#archiveteam 2014-08-11,Mon

↑back Search

Time Nickname Message
00:46 🔗 Lethalize Hello :)
00:48 🔗 xmc hi
00:51 🔗 Lethalize Ùhm is there a cap on twitch download speed? i am currently on a 100/50 line and can't get a download going faster then 20mbit/s from them
00:51 🔗 Lethalize is this only me or?
00:53 🔗 balrog Lethalize: yes they throttle. see #burnthetwitch
00:53 🔗 Lethalize no way to go around this ?
00:56 🔗 trs80 increase the number of workers in your warrior
00:56 🔗 J4ko you can download directly from twitch servers instead cdn, should be not limited like that
01:06 🔗 Lethalize Workers? you mean more VMs?
01:06 🔗 trs80 no, go to the web interface, your settings, show advanced, concurrent items
01:23 🔗 Lethalize Increasing Concurrent don't help =/ they just split the speed of the downloads
02:22 🔗 chfoo === We are also archiving Canv.as now! Tracker: http://tracker.archiveteam.org/canvas/ Details: http://archiveteam.org/index.php?title=Canv.as . Select the Canvas project in the warrior. ===
02:35 🔗 Cameron_D rsync keeps failing for it
02:49 🔗 bsmith093 possible stupid question, how do i run multiple warrior VMs at once?
02:51 🔗 crypto__ Import the warrior again
02:51 🔗 crypto__ VirtualBox -> Settings -> Network
02:51 🔗 crypto__ And change the host port to 8002, but leave the guest one alone.
02:51 🔗 crypto__ Do all the above for the second warrior.
02:51 🔗 crypto__ You can use the same ova you downloaded earlier
02:51 🔗 bsmith093 ah, ok. that makes so much more sense.
02:52 🔗 bsmith093 completely different question is there still no good or fast way to convert audible .aa files, into mp3s?
02:52 🔗 crypto__ That I know nothing about.
02:53 🔗 garyrh bsmith093, I think there's some way to use itunes and burn the .aa files to a CD (or a virtual CD.)
02:53 🔗 garyrh not sure how though.
02:54 🔗 bsmith093 garyrh, that's still the best way, after all these years of it? ugh.
02:55 🔗 aaaaaaaaa crypto__: I love how you are now the go to expert on that now.
02:55 🔗 crypto__ haha
02:55 🔗 crypto__ I know
02:55 🔗 crypto__ You should get all the credit for that.
02:55 🔗 bsmith093 crypto__, how many VM's are you running?
02:55 🔗 crypto__ Just 2
02:55 🔗 aaaaaaaaa Hey, didn't want to steal credit, just pump your self-esteem.
02:56 🔗 crypto__ Thats about all my I trust my Mac Mini to run efficently.
02:56 🔗 aaaaaaaaa The problem with too many is overprovisioning.
02:56 🔗 crypto__ I'd run the manual scripts, but I like the fact I can eat dinner, and it will automagically attach to a project without me needing to do anything.
02:57 🔗 aaaaaaaaa if you don't have enough ram, you just end up thrashing.
02:57 🔗 crypto__ Just for shits and giggles, I gave both warriors an extra 100mb of ram.
02:57 🔗 aaaaaaaaa It is a super handy system
02:57 🔗 crypto__ Don't really know if that helps them...but I had it to spare.
02:57 🔗 crypto__ IT really is though.
02:58 🔗 crypto__ I'll be back later.
04:06 🔗 phuzion Is there a channel for the canv.as project?
04:07 🔗 aaaaaaaaa phuzion: #canvas
04:07 🔗 phuzion thanks
04:12 🔗 nitro2k01 #canv.as on Rizon last time I checked. At least *an* IRC channel where some of the site's users hung out, last time I was in the channel.
04:14 🔗 phuzion nitro2k01: I was specifically asking about the archiveteam project
04:23 🔗 yipdw #canvas on efnet
08:17 🔗 NovaKing SketchCow: tomorrow everything should be prep'd for the open source movie, giving you a torrent file would be best method (then you can get from two seedboxes)
08:18 🔗 SketchCow Great
08:33 🔗 NovaKing maybe even later today, just waiting on hash checks to complete
09:52 🔗 jjonas hi:)
09:54 🔗 jjonas can we avoid getting false http status 200 from archives http://web.archive.org/web/20140212211621/https://webaccess.uc.cl/simplesaml/ (also http://www.webcitation.org/query?url=https://webaccess.uc.cl/&date=2037-12-31 )
09:55 🔗 jjonas or is there any documentation why they do this:( ?
10:53 🔗 Nemo_bis jjonas: does that redirect loop ever stop?
10:54 🔗 Nemo_bis seems not
12:06 🔗 jjonas yes, it is a funny one:)
12:07 🔗 jjonas but for the example just matters: in the first http header response they both only send status ok 200 instead of an error like 302 / 404
13:15 🔗 NovaKing SketchCow: i'm heading out, but everything is ready on my end, just let me know how you want this torrent file
13:31 🔗 Nemo_bis Just drop this damned info_hash in his PM :)
13:59 🔗 Sanqui so I'm guessing there's a bot here that handles this, else I'll look like an idiot
13:59 🔗 Sanqui WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
13:59 🔗 midas hah yahoosucks
13:59 🔗 midas without the hah
14:00 🔗 Sanqui so I could've just asked, okay
14:00 🔗 Sanqui thanks
14:01 🔗 midas yeah
14:03 🔗 tephra starting to upload swipnet grab, did anyone start work on username discovery? (been away for a week)
14:05 🔗 Arkiver2 tephra: any indication on how many users?
14:08 🔗 tephra Arkiver2: nope, I got 000000-9999999 but then there's the users that can have ascii usernames. No idea how many those are but since it's a somewhat old swedish community I would imagine that the numbers are in the thousands at the most
14:41 🔗 sep332 i thought canvas was dead already. in fact i thought AT did a grab already... what's different this time?
14:41 🔗 midas all the data
14:41 🔗 midas first grab was broken
14:44 🔗 sep332 ok, cool
14:55 🔗 nitro2k01 The site is offline, but the images are still online, as well the user dump pages.
14:55 🔗 nitro2k01 "All" you need to access a user dump page is the person's user id and name as a pair.
14:56 🔗 nitro2k01 http://canvas-export.s3-website-us-east-1.amazonaws.com/xxxx-yyyyyy/ xxxx = the user ID, yyyyyy = the user name.
15:18 🔗 chfoo tephra: we can do a warrior project to discover usernames
15:20 🔗 chfoo i'd need a list of top swedish words likely to be used in web pages though
15:22 🔗 midas does anyone understand the verizon webspaces? i have no idea what they are :p
15:38 🔗 tephra chfoo: cool, I was thinking of doing a googlescrape. hmm top swedish words likely to be used in web pages, will have to see if i can get that
15:42 🔗 DFJustin http://download.openwall.net/pub/wordlists/languages/Swedish/
15:42 🔗 DFJustin dunno how much passwords overlap with web page words
15:43 🔗 DFJustin I would expect a lot of english words to be used as well
15:46 🔗 tephra yes and probably many non-words too
16:31 🔗 Fnorder Hi, do you guys still use wget-lua?
16:32 🔗 Fnorder If so, should I push additional features on github? :> Like scripting local filenames etc
16:32 🔗 xmc sometimes
16:32 🔗 xmc I don't see why not, but a lot of work has moved to wpull
16:33 🔗 Fnorder oh that looks interesting
16:34 🔗 xmc it uses sqlite to queue urls, which is a major advantage over wget
16:34 🔗 yipdw it's also Python-scriptable, which can be nice
16:35 🔗 Fnorder Yeah and a bit more convenient to script
16:35 🔗 yipdw https://github.com/ArchiveTeam/ArchiveBot/blob/master/pipeline/wpull_hooks.py for example
16:36 🔗 yipdw oh, yeah, wpull also has runtime-adjustable concurrent fetchers
16:37 🔗 yipdw there is a wget fork that has concurrent fetch but I don't think it ever made mainline
16:38 🔗 Fnorder Definately will play with that
16:38 🔗 Fnorder yeah, parallel-wget
16:39 🔗 Fnorder it's on mainline's git I think...so eventually perhaps
16:39 🔗 yipdw ah, vool
16:40 🔗 yipdw cool
16:40 🔗 yipdw Fnorder: all that said, the canvas and Twitch projects are using wget-lua still
16:40 🔗 yipdw we will probably use it for a while whilst wpull matures
16:42 🔗 yipdw also wpull is really heavily tested on Python 3, and that's not in the current Warrior image, so wget-lua has an advantage there
16:42 🔗 yipdw wpull does have Python 2.x configs in Travis, but the day-to-day hammering happens on 3 :P
16:45 🔗 Fnorder k I'll cleanup the changes and pr
17:05 🔗 Fnorder it seems github broke their js
17:08 🔗 Fnorder there, pr's in the archivebot branch...which is the current I hope
17:13 🔗 yipdw oh, wrong branch
17:13 🔗 yipdw :P
17:14 🔗 yipdw Fnorder: can you retarget the PR at the lua branch?
17:15 🔗 Fnorder Sure. Though proceed_p isn't there
17:17 🔗 yipdw oh, hmm
17:17 🔗 yipdw oh right
17:18 🔗 yipdw I added httploop_proceed_p
17:18 🔗 Fnorder sec
17:18 🔗 yipdw actually, sorry
17:18 🔗 yipdw just keep it against archivebot for now
17:19 🔗 yipdw it's been a long time since I looked at that branch -- httploop_proceed_p was added to support archivebot's ignore patterns
17:19 🔗 yipdw Fnorder: ^
17:20 🔗 yipdw I need to understand what your PR does, heh
17:21 🔗 Fnorder there
17:21 🔗 Fnorder Ahhh
17:22 🔗 Fnorder *right* before http get, that's called. With my change, return { local_file="something" } will override the local filename
17:23 🔗 yipdw ah ok
17:23 🔗 yipdw let's take this to #warrior, it's a sorta-kinda dev channel
17:23 🔗 Fnorder mainly intended to clean up crap?like=this&and=such
17:33 🔗 SketchCow yipdw and chfoo - I am coming around to the idea of an Internet Archive donation link in the warrior.
17:33 🔗 SketchCow Twitch is a major burden.
17:33 🔗 yipdw cool, that should be doable
17:34 🔗 NovaKing twitch is slowly dying
17:34 🔗 NovaKing with it's google thing now
17:34 🔗 NovaKing SketchCow: can i send you that torrent file now?
17:36 🔗 SketchCow Yes.
17:38 🔗 NovaKing best way = ?
17:39 🔗 SketchCow e-mail?>
17:40 🔗 NovaKing sending now
18:06 🔗 Nemo_bis SketchCow: should I also put a donation request somewhere about https://archive.org/details/wikimediacommons ?
18:07 🔗 Nemo_bis Its size seems comparable to what you just defined a major burden and we don't want to be a burden
18:07 🔗 SketchCow Wikipedia commons is 290tb?
18:08 🔗 Nemo_bis 34 so far
18:08 🔗 Nemo_bis Ok, I misread the numbers on the twitch tracker then
18:15 🔗 NovaKing SketchCow: get the email?
18:15 🔗 SketchCow I did.
18:15 🔗 NovaKing ok cool
18:16 🔗 SketchCow I am going to confess to you this is not my #1 priority today.
18:16 🔗 NovaKing then i can leave peacefully
18:16 🔗 SketchCow But it will be done.
18:16 🔗 NovaKing yes, understandable
18:16 🔗 NovaKing just wanted to make sure you got it
18:53 🔗 chfoo SketchCow , yipdw: twitch already has a donation link in the project web ui banner and whenever it downloads a video, it outputs a reminder to donate. should i put even more donation links?
18:54 🔗 SketchCow In the client itself?
18:55 🔗 balrog yeah iirc the idea was a donation link in the warrior client
18:58 🔗 chfoo SketchCow: i mean in the warrior vm, if you select twitch, it will have a donation link in the page where its shows items being downloaded. if you are running the scripts in the console, you will also see a donation link as well. i'm guessing you are asking whether a donation link should be display in a permanent place in the warrior.
19:06 🔗 commentat the warrior (in virtual machine on windows) doesnt respect setting "Concurrent items:" is only using 1
19:12 🔗 chfoo commentat: try stopping the project and rebooting the vm
19:13 🔗 commentat yes, done that
19:13 🔗 commentat it is on multiple pc's now
19:16 🔗 chfoo commentat: come to #warrior and we'll see if we can troubleshoot what's wrong
20:02 🔗 n00b561 WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
20:02 🔗 chfoo n00b561: yahoosucks
20:02 🔗 n00b561 awesome
22:22 🔗 midas SketchCow: seems that i cant upload to https://archive.org/details/archiveteam_twitchtv
22:50 🔗 dashcloud this is an old link, but still rather interesting and relevant I suspect: http://waldo.jaquith.org/blog/2011/02/ocr-video/
23:11 🔗 dashcloud SketchCow: any chance of making a collection for this magazine? https://archive.org/search.php?query=PoC%20||%20GTFO (just uploading the latest issue now)
23:15 🔗 xmc dashcloud: neat page, I didn't realize this guy was doing that high quality of work
23:16 🔗 dashcloud the PDF is only a half (or less, depending on the issue) of the experience
23:16 🔗 dashcloud there's as much going on outside the PDF as there is on the pages
23:40 🔗 SketchCow dashcloud: Moved them into zines.
23:41 🔗 dashcloud thanks!

irclogger-viewer