[00:46] Hello :) [00:48] hi [00:51] Ãhm is there a cap on twitch download speed? i am currently on a 100/50 line and can't get a download going faster then 20mbit/s from them [00:51] is this only me or? [00:53] Lethalize: yes they throttle. see #burnthetwitch [00:53] no way to go around this ? [00:56] increase the number of workers in your warrior [00:56] you can download directly from twitch servers instead cdn, should be not limited like that [01:06] Workers? you mean more VMs? [01:06] no, go to the web interface, your settings, show advanced, concurrent items [01:23] Increasing Concurrent don't help =/ they just split the speed of the downloads [02:22] === We are also archiving Canv.as now! Tracker: http://tracker.archiveteam.org/canvas/ Details: http://archiveteam.org/index.php?title=Canv.as . Select the Canvas project in the warrior. === [02:35] rsync keeps failing for it [02:49] possible stupid question, how do i run multiple warrior VMs at once? [02:51] Import the warrior again [02:51] VirtualBox -> Settings -> Network [02:51] And change the host port to 8002, but leave the guest one alone. [02:51] Do all the above for the second warrior. [02:51] You can use the same ova you downloaded earlier [02:51] ah, ok. that makes so much more sense. [02:52] completely different question is there still no good or fast way to convert audible .aa files, into mp3s? [02:52] That I know nothing about. [02:53] bsmith093, I think there's some way to use itunes and burn the .aa files to a CD (or a virtual CD.) [02:53] not sure how though. [02:54] garyrh, that's still the best way, after all these years of it? ugh. [02:55] crypto__: I love how you are now the go to expert on that now. [02:55] haha [02:55] I know [02:55] You should get all the credit for that. [02:55] crypto__, how many VM's are you running? [02:55] Just 2 [02:55] Hey, didn't want to steal credit, just pump your self-esteem. [02:56] Thats about all my I trust my Mac Mini to run efficently. [02:56] The problem with too many is overprovisioning. [02:56] I'd run the manual scripts, but I like the fact I can eat dinner, and it will automagically attach to a project without me needing to do anything. [02:57] if you don't have enough ram, you just end up thrashing. [02:57] Just for shits and giggles, I gave both warriors an extra 100mb of ram. [02:57] It is a super handy system [02:57] Don't really know if that helps them...but I had it to spare. [02:57] IT really is though. [02:58] I'll be back later. [04:06] Is there a channel for the canv.as project? [04:07] phuzion: #canvas [04:07] thanks [04:12] #canv.as on Rizon last time I checked. At least *an* IRC channel where some of the site's users hung out, last time I was in the channel. [04:14] nitro2k01: I was specifically asking about the archiveteam project [04:23] #canvas on efnet [08:17] SketchCow: tomorrow everything should be prep'd for the open source movie, giving you a torrent file would be best method (then you can get from two seedboxes) [08:18] Great [08:33] maybe even later today, just waiting on hash checks to complete [09:52] hi:) [09:54] can we avoid getting false http status 200 from archives http://web.archive.org/web/20140212211621/https://webaccess.uc.cl/simplesaml/ (also http://www.webcitation.org/query?url=https://webaccess.uc.cl/&date=2037-12-31 ) [09:55] or is there any documentation why they do this:( ? [10:53] jjonas: does that redirect loop ever stop? [10:54] seems not [12:06] yes, it is a funny one:) [12:07] but for the example just matters: in the first http header response they both only send status ok 200 instead of an error like 302 / 404 [13:15] SketchCow: i'm heading out, but everything is ready on my end, just let me know how you want this torrent file [13:31] Just drop this damned info_hash in his PM :) [13:59] so I'm guessing there's a bot here that handles this, else I'll look like an idiot [13:59] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [13:59] hah yahoosucks [13:59] without the hah [14:00] so I could've just asked, okay [14:00] thanks [14:01] yeah [14:03] starting to upload swipnet grab, did anyone start work on username discovery? (been away for a week) [14:05] tephra: any indication on how many users? [14:08] Arkiver2: nope, I got 000000-9999999 but then there's the users that can have ascii usernames. No idea how many those are but since it's a somewhat old swedish community I would imagine that the numbers are in the thousands at the most [14:41] i thought canvas was dead already. in fact i thought AT did a grab already... what's different this time? [14:41] all the data [14:41] first grab was broken [14:44] ok, cool [14:55] The site is offline, but the images are still online, as well the user dump pages. [14:55] "All" you need to access a user dump page is the person's user id and name as a pair. [14:56] http://canvas-export.s3-website-us-east-1.amazonaws.com/xxxx-yyyyyy/ xxxx = the user ID, yyyyyy = the user name. [15:18] tephra: we can do a warrior project to discover usernames [15:20] i'd need a list of top swedish words likely to be used in web pages though [15:22] does anyone understand the verizon webspaces? i have no idea what they are :p [15:38] chfoo: cool, I was thinking of doing a googlescrape. hmm top swedish words likely to be used in web pages, will have to see if i can get that [15:42] http://download.openwall.net/pub/wordlists/languages/Swedish/ [15:42] dunno how much passwords overlap with web page words [15:43] I would expect a lot of english words to be used as well [15:46] yes and probably many non-words too [16:31] Hi, do you guys still use wget-lua? [16:32] If so, should I push additional features on github? :> Like scripting local filenames etc [16:32] sometimes [16:32] I don't see why not, but a lot of work has moved to wpull [16:33] oh that looks interesting [16:34] it uses sqlite to queue urls, which is a major advantage over wget [16:34] it's also Python-scriptable, which can be nice [16:35] Yeah and a bit more convenient to script [16:35] https://github.com/ArchiveTeam/ArchiveBot/blob/master/pipeline/wpull_hooks.py for example [16:36] oh, yeah, wpull also has runtime-adjustable concurrent fetchers [16:37] there is a wget fork that has concurrent fetch but I don't think it ever made mainline [16:38] Definately will play with that [16:38] yeah, parallel-wget [16:39] it's on mainline's git I think...so eventually perhaps [16:39] ah, vool [16:40] cool [16:40] Fnorder: all that said, the canvas and Twitch projects are using wget-lua still [16:40] we will probably use it for a while whilst wpull matures [16:42] also wpull is really heavily tested on Python 3, and that's not in the current Warrior image, so wget-lua has an advantage there [16:42] wpull does have Python 2.x configs in Travis, but the day-to-day hammering happens on 3 :P [16:45] k I'll cleanup the changes and pr [17:05] it seems github broke their js [17:08] there, pr's in the archivebot branch...which is the current I hope [17:13] oh, wrong branch [17:13] :P [17:14] Fnorder: can you retarget the PR at the lua branch? [17:15] Sure. Though proceed_p isn't there [17:17] oh, hmm [17:17] oh right [17:18] I added httploop_proceed_p [17:18] sec [17:18] actually, sorry [17:18] just keep it against archivebot for now [17:19] it's been a long time since I looked at that branch -- httploop_proceed_p was added to support archivebot's ignore patterns [17:19] Fnorder: ^ [17:20] I need to understand what your PR does, heh [17:21] there [17:21] Ahhh [17:22] *right* before http get, that's called. With my change, return { local_file="something" } will override the local filename [17:23] ah ok [17:23] let's take this to #warrior, it's a sorta-kinda dev channel [17:23] mainly intended to clean up crap?like=this&and=such [17:33] yipdw and chfoo - I am coming around to the idea of an Internet Archive donation link in the warrior. [17:33] Twitch is a major burden. [17:33] cool, that should be doable [17:34] twitch is slowly dying [17:34] with it's google thing now [17:34] SketchCow: can i send you that torrent file now? [17:36] Yes. [17:38] best way = ? [17:39] e-mail?> [17:40] sending now [18:06] SketchCow: should I also put a donation request somewhere about https://archive.org/details/wikimediacommons ? [18:07] Its size seems comparable to what you just defined a major burden and we don't want to be a burden [18:07] Wikipedia commons is 290tb? [18:08] 34 so far [18:08] Ok, I misread the numbers on the twitch tracker then [18:15] SketchCow: get the email? [18:15] I did. [18:15] ok cool [18:16] I am going to confess to you this is not my #1 priority today. [18:16] then i can leave peacefully [18:16] But it will be done. [18:16] yes, understandable [18:16] just wanted to make sure you got it [18:53] SketchCow , yipdw: twitch already has a donation link in the project web ui banner and whenever it downloads a video, it outputs a reminder to donate. should i put even more donation links? [18:54] In the client itself? [18:55] yeah iirc the idea was a donation link in the warrior client [18:58] SketchCow: i mean in the warrior vm, if you select twitch, it will have a donation link in the page where its shows items being downloaded. if you are running the scripts in the console, you will also see a donation link as well. i'm guessing you are asking whether a donation link should be display in a permanent place in the warrior. [19:06] the warrior (in virtual machine on windows) doesnt respect setting "Concurrent items:" is only using 1 [19:12] commentat: try stopping the project and rebooting the vm [19:13] yes, done that [19:13] it is on multiple pc's now [19:16] commentat: come to #warrior and we'll see if we can troubleshoot what's wrong [20:02] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [20:02] n00b561: yahoosucks [20:02] awesome [22:22] SketchCow: seems that i cant upload to https://archive.org/details/archiveteam_twitchtv [22:50] this is an old link, but still rather interesting and relevant I suspect: http://waldo.jaquith.org/blog/2011/02/ocr-video/ [23:11] SketchCow: any chance of making a collection for this magazine? https://archive.org/search.php?query=PoC%20||%20GTFO (just uploading the latest issue now) [23:15] dashcloud: neat page, I didn't realize this guy was doing that high quality of work [23:16] the PDF is only a half (or less, depending on the issue) of the experience [23:16] there's as much going on outside the PDF as there is on the pages [23:40] dashcloud: Moved them into zines. [23:41] thanks!