[00:01] *** w0rp has joined #archiveteam-bs [00:17] *** ZexaronS has quit IRC (Quit: Leaving) [00:45] *** Ravenloft has quit IRC (Ping timeout: 260 seconds) [00:55] *** dashcloud has quit IRC (Read error: Operation timed out) [00:56] *** BlueMaxim has joined #archiveteam-bs [00:59] *** dashcloud has joined #archiveteam-bs [01:00] *** zyphlar has joined #archiveteam-bs [01:05] *** ZexaronS has joined #archiveteam-bs [02:07] *** j08nY has quit IRC (Quit: Leaving) [02:42] *** pizzaiolo has quit IRC (pizzaiolo) [03:46] *** ZexaronS has quit IRC (Leaving) [03:58] looks like my script along time ago didn't upload alot of reuters.com videos [03:58] *** qw3rty6 has joined #archiveteam-bs [04:00] i downloaded the download pages to grab a list of items to check if they were all upload and turned up there not [04:01] there is about 5gb of video not uploaded for the 2008 alone [04:03] *** qw3rty5 has quit IRC (Read error: Operation timed out) [04:24] *** dashcloud has quit IRC (Read error: Connection reset by peer) [04:32] *** dashcloud has joined #archiveteam-bs [04:45] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [04:52] *** Sk1d has joined #archiveteam-bs [05:01] *** Meroje has quit IRC (Ping timeout: 260 seconds) [05:02] *** BnAboyZ66 has quit IRC (Ping timeout: 260 seconds) [05:04] *** Meroje has joined #archiveteam-bs [05:05] *** mundus201 is now known as mundus [05:24] *** ld1 has quit IRC (Ping timeout: 260 seconds) [05:24] *** ld1 has joined #archiveteam-bs [05:25] *** svchfoo1 has quit IRC (Quit: Closing) [05:40] *** godane has left [05:40] *** godane has joined #archiveteam-bs [05:41] *** Stiletti is now known as Stiletto [06:13] *** ld1 has quit IRC (Ping timeout: 260 seconds) [06:14] *** ld1 has joined #archiveteam-bs [06:56] *** kristian_ has joined #archiveteam-bs [07:15] *** pikhq has quit IRC (Ping timeout: 268 seconds) [08:39] *** chazchaz_ has quit IRC (Read error: Operation timed out) [08:39] *** dxrt- has quit IRC (Read error: Operation timed out) [08:41] *** espes__ has quit IRC (Ping timeout: 268 seconds) [08:42] *** chazchaz has joined #archiveteam-bs [08:44] *** dxrt- has joined #archiveteam-bs [08:47] *** espes__ has joined #archiveteam-bs [08:51] *** kristian_ has quit IRC (Quit: Leaving) [08:55] *** Honno has quit IRC (Read error: Operation timed out) [10:19] *** j08nY has joined #archiveteam-bs [10:28] yuku returning 403 for warrior user-agent, firefox ok. [10:29] that's rude [10:32] *** BlueMaxim has quit IRC (Quit: Leaving) [10:33] I guess we're Firefox now? :p [10:42] should we make some random useragent generator? [10:43] as in, random per-client, perhaps based on reported nickname [10:53] *** t2t2 has quit IRC (Quit: "goodbye uptime") [10:55] just hash it [10:55] *** tuluu has joined #archiveteam-bs [10:58] *** pizzaiolo has joined #archiveteam-bs [11:01] *** pikhq has joined #archiveteam-bs [11:06] *** BartoCH has joined #archiveteam-bs [11:07] *** jspiros has quit IRC (leaving) [11:07] *** RichardG has quit IRC (Ping timeout: 260 seconds) [11:10] *** jspiros has joined #archiveteam-bs [11:49] GLaDOS: My opinion is an optional random user-agent generator is not a bad feature to have that can be implemented if needed. [11:49] But not as default behavior [11:58] yeah, definitely optional [11:58] we had the same issue with soundcloud [12:01] from a site perspective, it might be best to base it on the public IP [12:01] maybe that hashed together with the username [12:02] although the best way for the pipeline to get its public IP would have to be figured out [12:03] perhaps the tracker reports it back when you retreive a job? [12:31] *** quantum has joined #archiveteam-bs [12:34] *** godane has quit IRC (Read error: Operation timed out) [13:10] *** t2t2 has joined #archiveteam-bs [13:40] *** sep332 has quit IRC (Quit: konversation out) [13:41] *** sep332 has joined #archiveteam-bs [13:55] *** quantum has quit IRC (Ping timeout: 268 seconds) [14:05] *** godane has joined #archiveteam-bs [15:02] *** RichardG has joined #archiveteam-bs [15:18] *** pikhq has quit IRC (Read error: Operation timed out) [15:21] *** schbirid has joined #archiveteam-bs [15:25] *** pikhq has joined #archiveteam-bs [15:42] *** pikhq has quit IRC (Ping timeout: 268 seconds) [15:48] *** pikhq has joined #archiveteam-bs [16:15] SketchCow: i'm uploading HeroesRebornNBC youtube channel on to FOS [16:15] i will be in Dead-Youtube-Channels [16:17] there are only 2 videos on the channel now [16:17] but i got 72 videos from it in the past [18:06] *** dashcloud has quit IRC (Read error: Operation timed out) [18:25] *** j08nY has quit IRC (Quit: Leaving) [18:29] *** dashcloud has joined #archiveteam-bs [18:30] *** Soni has quit IRC (Ping timeout: 272 seconds) [18:32] http://libgen.io/robots.txt [18:33] huh [18:34] *** Soni has joined #archiveteam-bs [18:34] http://gen.lib.rus.ec/robots.txt too [18:34] no idea if new [18:36] https://www.reddit.com/r/Scholar/comments/6puywe/meta_libgen_article_repository_is_down/ [18:42] noone seeding https://thepiratebay.org/torrent/11674459/The+Library+Genesis+SciMag+Repository+2015-01-31+%28torrents+only%29 :( [18:46] some http://torrentproject.se/?t=scimag [18:51] *** fie has quit IRC (Ping timeout: 268 seconds) [19:10] Someone wanna update the current running warrior project to yuku? [19:24] mundus: it's not returning 403 for every request anymore? [19:25] what? [19:25] It's just the active project [19:25] *** ItsYoda has quit IRC (Quit: rippppp to the yoda you used to know!) [19:28] *** ItsYoda has joined #archiveteam-bs [19:29] *** zino has quit IRC (Quit: Leaving) [19:32] *** Whopper has quit IRC (Read error: Operation timed out) [19:32] Just tried starting it, got one item. All the fetches for it returned 403. Opening one of those urls manually gives a phpbb sql error: "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'RELECT * FROM conservativeallies_users WHERE user_id = 0' at line 1 [1064]" [19:34] Fyi, https://www.reddit.com/r/opiaterollcall/ was recently banned, but Google still has some cached pages. [19:38] *** Whopper has joined #archiveteam-bs [19:49] *** Honno has joined #archiveteam-bs [19:49] Re: yuku, I obviously can't be certain (since I don't know what the topic of a given thread is supposed to be and so can't verify it's still the same content), but it seems the urls to access threads was changed from e.g. http://conservativeallies.yuku.com/topic/9280/ to http://conservativeallies.yuku.com/slug-usually-goes-here-t9280.html (the minimal fake slug you can get away with would be [19:49] .../-t9280.html) (if you want the canonical url you'll have to extract it from the fetched page) [19:49] so 1154 flv files was missing in reuters.com video 2008 uploads [19:50] those are now uploaded [19:50] Also, if the sample I got was representative (i.e. all 403s), the project should be put on hold again (it's unclear why it's running again) [19:53] huh, that url format change doesn't seem to be global... e.g. http://monsterkidclassichorrorforum.yuku.com/ still uses the old style [19:53] i'm uploading 3089 videos missing from reuters.com video 2009 uploads [19:53] ¯\_(ツ)_/¯ [19:53] alot are is missing from 2009-09 to 2009-12 [19:54] in a bit of weirdness 2009-02 items are all fine [19:54] no missing files there [20:32] *** TheLovina has joined #archiveteam-bs [20:32] *** Whopper has quit IRC (Read error: Connection reset by peer) [20:38] *** godane has quit IRC (Read error: Operation timed out) [20:45] *** godane has joined #archiveteam-bs [21:28] tobbez: we have a project for yuku [21:28] we can just load more items [21:29] huh [21:29] I see many projects have been removed from the tracker [21:30] who started yuku? I'm not sure if it was ready to be restarted [21:31] what projects have been removed from the tracker now?? [21:33] GLaDOS: see above ^ [21:33] was the yuku project tested properly before being restarted? [21:33] it was not run for quite some time, the website might have undergone some changes [21:34] yuku banned our useragent [21:34] the project is paused again [21:39] I'll check yuku and see if other stuff changed that needs editing of the project [21:40] also working on dayviews project, will be here https://github.com/ArchiveTeam/dayviews-grab [21:52] *** schbirid2 has joined #archiveteam-bs [21:55] *** sep332 is now known as sep332_ [21:56] *** schbirid has quit IRC (Read error: Operation timed out) [22:40] *** username1 has joined #archiveteam-bs [22:43] *** schbirid2 has quit IRC (Read error: Operation timed out) [22:45] *** username1 has quit IRC (Read error: Operation timed out) [22:49] *** schbirid has joined #archiveteam-bs [22:57] *** schbirid2 has joined #archiveteam-bs [22:59] *** schbirid has quit IRC (Read error: Operation timed out) [23:07] *** Odd0002 has joined #archiveteam-bs [23:22] *** username1 has joined #archiveteam-bs [23:24] *** schbirid2 has quit IRC (Read error: Operation timed out) [23:32] *** kristian_ has joined #archiveteam-bs