[00:21] *** aaaaaaaa_ has joined #archiveteam [00:21] *** aaaaaaaaa has quit IRC (Read error: Connection reset by peer) [00:21] *** swebb sets mode: +o aaaaaaaa_ [00:31] *** Atom-- has joined #archiveteam [00:36] *** aaaaaaaa_ is now known as aaaaaaaaa [00:37] *** Atom__ has quit IRC (Ping timeout: 506 seconds) [00:51] *** Ungstein has joined #archiveteam [00:54] *** xk_id has quit IRC (Remote host closed the connection) [00:55] *** xk_id has joined #archiveteam [01:01] *** Pythia has quit IRC (WeeChat 1.3) [01:03] *** dserodio has quit IRC (Read error: Operation timed out) [01:06] *** xk_id has quit IRC (Read error: Operation timed out) [01:17] *** dserodio has joined #archiveteam [01:18] *** primus104 has quit IRC (Leaving.) [01:55] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [01:56] awesome stuff coming to the Internet Archive & Wayback Machine: http://blog.archive.org/2015/10/21/grant-to-develop-the-next-generation-wayback-machine/ [02:05] Now we just need a way to fix the brains of people who write obnoxious robots.txt and cut their sites out of history [02:08] ^ [02:09] Isn't the only reason that archive.org respects robot.txt is due to le lawsuits [02:10] *** zenguy_pc has joined #archiveteam [02:10] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [02:11] probably the main reason anyway [02:13] http://techcrunch.com/2015/10/21/an-offer-creators-cant-refuse/ [02:13] "Today YouTube confirmed that any “partner” creator who earns a cut of ad revenue but doesn’t agree to sign its revenue share deal for its new YouTube Red $9.99 ad-free subscription will have their videos hidden from public view on both the ad-supported and ad-free tiers." [02:20] Say what [02:20] I'm not sure if that's a bad thing or not [02:20] they say "99% of content consumed on YouTube will be still available" [02:21] the youtube partner program is like 8 years old. [02:21] Ikt [02:21] so plenty of old stuff will probably disappear [02:21] Oh,shiit [02:21] so 7 days to save a big chunk of youtube? [02:22] So it will still be viewable with the url?just now with search and such? [02:23] it's unclear [02:23] oh probably just search and any form of discovery apart from the channel page [02:24] never mind then ¯\_(ツ)_/¯ [02:24] So it's should be just unlisted [02:24] But still we should do something about this [02:25] "[..." why the overwhelming majority of our partners, representing over 98% of the content watched on YouTube, have signed up. Videos of partners who don’t update their terms will be made private ​in the US at launch [..." [02:25] Maybe scrape,download,then upload to an channel where that doesn't do that to? [02:25] so maybe not just unlisted? [02:25] ^ and then get flagged by contentid [02:26] ? [02:30] *** SiBurning has joined #archiveteam [02:43] *** zenguy_pc has joined #archiveteam [02:51] *** VADemon has quit IRC (left4dead) [02:56] acchan: you can read that site using inoreader.com [02:57] acchan: make an account and paste http://2dteleidoscope.wordpress.com/feed/ into the subscription box; there are entries from 2011 and 2012 [02:57] 2009-2012 actually [02:57] you can try other RSS feed readers like feedly too [03:00] it's also in the google reader archive but there is no convenient way to read it afaik https://www.refheap.com/0788b640ca608157796e9fbeb/raw [03:03] *** gibigiana has quit IRC (Read error: Operation timed out) [03:03] *** RichardG has quit IRC (Read error: Connection reset by peer) [03:03] *** RichardG has joined #archiveteam [03:04] *** acchan has quit IRC (Read error: Operation timed out) [03:04] *** jmad980 has quit IRC (Read error: Operation timed out) [03:04] *** Laverne has quit IRC (Read error: Operation timed out) [03:04] *** acchan has joined #archiveteam [03:04] *** mistym- has quit IRC (Ping timeout: 369 seconds) [03:05] *** mistym has joined #archiveteam [03:05] *** dserodio has quit IRC (Read error: Operation timed out) [03:05] *** swebb has quit IRC (Read error: Operation timed out) [03:06] acchan: https://www.refheap.com/25d3c0ae8005be159d810529f/raw [03:06] *** Coderjoe has quit IRC (Read error: Operation timed out) [03:08] *** SiBurning has quit IRC (Read error: Operation timed out) [03:08] *** ohhdemgir has quit IRC (Ping timeout: 369 seconds) [03:08] *** SiBurning has joined #archiveteam [03:09] *** ohhdemgir has joined #archiveteam [03:10] *** atlogbot has quit IRC (Ping timeout: 369 seconds) [03:11] *** dcmorton has quit IRC (Ping timeout: 369 seconds) [03:12] *** dxrt has quit IRC (Ping timeout: 369 seconds) [03:12] *** Coderjoe has joined #archiveteam [03:12] *** slyphic has quit IRC (Read error: Operation timed out) [03:13] *** riz has quit IRC (Ping timeout: 369 seconds) [03:14] *** gibigiana has joined #archiveteam [03:15] *** godane has quit IRC (Read error: Operation timed out) [03:16] *** riz has joined #archiveteam [03:16] *** dxrt has joined #archiveteam [03:17] *** jmad980 has joined #archiveteam [03:19] *** godane has joined #archiveteam [03:19] *** n00b494 has joined #archiveteam [03:19] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [03:20] Yahoosucks [03:20] Should of guessed. [03:24] *** antomatic has joined #archiveteam [03:29] *** dcmorton has joined #archiveteam [03:29] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [03:32] *** atlogbot has joined #archiveteam [03:32] *** slyphic has joined #archiveteam [03:33] *** dcmorton has quit IRC (Excess Flood) [03:33] *** dcmorton has joined #archiveteam [03:34] *** Laverne has joined #archiveteam [03:35] *** swebb has joined #archiveteam [03:38] *** antomati_ has quit IRC (Read error: Operation timed out) [03:46] *** zenguy_pc has joined #archiveteam [03:47] *** atlogbot has quit IRC (Ping timeout: 369 seconds) [03:48] *** aaaaaaaaa sets mode: +o swebb [03:57] *** slyphic has quit IRC (Read error: Operation timed out) [04:12] *** aaaaaaaaa has quit IRC (Leaving) [04:12] *** cadbury_ has quit IRC (Read error: Operation timed out) [04:25] *** slyphic has joined #archiveteam [04:27] *** atlogbot has joined #archiveteam [04:32] *** SiBurning has quit IRC () [04:48] *** n00b494 has quit IRC (Quit: Page closed) [04:49] *** chazchaz_ has quit IRC (Remote host closed the connection) [04:51] *** robink has quit IRC (Read error: Connection reset by peer) [04:52] *** Froggypwn has quit IRC (Read error: Operation timed out) [04:52] *** Froggypwn has joined #archiveteam [04:53] *** chazchaz_ has joined #archiveteam [04:53] *** chazchaz_ has quit IRC (Remote host closed the connection) [04:55] *** cloudmons has quit IRC (Read error: Connection reset by peer) [04:57] *** cloudmons has joined #archiveteam [04:58] *** robink has joined #archiveteam [05:02] *** robink has quit IRC (Read error: Connection reset by peer) [05:04] *** cloudmons has quit IRC (Read error: Connection reset by peer) [05:04] *** cadbury has joined #archiveteam [05:15] *** cloudmons has joined #archiveteam [05:17] *** robink has joined #archiveteam [05:26] *** Stilett0 has joined #archiveteam [05:28] *** Stiletto has quit IRC (Ping timeout: 310 seconds) [05:34] *** Atom-- has quit IRC (Ping timeout: 506 seconds) [05:38] *** JesseW has joined #archiveteam [05:49] *** WinterFox has joined #archiveteam [05:51] *** JesseW has quit IRC (Read error: Operation timed out) [06:09] *** Dark_Star has quit IRC (Ping timeout: 268 seconds) [06:30] *** pokeball9 has quit IRC (Quit: Connection closed for inactivity) [06:36] *** vitzli has joined #archiveteam [06:41] *** primus104 has joined #archiveteam [06:41] *** robink has quit IRC (Read error: Connection reset by peer) [06:43] *** robink has joined #archiveteam [06:45] *** cloudmons has quit IRC (Ping timeout: 506 seconds) [06:49] *** cloudmons has joined #archiveteam [06:53] *** robink has quit IRC (Ping timeout: 506 seconds) [06:54] *** Sanqui has quit IRC (Ping timeout: 506 seconds) [06:54] *** patricko- has joined #archiveteam [06:56] *** patrickod has quit IRC (Ping timeout: 506 seconds) [06:58] *** cloudmons has quit IRC (Ping timeout: 506 seconds) [07:19] *** philpem has joined #archiveteam [07:32] *** PurpleSym has joined #archiveteam [07:33] *** Ungstein1 has joined #archiveteam [07:36] *** Ungstein has quit IRC (Ping timeout: 252 seconds) [07:41] *** atomotic has joined #archiveteam [07:48] Awesome, awesome, awesome https://blog.archive.org/2015/10/21/grant-to-develop-the-next-generation-wayback-machine/ [07:49] "Rewriting the Wayback Machine code." might allow us to ignore certain parts of URLs like sessionIDs [07:49] "Improving the playback of media-rich and interactive websites." probably means support for youtube and blip videos! :D [07:51] *** cloudmons has joined #archiveteam [07:51] *** robink has joined #archiveteam [07:51] *** Sanqui has joined #archiveteam [07:59] oOo [08:00] all of these are exciting [08:11] *** Ungstein has joined #archiveteam [08:12] *** Ungstein1 has quit IRC (Ping timeout: 252 seconds) [08:27] *** philpem has quit IRC (Ping timeout: 252 seconds) [08:35] *** xk_id has joined #archiveteam [09:01] Aren't there already support for youtube videos? [09:08] *** bzc6p_ is now known as bzc6p [09:08] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [09:11] *** ohhdemgir has quit IRC (Ping timeout: 268 seconds) [09:12] *** Emcy has joined #archiveteam [09:15] As for linkrot, see https://quarry.wmflabs.org/query/5794 [09:21] ersi: archive-it has support for them, the wayback machine not yet [09:22] I've watched YouTube videos in Wayback though [09:25] archive-it crawls go in the general pool AFAIK [09:25] yes they do [09:25] But the player of WARCs of Archive-It is not the same player the Wayback Machine uses [09:53] *** xk_id_ has joined #archiveteam [09:53] *** xk_id has quit IRC (Read error: Connection reset by peer) [09:56] *** xk_id has joined #archiveteam [09:56] *** xk_id_ has quit IRC (Read error: Connection reset by peer) [10:02] *** primus104 has quit IRC (Leaving.) [10:05] *** schbirid has joined #archiveteam [10:19] *** PurpleSym has quit IRC (Quit: WeeChat 1.1.1) [10:25] *** ppsym has joined #archiveteam [10:34] *** nertzy has quit IRC (Ping timeout: 252 seconds) [10:39] *** atomotic has joined #archiveteam [10:42] *** Ghost_of_ has joined #archiveteam [10:44] hi all [10:44] ALL of Yuku.com is down now [10:44] earlier on, it was just the CHFB [10:46] *** terburg has joined #archiveteam [10:47] http://www.yuku.com/ [10:47] http://monsterkidclassichorrorforum.yuku.com/ [10:57] I tried to document here: http://archiveteam.org/index.php?title=Talk:The_Classic_Horror_Film_Board [11:07] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [11:08] *** ppsym has quit IRC (WeeChat 1.1.1) [11:11] *** PurpleSym has joined #archiveteam [11:26] *** bzc6p_ has joined #archiveteam [11:26] *** swebb sets mode: +o bzc6p_ [11:32] *** bzc6p has quit IRC (Ping timeout: 615 seconds) [11:34] *** Morbus has quit IRC (Quit: http://www.disobey.com/) [11:35] *** primus104 has joined #archiveteam [11:38] *** pokeball9 has joined #archiveteam [11:47] *** ohhdemgir has joined #archiveteam [11:49] *** dserodio has joined #archiveteam [11:51] *** xk_id has quit IRC (Remote host closed the connection) [12:00] *** Morbus has joined #archiveteam [12:18] *** Morbus has quit IRC (Quit: http://www.disobey.com/) [12:27] *** Morbus has joined #archiveteam [12:31] *** nertzy has joined #archiveteam [12:33] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [12:44] Ghost_of_: let's hope it will be back. It would worth contacting them that they shouldn't go yet. [12:46] We would know more about their problems, also maybe they would be cooperative in archiving them. [12:56] paging arkiver [12:56] arkiver ping [12:57] bzc6p_: Ghost_of_: http://help.yuku.com/support/discussions/topics/3000161234 [12:57] " All Freeforums, Yuku, Lefora and Forumer networks are down at this time. Our technicians are working to bring everything back online. Thank you for your patience as this issue is addressed. We apologize for the inconvenience this is causing you" [12:58] (support desks are often hosted off-site) [13:01] *** lhobas has quit IRC (Ping timeout: 252 seconds) [13:02] *** lhobas has joined #archiveteam [13:05] *** schbirid has quit IRC (Quit: Leaving) [13:11] *** primus104 has quit IRC (Leaving.) [13:24] *** WinterFox has quit IRC (Remote host closed the connection) [13:27] bzc6p_: the question is if they will be friendly [13:28] joepie91: thanks, makes me slightly less nervous [13:28] yuku.com is still gone, though [13:28] Ghost_of_: yes, everything is down, it's an outage [13:29] there was some discussion on the boards re. Yuku policies ... I can't link to it now, but it was suggested they will do their best to stop us, per the EULA [13:30] I'm thinking it may be better to continue here [13:30] arkiver: what software will we be using? [13:32] *** scyther has joined #archiveteam [14:30] Ghost_of_: I'll be writing some scripts for the software we're always using [14:31] That is the tracker, wget-lua and seesaw [14:31] arkiver: awesome [14:36] thanks for all your help [14:42] *** VADemon has joined #archiveteam [14:51] *** atomotic has joined #archiveteam [15:12] *** JesseW has joined #archiveteam [15:15] *** terburg has quit IRC (terburg) [15:17] Ghost_of_: :) [15:17] *** Ghost_of_ has quit IRC (Quit: Leaving) [15:21] *** chfoo has quit IRC (Ping timeout: 310 seconds) [15:23] *** jspiros has quit IRC (Ping timeout: 186 seconds) [15:25] Archiving yuku will be fun, looking forward to it :) http://monsterkidclassichorrorforum.yuku.com/reply/1158270/PLEASE-READ-I-have-suggested-the-CHFB-be-archived#reply-1158270 [15:33] *** primus104 has joined #archiveteam [15:34] *** JesseW has quit IRC (Read error: Operation timed out) [15:36] *** jspiros has joined #archiveteam [15:39] hey can i get the permisions to use the #archivebot? i want to get http://adequacy.org/ archived [15:42] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [15:43] *** atomotic has joined #archiveteam [15:53] *** Dark_Star has joined #archiveteam [15:57] *** chfoo has joined #archiveteam [16:05] *** will has joined #archiveteam [16:06] *** jspiros has quit IRC (Ping timeout: 186 seconds) [16:28] *** wp494 has quit IRC (Ping timeout: 606 seconds) [16:29] *** jspiros has joined #archiveteam [16:29] *** chfoo has quit IRC (Read error: Operation timed out) [16:30] *** chfoo has joined #archiveteam [16:40] *** insane_al has joined #archiveteam [16:43] *** jspiros has quit IRC (Ping timeout: 186 seconds) [16:48] *** philpem has joined #archiveteam [16:51] .title http://www.economist.com/news/business/21676808-marissa-mayer-has-failed-revive-internet-sloth-portal-nowhere [16:51] or not [16:52] *** botpie91 has joined #archiveteam [16:52] .title http://www.economist.com/news/business/21676808-marissa-mayer-has-failed-revive-internet-sloth-portal-nowhere [16:52] joepie91: A portal to nowhere | The Economist [16:52] well that was not helpful at all... [16:52] tl;dr bad news about Yahoo etc [16:55] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [17:06] *** jspiros has joined #archiveteam [17:10] *** Ghost_of_ has joined #archiveteam [17:11] *** xk_id has joined #archiveteam [17:27] *** Ghost_of_ has quit IRC (Quit: Leaving) [17:27] *** Ghost_of_ has joined #archiveteam [17:28] *** Ghost_of_ has quit IRC (Client Quit) [17:28] *** Ghost_of_ has joined #archiveteam [17:29] should there be a chan for CHFB/Yuku? [17:43] *** bzc6p_ is now known as bzc6p [17:43] *** vitzli has quit IRC (Quit: Leaving) [17:46] *** test_ has joined #archiveteam [17:46] *** xk_id has quit IRC (Read error: Connection reset by peer) [17:46] *** xk_id_ has joined #archiveteam [17:47] *** test_ has quit IRC (Client Quit) [17:49] I've linked this room from the board, some people might drop by [17:58] *** Ghost_of_ has quit IRC (Remote host closed the connection) [18:01] *** jspiros has quit IRC (Ping timeout: 186 seconds) [18:27] *** godane has left [18:35] *** godane has joined #archiveteam [18:38] *** aaaaaaaaa has joined #archiveteam [18:38] *** swebb sets mode: +o aaaaaaaaa [18:42] *** aaaaaaaaa sets mode: +ooo chfoo godane midas [19:11] *** jspiros has joined #archiveteam [19:30] *** terburg has joined #archiveteam [19:32] *** edsu has joined #archiveteam [19:32] *** swebb sets mode: +o edsu [20:00] *** atomotic has joined #archiveteam [20:06] *** Jon has quit IRC (Read error: Operation timed out) [20:09] *** terburg has quit IRC (Quit: terburg) [20:28] *** insane_al has quit IRC (Leaving) [20:29] *** superkuh has joined #archiveteam [20:33] *** RichardG_ has joined #archiveteam [20:33] *** RichardG has quit IRC (Read error: Connection reset by peer) [20:35] *** RedType_ has quit IRC (Quit: leaving) [20:35] *** RedType has joined #archiveteam [20:38] *** atomotic has quit IRC (Ping timeout: 252 seconds) [20:40] *** jmtd has joined #archiveteam [20:40] *** RichardG_ is now known as RichardG [20:48] *** aaaaaaaaa has quit IRC (Leaving) [20:52] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [21:08] *** zenguy_pc has joined #archiveteam [21:26] *** aaaaaaaaa has joined #archiveteam [21:26] *** swebb sets mode: +o aaaaaaaaa [21:28] *** wp494 has joined #archiveteam [21:30] *** scyther has quit IRC (Read error: Connection reset by peer) [21:48] *** BlueMaxim has joined #archiveteam [21:53] *** Ghost_of_ has joined #archiveteam [21:56] heads up: reason to believe that hackforums (yes, that hackforums) may be going away in the near future [21:59] How soon is 'soon'? There are over 3.3 million threads so that could take a while if we just threw it at ArchiveBot [22:00] MrRadar: anywhere between now and never, basically [22:00] more towards now [22:00] reasoning? [22:00] probably best not to say that publicly :P [22:00] it's not confirmed yet, just a likely risk [22:02] but it'd probably be ideal to have at least a strategy within the next few hours [22:03] you need to login to see threads [22:03] HCross: or be googlebot [22:03] lol [22:03] rofl, such security [22:04] HCross: well, it's HF. surely you did not expect anything else? :P [22:04] it's just a way to get people to reg [22:04] heh [22:04] I have an acc somewhere, which I use to laugh at it [22:05] yeah, basically the same for me [22:05] can't stand it as a forum [22:05] but occasionally there's a tidbit of info that I'm looking for [22:05] which ive just fogotten the password to [22:05] (eg. scammers on LET) [22:06] so is this high priority then [22:07] HCross: I would say so, yes [22:08] can't stand the place, but it has years upon years of history [22:08] lots of info that isn't elsewhre [22:08] elsewhere* [22:08] I'll have some time tomorrow to write some scripts [22:08] lots of useful info for researching scammers, too [22:08] tomorrow afternoon [22:08] as they almost inevitably have a history on HF [22:08] but then, is this a warrior job or an archivebot job? [22:08] any deadline? [22:08] arkiver: I'm not sure. it's pretty big [22:08] arkiver: deadline is ASAP [22:08] and what is the reason they might be going away? [22:08] it may disappear tomorrow, it may stick around for years [22:08] arkiver: PM [22:10] ive just logged in, and the amount of bad skiddies is amazing [22:13] haha. my scaleway VPN is blocked [22:14] thats another note, they seem to block datacenter IP's [403 Forbidden Error] - You might be blocked by your IP, Country, or ISP. We also block VPN's, proxies, datacenters and socks. [22:15] Speaking of dying web forums, the Wizards of the Coast Community Forums archivebot job still has over 8 million pages in its queue and the site dies in 1 week [22:15] Is there anything that can be done to make it faster? Or are we just going to have to live with whatever we get? [22:16] upped the concurrency? [22:17] !concurrency number (keep cranking this up until it errors or breaks the site) [22:17] I meant !concurrency ident number [22:18] MrRadar: getting rate limited by tracker? or not enough clients? [22:19] I'm not sure [22:22] Hey joepie91 how will we get though cloudflare on hack fourm?or is there already a known way to do that? [22:22] And while we are at it why not do leak forums as well? [22:31] pokeball9, probably GoogleBot [22:33] googlebot, yes [22:35] also getting 403's from AWS with any useragent just FYI [22:35] yup. 403's from PoneyTelecom too [22:35] yeah, this would be an at-home-only project [22:35] I have a server or 2 under the stairs, they would be fine [22:35] hmm... [22:36] one moment while I inquire [22:37] Any archivebot nodes in the UK? [22:37] unless they up for archival and we could get a useragent that is allowed [22:40] very doubtful, tbh [22:40] Does anyone know if ArchiveBot can handle BBC iPlayer radio pages? like http://www.bbc.co.uk/programmes/p035tvm3 [22:45] wizards community probably needs to be a warrior project [22:48] HCross: i'm grabbing it [22:48] hey [22:50] godane, it doesnt seem to need a UK IP either [22:50] i know [22:51] Yeah, the iPlayer video streams require a UK IP but audio streams are worldwide [22:52] and sadly my OVH "UK" IP's dont work [22:53] about hackforums [22:53] there's that db dump from 2011 or so, so we'd only need to get posts from then onwards right? [23:00] *** slang has quit IRC (Quit: Page closed) [23:00] might as well grab it all into warcs so its viewable in the wayback [23:12] marvinw: holy shit I'll take a look. Thank you! [23:20] *** Ghost_of_ has quit IRC (Quit: Leaving)