[00:04] OK, just checked and no main UK political party website has been archived since the election announcement, so I'm going to start putting them in. [00:04] (to archivebot, that is) [00:05] *** dashcloud has quit IRC (Ping timeout: 245 seconds) [00:06] *** j08nY has quit IRC (Quit: Leaving) [00:07] *** dashcloud has joined #archiveteam [00:22] someone posted a list of sites, twitter, and facebook pages earlier- the non-facebook ones are probably in progress in #archivebot - Facebook has a hardline stance against scrapping, so that has to be done individually [00:23] dashcloud: tapedrive posted a list of candidate pages, as distinct from party websites [00:24] thanks! [00:24] dashcloud: Seeing as there are 2303 facebook pages, I think we might have to leave them, as doing them individually would take way too long. [00:50] *** dashcloud has quit IRC (Ping timeout: 245 seconds) [00:58] *** dashcloud has joined #archiveteam [00:59] *** owl_ has joined #archiveteam [01:09] *** Guest has quit IRC (Read error: Operation timed out) [01:32] *** bitspill has joined #archiveteam [01:36] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [01:40] *** BlueMaxim has joined #archiveteam [01:47] powerKitt: we'll get codeplex [01:49] icedice2: let's contact imgbox and see if they want to provide a list of images [01:52] *** tfgbd_znc has joined #archiveteam [02:07] Eroshare, which as its name implies, is meant for sharing erotic stuff, is shutting down on the 30th [02:08] I'll link the /r/datahoarder thread for your sake, obviously NSFWif you click through [02:08] https://www.reddit.com/r/DataHoarder/comments/6g4c3p/erosharecom_nsfw_shutting_down_june_30th/ [02:08] additionally /u/cyberdwarf said imgbox.com and imagebam.com are also going down too: https://www.reddit.com/r/DataHoarder/comments/6g4c3p/erosharecom_nsfw_shutting_down_june_30th/dinfnph/ [02:09] *** REiN^ has quit IRC (Read error: Operation timed out) [02:10] *** REiN^ has joined #archiveteam [02:19] if we even do bother with grabbing eroshare I suggest we have a separate navbox just for the NSFW and otherwise sketchy stuff [02:44] *** ndiddy has quit IRC () [02:51] *** dashcloud has quit IRC (Read error: Operation timed out) [02:58] *** Ctrl has joined #archiveteam [03:16] One of my URLs redirects to a facebook page without a trailing slash. Is archivebot going to go down an infinite path now? [03:17] The URL that redirects is http://www.wallaseyconservatives.com [03:17] it shouldn't go above/outside the directory named in the urls that you give it [03:18] Is that going to be okay? [03:19] that redirects to https://samething and then to https://www.facebook.com/wallaseyconservatives [03:19] should be fine [03:20] Thanks. I'll keep an eye on it anyway. [03:51] *** Stilett0 has joined #archiveteam [03:51] *** Stilett0 is now known as Stiletto [03:52] *** phuzion has quit IRC (Ping timeout: 600 seconds) [03:53] *** phuzion has joined #archiveteam [04:07] *** pizzaiolo has joined #archiveteam [04:08] *** pizzaiolo has quit IRC (Client Quit) [04:56] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [05:01] *** Sk1d has joined #archiveteam [05:58] *** arbin_ has quit IRC (Read error: Connection reset by peer) [06:18] *** owl_ has quit IRC (Quit: owl_) [06:19] *** kristian_ has joined #archiveteam [06:49] *** arbin has joined #archiveteam [07:03] *** j08nY has joined #archiveteam [07:07] *** ivan has quit IRC (Leaving) [07:11] *** ivan has joined #archiveteam [07:45] *** atomotic has joined #archiveteam [07:48] *** j08nY has quit IRC (Read error: Operation timed out) [08:48] *** kristian_ has quit IRC (Quit: Leaving) [09:12] *** owl_ has joined #archiveteam [09:20] *** Jonison has joined #archiveteam [09:40] *** owl_ has quit IRC (Quit: owl_) [09:50] *** Jonison has quit IRC (Quit: Leaving) [09:55] *** icedice has joined #archiveteam [10:25] *** gui7 has joined #archiveteam [10:28] *** j08nY has joined #archiveteam [10:35] https://i.imgur.com/CfLW7t6.png is this normal? I haven't been able to do any progress whatsoever with pixiv lately with my Warrior [10:36] Yes, that's normal. [10:37] All rooms rated for age 18+ are failing currently (on purpose). We already finished all rooms (I think), but requeued everything to get those that failed for other reasons. We'll grab the 18+ rooms later. [10:52] wp494: and sendvid too [10:54] What the hell? Are these all run by the same people? [10:56] *** anhedonis has joined #archiveteam [10:57] *** BlueMaxim has quit IRC (Read error: Operation timed out) [11:13] *** atomotic has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…) [11:43] *** atomotic has joined #archiveteam [12:12] *** phuzion has quit IRC (Remote host closed the connection) [12:24] *** phuzion has joined #archiveteam [12:42] *** pizzaiolo has joined #archiveteam [13:01] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [13:51] *** zino_ has joined #archiveteam [13:55] SourceForge is shaking up their mailing lists on the 26th. No deletion of mails yes, but they will close down lists without activity and mass-desubscribe users that do not answer back with their country of residency "to comply with electronic messaging and privacy laws". [13:55] Speculation: Deletion or de-listing of inactive lists in the future seems likely. [13:55] https://sourceforge.net/blog/sourceforge-project-e-mail-policy-update/ [13:59] *** owl_ has joined #archiveteam [14:04] *** ZexaronS has joined #archiveteam [15:17] *** ZexaronS has quit IRC (Leaving) [15:28] *** odemg has joined #archiveteam [15:46] ---------------------------------------------------------------- [15:46] INTERNET ARCHIVE WANTS ARCHIVE TEAM STUFF TO GO INTO ARCHIVETEAM COLLECTION [15:46] If you pump WARCs and stuff into the archive, talk to me about collection access. [15:47] ---------------------------------------------------------------- [15:47] Actually, e-mail me, jscott@archive.org, so I don't miss it. [15:47] Turns out some stuff was either going into wayback but the reports were off, or they were NOT going in because people are uploading to the wrong places and there's no derivation [16:45] *** arbin has quit IRC (Read error: Operation timed out) [17:18] *** nertzy has joined #archiveteam [17:23] *** Honno has joined #archiveteam [17:24] *** ReimuHaku has joined #archiveteam [17:32] *** MMovie2 has joined #archiveteam [17:32] *** MMovie has quit IRC (Read error: Operation timed out) [17:40] *** MMovie has joined #archiveteam [17:43] *** MMovie2 has quit IRC (Ping timeout: 600 seconds) [18:16] *** guest has joined #archiveteam [18:17] *** nertzy2 has joined #archiveteam [18:18] hi can you help me out how to download tens of thousands of links from a simple list keeping the directory structure as in the urls? [18:21] *** nertzy has quit IRC (Read error: Operation timed out) [18:23] *** nertzy2 has quit IRC (Read error: Connection reset by peer) [18:23] *** nertzy2 has joined #archiveteam [18:24] guest: wget or wpull should be able to do that [18:26] http://www.zdnet.com/article/microsoft-to-shut-down-its-docs-com-file-sharing-site-december-15/ [18:27] Reminder: that's the page that lets you search through all public documents, e.g. for "password". [18:35] *** godane has quit IRC (Ping timeout: 245 seconds) [18:36] MrRadar thank you found a solution with wget [18:39] *** godane has joined #archiveteam [18:40] *** guest has quit IRC (Ping timeout: 268 seconds) [18:42] *** metaprime has joined #archiveteam [19:19] *** ItsYoda has quit IRC (Quit: rippppp to the yoda you used to know!) [19:20] *** metaprime has quit IRC (Quit: Page closed) [19:25] *** ZexaronS has joined #archiveteam [19:31] *** schbirid has joined #archiveteam [19:34] *** gui7 has quit IRC (Read error: Operation timed out) [19:59] *** ItsYoda has joined #archiveteam [21:11] *** ndiddy has joined #archiveteam [21:17] schbirid : http://members.visi.net/~fathom isn't on that I was wondering if there's any archive of the site not listed on internet archive that might have it maybe [21:17] no idae:) [22:07] *** owl_ has quit IRC (Quit: owl_) [22:29] *** Honno has quit IRC (Read error: Operation timed out) [22:49] *** nertzy2 has quit IRC (Quit: This computer has gone to sleep) [22:59] *** icedice has quit IRC (Quit: Leaving) [23:10] *** MMovie has quit IRC (Read error: Operation timed out) [23:12] *** MMovie has joined #archiveteam [23:28] *** ZexaronS has quit IRC (Leaving) [23:39] *** db48x has quit IRC (Read error: Connection reset by peer) [23:55] *** db48x has joined #archiveteam