[00:01] *** Martle has joined #archiveteam [00:06] In case anyone doesn't know, we have drop-dead backups of the Archiveteam Wiki here: [00:07] https://archive.org/download/archiveteam_wiki_backup [00:09] Can someone here raise their hand, register, and download all of https://case.law/bulk/ to then put on the Internet Archive? [00:19] *** Kitaru_ has quit IRC (Quit: This computer has gone to sleep) [00:50] *** pizzaiolo has joined #archiveteam [01:12] *** Stilett0 has joined #archiveteam [01:14] *** Sk1d has quit IRC (Read error: Operation timed out) [01:14] *** Stiletto has quit IRC (Read error: Operation timed out) [01:14] *** Stiletto has joined #archiveteam [01:16] *** Stilett0 has quit IRC (Ping timeout: 260 seconds) [01:17] *** Sk1d has joined #archiveteam [01:24] *** Stilett0 has joined #archiveteam [01:25] *** Stiletto has quit IRC (Ping timeout: 264 seconds) [01:29] *** Sk1d has quit IRC (Read error: Operation timed out) [01:32] *** Sk1d has joined #archiveteam [01:45] *** Sk1d has quit IRC (Read error: Operation timed out) [01:46] *** Pixi` has joined #archiveteam [01:47] *** Sk1d has joined #archiveteam [01:49] *** Pixi has quit IRC (Read error: Operation timed out) [02:03] *** Kitaru has joined #archiveteam [02:05] *** BlueMax has joined #archiveteam [03:11] *** Kitaru has quit IRC (Quit: This computer has gone to sleep) [03:29] SketchCow: how big do you estimate that stuff is? [03:30] No friggin' clue [03:30] *** pizzaiolo has quit IRC (Read error: Connection reset by peer) [03:31] *** pizzaiolo has joined #archiveteam [03:32] *** zerkalo has quit IRC (Ping timeout: 264 seconds) [03:32] *** zerkalo has joined #archiveteam [03:35] *** RichardG_ has joined #archiveteam [03:36] *** Pixi has joined #archiveteam [03:37] *** pizzaiolo has quit IRC (Quit: pizzaiolo) [03:38] *** RichardG has quit IRC (Ping timeout: 360 seconds) [03:38] *** randomdes has quit IRC (Ping timeout: 360 seconds) [03:38] *** VerifiedJ has quit IRC (Read error: Operation timed out) [03:38] *** randomdes has joined #archiveteam [03:39] *** dxrt has quit IRC (Ping timeout: 360 seconds) [03:39] *** Polylith has quit IRC (Ping timeout: 360 seconds) [03:39] *** dxrt has joined #archiveteam [03:41] *** saper has quit IRC (Ping timeout: 264 seconds) [03:41] *** superkuh has quit IRC (Excess Flood) [03:41] *** SketchCo1 has joined #archiveteam [03:42] *** swebb sets mode: +o SketchCo1 [03:42] *** Polylith has joined #archiveteam [03:42] *** unlobito has quit IRC (Ping timeout: 360 seconds) [03:43] *** unlobito has joined #archiveteam [03:43] *** Pixi` has quit IRC (Read error: Operation timed out) [03:43] *** me_ has quit IRC (Read error: Operation timed out) [03:45] *** MMovie has quit IRC (Read error: Operation timed out) [03:46] *** Sk1d has quit IRC (Read error: Operation timed out) [03:47] *** phirephly has quit IRC (Ping timeout: 360 seconds) [03:47] *** fennec has quit IRC (Ping timeout: 360 seconds) [03:47] *** twigfoot has quit IRC (Ping timeout: 360 seconds) [03:47] *** arkiver has quit IRC (Ping timeout: 360 seconds) [03:47] *** Zebranky has quit IRC (Remote host closed the connection) [03:47] *** Zebranky has joined #archiveteam [03:47] *** SketchCow has quit IRC (Read error: Connection reset by peer) [03:48] *** phirephly has joined #archiveteam [03:48] *** saper has joined #archiveteam [03:48] *** Darkstar has quit IRC (Read error: Connection reset by peer) [03:48] *** eprillios has quit IRC (Ping timeout: 360 seconds) [03:49] *** twigfoot has joined #archiveteam [03:49] *** closure has quit IRC (Read error: Operation timed out) [03:50] *** me_ has joined #archiveteam [03:50] *** closure has joined #archiveteam [03:50] *** anarchat has joined #archiveteam [03:50] *** anarcat has quit IRC (Ping timeout: 360 seconds) [03:51] *** Sk1d has joined #archiveteam [03:52] *** eprillios has joined #archiveteam [03:52] *** SketchCo1 is now known as SketchCow [03:52] *** arkiver has joined #archiveteam [03:52] *** gibigian1 has joined #archiveteam [03:52] *** swebb sets mode: +o arkiver [03:53] *** gibigiana has quit IRC (Ping timeout: 395 seconds) [03:53] *** Darkstar has joined #archiveteam [03:56] *** Pixi has quit IRC (Quit: Pixi) [03:56] *** MMovie has joined #archiveteam [03:56] *** anarchat is now known as anarcat [03:56] *** Stiletto has joined #archiveteam [03:57] *** fennec has joined #archiveteam [03:57] *** Pixi has joined #archiveteam [03:58] *** Stilett0 has quit IRC (Read error: Operation timed out) [03:59] *** superkuh has joined #archiveteam [03:59] *** superkuh has quit IRC (Excess Flood) [04:03] *** Sk1d has quit IRC (Read error: Operation timed out) [04:05] *** Cameron_D has quit IRC (Read error: Operation timed out) [04:07] *** JTL has joined #archiveteam [04:07] *** Sk1d has joined #archiveteam [04:08] *** Cameron_D has joined #archiveteam [04:08] *** superkuh has joined #archiveteam [04:12] *** Stilett0 has joined #archiveteam [04:17] *** Stiletto has quit IRC (Read error: Operation timed out) [04:23] *** Dimtree has quit IRC (Read error: Connection reset by peer) [04:32] *** Dimtree has joined #archiveteam [04:39] *** Muramasa has quit IRC (Remote host closed the connection) [04:40] *** Muramasa has joined #archiveteam [04:51] *** qw3rty117 has joined #archiveteam [04:57] *** qw3rty116 has quit IRC (Read error: Operation timed out) [05:32] *** Sk1d has quit IRC (Read error: Operation timed out) [05:35] *** Sk1d has joined #archiveteam [05:48] *** Sk1d has quit IRC (Read error: Operation timed out) [05:52] *** Sk1d has joined #archiveteam [06:05] *** Sk1d has quit IRC (Read error: Operation timed out) [06:08] *** Sk1d has joined #archiveteam [06:18] *** logchfoo0 starts logging #archiveteam at Fri Nov 02 06:18:53 2018 [06:18] *** logchfoo0 has joined #archiveteam [06:19] *** Petri152 has joined #archiveteam [06:21] *** Sk1d has quit IRC (Read error: Operation timed out) [06:25] *** Sk1d has joined #archiveteam [06:28] *** Kitaru has joined #archiveteam [06:53] *** aMunster has quit IRC (Read error: Operation timed out) [06:53] *** aMunster has joined #archiveteam [07:13] *** Martle has quit IRC (Leaving) [07:31] *** SmileyG has quit IRC (Read error: Operation timed out) [07:31] *** Sk1d has quit IRC (Read error: Operation timed out) [07:31] *** Smiley has joined #archiveteam [07:34] *** brayden has quit IRC (Ping timeout: 260 seconds) [07:35] *** Sk1d has joined #archiveteam [07:36] *** alex__ has joined #archiveteam [07:47] *** brayden has joined #archiveteam [07:47] *** swebb sets mode: +o brayden [07:47] *** Larsenv has quit IRC (Read error: Operation timed out) [07:48] *** Sk1d has quit IRC (Read error: Operation timed out) [07:48] *** Kitaru has quit IRC (Quit: This computer has gone to sleep) [07:50] *** Sk1d has joined #archiveteam [08:04] *** Sk1d has quit IRC (Read error: Operation timed out) [08:07] *** Sk1d has joined #archiveteam [08:22] *** Petri152 has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** |Ripley| has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** Mayonaise has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** sknebel_ has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** jspiros has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** voker57 has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** S1mpbrain has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** ivan has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** JAA has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** K4k has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** c4rc4s has quit IRC (hub.efnet.us irc.colosolutions.net) [08:22] *** zyphlar has quit IRC (hub.efnet.us irc.colosolutions.net) [08:32] *** Larsenv has joined #archiveteam [08:32] *** Petri152 has joined #archiveteam [08:32] *** |Ripley| has joined #archiveteam [08:32] *** Mayonaise has joined #archiveteam [08:32] *** sknebel_ has joined #archiveteam [08:32] *** jspiros has joined #archiveteam [08:32] *** voker57 has joined #archiveteam [08:32] *** S1mpbrain has joined #archiveteam [08:32] *** ivan has joined #archiveteam [08:32] *** JAA has joined #archiveteam [08:32] *** K4k has joined #archiveteam [08:32] *** c4rc4s has joined #archiveteam [08:32] *** zyphlar has joined #archiveteam [08:32] *** irc.colosolutions.net sets mode: +oo ivan JAA [08:32] *** swebb sets mode: +o JAA [08:32] *** bakJAA sets mode: +o JAA [08:34] *** JAA sets mode: +o bakJAA [08:46] *** Sk1d has quit IRC (Read error: Operation timed out) [08:49] *** Sk1d has joined #archiveteam [09:02] *** Sk1d has quit IRC (Read error: Operation timed out) [09:03] *** Larsenv has quit IRC (Ping timeout: 246 seconds) [09:06] *** Sk1d has joined #archiveteam [09:15] *** Larsenv has joined #archiveteam [09:17] *** Sk1d has quit IRC (Read error: Operation timed out) [09:22] *** Sk1d has joined #archiveteam [09:34] *** Sk1d has quit IRC (Read error: Operation timed out) [09:34] I want a run a warrior on an old core duo iMac? So I’ll need an old version of Virtual box or VMWare, is the latest warrior download compatible with these older versions. Can’t seem to find info on this. [09:35] The iMac can go as high as Mac OS X 10.6.8 [09:36] *** alex__ has quit IRC (Quit: alex__) [09:38] *** alex__ has joined #archiveteam [09:40] *** Sk1d has joined #archiveteam [09:41] @Sk1d Do you know if Mac OS X 10.6.8 and an older version of VMWare or VirtualBox would meet the system requirements for running the Warrior? [09:42] I just want to put this old machine to good use 24/7 [09:42] why not just try it? [09:42] don't message people randomly just because you are impatient [09:43] Yeah I should just try it. You’re right @schbirid [09:44] Hey @schbirid a few months back, I did a huge scan of the entire internet for anonymoud FTP sites and uploaded the findings to archive.org (which I know this effort is not associated with). Is this something that you think might interest the effort? [09:46] I am not sure if I can engage in random discussion here or if I am going OT, if I am just scream at me … it’s ok. [09:46] for random discussion go to #archiveteam-bs , for completely offtopic go to #archiveteam-ot [09:47] Do you think my querry on the FTP scan I did would be more appropriate for the #archiveteam-bs channel? [09:48] If so I will post it there and of course, as always, apologies when they are due. [09:50] *** alex__ has quit IRC (Quit: alex__) [10:01] alex__ : most things somewhat related to archiving are appropriate for #archiveteam-bs [10:16] *** nertzy has joined #archiveteam [10:19] *** Larsenv has quit IRC (Ping timeout: 260 seconds) [10:23] *** VerifiedJ has joined #archiveteam [10:31] *** Sk1d has quit IRC (Read error: Operation timed out) [10:33] *** Larsenv has joined #archiveteam [10:34] *** Sk1d has joined #archiveteam [10:46] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [10:48] *** pizzaiolo has joined #archiveteam [10:48] *** Sk1d has quit IRC (Read error: Operation timed out) [10:50] *** Sk1d has joined #archiveteam [10:52] *** alex__ has joined #archiveteam [11:02] *** BlueMax has quit IRC (Read error: Connection reset by peer) [11:12] Flickr is going to torch part of its collection https://blog.flickr.net/en/2018/11/01/changing-flickr-free-accounts-1000-photos/ [11:13] maybe we could upload them to wikimedia commons, like it was done with panoramio [11:23] pizzaiolo: there are like 6 million Flickr CC images in Commons https://commons.wikimedia.org/wiki/Category:Files_from_Flickr (count "reviewed by" subcategories) [11:24] hence the urgency [11:24] Commons doesnt accept all CC images, just images useful for wikipedia, so it wont work as a vault for Flickr all CC images [11:25] we can discuss in #flickrfckr [11:42] *** JAA changes topic to: Archive Team: We're not archive.org | https://archiveteam.org/ | Long discussions: #archiveteam-bs | Offtopic: #archiveteam-ot | Thanks, we know about Flickr [12:07] *** Sk1d has quit IRC (Read error: Operation timed out) [12:10] *** Sk1d has joined #archiveteam [12:14] *** svchfoo1 has joined #archiveteam [12:14] *** svchfoo3 has joined #archiveteam [12:14] *** PurpleSym sets mode: +oo svchfoo1 svchfoo3 [12:52] *** Mateon1 has quit IRC (Read error: Operation timed out) [12:58] *** Mateon1 has joined #archiveteam [13:15] *** Sk1d has quit IRC (Read error: Operation timed out) [13:19] *** Sk1d has joined #archiveteam [13:33] *** Sk1d has quit IRC (Read error: Operation timed out) [13:35] *** Larsenv has quit IRC (Ping timeout: 246 seconds) [13:36] *** Sk1d has joined #archiveteam [13:43] *** Larsenv has joined #archiveteam [13:48] *** Sk1d has quit IRC (Read error: Operation timed out) [13:52] *** Sk1d has joined #archiveteam [14:44] *** matthusby has joined #archiveteam [15:01] Grabbed case.law/bulk and case.law via grab-site. Uploading to IA now. [15:03] hmm... should i use grab-site instead of wpull locally? [15:49] *** Larsenv has quit IRC (Ping timeout: 260 seconds) [15:51] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [15:52] (apparently my wiki account got purged at some point, and the secret word has changed since i last knew it since the spammers figured it out) [16:03] Case law upload: https://archive.org/download/case.law [16:03] anarcat: grab-site uses wpull [16:10] *** Larsenv has joined #archiveteam [16:18] *** pizzaiolo has quit IRC (Read error: Connection reset by peer) [16:19] *** pizzaiolo has joined #archiveteam [16:35] swebb: sure, but a forked version... [16:35] i've been struggling to make wpull do the right thing sometimes and i'm wondering if grab-site might be better [16:35] the main benefit is being able to control things during the crawl [16:36] like archivebot does? e.g. blocking regexes? [16:36] yes [16:36] neat [16:36] i got to say, all this stuff needs release engineering so bad :p [16:39] I will be gone the rest of the day soon so if you have any questions hopefully someone else can field them [16:41] thanks [16:41] are you @ludios on github? [16:41] i'm wondering why there's still a wpull fork there... seems the changes could be merged back upstream... [16:42] ah yes, you are one of the main contributors, sorry [16:42] yes [16:42] if JAA wants his changes in first there will be some work to do [16:42] also I haven't fixed all the tests with the new html scraper [16:46] understood [16:46] but JAA's changes are in a PR, and i don't see yours... did i miss something? :) [16:46] true [16:47] :) [16:48] well it seems JAA has some work to do to catchup with your review anyways :p [16:48] but it does fix some annoying issues like the htmllib stuff [16:49] that's one gigantic PR too [16:52] * anarcat reviewed a bunch of PRs on wpull [16:53] Time to move this to a different channel. [16:54] If there is enough interest, we could establish #archiveteam-dev. [16:56] *** Sk1d has quit IRC (Read error: Operation timed out) [16:56] or is that -bs? [16:59] *** Sk1d has joined #archiveteam [17:01] Yeah, -bs for now. If the discussions get too long and start annoying people, we can still move to a separate channel. [17:03] *** karl_marx has joined #archiveteam [17:04] Hi, everyone! I've found that in 2015 someone was trying to find a way to create vk.cc shortened links, but couldn't find how to do it. They're created through VK's main website: https://vk.com/cc [17:05] Oh wait, I'm on the wrong IRC [17:14] *** wp494 has quit IRC (Read error: Operation timed out) [17:14] *** Sk1d has quit IRC (Read error: Operation timed out) [17:14] *** wp494 has joined #archiveteam [17:18] *** Sk1d has joined #archiveteam [17:25] *** Pixi has quit IRC (Quit: Pixi) [17:32] *** Larsenv has quit IRC (Ping timeout: 633 seconds) [17:59] *** Larsenv has joined #archiveteam [18:01] is there a channel for flickr archiving yet? [18:01] I am particularly interested in older/early accounts that have been stale for a while [18:01] that are non-CC [18:14] balrog: #flickrfckr [18:42] *** Stiletto has joined #archiveteam [18:45] *** Kitaru has joined #archiveteam [18:46] *** Stilett0 has quit IRC (Read error: Operation timed out) [18:57] *** karl_marx has quit IRC (Quit: Page closed) [19:05] *** alex____ has joined #archiveteam [19:06] *** alex__ has quit IRC (Ping timeout: 252 seconds) [19:08] *** Pixi has joined #archiveteam [19:21] *** Martle has joined #archiveteam [21:05] *** Larsenv has quit IRC (Ping timeout: 260 seconds) [21:10] *** Sk1d has quit IRC (Read error: Operation timed out) [21:13] *** Sk1d has joined #archiveteam [21:22] so i'm looking at coordinating work on archival of brasil public websites and databases, following the regime change over there [21:22] we have about 30 websites of various sizes we need to archive fairly promptly [21:22] so far i've been feeding some of those into archivebot, but i would like to scale that up [21:23] i know a similar effort was done for the US gov in 2016, and i was looking for advice on how to proceed [21:23] our first need would be to have a collection on archive.org to regroup our uploads [21:23] in fact, that might become multiple collections... [21:24] SketchCow, do you think you could spare some cycles to help us with this, esp. with administrative matters? [21:27] *** Larsenv has joined #archiveteam [21:47] *** tuluu has quit IRC (Remote host closed the connection) [21:49] *** tuluu has joined #archiveteam [22:04] *** BlueMax has joined #archiveteam [22:52] *** Kitaru_ has joined #archiveteam [22:54] *** Kitaru has quit IRC (Read error: Operation timed out) [22:57] *** dxrt_ sets mode: +o dxrt