[00:01] *** Stiletto has joined #archiveteam-bs [00:10] *** BlueMaxim has joined #archiveteam-bs [00:45] *** JesseW has joined #archiveteam-bs [01:02] *** tomwsmf-a has joined #archiveteam-bs [01:23] *** JesseW has quit IRC (Ping timeout: 370 seconds) [01:23] *** JesseW has joined #archiveteam-bs [01:30] *** JesseW has quit IRC (Ping timeout: 370 seconds) [01:48] *** BlueMaxim has quit IRC (Read error: Operation timed out) [02:01] *** ErkDog has quit IRC (Read error: Operation timed out) [02:07] *** ErkDog has joined #archiveteam-bs [02:33] i'm starting to upload loveline [02:47] also i'm going to be uploading the 2 different bitrate mp3s podcasts as one item [02:48] thats mostly so if one has something wrong with it maybe the other lower or higher bit rate one works [02:59] https://archive.org/details/loveline-podcast-2006-05-02 [04:13] *** bwn_ has quit IRC (Read error: Operation timed out) [04:44] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:52] *** Sk1d has joined #archiveteam-bs [04:52] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [04:58] * Yoshimura scratches head, wonders if yipdw noticed the message. [05:19] What pipeline? [05:25] yosh-d-ao [05:26] Archivebot, got cut connection to server [05:30] I see [05:31] What error are you getting? [05:33] *** dashcloud has quit IRC (Read error: Operation timed out) [05:36] *** bwn has joined #archiveteam-bs [05:39] *** JesseW has joined #archiveteam-bs [05:40] *** dashcloud has joined #archiveteam-bs [05:48] *** bwn has quit IRC (Read error: Connection reset by peer) [05:48] *** bwn has joined #archiveteam-bs [05:58] *** bwn_ has joined #archiveteam-bs [05:58] *** bwn has quit IRC (Read error: Connection reset by peer) [06:00] *** bwn_ has quit IRC (Read error: Connection reset by peer) [06:01] *** bwn has joined #archiveteam-bs [06:06] i'm uploading my theblaze.com 2013-12 stories pages for last year [06:06] *from last year [06:07] i'm also going to be downloaded more bulk grabs of theblaze.com [06:13] *** Honno has joined #archiveteam-bs [06:15] *** mismatch has quit IRC (Remote host closed the connection) [06:17] *** mismatch has joined #archiveteam-bs [06:26] *** BlueMaxim has joined #archiveteam-bs [06:35] *** JesseW has quit IRC (Ping timeout: 370 seconds) [07:13] SketchCow: now you have both copies of this show: https://archive.org/details/loveline-podcast-2006-09-12 [07:13] duration on the low and high mp3s are sometimes different by a lot [07:23] Btw, is anyone doing the a) voice memo and b) danish social site ? [07:43] *** VADemon has quit IRC (left4dead) [07:44] Yoshimura: Are those on the Deathwatch wiki page yet? [07:44] Yeah, I should re-check that I guess. [07:45] A *real* calendar instead of that wiki page would be helpful I guess. [07:45] It is not. [07:46] I mean not on deathwatch, idk why. arto.com is there. [07:46] Also there is only one month remaining. [07:47] The data from site might be a lot lower then Those 2TB, as a lot of is marked private or deleted. [07:47] btw i'm at 691k items now [07:47] godane: What you are fetching? [07:48] *** bwn has quit IRC (Read error: Operation timed out) [08:02] *** Yoshimura has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) [08:30] *** schbirid has joined #archiveteam-bs [08:45] *** SN4T14 has quit IRC (Read error: Operation timed out) [08:51] *** Honno has quit IRC (Ping timeout: 1208 seconds) [08:51] *** bwn has joined #archiveteam-bs [08:56] *** SN4T14 has joined #archiveteam-bs [08:58] *** Honno has joined #archiveteam-bs [08:59] i put this up since other people were looking for 2006 or later mp3s of loveline: https://www.reddit.com/r/Loveline/comments/4h3ss9/list_of_mp3s_going_back_to_may_2006/ [09:05] *** ohhdemgir has joined #archiveteam-bs [09:07] *** ohhdemgir has quit IRC (Client Quit) [09:43] *** Yoshimura has joined #archiveteam-bs [09:55] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [10:37] *** HCross has quit IRC (Ping timeout: 250 seconds) [10:40] *** HCross has joined #archiveteam-bs [10:47] hopping off filefront forums [10:48] only 1 connection at a time is slowwww, lol [11:46] *** Stiletto is now known as Stilett0| [11:46] *** Stilett0| is now known as Stilett0 [11:47] *** Stilett0 is now known as Stilett0a [11:49] *** zino has quit IRC (Read error: Connection reset by peer) [12:17] *** BlueMaxim has quit IRC (Quit: Leaving) [12:46] Yoshimura: what is the voice memo site again? [13:38] *** schbirid has quit IRC (Quit: Leaving) [14:34] Are there any grabs that could do with 10/20 grabbers thrown at them? [14:39] nah not really. Even yuku is hitting rate limit [15:21] *** schbirid has joined #archiveteam-bs [16:03] what a disappointment : http://www.meatfilter.com/ [16:04] *** JesseW has joined #archiveteam-bs [16:04] *** atrocity has quit IRC (Ping timeout: 244 seconds) [16:11] *** schbirid has quit IRC (Quit: Leaving) [16:19] *** wyatt8740 has quit IRC (Read error: Operation timed out) [16:28] *** wyatt8740 has joined #archiveteam-bs [16:30] *** JesseW has quit IRC (Ping timeout: 370 seconds) [16:35] *** SimpBrain has quit IRC (Remote host closed the connection) [16:51] *** VADemon has joined #archiveteam-bs [17:04] hah [17:16] *** JesseW has joined #archiveteam-bs [17:52] *** bwn has quit IRC (Ping timeout: 246 seconds) [18:22] arkiver, ive found something that might be interesting to grab [18:24] godane, this might also be for you [18:24] well dump it [18:24] http://naenara.com.kp/en/periodic/f_trade/detail.php?kind=f_trade&&name=2016/01/01.pdf seems to have a set of north korean periodicals/magazines [18:27] *** atrocity has joined #archiveteam-bs [18:31] *** bwn has joined #archiveteam-bs [18:43] HCross: URLTeam can always use workers [18:44] Will sort something out [18:49] HCross: i'm having trouble connecting to that website [18:49] very slow [18:51] .kp is the TLD for North Korea and the IP address for it appears to be in NK itself according to iplocation.net [18:51] Which is strange since most externally-facing NK sites are hosted in Japan since NK barely has any external Internet bandwidth [19:00] *** VADemon has quit IRC (Quit: left4dead) [19:02] This is weird. The IP seems to be located in japan, but is in an IP-Range that is geolocated in NK, with an AS, that only has CHINA UNICORN as Upstream and when tracerouted, terminates in China. http://www.tcpiputils.com/browse/ip-address/175.45.176.77 [19:03] So yes, seems to be hosted in NK [19:03] china unicorn, not unicom? [19:05] This page says China Unicom provides most of NK's Internet service (with a secondary satellite connection as a backup): https://nknetobserver.github.io/ [19:05] unicom. Fuck autocorrent [19:05] LOL [19:10] maybe it was the keming [19:11] *** wp494_ has quit IRC (Read error: Connection reset by peer) [19:22] yuku isn't fully maxed out yet [19:23] Nice find HCross! We should definitely get that into IA [19:24] urlteam can always use people *investigating* shorteners, but we do actually sometimes fill up with available slots for warriors (I need to see if I can boost the queue sizes on some projects to help with this) [19:24] JesseW: I made urlteam the default project again for the warrior [19:25] actually, we should be able to accept more individual warriors right now, as long as they are on differing IPs. bitly is only at 27 clients with a queue of 70 [19:26] arkiver: ok, thanks [19:26] JesseW, does it matter they are all in the same subnet? [19:32] HCross: IDK. Probably not? Maybe? [19:33] I think the local check is just for exact IP address. [19:33] The services we are scraping may make broader checks, though. [19:36] * JesseW just switched my local warrior to yuku from filefront [19:36] er, gamefront [20:02] *** wp494 has joined #archiveteam-bs [20:45] *** MrIdea has joined #archiveteam-bs [20:46] hello again [20:46] just wondering, does anyone here actually have FTP access to the betaarchive? [20:48] if so, we really need to get an census of what they actually have, so we can see if some of the stuff they have is else where, so that people that want to try and "steal" some software from the ftp don't download dupes and waste space [20:49] *** RichardG has quit IRC (Read error: Connection reset by peer) [21:47] *** RichardG has joined #archiveteam-bs [22:22] arkiver or whoever it concerns: yuku errors (i guess): http://pastebin.com/raw/UBnHsQPB seemingly kept growing & http://pastebin.com/raw/ZPWkFEQn [22:27] also -- is there a recommended method of scraping a tumblr? im really only after the text content of all the posts [22:28] use archivebot with the singletumblr,notumblrnotes ignoresets [22:28] Use wget or wpull with recursion turned on but the acceptable domain restricted to just the tumblr's domain [22:28] Or that, which does effectively the same thing [22:29] ah [22:29] thanks :) [22:33] *** Honno has quit IRC (Read error: Operation timed out) [22:35] Rotab: also, please let me know of tumblr's you'd like archived; I'm making a general project of doing so [23:12] i'm assuming archive team is backing up the defcon site videos? [23:28] *** mismatch has quit IRC (Ping timeout: 633 seconds) [23:28] It looks like we did an ArchiveBot crawl of the site in 2014: http://archive.fart.website/archivebot/viewer/job/743vs [23:30] JesseW: i just wanted all the tracklists from http://even1ngmass.tumblr.com/ :) [23:32] Rotab: added to my list [23:37] *** BlueMaxim has joined #archiveteam-bs