#archiveteam-bs 2016-12-15,Thu

↑back Search

Time Nickname Message
00:13 🔗 MrRadar godane: How much data did you get before you had to stop?
00:21 🔗 godane it was only 2.6gb
00:21 🔗 godane i may do a it different now looking at
00:26 🔗 godane anyways the worst that would happen is that all of this data goes to FOS
00:32 🔗 godane ok i'm starting to upload it to FOS
01:05 🔗 alembic what's the purpose of the files ftp_check.py produces in the `archive` directory?
01:15 🔗 brayden_ has joined #archiveteam-bs
01:15 🔗 swebb sets mode: +o brayden_
01:20 🔗 brayden has quit IRC (Ping timeout: 633 seconds)
01:36 🔗 coretx has quit IRC (Remote host closed the connection)
01:37 🔗 Somebody has joined #archiveteam-bs
01:39 🔗 coretx has joined #archiveteam-bs
01:50 🔗 DiscantX has joined #archiveteam-bs
02:08 🔗 i336 has quit IRC (Remote host closed the connection)
02:13 🔗 DiscantX has quit IRC (Read error: Operation timed out)
02:14 🔗 DFJustin has quit IRC (Remote host closed the connection)
02:15 🔗 Start has quit IRC (Remote host closed the connection)
02:19 🔗 Start has joined #archiveteam-bs
02:19 🔗 DFJustin has joined #archiveteam-bs
02:19 🔗 Start has quit IRC (Client Quit)
02:20 🔗 DFJustin has quit IRC (Remote host closed the connection)
02:25 🔗 DFJustin has joined #archiveteam-bs
02:33 🔗 Start has joined #archiveteam-bs
02:57 🔗 Ravenloft has quit IRC ()
02:59 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
03:09 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
03:15 🔗 ravetcofx has joined #archiveteam-bs
03:16 🔗 Sk1d has joined #archiveteam-bs
04:40 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
04:49 🔗 ravetcofx has joined #archiveteam-bs
05:16 🔗 Frogging has quit IRC (El Psy Kongroo!)
05:24 🔗 Frogging has joined #archiveteam-bs
05:32 🔗 Sk1d has quit IRC (Ping timeout: 194 seconds)
05:38 🔗 Sk1d has joined #archiveteam-bs
06:11 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
06:42 🔗 godane so i'm starting to upload BBC World Service Newshour
06:42 🔗 godane i have about 46gb from 2009-06-23 to end of 2011
06:50 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
07:06 🔗 ravetcofx has joined #archiveteam-bs
07:15 🔗 brayden_ is now known as brayden
07:19 🔗 godane SketchCow: did you missing getting a magazine called MCmicrocomputer?
07:20 🔗 godane its a computer magazine from italy
07:21 🔗 dashcloud has quit IRC (Ping timeout: 250 seconds)
07:22 🔗 dashcloud has joined #archiveteam-bs
07:23 🔗 godane based on what i could tell is not on archive.org
07:24 🔗 BlueMaxim has joined #archiveteam-bs
07:45 🔗 Honno has joined #archiveteam-bs
08:20 🔗 ndiddy has quit IRC (Read error: Connection reset by peer)
09:05 🔗 kristian_ has joined #archiveteam-bs
09:21 🔗 DiscantX has joined #archiveteam-bs
09:29 🔗 GE has joined #archiveteam-bs
09:34 🔗 Smiley has joined #archiveteam-bs
09:35 🔗 RichardG_ has joined #archiveteam-bs
09:39 🔗 kanzure_ has joined #archiveteam-bs
09:39 🔗 Igloo_ has joined #archiveteam-bs
09:39 🔗 GE has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 RichardG has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 Yoshimura has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 godane has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 SmileyG has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 achip has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 superkuh has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 Igloo has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 kanzure has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 tpw_rules has quit IRC (hub.efnet.us irc.Prison.NET)
09:39 🔗 aschmitz has quit IRC (hub.efnet.us irc.Prison.NET)
09:52 🔗 Honno has quit IRC (Read error: Operation timed out)
09:56 🔗 godane has joined #archiveteam-bs
10:50 🔗 DiscantX has quit IRC (Ping timeout: 492 seconds)
11:14 🔗 superkuh has joined #archiveteam-bs
11:15 🔗 Igloo_ is now known as Igloo
11:46 🔗 arkiver PurpleSym: what did you make https://archive.org/download/ftp-mayn-de-2016-08-04 with?
11:47 🔗 arkiver downloading it now too, but if you're here, please let me know
11:57 🔗 PurpleSym arkiver: Uh, wget, I think.
11:58 🔗 PurpleSym Yes, header confirms it was wget.
12:08 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
12:12 🔗 arkiver ah, I didn't know wget does FTP to WARC
12:22 🔗 GE has joined #archiveteam-bs
12:47 🔗 Honno has joined #archiveteam-bs
13:29 🔗 VADemon I did let my slow FTP crawler (1 folder per ~1.5s) take over ftp.sec.gov over night, just to make a file list and get an estimated total size. It's still not finished: the file list alone is 18MB.
13:31 🔗 VADemon most files are 4-20kB in size
13:58 🔗 RichardG_ is now known as RichardG
14:25 🔗 wp494 has quit IRC (Ping timeout: 506 seconds)
14:27 🔗 wp494 has joined #archiveteam-bs
15:22 🔗 REiN^ has quit IRC (Max SendQ exceeded)
15:22 🔗 REiN^ has joined #archiveteam-bs
15:31 🔗 Start has quit IRC (Quit: Disconnected.)
15:32 🔗 REiN^ has quit IRC (Max SendQ exceeded)
15:32 🔗 REiN^ has joined #archiveteam-bs
16:35 🔗 arkiver VADemon: what information are saving about the files?
16:41 🔗 godane has quit IRC (Quit: Leaving.)
16:41 🔗 godane has joined #archiveteam-bs
16:54 🔗 yipdw ContraCT: download went fine, I think
16:54 🔗 yipdw oops
17:05 🔗 Somebody has quit IRC (Ping timeout: 370 seconds)
17:07 🔗 kristian_ has quit IRC (Quit: Leaving)
17:09 🔗 Fletcher has quit IRC (Ping timeout: 244 seconds)
17:16 🔗 Fletcher has joined #archiveteam-bs
17:51 🔗 Stiletto has quit IRC ()
18:00 🔗 Start has joined #archiveteam-bs
18:12 🔗 Start has quit IRC (Quit: Disconnected.)
18:59 🔗 GE_ has joined #archiveteam-bs
19:00 🔗 GE has quit IRC (Ping timeout: 255 seconds)
19:00 🔗 GE_ is now known as GE
19:23 🔗 SketchCow godane: There's anything I possibly could get
19:23 🔗 SketchCow I always miss a few.
19:32 🔗 jrwr has quit IRC (Remote host closed the connection)
19:38 🔗 godane i think MC microcomputer may have been missed cause its on issuu
19:38 🔗 godane anyways i'm grabbing it and making cbz files out the images
19:42 🔗 godane even there website only directly goes to the issuu.com urls: http://www.mc-online.it/
19:43 🔗 godane and to top it off i had to fix my issuu.sh script so it will work with the new html code
19:43 🔗 Start has joined #archiveteam-bs
20:24 🔗 bwn has quit IRC (Ping timeout: 244 seconds)
20:25 🔗 Stiletto has joined #archiveteam-bs
20:35 🔗 bwn has joined #archiveteam-bs
20:37 🔗 Stiletto has quit IRC (Ping timeout: 362 seconds)
20:38 🔗 Stiletto has joined #archiveteam-bs
20:53 🔗 tpw_rules has joined #archiveteam-bs
20:58 🔗 Start has quit IRC (Quit: Disconnected.)
21:03 🔗 Start has joined #archiveteam-bs
21:12 🔗 Stilett0 has joined #archiveteam-bs
21:14 🔗 Stiletto has quit IRC (Ping timeout: 246 seconds)
21:20 🔗 antomati_ is now known as antomatic
21:24 🔗 Stilett0 has quit IRC (Read error: Connection reset by peer)
21:30 🔗 DiscantX has joined #archiveteam-bs
21:49 🔗 Stiletto has joined #archiveteam-bs
21:50 🔗 Smiley waybackmachine.org in google - no meta data because of robots.txt
21:50 🔗 Smiley I LOL'd
21:54 🔗 DiscantX has quit IRC (Ping timeout: 492 seconds)
21:58 🔗 hook54321 Anyone know if there is another way to access the wayback machine? I'm trying to get around a web blocker
22:02 🔗 SketchCow Pay for proxy
22:15 🔗 Famicoma1 has quit IRC (Ping timeout: 260 seconds)
22:19 🔗 hook54321 There isn't an alternate domain similar to how archive.is has archive.li and archive.fo ?
22:20 🔗 Frogging set up a server
22:20 🔗 Frogging SSH tunnel
22:22 🔗 Stiletto has quit IRC (Remote host closed the connection)
22:22 🔗 HCross2 hook54321: remind me in the week, I'll sort something
22:22 🔗 Stiletto has joined #archiveteam-bs
22:27 🔗 Stiletto has quit IRC (Ping timeout: 260 seconds)
22:28 🔗 Stiletto has joined #archiveteam-bs
22:28 🔗 Start has quit IRC (Quit: Disconnected.)
22:35 🔗 sep332 has quit IRC (Quit: konversation out)
22:50 🔗 RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue)
22:51 🔗 RichardG has joined #archiveteam-bs
22:57 🔗 kristian_ has joined #archiveteam-bs
23:00 🔗 hook54321 HCross2: Remind you in a week or during the week?
23:09 🔗 VADemon arkiver: file paths and sizes. it's still not finished.
23:09 🔗 VADemon the website states that users shall use the index files to quickly get what they want, so if we believe that these indexes are a complete representation of the server, we may use them to plan out the mirroring
23:11 🔗 BlueMaxim has joined #archiveteam-bs
23:13 🔗 VADemon 345k+ files so far. i think of a warrior project..?
23:13 🔗 GE has quit IRC (Remote host closed the connection)
23:23 🔗 Smiley has quit IRC (Ping timeout: 250 seconds)
23:24 🔗 Famicoma1 has joined #archiveteam-bs
23:27 🔗 Smiley has joined #archiveteam-bs
23:31 🔗 wp494_ has joined #archiveteam-bs
23:32 🔗 wp494 has quit IRC (Ping timeout: 245 seconds)
23:33 🔗 godane SketchCow: i got ftp://ftp.sec.noaa.gov
23:33 🔗 wp494_ is now known as wp494
23:35 🔗 SketchCow How big
23:36 🔗 godane 2.6gb zip
23:36 🔗 godane its about 3.8gb uncompress
23:37 🔗 godane i'm going after smaller ftps so i'm not stuck uploading it forever
23:47 🔗 arkiver it might be better to grab FTPs as WARCs
23:51 🔗 Stiletto has quit IRC (Read error: Operation timed out)
23:52 🔗 godane i'm just grabbing them has zips for the file boneyards
23:52 🔗 Stiletto has joined #archiveteam-bs

irclogger-viewer