#archiveteam-bs 2015-02-27,Fri

↑back Search

Time Nickname Message
00:27 🔗 SmileyG has quit IRC (Remote host closed the connection)
00:29 🔗 Kirk has quit IRC (Ping timeout: 240 seconds)
00:30 🔗 BlueMaxim has quit IRC (Ping timeout: 265 seconds)
00:30 🔗 Kirk has joined #archiveteam-bs
00:31 🔗 Start has quit IRC (Read error: Connection reset by peer)
00:31 🔗 BlueMaxim has joined #archiveteam-bs
00:33 🔗 Start has joined #archiveteam-bs
00:34 🔗 wp494 has quit IRC (Read error: Operation timed out)
00:37 🔗 Smiley has joined #archiveteam-bs
00:49 🔗 wp494 has joined #archiveteam-bs
00:52 🔗 GLaDOS has joined #archiveteam-bs
00:52 🔗 swebb sets mode: +o GLaDOS
01:11 🔗 aaaaaaaaa has quit IRC (Read error: Operation timed out)
01:14 🔗 aaaaaaaaa has joined #archiveteam-bs
01:15 🔗 dashcloud has quit IRC (Read error: Operation timed out)
01:21 🔗 dashcloud has joined #archiveteam-bs
01:32 🔗 primus104 has quit IRC (Leaving.)
02:39 🔗 RedType_ has quit IRC (Remote host closed the connection)
02:58 🔗 mistym has quit IRC (Remote host closed the connection)
03:29 🔗 mistym has joined #archiveteam-bs
03:38 🔗 dashcloud has quit IRC (hub.dk irc.homelien.no)
03:38 🔗 ionpulse has quit IRC (hub.dk irc.homelien.no)
03:38 🔗 pikhq has quit IRC (hub.dk irc.homelien.no)
03:38 🔗 altlabel has quit IRC (hub.dk irc.homelien.no)
03:54 🔗 dashcloud has joined #archiveteam-bs
04:04 🔗 SN4T14 has joined #archiveteam-bs
05:00 🔗 mistym has quit IRC (Remote host closed the connection)
05:23 🔗 aaaaaaaaa has quit IRC (Leaving)
05:45 🔗 mistym has joined #archiveteam-bs
05:57 🔗 mistym has quit IRC (Remote host closed the connection)
06:03 🔗 RedType has joined #archiveteam-bs
06:12 🔗 RedType has quit IRC (Quit: Lost terminal)
06:19 🔗 RedType has joined #archiveteam-bs
06:23 🔗 mistym has joined #archiveteam-bs
06:26 🔗 godane so there maybe a archive of pri the world on audible.com
06:33 🔗 RedType has quit IRC (Client Quit)
07:04 🔗 Muad-Dib has quit IRC (Ping timeout: 260 seconds)
07:08 🔗 Muad-Dib has joined #archiveteam-bs
07:15 🔗 pikhq has joined #archiveteam-bs
07:17 🔗 ionpulse has joined #archiveteam-bs
07:19 🔗 godane https://medium.com/message/archive-fever-2a330b627274
07:24 🔗 godane "The archivist produces more archive, and that is why the archive is never closed. It opens out of the future."
08:00 🔗 Ctrl-S can we crawl the arduino sites? there's some sort of split happening http://hackaday.com/2015/02/25/arduino-v-arduino/
08:12 🔗 mistym has quit IRC (Remote host closed the connection)
08:18 🔗 primus104 has joined #archiveteam-bs
08:25 🔗 acridAxid has quit IRC (Quit: Quitting)
08:29 🔗 dashcloud has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 wp494 has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 rejon has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Famicoma1 has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 balrog has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 lrkj has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 slash` has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Baljem has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 espes__ has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Mayonaise has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 marvinw has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Cameron_D has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 ohhdemgir has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 eprillios has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 Rickster has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 fenn has quit IRC (west.us.hub irc.eversible.com)
08:29 🔗 xmc has quit IRC (west.us.hub irc.eversible.com)
08:30 🔗 acridAxid has joined #archiveteam-bs
08:30 🔗 wp494_ has joined #archiveteam-bs
08:30 🔗 wp494_ has quit IRC (Excess Flood)
08:30 🔗 wp494_ has joined #archiveteam-bs
08:31 🔗 Rickster` has joined #archiveteam-bs
08:32 🔗 lrkj_ has joined #archiveteam-bs
08:36 🔗 acridAxid has quit IRC (Read error: Operation timed out)
08:41 🔗 acridAxid has joined #archiveteam-bs
08:44 🔗 Rickster` is now known as Rickster
08:44 🔗 rejon has joined #archiveteam-bs
08:44 🔗 balrog has joined #archiveteam-bs
08:44 🔗 slash` has joined #archiveteam-bs
08:44 🔗 Baljem has joined #archiveteam-bs
08:44 🔗 espes__ has joined #archiveteam-bs
08:44 🔗 Mayonaise has joined #archiveteam-bs
08:44 🔗 marvinw has joined #archiveteam-bs
08:44 🔗 Cameron_D has joined #archiveteam-bs
08:44 🔗 xmc has joined #archiveteam-bs
08:44 🔗 irc.eversible.com sets mode: +oo balrog xmc
08:44 🔗 swebb sets mode: +o balrog
08:44 🔗 swebb sets mode: +o xmc
08:48 🔗 fenn has joined #archiveteam-bs
08:51 🔗 fenn has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 rejon has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 balrog has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 slash` has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 Baljem has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 espes__ has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 Mayonaise has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 marvinw has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 Cameron_D has quit IRC (west.us.hub irc.eversible.com)
08:51 🔗 xmc has quit IRC (west.us.hub irc.eversible.com)
08:54 🔗 espes___ has joined #archiveteam-bs
08:55 🔗 schbirid has joined #archiveteam-bs
08:55 🔗 marvinw_ has joined #archiveteam-bs
08:55 🔗 Baljem_ has joined #archiveteam-bs
09:06 🔗 eprillios has joined #archiveteam-bs
09:10 🔗 dashcloud has joined #archiveteam-bs
09:10 🔗 rejon has joined #archiveteam-bs
09:10 🔗 balrog has joined #archiveteam-bs
09:10 🔗 Mayonaise has joined #archiveteam-bs
09:10 🔗 Cameron_D has joined #archiveteam-bs
09:10 🔗 xmc has joined #archiveteam-bs
09:10 🔗 irc.eversible.com sets mode: +oo balrog xmc
09:10 🔗 swebb sets mode: +o balrog
09:10 🔗 swebb sets mode: +o xmc
09:19 🔗 fenn has joined #archiveteam-bs
09:21 🔗 primus104 has quit IRC (Leaving.)
09:46 🔗 eprillios has quit IRC (Ping timeout: 506 seconds)
09:52 🔗 eprillios has joined #archiveteam-bs
09:59 🔗 Famicoman has joined #archiveteam-bs
10:18 🔗 swebb has quit IRC (Read error: Operation timed out)
10:22 🔗 swebb has joined #archiveteam-bs
12:00 🔗 slash` has joined #archiveteam-bs
12:11 🔗 joepie91_ whoop whoop
12:11 🔗 joepie91_ first mirror-to-IA code in Node.js completed
12:11 🔗 joepie91_ handles unicode, newlines, the whole shebang :D
12:12 🔗 * ersi pats joepie91_
12:23 🔗 * BlueMaxim pats ersi
12:25 🔗 primus104 has joined #archiveteam-bs
12:27 🔗 * ersi explodes
12:53 🔗 joepie91_ damnit BlueMaxim
12:53 🔗 joepie91_ I told you not to pat the creeper
12:54 🔗 joepie91_ now I have to rebuild my code
12:54 🔗 joepie91_ :(
12:56 🔗 * BlueMaxim is exploded
13:04 🔗 BlueMaxim has quit IRC (Quit: Leaving)
13:11 🔗 wp494_ has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES)
13:11 🔗 wp494 has joined #archiveteam-bs
13:24 🔗 ohhdemgir has joined #archiveteam-bs
13:46 🔗 sankin has joined #archiveteam-bs
14:50 🔗 sankin has quit IRC (Leaving.)
14:58 🔗 sankin has joined #archiveteam-bs
15:02 🔗 primus104 has quit IRC (Leaving.)
15:32 🔗 mistym has joined #archiveteam-bs
15:50 🔗 mistym has quit IRC (Remote host closed the connection)
15:53 🔗 mhazinsk has quit IRC (Ping timeout: 186 seconds)
15:56 🔗 mhazinsk has joined #archiveteam-bs
16:11 🔗 mistym has joined #archiveteam-bs
16:33 🔗 Rickster has quit IRC (hub.se efnet.port80.se)
16:33 🔗 Muad-Dib has quit IRC (hub.se efnet.port80.se)
16:33 🔗 GLaDOS has quit IRC (hub.se efnet.port80.se)
16:33 🔗 WubTheCap has quit IRC (hub.se efnet.port80.se)
16:33 🔗 Sue_ has quit IRC (hub.se efnet.port80.se)
16:33 🔗 danneh_ has quit IRC (hub.se efnet.port80.se)
16:33 🔗 deathy has quit IRC (hub.se efnet.port80.se)
16:36 🔗 aaaaaaaaa has joined #archiveteam-bs
16:37 🔗 Sue__ has joined #archiveteam-bs
16:49 🔗 WubTheCap has joined #archiveteam-bs
17:12 🔗 mistym has quit IRC (Remote host closed the connection)
17:27 🔗 Start_ has joined #archiveteam-bs
17:27 🔗 Start has quit IRC (Read error: Connection reset by peer)
17:27 🔗 Start_ is now known as Start
17:38 🔗 primus104 has joined #archiveteam-bs
17:46 🔗 primus104 has quit IRC (Leaving.)
17:59 🔗 xmc sets mode: +o swebb
18:06 🔗 mistym has joined #archiveteam-bs
18:07 🔗 mistym_ has joined #archiveteam-bs
18:16 🔗 mistym has quit IRC (Ping timeout: 600 seconds)
18:27 🔗 primus104 has joined #archiveteam-bs
18:34 🔗 slash` has quit IRC (Ping timeout: 512 seconds)
18:36 🔗 balrog has quit IRC (Ping timeout: 512 seconds)
18:37 🔗 balrog has joined #archiveteam-bs
18:37 🔗 swebb sets mode: +o balrog
19:11 🔗 RedType has joined #archiveteam-bs
19:23 🔗 slash` has joined #archiveteam-bs
19:25 🔗 RedType_ has joined #archiveteam-bs
19:26 🔗 RedType_ has quit IRC (Client Quit)
19:28 🔗 RedType_ has joined #archiveteam-bs
19:42 🔗 RedType has quit IRC (Quit: Lost terminal)
20:12 🔗 godane SketchCow: i'm grabbing web ahead episodes
20:13 🔗 kyan has quit IRC (Quit: Leaving)
20:17 🔗 BlueMaxim has joined #archiveteam-bs
20:19 🔗 ersi Wow, unbelievably sad..
20:19 🔗 ersi Idiots in "IS" have destroyed pieces up to 3500 year old at the museum of Mosul, northern Iraq
20:23 🔗 godane i heard about that yesterday on glenn beck show
20:30 🔗 sep332 was the Google Reader grab really 9TB or am I mathing wrong?
20:59 🔗 RedType_ has quit IRC (Quit: leaving)
20:59 🔗 RedType has joined #archiveteam-bs
21:01 🔗 SketchCow http://imgur.com/gallery/KO9V4
21:07 🔗 godane i'm uploading 2008-03 urls of theguardian.com
21:08 🔗 godane turns out i grab that in 20141006
21:10 🔗 mistym_ has quit IRC (Remote host closed the connection)
21:31 🔗 DFJustin they took out the mosul library too http://www.csmonitor.com/Books/chapter-and-verse/2015/0225/ISIS-burns-Mosul-library-Why-terrorists-target-books
21:56 🔗 mistym has joined #archiveteam-bs
22:01 🔗 sankin has quit IRC (Leaving.)
22:15 🔗 n00b740 has joined #archiveteam-bs
22:15 🔗 n00b740 has quit IRC (Client Quit)
22:28 🔗 lelandbat has joined #archiveteam-bs
22:28 🔗 cbb2 has joined #archiveteam-bs
22:29 🔗 lelandbat Hello?
22:31 🔗 xmc sup
22:33 🔗 lelandbat I was invited to the channel by sep332 to discuss a quandary I'm facing
22:33 🔗 lelandbat I'll just dump here:
22:36 🔗 lelandbat A while ago, I built a site to make gifs out of YouTube videos. I put no restrictions on size, duration, or number. Since then, about 30,000 gifs have been made, about 200GB in total.
22:38 🔗 lelandbat I don't want to host 200GB of gifs, many of which are not particularly popular. What do I do?
22:39 🔗 xmc hm
22:39 🔗 lelandbat Or more accurately, what would people recommend that I do? I have a backup of all gifs on my own computer, but I've been deleting old gifs on the site in an effort to reduce space.
22:39 🔗 xmc do you want to keep them at their current urls
22:39 🔗 xmc ok
22:39 🔗 xmc so, if you don't care that much, stick them into a .tar file and then http://archive.org/upload/
22:40 🔗 xmc best to include a file with original urls, other metadata
22:40 🔗 lelandbat How would I include this metadata?
22:40 🔗 xmc uh, what do you have now and what form is it in
22:42 🔗 godane so i think archive doesn't count duplicates right
22:43 🔗 godane the 0511full.wma in theworld.ort/content url will say the 2 urls are unique
22:43 🔗 godane i grabbed both mp3s and they have the same md5sum
22:47 🔗 lelandbat Right now, I have no metadata (other than the preserved file creation dates). Since the files where just served out of an apache directory, the url for any given gif is just a hostname+path+gif_name
22:52 🔗 lelandbat What format should metadata be in?
22:59 🔗 xmc if all you've got is file modification times and paths, tar should be fin
22:59 🔗 xmc e
22:59 🔗 xmc if you have anything else relevant, like what youtube video they came out of, you probably should include that as well
23:00 🔗 xmc in a csv or whatever. it's not a giant machine-readable dataset. probably just useful to humans. i'd suggest writing up a quick readme too
23:17 🔗 cbb2 has quit IRC (hub.dk irc.efnet.pl)
23:17 🔗 primus104 has quit IRC (hub.dk irc.efnet.pl)
23:17 🔗 schbirid has quit IRC (hub.dk irc.efnet.pl)
23:17 🔗 Zebranky_ has quit IRC (hub.dk irc.efnet.pl)
23:17 🔗 cbb2 has joined #archiveteam-bs
23:17 🔗 primus104 has joined #archiveteam-bs
23:17 🔗 schbirid has joined #archiveteam-bs
23:17 🔗 Zebranky_ has joined #archiveteam-bs
23:18 🔗 lelandbat Ah, I see. But if I could get that metadata, such as the original video a gif was from, that would be interesting?
23:22 🔗 DFJustin IA can address files inside of .tar as well so you could set up your server to redirect to the archived versions and not break links
23:22 🔗 DFJustin the more metadata the merrier generally
23:24 🔗 lelandbat How would I set up my server to redirect to archived versions? I thought about doing that myself: converting the gifs into much more efficient webm videos, then redirecting to those. I'd be fine with that since webm is about 15x smaller than gifs (in my testing), but I don't know how to set up the redirects.
23:24 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
23:24 🔗 DFJustin look for a mod_rewrite tutorial for apache, it's not too hard
23:31 🔗 DFJustin once you have the .tar uploaded to an internet archive item like this https://archive.org/details/ftp.cs.umanitoba.ca
23:32 🔗 godane https://archive.org/details/Y2K_Family_Survival_Guide_With_Leonard_Nimoy_Palsojom1.X264.CG
23:32 🔗 DFJustin you can then access files inside like this http://archive.org/download/ftp.cs.umanitoba.ca/2014.06.ftp.cs.umanitoba.ca.tar/ftp.cs.umanitoba.ca/pub/andersj/nelson/CGoodmanR.jpg
23:32 🔗 DFJustin basically archive.org/itemname/filename.tar/directories/inside/tar/file.name
23:33 🔗 lelandbat oh, how cool!
23:33 🔗 DFJustin er archive.org/download/itemname/filename.tar/directories/inside/tar/file.name
23:33 🔗 DFJustin we've been systematically putting up old ftps this way https://archive.org/details/ftpsites
23:34 🔗 lelandbat How could I efficiently upload 200+GB to Archive.org?
23:34 🔗 DFJustin note that it will redirect to a url like ia802506.us.archive.org, you don't want to use that as periodically the ia##### server number will change
23:34 🔗 lelandbat good to know
23:36 🔗 DFJustin there is a command line upload tool https://pypi.python.org/pypi/internetarchive
23:36 🔗 lelandbat perfect, thank you!
23:36 🔗 DFJustin I would recommend testing with something less than 200gb first though
23:37 🔗 BlueMaxim has joined #archiveteam-bs
23:38 🔗 xmc you can delete and replace files within an item
23:38 🔗 xmc but you can't delete an item
23:38 🔗 xmc you can mark an item as 'test' when creating it, which means it'll be deleted soonish
23:40 🔗 DFJustin .zip also works but is probably pointless for .gif content
23:40 🔗 DFJustin .tar.gz etc does not work
23:45 🔗 Jonimus has joined #archiveteam-bs
23:50 🔗 nico_32_ has joined #archiveteam-bs
23:50 🔗 nico_32 has quit IRC (Read error: Connection reset by peer)
23:50 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
23:53 🔗 BlueMaxim has joined #archiveteam-bs
23:57 🔗 xmc yep. plain tarfile
23:58 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
23:59 🔗 BlueMaxim has joined #archiveteam-bs
23:59 🔗 schbirid is linking to files inside tar files performant enough and welcome by IA?
23:59 🔗 schbirid i mean lots of tiny files

irclogger-viewer