Time |
Nickname |
Message |
00:27
🔗
|
|
SmileyG has quit IRC (Remote host closed the connection) |
00:29
🔗
|
|
Kirk has quit IRC (Ping timeout: 240 seconds) |
00:30
🔗
|
|
BlueMaxim has quit IRC (Ping timeout: 265 seconds) |
00:30
🔗
|
|
Kirk has joined #archiveteam-bs |
00:31
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
00:31
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
00:33
🔗
|
|
Start has joined #archiveteam-bs |
00:34
🔗
|
|
wp494 has quit IRC (Read error: Operation timed out) |
00:37
🔗
|
|
Smiley has joined #archiveteam-bs |
00:49
🔗
|
|
wp494 has joined #archiveteam-bs |
00:52
🔗
|
|
GLaDOS has joined #archiveteam-bs |
00:52
🔗
|
|
swebb sets mode: +o GLaDOS |
01:11
🔗
|
|
aaaaaaaaa has quit IRC (Read error: Operation timed out) |
01:14
🔗
|
|
aaaaaaaaa has joined #archiveteam-bs |
01:15
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
01:21
🔗
|
|
dashcloud has joined #archiveteam-bs |
01:32
🔗
|
|
primus104 has quit IRC (Leaving.) |
02:39
🔗
|
|
RedType_ has quit IRC (Remote host closed the connection) |
02:58
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
03:29
🔗
|
|
mistym has joined #archiveteam-bs |
03:38
🔗
|
|
dashcloud has quit IRC (hub.dk irc.homelien.no) |
03:38
🔗
|
|
ionpulse has quit IRC (hub.dk irc.homelien.no) |
03:38
🔗
|
|
pikhq has quit IRC (hub.dk irc.homelien.no) |
03:38
🔗
|
|
altlabel has quit IRC (hub.dk irc.homelien.no) |
03:54
🔗
|
|
dashcloud has joined #archiveteam-bs |
04:04
🔗
|
|
SN4T14 has joined #archiveteam-bs |
05:00
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
05:23
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
05:45
🔗
|
|
mistym has joined #archiveteam-bs |
05:57
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
06:03
🔗
|
|
RedType has joined #archiveteam-bs |
06:12
🔗
|
|
RedType has quit IRC (Quit: Lost terminal) |
06:19
🔗
|
|
RedType has joined #archiveteam-bs |
06:23
🔗
|
|
mistym has joined #archiveteam-bs |
06:26
🔗
|
godane |
so there maybe a archive of pri the world on audible.com |
06:33
🔗
|
|
RedType has quit IRC (Client Quit) |
07:04
🔗
|
|
Muad-Dib has quit IRC (Ping timeout: 260 seconds) |
07:08
🔗
|
|
Muad-Dib has joined #archiveteam-bs |
07:15
🔗
|
|
pikhq has joined #archiveteam-bs |
07:17
🔗
|
|
ionpulse has joined #archiveteam-bs |
07:19
🔗
|
godane |
https://medium.com/message/archive-fever-2a330b627274 |
07:24
🔗
|
godane |
"The archivist produces more archive, and that is why the archive is never closed. It opens out of the future." |
08:00
🔗
|
Ctrl-S |
can we crawl the arduino sites? there's some sort of split happening http://hackaday.com/2015/02/25/arduino-v-arduino/ |
08:12
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
08:18
🔗
|
|
primus104 has joined #archiveteam-bs |
08:25
🔗
|
|
acridAxid has quit IRC (Quit: Quitting) |
08:29
🔗
|
|
dashcloud has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
wp494 has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
rejon has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
Famicoma1 has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
balrog has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
lrkj has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
slash` has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
Baljem has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
espes__ has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
Mayonaise has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
marvinw has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
Cameron_D has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
ohhdemgir has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
eprillios has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
Rickster has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
fenn has quit IRC (west.us.hub irc.eversible.com) |
08:29
🔗
|
|
xmc has quit IRC (west.us.hub irc.eversible.com) |
08:30
🔗
|
|
acridAxid has joined #archiveteam-bs |
08:30
🔗
|
|
wp494_ has joined #archiveteam-bs |
08:30
🔗
|
|
wp494_ has quit IRC (Excess Flood) |
08:30
🔗
|
|
wp494_ has joined #archiveteam-bs |
08:31
🔗
|
|
Rickster` has joined #archiveteam-bs |
08:32
🔗
|
|
lrkj_ has joined #archiveteam-bs |
08:36
🔗
|
|
acridAxid has quit IRC (Read error: Operation timed out) |
08:41
🔗
|
|
acridAxid has joined #archiveteam-bs |
08:44
🔗
|
|
Rickster` is now known as Rickster |
08:44
🔗
|
|
rejon has joined #archiveteam-bs |
08:44
🔗
|
|
balrog has joined #archiveteam-bs |
08:44
🔗
|
|
slash` has joined #archiveteam-bs |
08:44
🔗
|
|
Baljem has joined #archiveteam-bs |
08:44
🔗
|
|
espes__ has joined #archiveteam-bs |
08:44
🔗
|
|
Mayonaise has joined #archiveteam-bs |
08:44
🔗
|
|
marvinw has joined #archiveteam-bs |
08:44
🔗
|
|
Cameron_D has joined #archiveteam-bs |
08:44
🔗
|
|
xmc has joined #archiveteam-bs |
08:44
🔗
|
|
irc.eversible.com sets mode: +oo balrog xmc |
08:44
🔗
|
|
swebb sets mode: +o balrog |
08:44
🔗
|
|
swebb sets mode: +o xmc |
08:48
🔗
|
|
fenn has joined #archiveteam-bs |
08:51
🔗
|
|
fenn has quit IRC (west.us.hub irc.eversible.com) |
08:51
🔗
|
|
rejon has quit IRC (west.us.hub irc.eversible.com) |
08:51
🔗
|
|
balrog has quit IRC (west.us.hub irc.eversible.com) |
08:51
🔗
|
|
slash` has quit IRC (west.us.hub irc.eversible.com) |
08:51
🔗
|
|
Baljem has quit IRC (west.us.hub irc.eversible.com) |
08:51
🔗
|
|
espes__ has quit IRC (west.us.hub irc.eversible.com) |
08:51
🔗
|
|
Mayonaise has quit IRC (west.us.hub irc.eversible.com) |
08:51
🔗
|
|
marvinw has quit IRC (west.us.hub irc.eversible.com) |
08:51
🔗
|
|
Cameron_D has quit IRC (west.us.hub irc.eversible.com) |
08:51
🔗
|
|
xmc has quit IRC (west.us.hub irc.eversible.com) |
08:54
🔗
|
|
espes___ has joined #archiveteam-bs |
08:55
🔗
|
|
schbirid has joined #archiveteam-bs |
08:55
🔗
|
|
marvinw_ has joined #archiveteam-bs |
08:55
🔗
|
|
Baljem_ has joined #archiveteam-bs |
09:06
🔗
|
|
eprillios has joined #archiveteam-bs |
09:10
🔗
|
|
dashcloud has joined #archiveteam-bs |
09:10
🔗
|
|
rejon has joined #archiveteam-bs |
09:10
🔗
|
|
balrog has joined #archiveteam-bs |
09:10
🔗
|
|
Mayonaise has joined #archiveteam-bs |
09:10
🔗
|
|
Cameron_D has joined #archiveteam-bs |
09:10
🔗
|
|
xmc has joined #archiveteam-bs |
09:10
🔗
|
|
irc.eversible.com sets mode: +oo balrog xmc |
09:10
🔗
|
|
swebb sets mode: +o balrog |
09:10
🔗
|
|
swebb sets mode: +o xmc |
09:19
🔗
|
|
fenn has joined #archiveteam-bs |
09:21
🔗
|
|
primus104 has quit IRC (Leaving.) |
09:46
🔗
|
|
eprillios has quit IRC (Ping timeout: 506 seconds) |
09:52
🔗
|
|
eprillios has joined #archiveteam-bs |
09:59
🔗
|
|
Famicoman has joined #archiveteam-bs |
10:18
🔗
|
|
swebb has quit IRC (Read error: Operation timed out) |
10:22
🔗
|
|
swebb has joined #archiveteam-bs |
12:00
🔗
|
|
slash` has joined #archiveteam-bs |
12:11
🔗
|
joepie91_ |
whoop whoop |
12:11
🔗
|
joepie91_ |
first mirror-to-IA code in Node.js completed |
12:11
🔗
|
joepie91_ |
handles unicode, newlines, the whole shebang :D |
12:12
🔗
|
* |
ersi pats joepie91_ |
12:23
🔗
|
* |
BlueMaxim pats ersi |
12:25
🔗
|
|
primus104 has joined #archiveteam-bs |
12:27
🔗
|
* |
ersi explodes |
12:53
🔗
|
joepie91_ |
damnit BlueMaxim |
12:53
🔗
|
joepie91_ |
I told you not to pat the creeper |
12:54
🔗
|
joepie91_ |
now I have to rebuild my code |
12:54
🔗
|
joepie91_ |
:( |
12:56
🔗
|
* |
BlueMaxim is exploded |
13:04
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
13:11
🔗
|
|
wp494_ has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES) |
13:11
🔗
|
|
wp494 has joined #archiveteam-bs |
13:24
🔗
|
|
ohhdemgir has joined #archiveteam-bs |
13:46
🔗
|
|
sankin has joined #archiveteam-bs |
14:50
🔗
|
|
sankin has quit IRC (Leaving.) |
14:58
🔗
|
|
sankin has joined #archiveteam-bs |
15:02
🔗
|
|
primus104 has quit IRC (Leaving.) |
15:32
🔗
|
|
mistym has joined #archiveteam-bs |
15:50
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
15:53
🔗
|
|
mhazinsk has quit IRC (Ping timeout: 186 seconds) |
15:56
🔗
|
|
mhazinsk has joined #archiveteam-bs |
16:11
🔗
|
|
mistym has joined #archiveteam-bs |
16:33
🔗
|
|
Rickster has quit IRC (hub.se efnet.port80.se) |
16:33
🔗
|
|
Muad-Dib has quit IRC (hub.se efnet.port80.se) |
16:33
🔗
|
|
GLaDOS has quit IRC (hub.se efnet.port80.se) |
16:33
🔗
|
|
WubTheCap has quit IRC (hub.se efnet.port80.se) |
16:33
🔗
|
|
Sue_ has quit IRC (hub.se efnet.port80.se) |
16:33
🔗
|
|
danneh_ has quit IRC (hub.se efnet.port80.se) |
16:33
🔗
|
|
deathy has quit IRC (hub.se efnet.port80.se) |
16:36
🔗
|
|
aaaaaaaaa has joined #archiveteam-bs |
16:37
🔗
|
|
Sue__ has joined #archiveteam-bs |
16:49
🔗
|
|
WubTheCap has joined #archiveteam-bs |
17:12
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
17:27
🔗
|
|
Start_ has joined #archiveteam-bs |
17:27
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
17:27
🔗
|
|
Start_ is now known as Start |
17:38
🔗
|
|
primus104 has joined #archiveteam-bs |
17:46
🔗
|
|
primus104 has quit IRC (Leaving.) |
17:59
🔗
|
|
xmc sets mode: +o swebb |
18:06
🔗
|
|
mistym has joined #archiveteam-bs |
18:07
🔗
|
|
mistym_ has joined #archiveteam-bs |
18:16
🔗
|
|
mistym has quit IRC (Ping timeout: 600 seconds) |
18:27
🔗
|
|
primus104 has joined #archiveteam-bs |
18:34
🔗
|
|
slash` has quit IRC (Ping timeout: 512 seconds) |
18:36
🔗
|
|
balrog has quit IRC (Ping timeout: 512 seconds) |
18:37
🔗
|
|
balrog has joined #archiveteam-bs |
18:37
🔗
|
|
swebb sets mode: +o balrog |
19:11
🔗
|
|
RedType has joined #archiveteam-bs |
19:23
🔗
|
|
slash` has joined #archiveteam-bs |
19:25
🔗
|
|
RedType_ has joined #archiveteam-bs |
19:26
🔗
|
|
RedType_ has quit IRC (Client Quit) |
19:28
🔗
|
|
RedType_ has joined #archiveteam-bs |
19:42
🔗
|
|
RedType has quit IRC (Quit: Lost terminal) |
20:12
🔗
|
godane |
SketchCow: i'm grabbing web ahead episodes |
20:13
🔗
|
|
kyan has quit IRC (Quit: Leaving) |
20:17
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
20:19
🔗
|
ersi |
Wow, unbelievably sad.. |
20:19
🔗
|
ersi |
Idiots in "IS" have destroyed pieces up to 3500 year old at the museum of Mosul, northern Iraq |
20:23
🔗
|
godane |
i heard about that yesterday on glenn beck show |
20:30
🔗
|
sep332 |
was the Google Reader grab really 9TB or am I mathing wrong? |
20:59
🔗
|
|
RedType_ has quit IRC (Quit: leaving) |
20:59
🔗
|
|
RedType has joined #archiveteam-bs |
21:01
🔗
|
SketchCow |
http://imgur.com/gallery/KO9V4 |
21:07
🔗
|
godane |
i'm uploading 2008-03 urls of theguardian.com |
21:08
🔗
|
godane |
turns out i grab that in 20141006 |
21:10
🔗
|
|
mistym_ has quit IRC (Remote host closed the connection) |
21:31
🔗
|
DFJustin |
they took out the mosul library too http://www.csmonitor.com/Books/chapter-and-verse/2015/0225/ISIS-burns-Mosul-library-Why-terrorists-target-books |
21:56
🔗
|
|
mistym has joined #archiveteam-bs |
22:01
🔗
|
|
sankin has quit IRC (Leaving.) |
22:15
🔗
|
|
n00b740 has joined #archiveteam-bs |
22:15
🔗
|
|
n00b740 has quit IRC (Client Quit) |
22:28
🔗
|
|
lelandbat has joined #archiveteam-bs |
22:28
🔗
|
|
cbb2 has joined #archiveteam-bs |
22:29
🔗
|
lelandbat |
Hello? |
22:31
🔗
|
xmc |
sup |
22:33
🔗
|
lelandbat |
I was invited to the channel by sep332 to discuss a quandary I'm facing |
22:33
🔗
|
lelandbat |
I'll just dump here: |
22:36
🔗
|
lelandbat |
A while ago, I built a site to make gifs out of YouTube videos. I put no restrictions on size, duration, or number. Since then, about 30,000 gifs have been made, about 200GB in total. |
22:38
🔗
|
lelandbat |
I don't want to host 200GB of gifs, many of which are not particularly popular. What do I do? |
22:39
🔗
|
xmc |
hm |
22:39
🔗
|
lelandbat |
Or more accurately, what would people recommend that I do? I have a backup of all gifs on my own computer, but I've been deleting old gifs on the site in an effort to reduce space. |
22:39
🔗
|
xmc |
do you want to keep them at their current urls |
22:39
🔗
|
xmc |
ok |
22:39
🔗
|
xmc |
so, if you don't care that much, stick them into a .tar file and then http://archive.org/upload/ |
22:40
🔗
|
xmc |
best to include a file with original urls, other metadata |
22:40
🔗
|
lelandbat |
How would I include this metadata? |
22:40
🔗
|
xmc |
uh, what do you have now and what form is it in |
22:42
🔗
|
godane |
so i think archive doesn't count duplicates right |
22:43
🔗
|
godane |
the 0511full.wma in theworld.ort/content url will say the 2 urls are unique |
22:43
🔗
|
godane |
i grabbed both mp3s and they have the same md5sum |
22:47
🔗
|
lelandbat |
Right now, I have no metadata (other than the preserved file creation dates). Since the files where just served out of an apache directory, the url for any given gif is just a hostname+path+gif_name |
22:52
🔗
|
lelandbat |
What format should metadata be in? |
22:59
🔗
|
xmc |
if all you've got is file modification times and paths, tar should be fin |
22:59
🔗
|
xmc |
e |
22:59
🔗
|
xmc |
if you have anything else relevant, like what youtube video they came out of, you probably should include that as well |
23:00
🔗
|
xmc |
in a csv or whatever. it's not a giant machine-readable dataset. probably just useful to humans. i'd suggest writing up a quick readme too |
23:17
🔗
|
|
cbb2 has quit IRC (hub.dk irc.efnet.pl) |
23:17
🔗
|
|
primus104 has quit IRC (hub.dk irc.efnet.pl) |
23:17
🔗
|
|
schbirid has quit IRC (hub.dk irc.efnet.pl) |
23:17
🔗
|
|
Zebranky_ has quit IRC (hub.dk irc.efnet.pl) |
23:17
🔗
|
|
cbb2 has joined #archiveteam-bs |
23:17
🔗
|
|
primus104 has joined #archiveteam-bs |
23:17
🔗
|
|
schbirid has joined #archiveteam-bs |
23:17
🔗
|
|
Zebranky_ has joined #archiveteam-bs |
23:18
🔗
|
lelandbat |
Ah, I see. But if I could get that metadata, such as the original video a gif was from, that would be interesting? |
23:22
🔗
|
DFJustin |
IA can address files inside of .tar as well so you could set up your server to redirect to the archived versions and not break links |
23:22
🔗
|
DFJustin |
the more metadata the merrier generally |
23:24
🔗
|
lelandbat |
How would I set up my server to redirect to archived versions? I thought about doing that myself: converting the gifs into much more efficient webm videos, then redirecting to those. I'd be fine with that since webm is about 15x smaller than gifs (in my testing), but I don't know how to set up the redirects. |
23:24
🔗
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
23:24
🔗
|
DFJustin |
look for a mod_rewrite tutorial for apache, it's not too hard |
23:31
🔗
|
DFJustin |
once you have the .tar uploaded to an internet archive item like this https://archive.org/details/ftp.cs.umanitoba.ca |
23:32
🔗
|
godane |
https://archive.org/details/Y2K_Family_Survival_Guide_With_Leonard_Nimoy_Palsojom1.X264.CG |
23:32
🔗
|
DFJustin |
you can then access files inside like this http://archive.org/download/ftp.cs.umanitoba.ca/2014.06.ftp.cs.umanitoba.ca.tar/ftp.cs.umanitoba.ca/pub/andersj/nelson/CGoodmanR.jpg |
23:32
🔗
|
DFJustin |
basically archive.org/itemname/filename.tar/directories/inside/tar/file.name |
23:33
🔗
|
lelandbat |
oh, how cool! |
23:33
🔗
|
DFJustin |
er archive.org/download/itemname/filename.tar/directories/inside/tar/file.name |
23:33
🔗
|
DFJustin |
we've been systematically putting up old ftps this way https://archive.org/details/ftpsites |
23:34
🔗
|
lelandbat |
How could I efficiently upload 200+GB to Archive.org? |
23:34
🔗
|
DFJustin |
note that it will redirect to a url like ia802506.us.archive.org, you don't want to use that as periodically the ia##### server number will change |
23:34
🔗
|
lelandbat |
good to know |
23:36
🔗
|
DFJustin |
there is a command line upload tool https://pypi.python.org/pypi/internetarchive |
23:36
🔗
|
lelandbat |
perfect, thank you! |
23:36
🔗
|
DFJustin |
I would recommend testing with something less than 200gb first though |
23:37
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
23:38
🔗
|
xmc |
you can delete and replace files within an item |
23:38
🔗
|
xmc |
but you can't delete an item |
23:38
🔗
|
xmc |
you can mark an item as 'test' when creating it, which means it'll be deleted soonish |
23:40
🔗
|
DFJustin |
.zip also works but is probably pointless for .gif content |
23:40
🔗
|
DFJustin |
.tar.gz etc does not work |
23:45
🔗
|
|
Jonimus has joined #archiveteam-bs |
23:50
🔗
|
|
nico_32_ has joined #archiveteam-bs |
23:50
🔗
|
|
nico_32 has quit IRC (Read error: Connection reset by peer) |
23:50
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
23:53
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
23:57
🔗
|
xmc |
yep. plain tarfile |
23:58
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
23:59
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
23:59
🔗
|
schbirid |
is linking to files inside tar files performant enough and welcome by IA? |
23:59
🔗
|
schbirid |
i mean lots of tiny files |