#archiveteam 2016-01-22,Fri

↑back Search

Time Nickname Message
00:23 🔗 nertzy2 has joined #archiveteam
00:40 🔗 SimpBrain has quit IRC (Read error: Operation timed out)
00:42 🔗 nertzy2 has quit IRC (Quit: This computer has gone to sleep)
00:48 🔗 mismatch_ has joined #archiveteam
01:02 🔗 logchfoo1 starts logging #archiveteam at Fri Jan 22 01:02:44 2016
01:02 🔗 logchfoo1 has joined #archiveteam
01:15 🔗 Ghost_of_ has quit IRC (Read error: Operation timed out)
01:20 🔗 JesseW has joined #archiveteam
01:23 🔗 SimpBrain has joined #archiveteam
01:53 🔗 JesseW has quit IRC (Leaving.)
02:13 🔗 jspiros has quit IRC (leaving)
02:13 🔗 jspiros has joined #archiveteam
02:30 🔗 megaminxw has quit IRC (Quit: Leaving.)
02:30 🔗 JesseW has joined #archiveteam
02:48 🔗 Froggypwn has joined #archiveteam
03:11 🔗 Zebranky_ is now known as Zebranky
03:45 🔗 kyan has joined #archiveteam
03:58 🔗 W1nterFox has joined #archiveteam
04:03 🔗 WinterFox has quit IRC (Read error: Operation timed out)
04:06 🔗 W1nterFox has quit IRC (Read error: Operation timed out)
04:38 🔗 brayden has quit IRC (Quit: Leaving)
04:39 🔗 brayden has joined #archiveteam
04:52 🔗 xmc request for assistance: could someone please help me backup the gitorious disk image? it's just shy of 5T and i would like to have more than one copy
04:53 🔗 xmc looking for serious multiyear commitments
04:53 🔗 xmc ultimately i will figure out how to shove it into IA but it's a bit large for one item
04:53 🔗 xmc or something
04:56 🔗 megaminxw has joined #archiveteam
04:57 🔗 WinterFox has joined #archiveteam
05:00 🔗 fie has quit IRC (Read error: Operation timed out)
05:10 🔗 JesseW I am glad to physically hold on to a copy, but: 1) I'm also in Seattle, so that helps less with geographic separation ; 2) While I can likely afford to buy 5TBs worth of hard drives, I haven't done so yet.
05:10 🔗 JesseW xmc:
05:11 🔗 xmc hi
05:11 🔗 xmc the image is physically on a ceph cluster in san jose, not in my house :P
05:11 🔗 JesseW ah, well then me storing one in Seattle might be more useful. :-)
05:12 🔗 xmc yea
05:12 🔗 JesseW and it should be easier to get it to IA (because you're going to want to use sneakernet)
05:13 🔗 xmc mmmmaybe
05:13 🔗 JesseW why ever not?
05:14 🔗 xmc because that would require traveling and i don't have a place to stay down there and i don't really want to visit the bay area?
05:15 🔗 JesseW Just ask the IA folks to drop by the data center with 5 1T drives, plug them in, then pick them back up.
05:15 🔗 JesseW when they are full
05:15 🔗 * xmc shrug
05:20 🔗 * xmc email info@
05:30 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
05:33 🔗 SketchCow Is there a simple way to split it?
05:33 🔗 SketchCow (He asked)
05:36 🔗 xmc well i could split it by username, each username has somewhere between two and many repositories
05:37 🔗 xmc an item by user would make some sense, though clones are stored in the directory of the account it was cloned *from*
05:37 🔗 MrRadar 5TB drives aren't that expensive these days; when I was bulk-grabbing videos from Blip they were $150 each
05:39 🔗 MrRadar Also, it looks like oldfriends.co.nz is fully shut down. Should the primary domain be blocked from the ArchiveBot job?
05:39 🔗 MrRadar They have a secondary images domain which is still returning data
05:39 🔗 MrRadar At least for now
05:50 🔗 JesseW Ha -- I now know of *68* identifiers that IA will show records (to logged in users) of having (shock, horror) "deleted". ;-P
05:56 🔗 kyan JesseW, wut?
05:56 🔗 kyan I assume deleted meaning darked or something?
05:57 🔗 kyan (My understanding from talking to IA was that nothing is ever deleted)
05:57 🔗 JesseW hehehehehehehehehehehehehehehehe
05:58 🔗 JesseW https://archive.org/history/20_minutes_of_massachusetts
05:59 🔗 kyan JesseW, That's fairly disturbing but all that was there was the meta.xml, the reviews.xml, and an empty dir, apparently
05:59 🔗 kyan I'll have to find where i was told nothing's deleted
06:00 🔗 kyan Frankly I wonder why anyone would go to the trouble of deleting an empty identifier
06:00 🔗 kyan to save 10k of disk space or sthg?
06:01 🔗 * JesseW shrug -- IDK. The ones I've looked at have all been in 2007, so maybe things were different then.
06:01 🔗 kyan Dec 05 14:51:40 <SketchCow> We don't even delete SPAM
06:01 🔗 kyan Dec 05 14:51:59 <SketchCow> Nothing leaves the archive, not a bit
06:01 🔗 kyan JesseW, Ah, hmm
06:03 🔗 SketchCow You're all the most fucking adorable things
06:03 🔗 * JesseW bows
06:03 🔗 kyan http://archive.fart.website/bin/irclogger_log/archiveteam-bs?date=2014-12-05,Fri&sel=228#l224
06:05 🔗 * JesseW is mostly amused by the levels of recordkeeping -- even when something is *removed*, it still shows up in whatever jake used to generate the census list.
06:13 🔗 JesseW http://archive.fart.website/bin/irclogger_log/archiveteam-bs?date=2014-12-05,Fri&sel=238#l234 -- hm, I wonder how they are now.
06:14 🔗 JesseW https://archive.org/details/opensource_media <- 204,816
06:14 🔗 JesseW https://archive.org/details/opensource_movies <- 547,723
06:15 🔗 JesseW https://archive.org/details/opensource_religionvideo <- 104,421
06:15 🔗 JesseW https://archive.org/details/opensource <- 426,941
06:16 🔗 JesseW https://archive.org/details/opensource_audio <- 2,059,999 (!!)
06:17 🔗 JesseW https://archive.org/details/open_source_software <- 11,368
06:18 🔗 JesseW I think that's all of them...
06:21 🔗 Atluxity good morning
06:22 🔗 JesseW Atluxity: morning
06:25 🔗 JesseW well, of the 68 I found, all but 3 were deleted in 2006 or 2007. The other 3 were deleted by the archive.org staffer who uploaded them, presumably as a test.
06:25 🔗 JesseW Mostly just an interesting curio.
06:41 🔗 RichardG has joined #archiveteam
07:00 🔗 JesseW what's more concerning are the 48 items that were retrievable in the last census (in March 2015) but now are gone without even any records in archive.org/history/
07:01 🔗 JesseW including one that (randomly) was the first in the itemlist, https://archive.org/history/Urdu-Trana-001
07:03 🔗 kyan That's weird.
07:03 🔗 JesseW according to the census, it contained 10 mp3s of what was presumably Islamic speeches (from the filenames) and was in the iraq_middleeast, iraq_war and newsandpublicaffairs collections.
07:03 🔗 kyan Could it have been renamed?
07:04 🔗 JesseW Hm, let me look in those collections.
07:05 🔗 kyan There are over 36k items there https://archive.org/search.php?query=collection%3A%22newsandpublicaffairs%22%20collection%3A%22iraq_war%22%20collection%3A%22iraq_middleeast%22
07:05 🔗 JesseW Yeah, the other two are also too large to look through manually.
07:05 🔗 JesseW Hm, the name of the _meta file doesn't match the identifier. Let me look in that.
07:06 🔗 JesseW yep, there it is: https://archive.org/metadata/AansoonAurAhoon-MP3
07:07 🔗 kyan Ah, yay. The files are the same? Assuming they are we should probably upload a placeholder item to the other identifier to aid in locating it. Also, is that identifier also listed in the census?
07:07 🔗 JesseW how did it get under that other identifier, I wonder?
07:08 🔗 JesseW yep, the other identifier is in the census
07:08 🔗 kyan " k e y = > 3 4 3 9 5 0 6 - 1 8 0 4 6p r e v t a s k = > 3 8 2 5 0 5 6 0 2d i r = > / 3 0 / i t e m s / A a n s o o n A u r A h o o n - M P 3c o m m e n t = > R u n n i n g n o o p ' s t o u p d a t e c o l l e c t i o n s t r i n g i n m e t a d a t a t a b l e t o m a t c h c o l l e c t i o n s i n i t e m s m e t a . x m ln o o p = > 1key=3439506-18046&noop=1&.. "
07:08 🔗 vitzli has joined #archiveteam
07:09 🔗 JesseW Heh, I was just going to post that. :-)
07:09 🔗 kyan fixer.php submitted by jake@archive.org (who IIRC did the census?) around a year ago https://catalogd.archive.org/log/382521508
07:09 🔗 kyan :P
07:10 🔗 JesseW yep, I've found a few other fixes jake did after running the census. :-)
07:10 🔗 JesseW and I've sent a few more into info@ which have been done now.
07:11 🔗 kyan (FWIW, https://archive.org/details/AansoonAurAhoon-MP3 seems to be music, rather than speeches)
07:12 🔗 kyan Hah this one is cool https://ia802304.us.archive.org/30/items/AansoonAurAhoon-MP3/ek-sitara-tha-main.mp3
07:13 🔗 bai if you google for the song title, looks like it's associated with some graphic videos
07:14 🔗 vitzli pics are good too
07:16 🔗 kyan English song title is "I was a Star", it's in Hindi apparently
07:16 🔗 kyan according to Google Translate
07:24 🔗 kyan wish i knew what the lyrics were. All too many references to "jihad" in the google search results for the title for my taste
07:24 🔗 kyan i like the music tho
07:27 🔗 yipdw I just realized that JIHAD is an acronym for "Jesus, I'm Having A Dump"
07:27 🔗 yipdw sorry that was like lightyears off topic
07:27 🔗 JesseW *how* exactly did you "just realize" that? :-)
07:28 🔗 yipdw I got tired of not knowing what I'm doing and so I switched to the IRC client for a little while and it just happened
07:31 🔗 JesseW well, you're welcome. :-)
07:32 🔗 yipdw it may happen more often as I continue to realize that everything I knew about the GPU is wrong
08:16 🔗 JesseW has quit IRC (Leaving.)
08:31 🔗 atomotic has joined #archiveteam
08:33 🔗 redlob has quit IRC (Quit: ZNC - http://znc.in)
08:36 🔗 redlob has joined #archiveteam
08:51 🔗 vitzli has quit IRC (Leaving)
09:15 🔗 MrRadar has quit IRC (Read error: Operation timed out)
09:18 🔗 MrRadar has joined #archiveteam
09:31 🔗 arkiver SketchCow: oldfriends has closed. Our grab was a succes!
09:31 🔗 arkiver There's some older files you can delete from FOS rom oldfriends
09:32 🔗 arkiver Or instead of that pack them up in a non-WARC archive and upload them to IA, so we have them anyway
09:32 🔗 * kyan likes the second option better
09:37 🔗 Atluxity delete something?! thats not how we do it
09:38 🔗 arkiver SketchCow: looks like some items didn't get the metadata update: https://archive.org/details/archiveteam_newssites_20160120_0021
09:47 🔗 HCross2 arkiver: the bot has crashed, and student WiFi here blocks SSH
10:20 🔗 dashcloud has quit IRC (Read error: Operation timed out)
10:24 🔗 PurpleSym Is catalogd down?
10:26 🔗 dashcloud has joined #archiveteam
11:18 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
12:45 🔗 K4k_ has joined #archiveteam
12:47 🔗 atomotic has joined #archiveteam
12:49 🔗 VADemon has joined #archiveteam
13:00 🔗 Ghost_of_ has joined #archiveteam
13:14 🔗 K4k_ has quit IRC (Read error: Operation timed out)
13:56 🔗 atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…)
13:57 🔗 atomotic has joined #archiveteam
14:02 🔗 K4k_ has joined #archiveteam
14:02 🔗 K4k_ has quit IRC (Remote host closed the connection!)
14:02 🔗 K4k_ has joined #archiveteam
14:15 🔗 nertzy2 has joined #archiveteam
14:24 🔗 dashcloud has quit IRC (Read error: Operation timed out)
14:25 🔗 Ghost_of_ has quit IRC (Quit: Leaving)
14:27 🔗 dashcloud has joined #archiveteam
14:44 🔗 WinterFox has quit IRC (Remote host closed the connection)
14:49 🔗 phuzion Can we requeue some of the items for gamefront?
14:50 🔗 nertzy2 has quit IRC (Quit: This computer has gone to sleep)
14:50 🔗 phuzion There's 60k items out right now
14:53 🔗 nertzy2 has joined #archiveteam
14:55 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
14:56 🔗 dashcloud has quit IRC (Read error: Operation timed out)
15:03 🔗 nertzy2 has quit IRC (Quit: This computer has gone to sleep)
15:04 🔗 dashcloud has joined #archiveteam
15:20 🔗 K4k_ has quit IRC (Ping timeout: 260 seconds)
15:31 🔗 nertzy2 has joined #archiveteam
15:41 🔗 nertzy2 has quit IRC (Quit: This computer has gone to sleep)
15:59 🔗 K4k_ has joined #archiveteam
16:03 🔗 Lord_Nigh sets mode: +o balrog
16:10 🔗 godane has quit IRC (Read error: Operation timed out)
16:13 🔗 megaminxw has quit IRC (Quit: Leaving.)
16:31 🔗 Ghost_of_ has joined #archiveteam
16:51 🔗 BlueMaxim has joined #archiveteam
17:03 🔗 K4k__ has joined #archiveteam
17:05 🔗 K4k_ has quit IRC (Ping timeout: 252 seconds)
17:15 🔗 JesseW has joined #archiveteam
17:18 🔗 kristian_ has joined #archiveteam
17:22 🔗 schbirid has joined #archiveteam
17:27 🔗 JesseW has quit IRC (Leaving.)
17:34 🔗 z00nx has quit IRC (Ping timeout: 252 seconds)
17:34 🔗 z00nx has joined #archiveteam
17:36 🔗 rizzzz has quit IRC (Read error: Operation timed out)
17:40 🔗 rizzzz has joined #archiveteam
17:43 🔗 Atom__ has joined #archiveteam
17:46 🔗 Atom-- has quit IRC (Ping timeout: 252 seconds)
17:56 🔗 atomotic has joined #archiveteam
18:03 🔗 dashcloud has quit IRC (Read error: Operation timed out)
18:06 🔗 dashcloud has joined #archiveteam
18:31 🔗 Emcy has quit IRC (Ping timeout: 250 seconds)
19:03 🔗 K4k has joined #archiveteam
19:08 🔗 K4k__ has quit IRC (Read error: Operation timed out)
19:15 🔗 dashcloud has quit IRC (Read error: Operation timed out)
19:20 🔗 dashcloud has joined #archiveteam
19:21 🔗 scyther has joined #archiveteam
19:35 🔗 aliz has quit IRC (Ping timeout: 260 seconds)
19:41 🔗 atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…)
19:54 🔗 atomotic has joined #archiveteam
19:54 🔗 atomotic has quit IRC (Client Quit)
19:58 🔗 dashcloud has quit IRC (Read error: Operation timed out)
20:02 🔗 dashcloud has joined #archiveteam
20:16 🔗 Ghost_of_ has quit IRC (Quit: Leaving)
20:16 🔗 kristian_ has quit IRC (Quit: Leaving)
20:38 🔗 JesseW has joined #archiveteam
20:41 🔗 JesseW has quit IRC (Client Quit)
20:47 🔗 arkiver Great news on Google Code!
20:47 🔗 arkiver We can keep the grab running after the shutdown on the 25th
20:53 🔗 phuzion Awesome!
21:17 🔗 godane has joined #archiveteam
21:20 🔗 K4k has quit IRC (Ping timeout: 252 seconds)
21:41 🔗 Ghost_of_ has joined #archiveteam
21:44 🔗 dashcloud has quit IRC (Read error: Operation timed out)
21:45 🔗 dashcloud has joined #archiveteam
21:55 🔗 scyther has quit IRC (Quit: Leaving)
22:05 🔗 Atluxity holy COW those gamefront items are getting BIG
22:16 🔗 K4k has joined #archiveteam
22:22 🔗 JetBalsa has quit IRC (Read error: Connection reset by peer)
22:23 🔗 K4k has quit IRC (Read error: Operation timed out)
22:25 🔗 dashcloud has quit IRC (Read error: Operation timed out)
22:26 🔗 Atluxity if I was banned from gamefront, would I be getting any 200 OK at all?
22:29 🔗 dashcloud has joined #archiveteam
22:35 🔗 Start has quit IRC (Read error: Connection reset by peer)
22:35 🔗 Start has joined #archiveteam
22:38 🔗 JesseW has joined #archiveteam
23:08 🔗 WinterFox has joined #archiveteam
23:18 🔗 K4k has joined #archiveteam
23:23 🔗 K4k has quit IRC (Ping timeout: 260 seconds)
23:26 🔗 dashcloud has quit IRC (Read error: Operation timed out)
23:30 🔗 dashcloud has joined #archiveteam
23:41 🔗 nertzy2 has joined #archiveteam
23:45 🔗 JesseW has quit IRC (Leaving.)
23:49 🔗 SketchCow xmc: Archive came to me asking what to do about the guy with gitorious
23:49 🔗 xmc hahaha
23:49 🔗 xmc ok
23:49 🔗 SketchCow So really, it's all about me. I'm Rome and everything leads to me
23:49 🔗 xmc so i'll make something up and then do it
23:50 🔗 SketchCow If you could split it into 5 pieces, that would be good.
23:50 🔗 SketchCow Even if it kind of sucks
23:50 🔗 xmc i could, but it'd be weird
23:52 🔗 xmc i could also split it into 40,000 pieces, one per username
23:52 🔗 xmc eh, i can do it alphabetically or something
23:52 🔗 xmc ok.
23:58 🔗 SketchCow Work through it.
23:58 🔗 SketchCow But we'll take it.

irclogger-viewer