Time |
Nickname |
Message |
00:23
🔗
|
|
nertzy2 has joined #archiveteam |
00:40
🔗
|
|
SimpBrain has quit IRC (Read error: Operation timed out) |
00:42
🔗
|
|
nertzy2 has quit IRC (Quit: This computer has gone to sleep) |
00:48
🔗
|
|
mismatch_ has joined #archiveteam |
01:02
🔗
|
|
logchfoo1 starts logging #archiveteam at Fri Jan 22 01:02:44 2016 |
01:02
🔗
|
|
logchfoo1 has joined #archiveteam |
01:15
🔗
|
|
Ghost_of_ has quit IRC (Read error: Operation timed out) |
01:20
🔗
|
|
JesseW has joined #archiveteam |
01:23
🔗
|
|
SimpBrain has joined #archiveteam |
01:53
🔗
|
|
JesseW has quit IRC (Leaving.) |
02:13
🔗
|
|
jspiros has quit IRC (leaving) |
02:13
🔗
|
|
jspiros has joined #archiveteam |
02:30
🔗
|
|
megaminxw has quit IRC (Quit: Leaving.) |
02:30
🔗
|
|
JesseW has joined #archiveteam |
02:48
🔗
|
|
Froggypwn has joined #archiveteam |
03:11
🔗
|
|
Zebranky_ is now known as Zebranky |
03:45
🔗
|
|
kyan has joined #archiveteam |
03:58
🔗
|
|
W1nterFox has joined #archiveteam |
04:03
🔗
|
|
WinterFox has quit IRC (Read error: Operation timed out) |
04:06
🔗
|
|
W1nterFox has quit IRC (Read error: Operation timed out) |
04:38
🔗
|
|
brayden has quit IRC (Quit: Leaving) |
04:39
🔗
|
|
brayden has joined #archiveteam |
04:52
🔗
|
xmc |
request for assistance: could someone please help me backup the gitorious disk image? it's just shy of 5T and i would like to have more than one copy |
04:53
🔗
|
xmc |
looking for serious multiyear commitments |
04:53
🔗
|
xmc |
ultimately i will figure out how to shove it into IA but it's a bit large for one item |
04:53
🔗
|
xmc |
or something |
04:56
🔗
|
|
megaminxw has joined #archiveteam |
04:57
🔗
|
|
WinterFox has joined #archiveteam |
05:00
🔗
|
|
fie has quit IRC (Read error: Operation timed out) |
05:10
🔗
|
JesseW |
I am glad to physically hold on to a copy, but: 1) I'm also in Seattle, so that helps less with geographic separation ; 2) While I can likely afford to buy 5TBs worth of hard drives, I haven't done so yet. |
05:10
🔗
|
JesseW |
xmc: |
05:11
🔗
|
xmc |
hi |
05:11
🔗
|
xmc |
the image is physically on a ceph cluster in san jose, not in my house :P |
05:11
🔗
|
JesseW |
ah, well then me storing one in Seattle might be more useful. :-) |
05:12
🔗
|
xmc |
yea |
05:12
🔗
|
JesseW |
and it should be easier to get it to IA (because you're going to want to use sneakernet) |
05:13
🔗
|
xmc |
mmmmaybe |
05:13
🔗
|
JesseW |
why ever not? |
05:14
🔗
|
xmc |
because that would require traveling and i don't have a place to stay down there and i don't really want to visit the bay area? |
05:15
🔗
|
JesseW |
Just ask the IA folks to drop by the data center with 5 1T drives, plug them in, then pick them back up. |
05:15
🔗
|
JesseW |
when they are full |
05:15
🔗
|
* |
xmc shrug |
05:20
🔗
|
* |
xmc email info@ |
05:30
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
05:33
🔗
|
SketchCow |
Is there a simple way to split it? |
05:33
🔗
|
SketchCow |
(He asked) |
05:36
🔗
|
xmc |
well i could split it by username, each username has somewhere between two and many repositories |
05:37
🔗
|
xmc |
an item by user would make some sense, though clones are stored in the directory of the account it was cloned *from* |
05:37
🔗
|
MrRadar |
5TB drives aren't that expensive these days; when I was bulk-grabbing videos from Blip they were $150 each |
05:39
🔗
|
MrRadar |
Also, it looks like oldfriends.co.nz is fully shut down. Should the primary domain be blocked from the ArchiveBot job? |
05:39
🔗
|
MrRadar |
They have a secondary images domain which is still returning data |
05:39
🔗
|
MrRadar |
At least for now |
05:50
🔗
|
JesseW |
Ha -- I now know of *68* identifiers that IA will show records (to logged in users) of having (shock, horror) "deleted". ;-P |
05:56
🔗
|
kyan |
JesseW, wut? |
05:56
🔗
|
kyan |
I assume deleted meaning darked or something? |
05:57
🔗
|
kyan |
(My understanding from talking to IA was that nothing is ever deleted) |
05:57
🔗
|
JesseW |
hehehehehehehehehehehehehehehehe |
05:58
🔗
|
JesseW |
https://archive.org/history/20_minutes_of_massachusetts |
05:59
🔗
|
kyan |
JesseW, That's fairly disturbing but all that was there was the meta.xml, the reviews.xml, and an empty dir, apparently |
05:59
🔗
|
kyan |
I'll have to find where i was told nothing's deleted |
06:00
🔗
|
kyan |
Frankly I wonder why anyone would go to the trouble of deleting an empty identifier |
06:00
🔗
|
kyan |
to save 10k of disk space or sthg? |
06:01
🔗
|
* |
JesseW shrug -- IDK. The ones I've looked at have all been in 2007, so maybe things were different then. |
06:01
🔗
|
kyan |
Dec 05 14:51:40 <SketchCow> We don't even delete SPAM |
06:01
🔗
|
kyan |
Dec 05 14:51:59 <SketchCow> Nothing leaves the archive, not a bit |
06:01
🔗
|
kyan |
JesseW, Ah, hmm |
06:03
🔗
|
SketchCow |
You're all the most fucking adorable things |
06:03
🔗
|
* |
JesseW bows |
06:03
🔗
|
kyan |
http://archive.fart.website/bin/irclogger_log/archiveteam-bs?date=2014-12-05,Fri&sel=228#l224 |
06:05
🔗
|
* |
JesseW is mostly amused by the levels of recordkeeping -- even when something is *removed*, it still shows up in whatever jake used to generate the census list. |
06:13
🔗
|
JesseW |
http://archive.fart.website/bin/irclogger_log/archiveteam-bs?date=2014-12-05,Fri&sel=238#l234 -- hm, I wonder how they are now. |
06:14
🔗
|
JesseW |
https://archive.org/details/opensource_media <- 204,816 |
06:14
🔗
|
JesseW |
https://archive.org/details/opensource_movies <- 547,723 |
06:15
🔗
|
JesseW |
https://archive.org/details/opensource_religionvideo <- 104,421 |
06:15
🔗
|
JesseW |
https://archive.org/details/opensource <- 426,941 |
06:16
🔗
|
JesseW |
https://archive.org/details/opensource_audio <- 2,059,999 (!!) |
06:17
🔗
|
JesseW |
https://archive.org/details/open_source_software <- 11,368 |
06:18
🔗
|
JesseW |
I think that's all of them... |
06:21
🔗
|
Atluxity |
good morning |
06:22
🔗
|
JesseW |
Atluxity: morning |
06:25
🔗
|
JesseW |
well, of the 68 I found, all but 3 were deleted in 2006 or 2007. The other 3 were deleted by the archive.org staffer who uploaded them, presumably as a test. |
06:25
🔗
|
JesseW |
Mostly just an interesting curio. |
06:41
🔗
|
|
RichardG has joined #archiveteam |
07:00
🔗
|
JesseW |
what's more concerning are the 48 items that were retrievable in the last census (in March 2015) but now are gone without even any records in archive.org/history/ |
07:01
🔗
|
JesseW |
including one that (randomly) was the first in the itemlist, https://archive.org/history/Urdu-Trana-001 |
07:03
🔗
|
kyan |
That's weird. |
07:03
🔗
|
JesseW |
according to the census, it contained 10 mp3s of what was presumably Islamic speeches (from the filenames) and was in the iraq_middleeast, iraq_war and newsandpublicaffairs collections. |
07:03
🔗
|
kyan |
Could it have been renamed? |
07:04
🔗
|
JesseW |
Hm, let me look in those collections. |
07:05
🔗
|
kyan |
There are over 36k items there https://archive.org/search.php?query=collection%3A%22newsandpublicaffairs%22%20collection%3A%22iraq_war%22%20collection%3A%22iraq_middleeast%22 |
07:05
🔗
|
JesseW |
Yeah, the other two are also too large to look through manually. |
07:05
🔗
|
JesseW |
Hm, the name of the _meta file doesn't match the identifier. Let me look in that. |
07:06
🔗
|
JesseW |
yep, there it is: https://archive.org/metadata/AansoonAurAhoon-MP3 |
07:07
🔗
|
kyan |
Ah, yay. The files are the same? Assuming they are we should probably upload a placeholder item to the other identifier to aid in locating it. Also, is that identifier also listed in the census? |
07:07
🔗
|
JesseW |
how did it get under that other identifier, I wonder? |
07:08
🔗
|
JesseW |
yep, the other identifier is in the census |
07:08
🔗
|
kyan |
" k e y = > 3 4 3 9 5 0 6 - 1 8 0 4 6p r e v t a s k = > 3 8 2 5 0 5 6 0 2d i r = > / 3 0 / i t e m s / A a n s o o n A u r A h o o n - M P 3c o m m e n t = > R u n n i n g n o o p ' s t o u p d a t e c o l l e c t i o n s t r i n g i n m e t a d a t a t a b l e t o m a t c h c o l l e c t i o n s i n i t e m s m e t a . x m ln o o p = > 1key=3439506-18046&noop=1&.. " |
07:08
🔗
|
|
vitzli has joined #archiveteam |
07:09
🔗
|
JesseW |
Heh, I was just going to post that. :-) |
07:09
🔗
|
kyan |
fixer.php submitted by jake@archive.org (who IIRC did the census?) around a year ago https://catalogd.archive.org/log/382521508 |
07:09
🔗
|
kyan |
:P |
07:10
🔗
|
JesseW |
yep, I've found a few other fixes jake did after running the census. :-) |
07:10
🔗
|
JesseW |
and I've sent a few more into info@ which have been done now. |
07:11
🔗
|
kyan |
(FWIW, https://archive.org/details/AansoonAurAhoon-MP3 seems to be music, rather than speeches) |
07:12
🔗
|
kyan |
Hah this one is cool https://ia802304.us.archive.org/30/items/AansoonAurAhoon-MP3/ek-sitara-tha-main.mp3 |
07:13
🔗
|
bai |
if you google for the song title, looks like it's associated with some graphic videos |
07:14
🔗
|
vitzli |
pics are good too |
07:16
🔗
|
kyan |
English song title is "I was a Star", it's in Hindi apparently |
07:16
🔗
|
kyan |
according to Google Translate |
07:24
🔗
|
kyan |
wish i knew what the lyrics were. All too many references to "jihad" in the google search results for the title for my taste |
07:24
🔗
|
kyan |
i like the music tho |
07:27
🔗
|
yipdw |
I just realized that JIHAD is an acronym for "Jesus, I'm Having A Dump" |
07:27
🔗
|
yipdw |
sorry that was like lightyears off topic |
07:27
🔗
|
JesseW |
*how* exactly did you "just realize" that? :-) |
07:28
🔗
|
yipdw |
I got tired of not knowing what I'm doing and so I switched to the IRC client for a little while and it just happened |
07:31
🔗
|
JesseW |
well, you're welcome. :-) |
07:32
🔗
|
yipdw |
it may happen more often as I continue to realize that everything I knew about the GPU is wrong |
08:16
🔗
|
|
JesseW has quit IRC (Leaving.) |
08:31
🔗
|
|
atomotic has joined #archiveteam |
08:33
🔗
|
|
redlob has quit IRC (Quit: ZNC - http://znc.in) |
08:36
🔗
|
|
redlob has joined #archiveteam |
08:51
🔗
|
|
vitzli has quit IRC (Leaving) |
09:15
🔗
|
|
MrRadar has quit IRC (Read error: Operation timed out) |
09:18
🔗
|
|
MrRadar has joined #archiveteam |
09:31
🔗
|
arkiver |
SketchCow: oldfriends has closed. Our grab was a succes! |
09:31
🔗
|
arkiver |
There's some older files you can delete from FOS rom oldfriends |
09:32
🔗
|
arkiver |
Or instead of that pack them up in a non-WARC archive and upload them to IA, so we have them anyway |
09:32
🔗
|
* |
kyan likes the second option better |
09:37
🔗
|
Atluxity |
delete something?! thats not how we do it |
09:38
🔗
|
arkiver |
SketchCow: looks like some items didn't get the metadata update: https://archive.org/details/archiveteam_newssites_20160120_0021 |
09:47
🔗
|
HCross2 |
arkiver: the bot has crashed, and student WiFi here blocks SSH |
10:20
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
10:24
🔗
|
PurpleSym |
Is catalogd down? |
10:26
🔗
|
|
dashcloud has joined #archiveteam |
11:18
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
12:45
🔗
|
|
K4k_ has joined #archiveteam |
12:47
🔗
|
|
atomotic has joined #archiveteam |
12:49
🔗
|
|
VADemon has joined #archiveteam |
13:00
🔗
|
|
Ghost_of_ has joined #archiveteam |
13:14
🔗
|
|
K4k_ has quit IRC (Read error: Operation timed out) |
13:56
🔗
|
|
atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…) |
13:57
🔗
|
|
atomotic has joined #archiveteam |
14:02
🔗
|
|
K4k_ has joined #archiveteam |
14:02
🔗
|
|
K4k_ has quit IRC (Remote host closed the connection!) |
14:02
🔗
|
|
K4k_ has joined #archiveteam |
14:15
🔗
|
|
nertzy2 has joined #archiveteam |
14:24
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
14:25
🔗
|
|
Ghost_of_ has quit IRC (Quit: Leaving) |
14:27
🔗
|
|
dashcloud has joined #archiveteam |
14:44
🔗
|
|
WinterFox has quit IRC (Remote host closed the connection) |
14:49
🔗
|
phuzion |
Can we requeue some of the items for gamefront? |
14:50
🔗
|
|
nertzy2 has quit IRC (Quit: This computer has gone to sleep) |
14:50
🔗
|
phuzion |
There's 60k items out right now |
14:53
🔗
|
|
nertzy2 has joined #archiveteam |
14:55
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
14:56
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
15:03
🔗
|
|
nertzy2 has quit IRC (Quit: This computer has gone to sleep) |
15:04
🔗
|
|
dashcloud has joined #archiveteam |
15:20
🔗
|
|
K4k_ has quit IRC (Ping timeout: 260 seconds) |
15:31
🔗
|
|
nertzy2 has joined #archiveteam |
15:41
🔗
|
|
nertzy2 has quit IRC (Quit: This computer has gone to sleep) |
15:59
🔗
|
|
K4k_ has joined #archiveteam |
16:03
🔗
|
|
Lord_Nigh sets mode: +o balrog |
16:10
🔗
|
|
godane has quit IRC (Read error: Operation timed out) |
16:13
🔗
|
|
megaminxw has quit IRC (Quit: Leaving.) |
16:31
🔗
|
|
Ghost_of_ has joined #archiveteam |
16:51
🔗
|
|
BlueMaxim has joined #archiveteam |
17:03
🔗
|
|
K4k__ has joined #archiveteam |
17:05
🔗
|
|
K4k_ has quit IRC (Ping timeout: 252 seconds) |
17:15
🔗
|
|
JesseW has joined #archiveteam |
17:18
🔗
|
|
kristian_ has joined #archiveteam |
17:22
🔗
|
|
schbirid has joined #archiveteam |
17:27
🔗
|
|
JesseW has quit IRC (Leaving.) |
17:34
🔗
|
|
z00nx has quit IRC (Ping timeout: 252 seconds) |
17:34
🔗
|
|
z00nx has joined #archiveteam |
17:36
🔗
|
|
rizzzz has quit IRC (Read error: Operation timed out) |
17:40
🔗
|
|
rizzzz has joined #archiveteam |
17:43
🔗
|
|
Atom__ has joined #archiveteam |
17:46
🔗
|
|
Atom-- has quit IRC (Ping timeout: 252 seconds) |
17:56
🔗
|
|
atomotic has joined #archiveteam |
18:03
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
18:06
🔗
|
|
dashcloud has joined #archiveteam |
18:31
🔗
|
|
Emcy has quit IRC (Ping timeout: 250 seconds) |
19:03
🔗
|
|
K4k has joined #archiveteam |
19:08
🔗
|
|
K4k__ has quit IRC (Read error: Operation timed out) |
19:15
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
19:20
🔗
|
|
dashcloud has joined #archiveteam |
19:21
🔗
|
|
scyther has joined #archiveteam |
19:35
🔗
|
|
aliz has quit IRC (Ping timeout: 260 seconds) |
19:41
🔗
|
|
atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…) |
19:54
🔗
|
|
atomotic has joined #archiveteam |
19:54
🔗
|
|
atomotic has quit IRC (Client Quit) |
19:58
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
20:02
🔗
|
|
dashcloud has joined #archiveteam |
20:16
🔗
|
|
Ghost_of_ has quit IRC (Quit: Leaving) |
20:16
🔗
|
|
kristian_ has quit IRC (Quit: Leaving) |
20:38
🔗
|
|
JesseW has joined #archiveteam |
20:41
🔗
|
|
JesseW has quit IRC (Client Quit) |
20:47
🔗
|
arkiver |
Great news on Google Code! |
20:47
🔗
|
arkiver |
We can keep the grab running after the shutdown on the 25th |
20:53
🔗
|
phuzion |
Awesome! |
21:17
🔗
|
|
godane has joined #archiveteam |
21:20
🔗
|
|
K4k has quit IRC (Ping timeout: 252 seconds) |
21:41
🔗
|
|
Ghost_of_ has joined #archiveteam |
21:44
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
21:45
🔗
|
|
dashcloud has joined #archiveteam |
21:55
🔗
|
|
scyther has quit IRC (Quit: Leaving) |
22:05
🔗
|
Atluxity |
holy COW those gamefront items are getting BIG |
22:16
🔗
|
|
K4k has joined #archiveteam |
22:22
🔗
|
|
JetBalsa has quit IRC (Read error: Connection reset by peer) |
22:23
🔗
|
|
K4k has quit IRC (Read error: Operation timed out) |
22:25
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
22:26
🔗
|
Atluxity |
if I was banned from gamefront, would I be getting any 200 OK at all? |
22:29
🔗
|
|
dashcloud has joined #archiveteam |
22:35
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
22:35
🔗
|
|
Start has joined #archiveteam |
22:38
🔗
|
|
JesseW has joined #archiveteam |
23:08
🔗
|
|
WinterFox has joined #archiveteam |
23:18
🔗
|
|
K4k has joined #archiveteam |
23:23
🔗
|
|
K4k has quit IRC (Ping timeout: 260 seconds) |
23:26
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
23:30
🔗
|
|
dashcloud has joined #archiveteam |
23:41
🔗
|
|
nertzy2 has joined #archiveteam |
23:45
🔗
|
|
JesseW has quit IRC (Leaving.) |
23:49
🔗
|
SketchCow |
xmc: Archive came to me asking what to do about the guy with gitorious |
23:49
🔗
|
xmc |
hahaha |
23:49
🔗
|
xmc |
ok |
23:49
🔗
|
SketchCow |
So really, it's all about me. I'm Rome and everything leads to me |
23:49
🔗
|
xmc |
so i'll make something up and then do it |
23:50
🔗
|
SketchCow |
If you could split it into 5 pieces, that would be good. |
23:50
🔗
|
SketchCow |
Even if it kind of sucks |
23:50
🔗
|
xmc |
i could, but it'd be weird |
23:52
🔗
|
xmc |
i could also split it into 40,000 pieces, one per username |
23:52
🔗
|
xmc |
eh, i can do it alphabetically or something |
23:52
🔗
|
xmc |
ok. |
23:58
🔗
|
SketchCow |
Work through it. |
23:58
🔗
|
SketchCow |
But we'll take it. |