#archiveteam-bs 2014-05-26,Mon

↑back Search

Time Nickname Message
05:16 🔗 godane can anyone download this from way back machine: https://web.archive.org/web/20070626122847/http://msnbc.vo.llnwd.net/e1/video/podcast/pdv_nn_netcast_m4v-08-01-2007-193947.m4v
05:16 🔗 godane its very troubling when wayback machine will just say it has a video file but only gives like 33k of it
05:18 🔗 godane cause if i can get those links working i can then give you guys about 3 months of nbc ngiht news from 2007
05:18 🔗 underscor http://web.archive.org/cdx/search/cdx?url=http://msnbc.vo.llnwd.net/e1/video/podcast/pdv_nn_netcast_m4v-08-01-2007-193947.m4v
05:18 🔗 underscor It only got 33k (27897 bytes)
05:18 🔗 godane that sucks
05:20 🔗 underscor http://web.archive.org/cdx/search/cdx?url=http://msnbc.vo.llnwd.net/* btw
05:20 🔗 underscor note that's on a somewhat fragile app server and is also prone to breaking or going away
05:21 🔗 underscor I think there's a max it will return per page, too
05:21 🔗 underscor (though all of that one fits on one page)
05:27 🔗 godane is there a file search for wayback machine urls
05:28 🔗 godane not per a domain
05:28 🔗 underscor not publicly
05:28 🔗 underscor they're not indexed in a way to make that cheap
05:29 🔗 underscor it basically runs as a map-reduce job on the global wayback index in hadoop, afaik
05:30 🔗 underscor (our indexes are done by SERT, which is basically "reverse subdomain order")
05:30 🔗 underscor so like foo.archive.org/bar.txt becomes org,archive,foo)/bar.txt
05:31 🔗 underscor so we can efficiently look up something like "all org domains" or "all files on archive.org and all subdomains", etc
05:50 🔗 godane i'm going to check all vo.llnwd.net domains
06:54 🔗 godane anyways i'm fixing a typo i did with cbsradio
06:55 🔗 godane the creator for those items has a typo
07:14 🔗 SketchCow Famicoman: All downloaded, now to inject into the archive.
07:20 🔗 godane just know my fix for the cbs raido typo is going to create some fake cbs radio dates
07:20 🔗 godane i will got thur those and deindex them later
07:29 🔗 godane i'm uploading one of my Best Computer Games issue dvds
07:29 🔗 godane that will be about 16gb of data on 2 isos
07:30 🔗 godane one is a video disc and another is the game files
10:05 🔗 schbirid wasn't there a gnu tool to transpose (rotate the table so rows become columns and vice-versa) csv files?
10:07 🔗 schbirid looked through coreutils, i guess si had "pr" in mind but that does not do it
13:10 🔗 ersi Cool cool, so in the span of a week I've met both kennethreitz and SketchCow, without having to travel somewhere
13:11 🔗 ersi Sketchy is out exploring Stockholm atm
13:14 🔗 tephra ersi: you were at pycon?
13:16 🔗 ersi Yeah, 'course
13:16 🔗 tephra man, I was a volounteer there
13:16 🔗 ersi I even helped organise it, slightly, like, totally minimally
13:17 🔗 ersi oh, huh :O
13:17 🔗 tephra :O
13:17 🔗 tephra we must have met without knowing :P
13:17 🔗 ersi Indeed
13:18 🔗 ersi well, there was only like 260-290 attendees.. so we've *def* met.. but yeah :D
13:18 🔗 tephra :P
13:20 🔗 tephra should really try to get together sometime
13:20 🔗 ersi indeed~
14:18 🔗 midas silly yahoo, one of my clients was sending spam: temporarily deferred due to user complaints - 4.16.55.1; see http://postmaster.yahoo.com/421-ts01.html
14:18 🔗 midas only they removed the postmaster url.
14:20 🔗 ersi yeah, they be silly
14:22 🔗 midas they like to be silly
15:28 🔗 godane uploaded: https://archive.org/details/dvdrom-lki-62
15:37 🔗 godane so looks like can't get anything upload right now
15:38 🔗 godane keep getting slow down errors
15:42 🔗 SadDM godane: fwiw, I'm uploading right now without any problems.
15:43 🔗 SadDM https://archive.org/details/Talislanta-wizard_hunter
15:53 🔗 godane its working again
15:54 🔗 godane nevermind
15:54 🔗 godane it go 100% with on then started to fail again
15:55 🔗 godane now its saying 400 bad request
16:34 🔗 godane uploaded: https://archive.org/search.php?query=creator%3A%22The+Midday%22
16:34 🔗 is4 Heh http://www.wjla.com/articles/2012/01/jason-scott-sentenced-to-100-years-71267.html @sketchcow
18:58 🔗 godane SketchCow: there is going to a Wisconsin Public Radio collection
18:58 🔗 godane with sub-collection for each of the shows
18:58 🔗 godane it will have to be that way so i can upload to it
18:59 🔗 godane since i'm at that 30 collection limit or something
19:46 🔗 godane The Midday collection so far: https://archive.org/search.php?query=collection%3Agodaneinbox%20AND%20subject%3A%22The%20Midday%22&sort=-date
21:48 🔗 godane 2013 of The Midday collection is getting uploaded
21:49 🔗 godane i'm getting stuff done
22:56 🔗 SketchCow Great.
22:57 🔗 SketchCow All hail, met esri.
22:59 🔗 yipdw esri or ersi?
23:01 🔗 SketchCow ersi
23:01 🔗 SketchCow I just woke up from a nap.
23:01 🔗 SketchCow I did a lot of walking in Stockholm.
23:01 🔗 SketchCow I mean, a lot. Miles and miles.
23:01 🔗 SketchCow And I got my goddamn swedish meatballs in sweden
23:01 🔗 SketchCow All I wanted
23:04 🔗 dashcloud did you visit an Ikea as well?
23:05 🔗 midas bought a arkhiv?
23:05 🔗 godane i'm going after more global national
23:15 🔗 godane SketchCow: do you know if TV Archive project saves Global News channel?
23:15 🔗 godane i only ask cause i search for global national came up nothing
23:17 🔗 SketchCow For what it's worth, I don't know how important it is to get that over other things.
23:18 🔗 SketchCow But I honestly don't know. underscor is in much better shape to answer.
23:28 🔗 godane all i know is global news doesn't do a good job of keeping stuff
23:29 🔗 godane i will be luckly to get stuff over a year old
23:29 🔗 godane for example
23:29 🔗 godane only 20 episodes of march 2014 global national episodes still work
23:30 🔗 godane feb 2014 only has 11 episodes still working
23:31 🔗 godane jan 2014 is also 11
23:32 🔗 godane whats more funny is the between 2013-09-26 to 2014-01-05 only 4 episodes are not working
23:32 🔗 SketchCow Well go for it.
23:35 🔗 godane also the stupid podcasts they release only go back 6 weeks
23:35 🔗 godane so the streams are doing better but not by much
23:46 🔗 SketchCow Achieved newsblur zero.
23:46 🔗 SketchCow inbox zero is perhaps a bit too ambitious.
23:47 🔗 garyrh Is such a thing possible?!
23:51 🔗 SketchCow I had it once.
23:51 🔗 Ravenloft a man can dream
23:52 🔗 SketchCow https://www.youtube.com/watch?v=Ad9U3h2UmcA

irclogger-viewer