#archiveteam-bs 2012-12-01,Sat

↑back Search

Time Nickname Message
00:09 🔗 SketchCow Today I found out Random House publishing was meant to be a reference to actually publishing random books
01:00 🔗 godane now this is weird
01:00 🔗 godane a file for computer science spring 10 has timestamp of march 30 2011
01:00 🔗 godane *spring 2010
01:01 🔗 godane and the spring 2011 has timestamp of may 17 2010
01:02 🔗 godane it looks to be same file
01:02 🔗 godane http://crypto.stanford.edu/cs155old/cs155-spring11/hw_and_proj/proj3/traces/
01:02 🔗 godane http://crypto.stanford.edu/cs155old/cs155-spring10/hw_and_proj/proj3/traces/
01:02 🔗 godane there is lots of dedup that could have happened on this website warc grab
01:03 🔗 godane maybe usful when you have a 4gb warc file limit
01:46 🔗 mistym Someone on Slashdot linked this glorious textfile: http://www.textfiles.com/food/newcoke.txt
01:47 🔗 mistym I love how, aside from a couple cultural references, that could be an angry internet rant from today
01:54 🔗 chronomex "In conclusion, I understand that Burger King plans to change the recipe on the Whopper soon."
01:58 🔗 balrog_ http://www.youtube.com/watch?feature=player_embedded&v=6pDy-CSFsPs
02:28 🔗 godane does anyone know of a custom dirbuster to get a list of folders/files from a website?
02:28 🔗 godane i'm ask this cause i think computerpoweruser.com has alot more pdfs files
02:29 🔗 godane folders index can't be access so the only way is to guess the file names
02:29 🔗 chronomex hmm, no, but you could probably just bang something together with curl and your favorite scripting language
02:51 🔗 DFJustin godane: are you checking http://wayback-beta.archive.org/ when determining if there have been crawls or not
02:55 🔗 godane i know there there are crawls for this DFJustin
02:56 🔗 godane but i have found other folders that have not been crawled at all
02:57 🔗 DFJustin I know, just wondering if you're using the up to date beta information
02:58 🔗 godane alot of these captures have happened back in 2006
05:01 🔗 godane i found something interesting
05:01 🔗 godane looks like mininova.org forums still has stuff going back to 2005
05:01 🔗 godane so that maybe have some worth
06:45 🔗 godane so i got crypto.stanford.edu
06:46 🔗 godane the warc max size was size to 1gb
06:46 🔗 godane i got about 3.7gb of the site in warc.gz :-D
06:46 🔗 godane its for 4gb uncompress
07:50 🔗 DFJustin some stuff for the manuals collection https://archive.org/post/441568/
09:29 🔗 godane uploaded: http://archive.org/details/www.urinal.net-20121128-mirror
09:30 🔗 godane it has 100s of images of different urinals in it from around the world
09:40 🔗 * BlueMax salutes City of Heroes
10:41 🔗 godane so i found a 14.5mb pdf on computerpoweruser.com
10:41 🔗 godane turns out is just 3 pages
10:41 🔗 godane its funny cause i have 5mb full magazine pdfs
10:42 🔗 schbiridi pdfs sometimes are just bundled JPEG files or worse
10:46 🔗 schbiridi SketchCow: if you pass through hamburg by any chance on your deutschland trip, i would have about 3TB of jamendo ogg vorbis for drive-through copying. dating back to 2009 or whenever i started grabbing them
13:35 🔗 SketchCow That is not going to happen. :)
13:35 🔗 SketchCow (spoiler alert)
21:33 🔗 schbiridi aww :D
22:19 🔗 Coderjoe it's been publicly announced: http://blog.archive.org/2012/11/30/3-for-1-match/
22:46 🔗 godane so i got more urls the archive.org has for computerpoweruser.com/articles/2001/
22:46 🔗 godane i have dozens of pdfs from there
23:49 🔗 godane uploaded finally: http://archive.org/details/crypto.stanford.edu-20121130-mirror

irclogger-viewer