#archiveteam 2013-10-25,Fri

↑back Search

Time Nickname Message
01:29 🔗 joepie91 uhhh...
01:29 🔗 joepie91 hey SketchCow, why is https://archive.org/details/archiveteam-warrior darked?
01:31 🔗 joepie91 this is very strange
01:31 🔗 joepie91 there's no dark job or anything
01:31 🔗 * joepie91 blinks
01:31 🔗 joepie91 wait, why can I see the history in the first place?
01:32 🔗 DFJustin history view just needs an account I think
01:33 🔗 joepie91 ah
01:33 🔗 joepie91 still, why is it darked
01:33 🔗 joepie91 or well, at least I think it's darked
01:33 🔗 joepie91 someone messaged me that the download didn't work...
01:41 🔗 godane joepie91: the last change to log was 215 days ago
01:41 🔗 joepie91 yes, that is what I noticed
01:42 🔗 godane normally some event will say its dark in the history
06:25 🔗 deathy anyone know what bittorrent tracker archive.org is using?
06:32 🔗 Nemo_bis deathy: its own
06:35 🔗 deathy ah..found in the FAQ "We are using opentracker, which has proven to be highly scalable."
06:38 🔗 deathy This might be dangerous? "Starting in 2011, the Internet Archive began automatically retrieving BitTorrent files uploaded into most Community collections."
06:39 🔗 DFJustin naah
06:39 🔗 DFJustin more like 'awesome'
06:40 🔗 deathy I began crawling some bittorrent dumps a few weeks ago. Have ~6 Million .torrent files lying around..
06:40 🔗 DFJustin zip & upload
06:42 🔗 DFJustin I put this up the other day which is 873,671 old torrents from the pirate bay https://archive.org/details/TPB_index_20090815
06:42 🔗 DFJustin helpfully downloaded from another torrent via the aforementioned automatic feature
06:56 🔗 deathy I just have them like "INFOHASH.torrent" files, with a few folders based on filename.(just the torrents, no meta) That +zipped acceptable upload format? ..
07:01 🔗 odie5533_ How do you guys view warc files?
07:04 🔗 odie5533_ deathy: what is the use of old torrent fdiles?
07:08 🔗 godane just found the chat logs when i told you guys about dl.tv and crankygeeks not being on archive.org: http://badcheese.com/~steve/atlogs/?chan=archiveteam&day=2011-09-27
07:08 🔗 deathy odie5533_: in my case, possible research. hopefully doing my master's thesis bittorrent-related. For anyone else...I don't know. Backup everything?
07:09 🔗 odie5533_ deathy: care to elaborate on your plans for them?
07:11 🔗 deathy odie5533_: initially with these, getting whatever stats I can. Long term..not completely decided yet..
07:15 🔗 yipdw odie5533_: re: viewing: https://github.com/ArchiveTeam/warc-proxy
07:15 🔗 odie5533_ yipdw: is that the only thing you use?
07:26 🔗 odie5533_ Are there any other programs for viewing warcs, or does everyone use warc-proxy?
07:27 🔗 yipdw odie5533_: well, there's wayback
07:27 🔗 yipdw but warc-proxy requires a lot less supporting software
07:27 🔗 odie5533_ Does everyone here use warc-proxy?
07:33 🔗 phillipsj not *all* of the 146 lurkers.
07:33 🔗 odie5533_ but anyone that views warc files uses it I guess.
07:33 🔗 * phillipsj shrug
07:34 🔗 odie5533_ Do most people on archiveteam view warc files, or just upload them and not look at the contents?
07:34 🔗 aMunster i just run the warrior
07:34 🔗 aMunster fuck the rest of that noise
07:37 🔗 odie5533_ aMunster: how can you be sure you're doing anything?
07:37 🔗 aMunster actually, i run the scripts outside of warrior
07:37 🔗 aMunster i check in on them every few days, and then lurk
07:38 🔗 odie5533_ aMunster: how do you run them outside of the warrior?
07:38 🔗 aMunster http://archiveteam.org/index.php?title=Puu.sh
07:38 🔗 aMunster project source link: https://github.com/ArchiveTeam/puush-grab
07:39 🔗 aMunster then, the tracker such as http://chfoo-d1.mooo.com:8031/puush/#show-all shows highscores
07:39 🔗 odie5533_ ooh high scores!
07:40 🔗 odie5533_ woah that one dude has over 1 TB!
07:40 🔗 aMunster how many IPs do you have at your disposal?
07:40 🔗 odie5533_ I have a VPN
07:40 🔗 odie5533_ so maybe a few dozen
07:41 🔗 aMunster you'll need more than that to beat those scores
07:41 🔗 odie5533_ But I have like no bandwidth, so it barely matters.
07:41 🔗 odie5533_ don't even have enough bandwidth to download things I want for myself, let alone for archiveteam =/
07:41 🔗 aMunster college, eh
07:41 🔗 odie5533_ no.
07:42 🔗 aMunster no excuse for bandwidth problems then
07:42 🔗 aMunster :(
07:42 🔗 odie5533_ heh
07:43 🔗 odie5533_ What is happening to puu.sh?
07:43 🔗 aMunster one month link expiration
07:43 🔗 odie5533_ but they said they were offering "permanent storage"
07:43 🔗 odie5533_ how can "permanent storage" end?
07:43 🔗 aMunster terms of service change often
07:44 🔗 odie5533_ How can I look at some of the puu.sh files that have been saved?
07:45 🔗 aMunster https://archive.org/details/archiveteam_puush
07:45 🔗 odie5533_ those are like 10 GB.
07:46 🔗 aMunster they're split into those chunks, yes
07:46 🔗 odie5533_ I just wanted to look at a few small files.
07:47 🔗 odie5533_ What type of files are they? All images?
07:47 🔗 aMunster anything's that on puush
07:47 🔗 yipdw videos too
07:47 🔗 odie5533_ Can you give me an example? I can't seem to find a "search" button the puush site
07:48 🔗 yipdw http://puu.sh/4gMPk
07:48 🔗 odie5533_ I see.
07:50 🔗 odie5533_ we don't even know who these images belong to right?
07:51 🔗 aMunster that isn't part of the scope i think
07:52 🔗 odie5533_ I looked through a few images and didn't find any I'd want to save.
07:53 🔗 godane SketchCow: can please start giving me full access to may collections
07:54 🔗 godane *my collections
07:54 🔗 godane i just took 20 mins uploading a episode of diggnation to my diggnationseries just figure out i don't have access to it
07:56 🔗 godane based on what i can tell i could move stuff from godaneinbox once i have access to a collection to move it to
08:00 🔗 odie5533_ So, is there a way to easily access individual files from those giant archives?
08:01 🔗 odie5533_ or is the only way to download the entire 10 GB chunks and play them with e.g. the warc-proxy
08:01 🔗 aMunster that's the only way
08:28 🔗 ersi No, that's not the only way.
08:28 🔗 ersi You could parse the CDX file and get the byte offsets and get only the parts you want. It's a bit more hassle though.
08:29 🔗 odie5533_ ersi: hmm, interesting idea.
08:29 🔗 odie5533_ Does archive.org support byte-by-byte download?
08:31 🔗 odie5533_ Also, what is the difference between archiveteam_puush_20131025041147.cdx.gz and puush_20131025041147.megawarc.warc.gz ?
08:31 🔗 ersi It's called "byterange" download and yes.
08:32 🔗 ersi cdx is a index
08:32 🔗 odie5533_ sorry, I pasted the wrong bit
08:32 🔗 odie5533_ there is both an Item CDX Index and a WARC CDX Index
08:34 🔗 odie5533_ they have the same contents.
08:34 🔗 ersi I'm not sure. But maybe the "WARC CDX index" is generated when the items is uploaded and the "item CDX index" is what we've generated and uploaded
08:35 🔗 ivan` right, and they should be identical, though the gzip setting makes a difference in .gz size
11:57 🔗 Nemo_bis deathy: if you want some bigger test cases, I'm uploading them :) https://archive.org/details/ftp-ftp.hp.com_pub-2013-10
11:57 🔗 Nemo_bis $ lrztar -l softlib/software10
11:57 🔗 Nemo_bis Compression Ratio: 1.186. Average Compression Speed: 4.864MB/s.
11:57 🔗 Nemo_bis Total time: 19:32:02.90
11:57 🔗 Nemo_bis directory down to 282 GB, not super-impressing but better than nothing
11:59 🔗 n00b476 WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
13:09 🔗 ersi "yahoosucks" :)
13:17 🔗 GLaDOS he left ages ago
13:17 🔗 GLaDOS we suck at customer service
13:17 🔗 GLaDOS 0/10 would never do it again
13:19 🔗 ersi heh, yeah - I got join/parts off
13:20 🔗 ersi I still like saying yahoosucks though
13:20 🔗 GLaDOS but yahoo does suck!
14:13 🔗 DFJustin odie5533_: the item cdx index is an index of all the warc files on the item whereas the warc cdx index is associated with one warc file
14:13 🔗 DFJustin if the item has only one warc they would be the same
14:14 🔗 odie5533_ ah. that makes sense
14:14 🔗 DFJustin for viewing results there is also http://www.archiveteam.org/index.php?title=The_WARC_Ecosystem#warc_to_zip
14:15 🔗 odie5533_ yeah, I saw that. didn't seem as useful. thanks though
14:15 🔗 DFJustin or once it has been indexed into the wayback machine you can do a search like http://web.archive.org/web/*/http://puu.sh/* and compare the capture dates with the warc date
14:16 🔗 odie5533_ I had uploaded a warc a while back, but it never showed up in the wayback machine.
14:16 🔗 DFJustin what's the item
14:17 🔗 odie5533_ aww
14:17 🔗 odie5533_ your link crashed my firefox
14:17 🔗 DFJustin yeah that works better for sites that are less omghuge
14:18 🔗 odie5533_ https://archive.org/details/journals.math.tku.edu.tw_2013-01-06_mirror
14:19 🔗 DFJustin it needs to be in one of the web collections with mediatype=web in order to be added to wayback
14:19 🔗 ersi and not have a restrictive robots.txt
14:19 🔗 DFJustin robots looks ok in this case
14:19 🔗 odie5533_ doesn't seem to let me change the mediatype
14:19 🔗 DFJustin an admin needs to do it
14:20 🔗 odie5533_ I don't know any admins =/
14:20 🔗 ersi Yeah, just saying that ti's a critera.
14:21 🔗 DFJustin jscott@archive.org
14:21 🔗 ersi (^ That's SketchCow's Internet Archive e-mail)
14:21 🔗 odie5533_ but SketchCow is right here.
14:22 🔗 ersi SketchCow is very busy. Put it on the pile by sending an e-mail. :-)
14:22 🔗 odie5533_ alright
18:28 🔗 SketchCow Good morning
18:29 🔗 SketchCow Fixed.
18:36 🔗 joepie91 SketchCow: awesome; what was the issue?
18:50 🔗 SketchCow odie5533 can't upload web.
18:52 🔗 WiK hello world
18:52 🔗 WiK anyone have a good source of images from 5.25 floppys?
18:52 🔗 WiK games/apps, idk
18:52 🔗 WiK perferable stuff that will run on like dos or win3.1
19:03 🔗 joepie91 WiK: you could look at the older Twilight CDs
19:04 🔗 joepie91 afaik a bunch of stuff on there came from floppy-distributed things
19:04 🔗 joepie91 godane has been uploading them to IA
19:05 🔗 DFJustin in terms of images there's a bunch in the ibm5150 section of https://archive.org/download/MESS_0.149_Software_List_ROMs/MESS_0.149_Software_List_ROMs.zip/
19:05 🔗 DFJustin most of it is booters and not dos/windows based though
19:09 🔗 Jacek There are some 20GB+ dos game packs that show up on nzb search sites. Probably torrents too but they might have died :(
19:10 🔗 DFJustin a bunch of those are on ia now
19:10 🔗 DFJustin https://archive.org/search.php?query=collection%3Avintagesoftware%20dos
19:11 🔗 WiK DFJustin: perfect, thanks
19:11 🔗 DFJustin and of course https://archive.org/details/classicpcgames
19:13 🔗 WiK now, to see if i can get my kyroflux to write these to 5.25 floppies
19:16 🔗 phillipsj Wik I have access to some physical 360k floppies :)
19:16 🔗 DFJustin dumps plz
19:17 🔗 WiK phillipsj: ive got 400 5.25 PC/AT floppies, brand new
19:18 🔗 phillipsj Not sure about the copyright status of all the software. Much o fit is free/trialware though (from a BBS)
19:18 🔗 DFJustin no one really cares with stuff that old
19:19 🔗 WiK seems i need to make an IPF or somekind of image file from the dos files, and then i can push via kyroflux
19:20 🔗 DFJustin what os
19:21 🔗 WiK win 7, but ive access to cygwin/linux
19:21 🔗 DFJustin http://www.winimage.com/
19:22 🔗 DFJustin I dunno if kryoflux supports dos sector images for writing though, it didn't last I checked but it's been a while
19:22 🔗 DFJustin this is getting to be -bs btw
19:22 🔗 phillipsj DFJustin, if that is the case, why doe copyright ast so long?
19:22 🔗 WiK DFJustin: hoping it does
19:23 🔗 DFJustin people care about stuff that still makes gazillions of dollars like mickey mouse
19:23 🔗 DFJustin cga games, not so much
19:24 🔗 DFJustin I will note that http://www.gog.com/ is awesome and if you want to pay for some old games that is a way to go
19:25 🔗 joepie91 I can second gog being awesome
19:26 🔗 DFJustin but there's hardly anything from the 360k era still available that way
19:27 🔗 phillipsj DFJustin, gog wraps them in icky windows binaries :P
21:04 🔗 arkhive Anyone here live in Colorado besides swebb and myself?
21:07 🔗 arkhive Also, I've brought up the whole unsold TV pilots thing before. I was thinking about it recently and was wondering if a Kickstarter/IndieGoGo would work to purchase those lost television pilots and make publicly available for free.
21:09 🔗 arkhive I don't know much about KickStarter/IndieGoGo so I'm not sure if that kind of stuff would work. I know someone here said the pilots are locked away in the studio's 'vault' and never seen again.. But could we raise enough money to buy them from the NBC's and CBS's?
21:09 🔗 arkhive Again I have little to no idea how studios do business and crowd funding... Just me thinking out loud.
21:31 🔗 phillipsj With more work, I may be able to copy C64 160k floppies (drives are notorious for going out of alignment)
21:34 🔗 CowerZZZZ ah yes, hand aligning 1541s with just an oscope was good for $30 a pop back in the day

irclogger-viewer