#archiveteam 2013-03-11,Mon

↑back Search

Time Nickname Message
00:41 🔗 SketchCo1 Great. I'll slam it over to the front.
00:41 🔗 SketchCo1 But yeah, we have to make sure these are all living.
00:44 🔗 SketchCo1 Posterous is now shoving in nicely, and I'm doing a few more that were in the hopper. But this is the post-work efforts.
02:12 🔗 SketchCo1 Well, impressively, we have 7 terabytes of to-be-uploaded material.
02:12 🔗 SketchCo1 (From all the projects.)
03:24 🔗 dashcloud SketchCo1: did your scan all the manuals at home project start yet?
03:30 🔗 SketchCo1 nope!
04:05 🔗 ivan` http://www.intrade.com/
04:09 🔗 omf_ ivan`, I am pulling the site down now
04:13 🔗 omf_ I got the whole site, uploading to IA now
04:14 🔗 ivan` :)
04:16 🔗 omf_ It took 53 seconds to mirror the site and more than 5x that for me to upload it :(
04:17 🔗 omf_ https://archive.org/details/intrade.com
04:20 🔗 SketchCo1 Modified and put in web.
04:20 🔗 SketchCo1 So, small point of order: Wayback won't touch it if it isn't type "web"
04:20 🔗 SketchCo1 And you can't set it web. I can.
04:21 🔗 SketchCo1 So be sure to tell me, OR I find it another way.
04:22 🔗 godane SketchCo1: i finally got the video pages on g4tv.com
04:23 🔗 SketchCo1 Great.
04:23 🔗 godane over 62k files on there
04:23 🔗 godane also we have comments pages too
04:24 🔗 godane it was better then my first grab early this year
04:24 🔗 godane there was no pages with that one
04:25 🔗 godane also the images dumps i'm doing are very big
04:25 🔗 omf_ SketchCo1, here are the small sites I did that need mediatype changes:
04:25 🔗 omf_ http://archive.org/details/RobinSachs.warc
04:25 🔗 omf_ http://archive.org/details/blog.memolane.com
04:25 🔗 omf_ http://archive.org/details/WarnerHomeVideo-SpaceJam
04:26 🔗 omf_ http://archive.org/details/TwitCleaner
04:29 🔗 SketchCo1 Converted.
04:30 🔗 godane looks like i just need another 220 downloads of kat.ph blog dump to be in the top 5 downloads of archiveteam collection
04:31 🔗 omf_ thanks
04:31 🔗 SketchCo1 Conversation has gone well: http://www.tosecdev.org/index.php/forum/index.php?topic=494.0
04:32 🔗 SketchCo1 http://archive.org/details/folkscanomy getting huge
04:35 🔗 omf_ anyone know of other shutdowns of services that need a site grab. Besides what we know
04:36 🔗 omf_ smaller sites that do not need as much people power
04:37 🔗 SketchCow Well, there's lots of stuff lodged in the stacks.
04:38 🔗 SketchCow Like, we're finding amazing shit, to be frank
04:39 🔗 SketchCow http://archive.org/details/amigaformatmagazine was me shoving something in today
04:39 🔗 godane cool
04:40 🔗 godane i was grabing amiga format magazines that i found on the internet
04:41 🔗 SketchCow Yeah.
04:41 🔗 SketchCow They're all up.
04:41 🔗 godane thats good
04:42 🔗 godane one less thing for me
04:45 🔗 godane SketchCow: missed one: http://awesome.commodore.me/downloads/magazine/Amiga_Format/Amiga_Format_Issue_105_(1997)(Future_Publishing)(GB)%5Bchristmas_edition%5D.pdf
05:02 🔗 SketchCow http://archive.org/details/amigaformatmagazine-105
05:02 🔗 godane i'm uploading attack of the show blog
05:03 🔗 godane thanks
05:42 🔗 SketchCow Who wants to be the WARC hero? http://internettourbus.com/
05:44 🔗 * chronomex raises hand
05:44 🔗 chronomex in danger?
05:44 🔗 chronomex or just needs good coverage?
05:54 🔗 chronomex -rw-r--r-- 1 duncan duncan 11M Mar 10 22:52 internettourbus.com.warc.gz
05:57 🔗 SketchCow Someone told me in danger, but they might have meant ignored and closed
06:01 🔗 godane i'm grabing it
06:05 🔗 SketchCow chronomex has alrady grabbed it.
06:06 🔗 godane just noticed that
06:07 🔗 godane seeing at this point if its the right size
06:09 🔗 godane looks to be about 11mb
06:34 🔗 godane so i'm finding the missing videos
06:34 🔗 godane very boring
06:41 🔗 godane anyways i'm getting Video Game Tricks, Codes and Strategies Vol.1 from myspleen
07:47 🔗 godane uploaded: https://archive.org/details/www.g4tv.com-video-pages-20130309
08:38 🔗 godane so my account at thebox.bz is disabled again
08:38 🔗 godane good news is i got sky at night epsiodes from 1995
08:39 🔗 godane and i'm up to day with click episodes
10:04 🔗 omf_ There are rumors that echofon is going to shutdown the rest of their twitter apps
10:05 🔗 omf_ so here is a grab of the site https://archive.org/details/echofon.com
10:39 🔗 SketchCow added
10:41 🔗 SketchCow Can someone PLEASE grab http://www.oqotalk.com ? Due to http://www.oqotalk.com/index.php?topic=5304.0
10:46 🔗 omf_ SketchCow, I got a grab going on it now
11:08 🔗 SketchCow Thanks.
11:10 🔗 omf_ There look to be ~40,000 posts and I got over 4,000 so far
12:02 🔗 omf_ grabbing wrathofheroes.warhammeronline.com is proving problematic
12:02 🔗 omf_ it just does not want to grab all the pages
12:03 🔗 omf_ I tried numerous wget commands with span hosts and others skipping domains
12:03 🔗 omf_ even httrack is not finding pages that I can navigate to from the frontpage
12:04 🔗 godane so i got some good news
12:04 🔗 godane i registered on to g4tv.com forums
12:05 🔗 godane and there are more topics i have not archived yet
12:11 🔗 Smiley :O
12:14 🔗 godane doing --load-cookies=cookies.txt is not working
12:20 🔗 godane Smiley: i can't get it with wget
12:20 🔗 Smiley :<
12:20 🔗 Smiley useragent issue possibly?
12:25 🔗 godane adding firefox to user-agent doesn't work
12:28 🔗 Smiley hmmm some java scripting thing maybe :S
12:28 🔗 Smiley Im not good when wget doesn't work :(
12:34 🔗 omf_ wrathofheroes.warhammeronline.com is closing down March 29th
12:42 🔗 godane i need help guys
12:43 🔗 godane i can't grab the secret forums on g4
12:53 🔗 godane i'm getting it finally
12:53 🔗 godane i forgot to click the remember me button
12:55 🔗 godane if you don't have that then the cookies are not useful
13:44 🔗 omf_ fuck fuck FUCK. I just had to kill a 13 day download via wget because it was using 9gb of RAM
13:44 🔗 omf_ If I wasn't already writing a replacement for wget I would definitely be now
13:47 🔗 Smiley D:
13:47 🔗 Smiley :/
13:47 🔗 Smiley omf_: does it not write out anything?
13:47 🔗 Smiley :<<<
13:47 🔗 omf_ I have 52gb saved
13:48 🔗 omf_ but no way to determine where I am in the process
13:57 🔗 Smiley Is it a warc?
13:57 🔗 Smiley we should discuss this in #ispygames
14:42 🔗 omf_ Does anyone have experience beyond basic usage of heritrix
20:11 🔗 balrog_ so is punchfork is gonna get done?
20:12 🔗 alard There are 39 hard cases left.
20:47 🔗 alard Our Friendster data has found its way into science: http://snap.stanford.edu/data/com-Friendster.html
20:51 🔗 ersi awesome
20:51 🔗 alard Also fun (but not published, it seems): http://www.sg.ethz.ch/media/publication_files/OSN_Kcore.pdf
20:52 🔗 alard They're trying to analyse the structure of successful and unsuccessful social networks.
20:52 🔗 balrog_ if anyone needs 1tb harddrives: http://www.newegg.com/Product/Product.aspx?Item=N82E16822149382
20:52 🔗 ersi pretty cool
21:55 🔗 godane so i just found out i screwed up the new forums dumps of g4
21:55 🔗 godane i will have re do it
21:56 🔗 godane i forgot to put the cookies.txt file in the tmp folder i think use to build the full warc.gz after get the index warc
22:00 🔗 godane so we may have some more bad news about the forums
22:01 🔗 godane the s= urls are uses in all the links
22:01 🔗 godane so i don't know if this stuff will be browserable at all
22:34 🔗 dashcloud so, I'd like to back up a forum: http://diehardwolfers.areyep.com/index.php what commandline do I use here to do so?
22:35 🔗 ersi Something in the lines of `wget --warc-file=diehardwolfers.areyep.com --mirror --page-requisites http://diehardwolfers.areyep.com/index.php`
22:40 🔗 arkhive Question to AT members: What happens to unsold television pilots. Like if the pilot episode gets made but the network decides against picking it up for a series..What happens to it? I read that there are a whole bunch that don't sell. And I found some online but not 'a whole bunch'
22:42 🔗 ersi Well, uh.. they.. disappear. Depends on if they ever get published somewhere or not. I imagine some media corps save all pilots
22:43 🔗 arkhive Is there a way to watch the ones that aren't released to the public? Might be a stupid question lol. I'm just a big fan of what could have been. heh.
22:44 🔗 ersi dunno man, I guess one is; Work at one of the media megacorps
22:45 🔗 ersi (Please do, and leak these kinds of things to the IA ;D)
22:45 🔗 arkhive And when a series gets cancelled before even getting through a first season. even when they made all the episodes for that season and they never air it. never release on streaming or itunes and such.
22:45 🔗 arkhive i guess the same.
22:45 🔗 arkhive I totally would
22:46 🔗 ersi think about all the rejected paper articles ;)
22:46 🔗 arkhive Like The Playboy Club show had a lot of potential. They filmed more episodes then they released. I think they should at least put them up somewhere. Especially when NBC has TPC on their website to watch.. though only the first three episodes.. doesn't make sense.
22:46 🔗 S[h]O[r]T if the media corp paid for the pilot to be made then they likely own the rights and hold the tape somewhere.
22:47 🔗 arkhive And Heist. It was stupid and cheesey but i enjoyed it. Cater to the masses though
22:47 🔗 arkhive ersi: ya. if only if only.
22:47 🔗 S[h]O[r]T a lot of times if its a big hyped series that gets canceled or internationals are interested, those conuntries still get those episodes
22:47 🔗 S[h]O[r]T because they purchase the season/series before it was canceled
22:48 🔗 S[h]O[r]T so youll see tv captures from europe and australia and sometimes canada where the episodes that didnt air in the USA eventually air there
22:48 🔗 arkhive S[h]O[r]T: ya i know but that happened to Heist if i remember right. like three more episodes were released in another country. but still one remains missing!!!
22:48 🔗 arkhive it's crazy
22:49 🔗 S[h]O[r]T unless the network has something to hide you if you were a tv network you could buy the rights to air them
22:49 🔗 S[h]O[r]T start a tv network that airs all the missing episodes :p
22:49 🔗 S[h]O[r]T pilots that are greenlight, people have them esp if its from the major corperations
22:50 🔗 arkhive Ya. I thought about it. Need money though. And the megacorps would charge too much for their failed series/unsold pilots
22:50 🔗 S[h]O[r]T they will send them out to press or screen at events, or send them to affilate tv networks for their staff and marketing departments
22:50 🔗 S[h]O[r]T if it airs on tv, someone will record it
22:50 🔗 S[h]O[r]T otherwise its hard to find/you may never for years
22:51 🔗 arkhive But how come some remain missing? IT'S SUCKS!
22:51 🔗 S[h]O[r]T ill trade you pilots for un-edited sex tapes
22:51 🔗 arkhive haha
22:51 🔗 S[h]O[r]T i want all the non dubbed versions of the vivid sex tapes
22:51 🔗 S[h]O[r]T i guess this is -bs
22:51 🔗 arkhive well part of it was regular
22:51 🔗 S[h]O[r]T so to end...its their copyright and they chose what they want to do with it :(
22:52 🔗 arkhive hmm.. i wonder if i tweeted one of The Playboy Club actor/actresses to see if they have a copy, if they'd let me have it. :P if only if only
22:52 🔗 S[h]O[r]T they wouldnt
22:53 🔗 arkhive ya iknow
22:54 🔗 S[h]O[r]T most of the time they send tapes a week or so in advance to networks, they dont send the entire series
22:54 🔗 S[h]O[r]T but according to wikipedia fx latin america was going to air it but it got canceled. so im sure they had a contract to air it. and so did citytv in canada
22:54 🔗 arkhive It just seems stupid to keep them locked up to never see the light of day.
22:55 🔗 arkhive TPC or Heist?
22:56 🔗 S[h]O[r]T for heist wiki says the 6th aired in UK but not the 7th episode. so that was likely what their contract ran up until
22:57 🔗 S[h]O[r]T also dont forget that even tho NBC or another network may air them. they are most of the time produced by an entirely different media corp
22:57 🔗 arkhive Ya. Heist's Hot Digity episode..gone.
22:58 🔗 S[h]O[r]T there was a big concern when MGM filed for bankrupty
23:00 🔗 S[h]O[r]T there was that dexters labratory episode that was a rumor for years and eventually this year someone got approval to pull it from their vault and put it online
23:00 🔗 arkhive oh damn.
23:01 🔗 balrog_ arkhive: did you ever recover that hard drive?
23:02 🔗 arkhive Just about all of it. I PM'ed SketchCow like 3 times awhile back to tell him I'm ready to upload it and he never responded
23:03 🔗 arkhive But ya I got all that i could
23:03 🔗 arkhive which is almost all of it. like really close.
23:04 🔗 arkhive I didn't know if Jason was mad at me for my mess up or what lol.
23:04 🔗 arkhive He might have responded and I didn't get the message i don't know :)
23:06 🔗 ersi He's just pretty busy, and he's away from his IRC client a lot.
23:07 🔗 ersi Is this some private upload thingie? If not, I'd suggest giving it a go to upload to IA and just giving SketchCow the link.
23:07 🔗 arkhive Oh by the way. part of the folder structure is not intact (like the directory) because of my mess up. And i do sincerely apologize for my screw up.
23:07 🔗 arkhive It's in regards to MobileMe
23:09 🔗 ersi ah
23:11 🔗 balrog_ are the filenames intact?
23:11 🔗 arkhive ya
23:11 🔗 balrog_ then it probably can be reconstructed
23:12 🔗 arkhive Problem that i had was they were locked when I was trying to access on another computer. So I went through all of my computers and macs and used unlocking programs and such.
23:12 🔗 arkhive Like it wouldn't let me copy them to my folder to be uploaded. or even upload directly from where they were.
23:12 🔗 arkhive That problem is fixed though. Took a long ass time.
23:21 🔗 SketchCow I'm barraged. BARRAGED. Constantly.
23:21 🔗 SketchCow Believe me. You will know if I'm mad at you.
23:21 🔗 arkhive :)
23:21 🔗 SketchCow Your relatives going out to second cousins will know I'm mad at you
23:21 🔗 GLaDOS I'm quite close to my 5th cousins, will they know as well?
23:22 🔗 godane SketchCow: so it looks like i screwed up the first dumps of forums.g4tv.com
23:22 🔗 godane is has the s=session numbers in it
23:22 🔗 godane urls are saved as it should be
23:22 🔗 godane so maybe unable to really use it in wayback machine
23:24 🔗 godane just for you guys to know
23:24 🔗 godane i will not get every video id of g4tv.com
23:25 🔗 godane there are tons that are not active anymore but without know the file name with the ids i will most likely not be able to get it

irclogger-viewer