Time |
Nickname |
Message |
00:41
🔗
|
SketchCo1 |
Great. I'll slam it over to the front. |
00:41
🔗
|
SketchCo1 |
But yeah, we have to make sure these are all living. |
00:44
🔗
|
SketchCo1 |
Posterous is now shoving in nicely, and I'm doing a few more that were in the hopper. But this is the post-work efforts. |
02:12
🔗
|
SketchCo1 |
Well, impressively, we have 7 terabytes of to-be-uploaded material. |
02:12
🔗
|
SketchCo1 |
(From all the projects.) |
03:24
🔗
|
dashcloud |
SketchCo1: did your scan all the manuals at home project start yet? |
03:30
🔗
|
SketchCo1 |
nope! |
04:05
🔗
|
ivan` |
http://www.intrade.com/ |
04:09
🔗
|
omf_ |
ivan`, I am pulling the site down now |
04:13
🔗
|
omf_ |
I got the whole site, uploading to IA now |
04:14
🔗
|
ivan` |
:) |
04:16
🔗
|
omf_ |
It took 53 seconds to mirror the site and more than 5x that for me to upload it :( |
04:17
🔗
|
omf_ |
https://archive.org/details/intrade.com |
04:20
🔗
|
SketchCo1 |
Modified and put in web. |
04:20
🔗
|
SketchCo1 |
So, small point of order: Wayback won't touch it if it isn't type "web" |
04:20
🔗
|
SketchCo1 |
And you can't set it web. I can. |
04:21
🔗
|
SketchCo1 |
So be sure to tell me, OR I find it another way. |
04:22
🔗
|
godane |
SketchCo1: i finally got the video pages on g4tv.com |
04:23
🔗
|
SketchCo1 |
Great. |
04:23
🔗
|
godane |
over 62k files on there |
04:23
🔗
|
godane |
also we have comments pages too |
04:24
🔗
|
godane |
it was better then my first grab early this year |
04:24
🔗
|
godane |
there was no pages with that one |
04:25
🔗
|
godane |
also the images dumps i'm doing are very big |
04:25
🔗
|
omf_ |
SketchCo1, here are the small sites I did that need mediatype changes: |
04:25
🔗
|
omf_ |
http://archive.org/details/RobinSachs.warc |
04:25
🔗
|
omf_ |
http://archive.org/details/blog.memolane.com |
04:25
🔗
|
omf_ |
http://archive.org/details/WarnerHomeVideo-SpaceJam |
04:26
🔗
|
omf_ |
http://archive.org/details/TwitCleaner |
04:29
🔗
|
SketchCo1 |
Converted. |
04:30
🔗
|
godane |
looks like i just need another 220 downloads of kat.ph blog dump to be in the top 5 downloads of archiveteam collection |
04:31
🔗
|
omf_ |
thanks |
04:31
🔗
|
SketchCo1 |
Conversation has gone well: http://www.tosecdev.org/index.php/forum/index.php?topic=494.0 |
04:32
🔗
|
SketchCo1 |
http://archive.org/details/folkscanomy getting huge |
04:35
🔗
|
omf_ |
anyone know of other shutdowns of services that need a site grab. Besides what we know |
04:36
🔗
|
omf_ |
smaller sites that do not need as much people power |
04:37
🔗
|
SketchCow |
Well, there's lots of stuff lodged in the stacks. |
04:38
🔗
|
SketchCow |
Like, we're finding amazing shit, to be frank |
04:39
🔗
|
SketchCow |
http://archive.org/details/amigaformatmagazine was me shoving something in today |
04:39
🔗
|
godane |
cool |
04:40
🔗
|
godane |
i was grabing amiga format magazines that i found on the internet |
04:41
🔗
|
SketchCow |
Yeah. |
04:41
🔗
|
SketchCow |
They're all up. |
04:41
🔗
|
godane |
thats good |
04:42
🔗
|
godane |
one less thing for me |
04:45
🔗
|
godane |
SketchCow: missed one: http://awesome.commodore.me/downloads/magazine/Amiga_Format/Amiga_Format_Issue_105_(1997)(Future_Publishing)(GB)%5Bchristmas_edition%5D.pdf |
05:02
🔗
|
SketchCow |
http://archive.org/details/amigaformatmagazine-105 |
05:02
🔗
|
godane |
i'm uploading attack of the show blog |
05:03
🔗
|
godane |
thanks |
05:42
🔗
|
SketchCow |
Who wants to be the WARC hero? http://internettourbus.com/ |
05:44
🔗
|
* |
chronomex raises hand |
05:44
🔗
|
chronomex |
in danger? |
05:44
🔗
|
chronomex |
or just needs good coverage? |
05:54
🔗
|
chronomex |
-rw-r--r-- 1 duncan duncan 11M Mar 10 22:52 internettourbus.com.warc.gz |
05:57
🔗
|
SketchCow |
Someone told me in danger, but they might have meant ignored and closed |
06:01
🔗
|
godane |
i'm grabing it |
06:05
🔗
|
SketchCow |
chronomex has alrady grabbed it. |
06:06
🔗
|
godane |
just noticed that |
06:07
🔗
|
godane |
seeing at this point if its the right size |
06:09
🔗
|
godane |
looks to be about 11mb |
06:34
🔗
|
godane |
so i'm finding the missing videos |
06:34
🔗
|
godane |
very boring |
06:41
🔗
|
godane |
anyways i'm getting Video Game Tricks, Codes and Strategies Vol.1 from myspleen |
07:47
🔗
|
godane |
uploaded: https://archive.org/details/www.g4tv.com-video-pages-20130309 |
08:38
🔗
|
godane |
so my account at thebox.bz is disabled again |
08:38
🔗
|
godane |
good news is i got sky at night epsiodes from 1995 |
08:39
🔗
|
godane |
and i'm up to day with click episodes |
10:04
🔗
|
omf_ |
There are rumors that echofon is going to shutdown the rest of their twitter apps |
10:05
🔗
|
omf_ |
so here is a grab of the site https://archive.org/details/echofon.com |
10:39
🔗
|
SketchCow |
added |
10:41
🔗
|
SketchCow |
Can someone PLEASE grab http://www.oqotalk.com ? Due to http://www.oqotalk.com/index.php?topic=5304.0 |
10:46
🔗
|
omf_ |
SketchCow, I got a grab going on it now |
11:08
🔗
|
SketchCow |
Thanks. |
11:10
🔗
|
omf_ |
There look to be ~40,000 posts and I got over 4,000 so far |
12:02
🔗
|
omf_ |
grabbing wrathofheroes.warhammeronline.com is proving problematic |
12:02
🔗
|
omf_ |
it just does not want to grab all the pages |
12:03
🔗
|
omf_ |
I tried numerous wget commands with span hosts and others skipping domains |
12:03
🔗
|
omf_ |
even httrack is not finding pages that I can navigate to from the frontpage |
12:04
🔗
|
godane |
so i got some good news |
12:04
🔗
|
godane |
i registered on to g4tv.com forums |
12:05
🔗
|
godane |
and there are more topics i have not archived yet |
12:11
🔗
|
Smiley |
:O |
12:14
🔗
|
godane |
doing --load-cookies=cookies.txt is not working |
12:20
🔗
|
godane |
Smiley: i can't get it with wget |
12:20
🔗
|
Smiley |
:< |
12:20
🔗
|
Smiley |
useragent issue possibly? |
12:25
🔗
|
godane |
adding firefox to user-agent doesn't work |
12:28
🔗
|
Smiley |
hmmm some java scripting thing maybe :S |
12:28
🔗
|
Smiley |
Im not good when wget doesn't work :( |
12:34
🔗
|
omf_ |
wrathofheroes.warhammeronline.com is closing down March 29th |
12:42
🔗
|
godane |
i need help guys |
12:43
🔗
|
godane |
i can't grab the secret forums on g4 |
12:53
🔗
|
godane |
i'm getting it finally |
12:53
🔗
|
godane |
i forgot to click the remember me button |
12:55
🔗
|
godane |
if you don't have that then the cookies are not useful |
13:44
🔗
|
omf_ |
fuck fuck FUCK. I just had to kill a 13 day download via wget because it was using 9gb of RAM |
13:44
🔗
|
omf_ |
If I wasn't already writing a replacement for wget I would definitely be now |
13:47
🔗
|
Smiley |
D: |
13:47
🔗
|
Smiley |
:/ |
13:47
🔗
|
Smiley |
omf_: does it not write out anything? |
13:47
🔗
|
Smiley |
:<<< |
13:47
🔗
|
omf_ |
I have 52gb saved |
13:48
🔗
|
omf_ |
but no way to determine where I am in the process |
13:57
🔗
|
Smiley |
Is it a warc? |
13:57
🔗
|
Smiley |
we should discuss this in #ispygames |
14:42
🔗
|
omf_ |
Does anyone have experience beyond basic usage of heritrix |
20:11
🔗
|
balrog_ |
so is punchfork is gonna get done? |
20:12
🔗
|
alard |
There are 39 hard cases left. |
20:47
🔗
|
alard |
Our Friendster data has found its way into science: http://snap.stanford.edu/data/com-Friendster.html |
20:51
🔗
|
ersi |
awesome |
20:51
🔗
|
alard |
Also fun (but not published, it seems): http://www.sg.ethz.ch/media/publication_files/OSN_Kcore.pdf |
20:52
🔗
|
alard |
They're trying to analyse the structure of successful and unsuccessful social networks. |
20:52
🔗
|
balrog_ |
if anyone needs 1tb harddrives: http://www.newegg.com/Product/Product.aspx?Item=N82E16822149382 |
20:52
🔗
|
ersi |
pretty cool |
21:55
🔗
|
godane |
so i just found out i screwed up the new forums dumps of g4 |
21:55
🔗
|
godane |
i will have re do it |
21:56
🔗
|
godane |
i forgot to put the cookies.txt file in the tmp folder i think use to build the full warc.gz after get the index warc |
22:00
🔗
|
godane |
so we may have some more bad news about the forums |
22:01
🔗
|
godane |
the s= urls are uses in all the links |
22:01
🔗
|
godane |
so i don't know if this stuff will be browserable at all |
22:34
🔗
|
dashcloud |
so, I'd like to back up a forum: http://diehardwolfers.areyep.com/index.php what commandline do I use here to do so? |
22:35
🔗
|
ersi |
Something in the lines of `wget --warc-file=diehardwolfers.areyep.com --mirror --page-requisites http://diehardwolfers.areyep.com/index.php` |
22:40
🔗
|
arkhive |
Question to AT members: What happens to unsold television pilots. Like if the pilot episode gets made but the network decides against picking it up for a series..What happens to it? I read that there are a whole bunch that don't sell. And I found some online but not 'a whole bunch' |
22:42
🔗
|
ersi |
Well, uh.. they.. disappear. Depends on if they ever get published somewhere or not. I imagine some media corps save all pilots |
22:43
🔗
|
arkhive |
Is there a way to watch the ones that aren't released to the public? Might be a stupid question lol. I'm just a big fan of what could have been. heh. |
22:44
🔗
|
ersi |
dunno man, I guess one is; Work at one of the media megacorps |
22:45
🔗
|
ersi |
(Please do, and leak these kinds of things to the IA ;D) |
22:45
🔗
|
arkhive |
And when a series gets cancelled before even getting through a first season. even when they made all the episodes for that season and they never air it. never release on streaming or itunes and such. |
22:45
🔗
|
arkhive |
i guess the same. |
22:45
🔗
|
arkhive |
I totally would |
22:46
🔗
|
ersi |
think about all the rejected paper articles ;) |
22:46
🔗
|
arkhive |
Like The Playboy Club show had a lot of potential. They filmed more episodes then they released. I think they should at least put them up somewhere. Especially when NBC has TPC on their website to watch.. though only the first three episodes.. doesn't make sense. |
22:46
🔗
|
S[h]O[r]T |
if the media corp paid for the pilot to be made then they likely own the rights and hold the tape somewhere. |
22:47
🔗
|
arkhive |
And Heist. It was stupid and cheesey but i enjoyed it. Cater to the masses though |
22:47
🔗
|
arkhive |
ersi: ya. if only if only. |
22:47
🔗
|
S[h]O[r]T |
a lot of times if its a big hyped series that gets canceled or internationals are interested, those conuntries still get those episodes |
22:47
🔗
|
S[h]O[r]T |
because they purchase the season/series before it was canceled |
22:48
🔗
|
S[h]O[r]T |
so youll see tv captures from europe and australia and sometimes canada where the episodes that didnt air in the USA eventually air there |
22:48
🔗
|
arkhive |
S[h]O[r]T: ya i know but that happened to Heist if i remember right. like three more episodes were released in another country. but still one remains missing!!! |
22:48
🔗
|
arkhive |
it's crazy |
22:49
🔗
|
S[h]O[r]T |
unless the network has something to hide you if you were a tv network you could buy the rights to air them |
22:49
🔗
|
S[h]O[r]T |
start a tv network that airs all the missing episodes :p |
22:49
🔗
|
S[h]O[r]T |
pilots that are greenlight, people have them esp if its from the major corperations |
22:50
🔗
|
arkhive |
Ya. I thought about it. Need money though. And the megacorps would charge too much for their failed series/unsold pilots |
22:50
🔗
|
S[h]O[r]T |
they will send them out to press or screen at events, or send them to affilate tv networks for their staff and marketing departments |
22:50
🔗
|
S[h]O[r]T |
if it airs on tv, someone will record it |
22:50
🔗
|
S[h]O[r]T |
otherwise its hard to find/you may never for years |
22:51
🔗
|
arkhive |
But how come some remain missing? IT'S SUCKS! |
22:51
🔗
|
S[h]O[r]T |
ill trade you pilots for un-edited sex tapes |
22:51
🔗
|
arkhive |
haha |
22:51
🔗
|
S[h]O[r]T |
i want all the non dubbed versions of the vivid sex tapes |
22:51
🔗
|
S[h]O[r]T |
i guess this is -bs |
22:51
🔗
|
arkhive |
well part of it was regular |
22:51
🔗
|
S[h]O[r]T |
so to end...its their copyright and they chose what they want to do with it :( |
22:52
🔗
|
arkhive |
hmm.. i wonder if i tweeted one of The Playboy Club actor/actresses to see if they have a copy, if they'd let me have it. :P if only if only |
22:52
🔗
|
S[h]O[r]T |
they wouldnt |
22:53
🔗
|
arkhive |
ya iknow |
22:54
🔗
|
S[h]O[r]T |
most of the time they send tapes a week or so in advance to networks, they dont send the entire series |
22:54
🔗
|
S[h]O[r]T |
but according to wikipedia fx latin america was going to air it but it got canceled. so im sure they had a contract to air it. and so did citytv in canada |
22:54
🔗
|
arkhive |
It just seems stupid to keep them locked up to never see the light of day. |
22:55
🔗
|
arkhive |
TPC or Heist? |
22:56
🔗
|
S[h]O[r]T |
for heist wiki says the 6th aired in UK but not the 7th episode. so that was likely what their contract ran up until |
22:57
🔗
|
S[h]O[r]T |
also dont forget that even tho NBC or another network may air them. they are most of the time produced by an entirely different media corp |
22:57
🔗
|
arkhive |
Ya. Heist's Hot Digity episode..gone. |
22:58
🔗
|
S[h]O[r]T |
there was a big concern when MGM filed for bankrupty |
23:00
🔗
|
S[h]O[r]T |
there was that dexters labratory episode that was a rumor for years and eventually this year someone got approval to pull it from their vault and put it online |
23:00
🔗
|
arkhive |
oh damn. |
23:01
🔗
|
balrog_ |
arkhive: did you ever recover that hard drive? |
23:02
🔗
|
arkhive |
Just about all of it. I PM'ed SketchCow like 3 times awhile back to tell him I'm ready to upload it and he never responded |
23:03
🔗
|
arkhive |
But ya I got all that i could |
23:03
🔗
|
arkhive |
which is almost all of it. like really close. |
23:04
🔗
|
arkhive |
I didn't know if Jason was mad at me for my mess up or what lol. |
23:04
🔗
|
arkhive |
He might have responded and I didn't get the message i don't know :) |
23:06
🔗
|
ersi |
He's just pretty busy, and he's away from his IRC client a lot. |
23:07
🔗
|
ersi |
Is this some private upload thingie? If not, I'd suggest giving it a go to upload to IA and just giving SketchCow the link. |
23:07
🔗
|
arkhive |
Oh by the way. part of the folder structure is not intact (like the directory) because of my mess up. And i do sincerely apologize for my screw up. |
23:07
🔗
|
arkhive |
It's in regards to MobileMe |
23:09
🔗
|
ersi |
ah |
23:11
🔗
|
balrog_ |
are the filenames intact? |
23:11
🔗
|
arkhive |
ya |
23:11
🔗
|
balrog_ |
then it probably can be reconstructed |
23:12
🔗
|
arkhive |
Problem that i had was they were locked when I was trying to access on another computer. So I went through all of my computers and macs and used unlocking programs and such. |
23:12
🔗
|
arkhive |
Like it wouldn't let me copy them to my folder to be uploaded. or even upload directly from where they were. |
23:12
🔗
|
arkhive |
That problem is fixed though. Took a long ass time. |
23:21
🔗
|
SketchCow |
I'm barraged. BARRAGED. Constantly. |
23:21
🔗
|
SketchCow |
Believe me. You will know if I'm mad at you. |
23:21
🔗
|
arkhive |
:) |
23:21
🔗
|
SketchCow |
Your relatives going out to second cousins will know I'm mad at you |
23:21
🔗
|
GLaDOS |
I'm quite close to my 5th cousins, will they know as well? |
23:22
🔗
|
godane |
SketchCow: so it looks like i screwed up the first dumps of forums.g4tv.com |
23:22
🔗
|
godane |
is has the s=session numbers in it |
23:22
🔗
|
godane |
urls are saved as it should be |
23:22
🔗
|
godane |
so maybe unable to really use it in wayback machine |
23:24
🔗
|
godane |
just for you guys to know |
23:24
🔗
|
godane |
i will not get every video id of g4tv.com |
23:25
🔗
|
godane |
there are tons that are not active anymore but without know the file name with the ids i will most likely not be able to get it |