#archiveteam 2013-02-25,Mon

↑back Search

Time Nickname Message
00:32 🔗 SketchCow http://tracker.archiveteam.org/ looks amazing
00:41 🔗 NoMoon Soo... what's this I see about the Posterous project?
00:42 🔗 NoMoon Getting "No Item Received" from Punchfork, so figured I'd switch over, but it says to ask here first.
00:42 🔗 SketchCow Posterous is getting geared up.
00:43 🔗 omf_ http://tracker.archiveteam.org/posterous/
00:44 🔗 nomoon @omf: Yup, I see that. Just don't want to get banned.
00:58 🔗 SketchCow I see that.
00:58 🔗 SketchCow (I'm running it, to get banned.)
01:19 🔗 godane i need no tracker for my uploads
01:19 🔗 godane of course g4tv.com is maybe the last time i do a very big project
01:20 🔗 godane it fills like backing up geocities (to me) all by yourself
01:20 🔗 godane *feels
01:30 🔗 ChrisAM hi
01:31 🔗 ChrisAM What do I need to know before working on the Posterous project in Warrior?
01:33 🔗 omf_ you are going to get banned
01:33 🔗 omf_ everyone has
01:33 🔗 omf_ we are trying to figure it out
01:34 🔗 ChrisAM I don't personally mind getting banned, but should I wait until it's solved to start?
01:36 🔗 S[h]O[r]T godane i know i mentioned before and never really pursuied but if you have scripts and lists that just need to be run for the HD videos on g4tv i can do that. ive got bw and storage
01:39 🔗 godane i have the list uploaded
01:39 🔗 godane https://archive.org/details/g4tv.com-video-url-list-1
01:40 🔗 godane just need sed -i 's|_flv.flv|_flvhd.flv' list
01:40 🔗 godane just know that i have seen some hd videos that i needed to edit the name more then that
01:41 🔗 godane one hd video (i think one of the g4films) i need to remove fix2 to download it
01:41 🔗 godane cause just change _flv to _flvhd was not enough
01:43 🔗 godane i also have all descs that i custom sed a xml warc to making the uploads have the right descs
01:46 🔗 godane just know you will have to start at about 26000 for hd stuff
01:47 🔗 godane you will not get it until 26419 (i think) and not ever video will have hd
01:48 🔗 godane recent video doesn't even have hd
01:49 🔗 godane this one: http://www.g4tv.com/videos/61896/metal-gear-rising-revengeance-launch-trailer/
01:50 🔗 godane some part of me thinks i should have check for flvhd file first then if it doesn't exist go for flv
02:46 🔗 db48x nomoon: everyone always gets banned
02:47 🔗 db48x you'll have to get a new ip address every hour
03:49 🔗 godane so my mouse couldn't left click
03:50 🔗 godane luckly i have a old mouse that i can use
04:25 🔗 underscor ok
04:25 🔗 underscor woah
04:25 🔗 underscor hi guys
04:25 🔗 underscor long time no see
04:25 🔗 underscor What've I missed?
04:27 🔗 underscor also, holy shit is it good to be back
04:27 🔗 godane hey underscor
04:27 🔗 godane i'm just trying to save g4tv.com
04:29 🔗 DFJustin underscor: everything is shutting down
04:29 🔗 underscor Yeah, posterous, gamespot (iirc?)
04:29 🔗 underscor anything else?
04:30 🔗 DFJustin not gamespot, ign/gamespy/1up
04:30 🔗 DFJustin which is a shitload of sites
04:32 🔗 DFJustin opensolaris
04:32 🔗 DFJustin punchfork
04:34 🔗 underscor damn
04:34 🔗 underscor Couldn't they spread themselves out? Jeez...
05:10 🔗 ewook *yawn*
05:10 🔗 ewook So, about posterous... dynamic ip is the way to go I guess then :)
05:14 🔗 SketchCow For the moment, yes.
05:46 🔗 ewook is it performed automagically, or are they hunting manually?
05:47 🔗 ewook might be fun to do this over thor, if one can verify that a new endpoint would be chosen each time you reconnect.
06:01 🔗 db48x ewook: both
06:02 🔗 db48x ewook: we surmise that there is a cron job that runs 10 minutes before the hour which automatically bans based on some log info or other
06:03 🔗 db48x but it looks like there is occasionally human intervention as well, because some of the bans happen at other times, and the duration is variable
06:21 🔗 ewook db48x: geebus. Why in the world would one do that - bandwidth-capping sure, but simply toss the connection away? Does it look like one ban/one ip, or have you seen subnets being blocked in the manual part as well? (perhaps to early to tell?).
06:23 🔗 ewook one solution could be to ssh-tunnel and use cron to circulate between "ip's" and that might go under the radar, if I jump often enough.
06:26 🔗 ewook does the warrior still go full user get, and then move on, or does perhaps alternating between user-data work better?
06:27 🔗 SketchCow Maybe misidentify user agent to not be wget?
06:28 🔗 ewook If I were watching my site, I'd check for ip's going through a users data in the same manner, and with "dedication". Sure, that would be one what to do it, and much easier.
06:30 🔗 ewook oh, is punchfork done now? my warriors are just sitting pretty :).
06:31 🔗 db48x can't really say what was going on in their head when they wrote it
06:31 🔗 ewook sadly no :(.
08:44 🔗 SketchCow Hello,
08:44 🔗 SketchCow I have a question to ask about retrieving photos that were posted by a
08:44 🔗 SketchCow gallery for anything else, and when I just went to look at my photos,
08:44 🔗 SketchCow photos, but never the entire gallery. I have never used the MobileMe
08:44 🔗 SketchCow professional photographer to a Mobile Me gallery. I downloaded a few
08:44 🔗 SketchCow I saw the shocking news that the site is shut down. I am sick to
08:44 🔗 SketchCow think that these priceless photos of my toddler may be gone. Is there
08:44 🔗 SketchCow a way that your company could retrieve these photos?
08:44 🔗 SketchCow Thanks for your help,
08:44 🔗 SketchCow Kimberly
08:45 🔗 SmileyG SketchCow: <3, managed to help btw?
08:46 🔗 SketchCow Asking her for the username
08:52 🔗 SmileyG Do you have a list of these .... rescues anywhere?
08:53 🔗 SmileyG Ask people if they mind being added to a website listing those who've contacted you/the group and been helped.
09:01 🔗 omf_ Yeah testimonials work just like good reviews in driving more people to the cause
09:02 🔗 SmileyG In one of the videos, at defcon I think. The site for that child who had died.
09:02 🔗 SmileyG That convinced me.
09:02 🔗 SmileyG The one on geocities
09:03 🔗 omf_ I saw that clip in the Open Source Bridge talk
09:03 🔗 omf_ I watched that talk and was happy to learn people were coming together to be more proactive about saving the internet
09:04 🔗 SmileyG tbh I'd never really though about it before.
09:04 🔗 omf_ I had heard about geocities but at the time I thought it was a one time gig
09:04 🔗 SmileyG I still haven't checked to see if my site is in the geocities grab.
09:04 🔗 omf_ To me the geocities clip shows the impressive value of stored culture.
09:04 🔗 omf_ All my old ones are
09:04 🔗 omf_ like 7 of them
09:05 🔗 omf_ I did not even have backups of those
09:05 🔗 omf_ I had erased them a long time ago
09:05 🔗 omf_ All this data saved has immense value
09:05 🔗 omf_ everything else is just trying to explain that point to people
09:05 🔗 SmileyG I don't know how to check tbh.
09:05 🔗 SmileyG as it's not a warc?
09:10 🔗 SketchCow http://gallery.me.com/stephaniefay1/101561
09:10 🔗 SketchCow Anyone want to take a shot at finding this?
09:12 🔗 SketchCow alard: Let me know if we can find this outside the index
09:12 🔗 SketchCow Or if there's a list somewhere of what all we grabbed
11:31 🔗 offender hi everyone!
11:31 🔗 offender burn all jews in oven!
11:31 🔗 offender death to jews
11:31 🔗 offender sieg heil
11:31 🔗 offender burn all jews in GAS oven
11:32 🔗 SmileyG ty.
11:32 🔗 joepie91 funny how his kick reason is the same as his nick, yet appropriate
11:32 🔗 SmileyG ;)
11:33 🔗 SmileyG I would of done offended ;)
11:33 🔗 offender hi i came back
11:33 🔗 offender death to jews!
11:33 🔗 offender sieg heil
11:33 🔗 offender allahu akhbar
11:34 🔗 SmileyG Like so
11:34 🔗 SmileyG wtf where did my reasoning go
11:34 🔗 SmileyG :D
12:12 🔗 ersi Bleh, missed an @ - which made things a little silly.
12:21 🔗 GLaDOS ersi: wut r u doin
12:22 🔗 GLaDOS oh wait you fixed it
12:28 🔗 SmileyG D:
12:28 🔗 SmileyG ersi: :D
12:28 🔗 SmileyG i just did it for speed.
15:28 🔗 DFJustin moved some channels which I think are finished to the idle section on http://www.archiveteam.org/index.php?title=IRC - if I missed some or moved incorrectly pls fix kthxbye
15:35 🔗 ersi DFJustin: Good job.
15:40 🔗 sep332 other than username and concurrent items, what configuration do i have to do for the warrior?
15:40 🔗 sep332 I click "Available projects" but nothing happens
15:43 🔗 db48x sep332: it should load a page that lets you choose which project to help with
15:43 🔗 sep332 that's what i expected but nothing happens
15:43 🔗 sep332 i tried chrome and firefox
15:44 🔗 sep332 (my firefox is full of extensions that could cause problems but chrome is clean)
15:51 🔗 sep332 is this the right place to carp about the warrior? :)
15:51 🔗 db48x it's as good as any
15:52 🔗 db48x I'm trying to load the api page that the project list comes from
15:53 🔗 db48x hmm, got an http error
15:53 🔗 db48x that would cause it trouble
15:56 🔗 ersi I think there might be issues with the back-end tracking servers - which I think is why you're having problems sep332 with the "Project listing" page
15:56 🔗 sep332 ah, ok
15:56 🔗 sep332 is http://tracker.archiveteam.org/ supposed to load in a normal browser?
15:57 🔗 sep332 it's throwing 502's right now
15:57 🔗 db48x yes
15:57 🔗 db48x that's the problem
15:57 🔗 db48x (although the warrior loads a slightly different url for this sort of thing)
15:57 🔗 sep332 gotcha, thanks a lot db48x
15:57 🔗 ersi I'd just sit tight - it'll be fixed :-)
15:57 🔗 db48x you're welcome
15:58 🔗 sep332 this looked pretty simple and i was feeling dumb for not even being able to turn it on :)
15:59 🔗 db48x yea, it's our fault for not having a warning or something
15:59 🔗 db48x it just assumes that the request didn't fail
16:00 🔗 sep332 will i have to restart it later, or can i just keep clicking the button to see if it works? :)
16:01 🔗 db48x you'll have to restart it
16:01 🔗 sep332 ok
16:07 🔗 ersi sep332: Feel free to hang around or come around from time to time, by the way :-)
16:08 🔗 sep332 thanks! i should hang out here more often I think
16:09 🔗 db48x hmm
16:09 🔗 db48x I just realized that I don't have any way to diff two warcs
16:11 🔗 ersi What are you looking to accomplish? Compare each record with each other? De-duplicate? See how some Target-URI has changed on different fetches?
16:11 🔗 db48x yes
16:12 🔗 db48x comparing the results of two runs with different user agent strings
19:10 🔗 chronomex alard: wtf is tracker:/var/www/rsync for?
19:10 🔗 chronomex it's killing the box
19:16 🔗 chronomex doing what I can to free other space up in the meantime
19:40 🔗 sep332 what am I supposed to ask about before running the Posterous project in the warrior?
19:47 🔗 soultcer sep332: Posterous will ban you sooner or later
19:47 🔗 sep332 ok...
19:47 🔗 soultcer So if you need to visit any posterous blogs, you should not run it in the warrior
19:47 🔗 sep332 is that per-ip?
19:47 🔗 sep332 oh ok thanks
19:49 🔗 soultcer Yes, per IP
20:59 🔗 alard chronomex: Ah, grrr. /var/www/rsync is the debugging space that I use as a temporary upload area if I want to inspect a few warcs. I should remember to switch it back.
21:02 🔗 soultcer Do you have a space where you can put the data? Fos?
21:03 🔗 alard Yes, I've now switched the upload target back to fos. (There's an option in the tracker admin panel.)
21:03 🔗 alard I'm rsyncing-and-deleting the files to fos.
21:03 🔗 chronomex thanx
21:06 🔗 SketchCow FOS is ready for it.
21:08 🔗 alard SketchCow: That's very good, since most of it was already going that way. It's just that FOS is missing an easy just-let-me-look-in-that-warc-file-to-see-what's-wrong option.
21:08 🔗 balrog_ posterous bans for 7-10 days
21:09 🔗 alard SketchCow: If stephaniefay1 is not in the index, I wouldn't know how to find her, I.
21:09 🔗 alard ... I'm afraid.
21:10 🔗 SketchCow That's fine.
21:10 🔗 SketchCow We tried.
21:11 🔗 alard She's also not in the tracker log.
21:11 🔗 SketchCow So are we doing EC2 to get around this ban?
21:11 🔗 SketchCow The ban pisses me off.
21:12 🔗 omf_ I am going to be testing some vpn services in a little while
21:13 🔗 omf_ rotating ips
21:13 🔗 soultcer I thought most VPN services have a static IP that is shared by many costumers
21:13 🔗 omf_ depends on who you go with
21:14 🔗 SketchCow Do we have ANY idea how big posterous will be?
21:14 🔗 omf_ Some have multiple class C subnets you can switch between and since ips are pooled you rotate on connection
21:14 🔗 omf_ We know # of accounts
21:15 🔗 omf_ over 9 million
21:15 🔗 alard Heroku could also be an option, if anyone would want to go that way: https://github.com/ArchiveTeam/heroku-buildpack-archiveteam-warrior
21:26 🔗 nomoon Is that Heroku pack functional on 1 dyno?
21:26 🔗 nomoon I see that Heroku's AUP says "please don't use more than 2TB/month" so I guess bandwidth isn't really an issue.
21:28 🔗 alard nomoon: Yes, I think so.
22:51 🔗 godane ADD Moment: looks E3 2005 G4 Coverage was co-production with ign.com
22:52 🔗 godane so in some weird way i'm helping save stuff from ign
22:52 🔗 godane but not really
23:55 🔗 SketchCow AT .
23:55 🔗 SketchCow Our company Millenniata makes a 1,000-year data storage disc called the M-Disc which comes in Blu Ray (25 GB) and DVD (4.7 GB). We would love to share with your team how these discs can help you in your quest to archive history for a long period of time. Check out our website at www.mdisc.com. We'd be happy to answer any questions you have and explain how this might benefit you in your quest.
23:56 🔗 SketchCow Thank you,
23:56 🔗 SketchCow Josh Krall
23:56 🔗 SketchCow Director of Marketing
23:58 🔗 omf_ I had heard of them. I might order a 10pack to try out
23:58 🔗 chronomex unfortunately I don't have a time machine to transport these back to 1500 or so and test them hout
23:59 🔗 omf_ wait nevermind I thought they had bluray disks
23:59 🔗 omf_ There are heat tests
23:59 🔗 dashcloud will they send out lots of them as test discs?
23:59 🔗 omf_ soak them in water

irclogger-viewer