#archiveteam 2014-11-30,Sun

↑back Search

Time Nickname Message
00:06 πŸ”— mistym has quit IRC (Remote host closed the connection)
00:16 πŸ”— godane has joined #archiveteam
00:57 πŸ”— primus104 has quit IRC (Leaving.)
00:57 πŸ”— mistym has joined #archiveteam
00:58 πŸ”— primus104 has joined #archiveteam
01:11 πŸ”— primus104 has quit IRC (Leaving.)
01:12 πŸ”— primus104 has joined #archiveteam
01:58 πŸ”— primus104 has quit IRC (Leaving.)
02:06 πŸ”— mistym has quit IRC (Remote host closed the connection)
02:19 πŸ”— philpem has quit IRC (Ping timeout: 272 seconds)
02:26 πŸ”— mistym has joined #archiveteam
02:30 πŸ”— Ymgve has quit IRC ()
02:33 πŸ”— ruukasu has quit IRC (Quit: WeeChat 1.0.1)
02:36 πŸ”— ruukasu has joined #archiveteam
04:14 πŸ”— ruukasu has quit IRC (Quit: WeeChat 1.0.1)
04:16 πŸ”— Start is now known as StartAway
04:22 πŸ”— wp494 has quit IRC (Ping timeout: 186 seconds)
04:23 πŸ”— ruukasu has joined #archiveteam
04:27 πŸ”— wp494 has joined #archiveteam
05:03 πŸ”— aaaaaaaaa has quit IRC (Leaving)
06:06 πŸ”— SketchCow Last of the nightmare directories.
06:08 πŸ”— godane hey SketchCow
06:11 πŸ”— SketchCow In THEORY, and I mean IN THEORY, FOS's mrtgs should get better now
06:12 πŸ”— godane everything with computers is in THEORY
06:12 πŸ”— godane like in THEORY the clown should save everything
06:13 πŸ”— godane but it doesn't when it goes bankrupt or gets bought
06:14 πŸ”— SketchCow http://fos.textfiles.com:8088/mrtg/
06:14 πŸ”— SketchCow godane, you're so cute when you're cynical
06:16 πŸ”— godane SketchCow: you maybe getting radionz mp3 files
06:16 πŸ”— godane it goes back to 2008
06:18 πŸ”— godane its funny that you have these talks on these radio channels that end up putting it online
06:18 πŸ”— mistym has quit IRC (Remote host closed the connection)
06:18 πŸ”— godane but theres proof that there doing the very thing your fighting again
06:19 πŸ”— SketchCow I can see MRTG's pain is heavily reduced
06:45 πŸ”— mistym has joined #archiveteam
06:48 πŸ”— DFJustin I pity the foo
06:50 πŸ”— Aranje has joined #archiveteam
07:15 πŸ”— mutoso has quit IRC (Read error: Connection reset by peer)
07:20 πŸ”— APerti has joined #archiveteam
07:24 πŸ”— slipstrea has quit IRC (Ping timeout: 480 seconds)
07:24 πŸ”— pfallenop has quit IRC (Ping timeout: 480 seconds)
07:28 πŸ”— pfallenop has joined #archiveteam
07:35 πŸ”— slipstrea has joined #archiveteam
07:40 πŸ”— dxdx is now known as dx
07:44 πŸ”— slipstrea has quit IRC (Ping timeout: 480 seconds)
07:55 πŸ”— slipstrea has joined #archiveteam
08:07 πŸ”— dxdx has joined #archiveteam
08:07 πŸ”— dx has quit IRC (Read error: Operation timed out)
08:19 πŸ”— primus104 has joined #archiveteam
08:37 πŸ”— philpem has joined #archiveteam
08:48 πŸ”— dashcloud has quit IRC (Read error: Operation timed out)
08:51 πŸ”— dashcloud has joined #archiveteam
09:05 πŸ”— APerti has quit IRC (Ping timeout: 378 seconds)
09:10 πŸ”— dx has joined #archiveteam
09:11 πŸ”— slipstrea has quit IRC (Ping timeout: 480 seconds)
09:12 πŸ”— dxdx has quit IRC (Read error: Operation timed out)
09:21 πŸ”— dx has quit IRC (Read error: Operation timed out)
09:26 πŸ”— dx has joined #archiveteam
09:26 πŸ”— slipstrea has joined #archiveteam
09:41 πŸ”— dx has quit IRC (Ping timeout: 272 seconds)
09:41 πŸ”— dx has joined #archiveteam
09:48 πŸ”— dx has quit IRC (Ping timeout: 265 seconds)
09:48 πŸ”— dx has joined #archiveteam
10:09 πŸ”— schbirid has joined #archiveteam
10:11 πŸ”— arkiver SketchCow: viddy is going to start after IA is back from their free days
10:13 πŸ”— arkiver problem is, there are over 3.5 million user id's and they are all random numbers (example: f874d7f4-b3fb-4bdc-b83f-2f944cf290ac)
10:14 πŸ”— arkiver and then we have 1.5 million video id's, which are constructed the same way
10:15 πŸ”— arkiver SketchCow: so I'm afraid this will have to be over 5 million items :/
10:21 πŸ”— schbirid https://twitter.com/ezmobius has passed away
10:43 πŸ”— norbert79 schbirid: According to his entry he was some scientist... How did he die?
10:46 πŸ”— norbert79 Yeah. I see there are not details yet
10:57 πŸ”— mutoso has joined #archiveteam
10:59 πŸ”— godane http://www.justtechnews.net/ezra-zygmuntowicz-has-passed-away/
10:59 πŸ”— godane https://news.ycombinator.com/item?id=8676140&utm_source=dlvr.it&utm_medium=tumblr
11:03 πŸ”— dashcloud has quit IRC (Ping timeout: 335 seconds)
11:04 πŸ”— rejon has joined #archiveteam
11:08 πŸ”— dashcloud has joined #archiveteam
11:27 πŸ”— mistym has quit IRC (Remote host closed the connection)
12:04 πŸ”— Ymgve has joined #archiveteam
12:09 πŸ”— bauruine has quit IRC (Ping timeout: 265 seconds)
12:28 πŸ”— bauruine has joined #archiveteam
12:45 πŸ”— bauruine has quit IRC (Read error: Connection timed out)
12:54 πŸ”— primus104 has quit IRC (Leaving.)
13:03 πŸ”— bauruine has joined #archiveteam
13:28 πŸ”— bauruine has quit IRC (Quit: ZNC - http://znc.in)
13:31 πŸ”— bauruine has joined #archiveteam
13:39 πŸ”— BlueMaxim has quit IRC (Quit: Leaving)
14:27 πŸ”— adrian_ has joined #archiveteam
14:30 πŸ”— adrian_ hey everyone
14:30 πŸ”— adrian_ a while back you guys helped me archive a website I am going to turn off
14:30 πŸ”— adrian_ http://archive.fart.website/archivebot/viewer/job/9tchq
14:32 πŸ”— adrian_ i’m wondering if it’s possible to do a partial crawl of things that have changed since then
14:32 πŸ”— adrian_ i want to sunset the site tomorrow
14:34 πŸ”— midas i dont think archivebot can do that atm
14:34 πŸ”— GLaDOS partial crawls, no.
14:34 πŸ”— GLaDOS unless it's a subsection of the site it can crawl entirely
14:37 πŸ”— Smiley such as a subdomain incase you don't know what GLaDOS means
14:37 πŸ”— Smiley or someone who's good with the archivebot can add ignore masks
14:41 πŸ”— adrian_ i could do it myself through curl, maybe ?
14:49 πŸ”— fluff is now known as fluff_
14:50 πŸ”— fluff_ is now known as fluff
14:55 πŸ”— fluff is now known as fluff_
15:11 πŸ”— primus104 has joined #archiveteam
15:31 πŸ”— T31M has joined #archiveteam
15:32 πŸ”— cf has quit IRC (Quit: cf)
17:19 πŸ”— logchfoo has quit IRC (Ping timeout: 612 seconds)
17:21 πŸ”— logchfoo starts logging #archiveteam at Sun Nov 30 17:21:14 2014
17:21 πŸ”— logchfoo has joined #archiveteam
17:51 πŸ”— signius_ has quit IRC (Ping timeout: 258 seconds)
17:58 πŸ”— Aranje has quit IRC (Read error: Connection reset by peer)
17:58 πŸ”— Aranje has joined #archiveteam
18:03 πŸ”— signius_ has joined #archiveteam
18:19 πŸ”— cf has joined #archiveteam
18:40 πŸ”— StartAway is now known as Start
18:41 πŸ”— APerti has joined #archiveteam
19:06 πŸ”— cf has quit IRC (cf)
19:19 πŸ”— mistym has joined #archiveteam
19:19 πŸ”— cf has joined #archiveteam
19:26 πŸ”— SketchCow arkiver: Not 5 million IA items, right
20:15 πŸ”— SketchCow Time to write something.
20:39 πŸ”— arkiver SketchCow: I just mean 5 million warc.gz items for FOS, which need to be made into megawarcs
20:42 πŸ”— aaaaaaaaa are the user pages huge? If not it might be better to combine them in to packs.
20:43 πŸ”— arkiver aaaaaaaaa: <arkiver>problem is, there are over 3.5 million user id's and they are all random numbers (example: f874d7f4-b3fb-4bdc-b83f-2f944cf290ac)
20:46 πŸ”— mistym has quit IRC (Remote host closed the connection)
20:58 πŸ”— SketchCow We NEED to come up with a way to create the megawarcs that doesn't kill FOS again.
21:08 πŸ”— mistym has joined #archiveteam
21:20 πŸ”— schbirid has quit IRC (Leaving)
21:24 πŸ”— SketchCow http://archiveteam.org/index.php?title=Nightmare_Projects
21:32 πŸ”— BlueMaxim has joined #archiveteam
21:37 πŸ”— godane SketchCow: my funny or die collection is sort of a nightmare project
21:37 πŸ”— godane only cause i have no idea of its final size
21:38 πŸ”— godane but i'm uploading it slowly so IA can get the space need for it
21:43 πŸ”— SketchCow It is a little silly
21:43 πŸ”— SketchCow But I think it's a useful thing, although if they turn, it could disappear
21:47 πŸ”— SketchCow There's a conspiracy theory among whatevers that you are some agent of Turner Broadcasting trying to false flag archive.org.
21:49 πŸ”— SketchCow Which is hysTERRRRICAL
21:49 πŸ”— SketchCow I remember when a somewhat respected member of the security industry (nowadays) accused me of being an FBI plant
21:51 πŸ”— garyrh Paranoia and imagination make quite a combination.
21:56 πŸ”— xmc SketchCow: seriously? hahaha! what kind of "whatevers" are these?
22:01 πŸ”— garyrh https://archive.org/search.php?query=forumPost%3A1%20AND%20%22funny%20or%20die%22
22:03 πŸ”— garyrh The forums have some... unique individuals.
22:03 πŸ”— arkiver SketchCow: when can I start halo again?
22:03 πŸ”— xmc that i can believe ... >.>
22:04 πŸ”— arkiver And alright if I start viddy on FOS very soon? (size will be something like 4-5 TB probably)
22:05 πŸ”— arkiver ^ and a few million of items as wrote before.
22:05 πŸ”— toad1 has quit IRC (Leaving.)
22:07 πŸ”— SketchCow Is there ANY way to pre-cook the megawarcs, or is my poor machine in hell again?
22:08 πŸ”— arkiver "pre-cook"
22:08 πŸ”— SketchCow The halo should be as slow as possible. It's filling drives
22:09 πŸ”— arkiver Well, I can try to have multiple items per item, but I'm not sure how long the item name can be
22:09 πŸ”— SketchCow Yeah, pre-cook. In ideal world, they're turned into megawarc before going on FOS. If not, keep it slow.
22:09 πŸ”— SketchCow Let's put it this way.
22:09 πŸ”— SketchCow Halo is killing FOS.
22:09 πŸ”— SketchCow KILLING IT.
22:10 πŸ”— arkiver chfoo: is there a limit on the length of the item names?
22:10 πŸ”— toad has joined #archiveteam
22:10 πŸ”— arkiver SketchCow: we'll run halo slow
22:11 πŸ”— arkiver SketchCow: it might be better to ask an other AT member if he/she wants to help us out with the rsync?
22:12 πŸ”— xmc hmmmm. halfbaked idea, megawarc'ing in the process that receives incoming warcs
22:13 πŸ”— joepie91_ garyrh: xmc: I generally find the IA forums to be the most... "insane" part of IA
22:15 πŸ”— arkiver midas: would you be able to help us out with the Viddy project with some rsync space and processing power for the megawarcs?
22:16 πŸ”— fluff_ is now known as fluff
22:16 πŸ”— SketchCow I have to go to a movie, then drive north to the compound.
22:16 πŸ”— SketchCow But I'm happy to go into the ridiculous requirements of things.
22:18 πŸ”— SketchCow I mean, right now, SWIPNET and VERIZON are no longer in horrendous directories filled with hundreds of thousands of indiviual files
22:18 πŸ”— SketchCow Now they're in massive processes of turning 3-4 directories into megawarcs.
22:18 πŸ”— SketchCow All these are taking days.
22:20 πŸ”— xmc what's the resource constraint in the megawarc process?
22:20 πŸ”— xmc io wait?
22:20 πŸ”— xmc cpu time?
22:20 πŸ”— arkiver so can FOS take viddy? Right now I'm hearing that FOS is being killed by halo, is being killed by nightmare directories, and those things. So I'm starting to think it might be better to have this projet not on FOS
22:21 πŸ”— arkiver it might be a good idea for a feature kind of warrior thing, that peopple can volunteer as Megawarc creators.
22:22 πŸ”— SketchCow The main constraint with FOS is that it is doing about 10-20 things.
22:22 πŸ”— arkiver They need to have at least 100 GB, the files are going to those volunteers, they create the megawarcs, the megawarcs are send to FOS
22:22 πŸ”— Kazzy volunteer megawarcing could be a solution, although not everyone has the capacity to create 50gb warcs to send to fos/ia
22:23 πŸ”— Kazzy different size options would be needed. 10/20/30gb etc
22:23 πŸ”— arkiver I know, so it needs to be some seperate project people can select
22:23 πŸ”— arkiver the problem with that is that the files can be deleted by the volunteer
22:23 πŸ”— SketchCow Some of those are very simple (download and later upload DNA Lounge) and some are nightmare (literally, copying ONE USER'S directory of swipnet grabs took this machine three solid days. just cp -v)
22:24 πŸ”— SketchCow OK, so.
22:24 πŸ”— SketchCow In an ideal world, we could probably have items creating 5 or 10gb megawarcs.
22:25 πŸ”— SketchCow Upload those to FOS, and then it can upload into the site.
22:25 πŸ”— SketchCow Like, Halo isn't hard, but it's pulling in a lot.
22:26 πŸ”— SketchCow And right now, I'm watching Swipnet go for days, making a megawarc
22:30 πŸ”— SketchCow I just wrote a couple scripts to make things more efficient, and I can shore things up more, but it's still making megawarcs of massive amounts of 1mb items
22:30 πŸ”— chfoo arkiver: no limit on item names i think. if you want to use long item names, you'll need to do something like sha1 digest to keep the filenames from exceeding the limit
22:30 πŸ”— SketchCow That will always kill the machine, and it got to the point I'd see things take hours to do, like a ls
22:32 πŸ”— SketchCow One mistake on my part was that the ia uploader now has a nice --checksum and --delete pair.
22:32 πŸ”— SketchCow So if something is uploaded, it will delete it off the drive, having verified it got through.
22:32 πŸ”— SketchCow That will help a LITTLE.
22:36 πŸ”— SketchCow Since this mess started, the developer of the ia client has added a bunch of stuff for me to make life easier
22:36 πŸ”— SketchCow Was using it for archivebot, now the uploaders will too.
22:38 πŸ”— SketchCow I see archivebot's up to a terabyte buffer
22:43 πŸ”— arkiver SketchCow: I'm going to do this. Tomorrow in the afternoon I'll start viddy, I will then use FOS's rsync. But in the meantime I'll try to find an rsync other then FOS. If I have an other place to send the files to I'll disable to upload to FOS and continue with the new upload target.
22:44 πŸ”— arkiver chfoo: thanks, I'll see what I'm going to do with those item names
22:52 πŸ”— arkiver Viddy project channel: #viddiot
22:58 πŸ”— slash` has quit IRC (Quit: Coyote finally caught me)
23:00 πŸ”— slash` has joined #archiveteam
23:33 πŸ”— Start i've made notes on some sites that are currently shutting down (ziplist, nokia memories, etc.): http://paste.archivingyoursh.it/micevefine.avrasm
23:34 πŸ”— Start ziplist seems to be the easiest to archive, then exfm
23:34 πŸ”— Start brace.io reminds me of jux
23:34 πŸ”— Start and relay.im and nokia memories will be the trickiest for discovery
23:35 πŸ”— mistym has quit IRC (Read error: Connection reset by peer)
23:36 πŸ”— Start here's the correct link: http://paste.archivingyoursh.it/sebugupute.avrasm
23:36 πŸ”— Start first one has an error
23:36 πŸ”— mistym has joined #archiveteam

irclogger-viewer