[00:09] log-bookmark: why we do it (ignore that, it's for me to grep back to this point in the logs) [00:10] (for the next time "why would you bother saving that" comes up) [00:25] *** philpem has quit IRC (Ping timeout: 252 seconds) [00:42] *** Fusl has quit IRC (Max SendQ exceeded) [00:43] *** Fusl has joined #archiveteam [00:46] *** wutno` has quit IRC (Ping timeout: 306 seconds) [00:58] *** Emcy_ has joined #archiveteam [01:00] *** lytv has quit IRC (Read error: Operation timed out) [01:01] *** JesseW has joined #archiveteam [01:04] *** lytv has joined #archiveteam [01:05] *** Emcy has quit IRC (Read error: Operation timed out) [01:12] *** vitzli has joined #archiveteam [01:13] *** brayden__ has joined #archiveteam [01:13] *** swebb sets mode: +o brayden__ [01:19] *** brayden_ has quit IRC (Read error: Operation timed out) [01:36] SketchCow: can you ask him if http://blip.tv/lessig was his channel? [01:37] when the archives are uploaded to IA I should be able toretrieve a list of his videos and get him the link to the original videos for download [01:37] Blip also seems to be totally gone now [01:40] does it not resolve for you either? [01:41] no, it doesn't resolve for me either [01:41] *** primus104 has quit IRC (Leaving.) [01:42] arkiver: I asked [01:43] No answer yet. [01:43] thanks! [01:43] SketchCow: a bit more then 100 videos haven't been saved from blip [01:43] All the other videos are saved! [01:44] 100 out of over 7 million isn't a bad miss rate [01:45] Depending on what they are, they may end up somewhere else anyway [01:45] e.g. channel awesome stuff or sfdebris [01:46] Got it. [02:03] Guys, could you add links from archiveteam.org to digitize and fileformats wikis on a sidebar? [02:05] fileformats wiki looks like a secret trove and like a proper secret trove is hidden :( [02:07] 1. Shut up [02:08] 2. Learn how to phrase suggestions [02:08] 3. See how well phrased #1 is [02:08] :( [02:10] *** BlueMaxim has joined #archiveteam [02:11] Digitize is still in the oven. [02:17] I am not 100% happy with my flatbed scanning skills. Definitely need to get better at straightening. [02:19] Then again, I'm mostly digitizing these manuals as insurance against them not having made it into the big piles, which must be not true. [02:22] Like, I'm being over-reactive as it is. [02:44] *** JesseW has quit IRC (Read error: Operation timed out) [02:44] better safe than sorry [02:49] *** xk_id has quit IRC (Remote host closed the connection) [02:53] *** JesseW has joined #archiveteam [03:08] *** yipdw changes topic to: Archive Team: We're not archive.org | http://archiveteam.org/ | lengthy/off-topic in #archiveteam-bs | < BotoX_> dafuq is WARC | 1. Shut up [03:09] he'll take his buddhist stick to you for vocalizing [03:11] *** yipdw changes topic to: Archive Team: We're not archive.org | http://archiveteam.org/ | lengthy/off-topic in #archiveteam-bs | < BotoX_> dafuq is WARC [03:14] *** SketchCow changes topic to: Archive Team: We're not archive.org | http://archiveteam.org/ | lengthy/off-topic in #archiveteam-bs | 1. Shut up [03:25] https://archive.org/details/CompactPotentiometers7231072212InstructionManual723102J [03:31] *** VADemon has quit IRC (left4dead) [03:44] *** Elegance has joined #archiveteam [03:45] *** Elegance has quit IRC (Client Quit) [03:54] *** aaaaaaaaa has quit IRC (Leaving) [04:41] *** vitzli has quit IRC (Quit: Leaving) [04:43] * JesseW is reading over the manual SketchCow linked above [04:43] * JesseW is amused by the note on page 19 [04:44] and apparently repeated on page 21 [04:46] https://archive.org/search.php?query=creator%3A%22James+G.+Biddle+Company%22 [04:46] apparently 5 other works by the same company were uploaded back in 2011. [04:47] *** Ungstein has joined #archiveteam [04:58] *** vitzli has joined #archiveteam [05:00] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [05:01] *** dashcloud has joined #archiveteam [05:11] *** bsmith096 has joined #archiveteam [05:13] im just finishing up the last of my ffnet grab. it will be done by noon- ish est. does anyone have any suggestions for compressing these files on the fly? i dont have the room. almost 500GB, in roughly 9 million files [05:15] if i just send them rsync as-is, it will thrash fos's disks, so im trying to minimize the inconvenience [05:16] swebb SketchCow Coderjoe DFJustin closure ping [05:18] just loose html files? [05:22] DFJustin: text files of the stories [05:22] it took me about 2 years to grab almost all of them, up to id # 11 million [05:26] how much room do you have [05:27] about 30gb [05:28] DFJustin: the files are 400gb+ [05:29] so you need something that can put together a chunk of say 10gb at a time and keep track of where it left off [05:30] don't know a tool that can do that offhand other than writing a custom script but there might be something [05:30] ive heard i can also compress on the fly with pipes, but the syntax is wierd [05:31] ssh cat and gzip [05:32] yeah you can probably rig together something like that but the odds of getting all the way through 9 million files without some kind of error in the middle seems low [05:38] if you compress by last integer, 0-9, the 10 archives will be predictably consistent in size. move each off disk before compressing the next. you can recombine them later [05:40] bzip2 is great for text compression btw [05:41] ppdm via 7zip is also a good choice [05:41] ppmd* [05:43] *** Ungstein has quit IRC (Quit: Leaving.) [05:44] the problem is they're very small individually, but there's a crapload of them, hence compression, but i dont really have the room to store both the compressed archives and the fully expanded files [05:46] *** Ungstein has joined #archiveteam [05:46] gnu tar has a --remove-files option [05:47] dangerous, in that he has the only copy [05:50] SketchCow is just gonna compress them anyway, i thought i'd save time, bandwidth, and my disk r/w heads and compress in transit [05:57] *** Ungstein has quit IRC (Read error: Connection reset by peer) [05:57] what id like to do is have gzip (or whatever) make the folder into multi gb chunks, copied to fos as they're made then delete the chunks [05:58] i could rsync the gzip chunks [05:58] but i fully admit i have no idea how to do that [05:59] I think a lot of the folks here are asleep at this hour so you may have better luck in 8 hours or so [06:00] *** Ungstein has joined #archiveteam [06:01] k, the grab will probably be done by then. 2000 files to go! [06:01] (ffnet throttles, or i would have been done a year and a half age) [06:01] *ago [06:12] `scp -C` or `rsync -z` will compress during transfer (though it will be uncompressed on the other end) [06:17] just make tar chunks in your free space, rsync them, and delete them [06:18] *** JesseW has quit IRC (Read error: Operation timed out) [06:30] *** RichardG has quit IRC (Remote host closed the connection) [06:46] *** Ungstein has quit IRC (Quit: Leaving.) [06:57] *** Ungstein has joined #archiveteam [07:04] *** primus104 has joined #archiveteam [07:33] *** primus104 has quit IRC (Leaving.) [07:35] *** philpem has joined #archiveteam [07:36] *** Ravenloft has quit IRC (Read error: Connection reset by peer) [08:02] *** primus104 has joined #archiveteam [08:05] *** OregonRos has joined #archiveteam [08:06] Is there any place on the Internet where I could possibly find a cached view of my old Myspace or Facebook profile [08:07] *** OregonRos has quit IRC (Client Quit) [08:24] *** vitzli has quit IRC (Quit: Leaving) [09:50] *** nmnn_ has joined #archiveteam [09:51] *** habi has joined #archiveteam [09:55] *** philpem has quit IRC (Ping timeout: 252 seconds) [10:00] *** habi has quit IRC (Quit: Leaving.) [10:12] *** vitzli has joined #archiveteam [10:17] *** nmnn_ has quit IRC (Ping timeout: 483 seconds) [10:17] *** bsmith096 has quit IRC (Ping timeout: 240 seconds) [10:26] *** Dennisjr1 has quit IRC (Ping timeout: 240 seconds) [10:34] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [11:13] *** wutno has joined #archiveteam [11:59] *** philpem has joined #archiveteam [12:02] *** primus104 has quit IRC (Leaving.) [12:54] *** primus104 has joined #archiveteam [13:04] *** vitzli has quit IRC (Quit: Leaving) [13:05] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [13:06] *** zenguy_pc has joined #archiveteam [13:10] *** vitzli has joined #archiveteam [13:28] *** garyrh has quit IRC (Remote host closed the connection) [14:00] *** VADemon has joined #archiveteam [14:04] *** nmnn_ has joined #archiveteam [14:14] *** garyrh has joined #archiveteam [14:17] Hi what [14:17] *** nmnn_ has quit IRC (Ping timeout: 483 seconds) [14:18] What a lot of technical discussion for on the fly compression [14:21] *** nmnn_ has joined #archiveteam [14:24] Regarding blip.tv request for lessig, yes, the username is lessig. [14:24] So yes, to arkiver [14:29] *** nmnn_ has quit IRC (Ping timeout: 483 seconds) [14:37] *** nmnn_ has joined #archiveteam [14:38] *** nertzy has joined #archiveteam [14:47] *** nmnn_ has quit IRC (Ping timeout: 483 seconds) [15:17] *** PurpleSym has joined #archiveteam [15:19] SketchCow: They should give some money to the IA ;) . And us some beers :D [15:27] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [15:28] Uh, ok. [15:29] *** nmnn_ has joined #archiveteam [15:31] *** primus104 has quit IRC (Leaving.) [15:42] OVH is raising prices... check your mails if you have one ;) [15:54] well theres an amusing article on why archiving matters... [15:54] http://www.gamasutra.com/view/feature/167392/sad_but_true_we_cant_prove_when_.php [15:56] *** vitzli has quit IRC (Quit: Leaving) [15:58] *** JesseW has joined #archiveteam [16:05] *** Ungstein has quit IRC (Quit: Leaving.) [16:13] *** primus104 has joined #archiveteam [16:15] *** hive-mind has quit IRC (Remote host closed the connection) [16:15] *** primus105 has joined #archiveteam [16:22] *** primus104 has quit IRC (Read error: Operation timed out) [16:35] *** JesseW has quit IRC (Read error: Operation timed out) [16:44] *** xk_id has joined #archiveteam [16:54] *** primus105 has quit IRC (Leaving.) [17:03] *** schbirid has joined #archiveteam [17:26] hmmm. we have gamespy archived, yes? [17:26] SilSte: er. wat? [17:27] partially at least, its hard to tell [17:27] schbirid: well, it's still online, so... :P [17:27] parts of it are D: [17:28] ah [17:28] oh i forgot to get that vm back online. gamespy-archives.quaddicted.com usually has the old forums and the planet sites [17:29] SilSte: not seeing any changes for kimsufi anywy [17:29] anyway * [17:39] *** RichardG has joined #archiveteam [17:48] anyone in that grsecurity stuff? Shall it be saved? [17:58] *** xk_id has quit IRC (Remote host closed the connection) [18:09] *** bsmith096 has joined #archiveteam [18:15] SilSte: What specificly are you referring to? The announce.php? [18:15] That's saved. Or all stable releases? or wut [18:15] i would say the stable releases [18:16] everything what is going to be hidden [18:18] *** primus104 has joined #archiveteam [18:18] woot. another amazon pilot episode found on a random vkontakte page [18:23] (http://piratepad.net/e8FaqvIGRx) [18:24] Is Amazon removing those? [18:24] amazon only has rights to distribute them for a couple months [18:25] Ah [18:26] and yeah, turns out nobody makes copies of free stuff on amazon [18:26] specially kids shows [18:29] *** xk_id has joined #archiveteam [18:31] so a bunch are lost [18:31] SketchCow: im done, or atleast ready to send you all the files, how do i not thrash the discs this time? [18:37] *** aaaaaaaaa has joined #archiveteam [18:37] *** swebb sets mode: +o aaaaaaaaa [18:39] SketchCow: this is my current idea, tar -zc "/home/ben/Desktop/Ben's Stuff/Fanfiction" | pv | ssh wacko@fos.textfiles.com "cat > ~/bsmith/Fanfiction.tar.gz" [18:46] that's probably ok [18:47] i'd suggest instead making a list of directories and then giving it to tar, so if it breaks off you can trim the list up to a bit before that point and continue [19:01] *** schbirid has quit IRC (Leaving) [19:06] *** khaoohs has quit IRC (Read error: Connection reset by peer) [19:20] *** xk_id has quit IRC (Remote host closed the connection) [19:21] bsmith096: the tar "file" can also be a remote one [19:21] at least in some OS [19:23] *** khaoohs has joined #archiveteam [19:24] Nemo_bis: DFJustin i'm going with my current idea, i passed 500G to pv as the expected size, b/c thats how big the drive is, and im using most of it. well over 470GB [19:27] ETA 230 hrs , but thats an overestimate, the compression is probably adding to that, also im only doing this so i'll write one huge file instead of 10 million tiny ones, also, this doesnt leave temp files, right? b/c i seriously dont have the room [19:32] The command you posted should not make any temp files on your local machine [19:33] MrRadar: great!, its still running, it will probably take a few days [19:34] *** scyther has joined #archiveteam [20:03] *** lytv has quit IRC (Quit: Leaving) [20:04] *** db48x` has joined #archiveteam [20:10] *** aliz has quit IRC (Ping timeout: 252 seconds) [20:16] *** xk_id has joined #archiveteam [20:17] *** aliz has joined #archiveteam [20:27] *** lytv has joined #archiveteam [20:33] *** scyther has quit IRC (Leaving) [20:37] *** aliz has quit IRC (Ping timeout: 252 seconds) [20:44] *** aliz has joined #archiveteam [20:52] *** PurpleSym has quit IRC (Remote host closed the connection) [21:02] *** nmnn_ has quit IRC (Quit: Ex-Chat) [21:27] *** aaaaaaaa_ has joined #archiveteam [21:27] *** aaaaaaaaa has quit IRC (Read error: Connection reset by peer) [21:27] *** swebb sets mode: +o aaaaaaaa_ [21:37] *** aaaaaaaa_ is now known as aaaaaaaaa [21:41] *** goekesmi has quit IRC (Remote host closed the connection) [22:08] *** RedType has quit IRC (Quit: leaving) [22:08] *** RedType has joined #archiveteam [22:19] *** goekesmi has joined #archiveteam [22:24] *** db48x` has quit IRC (Read error: Operation timed out) [22:27] *** bsmith096 has quit IRC (Ping timeout: 240 seconds) [23:36] *** dashcloud has quit IRC (Read error: Connection reset by peer) [23:36] *** dashcloud has joined #archiveteam [23:46] *** JesseW has joined #archiveteam