[00:17] *** RichardG has quit IRC (Ping timeout: 260 seconds) [00:20] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [00:44] *** hawc145 has joined #archiveteam-bs [00:45] *** HCross has quit IRC (Ping timeout: 246 seconds) [01:10] *** JesseW has quit IRC (Quit: Leaving.) [01:59] *** BnA-Rob1n has quit IRC (Ping timeout: 260 seconds) [02:00] *** BnA-Rob1n has joined #archiveteam-bs [02:32] *** Start has joined #archiveteam-bs [03:39] *** RichardG has joined #archiveteam-bs [03:55] *** JesseW has joined #archiveteam-bs [04:05] *** bwn has quit IRC (Ping timeout: 492 seconds) [04:25] *** bwn has joined #archiveteam-bs [04:56] *** Coderjoe has quit IRC (Read error: Operation timed out) [05:09] *** Coderjoe has joined #archiveteam-bs [05:58] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [06:03] *** Sk1d has joined #archiveteam-bs [06:18] *** DFJustin has quit IRC (Remote host closed the connection) [06:29] *** DFJustin has joined #archiveteam-bs [06:29] *** swebb sets mode: +o DFJustin [06:56] *** wp494 has quit IRC (Read error: Connection reset by peer) [07:05] *** wp494 has joined #archiveteam-bs [07:08] *** JesseW has quit IRC (Quit: Leaving.) [07:09] *** bwn has quit IRC (Read error: Operation timed out) [07:51] *** ersi has quit IRC (Ping timeout: 258 seconds) [07:53] *** wp494 has quit IRC (Read error: Connection reset by peer) [07:58] *** wp494 has joined #archiveteam-bs [08:11] *** bwn has joined #archiveteam-bs [08:28] *** ersi has joined #archiveteam-bs [08:28] *** swebb sets mode: +o ersi [08:34] *** koon has joined #archiveteam-bs [10:45] *** schbirid has joined #archiveteam-bs [11:10] *** metalcamp has joined #archiveteam-bs [12:08] *** hawc145 is now known as HCross [13:35] *** vitzli has joined #archiveteam-bs [15:45] *** Apathy has quit IRC (Quit: OOOOoooooooooo................) [16:35] *** JesseW has joined #archiveteam-bs [17:02] *** JesseW has quit IRC (Quit: Leaving.) [17:14] *** MrRadar_ has joined #archiveteam-bs [17:18] *** MrRadar has quit IRC (Ping timeout: 370 seconds) [17:18] *** MrRadar_ is now known as MrRadar [17:47] *** SN4T14_ has joined #archiveteam-bs [17:48] *** SN4T14 has quit IRC (Read error: Operation timed out) [18:15] *** JesseW has joined #archiveteam-bs [18:32] *** bwn has quit IRC (Ping timeout: 246 seconds) [18:32] *** SN4T14_ has quit IRC (Remote host closed the connection) [18:34] *** JSharp___ has quit IRC (Ping timeout: 260 seconds) [18:35] JesseW: how goes the repacking? [18:38] if at all possible could you please include an inventory file, with the contents of each zip or whatever? [18:41] *** HCross2 has quit IRC (Ping timeout: 260 seconds) [18:47] *** schbirid has quit IRC (Quit: Leaving) [18:48] bsmith093: what are your thoughts about dividing it up? Did you see my suggestions above? [18:50] *** HCross2 has joined #archiveteam-bs [18:51] if you're going to go with multiple files, is there a method that will produce standalone chunks? also I cannot stress this enough, please please make a list of whats where, people have been asking me for things, and i hate having to keep this huge monolithic tar file around. [18:51] JesseW: i like your plan, multi chunk the biggest things, then just archive each category. [18:52] I'll certainly make an index, of course. [18:52] as you've noticed, the size drops off sharply. there's just so much of it! [18:53] I wasn't actually thinking of making separate zip files for each category, but rather separate zip files for each *initial letter* (except for the giant top 3) [18:53] that works too, i think, what would the sizes be like? [18:54] *** BnA-Rob1n has quit IRC (Ping timeout: 260 seconds) [18:54] *** Ctrl-S___ has quit IRC (Ping timeout: 260 seconds) [18:54] *** vitzli has quit IRC (Leaving) [18:54] *** johtso has quit IRC (Ping timeout: 260 seconds) [18:54] well, the top 3 are 35, 18 and 16GB respectively. [18:54] *** BnA-Rob1n has joined #archiveteam-bs [18:55] I need to write up the script to calculate the other sizes. [18:55] *** deathy has quit IRC (Ping timeout: 260 seconds) [18:55] *** TheKiwi has joined #archiveteam-bs [18:55] *** HCross2 has quit IRC (Ping timeout: 260 seconds) [18:55] *** Boltsie has quit IRC (Ping timeout: 260 seconds) [18:55] *** _desu___ has quit IRC (Ping timeout: 260 seconds) [18:57] i found this online. $ mkdir -p output/{A..Z}; for i in tstdir/*; do export FILE=$(basename "$i"); LTR=$(echo" ${FILE:0:1}" | tr [a-z] [A-Z]); mv "$i" "output/$LTR/$FILE" ; done [18:57] just move the 3 biggest out first [18:57] yeah, that should probably work [18:57] I need to fix the Fanfiction/Fanfiction bit too [18:57] can't code for crap, but i can tweak. [18:57] just make that it's own blob. [18:58] *** HCross2 has joined #archiveteam-bs [18:58] how big is that extra folder? [18:58] *** bwn has joined #archiveteam-bs [19:00] well, what I was planning to do was copy the 19 files with different versions over to the main one (with the older versions given a special extension, .bak or something), then delete the whole Fanfiction/Fanfiction hierarcy. [19:00] how many dupes are there, is the bak thing really needed? [19:00] also here http://unix.stackexchange.com/questions/111067/bash-script-to-sort-files-into-alphabetical-folders-on-readynas-duo-v1 [19:00] The Fanfiction/Fanfiction directory is 2.4GB [19:00] where i got the one-liner [19:01] *** SN4T14 has joined #archiveteam-bs [19:01] there are 19 files that differ from the older and the newer versions (all In-Progress ones that got re-written) [19:01] I think it's worth keeping them. [19:01] ok, thats fair. i like seeing old drafts of things :) [19:01] *** johtso has joined #archiveteam-bs [19:01] *** _desu___ has joined #archiveteam-bs [19:01] *** JSharp___ has joined #archiveteam-bs [19:02] *** Ctrl-S___ has joined #archiveteam-bs [19:02] *** deathy has joined #archiveteam-bs [19:02] I think this will work to move them: rsync --checksum -i -r -b --suffix=.bak Fanfiction/Fanfiction/ Fanfiction/ [19:02] *** Boltsie has joined #archiveteam-bs [19:06] *** JSharp___ has quit IRC (Ping timeout: 260 seconds) [19:06] *** JSharp___ has joined #archiveteam-bs [19:07] *** Boltsie has quit IRC (Ping timeout: 260 seconds) [19:08] OK, running the rsync [19:08] *** Boltsie has joined #archiveteam-bs [19:09] *** deathy has quit IRC (Ping timeout: 260 seconds) [19:11] *** TheKiwi has quit IRC (Ping timeout: 260 seconds) [19:12] *** deathy has joined #archiveteam-bs [19:12] *** HCross2 has quit IRC (Ping timeout: 260 seconds) [19:13] *** JSharp___ has quit IRC (Ping timeout: 260 seconds) [19:14] *** JSharp___ has joined #archiveteam-bs [19:16] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [19:18] *** Ctrl-S___ has quit IRC (Ping timeout: 260 seconds) [19:20] *** _desu___ has quit IRC (Ping timeout: 260 seconds) [19:21] *** johtso has quit IRC (Ping timeout: 260 seconds) [19:22] *** _desu___ has joined #archiveteam-bs [19:24] *** Boltsie has quit IRC (Read error: Connection timed out) [19:24] *** JSharp___ has quit IRC (Ping timeout: 260 seconds) [19:24] *** johtso has joined #archiveteam-bs [19:26] *** TheKiwi has joined #archiveteam-bs [19:27] *** TheKiwi has quit IRC (Connection closed) [19:27] *** deathy has quit IRC (Connection closed) [19:28] *** TheKiwi has joined #archiveteam-bs [19:29] *** johtso has quit IRC (Read error: Connection timed out) [19:30] *** deathy has joined #archiveteam-bs [19:31] *** HCross2 has joined #archiveteam-bs [19:31] *** JSharp___ has joined #archiveteam-bs [19:31] *** Boltsie has joined #archiveteam-bs [19:32] *** Ctrl-S___ has joined #archiveteam-bs [19:33] *** TheKiwi has quit IRC (Ping timeout: 260 seconds) [19:33] OK, finished rsync [19:36] *** HCross2 has quit IRC (Ping timeout: 260 seconds) [19:37] *** HCross2 has joined #archiveteam-bs [19:38] *** TheKiwi has joined #archiveteam-bs [19:41] *** Ctrl-S___ has quit IRC (Ping timeout: 260 seconds) [19:41] *** Boltsie has quit IRC (Ping timeout: 260 seconds) [19:42] *** HCross2 has quit IRC (Ping timeout: 260 seconds) [19:42] *** deathy has quit IRC (Ping timeout: 260 seconds) [19:43] *** TheKiwi has quit IRC (Ping timeout: 260 seconds) [19:43] *** Boltsie has joined #archiveteam-bs [19:43] *** Ctrl-S___ has joined #archiveteam-bs [19:44] *** deathy has joined #archiveteam-bs [19:44] *** TheKiwi has joined #archiveteam-bs [19:44] *** HCross2 has joined #archiveteam-bs [19:45] *** johtso has joined #archiveteam-bs [20:17] *** metalcamp has joined #archiveteam-bs [20:18] ~derpnet~ [20:19] lol [20:43] *** TheKiwi has quit IRC (Ping timeout: 260 seconds) [20:43] *** deathy has quit IRC (Ping timeout: 260 seconds) [20:44] *** deathy has joined #archiveteam-bs [20:44] *** TheKiwi has joined #archiveteam-bs [21:09] *** JetBalsa has joined #archiveteam-bs [21:38] *** JetBalsa is now known as JRWR [21:38] *** JRWR has quit IRC (Connection closed) [21:38] *** JRWR has joined #archiveteam-bs [21:48] *** HCross2 has quit IRC (Ping timeout: 260 seconds) [21:48] *** HCross2 has joined #archiveteam-bs [22:41] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [22:46] JesseW: updates? [23:04] bsmith093: got them merged, and looked at the various initial letters [23:04] There are a *few* lowercase letters, and punctuation -- I think I'll handle that by case-insenstivity and using the first alphabetical value [23:04] Also, there are a bunch that start with a digit -- I think I'll combine all those. [23:05] BTW, thanks for continuing to check on this. [23:07] The digit ones come to 641MB [23:09] fanficfare auto converted all unsafe chars to underscores, thats why there's so many folders that look weird [23:11] Ah, that makes sense [23:12] Actually, I think I'll put all the ones that don't start with capital letters in a misc.zip file at the end. [23:14] generating sizes no [23:14] w [23:14] now [23:15] inside the files, the names are preserved, in all their utf8 glory. [23:16] to clarify, are you preserving the folder structure? [23:16] and just re shuffling it into less folders? [23:17] *** BlueMaxim has joined #archiveteam-bs [23:17] "Harry Potter/Completed/Harry Potter - author - title.txt" or just "H/Harry Potter - author - title.txt" [23:18] I was planning to preserve the existing folder strcuture [23:18] ok, great [23:18] thanks! [23:18] hm, A comes to 11G [23:20] the FF.net archive? that damn thing's huge [23:21] also both those setups are quite confusing when it comes to crossovers [23:21] BlueMaxim: i know, i scraped it. [23:21] BlueMaxim: not really, a crosover is just stored in the category folder for it [23:22] eg Harry potter x men crossover would be Harry Potter_X Men/Completed/etc [23:31] So the sizes so far are 11GB, 15GB, 12GB, 17GB for A-D -- then 3GB for E. [23:31] This is including the three big ones, but I'll just exclude them after the count [23:50] yeah bsmith093 but it seemed quite random to me how they were sorted by the franchises inside them [23:50] I may have missed something though [23:51] *** BlueMaxim is now known as BlueMax [23:52] So the sizes range from 31GB (for S) [23:55] and Harry Potter is 35G [23:55] Q, U and Z are the only ones less than a GB [23:56] assuming that a 35G (uncompressed) zip file is OK, I think this plan should work fine. [23:57] What are you trying to do JesseW