Time |
Nickname |
Message |
00:03
🔗
|
|
sirdancea has quit IRC (Read error: Operation timed out) |
00:18
🔗
|
SketchCow |
Verfified - this f/win 6 |
00:32
🔗
|
|
primus104 has quit IRC (Leaving.) |
00:39
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
00:52
🔗
|
|
Nystrom has quit IRC (Ping timeout: 492 seconds) |
00:54
🔗
|
|
mistym has joined #archiveteam |
00:57
🔗
|
|
Nystrom has joined #archiveteam |
01:28
🔗
|
|
Nystrom has quit IRC (Ping timeout: 492 seconds) |
01:30
🔗
|
|
Nystrom has joined #archiveteam |
01:32
🔗
|
|
Start has joined #archiveteam |
01:48
🔗
|
|
Ymgve has quit IRC () |
02:00
🔗
|
dashcloud |
that seems like a sentence fragment- was it meant for this channel? |
02:09
🔗
|
xmc |
i'm going to guess no |
02:16
🔗
|
|
Sk1d has quit IRC (Ping timeout: 265 seconds) |
02:19
🔗
|
|
Sk1d has joined #archiveteam |
02:22
🔗
|
|
guest9000 has joined #archiveteam |
02:28
🔗
|
guest9000 |
yipdw closure SketchCow balrog dcmorton is there a "rare stuff from the 90s that nobody can find a torrents link for" section yet, cause i have a ton of shows from that era. bill and ted, back to the future (both animated) 160-something eps of Tom and Jerry, jumanji *animated* etc, anywhere i can rsync it to? |
02:32
🔗
|
|
lexicon has joined #archiveteam |
02:34
🔗
|
|
lexicon has left WeeChat 1.1.1 |
02:35
🔗
|
|
bar_noone has joined #archiveteam |
02:36
🔗
|
|
bar_noone is now known as lexicon |
02:48
🔗
|
|
nertzy has joined #archiveteam |
03:16
🔗
|
|
nertzy has quit IRC (This computer has gone to sleep) |
03:17
🔗
|
SketchCow |
Yes, but bear in mind it could be restricted in access very quickly. |
03:29
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
03:57
🔗
|
|
mistym has joined #archiveteam |
04:06
🔗
|
|
tsp_ has joined #archiveteam |
04:08
🔗
|
* |
tsp_ heard something about massive fanfiction archive. If it's available somewhere, I'm interested in it |
04:08
🔗
|
tsp_ |
Then there's this, someone's downloading fimfiction as epubs: https://www.fimfiction.net/user/Fimfarchive |
04:09
🔗
|
guest9000 |
tsp that might be me i have 450+gb of it |
04:10
🔗
|
tsp_ |
From where? |
04:10
🔗
|
guest9000 |
tsp_: fanfiction.net |
04:10
🔗
|
tsp_ |
Oh, I've got lots of that. Not as much as you do, though |
04:11
🔗
|
guest9000 |
tsp_: want it? |
04:11
🔗
|
tsp_ |
I'm not sure I have the space for it. How big is it compressed? |
04:11
🔗
|
guest9000 |
no idea im actually still grabbing |
04:11
🔗
|
tsp_ |
I'll wait a bit, it'll probably end up on archive.org like the last one |
04:13
🔗
|
tsp_ |
I've got a bunch of .db files of the larger stories, been doing this for years on a small scale. But only story chapters; I can give you story ids you don't have |
04:13
🔗
|
guest9000 |
speaking of, im the gut who uploaded basically all of a03, and it turns out the ultra compressed archive file i had was corrupted when i uploaded it, anyone want to try fixing that? unfortunately i no longer have the original data |
04:13
🔗
|
tsp_ |
where's that? |
04:14
🔗
|
tsp_ |
and what's corrupted about it? I don't think that's too fixable |
04:14
🔗
|
tsp_ |
AO3's epubs are broken, they say that in their known issues page but you have to dig for it. |
04:15
🔗
|
guest9000 |
tsp here https://archive.org/details/Ao3ArchiveCrawl |
04:16
🔗
|
guest9000 |
and here the 7zip recovery page, that may as well be in chinese for all the good it does me http://www.7-zip.org/recover.html |
04:16
🔗
|
tsp_ |
Not my prefered settings, I'd at least prefer an html like format and more raw pages, but I take what I can get. Let's see |
04:16
🔗
|
guest9000 |
tsp_: txt files |
04:16
🔗
|
tsp_ |
18gb... of txt files |
04:17
🔗
|
guest9000 |
sorted by category/category - author - title.txt |
04:17
🔗
|
tsp_ |
How long did that take? The downloader sleps for 1s or so between chapters |
04:17
🔗
|
tsp_ |
I hope you call it in parallel |
04:17
🔗
|
tsp_ |
cause that thing is ultra slow |
04:17
🔗
|
guest9000 |
tsp_: a few weeks i think, back when i grabbed afap |
04:18
🔗
|
guest9000 |
2 at once |
04:18
🔗
|
tsp_ |
what's afap? |
04:18
🔗
|
guest9000 |
i have the same downloader going on fanfiction.net right now |
04:18
🔗
|
guest9000 |
as fast as possible |
04:18
🔗
|
tsp_ |
The sleep settings there are even longer |
04:18
🔗
|
tsp_ |
You might be able to tweak them though |
04:18
🔗
|
guest9000 |
its been heavily updated since then. |
04:19
🔗
|
tsp_ |
Oh, fanficfare |
04:19
🔗
|
guest9000 |
i realize that but its probably not going down any time soon and im in no rush. i use a quick fire and forget script |
04:19
🔗
|
tsp_ |
How are you dealing with parallel downloads? I always fail at that |
04:20
🔗
|
guest9000 |
ive got 9 million stpries, only took me 2 years or so. |
04:20
🔗
|
guest9000 |
split the list, run 2 instances at once |
04:21
🔗
|
tsp_ |
I can guess, you're going from 1 to max, without scraping the category pages to see what's there? |
04:21
🔗
|
guest9000 |
2 chapters per second 600k to go, ill be done in maybe a month or so |
04:21
🔗
|
guest9000 |
theres only 11 million ids, i figure thats the easiest way |
04:21
🔗
|
tsp_ |
What does it do if one fails? |
04:21
🔗
|
guest9000 |
notes it , goes on to the next one |
04:22
🔗
|
tsp_ |
I"ve got about 700k over a few years |
04:22
🔗
|
guest9000 |
i have a screen session for both threads. it logs, |
04:22
🔗
|
guest9000 |
the only downside is updating the collection is gonna be a bitch |
04:23
🔗
|
tsp_ |
If you include the update date in the story (it usually does), you can scrape the category pages for the story ids every few months |
04:24
🔗
|
tsp_ |
if the update dates are different, or the number of chapters are different, then update that story |
04:24
🔗
|
guest9000 |
actually the files are sorted by "category/status/category - author - title.txt" in that format status is either "In-Progress" or "Completed" is there a way to only grep those files? |
04:24
🔗
|
tsp_ |
yeah, grep has a --from-files option |
04:24
🔗
|
tsp_ |
or something |
04:25
🔗
|
guest9000 |
tsp_: update date, i do. |
04:25
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
04:26
🔗
|
guest9000 |
it has update date, date published author story urls and summary in a block of text at the beginning of each file. |
04:26
🔗
|
|
Sk1d has quit IRC (Read error: Operation timed out) |
04:26
🔗
|
tsp_ |
is the story id in the file? How do you handle resumes |
04:26
🔗
|
|
mistym has joined #archiveteam |
04:26
🔗
|
tsp_ |
oh, duh, you can record the last id you got |
04:26
🔗
|
guest9000 |
yes the story id is in the text block at the beginning, and what do you mean resumes? |
04:26
🔗
|
tsp_ |
simple setup, simple solution |
04:26
🔗
|
tsp_ |
like, if your script dies |
04:27
🔗
|
guest9000 |
screen log |
04:27
🔗
|
guest9000 |
i check every few days |
04:27
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
04:27
🔗
|
tsp_ |
what version if 7zip did you use on this thing? |
04:27
🔗
|
guest9000 |
realistically, if i miss a few hundred, i still have the largest and most comprehensive collection. |
04:28
🔗
|
guest9000 |
err, i have no idea, the defualt that came with ubuntu ...10 i think? |
04:28
🔗
|
guest9000 |
i used the manual's description of ultra settings |
04:29
🔗
|
|
Sk1d has joined #archiveteam |
04:32
🔗
|
guest9000 |
about the ao3 grab, i actually had an author contact me about some of her stories she had deleted and wanted back, and i had to dissappoint her because i found out the file was bad. |
04:35
🔗
|
tsp_ |
have you tried the same 7z version on the ubuntu 10 box? |
04:36
🔗
|
tsp_ |
I looked at the header, the first 8 bytes are ok, the rest is all 0. After that it continues |
04:37
🔗
|
guest9000 |
tsp_: seriously, 900mb of zeros? |
04:37
🔗
|
tsp_ |
no |
04:37
🔗
|
tsp_ |
I'm not sure yet |
04:38
🔗
|
guest9000 |
oh, yeah that would be stupid!, so whats there? |
04:38
🔗
|
guest9000 |
ive upgraded since then, im on mint 17 now |
04:38
🔗
|
tsp_ |
the first 8 bytes seem correct, then the next 24 are 0, doing soe quick math. Then a bunch of data |
04:38
🔗
|
tsp_ |
So, let's see... I'll patch these bytes... |
04:39
🔗
|
tsp_ |
Oh, that's described on th erecovery page as something I should do. |
04:40
🔗
|
guest9000 |
see, i would have tried whatever that means, but i have no idea how youre doing that |
04:41
🔗
|
tsp_ |
I'll try it locally and not in this silly vps |
04:41
🔗
|
guest9000 |
use this link https://archive.org/download/Ao3ArchiveCrawl/ao3-1-700000.7z i garantee its faster |
04:42
🔗
|
tsp_ |
I can only download a 500 kb/s max |
04:42
🔗
|
tsp_ |
so it'll take half an hour |
04:42
🔗
|
guest9000 |
ugh, i kep forgetting 100mpbs isnt as common as id like it to be |
04:43
🔗
|
guest9000 |
what city u in? |
04:43
🔗
|
tsp_ |
canada |
04:43
🔗
|
tsp_ |
hang on, oh here we go |
04:43
🔗
|
guest9000 |
tsp_: what, find something? |
04:44
🔗
|
tsp_ |
this situation can also happen if the archiving was interrupted for some reason |
04:44
🔗
|
guest9000 |
im reasonbly sure i just let it run overnight, but this was years ago |
04:44
🔗
|
tsp_ |
this is like reading a google translated page |
04:47
🔗
|
guest9000 |
now, i really wish id backed up the data somewhere, i have plenty of space now! |
04:57
🔗
|
tsp_ |
I think the archive got interrupted before it finished |
04:57
🔗
|
tsp_ |
but could be wrong |
04:58
🔗
|
guest9000 |
so i'm most likely screwed on ever seeing any of that data then? |
04:59
🔗
|
tsp_ |
There's a complicated recovery process that might get some of it back, but I'm honestly not that good with a hex editor and byte offsets to pull it off |
04:59
🔗
|
tsp_ |
your best bet is to simply scrape ao3 again. Not ideal, but better than nothing |
05:20
🔗
|
|
mutoso has quit IRC (Quit: leaving) |
05:33
🔗
|
tsp_ |
guest9000: These things all end with "End file."? |
05:45
🔗
|
DFJustin |
there are some pretty crackerjack nerds in here, stick around and someone might come to the rescue |
05:55
🔗
|
guest9000 |
tsp_: youve got some valid text output?! |
05:56
🔗
|
guest9000 |
tsp_: and yes that was the default end for the scraper and i just left it. |
05:56
🔗
|
Lord_Nigh |
ok, the famitracker old forums definitely are closing |
05:56
🔗
|
tsp_ |
Yeah, I got a bunch of fanfics. Problem, they're all in one giant file, and I have to run it again because my dummy file was too small. |
05:57
🔗
|
guest9000 |
damn, i was afraid of that. are the names of the files preserved somewhere? |
05:57
🔗
|
Lord_Nigh |
http://famitracker.com/forum/ is closing |
05:57
🔗
|
tsp_ |
as the recovery page said, we can't really fix the giant file issue, but let's see what I can get out of it first with a 20gb file. |
05:57
🔗
|
Lord_Nigh |
the new forums are forums.famitracker.com |
05:58
🔗
|
guest9000 |
its all there?! *manly sqee* |
05:58
🔗
|
tsp_ |
nope, they're not. But you have the titles, authors, categories, status, and the End file marker |
05:58
🔗
|
tsp_ |
I'm trying a 20gb file. I doubt I'll get 20gb of data |
05:58
🔗
|
tsp_ |
because there's no way 18gb can fit into a 900mb archive |
05:58
🔗
|
tsp_ |
Well, ok, maybe, but my guess is not |
05:59
🔗
|
tsp_ |
it'll take a few hours for this to compress. |
05:59
🔗
|
guest9000 |
honestly , its just a crapload of text, i figured that was just how good ultra was |
05:59
🔗
|
tsp_ |
rar beat 7z last I checked on text |
06:00
🔗
|
guest9000 |
in hindsight 900mb is only 5% of 18gb, so that ratio would be pretty amazing |
06:01
🔗
|
tsp_ |
Ah well, I'll send you the big text file once I get it, you can write a script to parse the stuff out of it you want |
06:02
🔗
|
guest9000 |
tsp_: where the hell are you going to upoad that to? |
06:03
🔗
|
xmc |
archive.org ? :) |
06:03
🔗
|
guest9000 |
duh |
06:03
🔗
|
tsp_ |
my website, dropbox, wherever I can squeeze it in |
06:03
🔗
|
tsp_ |
noone wants a big text stream in the form it's going to come out as |
06:03
🔗
|
guest9000 |
its 2 am local time, i should probably serioulsly consider going to bed |
06:07
🔗
|
Lord_Nigh |
sleep is for the weak |
06:08
🔗
|
|
guest9000 has quit IRC (http://www.mibbit.com ajax IRC Client) |
06:08
🔗
|
xmc |
sleep is for the tired |
06:09
🔗
|
|
guest9000 has joined #archiveteam |
06:10
🔗
|
|
bsmith093 has joined #archiveteam |
06:11
🔗
|
|
bsmith093 has quit IRC (Client Quit) |
06:11
🔗
|
|
bsmith093 has joined #archiveteam |
06:13
🔗
|
guest9000 |
SketchCow Lord_Nigh xmc chfoo tsp_ working on a huge a03 archive recovery any ideas on parsing? |
06:13
🔗
|
xmc |
parsing what? |
06:13
🔗
|
Lord_Nigh |
a03? what's that? |
06:13
🔗
|
tsp_ |
AO3, archiveofourown |
06:14
🔗
|
guest9000 |
the 20gb file my ao3 crawl turned into from a bad compression apparently |
06:14
🔗
|
tsp_ |
it didn't turn into 20gb, yet |
06:14
🔗
|
tsp_ |
basicly the 7zip recovery page says: make an archive bigger than the existing one, split it, put yours in place of that, extract it, and see what you get |
06:14
🔗
|
xmc |
huh |
06:15
🔗
|
guest9000 |
tsp_: still might as well get ideas flowing, any eta, also seriously tsp_ THANK YOU SOOOO MUCH! :) |
06:15
🔗
|
tsp_ |
1gb gave me 1gb of text, so 20gb should give me... however much text this 947mb archive holds |
06:16
🔗
|
tsp_ |
14% compressing. I need to compress a 20gb file of /dev/urandom first, then split it, put the block of data in, extract |
06:16
🔗
|
tsp_ |
what a silly way to do it, you can't just hex edit the expected size in? |
06:16
🔗
|
bsmith093 |
tsp_, would dev/zero be faster? |
06:17
🔗
|
tsp_ |
That wouldn't get the desired effect. "We must create new "good" 7z archive with same method as in bad.7z, and new archive must be much larger than bad.7z" |
06:18
🔗
|
tsp_ |
I take that to mean I can't just cheat and use /dev/zero |
06:19
🔗
|
|
primus104 has joined #archiveteam |
06:19
🔗
|
bsmith093 |
the really sad part is there's an inventory file of every single file that was supposed to be in this grab, so i'll know what's missing |
06:21
🔗
|
tsp_ |
Just grab everything not deleted again. You have the work urls |
06:22
🔗
|
guest9000 |
yeah but i know stuff been deleted between now and then, and i'm one of those anal retentive nerds. oh well, youre right better than nothing |
06:28
🔗
|
|
MMovie2 has joined #archiveteam |
06:28
🔗
|
SketchCow |
What |
06:30
🔗
|
|
MMovie has quit IRC (Ping timeout: 306 seconds) |
06:37
🔗
|
|
guest9000 has quit IRC (http://www.mibbit.com ajax IRC Client) |
06:41
🔗
|
bsmith093 |
tsp_, restarted the grab with fanficfare, and the old config file, should be done in a month or two, they have 4.9million stories now |
06:41
🔗
|
bsmith093 |
going to bed |
06:43
🔗
|
SketchCow |
bsmith093: boop |
06:50
🔗
|
|
primus104 has quit IRC (Leaving.) |
06:51
🔗
|
|
garyrh has quit IRC (http://bnc4free.com/) |
06:51
🔗
|
|
garyrh has joined #archiveteam |
07:01
🔗
|
|
MMovie2 has quit IRC (Read error: Connection reset by peer) |
07:04
🔗
|
|
MMovie has joined #archiveteam |
07:04
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
07:09
🔗
|
|
MMovie has quit IRC (Ping timeout: 306 seconds) |
07:16
🔗
|
|
MMovie has joined #archiveteam |
07:19
🔗
|
|
atomotic has joined #archiveteam |
07:20
🔗
|
|
schbirid has joined #archiveteam |
07:22
🔗
|
|
MMovie has quit IRC (Ping timeout: 306 seconds) |
07:33
🔗
|
|
primus104 has joined #archiveteam |
07:34
🔗
|
|
MMovie has joined #archiveteam |
08:05
🔗
|
|
mistym has joined #archiveteam |
08:09
🔗
|
|
rolf has joined #archiveteam |
08:10
🔗
|
|
rejon has quit IRC (Read error: Operation timed out) |
08:11
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
08:25
🔗
|
|
primus104 has quit IRC (Leaving.) |
08:26
🔗
|
|
rejon has joined #archiveteam |
08:44
🔗
|
|
rejon has quit IRC (Ping timeout: 362 seconds) |
08:48
🔗
|
|
DopefishJ has joined #archiveteam |
08:56
🔗
|
|
DFJustin has quit IRC (Ping timeout: 740 seconds) |
08:58
🔗
|
|
vOYtEC_ has joined #archiveteam |
09:00
🔗
|
|
vOYtEC has quit IRC (Read error: Connection reset by peer) |
09:00
🔗
|
|
rejon has joined #archiveteam |
09:11
🔗
|
|
rolf has quit IRC (Leaving...) |
09:49
🔗
|
|
primus104 has joined #archiveteam |
10:00
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
10:07
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
10:07
🔗
|
|
mistym has joined #archiveteam |
10:16
🔗
|
|
mistym has quit IRC (Read error: Operation timed out) |
10:28
🔗
|
|
rolf has joined #archiveteam |
10:37
🔗
|
|
Ymgve has joined #archiveteam |
10:42
🔗
|
|
scyther has joined #archiveteam |
10:58
🔗
|
|
signius has quit IRC (Ping timeout: 252 seconds) |
11:09
🔗
|
|
dan_ has quit IRC (Ping timeout: 252 seconds) |
11:11
🔗
|
|
signius has joined #archiveteam |
11:28
🔗
|
dashcloud |
from the #aohell channel: https://twitter.com/AP/status/598081874146238465 (Verizon is buying AOL) |
11:38
🔗
|
|
antomati_ has joined #archiveteam |
11:40
🔗
|
|
antomatic has quit IRC (Read error: Operation timed out) |
12:08
🔗
|
|
mistym has joined #archiveteam |
12:12
🔗
|
|
dan_ has joined #archiveteam |
12:17
🔗
|
|
mistym has quit IRC (Ping timeout: 512 seconds) |
12:33
🔗
|
midas |
great, so now ill get some verizon cd's with 24 hours of internet |
13:01
🔗
|
|
sankin has joined #archiveteam |
13:12
🔗
|
|
atomotic has joined #archiveteam |
13:23
🔗
|
|
primus104 has quit IRC (Leaving.) |
13:26
🔗
|
|
sankin has quit IRC (Leaving.) |
13:48
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
13:52
🔗
|
|
sankin has joined #archiveteam |
14:03
🔗
|
|
nertzy has joined #archiveteam |
14:06
🔗
|
|
scyther has quit IRC (Read error: Connection reset by peer) |
14:10
🔗
|
|
mistym has joined #archiveteam |
14:15
🔗
|
|
mistym has quit IRC (Ping timeout: 252 seconds) |
14:18
🔗
|
|
DopefishJ is now known as DFJustin |
14:19
🔗
|
|
rolf has quit IRC (Leaving...) |
14:40
🔗
|
|
mistym has joined #archiveteam |
14:54
🔗
|
|
Mayonaise has quit IRC (Ping timeout: 362 seconds) |
15:05
🔗
|
|
Mayonaise has joined #archiveteam |
15:08
🔗
|
|
Start has quit IRC (Disconnected.) |
15:21
🔗
|
phillipsj |
dashcloud I have some comodore 5??? dirves (they tend to go out of alignment -- they came with an article explaining a quick and dirty "fix") I also have many 5¼ inch floppy dirves and 3½ inch floppy drives (which I may want to put into active use for "secure boot" purposes) |
15:23
🔗
|
phillipsj |
My rarest drive is probably a 270MB 3.5" disk cartridge drive (syquest)? I may have damged rather than fixed the heads by trying to clean them with a cotton swab though. |
15:36
🔗
|
|
dan_ has quit IRC (Ping timeout: 252 seconds) |
15:44
🔗
|
|
dan_ has joined #archiveteam |
15:44
🔗
|
|
sunny256_ has quit IRC (Read error: Connection reset by peer) |
15:45
🔗
|
|
nertzy has quit IRC (Read error: Operation timed out) |
15:47
🔗
|
|
nertzy has joined #archiveteam |
15:51
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
15:51
🔗
|
|
primus104 has joined #archiveteam |
15:55
🔗
|
|
DFJustin has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
Jonimus has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
wp494 has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
xtr-201 has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
Smiley has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
SketchCow has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
RedType has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
twrist has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
SadDM has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
sivoais has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
useretail has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
dx- has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
mr-b has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
thefinn93 has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
dugo_ has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
jk[SVP] has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
chfoo- has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
offby1 has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
Selanda has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
matthusby has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
rduser has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
underscor has quit IRC (ircd.shaw.ca irc.shaw.ca) |
15:55
🔗
|
|
NotGLaDOS has joined #archiveteam |
15:57
🔗
|
|
_vOYtEC has joined #archiveteam |
15:57
🔗
|
|
RedType_ has joined #archiveteam |
15:58
🔗
|
|
dugo has joined #archiveteam |
15:58
🔗
|
|
primus104 has quit IRC (Leaving.) |
15:59
🔗
|
|
SmileyG has joined #archiveteam |
16:02
🔗
|
|
vOYtEC_ has quit IRC (Read error: Connection reset by peer) |
16:03
🔗
|
|
PepsiMax has quit IRC (Ping timeout: 265 seconds) |
16:03
🔗
|
|
Deewiant has quit IRC (Ping timeout: 265 seconds) |
16:03
🔗
|
|
DFJustin has joined #archiveteam |
16:03
🔗
|
|
Jonimus has joined #archiveteam |
16:03
🔗
|
|
wp494 has joined #archiveteam |
16:03
🔗
|
|
SadDM has joined #archiveteam |
16:03
🔗
|
|
useretail has joined #archiveteam |
16:03
🔗
|
|
dx- has joined #archiveteam |
16:03
🔗
|
|
mr-b has joined #archiveteam |
16:03
🔗
|
|
thefinn93 has joined #archiveteam |
16:03
🔗
|
|
jk[SVP] has joined #archiveteam |
16:03
🔗
|
|
chfoo- has joined #archiveteam |
16:03
🔗
|
|
offby1 has joined #archiveteam |
16:03
🔗
|
|
Selanda has joined #archiveteam |
16:03
🔗
|
|
matthusby has joined #archiveteam |
16:03
🔗
|
|
rduser has joined #archiveteam |
16:03
🔗
|
|
underscor has joined #archiveteam |
16:03
🔗
|
|
irc.shaw.ca sets mode: +oo SadDM underscor |
16:03
🔗
|
|
Deewiant has joined #archiveteam |
16:03
🔗
|
|
nico_32 has quit IRC (Ping timeout: 265 seconds) |
16:04
🔗
|
|
PepsiMax has joined #archiveteam |
16:05
🔗
|
|
mistym has joined #archiveteam |
16:06
🔗
|
|
dan_ has quit IRC (Ping timeout: 252 seconds) |
16:07
🔗
|
|
dan_ has joined #archiveteam |
16:09
🔗
|
|
nico_32 has joined #archiveteam |
16:13
🔗
|
|
Start has joined #archiveteam |
16:17
🔗
|
|
sivoais has joined #archiveteam |
16:31
🔗
|
|
SimpBrain has joined #archiveteam |
16:45
🔗
|
|
Start has quit IRC (Disconnected.) |
16:48
🔗
|
|
philpem has joined #archiveteam |
16:50
🔗
|
|
Start has joined #archiveteam |
16:51
🔗
|
xmc |
midas: but, with verizon math, it'll actually only be 24 minutes |
16:59
🔗
|
|
SketchCow has joined #archiveteam |
16:59
🔗
|
|
GLaDOS sets mode: +o SketchCow |
17:02
🔗
|
|
scyther has joined #archiveteam |
17:06
🔗
|
|
nertzy has quit IRC (Quit: This computer has gone to sleep) |
17:29
🔗
|
|
xmc sets mode: +o swebb |
17:29
🔗
|
|
swebb sets mode: +o DFJustin |
17:42
🔗
|
|
Start has quit IRC (Disconnected.) |
17:45
🔗
|
SketchCow |
bsmith093: Hey there. So I've been packing up the fan fiction collection. |
17:45
🔗
|
|
pwnsrv has joined #archiveteam |
17:45
🔗
|
SketchCow |
It's big. I understand if another comes down the line. |
17:51
🔗
|
|
aaaaaaaaa has joined #archiveteam |
17:58
🔗
|
|
mistym has quit IRC (Remote host closed the connection) |
17:59
🔗
|
|
mistym has joined #archiveteam |
18:35
🔗
|
|
rolf has joined #archiveteam |
18:36
🔗
|
|
habi has joined #archiveteam |
18:37
🔗
|
|
caber has quit IRC (Quit: Kids: talk with your parents about ad-blockers, and, at some point; social media. But fundamentals first!) |
18:44
🔗
|
|
Start has joined #archiveteam |
18:44
🔗
|
|
caber has joined #archiveteam |
18:44
🔗
|
|
Start has quit IRC (Client Quit) |
18:45
🔗
|
|
Start has joined #archiveteam |
18:50
🔗
|
|
habi has quit IRC (Quit: Leaving.) |
18:50
🔗
|
|
habi has joined #archiveteam |
18:52
🔗
|
|
Start has quit IRC (Ping timeout: 370 seconds) |
18:54
🔗
|
|
rolf has quit IRC (Leaving...) |
18:56
🔗
|
|
habi has left |
19:03
🔗
|
|
rolf has joined #archiveteam |
19:03
🔗
|
|
rolf has quit IRC (Client Quit) |
19:08
🔗
|
|
Nystrom has quit IRC (- nbs-irc 2.39 - www.nbs-irc.net -) |
19:27
🔗
|
|
aaaaaaaaa has quit IRC (Leaving) |
19:34
🔗
|
|
rolf has joined #archiveteam |
19:36
🔗
|
|
aaaaaaaaa has joined #archiveteam |
19:40
🔗
|
|
rolf has quit IRC (Leaving...) |
19:40
🔗
|
|
wm_ has joined #archiveteam |
19:41
🔗
|
|
primus104 has joined #archiveteam |
19:54
🔗
|
|
SN4T14__ has joined #archiveteam |
20:00
🔗
|
|
SN4T14_ has quit IRC (Ping timeout: 369 seconds) |
20:02
🔗
|
|
bsmith094 has joined #archiveteam |
20:03
🔗
|
|
scyther has quit IRC (Leaving) |
20:13
🔗
|
bsmith094 |
SketchCow: i'm still running that actually, i havent sent anything up in a while, but its almost caught up, 600k ids to go |
20:20
🔗
|
bsmith094 |
tsp_: restarted the ao3 scraper about 10 hours ago, rough ETC ~4.5 months |
20:20
🔗
|
bsmith094 |
tsp_: theyve been very busy |
20:21
🔗
|
tsp_ |
bsmith094: I sent you a pm, well, 93 a pm |
20:24
🔗
|
bsmith093 |
got it, downloading |
20:27
🔗
|
tsp_ |
I don't think my script screwd up, things seem to be where they're supposed to be |
20:28
🔗
|
bsmith094 |
its opening, so all one file or split? |
20:32
🔗
|
bsmith094 |
tsp_: oh i see, its the same as i compressed it! awesome, how much is there? |
20:33
🔗
|
tsp_ |
4gb or so. I used the inventory file to reconstruct the filenames based on the work ids, and split at End file. |
20:34
🔗
|
bsmith094 |
i knew that inventory was a good idea, go past-me! |
20:34
🔗
|
bsmith094 |
i figured anyone who grabbed it , would like to have a list of what was there |
20:35
🔗
|
bsmith094 |
how many files? |
20:37
🔗
|
tsp_ |
140179 |
20:40
🔗
|
bsmith094 |
140179?700000 is about 20.02% so better than nothing! thanks :) |
20:40
🔗
|
tsp_ |
np |
20:42
🔗
|
bsmith094 |
merging now, 48 minutes to go |
20:42
🔗
|
tsp_ |
merging? With your current download? YOu should do that after you're done |
20:43
🔗
|
tsp_ |
you only want to merge if the story doesn't exist already on the site |
20:45
🔗
|
bsmith094 |
this way at least , i have something, a more complete collection |
20:53
🔗
|
|
sankin has quit IRC (Leaving.) |
20:54
🔗
|
tsp_ |
If you merge now and download any story you don't have, you won't be able to update them easily. IF you merge later, you'll be able to only merge what doesn't already exist in what you downloaded, which is better IMO |
20:59
🔗
|
bsmith094 |
crap, it just finished the merge. ah well redo only lost 12 hours |
21:11
🔗
|
|
bsmith094 has quit IRC (http://www.mibbit.com ajax IRC Client) |
21:12
🔗
|
|
rolf has joined #archiveteam |
21:16
🔗
|
|
BlueMaxim has joined #archiveteam |
21:17
🔗
|
SketchCow |
bsmith093: So you're saying I need to stop packing. |
21:17
🔗
|
SketchCow |
And will have to pack it later. |
21:17
🔗
|
SketchCow |
Doing so. |
21:18
🔗
|
SketchCow |
That's the problem with the FTP. Some people are working on things for months and others are doing it then walking away, then annoyed I don't mend-meld know they're finished. |
21:26
🔗
|
|
rolf has quit IRC (Leaving...) |
21:27
🔗
|
|
rolf has joined #archiveteam |
21:36
🔗
|
|
rolf has quit IRC (Leaving...) |
21:45
🔗
|
phillipsj |
BTW I thought of other possibly rare hardware I have: 8 track tape player and Record player capable of playing 78s. Archiving commercial music (or video) sounds like a pain though. |
22:00
🔗
|
|
phillipsj has quit IRC (Read error: Operation timed out) |
22:02
🔗
|
bsmith093 |
SketchCow, sorry, you can finish if you want, call it volume 1 or something, just thought i should tell you its not yet complete. |
22:03
🔗
|
bsmith093 |
SketchCow, its your space, and i appreciate you letting me use it :) |
22:03
🔗
|
SketchCow |
I have killed it and so let me know when you're done. |
22:18
🔗
|
|
nwf has quit IRC (Read error: Operation timed out) |
22:20
🔗
|
|
josephroo has quit IRC (Read error: Operation timed out) |
22:20
🔗
|
|
vegbrasil has quit IRC (Read error: Operation timed out) |
22:21
🔗
|
|
marvinw has quit IRC (Read error: Operation timed out) |
22:21
🔗
|
|
Froggypwn has quit IRC (Read error: Operation timed out) |
22:22
🔗
|
|
mistym_ has joined #archiveteam |
22:23
🔗
|
|
sep332 has quit IRC (Read error: Operation timed out) |
22:23
🔗
|
|
S[h]O[r]T has quit IRC (Read error: Operation timed out) |
22:24
🔗
|
|
mistym has quit IRC (Read error: Connection reset by peer) |
22:24
🔗
|
|
Ctrl-S has quit IRC (Read error: Operation timed out) |
22:24
🔗
|
|
Ctrl-S_ is now known as Ctrl-S |
22:24
🔗
|
|
S[h]O[r]T has joined #archiveteam |
22:24
🔗
|
|
Control-S has joined #archiveteam |
22:24
🔗
|
|
aaaaaaaaa has quit IRC (Read error: Operation timed out) |
22:24
🔗
|
|
bsmith095 has joined #archiveteam |
22:24
🔗
|
|
aMunster has quit IRC (Read error: Connection reset by peer) |
22:25
🔗
|
|
aMunster has joined #archiveteam |
22:25
🔗
|
|
Froggypwn has joined #archiveteam |
22:25
🔗
|
|
josephroo has joined #archiveteam |
22:25
🔗
|
|
lrkj has quit IRC (Remote host closed the connection) |
22:25
🔗
|
|
vegbrasil has joined #archiveteam |
22:25
🔗
|
|
aaaaaaaaa has joined #archiveteam |
22:25
🔗
|
|
sep332 has joined #archiveteam |
22:26
🔗
|
|
marvinw has joined #archiveteam |
22:26
🔗
|
bsmith095 |
SketchCow: should be mostly done is a few weeks or so |
22:27
🔗
|
SketchCow |
Great |
22:27
🔗
|
|
nwf has joined #archiveteam |
22:27
🔗
|
|
lrkj has joined #archiveteam |
22:34
🔗
|
|
xtr-201 has joined #archiveteam |
22:36
🔗
|
|
Emcy_ has joined #archiveteam |
22:38
🔗
|
|
NotGLaDOS has quit IRC (Ping timeout: 240 seconds) |
22:38
🔗
|
|
twrist has joined #archiveteam |
22:39
🔗
|
|
Emcy has quit IRC (Ping timeout: 240 seconds) |
22:42
🔗
|
|
Control-S has quit IRC (Read error: Connection reset by peer) |
22:42
🔗
|
|
achip has quit IRC (Read error: Operation timed out) |
22:43
🔗
|
|
achip has joined #archiveteam |
22:43
🔗
|
|
Control-S has joined #archiveteam |
22:46
🔗
|
|
Froggypwn has quit IRC (Read error: Operation timed out) |
22:46
🔗
|
|
lrkj has quit IRC (Read error: Connection reset by peer) |
22:46
🔗
|
|
godane has quit IRC (Quit: Leaving.) |
22:47
🔗
|
|
Control-S has quit IRC (Read error: Connection reset by peer) |
22:49
🔗
|
|
Froggypwn has joined #archiveteam |
22:49
🔗
|
|
nwf has quit IRC (Ping timeout: 600 seconds) |
22:49
🔗
|
|
aaaaaaaaa has quit IRC (Read error: Operation timed out) |
22:50
🔗
|
|
yotta has quit IRC (Ping timeout: 600 seconds) |
22:51
🔗
|
|
josephroo has quit IRC (Ping timeout: 600 seconds) |
22:52
🔗
|
|
aaaaaaaaa has joined #archiveteam |
22:53
🔗
|
|
sep332 has quit IRC (Ping timeout: 600 seconds) |
22:53
🔗
|
|
aMunster has quit IRC (Ping timeout: 600 seconds) |
22:55
🔗
|
|
lrkj has joined #archiveteam |
22:55
🔗
|
|
S[h]O[r]T has quit IRC (Ping timeout: 600 seconds) |
22:56
🔗
|
|
Control-S has joined #archiveteam |
22:56
🔗
|
|
aMunster has joined #archiveteam |
22:56
🔗
|
|
josephroo has joined #archiveteam |
22:56
🔗
|
|
nwf has joined #archiveteam |
22:56
🔗
|
|
yotta has joined #archiveteam |
22:56
🔗
|
|
S[h]O[r]T has joined #archiveteam |
22:59
🔗
|
|
sep332 has joined #archiveteam |
23:04
🔗
|
|
Start has joined #archiveteam |
23:06
🔗
|
|
Froggypwn has quit IRC (Ping timeout: 265 seconds) |
23:07
🔗
|
|
Froggypwn has joined #archiveteam |
23:15
🔗
|
|
xtr-201 has quit IRC (Ping timeout: 370 seconds) |
23:19
🔗
|
|
philpem has quit IRC (Ping timeout: 252 seconds) |
23:57
🔗
|
|
Ymgve has quit IRC () |