#archiveteam-bs 2019-03-29,Fri

↑back Search

Time Nickname Message
00:07 🔗 IanR has quit IRC (Remote host closed the connection)
00:15 🔗 IanR has joined #archiveteam-bs
00:44 🔗 wp494 has quit IRC (Read error: Operation timed out)
00:45 🔗 wp494 has joined #archiveteam-bs
01:03 🔗 ryry has joined #archiveteam-bs
01:10 🔗 godane SketchCow: so i'm going thru American Cinematographer 1926 someone scanned and noticed that 4 pages missing from june 1926 issue
01:10 🔗 godane pages 4, 5, 8 and 9
01:11 🔗 godane from this item: https://archive.org/details/amemato06asch
01:17 🔗 robbierut has quit IRC (Read error: Operation timed out)
01:31 🔗 adinbied has joined #archiveteam-bs
01:58 🔗 hendi_ has joined #archiveteam-bs
02:00 🔗 hendi has quit IRC (Read error: Operation timed out)
02:13 🔗 BlueMax has joined #archiveteam-bs
02:17 🔗 jdude104 has joined #archiveteam-bs
02:31 🔗 godane SketchCow: i'm uploading US News and World Report magazine
02:31 🔗 godane i scanned the 2 issues i found at savers
02:31 🔗 godane one from 1997-09-15 and 1998-01-26
02:32 🔗 godane looks like you guys have close to none on that one magazine
02:35 🔗 SketchCow I suspect we keep getting takedowns for that one.
02:35 🔗 Flashfire So is spam darked or is it deleted? I have always been curious about that sketchcow
02:36 🔗 godane SketchCow: that sucks
02:36 🔗 godane there is like no online archive of there back issues on there website
02:36 🔗 SketchCow Spam is darkedd
02:41 🔗 SketchCow I just darked over 1,000 spam items today
02:42 🔗 Flashfire What does spam usually consist of? I cant find much documentation
02:43 🔗 SketchCow You don't find any documentation
02:43 🔗 SketchCow But let's see....
02:43 🔗 SketchCow Taxi Businesses, Massage, Escorts, Malwarely go See The Full Movie, Best ____ in ___, etc.
02:43 🔗 SketchCow Anything where the item is not the item
02:43 🔗 SketchCow It's just an ad for something else
02:43 🔗 SketchCow We have some edge cases, but that's life
02:44 🔗 SketchCow Like, we have people who upload lifestyle blogging and then it's got these huge AND I DO AWESOME STUFF COME VISIT MY SITEEEEEEEEEEEEEEEEEEEEEE
02:44 🔗 SketchCow And they're shitball. Depending on the spam person doing the work, they stay or don't
02:44 🔗 Flashfire Does spam sometimes include web captures? Like save now? Cause I worry some of the weird stuff I find and use Savenow on
02:44 🔗 SketchCow I tend to be nice. Others aren't.
02:44 🔗 SketchCow No.
02:44 🔗 SketchCow Unless you're just uploading horseshit into opensource
02:44 🔗 Flashfire Alright so its often the same crap you find across the net uploaded everywhere
02:45 🔗 Flashfire Not unless you count one or 2 obscure kids PC games as horseshit but I think that went into the software section
02:48 🔗 godane latest scan: https://archive.org/details/us-news-and-world-report-magazine-1997-09-15
02:48 🔗 godane latest scan: https://archive.org/details/us-news-and-world-report-magazine-1998-01-26
03:11 🔗 SketchCow Want to share https://visidata.org/ because it is neat.
03:23 🔗 verifiedj has quit IRC (Ping timeout: 252 seconds)
03:25 🔗 Mateon1 has quit IRC (Remote host closed the connection)
03:27 🔗 VerifiedJ has joined #archiveteam-bs
03:30 🔗 Mateon1 has joined #archiveteam-bs
03:35 🔗 ndiddy has quit IRC (Ping timeout: 506 seconds)
03:46 🔗 Mateon1 has quit IRC (Remote host closed the connection)
03:46 🔗 Mateon1 has joined #archiveteam-bs
03:50 🔗 Exairnous Is it possible to download specific sites from chromebot collections on IA?
03:53 🔗 Despatche has joined #archiveteam-bs
04:12 🔗 Stilett0 has joined #archiveteam-bs
04:14 🔗 Stiletto has quit IRC (Ping timeout: 506 seconds)
04:16 🔗 godane btw we can save magazines from nxtbooks with simple url patterns: http://transfer.nxtbook.com/nxtbooks/ac/ac0308/offline/ac_ac0308.pdf
04:17 🔗 godane i maybe able to find American Cinematographer issues going back to 2000 on nxtbook
04:31 🔗 qw3rty111 has joined #archiveteam-bs
04:34 🔗 qw3rty119 has quit IRC (Read error: Operation timed out)
04:37 🔗 odemgi has joined #archiveteam-bs
04:39 🔗 odemgi_ has quit IRC (Ping timeout: 252 seconds)
04:45 🔗 odemg has quit IRC (Ping timeout: 615 seconds)
04:52 🔗 odemg has joined #archiveteam-bs
05:00 🔗 t3 Exairnous: I think chromebot's uploads would eventually end up on IA. Either it will upload WARCs or the WARCs will be combined with those of ArchiveBot to form MegaWARCs. If the site is in the files, then you should be able to find and download it.
05:01 🔗 t3 Exairnous: The tricky part is to locate where the WARCs went.
05:01 🔗 t3 And that depends on the specific site you're after.
05:06 🔗 ivan has quit IRC (Leaving)
05:07 🔗 ivan has joined #archiveteam-bs
05:24 🔗 Exairnous t3: Yes. I found a chromebot warc that contains the sites I want. The problem is it contains a bunch of extraneous sites as well and I was hoping there was an alternative way to download just the content I care about.
05:24 🔗 dhyan_nat has joined #archiveteam-bs
05:25 🔗 t3 Exairnous: Oh... I don't know if there are any alternative downloads. The only thing I can think of is to strip away the unnecessary bits from the WARC using some kind of WARC post-processing software.
05:28 🔗 Exairnous Ok. Thanks t3
05:29 🔗 Stilett0 has quit IRC (Ping timeout: 492 seconds)
05:30 🔗 t3 Exairnous: Anytime.
05:33 🔗 Stiletto has joined #archiveteam-bs
05:34 🔗 godane WTF i maybe able to get all issues of American Cinematographer from at least 1980s on
05:34 🔗 godane maybe get all issues
05:35 🔗 godane fuck now i'm grabbing 1960-01 issue
05:36 🔗 godane so i will be able to get all missing years that IA doesn't have
05:37 🔗 t3 Nice.
05:37 🔗 godane this was all cause i was making issue pdfs based on the current volume scans on IA
05:40 🔗 godane i may do a iphone image grab of some issues cause the pdfs take forever to render
05:40 🔗 Exairnous t3: Any WARC post-processing software you'd recommend?
05:43 🔗 godane ok looks like all scans are on there
05:43 🔗 godane i can grab 1930-01 issue
05:44 🔗 godane and i hate the 1930-01 scan
05:44 🔗 godane looks like the last words on edge of page was cut off in part
05:46 🔗 godane ok earliest issue i can get is 1922-04
05:48 🔗 t3 Exairnous: I haven't tried any but there are some Python packages that might help. There is `warctools` and `warc`. I found some using `pip3 search warc`.
05:51 🔗 Exairnous t3: Ok. Thanks
05:58 🔗 marked has quit IRC (Read error: Connection reset by peer)
05:58 🔗 robbierut has joined #archiveteam-bs
06:03 🔗 marked has joined #archiveteam-bs
06:14 🔗 SketchCo1 has joined #archiveteam-bs
06:14 🔗 Atom__ has joined #archiveteam-bs
06:16 🔗 IanQ has joined #archiveteam-bs
06:17 🔗 SketchCow has quit IRC (Read error: Connection reset by peer)
06:19 🔗 odemgi has quit IRC (Read error: Connection reset by peer)
06:19 🔗 IanR has quit IRC (Read error: Connection reset by peer)
06:19 🔗 Polylith_ has joined #archiveteam-bs
06:19 🔗 odemgi has joined #archiveteam-bs
06:19 🔗 VerifiedJ has quit IRC (Ping timeout: 252 seconds)
06:19 🔗 Atom-- has quit IRC (Ping timeout: 252 seconds)
06:19 🔗 jut has quit IRC (Ping timeout: 252 seconds)
06:19 🔗 yuitimoth has quit IRC (Ping timeout: 252 seconds)
06:19 🔗 deevious has quit IRC (Ping timeout: 252 seconds)
06:19 🔗 ColdIce has quit IRC (Ping timeout: 252 seconds)
06:20 🔗 yuitimoth has joined #archiveteam-bs
06:20 🔗 SmileyG has quit IRC (Ping timeout: 252 seconds)
06:20 🔗 Polylith has quit IRC (Ping timeout: 252 seconds)
06:20 🔗 Lord_Nigh has quit IRC (Ping timeout: 252 seconds)
06:21 🔗 Lord_Nigh has joined #archiveteam-bs
06:22 🔗 w0rmhole has quit IRC (Ping timeout: 252 seconds)
06:22 🔗 dashcloud has quit IRC (Ping timeout: 252 seconds)
06:25 🔗 jut has joined #archiveteam-bs
06:25 🔗 ranma has quit IRC (Ping timeout: 252 seconds)
06:25 🔗 Flashfire has quit IRC (Ping timeout: 252 seconds)
06:25 🔗 kiska has quit IRC (Ping timeout: 252 seconds)
06:25 🔗 Flashfire has joined #archiveteam-bs
06:25 🔗 w0rmhole has joined #archiveteam-bs
06:25 🔗 kiska has joined #archiveteam-bs
06:25 🔗 ranma has joined #archiveteam-bs
06:26 🔗 deevious has joined #archiveteam-bs
06:26 🔗 svchfoo1 sets mode: +o kiska
06:26 🔗 svchfoo3 sets mode: +o kiska
06:26 🔗 ColdIce has joined #archiveteam-bs
06:30 🔗 Smiley has joined #archiveteam-bs
06:35 🔗 jdude104 has quit IRC (Read error: Operation timed out)
06:46 🔗 Exairnous has quit IRC (Ping timeout: 265 seconds)
06:47 🔗 kiska sets mode: +o kiska1
06:47 🔗 kiska sets mode: +o kiskabak
06:57 🔗 Exairnous has joined #archiveteam-bs
07:02 🔗 Exairnous has quit IRC (Read error: Operation timed out)
07:12 🔗 Exairnous has joined #archiveteam-bs
07:39 🔗 polar has joined #archiveteam-bs
08:13 🔗 Smiley has quit IRC (Remote host closed the connection)
08:13 🔗 Smiley has joined #archiveteam-bs
08:14 🔗 antomati_ has joined #archiveteam-bs
08:16 🔗 dashcloud has joined #archiveteam-bs
08:18 🔗 Exairnous has quit IRC (Read error: Operation timed out)
08:20 🔗 ivan has quit IRC (Leaving)
08:21 🔗 antomatic has quit IRC (Ping timeout: 615 seconds)
08:22 🔗 ivan has joined #archiveteam-bs
08:39 🔗 robbier97 has joined #archiveteam-bs
08:45 🔗 robbierut has quit IRC (Read error: Operation timed out)
08:52 🔗 icedice has joined #archiveteam-bs
08:54 🔗 robbier97 has quit IRC (Read error: Connection reset by peer)
08:55 🔗 robbierut has joined #archiveteam-bs
09:34 🔗 polar has quit IRC (Quit: Page closed)
09:39 🔗 polar has joined #archiveteam-bs
09:40 🔗 Hintswen has quit IRC (Ping timeout: 265 seconds)
09:41 🔗 fuzy802 has joined #archiveteam-bs
09:43 🔗 Hintswen has joined #archiveteam-bs
09:45 🔗 wp494 has quit IRC (Read error: Operation timed out)
09:46 🔗 wp494 has joined #archiveteam-bs
09:49 🔗 fuzzy8021 has quit IRC (Ping timeout: 615 seconds)
09:51 🔗 fuzy802 is now known as fuzzy8021
10:03 🔗 VerifiedJ has joined #archiveteam-bs
10:07 🔗 fredgido has quit IRC (Read error: Connection reset by peer)
10:10 🔗 fredgido has joined #archiveteam-bs
10:15 🔗 fuzzy8021 has quit IRC (Read error: Operation timed out)
10:15 🔗 fuzzy8021 has joined #archiveteam-bs
10:23 🔗 BlueMax has quit IRC (Quit: Leaving)
10:33 🔗 icedice has quit IRC (Quit: Leaving)
11:50 🔗 netsound has quit IRC (Leaving)
11:53 🔗 polar has quit IRC (Quit: Page closed)
13:10 🔗 Pixi has quit IRC (Read error: Operation timed out)
13:11 🔗 Pixi has joined #archiveteam-bs
13:53 🔗 ryry has quit IRC (Ping timeout: 260 seconds)
14:38 🔗 Hintswen| has joined #archiveteam-bs
14:41 🔗 Despatche has quit IRC (Quit: Read error: Connection reset by deer)
14:43 🔗 Hintswen has quit IRC (Ping timeout: 604 seconds)
14:52 🔗 wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES)
15:07 🔗 wp494 has joined #archiveteam-bs
15:43 🔗 wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES)
17:01 🔗 wp494 has joined #archiveteam-bs
17:14 🔗 deevious has quit IRC (Quit: deevious)
17:49 🔗 Exairnous has joined #archiveteam-bs
18:03 🔗 Exairnous has quit IRC (Read error: Operation timed out)
18:09 🔗 SketchCo1 is now known as SketchCow
18:46 🔗 Odd0002_ has joined #archiveteam-bs
18:54 🔗 Odd0002 has quit IRC (Ping timeout: 615 seconds)
18:54 🔗 Odd0002_ is now known as Odd0002
19:12 🔗 godane SketchCow: i'm going to be uploading channelawesome videos from vid.me that i have to FOS
19:12 🔗 godane its over 100gb
19:14 🔗 godane SketchCow: for me this is a way make sure you have a copy of it
19:15 🔗 godane and don't touch it without contacting me before uploading it
19:15 🔗 godane its going to take a very long time for me to upload this
19:16 🔗 godane even at best speeds we are talking maybe a week for it to be completely uploaded
19:42 🔗 killsushi has joined #archiveteam-bs
20:23 🔗 ReimuHaku has quit IRC (Read error: Operation timed out)
20:24 🔗 ReimuHaku has joined #archiveteam-bs
20:32 🔗 schbirid time for mongodb then?
20:35 🔗 Stilett0 has joined #archiveteam-bs
20:35 🔗 Atom-- has joined #archiveteam-bs
20:39 🔗 ndiddy has joined #archiveteam-bs
20:39 🔗 netsound has joined #archiveteam-bs
20:41 🔗 Stiletto has quit IRC (Read error: Operation timed out)
20:41 🔗 Atom__ has quit IRC (Read error: Operation timed out)
21:00 🔗 Hintswen| is now known as Hintswen
21:06 🔗 schbirid has quit IRC (Remote host closed the connection)
21:23 🔗 dhyan_nat has quit IRC (Read error: Operation timed out)
21:29 🔗 BlueMax has joined #archiveteam-bs
21:37 🔗 dhyan_nat has joined #archiveteam-bs
22:01 🔗 icedice has joined #archiveteam-bs
22:07 🔗 dhyan_nat has quit IRC (Read error: Operation timed out)
22:40 🔗 Ing3b0rg has joined #archiveteam-bs
22:52 🔗 Exairnous has joined #archiveteam-bs
22:55 🔗 t3 has quit IRC (Quit: Connection closed for inactivity)
23:00 🔗 Exairnous has quit IRC (Read error: Operation timed out)
23:32 🔗 t3 has joined #archiveteam-bs
23:48 🔗 JAA Soo, I want to be nitpicky...
23:48 🔗 JAA Yes, we did reach 1 PB, but not 1 PiB yet.
23:48 🔗 JAA The tracker shows GiB while labelling it as GB. :-(
23:49 🔗 JAA We're at 0.9939 PiB currently.
23:49 🔗 IanQ the decimal data notations are a bit daft
23:50 🔗 JAA I mean, I don't really care which unit is used as long as it's used correctly.
23:51 🔗 IanQ I always thought PB = PiB was correct usage, ignoring ancient professors and confused millenials, and anyone with a business interest in selling inferior products/services
23:51 🔗 JAA Personally, I always use the binary prefixes, but that's mostly because it's unambiguous.
23:52 🔗 IanQ I'd like a decimal prefix to clarify the other direction, in the rare case I'd want base 10 binary
23:53 🔗 JAA Using "GB" to mean 2^30 bytes is wrong since giga- is defined since ages to mean 10^9. But writing "GB" is unfortunately ambiguous because so many people use it incorrectly.
23:53 🔗 JAA When I write it, I always actually mean 10^9 bytes, but yeah, to avoid the ambiguity, I tend to use the other units instead.
23:53 🔗 IanQ I grew up when incorrect was considered correct
23:54 🔗 JAA Yeah, "considered" but still factually wrong. :-)
23:54 🔗 IanQ and correct is breaking alot of ground truths
23:54 🔗 JAA It was wrong from the very start when some stoned engineer thought calling 1024 bytes a "kilobyte" was fine.
23:54 🔗 JAA Anyway, this doesn't belong in -bs anymore.
23:55 🔗 IanQ this is ot, oh, ni it isn't, oops
23:56 🔗 IanQ topic line has #archiveteam-ot in it, misleading :-)

irclogger-viewer