#archiveteam-bs 2017-12-08,Fri

↑back Search

Time Nickname Message
00:09 🔗 Stilett0 has joined #archiveteam-bs
00:20 🔗 ola_norsk has joined #archiveteam-bs
00:34 🔗 Valentine has quit IRC (Read error: Connection reset by peer)
00:34 🔗 TheLovina has joined #archiveteam-bs
00:39 🔗 Valentine has joined #archiveteam-bs
00:39 🔗 K4k has joined #archiveteam-bs
00:43 🔗 Stiletto has joined #archiveteam-bs
00:43 🔗 Stilett0 has quit IRC (Ping timeout: 250 seconds)
00:51 🔗 Valentine has quit IRC (Read error: Operation timed out)
00:54 🔗 Stilett0 has joined #archiveteam-bs
00:55 🔗 Valentine has joined #archiveteam-bs
00:58 🔗 Stiletto has quit IRC (Ping timeout: 245 seconds)
01:25 🔗 Stiletto has joined #archiveteam-bs
01:26 🔗 Stilett0 has quit IRC (Ping timeout: 250 seconds)
01:28 🔗 Dimtree has quit IRC (Peace)
01:30 🔗 Stiletto has quit IRC (Read error: Operation timed out)
01:30 🔗 CoolCanuk lmao my local police shipped a replica police car of theirs to Prince George at Buckingham Palace because of his Christmas Wishlist
01:32 🔗 Stilett0 has joined #archiveteam-bs
01:33 🔗 CoolCanuk 6,000 km away
01:36 🔗 Dimtree has joined #archiveteam-bs
02:03 🔗 Stilett0 has quit IRC (Ping timeout: 264 seconds)
02:05 🔗 ola_norsk has quit IRC (Ping timeout: 480 seconds)
02:10 🔗 kristian_ has joined #archiveteam-bs
02:24 🔗 kristian_ hi
02:25 🔗 kristian_ anyone got a Hathi account+
02:25 🔗 kristian_ ?
02:44 🔗 odemg CoolCanuk, how long have you been on archiveteam?
02:46 🔗 CoolCanuk 11 days
02:47 🔗 odemg xD
02:47 🔗 odemg sound
02:47 🔗 CoolCanuk hm?
02:48 🔗 CoolCanuk i'm still very much a noob
02:53 🔗 schbirid has quit IRC (Ping timeout: 255 seconds)
02:55 🔗 mundus lol, https://i.mundus.xyz/PfWEYB.png
02:56 🔗 vantec Yep, that uptime is laughable
02:56 🔗 K4k has quit IRC (Read error: Operation timed out)
02:56 🔗 mundus that's due to a shit host
02:56 🔗 mundus :p
03:04 🔗 schbirid has joined #archiveteam-bs
03:37 🔗 ld1 has quit IRC (Quit: ~)
03:44 🔗 ld1 has joined #archiveteam-bs
03:55 🔗 kristian_ has quit IRC (Quit: Leaving)
03:57 🔗 Stilett0 has joined #archiveteam-bs
04:08 🔗 qw3rty113 has joined #archiveteam-bs
04:14 🔗 du_ has quit IRC (Ping timeout: 260 seconds)
04:15 🔗 qw3rty112 has quit IRC (Read error: Operation timed out)
04:18 🔗 Aerochrom has joined #archiveteam-bs
05:12 🔗 Pixi` has joined #archiveteam-bs
05:16 🔗 Pixi has quit IRC (Ping timeout: 255 seconds)
05:20 🔗 Pixi` has quit IRC (Quit: Pixi`)
05:21 🔗 Pixi has joined #archiveteam-bs
06:09 🔗 wp494 has quit IRC (Read error: Operation timed out)
06:09 🔗 wp494 has joined #archiveteam-bs
06:22 🔗 K4k has joined #archiveteam-bs
06:54 🔗 CoolCanuk @mr_archiv: thanks for the edit. Sorry it took so long for you to get access.
06:55 🔗 mr_archiv CoolCanuk, glad to help I really appreciate what you all do, I am tired of trying to visit websites and they are down and I find that there is no backup of the website.
06:56 🔗 CoolCanuk You're one of us who helps make it possible. Nice work with the vidme warrior!
06:56 🔗 CoolCanuk :)
06:58 🔗 mr_archiv You too CoolCanuk I see you have done nearly 1TB of data. Are you running multiple servers?
06:58 🔗 CoolCanuk Running about 32 :)
06:59 🔗 CoolCanuk Google Compute Engine trials and DigitalOcean credit
07:00 🔗 CoolCanuk Might be about 20 running right now. Difficult to manage so many machines
07:02 🔗 mr_archiv Thank you for pointing out these opportunities I will check them out.
07:04 🔗 CoolCanuk :) youre not forced to go all out. Run what you can/want to.
07:04 🔗 CoolCanuk Personally I wont put any money into archiving until I get a decent job :)
07:06 🔗 mr_archiv I know I am not forced, it is free so why not do it?
07:08 🔗 CoolCanuk If you need help setting up Google Compute Engine, let me know.
07:08 🔗 CoolCanuk I will probably make a wiki tutorial within the next few days
07:09 🔗 CoolCanuk For DigitalOcean, i only used them because I got $50 free for being a student via github. I think you can use one of the many $10 codes on the web once you activate under billing. If you go over, it WILL charge your card.
07:10 🔗 CoolCanuk Google makes it very obvious when you're being charged, which is great.
07:11 🔗 CoolCanuk If I don't reply, i likely fell asleep. Cheers
07:12 🔗 mr_archiv Goodnight CoolCanuk
08:43 🔗 Frogging CoolCanuk: hey I used my github student digitalocean ccredit
08:43 🔗 Frogging for the same thing :p
09:05 🔗 icedice has joined #archiveteam-bs
09:06 🔗 wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES)
09:18 🔗 CoolCanuk has quit IRC (Quit: Connection closed for inactivity)
09:46 🔗 ZexaronS has joined #archiveteam-bs
10:10 🔗 icedice has quit IRC (Read error: Connection reset by peer)
10:49 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
10:52 🔗 bwn has quit IRC (Read error: Operation timed out)
10:53 🔗 bwn has joined #archiveteam-bs
11:03 🔗 ivan has quit IRC (Read error: Operation timed out)
11:03 🔗 marvinw has joined #archiveteam-bs
11:42 🔗 phaedra has joined #archiveteam-bs
11:42 🔗 phaedra has quit IRC (Client Quit)
12:33 🔗 jrwr Ugh
12:33 🔗 jrwr The wiki is dirt slow this morning
12:34 🔗 JAA s/morning/month/ :-/
12:52 🔗 JAA Toshiba will release a 14 TB drive in Q2 2018. PMR, helium-filled, nine (!) platters. Nice, this industry is in urgent need of competition.
13:00 🔗 Specular has joined #archiveteam-bs
13:01 🔗 kristian_ has joined #archiveteam-bs
13:08 🔗 Mateon1 has quit IRC (Remote host closed the connection)
13:09 🔗 Mateon1 has joined #archiveteam-bs
13:21 🔗 Specular I hate when sites don't even display *anything* without cookies.
13:50 🔗 godane uploading koreanet-2 chuncheon pg_butitv https://archive.org/details/koreanet-2-chuncheon-pg_butitv-20040124
13:58 🔗 SketchCow has quit IRC (Read error: Connection reset by peer)
14:15 🔗 schbirid has quit IRC (Ping timeout: 255 seconds)
14:27 🔗 schbirid has joined #archiveteam-bs
14:37 🔗 SketchCow has joined #archiveteam-bs
14:37 🔗 swebb sets mode: +o SketchCow
14:44 🔗 SketchCow has quit IRC (Read error: Connection reset by peer)
15:12 🔗 Mateon1 has quit IRC (Ping timeout: 260 seconds)
15:13 🔗 Mateon1 has joined #archiveteam-bs
15:40 🔗 kristian_ has quit IRC (Ping timeout: 360 seconds)
15:45 🔗 godane so looks like i can put thread queue size at 50000
15:46 🔗 godane i was fighting the larry sanders show final episode tape
16:14 🔗 Aerochrom has quit IRC (Ping timeout: 248 seconds)
16:15 🔗 Aerochrom has joined #archiveteam-bs
16:18 🔗 kristian_ has joined #archiveteam-bs
16:45 🔗 du_ has joined #archiveteam-bs
16:46 🔗 CoolCanuk has joined #archiveteam-bs
17:08 🔗 kristian_ has quit IRC (Quit: Leaving)
17:24 🔗 godane i'm now doing farewell good brothers tape
17:24 🔗 ez im wondering, does IA publish some sort of merkle root for the stuff they do have?
17:25 🔗 ez from what i understand iabak is rather haphazard endeavor when its not clear what it is to be actually mirrored
17:35 🔗 icedice has joined #archiveteam-bs
17:45 🔗 godane so i have like 9 tapes left from jason scott
17:45 🔗 godane or at least the ones i can capture
17:53 🔗 rbraun has joined #archiveteam-bs
18:09 🔗 ranavalon has quit IRC (Read error: Connection reset by peer)
18:09 🔗 ranavalon has joined #archiveteam-bs
18:31 🔗 icedice has quit IRC (Read error: Connection reset by peer)
18:49 🔗 icedice has joined #archiveteam-bs
19:26 🔗 dd0a13f37 has joined #archiveteam-bs
19:26 🔗 dd0a13f37 Is there any way to check how completely archived a site is in wayback, save for manually clicking on links and calculating 404%?
19:28 🔗 dd0a13f37 How well does wayback scrape and display interactive content? Does it execute JS that requires actively clicking on it?
19:39 🔗 jschwart has joined #archiveteam-bs
19:48 🔗 ez i think it scrapes only on html level. what js disabled browser wont see, ia wont see
19:54 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
19:56 🔗 RichardG has joined #archiveteam-bs
19:58 🔗 ola_norsk has joined #archiveteam-bs
19:59 🔗 ola_norsk JAA: it was you mentioned another irc client than hexchat right? (hexchat is beginning to piss me off)
20:01 🔗 wp494 has joined #archiveteam-bs
20:01 🔗 ola_norsk JAA: if i remember incorrectly, i apologize
20:03 🔗 ola_norsk CoolCanuk: the other day someone was talking about gamefaqs.com , is it going to shits that website?
20:03 🔗 icedice has quit IRC (Ping timeout: 260 seconds)
20:03 🔗 CoolCanuk I couldnt find any info on it closing
20:04 🔗 CoolCanuk there was talk about it being ruined in 2015, but I think it was just someone's opinion (eg: same as Disney taking over club penguin)
20:04 🔗 JAA ola_norsk: Definitely possible. irssi ftw.
20:04 🔗 ola_norsk ty
20:05 🔗 ola_norsk hexchat seems to be an ass. Even if i add server, set it to default, it still doesn't store it :/
20:05 🔗 schbirid its just bad ux
20:06 🔗 schbirid you have to make sure to take away focus from the field or something
20:06 🔗 godane so i'm now capturing the porn tapes
20:06 🔗 schbirid i get it all the time when trying to add custom servers...
20:06 🔗 schbirid godane: hf >:)
20:08 🔗 JAA ez: That depends. I don't know what the IA scrapes do, but if you archive something through the WM, it definitely also grabs some other stuff. Not everything though, since the WM doesn't (can't?) handle every request correctly.
20:10 🔗 ola_norsk has quit IRC (Leaving)
20:11 🔗 ola_norsk has joined #archiveteam-bs
20:19 🔗 ola_norsk CoolCanuk: Oh. Either way, been looking into ways to get httrack to get AKK urls/links mentioned on *gamefaqs.com , basically a 'sitemap' list/log, but struggle with httrack filter rules correct https://www.httrack.com/html/filters.html
20:19 🔗 CoolCanuk Be advised httrack has a size limit
20:20 🔗 ola_norsk CoolCanuk: could wget traverse it in same manner perhaps, without donwloading anything?
20:20 🔗 CoolCanuk httrack also adds html comments to all pages unless you tell it not to
20:20 🔗 CoolCanuk i'm not sure :(
20:20 🔗 ola_norsk CoolCanuk: httrack seems to be able to 'check links' only
20:21 🔗 CoolCanuk oh, a set of links? never tried that
20:21 🔗 ola_norsk CoolCanuk: yeah, just log the links, never making any request other than to them..
20:22 🔗 CoolCanuk i'd look for sitemap generation software
20:22 🔗 CoolCanuk but it likely wouldnt get all links :(
20:23 🔗 ola_norsk CoolCanuk: from what i can tell, with the correct arguments and 'scan rules' httrack is able to make sitemaps
20:23 🔗 godane https://www.reddit.com/r/DataHoarder/comments/7iheqk/imgurnudes_archive_of_nudes_from_multiple/
20:36 🔗 Stilett0 has quit IRC (Ping timeout: 260 seconds)
20:37 🔗 godane https://www.youtube.com/watch?v=JucFpDhuF98
20:38 🔗 marvinw is now known as ivan
20:38 🔗 ola_norsk JAA: the problem with irrsi is that i undoubtedly with enter in terminal commands to chats eventually :D
20:39 🔗 Stilett0 has joined #archiveteam-bs
20:48 🔗 ola_norsk adding e.g an 'Prank_Call_Nation_2016-06-21_Pack_thumb.png' to an item collection with that same name would make it the thumbnail, right?
20:49 🔗 ola_norsk or does it need to be jpg
21:03 🔗 ez JAA: WM?
21:04 🔗 ez i thought ia has only a crawler, which doesnt seem to execute js
21:18 🔗 Ravenloft has quit IRC (Read error: Operation timed out)
21:19 🔗 godane so i can tell you guys this other porn tape is very dark in lighting
21:23 🔗 SketchCow has joined #archiveteam-bs
21:23 🔗 swebb sets mode: +o SketchCow
21:47 🔗 BlueMaxim has joined #archiveteam-bs
22:05 🔗 dd0a13f37 has quit IRC (Quit: Connection closed for inactivity)
22:18 🔗 CoolCanuk Lol
22:22 🔗 ranavalon has quit IRC (Read error: Connection reset by peer)
22:22 🔗 ranavalon has joined #archiveteam-bs
22:27 🔗 schbirid !topic plz: <godane> SketchCow: your porn tapes are getting digitized right now
22:27 🔗 schbirid ;D
22:28 🔗 ranavalon has quit IRC (Read error: Connection reset by peer)
22:28 🔗 godane !topic <godane> SketchCow: your porn tapes are getting digitized right now
22:28 🔗 godane i can't change the the topic
22:28 🔗 schbirid only ops can :)
22:29 🔗 schbirid but i thought that line was funny
22:29 🔗 astrid i think you can change the topic
22:29 🔗 godane me too
22:29 🔗 astrid try using /topic instead of !topic
22:29 🔗 schbirid every man's nightmare except for us weirdo archivers
22:29 🔗 schbirid nah, my "!topic plz" was meant to be funny
22:29 🔗 godane that doesn't work either
22:29 🔗 schbirid topic can only be changed by ops
22:30 🔗 godane after this tape i got 2 anime porn tapes
22:33 🔗 JAA ez: WM = Wayback Machine. I mean when you manually use "Save now!".
22:33 🔗 JAA Also, if you visit a grab with your browser later, it will try to load anything that hasn't been archived.
22:34 🔗 ranavalon has joined #archiveteam-bs
22:34 🔗 JAA Well, if the browser requests it, which isn't always the case because of the previously mentioned incomplete JS processing in the WM.
22:35 🔗 ez oh, didnt know theres 'save page now', for some reason that option is well hidden to casual user
22:36 🔗 JAA Not really. It's in the lower right corner on https://web.archive.org/, and if you try to access something that hasn't been archived, there's a big message "Help make the Wayback Machine more complete" or similar.
22:36 🔗 JAA But yeah, for links which are already in the archives, I don't think it's exposed.
22:37 🔗 JAA (You can manually trigger it by replacing /web/<date> in the URL with /save.)
22:37 🔗 ez just going to /web exposes different "frontpage" i've never seen before where was the input box
22:38 🔗 ez i really suspect IA web frontend as a whole seems to be deliberately obtuse to keep the riffraff out
22:38 🔗 ez (in this case me)
22:38 🔗 astrid it's not deliberate
22:38 🔗 astrid but it has that effect regardless
22:39 🔗 ranavalon has quit IRC (Read error: Connection reset by peer)
22:40 🔗 ranavalon has joined #archiveteam-bs
22:42 🔗 Specular I map it to a browser keyword for manually archiving single pages. Saves a bit of clicking.
23:00 🔗 qw3rty114 has joined #archiveteam-bs
23:03 🔗 qw3rty113 has quit IRC (Read error: Operation timed out)
23:09 🔗 qw3rty114 has quit IRC (Read error: Connection reset by peer)
23:12 🔗 ld1 has quit IRC (Quit: ~)
23:12 🔗 ld1 has joined #archiveteam-bs
23:14 🔗 ld1 has quit IRC (Client Quit)
23:16 🔗 qw3rty114 has joined #archiveteam-bs
23:17 🔗 ld1 has joined #archiveteam-bs
23:20 🔗 SketchCow changes topic to: Off-Topic and Lengthy Archive Team and Archive Discussions here | <godane> SketchCow: your porn tapes are getting digitized right now
23:25 🔗 ola_norsk what the hell kind of pr0n would that be, that's yet to be digitized?? :/
23:26 🔗 ola_norsk 'the man and the ant-hill' .. 'the stick and the maid' ... I'm worried
23:29 🔗 * ola_norsk is reminded of the discussion in the movie Braindead
23:32 🔗 Specular has quit IRC (Leaving)
23:43 🔗 dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.)
23:44 🔗 ld1 has quit IRC (Quit: ~)
23:44 🔗 ld1 has joined #archiveteam-bs
23:45 🔗 dashcloud has joined #archiveteam-bs
23:49 🔗 Odd0002 has quit IRC (Quit: ZNC - http://znc.in)
23:53 🔗 ola_norsk lol, there's some 'explisit' shit there https://archive.org/details/manga_library
23:54 🔗 ola_norsk maybe IA should consider a 'rated:R' metadata :D
23:56 🔗 ola_norsk ..or, maybe parents shouldn't let kids click links willynilly..
23:57 🔗 ola_norsk after all, it's no worse than WW2 footage...

irclogger-viewer