[00:09] *** Stilett0 has joined #archiveteam-bs [00:20] *** ola_norsk has joined #archiveteam-bs [00:34] *** Valentine has quit IRC (Read error: Connection reset by peer) [00:34] *** TheLovina has joined #archiveteam-bs [00:39] *** Valentine has joined #archiveteam-bs [00:39] *** K4k has joined #archiveteam-bs [00:43] *** Stiletto has joined #archiveteam-bs [00:43] *** Stilett0 has quit IRC (Ping timeout: 250 seconds) [00:51] *** Valentine has quit IRC (Read error: Operation timed out) [00:54] *** Stilett0 has joined #archiveteam-bs [00:55] *** Valentine has joined #archiveteam-bs [00:58] *** Stiletto has quit IRC (Ping timeout: 245 seconds) [01:25] *** Stiletto has joined #archiveteam-bs [01:26] *** Stilett0 has quit IRC (Ping timeout: 250 seconds) [01:28] *** Dimtree has quit IRC (Peace) [01:30] *** Stiletto has quit IRC (Read error: Operation timed out) [01:30] lmao my local police shipped a replica police car of theirs to Prince George at Buckingham Palace because of his Christmas Wishlist [01:32] *** Stilett0 has joined #archiveteam-bs [01:33] 6,000 km away [01:36] *** Dimtree has joined #archiveteam-bs [02:03] *** Stilett0 has quit IRC (Ping timeout: 264 seconds) [02:05] *** ola_norsk has quit IRC (Ping timeout: 480 seconds) [02:10] *** kristian_ has joined #archiveteam-bs [02:24] hi [02:25] anyone got a Hathi account+ [02:25] ? [02:44] CoolCanuk, how long have you been on archiveteam? [02:46] 11 days [02:47] xD [02:47] sound [02:47] hm? [02:48] i'm still very much a noob [02:53] *** schbirid has quit IRC (Ping timeout: 255 seconds) [02:55] lol, https://i.mundus.xyz/PfWEYB.png [02:56] Yep, that uptime is laughable [02:56] *** K4k has quit IRC (Read error: Operation timed out) [02:56] that's due to a shit host [02:56] :p [03:04] *** schbirid has joined #archiveteam-bs [03:37] *** ld1 has quit IRC (Quit: ~) [03:44] *** ld1 has joined #archiveteam-bs [03:55] *** kristian_ has quit IRC (Quit: Leaving) [03:57] *** Stilett0 has joined #archiveteam-bs [04:08] *** qw3rty113 has joined #archiveteam-bs [04:14] *** du_ has quit IRC (Ping timeout: 260 seconds) [04:15] *** qw3rty112 has quit IRC (Read error: Operation timed out) [04:18] *** Aerochrom has joined #archiveteam-bs [05:12] *** Pixi` has joined #archiveteam-bs [05:16] *** Pixi has quit IRC (Ping timeout: 255 seconds) [05:20] *** Pixi` has quit IRC (Quit: Pixi`) [05:21] *** Pixi has joined #archiveteam-bs [06:09] *** wp494 has quit IRC (Read error: Operation timed out) [06:09] *** wp494 has joined #archiveteam-bs [06:22] *** K4k has joined #archiveteam-bs [06:54] @mr_archiv: thanks for the edit. Sorry it took so long for you to get access. [06:55] CoolCanuk, glad to help I really appreciate what you all do, I am tired of trying to visit websites and they are down and I find that there is no backup of the website. [06:56] You're one of us who helps make it possible. Nice work with the vidme warrior! [06:56] :) [06:58] You too CoolCanuk I see you have done nearly 1TB of data. Are you running multiple servers? [06:58] Running about 32 :) [06:59] Google Compute Engine trials and DigitalOcean credit [07:00] Might be about 20 running right now. Difficult to manage so many machines [07:02] Thank you for pointing out these opportunities I will check them out. [07:04] :) youre not forced to go all out. Run what you can/want to. [07:04] Personally I wont put any money into archiving until I get a decent job :) [07:06] I know I am not forced, it is free so why not do it? [07:08] If you need help setting up Google Compute Engine, let me know. [07:08] I will probably make a wiki tutorial within the next few days [07:09] For DigitalOcean, i only used them because I got $50 free for being a student via github. I think you can use one of the many $10 codes on the web once you activate under billing. If you go over, it WILL charge your card. [07:10] Google makes it very obvious when you're being charged, which is great. [07:11] If I don't reply, i likely fell asleep. Cheers [07:12] Goodnight CoolCanuk [08:43] CoolCanuk: hey I used my github student digitalocean ccredit [08:43] for the same thing :p [09:05] *** icedice has joined #archiveteam-bs [09:06] *** wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES) [09:18] *** CoolCanuk has quit IRC (Quit: Connection closed for inactivity) [09:46] *** ZexaronS has joined #archiveteam-bs [10:10] *** icedice has quit IRC (Read error: Connection reset by peer) [10:49] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [10:52] *** bwn has quit IRC (Read error: Operation timed out) [10:53] *** bwn has joined #archiveteam-bs [11:03] *** ivan has quit IRC (Read error: Operation timed out) [11:03] *** marvinw has joined #archiveteam-bs [11:42] *** phaedra has joined #archiveteam-bs [11:42] *** phaedra has quit IRC (Client Quit) [12:33] Ugh [12:33] The wiki is dirt slow this morning [12:34] s/morning/month/ :-/ [12:52] Toshiba will release a 14 TB drive in Q2 2018. PMR, helium-filled, nine (!) platters. Nice, this industry is in urgent need of competition. [13:00] *** Specular has joined #archiveteam-bs [13:01] *** kristian_ has joined #archiveteam-bs [13:08] *** Mateon1 has quit IRC (Remote host closed the connection) [13:09] *** Mateon1 has joined #archiveteam-bs [13:21] I hate when sites don't even display *anything* without cookies. [13:50] uploading koreanet-2 chuncheon pg_butitv https://archive.org/details/koreanet-2-chuncheon-pg_butitv-20040124 [13:58] *** SketchCow has quit IRC (Read error: Connection reset by peer) [14:15] *** schbirid has quit IRC (Ping timeout: 255 seconds) [14:27] *** schbirid has joined #archiveteam-bs [14:37] *** SketchCow has joined #archiveteam-bs [14:37] *** swebb sets mode: +o SketchCow [14:44] *** SketchCow has quit IRC (Read error: Connection reset by peer) [15:12] *** Mateon1 has quit IRC (Ping timeout: 260 seconds) [15:13] *** Mateon1 has joined #archiveteam-bs [15:40] *** kristian_ has quit IRC (Ping timeout: 360 seconds) [15:45] so looks like i can put thread queue size at 50000 [15:46] i was fighting the larry sanders show final episode tape [16:14] *** Aerochrom has quit IRC (Ping timeout: 248 seconds) [16:15] *** Aerochrom has joined #archiveteam-bs [16:18] *** kristian_ has joined #archiveteam-bs [16:45] *** du_ has joined #archiveteam-bs [16:46] *** CoolCanuk has joined #archiveteam-bs [17:08] *** kristian_ has quit IRC (Quit: Leaving) [17:24] i'm now doing farewell good brothers tape [17:24] im wondering, does IA publish some sort of merkle root for the stuff they do have? [17:25] from what i understand iabak is rather haphazard endeavor when its not clear what it is to be actually mirrored [17:35] *** icedice has joined #archiveteam-bs [17:45] so i have like 9 tapes left from jason scott [17:45] or at least the ones i can capture [17:53] *** rbraun has joined #archiveteam-bs [18:09] *** ranavalon has quit IRC (Read error: Connection reset by peer) [18:09] *** ranavalon has joined #archiveteam-bs [18:31] *** icedice has quit IRC (Read error: Connection reset by peer) [18:49] *** icedice has joined #archiveteam-bs [19:26] *** dd0a13f37 has joined #archiveteam-bs [19:26] Is there any way to check how completely archived a site is in wayback, save for manually clicking on links and calculating 404%? [19:28] How well does wayback scrape and display interactive content? Does it execute JS that requires actively clicking on it? [19:39] *** jschwart has joined #archiveteam-bs [19:48] i think it scrapes only on html level. what js disabled browser wont see, ia wont see [19:54] *** RichardG has quit IRC (Read error: Connection reset by peer) [19:56] *** RichardG has joined #archiveteam-bs [19:58] *** ola_norsk has joined #archiveteam-bs [19:59] JAA: it was you mentioned another irc client than hexchat right? (hexchat is beginning to piss me off) [20:01] *** wp494 has joined #archiveteam-bs [20:01] JAA: if i remember incorrectly, i apologize [20:03] CoolCanuk: the other day someone was talking about gamefaqs.com , is it going to shits that website? [20:03] *** icedice has quit IRC (Ping timeout: 260 seconds) [20:03] I couldnt find any info on it closing [20:04] there was talk about it being ruined in 2015, but I think it was just someone's opinion (eg: same as Disney taking over club penguin) [20:04] ola_norsk: Definitely possible. irssi ftw. [20:04] ty [20:05] hexchat seems to be an ass. Even if i add server, set it to default, it still doesn't store it :/ [20:05] its just bad ux [20:06] you have to make sure to take away focus from the field or something [20:06] so i'm now capturing the porn tapes [20:06] i get it all the time when trying to add custom servers... [20:06] godane: hf >:) [20:08] ez: That depends. I don't know what the IA scrapes do, but if you archive something through the WM, it definitely also grabs some other stuff. Not everything though, since the WM doesn't (can't?) handle every request correctly. [20:10] *** ola_norsk has quit IRC (Leaving) [20:11] *** ola_norsk has joined #archiveteam-bs [20:19] CoolCanuk: Oh. Either way, been looking into ways to get httrack to get AKK urls/links mentioned on *gamefaqs.com , basically a 'sitemap' list/log, but struggle with httrack filter rules correct https://www.httrack.com/html/filters.html [20:19] Be advised httrack has a size limit [20:20] CoolCanuk: could wget traverse it in same manner perhaps, without donwloading anything? [20:20] httrack also adds html comments to all pages unless you tell it not to [20:20] i'm not sure :( [20:20] CoolCanuk: httrack seems to be able to 'check links' only [20:21] oh, a set of links? never tried that [20:21] CoolCanuk: yeah, just log the links, never making any request other than to them.. [20:22] i'd look for sitemap generation software [20:22] but it likely wouldnt get all links :( [20:23] CoolCanuk: from what i can tell, with the correct arguments and 'scan rules' httrack is able to make sitemaps [20:23] https://www.reddit.com/r/DataHoarder/comments/7iheqk/imgurnudes_archive_of_nudes_from_multiple/ [20:36] *** Stilett0 has quit IRC (Ping timeout: 260 seconds) [20:37] https://www.youtube.com/watch?v=JucFpDhuF98 [20:38] *** marvinw is now known as ivan [20:38] JAA: the problem with irrsi is that i undoubtedly with enter in terminal commands to chats eventually :D [20:39] *** Stilett0 has joined #archiveteam-bs [20:48] adding e.g an 'Prank_Call_Nation_2016-06-21_Pack_thumb.png' to an item collection with that same name would make it the thumbnail, right? [20:49] or does it need to be jpg [21:03] JAA: WM? [21:04] i thought ia has only a crawler, which doesnt seem to execute js [21:18] *** Ravenloft has quit IRC (Read error: Operation timed out) [21:19] so i can tell you guys this other porn tape is very dark in lighting [21:23] *** SketchCow has joined #archiveteam-bs [21:23] *** swebb sets mode: +o SketchCow [21:47] *** BlueMaxim has joined #archiveteam-bs [22:05] *** dd0a13f37 has quit IRC (Quit: Connection closed for inactivity) [22:18] Lol [22:22] *** ranavalon has quit IRC (Read error: Connection reset by peer) [22:22] *** ranavalon has joined #archiveteam-bs [22:27] !topic plz: SketchCow: your porn tapes are getting digitized right now [22:27] ;D [22:28] *** ranavalon has quit IRC (Read error: Connection reset by peer) [22:28] !topic SketchCow: your porn tapes are getting digitized right now [22:28] i can't change the the topic [22:28] only ops can :) [22:29] but i thought that line was funny [22:29] i think you can change the topic [22:29] me too [22:29] try using /topic instead of !topic [22:29] every man's nightmare except for us weirdo archivers [22:29] nah, my "!topic plz" was meant to be funny [22:29] that doesn't work either [22:29] topic can only be changed by ops [22:30] after this tape i got 2 anime porn tapes [22:33] ez: WM = Wayback Machine. I mean when you manually use "Save now!". [22:33] Also, if you visit a grab with your browser later, it will try to load anything that hasn't been archived. [22:34] *** ranavalon has joined #archiveteam-bs [22:34] Well, if the browser requests it, which isn't always the case because of the previously mentioned incomplete JS processing in the WM. [22:35] oh, didnt know theres 'save page now', for some reason that option is well hidden to casual user [22:36] Not really. It's in the lower right corner on https://web.archive.org/, and if you try to access something that hasn't been archived, there's a big message "Help make the Wayback Machine more complete" or similar. [22:36] But yeah, for links which are already in the archives, I don't think it's exposed. [22:37] (You can manually trigger it by replacing /web/ in the URL with /save.) [22:37] just going to /web exposes different "frontpage" i've never seen before where was the input box [22:38] i really suspect IA web frontend as a whole seems to be deliberately obtuse to keep the riffraff out [22:38] (in this case me) [22:38] it's not deliberate [22:38] but it has that effect regardless [22:39] *** ranavalon has quit IRC (Read error: Connection reset by peer) [22:40] *** ranavalon has joined #archiveteam-bs [22:42] I map it to a browser keyword for manually archiving single pages. Saves a bit of clicking. [23:00] *** qw3rty114 has joined #archiveteam-bs [23:03] *** qw3rty113 has quit IRC (Read error: Operation timed out) [23:09] *** qw3rty114 has quit IRC (Read error: Connection reset by peer) [23:12] *** ld1 has quit IRC (Quit: ~) [23:12] *** ld1 has joined #archiveteam-bs [23:14] *** ld1 has quit IRC (Client Quit) [23:16] *** qw3rty114 has joined #archiveteam-bs [23:17] *** ld1 has joined #archiveteam-bs [23:20] *** SketchCow changes topic to: Off-Topic and Lengthy Archive Team and Archive Discussions here | SketchCow: your porn tapes are getting digitized right now [23:25] what the hell kind of pr0n would that be, that's yet to be digitized?? :/ [23:26] 'the man and the ant-hill' .. 'the stick and the maid' ... I'm worried [23:29] * ola_norsk is reminded of the discussion in the movie Braindead [23:32] *** Specular has quit IRC (Leaving) [23:43] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [23:44] *** ld1 has quit IRC (Quit: ~) [23:44] *** ld1 has joined #archiveteam-bs [23:45] *** dashcloud has joined #archiveteam-bs [23:49] *** Odd0002 has quit IRC (Quit: ZNC - http://znc.in) [23:53] lol, there's some 'explisit' shit there https://archive.org/details/manga_library [23:54] maybe IA should consider a 'rated:R' metadata :D [23:56] ..or, maybe parents shouldn't let kids click links willynilly.. [23:57] after all, it's no worse than WW2 footage...