#archiveteam-bs 2017-12-08,Fri

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
***Stilett0 has joined #archiveteam-bs [00:09]
ola_norsk has joined #archiveteam-bs [00:20]
Valentine has quit IRC (Read error: Connection reset by peer)
TheLovina has joined #archiveteam-bs
[00:34]
Valentine has joined #archiveteam-bs
K4k has joined #archiveteam-bs
Stiletto has joined #archiveteam-bs
Stilett0 has quit IRC (Ping timeout: 250 seconds)
[00:39]
Valentine has quit IRC (Read error: Operation timed out)
Stilett0 has joined #archiveteam-bs
Valentine has joined #archiveteam-bs
Stiletto has quit IRC (Ping timeout: 245 seconds)
[00:51]
...... (idle for 27mn)
Stiletto has joined #archiveteam-bs
Stilett0 has quit IRC (Ping timeout: 250 seconds)
Dimtree has quit IRC (Peace)
Stiletto has quit IRC (Read error: Operation timed out)
[01:25]
CoolCanuklmao my local police shipped a replica police car of theirs to Prince George at Buckingham Palace because of his Christmas Wishlist [01:30]
***Stilett0 has joined #archiveteam-bs [01:32]
CoolCanuk6,000 km away [01:33]
***Dimtree has joined #archiveteam-bs [01:36]
...... (idle for 27mn)
Stilett0 has quit IRC (Ping timeout: 264 seconds)
ola_norsk has quit IRC (Ping timeout: 480 seconds)
[02:03]
kristian_ has joined #archiveteam-bs [02:10]
kristian_hi
anyone got a Hathi account+
?
[02:24]
.... (idle for 19mn)
odemgCoolCanuk, how long have you been on archiveteam? [02:44]
CoolCanuk11 days [02:46]
odemgxD
sound
[02:47]
CoolCanukhm?
i'm still very much a noob
[02:47]
***schbirid has quit IRC (Ping timeout: 255 seconds) [02:53]
munduslol, https://i.mundus.xyz/PfWEYB.png [02:55]
vantecYep, that uptime is laughable [02:56]
***K4k has quit IRC (Read error: Operation timed out) [02:56]
mundusthat's due to a shit host
:p
[02:56]
***schbirid has joined #archiveteam-bs [03:04]
....... (idle for 33mn)
ld1 has quit IRC (Quit: ~) [03:37]
ld1 has joined #archiveteam-bs [03:44]
kristian_ has quit IRC (Quit: Leaving)
Stilett0 has joined #archiveteam-bs
[03:55]
qw3rty113 has joined #archiveteam-bs [04:08]
du_ has quit IRC (Ping timeout: 260 seconds)
qw3rty112 has quit IRC (Read error: Operation timed out)
Aerochrom has joined #archiveteam-bs
[04:14]
........... (idle for 54mn)
Pixi` has joined #archiveteam-bs
Pixi has quit IRC (Ping timeout: 255 seconds)
Pixi` has quit IRC (Quit: Pixi`)
Pixi has joined #archiveteam-bs
[05:12]
.......... (idle for 48mn)
wp494 has quit IRC (Read error: Operation timed out)
wp494 has joined #archiveteam-bs
[06:09]
K4k has joined #archiveteam-bs [06:22]
....... (idle for 32mn)
CoolCanuk@mr_archiv: thanks for the edit. Sorry it took so long for you to get access. [06:54]
mr_archivCoolCanuk, glad to help I really appreciate what you all do, I am tired of trying to visit websites and they are down and I find that there is no backup of the website. [06:55]
CoolCanukYou're one of us who helps make it possible. Nice work with the vidme warrior!
:)
[06:56]
mr_archivYou too CoolCanuk I see you have done nearly 1TB of data. Are you running multiple servers? [06:58]
CoolCanukRunning about 32 :)
Google Compute Engine trials and DigitalOcean credit
Might be about 20 running right now. Difficult to manage so many machines
[06:58]
mr_archivThank you for pointing out these opportunities I will check them out. [07:02]
CoolCanuk:) youre not forced to go all out. Run what you can/want to.
Personally I wont put any money into archiving until I get a decent job :)
[07:04]
mr_archivI know I am not forced, it is free so why not do it? [07:06]
CoolCanukIf you need help setting up Google Compute Engine, let me know.
I will probably make a wiki tutorial within the next few days
For DigitalOcean, i only used them because I got $50 free for being a student via github. I think you can use one of the many $10 codes on the web once you activate under billing. If you go over, it WILL charge your card.
Google makes it very obvious when you're being charged, which is great.
If I don't reply, i likely fell asleep. Cheers
[07:08]
mr_archivGoodnight CoolCanuk [07:12]
................... (idle for 1h31mn)
FroggingCoolCanuk: hey I used my github student digitalocean ccredit
for the same thing :p
[08:43]
..... (idle for 22mn)
***icedice has joined #archiveteam-bs
wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES)
[09:05]
CoolCanuk has quit IRC (Quit: Connection closed for inactivity) [09:18]
...... (idle for 28mn)
ZexaronS has joined #archiveteam-bs [09:46]
..... (idle for 24mn)
icedice has quit IRC (Read error: Connection reset by peer) [10:10]
........ (idle for 39mn)
BlueMaxim has quit IRC (Read error: Connection reset by peer)
bwn has quit IRC (Read error: Operation timed out)
bwn has joined #archiveteam-bs
[10:49]
ivan has quit IRC (Read error: Operation timed out)
marvinw has joined #archiveteam-bs
[11:03]
........ (idle for 39mn)
phaedra has joined #archiveteam-bs
phaedra has quit IRC (Client Quit)
[11:42]
........... (idle for 51mn)
jrwrUgh
The wiki is dirt slow this morning
[12:33]
JAAs/morning/month/ :-/ [12:34]
.... (idle for 18mn)
Toshiba will release a 14 TB drive in Q2 2018. PMR, helium-filled, nine (!) platters. Nice, this industry is in urgent need of competition. [12:52]
***Specular has joined #archiveteam-bs
kristian_ has joined #archiveteam-bs
[13:00]
Mateon1 has quit IRC (Remote host closed the connection)
Mateon1 has joined #archiveteam-bs
[13:08]
SpecularI hate when sites don't even display *anything* without cookies. [13:21]
...... (idle for 29mn)
godaneuploading koreanet-2 chuncheon pg_butitv https://archive.org/details/koreanet-2-chuncheon-pg_butitv-20040124 [13:50]
***SketchCow has quit IRC (Read error: Connection reset by peer) [13:58]
.... (idle for 17mn)
schbirid has quit IRC (Ping timeout: 255 seconds) [14:15]
schbirid has joined #archiveteam-bs [14:27]
SketchCow has joined #archiveteam-bs
swebb sets mode: +o SketchCow
[14:37]
SketchCow has quit IRC (Read error: Connection reset by peer) [14:44]
...... (idle for 28mn)
Mateon1 has quit IRC (Ping timeout: 260 seconds)
Mateon1 has joined #archiveteam-bs
[15:12]
...... (idle for 27mn)
kristian_ has quit IRC (Ping timeout: 360 seconds) [15:40]
godaneso looks like i can put thread queue size at 50000
i was fighting the larry sanders show final episode tape
[15:45]
...... (idle for 28mn)
***Aerochrom has quit IRC (Ping timeout: 248 seconds)
Aerochrom has joined #archiveteam-bs
kristian_ has joined #archiveteam-bs
[16:14]
...... (idle for 27mn)
du_ has joined #archiveteam-bs
CoolCanuk has joined #archiveteam-bs
[16:45]
..... (idle for 22mn)
kristian_ has quit IRC (Quit: Leaving) [17:08]
.... (idle for 16mn)
godanei'm now doing farewell good brothers tape [17:24]
ezim wondering, does IA publish some sort of merkle root for the stuff they do have?
from what i understand iabak is rather haphazard endeavor when its not clear what it is to be actually mirrored
[17:24]
***icedice has joined #archiveteam-bs [17:35]
godaneso i have like 9 tapes left from jason scott
or at least the ones i can capture
[17:45]
***rbraun has joined #archiveteam-bs [17:53]
.... (idle for 16mn)
ranavalon has quit IRC (Read error: Connection reset by peer)
ranavalon has joined #archiveteam-bs
[18:09]
..... (idle for 22mn)
icedice has quit IRC (Read error: Connection reset by peer) [18:31]
.... (idle for 18mn)
icedice has joined #archiveteam-bs [18:49]
........ (idle for 37mn)
dd0a13f37 has joined #archiveteam-bs [19:26]
dd0a13f37Is there any way to check how completely archived a site is in wayback, save for manually clicking on links and calculating 404%?
How well does wayback scrape and display interactive content? Does it execute JS that requires actively clicking on it?
[19:26]
***jschwart has joined #archiveteam-bs [19:39]
ezi think it scrapes only on html level. what js disabled browser wont see, ia wont see [19:48]
***RichardG has quit IRC (Read error: Connection reset by peer)
RichardG has joined #archiveteam-bs
ola_norsk has joined #archiveteam-bs
[19:54]
ola_norskJAA: it was you mentioned another irc client than hexchat right? (hexchat is beginning to piss me off) [19:59]
***wp494 has joined #archiveteam-bs [20:01]
ola_norskJAA: if i remember incorrectly, i apologize
CoolCanuk: the other day someone was talking about gamefaqs.com , is it going to shits that website?
[20:01]
***icedice has quit IRC (Ping timeout: 260 seconds) [20:03]
CoolCanukI couldnt find any info on it closing
there was talk about it being ruined in 2015, but I think it was just someone's opinion (eg: same as Disney taking over club penguin)
[20:03]
JAAola_norsk: Definitely possible. irssi ftw. [20:04]
ola_norskty
hexchat seems to be an ass. Even if i add server, set it to default, it still doesn't store it :/
[20:04]
schbiridits just bad ux
you have to make sure to take away focus from the field or something
[20:05]
godaneso i'm now capturing the porn tapes [20:06]
schbiridi get it all the time when trying to add custom servers...
godane: hf >:)
[20:06]
JAAez: That depends. I don't know what the IA scrapes do, but if you archive something through the WM, it definitely also grabs some other stuff. Not everything though, since the WM doesn't (can't?) handle every request correctly. [20:08]
***ola_norsk has quit IRC (Leaving)
ola_norsk has joined #archiveteam-bs
[20:10]
ola_norskCoolCanuk: Oh. Either way, been looking into ways to get httrack to get AKK urls/links mentioned on *gamefaqs.com , basically a 'sitemap' list/log, but struggle with httrack filter rules correct https://www.httrack.com/html/filters.html [20:19]
CoolCanukBe advised httrack has a size limit [20:19]
ola_norskCoolCanuk: could wget traverse it in same manner perhaps, without donwloading anything? [20:20]
CoolCanukhttrack also adds html comments to all pages unless you tell it not to
i'm not sure :(
[20:20]
ola_norskCoolCanuk: httrack seems to be able to 'check links' only [20:20]
CoolCanukoh, a set of links? never tried that [20:21]
ola_norskCoolCanuk: yeah, just log the links, never making any request other than to them.. [20:21]
CoolCanuki'd look for sitemap generation software
but it likely wouldnt get all links :(
[20:22]
ola_norskCoolCanuk: from what i can tell, with the correct arguments and 'scan rules' httrack is able to make sitemaps [20:23]
godanehttps://www.reddit.com/r/DataHoarder/comments/7iheqk/imgurnudes_archive_of_nudes_from_multiple/ [20:23]
***Stilett0 has quit IRC (Ping timeout: 260 seconds) [20:36]
godanehttps://www.youtube.com/watch?v=JucFpDhuF98 [20:37]
***marvinw is now known as ivan [20:38]
ola_norskJAA: the problem with irrsi is that i undoubtedly with enter in terminal commands to chats eventually :D [20:38]
***Stilett0 has joined #archiveteam-bs [20:39]
ola_norskadding e.g an 'Prank_Call_Nation_2016-06-21_Pack_thumb.png' to an item collection with that same name would make it the thumbnail, right?
or does it need to be jpg
[20:48]
ezJAA: WM?
i thought ia has only a crawler, which doesnt seem to execute js
[21:03]
***Ravenloft has quit IRC (Read error: Operation timed out) [21:18]
godaneso i can tell you guys this other porn tape is very dark in lighting [21:19]
***SketchCow has joined #archiveteam-bs
swebb sets mode: +o SketchCow
[21:23]
..... (idle for 24mn)
BlueMaxim has joined #archiveteam-bs [21:47]
.... (idle for 18mn)
dd0a13f37 has quit IRC (Quit: Connection closed for inactivity) [22:05]
CoolCanukLol [22:18]
***ranavalon has quit IRC (Read error: Connection reset by peer)
ranavalon has joined #archiveteam-bs
[22:22]
schbirid!topic plz: <godane> SketchCow: your porn tapes are getting digitized right now
;D
[22:27]
***ranavalon has quit IRC (Read error: Connection reset by peer) [22:28]
godane!topic <godane> SketchCow: your porn tapes are getting digitized right now
i can't change the the topic
[22:28]
schbiridonly ops can :)
but i thought that line was funny
[22:28]
astridi think you can change the topic [22:29]
godaneme too [22:29]
astridtry using /topic instead of !topic [22:29]
schbiridevery man's nightmare except for us weirdo archivers
nah, my "!topic plz" was meant to be funny
[22:29]
godanethat doesn't work either [22:29]
schbiridtopic can only be changed by ops [22:29]
godaneafter this tape i got 2 anime porn tapes [22:30]
JAAez: WM = Wayback Machine. I mean when you manually use "Save now!".
Also, if you visit a grab with your browser later, it will try to load anything that hasn't been archived.
[22:33]
***ranavalon has joined #archiveteam-bs [22:34]
JAAWell, if the browser requests it, which isn't always the case because of the previously mentioned incomplete JS processing in the WM. [22:34]
ezoh, didnt know theres 'save page now', for some reason that option is well hidden to casual user [22:35]
JAANot really. It's in the lower right corner on https://web.archive.org/, and if you try to access something that hasn't been archived, there's a big message "Help make the Wayback Machine more complete" or similar.
But yeah, for links which are already in the archives, I don't think it's exposed.
(You can manually trigger it by replacing /web/<date> in the URL with /save.)
[22:36]
ezjust going to /web exposes different "frontpage" i've never seen before where was the input box
i really suspect IA web frontend as a whole seems to be deliberately obtuse to keep the riffraff out
(in this case me)
[22:37]
astridit's not deliberate
but it has that effect regardless
[22:38]
***ranavalon has quit IRC (Read error: Connection reset by peer)
ranavalon has joined #archiveteam-bs
[22:39]
SpecularI map it to a browser keyword for manually archiving single pages. Saves a bit of clicking. [22:42]
.... (idle for 18mn)
***qw3rty114 has joined #archiveteam-bs
qw3rty113 has quit IRC (Read error: Operation timed out)
[23:00]
qw3rty114 has quit IRC (Read error: Connection reset by peer)
ld1 has quit IRC (Quit: ~)
ld1 has joined #archiveteam-bs
ld1 has quit IRC (Client Quit)
qw3rty114 has joined #archiveteam-bs
ld1 has joined #archiveteam-bs
SketchCow changes topic to: Off-Topic and Lengthy Archive Team and Archive Discussions here | <godane> SketchCow: your porn tapes are getting digitized right now
[23:09]
ola_norskwhat the hell kind of pr0n would that be, that's yet to be digitized?? :/
'the man and the ant-hill' .. 'the stick and the maid' ... I'm worried
ola_norsk is reminded of the discussion in the movie Braindead
[23:25]
***Specular has quit IRC (Leaving) [23:32]
dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.)
ld1 has quit IRC (Quit: ~)
ld1 has joined #archiveteam-bs
dashcloud has joined #archiveteam-bs
Odd0002 has quit IRC (Quit: ZNC - http://znc.in)
[23:43]
ola_norsklol, there's some 'explisit' shit there https://archive.org/details/manga_library
maybe IA should consider a 'rated:R' metadata :D
..or, maybe parents shouldn't let kids click links willynilly..
after all, it's no worse than WW2 footage...
[23:53]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)