[00:10] *** Stilett0 has joined #archiveteam-bs [00:12] *** julientm has joined #archiveteam-bs [00:13] *** Stiletto has quit IRC (Read error: Operation timed out) [00:14] *** Stilett0 is now known as Stiletto [00:37] *** second has quit IRC (Remote host closed the connection) [00:50] *** julientm has quit IRC (Ping timeout: 252 seconds) [00:55] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [01:01] *** bitBaron has joined #archiveteam-bs [01:13] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [01:18] *** Odd0002_ has joined #archiveteam-bs [01:19] *** bitBaron has joined #archiveteam-bs [01:23] *** Odd0002 has quit IRC (Ping timeout: 615 seconds) [01:23] *** Odd0002_ is now known as Odd0002 [01:31] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [01:36] *** julientm has joined #archiveteam-bs [01:39] *** Odd0002_ has joined #archiveteam-bs [01:42] *** Odd0002 has quit IRC (Ping timeout: 600 seconds) [01:42] *** Odd0002_ is now known as Odd0002 [01:48] *** bitBaron has joined #archiveteam-bs [01:48] *** Odd0002_ has joined #archiveteam-bs [01:53] *** Odd0002 has quit IRC (Ping timeout: 600 seconds) [01:53] *** Odd0002_ is now known as Odd0002 [02:06] *** SimpBrain has quit IRC (Read error: Operation timed out) [02:13] *** SimpBrain has joined #archiveteam-bs [02:17] *** second has joined #archiveteam-bs [02:31] *** julientm has quit IRC (Leaving) [02:35] *** ndiddy has quit IRC () [02:41] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [02:59] *** julientm has joined #archiveteam-bs [03:05] *** icedice has quit IRC (Read error: Operation timed out) [03:08] *** Stiletto has quit IRC (Read error: Connection reset by peer) [03:08] *** Stiletto has joined #archiveteam-bs [03:13] *** marked has quit IRC (west.us.hub irc.Prison.NET) [03:13] *** achip has quit IRC (west.us.hub irc.Prison.NET) [03:13] *** SynMonger has quit IRC (west.us.hub irc.Prison.NET) [03:16] *** synm0nger has joined #archiveteam-bs [03:27] *** Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) [03:32] *** achip has joined #archiveteam-bs [03:32] *** marked has joined #archiveteam-bs [03:36] *** znak has quit IRC (Read error: Operation timed out) [03:36] *** wabu has quit IRC (Read error: Operation timed out) [03:37] *** Polylith_ has quit IRC (Read error: Operation timed out) [03:37] *** simon816 has quit IRC (Ping timeout: 246 seconds) [03:37] *** c4rc4s has quit IRC (Read error: Operation timed out) [03:38] *** ivan has quit IRC (Ping timeout: 246 seconds) [03:38] *** swebb_ has joined #archiveteam-bs [03:38] *** swebb has quit IRC (Ping timeout: 246 seconds) [03:38] *** JAA has quit IRC (Ping timeout: 246 seconds) [03:38] *** K4k__ has quit IRC (Ping timeout: 246 seconds) [03:38] *** svchfoo1 has quit IRC (Ping timeout: 246 seconds) [03:39] *** swebb_ is now known as swebb [03:39] *** colona has quit IRC (Ping timeout: 246 seconds) [03:39] *** betamax has quit IRC (Ping timeout: 246 seconds) [03:39] *** sknebel has quit IRC (Ping timeout: 246 seconds) [03:39] *** joepie91 has quit IRC (Ping timeout: 246 seconds) [03:39] *** TC01 has quit IRC (Ping timeout: 246 seconds) [03:39] *** sknebel has joined #archiveteam-bs [03:39] *** TC01 has joined #archiveteam-bs [03:40] *** ivan has joined #archiveteam-bs [03:41] *** betamax has joined #archiveteam-bs [03:41] *** colona has joined #archiveteam-bs [03:46] *** Polylith has joined #archiveteam-bs [03:48] *** joepie91 has joined #archiveteam-bs [03:50] *** Despatche has quit IRC (Quit: Connection reset by deer) [03:50] *** wyatt8740 has joined #archiveteam-bs [03:52] *** K4k__ has joined #archiveteam-bs [03:58] *** znak has joined #archiveteam-bs [04:20] *** julientm has quit IRC (Remote host closed the connection) [04:28] *** Binzhou5 has joined #archiveteam-bs [04:31] *** qw3rty113 has joined #archiveteam-bs [04:35] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [04:35] *** SimpBrain has joined #archiveteam-bs [04:36] *** c4rc4s has joined #archiveteam-bs [04:36] *** simon816 has joined #archiveteam-bs [04:37] *** svchfoo1 has joined #archiveteam-bs [04:37] *** qw3rty112 has quit IRC (Ping timeout: 600 seconds) [04:38] *** JAA has joined #archiveteam-bs [04:38] *** bakJAA sets mode: +o JAA [04:41] *** wabu has joined #archiveteam-bs [04:48] *** odemgi has joined #archiveteam-bs [04:50] *** odemgi_ has quit IRC (Ping timeout: 252 seconds) [04:52] *** powerKitt has joined #archiveteam-bs [04:52] How would I use Wikiteam to dump a wiki with $wgEnableAPI=false; set? [04:53] powerKitt: By using Special:Export? [04:54] no Special:Export [04:56] https://ggwiki.deepfreeze.it/index.php?title=Special:Export Trying to dump the GamerGate wiki but it's really locked down for some reason [04:56] *** odemg has quit IRC (Ping timeout: 615 seconds) [04:57] (I don't agree with #GamerGate, but I think dumping it would be useful for future historians trying to understand what happened) [05:03] *** odemg has joined #archiveteam-bs [05:05] We are featured on https://youtu.be/FeAMpG4KbEc. [05:05] To back up Google+. [05:05] It's the last part of the video. [05:09] https://archive.org/details/youtube-FeAMpG4KbEc tubeup mirror [05:17] *** dhyan_nat has joined #archiveteam-bs [05:18] *** Binzhou5 has quit IRC (Quit: Page closed) [05:24] *** powerKitt has quit IRC (Quit: Page closed) [06:17] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [06:20] *** SimpBrain has joined #archiveteam-bs [06:42] *** wp494 has quit IRC (Read error: Operation timed out) [06:43] *** wp494 has joined #archiveteam-bs [07:00] *** logchfoo4 starts logging #archiveteam-bs at Tue Mar 19 07:00:51 2019 [07:00] *** logchfoo4 has joined #archiveteam-bs [07:01] *** atbk_ has joined #archiveteam-bs [07:01] *** LordNigh2 has joined #archiveteam-bs [07:02] *** atbk has quit IRC (Ping timeout: 615 seconds) [07:02] *** Laverne has joined #archiveteam-bs [07:02] *** kiskabak has joined #archiveteam-bs [07:02] *** xoxo has joined #archiveteam-bs [07:02] *** Kaz has joined #archiveteam-bs [07:02] *** efnet.portlane.se sets mode: +o Kaz [07:03] *** Gfy has joined #archiveteam-bs [07:09] *** underscor has joined #archiveteam-bs [07:12] *** C4K3_ has joined #archiveteam-bs [07:15] *** LordNigh2 is now known as Lord_Nigh [07:48] *** SimpBrain has quit IRC (Remote host closed the connection) [07:53] *** dhyan_nat has quit IRC (Read error: Operation timed out) [07:55] *** SimpBrain has joined #archiveteam-bs [08:39] *** svchfoo1 has quit IRC (Read error: Operation timed out) [08:40] *** logchfoo4 has quit IRC (Ping timeout: 246 seconds) [08:41] *** logchfoo0 starts logging #archiveteam-bs at Tue Mar 19 08:41:16 2019 [08:41] *** logchfoo0 has joined #archiveteam-bs [08:54] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [09:00] *** SimpBrain has joined #archiveteam-bs [09:22] *** dhyan_nat has joined #archiveteam-bs [09:38] *** S1mpbrain has joined #archiveteam-bs [09:38] *** wabu has joined #archiveteam-bs [09:38] *** c4rc4s has joined #archiveteam-bs [09:38] *** simon816 has joined #archiveteam-bs [09:39] *** SimpBrain has quit IRC (Read error: Connection reset by peer) [09:40] *** JAA has joined #archiveteam-bs [09:40] *** bakJAA sets mode: +o JAA [09:41] *** svchfoo1 has joined #archiveteam-bs [09:49] JAA: The tool tcp_closer ↑ works very well when gdb does not. [10:03] *** dhyan_nat has quit IRC (Read error: Operation timed out) [10:15] (With the -t parameter you could even run it as a cron job to auto-fix stuck connections.) [10:17] *** wyatt8740 has quit IRC (Ping timeout: 255 seconds) [10:26] *** dhyan_nat has joined #archiveteam-bs [10:43] *** dhyan_nat has quit IRC (Read error: Operation timed out) [10:45] *** dhyan_nat has joined #archiveteam-bs [10:49] *** BlueMax has quit IRC (Read error: Connection reset by peer) [10:56] *** Joseph__ has joined #archiveteam-bs [10:57] *** VerifiedJ has quit IRC (Ping timeout: 252 seconds) [11:19] *** dhyan_nat has quit IRC (Quit: Konversation terminated!) [11:19] *** dhyan_nat has joined #archiveteam-bs [11:33] *** robbierut has joined #archiveteam-bs [11:34] *** bitBaron has joined #archiveteam-bs [11:59] *** S1mpbrain has quit IRC (Read error: Connection reset by peer) [11:59] *** S1mpbrain has joined #archiveteam-bs [12:07] *** dhyan_nat has quit IRC (Ping timeout: 268 seconds) [12:35] *** S1mpbrain has quit IRC (Read error: Connection reset by peer) [12:35] *** SimpBrain has joined #archiveteam-bs [12:52] *** Flashfire has quit IRC (Ping timeout: 252 seconds) [12:52] *** kiska has quit IRC (Ping timeout: 252 seconds) [13:04] PurpleSym: Good to know, thanks! [13:05] Won't work on all pipelines due to the kernel version requirement (Ubuntu 16.04 LTS still has 4.4, for example), unfortunately. [13:06] The -t option seems very nice indeed. [13:06] I'll do some testing with that on jap-saola I think. [13:06] There’s always a catch. You don’t even need cron though. I’m running it as `tcp-closer -t 300000 -i 120 -d 443 -d 80` on my pipelines now. [13:07] That closes connections idle for 5 minutes every two minutes. [13:07] Hmm, are there any valid connections which may be idle for 5 minutes? [13:08] Seems a bit short to me. [13:09] I’m not sure, but I can’t think of any reason for idling 5 minutes right now. [13:12] But if you’re uncomfortable with that, we can always push it up to 30 or 60 minutes. [13:29] *** w0rmhole has joined #archiveteam-bs [13:29] *** kiska has joined #archiveteam-bs [13:30] *** Flashfire has joined #archiveteam-bs [13:58] *** bitBaron has quit IRC (My computer has gone to sleep. 😴😪ZZZzzz…) [13:59] *** bitBaron has joined #archiveteam-bs [14:00] *** Tenebrae has quit IRC (Read error: Operation timed out) [14:01] *** Tenebrae has joined #archiveteam-bs [14:09] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [14:16] *** bitBaron has joined #archiveteam-bs [14:26] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [14:28] *** bitBaron has joined #archiveteam-bs [14:38] *** bitBaron has quit IRC (My computer has gone to sleep. 😴😪ZZZzzz…) [14:49] *** SimpBrain has quit IRC (Read error: Operation timed out) [14:53] *** SimpBrain has joined #archiveteam-bs [15:12] *** minoa has joined #archiveteam-bs [15:13] *** dhyan_nat has joined #archiveteam-bs [15:14] minoa: Are the SVGs stored on the same host and in the same directory (or a subdirectory) as the main page you pass to wget? [15:15] Yes, but right now I am trying with robots turned off. [15:16] Ah, I got it now: wget -ckm -e robots=off --user-agent="(Agent)" "https://minoa.li/" --warc-file="minoa". I never intended to block archive sites, and I thought ia_archiver did the trick … at the time. [15:17] Ah yeah, "User-agent: * Disallow: /". :-| [15:18] I will whitelist ArchiveBot. Sorry for any inconvenience [15:18] ArchiveBot ignores robots.txt anyway. [15:18] I was after the SEO and referrer spammers, not archivsts. [15:19] And the ia_archiver block should make it visible on the Wayback Machine, I think. [15:19] (Except for those two directories, obviously.) [15:21] They are for internal use only. Not private but intended for other sites. [15:21] *** Exairnous has quit IRC (Read error: Operation timed out) [15:22] (By the way, if you haven't, take a look at our wiki page on robots.txt also.) [15:24] *** Exairnous has joined #archiveteam-bs [15:25] I saw that, but be assured I am only after the referrer spammers, not you. In fact I only learned about you today. [15:32] So now I created a WARC dump: should it be compressed, and if so, what format? [15:35] gzip if poss [15:40] I think wget should already write gzipped WARCs. Note that this is not the same as writing an uncompressed WARC and then running it through gzip. Each WARC record is compressed individually to allow random access. [15:40] I know the WARC is gzipped, but I am referring to packing the complete archive for submission, when ready [15:41] *** wp494 has quit IRC (Read error: Operation timed out) [15:42] *** wp494 has joined #archiveteam-bs [15:42] Ah, no, nothing else is needed. [15:46] I am also going to backup a MediaWiki wiki: https://nsindex.net - what considerations are needed for MediaWiki sites, because if I recall correctly there some pointless pages to skip (like the login page). [15:47] *** bitBaron has joined #archiveteam-bs [15:47] minoa: If the MediaWiki API is available you can use a special tool to dump the entire wiki through the API [15:47] https://www.archiveteam.org/index.php?title=WikiTeam#Tools_and_source_code [15:48] Though it's probably a good idea to grab it as static WARC files as well [15:48] *** SimpBrain has quit IRC (Read error: Operation timed out) [15:49] In addition the ArchiveBot project has a bunch of exclusion regexes that are useful for MediaWikis: https://github.com/ArchiveTeam/ArchiveBot/blob/master/db/ignore_patterns/mediawiki.json [15:50] I know dumpBackup.php and dumpUploads.php, but I thought you may prefer as it would appear now. [15:51] Both are useful in different contexts [15:51] It's not like NSindex is disappearing on 20 March, I have backing up high in my priority list. [15:52] But due to personal health issues I feel a checkpoint is in order. [15:55] *** SimpBrain has joined #archiveteam-bs [15:57] I think I am a bit too new to archiving NSindex from the Archive Team side — maybe I will have to submit the site to the queue while I deal with the dump scripts sort of thing. [15:58] I do not even know if a Warrior VM will let me archive my own sites. [15:59] Warrior is for big things [15:59] #archivebot for one off [16:00] https://github.com/ArchiveTeam/grab-site for personal archiving [16:01] I probably have to come back here on 21 March if #archivebot does not have a scheduling system. [16:04] BTW, what does it mean by “we are not the Internet Archive”? I know about robots.txt being ignored (which is not always a bad thing), but is there anything that I have missed? [16:05] *** VADemon has quit IRC (Quit: left4dead) [16:05] *** svchfoo3 has joined #archiveteam-bs [16:05] We upload our stuff to the Internet Archive but we have no official connection to them [16:05] *** PurpleSym sets mode: +oo svchfoo1 svchfoo3 [16:06] Well, we are not the Internet Archive. We're a group of crazy people who throw terabytes of archives into IA each day, but everything we do is completely separate from IA's infrastructure and organisation. [16:06] *** svchfoo1 sets mode: +o joepie91 [16:07] minoa: it's clear archive.org wants a little distance. First off, you aren't their employees... [16:07] *** svchfoo1 sets mode: +o kiska [16:07] I wanted to make sure I get everything right before submitting NSindex to archivebot. I don't want to make a mistake that may upset you or something. [16:09] ArchiveBot's quite busy currently, so unless nsindex.net is in danger of disappearing soon, I'd suggest we delay that until we have more free resources. [16:10] But you can archive it yourself with grab-site if you want. [16:10] (Downside: it won't become available in the Wayback Machine.) [16:13] And if I can use grab-site, how do I submit the completed project? [16:14] You can upload it to the Internet Archive directly. [16:15] *** fuzzy8021 has quit IRC (Read error: Connection reset by peer) [16:15] *** fuzzy8021 has joined #archiveteam-bs [16:16] I used to upload monthly data dumps there until they removed it. It was at https://archive.org/details/nsindex [16:19] Well, in that case, you should probably talk to IA before reuploading it (in whichever format). [16:24] Sent the email off to them. [16:24] Thanks for the help so far. [16:24] *** minoa has left [17:26] *** Hani has quit IRC (Ping timeout: 615 seconds) [17:27] *** Stiletto has quit IRC () [17:43] *** sebras has joined #archiveteam-bs [17:46] *** Stiletto has joined #archiveteam-bs [18:25] *** icedice has joined #archiveteam-bs [18:29] *** Hani has joined #archiveteam-bs [18:34] *** icedice has quit IRC (Quit: Leaving) [18:42] *** omarroth has joined #archiveteam-bs [19:06] *** voltagex has quit IRC (Ping timeout: 264 seconds) [21:20] *** icedice has joined #archiveteam-bs [21:23] *** dhyan_nat has quit IRC (Read error: Operation timed out) [21:59] *** BlueMax has joined #archiveteam-bs [22:11] *** omarroth has quit IRC (Ping timeout: 506 seconds) [22:11] *** omarroth has joined #archiveteam-bs [22:12] *** icedice has quit IRC (Quit: Leaving) [22:25] *** tuluu has quit IRC (Ping timeout: 615 seconds) [22:26] *** tuluu has joined #archiveteam-bs [22:31] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [22:43] *** Hani has quit IRC (Ping timeout: 255 seconds) [22:44] *** wyatt8740 has joined #archiveteam-bs [22:50] *** arbin has quit IRC (Quit: .) [22:51] *** Hani has joined #archiveteam-bs [22:51] *** arbin has joined #archiveteam-bs [22:55] *** tuluu_ has joined #archiveteam-bs [22:59] *** tuluu has quit IRC (Ping timeout: 265 seconds) [23:04] *** ttteessst has joined #archiveteam-bs [23:04] *** icedice has joined #archiveteam-bs [23:04] *** icedice has quit IRC (Remote host closed the connection) [23:05] *** bitBaron has joined #archiveteam-bs [23:17] *** Ryz has joined #archiveteam-bs [23:36] JAA: So, I installed Python 3.7.2 https://www.python.org/downloads/release/python-372/ (Windows x86-64 embeddable zip file) - I tried to install snscrape but not sure how, I tried downloading Windows help file and opened it, but it didn't seem to work...what [23:38] So yeah I'm stuck at the moment~ [23:40] Ryz: I probably won't be able to help you with that since I haven't used Windows in many years. Stack Overflow suggests that the installation comes with pip and you should therefore be able to run 'pip install snscrape', possibly explicitly specifying a path for pip or pip.exe. But I'm sure someone else in this channel has installed Python packages on Windows before and can help you better. [23:44] Oh, so use the installer instead that has pip [23:59] *** robbierut has quit IRC (Ping timeout: 262 seconds)