[00:12] *** BlueMax has quit IRC (Read error: Connection reset by peer) [00:25] *** VerfiedJ has quit IRC (Quit: Leaving) [00:35] *** m007a83 has joined #archiveteam-ot [00:40] *** Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) [00:42] https://jinglepings.com/ [00:42] I love the internet [00:42] This is the best thing ever [00:49] "Since the led wall was being flooded with inappropriate content" lol [00:50] Very nice idea! [00:50] I fucking love it [00:50] I want to set one up here locally [00:51] you don't even need that much IPv6 Space [00:51] I'm going to paint the Archive Team Warrior Logo on it [00:51] im setting it up right now [00:52] the rainbow line going across seeing to be having some packet loss [00:53] lol [00:54] damn, all my hetzner boxes have shit IPv6 [00:55] ah, got one [02:31] *** wp494 has quit IRC (Read error: Operation timed out) [02:32] *** wp494 has joined #archiveteam-ot [02:32] *** svchfoo1 sets mode: +o wp494 [02:44] *** Sian1468 has joined #archiveteam-ot [02:55] *** Sian1468 has quit IRC (Quit: Quit) [03:02] I like that we've doomed ourselves before the world has even completely flipped to IPV6 [03:04] Does bing even support IPv6 yet? [03:05] Maybe it's just me, but standardizing a normal end user as a /48 just seems like a silly idea [03:54] someone almost faded out the whole board for a second [04:02] *** kiska1 has quit IRC (Read error: Operation timed out) [04:03] *** kiska1 has joined #archiveteam-ot [04:51] *** BlueMax has joined #archiveteam-ot [04:57] *** odemg has quit IRC (Ping timeout: 265 seconds) [05:09] *** odemg has joined #archiveteam-ot [05:16] *** Odd0002 has joined #archiveteam-ot [06:14] *** wp494 has quit IRC (Ping timeout: 506 seconds) [06:15] *** wp494 has joined #archiveteam-ot [06:16] *** svchfoo1 sets mode: +o wp494 [07:41] Oooo [07:41] Not every day this happens [07:41] I got served what was supposed to be a spammy ad [07:41] But the webserver appears to be pwned [08:51] *** BlueMax has quit IRC (Read error: Connection reset by peer) [09:50] *** godane has quit IRC (Ping timeout: 265 seconds) [09:57] *** godane has joined #archiveteam-ot [09:57] *** svchfoo1 sets mode: +o godane [10:55] *** Sian1468 has joined #archiveteam-ot [13:10] *** Dj-Wawa has joined #archiveteam-ot [13:31] is there a dump of opensubtitles or similar sites? searching in the scripts of all movies would be cool [13:35] psi [13:36] what ubuntu is that? [13:36] 18.04, basically fresh install [13:36] thats weird. i have a /etc/network/interfaces on mine but that could as well be because i run it with cloud-init [13:37] what you can try [13:37] `ip l l`, find your network interface name, then grep -r INTERFACENAME /etc [13:37] and try to find whatever files reference that network interface [13:38] VoynichCr: We did grab a few such sites through ArchiveBot a while ago when there was a wave of shutdowns (#domtitles). But not sure if an easily searchable DB exists. [13:43] Fusl: I see some IPv6 related files and a file in the network folder of systemd but it seems commented out [13:43] does the server get its ip via dhcp? [13:43] (check ps fauxww | grep dhclient) [13:44] only gets the grep process [13:48] thats weird [13:55] *** Sian1468 has quit IRC (Quit: Leaving) [14:08] Oh [14:08] Fusl: https://askubuntu.com/questions/1031709/ubuntu-18-04-switch-back-to-etc-network-interfaces [14:09] oh god netplan [14:10] Time to install Ubuntu 16 instead :) [14:10] https://netplan.io/faq#use-pre-up-post-up-etc-hook-scripts [14:11] anyway [14:11] you dont necessarily need to put the iptables in a post-up for that interface, you can also manually run it every time after you reboot the machine [14:11] yikes [14:12] I had nothing set up yet, so I'd rather switch back to 16.04 and do it right the first time [14:12] or just use an OS that doesn't f... with its users, like debian or alpine linux :P [14:13] I don't have any experience with those so I'm a bit afraid to move to those [14:13] debian is literally the same as ubuntu [14:13] just without all that canonical crap installed [14:13] oh [14:14] well time to go ahead and try that then :p [14:15] Grr, my auto-op script broke and I don't know why. [14:24] Gotta have patience for reinstalling SyS servers [15:18] *** wp494 has quit IRC (Ping timeout: 492 seconds) [15:19] *** wp494 has joined #archiveteam-ot [15:19] *** svchfoo1 sets mode: +o wp494 [17:00] *** Mateon1 has quit IRC (Read error: Operation timed out) [17:02] *** Mateon1 has joined #archiveteam-ot [17:02] *** terorie has joined #archiveteam-ot [17:14] hook54321: what article? [17:17] Fusl: https://motherboard.vice.com/en_us/article/d3bekm/archivists-say-tumblr-ip-banned-them-for-trying-to-preserve-adult-content [17:19] ic [17:19] well there are certainly mistakes but nothing that causes bad AT reputation :D [17:21] Fusl: Main reason I said that is they wrote what seemed to be an article designed to make IA look bad awhile ago. [17:22] did they? [17:22] i dont read news for such reasons [17:22] looking for it, one sec [17:24] https://motherboard.vice.com/en_us/article/nekzzq/wayback-machine-deleting-evidence-flexispy [17:25] And then they proceeded to post the article on their Twitter feed at least five times [17:26] it was made unaccessible when they put a robots.txt on the side i guess? [17:27] Iirc they requested that IA make it non-public, IA probably complied because the company has copyright over the material [17:29] Also, the author of the article referenced a tweet (that they later deleted) that talks about diversifying archives, but more than half of the services mentioned in the tweet are ran by IA. https://web.archive.org/web/20180526092639/https://twitter.com/josephfcox/status/999218176364892160 [17:36] I don't know when they deleted it, but it was after it was brought up in this channel. So there's a chance they're in here. [17:38] lol that reply tweet [17:40] *** Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) [18:02] Fusl: I have the IP block now, but there is no docker0 interface in /etc/network/interfaces... :/ [18:02] it does show up in `ip l l` though [18:08] yeah, docker0 is managed by the docker daemon [18:08] you'll need to add the post-up commands into the block for your primary network interface [18:08] Oh I see [18:08] Which would probably be the eth0 IPv4 interface? [18:09] yeah [18:09] And do I need to restart anything after adding that line? [18:10] you can copy the post-up commands out and just run them as normal commands [18:10] that will do it [18:10] Starting with `iptables` I assume [18:11] yeah [18:11] once you've configured iptables to balance across your ips or prefixes, you can use `docker run --rm -ti alpine sh -c 'for i in $(seq 20); do wget -qO- https://ipinfo.io/ip; done'` to verify that your iptables rules are working properly [18:12] it should give you the public ips back the docker container got after each request [18:13] *** Fusl_ has joined #archiveteam-ot [18:13] Hm, it's getting the same IP every time, but it is one from my new IP block [18:13] how do your iptables rules look like? [18:15] IPs censored, obviously https://www.irccloud.com/pastebin/N8CInW3c/ [18:15] The only other ones in /etc/network/interfaces are the loopback interface and the IPv6 eth0 [18:17] Oh, the test command is now alternating between 5.6.7.3 and 5.6.7.10 [18:17] oh yeah thats what it should do [18:17] if it does that, everythings working [18:18] Shouldn't all 16 IPs technically be used? [18:19] technically yes, but "randomness" in iptables is kinda weird. once you get some threads up and running, they will alternate between all ips [18:19] alrighty [18:21] once you have some threads running you can check with `tcpdump -plnn -i eth0 'tcp[tcpflags] == tcp-syn' | awk -Winteractive '{print $3}' | awk -Winteractive -F. '{print $1"."$2"."$3"."$4}'` to see source ip addresses for tcp connects [18:22] I see some IP addresses come by that I don't even recognise but most of them seem to be from my IP block [18:22] *** terorie has quit IRC (Remote host closed the connection) [18:22] those are inbound connections then [18:23] inb4 russia's trying to hack me [18:23] try this instead `tcpdump -Qout -plnn -i eth0 'tcp[tcpflags] == tcp-syn' | awk -Winteractive '{print $3}' | awk -Winteractive -F. '{print $1"."$2"."$3"."$4}'` [18:24] `Unable to write output: broken pipe` [18:24] oh my bad [18:24] Yep that seems plenty random [18:25] if you want fuller randomness you can still go ahead and write a custom iptables target chain with -m statistic --mode random or --mode nth [18:25] *** terorie has joined #archiveteam-ot [18:26] I hardly know what I'm doing anyway so I'll just leave it alone x) [18:26] kk [18:27] But now they have to block a /28 block instead of just one IP >:) [18:27] Also, regarding the multiple IP blocks idea, it used to be the case that you have multiple --to-source params and it would round-robin between them [18:28] But they removed that for whatever reason [18:31] Oh, it was actually removed in the Linux kernel [18:48] Hey Fusl does the `touch STOP` thing work when `docker exec -it`'d in containers running your file [18:49] it should [18:49] Hm, guess the tasks just take a while then [18:50] you can check with `docker logs -f ` [18:51] that's a lot of logs [18:52] But yeah it doesn't seem like it's getting new jobs [18:57] *** terorie has quit IRC (Remote host closed the connection) [18:59] not-quick and dirty script for even distribution of source nat ips http://xor.meo.ws/nbx3sPbsWx2b0TJWVRPNgnx0UiYb7DHj.txt [19:00] requires all your "possible" outbound ip addresses to be added to the primary network interface [19:07] meh [19:16] *** ubahn has joined #archiveteam-ot [19:22] Hm [19:23] It's probably easier to do this the Docker Swarm way [19:24] *** ubahn has quit IRC (Quit: ubahn) [19:30] *** Odd0002 has quit IRC (ZNC - http://znc.in) [19:33] *** VerfiedJ has joined #archiveteam-ot [19:33] oh shit apparently i can get gigabit internet at home [19:33] shiiiit [19:34] 50mbit/s upstream lol [19:36] *** Odd0002 has joined #archiveteam-ot [19:37] *** Odd0002 has quit IRC (Client Quit) [19:41] *** Odd0002 has joined #archiveteam-ot [19:53] *** ubahn has joined #archiveteam-ot [19:58] *** ubahn has quit IRC (Client Quit) [20:57] *** terorie has joined #archiveteam-ot [21:02] *** terorie has quit IRC (Ping timeout: 268 seconds) [22:04] *** terorie has joined #archiveteam-ot [22:05] *** s4t has joined #archiveteam-ot [22:09] *** terorie has quit IRC (Ping timeout: 268 seconds) [22:16] *** s4t has quit IRC (Quit: s4t) [22:16] *** terorie has joined #archiveteam-ot [22:33] *** BlueMax has joined #archiveteam-ot [22:46] *** Dj-Wawa has joined #archiveteam-ot [23:18] *** Odd0002 has quit IRC (ZNC - http://znc.in) [23:21] *** Odd0002 has joined #archiveteam-ot [23:23] *** silas has joined #archiveteam-ot [23:35] I just finished downloading an FTP site as a WARC using wget. It made a WARC file and a normal folder with the stuff it downloaded, and my question is if I zip up the folder, will the Wayback Machine still be able to ingest the WARC properly? I feel like it's a kinda dumb question but I want to be sure since it's all a few hundred GB altogether and I don't want to mess anything up [23:38] And just so it's clear I don't mean zipping up the warc with it, I just mean zipping up the subfolder wget makes that has all the files it's downloaded in it.