#archiveteam-bs 2019-06-30,Sun

↑back Search

Time Nickname Message
00:02 🔗 icedice has quit IRC (Read error: Connection reset by peer)
00:02 🔗 icedice has joined #archiveteam-bs
02:23 🔗 SketchCow Grab it all
02:23 🔗 SketchCow Yank it
02:23 🔗 SketchCow In case the contract dies
02:24 🔗 SketchCow As for the spam reviewer HA HA MOTHERFUCKERS
02:24 🔗 SketchCow I have a set of 5 scripts running like a sentinel. It is currently not possible to post any of those URLs in any form in any review in the Internet Archive.
02:24 🔗 SketchCow They don't last for more than 15 minutes at the most. I watched him do 10,000-11,000 spam reviews today, they're gone, gone gone. Also any going back anywhere in time.
02:26 🔗 SketchCow Like that shit is done, and that's a couple hours I'm never getting back.
02:29 🔗 sec^nd has quit IRC (Read error: Connection reset by peer)
02:31 🔗 wyatt8740 has quit IRC (Quit: Ceci n'est pas un IRC quit message.)
02:33 🔗 wyatt8740 has joined #archiveteam-bs
02:36 🔗 second has joined #archiveteam-bs
03:31 🔗 odemgi_ has joined #archiveteam-bs
03:34 🔗 odemg has quit IRC (Ping timeout: 265 seconds)
03:37 🔗 Fusl6 has joined #archiveteam-bs
03:37 🔗 odemgi has quit IRC (Read error: Operation timed out)
03:40 🔗 MillerBOS I'll share my hours with you SketchCow
03:41 🔗 Fusl5 has quit IRC (Read error: Operation timed out)
03:46 🔗 odemg has joined #archiveteam-bs
04:29 🔗 jspiros has quit IRC (Read error: Operation timed out)
04:30 🔗 jspiros has joined #archiveteam-bs
05:15 🔗 godane has joined #archiveteam-bs
05:17 🔗 godane !ao https://www.yahoo.com/finance/news/40-acres-and-a-mule-reparations-in-2019-190018747.html
05:17 🔗 godane wrong channel
05:19 🔗 AnthonyI has quit IRC (Ping timeout: 264 seconds)
05:24 🔗 PurpleSym has quit IRC (Read error: Operation timed out)
05:24 🔗 PurpleSym has joined #archiveteam-bs
05:25 🔗 purplebot has quit IRC (Read error: Operation timed out)
05:26 🔗 Fusl has quit IRC (Read error: Operation timed out)
05:26 🔗 robogoat_ has joined #archiveteam-bs
05:26 🔗 Fusl has joined #archiveteam-bs
05:26 🔗 nyany has quit IRC (Ping timeout: 506 seconds)
05:26 🔗 Fusl_ sets mode: +o Fusl
05:27 🔗 wp494 has quit IRC (Ping timeout: 506 seconds)
05:27 🔗 svchfoo3 has quit IRC (Read error: Operation timed out)
05:28 🔗 wp494 has joined #archiveteam-bs
05:29 🔗 atomicthu has quit IRC (Ping timeout: 506 seconds)
05:29 🔗 robogoat has quit IRC (Ping timeout: 506 seconds)
05:29 🔗 mr_archiv has quit IRC (Ping timeout: 506 seconds)
05:29 🔗 atomicthu has joined #archiveteam-bs
05:31 🔗 mr_archiv has joined #archiveteam-bs
05:32 🔗 icedice2 has joined #archiveteam-bs
05:32 🔗 icedice2 has quit IRC (Connection closed)
05:33 🔗 icedice2 has joined #archiveteam-bs
05:37 🔗 icedice has quit IRC (Read error: Operation timed out)
05:42 🔗 nyaomi has quit IRC (Quit: meow)
06:10 🔗 nyaomi has joined #archiveteam-bs
06:21 🔗 purplebot has joined #archiveteam-bs
06:22 🔗 nyany has joined #archiveteam-bs
06:23 🔗 svchfoo3 has joined #archiveteam-bs
06:23 🔗 Fusl sets mode: +o svchfoo3
06:23 🔗 svchfoo1 sets mode: +o svchfoo3
06:32 🔗 icedice has joined #archiveteam-bs
06:39 🔗 icedice2 has quit IRC (Read error: Operation timed out)
07:03 🔗 stapler11 has quit IRC (Read error: Connection reset by peer)
07:04 🔗 stapler11 has joined #archiveteam-bs
07:16 🔗 Raccoon has joined #archiveteam-bs
10:30 🔗 icedice has quit IRC (Read error: Connection reset by peer)
10:30 🔗 icedice has joined #archiveteam-bs
10:37 🔗 icedice has quit IRC (Read error: Connection reset by peer)
10:37 🔗 icedice has joined #archiveteam-bs
11:12 🔗 Raccoon has quit IRC (Ping timeout: 265 seconds)
11:50 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
11:53 🔗 icedice has quit IRC (Quit: Leaving)
13:09 🔗 killsushi has quit IRC (Quit: Leaving)
13:10 🔗 schbirid has joined #archiveteam-bs
13:16 🔗 BartoCH has quit IRC (Ping timeout: 615 seconds)
13:47 🔗 katocala has quit IRC (Read error: Operation timed out)
15:03 🔗 Raccoon has joined #archiveteam-bs
15:12 🔗 Fusl_ i have https://github.com/Fusl/ateam-scripts now for anyone interested in knowing how i do my stuff
15:28 🔗 Fusl_ JAA: https://github.com/Fusl/ateam-scripts/blob/master/df/nratv/files/script.sh
15:30 🔗 JAA Fusl_: Nice.
15:30 🔗 JAA Soon we might need a list of repositories containing useful scripts. lol
15:30 🔗 Fusl_ :D
15:30 🔗 JAA Fusl_: I think wpull won't write a meta WARC with those args anyway, but why do you filter it out before the upload?
15:31 🔗 Fusl_ its just a copy paste of the sonysketch one: https://github.com/Fusl/ateam-scripts/blob/master/df/sonysketch/files/script.sh
15:32 🔗 JAA grab-site definitely writes meta WARCs. So I guess those are lost?
15:32 🔗 Fusl_ yeah i dont grab those meta warcs, the data warcs ended up in megawarcs
15:32 🔗 JAA :-(
15:32 🔗 JAA meta WARCs are important as well, they contain the retrieval log.
15:32 🔗 Fusl_ are they any useful?
15:33 🔗 Fusl_ ic
15:33 🔗 Fusl_ but we dont want them in megawarcs, right
15:33 🔗 Fusl_ right?
15:34 🔗 JAA Not sure. It might be best to have two megawarcs, one with the data and one with the meta WARCs. That way you can easily access the logs without downloading the entire thing.
15:34 🔗 ivan_ why even megawarc grab-site output?
15:34 🔗 ivan_ related sites?
15:35 🔗 Fusl_ ivan_: sony sketch stuff
15:35 🔗 ivan_ ah
15:35 🔗 Fusl_ roundabout 2.4tb of sony sketch images, each batch was 150 sketch ids and was assigned to one grab-site worker
15:36 🔗 Fusl_ also, JAA, looks like grab-site is actually not writing the meta warcs for me anymore
15:37 🔗 Fusl_ i just checked the running jobs on mips and none of them has any meta warcs
15:37 🔗 JAA Fusl_: The meta WARC is only written at the end.
15:37 🔗 JAA (And not when wpull crashes)
15:37 🔗 Fusl_ oic
15:38 🔗 Fusl_ welp
15:46 🔗 BartoCH has joined #archiveteam-bs
16:54 🔗 Fusl http://xor.meo.ws/e1db0857/1c6e/4b4d/800a/a71830b109b8.png not gonna cancel any of my ex42-nvme servers any time soon it seems :D
17:09 🔗 Kaz they've been out for a few days at least, bad times
17:11 🔗 JAA Yeah, noticed that the other day as well.
17:23 🔗 ivan_ heh, I was wondering how much stock they had
17:35 🔗 godane has quit IRC (Quit: Leaving.)
17:36 🔗 Stilettoo is now known as Stiletto
17:38 🔗 Kenshin has quit IRC (Quit: ZNC - http://znc.in)
17:40 🔗 Kenshin has joined #archiveteam-bs
17:40 🔗 Fusl sets mode: +o Kenshin
17:41 🔗 Fusl so uh if anyone around wants to get rid of their ex42-nvme server, preferably in finland but germany is also fine, let me know and i'll add them to the ateam hoard http://xor.meo.ws/8000c941/de80/43e1/82e7/05587f15c62b.png :P
17:44 🔗 Hani111 has joined #archiveteam-bs
17:54 🔗 Hani has quit IRC (Ping timeout: 615 seconds)
17:54 🔗 Hani111 is now known as Hani
18:39 🔗 Atom-- has quit IRC (Ping timeout: 604 seconds)
18:52 🔗 Raccoon has quit IRC (Read error: Connection reset by peer)
19:02 🔗 Fusl JAA: fyi i'm doing a custom crawl of www.edis.at the old website pointing dns to the old ip address on mips. they recently got a new website design and lots of information was not ported over to it, especially support articles, etc.
19:03 🔗 Fusl imma dump it directly into IA once done but dont think we want it in the WBM i guess?
19:04 🔗 JAA Fusl: Hmm, last time I had this issue it was not concerning to have it in the WBM because the domain was gone, but yeah, in this case, it might be better to not have it in there since it would collide with the actual current website.
19:06 🔗 JAA arkiver, SketchCow: Thoughts? ^
19:06 🔗 JAA Fusl: I assume the old site is not accessible directly with just the IP but requires the domain in the Host header?
19:07 🔗 Fusl correct
19:10 🔗 hook54321 You could try asking them to point a subdomain to the old site, I'm guessing they probably wouldn't bother though.
19:11 🔗 Fusl thats not easily doable
19:11 🔗 arkiver IA would love the data
19:11 🔗 Fusl the software running on that thing doesnt allow just "pointing a domain to it"
19:11 🔗 arkiver But it might not be a good idea to put this in the Wayback Machine
19:12 🔗 JAA One more alternative: grab it under the IP and use --header 'Host: www.edis.at'. That should make wpull write the IP to the WARC headers, so the snapshots wouldn't collide with the current site.
19:12 🔗 JAA But I'm not sure how that behaves for offsite links.
19:12 🔗 JAA Or images etc.
19:12 🔗 astrid hmm interesting
19:13 🔗 JAA And also it would still be incorrect since you can't access the site under the URL in the WARC then...
19:13 🔗 Fusl yeah that
19:14 🔗 Fusl all the page resources wouldnt load correctly
19:15 🔗 JAA In theory, wpull should probably overwrite the Host header if a child URL is on a different host. In practice, it probably doesn't do that.
19:16 🔗 Fusl In practice, it actually doesn't do that
20:20 🔗 Fusl JAA Igloo: fyi, there's been a brief network outage on OSS host as the maximum-prefix limit on the routers were tripped due to me adding two more /24s for mips. everything is back now and the maximum-prefix limit on NFOrce router side has been raised by x3
20:23 🔗 Dj-Wawa has quit IRC (Quit: Connection closed for inactivity)
20:37 🔗 thepaul has joined #archiveteam-bs
20:37 🔗 thepaul ivan_ JAA: it was knzk
20:39 🔗 ivan_ thepaul: URLs?
20:41 🔗 ivan_ knzk.me?
20:43 🔗 thepaul that one'
20:44 🔗 ivan_ chromebot was used to archive one page on it so it's probably just that and whatever other scraps are in wayback
20:45 🔗 Kaz is mastodon some sort of running joke I'm not in on? why does it exist and why is it so bad
20:46 🔗 ivan_ a playground for failing at federated social networking
20:47 🔗 ivan_ federation but with a large amount of centralization because who can operate servers anyway
20:47 🔗 ivan_ no data escrow/evacuation
20:50 🔗 Kaz sounds like a right laugh, unless you're one of the suckers that suddenly has all their data dropped because someone forgot to pay a bill
20:53 🔗 ivan_ thepaul: you could try looking for your cached data in your browser profile or other people's mastodon instances
21:12 🔗 icedice has joined #archiveteam-bs
21:19 🔗 icedice has quit IRC (Read error: Connection reset by peer)
21:19 🔗 icedice has joined #archiveteam-bs
21:20 🔗 ivan_ or uh ask this guy http://web.archive.org/web/20180307150224/https://knzk.me/@Knzk
21:21 🔗 Stilettoo has joined #archiveteam-bs
21:21 🔗 Stiletto has quit IRC (Read error: Operation timed out)
21:52 🔗 abstract ive yet to figure out whos killing the connection between me and archive.org
21:52 🔗 abstract i can get to the homepage but any attempt to view an archived page results in "web.archive.org unexpectedly closed the connection"
21:55 🔗 icedice has quit IRC (Read error: Connection reset by peer)
21:55 🔗 icedice has joined #archiveteam-bs
22:08 🔗 Raccoon has joined #archiveteam-bs
22:16 🔗 BlueMax has joined #archiveteam-bs
22:26 🔗 Kaz connecting from where?
22:35 🔗 Raccoon The A-team hoard need better names besides 'archivebox-hel1'
22:36 🔗 Raccoon Like... Hannibal and Howling Mad and Faceman and Bad A Baracus
22:36 🔗 Kaz sounds great until you forgot which is which
22:36 🔗 Kaz fun names don't scale past.. 5 machine
22:37 🔗 Raccoon Do the Mormons number their kids? Do the Roman Catholics?
22:37 🔗 Raccoon roman numeral kids?
22:39 🔗 Kaz yes, actually: https://www.bbc.co.uk/news/uk-politics-40506109
22:39 🔗 Kaz 5 got normal names, 6th was 'Sixtus'
22:41 🔗 Raccoon heh. I pitty that foo
22:59 🔗 killsushi has joined #archiveteam-bs
23:45 🔗 kiskabak has quit IRC (Remote host closed the connection)
23:45 🔗 kiskabak has joined #archiveteam-bs
23:45 🔗 Fusl sets mode: +o kiskabak
23:46 🔗 benjins abstract I've got that issue before, try clearing your site cookies/cached data for *.archive.org

irclogger-viewer