[02:42] *** BlueMax has joined #archiveteam-bs [03:18] *** BlueMax has quit IRC (Quit: Leaving) [03:43] *** odemg has quit IRC (Ping timeout: 265 seconds) [03:47] *** qw3rty118 has joined #archiveteam-bs [03:52] *** qw3rty117 has quit IRC (Read error: Operation timed out) [03:55] *** odemg has joined #archiveteam-bs [04:00] so i found this : http://www.shasej.org/gakkaishi/archive/archive.asp?Yers=7 [04:24] *** Sokar has quit IRC (Ping timeout: 615 seconds) [04:26] *** BlueMax has joined #archiveteam-bs [04:37] *** Sokar has joined #archiveteam-bs [05:01] *** godane has quit IRC (Read error: Operation timed out) [05:20] *** godane has joined #archiveteam-bs [05:49] *** m007a83 has quit IRC (Quit: Fuck you Comcast) [05:52] *** Mateon1 has quit IRC (Read error: Operation timed out) [05:54] *** Mateon1 has joined #archiveteam-bs [05:59] *** wyatt8740 has quit IRC (Read error: Operation timed out) [06:08] SketchCow: something you may like : https://archive.org/details/Virus_Bulletin-1989-07 [06:09] i couldn't find it on archive.org so i'm uploading [06:51] *** eientei95 has quit IRC (Quit: ZNC 1.7.2+deb2 - https://znc.in) [07:02] *** susudo has joined #archiveteam-bs [07:19] *** susudo has quit IRC (Quit: Page closed) [09:02] *** deevious has quit IRC (Read error: Connection reset by peer) [09:02] *** deevious has joined #archiveteam-bs [09:06] Sanqui: What's your goal exactly? (from #archiveteam) [09:21] *** martinlig has joined #archiveteam-bs [09:28] JAA: Take a website archived with ArchiveBot, and get a list of all *.wz.cz domains from it. For example. [09:32] Sanqui: As long as the job was run without --no-offsite-links, i.e. those domains were retrieved, I'd use either the meta WARC or IA's CDX. Both are fairly easy to parse with grep and/or awk. If the URLs were ignored, only the meta WARC will work. [09:33] What is the difference between the CDX and meta? [09:33] If the URLs were hard-ignored by --no-offsite-links, --no-parent, or some other wpull option, then the only way would be to parse the data WARC. Have fun with that... [09:33] I have a problem with newsgrabber megawarcs, sometimes they don#'t get a CDX [09:33] Igloo: The CDX is an index of all response records in the WARC. The meta WARC contains the wpull log. [09:34] The CDX files are generated by the IA derive after upload. Each WARC in an item with mediatype:web gets one CDX, plus there's one item-wide index. [09:35] ok, I wonder why some don't get that [09:35] I need to re-write the dedupe anyway [09:36] Have an example? [09:38] I do but not right now :-) [09:38] Poolside baby [09:40] Ah, right, enjoy it! [09:48] *** BlueMax has quit IRC (Read error: Connection reset by peer) [10:54] *** eientei95 has joined #archiveteam-bs [10:54] *** eientei95 has quit IRC (Handshake flooding) [10:56] *** eientei95 has joined #archiveteam-bs [10:56] *** eientei95 has quit IRC (Handshake flooding) [10:58] *** eientei95 has joined #archiveteam-bs [11:11] SketchCow: Those ground zero photos of yours on IA/a non-flickr download? [11:13] Another Ban Wave happened on Reddit [11:14] Also someone uploaded https://archive.org/details/WTC-ISO - back story https://www.reddit.com/r/DataHoarder/comments/c2vi3b/2389_never_before_seen_photos_of_ground_zero_in/ern2es1/ [11:21] *** wyatt8740 has joined #archiveteam-bs [12:02] *** deevious has quit IRC (Quit: deevious) [12:11] *** martinlig has quit IRC (Quit: Connection closed for inactivity) [12:42] *** ColdIce has quit IRC (Quit: The Lounge - https://thelounge.chat) [13:45] *** Tenebrae has quit IRC (Remote host closed the connection) [13:47] *** Tenebrae has joined #archiveteam-bs [14:25] *** DogsRNice has joined #archiveteam-bs [15:11] *** zhongfu_ has joined #archiveteam-bs [15:17] *** zhongfu has quit IRC (Ping timeout: 615 seconds) [15:24] https://www.mendeley.com/campaign/about-climate-change someone here wanna go ahead and write a script for grabbing those? [16:18] *** atbk has quit IRC (Quit: ZNC - https://znc.in) [16:19] *** atbk has joined #archiveteam-bs [16:27] *** odemgi_ has quit IRC (Remote host closed the connection) [16:50] *** Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) [17:15] *** zhongfu_ has quit IRC (Read error: Connection reset by peer) [17:18] *** zhongfu has joined #archiveteam-bs [17:37] *** Mateon1 has quit IRC (Remote host closed the connection) [17:37] *** Mateon1 has joined #archiveteam-bs [18:00] *** VADemon has joined #archiveteam-bs [18:20] *** Mateon1 has quit IRC (Read error: Operation timed out) [18:20] *** Mateon1 has joined #archiveteam-bs [19:02] Anyone want to take a shot at grabbing the video file? [19:02] https://commerce.veritone.com/search/asset/18501328 [19:02] commerce.veritone.com’s server IP address could not be found. [19:03] https://cdnt3mt-a.akamaihd.net/2B1/0DA/FF2/2B10DAFF2_001_xp-wmte.f4v?__gda__=1561158164_f675578419cb54dbe4a01ccaf9ec14de [19:04] Fusl: https://www.irccloud.com/pastebin/lV0VJguR/ [19:05] Kaz: http://xor.meo.ws/cZqg5z86rfZWsxnGvuWhvOj-w47n5nJY.txt [19:06] either CF is wrong, or everyone else is [19:06] hell I actually don't remember who i'm forwarding to atm [19:09] Kaz: Good one, got it [19:10] Kaz: veritone.com. requires DNSSEC signing but commerce.veritone.com. points with a CNAME to commerce.pd.dmh.wzplatform.com. which is not DNSSEC signed, cloudflare is correct and everyone else is wrong here [19:11] https://dnssec-analyzer.verisignlabs.com/commerce.veritone.com [19:15] Fusl: guess I'm forwarding to google then [19:38] *** CoolCanuk has joined #archiveteam-bs [20:27] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [20:31] *** killsushi has joined #archiveteam-bs [20:34] *** qw3rty119 has joined #archiveteam-bs [20:39] *** qw3rty118 has quit IRC (Ping timeout: 600 seconds) [20:50] *** qw3rty119 has quit IRC (Nettalk6 - www.ntalk.de) [20:59] *** fredgido has quit IRC (Read error: Connection reset by peer) [20:59] *** fredgido has joined #archiveteam-bs [21:00] Fusl: I am curious what is up with your /bd/0x5000c... paths. Which filesystem generates such a structure? [21:06] mergerfs iirc [21:31] would be nice if archivebot automatically knew "oh, it's a twitter/fb url, let me assign an ignoreset" [21:32] Sometimes we don't need the ignoreset though [21:32] s/need/want [21:33] oh ok [22:06] SketchCow : 90s rave & jungle cassetes tapes : http://artmeetsscience.co.uk/tapes/ [22:13] *** Atom__ has joined #archiveteam-bs [22:19] *** Atom-- has quit IRC (Read error: Operation timed out) [22:26] *** fredgido has quit IRC (Remote host closed the connection) [22:26] *** Fionera has quit IRC (Read error: Connection reset by peer) [22:26] h3ndr1k: /bd/ means backing device. its just an ext4 mounted with the wwn-id of each disk and partition from /dev/disk/by-id/ and mergerfs uses that to stripe it into /data [22:26] *** yano has quit IRC (Read error: Connection reset by peer) [22:27] *** Fionera has joined #archiveteam-bs [22:28] *** fredgido has joined #archiveteam-bs [22:28] *** yano has joined #archiveteam-bs [22:28] *** TigerbotH has quit IRC (Read error: Connection reset by peer) [22:29] *** PotcFdk has quit IRC (Ping timeout: 600 seconds) [22:30] *** chungone_ has joined #archiveteam-bs [22:31] *** jspiros_ has quit IRC (Read error: Operation timed out) [22:31] *** paul2520 has quit IRC (Write error: Broken pipe) [22:31] *** nightpoo- has quit IRC (Write error: Broken pipe) [22:31] *** ndiddy has quit IRC (Write error: Broken pipe) [22:31] *** nightpool has joined #archiveteam-bs [22:31] *** sep332 has quit IRC (Read error: Operation timed out) [22:39] *** logchfoo3 starts logging #archiveteam-bs at Fri Jun 21 22:39:21 2019 [22:39] *** logchfoo3 has joined #archiveteam-bs [22:39] *** abstract has joined #archiveteam-bs [22:39] *** jodizzle has quit IRC (Ping timeout: 246 seconds) [22:39] *** Coderjo_ has quit IRC (Ping timeout: 600 seconds) [22:39] *** nothere has quit IRC (Ping timeout: 600 seconds) [22:39] *** betamax_ has joined #archiveteam-bs [22:39] *** anarcat has quit IRC (Ping timeout: 600 seconds) [22:39] *** ivan has quit IRC (Read error: Operation timed out) [22:41] *** squires has quit IRC (Ping timeout: 600 seconds) [22:41] *** asdf0101 has quit IRC (Read error: Operation timed out) [22:41] *** fredgido has quit IRC (Read error: Operation timed out) [22:41] *** step has quit IRC (Ping timeout: 600 seconds) [22:43] *** betamax has quit IRC (Read error: Operation timed out) [22:44] *** arkiver has joined #archiveteam-bs [22:45] *** mistym has joined #archiveteam-bs [22:45] *** dxrt_ has quit IRC (Ping timeout: 600 seconds) [22:47] *** Fusl has joined #archiveteam-bs [22:47] *** Fusl_ sets mode: +o Fusl [22:48] *** LordNigh2 has joined #archiveteam-bs [22:48] *** joepie91 has joined #archiveteam-bs [22:50] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [22:50] *** LordNigh2 is now known as Lord_Nigh [22:50] *** anarcat has joined #archiveteam-bs [22:51] *** jodizzle has joined #archiveteam-bs [22:54] *** Zebranky has joined #archiveteam-bs [22:55] Fusl: Thanks, interesting. I would have guessed some cluster file system, because it look quite complicated. But then I'm working with ceph and it never looks that complicated, its just a single mount point for one cephfs. But maybe some crazy file system :) [22:56] Never worked with mergerfs though [23:01] *** kode54 has quit IRC (Quit: The Lounge - https://thelounge.chat) [23:03] *** kiska1 has joined #archiveteam-bs [23:03] *** kode54 has joined #archiveteam-bs [23:03] *** svchfoo3 sets mode: +o kiska1 [23:04] *** paul2520 has joined #archiveteam-bs [23:06] *** mr_archiv has joined #archiveteam-bs [23:08] *** GDorn__ has joined #archiveteam-bs [23:16] *** BlueMax has joined #archiveteam-bs [23:36] *** c4rc4s has joined #archiveteam-bs [23:36] *** svchfoo1 has joined #archiveteam-bs [23:36] *** Fusl sets mode: +o svchfoo1 [23:37] *** ivan has joined #archiveteam-bs [23:37] *** simon816 has joined #archiveteam-bs [23:37] *** Fusl sets mode: +o ivan [23:37] *** asdf0101 has joined #archiveteam-bs [23:37] *** markedL has joined #archiveteam-bs [23:37] *** cfarquhar has joined #archiveteam-bs [23:37] *** JAA has joined #archiveteam-bs [23:37] *** Fusl sets mode: +o JAA [23:37] *** AlsoJAA sets mode: +o JAA [23:58] *** CoolCanuk has quit IRC (Quit: Connection closed for inactivity)