#archiveteam-bs 2016-11-02,Wed

↑back Search

Time Nickname Message
01:06 πŸ”— Stiletto has quit IRC ()
01:24 πŸ”— powerKitt has quit IRC (Remote host closed the connection)
01:27 πŸ”— dashcloud has joined #archiveteam-bs
01:44 πŸ”— Swizzle has joined #archiveteam-bs
02:03 πŸ”— bsmith093 about fos, i went to throw more things i found into it, and my folder was gone, was it uploaded, or just dumped?
02:04 πŸ”— bsmith093 im "wacko" incase you forgot
02:09 πŸ”— logchfoo2 starts logging #archiveteam-bs at Wed Nov 02 02:09:03 2016
02:09 πŸ”— logchfoo2 has joined #archiveteam-bs
02:10 πŸ”— SketchCow bsmith093: Hi
02:10 πŸ”— SketchCow Are you the one uploading, like, cable TV shows
02:11 πŸ”— bsmith093 soooo, stop doing that?
02:11 πŸ”— bsmith093 k
02:11 πŸ”— SketchCow Yes
02:11 πŸ”— SketchCow I've been deleting them as fast as they come in
02:12 πŸ”— SketchCow They just cause huge headaches for the archive providing basically available TV shows and known properties
02:12 πŸ”— SketchCow Vs., say, all sorts of unusual documents, or VHS rips of long-lost properties, etc.
02:19 πŸ”— bsmith093 has quit IRC (Read error: Connection reset by peer)
02:21 πŸ”— bsmith093 has joined #archiveteam-bs
02:25 πŸ”— Swizzle has quit IRC (Quit: Leaving)
02:27 πŸ”— Sanqui has quit IRC (Ping timeout: 260 seconds)
02:28 πŸ”— Sanqui has joined #archiveteam-bs
02:31 πŸ”— GE has joined #archiveteam-bs
02:40 πŸ”— GE_ has joined #archiveteam-bs
02:42 πŸ”— GE has quit IRC (Ping timeout: 255 seconds)
02:42 πŸ”— GE_ is now known as GE
02:52 πŸ”— GE has quit IRC (Remote host closed the connection)
03:09 πŸ”— Stiletto has joined #archiveteam-bs
03:31 πŸ”— Stiletto has quit IRC ()
04:15 πŸ”— brayden has joined #archiveteam-bs
04:15 πŸ”— swebb sets mode: +o brayden
04:16 πŸ”— brayden has quit IRC (Client Quit)
04:17 πŸ”— brayden has joined #archiveteam-bs
04:17 πŸ”— swebb sets mode: +o brayden
04:25 πŸ”— Stiletto has joined #archiveteam-bs
04:35 πŸ”— godane SketchCow: i mostly go after original recording blocks of tv shows
04:36 πŸ”— godane at least if there WOC they have something different then the dvd
04:36 πŸ”— godane also the original broadcasts could be different then the dvd ones
05:06 πŸ”— Blackout has quit IRC (Quit: http://www.mibbit.com ajax IRC Client)
05:09 πŸ”— yipdw hmm http://www.csoonline.com/article/3137181/security/google-to-untrust-wosign-and-startcom-certificates.html
05:10 πŸ”— yipdw as an experiment, I removed wosign and startcom from my trust roots
05:10 πŸ”— yipdw it's interesting to see how many HTTPS TLS-related errors you get when you do that
05:10 πŸ”— yipdw (it's quite a few sites)
05:11 πŸ”— yipdw bugzilla.gnome.org is one of them, which caused some finagling on my part to fix that up
05:13 πŸ”— yipdw 2025: Let's Encrypt issues 90% of TLS certificates for HTTPS
05:13 πŸ”— yipdw 2026: "Flintlock" vulnerability punches massive holes in trust chains
05:20 πŸ”— Sk1d has quit IRC (Ping timeout: 194 seconds)
05:26 πŸ”— Sk1d has joined #archiveteam-bs
05:56 πŸ”— is- has quit IRC (Ping timeout: 633 seconds)
06:02 πŸ”— is- has joined #archiveteam-bs
06:28 πŸ”— Coderjoe has joined #archiveteam-bs
06:57 πŸ”— eloos has joined #archiveteam-bs
08:06 πŸ”— brayden_ has joined #archiveteam-bs
08:06 πŸ”— swebb sets mode: +o brayden_
08:09 πŸ”— yipdw has quit IRC (Remote host closed the connection)
08:10 πŸ”— yipdw has joined #archiveteam-bs
08:11 πŸ”— brayden has quit IRC (Read error: Operation timed out)
08:27 πŸ”— GE has joined #archiveteam-bs
08:32 πŸ”— brayden_ has quit IRC (Read error: Operation timed out)
09:21 πŸ”— ravetcofx has quit IRC (Ping timeout: 506 seconds)
09:32 πŸ”— Midas yipdw: don't say that, the internet will burn
09:32 πŸ”— Yoshimura What should I look for when I would like fulltext search of archived pages?
09:47 πŸ”— fie has joined #archiveteam-bs
09:59 πŸ”— Yoshimura has quit IRC (Remote host closed the connection)
10:04 πŸ”— Yoshimura has joined #archiveteam-bs
10:22 πŸ”— Yoshimura tiddlyspace: So the resets seem to happen when reditecting to https.
10:23 πŸ”— Yoshimura From archive.org perspective, will there be a full fetch or just the tiddler data? as else I cannot see how would that be accessible.
10:36 πŸ”— BlueMaxim has quit IRC (Quit: Leaving)
10:37 πŸ”— GE has quit IRC (Remote host closed the connection)
10:49 πŸ”— Yoshimura Can one match only the top page for a domain with CDX search?
10:56 πŸ”— Yoshimura Nevermind, I forgot about filter.
11:00 πŸ”— Yoshimura https://web.archive.org/cdx/search/cdx?url=tiddlyspace.com&collapse=urlkey&fl=original&matchType=domain&filter=original:.*\.tiddlyspace.com(|:[^/]*)/&limit=2000
11:24 πŸ”— Yoshimura 1684 unique tiddlyspace subdomains/handles http://chunk.io/f/38108e5a63614fd588a67b583c78e4aa
11:24 πŸ”— Yoshimura Parsed from archive.org + some other lesser sources.
11:26 πŸ”— Yoshimura Actually, I failed, this one is unique (still 1684): http://chunk.io/f/6be43facd1a34395acce906b1d1d6d22
11:27 πŸ”— Yoshimura And there are some other data apparently mixes, but yeah, here it is, I need brain food.
11:38 πŸ”— brayden has joined #archiveteam-bs
11:38 πŸ”— swebb sets mode: +o brayden
11:52 πŸ”— godane so we are up to 2014 with abc.net.au/news/2014
12:17 πŸ”— GE has joined #archiveteam-bs
12:39 πŸ”— VADemon has joined #archiveteam-bs
13:16 πŸ”— luckcolor hey guys does anyone know the channel for Vine?
13:16 πŸ”— luckcolor (if it's already inpΓ²ace ofc)
13:16 πŸ”— luckcolor *inplace
13:17 πŸ”— Whopper #vinewhine
13:18 πŸ”— luckcolor thx
14:52 πŸ”— VADemon_ has joined #archiveteam-bs
14:54 πŸ”— VADemon has quit IRC (Ping timeout: 250 seconds)
15:12 πŸ”— powerKitt has joined #archiveteam-bs
17:01 πŸ”— ravetcofx has joined #archiveteam-bs
17:11 πŸ”— VADemon_ has quit IRC (Quit: left4dead)
17:46 πŸ”— Frogging Huh. I've started to get spam in the email I used to sign up for myvip
17:46 πŸ”— Frogging it was an address used only for that
17:50 πŸ”— Frogging or maybe it was sent from and/or through myvip, because it does mention the website. it's selling vehicle insurance. though it's all in hungarian so it's hard to tell
18:04 πŸ”— powerKitt has quit IRC (Read error: Operation timed out)
18:29 πŸ”— powerKitt has joined #archiveteam-bs
18:36 πŸ”— Medowar0 same. seems to be coming from the company behind myvip, because rdns for myvip and the mailserver are very similar.
18:37 πŸ”— Medowar0 seems like a desparate method to generate some money out of a sinking ship.
18:45 πŸ”— JW_work has quit IRC (Quit: Leaving.)
18:50 πŸ”— JW_work has joined #archiveteam-bs
18:55 πŸ”— JW_work has quit IRC (Quit: Leaving.)
19:03 πŸ”— JW_work has joined #archiveteam-bs
19:15 πŸ”— PurpleSym arkiver: Uploaded public NXP datasheets: https://archive.org/details/nxp-datasheets-11-2016 Can you move the item to collections web and archiveteam?
19:44 πŸ”— Yoshimura PurpleSym: What was the method used and what everything is in it?
19:47 πŸ”— PurpleSym Yoshimura: There’s a site search returning XML. I scraped it and got all unique documents (by Asset_id).
19:49 πŸ”— Yoshimura Oh. You sure you got all? Therefore I need not to do it. Good for me.
19:49 πŸ”— Yoshimura Are those only datasheets or all docs?
19:50 πŸ”— PurpleSym I did not double-check yet, so I’m not sure.
19:50 πŸ”— PurpleSym All documents.
19:50 πŸ”— bsmith093 how do i open a lz archive? i've literally *never* heard of that format.
19:50 πŸ”— PurpleSym lzip.
19:51 πŸ”— PurpleSym Same compression algorithm that xz uses, but simpler container format.
19:51 πŸ”— VADemon has joined #archiveteam-bs
19:53 πŸ”— Yoshimura To be honest, lzip is not best choice for archival
19:55 πŸ”— PurpleSym What do you suggest instead?
19:56 πŸ”— xmc gzip is best
19:56 πŸ”— xmc bzip is pretty good
19:56 πŸ”— xmc .zip is also an a+ choice
19:56 πŸ”— Aranje has joined #archiveteam-bs
19:57 πŸ”— JW_work print out the data in hex, and scan it back in as jpegs
19:57 πŸ”— JW_work That would be a Bad Idea.
19:58 πŸ”— JW_work but could be useful for material one wanted to *really painful* to search for
19:58 πŸ”— JW_work er, wanted to *make* really
20:00 πŸ”— PurpleSym Well, in terms of compression ratio none of these beats LZMA.
20:01 πŸ”— xmc lzma is not very well tested and doesn't have a robust container format
20:01 πŸ”— xmc i mean, petabytes have been gzipped, and we know how it works
20:01 πŸ”— xmc we are archivists, not compression fanatics
20:01 πŸ”— yipdw it's 52 MB of JSON next to a 10 GB warc.gz
20:02 πŸ”— xmc goal #1 is to not destroy things accidentally
20:02 πŸ”— yipdw this is like saying a Veyron Super Sport is faster than an Aventador
20:02 πŸ”— yipdw technically true and nobody gives a shit
20:03 πŸ”— yipdw also I think I miss Top Gear
20:05 πŸ”— xmc aw
20:05 πŸ”— PurpleSym LZMA has been here for quite a while and lzip’s container format is dead simple, just like gzip.
20:06 πŸ”— xmc just use gzip
20:06 πŸ”— xmc it's less confusing to everyone
20:07 πŸ”— yipdw in vine news
20:07 πŸ”— yipdw https://gitlab.peach-bun.com/snippets/40
20:08 πŸ”— yipdw I like how #2 is a backflip
20:09 πŸ”— yipdw well, backflip minus the tuck
20:10 πŸ”— Yoshimura xz is in linux kernel at least.
20:10 πŸ”— Yoshimura But unless the data is very compressible, there is not much point in using xz and not gz.
20:10 πŸ”— xmc we discussed yesterday why xz is not very robust either
20:10 πŸ”— xmc http://www.nongnu.org/lzip/xz_inadequate.html
20:13 πŸ”— yipdw #9 is also a backflip lol
20:13 πŸ”— Yoshimura Yeah. I can argue though that data integrity and error recovery should be handled on storage level either by user or by storage.
20:13 πŸ”— PurpleSym Yoshimura: It’s JSON, so yeah, it compresses well: gzip: 7828510, lzip: 3968256
20:14 πŸ”— Yoshimura PurpleSym: gzip -9?
20:14 πŸ”— PurpleSym --best
20:14 πŸ”— Yoshimura Yeah. But it is a small file, so unless you use that format across the board for a large amount of data, you know.
20:16 πŸ”— Yoshimura I know there are strong arguments for gzip, but personally I would store stuff differently, saving a lot of space, a lot of disk time. I cannot talk for the Archive.org scale though and how having the data already gzp compressed helps to serve the data without decompression.
20:17 πŸ”— Yoshimura And not sure how it's handled (curious). If it needs to decompress to verify checksum then you need more compute anyway, but not as much as compression. verifying compressed would be faster.
20:18 πŸ”— PurpleSym Here, everything you need to know: http://www.nongnu.org/lzip/manual/lzip_manual.html#File-format
20:18 πŸ”— Yoshimura was kicked by xmc (you are not contributing meaningfully)
20:19 πŸ”— powerKitt Nice one, xmc.
20:19 πŸ”— yipdw the hell is going on
20:19 πŸ”— powerKitt Yoshimura being a butt
20:19 πŸ”— yipdw is this seriously all happening over compression formats
20:19 πŸ”— yipdw come on
20:22 πŸ”— yipdw here if you want to be technical, explain to this guy why his vine isn't "lagging" https://vine.co/v/OjaUq3gi60h
20:24 πŸ”— xmc ha
20:24 πŸ”— Aoede Top Gear may be gone, but The Grand Tour is coming in few weeks
20:28 πŸ”— Yoshimura has joined #archiveteam-bs
20:32 πŸ”— Smiley pfft?
20:34 πŸ”— Yoshimura Never knew one has to contribute only in offtopic channel.
20:36 πŸ”— jrwr has joined #archiveteam-bs
20:54 πŸ”— jrwr SketchCow, Or BS it up in here
20:54 πŸ”— jrwr It was a productive chat from what I saw
20:55 πŸ”— xmc it's Another Chat on Ethical Archiving (TM)
20:56 πŸ”— jrwr Its understandable, Does AT have a official stance on that type of data?
20:56 πŸ”— yipdw no
20:57 πŸ”— yipdw my personal stance on it is that it's harmful
20:57 πŸ”— yipdw I don't think humans know how to interpret that sort of data
20:57 πŸ”— yipdw without fucking themselves and everyone around them
20:57 πŸ”— jrwr I think there is a Southpark Ep about this that just came out
20:58 πŸ”— yipdw when you're in a chat you aren't producing text that's carefully massaged for consumption by the public at large
20:58 πŸ”— yipdw *regardless* of technical access controls
20:58 πŸ”— yipdw intent is a tricky thing
20:58 πŸ”— JW_work has quit IRC (Quit: Leaving.)
20:59 πŸ”— JW_work has joined #archiveteam-bs
20:59 πŸ”— Sanqui i agree with yipdw, and i think this is a fundamental flaw in telegram
20:59 πŸ”— jrwr It is, as it just takes one conversion to see how something begin
20:59 πŸ”— JW_work has quit IRC (Client Quit)
20:59 πŸ”— Sanqui past messages in a chat should be only accessible by the people who were present at that point at time
20:59 πŸ”— Sanqui IMO
21:00 πŸ”— yipdw I agree
21:00 πŸ”— SketchCow I go back and forth
21:00 πŸ”— Sanqui unless there is a public and loud notification that the chat is logged publicly, like some freenode help channels have
21:00 πŸ”— Sanqui publicpublicpulbic
21:00 πŸ”— jrwr What about IRC logging then? Its very common, but some of my best projects start out in a IRC channel as (Why hasn't anyone done X yet)
21:00 πŸ”— xmc i whip my discourse back and forth
21:00 πŸ”— SketchCow But I DEFINITELY think that a case where you have everyone in some sort of communication and NOBODY has ANY idea it's being recorded for posterity, that's straight up black, not grey-area
21:01 πŸ”— Sanqui personal logging is obviously fine, but if you publish the logs without the approval of the people who are present in it, you aren't being ethical
21:01 πŸ”— yipdw jrwr: depends on channel, intent, and context, and my fallback policy is "don't log it and ask first"
21:01 πŸ”— yipdw obviously i can't control what others do
21:02 πŸ”— Sanqui some of the stuff I've said in "public" chat could fuck me up lol
21:02 πŸ”— yipdw like if it's a known project channel on (say) Freenode I'd expect public logs to show up somewhere
21:02 πŸ”— yipdw I know this channel is publicly logged but I'm writing things in here that I wouldn't write on a Freenode project channe
21:02 πŸ”— yipdw stuff like that
21:03 πŸ”— yipdw it's possible that policy is irrational
21:03 πŸ”— yipdw but I think with (again) Freenode there's a shared, implicit expectation amongst many (if not all) participants that the chat is to be treated more like proceedings of a meeting
21:04 πŸ”— yipdw not so much others
21:04 πŸ”— yipdw other channels/networks that is
21:04 πŸ”— jrwr like the deeper parts of IRC, the private channels of the private project channels
21:08 πŸ”— yipdw one thing I did find interesting was the outrage over warrantless wiretapping vs. the accepted nature of "everyone logs"
21:08 πŸ”— yipdw I wonder if there was a similar outrage in the early days of IRC
21:08 πŸ”— xmc read up on the history of dejanews
21:09 πŸ”— yipdw http://www.antipope.org/charlie/old/rant/dejanews.html <-- ?
21:13 πŸ”— jrwr Ah
21:14 πŸ”— jrwr the old Newsgroups discussions
21:14 πŸ”— jrwr As people, we don't want things taken out of context as it might look bad
21:14 πŸ”— jrwr like a bad 90s joke that if your employer found 10 years later might get you passed up on
21:16 πŸ”— Frogging that'd be silly af
21:17 πŸ”— jrwr I know for a fact that has happened
21:17 πŸ”— jrwr since it was a racist joke
21:17 πŸ”— Frogging of course it was, people got real sensitive in the last while :p
21:18 πŸ”— jrwr I know I have some crazy IRC logs of my username floating around
21:18 πŸ”— jrwr like '04 and back when I was like 11
21:26 πŸ”— Kaz otoh, the fact that a conversation 12 years ago is still available, is pretty cool
21:31 πŸ”— Yoshimura The fact that I had FullHD Evanescence video (which I probably lost) and Vevo has 360p version is the sad truth about copyrighted stuff.
21:47 πŸ”— dashcloud has quit IRC (Ping timeout: 250 seconds)
21:49 πŸ”— VADemon has quit IRC (Read error: Operation timed out)
21:50 πŸ”— dashcloud has joined #archiveteam-bs
22:09 πŸ”— JW_work has joined #archiveteam-bs
22:10 πŸ”— SketchCow Well, a sort of sad truth
22:11 πŸ”— SketchCow Oh, and is there where I tell you an insider told me how basically every public IRC server is being logged, at the server level
22:11 πŸ”— SketchCow All conversations, period
22:13 πŸ”— powerKitt Huh, that sounds interesting. Makes me wonder if irc.mindfang.net (the IRC server for the "Pesterchum" roleplay chat client) is logged at the server level.
22:14 πŸ”— powerKitt Cause if it is, it might be interesting to try and get a copy of logs related to the ARG I'm archiving.
22:15 πŸ”— SketchCow I didn't say the servers knew this was happening
22:16 πŸ”— wp494_ has joined #archiveteam-bs
22:17 πŸ”— jrwr There are addons to the IRC core to do this
22:17 πŸ”— jrwr and all trimmings
22:21 πŸ”— joepie91 SketchCow: seems to fall into the same bucket as "basically every shared hosting server in existence is pwnt in some way at the admin level"
22:21 πŸ”— wp494 has quit IRC (Read error: Operation timed out)
22:21 πŸ”— joepie91 (you just don't see that as a customer, because it's useful to send spoofed UDP as root)
22:25 πŸ”— jrwr the size of IoT DDoS swarms are amazing
22:26 πŸ”— jrwr it endangers sites to the point they may never come back online
22:27 πŸ”— joepie91 that thing that security people have been warning about happening for the past 2 years, happened
22:27 πŸ”— joepie91 news at 11
22:27 πŸ”— joepie91 :P
22:27 πŸ”— joepie91 (more than 2, even)
22:28 πŸ”— joepie91 like, my response to the DDoS stuff was mostly "yeah, that's an IoT botnet, they're fucked"
22:28 πŸ”— joepie91 it's not unexpected at all
22:28 πŸ”— joepie91 have a pile of companies put out shit at the lowest common denominator with no repercussions for fucking up security nor incentives to do it right
22:28 πŸ”— joepie91 attach them all to a network
22:28 πŸ”— joepie91 what do you *think* is going to happen, really
22:29 πŸ”— jrwr maybe it was the plan all along
22:33 πŸ”— dashcloud has quit IRC (Remote host closed the connection)
22:34 πŸ”— dashcloud has joined #archiveteam-bs
22:35 πŸ”— Yoshimura That is why the serious have multilevel routers with BGP blackholing, FPGA filtering, and finetunable end of the line software filtering.
22:36 πŸ”— jrwr Good ol OVH, their DDoS Protection is not half bad
22:37 πŸ”— Yoshimura xmc Thank you for the link to the xz scrutiny. That plus related materials gave me more insight. My concern is data size for individuals, that do not possess resources of accumulated wealth by community donations/work. But which can play a big role. Meanwhile cost of data often means cost of transfer, but I do not know the insides.
23:02 πŸ”— Ravenloft has joined #archiveteam-bs
23:05 πŸ”— GE has quit IRC (Remote host closed the connection)
23:05 πŸ”— Swizzle has joined #archiveteam-bs
23:10 πŸ”— wp494_ is now known as wp494
23:27 πŸ”— Swizzle has quit IRC (Read error: Operation timed out)
23:33 πŸ”— Frogging all the world's problems would be solved if only people were as smart as you, eh Yoshimura
23:37 πŸ”— BlueMaxim has joined #archiveteam-bs
23:40 πŸ”— Sanqui why so abrasive
23:53 πŸ”— powerKitt has quit IRC (Remote host closed the connection)
23:56 πŸ”— ndiddy has joined #archiveteam-bs

irclogger-viewer