[00:27] *** BlueMaxim has joined #archiveteam-bs [00:33] *** VADemon has quit IRC (left4dead) [00:41] *** Marcelo has joined #archiveteam-bs [00:44] *** DoomTay has quit IRC (Quit: Page closed) [01:14] *** zino has quit IRC (Quit: Leaving) [01:14] *** zino has joined #archiveteam-bs [01:16] *** schbirid has joined #archiveteam-bs [01:17] *** schbirid2 has quit IRC (Ping timeout: 244 seconds) [01:25] *** DoomTay has joined #archiveteam-bs [02:01] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [02:02] *** Honno has joined #archiveteam-bs [02:14] hook54321 scan @ 300 DPI grayscale, black and white might seem more ideal, but it will fragment the edges, grayscale seems to privide the best "edges" for OCR [02:18] grayscale gives ocr much more data to work with [02:18] you can always turn greyscale into b/w but you can't go the other way [02:19] I thought greyscale would be harder to work with, having less contrast and all [02:22] nope, you can filter it to improve the contrast. this is what the scanner does [02:23] but you aren't adding any information [02:23] greyscale is better for humans to look at, and ocr engines can deal with it just fine [02:23] hook54321: if the material is important, 600dpi, but 300 is fine usually [02:28] To change the topic a bit, anyone mind if I update the screencap on http://www.archiveteam.org/index.php?title=Internet_Archive? [02:28] i can't imagine anyone objecting [02:29] It would be a partial picture since the new look is an "endless" page [02:39] *** Honno has quit IRC (Read error: Operation timed out) [03:07] *** DoomTay has quit IRC (Quit: Page closed) [03:07] *** DoomTay has joined #archiveteam-bs [03:09] Hm... I was about to pack it up, but I got pinged as soon as I closed the page, and now I can't tell what it was for [03:10] was probably this [03:10] [23:07:15] -- Notice(purplebot): Alive... OR ARE THEY edited by DoomTay (+161, /* Still Alive */) 1 minute ago -- http://www.archiveteam.org/?diff=26187&oldid=26147 [03:10] Oh [03:10] you should get yourself an IRC client :) [03:10] Yeah... [03:11] archive.fart.website/ seems to loave those out [03:11] *leave those out [03:11] the notices? yeah [03:15] *** DoomTay has quit IRC (Quit: Page closed) [03:19] *** Whopper_ has joined #archiveteam-bs [03:23] *** Whopper has quit IRC (Read error: Operation timed out) [04:30] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [04:37] *** Sk1d has joined #archiveteam-bs [05:17] *** REiN^ has quit IRC (Read error: Operation timed out) [05:33] *** bsmith093 has joined #archiveteam-bs [05:34] *** RedType has joined #archiveteam-bs [05:37] *** bsmith093 has quit IRC (Client Quit) [05:39] *** bsmith093 has joined #archiveteam-bs [06:03] *** fusl has quit IRC (.) [06:07] *** fusl has joined #archiveteam-bs [06:13] HCross: oh oops, may be resolved now [06:33] *** Marcelo has quit IRC (Ping timeout: 268 seconds) [06:38] *** davidar_ has joined #archiveteam-bs [06:40] *** davidar_ is now known as davidar [07:48] *** RedType has quit IRC (Remote host closed the connection) [07:48] *** RedType has joined #archiveteam-bs [08:48] *** BartoCH has joined #archiveteam-bs [08:57] *** Honno has joined #archiveteam-bs [09:37] ErkDog, xmc: I was under the impression that grayscale was better for OCR... Especially with fonts that are unusually condensed. [09:38] Is there any harm in scanning at higher than 600 dpi? [10:05] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [10:06] *** BartoCH has joined #archiveteam-bs [10:14] *** tomwsmf has quit IRC (Read error: Operation timed out) [10:29] *** dashcloud has quit IRC (Ping timeout: 260 seconds) [10:31] *** dashcloud has joined #archiveteam-bs [12:17] *** RichardG has joined #archiveteam-bs [12:25] *** dashcloud has quit IRC (Read error: Connection reset by peer) [12:25] *** dashcloud has joined #archiveteam-bs [13:01] *** davidar has quit IRC (Quit: Connection closed for inactivity) [13:13] *** dashcloud has quit IRC (Read error: Connection reset by peer) [13:14] *** dashcloud has joined #archiveteam-bs [13:21] *** dashcloud has quit IRC (Read error: Operation timed out) [13:24] *** dashcloud has joined #archiveteam-bs [13:49] *** BlueMaxim has quit IRC (Quit: Leaving) [14:00] *** REiN^ has joined #archiveteam-bs [14:31] *** Simpbrain has joined #archiveteam-bs [14:43] *** dashcloud has quit IRC (Read error: Operation timed out) [14:51] *** dashcloud has joined #archiveteam-bs [15:13] https://rol.im/securegoldenkeyboot/ [15:13] lols [15:34] hook54321: no substantive harm in >600 dpi, but you might pick up patterns in the paper that are distracting to the ocr [15:58] *** DoomTay has joined #archiveteam-bs [16:36] *** dashcloud has quit IRC (Read error: Operation timed out) [16:39] *** dashcloud has joined #archiveteam-bs [16:50] *** dashcloud has quit IRC (Read error: Operation timed out) [16:55] *** dashcloud has joined #archiveteam-bs [16:58] *** dashcloud has quit IRC (Read error: Operation timed out) [17:01] *** dashcloud has joined #archiveteam-bs [17:04] *** dashcloud has quit IRC (Read error: Operation timed out) [17:07] *** dashcloud has joined #archiveteam-bs [17:46] one of these days I'll start a file-sharing site limited to 64 bytes or less [17:46] i'll call it femto [17:47] lol [17:47] check out that twitter.com website [17:48] yipdw: i like it [17:49] hey nobody ever let existing services get in the way of new ones [17:49] yeah you could use twitter as a backing store [17:49] it's Synergy [17:49] twatfs [17:49] treating twitter as an append-only log [17:50] perl -e 'print "y" x 64' | base64 -w 0 | wc -c says that it'd take 88 characters, leaving plenty of room for metadata [17:55] Haha, nice [17:55] has anyone really tried to retrieve a full account history? I keep getting stymied on pages further in time [17:55] I mean for like a thousand tweets it works [17:55] but go further and it's like "hm" [17:55] twatfs: lossy append-only log-structured filesystem [17:56] it works if you have a link but, [17:56] they limit it to 1000 tweets from scrolling down [17:56] yeah, the web interface is a no-go for it [17:56] the API might but it's like "hmm" [17:57] * xmc nod [17:57] from a twitter employee - [17:57] n> From the analytics site you can download your full history [17:57] n> Through the API, I don't think so. [17:58] nice [18:00] *** dashcloud has quit IRC (Read error: Operation timed out) [18:01] [19:55] twatfs: lossy append-only log-structured filesystem [18:01] so... we've found the replacement for mongodb, then? [18:01] yeah a lot of contradictions [18:01] oh snap [18:01] i haven't used mongodb in a while so I've tried to stop myself from talking about it [18:02] it's an easy target because it's just broken by design and still ends up at the bottom of every benchmark [18:02] as far as I know these days it's friggin awesome but I'm over here just like "yeah I like postgres still sry" [18:02] :p [18:02] nah, it's not [18:02] like [18:02] I argue this topic regularly [18:02] with people [18:02] i like postgres & not sry [18:02] and for the past few months, they've come up with all kinds of [18:02] "yeah it has this now" [18:02] yeah, introduced in pg 9.1 [18:02] and this! [18:02] pg 9.3 [18:02] and this! [18:02] yeah, pg 9.2 [18:02] lol [18:03] and it goes on like that for a while until they eventually run out of steam, start making scalability claims again [18:03] sql lets me reason about my data and ask the computer questions. which i like. [18:03] and you then point out "nope, still loses data" [18:03] the most recent documented incident: https://engineering.meteor.com/mongodb-queries-dont-always-return-all-matching-documents-654b6594a827 [18:03] less loss, but still not good [18:04] ~ eventual consistency ~ [18:04] if you have mongodb cluster isn't that kinda what you expect [18:04] (that's about 2 months old) [18:04] oh wait [18:04] nm [18:04] yipdw: no, people think it's a magically scalable database [18:04] that does ~magic~ [18:04] without tradeoffs [18:04] and you can cluster postgresql now as well with some data consistency tradeoffs, so... [18:04] :P [18:04] well [18:04] shard [18:04] ref https://www.citusdata.com/blog/2016/03/24/citus-unforks-goes-open-source/ [18:05] oh that's cool [18:05] *** dashcloud has joined #archiveteam-bs [18:23] *** DoomTay has quit IRC (Ping timeout: 268 seconds) [18:59] *** Coderjoe has quit IRC (Read error: Operation timed out) [19:08] *** Coderjoe has joined #archiveteam-bs [19:08] *** DoomTay has joined #archiveteam-bs [19:17] I wonder if our Wiki has a way of seeing how many sites closed suddenly vs, how many had an announcement beforehand [19:20] *** kristian_ has quit IRC (Leaving) [19:35] *** MrRadar has quit IRC (Read error: Operation timed out) [19:37] Is twitter actually dying or are people over reacting? [19:37] ? [19:41] i'm at 805k items now [19:46] people overreacting, which causes Twitter to die [19:47] life and death of social networks is full of fun causality loops [19:47] that said twitter is a dumpster fire for a lot of groups, so dying might not be so bad [19:53] every social media is. [20:02] i can agree with that^ [20:10] So apparently FossHub was hacked and they put up a fake shutdown notice [20:10] That's just ridiculously evil. I never thought that would even happen in real life [20:11] wasn't fosshub the one hacked to distribute MBR-erasing malware? [20:12] Yup [20:12] which also feels kinda obsolete and pointlessly evil [20:13] I think I only dealt with a f*cked up MBR once and I don't even know how it happened [20:13] *** Stiletto has quit IRC (Ping timeout: 244 seconds) [20:16] hook54321: what's today's cause of deathj? [20:16] death* [20:26] do you guys download single videos? [20:27] probably will be DMCA'd [20:27] www.youtube.com/watch?v=eyxgToqzKR0 [20:27] Annemiek van Vleuten in Horror Crash FULL ALTERNATIVE ANGLE | Women's Cycling Rio 2016 - [3m56s] 2016-08-08 - Vineshub - Vine Compilations - 2,700,503 views [20:31] #archivebot can take care of that easily [20:33] Why do you think it will be DMCA'd, though? [20:38] *** RichardG has quit IRC (Read error: Connection reset by peer) [20:39] *** RichardG has joined #archiveteam-bs [20:44] *** dashcloud has quit IRC (Ping timeout: 250 seconds) [20:44] *** dashcloud has joined #archiveteam-bs [20:45] the international Olympic committee sounds like they are very controlling of their content [20:46] someone tell me there's a psychological study on the popularity of fail videos [20:46] it seems like pathological behavior to me [20:47] I'm only grabbing it because I've lost access to DMCA'd videos [20:47] oh, there are a few [20:47] I'm REALLY pissed about a video clearly falling under fair use that the UFC took down [20:48] analysing Rhonda Roussey or however you spell her name [20:48] bits of a fight she lost [20:48] i mention it because there was an interesting serendipity about you posting that and a few days ago when I was talking with a student [20:48] acrobatics goof-up videos inculcated a kind of fear in the student's mind [20:49] we had to work through it [20:49] it only used maybe a few seconds of the actual fight, so fair use was very very justifiable [20:49] this is part of the pathological behavior I describe; the other part of it is the encouragement of a sick spectator culture that rewards injury and does nothing for actual performance [20:49] * yipdw end soapbox [20:51] s/performance/progress and participation/ [20:54] I've been yelled at for not doing things, but when I make up for it the next day or so by spending most of the day getting chores done, I am rewarded with silence [21:18] *** Stiletto has joined #archiveteam-bs [21:24] *** wp494 has quit IRC (Read error: Connection reset by peer) [21:32] *** tomwsmf has joined #archiveteam-bs [21:33] so i found something interesting when doing my pc magazine grab [21:33] issue 1989-12-26 is sort of backwards [21:34] this is so you can see what i mean: https://books.google.com/books?id=-Xr7Ic-ivyMC&printsec=frontcover#v=onepage&q&f=false [21:34] Oh wow [21:35] most of the time its not that bad [21:35] a least the pages looks to be good scans [21:36] some of the 1982 issues are bit screwed up too [21:36] like have 3 issues per a id [21:55] *** bauruine has quit IRC (Ping timeout: 260 seconds) [22:40] *** Honno has quit IRC (Read error: Operation timed out) [22:42] *** Stiletto has quit IRC (Read error: Operation timed out) [22:46] http://www.redditstatus.com/ [22:46] 'slight issue', lists reddit.com as operational [22:47] error rate nearly 100%, request rate dropped to 0 [22:47] status reporting (Y) [22:48] yep, reddit's down [22:51] joepie91: #savetwitter was trending last night [22:52] that doesn't explain why people thought it needed saving [22:55] twitter had some issues around counting yesterday [22:56] Like, site malfunctioning? [22:56] Like what reddit's going through right now? [22:57] DoomTay: check what Kaz posted [22:58] No, I meant by twitter having issues, do you mean that the site malfunctioned like reddit is malfunctioning right now? [23:00] oh [23:01] dunno [23:01] Ugh! My school password automatically reset, now I can't archive something unless I somehow get to a school computer and set the password. [23:02] Your school doesn't allow for resetting the password from home? That's dumb [23:03] *** Stiletto has joined #archiveteam-bs [23:03] There is a way to change the password from home, but it appears that it doesn't work after the auto reset near the end of the summer. [23:04] *** wp494 has joined #archiveteam-bs [23:05] Ya know how their is a way to log into Windows through a network, right? [23:07] If I parked in a school parking lot, brought a laptop and connected to the WiFi, could I reset my password somehow? [23:14] *** SketchCo1 has joined #archiveteam-bs [23:14] *** swebb sets mode: +o SketchCo1 [23:15] *** tomwsmf has quit IRC (west.us.hub irc.Prison.NET) [23:15] *** RedType has quit IRC (west.us.hub irc.Prison.NET) [23:15] *** SketchCow has quit IRC (west.us.hub irc.Prison.NET) [23:19] *** RedType_ has joined #archiveteam-bs [23:24] *** fusl has quit IRC (Max SendQ exceeded) [23:24] *** fusl has joined #archiveteam-bs [23:24] hook54321: probably not [23:25] Then I guess my only options now are to ask a teacher to reset it or to sneak into a computer lab real quick. [23:25] yep [23:27] How risky is the second option? :/ [23:27] i think that is for you to determine :p [23:28] Bring a gun; then if they catch you they won't ask about the password [23:28] that's the most american thing i've heard today [23:29] At least I know I'm firee [23:29] I'm pretty sure I would get expelled if I brought a gun... Why wouldn't I want them to know about the password? [23:29] 🐷 [23:30] https://www.youtube.com/watch?v=Q65KZIqay4E [23:30] *** kristian_ has joined #archiveteam-bs [23:31] *** SketchCo1 is now known as SketchCow [23:33] The elementary school near me has a computer just in the middle of the library, it isn't even locked or anything. [23:33] 19:29 < hook54321> I'm pretty sure I would get expelled if I brought a gun... [23:33] And they would never ask about the password [23:33] You gotta look at the big picture [23:33] he speaks truth^ [23:34] Why would they care about the password? [23:34] :P [23:35] Sobriquet Kael [23:35] Aug 3 (8 days ago) [23:35] to archiveteam [23:35] I honestly don't know how to word an email so as not to sound like a douche. [23:35] ---- [23:35] (This was the highlight of the inbox.) [23:39] *** Ravenloft has joined #archiveteam-bs [23:40] *** DoomTay has quit IRC (Quit: Page closed) [23:50] https://archive.org/details/densho?and[]=mediatype%3A"movies"&sort=titleSorter [23:50] Internet Archive has 3,000 of these uploading, more to go [23:54] what would be the turning point from a text based internet to a image based one [23:54] or a time window in which the change occured [23:55] 2006 [23:55] invetion of the gif [23:55] Wait, definition of those [23:55] But yes, 1993 [23:55] 1994 [23:56] I was thinking about it the other day, I started using the internet in 1999, so I am just a newcomer compared to some of you, but, on other hand, since it started to get really popular here only a few years late, I was a early adopter in my context [23:57] also I was 14 in 1999, so there is that [23:57] anyway [23:57] I start with IRC mostly [23:57] *started [23:57] and did a lot of web surfing with a pentium 133 via 56kbps dial up [23:57] \m/ [23:58] and now even simple pages are slow to load on relatively modern machines [23:58] with all the added crap [23:58] Boy, a spend a lot of time logging into FOS and going "THIS IS A LOT OF STUFF"