[00:03] <Raccoon> If I were to ping random image IDs on imgur to extrapolate the number of images they're hosting at the end of 2019, can I add that to the Imgur article?  Or are there new technical reasons why this won't be possible to do
[00:04] <Flashfire> I mean you might get rate limited
[00:04] <Flashfire> I want to know if they go sequentially or not
[00:04] <Raccoon> lol, they don't.  Unless you can figure out the hashing algorithm and seed they use
[00:06] <Raccoon> it's arguably easier and less risky to just generate a random string and check if it exists before assigning it
[00:07] *** HashbangI has quit IRC (Read error: Operation timed out)
[00:09] *** HashbangI has joined #archiveteam-bs
[00:26] <amelia386> They didn't directly say it, but an old blog post makes it seem like it is random: https://blog.imgur.com/2013/01/18/more-characters-in-filenames/
[00:55] <Raccoon> Am I correct in assuming that imgur image ids are, and always have been, 7 digits [a-zA-Z0-9]
[00:55] <Raccoon> no _ or -
[00:56] <amelia386> old ones were 5, newer ones are 7 (when they ran out of 5 char ones)
[00:56] <Raccoon> hmm, ok.
[00:56] <Flashfire> https://imgur.com/gallery/dhtSi
[00:56] <Raccoon> no, that's a gallery ID
[00:56] <Raccoon> NOT an image id
[00:56] <Flashfire> Oh ok sorry
[00:57] <amelia386> https://i.imgur.com/Bw2ppzG.jpg is the actual image
[00:57] <Raccoon> did they in fact issue 5 digit image ids then?
[00:57] <amelia386> And capitalization does matter for them
[00:57] <Raccoon> I guess I could just do two extrapolations and then add them together
[01:00] <Raccoon> 62^5 = 916132832, 62^7 = 3521614606208.  I need to figure out how to determine the margin of error based on the size of the sample pool
[01:00] <Raccoon> Statistics 101 i'm sure
[01:01] <Flashfire> But doesnt 62^7 calculation include the 62^5 combinations?
[01:01] <Raccoon> hmm, yeah
[01:01] <Raccoon> 62^7 - 62^6
[01:02] <Flashfire> 3.4648144e+12
[01:02] <Raccoon> still very close: 3464814370624 and 901356496
[01:04] <amelia386> 6 years ago it was 1B images: https://web.archive.org/web/20130330220304/http://imgur.com/stats
[01:04] <Flashfire> also #imgone
[01:04] <amelia386> 1M, I can read...
[01:05] <amelia386> So 1M/day for 6 years (and likely more than that since)
[01:06] <Raccoon> that's why I'd like to sample.
[01:08] <Raccoon> basically hit (ie, 1 million) random image ids, count the 404s vs 200s, extrapolate the statistical odds of that number within the set
[01:09] <Raccoon> 11 1/2 days work for 1 hit per second
[01:10] <Raccoon> ($hits / 1000000 * 3521614606208)
[01:12] <amelia386> if it actually has a uniform distribution over the IDs
[01:12] <Raccoon> wouldn't random sampling be immune
[01:13] <amelia386> not sure
[01:22] <Raccoon> margin of error and standard deviations still hurt my brain, even when it's explained for dummies.  https://www.dummies.com/education/math/statistics/how-to-determine-the-minimum-size-needed-for-a-statistical-sample/
[01:22] <Raccoon> Why can't they just explain it for coders.
[01:25] <kiska> Search for 95% and 97% confidence intervals, I guess
[01:25] <Raccoon> >_>
[01:27] *** Stiletto has quit IRC (Ping timeout: 506 seconds)
[01:38] <amelia386> This is over my head of how much I understand stats. 
[01:41] *** Stiletto has joined #archiveteam-bs
[01:41] *** DogsRNice has quit IRC (Read error: Connection reset by peer)
[01:53] *** Stiletto has quit IRC (Ping timeout: 745 seconds)
[02:00] <SketchCow> home.
[02:01] <Flashfire> Welcome Home Sketch
[02:28] *** Dj-Wawa has quit IRC (Quit: Connection closed for inactivity)
[02:48] *** Stiletto has joined #archiveteam-bs
[02:52] *** Stiletto has quit IRC (Remote host closed the connection)
[03:03] *** Flashfloo has quit IRC (Read error: Connection reset by peer)
[03:03] *** kiska has quit IRC (Remote host closed the connection)
[03:03] *** Flashfire has quit IRC (Remote host closed the connection)
[03:04] *** Flashfloo has joined #archiveteam-bs
[03:04] *** kiska has joined #archiveteam-bs
[03:04] *** Fusl sets mode: +o kiska
[03:04] *** Flashfire has joined #archiveteam-bs
[03:29] *** Stiletto has joined #archiveteam-bs
[03:32] *** Stilettoo has joined #archiveteam-bs
[03:40] *** Stiletto has quit IRC (Ping timeout: 506 seconds)
[03:42] *** Stilettoo has quit IRC (Ping timeout: 604 seconds)
[03:56] *** qw3rty111 has joined #archiveteam-bs
[04:01] *** qw3rty119 has quit IRC (Ping timeout: 600 seconds)
[05:30] *** Mateon1 has quit IRC (Remote host closed the connection)
[05:32] *** Mateon1 has joined #archiveteam-bs
[07:44] *** m007a83_ has joined #archiveteam-bs
[07:46] *** fredgido has quit IRC (Read error: Connection reset by peer)
[07:47] *** fredgido has joined #archiveteam-bs
[07:48] *** m007a83 has quit IRC (Ping timeout: 252 seconds)
[08:14] *** m007a83_ has quit IRC (Ping timeout: 252 seconds)
[08:17] *** m007a83 has joined #archiveteam-bs
[08:36] *** deevious has joined #archiveteam-bs
[10:10] *** Shen has quit IRC (Ping timeout: 240 seconds)
[10:27] *** Shen has joined #archiveteam-bs
[10:40] *** fredgido has quit IRC (Remote host closed the connection)
[10:41] *** fredgido has joined #archiveteam-bs
[10:58] *** Dj-Wawa has joined #archiveteam-bs
[12:13] *** VerifiedJ has joined #archiveteam-bs
[12:15] *** fredgido has quit IRC (Read error: Connection reset by peer)
[12:16] *** fredgido has joined #archiveteam-bs
[12:29] *** BlueMax has quit IRC (Quit: Leaving)
[13:05] <JAA> jrwr or SketchCow, can you set  $wgNamespacesWithSubpages[NS_MAIN] = true;  please in the wiki's LocalSettings.php so we get proper subpages which link back to the parent page? (Thanks revi)
[14:52] *** systwi_ has joined #archiveteam-bs
[14:58] *** systwi has quit IRC (Read error: Operation timed out)
[14:59] *** systwi_ is now known as systwi
[16:49] *** Stiletto has joined #archiveteam-bs
[17:00] *** DogsRNice has joined #archiveteam-bs
[17:15] <Fusl> Kenshin: can you pm me your IA address you use for megawarc uploads?
[17:15] <Fusl> email address
[17:16] <Kenshin> don't have an account
[17:16] <Kenshin> i never do any of the uploading myself
[17:16] <Fusl> ah
[17:55] *** yano has quit IRC (WeeChat, The Better IRC Client, https://weechat.org/)
[18:06] *** yano has joined #archiveteam-bs
[18:10] *** schbirid has joined #archiveteam-bs
[18:13] *** zhongfu has quit IRC (Ping timeout: 745 seconds)
[18:27] *** zhongfu has joined #archiveteam-bs
[18:29] *** killsushi has joined #archiveteam-bs
[18:54] *** Pokemonpr has quit IRC (Quit: Page closed)
[20:43] *** fredgido has quit IRC (Read error: Connection reset by peer)
[20:44] *** fredgido has joined #archiveteam-bs
[21:10] *** Gfy has quit IRC (Remote host closed the connection)
[21:15] *** Gfy has joined #archiveteam-bs
[21:24] *** schbirid has quit IRC (Remote host closed the connection)
[21:44] *** dashcloud has quit IRC (Ping timeout: 252 seconds)
[21:49] *** dashcloud has joined #archiveteam-bs
[22:24] *** TC01 has quit IRC (Ping timeout: 745 seconds)
[22:39] *** BlueMax has joined #archiveteam-bs
[22:55] <arkiver> Fusl: what would Kenshin upload?
[23:17] *** Dj-Wawa has quit IRC (Quit: Connection closed for inactivity)
[23:45] *** wyatt8740 has quit IRC (Read error: Operation timed out)