Time |
Nickname |
Message |
00:03
🔗
|
Raccoon |
If I were to ping random image IDs on imgur to extrapolate the number of images they're hosting at the end of 2019, can I add that to the Imgur article? Or are there new technical reasons why this won't be possible to do |
00:04
🔗
|
Flashfire |
I mean you might get rate limited |
00:04
🔗
|
Flashfire |
I want to know if they go sequentially or not |
00:04
🔗
|
Raccoon |
lol, they don't. Unless you can figure out the hashing algorithm and seed they use |
00:06
🔗
|
Raccoon |
it's arguably easier and less risky to just generate a random string and check if it exists before assigning it |
00:07
🔗
|
|
HashbangI has quit IRC (Read error: Operation timed out) |
00:09
🔗
|
|
HashbangI has joined #archiveteam-bs |
00:26
🔗
|
amelia386 |
They didn't directly say it, but an old blog post makes it seem like it is random: https://blog.imgur.com/2013/01/18/more-characters-in-filenames/ |
00:55
🔗
|
Raccoon |
Am I correct in assuming that imgur image ids are, and always have been, 7 digits [a-zA-Z0-9] |
00:55
🔗
|
Raccoon |
no _ or - |
00:56
🔗
|
amelia386 |
old ones were 5, newer ones are 7 (when they ran out of 5 char ones) |
00:56
🔗
|
Raccoon |
hmm, ok. |
00:56
🔗
|
Flashfire |
https://imgur.com/gallery/dhtSi |
00:56
🔗
|
Raccoon |
no, that's a gallery ID |
00:56
🔗
|
Raccoon |
NOT an image id |
00:56
🔗
|
Flashfire |
Oh ok sorry |
00:57
🔗
|
amelia386 |
https://i.imgur.com/Bw2ppzG.jpg is the actual image |
00:57
🔗
|
Raccoon |
did they in fact issue 5 digit image ids then? |
00:57
🔗
|
amelia386 |
And capitalization does matter for them |
00:57
🔗
|
Raccoon |
I guess I could just do two extrapolations and then add them together |
01:00
🔗
|
Raccoon |
62^5 = 916132832, 62^7 = 3521614606208. I need to figure out how to determine the margin of error based on the size of the sample pool |
01:00
🔗
|
Raccoon |
Statistics 101 i'm sure |
01:01
🔗
|
Flashfire |
But doesnt 62^7 calculation include the 62^5 combinations? |
01:01
🔗
|
Raccoon |
hmm, yeah |
01:01
🔗
|
Raccoon |
62^7 - 62^6 |
01:02
🔗
|
Flashfire |
3.4648144e+12 |
01:02
🔗
|
Raccoon |
still very close: 3464814370624 and 901356496 |
01:04
🔗
|
amelia386 |
6 years ago it was 1B images: https://web.archive.org/web/20130330220304/http://imgur.com/stats |
01:04
🔗
|
Flashfire |
also #imgone |
01:04
🔗
|
amelia386 |
1M, I can read... |
01:05
🔗
|
amelia386 |
So 1M/day for 6 years (and likely more than that since) |
01:06
🔗
|
Raccoon |
that's why I'd like to sample. |
01:08
🔗
|
Raccoon |
basically hit (ie, 1 million) random image ids, count the 404s vs 200s, extrapolate the statistical odds of that number within the set |
01:09
🔗
|
Raccoon |
11 1/2 days work for 1 hit per second |
01:10
🔗
|
Raccoon |
($hits / 1000000 * 3521614606208) |
01:12
🔗
|
amelia386 |
if it actually has a uniform distribution over the IDs |
01:12
🔗
|
Raccoon |
wouldn't random sampling be immune |
01:13
🔗
|
amelia386 |
not sure |
01:22
🔗
|
Raccoon |
margin of error and standard deviations still hurt my brain, even when it's explained for dummies. https://www.dummies.com/education/math/statistics/how-to-determine-the-minimum-size-needed-for-a-statistical-sample/ |
01:22
🔗
|
Raccoon |
Why can't they just explain it for coders. |
01:25
🔗
|
kiska |
Search for 95% and 97% confidence intervals, I guess |
01:25
🔗
|
Raccoon |
>_> |
01:27
🔗
|
|
Stiletto has quit IRC (Ping timeout: 506 seconds) |
01:38
🔗
|
amelia386 |
This is over my head of how much I understand stats. |
01:41
🔗
|
|
Stiletto has joined #archiveteam-bs |
01:41
🔗
|
|
DogsRNice has quit IRC (Read error: Connection reset by peer) |
01:53
🔗
|
|
Stiletto has quit IRC (Ping timeout: 745 seconds) |
02:00
🔗
|
SketchCow |
home. |
02:01
🔗
|
Flashfire |
Welcome Home Sketch |
02:28
🔗
|
|
Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) |
02:48
🔗
|
|
Stiletto has joined #archiveteam-bs |
02:52
🔗
|
|
Stiletto has quit IRC (Remote host closed the connection) |
03:03
🔗
|
|
Flashfloo has quit IRC (Read error: Connection reset by peer) |
03:03
🔗
|
|
kiska has quit IRC (Remote host closed the connection) |
03:03
🔗
|
|
Flashfire has quit IRC (Remote host closed the connection) |
03:04
🔗
|
|
Flashfloo has joined #archiveteam-bs |
03:04
🔗
|
|
kiska has joined #archiveteam-bs |
03:04
🔗
|
|
Fusl sets mode: +o kiska |
03:04
🔗
|
|
Flashfire has joined #archiveteam-bs |
03:29
🔗
|
|
Stiletto has joined #archiveteam-bs |
03:32
🔗
|
|
Stilettoo has joined #archiveteam-bs |
03:40
🔗
|
|
Stiletto has quit IRC (Ping timeout: 506 seconds) |
03:42
🔗
|
|
Stilettoo has quit IRC (Ping timeout: 604 seconds) |
03:56
🔗
|
|
qw3rty111 has joined #archiveteam-bs |
04:01
🔗
|
|
qw3rty119 has quit IRC (Ping timeout: 600 seconds) |
05:30
🔗
|
|
Mateon1 has quit IRC (Remote host closed the connection) |
05:32
🔗
|
|
Mateon1 has joined #archiveteam-bs |
07:44
🔗
|
|
m007a83_ has joined #archiveteam-bs |
07:46
🔗
|
|
fredgido has quit IRC (Read error: Connection reset by peer) |
07:47
🔗
|
|
fredgido has joined #archiveteam-bs |
07:48
🔗
|
|
m007a83 has quit IRC (Ping timeout: 252 seconds) |
08:14
🔗
|
|
m007a83_ has quit IRC (Ping timeout: 252 seconds) |
08:17
🔗
|
|
m007a83 has joined #archiveteam-bs |
08:36
🔗
|
|
deevious has joined #archiveteam-bs |
10:10
🔗
|
|
Shen has quit IRC (Ping timeout: 240 seconds) |
10:27
🔗
|
|
Shen has joined #archiveteam-bs |
10:40
🔗
|
|
fredgido has quit IRC (Remote host closed the connection) |
10:41
🔗
|
|
fredgido has joined #archiveteam-bs |
10:58
🔗
|
|
Dj-Wawa has joined #archiveteam-bs |
12:13
🔗
|
|
VerifiedJ has joined #archiveteam-bs |
12:15
🔗
|
|
fredgido has quit IRC (Read error: Connection reset by peer) |
12:16
🔗
|
|
fredgido has joined #archiveteam-bs |
12:29
🔗
|
|
BlueMax has quit IRC (Quit: Leaving) |
13:05
🔗
|
JAA |
jrwr or SketchCow, can you set $wgNamespacesWithSubpages[NS_MAIN] = true; please in the wiki's LocalSettings.php so we get proper subpages which link back to the parent page? (Thanks revi) |
14:52
🔗
|
|
systwi_ has joined #archiveteam-bs |
14:58
🔗
|
|
systwi has quit IRC (Read error: Operation timed out) |
14:59
🔗
|
|
systwi_ is now known as systwi |
16:49
🔗
|
|
Stiletto has joined #archiveteam-bs |
17:00
🔗
|
|
DogsRNice has joined #archiveteam-bs |
17:15
🔗
|
Fusl |
Kenshin: can you pm me your IA address you use for megawarc uploads? |
17:15
🔗
|
Fusl |
email address |
17:16
🔗
|
Kenshin |
don't have an account |
17:16
🔗
|
Kenshin |
i never do any of the uploading myself |
17:16
🔗
|
Fusl |
ah |
17:55
🔗
|
|
yano has quit IRC (WeeChat, The Better IRC Client, https://weechat.org/) |
18:06
🔗
|
|
yano has joined #archiveteam-bs |
18:10
🔗
|
|
schbirid has joined #archiveteam-bs |
18:13
🔗
|
|
zhongfu has quit IRC (Ping timeout: 745 seconds) |
18:27
🔗
|
|
zhongfu has joined #archiveteam-bs |
18:29
🔗
|
|
killsushi has joined #archiveteam-bs |
18:54
🔗
|
|
Pokemonpr has quit IRC (Quit: Page closed) |
20:43
🔗
|
|
fredgido has quit IRC (Read error: Connection reset by peer) |
20:44
🔗
|
|
fredgido has joined #archiveteam-bs |
21:10
🔗
|
|
Gfy has quit IRC (Remote host closed the connection) |
21:15
🔗
|
|
Gfy has joined #archiveteam-bs |
21:24
🔗
|
|
schbirid has quit IRC (Remote host closed the connection) |
21:44
🔗
|
|
dashcloud has quit IRC (Ping timeout: 252 seconds) |
21:49
🔗
|
|
dashcloud has joined #archiveteam-bs |
22:24
🔗
|
|
TC01 has quit IRC (Ping timeout: 745 seconds) |
22:39
🔗
|
|
BlueMax has joined #archiveteam-bs |
22:55
🔗
|
arkiver |
Fusl: what would Kenshin upload? |
23:17
🔗
|
|
Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) |
23:45
🔗
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |