[00:39] *** Zerote has quit IRC (Read error: Operation timed out) [04:57] *** kiska1 has quit IRC (Read error: Operation timed out) [05:13] *** kiska1 has joined #wikiteam [06:16] *** Zerote has joined #wikiteam [07:16] *** Zerote has quit IRC (Read error: Operation timed out) [07:20] *** Zerote has joined #wikiteam [14:20] *** Zerote has quit IRC (Ping timeout: 600 seconds) [15:22] *** Zerote has joined #wikiteam [20:04] *** phuzion has quit IRC (Remote host closed the connection) [20:47] *** phuzion has joined #wikiteam [21:37] I've encountered a weird issue with the dumpgenerator, when multiple images exist with the same name but different casing [21:38] In this case, the wiki has the two files "Alliance Handshake.jpg" and "Alliance handshake.jpg", it will save the first one, then not save the second, because it has the same name [21:38] However, if the scraper is paused or disconnects, and has to read the list of files, it will only find one of them [21:40] This means it stops once it can't find "Alliance handshake.jpg" and assumes only ~1500 images were downloaded, when it was more than 10k [22:20] Wait, I'm dumb, this might just be a problem with Windows [22:28] That would make sense. NTFS is case-preserving, i.e. filenames with differing case are collisions but the case you use when creating a file is preserved. [22:57] Yeah, just got a couple Linux VMs running, I'll be redoing scrapes I did on Windows just to make sure stuff isn't missing because of that