[00:01] *** Despatche has joined #archiveteam-ot [00:56] *** godane has quit IRC (Read error: Operation timed out) [01:06] *** Despatche has quit IRC (Remote host closed the connection) [01:16] *** Despatche has joined #archiveteam-ot [01:16] *** Despatche has quit IRC (Read error: Connection reset by peer) [01:17] *** Despatche has joined #archiveteam-ot [01:30] *** Despatche has quit IRC (Quit: Connection reset by deer) [01:57] *** picklefac has joined #archiveteam-ot [02:07] *** m007a83 has quit IRC (Read error: Connection reset by peer) [02:10] *** m007a83 has joined #archiveteam-ot [02:11] *** ola_norsk has joined #archiveteam-ot [02:13] i've 'tubeupped' a slew of 'twitter cards' from a specific twitter user, where some of it seems to be quite duplicate 'gif memes', and yet, the important ones are actually worth keeping.. [02:13] How would you, (or should i even), solve that? [02:14] *** terorie has joined #archiveteam-ot [02:15] Among the actually important video tweets, there's like ~20+ 'twitter card' gif anims of the office facepalm etc.. [02:17] If the gif anims are identical to AI, i wouldn't worry about it... But if not, i feel bad and kind of spammy if i don't fix it [02:17] IA* [02:27] ola_norsk, you should look into CBIR (content based image retrieval) [02:27] *** picklefac has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…) [02:28] There are Open Source programs that help you with things such as: I have an original JPEG and a recompressed version of the JPEG given one version find the other. [02:29] Or even someone edited the photo a-bit. [02:30] I assume this is what you are doing, you have one GIF and you want to find another and you can't do it with file checksums because it may be degraded because it is a reupload. [02:31] Note that no solution is 100% so you may want to manually review the results you get from whatever software you end up choosing [02:33] Another thing to search for would be near duplicate image detection [02:35] This might do it: http://pdiff.sourceforge.net/ [02:38] *** terorie has quit IRC (Remote host closed the connection) [02:39] *** terorie has joined #archiveteam-ot [02:44] *** terorie has quit IRC (Ping timeout: 268 seconds) [02:45] *** BlueMax has quit IRC (Read error: Connection reset by peer) [02:57] *** terorie has joined #archiveteam-ot [03:02] *** terorie has quit IRC (Ping timeout: 268 seconds) [03:23] *** icedice has quit IRC (Quit: Leaving) [04:15] *** BlueMax has joined #archiveteam-ot [04:26] *** odemg has quit IRC (Ping timeout: 615 seconds) [04:29] *** godane has joined #archiveteam-ot [05:04] *** Mateon1 has quit IRC (Ping timeout: 268 seconds) [05:04] *** Mateon1 has joined #archiveteam-ot [05:16] *** wp494 has quit IRC (Ping timeout: 506 seconds) [05:16] *** wp494 has joined #archiveteam-ot [08:21] mr_archiv: any chance i could rely on IA doing the 'near duplicate' stuff? [08:28] With my 'expertize' http://pdiff.sourceforge.net/ + and 'ia delete' , might just end up all items gone [08:32] *** ola_norsk has quit IRC (leaving) [09:03] I am not affiliated with the Internet Archive in any way so I cannot speak on their behalf, however I do not think it is likely that they do this. [09:14] *** Mateon1 has quit IRC (west.us.hub irc.Prison.NET) [09:14] *** Polylith has quit IRC (west.us.hub irc.Prison.NET) [09:14] *** chirlu has quit IRC (west.us.hub irc.Prison.NET) [09:14] *** marked has quit IRC (west.us.hub irc.Prison.NET) [09:16] *** Polylith_ has joined #archiveteam-ot [09:23] *** chirlu` has joined #archiveteam-ot [09:25] *** chirlu has joined #archiveteam-ot [09:25] *** chirlu has quit IRC (Ping timeout: 255 seconds) [09:25] *** marked has joined #archiveteam-ot [09:30] *** Mateon1 has joined #archiveteam-ot [10:08] *** terorie has joined #archiveteam-ot [10:13] *** terorie has quit IRC (Ping timeout: 268 seconds) [10:54] *** Despatche has joined #archiveteam-ot [11:00] *** Despatche has quit IRC (Quit: Connection reset by deer) [13:06] *** BlueMax has quit IRC (Read error: Connection reset by peer) [13:21] *** terorie has joined #archiveteam-ot [13:26] *** terorie has quit IRC (Ping timeout: 268 seconds) [14:08] *** Oddly has joined #archiveteam-ot [14:12] *** wp494 has quit IRC (Read error: Operation timed out) [14:12] *** wp494 has joined #archiveteam-ot [14:30] *** schbirid has joined #archiveteam-ot [17:16] *** icedice has joined #archiveteam-ot [18:26] *** chimyatta has joined #archiveteam-ot [18:50] *** LFlare has joined #archiveteam-ot [19:23] *** Oddly has quit IRC (Ping timeout: 255 seconds) [21:15] *** terorie has joined #archiveteam-ot [21:48] *** BlueMax has joined #archiveteam-ot [21:54] *** Despatche has joined #archiveteam-ot [22:49] *** schbirid has quit IRC (Remote host closed the connection) [23:13] *** wp494 has quit IRC (Ping timeout: 364 seconds) [23:14] *** wp494 has joined #archiveteam-ot [23:20] *** Stiletto has quit IRC (Read error: Operation timed out) [23:20] *** Stiletto has joined #archiveteam-ot