[00:35] *** VADemon has quit IRC (Read error: Connection reset by peer) [01:17] *** newbie12 has joined #archiveteam-bs [01:18] *** newbie12 is now known as jianaran [01:33] *** VerfiedJ has quit IRC (Quit: Leaving) [02:17] *** Despatche has quit IRC (Read error: Connection reset by peer) [02:17] *** Despatche has joined #archiveteam-bs [02:34] *** wp494_ has joined #archiveteam-bs [02:38] *** wp494 has quit IRC (Read error: Operation timed out) [02:45] *** odemg has quit IRC (Ping timeout: 265 seconds) [02:45] *** Hani111 has joined #archiveteam-bs [02:46] *** odemgi_ has quit IRC (Ping timeout: 252 seconds) [02:52] *** Hani has quit IRC (Ping timeout: 615 seconds) [02:52] *** Hani111 is now known as Hani [03:01] *** Sk1d has quit IRC (Read error: Operation timed out) [03:03] *** Sk1d has joined #archiveteam-bs [03:09] *** jianaran has quit IRC (Read error: Operation timed out) [03:09] *** turnkit has quit IRC (Read error: Connection reset by peer) [03:10] *** turnkit has joined #archiveteam-bs [03:13] *** kyonko has joined #archiveteam-bs [03:16] *** systwi has quit IRC (Read error: Connection reset by peer) [03:17] *** systwi has joined #archiveteam-bs [03:20] *** Sk1d has quit IRC (Read error: Operation timed out) [03:24] *** Sk1d has joined #archiveteam-bs [03:55] *** Sk1d has quit IRC (Read error: Operation timed out) [03:58] *** Sk1d has joined #archiveteam-bs [04:04] *** Sk1d has quit IRC (Read error: Operation timed out) [04:08] *** Sk1d has joined #archiveteam-bs [04:28] *** odemgi has joined #archiveteam-bs [04:33] *** qw3rty118 has joined #archiveteam-bs [04:35] *** odemg has joined #archiveteam-bs [04:37] *** systwi has quit IRC (Read error: Connection reset by peer) [04:37] *** systwi_ has joined #archiveteam-bs [04:37] *** qw3rty117 has quit IRC (Ping timeout: 600 seconds) [04:37] *** ndiddy has quit IRC () [04:39] *** systwi_ is now known as systwi [04:44] *** systwi has quit IRC (Read error: Connection reset by peer) [04:44] *** systwi_ has joined #archiveteam-bs [04:46] *** systwi_ is now known as systwi [04:59] *** systwi has quit IRC (Read error: Connection reset by peer) [05:00] *** systwi has joined #archiveteam-bs [05:13] *** systwi has quit IRC (Read error: Operation timed out) [05:13] *** HashbangI has quit IRC (Read error: Connection reset by peer) [05:14] *** systwi has joined #archiveteam-bs [05:19] *** icedice has quit IRC (Quit: Leaving) [05:58] *** Odd0002_ has joined #archiveteam-bs [05:58] *** Odd0002 has quit IRC (Ping timeout: 252 seconds) [05:59] *** Odd0002_ is now known as Odd0002 [06:04] *** systwi_ has joined #archiveteam-bs [06:04] *** systwi has quit IRC (Read error: Connection reset by peer) [06:05] *** ats has quit IRC (Read error: Operation timed out) [06:06] *** ats has joined #archiveteam-bs [06:13] *** fuzy802 has joined #archiveteam-bs [06:14] *** fuzzy8021 has quit IRC (Read error: Operation timed out) [06:16] *** Sk1d has quit IRC (Read error: Operation timed out) [06:19] *** Sk1d has joined #archiveteam-bs [06:23] *** fuzy802 is now known as fuzzy8021 [06:25] *** m007a83 has quit IRC (Read error: Connection reset by peer) [06:43] *** pawbs|alt has joined #archiveteam-bs [06:45] *** pawbs has quit IRC (Read error: Operation timed out) [06:46] *** m007a83 has joined #archiveteam-bs [06:57] *** Sk1d has quit IRC (Read error: Operation timed out) [07:00] *** Sk1d has joined #archiveteam-bs [07:03] *** newbie|2 has joined #archiveteam-bs [07:04] *** newbie|2 is now known as kyonko2 [07:06] *** kyonko has quit IRC (Ping timeout: 252 seconds) [07:16] *** Sk1d has quit IRC (Read error: Operation timed out) [07:18] *** m007a83 has quit IRC (Ping timeout: 252 seconds) [07:20] *** Sk1d has joined #archiveteam-bs [07:21] *** m007a83 has joined #archiveteam-bs [07:36] *** Odd0002 has quit IRC (Read error: Operation timed out) [07:41] *** newbie12 has joined #archiveteam-bs [07:41] *** newbie12 is now known as jianaran [07:42] Is anyone here familiar with internetarchive's warc python package? I'm struggling to write a new record, and would really appreciate some help [07:44] *** Gfy has quit IRC (Read error: Connection reset by peer) [07:47] *** newbie|2 has joined #archiveteam-bs [07:48] *** newbie|2 has quit IRC (Read error: Connection reset by peer) [07:48] *** newbie|2 has joined #archiveteam-bs [07:48] *** kyonko2 has quit IRC (Ping timeout: 252 seconds) [07:50] *** newbie|2 is now known as kyonko2 [07:58] *** Gfy has joined #archiveteam-bs [08:10] *** Exairnous has quit IRC (Read error: Operation timed out) [08:15] *** Despatche has quit IRC (Remote host closed the connection) [08:17] *** Despatche has joined #archiveteam-bs [08:25] *** brayden has quit IRC (Read error: Operation timed out) [08:26] jianaran: I'm loosely familiar, what's the problem? [08:27] Though I should probably mention before you dig too deep that the internetarchive package is quite old, and a lot of people prefer warcio: https://github.com/webrecorder/warcio [08:27] I think I've solved it now using warcio, but the IA warc package wouldn't write new .warc.gz files without throwing errors [08:27] Ha, there we go! [08:27] Ah okay [08:28] There are also python3 forks of the internetarchive package that are slightly better, but I don't think by much [08:31] Yeah, I spent a little while trying to make one work before giving up [08:31] But the good thing is, I now have deduplication working! [08:36] Is there a good way to extract media from within a WARC? eg, if I want to go through and pull out every jpg to a folder? [08:43] *** Sk1d has quit IRC (Read error: Operation timed out) [08:43] *** Despatche has quit IRC (Read error: Connection reset by peer) [08:46] *** Sk1d has joined #archiveteam-bs [08:49] *** Despatche has joined #archiveteam-bs [08:57] *** Despatche has quit IRC (Ping timeout: 360 seconds) [08:58] *** Despatche has joined #archiveteam-bs [09:03] *** brayden has joined #archiveteam-bs [09:03] *** Despatche has quit IRC (Ping timeout: 246 seconds) [09:09] jianaran: Well, you could write a script to do it with warcio, or you could try using warcat: https://github.com/chfoo/warcat [09:26] *** BlueMax has quit IRC (Quit: Leaving) [09:40] *** Jens has quit IRC (Remote host closed the connection) [09:41] *** Jens has joined #archiveteam-bs [09:45] *** Despatche has joined #archiveteam-bs [09:57] *** Despatche has quit IRC (Ping timeout: 600 seconds) [09:58] *** jianaran has quit IRC (Ping timeout: 612 seconds) [10:01] *** LFlare has quit IRC (Read error: Operation timed out) [10:02] *** Despatche has joined #archiveteam-bs [10:04] *** LFlare has joined #archiveteam-bs [10:28] *** Despatche has quit IRC (Ping timeout: 612 seconds) [10:29] *** Despatche has joined #archiveteam-bs [10:40] *** systwi_ has quit IRC (Give me your HAND, and I'll help you across.) [10:45] *** Sk1d has quit IRC (Read error: Operation timed out) [10:48] *** Sk1d has joined #archiveteam-bs [11:02] *** Raccoon` has joined #archiveteam-bs [11:03] *** Mateon1 has quit IRC (Read error: Operation timed out) [11:04] *** Mateon1 has joined #archiveteam-bs [11:22] *** RichardG_ has quit IRC (Read error: Connection reset by peer) [11:22] *** RichardG has joined #archiveteam-bs [11:35] *** wp494 has joined #archiveteam-bs [11:41] *** wp494_ has quit IRC (Read error: Operation timed out) [12:35] *** HashbangI has joined #archiveteam-bs [13:21] *** Sk1d has quit IRC (Read error: Operation timed out) [13:26] *** Sk1d has joined #archiveteam-bs [14:36] *** ubahn has joined #archiveteam-bs [14:56] *** VerfiedJ has joined #archiveteam-bs [15:51] *** ubahn has quit IRC (ubahn) [15:51] *** kbtoo_ has joined #archiveteam-bs [15:53] *** ubahn has joined #archiveteam-bs [15:59] *** kbtoo has quit IRC (Read error: Operation timed out) [16:24] *** fredgido_ has joined #archiveteam-bs [16:25] *** fredgido has quit IRC (Ping timeout: 252 seconds) [16:49] *** Sk1d has quit IRC (Read error: Operation timed out) [16:53] *** Sk1d has joined #archiveteam-bs [16:56] *** schbirid has joined #archiveteam-bs [16:57] *** Sk1d has quit IRC (Read error: Operation timed out) [17:00] *** Sk1d has joined #archiveteam-bs [17:06] *** Despatche has quit IRC (Read error: Operation timed out) [17:44] *** Despatche has joined #archiveteam-bs [18:03] *** zirnch has joined #archiveteam-bs [18:36] *** RichardG_ has joined #archiveteam-bs [18:37] *** RichardG has quit IRC (Read error: Connection reset by peer) [18:43] *** ubahn has quit IRC (Quit: ubahn) [19:09] *** m007a83_ has joined #archiveteam-bs [19:09] *** m007a83_ has quit IRC (Read error: Connection reset by peer) [19:12] *** m007a83 has quit IRC (Ping timeout: 252 seconds) [19:27] *** Exairnous has joined #archiveteam-bs [19:44] *** zirnch_ has joined #archiveteam-bs [19:45] *** zirnch has quit IRC (Read error: Connection reset by peer) [19:47] *** ubahn has joined #archiveteam-bs [20:20] *** Oddly has joined #archiveteam-bs [20:35] *** wp494_ has joined #archiveteam-bs [20:39] *** wp494 has quit IRC (Read error: Operation timed out) [20:46] *** Sk1d has quit IRC (Read error: Operation timed out) [20:49] *** ubahn has quit IRC (Quit: ubahn) [20:49] *** Sk1d has joined #archiveteam-bs [21:22] *** ivan_ has quit IRC (Leaving) [21:24] *** ivan_ has joined #archiveteam-bs [21:28] *** BlueMax has joined #archiveteam-bs [21:30] *** Oddly has quit IRC (Read error: Operation timed out) [21:38] *** X-Scale has joined #archiveteam-bs [21:57] *** schbirid has quit IRC (Remote host closed the connection) [22:14] *** TC01 has quit IRC (Read error: Operation timed out) [22:16] *** TC01 has joined #archiveteam-bs [22:27] i'm capturing a kiss 3d video tape i found at savers [22:35] Would someone be willing to archive some more snscrapes of Venezuelan accounts? These are mostly either government accounts or accounts of government officials related to the United Socialist Party: https://transfer.sh/dkYfx/venezuela-psuv-twitter.txt https://transfer.sh/Dof8T/venezuela-psuv-facebook.txt https://transfer.sh/8IAdg/venezuela-psuv-instagram.txt [22:35] Warning that the twitter list is pretty large, around 1.1 million tweets. [22:36] If someone could also archivebot http://www.psuv.org.ve/, that'd be great [22:51] *** ndiddy has joined #archiveteam-bs [23:32] *** Stilett0 has joined #archiveteam-bs [23:34] *** Stiletto has quit IRC (Ping timeout: 268 seconds) [23:50] *** kode54 has quit IRC (Quit: ZNC 1.7.1 - https://znc.in)