[00:04] https://www.flickr.com/photos/52611635@N06/ [00:04] the guys flickr pictures [00:04] with pictures of tapes [00:06] SketchCow: btw please send me tapes to help digitizing of them [00:24] *** TheLovina has joined #archiveteam-bs [00:25] *** drumstick has quit IRC (Ping timeout: 255 seconds) [00:35] *** RichardG has quit IRC (Ping timeout: 255 seconds) [00:40] *** BlueMaxim has joined #archiveteam-bs [00:42] *** JensRex has quit IRC (Remote host closed the connection) [00:43] *** JensRex has joined #archiveteam-bs [01:38] *** Asparagir has quit IRC (Asparagir) [01:43] *** drumstick has joined #archiveteam-bs [02:00] *** Honno has joined #archiveteam-bs [02:02] *** refeed has joined #archiveteam-bs [02:02] *** _refeed_ has joined #archiveteam-bs [02:03] *** refeed has quit IRC (Client Quit) [02:04] *** _refeed_ is now known as refeed [02:14] *** refeed has quit IRC (Ping timeout: 260 seconds) [02:23] *** Stilett0 has joined #archiveteam-bs [02:38] *** Honno has quit IRC (Read error: Operation timed out) [02:46] *** Asparagir has joined #archiveteam-bs [02:47] *** svchfoo3 sets mode: +o Asparagir [02:47] *** svchfoo1 sets mode: +o Asparagir [02:58] *** refeed has joined #archiveteam-bs [03:19] *** _refeed_ has joined #archiveteam-bs [03:19] *** refeed has quit IRC (Read error: Connection reset by peer) [03:30] *** __refeed_ has joined #archiveteam-bs [03:30] *** _refeed_ has quit IRC (Read error: Connection reset by peer) [03:47] *** Stilett0 is now known as Stiletto [04:01] *** __refeed_ has quit IRC (Read error: Connection reset by peer) [04:16] *** __refeed_ has joined #archiveteam-bs [04:24] *** pizzaiolo has quit IRC (Quit: pizzaiolo) [04:31] *** balrog has quit IRC (Read error: Operation timed out) [04:32] *** REiN^ has quit IRC (Read error: Operation timed out) [04:32] *** Mayonaise has quit IRC (Read error: Operation timed out) [04:32] *** squires has quit IRC (Write error: Broken pipe) [04:32] *** ruunyan has quit IRC (Read error: Operation timed out) [04:32] *** C4K3 has quit IRC (Read error: Operation timed out) [04:33] *** Asparagir has quit IRC (Read error: Operation timed out) [04:33] *** spacegirl has quit IRC (Read error: Operation timed out) [04:33] *** Mayonaise has joined #archiveteam-bs [04:33] *** robogoat has quit IRC (Read error: Operation timed out) [04:33] *** Odd0002 has quit IRC (Read error: Operation timed out) [04:33] *** bwn has quit IRC (Read error: Operation timed out) [04:34] *** __refeed_ has quit IRC (Ping timeout: 260 seconds) [04:34] *** drumstick has quit IRC (Read error: Operation timed out) [04:34] *** rocode has quit IRC (Read error: Operation timed out) [04:34] *** Baljem has quit IRC (Read error: Operation timed out) [04:34] *** balrog has joined #archiveteam-bs [04:34] *** swebb sets mode: +o balrog [04:34] *** svchfoo3 sets mode: +o balrog [04:35] *** __refeed_ has joined #archiveteam-bs [04:35] *** robogoat has joined #archiveteam-bs [04:35] *** htw has quit IRC (Read error: Operation timed out) [04:36] *** spacegirl has joined #archiveteam-bs [04:36] *** Odd0002 has joined #archiveteam-bs [04:36] *** Dimtree has quit IRC (Read error: Operation timed out) [04:36] *** PotcFdk has quit IRC (Read error: Operation timed out) [04:37] *** godane has quit IRC (Read error: Operation timed out) [04:37] *** tfgbd_znc has quit IRC (Read error: Operation timed out) [04:38] *** drumstick has joined #archiveteam-bs [04:39] *** robink has quit IRC (Read error: Operation timed out) [04:40] *** robink has joined #archiveteam-bs [04:42] *** htw has joined #archiveteam-bs [04:43] *** __refeed_ has quit IRC (Ping timeout: 260 seconds) [04:44] *** bwn has joined #archiveteam-bs [04:48] *** godane has joined #archiveteam-bs [04:49] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [04:53] *** REiN^ has joined #archiveteam-bs [04:53] *** rocode has joined #archiveteam-bs [04:53] *** ruunyan has joined #archiveteam-bs [04:56] *** Sk1d has joined #archiveteam-bs [04:56] *** Sk1d has quit IRC (Connection Closed) [04:57] *** Sk1d has joined #archiveteam-bs [04:57] *** C4K3 has joined #archiveteam-bs [04:59] *** squires has joined #archiveteam-bs [05:00] *** tfgbd_znc has joined #archiveteam-bs [05:07] *** PotcFdk has joined #archiveteam-bs [05:15] *** Dimtree has joined #archiveteam-bs [05:17] *** Baljem has joined #archiveteam-bs [05:22] *** what_the_ has quit IRC (Ping timeout: 268 seconds) [05:29] *** __refeed_ has joined #archiveteam-bs [06:08] *** Aranje has quit IRC (Quit: Three sheets to the wind) [06:09] *** etudier has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…) [06:18] *** tfgbd_znc has quit IRC (Read error: Connection reset by peer) [06:19] *** tfgbd_znc has joined #archiveteam-bs [06:28] *** __refeed_ has quit IRC (Remote host closed the connection) [07:17] *** yuitimoth has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) [07:17] *** yuitimoth has joined #archiveteam-bs [07:17] *** yuitimoth has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) [07:17] *** yuitimoth has joined #archiveteam-bs [07:18] *** yuitimoth has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) [07:18] *** yuitimoth has joined #archiveteam-bs [07:18] *** yuitimoth has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) [07:18] *** yuitimoth has joined #archiveteam-bs [07:18] *** yuitimoth has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) [07:18] *** yuitimoth has joined #archiveteam-bs [08:24] *** tuluu has quit IRC (Quit: No Ping reply in 180 seconds.) [08:27] *** tuluu has joined #archiveteam-bs [08:31] *** Honno has joined #archiveteam-bs [09:20] *** drumstick has quit IRC (Read error: Operation timed out) [09:21] *** drumstick has joined #archiveteam-bs [10:13] Fyi, the maintainer of the Ublock repository (not ublock origin) has been deleting comments, issues, etc, asking about where the funds are going if there isn't any development going on. https://github.com/chrisaljoudi/uBlock [10:19] Shall we grab https://github.com/chrisaljoudi/uBlock as well (without the code)? [10:19] (Moved from #archivebot) [10:20] Looks like the issues are still around, e.g. https://github.com/chrisaljoudi/uBlock/issues/1706 [10:22] Ah yeah, he deleted comments. Hmm [10:23] one sec [10:24] I'm holding down the end key on twitter [10:26] Yeah, this isn't urgent, looks like most of the stuff happened two months ago anyway (including that ticket I linked). [11:07] *** pizzaiolo has joined #archiveteam-bs [11:12] *** RichardG has joined #archiveteam-bs [11:25] *** refeed has joined #archiveteam-bs [11:26] *** sep332 has quit IRC (Ping timeout: 260 seconds) [11:30] *** drumstick has quit IRC (Ping timeout: 255 seconds) [11:31] finally [11:31] It went from 4 hours ago, to 22 hours ago, to september 5th. [11:37] *** _refeed_ has joined #archiveteam-bs [11:48] *** yuitimoth has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) [11:48] *** yuitimoth has joined #archiveteam-bs [11:49] *** yuitimoth has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) [11:49] *** yuitimoth has joined #archiveteam-bs [11:49] *** BlueMaxim has quit IRC (Quit: Leaving) [11:49] *** yuitimoth has quit IRC (Read error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac) [11:49] *** yuitimoth has joined #archiveteam-bs [11:49] *** yuitimoth has quit IRC (Remote host closed the connection) [11:49] *** yuitimoth has joined #archiveteam-bs [11:50] !w f3p6sgk7e3f2hxwn0hy1w3xcq [11:55] *** _refeed_ has quit IRC (Leaving) [12:02] *** tobbez has joined #archiveteam-bs [12:04] *** godane has quit IRC (Ping timeout: 260 seconds) [13:36] *** sep332 has joined #archiveteam-bs [13:40] *** pizzaiolo has quit IRC (Ping timeout: 245 seconds) [13:42] *** pizzaiolo has joined #archiveteam-bs [13:55] hook54321: You're not the only one getting temp-banned from bit.ly. I currently only get 403 replies from them on at least one machine. [14:03] *** JAA___ has joined #archiveteam-bs [14:04] *** JAA sets mode: +o JAA___ [14:05] *** JAA has quit IRC (leaving) [14:08] *** JAA has joined #archiveteam-bs [14:08] *** swebb sets mode: +o JAA [14:13] *** JAA___ has quit IRC (Quit: Page closed) [14:23] *** pikhq has quit IRC (Read error: Operation timed out) [14:31] SketchCow, We've got people saying things like https://www.reddit.com/r/DataHoarder/comments/704h1g/saw_this_on_another_hoarding_site_first_there/dn0r7p6/ ..what's the status of ia/at getting these tapes? It'd be a shame to let some donut pick them up thinking he could digitise 24k tapes by himself :/ there's no coordination going on to I bet this guy is getting 100s of emails from people wasting his time and [14:31] ours. [14:37] odemg: http://archive.fart.website/bin/irclogger_log/archiveteam-bs?date=2017-09-14,Thu&sel=301#l297 [14:37] https://twitter.com/textfiles/status/908432524128456704 [14:38] Thank fuck. [14:41] *** Mateon1 has quit IRC (Read error: Operation timed out) [14:41] *** Mateon1 has joined #archiveteam-bs [15:16] *** VADemon has joined #archiveteam-bs [15:38] jrwr: why are they rioting [15:38] So [15:39] The main headline is "Police Officer Murder Trail: NOT GUILTY" [15:39] in 2011 a black guy named Anthony Smith Killed after a care chase [15:39] car* [15:39] I see [15:40] so its a black lives matter protest [15:40] thanks for the info :-) [16:19] fuckin [16:19] why can't cops just like ... not kill people [16:20] it doesn't seem that complicated. i don't kill people. never even once! [16:25] i killed a centipede yesterday [16:26] I choose to belive that the spider I washed down in the shower this morning is living a happy life with his alligator buddies. [16:37] *** _refeed_ has joined #archiveteam-bs [16:41] *** refeed has quit IRC (Ping timeout: 600 seconds) [17:26] *** atrocity has quit IRC (Ping timeout: 260 seconds) [17:49] I'm just gonna place this here, I don't have time to do anything with it right now. It's a list of x.vu URLs. [17:50] https://gist.github.com/anonymous/e6a10e4adad2db2c453366ecd07c7359 [17:57] *** sun_shine has joined #archiveteam-bs [18:19] *** RichardG has quit IRC (Read error: Connection reset by peer) [18:21] *** Asparagir has joined #archiveteam-bs [18:22] *** svchfoo3 sets mode: +o Asparagir [18:22] *** svchfoo1 sets mode: +o Asparagir [18:23] *** dd0a13f37 has joined #archiveteam-bs [18:24] *** RichardG has joined #archiveteam-bs [18:28] *** dd0a13f37 has quit IRC (Ping timeout: 268 seconds) [18:31] *** BartoCH has joined #archiveteam-bs [18:32] *** dd0a13f37 has joined #archiveteam-bs [18:33] What is archiveteam's position on archiving stuff obtained by questionable means? For example, would you accept scrapes done with hijacked accounts? AT presumably violates ToS frequently, but do you have any official "guidelines" on what to do/not do? [18:34] For services that require accounts we generally sign up for one specifically for archiving if that's possible, or ask for people to donate accoutns if not [18:35] But for example paywalled content for #newsgrabber [18:36] For most sites just browsing with cookies disabled is enough to bypass the paywall, so that's not an issue [18:36] Well, for some, but some others (svd.se for example) require you to be logged in to an account that in turn needs to be registered with valid and working payment info [18:37] Those kind of cases I'm not sure about ¯\_(ツ)_/¯ [18:37] Also, there are complete .pdf archives of some news papers which need premium/subscriber accounts to access, but you can download them as long as you have the URL. [18:38] So you could log in with tor, get all the URLs, then use much more crude methods to fetch the actual files [18:40] If you want to archive them, it seems like the least bad method. You could have a volunteer register a trial account with valid payment info, but I don't think it's a brilliant idea to give our your name and address to do what legally speaking probably is copyright infringement of some kind to the people whose copyright you're allegedly infringing [18:43] Also, the getting of accounts can be done programatically. You would only need a few to stay clear of ratelimits, and most of the news sites have insecure login forms (doesn't say "invalid login", says "invalid email"/"invalid password") [18:47] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [18:51] dd0a13f37: usually, getting on people's bad sides depends a lot on who that person is and how they're perceived relative to the rest of society [18:52] as a rule, pushing for shady means is frowned upon here [18:52] it's not hard-and-fast and I'm afraid that if you're looking for something you could codify in a program you won't find it here [18:52] Okay, thanks. [18:53] Could you tell me the secret word for wiki account creation? [18:53] I can't remember what it is offhand [18:53] Well yes, it definitely goes under "shady" by any definition to use hijacked accounts, so that's pretty clear. [18:56] Okay, thanks. Could you add links to http://libgen.io/libgen/repository_torrent/ http://libgen.io/dbdumps/libgen/ to the page "Library Genesis"? [18:57] It's 30tb (for the books, much more in papers), and they're not in very good health (the server went down just now for instance) [19:02] *** pikhq has joined #archiveteam-bs [19:18] *** lag has joined #archiveteam-bs [19:29] *** _refeed_ has quit IRC (Read error: Operation timed out) [19:42] *** godane has joined #archiveteam-bs [19:52] What are your thoughts on archiving bittorrent DHT? There are some projects that are scraping it already, like btdig, btdb, torrentproject(dead), itorrents(not scraping but has a huge amount of torrents) [19:53] i've thought about it [19:53] i have ENTIRELY too many projects, but it's appealing [19:54] It should be as simple as asking them for a copy, the problem is what to do with the ones that discard torrent files after scraping them [19:56] Hey astrid, hate to spam, but do you have the wiki secret word? It's not y********s anymore [19:57] yeah i do, sec [20:22] *** dd0a13f37 has quit IRC (Ping timeout: 268 seconds) [20:22] "i have ENTIRELY too many projects, but it's appealing" -- That sounds way too familiar. [20:23] Hey, you have all figured out my middle name! [20:26] Speaking of that: does anyone know of any active Reddit archival efforts? I had an idea yesterday... [20:27] Unfortunately, searching the logs for "reddit" is not particularly helpful. [20:27] (I'm aware of the comment dump up to 2015, hence why I wrote "active".) [20:32] *** dd0a13f37 has joined #archiveteam-bs [20:33] Would it be too out of scope to run an archiveteam project to scrape the DHT? It will definitely be useful for the future, you can find tons of obscure stuff in other p2p networks if you have filenames etcetera, and it's quite "cheap" (4mb gets you one image or hundreds of torrents) [20:34] As in, not indexing services but getting it straight form the soruce [20:35] !ig an2l7kygr2q9ilkuydo3qimq1 ^https?://sputniknews\.com/services/likes/ [20:35] d'oh [20:43] JAA: Isn't that still updated? [20:44] dd0a13f37: I haven't seen anything recently updated on IA, at least. [20:46] Oh right, here it is: http://files.pushshift.io/reddit/comments/ [20:47] oh [20:47] jackpot https://files.pushshift.io/ [20:47] And on IA: https://archive.org/details/reddit-data-comments [20:47] Why hasn't anyone come up with a decent solution for Tor on IRC? [20:47] Sweet. I won't have to do anything then. :-) [20:48] so looks like my squashfs file is look as a wave file for some reason [20:48] Allowing channel operators to ignore the ban, ask you to solve some captchas and wait a few days, allowing people to login to previously registered accounts, etc [20:49] run ffplay on it and see what happens [20:49] #1 and #3 exist, but only on decent IRC networks (i.e. not EFNet). [20:50] I know that's how freenode does it [20:50] EVEN WHEN I PUT IT AS A .squashfs [20:51] still think its a wave file [20:51] Yes, Freenode belongs to the decent IRC networks. [20:51] my problem is with this item: https://archive.org/details/slackwarearm-14.2-20170906-kiwix [20:51] #2 works fine on freenet which is 100% anonymous and extremely slowly moderated AND has a huge spam problem, it's also used by swedish forum flashback. EFnet is nice and decentralized though [20:52] godane: what's the issue? you can still mount ot [20:52] True, but that also causes a decent amount of issues (e.g. netsplits). [20:52] its deriving like its a wave file [20:52] i'm trying to stop deriving [20:52] deriving? [20:53] it trys to make a wave file into mp3, flac file [20:53] What? [20:53] but its not wave [20:53] just mount it manually if you're onl inux [20:53] a.org also recognizes it as such, see https://ia601500.us.archive.org/21/items/slackwarearm-14.2-20170906-kiwix/slackwarearm-14.2-20170906-kiwix_files.xml [20:54] sudo mount -o loop whatever.squashfs /mntpath/ [20:54] i know that [20:55] my problem is the IA thinking its a wave [20:55] also i'm trying to stop it from deriving [20:55] You can download the file though [20:55] its my fiel [20:55] if all else fails use the torrent [20:56] oh ok i see [20:56] the .sb suffix is throwing it off apparently [20:56] according to the derive log https://catalogd.archive.org/log/733622918 [20:56] astrid: but i put squashfs as file name [20:56] yeah that's weird [20:57] can you queue a derive with delete of all former derive results? [20:57] It's not doing anything with the magics? [20:57] bc it looks like you changed something in the last few minutes [20:57] but there's no derive scheduled [20:58] xxd file | head -n 1, does it start with RIFF? [20:58] i change the end from img to squashfs since .img was still saying its a wave file [20:59] ah, so you changed the filename? yeah, re-queue a derive and tick the "delete all prior versions" box [20:59] i delete the derive manually [21:00] that also works i guess [21:01] If I want to contact someone for archival efforts, should I ask someone here to do it so it's done "officially" or can I just email them and ask for a DB copy? [21:01] #2 [21:04] *** etudier has joined #archiveteam-bs [21:11] https://pastebin.com/GztDCtV3 Is there anything else I should add? [21:11] *** etudier has quit IRC (Ping timeout: 370 seconds) [21:33] Might want to explain a little bit about who you and why you want the info, so they don't think you work for the RIAA or MPAA or something. [21:37] Too late, already sent it. But I do mention Torrentproject shutting down [21:38] itorrents is run by a guy in pakistan who writes out his full name and address on whois and also openly runs limetorrents, so I dont think he is worried about MPAA [21:39] btdigg used to provide an API, so they should understand. [21:41] MrRadar: rather late, but the Hauppauge products are always reliable- I've used a PVR950q, and before that I used the PVR250 (which was a hardware MPEG2 encoder) [21:43] wow, the dht is big and it has more search engines than I thought [21:46] 2-3tb of incompressible data, 13 chinese search engines, torbt, digbt, the 4 ones I already mentioned [22:09] *** drumstick has joined #archiveteam-bs [22:20] *** BartoCH has quit IRC (Quit: WeeChat 1.9) [22:32] *** dd0a13f37 has quit IRC () [23:29] *** robink has quit IRC (Ping timeout: 260 seconds) [23:32] *** robink has joined #archiveteam-bs [23:46] *** etudier has joined #archiveteam-bs