[00:40] *** Jonimus has quit IRC (ircd.shaw.ca irc.shaw.ca) [00:40] *** rduser has quit IRC (ircd.shaw.ca irc.shaw.ca) [00:52] *** mistym has quit IRC (Remote host closed the connection) [00:57] *** oldcad has quit IRC (Quit: Leaving.) [01:00] *** Asparagir has quit IRC (Asparagir) [01:02] *** Asparagir has joined #archiveteam-bs [01:09] *** rduser has joined #archiveteam-bs [01:26] *** pikhq has joined #archiveteam-bs [01:33] *** Asparagir has quit IRC (Asparagir) [01:35] *** rduser has quit IRC (ircd.shaw.ca irc.shaw.ca) [01:59] *** xtr-201 has quit IRC (Read error: Operation timed out) [02:04] *** rduser has joined #archiveteam-bs [02:07] *** schbirid2 has joined #archiveteam-bs [02:08] *** Stiletto has joined #archiveteam-bs [02:08] *** rduser has quit IRC (ircd.shaw.ca irc.shaw.ca) [02:08] *** schbirid has quit IRC (Read error: Operation timed out) [02:36] *** rduser has joined #archiveteam-bs [02:38] *** kyan has joined #archiveteam-bs [02:44] *** mistym has joined #archiveteam-bs [02:53] *** kniffy has quit IRC (Ping timeout: 252 seconds) [02:54] *** ripvanwin has quit IRC (Read error: Operation timed out) [02:56] *** dashcloud has quit IRC (Read error: Operation timed out) [03:02] *** dashcloud has joined #archiveteam-bs [03:07] *** primus104 has quit IRC (Leaving.) [03:34] *** xf2e has quit IRC (Remote host closed the connection) [03:59] *** xtr-201 has joined #archiveteam-bs [04:23] *** aaaaaaaaa has quit IRC (Leaving) [04:44] *** ripvanwin has joined #archiveteam-bs [04:47] *** mistym has quit IRC (Remote host closed the connection) [04:54] *** john1 has joined #archiveteam-bs [04:55] I'd like to be able to scrape youtube links off of a web page. Does a script for this already exist, that you know of? [05:04] *** vitzli has joined #archiveteam-bs [05:10] *** mistym has joined #archiveteam-bs [05:18] *** dashcloud has quit IRC (Read error: Operation timed out) [05:19] *** rduser has quit IRC (ircd.shaw.ca irc.shaw.ca) [05:23] *** dashcloud has joined #archiveteam-bs [05:27] *** rduser` has joined #archiveteam-bs [05:35] *** rduser` is now known as rduser [06:03] *** dashcloud has quit IRC (Read error: Connection reset by peer) [06:05] For large amounts of storage, do you recommend cheaper drives with more duplication and maintenance, or more reliable ones? [06:11] *** dashcloud has joined #archiveteam-bs [06:13] *** BlueMaxim has quit IRC (Ping timeout: 306 seconds) [06:14] *** BlueMaxim has joined #archiveteam-bs [06:24] *** superkuh_ has quit IRC (Quit: the neuronal action potential is an electrical manipulation of reversible abrupt phase changes in the lipid bilaye) [07:18] http://www.the-master-list.com/ [07:23] schbirid2: Is that from the nineties? Has it been updated recently? [07:23] updated 2014 :D [07:23] Nice. [07:23] You want it archived? [07:23] *** BlueMaxim has quit IRC (Quit: Leaving) [07:27] dont care, just thought it was funny [07:31] By the way, you know there's a modern geocities clone? [07:32] https://neocities.org/browse [07:33] yeah [07:36] I'm not sure the old geocities had bandwidth limits though. :/ [07:39] oh it did [07:40] All right. [07:41] Is the Archive Team prepared to face legal challenges? Have they done so before? [07:46] that can mean many things, no one could answer that [07:47] On issues of intellectual property, to be more specific. [07:47] what is your threat model [07:49] xmc: Basically, copyright trolls and assholes. For example, one might claim that their website is their intellectual property, and that you have no right to redistribute it. Though, I do believe this argument is no longer up for judicial debate. [07:49] it's never come up and i doubt it wil [07:50] I'm also wondering how DMCA requests will be handled. [07:50] archive.org's job, not ours [07:50] we dont care about copyright etc [07:51] All right. [07:52] be aware that if you upload to IA, your mail address is available for the world to see though [07:53] An email address doesn't necessarily identify a person though. [07:54] Of course, not using an email address you read may also deprive you of chances to defend yourself. [07:54] But I really don't care that much. [07:55] I'm a copyright antagonist myself, but I ask these questions because I don't want to give pirated data to organizations that avoid it. [08:00] *** kniffy has joined #archiveteam-bs [08:01] look at recent uploads at IA 8) [08:02] Yeah, I know. [08:02] *** mistym has quit IRC (Remote host closed the connection) [08:03] The best part is, when they receive a request, they don't even delete it. They just store it indefinitely, which I believe they are allowed to do due to their legal library status. [08:03] Though, I am curious how that stuff doesn't get lost, but that's a question for them I suppose. [08:05] *** mistym has joined #archiveteam-bs [08:06] What do you want to upload? [08:06] Nothing right now. Just thinking ahead of myself. [08:09] Though, I wouldn't upload anything there specifically for piracy. [08:10] *** mistym has quit IRC (Remote host closed the connection) [08:45] Hehehe, Legal challenges. [08:45] *** primus104 has joined #archiveteam-bs [08:48] *** BlueMaxim has joined #archiveteam-bs [08:49] *** dashcloud has quit IRC (Read error: Operation timed out) [09:03] *** dashcloud has joined #archiveteam-bs [09:06] Anyone else facing a similar problem? https://pastee.org/5uxhq [09:09] *** primus104 has quit IRC (Leaving.) [09:10] *** mistym has joined #archiveteam-bs [09:10] *** ohhdemgir has quit IRC (Read error: Operation timed out) [09:25] *** mistym has quit IRC (Read error: Operation timed out) [09:37] *** primus104 has joined #archiveteam-bs [09:37] *** dashcloud has quit IRC (Read error: Operation timed out) [09:40] *** dashcloud has joined #archiveteam-bs [09:59] john1: looks like wpull is python2, so use pip2 :P [10:00] hm, never mind [10:00] weird exception then lol [10:00] (wpull is python 3, ignore what I had said) [10:06] *** ohhdemgir has joined #archiveteam-bs [10:21] *** superkuh has joined #archiveteam-bs [10:50] *** BlueMaxim has quit IRC (Quit: Leaving) [11:02] can anyone download this: https://drive.google.com/file/d/0BxrjMy713etLcmhrWU1MVGNOT3M/view?pli=1 [11:03] its a 1994 film called Otaku and looks like its hard to find [11:03] download button doesn't work for me [11:06] *** oldcad has joined #archiveteam-bs [11:09] "Not Found [11:09] Error 404" for me [11:09] for me too [11:09] i figured it out [11:09] yeah, I guess just downloading the converted video [11:10] or did you find a working link to the wmv? [11:10] its just the converted webm [11:12] i got the link from here: http://www.reddit.com/r/Documentaries/comments/3dlcrt/otaku_1994_this_classic_documentary_focuses_on_a/ [11:13] *** dashcloud has quit IRC (Read error: Operation timed out) [11:13] godane: are you into torrents? it is available at a tracker i am member of [11:13] i mean generally, if you would use it well, i could invite you [11:14] ok [11:14] which torrent tracker? [11:14] pm me your mail address if you want [11:14] cinemageddon [11:14] YES [11:14] :D [11:16] *** dashcloud has joined #archiveteam-bs [11:33] *** Coderjoe_ is now known as Coderjoe [11:42] *** dashcloud has quit IRC (Read error: Operation timed out) [11:46] *** dashcloud has joined #archiveteam-bs [11:53] heh https://github.com/avinassh/rockstar [12:00] *** ivan` has joined #archiveteam-bs [12:00] john1: old Python 3, perhaps? [12:10] so i'm looing at cinemageddon forums [12:11] its alot easier to grab then underground-gamer [12:11] thats cause all pages of a topic is on the index pages [12:15] *** primus104 has quit IRC (Leaving.) [12:15] oi, dont get me banned [12:15] that was not the intention [12:15] not to mention that those forums are private [12:17] ok [12:42] what's the status of imageshack? [12:42] it seems they deleted a lot of old images and relaunched as a mobile app? [12:42] did we ever grab anything from them? [13:02] *** Muad-Dib has quit IRC (Ping timeout: 252 seconds) [13:14] *** mistym has joined #archiveteam-bs [13:22] *** mistym has quit IRC (Ping timeout: 492 seconds) [14:22] *** kyan has quit IRC (Quit: Leaving) [14:28] *** vitzli has quit IRC (Quit: Leaving) [14:43] *** Stiletto has quit IRC () [14:44] godane: i managed to get only the reencoded mp4 for streaming, 200MB [14:44] I did find a torrent for that though, I'll go see if it's similar [14:53] i got the torrent [14:56] ah alright [15:04] *** Stiletto has joined #archiveteam-bs [15:09] *** primus104 has joined #archiveteam-bs [15:10] *** Stiletto has quit IRC () [15:26] *** Stiletto has joined #archiveteam-bs [15:35] *** Coderjoe has quit IRC (Ping timeout: 186 seconds) [15:58] Sanqui: No, the setup.py told me to use python3. [15:58] ivan`: Python 3.4 is still the current version, isn't it? [16:19] *** Coderjoe has joined #archiveteam-bs [16:45] i'm starting to uploaded these: https://archive.org/details/koreanet-2_cheongju_tvpro_chung-20030107 [16:46] the tvpro part will be dropped from 2009 on [16:56] *** aaaaaaaaa has joined #archiveteam-bs [16:56] *** swebb sets mode: +o aaaaaaaaa [17:00] *** mistym has joined #archiveteam-bs [17:35] *** dashcloud has quit IRC (Read error: Operation timed out) [17:39] *** dashcloud has joined #archiveteam-bs [17:40] *** Asparagir has joined #archiveteam-bs [17:48] *** dashcloud has quit IRC (Read error: Operation timed out) [17:51] *** dashcloud has joined #archiveteam-bs [17:55] *** godane has quit IRC (Ping timeout: 252 seconds) [18:10] *** godane has joined #archiveteam-bs [18:17] *** dashcloud has quit IRC (Read error: Operation timed out) [18:21] *** dashcloud has joined #archiveteam-bs [18:27] *** primus104 has quit IRC (Leaving.) [18:35] *** godane has quit IRC (Quit: Leaving.) [18:37] *** godane has joined #archiveteam-bs [19:00] *** dashcloud has quit IRC (Read error: Operation timed out) [19:11] *** dashcloud has joined #archiveteam-bs [19:34] *** dashcloud has quit IRC (Read error: Operation timed out) [19:37] *** dashcloud has joined #archiveteam-bs [19:51] *** Kazzy has quit IRC (Quit: ZNC - http://znc.in) [19:51] *** Kazzy has joined #archiveteam-bs [20:00] *** Kazzy has quit IRC (Quit: ZNC - http://znc.in) [20:03] *** Kazzy has joined #archiveteam-bs [20:32] *** HCross has quit IRC (Ping timeout: 265 seconds) [20:35] *** wp494 has quit IRC (Read error: Connection reset by peer) [20:37] *** wp494 has joined #archiveteam-bs [20:37] *** wp494 has quit IRC (Excess Flood) [20:37] *** HCross has joined #archiveteam-bs [20:37] *** wp494 has joined #archiveteam-bs [21:01] john1: it is, I guess it's not that [21:26] *** dashcloud has quit IRC (Read error: Operation timed out) [21:29] *** dashcloud has joined #archiveteam-bs [21:53] *** Ravenloft has joined #archiveteam-bs [22:07] *** DopefishJ has joined #archiveteam-bs [22:07] *** swebb sets mode: +o DopefishJ [22:08] *** DFJustin has quit IRC (Read error: Operation timed out) [22:10] *** primus104 has joined #archiveteam-bs [22:14] *** espes___ has quit IRC (Ping timeout: 240 seconds) [22:23] *** Asparagir has quit IRC (Asparagir) [22:40] *** espes__ has joined #archiveteam-bs [22:46] *** chazchaz_ has quit IRC (Remote host closed the connection) [22:49] *** chazchaz_ has joined #archiveteam-bs [22:59] well, I guess sourceforge may be really dead... [23:01] They posted https://twitter.com/sourceforge/status/622237830186577920 a while ago [23:02] .title https://twitter.com/sourceforge/status/622237830186577920 [23:02] aaaaaaaaa: sourceforge on Twitter: "#SourceForge directory, download and project summary pages are back online; dev services (SCM, uploads, ML's, project web) pending restoral" [23:02] aaaaaaaaa: yes, I know [23:02] aaaaaaaaa: tbh, it feels like they're on borrowed time right now [23:02] what is the status of SF archival>? [23:03] paused [23:03] phiren: any particular reason? [23:03] others might not. But I like how everything they list as back is available from the mirrors. [23:04] the archival efforts started and a sourceforge admin showed up a few hours later and said "woah, you aren't following robot.txt" [23:04] They noticed the effort, panicked, stopped the party and started banning the user agent and rsync boxes [23:05] so SketchCow was trying to negotiate something with them. [23:06] arkiver probably has the best picture [23:06] there was a great post on reddit [23:06] https://www.reddit.com/r/sysadmin/comments/3do9k0/sourceforge_is_down_due_to_storage_problems_no_eta/ct77o49 [23:06] I'll write a mail to them tomorrow [23:06] will let you all check it first [23:06] SketchCow 6 [23:06] ^* [23:07] so the binaries aren't lost, I guess that's something [23:08] I think they'll all be restored [23:10] may have to overnight a box, and then figure out how the hell to reverse engineer their kludge. [23:12] I'd like an update on the SF negotiations [23:12] because this storage issue is a giant red flag to me [23:12] and I feel like this is going to be the next geocities, but without advance notice [23:12] joepie91: there's not much of an update yet [23:12] it all smells of neglect [23:13] they're being cooperative [23:13] arkiver: but the project has paused [23:13] but they don't fully understand what we exactly want, so we need to explain that better [23:13] arkiver: okay, what is unclear about their understanding? [23:13] in* [23:14] They asked us if we want to be a mirror. [23:15] And they want to talk with us "to ensure you can obtain a copy of Open Source content we host in a manner that uses a fair share of our delivery capacity and without impact to our other community members" [23:16] We do not want mirrors, we want a full web grab [23:16] arkiver: right. yeah [23:16] Now we need to explain that well in a mail [23:16] arkiver: it would perhaps be useful to point them at the wayback machine as an example of what you want [23:16] yeah [23:17] or say 'static dump' or whatever [23:17] there is no point trying to negotiate right now, they will be running around with their hair on fire [23:17] Though, I think it'll be hard for them to really understand why we want a webgrab and not just all files as a mirror [23:17] arkiver: let me know if you need my help explaining that [23:18] * joepie91 puts on "talking to non-knowledgeable people" hat [23:18] heh, depending on what you want, it s the best time [23:18] though, we don't just want a webgrab [23:18] we want all the SCM data as well [23:18] also, don't forgt the code [23:18] joepie91: I might need that, are you available tomorrow? [23:18] arkiver: it's slightly disturbing to me that they don't understand the requirements, though - makes me feel you're not talking to a technical person [23:18] arkiver: depends on the time [23:19] joepie91: ok, I'll just ping you [23:19] arkiver: if you give me an approx time (in NL timezone), I can try to schedule it in with sleep and such [23:19] Mail won't be send before you have read it [23:19] :p [23:19] I'm sorry, I can't give some exact time [23:20] arkiver: just ballpark. my normal sleep pattern is to go to sleep around 07:00 NL time, and wake up 7:30 - 8:00 hours later [23:20] at this point [23:20] but I can adjust that if need be [23:21] so I'd be back around 16:00 or so [23:21] that's ok [23:21] alright [23:22] they are not technical, the contact person is in pr [23:22] arkiver: just highlight me here in channel then, I'll probably miss a PM if I'm not actively looking for it (because my client doesn't reliably show new PM tabs to me) [23:22] I'll make sure this tab is in view :P [23:22] OK, I'll do that [23:23] I did just sent a PM, you got it? [23:23] arkiver: yeah, but the tab is somewhere three kilometers off screen :P [23:23] haha [23:42] *** dashcloud has quit IRC (Read error: Operation timed out) [23:47] *** dashcloud has joined #archiveteam-bs