[00:47] *** Stiletto has quit IRC () [01:07] thanks chfoo yipdw [01:10] *** j08nY has quit IRC (Remote host closed the connection) [01:12] *** BlueMaxim has joined #archiveteam-bs [01:15] *** Stilett0 has joined #archiveteam-bs [01:15] *** Stilett0 has quit IRC (Client Quit) [01:17] *** Stilett0 has joined #archiveteam-bs [01:18] *** Stilett0 is now known as Stiletto [01:21] *** ndiddy has quit IRC () [01:28] *** ndiddy has joined #archiveteam-bs [01:30] *** schbirid2 has joined #archiveteam-bs [01:33] *** schbirid has quit IRC (Read error: Operation timed out) [01:46] *** REiN^ has quit IRC (Max SendQ exceeded) [01:47] *** BlueMaxim has quit IRC (Read error: Operation timed out) [01:47] *** REiN^ has joined #archiveteam-bs [02:12] *** Stiletto has quit IRC (Read error: Operation timed out) [02:12] *** Stilett0 has joined #archiveteam-bs [02:28] OK, I'm fixing all the New Computer Express stuff the guy's uploading. [02:28] https://archive.org/details/NewComputerExpress000 will be the first one that finishes, I bet [02:34] *** pizzaiolo has quit IRC (Quit: pizzaiolo) [03:18] *** BlueMaxim has joined #archiveteam-bs [04:31] *** Odd0002 has quit IRC (Remote host closed the connection) [04:56] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [05:03] *** Sk1d has joined #archiveteam-bs [06:22] *** fie has quit IRC (Ping timeout: 506 seconds) [06:31] *** fie has joined #archiveteam-bs [06:37] *** powerArch has quit IRC (Remote host closed the connection) [06:48] *** vitzli has joined #archiveteam-bs [08:09] *** Honno has joined #archiveteam-bs [08:11] *** vitzli has quit IRC (Quit: Leaving) [08:18] i'm asking the retromags people if they will look at the New Computer Express on IA [08:19] i figure they would make a edit version of it and also a smaller size cbz for there site [09:16] *** SHODAN_UI has joined #archiveteam-bs [09:55] SketchCow: the pdf derive sucks ass on this one: https://archive.org/details/TNM_The_Apple_Collection_Catalog [09:55] there is like no text in the pdf at all [09:56] the only derive that works well is normal the jp2.zip files [10:14] *** j08nY has joined #archiveteam-bs [10:56] *** BlueMaxim has quit IRC (Read error: Operation timed out) [10:57] *** BlueMaxim has joined #archiveteam-bs [11:02] *** fie has quit IRC (Quit: Leaving) [11:16] *** fie has joined #archiveteam-bs [11:39] *** SHODAN_UI has quit IRC (Remote host closed the connection) [12:13] *** robinak has joined #archiveteam-bs [12:13] *** robink has quit IRC (Read error: Connection reset by peer) [13:01] *** C4K3 has quit IRC (Quit: leaving) [13:01] *** C4K3 has joined #archiveteam-bs [13:04] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [13:32] *** fie has quit IRC (Read error: Operation timed out) [13:33] *** yuitimoth has quit IRC (Remote host closed the connection) [13:34] *** yuitimoth has joined #archiveteam-bs [13:40] *** SHODAN_UI has joined #archiveteam-bs [13:42] *** j08nY has quit IRC (Read error: Operation timed out) [14:09] *** j08nY has joined #archiveteam-bs [15:09] *** pizzaiolo has joined #archiveteam-bs [15:30] What is a good upload rate limit on uploading to IA? I noticed some of my grabs are offline, so I need to get them on IA before I get hit by a bus, but I don't want to overload the server with concurrent uploads. Is there a guideline? [16:22] if you use the s3 interface, it'll throw 429's at you if you go too fast [16:23] as for bandwidth, I'm sure newsgrabber was sending in the 500-700mbit/s at it pretty constantly [16:29] *** dashcloud has joined #archiveteam-bs [16:31] *** godane has quit IRC (Read error: Operation timed out) [16:37] I'm grabbing Tanobb now. It's quite slow, probably at least partially because their servers are located in Japan, but well, I'll try to get as much as possible before they shut down. [16:42] *** godane has joined #archiveteam-bs [17:10] *** godane has quit IRC (Quit: Leaving.) [17:37] *** TheLovina has joined #archiveteam-bs [18:00] *** RichardG has quit IRC (Read error: Connection reset by peer) [18:17] *** RichardG has joined #archiveteam-bs [18:42] *** godane has joined #archiveteam-bs [18:57] *** pizzaiolo has quit IRC (Read error: Operation timed out) [18:58] *** Florian_ has joined #archiveteam-bs [18:58] *** Florian_ has quit IRC (Client Quit) [19:05] *** SHODAN_UI has quit IRC (Remote host closed the connection) [19:32] *** jrwr has joined #archiveteam-bs [19:53] timmc, Kaz as ero was built primarily for reddit we can assume that a very high percentage of the links were posted on reddit, this allows us to grep datasets like this one - http://files.pushshift.io/reddit/ - for eroshare links and download them [19:54] yeah, that should make it a ton easier to work through [19:54] should I get to work pulling the links out of that data? [19:54] only seems to go up to april though [19:55] probbaly not a bad idea to start working it through, if we do end up grabbing [19:55] we can worry about that afterwards, ero was born when? I know it's only bee around a few years at this point so no need to grab earlier files [19:56] you probably know better than me when ero started.. :) [19:56] domain is 10 years old [19:57] highly doubt it's been used for that long, somehow. [19:57] .... whois on the domain say its was regged on 2006-11-30 [19:57] wut [19:57] actually, #nofap is probably the best place if we're actually going to start grabbing [20:02] *** ZexaronS has joined #archiveteam-bs [20:02] Kaz, explain? [20:04] Project channel [20:06] I got my Archive Team Warrior Stickers in [20:06] they are nice! [20:10] *** wp494 has quit IRC (Read error: Operation timed out) [20:17] *** icedice has joined #archiveteam-bs [20:20] odemg: Ah, interesting. [20:21] https://goo.gl/photos/MnPQMLAe1ixzjedV9 [20:21] Works very well on my mug [20:26] *** pizzaiolo has joined #archiveteam-bs [20:42] *** Pudsey has joined #archiveteam-bs [20:58] *** SHODAN_UI has joined #archiveteam-bs [21:07] *** pizzaiolo has quit IRC (Ping timeout: 506 seconds) [21:20] *** icedice has quit IRC (Quit: Leaving) [21:28] *** pizzaiolo has joined #archiveteam-bs [21:30] *** SilSte has quit IRC (Remote host closed the connection) [21:31] *** SilSte has joined #archiveteam-bs [21:33] *** Pudsey has quit IRC (Remote host closed the connection) [21:40] SilSte: Howd [21:40] *** Silas has joined #archiveteam-bs [21:40] So Silas [21:40] can you hover over the invaild config icon at the bottom [21:40] what does it say [21:41] *** pizzaiolo has quit IRC (Read error: Operation timed out) [21:41] jrwr: re cygwin port, I guess it's a good idea if it's stable and works [21:41] its just a warning about there not being enough video memory to go into fullscreen or seamless [21:41] Ok [21:41] that said, WSL might be better? albeit only win10 support [21:41] damn, that should boot then [21:41] *** pizzaiolo has joined #archiveteam-bs [21:42] Kaz: true but since its pretty simple on the scripts, and would want it to even work on win7/win8 [21:42] make a little installer + some shortcuts to some scripts to turn off the warrior and such [21:43] i should check my bios to make sure vt-x is enabled in the first place [21:43] the processor doesn't support it [21:43] oh [21:43] :/ [21:44] VMware player might work in the this case, it can import the OVA [21:44] Or [21:44] a 2.99 Euro a month Virtual Machine Instance at scaleway works very well [21:44] ill try out vmware player [21:45] Let us know how it goes, ill be here all night [21:45] Kaz: if we keep the docker support, Ill work on a installer for it, wont be too hard at all. ill do some testing today on it, I'm pretty bored right now [21:54] *** Odd0002 has joined #archiveteam-bs [21:56] Seriously though, the warrior is a ridiculous security risk (which is why I'd never let it anywhere near my machines). I'm not sure about the Docker container, though based on a quick look it seems to be based on a 3-month-old version of phusion/baseimage and probably also has some security issues. [21:57] I got the warrior running in VMWare Player and set to run whatever ArchiveTeam's Choice is, everything looks good! [21:57] Thanks jrwr [21:57] Awesome! [21:57] jrwr: Ill have the cygwin version only listen on 127.0.0.1 [21:58] did you just talk to yourself? [21:58] I did [21:58] I ment JAA [21:58] ah [21:58] Well, it still needs an internet connection... [21:58] Ya [21:59] and it is running random code from the internet [21:59] :) [22:00] True. It also auto-updates scripts and fetches the wget-lua source via HTTP without a checksum check. :-| (Ping JensRex, you implemented that, didn't you? What happened with that?) [22:00] As I said, ridiculous security risk. [22:01] it is a botnet after all [22:02] That's true, but it could be improved a lot. [22:02] Yes [22:03] I've though about doing it, making a new Virtual Machine, but I think that should be up to the project managers, as I have no say around here [22:03] *** decay has quit IRC (Read error: Operation timed out) [22:03] *** decay has joined #archiveteam-bs [22:12] wow, that was easy.. almost too easy [22:21] *** Silas has quit IRC (Quit: help) [22:21] *** Silas has joined #archiveteam-bs [22:24] *** Silas has quit IRC (Client Quit) [22:45] we've had problems with people using cygwin in the past because of the case-insensitive filesystem and certain filenames being illegal on windows [22:45] that's one of the reasons the warrior was created in the first place [22:45] to have a consistent no-surprises environment [22:47] *** Ravenloft has joined #archiveteam-bs [22:48] http://www.os2museum.com/wp/rich-heimlichs-patch-set-overview/ [22:48] if you're not willing to run the warrior and you don't have access to a linux system then it's better to just let someone else do it rather than cause problems [22:48] *** SHODAN_UI has quit IRC (Remote host closed the connection) [22:48] we usually don't have a shortage of warriors running [22:50] This is true [22:50] also wget-lua hates cygwin atm [22:53] * jrwr is just trying to be helpful [22:55] yeah it's just that if it creates a mess and an admin has to clean it up that far outweighs the benefit of 1 more client running [22:56] usually the rate limiter is the site being archived or our staging servers rather than number of people crawling [22:56] *** Silas has joined #archiveteam-bs [22:57] *** Ravenloft has quit IRC () [22:58] *** Ravenloft has joined #archiveteam-bs [23:09] Ya i provided a staging server for pixiv [23:09] it was crazy the amount of traffic inbound I was getting, poor FOS just cant keep up [23:10] jrwr: there's not really such a thing as "the project managers" :P [23:11] stuff gets done when somebody decides to do it [23:11] i love the fact there's a leaderboard [23:11] im a sucker for stat tracking lol [23:12] Its a loose collection of people, There are people with access to infra and they tend to help when a need arises, those are the true "managers" of AT [23:12] like I would save arkiver has been a amazing "manager" of savepixiv [23:14] thats what she said [23:19] *** j08nY has quit IRC (Remote host closed the connection) [23:20] *** j08nY has joined #archiveteam-bs [23:40] What the hell, wpull is suddenly ignoring my --reject-regex options. O.o [23:46] *** Silas has quit IRC (Quit: Page closed) [23:48] Ok, looks like it never worked, but now I seriously wonder why. [23:52] Ooh, you can only have one --reject-regex option. Well, that's... unexpected. [23:59] *** Honno has quit IRC (Read error: Operation timed out)