[00:03] *** mutoso has joined #archiveteam-bs [00:04] *** ky0ko has quit IRC (Read error: Operation timed out) [00:07] *** ky0ko has joined #archiveteam-bs [00:35] *** RichardG_ has joined #archiveteam-bs [00:36] *** RichardG has quit IRC (Read error: Operation timed out) [00:47] *** Honno has quit IRC (Read error: Operation timed out) [00:54] *** brayden has joined #archiveteam-bs [00:54] *** swebb sets mode: +o brayden [01:04] *** zenguy_pc has quit IRC (Read error: Operation timed out) [01:05] *** BlueMaxim has joined #archiveteam-bs [01:19] *** zenguy_pc has joined #archiveteam-bs [01:19] *** kristian_ has quit IRC (Quit: Leaving) [01:35] *** zenguy_pc has quit IRC (Read error: Operation timed out) [01:52] *** zenguy_pc has joined #archiveteam-bs [01:56] *** zenguy_pc has quit IRC (Read error: Operation timed out) [02:34] *** zenguy_pc has joined #archiveteam-bs [02:37] *** RichardG_ has quit IRC (Ping timeout: 260 seconds) [02:38] *** RichardG has joined #archiveteam-bs [02:41] *** RichardG_ has joined #archiveteam-bs [02:48] *** RichardG has quit IRC (Read error: Operation timed out) [02:57] i'm starting to upload www.reuters.com videos again [03:00] *** zenguy_pc has quit IRC (Read error: Operation timed out) [03:04] *** RichardG_ has quit IRC (Ping timeout: 255 seconds) [03:08] are 6Gb/s hard drives a gimmick? pretty sure they can't saturate 3Gb/s [03:31] *** zenguy_pc has joined #archiveteam-bs [03:36] *** zenguy_pc has quit IRC (Ping timeout: 244 seconds) [04:08] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:11] *** dashcloud has quit IRC (Read error: Operation timed out) [04:14] *** Sk1d has joined #archiveteam-bs [04:38] *** Meroje has quit IRC (Quit: bye!) [04:39] *** Meroje has joined #archiveteam-bs [05:05] *** Ravenloft has quit IRC () [05:17] *** ravetcofx has quit IRC (Ping timeout: 246 seconds) [05:35] *** ravetcofx has joined #archiveteam-bs [06:42] *** zenguy_pc has joined #archiveteam-bs [06:59] *** espes__ has quit IRC (Ping timeout: 250 seconds) [07:17] *** espes__ has joined #archiveteam-bs [07:23] *** espes__ has quit IRC (Ping timeout: 250 seconds) [07:35] Frogging: I asked an engineer that question, they said: "Sure" "Most don't actually sustain those speeds" "That's usually the burst speed. I.e. How fast it can read out of its cache. The drive itself probably can't sustain that for long at all, unless it is solid state." [07:48] *** espes__ has joined #archiveteam-bs [07:51] *** schbirid has joined #archiveteam-bs [07:56] *** DrKyonko has joined #archiveteam-bs [09:03] *** GE has joined #archiveteam-bs [09:05] *** Honno has joined #archiveteam-bs [09:07] *** espes__ has quit IRC (Ping timeout: 250 seconds) [09:13] *** espes__ has joined #archiveteam-bs [09:26] *** Jeroen__u has joined #archiveteam-bs [09:27] Jeroe__u: I have a fat server, that once had 400 instances running. All at the same time, all with different ports. [09:28] like screen -dmS e1 run-pipeline3 pipeline.py --concurrent 20 --address '127.0.0.1' --port 1331 Medowar [09:28] and then increasing the screen-name to e2 and the port to 1332 [09:32] Medowar: Are you running all manually or is there any cron involved? [09:33] no cron, but I am using an edited version of HCross mass-launch script [09:34] my version: http://pastebin.com/ysReMBnQ [09:34] Medowar: Cheers, I'll check it out [09:34] the example is for 10 instances. [09:42] You can also run it with --disable-web-server [09:42] Which doesn't give the web GUI for the script [09:43] Saving resources, sockets and so on and so on [09:43] That's how I run mine [09:44] pipeline.py --concurrent 20 --disable-web-server XXXXX [09:44] Thanks for the tip, I will change my setup. [09:45] Igloo^ Medowar I really appreciate the input, was thinking of pushing a RPi3 to see what it can do while running linux containers (docker) [09:45] I think you'll run foul of CPU [09:45] But let me know onthat one [09:46] Sure, it's always good to know when and why something breaks [09:46] please keep in mind that we usually run this stuff on x86-64 machines [09:47] if you're going to armhf/arm64 you may or may not hit portability problems [09:48] lua, python 2.7, wget all do run fine when built for ARM. I can't recall if anyone's run the Warrior combination though [09:49] yipdw: I'm fully aware, I'm always poking the bear. Even had a TS server running using qemu just to see if I could and when it would break [09:50] yipdw: I think that I hit into missing dependencies on CentOS, I'm not sure what but the following seems to make it run. It also installs the EPEL repository. http://pastebin.com/4z2ykrSv [09:50] All the cheap hardware is fun, but still I feel it's a waste of resources when it's only doing a single task [09:50] Jeroen__u: probably. nobody really runs this on CentOS [09:51] if the README needs to be updated please submit a PR [09:51] Kksmkrn: Whenever I need to run a single task on a server it is usually because it needs a specific distro, or there is a deps hell going on. [09:52] Kksmkrn: And recently GitLab started using up a lot of RAM that I just didn't have on a specific machine, so yesterday I started renting a VPS just specifically for GitLab. [09:52] yipdw: I'll look into the dependency issue and I'll make a PR request if it is a problem on the official ISO. [09:52] cool thanks [09:53] Jeroen__u: Oh deps hell, yea I remember, vaguely.. Left that behind when I started playing with Arch and docker :D [09:54] Jeroen__u: Though Alpine is my favorite at the moment, nice and small [09:54] *** DrKyonko has quit IRC (Quit: Depression is merely anger without enthusiasm) [09:56] Kksmkrn: I prefer CentOS, although old stable versions usually don't have certain cool new features. [09:56] One does not really have bleeding edge features on CentOS. [09:57] Bleeding edge is nice to tinker with, though there are distros that can be so cutting edge they'll cut the OS right from under you [09:57] *** BlueMaxim has quit IRC (Quit: Leaving) [09:57] Pardon the Denglish [10:04] kksmkrn: Denglish? Are you german? Also please forget docker. the overhead is massive... I migrated from docker to native and had a 50% performance improvement with fewer ram and cpu usage [10:05] Docker is more for fast spinning up, like for a quick task on a do or linode server. Pull the image, load the script files, and go. not really efficient... [10:05] Medowar: Haha no, Denglish as in Dutch English [10:05] ah. Denglish is also German for Deutsch-English. [10:06] Oh my, the confusion :) [10:08] Medowar: So drop docker you say, how else will I be able to tinker with my RPi3 and still have it run tasks I need it to do without cluttering the OS [10:09] You still can use it, I just want to warn you, that it has quiet a performance overhead and you shouldnt use it for permanent usage. [10:09] This use case justifies it. [10:09] Just be careful not to reboot your rpi [10:10] I wasn't planning on it, much [10:11] What would happen if I did though? [10:21] *** GE has quit IRC (Remote host closed the connection) [10:28] Kksmkrn: i'm currently running everyhting on an armhf the archiveteam scripts [10:28] i can confirm they are mostly working [10:29] luckcolor: That's good to hear, any particular issues that cause anything to break? [10:32] not really: just get used to disappointment if you get an item wich starts to infinitely loop and eats all of your ram [10:33] happened a lot with google code for me [10:33] to cite something that's working nicely is archivebot [10:33] i have a pipeline on my sever, it's slow but it gets the jobs done [10:34] it's named after my nickname in the archivebot's dashboard [10:34] oic [10:35] mostly the problems you might have is python being VERY stupid about dependencies or strange errors [10:35] but those are something you get everywhere [10:36] (and for the record yes i'm not a fan of python) :P [10:36] As python is derping as we speak, nothing dependency related tho [10:36] Kksmkrn: anyways let me know if you have problems [10:36] luckcolor: Much appreciated :) [10:38] Kksmkrn: another thing, a project i raccomend not to run if you don't have a lot of computing power is Newsbuddy [10:39] grab-site is heavy on cpu and if you don't have enough it may run too slow and jobs start to backlog in your pipeline [10:39] i don't know if the backlog problem has been fixed i haven't tested the new version [10:40] (with backlog problem i mean concurrency limit) [10:40] luckcolor: Duely noted, might want to test to see if it still has that issue [10:54] Can you run multiple concurrent jobs in the same directory? Or will there be conflicts? [10:54] i never tried but you should be able [10:57] I am worried that the data could be corrupted. [10:59] vantec is doing so many nujij articles that my warriors are having problems getting in the middle of it. [11:18] *** irl has joined #archiveteam-bs [11:18] *** zenguy_pc has quit IRC (Ping timeout: 260 seconds) [11:18] SketchCow: update on manuals for things, scanner has arrived and appears to work but the adf is screwy, going to order a replacement roller and pickup thingy [11:18] i'm back from travels now, so once that arrives i'll begin mass scanning of them [11:18] (assuming that's the only thing i need to fix) [11:18] it claims it can cope with 75 sheets at a time, and it has a duplexer [11:53] *** dashcloud has joined #archiveteam-bs [11:55] *** GE has joined #archiveteam-bs [12:24] *** tuankiet has quit IRC (Ping timeout: 244 seconds) [12:24] *** tuankiet has joined #archiveteam-bs [13:06] *** Whopper has quit IRC (Ping timeout: 370 seconds) [13:12] *** brayden has quit IRC (Read error: Operation timed out) [13:13] *** ky0ko has quit IRC (Read error: Operation timed out) [13:21] *** ky0ko has joined #archiveteam-bs [13:26] *** brayden has joined #archiveteam-bs [13:26] *** swebb sets mode: +o brayden [13:36] *** dashcloud has quit IRC (Read error: Operation timed out) [13:53] *** ravetcofx has quit IRC (Ping timeout: 370 seconds) [15:09] heh, didn't catch this before Kim Dotcom permitted to livestream his appeal against extradition on Youtube http://hn.premii.com/#/article/12393211 [15:09] "all footage must be removed when the hearing ends" [15:10] *removed from the internet [15:11] >removing things from the Internet [15:34] *** ravetcofx has joined #archiveteam-bs [15:44] The internet never forgets. [15:45] Could, for the fun of it, mirror the appeal over many hard drives for the fun of it. [15:47] *** ravetcofx has quit IRC (Read error: Operation timed out) [15:52] *** ravetcofx has joined #archiveteam-bs [15:56] if no one records it, the internet does forget [15:56] that's why we are here [15:57] for kim dotcom though, I'm sure it's recorded by someone [16:07] ranma: Frogging: Jeroen__u: arkiver: https://www.youtube.com/channel/UCw7XhgJhQDHkVrJjiw4CONg/live [16:07] the hearing only ends in a few days [16:07] the recordings are there for now [16:07] nice, still online [16:08] [11:56:42] <@arkiver> that's why we are here [16:08] very true [16:11] *** metal_cam has joined #archiveteam-bs [16:12] https://archive.org/details/FreebaseRdsLatest poof [16:14] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [16:14] *** kristian_ has joined #archiveteam-bs [16:43] joepie91: arkiver, Frogging, ranma: We could make a syncthing folder to share current and upcoming streams. [16:47] *** JW_work1 has joined #archiveteam-bs [16:48] *** BartoCH has joined #archiveteam-bs [16:49] *** JW_work has quit IRC (Read error: Operation timed out) [16:51] *** JW_work1 has quit IRC (Client Quit) [17:06] maybe worth downloading: http://support.wdc.com/download/gpl/ [17:09] http://kb.netgear.com/app/answers/detail/a_id/2649/~/netgear-open-source-code-for-programmers-(gpl) [17:10] I love the "for programmers" suffix there [17:10] lol [17:10] yeah :p [17:10] "source code for people who look at source code" [17:13] *** RichardG has joined #archiveteam-bs [17:13] *** RichardG has quit IRC (Client Quit) [17:13] *** RichardG has joined #archiveteam-bs [17:34] *** Simpbrain has quit IRC (Ping timeout: 370 seconds) [17:34] *** Simpbrain has joined #archiveteam-bs [17:48] joepie91: sent to archivebot [17:49] luckcolor: archivebot won't get it [17:49] afaik it's buttons [17:49] ah [17:49] well it will get the kb one [17:58] *** jeroen52_ has joined #archiveteam-bs [17:59] *** Jeroen__u has quit IRC (Ping timeout: 268 seconds) [18:01] *** ravetcofx has quit IRC (Read error: Operation timed out) [18:04] *** RichardG_ has joined #archiveteam-bs [18:04] *** RichardG has quit IRC (Read error: Operation timed out) [18:19] *** RichardG_ has quit IRC (Ping timeout: 370 seconds) [18:23] *** tuankiet has quit IRC (Ping timeout: 244 seconds) [18:32] *** Spring has joined #archiveteam-bs [18:32] whose idea was it to archive pomf.se? Did the author contact you about? [18:32] http://archiveteam.org/index.php?title=Pomf.se [18:32] yes [18:34] yipdw, thx. Looking at the Clones page I see Mixtape.moe 'isn't very cooperative' with archiving efforts. [18:34] what a shame [18:35] Was going to ask about that one as I'm not sure of their purpose. Reading their blog posts every one is titled "Rule of Acquisition :" [18:35] so perhaps they're looking to be bought [18:35] or they watched too much Star Trek [18:36] I'm just wonder about the viability of it as a hosting service if it goes under and they aren't willing to have a copy made [18:36] *I just [18:36] *** tuankiet has joined #archiveteam-bs [18:37] from the number of clones and the full backup I'd say pomf.se's dataset is the longest-lived, most robust dataset in the entire history of the internet [18:37] probably don't need copies of copies [18:37] yipdw: uh, isn't the tz database more robust and better replicated [18:38] possibly but the tz database doesn't seem to get as much love [18:38] 14 “<probably> don't need copies of copies” come again? Do you mean the site backend/design or the files themselves? [18:39] it's possible I've missed an immense amount of important information that's stenographically moeified [18:42] lol [18:43] I mean for original content [18:44] but true, there's probably far less of that than rehosted content [18:44] we got everything pomf.se had. what's the difference? [18:45] oh, are you referring to uploads on the clones but not in the original dataset [18:45] yes [18:46] then yes, maybe a flaky host could be a problem [18:53] *** Spring has quit IRC (Ping timeout: 370 seconds) [18:54] *** JW_work has joined #archiveteam-bs [19:19] *** metalcamp has joined #archiveteam-bs [19:19] oh nice, the *_s secure API functions in the Windows CRT do not and will never officially exist on Windows XP [19:20] i guess the tl;dr version of it is "get off Windows XP already, this is making shipping code a pain in the ass" [19:22] *** metal_cam has quit IRC (Read error: Operation timed out) [19:23] * yipdw is shipping stuff via mingw-w64 and ran into fun compiler errors [19:29] *** JW_work1 has joined #archiveteam-bs [19:31] *** JW_work has quit IRC (Read error: Operation timed out) [19:37] *** JW_work1 has quit IRC (Read error: Operation timed out) [19:42] *** ndiddy has joined #archiveteam-bs [20:06] *** GE has quit IRC (Remote host closed the connection) [20:06] *** GE has joined #archiveteam-bs [20:16] *** metalcamp has quit IRC (Read error: Operation timed out) [20:16] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [20:18] *** BartoCH has joined #archiveteam-bs [20:29] Does Windows Fax and Scan let you scan multiple pages into a single PDF? [20:32] *** Jeroen52 has quit IRC (Read error: Connection reset by peer) [20:43] *** dashcloud has joined #archiveteam-bs [20:57] yipdw: that makes sense - if memory serves, without saying too much, the _s functions work by inspecting allocation metadata [20:57] the allocator changed completely between XP SP2 and SP3, and again in Vista [21:10] *** Jeroen52 has joined #archiveteam-bs [21:11] *** ky0ko has quit IRC (Ping timeout: 244 seconds) [21:32] *** schbirid has quit IRC (Quit: Leaving) [21:39] *** ky0ko has joined #archiveteam-bs [21:39] article I just looked at mentioned the rpi3 doesn't have wifi and usb sharing a bus [21:40] doesn't mention the ethernet adapter though [21:40] I'm on a RPi3 as we speak, running hypriotOS (raspbian + docker) [21:41] well, i have a first generation pi, model b [21:41] only 100mbit ethernet on it too :( [21:41] Same for RPi3 [21:41] looks like they fixed my problem there [21:41] I have a 2 and 3 [21:42] Anyway, I'm working my way through the steps provided in the Dockerfile and see if I can make a workable warrior out of it [21:43] *** RichardG has joined #archiveteam-bs [21:57] FalconK: ah, ok, that'd explain it [21:59] Does it matter what PDF version [I save scanned documents in? [21:59] *** BlueMaxim has joined #archiveteam-bs [22:14] *** RichardG has quit IRC (Read error: Operation timed out) [22:16] Well, don't load it up with garbage [22:18] *** bsmith093 has joined #archiveteam-bs [22:24] *** Honno has quit IRC (Read error: Operation timed out) [23:36] *** GE has quit IRC (Quit: zzz)