#archiveteam-bs 2017-01-26,Thu

↑back Search

Time Nickname Message
00:20 🔗 zerkalo has joined #archiveteam-bs
00:27 🔗 hook54321 pizzaiolo: Checked, it's already there.
00:28 🔗 hook54321 We might want to consider doing it soon though
00:29 🔗 nickname_ has quit IRC (Read error: Operation timed out)
00:32 🔗 espes__ has joined #archiveteam-bs
00:47 🔗 odemg has joined #archiveteam-bs
00:54 🔗 odemg has quit IRC (Remote host closed the connection)
01:02 🔗 godane has left
01:21 🔗 odemg has joined #archiveteam-bs
01:44 🔗 BlueMaxim has joined #archiveteam-bs
01:49 🔗 Darkstar has quit IRC (Ping timeout: 506 seconds)
01:58 🔗 icedice has quit IRC (Quit: Leaving)
02:09 🔗 kristian_ has quit IRC (Quit: Leaving)
02:16 🔗 vitzli has joined #archiveteam-bs
02:18 🔗 yan has quit IRC (Read error: Operation timed out)
02:28 🔗 nickname_ has joined #archiveteam-bs
02:29 🔗 Darkstar has joined #archiveteam-bs
02:40 🔗 schbirid2 has joined #archiveteam-bs
02:43 🔗 username1 has quit IRC (Read error: Operation timed out)
03:01 🔗 zhongfu has quit IRC (Ping timeout: 260 seconds)
03:04 🔗 pizzaiolo has left
03:08 🔗 godane has joined #archiveteam-bs
03:08 🔗 godane looks like ftp://aftp.cmdl.noaa.gov/ is gone
03:09 🔗 alembic welp
03:09 🔗 alembic glad we did that one first
03:12 🔗 Frogging nice
03:20 🔗 vitzli has quit IRC (Quit: Leaving)
03:27 🔗 Asparagir has quit IRC (Read error: Operation timed out)
03:28 🔗 Asparagir has joined #archiveteam-bs
04:23 🔗 nickname_ has quit IRC (Read error: Operation timed out)
04:25 🔗 nickname_ has joined #archiveteam-bs
04:32 🔗 Stiletto has quit IRC (Read error: Operation timed out)
04:32 🔗 Stil3tt0 has joined #archiveteam-bs
05:00 🔗 nickname_ has quit IRC (Read error: Operation timed out)
05:12 🔗 ndizzle has joined #archiveteam-bs
05:18 🔗 Somebody2 has quit IRC (Read error: Operation timed out)
05:19 🔗 ndiddy has quit IRC (Read error: Operation timed out)
05:35 🔗 Somebody2 has joined #archiveteam-bs
05:45 🔗 ndiddy has joined #archiveteam-bs
05:45 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
05:48 🔗 ndizzle has quit IRC (Ping timeout: 244 seconds)
05:50 🔗 godane i'm uploading rev copies of UN Daily Radio
05:51 🔗 godane they do revison copies sometimes to update there radio program
05:52 🔗 Sk1d has joined #archiveteam-bs
06:17 🔗 ravetcofx has joined #archiveteam-bs
06:31 🔗 Honno has joined #archiveteam-bs
07:06 🔗 Lord_Nigh <godane> looks like ftp://aftp.cmdl.noaa.gov/ is gone <- gone? the appeal for backing it up was on reddit maybe 10 hours ago. i think everyone from /r/datahoarders tried to wget it at once
07:09 🔗 Frogging oh, that may not have been a good idea
07:15 🔗 Aranje has quit IRC (Quit: Three sheets to the wind)
07:18 🔗 Lord_Nigh do we have a complete copy of it? iirc it was started archiving in late december?
07:18 🔗 Lord_Nigh so we should have a copy of it
07:23 🔗 Stil3tt0 is now known as Stiletto
07:24 🔗 pikhq has quit IRC (Read error: Operation timed out)
07:30 🔗 pikhq has joined #archiveteam-bs
07:40 🔗 atomicthu has quit IRC (hub.dk irc.homelien.no)
07:40 🔗 PotcFdk has quit IRC (hub.dk irc.homelien.no)
07:40 🔗 alfie has quit IRC (hub.dk irc.homelien.no)
08:11 🔗 alfie has joined #archiveteam-bs
08:21 🔗 GE has joined #archiveteam-bs
08:45 🔗 SketchCow https://archive.org/details/archiveteam_ftpgov?sort=-publicdate&and[]=aftp.cmdl.noaa.gov
08:45 🔗 SketchCow We have a lot of it. arkiver will know
08:47 🔗 SketchCow I think they DDOSed it
08:48 🔗 SketchCow root@teamarchive0:/0/GODANERATOR# ftp aftp.cmdl.noaa.gov
08:48 🔗 SketchCow Connected to aftp.cmdl.noaa.gov.
08:48 🔗 SketchCow 421 There are too many connected users, please try later.
08:48 🔗 SketchCow ftp>
08:53 🔗 zhongfu has joined #archiveteam-bs
09:25 🔗 bwn has quit IRC (Read error: Operation timed out)
09:26 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
09:35 🔗 Lord_Nigh SketchCow: well, based on http://www.sciencemag.org/news/2017/01/trump-officials-suspend-plan-delete-epa-climate-web-page?utm_source=newsfromscience&utm_medium=facebook-text&utm_campaign=suspendepa-10685 i don't know if it will go down anytime soon, so maybe wait for people to get bored of archiving, or ask people on reddit to nicely stop hammering the server especially with multiple connections
09:36 🔗 Lord_Nigh https://www.reddit.com/r/DataHoarder/comments/5q4xxe/erik_fichtner_on_twitter_please_wget_m_np/?utm_content=comments&utm_medium=hot&utm_source=reddit&utm_name=DataHoarder is the post about it
09:36 🔗 Lord_Nigh but its spawned by a tweet, not from reddit itself
09:36 🔗 Lord_Nigh hammering the server into oblivion just means nobody gets the data
09:36 🔗 Lord_Nigh arkiver: ^
09:37 🔗 Lord_Nigh https://www.reddit.com/r/DataHoarder/comments/5q4xxe/erik_fichtner_on_twitter_please_wget_m_np/dcwurts/ might be a good person to contact if they aren't already in here
09:38 🔗 Lord_Nigh since that might be everything, unless they're actively mirroring more
09:38 🔗 Lord_Nigh also someone on news.ycombinator.com thread about this
09:39 🔗 Lord_Nigh https://news.ycombinator.com/item?id=13487843
09:39 🔗 Lord_Nigh said they have an internet2 campus link pulling the data as well
09:48 🔗 SketchCow This is a lot of talking for a simple thing
09:48 🔗 GE has quit IRC (Remote host closed the connection)
09:54 🔗 Lord_Nigh SketchCow: ok, do we have a plan? we can continue probing the ftp periodically until a slot opens up?
10:13 🔗 antomatic has quit IRC (Read error: Operation timed out)
10:14 🔗 antomatic has joined #archiveteam-bs
10:14 🔗 swebb sets mode: +o antomatic
10:15 🔗 Coderjoe has quit IRC (Read error: Operation timed out)
10:22 🔗 Coderjoe has joined #archiveteam-bs
11:02 🔗 kniffy has quit IRC (Ping timeout: 240 seconds)
11:10 🔗 kniffy has joined #archiveteam-bs
11:32 🔗 GE has joined #archiveteam-bs
12:22 🔗 SadDM has joined #archiveteam-bs
12:22 🔗 swebb sets mode: +o SadDM
12:22 🔗 tychot has quit IRC (Ping timeout: 245 seconds)
12:24 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
12:27 🔗 pizzaiolo has joined #archiveteam-bs
12:37 🔗 tychot has joined #archiveteam-bs
12:45 🔗 pizzaiolo any particular reason archiveteam.org doesn't have HTTPS?
12:46 🔗 odemg has quit IRC (Remote host closed the connection)
12:51 🔗 bwn has joined #archiveteam-bs
12:53 🔗 Honno has quit IRC (Ping timeout: 370 seconds)
12:59 🔗 Simpbrain has quit IRC (Remote host closed the connection)
14:08 🔗 Honno has joined #archiveteam-bs
14:18 🔗 odemg has joined #archiveteam-bs
14:31 🔗 atomicthu has joined #archiveteam-bs
14:31 🔗 PotcFdk has joined #archiveteam-bs
14:53 🔗 vitzli has joined #archiveteam-bs
14:56 🔗 Honno has quit IRC (Ping timeout: 370 seconds)
16:06 🔗 xmc because nobody has made it happen yet
16:07 🔗 xmc like with most things in archiveteam that are reasonable ideas, nobody's gotten a bee up their ass to do it yet
16:37 🔗 pizzaiolo xmc: who's hosting the wiki?
16:38 🔗 Frogging s/archiveteam/any volunteer efforts/
16:44 🔗 kniffy has quit IRC (Ping timeout: 240 seconds)
16:49 🔗 kniffy has joined #archiveteam-bs
17:11 🔗 schbirid2 i used caddy for the first time today, how AWESOME
17:22 🔗 ndiddy pizzaiolo: why would you need https to access a wiki
17:22 🔗 ndiddy "oh no, attackers can spy on my wiki edits!"
17:24 🔗 Frogging lol
17:25 🔗 Frogging privacy activists would disagree but yeah I don't think it matters much
17:25 🔗 ndiddy as far as i can tell the login form is a frame to an https page so there shouldn't be any security issues
17:25 🔗 Frogging also I forget who hosts the wiki
17:26 🔗 yipdw if it's the AT wiki, there's no HTTPS at any phase, login or otherwise
17:26 🔗 ndiddy yolo
17:26 🔗 yipdw there are valid reasons for wanting HTTPS to cover all requests beyond the initial login
17:27 🔗 yipdw see e.g. technique popularized by firesheep, edit integrity
17:27 🔗 yipdw fortunately these days getting a useful cert isn't too hard what with LE and all
17:27 🔗 Frogging someone could narf your session key
17:28 🔗 Frogging or just your password *shrug*
17:28 🔗 yipdw yes, that is the technique popularized by firesheep
17:28 🔗 yipdw HTTPS isn't an unreasonable request, it just hasn't really been on the priority list
17:33 🔗 yipdw also haha I like the way Intel formatted this
17:33 🔗 yipdw https://gitlab.peach-bun.com/yipdw/random-images/uploads/298582b3b25e313b88950a27915b9e5f/intel6.png
17:33 🔗 yipdw "Buy Composer Edition! IT INCLUDES NOTHING"
17:33 🔗 Frogging lmao
17:33 🔗 yipdw yeah, you have the "All Editions Feature" subheading on the other side
17:33 🔗 Frogging Improved buying experience
17:33 🔗 MrRadar Both Firefox and Chrome are planning to mark non-secure pages with password fields as explicitly non-secure. It may be a good idea to roll out HTTPS at least for the login page before then (though if you go that far you may as well go all the way)
17:34 🔗 yipdw I dunno, I guess if the idea is to make people go for the Cluster Edition, it works
17:38 🔗 Frogging Is there some sort of filesystem container that compresses the contents automatically? My IRC logs are large but easily compressible, it'd be cool if I could put them in their own filesystem that compresses them transparently
17:38 🔗 Frogging And still be easily greppable because it's transparent
17:38 🔗 MrRadar On Windows you can enable NTFS compression on a per-folder basis
17:38 🔗 hook54321 sets mode: +o Asparagir
17:38 🔗 Frogging linux
17:38 🔗 MrRadar There's probably some FUSE file system that would work
17:40 🔗 arkiver ftp://aftp.cmdl.noaa.gov/ is up for me
17:40 🔗 Frogging must have been a reddit hug of death then
17:41 🔗 arkiver yeah
17:41 🔗 arkiver it won't work when 100 people go after the same file
17:42 🔗 arkiver that's why we have the warrior.
17:42 🔗 yipdw btrfs does transparent compression via lzo
17:42 🔗 * arkiver is afk for a bit
17:42 🔗 yipdw I dunno if you want to switch to btrfs though
17:42 🔗 arkiver when I'm back I'll check if we have all of aftp
17:42 🔗 rocode btrfs will be great when it is stable.
17:42 🔗 yipdw not that it's necessarily unstable (I have no idea), but switching filesystems is generally just a pain in the ass no matter what the target
17:43 🔗 n00b709 has joined #archiveteam-bs
17:43 🔗 Frogging yipdw: it'd just be a mounted image that sits inside my normal filesystem,
17:44 🔗 yipdw I guess that works, yeah
17:44 🔗 n00b709 has left
17:45 🔗 yipdw I never tried using different filesystems for LVM volume groups
17:45 🔗 yipdw maybe I should see if that works
17:45 🔗 yipdw you know, the next time I decide "Wow, I have nothing better to do than destroy my computer"
17:47 🔗 rocode "I will just do this one thing, how bad can it be? There is even a tutorial!" -> 5 hours later -> "Okay, I have managed to reflash by bios and unbrick my system. Let's never do that again." -> Repeat.
17:47 🔗 MrRadar https://xkcd.com/349/
17:48 🔗 yipdw yeah, basically
17:48 🔗 yipdw although, to be fair, I have had very good experiences with LVM
17:48 🔗 yipdw it was very handy when I was moving from 2 80 GB SATA SSDs to a 400 GB PCIe
17:48 🔗 yipdw add the new drive to a volume group, copy the two SSD VGs to the new one
17:48 🔗 yipdw done
17:49 🔗 yipdw I was surprised. I was expecting to have to finagle filesystem arcana or some shit and I was disappointed, in a way, that I didn't, because it meant I couldn't make snide "In 2016, ..." tweets
17:50 🔗 jrwr has joined #archiveteam-bs
17:59 🔗 kurt|rbx1 has quit IRC (Ping timeout: 260 seconds)
18:02 🔗 Honno has joined #archiveteam-bs
18:12 🔗 mhazinsk Frogging: ZFS does compression
18:19 🔗 vitzli has quit IRC (Quit: Leaving)
18:20 🔗 odemg has quit IRC (Remote host closed the connection)
18:36 🔗 kniffy has quit IRC (Ping timeout: 260 seconds)
18:42 🔗 kniffy has joined #archiveteam-bs
18:42 🔗 zino ZFS with lz4 is better than sliced break.
18:48 🔗 VADemon has joined #archiveteam-bs
18:59 🔗 odemg has joined #archiveteam-bs
19:05 🔗 kniffy has quit IRC (Ping timeout: 240 seconds)
19:06 🔗 Lord_Nigh https://www.reddit.com/r/DataHoarder/comments/5q4xxe/erik_fichtner_on_twitter_please_wget_m_np/dcxq9n9/ is worrying
19:06 🔗 Lord_Nigh judging by the warrior stats we only have at most 14gb from that ftp
19:07 🔗 Lord_Nigh while the person parent post to the one i linked on reddit has 514GB
19:09 🔗 Lord_Nigh SketchCow: maybe ask that /u/fuckoffplsthankyou guy if he can upload the 514GB to an AT machine?
19:11 🔗 kniffy has joined #archiveteam-bs
19:15 🔗 merp has joined #archiveteam-bs
19:21 🔗 merp has left
19:28 🔗 kniffy has quit IRC (Ping timeout: 240 seconds)
19:33 🔗 kniffy has joined #archiveteam-bs
19:52 🔗 Jordan has quit IRC (Read error: Operation timed out)
19:53 🔗 gourgastl has joined #archiveteam-bs
19:54 🔗 Jordan has joined #archiveteam-bs
20:03 🔗 Jordan has quit IRC (Remote host closed the connection)
20:04 🔗 Jordan has joined #archiveteam-bs
20:23 🔗 ravetcofx has joined #archiveteam-bs
20:31 🔗 kevinr has joined #archiveteam-bs
20:41 🔗 godane has quit IRC (Read error: Operation timed out)
20:48 🔗 nickname_ has joined #archiveteam-bs
21:24 🔗 alembic anybody recommend good bang/$ when it comes to VPS providers to run warrior on. I'm running on DigitalOcean atm with moderate success.
21:26 🔗 Kaz depending on the budget, but OVH's SoYouStart is pretty much the best you're going to get (not VPSs though, dedis)
21:27 🔗 alembic hmpf... might have to pay 15% sales tax on that since I'm Canadian, but 2GB for $7/mo is very competative.
21:29 🔗 gourgastl has quit IRC (Quit: Page closed)
21:31 🔗 alembic oops, that was the Kimsufi line
21:33 🔗 Kaz RAM isn't everything though, those Atom processors will be terrible for, well, pretty much everything
21:35 🔗 alembic ahaha yah, but the scripts are mostly I/O bound, no? The load wouldn't be so crazy as to bottleneck the I/O, would it?
21:36 🔗 Kaz on those processors I'd say the CPU would be your issue
21:37 🔗 Kaz depends on the project really. I'm maxing out 8 cores of a Xeon E3 on ftp-gov, but that is excessively heavy on cpu
21:39 🔗 rocode ^ I use Scaleway VPS for mine. $10 a month for 6 x x86 cores, 8GB of RAM, 200GB of SSD, 200mbit unmetered.
21:40 🔗 godane has joined #archiveteam-bs
21:43 🔗 Kaz rocode: have you tried ftp-gov on that? wondering about performance
21:45 🔗 rocode Mine is currently running 2 concurrent of the following projects: wikiteam, urlshort, yuku, pdf, googlecode, ftp, vine, ipernity, yahooanswers, and 5 grab-site grabbers.
21:45 🔗 rocode So, I am not running FTPgov, but I don't think it would be a problem. Something caused me not to run it, probably an error of some sort.
21:46 🔗 Kaz ah
21:46 🔗 Kaz just wondering as I'm having 'issues' with online/scaleway's network
21:47 🔗 rocode Tried switching to the Amsterdam datacenter?
21:47 🔗 Kaz can't, as it's a dedi through online.net rather than scaleway
21:48 🔗 rocode Ah. No clue then.
21:48 🔗 rocode I have never had network issues.
21:48 🔗 Kaz for reference, 107ms from OVH in Roubaix to one of the nasa FTPs. From online in paris I'm seeing an average of 1270ms :(
21:51 🔗 HCross2 Kaz: very simple solution. More ovh :p
21:51 🔗 Ravenloft has quit IRC (Read error: Connection reset by peer)
21:53 🔗 Kaz I'm enjoying the �20/mo I'm already saving!
21:53 🔗 Kaz shame about the 260mbps limit
21:54 🔗 HCross id use hetzner in a heartbeat if it wasnt for their bw caps
21:54 🔗 Kaz turns out we're not actually the ideal customer for any provider, who knew?
21:55 🔗 Kaz OVH could do away with Kimsufi though, bring the price down for the rest of us
21:55 🔗 HCross Kaz, tempted to test servdiscount - however from what I read their network can be slow
21:57 🔗 Kaz They have a 30day trial/moneyback thing
21:57 🔗 Kaz it's win/win. The network either works and you keep it, or it's crap so they have no reason not to honor it
21:57 🔗 HCross https://servdiscount.com/en/services/payment-methods.html is not that good tho
21:58 🔗 Kaz ?
21:58 🔗 Kaz visa/mastercard accepted
21:58 🔗 HCross for a 5% fee
21:58 🔗 HCross if you scroll down
21:58 🔗 Kaz ah, yeah they hide that a bit out the way
21:58 🔗 HCross Can do SEPA, but my bank charge £4 per payment on it
21:59 🔗 HCross although 2 eur on a 40eur server isnt too bad
22:02 🔗 Kaz http://www.speedtest.net/result/4242574225.png
22:02 🔗 Kaz admittedly that's old
22:02 🔗 HCross wonderful - the order form has suddenly gone all german on me
22:02 🔗 Kaz hah
22:03 🔗 HCross the url has /en/ in it - yet its gone all German
22:03 🔗 Kaz https://www.peeringdb.com/net/1007 too - doesn't seem too bad but doubt you'll get anything worthwhile outside of europe
22:03 🔗 HCross im pretty amazed with my box in LA. 140ms between my home and the server, RDP still feels like its at OVH in france, and gameservers and stuff run well
22:03 🔗 BartoCH can someone requeue the flickr items or change the warrior default to something else? I'm feeling like it doens't do much right now.
22:06 🔗 HCross Kaz, their order form cant take a full UK postcode
22:10 🔗 Kaz ah
22:14 🔗 alembic i've noticed ftp-gov has been heavy on CPU as well Kaz, any idea why? Also, why does the item have to fit in memory?
22:14 🔗 alembic if it's a matter of having to send it off to the AT servers, then we should be able to read and send in chunks, no?
22:16 🔗 Kaz honestly no idea, I haven't looked much into what the scripts are actually doing other than pointing wpull at some ftp sites
22:17 🔗 yipdw the ftp-gov scripts use wpull's on-disk URL database; if there's a memory limit somewhere, at least it isn't that
22:20 🔗 alembic k
22:21 🔗 yipdw as for high CPU, we notice the same phenomenon in archivebot, which is also using wpull
22:21 🔗 yipdw the cause is presently unknown because we haven't profiled yet
22:22 🔗 yipdw in archivebot's case it's possible it's not wpull proper but rather one of our plugins
22:22 🔗 yipdw etc.
22:23 🔗 HCross yipdw, are you taking pipeline applications?
22:23 🔗 yipdw maybe soon. I want to keep watching the new pipeline code for a bit
22:24 🔗 alembic coolio... I've been getting a lot of memory surprises with long-running python scripts these days myself
22:24 🔗 alembic one time it was a memory leak from a c library interface, the other time it was one of those weird subotimal GC cases
22:25 🔗 yipdw we usually don't hit memory problems in pipelines
22:25 🔗 yipdw although sometimes it happens
22:25 🔗 yipdw it's odd
22:25 🔗 yipdw i'd really love a visualvm equivalent for CPython
22:26 🔗 alembic there are some weird profiling tools for python, but it's all opaque to me
22:28 🔗 yipdw there are a few but the ones I've seen give you a flat or graph profile at the end of some block, or at process termination. i want the ability (either by explicit instrumentation or VM hooks) to watch the process as it runs
22:28 🔗 yipdw there's so much information you can get about process behavior by watching it over time vs. trying to reconstruct that history from a profiler snapshot. it's strange that this isn't more common in profiler land
22:29 🔗 yipdw like offhand there's visualvm, instruments, telemetry, uh
22:29 🔗 yipdw vtune maybe
22:42 🔗 dashcloud yipdw: you may have seen this already, but just in case you haven't, is this helpful: http://www.brendangregg.com/blog/2016-10-27/dtrace-for-linux-2016.html (the person who worked a lot on bpf)
22:44 🔗 yipdw dtrace is pretty nice and there is some work on integrating dtrace probes into CPython, yeah
22:44 🔗 yipdw I hope that takes off
22:44 🔗 yipdw I didn't know DTrace made it into Linux
22:45 🔗 yipdw oh, wait, it didn't
22:45 🔗 yipdw well maybe one day
22:56 🔗 odemg has quit IRC (Remote host closed the connection)
23:02 🔗 GE has quit IRC (Quit: zzz)
23:05 🔗 Honno has quit IRC (Ping timeout: 370 seconds)
23:20 🔗 odemg has joined #archiveteam-bs
23:23 🔗 dashcloud from the post, it sounds like bpf offers everything dtrace does except the vast library of pre-existing probes
23:25 🔗 Stiletto has quit IRC (Read error: Operation timed out)
23:26 🔗 dashcloud has quit IRC (Read error: Operation timed out)
23:32 🔗 dashcloud has joined #archiveteam-bs
23:37 🔗 BlueMaxim has joined #archiveteam-bs
23:55 🔗 Stil3tt0 has joined #archiveteam-bs
