#archiveteam 2017-10-13,Fri

↑back Search

Time Nickname Message
00:00 🔗 Rondom has quit IRC (Remote host closed the connection)
00:01 🔗 Rondom has joined #archiveteam
00:03 🔗 Soni has quit IRC (Read error: Operation timed out)
00:15 🔗 dboard2 is now known as dboard
00:59 🔗 Laverne has quit IRC (Read error: Operation timed out)
01:24 🔗 Laverne has joined #archiveteam
01:47 🔗 username1 has joined #archiveteam
01:50 🔗 schbirid2 has quit IRC (Read error: Operation timed out)
01:56 🔗 Stilett0 has quit IRC (Ping timeout: 255 seconds)
02:46 🔗 Stilett0 has joined #archiveteam
03:10 🔗 Soni has joined #archiveteam
03:24 🔗 qw3rty5 has joined #archiveteam
03:28 🔗 qw3rty4 has quit IRC (Read error: Operation timed out)
03:34 🔗 SketchCow ----------------------------------
03:34 🔗 SketchCow All Archiveteam Programmy Nerds
03:34 🔗 SketchCow You presence is requested
03:34 🔗 SketchCow In #last20
03:34 🔗 SketchCow ----------------------------------
04:03 🔗 jrwr has quit IRC (Max SendQ exceeded)
04:16 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
04:23 🔗 ZexaronS has joined #archiveteam
04:23 🔗 Sk1d has joined #archiveteam
04:43 🔗 Stilett0 has quit IRC (Read error: Connection reset by peer)
04:44 🔗 Stilett0 has joined #archiveteam
04:45 🔗 Stilett0 is now known as Stiletto
05:08 🔗 MMovie2 has quit IRC (Ping timeout: 600 seconds)
05:12 🔗 MMovie has joined #archiveteam
05:25 🔗 ZexaronS has quit IRC (Quit: Leaving)
05:36 🔗 ZexaronS has joined #archiveteam
06:13 🔗 Nemo_bis SketchCow splendid appearance in http://blog.archive.org/2017/10/13/the-20th-century-time-machine/ :)
06:15 🔗 jrwr has joined #archiveteam
06:17 🔗 Nemo_bis wow https://archive.org/details/last20
06:17 🔗 Pixi` has quit IRC (Quit: Pixi`)
06:17 🔗 Pixi has joined #archiveteam
06:37 🔗 K4k has quit IRC (Ping timeout: 255 seconds)
06:38 🔗 K4k has joined #archiveteam
07:01 🔗 ZexaronS- has joined #archiveteam
07:02 🔗 ZexaronS has quit IRC (Ping timeout: 260 seconds)
07:15 🔗 Guest has joined #archiveteam
07:19 🔗 Guest has quit IRC (Connection closed)
07:22 🔗 Soni has quit IRC (Ping timeout: 272 seconds)
07:39 🔗 Soni has joined #archiveteam
07:42 🔗 atomotic has joined #archiveteam
07:43 🔗 hive-mind has quit IRC (Remote host closed the connection)
07:50 🔗 hive-mind has joined #archiveteam
07:50 🔗 Honno has joined #archiveteam
09:00 🔗 Jonison has joined #archiveteam
09:00 🔗 pizzaiolo has quit IRC (Quit: pizzaiolo)
09:32 🔗 Honno has quit IRC (Read error: Operation timed out)
09:37 🔗 atomotic has quit IRC (Quit: atomotic)
09:40 🔗 atomotic has joined #archiveteam
10:01 🔗 Mateon1 has quit IRC (Read error: Operation timed out)
10:02 🔗 Mateon1 has joined #archiveteam
10:29 🔗 BlueMaxim has quit IRC (Quit: Leaving)
10:47 🔗 schbirid2 has joined #archiveteam
10:50 🔗 username1 has quit IRC (Read error: Operation timed out)
11:05 🔗 icedice has joined #archiveteam
11:06 🔗 Valentine has joined #archiveteam
11:14 🔗 atomotic has quit IRC (Quit: atomotic)
11:51 🔗 atomotic has joined #archiveteam
11:58 🔗 c0mpass has joined #archiveteam
11:59 🔗 c0mpass I have a question, If I run the warrior on my dedicated, nothing illegal is going though it right?
12:01 🔗 JAA Define: "illegal"
12:02 🔗 c0mpass Uhhh
12:02 🔗 c0mpass You know what I mean
12:02 🔗 JAA Strictly speaking, almost everything we archive is protected by copyright, and in some jurisdictions, laws regarding unauthorised access to computer systems might apply.
12:02 🔗 joepie91 aside from that, many archived sites are user content sites
12:02 🔗 c0mpass Reason why I ask is I have about 40Gbps of available servers
12:02 🔗 joepie91 that usually contain technically-illegal content *somewhere*
12:03 🔗 joepie91 so the more useful question, I think, is "will I get in trouble for running the warrior"
12:03 🔗 c0mpass Yeah
12:03 🔗 c0mpass Thats basically it
12:03 🔗 joepie91 to which my answer would be "usually not, but if you're doing 40gps, that might change"
12:03 🔗 joepie91 mostly because at 40gbps the site owners start going "wtf?" :)
12:03 🔗 c0mpass Lmao
12:03 🔗 joepie91 c0mpass: depending on the amount of storage space, you may be better off running an rsync target
12:03 🔗 joepie91 ie. a collection server where warriors send their archived data to
12:04 🔗 joepie91 before it ends up in the Internet Archive
12:04 🔗 c0mpass All low storage, NVME servers.
12:04 🔗 joepie91 ah, crap
12:04 🔗 c0mpass Work gives me 4 10Gbps servers free as a perk
12:04 🔗 JAA It's also worth mentioning that the warrior projects are usually rate limited, so you wouldn't actually fire 40 Gbit/s at the targets.
12:04 🔗 c0mpass I just dont want to get fired becuase of downloading illegal stuff on them
12:04 🔗 joepie91 c0mpass: so I'd say that it's probably safe to run the warrior (I don't think anybody's ever gotten in trouble for it? automated IP bans at worst), but I wouldn't try to do so at 40gbps
12:05 🔗 joepie91 basically, make it not come across as an attack
12:05 🔗 c0mpass I mean I could throttle it to gigabit
12:05 🔗 c0mpass even 500 meg
12:05 🔗 joepie91 yeah, you'd probably want to throttle to way less
12:05 🔗 joepie91 500mbps is probably the upper bound of what you can get away with before site owners start asking questions
12:05 🔗 joepie91 (ballpark guess, mind)
12:05 🔗 joepie91 also depends whether it's all from the same IP range, etc.
12:05 🔗 c0mpass I mean if I just do the yahoo answers thing then I should have no issues at 40
12:06 🔗 JAA Yahoo throttles heavily.
12:06 🔗 c0mpass IP's are all in the same block
12:06 🔗 joepie91 right. then you'd want to maintain one ratelimit for all of them
12:06 🔗 c0mpass Second question.
12:06 🔗 joepie91 probably safe to run with tens of threads for most projects, especially partaking in the higher-bandwidth ones like video sites
12:06 🔗 c0mpass If I were to do this though a VPN
12:06 🔗 joepie91 just not the heavily throttled projects :)
12:07 🔗 joepie91 c0mpass: it's generally discouraged to run warriors on anything other than a direct uncensored pipe to the internet, because there are too many factors inbetween that could corrupt the data
12:07 🔗 joepie91 provider cockups, block pages, etc.
12:07 🔗 c0mpass thats what I thought
12:08 🔗 joepie91 even adding a VPN would basically double the amount of parties that could be messing up the responses :P
12:08 🔗 c0mpass Hi BartoCH
12:08 🔗 BartoCH hullo
12:08 🔗 c0mpass Yeah figured.
12:08 🔗 c0mpass BartoCH: yes.
12:08 🔗 BartoCH hrhr
12:08 🔗 c0mpass Okay well I'll set this up on one server and see how it goes
12:10 🔗 joepie91 c0mpass: hm, only just realized we're in #archiveteam. if you have further questions, prefer to switch to #archiveteam-bs as this channel is mostly for low-noise announcements and "oh no this site is dying, did you hear" type messages :)
12:10 🔗 c0mpass Ohhh so sorry
13:00 🔗 atomotic has quit IRC (Quit: atomotic)
13:24 🔗 Valentine hi all, can I get the Archive Team's help to save the news site AsiaOne, which might shut down as early as next month? https://sg.news.yahoo.com/sph-news-aggregator-site-asiaone-close-090754309.html
13:30 🔗 JAA I'll throw it into ArchiveBot. Because there is a huge queue currently, I'm not sure if it will be grabbed in time, but let's try...
13:36 🔗 JAA Note, that won't grab everything, e.g. no videos (I think).
14:12 🔗 Valentine that's fine, thanks!
14:13 🔗 godane has quit IRC (Quit: Leaving.)
14:57 🔗 godane has joined #archiveteam
15:06 🔗 klapperst has joined #archiveteam
15:07 🔗 klapperst hi
15:13 🔗 Jonison has quit IRC (Read error: Connection reset by peer)
15:20 🔗 ZexaronS- has quit IRC (Quit: Leaving)
15:32 🔗 atomotic has joined #archiveteam
15:38 🔗 schbirid2 has quit IRC (Quit: Leaving)
15:48 🔗 Xe has quit IRC (Max SendQ exceeded)
15:52 🔗 icedice has quit IRC (Quit: Leaving)
16:02 🔗 Xe has joined #archiveteam
16:06 🔗 icedice has joined #archiveteam
16:07 🔗 klapperst has quit IRC (Quit: Page closed)
16:34 🔗 schbirid has joined #archiveteam
16:41 🔗 atomotic has quit IRC (Quit: atomotic)
16:42 🔗 ZexaronS has joined #archiveteam
17:24 🔗 ZexaronS has quit IRC (Quit: Leaving)
17:47 🔗 Starholme has joined #archiveteam
17:55 🔗 kepler45 has joined #archiveteam
17:59 🔗 bRick5772 has joined #archiveteam
18:41 🔗 icedice has quit IRC (Read error: Connection reset by peer)
18:42 🔗 icedice has joined #archiveteam
19:07 🔗 kris33 has joined #archiveteam
19:25 🔗 atrocity has quit IRC (Read error: Operation timed out)
19:36 🔗 kris33 has quit IRC (Textual IRC Client: www.textualapp.com)
20:44 🔗 atomicthu https://scrapinghub.com/platform this looks extremely useful if a bit expensive
20:47 🔗 joepie91 atomicthu: that looks fantastically proprietary :)
20:47 🔗 atomicthu yep
20:47 🔗 atomicthu not a thing you can download, just a serve
20:47 🔗 atomicthu *service
20:47 🔗 atomicthu since it's 2017 and Gotta Make Mad VC Cash yo
20:47 🔗 joepie91 heh
20:47 🔗 joepie91 "NPM package as a service"
20:48 🔗 joepie91 more seriously; probably not useful for archiveteam
20:48 🔗 joepie91 due to its proprietary nature
21:01 🔗 MMovie has quit IRC (Read error: Operation timed out)
21:09 🔗 bRick5772 has quit IRC (Quit: Leaving.)
21:29 🔗 Honno has joined #archiveteam
21:42 🔗 schbirid has quit IRC (Quit: Leaving)
21:47 🔗 MMovie has joined #archiveteam
21:58 🔗 MMovie2 has joined #archiveteam
22:02 🔗 Valentin- has joined #archiveteam
22:02 🔗 MMovie has quit IRC (Read error: Operation timed out)
22:03 🔗 Valentine has quit IRC (Ping timeout: 506 seconds)
22:06 🔗 MMovie has joined #archiveteam
22:10 🔗 MMovie2 has quit IRC (Read error: Operation timed out)
22:21 🔗 underscor has quit IRC (Quit: No Ping reply in 180 seconds.)
22:22 🔗 underscor has joined #archiveteam
22:22 🔗 swebb sets mode: +o underscor
22:26 🔗 atomicthu joepie91: i was more looking at the "crawlera" part since it works as a proxy
22:26 🔗 atomicthu might be useful for sites that limit bandwidth per-IP
23:00 🔗 Starholme has quit IRC (Quit: Page closed)
23:12 🔗 dashcloud has joined #archiveteam
23:21 🔗 kepler45 has quit IRC (Quit: Leaving)
23:27 🔗 MMovie2 has joined #archiveteam
23:28 🔗 Gfy has quit IRC (Read error: Operation timed out)
23:28 🔗 MMovie has quit IRC (Read error: Operation timed out)
23:31 🔗 BlueMaxim has joined #archiveteam
23:32 🔗 Gfy has joined #archiveteam
23:39 🔗 Honno has quit IRC (Read error: Operation timed out)
23:55 🔗 PotcFdk has quit IRC (~'o'/)
23:59 🔗 MMovie has joined #archiveteam

irclogger-viewer