#archiveteam-bs 2016-09-18,Sun

↑back Search

Time Nickname Message
00:01 🔗 metal_cam has quit IRC (Read error: Operation timed out)
00:03 🔗 balrog has quit IRC (Read error: Operation timed out)
00:08 🔗 balrog has joined #archiveteam-bs
00:08 🔗 swebb sets mode: +o balrog
00:24 🔗 metalcamp has quit IRC (Ping timeout: 506 seconds)
01:13 🔗 BlueMaxim has joined #archiveteam-bs
02:32 🔗 JesseW has joined #archiveteam-bs
03:01 🔗 ivan hook54321: isn't the remote server the one controlling access? how would expect to just bypass that?
03:02 🔗 hook54321 I would assume that's how it works. I'm not really sure though.
03:03 🔗 ivan yes. it would be hilarious if the client were responsible for enforcing this.
03:04 🔗 hook54321 I think some of the teachers have access to the network drives...
03:05 🔗 ivan I would recommend not trying to get expelled, if you want to graduate from wherever you're at
03:05 🔗 hook54321 I'm not trying to
03:06 🔗 hook54321 What if I asked some teachers if they could copy the stuff onto an external drive for me?
03:06 🔗 ivan sounds like it could work
03:07 🔗 hook54321 I kinda doubt they would do it though
03:07 🔗 hook54321 Well, I know one teacher pretty well
03:45 🔗 ndiddy has quit IRC (Read error: Connection reset by peer)
04:16 🔗 MrRadar Someone compiled a list of every open FTP server on the IPv4 Internet: https://github.com/massivedynamic/openftp4
04:17 🔗 JesseW hm, we probably want those for the FTP project
04:51 🔗 Stiletto has joined #archiveteam-bs
04:52 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
04:53 🔗 Sk1d has quit IRC (Ping timeout: 194 seconds)
04:59 🔗 Sk1d has joined #archiveteam-bs
05:23 🔗 Frogging has quit IRC (El Psy Kongroo!)
05:26 🔗 RichardG_ has joined #archiveteam-bs
05:26 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
05:30 🔗 Frogging has joined #archiveteam-bs
06:01 🔗 GE has joined #archiveteam-bs
06:39 🔗 GE has quit IRC (Ping timeout: 255 seconds)
07:11 🔗 HCross2 I tried doing something like that once. Never again.
07:16 🔗 metalcamp has joined #archiveteam-bs
07:25 🔗 hook54321 HCross2: why? what happened?
07:25 🔗 Sanqui MrRadar: wow, I keep forgetting stuff like this is feasible nowadays
07:26 🔗 Sanqui now I want to scan for every MUD/MUSH server
07:26 🔗 HCross2 Well. When you contact port 21 on every IP on the internet. You hit Government ranges. Governments don't like that
07:26 🔗 Sanqui Ah, right. And they don't always run on predictable ports either.
07:27 🔗 HCross2 If I can remember right, the MoD got in contact
07:28 🔗 hook54321 Someone once did something like that except for Minecraft servers, I think they received a cease and desist notice.
07:28 🔗 Sanqui scanning the Entire internet is serious business, I guess
07:31 🔗 ranma there was a def con talk on that
07:31 🔗 hook54321 What's the difference between the ftp project and ArchiveBot?
07:32 🔗 yipdw they're not related at all
07:32 🔗 Sanqui ArchiveBot is a service
07:32 🔗 hook54321 Isn't ArchiveBot compatible with ftp though? Why don't they just use that?
07:32 🔗 Sanqui ftp project is a regular project
07:32 🔗 yipdw because slamming terabytes of FTP through a few pipelines is not a good idea
07:33 🔗 hook54321 Ah
07:33 🔗 yipdw there are other ways to get archives of FTP sites than download them
07:33 🔗 Sanqui single ftp servers can be archived by archivebot
07:33 🔗 hook54321 There are?
07:34 🔗 ranma https://www.youtube.com/watch?v=UOWexFaRylM
07:34 🔗 ranma Massscanning the Internet - Defcon 22 (2014)
07:34 🔗 yipdw yes? downloading is one option but for problematic sites you also have the option of writing to the maintainer
07:34 🔗 HCross2 hook54321: archivebot pipelines are a finite resource
07:34 🔗 yipdw and as far as FTP is concerned, a file mirror is a pretty good mirror
07:34 🔗 hook54321 HCross2: I know
07:35 🔗 hook54321 How is the FTP stuff handled if the maintainer replies?
07:36 🔗 yipdw I don't know, I'm throwing it out there as a plan B
07:36 🔗 yipdw there is of course a warrior project for the FTP grab
07:38 🔗 schbirid has joined #archiveteam-bs
07:39 🔗 GLaDOS has quit IRC (Oh crap, I died.)
07:40 🔗 GLaDOS has joined #archiveteam-bs
08:28 🔗 PurpleSym sets mode: +o arkiver
08:29 🔗 PurpleSym sets mode: +o midas
08:40 🔗 odemg has joined #archiveteam-bs
08:42 🔗 godane you guys may want to go after this: https://www.youtube.com/user/UCBerkeley/playlists
08:42 🔗 godane http://news.berkeley.edu/2016/09/13/a-statement-on-online-course-content-and-accessibility/
08:47 🔗 Sanqui I honestly think they should be contacted, if they're as committed to keep the content available as they claim, they should be willing to give everything out and have it on IA
09:10 🔗 odemg has quit IRC (Quit: Leaving)
09:26 🔗 brayden has quit IRC (Ping timeout: 633 seconds)
09:28 🔗 yipdw huh, Chrome 52 offers MIDI control permissions for websites
09:28 🔗 yipdw does this mean I can interact with websites via my Launchpad, because if so that is stupendously badass
09:30 🔗 brayden has joined #archiveteam-bs
09:30 🔗 swebb sets mode: +o brayden
09:30 🔗 yipdw oh there's an entire subsection of WebAudio about this, nice
09:36 🔗 GE has joined #archiveteam-bs
10:03 🔗 VADemon has joined #archiveteam-bs
10:23 🔗 godane has quit IRC (Read error: Operation timed out)
10:26 🔗 godane has joined #archiveteam-bs
10:45 🔗 atrocity wtf, the first video of the compute rscience playlist was webm
10:58 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
10:58 🔗 dashcloud has joined #archiveteam-bs
10:59 🔗 GE has quit IRC (Remote host closed the connection)
11:23 🔗 dashcloud has quit IRC (Read error: Operation timed out)
11:27 🔗 dashcloud has joined #archiveteam-bs
11:28 🔗 atrocity wow, i'm so glad youtube-dl has the -U option
11:28 🔗 atrocity couldn't get twitch streams until i did it and bam, it's working
11:37 🔗 atrocity IA is totally fine uploading 15GB videos, right?
11:37 🔗 atrocity the roguelike conference was yesterday and i'm downloading the twitch streams to upload to IA becaue i don't trust twitch
11:39 🔗 GE has joined #archiveteam-bs
11:40 🔗 Igloo^_^ Yeah
11:40 🔗 atrocity ok, cool
11:40 🔗 atrocity wow, i REALLY don't like twitch, lol
11:40 🔗 atrocity have to download a 300KB .part file for each chunk, then combine them all later into the video. taking forever with a 12.5GB file, lol
11:41 🔗 Igloo^_^ Yeahhhhh
11:41 🔗 Igloo^_^ Twitch
11:48 🔗 Kaz twitch was a fun project.
11:56 🔗 atrocity wow, they killed my download, lol
11:56 🔗 atrocity i was bouncing between 150 and 1.5MB/s
11:56 🔗 atrocity now i'm bouncing between 5MB-7MB/s
11:56 🔗 atrocity glad it resumes, too
11:56 🔗 atrocity hopefully they're not sending me fake data
11:57 🔗 atrocity i should've probably limited my speed to a "normal" streamer
11:57 🔗 Kaz I was under the impression Twitch's CDN limited download speed anyway?
12:00 🔗 atrocity oh, maybe so
12:00 🔗 atrocity my download maxes out around 9MB/s, so i'm almost hitting it. just odd that i was only hitting 1.5MB/s for 20 minutes, it dies, and now i'm getting 7.5MB/s
12:23 🔗 fie_ has joined #archiveteam-bs
12:25 🔗 fie__ has quit IRC (Ping timeout: 244 seconds)
12:26 🔗 BlueMaxim has quit IRC (Quit: Leaving)
12:50 🔗 atrocity so many pieces that windows is choking to even show the directory now
12:50 🔗 atrocity 14.5k+ pieces it's downloading to recombine, lol
13:12 🔗 VADemon atrocity: I assume it server-side caching and optimisation of some sort
13:13 🔗 joepie91 https://www.ets.berkeley.edu/news/fall-2015-changes-course-capture-webcast-service :/
13:16 🔗 GE has quit IRC (Remote host closed the connection)
13:16 🔗 atrocity joepie91: that makes no sense
13:16 🔗 atrocity they just admitting they're going to do it for their students, but because of costs, not to the world
13:16 🔗 atrocity the cost is already done if they're doing it for their students
13:22 🔗 joepie91 atrocity: it was hosted on YT so what costs did they really have?
14:02 🔗 fie_ has quit IRC (Quit: Leaving)
14:02 🔗 atrocity exactly what i mean, lol
14:03 🔗 fie has joined #archiveteam-bs
14:17 🔗 t2t2 the amount of parts a twitch vod has is roughly $duration_milliseconds/4000
14:24 🔗 t2t2 or in other words, one part per four seconds
14:29 🔗 brayden has quit IRC (Read error: Operation timed out)
14:31 🔗 joepie91 https://torrentfreak.com/elsevier-wants-cloudflare-to-expose-pirate-sites-160917/
14:31 🔗 joepie91 "In the ongoing copyright infringement lawsuit against alleged pirate sites Sci-Hub, Libgen and Bookfi, academic publisher Elsevier wants help from Cloudflare. The publisher informs the court that a subpoena against Cloudflare is needed to expose the personal details of the sites' owners."
14:34 🔗 joepie91 "In addition to contacting Cloudflare, the academic publisher also requested information from Whois Privacy Corp. – the domain registration anonymization service used by both Libgen.org and Bookfi.org – but the company hasn’t responded to these requests at all."
14:34 🔗 joepie91 apparently internet.bs' WHOIS privacy thing is giving them the runaround, heh
14:43 🔗 GE has joined #archiveteam-bs
14:47 🔗 Sanqui it begins
14:47 🔗 Sanqui grrrrr
15:30 🔗 Sanqui I just noticed archive.org refuses to show nifty homepages entirely
15:31 🔗 Sanqui "this URL has been excluded"
15:31 🔗 Sanqui does that mean robots.txt? because that doesn't ban everything: http://homepage2.nifty.com/robots.txt
15:34 🔗 dashcloud has quit IRC (Remote host closed the connection)
15:40 🔗 dashcloud has joined #archiveteam-bs
15:51 🔗 Aranje has joined #archiveteam-bs
16:22 🔗 brayden has joined #archiveteam-bs
16:22 🔗 swebb sets mode: +o brayden
16:29 🔗 RichardG_ has quit IRC (Read error: Connection reset by peer)
16:29 🔗 RichardG has joined #archiveteam-bs
16:47 🔗 VADemon It's because their robots.txt parser is stupid and they don't seem to care
17:34 🔗 ndiddy has joined #archiveteam-bs
18:29 🔗 arkiver looks like it's not blocked because of robots.txt
18:30 🔗 arkiver else the wayback machine would have said something about robots.txt
18:33 🔗 JesseW has joined #archiveteam-bs
18:45 🔗 tomwsmf_ has joined #archiveteam-bs
19:18 🔗 schbirid has quit IRC (Quit: Leaving)
19:19 🔗 tomwsmf_ has quit IRC (Read error: Operation timed out)
19:33 🔗 ndiddy i have a question about that
19:34 🔗 ndiddy has anyone else had a site that they wanted to look at in the internet archive but the site went down and was replaced with a parked page that has a robots.txt file but the original didn't but because the parked page does you can't see any backups of the page
19:34 🔗 ndiddy it's pretty annoying
19:34 🔗 xmc happens all the time
19:34 🔗 xmc known issue
19:36 🔗 ndiddy also, parked pages are pretty annoying in general tbh
19:36 🔗 xmc your point?
19:36 🔗 ndiddy nobody's going to pay $500 for a domain
19:40 🔗 dashcloud has quit IRC (Read error: Operation timed out)
19:42 🔗 GE The only joy I ever got out of parked websites was why they all had that same picture of that girl
19:42 🔗 ndiddy now it's all pictures of granite and shit
19:42 🔗 ndiddy fun fact: i was looking up what sunsoft was up to a few days ago and they sold their us domain http://sunsoftgames.com/
19:43 🔗 ndiddy i sent sunsoft japan an email about it and they didn't reply yet
19:43 🔗 GE My mind just had a thought about what the internet has done to that image and I regret going there
19:44 🔗 godane has quit IRC (Quit: Leaving.)
19:44 🔗 godane has joined #archiveteam-bs
19:45 🔗 dashcloud has joined #archiveteam-bs
19:45 🔗 ndizzle has joined #archiveteam-bs
19:52 🔗 ndiddy has quit IRC (Read error: Operation timed out)
19:53 🔗 ndizzle is now known as ndiddy
20:04 🔗 ndiddy has quit IRC (Read error: Connection reset by peer)
20:06 🔗 ndiddy has joined #archiveteam-bs
20:08 🔗 ndizzle has joined #archiveteam-bs
20:16 🔗 ndiddy has quit IRC (Read error: Operation timed out)
20:17 🔗 ndizzle has quit IRC (Read error: Operation timed out)
20:17 🔗 ndiddy has joined #archiveteam-bs
20:31 🔗 JesseW has quit IRC (Quit: Leaving.)
20:32 🔗 JesseW has joined #archiveteam-bs
20:39 🔗 BartoCH has quit IRC (Ping timeout: 260 seconds)
20:39 🔗 BartoCH has joined #archiveteam-bs
20:41 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
20:56 🔗 arkiver PurpleSym: having a look at the WARC now
21:02 🔗 arkiver sorry for the delay :/
21:02 🔗 arkiver flickr put up a new version, going to check if everything still works
21:02 🔗 arkiver if it does, I'm going to start the test grab
21:27 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
21:30 🔗 metalcamp has quit IRC (Read error: Operation timed out)
21:39 🔗 ravetcofx has joined #archiveteam-bs
21:51 🔗 dashcloud has quit IRC (Remote host closed the connection)
21:54 🔗 dashcloud has joined #archiveteam-bs
22:01 🔗 VADemon has quit IRC (Quit: left4dead)
22:49 🔗 atrocity you can actually fight for a domain if somebody is using it for ads
22:49 🔗 atrocity ppl apparently get sued over that
22:49 🔗 atrocity i used to own a reddit "typo" domain and made some money off of it years ago and didn't renew it for that reason, lol
23:01 🔗 JesseW has joined #archiveteam-bs
23:07 🔗 GE has quit IRC (Quit: zzz)
23:15 🔗 tomwsmf_ has joined #archiveteam-bs
23:37 🔗 kristian_ has joined #archiveteam-bs
23:44 🔗 ivan "nobody's going to pay $500 for a domain" lol. just lol. sorry.
23:45 🔗 Frogging I have a feeling like that industry survives due to people with more money than sense tbh
23:46 🔗 Frogging just need a few of those very high margin sales of not-intrinsically-valuable items profitable
23:46 🔗 Frogging :p
23:46 🔗 Frogging I'm just speculating. the only thing I know for sure is that it's cancer
23:47 🔗 Frogging to be profitable*
23:47 🔗 Frogging where's my brain
23:48 🔗 ivan "intrinsically valuable" is not really a thing that means anything
23:48 🔗 ivan (is land at location X intrinsically valuable? gold (at market prices)?)
23:49 🔗 Frogging fair enough, since value is indeed determined by how much people will pay for it. I guess my point was that very few people are willing to pay that, but enough are to make it worthwhile
23:53 🔗 arkiver PurpleSym: the resource records look good
23:53 🔗 arkiver I'm only not sure about the WARC-Target-URI.
23:54 🔗 arkiver Maybe this should be the location on your machine, or the exact URL you got this synced from to your machine

irclogger-viewer