#archiveteam-bs 2014-11-10,Mon

↑back Search

Time Nickname Message
00:33 🔗 schbirid2 has joined #archiveteam-bs
00:35 🔗 schbirid has quit IRC (Read error: Operation timed out)
00:44 🔗 wp494 has joined #archiveteam-bs
00:44 🔗 BiggieJo1 has joined #archiveteam-bs
00:47 🔗 BiggieJon has quit IRC (Read error: Operation timed out)
00:53 🔗 Mayonaise has quit IRC (Ping timeout: 365 seconds)
00:53 🔗 zenguy_pc has quit IRC (Read error: Operation timed out)
01:06 🔗 primus104 has quit IRC (Leaving.)
01:09 🔗 zenguy_pc has joined #archiveteam-bs
01:38 🔗 Mayonaise has joined #archiveteam-bs
01:47 🔗 tfgbd Ugh, no wonder I had trouble uploading some massive site dumps..
01:47 🔗 tfgbd Just noticed these emails...
01:47 🔗 tfgbd Thank you for your interest in adding files to the Internet Archive. Unfortunately, one or more of the files you uploaded into item VetuswareSoftware_olddosru appear to be malware, and the item has been removed from archive.org. You can get more details about the malware file(s) here:
01:47 🔗 tfgbd Communication_update.zip https://www.virustotal.com/file/3babe259474e50616dfb47fcb8dc983dae673e5d6f856d18c8cbd903c67256f7/analysis/1415505484/
01:55 🔗 tfgbd but that doesn't explain why the huge files fail
01:55 🔗 tfgbd I think I'll just give up on that ID and just reupload everything...
02:06 🔗 Mayonaise has quit IRC (Ping timeout: 365 seconds)
02:11 🔗 Mayonaise has joined #archiveteam-bs
02:35 🔗 logchfoo starts logging #archiveteam-bs at Mon Nov 10 02:35:30 2014
02:35 🔗 logchfoo has joined #archiveteam-bs
02:40 🔗 ex-parrot has joined #archiveteam-bs
02:52 🔗 Ravenloft has quit IRC (Ping timeout: 606 seconds)
02:57 🔗 bauruine has quit IRC (Ping timeout: 265 seconds)
03:02 🔗 bauruine has joined #archiveteam-bs
03:03 🔗 schbirid2 has quit IRC (Read error: Operation timed out)
03:06 🔗 dashcloud so, if you used browserstacks, bad news- they were hacked, and if you believe the pastebin, it was very bad
03:11 🔗 schbirid2 has joined #archiveteam-bs
03:47 🔗 mistym has joined #archiveteam-bs
03:51 🔗 Lord_Nigh has quit IRC (Read error: Operation timed out)
03:53 🔗 Lord_Nigh has joined #archiveteam-bs
04:23 🔗 bsmith093 has quit IRC (Read error: Operation timed out)
04:38 🔗 mistym has quit IRC (Remote host closed the connection)
04:39 🔗 bsmith093 has joined #archiveteam-bs
04:39 🔗 midas sets mode: +o bsmith093
04:53 🔗 ex-parrot has quit IRC (Leaving.)
04:56 🔗 aaaaaaaaa has quit IRC (Leaving)
05:23 🔗 mistym has joined #archiveteam-bs
05:33 🔗 JonimusP is now known as Jonimus
07:16 🔗 joepie91 dashcloud: link to pastebin?
07:17 🔗 Kazzy https://www.reddit.com/r/sysadmin/comments/2ltemy/crazy_browserstack_email_i_just_got/
07:17 🔗 Kazzy this is all I've seen from it
07:17 🔗 Kazzy I'm assuming that'd be the contents of whatever someone had put on pastebin
07:18 🔗 garyrh the pastebin: http://pastebin.com/RQXd2Au3
07:18 🔗 garyrh (from https://news.ycombinator.com/item?id=8581477)
07:18 🔗 joepie91 right
07:23 🔗 primus104 has joined #archiveteam-bs
07:39 🔗 rduser has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 SadDM has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 twrist has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 altlabel has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 pikhq has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 ionpulse has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 eprillios has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Insomnia_ has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Aranje has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 dcmorton has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 SmileyG has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Cameron_D has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 slash` has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 pft has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 antomatic has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Sue_ has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 mistym has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Lord_Nigh has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 GLaDOS has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 arkiver has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 RainbowCo has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 SN4T14__ has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 brayden has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Zebranky_ has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Atluxity has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 tfgbd has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 wm_ has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 bauruine has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Sellyme_ has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 dashcloud has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 DFJustin has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Sk1d has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 danneh_ has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Kirk has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 primus104 has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 schbirid2 has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Coderjoe has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 norbert79 has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 ersi has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 garyrh has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Void_ has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Boppen has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Kenshin has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 wp494 has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 wiktor_b has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 bsmith093 has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 SketchCow has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 kanzure has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 lytv has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 RedType has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 chfoo has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 dx has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 xmc has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 zenguy_pc has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 BlueMaxim has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 primus_ has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Jonimus has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 swebb has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 w0rp has quit IRC (ircd.choopa.net hub.efnet.us)
07:39 🔗 Laverne has quit IRC (ircd.choopa.net hub.efnet.us)
17:03 🔗 logchfoo starts logging #archiveteam-bs at Mon Nov 10 17:03:04 2014
17:03 🔗 logchfoo has joined #archiveteam-bs
17:14 🔗 mistym has quit IRC (Remote host closed the connection)
17:31 🔗 logchfoo starts logging #archiveteam-bs at Mon Nov 10 17:31:59 2014
17:31 🔗 logchfoo has joined #archiveteam-bs
17:32 🔗 mistym has joined #archiveteam-bs
17:51 🔗 primus104 has quit IRC (Leaving.)
17:59 🔗 bobby_ has joined #archiveteam-bs
18:23 🔗 bobby__ has joined #archiveteam-bs
18:26 🔗 bobby_ has quit IRC (Ping timeout: 240 seconds)
18:39 🔗 espes__ so after churning through 100gb of the hyves indexes I realised I got the username wrong
18:39 🔗 espes__ I probably should have been tipped off by the "mother" being a teenage boy
18:43 🔗 kyan has joined #archiveteam-bs
18:43 🔗 kyan So, I'm trying to find this file http://downloads.bbc.co.uk/podcasts/radio4/ipm/ipm_20080412-1843.mp3
18:44 🔗 joepie91 espes__: lol?
18:44 🔗 kyan BBC's archiving policy is, as far as I can tell, to BURN IT ALL. That makes me sad. I was wgetting the podcast for a while last year, but it was kind of bad because it only worked when my laptop was online. Is there any sort of scheduled archiving thing?
18:44 🔗 joepie91 espes__: I'm still writing parsing code, but, I need sleep :(
18:44 🔗 * joepie91 hit a zlib speed bump
18:45 🔗 kyan Like, periodic archiving in the cloud
18:45 🔗 joepie91 ~cloud~
18:45 🔗 joepie91 kyan: I'm running a periodic wget for the NHK broadcasts, I could set one up for this if necessary
18:45 🔗 joepie91 no idea if something like that already exists though
18:46 🔗 kyan joepie91: Ah. I was thinking that archivebot or something might have that. guess not though. Yea, as far as I can tell BBC4 deletes basically everything after a couple weeks
18:46 🔗 joepie91 :
18:46 🔗 joepie91 :/ *
18:47 🔗 joepie91 kyan: is there, like, a directory index for them?
18:47 🔗 joepie91 or does it require page parsing?
18:47 🔗 * joepie91 notes that he has now almost a year of NHK podcasts
18:47 🔗 kyan the closest thing I found was the actual podcast XML, IIRC
18:47 🔗 antomatic BBC archiving policy (today) is 'KEEP IT ALL', but that does not mean that they'll let anyone else have it.
18:47 🔗 joepie91 kyan: if you could drop me all the URLs you have for it in PM
18:47 🔗 kyan I think what i ended up doing was just wgetting the main podcast page for a few hops
18:47 🔗 joepie91 I can have a look at it tomorrow
18:48 🔗 joepie91 still need to set up an automated upload job for NHK as well
18:48 🔗 kyan Hmm, if they keep it all, maybe it's not a priority issue for us then?
18:48 🔗 antomatic They're reasonably good at making stuff /available/ for limited periods - e.g. 7-30 days after broadcast, but after that it doesn't count towards ratings so they hide it away again.
18:48 🔗 joepie91 so may as well look into adding this to the schedule
18:48 🔗 joepie91 kyan: dark archive is no archive
18:48 🔗 joepie91 :)
18:48 🔗 kyan true.
18:48 🔗 * antomatic nods
18:49 🔗 joepie91 but yeah, drop me all the relevant URLs in PM and I will look at it probably tomorrow
18:49 🔗 antomatic BBC policy used to be 'tapes are expensive!', but they do seem more enlightened today
18:49 🔗 kyan I think there are a lot more podcasts than that, though
18:49 🔗 kyan like, one for each show
18:50 🔗 kyan that's the one I 've been doing, because of that interveiew about the scientology documents (which is apparently the only thing that gave the name of the involved laywer)
18:51 🔗 espes__ there should be a thing to automatically ingest stuff from an rss feed
18:52 🔗 espes__ and another scraperwiki-like thing to easily generate rss feeds
18:52 🔗 espes__ or just like, cron-as-a-service :P
18:53 🔗 schbirid2 oh debian, php5 depends on apache2
18:53 🔗 kyan For what it's worth, BBC also uses IP blocks to limit some content to UK only http://www.bbc.co.uk/podcasts/help/uk_only
18:54 🔗 joepie91 schbirid2: wrong php package
18:54 🔗 joepie91 schbirid2: php5-cgi/php5-fpm for lighttpd/nginx
18:56 🔗 DFJustin http://file.wikileaks.org/robots.txt sad face
19:04 🔗 antomatic Most of the BBC's content is behind iPlayer, which is completely IP-locked, unlike [most] podcasts
19:05 🔗 primus104 has joined #archiveteam-bs
19:05 🔗 schbirid2 joepie91: too late :P
19:05 🔗 schbirid2 also i want to run the builtin php server because i like danger
19:07 🔗 joepie91 antomatic: haha, IP-locked
19:07 🔗 * joepie91 SSHs into UK VPS
19:07 🔗 joepie91 :D
19:10 🔗 antomatic Ssh, don't tell them. :)
19:11 🔗 antomatic True confession: I used to look after the geoblocking for a video site at work. Pretty easy, in the main it was just whitelisting the big UK ISPs, and denything everything else with a "Has there been an error? Let us know" reply form.
19:12 🔗 antomatic Occasionally an ISP would open up a new IP range, we'd hear about it from the form.
19:12 🔗 antomatic More often I'd get emails from people saying "I am just trying to use your site, in my home, and it does not work, please can you fix it"
19:13 🔗 antomatic which due investigation would reveal that their home was apparentlyhosted in the middle of a large datacenter. :)
19:16 🔗 antomatic Or that their IP range was allocated to "Soopa VPN Ltd"
19:16 🔗 joepie91 lol
19:16 🔗 kyan joepie91: btw I pmed you with an idea for a wget command
19:42 🔗 Aranje has quit IRC (Quit: Three sheets to the wind)
20:06 🔗 ex-parrot has joined #archiveteam-bs
20:08 🔗 kyan it looks like something might have gone wrong with bhscfbemh541lxe06mrgurvz9 in archivebot, Facebook urls are timing out
20:08 🔗 ex-parrot has quit IRC (Client Quit)
20:20 🔗 kyan Also: is there a way to search for specific finished Archivebot WARCs?
20:23 🔗 Panasonic has quit IRC (Ping timeout: 480 seconds)
20:24 🔗 bobby__ has quit IRC (Ping timeout: 240 seconds)
20:30 🔗 bobby_ has joined #archiveteam-bs
21:06 🔗 DFJustin google site:archive.org archivebot whatever
21:09 🔗 bobby_ has quit IRC (Quit: Page closed)
21:09 🔗 Bobby_ has joined #archiveteam-bs
21:28 🔗 BlueMaxim has joined #archiveteam-bs
21:31 🔗 mistym has quit IRC (Remote host closed the connection)
21:33 🔗 kyan_ has joined #archiveteam-bs
21:36 🔗 kyan_ has quit IRC (Client Quit)
21:38 🔗 kyan has quit IRC (Ping timeout: 480 seconds)
21:46 🔗 Bobby_ has quit IRC ()
21:52 🔗 mistym has joined #archiveteam-bs
22:22 🔗 mistym has quit IRC (Remote host closed the connection)
22:37 🔗 mistym has joined #archiveteam-bs
22:54 🔗 RedType has quit IRC (Quit: leaving)
22:54 🔗 RedType has joined #archiveteam-bs
22:54 🔗 RedType has quit IRC (Client Quit)
22:55 🔗 midas https://archive.org/details/archivebot
22:56 🔗 midas (it has its own collection you know:))
23:07 🔗 RedType has joined #archiveteam-bs

irclogger-viewer