#archiveteam-bs 2016-06-12,Sun

↑back Search

Time Nickname Message
00:02 🔗 dashcloud has quit IRC (Read error: Operation timed out)
00:06 🔗 dashcloud has joined #archiveteam-bs
00:20 🔗 ralphdnak has quit IRC (Ping timeout: 633 seconds)
00:21 🔗 ralphdnak has joined #archiveteam-bs
00:31 🔗 kristian_ has joined #archiveteam-bs
00:39 🔗 ralphdnak has quit IRC (Ping timeout: 244 seconds)
01:10 🔗 MrRadar Interesting article on video game preservation: https://web.stanford.edu/group/htgg/cgi-bin/drupal/?q=node/1211
01:19 🔗 bauruine has joined #archiveteam-bs
01:33 🔗 kristian_ has quit IRC (Leaving)
01:36 🔗 Eloquence has joined #archiveteam-bs
01:40 🔗 r3c0d3x_ has joined #archiveteam-bs
01:41 🔗 r3c0d3x_ Hey, can I speak to a staff member about something? (My internet connection has been acting up the past few days and has probably spammed this chat with joins/leaves.)
01:44 🔗 JesseW r3c0d3x_: we noticed. :-)
01:44 🔗 JesseW I'm not sure who's around right now, but someone probably will speak up eventually.
01:58 🔗 r3c0d3x_ K, thanks. Really sorry about all that, my ISP has been having problems the past few days (and they're still ongoing), but I'm currently on a seperate, stable sever now, so that should no longer be an issue.
02:07 🔗 JesseW Eh, it happens.
02:07 🔗 JesseW I can hardly complain, as I don't even run a bouncer, so I pop in and out a lot.
02:22 🔗 Stilett0 has joined #archiveteam-bs
02:22 🔗 Stiletto has quit IRC (Read error: Operation timed out)
02:39 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
02:39 🔗 Stiletto has joined #archiveteam-bs
02:54 🔗 bwn i couldn't think of a better place for that, so if anyone has any suggestions..
02:56 🔗 JesseW seems good
02:57 🔗 bwn that's a really interesting tool, a big part of what's needed if you ask me
02:58 🔗 JesseW The list of currently supported sites didn't happen to be any of personal interest to me -- but the idea certainly seems good and useful.
02:58 🔗 r3c0d3x_ reading through their site now, this does look really interesting! nice find bwn.
02:59 🔗 r3c0d3x_ might contribute at some point
02:59 🔗 bwn Nemo bis found it and added it to the Quora wiki
02:59 🔗 * JesseW is happily reading through http://wiki.erights.org/wiki/Walnut/Distributed_Computing right now
03:00 🔗 bwn jessew: i happen to have a quora account but zero answers, none of the others either, heh
03:01 🔗 bwn the extensible aspect though
03:14 🔗 MrRadar !ig 2lnjehj9rvargx2kpdcxcxzx5 ^https?://www\.drudgereportarchives\.com/data/.*_video-gunshots-shouts-allahu-akbar-french-magazine-shooting_823281\.html
03:23 🔗 JesseW hm, I need to look at the extensible aspect more, I guess
03:29 🔗 JesseW https://freeyourstuff.cc/plugins <- bwn, I presume you meant this page?
03:30 🔗 dashcloud has quit IRC (Read error: Operation timed out)
03:32 🔗 JesseW probably a good idea to take Erik up on this: https://freeyourstuff.cc/mirrors
03:34 🔗 dashcloud has joined #archiveteam-bs
03:36 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
03:38 🔗 BlueMaxim has joined #archiveteam-bs
03:44 🔗 JesseW it might be interesting to write a generalized mediawiki plugin for freeyourstuff.cc
03:58 🔗 bwn sorry, yes, the plugins is what i was referring to
04:11 🔗 DopefishJ is now known as DFJustin
04:16 🔗 VADemon has quit IRC (Read error: Connection reset by peer)
04:27 🔗 Stiletto has quit IRC (Read error: Operation timed out)
04:27 🔗 Stiletto has joined #archiveteam-bs
04:51 🔗 Stiletto has quit IRC (Read error: Operation timed out)
04:51 🔗 Stiletto has joined #archiveteam-bs
04:56 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
04:56 🔗 BlueMaxim has joined #archiveteam-bs
05:23 🔗 Stiletto has quit IRC (Read error: Operation timed out)
05:24 🔗 Stiletto has joined #archiveteam-bs
05:48 🔗 Stiletto has quit IRC (Read error: Operation timed out)
05:48 🔗 Stiletto has joined #archiveteam-bs
06:10 🔗 Stiletto has quit IRC (Read error: Operation timed out)
06:10 🔗 Stiletto has joined #archiveteam-bs
06:15 🔗 Honno has joined #archiveteam-bs
06:55 🔗 Stiletto has quit IRC (Read error: Operation timed out)
06:55 🔗 Stiletto has joined #archiveteam-bs
06:56 🔗 JesseW has quit IRC (Read error: Operation timed out)
07:00 🔗 Eloquence has quit IRC (Ping timeout: 244 seconds)
07:03 🔗 ralphdnak has joined #archiveteam-bs
07:09 🔗 PurpleSym sets mode: -b r3c0d3x!*@*
07:11 🔗 PurpleSym r3c0d3x_: ^
07:22 🔗 bwn has quit IRC (Ping timeout: 244 seconds)
07:24 🔗 tomwsmf-a has quit IRC (Read error: Operation timed out)
07:25 🔗 Aranje has quit IRC (Remote host closed the connection)
07:30 🔗 bwn has joined #archiveteam-bs
07:39 🔗 BlueMaxim has quit IRC (Quit: Leaving)
07:45 🔗 bzc6p has joined #archiveteam-bs
07:45 🔗 swebb sets mode: +o bzc6p
08:01 🔗 xXx_ndidd has joined #archiveteam-bs
08:03 🔗 ralphdnak has quit IRC (Ping timeout: 244 seconds)
08:06 🔗 ndiddy has quit IRC (Read error: Operation timed out)
08:15 🔗 bzc6p has left
08:22 🔗 Kazzy PurpleSym: cheers, just woke up
08:37 🔗 Eloquence has joined #archiveteam-bs
08:56 🔗 Stiletto has quit IRC (Read error: Operation timed out)
08:56 🔗 Stiletto has joined #archiveteam-bs
09:02 🔗 Eloquence has quit IRC (Read error: Operation timed out)
09:19 🔗 closure has quit IRC (Ping timeout: 250 seconds)
09:19 🔗 closure has joined #archiveteam-bs
09:19 🔗 midas sets mode: +o closure
09:41 🔗 Stiletto has quit IRC (Read error: Operation timed out)
09:41 🔗 Stiletto has joined #archiveteam-bs
09:43 🔗 ralphdnak has joined #archiveteam-bs
10:38 🔗 ralphdnak has quit IRC (Ping timeout: 244 seconds)
10:51 🔗 Stiletto has quit IRC (Read error: Operation timed out)
10:51 🔗 Stiletto has joined #archiveteam-bs
11:15 🔗 Stiletto has quit IRC (Read error: Operation timed out)
11:15 🔗 Stiletto has joined #archiveteam-bs
11:42 🔗 Stiletto has quit IRC (Read error: Operation timed out)
11:42 🔗 Stiletto has joined #archiveteam-bs
12:09 🔗 VADemon has joined #archiveteam-bs
12:24 🔗 Honno has quit IRC (Read error: Operation timed out)
12:42 🔗 VADemon PSA: Facebook forces users to download their new app "Moments" in order to NOT LOSE (auto-)synced photos
12:42 🔗 VADemon https://twitter.com/aurevoiralexis/status/740728442254135296/photo/1
13:10 🔗 Stiletto has quit IRC (Read error: Operation timed out)
13:10 🔗 Stiletto has joined #archiveteam-bs
13:45 🔗 Stiletto has quit IRC (Read error: Operation timed out)
13:45 🔗 Stiletto has joined #archiveteam-bs
15:02 🔗 r3c0d3x_ PurpleSym: Thanks! Everything should be fixed now.
15:02 🔗 r3c0d3x_ is now known as r3c0d3x
15:07 🔗 Honno has joined #archiveteam-bs
15:23 🔗 Honno has quit IRC (Read error: Operation timed out)
15:42 🔗 Stiletto has quit IRC (Read error: Operation timed out)
15:42 🔗 Stiletto has joined #archiveteam-bs
15:59 🔗 dashcloud if there's any projects/ideas/whatever that would benefit from unfiltered, super-high speed connections for a few days, make a list- HOPE is this summer, and you'll have access to a great network for the weekend (July 22-24)
16:02 🔗 GLaDOS has quit IRC (Quit: Oh crap, I died.)
16:03 🔗 GLaDOS has joined #archiveteam-bs
16:11 🔗 Rotab has quit IRC (Read error: Connection reset by peer)
16:23 🔗 joepie91 "To make a long story short, we managed to find the company that had purchased our valve manufacturer and it turns out they had exited the manufacturing buisness and they were now a magazine. However, they still had a warehouse full of the fucking valves, and they'd sell us one if we wanted it. And that was the day we ordered an expensive three way valve from a company that had no idea how it worked, or what it did."
16:23 🔗 joepie91 ( https://www.reddit.com/r/talesfromtechsupport/comments/4njv3r/our_operators_are_too_stupid_part_1/d44plu1 )
16:31 🔗 JesseW has joined #archiveteam-bs
16:32 🔗 yipdw joepie91: the company in that story seriously sounds like Roche Pharmaceutials
16:33 🔗 yipdw they are Very Big and they have a strong presence in Indiana, which is basically Nowhere, USA
16:35 🔗 schbirid has joined #archiveteam-bs
16:40 🔗 godane another g4tv.com video saved: https://archive.org/details/g4tv.com-video36368-flvhd
17:14 🔗 fie has joined #archiveteam-bs
18:03 🔗 Sanqui https://publicpolicy.googleblog.com/2016/06/the-trans-pacific-partnership-step.html what.
18:07 🔗 _desu___ has joined #archiveteam-bs
18:07 🔗 HCross2 has joined #archiveteam-bs
18:08 🔗 JesseW Hi HCross2!
18:09 🔗 HCross2 Hello
18:09 🔗 JesseW Did you see my list of academictorrents I posted yesterday? Will that work for you, or would you like me to parse it further?
18:10 🔗 HCross2 Seems IRCCloud is on various sorts of fire. I had a look and ideally I want a set of torrent files
18:10 🔗 HCross2 Deluge can't take a set of magnets
18:11 🔗 arkiver hey
18:11 🔗 arkiver Can I help with a script to back them up to IA?
18:11 🔗 JesseW arkiver: certainly!
18:11 🔗 arkiver Just backing up the torrents to IA, let IA download them
18:11 🔗 JesseW Yep, that's the basic plan.
18:11 🔗 HCross yeah, if we can just get the torrents, we can feed them into the IA and they will get them
18:11 🔗 arkiver ok
18:11 🔗 JesseW From the infohashes, you should be able to download the torrents like this, I think:
18:12 🔗 arkiver yes
18:12 🔗 JesseW http://academictorrents.com/download/403e6d6945a64dd1b9e185a6cd8d029274efccdc.torrent
18:12 🔗 arkiver do we already have a list of hashes/torrents?
18:13 🔗 JesseW I made a list of 296 infohashes
18:13 🔗 JesseW http://termbin.com/j6f9
18:13 🔗 arkiver ok
18:14 🔗 arkiver It looks like that list is incomplete
18:15 🔗 JesseW That's just a list of datasets -- the other items are papers, I think.
18:16 🔗 arkiver I mean, see the last line of that list
18:17 🔗 JesseW hm, yeah
18:17 🔗 JesseW I'm not sure what happened there :-(
18:17 🔗 JesseW I'll see about fixing that.
18:17 🔗 arkiver But I can add some scraping of the site to the script.
18:18 🔗 Aranje has joined #archiveteam-bs
18:18 🔗 JesseW sure. My scraping was as simple as downloading http://academictorrents.com/browse.php?cat=6&sort_field=seeders&sort_dir=DESC&page=0 and running a regex on the result
18:18 🔗 JesseW this was the regex: re.findall(r"""href="/details/([0-9a-z]+)"><b>([^<]+)</b>.+?filelist=1">([0-9]+)<.+?<nobr>([-0-9]+)<.+?>([0-9.]+[A-Z]+)<.+?center>([0-9,]+)<.+?dllist=1">([0-9+]+)<.+?</tr>""",txt, re.DOTALL)
18:18 🔗 JesseW There are 15 pages of datasets, and 55 pages of papers.
18:18 🔗 JesseW currently
18:19 🔗 JesseW with 20 items on each page
18:19 🔗 JesseW That will get you the infohashes, titles, sizes, file counts, "mirror" (i.e. seed) counts
18:20 🔗 PurpleSym http://academictorrents.com/about.php#mirroring could be relevant.
18:22 🔗 arkiver ok
18:22 🔗 arkiver I'll try to have something in a bit
18:22 🔗 JesseW PurpleSym: not particularly -- we *want* to do "blind mirroring of all data", so their per-collection lists don't help that much. :-)
18:23 🔗 JesseW arkiver: I don't think it's particular urgent, but it's a good thing to do.
18:23 🔗 PurpleSym Yeah, right below that section are details on their API.
18:23 🔗 PurpleSym That might be easier than screen scraping.
18:27 🔗 JesseW PurpleSym: I looked into it, but the API, unlike the real interface, didn't seem to support paging, oddly.
18:27 🔗 PurpleSym The examples suggest you can use &limit=9999
18:27 🔗 JesseW And changing the limit seemed to require an API key -- which, no thanks, I'll just use what you are *already making available*
18:28 🔗 PurpleSym I see.
18:37 🔗 PurpleSym Anyway, curl -s -b 'uid=4510;pass=f2e3f605ea9062c5eb7390a3bd3f8eb9' 'http://academictorrents.com/apiv2/entries?limit=9999' | jq -r '.[] | [.infohash, .name, .size, .dateadded] | @csv'
18:39 🔗 JesseW Nice!
18:39 🔗 JesseW better you than me.
19:08 🔗 xioustic has joined #archiveteam-bs
19:22 🔗 ndizzle has joined #archiveteam-bs
19:26 🔗 JesseW has quit IRC (Read error: Operation timed out)
19:26 🔗 xXx_ndidd has quit IRC (Ping timeout: 244 seconds)
19:37 🔗 Eloquence has joined #archiveteam-bs
19:39 🔗 dashcloud has quit IRC (Read error: Operation timed out)
19:41 🔗 Start has quit IRC (Read error: Connection reset by peer)
19:41 🔗 Start has joined #archiveteam-bs
19:43 🔗 dashcloud has joined #archiveteam-bs
19:44 🔗 tomwsmf-a has joined #archiveteam-bs
20:01 🔗 Eloquence has quit IRC (Read error: Operation timed out)
20:14 🔗 arkiver I asked SketchCow to create a collection
20:36 🔗 Simpbrain has quit IRC (Read error: Operation timed out)
20:37 🔗 Eloquence has joined #archiveteam-bs
20:51 🔗 Stiletto has quit IRC (Read error: Operation timed out)
20:51 🔗 Stiletto has joined #archiveteam-bs
20:52 🔗 Simpbrain has joined #archiveteam-bs
21:03 🔗 schbirid has quit IRC (Quit: Leaving)
21:04 🔗 Simpbrain has quit IRC (Ping timeout: 633 seconds)
21:05 🔗 RichardG has joined #archiveteam-bs
21:07 🔗 Eloquence has quit IRC (Read error: Operation timed out)
21:07 🔗 Simpbrain has joined #archiveteam-bs
21:32 🔗 Simpbra1 has joined #archiveteam-bs
21:33 🔗 Simpbrain has quit IRC (Ping timeout: 1208 seconds)
21:38 🔗 kristian_ has joined #archiveteam-bs
21:43 🔗 JesseW has joined #archiveteam-bs
21:45 🔗 dashcloud has quit IRC (Read error: Operation timed out)
21:48 🔗 dashcloud has joined #archiveteam-bs
21:59 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
21:59 🔗 dashcloud has joined #archiveteam-bs
22:01 🔗 signius has quit IRC (Remote host closed the connection)
22:13 🔗 kristian_ has quit IRC (Leaving)
22:14 🔗 Eloquence has joined #archiveteam-bs
22:19 🔗 Stiletto has quit IRC (Read error: Operation timed out)
22:19 🔗 Stiletto has joined #archiveteam-bs
22:26 🔗 joepie91 so apparently Savant's soundcloud was baleeted over a bunch of remixes
22:26 🔗 joepie91 https://www.facebook.com/zyonMGMT/videos/vb.649866465116005/682443285191656/?type=2&theater
22:42 🔗 BlueMaxim has joined #archiveteam-bs
22:51 🔗 dashcloud has quit IRC (Read error: Operation timed out)
22:52 🔗 mutoso has quit IRC (Read error: Operation timed out)
22:54 🔗 godane so i have uploaded up to 2015-04 with kotaku.com
22:54 🔗 dashcloud has joined #archiveteam-bs
23:31 🔗 godane looks like i got all of gawker.com up to 2015: https://archive.org/search.php?query=subject%3A%22gawker.com%22
23:51 🔗 Honno has joined #archiveteam-bs
23:52 🔗 Stiletto has quit IRC (Read error: Operation timed out)
23:52 🔗 Stiletto has joined #archiveteam-bs
23:53 🔗 godane so looks like i did lifehacker.com sitemap grab last summer
23:54 🔗 Eloquence has quit IRC (Read error: Operation timed out)
23:54 🔗 godane i will have to at least another 17 months of it so we are sure of up to date with it
23:57 🔗 JesseW good
23:57 🔗 JesseW valleywag also seems important to try and get, if we haven't already

irclogger-viewer