Time |
Nickname |
Message |
00:02
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
00:06
🔗
|
|
dashcloud has joined #archiveteam-bs |
00:20
🔗
|
|
ralphdnak has quit IRC (Ping timeout: 633 seconds) |
00:21
🔗
|
|
ralphdnak has joined #archiveteam-bs |
00:31
🔗
|
|
kristian_ has joined #archiveteam-bs |
00:39
🔗
|
|
ralphdnak has quit IRC (Ping timeout: 244 seconds) |
01:10
🔗
|
MrRadar |
Interesting article on video game preservation: https://web.stanford.edu/group/htgg/cgi-bin/drupal/?q=node/1211 |
01:19
🔗
|
|
bauruine has joined #archiveteam-bs |
01:33
🔗
|
|
kristian_ has quit IRC (Leaving) |
01:36
🔗
|
|
Eloquence has joined #archiveteam-bs |
01:40
🔗
|
|
r3c0d3x_ has joined #archiveteam-bs |
01:41
🔗
|
r3c0d3x_ |
Hey, can I speak to a staff member about something? (My internet connection has been acting up the past few days and has probably spammed this chat with joins/leaves.) |
01:44
🔗
|
JesseW |
r3c0d3x_: we noticed. :-) |
01:44
🔗
|
JesseW |
I'm not sure who's around right now, but someone probably will speak up eventually. |
01:58
🔗
|
r3c0d3x_ |
K, thanks. Really sorry about all that, my ISP has been having problems the past few days (and they're still ongoing), but I'm currently on a seperate, stable sever now, so that should no longer be an issue. |
02:07
🔗
|
JesseW |
Eh, it happens. |
02:07
🔗
|
JesseW |
I can hardly complain, as I don't even run a bouncer, so I pop in and out a lot. |
02:22
🔗
|
|
Stilett0 has joined #archiveteam-bs |
02:22
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
02:39
🔗
|
|
Stilett0 has quit IRC (Read error: Operation timed out) |
02:39
🔗
|
|
Stiletto has joined #archiveteam-bs |
02:54
🔗
|
bwn |
i couldn't think of a better place for that, so if anyone has any suggestions.. |
02:56
🔗
|
JesseW |
seems good |
02:57
🔗
|
bwn |
that's a really interesting tool, a big part of what's needed if you ask me |
02:58
🔗
|
JesseW |
The list of currently supported sites didn't happen to be any of personal interest to me -- but the idea certainly seems good and useful. |
02:58
🔗
|
r3c0d3x_ |
reading through their site now, this does look really interesting! nice find bwn. |
02:59
🔗
|
r3c0d3x_ |
might contribute at some point |
02:59
🔗
|
bwn |
Nemo bis found it and added it to the Quora wiki |
02:59
🔗
|
* |
JesseW is happily reading through http://wiki.erights.org/wiki/Walnut/Distributed_Computing right now |
03:00
🔗
|
bwn |
jessew: i happen to have a quora account but zero answers, none of the others either, heh |
03:01
🔗
|
bwn |
the extensible aspect though |
03:14
🔗
|
MrRadar |
!ig 2lnjehj9rvargx2kpdcxcxzx5 ^https?://www\.drudgereportarchives\.com/data/.*_video-gunshots-shouts-allahu-akbar-french-magazine-shooting_823281\.html |
03:23
🔗
|
JesseW |
hm, I need to look at the extensible aspect more, I guess |
03:29
🔗
|
JesseW |
https://freeyourstuff.cc/plugins <- bwn, I presume you meant this page? |
03:30
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
03:32
🔗
|
JesseW |
probably a good idea to take Erik up on this: https://freeyourstuff.cc/mirrors |
03:34
🔗
|
|
dashcloud has joined #archiveteam-bs |
03:36
🔗
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
03:38
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
03:44
🔗
|
JesseW |
it might be interesting to write a generalized mediawiki plugin for freeyourstuff.cc |
03:58
🔗
|
bwn |
sorry, yes, the plugins is what i was referring to |
04:11
🔗
|
|
DopefishJ is now known as DFJustin |
04:16
🔗
|
|
VADemon has quit IRC (Read error: Connection reset by peer) |
04:27
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
04:27
🔗
|
|
Stiletto has joined #archiveteam-bs |
04:51
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
04:51
🔗
|
|
Stiletto has joined #archiveteam-bs |
04:56
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
04:56
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
05:23
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
05:24
🔗
|
|
Stiletto has joined #archiveteam-bs |
05:48
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
05:48
🔗
|
|
Stiletto has joined #archiveteam-bs |
06:10
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
06:10
🔗
|
|
Stiletto has joined #archiveteam-bs |
06:15
🔗
|
|
Honno has joined #archiveteam-bs |
06:55
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
06:55
🔗
|
|
Stiletto has joined #archiveteam-bs |
06:56
🔗
|
|
JesseW has quit IRC (Read error: Operation timed out) |
07:00
🔗
|
|
Eloquence has quit IRC (Ping timeout: 244 seconds) |
07:03
🔗
|
|
ralphdnak has joined #archiveteam-bs |
07:09
🔗
|
|
PurpleSym sets mode: -b r3c0d3x!*@* |
07:11
🔗
|
PurpleSym |
r3c0d3x_: ^ |
07:22
🔗
|
|
bwn has quit IRC (Ping timeout: 244 seconds) |
07:24
🔗
|
|
tomwsmf-a has quit IRC (Read error: Operation timed out) |
07:25
🔗
|
|
Aranje has quit IRC (Remote host closed the connection) |
07:30
🔗
|
|
bwn has joined #archiveteam-bs |
07:39
🔗
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
07:45
🔗
|
|
bzc6p has joined #archiveteam-bs |
07:45
🔗
|
|
swebb sets mode: +o bzc6p |
08:01
🔗
|
|
xXx_ndidd has joined #archiveteam-bs |
08:03
🔗
|
|
ralphdnak has quit IRC (Ping timeout: 244 seconds) |
08:06
🔗
|
|
ndiddy has quit IRC (Read error: Operation timed out) |
08:15
🔗
|
|
bzc6p has left |
08:22
🔗
|
Kazzy |
PurpleSym: cheers, just woke up |
08:37
🔗
|
|
Eloquence has joined #archiveteam-bs |
08:56
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
08:56
🔗
|
|
Stiletto has joined #archiveteam-bs |
09:02
🔗
|
|
Eloquence has quit IRC (Read error: Operation timed out) |
09:19
🔗
|
|
closure has quit IRC (Ping timeout: 250 seconds) |
09:19
🔗
|
|
closure has joined #archiveteam-bs |
09:19
🔗
|
|
midas sets mode: +o closure |
09:41
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
09:41
🔗
|
|
Stiletto has joined #archiveteam-bs |
09:43
🔗
|
|
ralphdnak has joined #archiveteam-bs |
10:38
🔗
|
|
ralphdnak has quit IRC (Ping timeout: 244 seconds) |
10:51
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
10:51
🔗
|
|
Stiletto has joined #archiveteam-bs |
11:15
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
11:15
🔗
|
|
Stiletto has joined #archiveteam-bs |
11:42
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
11:42
🔗
|
|
Stiletto has joined #archiveteam-bs |
12:09
🔗
|
|
VADemon has joined #archiveteam-bs |
12:24
🔗
|
|
Honno has quit IRC (Read error: Operation timed out) |
12:42
🔗
|
VADemon |
PSA: Facebook forces users to download their new app "Moments" in order to NOT LOSE (auto-)synced photos |
12:42
🔗
|
VADemon |
https://twitter.com/aurevoiralexis/status/740728442254135296/photo/1 |
13:10
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
13:10
🔗
|
|
Stiletto has joined #archiveteam-bs |
13:45
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
13:45
🔗
|
|
Stiletto has joined #archiveteam-bs |
15:02
🔗
|
r3c0d3x_ |
PurpleSym: Thanks! Everything should be fixed now. |
15:02
🔗
|
|
r3c0d3x_ is now known as r3c0d3x |
15:07
🔗
|
|
Honno has joined #archiveteam-bs |
15:23
🔗
|
|
Honno has quit IRC (Read error: Operation timed out) |
15:42
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
15:42
🔗
|
|
Stiletto has joined #archiveteam-bs |
15:59
🔗
|
dashcloud |
if there's any projects/ideas/whatever that would benefit from unfiltered, super-high speed connections for a few days, make a list- HOPE is this summer, and you'll have access to a great network for the weekend (July 22-24) |
16:02
🔗
|
|
GLaDOS has quit IRC (Quit: Oh crap, I died.) |
16:03
🔗
|
|
GLaDOS has joined #archiveteam-bs |
16:11
🔗
|
|
Rotab has quit IRC (Read error: Connection reset by peer) |
16:23
🔗
|
joepie91 |
"To make a long story short, we managed to find the company that had purchased our valve manufacturer and it turns out they had exited the manufacturing buisness and they were now a magazine. However, they still had a warehouse full of the fucking valves, and they'd sell us one if we wanted it. And that was the day we ordered an expensive three way valve from a company that had no idea how it worked, or what it did." |
16:23
🔗
|
joepie91 |
( https://www.reddit.com/r/talesfromtechsupport/comments/4njv3r/our_operators_are_too_stupid_part_1/d44plu1 ) |
16:31
🔗
|
|
JesseW has joined #archiveteam-bs |
16:32
🔗
|
yipdw |
joepie91: the company in that story seriously sounds like Roche Pharmaceutials |
16:33
🔗
|
yipdw |
they are Very Big and they have a strong presence in Indiana, which is basically Nowhere, USA |
16:35
🔗
|
|
schbirid has joined #archiveteam-bs |
16:40
🔗
|
godane |
another g4tv.com video saved: https://archive.org/details/g4tv.com-video36368-flvhd |
17:14
🔗
|
|
fie has joined #archiveteam-bs |
18:03
🔗
|
Sanqui |
https://publicpolicy.googleblog.com/2016/06/the-trans-pacific-partnership-step.html what. |
18:07
🔗
|
|
_desu___ has joined #archiveteam-bs |
18:07
🔗
|
|
HCross2 has joined #archiveteam-bs |
18:08
🔗
|
JesseW |
Hi HCross2! |
18:09
🔗
|
HCross2 |
Hello |
18:09
🔗
|
JesseW |
Did you see my list of academictorrents I posted yesterday? Will that work for you, or would you like me to parse it further? |
18:10
🔗
|
HCross2 |
Seems IRCCloud is on various sorts of fire. I had a look and ideally I want a set of torrent files |
18:10
🔗
|
HCross2 |
Deluge can't take a set of magnets |
18:11
🔗
|
arkiver |
hey |
18:11
🔗
|
arkiver |
Can I help with a script to back them up to IA? |
18:11
🔗
|
JesseW |
arkiver: certainly! |
18:11
🔗
|
arkiver |
Just backing up the torrents to IA, let IA download them |
18:11
🔗
|
JesseW |
Yep, that's the basic plan. |
18:11
🔗
|
HCross |
yeah, if we can just get the torrents, we can feed them into the IA and they will get them |
18:11
🔗
|
arkiver |
ok |
18:11
🔗
|
JesseW |
From the infohashes, you should be able to download the torrents like this, I think: |
18:12
🔗
|
arkiver |
yes |
18:12
🔗
|
JesseW |
http://academictorrents.com/download/403e6d6945a64dd1b9e185a6cd8d029274efccdc.torrent |
18:12
🔗
|
arkiver |
do we already have a list of hashes/torrents? |
18:13
🔗
|
JesseW |
I made a list of 296 infohashes |
18:13
🔗
|
JesseW |
http://termbin.com/j6f9 |
18:13
🔗
|
arkiver |
ok |
18:14
🔗
|
arkiver |
It looks like that list is incomplete |
18:15
🔗
|
JesseW |
That's just a list of datasets -- the other items are papers, I think. |
18:16
🔗
|
arkiver |
I mean, see the last line of that list |
18:17
🔗
|
JesseW |
hm, yeah |
18:17
🔗
|
JesseW |
I'm not sure what happened there :-( |
18:17
🔗
|
JesseW |
I'll see about fixing that. |
18:17
🔗
|
arkiver |
But I can add some scraping of the site to the script. |
18:18
🔗
|
|
Aranje has joined #archiveteam-bs |
18:18
🔗
|
JesseW |
sure. My scraping was as simple as downloading http://academictorrents.com/browse.php?cat=6&sort_field=seeders&sort_dir=DESC&page=0 and running a regex on the result |
18:18
🔗
|
JesseW |
this was the regex: re.findall(r"""href="/details/([0-9a-z]+)"><b>([^<]+)</b>.+?filelist=1">([0-9]+)<.+?<nobr>([-0-9]+)<.+?>([0-9.]+[A-Z]+)<.+?center>([0-9,]+)<.+?dllist=1">([0-9+]+)<.+?</tr>""",txt, re.DOTALL) |
18:18
🔗
|
JesseW |
There are 15 pages of datasets, and 55 pages of papers. |
18:18
🔗
|
JesseW |
currently |
18:19
🔗
|
JesseW |
with 20 items on each page |
18:19
🔗
|
JesseW |
That will get you the infohashes, titles, sizes, file counts, "mirror" (i.e. seed) counts |
18:20
🔗
|
PurpleSym |
http://academictorrents.com/about.php#mirroring could be relevant. |
18:22
🔗
|
arkiver |
ok |
18:22
🔗
|
arkiver |
I'll try to have something in a bit |
18:22
🔗
|
JesseW |
PurpleSym: not particularly -- we *want* to do "blind mirroring of all data", so their per-collection lists don't help that much. :-) |
18:23
🔗
|
JesseW |
arkiver: I don't think it's particular urgent, but it's a good thing to do. |
18:23
🔗
|
PurpleSym |
Yeah, right below that section are details on their API. |
18:23
🔗
|
PurpleSym |
That might be easier than screen scraping. |
18:27
🔗
|
JesseW |
PurpleSym: I looked into it, but the API, unlike the real interface, didn't seem to support paging, oddly. |
18:27
🔗
|
PurpleSym |
The examples suggest you can use &limit=9999 |
18:27
🔗
|
JesseW |
And changing the limit seemed to require an API key -- which, no thanks, I'll just use what you are *already making available* |
18:28
🔗
|
PurpleSym |
I see. |
18:37
🔗
|
PurpleSym |
Anyway, curl -s -b 'uid=4510;pass=f2e3f605ea9062c5eb7390a3bd3f8eb9' 'http://academictorrents.com/apiv2/entries?limit=9999' | jq -r '.[] | [.infohash, .name, .size, .dateadded] | @csv' |
18:39
🔗
|
JesseW |
Nice! |
18:39
🔗
|
JesseW |
better you than me. |
19:08
🔗
|
|
xioustic has joined #archiveteam-bs |
19:22
🔗
|
|
ndizzle has joined #archiveteam-bs |
19:26
🔗
|
|
JesseW has quit IRC (Read error: Operation timed out) |
19:26
🔗
|
|
xXx_ndidd has quit IRC (Ping timeout: 244 seconds) |
19:37
🔗
|
|
Eloquence has joined #archiveteam-bs |
19:39
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
19:41
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
19:41
🔗
|
|
Start has joined #archiveteam-bs |
19:43
🔗
|
|
dashcloud has joined #archiveteam-bs |
19:44
🔗
|
|
tomwsmf-a has joined #archiveteam-bs |
20:01
🔗
|
|
Eloquence has quit IRC (Read error: Operation timed out) |
20:14
🔗
|
arkiver |
I asked SketchCow to create a collection |
20:36
🔗
|
|
Simpbrain has quit IRC (Read error: Operation timed out) |
20:37
🔗
|
|
Eloquence has joined #archiveteam-bs |
20:51
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
20:51
🔗
|
|
Stiletto has joined #archiveteam-bs |
20:52
🔗
|
|
Simpbrain has joined #archiveteam-bs |
21:03
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
21:04
🔗
|
|
Simpbrain has quit IRC (Ping timeout: 633 seconds) |
21:05
🔗
|
|
RichardG has joined #archiveteam-bs |
21:07
🔗
|
|
Eloquence has quit IRC (Read error: Operation timed out) |
21:07
🔗
|
|
Simpbrain has joined #archiveteam-bs |
21:32
🔗
|
|
Simpbra1 has joined #archiveteam-bs |
21:33
🔗
|
|
Simpbrain has quit IRC (Ping timeout: 1208 seconds) |
21:38
🔗
|
|
kristian_ has joined #archiveteam-bs |
21:43
🔗
|
|
JesseW has joined #archiveteam-bs |
21:45
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
21:48
🔗
|
|
dashcloud has joined #archiveteam-bs |
21:59
🔗
|
|
dashcloud has quit IRC (Read error: Connection reset by peer) |
21:59
🔗
|
|
dashcloud has joined #archiveteam-bs |
22:01
🔗
|
|
signius has quit IRC (Remote host closed the connection) |
22:13
🔗
|
|
kristian_ has quit IRC (Leaving) |
22:14
🔗
|
|
Eloquence has joined #archiveteam-bs |
22:19
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
22:19
🔗
|
|
Stiletto has joined #archiveteam-bs |
22:26
🔗
|
joepie91 |
so apparently Savant's soundcloud was baleeted over a bunch of remixes |
22:26
🔗
|
joepie91 |
https://www.facebook.com/zyonMGMT/videos/vb.649866465116005/682443285191656/?type=2&theater |
22:42
🔗
|
|
BlueMaxim has joined #archiveteam-bs |
22:51
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
22:52
🔗
|
|
mutoso has quit IRC (Read error: Operation timed out) |
22:54
🔗
|
godane |
so i have uploaded up to 2015-04 with kotaku.com |
22:54
🔗
|
|
dashcloud has joined #archiveteam-bs |
23:31
🔗
|
godane |
looks like i got all of gawker.com up to 2015: https://archive.org/search.php?query=subject%3A%22gawker.com%22 |
23:51
🔗
|
|
Honno has joined #archiveteam-bs |
23:52
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
23:52
🔗
|
|
Stiletto has joined #archiveteam-bs |
23:53
🔗
|
godane |
so looks like i did lifehacker.com sitemap grab last summer |
23:54
🔗
|
|
Eloquence has quit IRC (Read error: Operation timed out) |
23:54
🔗
|
godane |
i will have to at least another 17 months of it so we are sure of up to date with it |
23:57
🔗
|
JesseW |
good |
23:57
🔗
|
JesseW |
valleywag also seems important to try and get, if we haven't already |