| Time |
Nickname |
Message |
|
01:07
🔗
|
phuzion |
ivan`: where did you get that github-repositories.txt file? |
|
01:07
🔗
|
chronomex |
the gothub subcommittee posted it on IA recently |
|
01:08
🔗
|
chronomex |
http://archive.org/details/archiveteam-github-repository-index-201212 |
|
01:09
🔗
|
phuzion |
Thanks |
|
01:09
🔗
|
* |
phuzion considers just starting that for the hell of it, to see how much space it ends up taking. |
|
01:27
🔗
|
balrog_ |
some repos already got renamed or deleted |
|
01:27
🔗
|
joepie91 |
does anyone have a done-in-10-seconds way to submit a WARC to the internet archive? |
|
01:27
🔗
|
joepie91 |
from a headless server |
|
01:28
🔗
|
joepie91 |
I have a WARC of the BBC site of a while ago |
|
01:28
🔗
|
chronomex |
balrog_: yeah, they will do that |
|
01:28
🔗
|
chronomex |
joepie91: you can cobble together something that uses curl to POST it to the s3 api |
|
01:29
🔗
|
joepie91 |
:p |
|
01:29
🔗
|
joepie91 |
right, but how would I do that, seeing as I'm entirely unfamiliar with the s3 api |
|
01:29
🔗
|
chronomex |
okay |
|
01:30
🔗
|
chronomex |
you need to get tokens and I'll give you a command line |
|
01:30
🔗
|
joepie91 |
how do i get tokens? |
|
01:30
🔗
|
ivan` |
phuzion: let me know if you run out, I might have 3TB of space to do some of it |
|
01:31
🔗
|
phuzion |
ivan`: I've started on it, I'll let you know when it fills up my drive. |
|
01:31
🔗
|
chronomex |
joepie91: http://archive.org/account/s3.php |
|
01:31
🔗
|
joepie91 |
okay, got them |
|
01:32
🔗
|
ivan` |
phuzion: you might want to run two in parallel since half the time github will be busy counting objects |
|
01:32
🔗
|
chronomex |
joepie91: curl '--header' 'authorization: LOW your-magic-token' '--header' 'x-archive-meta01-collection:opensource' '--header' 'x-amz-auto-make-bucket:1' '--header' 'x-archive-meta-noindex:true' --header 'x-archive-meta-(title|date|mediatype|language|etc): Value' |
|
01:32
🔗
|
phuzion |
Hmm... Perhaps I can figure out how to split the list into even and odd lines... |
|
01:32
🔗
|
ivan` |
or with xargs or parallel |
|
01:32
🔗
|
* |
ivan` looks it up |
|
01:33
🔗
|
chronomex |
yes, `parallel' is good |
|
01:33
🔗
|
joepie91 |
magic token == secret key? |
|
01:33
🔗
|
chronomex |
joepie91: hold on don't run that yet |
|
01:33
🔗
|
chronomex |
yes, secret key |
|
01:33
🔗
|
chronomex |
you actually want to run it with these as well ... |
|
01:34
🔗
|
chronomex |
curl -i '-#' ${args from above} --upload-file /dev/null "http://s3.us.archive.org/"$identifier |
|
01:34
🔗
|
chronomex |
this will give you a progress bar and stuff |
|
01:35
🔗
|
joepie91 |
what is the $identifier? |
|
01:35
🔗
|
* |
joepie91 is confused now |
|
01:35
🔗
|
joepie91 |
okay, let me ask it differently |
|
01:36
🔗
|
joepie91 |
if I wanted to upload a warc.gz of the BBC.co.uk site named "BBC.co.uk WARC", and the filename was at-bbc.warc.gz |
|
01:36
🔗
|
joepie91 |
what would the full command be to run (minus secret key, ofc) |
|
01:36
🔗
|
joepie91 |
so that I get a bit of a better grasp on the syntax :p |
|
01:38
🔗
|
phuzion |
ivan`: I'm trying to figure out how to split the file in half, I want to do even and odd lines, but can't quite nail the sed syntax, you any good with sed? |
|
01:39
🔗
|
phuzion |
Wait, hang on, I might have gotten it |
|
01:39
🔗
|
ivan` |
no, I was busy trying to figure out how to do the subshell thing with parallel |
|
01:41
🔗
|
DFJustin |
so did someone warc this yet, closing tomorrow http://japan.gamespot.com/ |
|
01:41
🔗
|
phuzion |
Yeah, got it |
|
01:41
🔗
|
phuzion |
sed -n "1~2 p" github-repositories.txt > github-odd.txt and then sed -n "2~2 p" github-repositories.txt > github-even.txt |
|
01:46
🔗
|
chronomex |
joepie91: curl -i -'#' (all the --header options from above) --upload-file at-bbc.warc.gz "http://s3.us.archive.org/BBC.co.uk-warc" |
|
01:48
🔗
|
chronomex |
joepie91: make sense? |
|
01:48
🔗
|
chronomex |
you should write a description and stuff |
|
01:54
🔗
|
joepie91 |
right... not much to describe though |
|
01:54
🔗
|
joepie91 |
:P |
|
01:55
🔗
|
chronomex |
well, write where it came from, include the wget command line, etc |
|
01:55
🔗
|
chronomex |
maybe why you got it |
|
01:55
🔗
|
chronomex |
just a few sentences |
|
01:56
🔗
|
joepie91 |
just use \n to insert a newline? |
|
01:57
🔗
|
chronomex |
I don't know tbh |
|
01:57
🔗
|
* |
phuzion predicts that his 1tb drive will be full tonight, thanks to cloning github repos |
|
01:57
🔗
|
chronomex |
phuzion: that sounds like a safe bet |
|
01:57
🔗
|
joepie91 |
meh, don't have the command I ran anymore anyway :/ |
|
01:57
🔗
|
phuzion |
heh |
|
01:58
🔗
|
godane |
how big would you think japan.gamespot.com should be? |
|
01:58
🔗
|
joepie91 |
desc? |
|
01:58
🔗
|
joepie91 |
what's the name of the description header? |
|
01:58
🔗
|
chronomex |
ummm |
|
01:58
🔗
|
phuzion |
Can someone take http://git.kernel.org/index.html and get all of the git:// links out of the page for me? I wanna clone all of those as well |
|
02:00
🔗
|
chronomex |
joepie91: read the examples at http://archive.org/help/abouts3.txt |
|
02:02
🔗
|
joepie91 |
phuzion: http://sprunge.us/XOBV |
|
02:02
🔗
|
joepie91 |
ignore the first line |
|
02:03
🔗
|
joepie91 |
rest should be valid |
|
02:03
🔗
|
phuzion |
joepie91: you sir, are a gentleman and a scholar |
|
02:03
🔗
|
phuzion |
Mind if I ask the wizardry you used to obtain such a result? |
|
02:07
🔗
|
joepie91 |
sure, 1 sec :P |
|
02:08
🔗
|
joepie91 |
bit of a nasty method, but |
|
02:08
🔗
|
joepie91 |
http://pastie.org/5541069 |
|
02:08
🔗
|
joepie91 |
it does the job |
|
02:08
🔗
|
joepie91 |
curl http://whatever | python gitlink.py |
|
02:09
🔗
|
joepie91 |
the regex is extremely lazy though, and there's no guarantee that it'll work with other stuff :P |
|
02:09
🔗
|
joepie91 |
plus I don't think it'll match more than one git:// url per line in the html file |
|
02:09
🔗
|
joepie91 |
which is fine for this, but may not be fine for other things |
|
02:10
🔗
|
chronomex |
I would do curl http://whatever | sed -e 's/[" ]/\n/g' | grep ^git:// |
|
02:10
🔗
|
joepie91 |
chronomex: that won't work if there's other stuff on the same line, right? |
|
02:10
🔗
|
joepie91 |
wait |
|
02:10
🔗
|
joepie91 |
I see what you're doing |
|
02:11
🔗
|
chronomex |
:) |
|
02:11
🔗
|
joepie91 |
that would break here though, if you don't include ) in your regex |
|
02:11
🔗
|
chronomex |
ok |
|
02:11
🔗
|
joepie91 |
there was one that would break and have a ) at the end |
|
02:11
🔗
|
chronomex |
well, as usual, it requires tuning |
|
02:11
🔗
|
joepie91 |
:P |
|
02:11
🔗
|
joepie91 |
plus you'd have to add < |
|
02:11
🔗
|
joepie91 |
in case it's mentioned as text |
|
02:11
🔗
|
chronomex |
well yes |
|
02:11
🔗
|
chronomex |
but you see where I'm going with it |
|
02:11
🔗
|
joepie91 |
yes :) |
|
02:11
🔗
|
joepie91 |
I'm horrible with sed and awk so I prefer python for these kind of things :P |
|
02:12
🔗
|
chronomex |
or you could do grep -o 'git://[-_A-Za-z./%0-9 etc]*' |
|
02:12
🔗
|
chronomex |
-o is only-matching-regions |
|
02:31
🔗
|
chronomex |
alard: tracker is back in swapsville http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/memory.html |
|
02:39
🔗
|
godane |
so i'm starting the mirroring of japan.gamespot.com |
|
03:27
🔗
|
godane |
and japan.gamespot.com is gone |
|
03:27
🔗
|
balrog_ |
:/ did it get backed up? |
|
03:28
🔗
|
godane |
part of it did |
|
03:28
🔗
|
godane |
not much |
|
03:43
🔗
|
godane |
i'm uploading my warc for japan.gamespot.com right now |
|
03:43
🔗
|
godane |
just don't expect much |
|
03:50
🔗
|
balrog_ |
:[ |
|
03:55
🔗
|
godane |
uploaded: http://archive.org/details/japan.gamespot.com-20121216-mirror-incomplete |
|
03:55
🔗
|
godane |
we wore not fast enough |
|
03:56
🔗
|
godane |
i grabbing stuff like fireflyfans.net before it needs a panic download in under 2 hours |
|
03:59
🔗
|
godane |
it was already starting to redirect to japan.cnet.com best on my wget.log |
|
04:03
🔗
|
phuzion |
chronomex: The tracker that you talk of, is that why I can't download github stuff? |
|
04:03
🔗
|
chronomex |
phuzion: no idea |
|
07:01
🔗
|
* |
chronomex currently stuffing some ftp grabs from last week into .zips |
|
09:50
🔗
|
Nemo_bis |
chronomex: you uploaded some of these a while ago didn't you? https://archive.org/details/bellsystempractices |
|
09:50
🔗
|
Nemo_bis |
are http://thepiratebay.se/torrent/5946997/Bell_Systems_Technical_Journals_(Full_Site_Rip) darkened or just not on archive.org? |
|
09:54
🔗
|
chronomex |
Nemo_bis: that is my collection, yes. |
|
09:56
🔗
|
Nemo_bis |
chronomex: do you anything about the bell system technical journals then? |
|
09:56
🔗
|
chronomex |
do I what anything? |
|
09:57
🔗
|
chronomex |
I think you a word |
|
09:57
🔗
|
alard |
chronomex: The tracker likes to swap. We have too many large projects at the moment. |
|
09:57
🔗
|
chronomex |
yeah, that was my understanding |
|
09:58
🔗
|
Nemo_bis |
*do you know |
|
09:59
🔗
|
alard |
GitHub is done now, so that will be going. |
|
09:59
🔗
|
chronomex |
I know some things about the BSTJ, yes? |
|
10:00
🔗
|
Nemo_bis |
chronomex: about them being uploaded on archive.org |
|
10:01
🔗
|
chronomex |
don't |
|
10:02
🔗
|
chronomex |
http://archive.org/search.php?query=bell%20system%20technical%20journal hmmm this is bad |
|
10:03
🔗
|
chronomex |
maybe I should upload that 50G torrent |
|
10:03
🔗
|
chronomex |
orrrrr not? |
|
10:03
🔗
|
chronomex |
upload it from the lucent site |
|
10:03
🔗
|
chronomex |
maybe I'll do that tomorrow |
|
10:04
🔗
|
chronomex |
yeahhhh |
|
10:18
🔗
|
Nemo_bis |
chronomex: yes you should :) |
|
10:18
🔗
|
Nemo_bis |
unless Jason already did it? |
|
10:25
🔗
|
hiker3 |
http://japan.gamespot.com/ is gone now |
|
10:26
🔗
|
hiker3 |
I see godane managed to grab some of it |
|
10:28
🔗
|
hiker3 |
If http://andriasang.com/ comes back online it might be nice to grab a copy as well. I am not sure how much longer it will stay up |
|
10:29
🔗
|
hiker3 |
Thank you for grabbing what you did, godane. |
|
11:58
🔗
|
Nemo_bis |
If you want to pick some... (Or add suggestions; my proxy and myself got sick of browsing TPB. ;-) ) http://archiveteam.org/index.php?title=Magazines_and_journals |
|
13:05
🔗
|
godane |
hiker3: looks like gamespotjapan twitter feed is gone too |
|
13:05
🔗
|
hiker3 |
Hi! But isn't twitter archived automatically? |
|
13:06
🔗
|
godane |
don't knnow |
|
13:06
🔗
|
godane |
i just know that the account doesn't exist anymore |
|
13:06
🔗
|
hiker3 |
Were you the only one grabbing the site? |
|
13:11
🔗
|
godane |
i don't know |
|
13:12
🔗
|
godane |
in least jason scott got it |
|
13:12
🔗
|
godane |
*in less |
|
13:12
🔗
|
godane |
when was it posted that it was going to be redirected to japan.cnet.com |
|
13:13
🔗
|
hiker3 |
I came in here 3 days ago and mentioned it I think |
|
13:13
🔗
|
godane |
so i hope jason got the warning then |
|
13:13
🔗
|
godane |
i know i was not going to get all of it |
|
13:14
🔗
|
hiker3 |
Is there any way someone can get http://andriasang.com/ if it comes back up? |
|
13:14
🔗
|
hiker3 |
It's been having errors for a few weeks now, and the author has moved on to other things so I am not sure how much longer it will stay up. |
|
14:10
🔗
|
godane |
so looks like fireflyfans.net bluesunroom is very big |
|
14:41
🔗
|
joepie91 |
chronomex: http://aarnist.cryto.net:81/data/at-trancenu.warc.gz |
|
14:41
🔗
|
joepie91 |
a seemingly complete warc of trance.nu |
|
14:44
🔗
|
godane |
uploaded: http://archive.org/details/www.engadget.com-images-2006-mirror |
|
14:44
🔗
|
norbert79 |
joepie91: What do you think, would all sources found for gopher be worth of uploading? |
|
14:45
🔗
|
norbert79 |
I mean the UMN gopher engine |
|
14:48
🔗
|
joepie91 |
norbert79: I have no idea what that would entail, to be perfectly honest |
|
14:48
🔗
|
joepie91 |
that was before my time :P |
|
14:50
🔗
|
norbert79 |
Lot of old gopher code; it would mean like I would say: old apache2 code :) |
|
14:52
🔗
|
joepie91 |
ah, right |
|
14:52
🔗
|
joepie91 |
sure, why not :P |
|
14:54
🔗
|
hiker3 |
Is there a list of websites which have shutdown but have archives from AT? |
|
14:56
🔗
|
godane |
i have gopher plugin in for firefox |
|
14:58
🔗
|
Nemo_bis |
sigh people packaging PDFs in NRG packaged in multifile RARs |
|
15:11
🔗
|
norbert79 |
Looks like sharing isn't accessible atm |
|
15:12
🔗
|
godane |
i found some usenet dumps |
|
15:12
🔗
|
godane |
on gopher://telefisk.org/ |
|
15:12
🔗
|
joepie91 |
hiker3: I think it's on the archiveteam wiki |
|
15:12
🔗
|
godane |
the archive is up to like 2011 |
|
15:13
🔗
|
norbert79 |
godane: Telefisk is still anactive gopher server |
|
15:13
🔗
|
godane |
yes |
|
15:13
🔗
|
godane |
from what i can tell |
|
15:13
🔗
|
norbert79 |
godane: You could also add olduse.net to this too |
|
15:14
🔗
|
joepie91 |
ah |
|
15:14
🔗
|
joepie91 |
hiker3: http://archive.org/details/archiveteam |
|
15:15
🔗
|
norbert79 |
godane: Wanted to upload Old Gopher Sources, connection died, now I can't use that keyword anymore, but am offered OldGopherSOurces_631 |
|
15:15
🔗
|
norbert79 |
godane: What now? |
|
15:15
🔗
|
norbert79 |
Shall I ignore this? |
|
15:17
🔗
|
godane |
i'm donwloading this stuff to be on the safe side |
|
15:19
🔗
|
DFJustin |
norbert79: it looks like https://archive.org/details/OldGopherSources was created, so you ought to be able to go in and edit it |
|
15:20
🔗
|
norbert79 |
DFJustin: Cheers, looks like both https://archive.org/details/OldGopherSources and https://archive.org/details/OldGopherSources_693 got created and got stuck again |
|
15:20
🔗
|
DFJustin |
afaik olduse.net comes from data that is already on IA so no point in archiving it https://archive.org/details/utzoo-wiseman-usenet-archive |
|
15:21
🔗
|
norbert79 |
DFJustin: About these pages, can I somehow remove them? |
|
15:21
🔗
|
DFJustin |
no |
|
15:21
🔗
|
norbert79 |
I wish to remove the second, aw crap |
|
15:21
🔗
|
DFJustin |
it's not public yet so no big deal |
|
15:21
🔗
|
norbert79 |
Ok |
|
15:22
🔗
|
norbert79 |
DFJustin: What is the right choice for compressed source files? |
|
15:22
🔗
|
norbert79 |
I am offered movie, audio and text |
|
15:22
🔗
|
norbert79 |
and etree |
|
15:22
🔗
|
DFJustin |
pick text and an admin can move it later |
|
15:23
🔗
|
norbert79 |
cheers |
|
15:29
🔗
|
norbert79 |
Done |
|
16:15
🔗
|
godane |
looks like fireflyfans.net store the bluesun images using the files md5sum |
|
16:41
🔗
|
joepie91 |
anything else that needs wget-warcing? |
|
16:48
🔗
|
Nemo_bis |
joepie91: are you open also to different suggestions? :) |
|
16:48
🔗
|
joepie91 |
that depends on what said suggestion is :P |
|
16:48
🔗
|
Nemo_bis |
I put some on http://archiveteam.org/index.php?title=Magazines_and_journals |
|
16:50
🔗
|
joepie91 |
Nemo_bis: I can't do torrents, though |
|
16:50
🔗
|
Nemo_bis |
ah |
|
16:50
🔗
|
joepie91 |
disallowed by the host that I'm using |
|
16:50
🔗
|
joepie91 |
because it's very IO heavy |
|
16:51
🔗
|
joepie91 |
:P |
|
16:51
🔗
|
balrog_ |
:[ |
|
16:51
🔗
|
joepie91 |
see https://srsvps.com/terms.html |
|
16:51
🔗
|
balrog_ |
even if you limit to a few connections at a time? ahh |
|
16:51
🔗
|
balrog_ |
I understand that OVH is pretty lenient |
|
16:51
🔗
|
balrog_ |
and is popular for seedboxes |
|
16:51
🔗
|
joepie91 |
ya, but my only OVH box that I could use for this would be my kimsufi |
|
16:51
🔗
|
balrog_ |
yeah |
|
16:52
🔗
|
joepie91 |
:P |
|
16:52
🔗
|
joepie91 |
and that one isn't supposed to do anything besides function as a testing box for my vps panel |
|
16:52
🔗
|
joepie91 |
don't want to risk suspension or similar |
|
16:52
🔗
|
Nemo_bis |
aww 503 Service Unavailable |
|
16:52
🔗
|
joepie91 |
I have one other VPS on an OVH server, but if I start torrenting on that, encyclopedia dramatica will probably slow down to a crawl, since it's a backend server >.> |
|
16:53
🔗
|
Nemo_bis |
there's an interesting NATO FTP site there that you could grab though |
|
16:53
🔗
|
balrog_ |
ohhh? |
|
16:53
🔗
|
* |
balrog_ has been on the lookout for NATO documents |
|
16:53
🔗
|
balrog_ |
well, certain specific ones having to do with speech codecs |
|
16:53
🔗
|
joepie91 |
Nemo_bis: how large is it, approx? |
|
16:53
🔗
|
joepie91 |
I have about 50G of space left |
|
16:53
🔗
|
Nemo_bis |
joepie91: dunno, some dozens GiB perhaps |
|
16:53
🔗
|
joepie91 |
on this vps |
|
16:53
🔗
|
joepie91 |
hmm |
|
16:53
🔗
|
joepie91 |
I could do it partially |
|
16:53
🔗
|
Nemo_bis |
ftp.rta.nato.int/PubFullText/AGARD/ or http://thepiratebay.se/torrent/7639843/AGARD_monographs_(_AGARDographs_) 15 GiB/453 |
|
16:54
🔗
|
Nemo_bis |
and parent folder |
|
16:54
🔗
|
joepie91 |
what is the easiest way to mass-download from an FTP server? |
|
16:54
🔗
|
Nemo_bis |
wget |
|
16:54
🔗
|
DFJustin |
lftp |
|
16:54
🔗
|
joepie91 |
I'd assume warc isn't suitable for this |
|
16:54
🔗
|
Nemo_bis |
http://archiveteam.org/index.php?title=FTP |
|
16:54
🔗
|
joepie91 |
ah, nice :P |
|
16:55
🔗
|
DFJustin |
it doesn't really matter if you're doing a one time pull, I like lftp for updating an existing mirror |
|
16:56
🔗
|
joepie91 |
downloading... |
|
16:57
🔗
|
joepie91 |
140kb/sec |
|
16:57
🔗
|
joepie91 |
:p |
|
16:57
🔗
|
joepie91 |
not particularly fast |
|
16:57
🔗
|
joepie91 |
KB* |
|
16:57
🔗
|
Nemo_bis |
:< |
|
16:57
🔗
|
joepie91 |
you'd think nato could afford a decent pipe |
|
16:57
🔗
|
joepie91 |
oh, by the way, alard, are you here? |
|
17:04
🔗
|
joepie91 |
Nemo_bis: I've started downloading the car and motorcycle manual torrents from my home connection (on my media server) |
|
17:04
🔗
|
joepie91 |
:P |
|
17:04
🔗
|
joepie91 |
it'll be slow, but it's something |
|
17:05
🔗
|
joepie91 |
it'll download at 1.1MB/sec max, and upload at like 60KB/sec max |
|
17:06
🔗
|
godane |
i found a amiga virus collection |
|
17:07
🔗
|
joepie91 |
haha |
|
17:08
🔗
|
Nemo_bis |
joepie91: ok, upload with the bulk uploader, you know how? |
|
17:09
🔗
|
joepie91 |
Nemo_bis: no idea, and the standard uploader on the archive.org site says it doesn't work properly on unix-based systems |
|
17:09
🔗
|
joepie91 |
honestly, archive.org needs some kind of software to do uploads |
|
17:09
🔗
|
joepie91 |
easily |
|
17:09
🔗
|
joepie91 |
including the whole tagging etc |
|
17:10
🔗
|
Nemo_bis |
https://wiki.archive.org/twiki/bin/view/Main/IAS3BulkUploader |
|
17:12
🔗
|
joepie91 |
whoa, I did not know that existed |
|
17:20
🔗
|
Nemo_bis |
joepie91: if your upload bandwidth is so little, perhaps you chose too big a torrent :) |
|
17:20
🔗
|
Nemo_bis |
but let's see |
|
17:21
🔗
|
joepie91 |
nah, I'll just have patience :P |
|
17:21
🔗
|
joepie91 |
that server runs 24/7 anyway |
|
17:21
🔗
|
joepie91 |
plus I'll probably upload stuff separately |
|
17:26
🔗
|
godane |
this item needs some help: www.engadget.com-images-2007-mirror |
|
17:34
🔗
|
schbiridi |
joepie91: i mount the ftp with curlftpfs and then use rsync |
|
17:34
🔗
|
joepie91 |
schbiridi: you mean the archive.org FTP upload? |
|
17:35
🔗
|
joepie91 |
according to the info page that's not recommended because of bandwidth |
|
17:53
🔗
|
schbiridi |
nah, for mirroring FTP servers |
|
17:54
🔗
|
schbiridi |
sorry :D |
|
17:54
🔗
|
balrog_ |
schbiridi: I usually use lftp here, or wget |
|
17:57
🔗
|
Nemo_bis |
the wonders of interoperable systems: everyone can use any flavour of software one likes most ;) |
|
18:00
🔗
|
joepie91 |
ahh |
|
18:01
🔗
|
schbiridi |
i find rsync the most versatile |
|
18:53
🔗
|
alard |
joepie91: Yes? |
|
19:06
🔗
|
joepie91 |
alard: I have something that may be of use to you for future projects |
|
19:06
🔗
|
joepie91 |
I wrote a self-extracting python script thingie |
|
19:06
🔗
|
joepie91 |
I'm using it for the installer for my VPS panel, but it may be useful for stand-alone versions of crawlers etc as well |
|
19:07
🔗
|
joepie91 |
https://github.com/joepie91/cvm/tree/develop/tools/pysfx |
|
19:07
🔗
|
joepie91 |
it doesn't have its own repo yet (it will soon, though) |
|
19:07
🔗
|
joepie91 |
example usage: https://github.com/joepie91/cvm/blob/develop/installer/build.sh |
|
19:07
🔗
|
joepie91 |
end result is a single .py that you can run, it'll extract itself to a temp dir, and run the specified command |
|
19:49
🔗
|
ivan` |
blip.tv a serious risk given "@richhickey says @skelter Blip doesn't want conference vids, tech talks etc, and gave us 2 weeks to move." |
|
19:49
🔗
|
ivan` |
I have all the Clojure videos, don't worry about those |
|
19:58
🔗
|
chronomex |
oh really? |
|
19:58
🔗
|
chronomex |
I thought blip.tv has the HOPE video |
|
20:00
🔗
|
ivan` |
I also have all of http://blip.tv/linuxconfau |
|
20:04
🔗
|
ivan` |
going to grab http://blip.tv/linux-journal and other things I can find on google |
|
20:06
🔗
|
ivan` |
(note: my upstream is terrible and no real backups) |
|
20:07
🔗
|
alard |
joepie91: Ah, that's something to remember. Similar to py2exe, but for Linux? |
|
20:08
🔗
|
joepie91 |
alard: more similar to a 7zip sfx with autorun, I'd say, but for Linux :P |
|
20:09
🔗
|
joepie91 |
it doesn't include dependencies etc, it just works with whatever tar.gz you give it |
|
20:09
🔗
|
joepie91 |
you could theoretically pack up something entirely non-python with it |
|
20:09
🔗
|
joepie91 |
as it will just run whatever command you specify, but with working directory set to the temp extraction directory |
|
20:13
🔗
|
ivan` |
what's our preferred piratepad? |
|
20:13
🔗
|
ivan` |
piratepad? |
|
20:16
🔗
|
alard |
joepie91: OK. |
|
20:24
🔗
|
ivan` |
if anyone is really interested in blip.tv I can provide a youtube-dl patch and start listing channels in piratepad |
|
20:25
🔗
|
ivan` |
otherwise I'll just continue sucking things down at 1M/s and hope 2 weeks only apply to hickey |
|
20:35
🔗
|
ivan` |
surprised to see a lot of great content, site must have terrible googlejuice |
|
21:38
🔗
|
godane |
SketchCow: so i have the bluesunrom of fireflyfans.net |
|
21:38
🔗
|
godane |
2.1gb warc.gz with 11000+ images |
|
22:11
🔗
|
SketchCow |
Goodness |
|
22:13
🔗
|
Nemo_bis |
I think hank briefly killed the site (or so) earlier today, maybe with this https://archive.org/~tracey/mrtg/derives.html |
|
22:16
🔗
|
godane |
SketchCow: did you grab japan.gamespot.com? |
|
22:16
🔗
|
godane |
i only grabed 90mb |
|
22:18
🔗
|
alard |
SketchCow: I uploaded the GitHub files, see http://archive.org/details/github-downloads-201212-part-a (to -z and -0 to -9) |
|
22:23
🔗
|
SketchCow |
So this is the after-we-fixed-the-bugs thing? |
|
22:24
🔗
|
alard |
Yes. |
|
22:32
🔗
|
SketchCow |
Fantastic. |
|
22:32
🔗
|
SketchCow |
I think I put this in software. |
|
22:36
🔗
|
dashcloud |
ivan`: I'm interested in pulling down the blip.tv stuff and I've got a good pipe here- the latest released version of youtube-dl seems to work fine with blip- any special options to use? |
|
22:38
🔗
|
ivan` |
dashcloud: yes, you need a patch to get the Source/720p content |
|
22:38
🔗
|
ivan` |
sec |
|
22:41
🔗
|
dashcloud |
just ping me with it, and I'll get to it sometime tonight |
|
22:43
🔗
|
SketchCow |
http://archive.org/details/github-downloads-2012-12 |
|
22:48
🔗
|
SketchCow |
Don't do mediatype data, do mediatype software |
|
22:50
🔗
|
alard |
"data" is the default, I think. (I'm not even sure if non-admins can upload anything but the default type, but I haven't tried that.) |
|
22:50
🔗
|
DFJustin |
non-admins can upload anything |
|
22:56
🔗
|
SketchCow |
Anyway, it's all set now |
|
23:01
🔗
|
ivan` |
dashcloud: http://piratepad.net/R18h7lKV1N has the patch and some channels |
|
23:02
🔗
|
ivan` |
it's possible that the clojure channel got specifically targeted for using up too much of their bandwidth or something, but blip.tv still seems careless |
|
23:12
🔗
|
dashcloud |
ivan`: conferences don't seem a terribly good fit for blip as it is now- conferences happen once a year, and blip is geared toward episodic-type content (weekly/biweekly/monthly shows) |
|
23:12
🔗
|
godane |
so i got a account to astraweb.com |
|
23:13
🔗
|
godane |
only got the $10/25gb credit |
|
23:13
🔗
|
dashcloud |
holy crap- that's weird watching text suddenly show up on the page |
|
23:13
🔗
|
godane |
just to test if i can get episode of attack of the show without missing parts |
|
23:20
🔗
|
joepie91 |
dashcloud: heh, you're unfamiliar with etherpad? |
|
23:20
🔗
|
joepie91 |
it's basically multiplayer notepad :P |
|
23:21
🔗
|
dashcloud |
I've never used one before |
|
23:21
🔗
|
dashcloud |
did you run the git-annex kickstarter? |
|
23:22
🔗
|
joepie91 |
git-annex kickstarter? |
|
23:22
🔗
|
godane |
good news everyone |
|
23:22
🔗
|
godane |
i maybe able to save more aots |
|
23:35
🔗
|
godane |
also most of my engadget dumps are uploaded |
|
23:35
🔗
|
godane |
will do a 2012 year dump sometime next year |
|
23:41
🔗
|
ivan` |
oh hey http://blip.tv/acquia and its videos got nuked too |
|
23:41
🔗
|
ivan` |
whatever it was. not that I'll have any idea now. |