Time |
Nickname |
Message |
01:41
🔗
|
dashcloud |
well, the answer to that question is yes- which sucked for me |
01:42
🔗
|
dashcloud |
also, apparently you can't do --output-document and --truncate-output together- wget doesn't recognize the combination, but just fails as if you left off the URL |
01:42
🔗
|
dashcloud |
(for wget that is) |
06:28
🔗
|
Nemo_bis |
Uh, still no updates for http://archiveteam.org/index.php?title=Ancestry.com ? Does it have a channel or something? |
06:29
🔗
|
joepie91 |
if not, I propose #gravedigging as channel name |
06:46
🔗
|
joepie91 |
goddamnit |
06:46
🔗
|
joepie91 |
anybody have a copy of HipHop, that music thing? |
06:46
🔗
|
joepie91 |
site is dead, repo is dead |
06:46
🔗
|
joepie91 |
no forks |
06:47
🔗
|
joepie91 |
AUR package refers to source tarball on (now-defunct) site |
06:47
🔗
|
joepie91 |
no matches in wayback |
06:47
🔗
|
joepie91 |
well the site is in wayback, but the tarballs are not |
06:50
🔗
|
joepie91 |
ffs |
06:50
🔗
|
joepie91 |
how can there be no copies of this thing |
06:56
🔗
|
ivan` |
https://github.com/rhumlover/hiphop |
06:57
🔗
|
ivan` |
https://github.com/liamja/hiphop |
06:58
🔗
|
ivan` |
https://encrypted.google.com/search?q=%22forked%20from%20hiphopapp%2Fhiphop%22%20site%3Agithub.com |
07:08
🔗
|
joepie91 |
ivan`: wtf? github search gave me 0 results :| |
07:08
🔗
|
joepie91 |
anyway, thanks |
07:08
🔗
|
yipdw |
ivan` has Github Search+ |
07:09
🔗
|
midas |
he paid for fastlane |
07:09
🔗
|
trs80 |
how did hiphop actually work? youtube? |
07:10
🔗
|
midas |
oh wait, the ISP's dont like that name, you have fastlane, but for faster you need hyperspeedlane |
07:11
🔗
|
joepie91 |
trs80: node-webkit application, searched on youtube, then (presumably) grabbed cover art for it |
07:11
🔗
|
joepie91 |
and displayed the list |
07:11
🔗
|
joepie91 |
as far as I can tell |
07:11
🔗
|
joepie91 |
so nothing too terribly new (because Tomahawk) but still |
07:11
🔗
|
joepie91 |
also Tomahawk is buggy as shit, so yeagh |
07:11
🔗
|
joepie91 |
yeah * |
07:12
🔗
|
yipdw |
the only Tomahawk I know is the BT + Adam K track, which makes that sentence really funny |
07:12
🔗
|
joepie91 |
lol |
07:12
🔗
|
joepie91 |
yipdw: http://www.tomahawk-player.org/ |
07:12
🔗
|
yipdw |
oh |
07:12
🔗
|
trs80 |
yeah, I could see cover art came from itunes/last.fm |
07:13
🔗
|
joepie91 |
Tomahawk is a great idea, but last time I tried to use it, I gave up after an hour of chasing inexplicable bugs |
07:16
🔗
|
yipdw |
huh, Tomahawk looks pretty neat |
10:55
🔗
|
schbirid |
http://www.minuszerodegrees.net/manuals/ |
11:01
🔗
|
midas |
https://archive.org/post/1018340/add-website |
11:01
🔗
|
midas |
should I? yes i should |
11:16
🔗
|
nico |
lol |
11:21
🔗
|
midas |
archivebot for the win right :p |
12:37
🔗
|
Nemo_bis |
Ah it's nothing new. :( http://www.reddit.com/r/DataHoarder/comments/27y8ux/standing_up_40tbs_of_data_for_fun_times/ci601wk |
13:33
🔗
|
ohhdemgir |
midas, what would the archivebot get if you set it off on http://hardforum.com/ |
13:33
🔗
|
ohhdemgir |
off site images from posts included or not? |
13:38
🔗
|
ivan` |
ohhdemgir: embedded images are grabbed from all domains |
13:39
🔗
|
ivan` |
links are not |
13:39
🔗
|
ohhdemgir |
wonder how big hf is |
13:42
🔗
|
ivan` |
Google thinks 3.5M pages |
13:43
🔗
|
ivan` |
# of threads adds up to 949,580 |
13:44
🔗
|
ohhdemgir |
what command does the bot run to do the grab? |
13:46
🔗
|
ivan` |
see https://github.com/ArchiveTeam/ArchiveBot/blob/master/pipeline/pipeline.py#L94 |
13:59
🔗
|
Arkiver2 |
Is the upload speed to the IA very slow today? |
13:59
🔗
|
Arkiver2 |
It is uploading very slow right now |
14:00
🔗
|
ohhdemgir |
Arkiver2, same here, very |
14:00
🔗
|
ohhdemgir |
midas, ^^ |
14:00
🔗
|
ohhdemgir |
random pickups here and there but generally it's been silly slow |
14:00
🔗
|
Arkiver2 |
Will be fixed in a bit of time probably |
14:01
🔗
|
Arkiver2 |
I just had to make sure it wasn't anything with me |
14:01
🔗
|
ohhdemgir |
where are you uploading from? |
14:01
🔗
|
Arkiver2 |
The Netherlands |
14:05
🔗
|
Nemo_bis |
Arkiver2: do a traceroute to s3.us.archive.org? |
14:08
🔗
|
Nemo_bis |
eek, I have 350 ms ping from Amsterdam via HE but 170 ms from Helsinki via ISC. Usual stuff. |
14:11
🔗
|
Arkiver2 |
yeah |
14:11
🔗
|
Arkiver2 |
got the same here |
14:11
🔗
|
Arkiver2 |
but I think it will be fixed in some hours, so I'll just wait |
14:14
🔗
|
Arkiver2 |
Nemo_bis: ah look: https://monitor.archive.org/weathermap/weathermap.html |
14:14
🔗
|
Arkiver2 |
HE uploading is 85 to 100% |
14:15
🔗
|
joepie91 |
damn, congested |
14:15
🔗
|
Co2e35_ |
365ms delay here also netherlands |
14:19
🔗
|
Arkiver2 |
joepie91: could it be some kind of hacking thing? |
14:19
🔗
|
joepie91 |
Arkiver2: what? |
14:19
🔗
|
midas |
Arkiver2: from online,bet is slow, OVH was hitting 11MB/s |
14:20
🔗
|
midas |
s/bet/very |
14:20
🔗
|
Arkiver2 |
joepie91: nah, nevermind |
14:21
🔗
|
Nemo_bis |
Arkiver2: that's downloading |
14:21
🔗
|
Nemo_bis |
but maybe they cap the sum of up/down or something |
14:22
🔗
|
Arkiver2 |
Nemo_bis: yeah, just maybe someone is flooding it with download requests |
14:22
🔗
|
Nemo_bis |
nah, IA via HE.net is always slow for me |
14:23
🔗
|
Nemo_bis |
try to route via Japan if necessary :P to avoid HE |
14:23
🔗
|
joepie91 |
the only statement more insane than that would be "try to route via Australia, to get better speeds" |
14:23
🔗
|
joepie91 |
:P |
14:24
🔗
|
Arkiver2 |
I'll just wait for some hours, speed will be better then hopefully |
14:37
🔗
|
underscor |
Yeah, our HE uplink is generally pretty saturated |
14:37
🔗
|
joepie91 |
hai underscor |
14:37
🔗
|
underscor |
https://monitor.archive.org/pubaccess/graph_2064.html |
14:38
🔗
|
joepie91 |
damn |
14:38
🔗
|
joepie91 |
HE must either love or hate you |
14:38
🔗
|
underscor |
haha |
14:38
🔗
|
Arkiver2 |
underscor: but it will get back to normal in a while right? |
14:38
🔗
|
joepie91 |
depending on what you pay :P |
14:38
🔗
|
underscor |
well, I mean, we just pay for the 10gbit port unmetered, so they agreed to it originally |
14:38
🔗
|
underscor |
haha |
14:39
🔗
|
underscor |
Arkiver2: Nothing looks particularly busier than it usually is; how long has it been slow? |
14:39
🔗
|
Arkiver2 |
not sure |
14:39
🔗
|
Arkiver2 |
a few hours at least |
14:39
🔗
|
Arkiver2 |
I was not home |
14:40
🔗
|
Arkiver2 |
When I came back home it was very slow and according to the total uploaded bytes it was slow for a few hours already |
14:40
🔗
|
joepie91 |
underscor: heh, probably hate then |
14:40
🔗
|
joepie91 |
"10gbps unmetered? whatever, they'll never use all of it anyway" |
14:40
🔗
|
Arkiver2 |
yesterday I it was around 8 to 10 times as fast |
14:40
🔗
|
joepie91 |
".... oh...." |
14:40
🔗
|
underscor |
:D |
14:40
🔗
|
underscor |
that's basically what happened with isc, haha |
14:40
🔗
|
joepie91 |
lol, really? |
14:40
🔗
|
underscor |
although in the end it netted them some really sweet peering agreements |
14:41
🔗
|
underscor |
yeah, we like, quadroupled their internet footprint XD |
14:41
🔗
|
joepie91 |
hahaha |
14:42
🔗
|
underscor |
Arkiver2: Looks like the intake racks are a bit busy today (ia9025) which is probably why it's slow at the moment |
14:42
🔗
|
underscor |
That's the only thing I see that explains it; catalog seems to be operating normally and the link saturation is in the other direction |
14:42
🔗
|
Arkiver2 |
underscor: ah, thank you! |
14:43
🔗
|
underscor |
If you give me your IP I can do a traceroute back to see what route your return traffic is taking, also |
14:43
🔗
|
Arkiver2 |
hmm |
14:43
🔗
|
Arkiver2 |
I will be back here in 2 hours ok? |
14:43
🔗
|
underscor |
(or one near your netblock) |
14:43
🔗
|
underscor |
sure, just pm me :) |
14:43
🔗
|
Arkiver2 |
thank underscor! |
14:44
🔗
|
Arkiver2 |
see you guys later |
14:46
🔗
|
joepie91 |
sorry, this live CD has a very dubious web browser |
14:46
🔗
|
joepie91 |
underscor: what did you say>? |
14:47
🔗
|
underscor |
lol |
14:47
🔗
|
underscor |
<underscor> Arkiver2: Looks like the intake racks are a bit busy today (ia9025) which is probably why it's slow at the moment |
14:47
🔗
|
underscor |
<underscor> If you give me your IP I can do a traceroute back to see what route your return traffic is taking, also |
14:47
🔗
|
underscor |
was basically it |
14:47
🔗
|
joepie91 |
I se |
14:47
🔗
|
joepie91 |
see * |
14:48
🔗
|
joepie91 |
also, boot repair just finished |
14:48
🔗
|
joepie91 |
so, time to reboot, and probably watch it not boot |
14:48
🔗
|
underscor |
gl;hf |
14:48
🔗
|
joepie91 |
not going to be much fun here |
14:48
🔗
|
joepie91 |
lol |
14:48
🔗
|
joepie91 |
I hate suicidal bootloeaders |
14:48
🔗
|
underscor |
haha |
14:48
🔗
|
joepie91 |
bootloaders * |
14:48
🔗
|
joepie91 |
:( |
14:48
🔗
|
underscor |
boatleaders |
14:49
🔗
|
underscor |
buttloafers |
14:49
🔗
|
joepie91 |
okay |
14:49
🔗
|
joepie91 |
time to reboot |
14:50
🔗
|
joepie91 |
back in anywhere between 5 minutes best case to 5 days worst case |
14:50
🔗
|
joepie91 |
:P |
14:50
🔗
|
underscor |
haha |
14:53
🔗
|
Co2e35_ |
guys whats cacti? |
14:54
🔗
|
underscor |
a tree based monitoring/graphing application named after the infamous desert plant |
14:55
🔗
|
underscor |
For router graphs and stuff |
14:58
🔗
|
joepie91 |
well, that was quite clearly a failure. |
14:58
🔗
|
SN4T14 |
Co2e35_, plural of cactus. |
15:15
🔗
|
Co2e35_ |
thanks @underscor |
15:25
🔗
|
Co2e35_ |
joepie91 u are dutch right? |
15:25
🔗
|
joepie91 |
Co2e35_: correct |
15:25
🔗
|
Co2e35_ |
great can u check on #angerthehyve? |
15:26
🔗
|
joepie91 |
Co2e35_: um, what needs checking? |
15:26
🔗
|
joepie91 |
because I'm kinda trying to recover my bootloader :P |
15:27
🔗
|
Co2e35_ |
for what os? |
15:27
🔗
|
joepie91 |
Co2e35_: openSUSE |
15:27
🔗
|
joepie91 |
my GRUB killed itself (again) |
15:28
🔗
|
Co2e35_ |
cant u run it on 2 drives and copying the working one to the broken one sorry my programmer english is horrible |
15:28
🔗
|
Co2e35_ |
xD |
15:29
🔗
|
joepie91 |
:P |
15:29
🔗
|
joepie91 |
Co2e35_: that's implying I have two drives |
15:29
🔗
|
joepie91 |
I only have one usable HDD |
15:29
🔗
|
Co2e35_ |
virtual drive? |
15:29
🔗
|
joepie91 |
my old HDD is 40GB |
15:29
🔗
|
joepie91 |
and the setup on that is -definitely- broken |
15:29
🔗
|
Co2e35_ |
let me guess to small :P |
15:29
🔗
|
joepie91 |
well yes, and it has a broken opensuse and bootloader on it |
15:29
🔗
|
joepie91 |
let's just say that I was a bit too careless while cloning my disk |
15:30
🔗
|
joepie91 |
when I got a new HD |
15:30
🔗
|
joepie91 |
HDD * |
15:30
🔗
|
Co2e35_ |
xD |
15:30
🔗
|
joepie91 |
and since then, I've had no end of problems with bootloaders :p |
15:30
🔗
|
joepie91 |
but yeah, time to reboot and see if my stuff works now |
15:30
🔗
|
joepie91 |
so, brb |
15:30
🔗
|
Co2e35_ |
k gl |
15:47
🔗
|
Co2e35_ |
keep getting |
15:47
🔗
|
Co2e35_ |
https://monitor.archive.org/weathermap/weathermap.html |
15:47
🔗
|
Co2e35_ |
no item received |
15:49
🔗
|
yipdw |
Co2e35_: for hyves, that project ended a long time ago |
15:52
🔗
|
Co2e35_ |
no for project justin |
15:53
🔗
|
Co2e35_ |
about hyves i was wondering if i can get a specific profile in the archive |
15:57
🔗
|
yipdw |
Co2e35_: yeah, there's no items currently out for justin.tv either |
16:01
🔗
|
Co2e35_ |
ok |
17:28
🔗
|
schbirid |
http://blog.earbits.com/online_radio/earbits-will-be-shutting-down-june-16th/ |
17:28
🔗
|
schbirid |
4 days notice |
17:30
🔗
|
Co2e35_ |
and did it work joepie? |
17:34
🔗
|
schbirid |
started a wget from http://www.earbits.com/artists/ |
17:35
🔗
|
schbirid |
pretty slow |
17:41
🔗
|
schbirid |
looks like indie music |
17:41
🔗
|
schbirid |
worth trying to save music? |
17:48
🔗
|
joepie91 |
Co2e35_: oh, yes, it did |
17:48
🔗
|
joepie91 |
after approximately 4290938 attempts |
17:48
🔗
|
joepie91 |
and then figuring out that my kernel was half-installed, which I suppose is a pretty valid reason for not booting |
17:48
🔗
|
schbirid |
it just threw json with the mp3 urls at me but only for a minute or so |
17:48
🔗
|
joepie91 |
but further discussion of that belongs in #archiveteam-bs ;) |
17:49
🔗
|
schbirid |
streaming is served like http://media-http-prod-0.earbits.com/0be2946d74dbf54b808ce58333bf1bcc.mp3 |
17:49
🔗
|
schbirid |
that's for http://www.earbits.com/collections/indie-rock/tracks/50243b8680eb5b00020015f6 |
17:57
🔗
|
schbirid |
ok there is some kind of api, but it requires a "Cookie: client_token" |
18:01
🔗
|
schbirid |
on it |
18:02
🔗
|
schbirid |
is there a good json parser on debian? i need to extract specific fields |
18:02
🔗
|
schbirid |
grepping json_pp output works but ... |
18:03
🔗
|
ivan` |
jq? |
18:08
🔗
|
is4 |
schbirid: beat my to it |
18:09
🔗
|
schbirid |
i am writing a script to download tracks per artist |
18:09
🔗
|
schbirid |
my warc will be errorneous, cant hurt to redo it :) |
18:20
🔗
|
schbirid |
sometimes i hate bash |
18:21
🔗
|
schbirid |
https://gist.github.com/SpiritQuaddicted/ebc149573e68ef2da33a |
18:21
🔗
|
schbirid |
i need to have quotes for curl's -H parameters in the end but of course they get "stripped" |
18:22
🔗
|
schbirid |
there are varaibles inside, so no ' ' |
18:22
🔗
|
* |
schbirid feels like a noob |
18:23
🔗
|
schbirid |
the "while read trackid" is failing, before it is fine |
18:23
🔗
|
yipdw |
use python, save your lifespan |
18:23
🔗
|
yipdw |
there are many reasons why the Warrior code moved from bash to python :P |
18:24
🔗
|
schbirid |
:) |
18:24
🔗
|
schbirid |
i use "\", works alright |
18:34
🔗
|
schbirid |
i give up, can't find the proper trackids to query the API with. they are not the ones in the json |
18:40
🔗
|
schbirid |
probably some md5 |
18:49
🔗
|
Arkiver2 |
underscor: the upload speed is slowly getting beter now... |
18:50
🔗
|
schbirid |
if someone wants to help with earbits: https://gist.github.com/SpiritQuaddicted/af2e2479aec6aa4eda25 |
18:53
🔗
|
SN4T14 |
schbirid, could you give me an example link with an ID? |
18:53
🔗
|
SN4T14 |
And do they change if you play the same track twice? |
18:54
🔗
|
schbirid |
good question |
18:54
🔗
|
schbirid |
pmed you a URL |
18:55
🔗
|
SN4T14 |
Will take a look at it in a bit |
18:56
🔗
|
schbirid |
hm, can't re-trigger the website calling the API. it seems to remember what it looked up |
18:56
🔗
|
Arkiver2 |
the songs are just downloadable |
18:56
🔗
|
Arkiver2 |
here is a list with the songs: |
18:56
🔗
|
Arkiver2 |
http://media-http-prod-0.earbits.com/ |
18:56
🔗
|
schbirid |
maybe https://chrome.google.com/webstore/detail/earbits-radio-free-music/mgkjffcdjblaipglnmhanakilfbniihj/details will help |
18:57
🔗
|
schbirid |
lol |
18:57
🔗
|
schbirid |
dumbest trick in the book, thanks |
18:57
🔗
|
joepie91 |
bahaha |
18:57
🔗
|
Arkiver2 |
:P |
18:57
🔗
|
joepie91 |
been a while since I've seen a site do that one wrong |
18:57
🔗
|
Arkiver2 |
glad I could help |
18:57
🔗
|
Arkiver2 |
lol yes |
18:57
🔗
|
joepie91 |
amazed |
18:57
🔗
|
joepie91 |
misconfigured s3, probably |
18:57
🔗
|
Arkiver2 |
they aren't the best at security... |
18:57
🔗
|
schbirid |
i love those |
18:57
🔗
|
joepie91 |
:P |
18:57
🔗
|
joepie91 |
schbirid: goodies! |
18:57
🔗
|
Arkiver2 |
hmm |
18:58
🔗
|
Arkiver2 |
is there only http://media-http-prod-0.earbits.com/ |
18:58
🔗
|
Arkiver2 |
or also others? |
18:58
🔗
|
Arkiver2 |
there is no http://media-http-prod-1.earbits.com/ |
18:58
🔗
|
schbirid |
only saw that one so far |
18:58
🔗
|
Arkiver2 |
hmm |
18:58
🔗
|
Arkiver2 |
going to create a list now of the mp3 |
18:59
🔗
|
schbirid |
hm, just 800 files |
18:59
🔗
|
joepie91 |
https://www.google.com/search?q=%22media-http-prod-*.earbits.com%22&oq=%22media-http-prod-*.earbits.com%22&aqs=chrome..69i57.2247j0j4&sourceid=chrome&es_sm=0&ie=UTF-8 |
18:59
🔗
|
joepie91 |
helpful google is helpful |
19:00
🔗
|
Arkiver2 |
there are also log files in the list |
19:00
🔗
|
Arkiver2 |
http://media-http-prod-0.earbits.com/003b0478d0d4465f2a3da911e8b9f2ee.log |
19:00
🔗
|
Arkiver2 |
not sure if it's useful |
19:00
🔗
|
Arkiver2 |
but I thought I'd post it |
19:01
🔗
|
joepie91 |
https://www.google.com/search?q=site%3A%22*.earbits.com%22+-site%3Awww.earbits.com+-site%3Aemail.earbits.com+-site%3Ablog.earbits.com+-site%3Ahelp.earbits.com&oq=site%3A%22*.earbits.com%22+-site%3Awww.earbits.com+-site%3Aemail.earbits.com+-site%3Ablog.earbits.com+-site%3Ahelp.earbits.com&aqs=chrome..69i57j69i58.647j0j9&sourceid=chrome&es_sm=0&ie=UTF-8 |
19:01
🔗
|
joepie91 |
not much interesting otherwise |
19:02
🔗
|
schbirid |
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html "This implementation of the GET operation returns some or all (up to 1000) of the objects in a bucket. You can use the request parameters as selection criteria to return a subset of the objects in a bucket." |
19:02
🔗
|
schbirid |
that makes sense |
19:02
🔗
|
schbirid |
so how do we query that bucket for more |
19:03
🔗
|
Smiley |
hmmm it'll be something like Start=1001 or something |
19:03
🔗
|
Smiley |
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html |
19:03
🔗
|
Smiley |
errr maker |
19:04
🔗
|
Smiley |
Specifies the key to start with when listing objects in a bucket. Amazon S3 lists objects in alphabetical order. |
19:04
🔗
|
Smiley |
marker*; marker |
19:04
🔗
|
Smiley |
Type: String |
19:07
🔗
|
Arkiver2 |
oh shit |
19:07
🔗
|
joepie91 |
ALL THE S3 |
19:07
🔗
|
Arkiver2 |
http://media-http-prod-0.earbits.com/ doesn't have the full list |
19:07
🔗
|
Arkiver2 |
wait |
19:08
🔗
|
schbirid |
someone needs an aws account, then we should be able to query it with s3cmd |
19:08
🔗
|
Arkiver2 |
here |
19:08
🔗
|
Arkiver2 |
http://pastebin.com/wwQgpWmw |
19:08
🔗
|
schbirid |
eg s3cmd ls s3://earbits-media-production-0 |
19:08
🔗
|
Arkiver2 |
it goes up to 0078be8d5ad0a3189be4bfbb38422082.log |
19:08
🔗
|
schbirid |
that's just the 1000 files, yeah :) |
19:09
🔗
|
Arkiver2 |
songs like http://media-http-prod-0.earbits.com/0be2946d74dbf54b808ce58333bf1bcc.mp3 aren't there |
19:09
🔗
|
Arkiver2 |
schbirid: do you know how to get the full list? |
19:09
🔗
|
schbirid |
scroll up! |
19:09
🔗
|
joepie91 |
there's flacs and wavs there! |
19:09
🔗
|
Arkiver2 |
schbirid: sorry... I see |
19:09
🔗
|
schbirid |
how much will it cost them if we do this :( |
19:10
🔗
|
Arkiver2 |
why do we care about that? They oly said their website is going away 4 days before it is going away... |
19:11
🔗
|
joepie91 |
not our problem |
19:11
🔗
|
joepie91 |
they can save costs by shipping us a HDD |
19:11
🔗
|
joepie91 |
lol |
19:11
🔗
|
schbirid |
i am a nice person |
19:11
🔗
|
schbirid |
that's why |
19:11
🔗
|
Arkiver2 |
lol |
19:12
🔗
|
Arkiver2 |
so, any solution for the max keys = 1000 problem? |
19:12
🔗
|
* |
joepie91 points at the sign on the front door saying "Archive Team, Band of Rogue Archivists" |
19:13
🔗
|
schbirid |
ok should not be much money if it is say 300k tracks at 5mb each. < 200usd unless i calced wrong |
19:13
🔗
|
Smiley |
you need to juset set hte marker to 1001 and it'll return 1001-2001 |
19:14
🔗
|
schbirid |
Smiley: what marker:P |
19:14
🔗
|
Smiley |
schbirid: not sure exactly how the GET works |
19:14
🔗
|
Smiley |
but it has a attribute called marker, which you can set to start the index as a arbitary point. |
19:14
🔗
|
schbirid |
Arkiver2: we need someone with aws credentials i think. i dont want to get my personal info involved so i wont do that. then s3cmd or similar tools will allow querying |
19:14
🔗
|
Smiley |
as the site says D: |
19:14
🔗
|
schbirid |
yeah, via s3 |
19:15
🔗
|
Arkiver2 |
hmm |
19:19
🔗
|
Arkiver2 |
doing a crawl for what we ave already http://pastebin.com/TpbTuTEH Crawlig with Heritrix. |
19:19
🔗
|
Arkiver2 |
discovering more urls!!!! |
19:19
🔗
|
Arkiver2 |
http://cdn-1.earbits.com/ |
19:19
🔗
|
Arkiver2 |
the same |
19:21
🔗
|
schbirid |
yuck, private information in there |
19:22
🔗
|
Arkiver2 |
oh shit |
19:22
🔗
|
Arkiver2 |
yeah |
19:22
🔗
|
schbirid |
this is something one should tell them :( |
19:22
🔗
|
Arkiver2 |
hmm |
19:23
🔗
|
Arkiver2 |
Maybe it's weird, but shouldn't we first archive it? Because if they know of it they will also remove access to that url with all the songs |
19:24
🔗
|
yipdw |
oh hahaha |
19:24
🔗
|
yipdw |
that's an open S3 bucket |
19:24
🔗
|
yipdw |
gg earbits |
19:24
🔗
|
schbirid |
we should totally try to archive it first |
19:24
🔗
|
pft |
wow, yeah |
19:24
🔗
|
pft |
names and email addresses |
19:24
🔗
|
pft |
hurrr |
19:25
🔗
|
Arkiver2 |
this one is note accessible cdn-100.earbits.com |
19:25
🔗
|
pft |
archive it but flag it as dark |
19:25
🔗
|
Arkiver2 |
while it does have some urls |
19:25
🔗
|
schbirid |
no one has aws credentials? |
19:26
🔗
|
yipdw |
I have AWS accounts, but you don't need them |
19:26
🔗
|
yipdw |
the bucket is public |
19:26
🔗
|
schbirid |
how can we get the full list then? |
19:26
🔗
|
schbirid |
i don't know how to query without access credentials |
19:31
🔗
|
joepie91 |
wow, this is so bad |
19:34
🔗
|
schbirid |
i am only getting ~15MBit/s from those buckets anyways .( |
19:39
🔗
|
Arkiver2 |
any solution for the 1000 max links? |
19:40
🔗
|
joepie91 |
Arkiver2: the solution has been mentioned several times, but won't get anywhere until somebody uses an Amazon account to actually do it |
19:41
🔗
|
midas |
do we have chan yet? |
19:41
🔗
|
midas |
or too small? |
19:42
🔗
|
Arkiver2 |
what about #earbite ? |
19:43
🔗
|
midas |
im down with that |
19:44
🔗
|
Arkiver2 |
ok good I'm in it |
19:44
🔗
|
schbirid |
ok |
20:46
🔗
|
schbirid |
yipdw: if you know more about accessing public buckets, please help us in #earbite |
21:56
🔗
|
Arkiver2 |
underscor: uploadspeed is back to normal again, thanks man! |
22:35
🔗
|
dashcloud |
if I want to exclude club.angelfire.com while still grabbing everything else in the angelfire domain, should I use --reject-regex=club.angelfire.com* or --exclude-domains=club.angelfire.com ? |
22:45
🔗
|
dashcloud |
ddrescue is out with a new release- changes to to copy algorithm and trimming: http://lists.gnu.org/archive/html/info-gnu/2014-06/msg00009.html |
22:45
🔗
|
Smiley |
oo |
22:48
🔗
|
SN4T14 |
dashcloud, about the axclude thing, they should both work, but I'd use exclude-domains because it's simpler. |
22:49
🔗
|
dashcloud |
thanks! |
23:44
🔗
|
Nemo_bis |
underscor: MOAR cacti links |