Time |
Nickname |
Message |
00:03
🔗
|
JW_work |
eddiedean: btw, you can get unmodified versions of wayback pages by suffixing id_ to the date part (see http://www.archiveteam.org/index.php?title=Internet_Archive#Downloading_from_archive.org ) |
00:03
🔗
|
eddiedean |
JW_work yes, that is what my script does. |
00:04
🔗
|
JW_work |
ah, good — it wasn't clear if you were trying to retroactively clean up the modified versions, or were just downloading the unmodified ones. |
00:05
🔗
|
eddiedean |
I may explained myself bad, english isn't my first language :P |
00:05
🔗
|
JW_work |
and the github link doesn't seem to go anywhere. :-/ |
00:06
🔗
|
eddiedean |
It is fun to monitor the tool, so I can get performance data. Some guy is retrieving library.gnome.org, 32K of files huhuhu |
00:07
🔗
|
eddiedean |
JW_work haven't published it yet, the icons in the footer are WIP right now :P |
00:07
🔗
|
JW_work |
heh |
00:07
🔗
|
JW_work |
I tried python.org and ludios.org — both 504'ed. |
00:08
🔗
|
eddiedean |
When? |
00:08
🔗
|
JW_work |
just now |
00:08
🔗
|
eddiedean |
Hmm, let me check :) |
00:09
🔗
|
|
Famicoma1 has joined #archiveteam |
00:10
🔗
|
eddiedean |
It might be the API. 1 min, I'm going to test something |
00:20
🔗
|
eddiedean |
JW_work retry now :). It was some Google Cloud issue, now it is solved :) |
00:23
🔗
|
JW_work |
heh, ok |
00:24
🔗
|
JW_work |
hm, the wayback machine appears to be having some maintainance issues right now |
00:24
🔗
|
xmc |
yeah they just rebooted it like 20 minutes ago |
00:24
🔗
|
xmc |
reeeelllllaaaaaxxxxx |
00:25
🔗
|
JW_work |
eddiedean: so, I'd think people would be … reticient … about provinding you with email addresses, rather than you just, you know, providing a normal download link. |
00:26
🔗
|
eddiedean |
They can use a disposable one. But I could provide a temp link that shows the download link when it is ready. The thing is that each download request goes to a queue, and it takes some time to be processed because each snapshot has different size |
00:27
🔗
|
JW_work |
I think a download link would be preferable. |
00:28
🔗
|
eddiedean |
Good idea. I'll add a download link option too, so people can choose. |
00:30
🔗
|
eddiedean |
I guess that the issues that I'm having right now are because archive.org servers have been rebooted? |
00:31
🔗
|
eddiedean |
Because I've added some caching to this, and a previous requested URL can be shown, but not a new one. |
00:51
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
00:54
🔗
|
|
dashcloud has joined #archiveteam |
00:56
🔗
|
|
chfoo0 is now known as chfoo |
01:10
🔗
|
BnA-Rob1n |
Anyone here who can add some more items for fotolog? |
01:29
🔗
|
|
bai_ is now known as bai |
01:30
🔗
|
|
eddiedean has quit IRC (Ping timeout: 260 seconds) |
01:36
🔗
|
|
pft has joined #archiveteam |
01:38
🔗
|
|
JesseW has joined #archiveteam |
01:40
🔗
|
|
dxrt_ is now known as dxrt |
01:48
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
01:48
🔗
|
|
hawc145 is now known as HCross |
01:52
🔗
|
|
dashcloud has joined #archiveteam |
01:57
🔗
|
|
Guest_ has joined #archiveteam |
02:36
🔗
|
|
tomwsmf-a has joined #archiveteam |
02:36
🔗
|
|
arkiver has quit IRC (Ping timeout: 260 seconds) |
02:37
🔗
|
|
Guest_ has quit IRC (Quit: My MacBook Pro has gone to sleep. ZZZzzz…) |
02:43
🔗
|
|
dxrt has quit IRC (Read error: Operation timed out) |
02:43
🔗
|
|
dxrt has joined #archiveteam |
02:43
🔗
|
|
dxrt- sets mode: +o dxrt |
02:44
🔗
|
|
Peetz0r_ has quit IRC (Read error: Operation timed out) |
02:44
🔗
|
|
Jonimus has quit IRC (Read error: Operation timed out) |
02:44
🔗
|
|
mhazinsk has quit IRC (Read error: Operation timed out) |
02:44
🔗
|
|
bai has quit IRC (Read error: Operation timed out) |
02:44
🔗
|
|
bai has joined #archiveteam |
02:45
🔗
|
|
aMunster has quit IRC (Read error: Operation timed out) |
02:45
🔗
|
|
vegbrasil has quit IRC (Read error: Operation timed out) |
02:46
🔗
|
|
beardicus has quit IRC (Read error: Operation timed out) |
02:47
🔗
|
|
maseck has quit IRC (Read error: Operation timed out) |
02:47
🔗
|
|
toad1 has quit IRC (Read error: Operation timed out) |
02:48
🔗
|
|
phuz has quit IRC (Read error: Operation timed out) |
02:48
🔗
|
|
SimpBrai1 has quit IRC (Read error: Operation timed out) |
02:49
🔗
|
|
is- has joined #archiveteam |
02:49
🔗
|
|
jmad980 has joined #archiveteam |
02:49
🔗
|
|
closure has quit IRC (Ping timeout: 633 seconds) |
02:50
🔗
|
|
jmad980_ has quit IRC (Read error: Operation timed out) |
02:50
🔗
|
|
Peetz0r has joined #archiveteam |
02:50
🔗
|
|
nwf has quit IRC (Read error: Operation timed out) |
02:51
🔗
|
|
maseck has joined #archiveteam |
02:52
🔗
|
|
MMovie has quit IRC (Ping timeout: 633 seconds) |
02:52
🔗
|
|
is-_ has quit IRC (Read error: Connection reset by peer) |
02:54
🔗
|
|
phuzion has joined #archiveteam |
02:56
🔗
|
|
arkiver has joined #archiveteam |
02:57
🔗
|
|
SimpBrai1 has joined #archiveteam |
03:07
🔗
|
|
VADemon has quit IRC (Quit: left4dead) |
03:12
🔗
|
|
toad1 has joined #archiveteam |
03:14
🔗
|
|
beardicus has joined #archiveteam |
03:14
🔗
|
|
vegbrasil has joined #archiveteam |
03:14
🔗
|
|
winr4r has quit IRC (Ping timeout: 260 seconds) |
03:14
🔗
|
|
winr4r has joined #archiveteam |
03:15
🔗
|
|
closure has joined #archiveteam |
03:30
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
03:31
🔗
|
|
aMunster has joined #archiveteam |
03:35
🔗
|
|
MMovie has joined #archiveteam |
03:58
🔗
|
SketchCow |
* rebooted |
03:58
🔗
|
SketchCow |
* rsync back up |
03:58
🔗
|
|
bwn has quit IRC (Ping timeout: 492 seconds) |
04:03
🔗
|
ErkDog |
nice |
04:03
🔗
|
ErkDog |
but 100k/sec still :( |
04:05
🔗
|
SketchCow |
Is it. |
04:06
🔗
|
ErkDog |
yeah I got failed rsynch messages |
04:06
🔗
|
ErkDog |
then it connected back |
04:06
🔗
|
ErkDog |
and 70-100 K/sec |
04:06
🔗
|
ErkDog |
http://puu.sh/nzYJA/b677b392d7.png |
04:06
🔗
|
SketchCow |
Well, let's assume it's you, and that's why mommy and daddy aren't together. |
04:06
🔗
|
SketchCow |
Guess we'll have to wait when I'm onsite. |
04:06
🔗
|
SketchCow |
Unless you want to be an extra credit traceroute noodle |
04:07
🔗
|
ErkDog |
sure |
04:08
🔗
|
ErkDog |
http://paste.nerds.io/cumoqoloqi.avrasm |
04:08
🔗
|
ErkDog |
goku.ecansol.net will get you back this way |
04:09
🔗
|
dxrt |
It's still buggered from France, 30KB/s. |
04:11
🔗
|
|
nwf has joined #archiveteam |
04:17
🔗
|
|
Jonimus has joined #archiveteam |
04:19
🔗
|
|
bwn has joined #archiveteam |
04:20
🔗
|
|
bwn has quit IRC (Client Quit) |
04:23
🔗
|
|
mhazinsk has joined #archiveteam |
04:31
🔗
|
FalconK |
if anyone wants, the bits I am running are https://github.com/falconkirtaran/ArchiveBot |
04:31
🔗
|
|
xXx_ndidd has quit IRC (Read error: Connection reset by peer) |
04:31
🔗
|
FalconK |
my pipeline is (slowly) emptying its buffer out. |
04:31
🔗
|
FalconK |
17GB free now. |
04:34
🔗
|
|
logan2 has joined #archiveteam |
04:36
🔗
|
|
logan has quit IRC (Read error: Operation timed out) |
04:36
🔗
|
|
khaoohs_ has joined #archiveteam |
04:39
🔗
|
|
Froggypwn has quit IRC (Read error: Operation timed out) |
04:39
🔗
|
|
vtyl has joined #archiveteam |
04:40
🔗
|
|
Froggypwn has joined #archiveteam |
04:40
🔗
|
|
lytv has quit IRC (Read error: Operation timed out) |
04:41
🔗
|
|
khaoohs has quit IRC (Read error: Operation timed out) |
04:48
🔗
|
|
Simpbrai_ has quit IRC (Read error: Operation timed out) |
04:48
🔗
|
|
maseck has quit IRC (Read error: Connection reset by peer) |
04:48
🔗
|
|
Simpbrai_ has joined #archiveteam |
04:49
🔗
|
|
maseck has joined #archiveteam |
05:01
🔗
|
|
tomwsmf-a has quit IRC (Read error: Operation timed out) |
05:11
🔗
|
|
acridAxid has quit IRC (marauder) |
05:12
🔗
|
|
Sk1d has quit IRC (Ping timeout: 250 seconds) |
05:12
🔗
|
|
acridAxid has joined #archiveteam |
05:15
🔗
|
ErkDog |
How is a message board with 200,000 threads considered "small" lol |
05:19
🔗
|
|
Sk1d has joined #archiveteam |
05:45
🔗
|
|
logan2 has quit IRC (Read error: Operation timed out) |
05:46
🔗
|
|
logan has joined #archiveteam |
06:16
🔗
|
|
Famicoma1 has quit IRC (Ping timeout: 260 seconds) |
06:25
🔗
|
|
WinterFox has joined #archiveteam |
06:28
🔗
|
|
JesseW has joined #archiveteam |
07:20
🔗
|
|
metalcamp has joined #archiveteam |
07:22
🔗
|
|
davidar has joined #archiveteam |
07:23
🔗
|
davidar |
ping arkiver |
07:24
🔗
|
davidar |
or Fletcher |
07:31
🔗
|
JesseW |
? |
07:33
🔗
|
davidar |
hey JesseW |
07:34
🔗
|
davidar |
just need to clarify some more details on how this pdf crawl will work |
07:34
🔗
|
davidar |
arkiver said the pdfs can be delivered to an rsync target |
07:34
🔗
|
davidar |
are we also able to attach original URLs to those pdfs? |
07:37
🔗
|
|
Famicoma1 has joined #archiveteam |
07:38
🔗
|
JesseW |
we certainly could, yeah |
07:39
🔗
|
JesseW |
simplest way is probably to make the filename a normalized form of the URL. |
07:40
🔗
|
JesseW |
i.e. if the URL is http://forge.fh-potsdam.de/~IFLA/INSPEL/96-1riea.pdf the resulting file name would be: forge.fh-potsdam.de_~IFLA_INSPEL_96-1riea.pdf |
07:41
🔗
|
JesseW |
(with the http:// removed and slashes converted into underscores) |
07:42
🔗
|
JesseW |
If you wanted to more reverseably normalize them, you could convert the slashes to _SLASH_ , instead. |
07:42
🔗
|
JesseW |
davidar: |
07:46
🔗
|
davidar |
that's true |
07:47
🔗
|
davidar |
could we also decide on identifiers for each URL beforehand, and then just use that? |
07:47
🔗
|
davidar |
(not sure if that would be simpler than normalising URLs) |
07:50
🔗
|
davidar |
JesseW: but the filename thing sounds fine (so long as it doesn't introduce any ambiguity) |
07:52
🔗
|
JesseW |
eh, we could prefix a hash of the url to the filename, but that wouldn't really get us anything more than including the actual url. :-) |
07:52
🔗
|
JesseW |
but IDK about how warrior jobs work in detail -- so this may already be handled |
07:53
🔗
|
JesseW |
Assuming no URL actually contains "_SLASH_" there wouldn't be any ambiguity. |
07:53
🔗
|
JesseW |
davidar: |
07:54
🔗
|
davidar |
cool |
08:00
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
08:14
🔗
|
|
signius has joined #archiveteam |
08:35
🔗
|
FalconK |
yo, anyone know any details about archive.org rationing on S3 API uploads? |
08:36
🔗
|
xmc |
nope, never hit it |
08:36
🔗
|
xmc |
i think it's when the edge boxes get congested? |
08:36
🔗
|
FalconK |
I'm wondering what the rationing enabled condition means exactly and how long it might taje to clear |
08:36
🔗
|
FalconK |
**take |
08:36
🔗
|
xmc |
probably bitrate |
08:36
🔗
|
FalconK |
probably |
08:36
🔗
|
xmc |
i never hit it uploading a zillion tiny items |
08:36
🔗
|
FalconK |
there are two expressions of it |
08:36
🔗
|
FalconK |
one is whether rationing is enabled, and another is whether you hit your limit in particular |
08:37
🔗
|
xmc |
mmm ok |
08:37
🔗
|
FalconK |
the implementation seems poor: if you begin an upload while you are not at your limit, it will happily cancel a large upload halfway done (even knowing your size hint), only to let you begin anew and do the same thing |
08:38
🔗
|
FalconK |
so I told it to block whenever rationing is enabled at all, but I wonder if there is a better answer |
08:38
🔗
|
FalconK |
I mean, I'm hitting it with a *lot* of traffic |
08:39
🔗
|
FalconK |
like in the past 6 hours, maybe as much as 10GB/hr |
08:41
🔗
|
|
roninski has joined #archiveteam |
08:42
🔗
|
yipdw_ |
I've received 503 Slow Down before |
08:42
🔗
|
yipdw_ |
usually the condition will reset once the machines are ok to take more |
08:42
🔗
|
|
roninski has left |
08:43
🔗
|
xmc |
it cancels uploads midstream? |
08:43
🔗
|
xmc |
are you sending a size hint? |
08:43
🔗
|
|
roninski has joined #archiveteam |
08:43
🔗
|
xmc |
x-archive-size-hint:19327352832 (in bytes) |
08:44
🔗
|
yipdw_ |
as far as I know, detecting when to resume isn't possible client-side; megawarc factory handles this by just applying its usual retry strategy |
08:45
🔗
|
yipdw_ |
this does mean that you might get consecutive failures, but so far nobody has complained |
08:46
🔗
|
midas |
also, dont upload huge files. s3 hates it. you might not get a slowdown but a disk is full really sucks when you upload a 1TB tar file |
08:53
🔗
|
SketchCow |
This is why I cut megawarc files down to 50gb |
08:53
🔗
|
SketchCow |
Big enough to not make 100,000 objects, small enough the system doesn't fucking explode. |
08:54
🔗
|
|
schbirid has joined #archiveteam |
08:56
🔗
|
SketchCow |
So. I just turned on the archivebot uploading. |
08:57
🔗
|
FalconK |
I'm only uploading 5GB warcs at the moment |
08:58
🔗
|
FalconK |
you can query the status of the throttling, which exposes counters at the API key, bucket, and general levels, and indicates whether your API key in particular is at its limit right now, which I believe means an upload right now would get a 503 error |
08:59
🔗
|
yipdw_ |
oh |
08:59
🔗
|
FalconK |
I don't really understand what the counters mean and I don't see any documentation on them |
08:59
🔗
|
* |
yipdw_ never bothered |
08:59
🔗
|
FalconK |
well I noticed that my uploader would upload like 500mb, and then get 503 and curl would fail out |
08:59
🔗
|
FalconK |
then it would upload the same 500mb and do it again |
08:59
🔗
|
yipdw_ |
I've noticed that sometimes as well |
08:59
🔗
|
FalconK |
so that it would keep wasting everyone's bandwidth |
08:59
🔗
|
yipdw_ |
I just let it loop |
08:59
🔗
|
midas |
^ |
08:59
🔗
|
midas |
that |
09:00
🔗
|
FalconK |
so I wrote a little thing that queries first, and tries to predict what will happen, in a loop. |
09:00
🔗
|
yipdw_ |
because I figure the more intelligence I add, the more it will fuck up |
09:00
🔗
|
FalconK |
right now if it sees that it's throttled or that rationing is globally on, it waits 5 seconds and asks again |
09:00
🔗
|
HCross2 |
FalconK: can you pm me that? |
09:01
🔗
|
FalconK |
HCross2: it's in github - https://github.com/falconkirtaran/ArchiveBot/blob/master/uploader/uploader.py#L35 |
09:01
🔗
|
midas |
S3 status messages are just as informative as a blank paper, i wouldnt bother and just keep pushing it in untill it accepts |
09:01
🔗
|
yipdw_ |
eh the code's there |
09:02
🔗
|
yipdw_ |
if it works, whichever |
09:02
🔗
|
FalconK |
it's trying to tell me *something* |
09:02
🔗
|
yipdw_ |
I just never bothered for the above reasons |
09:02
🔗
|
|
Guest__ has joined #archiveteam |
09:02
🔗
|
FalconK |
over_limit is for sure informative as it means any activity will block |
09:02
🔗
|
FalconK |
er, will 503 |
09:04
🔗
|
yipdw_ |
it looks like over_limit is the only flag that you can rely on |
09:04
🔗
|
FalconK |
I would be interested to know what constitutes a "task" for these counters |
09:05
🔗
|
yipdw_ |
so I guess it's probably fine to just wait until it's cleared and assume zero means go-ahead |
09:05
🔗
|
yipdw_ |
if it cuts you off midstream, oh well, what's a few gigabytes between friends |
09:05
🔗
|
FalconK |
well, yes, but it looked like any attempt at all to upload giant things at the rate I am capable of uploading them will exceed my ration |
09:05
🔗
|
yipdw_ |
today just might not be a good day |
09:06
🔗
|
FalconK |
it depends on "task" |
09:06
🔗
|
yipdw_ |
I've pushed stuff in at ~600 Mbps |
09:06
🔗
|
yipdw_ |
it does, but the detail field is also explicitly documented as internal |
09:06
🔗
|
FalconK |
yeah, while rationing_engaged=0, it doesn't seem to care what kind of mess I make |
09:06
🔗
|
yipdw_ |
as a result it doesn't seem like it's anything to rely on |
09:07
🔗
|
SketchCow |
s3 is overloaded, so there's that. |
09:07
🔗
|
FalconK |
I'm trying to be a good netizen and not create ops problems ;) |
09:07
🔗
|
FalconK |
what is it overloaded by? semantic processing? bandwidth? data migration? |
09:09
🔗
|
FalconK |
it looks as though people are enqueueing tasks at a very high rate tonight |
09:09
🔗
|
midas |
http://i.imgur.com/Q45H7.gif <-- s3 graph |
09:10
🔗
|
FalconK |
lol |
09:10
🔗
|
midas |
not to worry, it will unclog at a certain moment |
09:11
🔗
|
yipdw_ |
there might be someone at IA you can email; I'm not sure who that'd be or if they're available for this sort of thing |
09:11
🔗
|
schbirid |
i hope s3 gave their consent |
09:11
🔗
|
schbirid |
;P |
09:11
🔗
|
midas |
lol |
09:11
🔗
|
FalconK |
... it seems as though it should be possible to approximate the desired rate by seeing that accesskey_tasks_queued<accesskey_ration, bucket_tasks_queued<bucket_ration, and total_tasks_queued<total_global_limit |
09:11
🔗
|
midas |
SketchCow: nice work on the busy gif btw |
09:12
🔗
|
midas |
https://monitor.archive.org/about/busy.gif |
09:12
🔗
|
FalconK |
and long polling while querying that until the condition is true |
09:12
🔗
|
FalconK |
but that is WAY too much work |
09:12
🔗
|
FalconK |
and I can't do that while shelling out to curl for uploads really |
09:12
🔗
|
|
pikhq has quit IRC (Ping timeout: 506 seconds) |
09:12
🔗
|
FalconK |
and that would really put too much logic in there |
09:12
🔗
|
FalconK |
ugh |
09:17
🔗
|
midas |
what you can do FalconK is have a look at http://monitor.archive.org/stats/s3.php |
09:18
🔗
|
midas |
but that's mostly old data, as in not live feed |
09:18
🔗
|
yipdw_ |
I don't know how you'd use that in a script |
09:18
🔗
|
midas |
mostly to check with your eyes and see "well this seems to be a bad time to upload something" ;-) |
09:19
🔗
|
FalconK |
not even a bit amenable to automation but it's nice to have the stats. |
09:19
🔗
|
yipdw_ |
nobody really does that with the archivebot uploader |
09:19
🔗
|
yipdw_ |
hell I forget it's running |
09:19
🔗
|
yipdw_ |
I'd prefer to keep forgetting that it's there |
09:19
🔗
|
FalconK |
:P |
09:19
🔗
|
midas |
thats because the archivebot uploader keeps looping anyway |
09:20
🔗
|
yipdw_ |
anyway good luck I guess -- I just haven't hit 503 Slow Down often enough to really look into optimizing backoff/retry |
09:20
🔗
|
yipdw_ |
if you hear back from (say) IA staff it'd be cool to have that info |
09:21
🔗
|
yipdw_ |
it'd be super-cool if http://archive.org/help/abouts3.txt was updated with detail info, but the way that's written sounds like the rate limiting policy is in flux |
09:21
🔗
|
HCross2 |
I'm talking to Mark from the IA tonight. I'll bring up the slowdowns and see |
09:21
🔗
|
FalconK |
it doesn't look well-correlated with anything in particular |
09:22
🔗
|
FalconK |
yeah! |
09:22
🔗
|
FalconK |
it would be nice to know like |
09:22
🔗
|
SketchCow |
Mark will not be able to help you. |
09:22
🔗
|
FalconK |
anything at all that relates upload size to tasks |
09:22
🔗
|
|
bwn has joined #archiveteam |
09:22
🔗
|
SketchCow |
I mean, I know why FOS is slow and why S3 has problems. |
09:23
🔗
|
FalconK |
it'd be awful nice if it could use my size hint to dispose of the request fast |
09:28
🔗
|
HCross2 |
Ok |
09:35
🔗
|
SketchCow |
s3 unjammed |
09:35
🔗
|
FalconK |
so it is! |
09:35
🔗
|
FalconK |
I also notice that most of my uploads have been flagged for admin intervention due to being on full disks as derive.php started |
09:37
🔗
|
FalconK |
or things like connect to host iw600504 port 22: No route to host |
09:37
🔗
|
FalconK |
guessing it's related to the downtime today and derivation will continue anon. |
09:55
🔗
|
|
roninski1 has joined #archiveteam |
09:56
🔗
|
|
MMovie has quit IRC (Read error: Operation timed out) |
09:57
🔗
|
|
vitzli has joined #archiveteam |
10:01
🔗
|
|
roninski has quit IRC (Read error: Operation timed out) |
10:06
🔗
|
|
metalcamp has quit IRC (Ping timeout: 250 seconds) |
10:52
🔗
|
|
atomotic has joined #archiveteam |
11:11
🔗
|
|
metalcamp has joined #archiveteam |
11:21
🔗
|
|
db48x has quit IRC (Read error: Connection reset by peer) |
11:41
🔗
|
|
metalcamp has quit IRC (Ping timeout: 250 seconds) |
11:53
🔗
|
|
roninski has joined #archiveteam |
11:58
🔗
|
|
roninski1 has quit IRC (Read error: Operation timed out) |
12:00
🔗
|
|
khaoohs_ has quit IRC (Read error: Operation timed out) |
12:00
🔗
|
|
khaoohs has joined #archiveteam |
12:07
🔗
|
|
roninski1 has joined #archiveteam |
12:12
🔗
|
|
MMovie has joined #archiveteam |
12:13
🔗
|
|
roninski has quit IRC (Read error: Operation timed out) |
12:31
🔗
|
|
jmad980 has quit IRC (Read error: Operation timed out) |
12:33
🔗
|
|
metalcamp has joined #archiveteam |
12:35
🔗
|
|
Jonimus has quit IRC (Read error: Operation timed out) |
12:35
🔗
|
|
nwf has quit IRC (Read error: Operation timed out) |
12:36
🔗
|
|
aMunster has quit IRC (Read error: Operation timed out) |
12:36
🔗
|
|
toad1 has quit IRC (Read error: Operation timed out) |
12:36
🔗
|
|
mhazinsk has quit IRC (Read error: Operation timed out) |
12:37
🔗
|
|
MMovie has quit IRC (Read error: Operation timed out) |
12:37
🔗
|
|
vegbrasil has quit IRC (Read error: Operation timed out) |
12:38
🔗
|
|
closure has quit IRC (Read error: Operation timed out) |
12:38
🔗
|
|
vtyl has quit IRC (Read error: Operation timed out) |
12:41
🔗
|
|
beardicus has quit IRC (Read error: Operation timed out) |
12:42
🔗
|
|
lytv has joined #archiveteam |
12:42
🔗
|
|
toad1 has joined #archiveteam |
12:49
🔗
|
|
jmad980 has joined #archiveteam |
12:51
🔗
|
|
metal_cam has joined #archiveteam |
12:51
🔗
|
|
metalcamp has quit IRC (Ping timeout: 258 seconds) |
12:54
🔗
|
|
WinterFox has quit IRC (Remote host closed the connection) |
12:57
🔗
|
|
metal_cam is now known as metalcamp |
13:04
🔗
|
|
VADemon has joined #archiveteam |
13:05
🔗
|
|
beardicus has joined #archiveteam |
13:06
🔗
|
|
vegbrasil has joined #archiveteam |
13:08
🔗
|
|
closure has joined #archiveteam |
13:08
🔗
|
|
dserodio has joined #archiveteam |
13:16
🔗
|
|
aMunster has joined #archiveteam |
13:54
🔗
|
|
maseck has quit IRC (Quit: No Ping reply in 180 seconds.) |
13:55
🔗
|
|
maseck has joined #archiveteam |
14:05
🔗
|
|
vitzli has quit IRC (Leaving) |
14:06
🔗
|
|
Jonimus has joined #archiveteam |
14:10
🔗
|
|
MMovie has joined #archiveteam |
14:15
🔗
|
|
mhazinsk has joined #archiveteam |
14:23
🔗
|
|
tomwsmf-a has joined #archiveteam |
14:26
🔗
|
|
nwf has joined #archiveteam |
14:35
🔗
|
|
pgoetz has quit IRC (Remote host closed the connection) |
15:06
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
15:07
🔗
|
Start |
http://www.theverge.com/2016/3/9/11184518/flickr-photo-uploader-now-paid-feature |
15:10
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
15:15
🔗
|
|
Ungstein1 has quit IRC (Read error: Connection reset by peer) |
15:17
🔗
|
|
Morbus has joined #archiveteam |
15:21
🔗
|
|
Ungstein has joined #archiveteam |
15:24
🔗
|
SketchCow |
Bad sign |
15:28
🔗
|
* |
ersi starts the Death Watch |
15:36
🔗
|
|
Ungstein has quit IRC (Quit: Leaving.) |
15:41
🔗
|
|
Ungstein has joined #archiveteam |
15:45
🔗
|
|
Start has joined #archiveteam |
16:23
🔗
|
|
RichardG_ has quit IRC (Ping timeout: 258 seconds) |
16:30
🔗
|
|
RichardG has joined #archiveteam |
16:34
🔗
|
|
arkiver2 has joined #archiveteam |
16:50
🔗
|
|
vOYtEC has quit IRC (rm -r *) |
16:51
🔗
|
|
Start_ has joined #archiveteam |
16:51
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
16:51
🔗
|
|
Start_ is now known as Start |
16:59
🔗
|
|
JesseW has joined #archiveteam |
17:03
🔗
|
|
MMovie has quit IRC (Read error: Connection reset by peer) |
17:05
🔗
|
|
MMovie has joined #archiveteam |
17:07
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
17:13
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
17:20
🔗
|
|
arkiver2 has quit IRC (Ping timeout: 258 seconds) |
17:23
🔗
|
|
metalcamp has quit IRC (Ping timeout: 258 seconds) |
17:28
🔗
|
HCross |
FalconK, does your s3 thingy need the secret key, or just the access key? |
17:29
🔗
|
godane |
SketchCow: 2010 mp3s of kpfa will be all uploaded by tonight |
17:29
🔗
|
godane |
i'm up to 2010-12-09 right now |
17:31
🔗
|
|
atomotic has joined #archiveteam |
17:42
🔗
|
|
metalcamp has joined #archiveteam |
17:44
🔗
|
xmc |
FalconK: are you uploading with the magic flag set that blocks derive operations until you're done putting files into the item? |
17:44
🔗
|
xmc |
it could be that you are putting too many derive jobs in the queue |
17:46
🔗
|
|
VADemon has quit IRC (Quit: left4dead) |
17:51
🔗
|
|
vOYtEC has joined #archiveteam |
17:51
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
17:58
🔗
|
Frogging |
SketchCow: I thought it was already clear that Flickr was in trouble since Yahoo told their shareholders they were going to kill it :p |
18:01
🔗
|
HCross |
just the words "owned by Yahoo" puts anything close to death |
18:02
🔗
|
Frogging |
mhmm |
18:02
🔗
|
|
philpem has joined #archiveteam |
18:20
🔗
|
johtso |
is there anywhere that caches of github repositories can be found? |
18:28
🔗
|
MrRadar |
Cached as in the Git repository or cached as in the other Github stuff (issues, pull requests, etc)? |
18:39
🔗
|
|
jut has joined #archiveteam |
18:40
🔗
|
|
Froggypwn has quit IRC (Ping timeout: 258 seconds) |
18:42
🔗
|
johtso |
MrRadar: the files themselves, I can see the code in google's cache, but can't get at the one DLL file I need :( |
18:43
🔗
|
johtso |
https://webcache.googleusercontent.com/search?q=cache:7NqX3jT4LUQJ:https://github.com/jgoewert/USBCD-Module+&cd=1&hl=en&ct=clnk&gl=uk |
18:43
🔗
|
johtso |
a shame internet archive doesn't crawl github :( |
18:47
🔗
|
MrRadar |
Yeah, though GitHub repositories are gigantic if you actually capture the source code through the web interface |
18:47
🔗
|
MrRadar |
It doesn't look like anyone forked that, unfortunately |
18:47
🔗
|
DFJustin |
there have been people here pulling github repos but I don't know what the current status is |
18:50
🔗
|
|
GChriss has joined #archiveteam |
18:51
🔗
|
GChriss |
a shutdown notice from pyvideo.org that hasn't been added to the wiki yet: http://bluesock.org/~willkg/blog/pyvideo/status_20160115.html |
18:51
🔗
|
MrRadar |
I think we already grabbed that through ArchiveBot |
18:52
🔗
|
MrRadar |
Yes: http://archive.fart.website/archivebot/viewer/job/eilj3 |
18:52
🔗
|
GChriss |
yes, checking degree of completeness now |
18:53
🔗
|
GChriss |
youtube embedding is broken but I'm guessing that's expected |
18:58
🔗
|
|
bwn has quit IRC (Read error: Operation timed out) |
19:01
🔗
|
yipdw_ |
if you're checking completeness, try multiple replay tools |
19:04
🔗
|
JW_work |
Hm; it might be worth making a tool to automatically pick through the github firehose and fork any repository that didn't have any forks after a couple of days. |
19:05
🔗
|
JW_work |
(if one has enough space, cloning them privately might be even better, to avoid cases where github decides they don't want to host the repo *or* mirrors they know about) |
19:05
🔗
|
phuzion |
Project name: Giterdun |
19:06
🔗
|
JW_work |
:-) |
19:09
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
19:14
🔗
|
joepie91 |
lol |
19:23
🔗
|
|
bwn has joined #archiveteam |
19:28
🔗
|
|
jut has quit IRC (Read error: Connection reset by peer) |
19:28
🔗
|
|
schbirid has joined #archiveteam |
19:46
🔗
|
|
roninski has joined #archiveteam |
19:50
🔗
|
|
roninski1 has quit IRC (Read error: Operation timed out) |
20:38
🔗
|
|
Start has joined #archiveteam |
20:45
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
21:02
🔗
|
|
K4k has quit IRC (Ping timeout: 260 seconds) |
21:06
🔗
|
|
K4k has joined #archiveteam |
21:07
🔗
|
|
Lord_Nigh has quit IRC (Read error: Operation timed out) |
21:10
🔗
|
|
Lord_Nigh has joined #archiveteam |
21:11
🔗
|
|
balrog sets mode: +o Lord_Nigh |
21:16
🔗
|
|
metalcamp has quit IRC (Ping timeout: 258 seconds) |
21:39
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
21:40
🔗
|
|
tomwsmf-a has quit IRC (Ping timeout: 258 seconds) |
22:01
🔗
|
johtso |
or clone to a private bitbucket repo |
22:28
🔗
|
|
lysobit has quit IRC (Quit: that hurt deep, nsh) |
22:32
🔗
|
|
lysobit has joined #archiveteam |
22:38
🔗
|
|
ndiddy has joined #archiveteam |
22:44
🔗
|
|
Kenshin has quit IRC (Ping timeout: 260 seconds) |
22:48
🔗
|
|
Kenshin has joined #archiveteam |
22:49
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
22:50
🔗
|
arkiver |
Is someone using the IA CDX API for anything flickr related currently? |
22:57
🔗
|
|
dashcloud has joined #archiveteam |
23:09
🔗
|
godane |
https://archive.org/details/ET_And_Friends_1982 |
23:26
🔗
|
|
MMovie has quit IRC (Read error: Operation timed out) |
23:26
🔗
|
|
aMunster has quit IRC (Read error: Operation timed out) |
23:27
🔗
|
|
vegbrasil has quit IRC (Read error: Operation timed out) |
23:27
🔗
|
|
nwf has quit IRC (Read error: Operation timed out) |
23:28
🔗
|
|
mhazinsk has quit IRC (Read error: Operation timed out) |
23:28
🔗
|
|
beardicus has quit IRC (Read error: Operation timed out) |
23:30
🔗
|
|
closure has quit IRC (Read error: Operation timed out) |
23:41
🔗
|
|
Jonimus has quit IRC (Ping timeout: 633 seconds) |
23:54
🔗
|
|
beardicus has joined #archiveteam |
23:57
🔗
|
|
vegbrasil has joined #archiveteam |