Time |
Nickname |
Message |
02:40
🔗
|
SketchCow |
Schbirid: FOS will be available for it soon. I am currently uploading Wretch |
07:56
🔗
|
Schbirid |
SketchCow: excellent, i will see to get my dockstar into the network today! it would probably end up as rsync daemon to sync from. |
09:46
🔗
|
arkiver |
saving warhammeronline.com in just under 60 minutes... |
09:46
🔗
|
arkiver |
:) |
09:46
🔗
|
arkiver |
new record!! |
09:56
🔗
|
m1das |
nice job arkiver |
10:03
🔗
|
nico_32 |
how?much was it ? (in size) |
10:16
🔗
|
arkiver |
using a different method |
10:16
🔗
|
arkiver |
not adding a website and downloading that whole website |
10:16
🔗
|
arkiver |
since it is then downloading everything one by one |
10:16
🔗
|
arkiver |
but I used a program that quickly discoveres all the links from a website |
10:17
🔗
|
arkiver |
then I download all those links instead of the website |
10:17
🔗
|
arkiver |
the website is then faster downloaded |
10:37
🔗
|
arkiver |
is http://commons.wikimedia.org/ also saved by the archiveteam already? |
10:43
🔗
|
nico_32 |
probably backuped by the wikiteam |
10:50
🔗
|
arkiver |
ah ok |
11:00
🔗
|
Nemo_bis |
arkiver: what part of it? |
11:00
🔗
|
Nemo_bis |
the text is in http://dumps.wikimedia.org/backup-index.html with some mirrors |
11:00
🔗
|
arkiver |
yes |
11:00
🔗
|
arkiver |
but I mean all the images and videos and so on |
11:01
🔗
|
Nemo_bis |
uploads are close to 30 TB, I spent a few months archiving them |
11:01
🔗
|
Nemo_bis |
if you find something/someone to seed the torrents, that's appreciated :) there's one per month https://archive.org/details/wikimediacommons-torrents |
11:09
🔗
|
arkiver |
are you only uploading them as torrents or also as warc's? |
11:10
🔗
|
Nemo_bis |
O_o |
11:10
🔗
|
Nemo_bis |
they're uploaded as ZIP files (which contain the individual media files + XML descriptions), torrents are just a way for distribution |
11:10
🔗
|
m1das |
30TB, thats about the storage i have in total. |
11:10
🔗
|
arkiver |
no I mean are they in the wayback machine? |
11:11
🔗
|
Nemo_bis |
see https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs for more info |
11:11
🔗
|
arkiver |
brb |
11:11
🔗
|
Nemo_bis |
I doubt it and it wouldn't be very useful anyway, you can't download more than 100 MB per file from wayback |
11:11
🔗
|
Nemo_bis |
though the legend says you can from some machines |
11:48
🔗
|
arkiver |
?? |
11:48
🔗
|
arkiver |
you can download more then 100 MB per file from the wayback machine... |
12:27
🔗
|
Nemo_bis |
arkiver: not always https://archive.org/post/1003894/wayback-machine-doesnt-support-the-range-header-aka-wget-continue-doesnt-work |
12:28
🔗
|
arkiver |
Nemo_bis: I never experienced that yet... |
12:28
🔗
|
arkiver |
can someone here create good scripts or little programs for windows? |
12:29
🔗
|
Nemo_bis |
arkiver: then try downloading http://web.archive.org/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi and tell me what you get :) |
12:31
🔗
|
arkiver |
Nemo_bis: ah yes, I see... |
12:31
🔗
|
arkiver |
I did get that sometimes but never always on 100 MB |
12:31
🔗
|
arkiver |
it is different everytime |
12:31
🔗
|
arkiver |
but for what I learned is that it just needs to be archived again |
12:31
🔗
|
arkiver |
since there was probably some kind of error in the connection at that time |
12:32
🔗
|
Nemo_bis |
ouch, that would be terrible because those videos are gone; where did you read this? |
12:34
🔗
|
arkiver |
no it's just from what I tried out |
12:34
🔗
|
arkiver |
I tried and tried with other links |
12:34
🔗
|
arkiver |
and that is my "conclusion" |
12:35
🔗
|
arkiver |
but man |
12:35
🔗
|
arkiver |
maybe we should put wikimedia in the wayback machine? |
12:36
🔗
|
Nemo_bis |
that's a bit generic :) what part of it |
12:36
🔗
|
arkiver |
hmm |
12:37
🔗
|
arkiver |
alright if we talk about this a little later |
12:37
🔗
|
arkiver |
lol |
12:37
🔗
|
arkiver |
doing several things atm |
12:37
🔗
|
arkiver |
and I want to have a good conversation about it |
12:37
🔗
|
arkiver |
ok? |
12:39
🔗
|
arkiver |
till when are you online? |
15:04
🔗
|
chfoo |
the #btch project is up and running. manual script running: https://github.com/ArchiveTeam/ptch-grab |
15:05
🔗
|
Marcelo |
Only manual? |
15:08
🔗
|
chfoo |
i need an admin to add it to projects.json please |
15:17
🔗
|
nico_32 |
another project ? |
15:19
🔗
|
chfoo |
yahoo! is shutting down ptch. ~5 days remain. |
15:21
🔗
|
nico_32 |
74k todo ? definitive number ? |
15:23
🔗
|
nico_32 |
chfoo: how much concurrent by ipv4 ? |
15:25
🔗
|
chfoo |
nico_32: 74k should be definitive based on the list deathy gave me. i'm not sure about how concurrent threads is ok. |
15:26
🔗
|
chfoo |
if possible, best advice is use a sacrificial ip address and let us know. |
15:27
🔗
|
deathy |
for ptch there was no obvious/visible rate-limiting when I did initial research/API calls. |
15:28
🔗
|
deathy |
that being said... 2 concurrent is safe...let's at least see how it goes before trying to break it |
15:28
🔗
|
nico_32 |
so running concurrent=2 on 4 ipv4 |
15:28
🔗
|
nico_32 |
got another dedicated server |
15:30
🔗
|
Marcelo |
Can I increase upload slots? |
15:30
🔗
|
Marcelo |
Concurrent uploads |
15:34
🔗
|
nico_32 |
the upload target is slow |
15:34
🔗
|
nico_32 |
~75 kBps here |
15:35
🔗
|
Marcelo |
75.98kB/s here |
15:39
🔗
|
nico_32 |
from Schbirid (was got klined from efnet): "hey, could someone test the speed of my jamendo vorbis album server?" |
15:40
🔗
|
nico_32 |
from Schbirid (was got klined from efnet): "rsync -avP 151.217.55.80::albums2 ." |
15:40
🔗
|
nico_32 |
s/was/who/g |
15:40
🔗
|
nico_32 |
from Schbirid (who got klined from efnet): "if it works, maybe someone could sync from/to fos? albums2 is the first hdd with 2TB" |
15:41
🔗
|
nico_32 |
from Schbirid (who got klined from efnet): "rsync -avP --dry-run 151.217.55.80::albums2 jamendo-albums/" |
15:41
🔗
|
nico_32 |
poke SketchCow |
15:51
🔗
|
SketchCow |
OK. |
15:55
🔗
|
Nemo_bis |
chfoo: the README doesn't include the instructions added in last revisions of https://github.com/ArchiveTeam/wretch-grab |
15:56
🔗
|
chfoo |
Nemo_bis: noted. i'll fix it now |
15:56
🔗
|
Nemo_bis |
I guess they need to be pushed to the upstream repo? |
15:56
🔗
|
Nemo_bis |
thanks |
15:57
🔗
|
Nemo_bis |
I also noted we still require gnutls-dev[el] and openssl-dev[el], I had to install them on fedora (this used to be the most common problem, with mobileme) |
15:57
🔗
|
Nemo_bis |
so maybe that's to add too |
16:11
🔗
|
nico_32 |
it is openssl-dev or gnutls-dev |
16:11
🔗
|
nico_32 |
one is enough |
16:22
🔗
|
Nemo_bis |
hmmm |
16:24
🔗
|
Nemo_bis |
I can't make sense out of my package manager history, oh well |
16:25
🔗
|
joepie91 |
Nemo_bis: wait, you mean there are people that -can- make sense out of package manager history? |
16:25
🔗
|
joepie91 |
where do I find these mythical creatures? |
16:27
🔗
|
Nemo_bis |
:) apper is rather easy to use |
16:28
🔗
|
Nemo_bis |
but apparently I didn't install the packages I remembered, probably I'm the wrong one ;) |
18:22
🔗
|
wp494 |
!! |
18:22
🔗
|
wp494 |
http://www.theverge.com/2013/12/27/5248286/vdio-shut-down-by-rdio |
18:25
🔗
|
yipdw |
rdio killed the vdio star |
19:09
🔗
|
zenguy_pc |
how does web,archive.org determine what imgur links they cache |
19:09
🔗
|
zenguy_pc |
yipdw: lol |
20:47
🔗
|
DFJustin |
zenguy_pc: I would assume it's just a crapshoot based on what their spiders reach |
20:47
🔗
|
DFJustin |
so popular images linked from multiple external pages are more likely |