#archiveteam 2013-12-27,Fri

↑back Search

Time Nickname Message
02:40 🔗 SketchCow Schbirid: FOS will be available for it soon. I am currently uploading Wretch
07:56 🔗 Schbirid SketchCow: excellent, i will see to get my dockstar into the network today! it would probably end up as rsync daemon to sync from.
09:46 🔗 arkiver saving warhammeronline.com in just under 60 minutes...
09:46 🔗 arkiver :)
09:46 🔗 arkiver new record!!
09:56 🔗 m1das nice job arkiver
10:03 🔗 nico_32 how?much was it ? (in size)
10:16 🔗 arkiver using a different method
10:16 🔗 arkiver not adding a website and downloading that whole website
10:16 🔗 arkiver since it is then downloading everything one by one
10:16 🔗 arkiver but I used a program that quickly discoveres all the links from a website
10:17 🔗 arkiver then I download all those links instead of the website
10:17 🔗 arkiver the website is then faster downloaded
10:37 🔗 arkiver is http://commons.wikimedia.org/ also saved by the archiveteam already?
10:43 🔗 nico_32 probably backuped by the wikiteam
10:50 🔗 arkiver ah ok
11:00 🔗 Nemo_bis arkiver: what part of it?
11:00 🔗 Nemo_bis the text is in http://dumps.wikimedia.org/backup-index.html with some mirrors
11:00 🔗 arkiver yes
11:00 🔗 arkiver but I mean all the images and videos and so on
11:01 🔗 Nemo_bis uploads are close to 30 TB, I spent a few months archiving them
11:01 🔗 Nemo_bis if you find something/someone to seed the torrents, that's appreciated :) there's one per month https://archive.org/details/wikimediacommons-torrents
11:09 🔗 arkiver are you only uploading them as torrents or also as warc's?
11:10 🔗 Nemo_bis O_o
11:10 🔗 Nemo_bis they're uploaded as ZIP files (which contain the individual media files + XML descriptions), torrents are just a way for distribution
11:10 🔗 m1das 30TB, thats about the storage i have in total.
11:10 🔗 arkiver no I mean are they in the wayback machine?
11:11 🔗 Nemo_bis see https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Media_tarballs for more info
11:11 🔗 arkiver brb
11:11 🔗 Nemo_bis I doubt it and it wouldn't be very useful anyway, you can't download more than 100 MB per file from wayback
11:11 🔗 Nemo_bis though the legend says you can from some machines
11:48 🔗 arkiver ??
11:48 🔗 arkiver you can download more then 100 MB per file from the wayback machine...
12:27 🔗 Nemo_bis arkiver: not always https://archive.org/post/1003894/wayback-machine-doesnt-support-the-range-header-aka-wget-continue-doesnt-work
12:28 🔗 arkiver Nemo_bis: I never experienced that yet...
12:28 🔗 arkiver can someone here create good scripts or little programs for windows?
12:29 🔗 Nemo_bis arkiver: then try downloading http://web.archive.org/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi and tell me what you get :)
12:31 🔗 arkiver Nemo_bis: ah yes, I see...
12:31 🔗 arkiver I did get that sometimes but never always on 100 MB
12:31 🔗 arkiver it is different everytime
12:31 🔗 arkiver but for what I learned is that it just needs to be archived again
12:31 🔗 arkiver since there was probably some kind of error in the connection at that time
12:32 🔗 Nemo_bis ouch, that would be terrible because those videos are gone; where did you read this?
12:34 🔗 arkiver no it's just from what I tried out
12:34 🔗 arkiver I tried and tried with other links
12:34 🔗 arkiver and that is my "conclusion"
12:35 🔗 arkiver but man
12:35 🔗 arkiver maybe we should put wikimedia in the wayback machine?
12:36 🔗 Nemo_bis that's a bit generic :) what part of it
12:36 🔗 arkiver hmm
12:37 🔗 arkiver alright if we talk about this a little later
12:37 🔗 arkiver lol
12:37 🔗 arkiver doing several things atm
12:37 🔗 arkiver and I want to have a good conversation about it
12:37 🔗 arkiver ok?
12:39 🔗 arkiver till when are you online?
15:04 🔗 chfoo the #btch project is up and running. manual script running: https://github.com/ArchiveTeam/ptch-grab
15:05 🔗 Marcelo Only manual?
15:08 🔗 chfoo i need an admin to add it to projects.json please
15:17 🔗 nico_32 another project ?
15:19 🔗 chfoo yahoo! is shutting down ptch. ~5 days remain.
15:21 🔗 nico_32 74k todo ? definitive number ?
15:23 🔗 nico_32 chfoo: how much concurrent by ipv4 ?
15:25 🔗 chfoo nico_32: 74k should be definitive based on the list deathy gave me. i'm not sure about how concurrent threads is ok.
15:26 🔗 chfoo if possible, best advice is use a sacrificial ip address and let us know.
15:27 🔗 deathy for ptch there was no obvious/visible rate-limiting when I did initial research/API calls.
15:28 🔗 deathy that being said... 2 concurrent is safe...let's at least see how it goes before trying to break it
15:28 🔗 nico_32 so running concurrent=2 on 4 ipv4
15:28 🔗 nico_32 got another dedicated server
15:30 🔗 Marcelo Can I increase upload slots?
15:30 🔗 Marcelo Concurrent uploads
15:34 🔗 nico_32 the upload target is slow
15:34 🔗 nico_32 ~75 kBps here
15:35 🔗 Marcelo 75.98kB/s here
15:39 🔗 nico_32 from Schbirid (was got klined from efnet): "hey, could someone test the speed of my jamendo vorbis album server?"
15:40 🔗 nico_32 from Schbirid (was got klined from efnet): "rsync -avP 151.217.55.80::albums2 ."
15:40 🔗 nico_32 s/was/who/g
15:40 🔗 nico_32 from Schbirid (who got klined from efnet): "if it works, maybe someone could sync from/to fos? albums2 is the first hdd with 2TB"
15:41 🔗 nico_32 from Schbirid (who got klined from efnet): "rsync -avP --dry-run 151.217.55.80::albums2 jamendo-albums/"
15:41 🔗 nico_32 poke SketchCow
15:51 🔗 SketchCow OK.
15:55 🔗 Nemo_bis chfoo: the README doesn't include the instructions added in last revisions of https://github.com/ArchiveTeam/wretch-grab
15:56 🔗 chfoo Nemo_bis: noted. i'll fix it now
15:56 🔗 Nemo_bis I guess they need to be pushed to the upstream repo?
15:56 🔗 Nemo_bis thanks
15:57 🔗 Nemo_bis I also noted we still require gnutls-dev[el] and openssl-dev[el], I had to install them on fedora (this used to be the most common problem, with mobileme)
15:57 🔗 Nemo_bis so maybe that's to add too
16:11 🔗 nico_32 it is openssl-dev or gnutls-dev
16:11 🔗 nico_32 one is enough
16:22 🔗 Nemo_bis hmmm
16:24 🔗 Nemo_bis I can't make sense out of my package manager history, oh well
16:25 🔗 joepie91 Nemo_bis: wait, you mean there are people that -can- make sense out of package manager history?
16:25 🔗 joepie91 where do I find these mythical creatures?
16:27 🔗 Nemo_bis :) apper is rather easy to use
16:28 🔗 Nemo_bis but apparently I didn't install the packages I remembered, probably I'm the wrong one ;)
18:22 🔗 wp494 !!
18:22 🔗 wp494 http://www.theverge.com/2013/12/27/5248286/vdio-shut-down-by-rdio
18:25 🔗 yipdw rdio killed the vdio star
19:09 🔗 zenguy_pc how does web,archive.org determine what imgur links they cache
19:09 🔗 zenguy_pc yipdw: lol
20:47 🔗 DFJustin zenguy_pc: I would assume it's just a crapshoot based on what their spiders reach
20:47 🔗 DFJustin so popular images linked from multiple external pages are more likely

irclogger-viewer