Time |
Nickname |
Message |
01:02
🔗
|
pronoiac |
Nuts. I was looking into a problem from yesterday, and the server now incorrectly believes that one's done - it's tauran, which Wyatt|Wor had problems with. |
07:44
🔗
|
ersi |
Coderjoe: ah, well - I lost the conversation in the backlog so I just thought you were asking what it meant :) |
07:56
🔗
|
Nemo_bis |
does the archive.org flash/javascript interface use chunked uploading? |
11:58
🔗
|
Schbirid |
i can never remember how to redirect stderr to devnull |
11:58
🔗
|
Schbirid |
2>/dev/null |
12:01
🔗
|
Schbirid |
https://github.com/SpiritQuaddicted/fileplanet-file-download/blob/master/download_pages_and_files_from_fileplanet.sh is much nicer now |
12:12
🔗
|
Schbirid |
http://www.pastie.org/3867284 |
12:15
🔗
|
Schbirid |
working well |
12:35
🔗
|
ersi |
Schbirid: that's not #AT bizniz |
12:36
🔗
|
ersi |
IMO |
12:46
🔗
|
underscor |
Schbirid: need more people to help download fileplanet? |
12:46
🔗
|
Schbirid |
yes, definitely. i am just deciding on the final packaging then i would have asked |
12:46
🔗
|
Schbirid |
how big can one archive.org item become? |
12:48
🔗
|
ersi |
AFAIK it can be any size |
12:49
🔗
|
ersi |
preferebly it should be smaller though |
12:58
🔗
|
Schbirid |
argh, got a bug with 's |
13:04
🔗
|
Schbirid |
i am too dumb to figure out how to remove the last character from a string in bash or gnu coreutils |
13:07
🔗
|
Schbirid |
| rev | cut -c 2- | rev |
13:07
🔗
|
Schbirid |
heh |
13:07
🔗
|
Schbirid |
well, why not |
13:37
🔗
|
underscor |
Schbirid: We aim for 10GB |
13:37
🔗
|
underscor |
bigger than that and you can run into task issues, as there is only ~10GB guaranteed to be free on a datanode drive at any point |
14:23
🔗
|
Schbirid |
hm, anyone able to download http://www.fileplanet.com/52249/download ? |
14:24
🔗
|
Schbirid |
i always get a 403 forbidden |
14:25
🔗
|
DFJustin |
same |
14:26
🔗
|
Schbirid |
please refresh, check the source for the link (grep for default-file-download-link) and try pasting that into your address bar |
14:27
🔗
|
DFJustin |
same |
14:28
🔗
|
Schbirid |
cheers |
14:28
🔗
|
Schbirid |
(i like how they have single quotes in filenames and use single quotes in their javascript as well) |
14:29
🔗
|
DFJustin |
it's available at http://www.gamefront.com/files/13625/GRIST_MILLBY |
14:30
🔗
|
Schbirid |
50000-54999 is 24G already, ugh |
15:16
🔗
|
Schbirid |
20k-30k: ~7-8GB |
15:16
🔗
|
Schbirid |
30k-40k: 10GB |
15:16
🔗
|
Schbirid |
40k-50k: 18G |
15:16
🔗
|
Schbirid |
50k-55k: 25G |
15:17
🔗
|
Schbirid |
i am scared. might mean that we'd need to do 1-2k increments. the end would be at 250k or something |
15:17
🔗
|
Schbirid |
bbl |
15:22
🔗
|
Nemo_bis |
Schbirid, put 5k files per item then |
15:37
🔗
|
chronomex |
don't be scared! |
16:36
🔗
|
codebear |
mobileme news: http://arstechnica.com/apple/news/2012/05/free-20gb-cloud-storage-for-mobileme-subscribers-extended-to-sept-30.ars |
16:40
🔗
|
DFJustin |
http://archive.org/post/419499/chumbycom-is-going-away-request-for-archiving |
16:44
🔗
|
yipdw |
oh neat |
16:44
🔗
|
yipdw |
Github added organization display to user profiles |
16:44
🔗
|
yipdw |
Archive Team needs a snazzy gravatar now |
16:44
🔗
|
yipdw |
maybe we can reuse the unicorn one |
16:46
🔗
|
yipdw |
(for those who don't know: http://archiveteam.org/images/0/05/Rejectedatlogo.jpg) |
16:47
🔗
|
mistym |
I vote yes! |
16:48
🔗
|
mistym |
I still wish Github let you follow organizations. |
17:07
🔗
|
Schbirid |
alright, who wants to download some fileplanet! |
17:07
🔗
|
Schbirid |
right now you will need to tar it manually in the end |
17:08
🔗
|
Schbirid |
i guess we will just upload each tar seperately and have someone put them together into a collection? |
17:18
🔗
|
Nemo_bis |
yes |
17:18
🔗
|
* |
Nemo_bis is already downloading thousands of wikis |
17:19
🔗
|
Schbirid |
ok |
17:19
🔗
|
Schbirid |
https://github.com/SpiritQuaddicted/fileplanet-file-download/blob/master/download_pages_and_files_from_fileplanet.sh |
17:20
🔗
|
Schbirid |
run: download_pages_and_files_from_fileplanet.sh 55000 59999 |
17:20
🔗
|
Schbirid |
will be about 30G i guess |
17:28
🔗
|
Schbirid |
registering on the forums is still not possible? do we have a shared account i could use? |
18:10
🔗
|
dnova |
Apple is extending its free storage offer to paid MobileMe subscribers from June 30 to September 30, 2012 |
18:11
🔗
|
yipdw |
http://venturebeat.com/2012/05/06/brazil-facebook-lies/ <-- fucking Black Mirror |
18:16
🔗
|
yipdw |
also, props to whatever generated the URL slug, because it's totally apropos |
18:46
🔗
|
Schbirid |
Nemo_bis: you running it? |
18:47
🔗
|
Nemo_bis |
Schbirid, what? |
18:48
🔗
|
Nemo_bis |
if you mean your script I'm not, as I said I'm already busy with wikis, load was at 60 a few min ago |
18:48
🔗
|
Schbirid |
oh, i guess i misunderstood you |
18:48
🔗
|
Nemo_bis |
7 now but still no disk space :) |
18:48
🔗
|
Schbirid |
heh |
18:48
🔗
|
Schbirid |
nice |
19:59
🔗
|
ersi |
The Swedish site http://www.resdagboken.se is closing down, from a press release by their owners (The large Norwegian(?) company/media conglomerate Schibsteds). The site is a "travel journey diary" for travelers, so it's mostly if not totally only user-made content |
20:00
🔗
|
ersi |
Unsure if the content is going to be deleted, but.. if something's on it's deathbed, most likely. They've disabled the ability to create new users/logins as well as new 'journey diaries'. But existing diaries can be updated until 15 June 2012 |
20:04
🔗
|
ersi |
There's at least 17 million images and 2 million "journey diaries" from users according to their stats in the press release |
20:05
🔗
|
shaqfu |
Think it'll be better to sweep through now, or wait for last call? |
20:06
🔗
|
ersi |
not sure, but earlier is always better |
20:07
🔗
|
shaqfu |
Might be better to wait a bit - it loooks like people are doing final entries now |
20:09
🔗
|
ersi |
hm, true. But starting out finding users diaries and such might be good |
20:19
🔗
|
shaqfu |
Doesn't look nicely structured, sadly |
20:44
🔗
|
shaqfu |
http://archiveteam.org/index.php?title=Fileplanet |
20:46
🔗
|
shaqfu |
Until account creation's back up, I'll probably give Schbird my credentials to keep the page count updated |
20:47
🔗
|
shaqfu |
Or it might be better not to, so someone's keeping track of all the downloads |
21:51
🔗
|
underscor |
If someone sees schibirid, tell him I have a list of all the valid ids |
21:51
🔗
|
underscor |
it's much more efficient than brute-forcing every number between 1 and 220000 |
21:53
🔗
|
shaqfu |
underscor: Spiffy; how many are there? |
21:53
🔗
|
underscor |
wc -l valid |
21:53
🔗
|
underscor |
87190 valid |
21:53
🔗
|
* |
shaqfu updates the page |
21:54
🔗
|
shaqfu |
Any estimates on size? |
21:56
🔗
|
underscor |
nope |
21:56
🔗
|
underscor |
I just extracted them from the sitemap XML files |
21:56
🔗
|
shaqfu |
Gotcha |
22:24
🔗
|
Nemo_bis |
underscor, email him? |
23:56
🔗
|
SketchCow |
HUZZAH ARCHIVETEAM |
23:56
🔗
|
dashcloud |
hello! |
23:59
🔗
|
shaqfu |
Con go well? |