#archiveteam 2014-12-01,Mon

↑back Search

Time Nickname Message
00:18 🔗 Start is now known as StartAway
00:23 🔗 Froggypwn has quit IRC (Quit: ~ Trillian Astra - www.trillian.im ~)
00:24 🔗 dashcloud has quit IRC (Read error: Operation timed out)
00:25 🔗 cf_ has joined #archiveteam
00:28 🔗 dashcloud has joined #archiveteam
00:29 🔗 cf has quit IRC (Read error: Operation timed out)
00:29 🔗 cf_ is now known as cf
00:31 🔗 mistym has quit IRC (Read error: Operation timed out)
00:38 🔗 cf_ has joined #archiveteam
00:45 🔗 cf has quit IRC (Read error: Operation timed out)
00:45 🔗 cf_ is now known as cf
01:01 🔗 Aranje has quit IRC (Read error: Connection reset by peer)
01:02 🔗 Aranje has joined #archiveteam
01:02 🔗 Aranje has quit IRC (Read error: Connection reset by peer)
01:03 🔗 Aranje has joined #archiveteam
01:44 🔗 Ymgve has quit IRC ()
02:01 🔗 Sanqui has quit IRC (Read error: Operation timed out)
02:03 🔗 Sanqui has joined #archiveteam
02:03 🔗 primus104 has quit IRC (Leaving.)
02:06 🔗 Sanqui has quit IRC (Read error: Operation timed out)
02:07 🔗 Sanqui has joined #archiveteam
02:16 🔗 dashcloud has quit IRC (Ping timeout: 272 seconds)
02:21 🔗 dashcloud has joined #archiveteam
02:38 🔗 xtr-201 has quit IRC (Ping timeout: 852 seconds)
02:39 🔗 StartAway is now known as Start
03:04 🔗 xtr-201 has joined #archiveteam
03:06 🔗 Silent700 has joined #archiveteam
03:07 🔗 Silent700 Hello - question re: archive.org's uploader...
03:07 🔗 Silent700 If I upload a .zip of .tif files, will they be processed into a PDF or just the online-viewable format?
03:07 🔗 Silent700 Or nothing at all?
03:11 🔗 xmc it will turn it into every other format you normally see
03:12 🔗 Silent700 so .zip is OK?
03:12 🔗 xmc yes
03:12 🔗 xmc a zip of TIFF files is quite good
03:12 🔗 Silent700 I found an old forum post that said it would not burst the zip, but I figured that was outdated
03:12 🔗 xmc that is outdated.
03:12 🔗 Silent700 then that is what you will get
03:12 🔗 * xmc points to /topic :)
03:12 🔗 xmc IA will get it
03:12 🔗 Silent700 then that is what /they/ will get
03:13 🔗 Silent700 and you, by way of them :)
03:13 🔗 xmc fair enough
03:42 🔗 Silent700 does the uploaded support multiple files at once _and_ create an entry for each one? Or must they all be contained under one entry?
03:42 🔗 Silent700 for example, issues of a magazine
03:47 🔗 xmc each issue should go in a different item
03:47 🔗 Silent700 And the uploader tool will do that, or the uploader (me) must do that?
03:50 🔗 BlueMaxim has quit IRC (Quit: Leaving)
04:07 🔗 Start has quit IRC (Read error: Operation timed out)
04:22 🔗 Start has joined #archiveteam
04:37 🔗 dx has quit IRC (Remote host closed the connection)
04:37 🔗 dx has joined #archiveteam
04:47 🔗 aaaaaaaaa has quit IRC (Leaving)
05:37 🔗 Silent700 has left
05:42 🔗 Start http://wayback.archive.org/web/20140331060749/http://ex.fm/api
05:43 🔗 Start ex.fm's api still works with archive.ex.fm
05:45 🔗 Start of course it could also be scraped through http://archive.ex.fm/siteindex.xml.gz
05:47 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
05:48 🔗 dashcloud has joined #archiveteam
05:54 🔗 mutoso Is there a way to tell wget to delete the non-warc'd parts of a download and just leave the generated warc file?
05:55 🔗 yipdw mutoso: wget-warc will write non-warc content no matter what, but you can limit it with --output-document and --truncate-output
05:56 🔗 yipdw if this is unacceptable, consider wpull, which implements most of the same wget options
05:59 🔗 mistym has joined #archiveteam
06:07 🔗 Start i've made more detailed notes on sites shutting down: http://paste.archivingyoursh.it/jotejecagi.vhdl
06:08 🔗 dashcloud has quit IRC (Read error: Operation timed out)
06:13 🔗 dashcloud has joined #archiveteam
06:24 🔗 godane good news on the KBS News Today archiving
06:24 🔗 godane i found the podcast paths:
06:24 🔗 godane http://newsdown.kbs.gscdn.com/news_today/2013/01/04/10.mp4
06:24 🔗 godane working one: http://newsdown.kbs.gscdn.com/news_today/2013/08/01/10.mp4
06:25 🔗 godane the august 2013 videos are not even on youtube
06:37 🔗 arkiver Start: I didn't know about ziplist... :/
06:37 🔗 arkiver So the deadline is december 10th. I'll make sure we have it running today or tommorrow
06:37 🔗 Start thanks
06:38 🔗 Start i wasn't aware of it until today either
06:38 🔗 Start thankfully it seems pretty straightforward to grab
06:39 🔗 arkiver SketchCow: how are you on space on FOS? I'll start Viddy this afternoon with FOS, but will try to have someone else volunteer to do take the files
06:40 🔗 arkiver But first everything is going into FOS
07:04 🔗 SketchCow For which... viddy or whatever?
07:05 🔗 SketchCow FOS is overburdened, and will be for at least a week.
07:05 🔗 SketchCow I mean, that doesn't seem to be stopping anyone, but there we are.
07:05 🔗 arkiver viddy
07:06 🔗 SketchCow I mean, nothing's going to change soon - FOS is under ridiculous strain.
07:06 🔗 SketchCow Right now FOS hs 2.3tb free but that seems to change in no time. Like I said, someone pumped 1tb of Archivebot joy into it in 72 hours.
07:07 🔗 arkiver We have currently only 15 days left for viddy, so we really have to start. Since FOS is overburned, I'll try have an other upload target other then FOS as soon as possible, but till that time I want to go with FOS.
07:07 🔗 SketchCow If you do viddy, don't do Halo.
07:07 🔗 SketchCow That's all.
07:08 🔗 yipdw ask Kenshin if you can reuse part of tank
07:08 🔗 yipdw [archiveteam@tank ~]$ df -h /home/archiveteam/
07:08 🔗 yipdw Filesystem Size Used Avail Capacity Mounted on
07:08 🔗 yipdw zfs/archiveteam 20T 842G 19T 4% /home/archiveteam
07:08 🔗 yipdw I will say, though, that that machine gets pretty busy during twitpic
07:08 🔗 yipdw also part of life is learning to deal with losing a few times
07:09 🔗 SketchCow Well, Halo is also the lowest of low priority high-resource projects, frankly.
07:09 🔗 SketchCow Having a millions-of-games sample of Halo games, which we now already have, is quite good.
07:09 🔗 SketchCow Having a comprehensive collection of every game played on Halo 2-4 is less good
07:09 🔗 SketchCow or important.
07:11 🔗 Start arkiver: regarding ziplist, the highest valid recipe ID i could find is http://www.ziplist.com/recipes/3320393
07:19 🔗 godane i found about 1 hour of video that you hate to archive: http://video-hot.nowcom.gscdn.com/mvod/20140424/321/85542321_1.mp4
07:20 🔗 godane there folders are public so i was able to fine it
07:20 🔗 godane *find it
07:21 🔗 godane i sent one of the these gscdn.com urls to archive bot cause it has 2008 starcraft tournaments videos
07:21 🔗 godane from South Korea
07:23 🔗 balrog SketchCow: I bet if you made a graph of archivebot uploads over time it would be *very* up-and-down
07:25 🔗 godane SketchCow: i found a web stream of WCS Season 1 GSL: http://ongameimg.gscdn.com/web/WCS%20Season1%20GSL/05-14/05.14%20WCS%20FULL%20Version.mp4
07:25 🔗 godane over 3 hours long
07:27 🔗 godane i may have to look at ongameimg.gscdn.com more
07:27 🔗 balrog SketchCow: pretty much a roller coaster
07:27 🔗 balrog that would be interesting data.
07:27 🔗 godane looks like some folders are open but folders going to that folder are not
07:27 🔗 arkiver SketchCow: I'll do that
07:28 🔗 arkiver yipdw: I'll do that also
07:29 🔗 Start is now known as StartAway
07:38 🔗 primus104 has joined #archiveteam
07:50 🔗 StartAway is now known as Start
07:59 🔗 Start is now known as StartAway
08:00 🔗 SketchCow It finished. Right now, I have 2.6tb of Halo backed up
08:00 🔗 SketchCow Buffered. Waiting to be uploaded.
08:07 🔗 mistym has quit IRC (Remote host closed the connection)
08:47 🔗 midas arkiver: getting another 8TB box ready for you
10:15 🔗 Nemo_bis Silent is gone, but I think a good guide is https://en.wikisource.org/wiki/Help:DjVu_files#The_Internet_Archive (yes, I'm biased)
10:20 🔗 BlueMaxim has joined #archiveteam
10:20 🔗 schbirid has joined #archiveteam
10:26 🔗 signius_ has quit IRC (Read error: Operation timed out)
10:39 🔗 signius_ has joined #archiveteam
10:43 🔗 BlueMaxim has quit IRC (Quit: Leaving)
11:10 🔗 APerti has quit IRC (Ping timeout: 378 seconds)
12:29 🔗 LordNigh2 has joined #archiveteam
12:31 🔗 Lord_Nigh has quit IRC (Ping timeout: 272 seconds)
12:31 🔗 LordNigh2 is now known as Lord_Nigh
12:35 🔗 slipstrea is now known as raylee
12:56 🔗 cf has quit IRC (Quit: cf)
12:57 🔗 Froggypwn has joined #archiveteam
13:01 🔗 Ymgve has joined #archiveteam
13:42 🔗 cf has joined #archiveteam
13:46 🔗 xk_id has joined #archiveteam
14:03 🔗 will__ Sorry I'm pretty new to all of this - what does FOS actually stand for?
14:04 🔗 Nemo_bis will__: fortress of solitude, it's just a name
14:05 🔗 will__ Ah right thanks Nemo_bis
14:16 🔗 Smiley "fields of saves"
14:16 🔗 Smiley as in build it, and they will come
14:16 🔗 Smiley :DDDDD
14:16 🔗 Smiley He built it, we came
14:16 🔗 Smiley we saved
14:16 🔗 Smiley we broke things...
14:17 🔗 StartAway is now known as Start
14:37 🔗 REiN^ has joined #archiveteam
14:41 🔗 Start has quit IRC (Remote host closed the connection)
14:43 🔗 cf has quit IRC (Quit: cf)
15:09 🔗 BiggieJon has joined #archiveteam
15:19 🔗 aaaaaaaaa has joined #archiveteam
15:40 🔗 mistym has joined #archiveteam
15:40 🔗 cf has joined #archiveteam
15:43 🔗 arkiver Start: how do you get that each item of ziplist is ~9 MB? http://paste.archivingyoursh.it/jotejecagi.vhdl
15:43 🔗 mistym has quit IRC (Remote host closed the connection)
15:57 🔗 dashcloud has quit IRC (Read error: Operation timed out)
16:01 🔗 mistym has joined #archiveteam
16:04 🔗 dashcloud has joined #archiveteam
16:32 🔗 Froggypwn has quit IRC (Quit: ~ Trillian Astra - www.trillian.im ~)
16:36 🔗 thechip has joined #archiveteam
16:38 🔗 cf has quit IRC (Quit: cf)
16:54 🔗 joe_ has joined #archiveteam
16:55 🔗 joe_ hey, not sure where to direct this request, but would it possible to have https://archive.org point to https://web.archive.org in the search box (rather than https:// pointing to the unencrypted http://web.archive.org) ?
17:01 🔗 DFJustin direct it to info@archive.org
17:02 🔗 okeuday has quit IRC (Read error: Operation timed out)
17:03 🔗 joe_ don't use email :(
17:08 🔗 aaaaaaaaa has quit IRC (Read error: Operation timed out)
17:15 🔗 mistym has quit IRC (Remote host closed the connection)
17:16 🔗 aaaaaaaaa has joined #archiveteam
17:19 🔗 okeuday has joined #archiveteam
17:19 🔗 Start has joined #archiveteam
17:25 🔗 the_fox Oh yeah, I was wondering about that. Sort of bothered me that the login page is apparently unencrypted.
17:29 🔗 the_fox First you'd probably need a valid certificate though. archiveteam.org's is self signed. And it's mismatched. And it's expired.
17:29 🔗 the_fox Maybe I should go ahead and send that email.
17:29 🔗 joe_ <3
17:29 🔗 joe_ thank you, then I'll be on my way
17:29 🔗 joe_ has left
17:31 🔗 primus104 has quit IRC (Leaving.)
17:43 🔗 SmileyG has joined #archiveteam
17:44 🔗 the_fox E-mail sent.
17:46 🔗 Smiley has quit IRC (Read error: Operation timed out)
17:46 🔗 ivan` has quit IRC (Ping timeout: 248 seconds)
17:48 🔗 ruukasu has quit IRC (Quit: WeeChat 1.0.1)
17:48 🔗 phuzion has quit IRC (Read error: Connection reset by peer)
17:48 🔗 Start has quit IRC (Read error: Connection reset by peer)
17:48 🔗 Sanqui has quit IRC (Ping timeout: 248 seconds)
17:48 🔗 Start has joined #archiveteam
17:48 🔗 Sanqui has joined #archiveteam
17:48 🔗 phuzion has joined #archiveteam
17:50 🔗 ivan` has joined #archiveteam
17:50 🔗 Start has quit IRC (Read error: Connection reset by peer)
17:51 🔗 todrobbin has joined #archiveteam
17:58 🔗 Start has joined #archiveteam
17:58 🔗 Start i've created ziplist a wiki page
18:02 🔗 Nemo_bis Did someone say wiki
18:02 🔗 ruukasu has joined #archiveteam
18:07 🔗 APerti has joined #archiveteam
18:23 🔗 kyan_ has joined #archiveteam
18:31 🔗 ruukasu has quit IRC (Quit: WeeChat 1.0.1)
18:32 🔗 ruukasu has joined #archiveteam
18:35 🔗 SketchCow godane: There's no extant Internet Archive ERIC Archive, right?
18:36 🔗 godane from what i know
18:36 🔗 SketchCow OK.
18:37 🔗 SketchCow http://www.calpro-online.org/ERIC/index.asp
18:38 🔗 rejon has quit IRC (Ping timeout: 480 seconds)
18:41 🔗 SketchCow New archive made, I'm going to pump the 55,539 items into it.
18:42 🔗 godane ok
18:47 🔗 Start has quit IRC (Ping timeout: 252 seconds)
18:52 🔗 kyan_ is now known as kyan
18:53 🔗 godane so now i think i can get 2009 videos of kbs news today
18:53 🔗 godane what the hell
18:54 🔗 godane i just add 700k/10.mp4 to the end of the date path
18:54 🔗 godane its my best guess if all the old asf files are now mp4
19:00 🔗 cf has joined #archiveteam
19:14 🔗 SketchCow Ha ha, they're swapping over the ERICs, but it's not enjoying the process.
19:18 🔗 cf has quit IRC (Quit: cf)
19:19 🔗 SketchCow I'll hit the rest of your stuff when it's done with this, godane.
19:20 🔗 godane ok
19:21 🔗 ruukasu has quit IRC (Quit: WeeChat 1.0.1)
19:21 🔗 ruukasu has joined #archiveteam
19:22 🔗 godane SketchCow: looks like i well be able to get a good chuck of KBS News Today
19:22 🔗 godane i have just started download 20080729 episode of it
19:23 🔗 SmileyG How did the metadata for the arcade go?
19:25 🔗 SketchCow Which what.
19:25 🔗 SketchCow Like, we still need a lot.
19:25 🔗 SketchCow Some arcade games are becoming unplayable, but not unsurprisingly the more obscure ones that REALLY need metadata are still without good descriptions.
19:28 🔗 primus104 has joined #archiveteam
19:57 🔗 xmc has quit IRC (Quit: Lost terminal)
19:58 🔗 chronomex has joined #archiveteam
20:04 🔗 cf has joined #archiveteam
20:09 🔗 SketchCow I'm now agressively moving all my "project" items to one drive and all the "incoming buffer" to another, although I doubt this will change much.
20:13 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
20:14 🔗 dashcloud has joined #archiveteam
20:15 🔗 mutoso has quit IRC (Read error: Operation timed out)
20:28 🔗 SmileyG for a second then i was worried you meant OneDrive
20:33 🔗 Start has joined #archiveteam
20:37 🔗 mistym has joined #archiveteam
20:50 🔗 arkiver SketchCow: good luck with getting all those items up into IA.
20:51 🔗 arkiver For now I'll keep Halo paused and Viddy is going to midas
20:51 🔗 arkiver So FOS can totally focus on getting everything up and be ready for next projects
20:55 🔗 arkiver ------------------------------------------
20:55 🔗 arkiver Viddy has started
20:55 🔗 arkiver 15 days left to download all the videos
20:55 🔗 arkiver Join our newest project in: #viddiot
20:55 🔗 arkiver ------------------------------------------
21:00 🔗 nblr_ is now known as nblr
21:00 🔗 chronomex is now known as xmc
21:01 🔗 cf has quit IRC (cf)
21:04 🔗 mistym has quit IRC (Remote host closed the connection)
21:07 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
21:07 🔗 Start arkiver: should i add ziplist to the upcoming projects list?
21:09 🔗 dashcloud has joined #archiveteam
21:10 🔗 human39 has joined #archiveteam
21:12 🔗 arkiver Start: yeah, sure
21:12 🔗 Start ok
21:13 🔗 arkiver Start: I'll start working on it now, will keep you informed
21:14 🔗 Start alright
21:14 🔗 arkiver Maybe it time to create a channel for ziplist
21:14 🔗 arkiver it is*
21:14 🔗 Start #zipyourlips
21:15 🔗 arkiver sounds good. you decide on the channel name ;)
21:23 🔗 Start has quit IRC (Quit: Disconnected.)
21:25 🔗 mistym has joined #archiveteam
21:33 🔗 cf has joined #archiveteam
21:35 🔗 todrobbin has quit IRC (Quit: todrobbin)
22:02 🔗 cf has quit IRC (Quit: cf)
22:13 🔗 schbirid has quit IRC (Leaving)
23:17 🔗 Start has joined #archiveteam

irclogger-viewer