[00:00] *** dashcloud has quit IRC (Ping timeout: 265 seconds) [00:06] *** Nertsy has quit IRC (Read error: Connection reset by peer) [00:06] *** dashcloud has joined #archiveteam [00:11] *** Nertsy has joined #archiveteam [00:28] *** schbirid has quit IRC (Leaving) [01:11] *** dashcloud has quit IRC (Read error: Operation timed out) [01:12] *** Start has joined #archiveteam [01:15] *** dashcloud has joined #archiveteam [01:39] *** primus104 has quit IRC (Leaving.) [01:45] *** BiggieJo1 has joined #archiveteam [01:48] *** BiggieJon has quit IRC (Read error: Operation timed out) [01:49] *** Start has quit IRC (Ping timeout: 492 seconds) [02:11] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [02:11] *** ruukasu has joined #archiveteam [02:42] *** signius has quit IRC (Ping timeout: 480 seconds) [02:51] *** signius has joined #archiveteam [02:56] *** brayden has quit IRC (Ping timeout: 606 seconds) [02:59] *** brayden has joined #archiveteam [03:03] *** mistym has quit IRC (Remote host closed the connection) [04:17] *** Nertsy has quit IRC (Read error: Connection reset by peer) [04:20] *** Nertsy has joined #archiveteam [05:01] *** aaaaaaaaa has quit IRC (Leaving) [05:06] *** Swizzle has quit IRC (Quit: HydraIRC -> http://www.hydrairc.com <- Would you like to know more?) [05:52] *** dashcloud has quit IRC (Read error: Operation timed out) [05:56] *** dashcloud has joined #archiveteam [06:00] *** aschmitz has joined #archiveteam [06:31] *** Nertsy has quit IRC (Remote host closed the connection) [06:31] *** Nertsy has joined #archiveteam [06:52] *** mistym has joined #archiveteam [07:10] *** rejon has quit IRC (Read error: Operation timed out) [07:28] *** dashcloud has quit IRC (Read error: Operation timed out) [07:32] *** dashcloud has joined #archiveteam [07:37] *** primus104 has joined #archiveteam [08:17] *** primus104 has quit IRC (Leaving.) [08:32] *** Ymgve has joined #archiveteam [08:51] *** indigo_ has quit IRC (Remote host closed the connection) [08:55] *** brayden has quit IRC (Quit: Leaving) [09:14] *** brayden has joined #archiveteam [09:46] *** schbirid has joined #archiveteam [10:24] *** mistym has quit IRC (Remote host closed the connection) [10:24] *** Ymgve__ has joined #archiveteam [10:32] *** Ymgve has quit IRC (Ping timeout: 512 seconds) [10:35] *** primus104 has joined #archiveteam [10:47] So I have my two week holiday! [10:47] Will be working on the upcoming projects for the warrior [10:48] SketchCow: can the halo project start again? [10:49] *** primus104 has quit IRC (Leaving.) [10:57] *** APerti has quit IRC (Ping timeout: 370 seconds) [11:44] *** primus104 has joined #archiveteam [12:10] *** BlueMaxim has quit IRC (Quit: Leaving) [13:54] *** Nertsy` has joined #archiveteam [13:54] *** Nertsy has quit IRC (Read error: Connection reset by peer) [14:02] *** Nemo_bis has quit IRC (Remote host closed the connection) [14:08] *** hive-mind has quit IRC (Ping timeout: 272 seconds) [14:15] *** hive-mind has joined #archiveteam [14:19] *** Daloader_ has joined #archiveteam [14:21] *** T31M has joined #archiveteam [14:24] *** Nertsy` has quit IRC (Remote host closed the connection) [14:25] *** Nertsy has joined #archiveteam [14:25] *** chazchaz has quit IRC (Read error: Connection reset by peer) [14:25] *** chazchaz has joined #archiveteam [14:25] *** Laverne has quit IRC (Ping timeout: 369 seconds) [14:26] *** Laverne has joined #archiveteam [14:26] *** T31m_ has quit IRC (Read error: Operation timed out) [14:31] *** chazchaz has quit IRC (Remote host closed the connection) [14:32] *** T31M has quit IRC (Read error: Operation timed out) [14:32] *** Daloader_ has quit IRC (Read error: Operation timed out) [14:36] *** chazchaz has joined #archiveteam [14:40] *** smither has joined #archiveteam [14:40] hi there [14:40] I’ve been trying to do a grab of cbc.ca/Q [14:40] but something prevents wget from retriving more than one page [14:41] I’m using user-agent=“not Google” but that ain’t tricking the machine [14:41] *** Daloader_ has joined #archiveteam [14:42] (for background, CBC is deleting some of its archives because one of their anchor turned out to be a rapist. But it’s problematic because they’re erasing a lot of info for journalists) [14:42] http://www.huffingtonpost.ca/2014/12/17/ghomeshi-q-archives_n_6340882.html [14:46] *** Jonimus has quit IRC (Excess Flood) [14:47] *** Jonimus has joined #archiveteam [14:48] *** BiggieJo1 is now known as BiggieJ [14:49] *** Jonimus has quit IRC (Excess Flood) [14:50] *** Jonimus has joined #archiveteam [14:51] *** eprillios has quit IRC (Ping timeout: 369 seconds) [14:51] *** eprillios has joined #archiveteam [14:52] smither: looks like the podcast rss archived very often at least [14:52] https://web.archive.org/web/*/http://www.cbc.ca/podcasting/includes/qpodcast.xml [14:52] so it should be fine ? [14:53] no [14:53] i'm grabbing the mp3s right now [14:54] also from what i can tell 2011 mp3 urls don't work anymore [14:58] any idea why my wget didn’t work ? [14:58] I used wget -mc --no-parent --no-clobber --adjust-extension --user-agent="not Google" --convert-links --page-requisites cbc.ca/q [15:02] *** T31m_ has joined #archiveteam [15:04] *** ohhdemgir has quit IRC (Read error: Operation timed out) [15:04] *** brayden has quit IRC (Read error: Operation timed out) [15:05] *** Nertsy has quit IRC (Read error: Connection reset by peer) [15:05] i tryed my own way and it will not mirror either [15:05] wget --mirror cbc.ca/q -U "firefox" -e robots=off --warc-file=cbc-q --warc-cdx -E -o wget.log [15:06] *** ohhdemgir has joined #archiveteam [15:07] *** Nertsy has joined #archiveteam [15:11] *** zenguy_pc has quit IRC (Excess Flood) [15:12] so it’s not the robot? [15:12] *** zenguy_pc has joined #archiveteam [15:14] *** Daloader_ has quit IRC (Read error: Operation timed out) [15:15] *** brayden has joined #archiveteam [15:23] *** Nemo_bis has joined #archiveteam [15:27] *** T31M has joined #archiveteam [15:29] *** primus104 has quit IRC (Leaving.) [15:31] *** Nertsy has quit IRC (Remote host closed the connection) [15:31] *** Nertsy has joined #archiveteam [15:34] *** T31m_ has quit IRC (Read error: Operation timed out) [15:39] am i currently the only one mirroring wallbase? [15:39] *** smither has quit IRC (smither) [15:40] *** Nertsy has quit IRC (Remote host closed the connection) [15:41] i expect the server to go down beginning 2015, can someone please be so kind and mirror it and put the mirror in the wallbase mirror list? [15:41] *** Nertsy has joined #archiveteam [15:48] *** Nertsy` has joined #archiveteam [15:48] *** Nertsy has quit IRC (Read error: Connection reset by peer) [15:56] *** goekesmi has quit IRC (Ping timeout: 369 seconds) [16:04] *** goekesmi has joined #archiveteam [16:05] Fusl: has the site been taken down? officially propose to push to IA? [16:05] *** aaaaaaaaa has joined #archiveteam [16:07] *** Daloader_ has joined #archiveteam [16:07] Fusl: yeah, sure [16:08] the site has not been taken down [16:08] but the host node where i'm hosting that entire thing on (it's about 1.3TB huge) will be cancelled [16:08] because of the lack of money [16:08] Fusl: you're talking about this right? http://archive_wallbase.cc.mirror.fuslvz.ws/ [16:08] yes, but there is rsync on this mirror [16:09] rsync://mirror.fuslvz.ws/archive_wallbase.cc/ [16:10] *** goekesmi has quit IRC (Read error: Connection reset by peer) [16:10] *** indigo_ has joined #archiveteam [16:13] *** T31M has quit IRC (Read error: Operation timed out) [16:14] *** goekesmi has joined #archiveteam [16:27] iomart? [16:28] arkiver: do you have anywhere i can dump the halo stuff i was holding while FOS was down? [16:28] *** T31M has joined #archiveteam [16:28] i have qwiki with me too [16:28] halo: https://archive.org/details/archiveteam_halo [16:28] if u can get those off me i'll have space for Fusl [16:28] But SketchCow needs to give you acces to upload to there [16:28] arkiver: rsync target [16:28] i'm holding rsync data only [16:29] ah ok [16:29] we might be able to use FOS's rsync, but I'm not sure if we can already start uploading to that one [16:29] will need to ask SketchCow i guess [16:29] what about qwiki? [16:29] If SketchCow thinks FOS is fine again, we can move your stuff to FOS [16:29] it's not a lot, about 700MB [16:30] qwiki the same [16:30] We'll have to wait for SketchCow, what he says [16:30] k. if he gives the go ahead then i'll have 1.5T for Fusl's stuff [16:30] yes [16:30] alternatively, fusl push straight to IA if it's get approved [16:31] how much of halo do you have? [16:31] since the website is pretty much dead [16:31] Kenshin: if you explain how, i can do that :) [16:31] arkiver: 1.7T [16:31] Kenshin: ok, that'd be fine [16:31] Fusl: no idea. i've always left uploading to yip [16:32] hm [16:32] arkiver: what would be the best way to push 1.3t of website data to IA? [16:34] Kenshin: megawarc it and upload it to the collection, with https://pypi.python.org/pypi/internetarchive or https://github.com/kngenie/ias3upload [16:34] Fusl: I'd suggest using one of the above tools ^ to upload all the stuff to IA [16:38] *** primus104 has joined #archiveteam [16:38] *** T31m_ has joined #archiveteam [16:39] the lastter one, what .csv files do i need exactly? [16:39] or don't i need them? [16:41] *** Daloader_ has quit IRC (Read error: Operation timed out) [16:42] *** Daloader_ has joined #archiveteam [16:44] *** w0rp has quit IRC (Ping timeout: 1221 seconds) [16:45] *** w0rp has joined #archiveteam [16:45] *** dashcloud has quit IRC (Read error: Operation timed out) [16:45] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [16:45] *** ruukasu has joined #archiveteam [16:46] *** T31M has quit IRC (Read error: Operation timed out) [16:48] Fusl: in the csv file you'll write the information for your item, like the identifier, title, description, tags, etc. https://github.com/kngenie/ias3upload/blob/master/metadata.csv [16:48] *** dashcloud has joined #archiveteam [16:50] *** T31m_ has quit IRC (Read error: Operation timed out) [16:50] unfortunately too much work at the moment [16:50] i have to move other stuff :/ [16:54] *** primus104 has quit IRC (Leaving.) [16:57] Fusl: I'm mirroring parts of it now [16:57] will get folders with images up in IA [16:57] arkiver: if you want, i can throw your ssh key in the server so you can put it up on IA directly from there...? [16:58] nah, it's going fine this way [16:58] k [16:59] I'll put them in seperate items for each folder. So the images from http://archive_wallbase.cc.mirror.fuslvz.ws/siterip/images/0000000/ will have the collection wallbase.cc-rip-0000000 [17:11] Fusl: test item: https://archive.org/details/test_wallbase.cc-rip-0000000 [17:11] looks good? (still uploading, not derived yet) [17:12] neat [17:14] Fusl: what's the full size of everything in this directory? http://archive_wallbase.cc.mirror.fuslvz.ws/siterip/images/ [17:14] if not too big I'll put everything in one item [17:15] calculating ... [17:15] 1.2T [17:50] *** rejon has joined #archiveteam [18:06] *** mistym has joined #archiveteam [18:13] *** bsmith093 has quit IRC (Read error: Operation timed out) [18:15] *** db48x has quit IRC (Ping timeout: 258 seconds) [18:27] *** bsmith093 has joined #archiveteam [18:43] *** APerti has joined #archiveteam [19:49] *** okeuday has joined #archiveteam [20:03] What [20:03] Kenshin: FOS, go ahead. [20:03] Sorry for lack of response, maniacs [20:04] /join #aside [20:05] *** rejon has quit IRC (Read error: Operation timed out) [20:17] How did I miss #roon [20:19] *** thechip_ has quit IRC (Read error: Operation timed out) [20:22] Anyway, I guess we're doing roon. I'm doing the groupings now. [20:24] *** primus104 has joined #archiveteam [20:31] *** fluff is now known as fluff_ [20:34] I'm in #roon on irc.efnet.net and is no one there [20:35] #rooined [20:41] SketchCow: do you have the rsync urls for qwiki and halo? [20:47] chfoo does [20:48] *** ohhdemgir has quit IRC (Read error: Operation timed out) [20:49] *** ohhdemgir has joined #archiveteam [20:54] Kenshin: for fos i assume: rsync://fos.textfiles.com/chfoo/warrior/qwiki/:downloader/ & rsync://fos.textfiles.com/chfoo/warrior/halo/:downloader/ [20:56] cool thanks, much appreciated [20:56] replace :downloader with a nickname [20:56] *** mistym has quit IRC (Remote host closed the connection) [21:11] *** BlueMaxim has joined #archiveteam [21:29] *** wp494 has quit IRC () [21:31] *** wp494 has joined #archiveteam [21:37] *** bzc6p has joined #archiveteam [21:44] chfoo, arkiver: In a template I made on the wiki, I included "some additional information" I miss from the script documentations on GitHub. Could you please include those pieces of information when you next time create projects on GitHub? [21:45] (I mean the missing ones, about the concurrency, stopping the script, and what to do when outdated – even rephrased if necessary.) I think these are important for newcomers, but if they were on github, I could remove them from the wiki. Thank you. [21:46] the template we're using is located at https://github.com/ArchiveTeam/standalone-readme-template [21:48] Could you please expand that then? I don't want to create a github account just for that. [21:49] (Of course if you too find it a good idea.) [21:50] bzc6p: Where's your template on the wiki? [21:51] ersi: http://archiveteam.org/index.php?title=Template:Howcanihelp [21:51] Sorry, I indeed forgot to name it... [21:51] *** dashcloud has quit IRC (Read error: Operation timed out) [21:52] the additional info is put into a collapsible box [21:54] I'll take a look at merging it. I do, have a GitHub account. :) [21:56] *** dashcloud has joined #archiveteam [22:11] *** schbirid has quit IRC (Leaving) [22:50] *** wp494 has quit IRC () [22:52] *** wp494 has joined #archiveteam [23:04] *** Start has joined #archiveteam [23:16] *** bzc6p has left [23:18] *** Start has quit IRC (Ping timeout: 606 seconds) [23:23] *** primus has joined #archiveteam