[00:08] *** Start has joined #archiveteam [00:17] Hi. [00:18] *** JesseW has joined #archiveteam [00:20] *** primus104 has quit IRC (Leaving.) [00:29] Looks like we're getting some skillfeed [00:39] *** vitzli has joined #archiveteam [00:48] *** Aranje has quit IRC (Read error: Connection reset by peer) [00:49] *** Aranje has joined #archiveteam [01:01] *** sivoais has quit IRC (Read error: Operation timed out) [01:01] *** sivoais has joined #archiveteam [01:02] *** JesseW has quit IRC (Read error: Operation timed out) [01:04] *** Aranje has quit IRC (Read error: Connection reset by peer) [01:04] *** Aranje has joined #archiveteam [01:08] *** Ungstein has joined #archiveteam [01:09] *** Aranje has quit IRC (Read error: Connection reset by peer) [01:09] *** Aranje has joined #archiveteam [01:10] *** Aranje has quit IRC (Read error: Connection reset by peer) [01:10] *** Aranje has joined #archiveteam [01:11] *** Aranje has quit IRC (Read error: Connection reset by peer) [01:11] *** Ungstein1 has joined #archiveteam [01:12] *** Aranje has joined #archiveteam [01:12] *** Ungstein has quit IRC (Ping timeout: 252 seconds) [01:14] *** vitzli has quit IRC (Quit: Leaving) [01:16] *** nertzy has joined #archiveteam [01:16] *** Aranje has quit IRC (Read error: Connection reset by peer) [01:17] *** Aranje has joined #archiveteam [01:26] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [01:28] What will happen to the site’s content? [01:28] Shortly after September 30 all content will be permanently removed from Skillfeed and not saved by the company. [01:28] Can I pull the content off the site? [01:28] Skillfeed does not have a capability to allow users to download content from site. [01:28] Can I view the content elsewhere? [01:28] No the content is only accessible on the Skillfeed site unless the contributor has posted their material on other tutorial websites. All content will be completely removed after September 30. [01:28] Will Skillfeed reopen? [01:28] No. [01:28] DICKS. [01:28] HOLY CRAP DICKS [01:31] I've begun the upload of skillfeed into archive.org. [01:33] so skillfeed actually had content on it? I thought it was just some spam answer site [01:33] *** Aranje has quit IRC (Read error: Connection reset by peer) [01:33] it appears to have had quite a lot of content, if the size of some of the items in the grab is anything to go by [01:34] (he says, watching a ~4GB upload trickle up the pipe) [01:35] *** RichardG_ has joined #archiveteam [01:35] *** RichardG has quit IRC (Read error: Connection reset by peer) [01:35] https://archive.org/details/archiveteam_skillfeed [01:41] better are the questions about money. They don't have your details on September 30th, you don't get any money you are owed and paying customers only get a refund for time remaining after Oct. 15th [01:43] so this maybe a video like: https://embed-ssl.wistia.com/deliveries/5222036716253eef48043dc78f5d3f2cedbf6a46.bin [01:44] its a .bin [01:46] ah, nevermind. I misread the sentence about the 15 days. [02:01] so skillfeed uses wistia for video hosting [02:01] they use api embeding based on source code: http://wistia.com/doc/embedding [02:03] good news is i found a way to grab video [02:03] godane: I believe they are already grabbing the videos in the main script [02:04] SketchCow: looking at that FAQ page, I particularly like the "why are you closing?" "see above" when all 'above' says is "we've decided to close". *thumbs up* a prize example of the fuck-you shutdown, I guess! [02:05] ok then [02:06] looks like there is more then one video url [02:07] *** zenguy_pc has quit IRC (Read error: Operation timed out) [02:07] maybe you should ask arkiver tomorrow, then [02:08] anyway here is basic code i got so far: curl -s https://www.skillfeed.com/courses/13774 | grep 'id="wistia_' | sed 's|" class.*||g' | sed 's|.*div id="wistia_|fast.wistia.net/embed/iframe/|g' [02:08] that will get you this: fast.wistia.net/embed/iframe/u4g3ksi019 [02:08] which you can then curl to get the video url [02:09] ok i see the lua code [02:09] btw there is no page for skillfeed on archiveteam.org [02:11] *** zenguy_pc has joined #archiveteam [02:19] *** JesseW has joined #archiveteam [02:26] Is there a Skillfeed channel? I didn't see any mention of one. [02:27] In any case, the tracker appears to be returning status 403. [02:27] arkiver: ping (not sure who else) [02:29] *** Stiletto has quit IRC () [02:31] JesseW: #skillessfeed [02:31] tracker working for me [02:57] bentpins: thanks [03:14] *** Stiletto has joined #archiveteam [03:51] *** wacky has quit IRC (Ping timeout: 240 seconds) [03:51] *** wacky has joined #archiveteam [03:52] *** godane has quit IRC (Ping timeout: 240 seconds) [03:52] *** SketchCow has quit IRC (Ping timeout: 240 seconds) [03:52] *** godane has joined #archiveteam [04:26] *** xk_id_ has joined #archiveteam [04:30] *** Ungstein1 has quit IRC (Quit: Leaving.) [04:31] *** aaaaaaaaa has quit IRC (Leaving) [04:31] *** xk_id has quit IRC (Read error: Operation timed out) [05:19] *** BlueMaxim has joined #archiveteam [05:53] *** Froggypwn has quit IRC (Ping timeout: 483 seconds) [05:54] *** Froggypwn has joined #archiveteam [06:08] *** vitzli has joined #archiveteam [06:28] *** JesseW has quit IRC (Read error: Operation timed out) [06:35] *** primus104 has joined #archiveteam [07:02] Did initial page for skillfeed and added it to 'current events' [07:07] *** SketchCow has joined #archiveteam [07:07] *** swebb sets mode: +o SketchCow [07:07] *** GLaDOS sets mode: +o SketchCow [07:07] https://archive.org/details/archiveteam_skillfeed_20150915003154 [07:07] Integrating skillfeed, biiiiiitches [07:09] schluuuurp [07:36] ------------------------------------ [07:36] Oh shit son, another one. [07:36] https://publish.comcast.net/splash/ [07:36] ------------------------------------ [07:39] we have a page: http://archiveteam.org/index.php?title=Comcast_Personal_Web_Pages [07:41] *** schbirid has joined #archiveteam [07:41] Whew [07:49] http://www.cbc.ca/radio/spark/292-what-you-say-will-be-searched-why-recognition-systems-don-t-recognize-accents-and-more-1.3211777/why-i-spent-my-summer-rescuing-thousands-of-vintage-manuals-1.3222642 [07:50] *** HCross- has quit IRC (Ping timeout: 252 seconds) [07:55] SketchCow: episode 208 of CBC Spark is gone [07:56] luckly for you i'm up to 231 with my CBC Spark collection [08:02] *** HCrossII has joined #archiveteam [08:02] *** Start has quit IRC (Read error: Connection reset by peer) [08:03] *** Start has joined #archiveteam [08:12] Not surprisingly, the skillfeed packs are coming in faster than FOS can process them and upload them. [08:12] But we're only at 14 percent usage at the moment. I'll sound an alarm when it's somewhere real. [08:14] ye. My upload is quite slow [08:14] first new cbc spark upload: https://archive.org/details/spark_20131120_26405 [08:17] looks like some beat me to this id: https://archive.org/details/spark_20131120_42239 [08:17] that can be moved to my collection spark_cbc [08:21] godane: implemented all that two days ago in the script. Don't worry, we really ae getting everything! (as always) [08:22] ok [08:23] Start: do you think you can discover comcast websites like you did with previous projects? [08:32] SketchCow: have you seen what I wrote earlier here about GameFront? [08:32] http://www.gamesindustry.biz/articles/2015-01-21-defy-media-lays-off-staff-at-gaming-sites [08:32] "The Escapist, GameTrailers, and GameFront all lose headcount but avoid shutdown" [08:32] http://www.gamefront.com/a-farewell-to-the-front/ [08:33] GameFront hosts tens of thousands of freeware, addons, mods, skins, etc. of games [08:35] Ah shit. [08:35] Well, pile it on [08:36] *** vitzli has quit IRC (Quit: Leaving) [08:46] *** primus104 has quit IRC (Leaving.) [08:57] ok! [08:57] *** xk_id_ has quit IRC (Remote host closed the connection) [09:18] *** Ungstein has joined #archiveteam [09:23] *** RedType_ has quit IRC (Read error: Operation timed out) [09:24] *** RedType has joined #archiveteam [09:46] *** atomotic has joined #archiveteam [09:49] *** xk_id has joined #archiveteam [09:51] *** fenn has quit IRC (Remote host closed the connection) [09:57] *** primus104 has joined #archiveteam [10:24] *** RedType has quit IRC (Read error: Operation timed out) [10:25] *** RedType has joined #archiveteam [10:27] *** zhongfu has quit IRC (Quit: Goodbye.) [10:28] *** zhongfu has joined #archiveteam [10:44] *** atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…) [10:56] *** vitzli has joined #archiveteam [11:01] *** xk_id has quit IRC (Remote host closed the connection) [11:04] *** primus104 has quit IRC (Leaving.) [11:27] *** db48x has quit IRC (Read error: Connection reset by peer) [11:29] *** Ungstein has quit IRC (Read error: Connection reset by peer) [11:30] *** atomotic has joined #archiveteam [11:34] *** xk_id has joined #archiveteam [11:35] *** Ungstein has joined #archiveteam [11:37] *** db48x` has joined #archiveteam [11:39] *** db48x has joined #archiveteam [11:41] *** db48x` has quit IRC (Client Quit) [11:55] *** primus104 has joined #archiveteam [12:01] *** InAUGral has joined #archiveteam [12:07] Hey guys is this a good place to suggest additions to the "archive project"? [12:08] here :) [12:08] also the wiki [12:10] Anywho there is a star trek/sci fi review site ive been using for around 7 years. It occured to me recently it could disappear at any time. its URL is http://www.jammersreviews.com/ [12:10] A review of every star trek episode is there plus battlestar galactica [12:10] It is fairly unique [12:11] good idea [12:11] you should be able to feed that to the archivebot [12:11] http://www.archiveteam.org/index.php?title=ArchiveBot [12:12] just added [12:12] Awesome [12:15] #archivebot and http://archivebot.at.ninjawedding.org:4567/ will let you track the job if desired [12:17] Awesomesauce [12:17] I will indeed [12:21] though it seems to be archived fairly frequently https://web.archive.org/web/*/jammersreviews.com [12:22] lets have a look [12:22] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [12:22] Ah yes. Well more hosts for achives is never a bad thing? [12:25] pretty much [12:39] *** xk_id has quit IRC (Remote host closed the connection) [12:42] *** cloudmons has quit IRC (Ping timeout: 492 seconds) [12:43] *** robink has quit IRC (Ping timeout: 492 seconds) [12:56] *** lrkj has quit IRC (Read error: Operation timed out) [13:07] *** atomotic has joined #archiveteam [13:07] *** InAUGral has left [13:14] *** lrkj has joined #archiveteam [13:19] *** BlueMaxim has quit IRC (Ping timeout: 306 seconds) [13:20] *** PurpleSym has joined #archiveteam [13:32] *** xk_id has joined #archiveteam [13:36] *** robink has joined #archiveteam [13:36] *** cloudmons has joined #archiveteam [13:38] *** lrkj has quit IRC (Read error: Operation timed out) [13:40] *** lrkj has joined #archiveteam [13:43] *** primus104 has quit IRC (Leaving.) [13:50] *** scyther has joined #archiveteam [13:51] *** Start has quit IRC (Quit: Disconnected.) [14:22] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [14:33] *** dxrt has quit IRC (Read error: Operation timed out) [14:34] *** dxrt has joined #archiveteam [14:38] *** Start has joined #archiveteam [15:11] *** HCross has joined #archiveteam [15:15] *** primus104 has joined #archiveteam [15:31] *** nertzy has joined #archiveteam [15:40] *** primus104 has quit IRC (Leaving.) [15:41] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [15:43] *** mafrasi2 has quit IRC (Remote host closed the connection) [15:44] *** HCrossII has quit IRC (Leaving) [15:46] *** mafrasi2 has joined #archiveteam [15:54] *** nertzy has joined #archiveteam [16:01] *** RichardG_ is now known as RichardG [16:01] *** mafrasi2 has quit IRC (Remote host closed the connection) [16:03] *** Start has quit IRC (Quit: Disconnected.) [16:04] *** mafrasi2 has joined #archiveteam [16:05] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [16:12] *** Start has joined #archiveteam [16:14] *** JesseW has joined #archiveteam [16:15] *** mafrasi2 has quit IRC (Remote host closed the connection) [16:16] *** mafrasi2 has joined #archiveteam [16:28] *** DFJustin has quit IRC (Remote host closed the connection) [16:28] *** DFJustin has joined #archiveteam [16:28] *** swebb sets mode: +o DFJustin [16:34] *** superkuh has quit IRC (Remote host closed the connection) [16:38] *** vitzli has quit IRC (Quit: Leaving) [16:47] Hey, so it's quite obvious the way that Skillfuckers works, it will be slow to process. [16:47] yes [16:47] I just saw a 980mb .WARC go by. It makes the MegaWARCer flip out. [16:47] And also FOS uploads are sloww [16:47] *** superkuh has joined #archiveteam [16:47] *** JesseW has quit IRC (Read error: Operation timed out) [16:52] Well, there's tons of you shits [16:54] slow is relative [16:54] fast is relative [16:54] * ersi sips beer [16:55] *** SimpBrain has joined #archiveteam [16:57] *** primus104 has joined #archiveteam [17:00] *** dan-- has quit IRC (Ping timeout: 483 seconds) [17:05] *** dan- has joined #archiveteam [17:06] my stuff, your shit :) [17:08] I have a 10 gig WARC going out for Skillfeed [17:08] If 980 megs makes it flip out, I have no clue what 10 gigs will do [17:15] *** Start has quit IRC (Quit: Disconnected.) [17:21] *** dan- has quit IRC (Ping timeout: 483 seconds) [17:22] it's not just WARC size, it's also structure [17:22] *** dan- has joined #archiveteam [17:29] having e.g. a lot of videos or photos can cause things to go bad faster than usual [17:30] Oh yeah, this WARC has over 3600 URLs in it [17:39] 2.6G 20150915150652/skillfeed-video_1294-20150915-065554.warc.gz [17:39] 1.5G 20150915150652/skillfeed-video_2258-20150915-070843.warc.gz [17:39] 1.3G 20150915150652/skillfeed-video_2304-20150915-065707.warc.gz [17:39] 2.7G 20150915150652/skillfeed-video_2370-20150915-034424.warc.gz [17:39] 4.2G 20150915150652/skillfeed-video_2616-20150915-023014.warc.gz [17:39] 4.9G 20150915150652/skillfeed-video_3117-20150915-021346.warc.gz [17:39] 2.9G 20150915150652/skillfeed-video_3152-20150915-045728.warc.gz [17:39] 2.5G 20150915150652/skillfeed-video_4126-20150915-061757.warc.gz [17:39] 1.9G 20150915150652/skillfeed-video_5725-20150915-023617.warc.gz [17:40] 1.3G 20150915150652/skillfeed-video_9899-20150915-105103.warc.gz [17:40] Stuff like that getting pulled in. [17:48] I'm working on one that's 12gb [17:55] *** beardicus has quit IRC (Quit: bye now) [18:00] *** beardicus has joined #archiveteam [18:24] *** Stiletto has quit IRC (Remote host closed the connection) [18:25] *** Stiletto has joined #archiveteam [18:30] *** aaaaaaaaa has joined #archiveteam [18:31] *** swebb sets mode: +o aaaaaaaaa [18:39] *** kevin has quit IRC (Quit: Updating details, brb) [18:39] *** kevin has joined #archiveteam [19:20] *** beardicu- has joined #archiveteam [19:21] Verified, it's going to be slow going. [19:21] We're going to need another rsync target [19:23] Somewhere EU? [19:24] That in itself matters very little. Peering and disk I/O is what makes it go full whack or not. [19:26] IA base themselves of Cogent [19:31] *** khaoohs_ has joined #archiveteam [19:34] *** khaoohs has quit IRC (Ping timeout: 306 seconds) [19:38] Hm? https://commons.wikimedia.org/w/index.php?title=Help:Server-side_upload&curid=16126343&diff=171934389&oldid=171908773 [19:39] *** beardicu- has quit IRC (Read error: Operation timed out) [19:39] Nemo_bis: https://support.google.com/drive/answer/2881970 [19:41] but that page is about files [19:41] will it become impossible to wget an URL? [19:42] (that page = the commons one= [19:42] beats me, i don't work at google [19:42] :) ok thanks [19:43] it's possible that the contributor misunderstand the announcement [19:44] *** beardicu- has joined #archiveteam [19:49] *** Start has joined #archiveteam [19:51] *** habi has joined #archiveteam [19:52] *** habi has left [19:56] *** Start has quit IRC (Quit: Disconnected.) [20:11] *** Start has joined #archiveteam [20:14] *** schbirid has quit IRC (Quit: Leaving) [20:14] [20:14] [20:17] *** microguru has joined #archiveteam [20:20] *** Start has quit IRC (Quit: Disconnected.) [20:26] [20:27] [20:41] *** microguru has quit IRC (Quit: Page closed) [20:43] *** mafrasi2 has quit IRC (Remote host closed the connection) [20:44] *** mafrasi2 has joined #archiveteam [20:52] *** PurpleSym has quit IRC (Remote host closed the connection) [20:56] *** scyther has quit IRC (Read error: Operation timed out) [20:58] *** scyther has joined #archiveteam [21:05] *** SimpBrain has quit IRC (Remote host closed the connection) [21:08] *** scyther has quit IRC (Read error: Operation timed out) [21:11] *** beardicu- has quit IRC (Ping timeout: 483 seconds) [21:41] *** superkuh has quit IRC (Remote host closed the connection) [21:43] *** superkuh has joined #archiveteam [22:00] *** garyrh has quit IRC (Quit: http://bnc4free.com/) [22:06] *** Peetz0r_ has joined #archiveteam [22:09] *** kevin has quit IRC (hub.se efnet.port80.se) [22:09] *** dan- has quit IRC (hub.se efnet.port80.se) [22:09] *** HCross has quit IRC (hub.se efnet.port80.se) [22:09] *** Ungstein has quit IRC (hub.se efnet.port80.se) [22:09] *** zhongfu has quit IRC (hub.se efnet.port80.se) [22:09] *** pikhq has quit IRC (hub.se efnet.port80.se) [22:09] *** Sue_ has quit IRC (hub.se efnet.port80.se) [22:09] *** Famicoman has quit IRC (hub.se efnet.port80.se) [22:09] *** rizzzz has quit IRC (hub.se efnet.port80.se) [22:09] *** aliz has quit IRC (hub.se efnet.port80.se) [22:09] *** goekesmi has quit IRC (hub.se efnet.port80.se) [22:09] *** Wyatts has quit IRC (hub.se efnet.port80.se) [22:09] *** _0x2A has quit IRC (hub.se efnet.port80.se) [22:09] *** Peetz0r has quit IRC (hub.se efnet.port80.se) [22:09] *** lhobas has quit IRC (hub.se efnet.port80.se) [22:09] *** GLaDOS has quit IRC (hub.se efnet.port80.se) [22:09] *** Boltsie has quit IRC (hub.se efnet.port80.se) [22:09] *** JSharp has quit IRC (hub.se efnet.port80.se) [22:09] *** karissa has quit IRC (hub.se efnet.port80.se) [22:09] *** _desu_ has quit IRC (hub.se efnet.port80.se) [22:09] *** Ctrl-S has quit IRC (hub.se efnet.port80.se) [22:09] *** russss__ has quit IRC (hub.se efnet.port80.se) [22:09] *** diacope has quit IRC (hub.se efnet.port80.se) [22:09] *** zyphlar has quit IRC (hub.se efnet.port80.se) [22:09] *** codl has quit IRC (hub.se efnet.port80.se) [22:09] *** afics has quit IRC (hub.se efnet.port80.se) [22:09] *** deathy has quit IRC (hub.se efnet.port80.se) [22:09] *** Fletcher has quit IRC (hub.se efnet.port80.se) [22:09] *** sigkell has quit IRC (hub.se efnet.port80.se) [22:09] *** irl1 has quit IRC (hub.se efnet.port80.se) [22:09] *** Rickster has quit IRC (hub.se efnet.port80.se) [22:09] *** Muad-Dib has quit IRC (hub.se efnet.port80.se) [22:11] *** nertzy has joined #archiveteam [22:19] *** goekesmi_ has joined #archiveteam [22:25] *** Sue_ has joined #archiveteam [22:37] *** mafrasi2 has quit IRC (Remote host closed the connection) [22:38] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [22:39] *** mafrasi2 has joined #archiveteam [22:56] [22:57] [22:58] [23:00]   [23:02] *** Start has joined #archiveteam [23:40] *** mafrasi2 has quit IRC (Remote host closed the connection) [23:41] FOS definitely does not like working with already-compressed items [23:42] *** mafrasi2 has joined #archiveteam [23:56] *** BlueMaxim has joined #archiveteam