[00:01] *** VADemon has quit IRC (Quit: left4dead) [00:25] *** JesseW has quit IRC (Ping timeout: 370 seconds) [00:58] *** JesseW has joined #archiveteam-bs [01:22] *** BlueMaxim has joined #archiveteam-bs [01:25] *** tomwsmf-a has quit IRC (Remote host closed the connection) [01:26] *** tomwsmf-a has joined #archiveteam-bs [03:13] *** JesseW has quit IRC (Ping timeout: 370 seconds) [03:14] *** vitzli has joined #archiveteam-bs [03:14] *** zhongfu has quit IRC (Ping timeout: 260 seconds) [03:22] *** JesseW has joined #archiveteam-bs [03:31] *** zhongfu has joined #archiveteam-bs [03:55] *** Ducky_ is now known as Ducky [04:07] Apparently, this is the Edge UA string Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246 [04:08] Good ol' UAs. Been broken nonsense for ages. [04:10] *** vitzli has quit IRC (Quit: Leaving) [04:13] lol [04:14] ErkDog: ? [04:15] that Edge UA [04:15] like shouldn't it just be "Edge/12.246" [04:15] what's all the other garbage [04:15] ErkDog: Accounting for a *lot* of Javascript that checks for certain strings in a UA string. [04:16] yeah but like, if it's the Edge browser, then why doesn't it just say Edge, why reference mozilla, and chrome, [04:16] Because Javascript is looking for those strings. [04:16] You break things if you omit them. [04:21] gotcha [04:24] ErkDog: To answer your question about how old tr.im is dead -- the database was wiped; there's a *new* service at that same URL, but it's a different database. [04:24] IIRC [04:38] K thanks, yeah I was like whaaat then as I went down page, I saw and deleted my comment, lol [04:40] I saw you had, but I wasn't sure if you got your question answered as well [04:40] so i plan on doing something different with mbc videos i have downloaded [04:40] it's a reasonable question -- might be worth adding an additional note to that entry clarifying it [04:40] there are going to be more in a daily dump of all the videos they have from 2007 on [04:54] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [05:10] Now, that's a phrase I hadn't previously thought of: "pancake flavored Easter themed Kit Kat bar" [05:10] https://twitter.com/textfiles/status/739647391502983168 [05:11] Japan gets all the best candy flavors :( [05:13] Hm, there's probably a "online random name generator generator" out there somewhere. Might be fun to make a "X flavored Y themed Z bar" instance. [05:23] *** JesseW has quit IRC (Ping timeout: 370 seconds) [05:25] x86 -> ARM cross-compilation of C extensions in Python modules is not quite hell, but it is very close [05:26] oh yeah and you have to teach the Debian packaging pipeline about that [05:26] * yipdw argh [05:28] dh_virtualenv seems to get pretty close, though [05:35] so i'm using livestreamer to grab the mbc news mp4s now [05:38] livestreamer grabs the m3u8 file thats 20 minutes lone in under 60 seconds [06:01] *** Honno has joined #archiveteam-bs [06:14] looks like livestreamer will make 700k videos out 300k streams if under 20 minutes [06:15] i have to do a re-encode of the output from livestreamer cause there is no time code [06:16] these videos from livestreamer will work in mplayer but not in vlc [06:16] I seem to remember there's a way to rebuild the index without running it through another encoding. [06:18] http://superuser.com/questions/4570/how-can-i-repair-a-broken-avi-file [06:18] i re-encode with -c copy [06:22] so i may use the mp4 paths to grab the rest of MBC Newsdesk [06:22] see that i can get streams at a much faster rate [06:33] *** brayden_ has quit IRC (Read error: Connection reset by peer) [06:34] *** HCross has quit IRC (Read error: Connection reset by peer) [06:34] *** ndiddy has quit IRC (Read error: Connection reset by peer) [06:34] *** HCross has joined #archiveteam-bs [06:34] *** ndiddy has joined #archiveteam-bs [06:34] *** brayden_ has joined #archiveteam-bs [06:34] *** swebb sets mode: +o brayden_ [07:44] *** schbirid has joined #archiveteam-bs [08:06] *** jut has joined #archiveteam-bs [08:07] will be back to full archiving duties soon, just juggling new home and new job [08:09] np [08:09] yr allowed to have a life [08:09] :) [08:13] Good new happy end of school (Lithuania)! I have my life back. [08:18] yay! [08:20] On that note, I may not be around as much this week, end of the year so fun is happening.. Not [09:53] *** Ducky has quit IRC () [10:49] *** bzc6p has joined #archiveteam-bs [10:49] *** swebb sets mode: +o bzc6p [10:54] *** bzc6p has left [12:17] *** Stilett0 has quit IRC (Ping timeout: 260 seconds) [13:07] *** Stiletto has joined #archiveteam-bs [14:59] requests and urllib don't seem to have any way of seeing their raw input and output, weird. [15:00] *** VADemon has joined #archiveteam-bs [15:11] my hacky stuff to get around that doesn't work very well. [15:27] *** BlueMaxim has quit IRC (Quit: Leaving) [15:33] Just saw this go by in the Archivebot dashbaord, quilters try to figure out whether copyright applies to quilting patterns: http://www.quiltingboard.com/main-f1/more-discusion-about-copyright-issues-t185888.html [15:33] IANAL but I'm pretty sure the answer is yes [15:33] Thought it was interesting to see copyright come up in that context [15:33] *** JesseW has joined #archiveteam-bs [15:34] it looks like wpull implements HTTP/1.1 itself, wasn't expecting that. [16:11] trying to find this article for free: http://blogs.ft.com/tech-blog/2013/06/tim-berners-lee-prism/ [16:12] nvm. found it. [16:12] http://web.archive.org/web/20130612022939/http://blogs.ft.com/tech-blog/2013/06/tim-berners-lee-prism/ [16:12] *** JesseW has quit IRC (Ping timeout: 370 seconds) [16:35] http://virtuallyfun.superglobalmegacorp.com/2016/06/06/pcem-v11-released/ [16:36] I'm curious, what's the difference between DosBox and that? [16:56] *** JW_work1 has joined #archiveteam-bs [17:01] *** JW_work has quit IRC (Ping timeout: 370 seconds) [17:01] *** fie has joined #archiveteam-bs [17:18] *** Silvan has joined #archiveteam-bs [17:18] *** SilSte has quit IRC (Read error: Connection reset by peer) [17:53] *** tomwsmf-a has joined #archiveteam-bs [19:29] *** Dark_Star has quit IRC (Remote host closed the connection) [19:29] *** Silvan has quit IRC (Remote host closed the connection) [19:29] *** brayden_ has quit IRC (Read error: Connection reset by peer) [19:29] *** brayden_ has joined #archiveteam-bs [19:30] *** mutoso has quit IRC (hub.dk ircd.choopa.net) [19:30] *** tomwsmf-a has quit IRC (hub.dk ircd.choopa.net) [19:30] *** schbirid has quit IRC (hub.dk ircd.choopa.net) [19:30] *** SketchCo1 has quit IRC (hub.dk ircd.choopa.net) [19:30] *** midas has quit IRC (hub.dk ircd.choopa.net) [19:30] *** Smiley has quit IRC (hub.dk ircd.choopa.net) [19:30] *** achip has quit IRC (hub.dk ircd.choopa.net) [19:30] *** Zebranky has quit IRC (hub.dk ircd.choopa.net) [19:30] *** jspiros has quit IRC (hub.dk ircd.choopa.net) [19:30] *** remsen has quit IRC (hub.dk ircd.choopa.net) [19:30] *** mr-b has quit IRC (hub.dk ircd.choopa.net) [19:30] *** acridAxid has quit IRC (hub.dk ircd.choopa.net) [19:30] *** balrog has quit IRC (hub.dk ircd.choopa.net) [19:30] *** marvinw has quit IRC (hub.dk ircd.choopa.net) [19:30] *** zenguy has quit IRC (hub.dk ircd.choopa.net) [19:30] *** SadDM has quit IRC (hub.dk ircd.choopa.net) [19:30] *** yakfish has quit IRC (hub.dk ircd.choopa.net) [19:30] *** fie has quit IRC (hub.dk ircd.choopa.net) [19:30] *** JW_work1 has quit IRC (hub.dk ircd.choopa.net) [19:30] *** ndiddy has quit IRC (hub.dk ircd.choopa.net) [19:30] *** Honno has quit IRC (hub.dk ircd.choopa.net) [19:30] *** Baljem has quit IRC (hub.dk ircd.choopa.net) [19:30] *** dxrt has quit IRC (hub.dk ircd.choopa.net) [19:30] *** mistym- has quit IRC (hub.dk ircd.choopa.net) [19:30] *** godane has quit IRC (hub.dk ircd.choopa.net) [19:30] *** zino has quit IRC (hub.dk ircd.choopa.net) [19:30] *** Cameron_D has quit IRC (hub.dk ircd.choopa.net) [19:30] *** dcmorton has quit IRC (hub.dk ircd.choopa.net) [19:30] *** decay has quit IRC (hub.dk ircd.choopa.net) [19:30] *** bsmith093 has quit IRC (hub.dk ircd.choopa.net) [19:30] *** superkuh has quit IRC (hub.dk ircd.choopa.net) [19:30] *** GLaDOS has quit IRC (hub.dk ircd.choopa.net) [19:30] *** ranma has quit IRC (hub.dk ircd.choopa.net) [19:30] *** SN4T14 has quit IRC (hub.dk ircd.choopa.net) [19:30] *** swebb has quit IRC (hub.dk ircd.choopa.net) [19:30] *** signius has quit IRC (hub.dk ircd.choopa.net) [19:30] *** atlogbot has quit IRC (hub.dk ircd.choopa.net) [19:30] *** lytv has quit IRC (hub.dk ircd.choopa.net) [19:30] *** yipdw has quit IRC (hub.dk ircd.choopa.net) [19:30] *** antomatic has quit IRC (hub.dk ircd.choopa.net) [19:30] *** Infreq_ has quit IRC (hub.dk ircd.choopa.net) [19:30] *** ErkDog has quit IRC (hub.dk ircd.choopa.net) [19:30] *** slyphic has quit IRC (hub.dk ircd.choopa.net) [19:30] *** winr5r has quit IRC (hub.dk ircd.choopa.net) [19:30] *** rduser has quit IRC (hub.dk ircd.choopa.net) [19:30] *** jut has quit IRC (hub.dk ircd.choopa.net) [19:30] *** goekesmi has quit IRC (hub.dk ircd.choopa.net) [19:30] *** bwn has quit IRC (hub.dk ircd.choopa.net) [19:30] *** beardicus has quit IRC (hub.dk ircd.choopa.net) [19:30] *** phuzion has quit IRC (hub.dk ircd.choopa.net) [19:30] *** tfgbd_znc has quit IRC (hub.dk ircd.choopa.net) [19:30] *** botpie91 has quit IRC (hub.dk ircd.choopa.net) [19:30] *** Mayonaise has quit IRC (hub.dk ircd.choopa.net) [19:30] *** Dark_Star has joined #archiveteam-bs [19:36] *** balrog has joined #archiveteam-bs [19:36] *** tomwsmf-a has joined #archiveteam-bs [19:36] *** fie has joined #archiveteam-bs [19:36] *** JW_work1 has joined #archiveteam-bs [19:36] *** jut has joined #archiveteam-bs [19:36] *** schbirid has joined #archiveteam-bs [19:36] *** ndiddy has joined #archiveteam-bs [19:36] *** Honno has joined #archiveteam-bs [19:36] *** mutoso has joined #archiveteam-bs [19:36] *** jspiros has joined #archiveteam-bs [19:36] *** goekesmi has joined #archiveteam-bs [19:36] *** bwn has joined #archiveteam-bs [19:36] *** Baljem has joined #archiveteam-bs [19:36] *** SketchCo1 has joined #archiveteam-bs [19:36] *** beardicus has joined #archiveteam-bs [19:36] *** phuzion has joined #archiveteam-bs [19:36] *** tfgbd_znc has joined #archiveteam-bs [19:36] *** botpie91 has joined #archiveteam-bs [19:36] *** Mayonaise has joined #archiveteam-bs [19:36] *** dxrt has joined #archiveteam-bs [19:36] *** remsen has joined #archiveteam-bs [19:36] *** mr-b has joined #archiveteam-bs [19:36] *** mistym- has joined #archiveteam-bs [19:36] *** acridAxid has joined #archiveteam-bs [19:36] *** marvinw has joined #archiveteam-bs [19:36] *** midas has joined #archiveteam-bs [19:36] *** godane has joined #archiveteam-bs [19:36] *** hub.efnet.us sets mode: +oooo balrog SketchCo1 dxrt godane [19:36] *** zino has joined #archiveteam-bs [19:36] *** Smiley has joined #archiveteam-bs [19:36] *** Cameron_D has joined #archiveteam-bs [19:36] *** dcmorton has joined #archiveteam-bs [19:36] *** decay has joined #archiveteam-bs [19:36] *** zenguy has joined #archiveteam-bs [19:36] *** bsmith093 has joined #archiveteam-bs [19:36] *** superkuh has joined #archiveteam-bs [19:36] *** SadDM has joined #archiveteam-bs [19:36] *** yakfish has joined #archiveteam-bs [19:36] *** GLaDOS has joined #archiveteam-bs [19:36] *** ranma has joined #archiveteam-bs [19:36] *** SN4T14 has joined #archiveteam-bs [19:36] *** swebb has joined #archiveteam-bs [19:36] *** hub.efnet.us sets mode: +oooo Smiley SadDM GLaDOS swebb [19:36] *** signius has joined #archiveteam-bs [19:36] *** achip has joined #archiveteam-bs [19:36] *** atlogbot has joined #archiveteam-bs [19:36] *** lytv has joined #archiveteam-bs [19:36] *** hub.efnet.us sets mode: +o achip [19:36] *** Zebranky has joined #archiveteam-bs [19:36] *** yipdw has joined #archiveteam-bs [19:36] *** antomatic has joined #archiveteam-bs [19:36] *** Infreq_ has joined #archiveteam-bs [19:36] *** ErkDog has joined #archiveteam-bs [19:36] *** slyphic has joined #archiveteam-bs [19:36] *** winr5r has joined #archiveteam-bs [19:36] *** rduser has joined #archiveteam-bs [19:36] *** hub.efnet.us sets mode: +ooo yipdw antomatic Infreq_ [19:36] *** swebb sets mode: +o brayden_ [19:36] *** swebb sets mode: +o DFJustin [19:36] *** swebb sets mode: +o xmc [19:38] Ohhh I did the lord of the rings wiki, lol [19:39] http://puu.sh/pj9wB/3275066ac6.png [19:43] *** jut has quit IRC (Leaving) [20:00] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [20:17] fashion/clothes can't be copyrighted, right? [20:17] yep [20:18] guess I'll have to read more of the quilting thread to see upon what grounds they're claiming it can be copyrighted [20:22] *** SN4T14 has quit IRC (Read error: Operation timed out) [20:27] *** SN4T14 has joined #archiveteam-bs [20:52] *** yakfish has quit IRC (Read error: Operation timed out) [21:00] are you sure? i could swear there was some ruckus last year [21:00] *** schbirid has quit IRC (Quit: Leaving) [21:12] *** yakfish has joined #archiveteam-bs [21:33] *** will has quit IRC (Ping timeout: 244 seconds) [21:34] *** will has joined #archiveteam-bs [21:43] *** VADemon has quit IRC (Quit: left4dead) [21:48] anyone know why http://jezebel.com/ is excluded from the wayback machine? [21:49] *** Honno has quit IRC (Read error: Operation timed out) [21:53] Huh, it's not a robots.txt exclusion. I've never seen that before [21:53] You'd probably have to ask info@archive.org what's up [21:53] http://web.archive.org/web/http://jezebel.com/ [21:54] I haven't seen any of the other gawker websites be excluded from the wayback machine [21:54] Yeah, I'll send them a quick email. [21:56] hook54321: that error message generally means someone from the site directly emailed IA and asked for it to be excluded, rather than using the robots.txt method. [21:57] Might be worth archivebotting once some pipelines free up [21:57] While you are welcome to email info@ I doubt they'll tell you anything more than that. [21:58] yeah. I'll try contacting IA first, then Jezebel, then jezebel authors directly, then gawker authors, then gawker. [21:58] Eh, I wouldn't go that far [21:58] archive with complete abandon. [21:58] Given Gawker's general problems right now, I doubt you'll get a very favorable hearing from anyone. [21:59] If they don't want the IA crawling them that's their prerogative [21:59] There's still problems at gawker? [21:59] It's even more strange they didn't have the other sites excluded though. [22:01] does IA require the webmaster to request the site to be excluded? [22:02] The easiest way is just to block ia_archiver in your robots.txt [22:02] That will also retroactively remove access to past crawls until you un-block the IA's bot [22:12] *** n00bLurke has joined #archiveteam-bs [22:17] i'm doing a sitemap grab of jezebel.com [22:44] *** n00bLurke has quit IRC (n00bLurke) [23:54] *** SketchCo1 is now known as SketchCow [23:58] *** BlueMaxim has joined #archiveteam-bs