#archiveteam-bs 2016-12-02,Fri

↑back Search

Time Nickname Message
00:02 🔗 Stilett0 has quit IRC (Ping timeout: 246 seconds)
00:07 🔗 yipdw has quit IRC (ny.us.hub irc.servercentral.net)
00:07 🔗 ravetcofx has quit IRC (ny.us.hub irc.servercentral.net)
00:07 🔗 robink has quit IRC (ny.us.hub irc.servercentral.net)
00:07 🔗 swebb has quit IRC (ny.us.hub irc.servercentral.net)
00:07 🔗 Laverne has quit IRC (ny.us.hub irc.servercentral.net)
00:07 🔗 Flierp has quit IRC (ny.us.hub irc.servercentral.net)
00:07 🔗 ZizzyDizz has quit IRC (ny.us.hub irc.servercentral.net)
00:14 🔗 Start has joined #archiveteam-bs
00:14 🔗 GE has quit IRC (Quit: zzz)
00:21 🔗 ravetcofx has joined #archiveteam-bs
00:21 🔗 Laverne has joined #archiveteam-bs
00:21 🔗 chazchaz has joined #archiveteam-bs
00:21 🔗 atlogbot has joined #archiveteam-bs
00:21 🔗 slyphic_ has joined #archiveteam-bs
00:21 🔗 Cameron_D has joined #archiveteam-bs
00:21 🔗 MrRadar has joined #archiveteam-bs
00:21 🔗 Flierp has joined #archiveteam-bs
00:21 🔗 ZizzyDizz has joined #archiveteam-bs
00:22 🔗 swebb_ is now known as swebb
00:25 🔗 brayden has joined #archiveteam-bs
00:53 🔗 tfgbd_znc has joined #archiveteam-bs
01:15 🔗 Somebody has joined #archiveteam-bs
01:17 🔗 Start has quit IRC (Quit: Disconnected.)
01:20 🔗 Start has joined #archiveteam-bs
01:23 🔗 Stiletto has joined #archiveteam-bs
01:37 🔗 VADemon has quit IRC (Quit: left4dead)
02:21 🔗 tfgbd_znc has quit IRC (Read error: Connection reset by peer)
02:23 🔗 Somebody This delights my weird archivist heart: https://archive.org/details/MusicLocker_201608
02:26 🔗 tfgbd_znc has joined #archiveteam-bs
03:05 🔗 Somebody has quit IRC (Ping timeout: 370 seconds)
03:18 🔗 jrwr has quit IRC (Leaving)
03:21 🔗 Stiletto has quit IRC (Ping timeout: 246 seconds)
03:28 🔗 Ravenloft has quit IRC (Ping timeout: 244 seconds)
03:43 🔗 Somebody has joined #archiveteam-bs
03:50 🔗 Stiletto has joined #archiveteam-bs
03:56 🔗 dashcloud has quit IRC (Read error: Operation timed out)
03:57 🔗 dashcloud has joined #archiveteam-bs
04:26 🔗 vitzli has joined #archiveteam-bs
04:29 🔗 BlueMaxim has joined #archiveteam-bs
04:55 🔗 Yoshimura has quit IRC (Ping timeout: 255 seconds)
05:20 🔗 ranma https://twitter.com/mikko/status/804232169728053252
05:20 🔗 ranma <Chii> Mikko Hypponen on Twitter: "I'm a bit worried about what's going to happen to Pebble now that Fitbit seems to be acquiring them. https://t.co/vPjz2WRk1F" ~ twitter.com
05:32 🔗 yipdw_ ranma: https://badcheese.com/~steve/atlogs/?chan=archiveteam&day=2016-12-01
05:35 🔗 ranma ah, i missed the !a request
05:45 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
05:48 🔗 Aranje has joined #archiveteam-bs
05:52 🔗 Sk1d has joined #archiveteam-bs
05:54 🔗 Aranje has quit IRC (Read error: Connection timed out)
05:54 🔗 Aranje has joined #archiveteam-bs
06:02 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
06:10 🔗 ndiddy has quit IRC (Read error: Connection reset by peer)
06:14 🔗 Somebody has quit IRC (Ping timeout: 370 seconds)
06:18 🔗 ravetcofx has joined #archiveteam-bs
06:26 🔗 krazedkat has quit IRC (Ping timeout: 244 seconds)
06:29 🔗 krazedkat has joined #archiveteam-bs
06:33 🔗 jsp12345 has quit IRC (Ping timeout: 492 seconds)
06:41 🔗 krazedkat has quit IRC (Ping timeout: 244 seconds)
06:42 🔗 krazedkat has joined #archiveteam-bs
06:55 🔗 Aranje has quit IRC (Quit: Three sheets to the wind)
07:04 🔗 alembic has joined #archiveteam-bs
07:07 🔗 alembic has quit IRC (Client Quit)
07:10 🔗 alembic has joined #archiveteam-bs
07:11 🔗 Somebody has joined #archiveteam-bs
07:11 🔗 alembic has quit IRC (Client Quit)
07:11 🔗 alembic has joined #archiveteam-bs
07:12 🔗 krazedkat has quit IRC (Quit: Leaving)
07:15 🔗 REiN^ has quit IRC (Max SendQ exceeded)
07:15 🔗 REiN^ has joined #archiveteam-bs
07:39 🔗 Stiletto has quit IRC (Read error: Connection reset by peer)
07:40 🔗 Stiletto has joined #archiveteam-bs
08:23 🔗 Somebody has quit IRC (Ping timeout: 370 seconds)
08:33 🔗 GE has joined #archiveteam-bs
09:12 🔗 yipdw_ is now known as yipdw
09:19 🔗 hawc145 is now known as HCross
09:22 🔗 vitzli has quit IRC (Quit: Leaving)
09:28 🔗 vitzli has joined #archiveteam-bs
09:34 🔗 HCross has quit IRC (Read error: Connection reset by peer)
09:35 🔗 HCross has joined #archiveteam-bs
09:36 🔗 xx343 has quit IRC (Read error: Connection reset by peer)
09:37 🔗 xx343 has joined #archiveteam-bs
10:00 🔗 godane so Arirang Business Daily is almost done
10:00 🔗 godane i'm uploading episode 2016-10-24 episode right now
10:37 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
10:38 🔗 BlueMaxim has joined #archiveteam-bs
10:39 🔗 ravetcofx has quit IRC (Read error: Operation timed out)
11:12 🔗 godane i'm starting uploaded free north korea radio
11:13 🔗 godane i got mp3s going back to feb 2010
11:13 🔗 GE has quit IRC (Remote host closed the connection)
11:40 🔗 BlueMaxim has quit IRC (Ping timeout: 370 seconds)
12:16 🔗 signius_ has joined #archiveteam-bs
12:54 🔗 VADemon has joined #archiveteam-bs
13:01 🔗 GE has joined #archiveteam-bs
14:09 🔗 godane SketchCow: can you find out why WBAI archives are mostly gone
14:09 🔗 godane there was like 10 years worth of mp3s about 18 months ago
14:27 🔗 vitzli has quit IRC (Quit: Leaving)
15:30 🔗 RichardG_ has joined #archiveteam-bs
15:31 🔗 godane SketchCow: this needs to be put into a collection: https://archive.org/search.php?query=subject%3A%22nerdtv%22&sort=-publicdate&and[]=subject%3A%22cringely%22
15:31 🔗 godane its the PBS NerdTV series
15:34 🔗 RichardG has quit IRC (Ping timeout: 364 seconds)
15:34 🔗 godane we also got lucky cause the site is down
15:42 🔗 RichardG_ is now known as RichardG
15:59 🔗 fie has joined #archiveteam-bs
16:44 🔗 RichardG has quit IRC (Ping timeout: 250 seconds)
16:59 🔗 RichardG has joined #archiveteam-bs
17:20 🔗 godane i'm grabbing tons of mp3s from 2005 from way back for wbai archive collection
17:36 🔗 Somebody has joined #archiveteam-bs
18:43 🔗 Somebody has quit IRC (Ping timeout: 370 seconds)
19:31 🔗 ravetcofx has joined #archiveteam-bs
19:34 🔗 drunksci has quit IRC (Remote host closed the connection)
19:38 🔗 jsp12345 has joined #archiveteam-bs
20:26 🔗 drunksci has joined #archiveteam-bs
20:27 🔗 BlueMaxim has joined #archiveteam-bs
20:31 🔗 Coderjoe has joined #archiveteam-bs
20:32 🔗 jrwr has joined #archiveteam-bs
20:33 🔗 Coderjoe grr. I see several places linking to mailing list messages in the pipermail archive that used to be hosted at arduino.cc, but those links are now dead and I can't seem to find copies of those messages.
20:36 🔗 Coderjoe the specific message I am currently trying to find used to live at http://arduino.cc/pipermail/developers_arduino.cc/2011-September/005568.html
20:37 🔗 Coderjoe but I see several other dead links. this makes me both angry and sad.
20:39 🔗 joepie91 Coderjoe: it's possible that they got renumbered
20:39 🔗 joepie91 this happens with python mailinglists every few months
20:39 🔗 joepie91 it's very irritating
20:59 🔗 powerKitt has joined #archiveteam-bs
21:00 🔗 powerKitt I want to start a project to archive the SCP Foundation wiki (and other Wikidot sites) but there's one "small" problem.
21:00 🔗 powerKitt Usage of the API requires a $49.90 payment to Wikidot yearly.
21:01 🔗 powerKitt http://www.wikidot.com/plans https://www.wikidot.com/doc:api
21:01 🔗 xmc fffff
21:04 🔗 powerKitt Also, it appears you may need to be a "member" of a wiki before you can scrape it using the API. I'd check, but I don't have $49.90 payable to Wikidot on hand.
21:05 🔗 joepie91 you wouldn't want to archive through the API anyway
21:05 🔗 joepie91 at least not initially
21:05 🔗 joepie91 a WARC from a web scrape is more useful, generally
21:05 🔗 xmc also, i'm sure you could construct a pretty straightworward api from the website anyway
21:06 🔗 powerKitt The main issue with WARC scrapes, Wikidot-wise at least, is that Wikidot pages are messes of javascript.
21:06 🔗 xmc yeah
21:08 🔗 powerKitt Take http://www.scp-wiki.net/scp-343 for example. Revision history and files uploaded are javascript drop downs.
21:09 🔗 powerKitt Viewing a past revision require using the History dropdown, and then clicking the button to view the revision. URL does not change. Source code for an article revision is obtained the same way.
21:10 🔗 powerKitt Oh, and the dropdowns don't even appear if you aren't a logged in Wikidot user who's a "member" of the SCP Foundation wiki.
21:10 🔗 xmc i have a wikidot account, haven't formally joind scp-wiki in any way, and they're visible to me
21:10 🔗 xmc when logged in
21:14 🔗 powerKitt Huh.
21:15 🔗 powerKitt http://ci-wiki.wikidot.com/item-experimentation They don't appear on this one, though. Which is strange.
21:17 🔗 powerKitt http://ci-wiki.wikidot.com/system:list-all-pages It should be noted that this is the Wikidot equivalent of MediaWiki's Special:AllPages
21:26 🔗 powerKitt http://r.wikidot.com/ What kind of absurd mistake is this. http://r.wikidot.com/system:list-all-pages
21:27 🔗 powerKitt Who thought a Wikidot based //URL SHORTENER// was a good idea??
21:27 🔗 xmc is this real?!?
21:27 🔗 xmc oh my gosh
21:27 🔗 powerKitt Apparently so
21:28 🔗 powerKitt I mean, I've some complete mistakes of free site usage before.
21:28 🔗 powerKitt But this is next level madness.
21:29 🔗 ae_g_i_s especially since the domain is way too long for a shortener
21:31 🔗 powerKitt Some quick calculations reveal that there's roughly 118762 pages on http://r.wikidot.com/
21:33 🔗 powerKitt So there's probably a bit under that many links "shortened" this way, as I haven't subtracted non-link pages.
21:49 🔗 powerKitt brb, checking sometihng
21:49 🔗 powerKitt has quit IRC (Quit: Page closed)
21:52 🔗 powerKitt has joined #archiveteam-bs
21:53 🔗 powerKitt Well, I did some looking around, but I can't seem to find a way to get page source/revisions/filelists without javascript.
21:56 🔗 drunksci has quit IRC ()
21:56 🔗 powerKitt http://web.archive.org/web/20161202215539/http://scp-wiki.wikidot.com/scp-2111/ Wow that is one ugly mess the Wayback machine spit out
21:58 🔗 powerKitt http://i.imgur.com/pouy0Ar.png
22:05 🔗 xmc my gosh
22:07 🔗 arkiver powerKitt: try https://web-beta.archive.org/web/20161202215539/http://scp-wiki.wikidot.com/scp-2111/
22:08 🔗 powerKitt Apparently if you try to save a wikidot page manually with a logged in account, it freaks out.
22:09 🔗 Coderjoe joepie91: the entire pipermail tree is gone. the pipermail directory itself redirects to a google groups mailing list, and I don't think it includes any of the old list's messages
22:22 🔗 powerKitt http://scp-jp-sandbox2.wikidot.com/system:list-all-pages Apparently it's possible for a Wikidot wiki to delete system:list-all-pages
22:22 🔗 powerKitt great.
22:24 🔗 powerKitt That, or it's localized to the wiki region
22:24 🔗 powerKitt Either way, /great/.
22:27 🔗 ae_g_i_s the js is also quite horrible ^^
22:28 🔗 ae_g_i_s did a quick check how difficult it'd be to emulate/rewrite it, but it's minified and just...bah
22:29 🔗 yipdw web-beta works fine, but honestly I like the glitchy version more
22:29 🔗 yipdw I mean, really, the page has a mention of "memetic security systems"
22:29 🔗 yipdw OF COURSE you need glitches
22:30 🔗 powerKitt ae_g_i_s: If you're using Notepad++, the JSTool plugin can make less hideous looking.
22:32 🔗 yipdw people who run cyberpunkish fiction sites should totally install request filters that, if they detect something like ia_archiver, introduces glitch CSS
22:32 🔗 yipdw that would be awesome and would drive people here insane
22:32 🔗 * yipdw +1
22:32 🔗 ae_g_i_s powerKitt: thx, chrome does have a pretty printer too, but what it can't do is refactor variables and other identifiers :/
22:32 🔗 xmc that's a perfect item for "evil thought of the day"
22:32 🔗 yipdw I prefer to think of it as performance art
22:33 🔗 ae_g_i_s :D
22:33 🔗 yipdw you don't destroy of the content, you merely jack with its form
22:33 🔗 xmc https://twitter.com/search?q=3totd
22:33 🔗 yipdw oh i didn't know that was a thing
22:34 🔗 xmc it's mostly a few people in seattle
22:35 🔗 yipdw OR
22:36 🔗 yipdw ok so, these days, you can get the current date from Javascript really easily and it's not too hard to use that to do things like manipulate CSS classes
22:36 🔗 yipdw if you detect ia_archiver or ArchiveBot: introduce CSS and Javascript to activate it, but the change is subtle and occurs over time
22:36 🔗 yipdw like, have the page slowly rot
22:37 🔗 xmc or you could make two requests to the same endpoint, which should return different results; if you get the same result then you're being cached or archived, so activate the payload
22:37 🔗 ae_g_i_s a dali painting over 4 years of wayback machine
22:37 🔗 xmc hm
22:37 🔗 ae_g_i_s melting away
22:37 🔗 xmc i like this rotting idea though
22:39 🔗 yipdw I wonder if there's a way to do this just with CSS
22:41 🔗 Sanqui yipdw: css prefixes are literally a way of webpage rot
22:41 🔗 ae_g_i_s yipdw: that's exactly the reason i have the wikipedia page for media queries open
22:41 🔗 yipdw yeah, but I want something that occurs over time
22:41 🔗 Sanqui because they stop being supported at some point
22:41 🔗 yipdw controlled
22:41 🔗 yipdw not via vendor prefixes
22:41 🔗 powerKitt http://de-scp.wikidot.com/ http://scp-wiki-de.wikidot.com/ Weird. There's actually two German SCP Foundation wikis.
22:41 🔗 ae_g_i_s but they don't seem to do this kind of thing, i can't find anything that'd depend on a long-term state or date
22:42 🔗 yipdw yeah, me either
22:42 🔗 xmc so you serve it with the current unix time, and the further the page's stored time is from its run-time, it degrades more?
22:42 🔗 xmc stored in the source for the page or whatever
22:42 🔗 yipdw the closest thing I've found so far are the :past/:future selectors in the CSS level 4 proposal
22:42 🔗 yipdw but that's intended for WebVTT, which is all relative times
22:43 🔗 yipdw xmc: something like that, but only if the page was requested in a way that it's clear an archiver user-agent was involved
22:43 🔗 Sanqui really long term css animations
22:43 🔗 yipdw so the rot has to be client-side and ideally would not involve JS
22:43 🔗 Sanqui will degrade the page if you keep it open
22:43 🔗 Sanqui lol
22:43 🔗 xmc yipdw: hm.
22:44 🔗 xmc i was just thinking "if this page was autogenerated more than X days ago, activate progressive rot"
22:44 🔗 yipdw I figure someone must have done this already
22:44 🔗 xmc oh! (1) does wayback serve with the date-modified header, and (2) can you fetch that from page scripting
22:44 🔗 yipdw you can parse out the grab date from the URL
22:45 🔗 yipdw but I think you still need Javascript to do that
22:45 🔗 ae_g_i_s yeah, there's no 'text matching' in CSS selectors
22:45 🔗 xmc i'm thinking a thing that works independent of wayback itself
22:45 🔗 xmc just an age-of-page thingy
22:46 🔗 yipdw oh
22:47 🔗 ae_g_i_s one ugly and naive way to do it would be writing the age of the page as a class into a specific element in a server-side script...combined with a huge amount of css selectors
22:47 🔗 ae_g_i_s i.e. one per day or whatever time unit you're using
22:47 🔗 ae_g_i_s well, not just selectors, but also "CSS rules of the day" for every day in the future
22:48 🔗 yipdw er wait lol
22:48 🔗 yipdw <time>
22:48 🔗 yipdw hmm
22:48 🔗 ae_g_i_s ?
22:48 🔗 ae_g_i_s is that an actual tag?
22:48 🔗 yipdw wait sorry
22:48 🔗 yipdw that's a markup tag, not an input control
22:49 🔗 yipdw there is an input type="time" and maybe you can do some stuff with attr^=value
22:49 🔗 ae_g_i_s damn :/ also, i don't think you can select based on form controls' content
22:49 🔗 yipdw er sorry, datetime
22:49 🔗 yipdw yeah, I think that's true too
22:51 🔗 yipdw that and date/datetime/datetime-local/etc. has pretty poor browser support, plus I don't see a way to autopopulate those with the current time
22:51 🔗 yipdw maybe in the future though
22:54 🔗 ae_g_i_s was gonna check out the 'turing complete' argument for CSS3/HTML5, but it requires user interaction
22:54 🔗 powerKitt http://pastebin.com/FuVWe9nY Preliminary Wikidot scrape for SCP Foundation wikis.
22:56 🔗 yipdw ae_g_i_s: oh, the Rule 110 automaton?
22:57 🔗 ae_g_i_s yipdw: yeah, exactly...was considering if maybe some parts of it would be reusable for this, but probably not
22:57 🔗 yipdw ah
23:01 🔗 powerKitt http://developer.wikidot.com/i-want-api-access
23:01 🔗 powerKitt "Note the API access needs to be enabled in _admin" well there goes my plans.
23:04 🔗 powerKitt Looks like I'm definitely going to have to whip up some kind of scraper.
23:10 🔗 GE has quit IRC (Quit: zzz)
23:12 🔗 powerKitt https://github.com/wertercatt/Wikidot-Scraper Used Google Chrome's inspect element tool to save the packets sent and recieved when you view:
23:13 🔗 ae_g_i_s oh, cool
23:13 🔗 powerKitt Page history, A specific page revision, source of a page revision, and file listing
23:14 🔗 powerKitt I have no idea where to start on writing a scraper though, so help would be appreciated.
23:19 🔗 powerKitt Fun fact: I have one of those "instantly save to Internet Archive" bookmarklets.
23:20 🔗 powerKitt and every so often I accidently hit it while trying to click in the url bar.
23:20 🔗 xmc :)
23:27 🔗 powerKitt Anyway, I guess I'll try to figure out how site scraping works.
23:30 🔗 powerKitt Idea: script that finds YouTube video pages saved to the wayback machine, and then runs tubeup.py on them.
23:34 🔗 ndiddy has joined #archiveteam-bs
23:37 🔗 powerKitt has quit IRC (Quit: Page closed)

irclogger-viewer