#archiveteam 2016-08-17,Wed

↑back Search

Time Nickname Message
00:00 🔗 laufwerkf has joined #archiveteam
00:17 🔗 tfgbd_znc has quit IRC (Ping timeout: 633 seconds)
00:29 🔗 kristian_ has quit IRC (Leaving)
00:49 🔗 JesseW has joined #archiveteam
00:58 🔗 DoomTay has joined #archiveteam
01:26 🔗 ZeoNet has quit IRC (Read error: Operation timed out)
01:52 🔗 BlueMaxim has joined #archiveteam
02:24 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
02:27 🔗 Zialus is now known as RMF|away
02:31 🔗 nicolas17 so...
02:32 🔗 nicolas17 how do I help with archiving stuff? the Warrior?
02:52 🔗 dashcloud that's probably the easiest way
02:55 🔗 nicolas17 I just ran it as a docker container in a VPS
02:56 🔗 nicolas17 took me a while to get the web UI to work (port forwarding and all)
02:56 🔗 nicolas17 I set it to "archiveteam's choice" and it seems its choice was urlteam, which is frequently giving "no tasks available" :o
02:58 🔗 nicolas17 the web UI is super slick
03:07 🔗 laufwerkf has quit IRC ()
03:09 🔗 nicolas17 what project does have tasks?
03:09 🔗 nicolas17 I have a *lot* of bandwidth and I'd like to use it :P
03:15 🔗 JesseW has joined #archiveteam
03:28 🔗 DoomTay Doesn't the warrior interface display all possible tasks?
03:32 🔗 nicolas17 it shows all possible projects, and most of the ones I try say they have no tasks from the tracker
03:32 🔗 DoomTay nicolas17: How about Orkut? That's going kaput next month.
03:33 🔗 RichardG has quit IRC (Ping timeout: 370 seconds)
03:34 🔗 nicolas17 the throttling and project switching is pretty annoying... like if I switch to orkut, it doesn't start any new orkut task because there are too many concurrent tasks from another project running already... but all those tasks are sleeping! ("No items available currently. Trying again in 120 seconds")
03:35 🔗 JesseW nicolas17: they will eventually time out, and it will load new tasks from orkut
03:36 🔗 JesseW there certainly are various things that could be improved about the warrior, though
03:36 🔗 nicolas17 more like keep sleeping because of heavy tracker rate limiting on the orkut project :P
03:36 🔗 JesseW when I looked into it, I got stuck trying to set up a testing environment
03:36 🔗 JesseW nicolas17: sure, but at least they'll be waiting on orkut
03:37 🔗 JesseW also, if you can, URLteam can always use people investigating shorteners -- then I can add more to the tracker, and there will be more work to do
03:38 🔗 nicolas17 I'm on a gigabit pipe doing pretty brief 50KB/s bursts and then sleeping, it's a bit frustrating
03:40 🔗 tomwsmf has quit IRC (Read error: Operation timed out)
03:42 🔗 DoomTay What's your ISP?
03:42 🔗 nicolas17 I'm running it on a VPS
03:44 🔗 nicolas17 maybe I should run an archivebot node instead? :P
03:44 🔗 bwn nicolas17: <HCross2> Putting another call out. We really could do with a few more newsbuddy grabbers. If anyone has a fast, stable connection and is willing to help, just come into #newsgrabber and let myself or arkiver know please
03:44 🔗 * nicolas17 still reading the wiki
03:44 🔗 JesseW nicolas17: no new archivebot nodes for now
03:45 🔗 nicolas17 oki
03:53 🔗 nicolas17 will the orkut grab finish in time with the current rate limiting? it seems like adding more warriors would make no difference
03:54 🔗 RichardG has joined #archiveteam
04:13 🔗 ravetcofx has quit IRC (Read error: Connection reset by peer)
04:16 🔗 ravetcofx has joined #archiveteam
04:20 🔗 DoomTay has quit IRC (DoomTay)
04:23 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
04:30 🔗 Sk1d has joined #archiveteam
05:35 🔗 barblfish has joined #archiveteam
05:36 🔗 barblfish According to the wiki article on DNS History, "the site is a zombie"
05:36 🔗 barblfish I just took a peek and the site looks to be working normally, including search
05:37 🔗 barblfish has quit IRC (Client Quit)
05:38 🔗 barblfish has joined #archiveteam
05:39 🔗 JesseW barblfish: good! maybe the interest prompted the site owner to keep it running
05:40 🔗 JesseW barblfish: feel free to update the wiki page, mentioning that the site seems to be working (but mention exactly what you did and didn't try, as other pieces may still be broken).
05:40 🔗 barblfish Probably should TRY to archve it at a "leisurely" pace just in case. For whatever reason, the closure notice is still there
05:40 🔗 JesseW the magic word is yahoosucks
05:40 🔗 barblfish K
05:41 🔗 JesseW barblfish: yeah, having an individual run a grab-site instance at, say 1 request per couple of minutes (in a random order, with a random delay) is probably worth doing
05:46 🔗 barblfish has quit IRC (Quit: ChatZilla 0.9.92 [Firefox 48.0/20160726073904])
06:02 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
06:07 🔗 nicolas17 has quit IRC (Read error: Operation timed out)
07:44 🔗 Selavi has joined #archiveteam
07:48 🔗 JesseW has joined #archiveteam
08:25 🔗 MMovie1 has joined #archiveteam
08:27 🔗 MMovie has quit IRC (Read error: Operation timed out)
08:42 🔗 Morbus has quit IRC (Ping timeout: 255 seconds)
08:45 🔗 Morbus has joined #archiveteam
09:08 🔗 Honno has joined #archiveteam
10:07 🔗 WinterFox has joined #archiveteam
10:10 🔗 JesseW has quit IRC (Read error: Operation timed out)
10:25 🔗 SketchCow has quit IRC (Read error: Operation timed out)
10:28 🔗 SketchCow has joined #archiveteam
10:28 🔗 swebb sets mode: +o SketchCow
11:03 🔗 ats has quit IRC (Quit: Lost terminal)
11:05 🔗 ats has joined #archiveteam
12:59 🔗 BlueMaxim has quit IRC (Quit: Leaving)
13:00 🔗 RMF|away is now known as Zialus
13:05 🔗 WinterFox has quit IRC (Read error: Operation timed out)
13:21 🔗 DoomTay has joined #archiveteam
13:30 🔗 ats has quit IRC (Quit: leaving)
13:36 🔗 ats has joined #archiveteam
13:36 🔗 joepie91 Reddit thread regarding Google Code deleting tarballs in 5 months: https://www.reddit.com/r/programming/comments/4y4epv/about_5_months_from_now_the_tarballs_from_google/
13:45 🔗 arkiver joepie91: you're talking about the Google Code Archive shutting down too?
13:46 🔗 joepie91 seems so
13:46 🔗 joepie91 have not read the thread carefully
13:46 🔗 joepie91 just passing it on
13:46 🔗 arkiver yeah, it looks like it
13:47 🔗 arkiver When we're done with the 'original' google code we'll do the google code archive too
13:48 🔗 bauruine has quit IRC (Ping timeout: 260 seconds)
13:53 🔗 bauruine has joined #archiveteam
13:58 🔗 DoomTay Since ArchiveBot seems to not handle LEGO.com videos properly, I'm going to try my hand at archiving videos at http://web.archive.org/web/20160616230429/http://www.lego.com/en-us/chima/videos "manually". And I just figured out how to do that
14:07 🔗 arkiver how are you going to do that?
14:07 🔗 arkiver let's move this to #archiveteam-bs also
14:08 🔗 DoomTay Yeaah....
14:09 🔗 DoomTay Can't
14:31 🔗 Sneakyimp has joined #archiveteam
14:53 🔗 voltagex DoomTay: come over to -bs, also, youtube-dl should save those for you.
14:53 🔗 voltagex arkiver, joepie91: just emailed Chris DiBona about getting in touch with ArchiveTeam re: Google Code, we'll see how that goes.
14:54 🔗 voltagex the #googlecodeblue wiki needs some TLC.
14:54 🔗 voltagex tracker seems to be down, also.
14:54 🔗 DoomTay It looks like I'm banned from -bs. Also, I already tried youtube-dl with ArchiveBot. no luck.
14:56 🔗 voltagex no, you'd only be able to use it on the live site
14:56 🔗 voltagex if the videos ain't in archive.org, they ain't in archive.org.
14:56 🔗 voltagex I wonder what you did to get banned from bs
14:57 🔗 DoomTay Apparently a history of "saying galactically dumb shit"
14:59 🔗 voltagex oh well, live and learn
14:59 🔗 voltagex what are you trying to save exactly?
14:59 🔗 voltagex if it's missing files in archive.org you'd have to go back to the source
14:59 🔗 voltagex if they're gone there, YouTube or you're too late.
15:01 🔗 DoomTay I'm trying to save videos off of http://www.lego.com/en-us/chima/videos . Problem is their video player is powered by AngularJS, and the player is set up "on the fly"
15:02 🔗 DoomTay And using youtube-dl is also a no go: "unsupported URL"
15:03 🔗 voltagex correct
15:04 🔗 voltagex post a correctly formatted issue / request on https://github.com/rg3/youtube-dl/issues
15:06 🔗 voltagex the only other hint I'll give you is look at the network traffic for manifest.f4m
15:22 🔗 JesseW has joined #archiveteam
15:33 🔗 nwf Hey channel. I have Internet2 at my disposal and a huge stash of unused disk space; can I be of assistance for google code or some other project? Ideally your answer is something like "Yes, please run aria2 on each URL in the list at $URL." ;)
15:36 🔗 JesseW nwf: join #newsgrabber and ask about being a grabber
15:37 🔗 JesseW nwf: also, check out iabackup.archiveteam.org for a use for your disk space
15:37 🔗 JesseW and THANK YOU!
15:37 🔗 JesseW feel free to ask here if you have questions
15:38 🔗 nwf Thanks. :)
15:39 🔗 JesseW you can also run a #warrior, but we don't have any project ATM that needs help, I think. But that could change anytime.
15:40 🔗 Sanqui you can run an archivebot pipeline
15:40 🔗 Sanqui reliable long term ones are always wanted
15:41 🔗 nwf Whazzat?
15:41 🔗 Sanqui on-demand archiver of small-to-medium or at-risk websites
15:42 🔗 Sanqui see #archivebot, http://archiveteam.org/index.php?title=ArchiveBot
15:42 🔗 nwf Sounds neat. Who has authority to push to the queue? (I don't want there to be risk to my hosting organization.)
15:43 🔗 Sanqui trusted users from here, though the bar is set pretty low
15:43 🔗 Sanqui and sometimes questionable stuff is archived
15:43 🔗 Sanqui so if that's of concern, it's fine
15:43 🔗 nwf Well, it just means I need to ask the admins for permission / give them a heads up that network security might come after them for a particular IP address.
15:44 🔗 JesseW we're also not accepting new #archivebot pipelines right now, according to yipdw (who maintains the list)
15:44 🔗 Sanqui oh
15:44 🔗 Sanqui I didn't kbkw that, alright
15:44 🔗 nicolas17 has joined #archiveteam
15:45 🔗 JesseW yeah, new archivebot pipelines is blocked by various code changes (i'm not certain exactly what)
15:45 🔗 Sanqui well archivebot needs to be rewritten, I know that, but it's trudging along anyway :P
15:45 🔗 JesseW but AFAIK, #newsgrabber is actively looking for new pipelines, and #iabackup, while inactive, is still accepting new storage
15:47 🔗 JesseW Sanqui: http://archiveteam.org/index.php?title=ArchiveBot#Volunteer_a_Node see the note at the top
15:47 🔗 Sanqui got it
16:03 🔗 DoomTay has quit IRC (Quit: Page closed)
16:08 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
16:21 🔗 DoomTay has joined #archiveteam
17:04 🔗 AlexLehm has joined #archiveteam
17:33 🔗 kristian_ has joined #archiveteam
18:06 🔗 JW_work has quit IRC (Quit: Leaving.)
18:07 🔗 JW_work has joined #archiveteam
18:10 🔗 r3c0d3x http://www.npr.org/sections/ombudsman/2016/08/17/489516952/npr-website-to-get-rid-of-comments NPR is removing comments from articles, but the comments will still be alive through Disqus. Are we planning on addressing this? (i.e. dumping threads from disqus for each article, or perhaps something else..?)
18:11 🔗 r3c0d3x Quote from the article: "All existing comments on the site will disappear. That is because while comments look as though they exist on the NPR.org pages, they actually live within Disqus, an outside moderation platform used by NPR. So when the commenting software is removed, the archival comments go with it, Montgomery said, adding that it is not possible to remove the comment system but leave the old comments. Individual users will still be able
18:11 🔗 r3c0d3x to see an archive of their own comments in their Disqus accounts."
18:30 🔗 JW_work has quit IRC (Read error: Connection reset by peer)
18:31 🔗 JW_work has joined #archiveteam
18:55 🔗 GLaDOS has quit IRC (Read error: Operation timed out)
18:56 🔗 GLaDOS has joined #archiveteam
19:03 🔗 SmileyG has quit IRC (Remote host closed the connection)
19:13 🔗 JW_work1 has joined #archiveteam
19:15 🔗 JW_work has quit IRC (Read error: Operation timed out)
19:16 🔗 Smiley has joined #archiveteam
19:28 🔗 ats has quit IRC (reeeeboooooooot)
19:59 🔗 AlexLehm has quit IRC (Ping timeout: 260 seconds)
20:06 🔗 tomwsmf has joined #archiveteam
20:07 🔗 SirCmpwn has quit IRC (Read error: Operation timed out)
20:10 🔗 ats has joined #archiveteam
20:24 🔗 SirCmpwn has joined #archiveteam
20:25 🔗 kristian_ has quit IRC (Leaving)
20:31 🔗 pfallenop has quit IRC (Read error: Operation timed out)
20:37 🔗 mr-b has quit IRC (Read error: Operation timed out)
20:40 🔗 mr-b has joined #archiveteam
20:40 🔗 pfallenop has joined #archiveteam
20:42 🔗 DoomTay has quit IRC (Quit: Page closed)
20:45 🔗 mr-b has quit IRC (Ping timeout: 246 seconds)
21:02 🔗 kristian_ has joined #archiveteam
21:02 🔗 mr-b has joined #archiveteam
21:06 🔗 Honno has quit IRC (Read error: Operation timed out)
21:35 🔗 robink has quit IRC (Ping timeout: 501 seconds)
22:08 🔗 robink has joined #archiveteam
22:38 🔗 pfallenop has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 nicolas17 has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 SketchCow has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 Morbus has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 zenguy has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 superkuh has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 dashcloud has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 chazchaz has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 winr5r has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 MrRadar has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 RedType_ has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 zino has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 arkiver has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 Peetz0r_ has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 Infreq has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 aschmitz has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 gibigiana has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 w0rp has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 HCross has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 indrora has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 dxrt has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 Zebranky has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 ranma has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 antomatic has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 hook54321 has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 luckcolor has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 ErkDog has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 Cameron_D has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 dcmorton has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 is- has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 Jogie has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 mistym- has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 swebb has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 atlogbot has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 dserodio has quit IRC (ny.us.hub irc.servercentral.net)
22:38 🔗 filippo__ has quit IRC (ny.us.hub irc.servercentral.net)
22:40 🔗 andromed1 has quit IRC (Read error: Connection reset by peer)
22:55 🔗 JW_work has joined #archiveteam
23:04 🔗 JW_work1 has quit IRC (Read error: Operation timed out)
23:05 🔗 DoomTay has joined #archiveteam
23:05 🔗 pfallenop has joined #archiveteam
23:05 🔗 nicolas17 has joined #archiveteam
23:05 🔗 SketchCow has joined #archiveteam
23:05 🔗 Morbus has joined #archiveteam
23:05 🔗 zenguy has joined #archiveteam
23:05 🔗 superkuh has joined #archiveteam
23:05 🔗 dashcloud has joined #archiveteam
23:05 🔗 chazchaz has joined #archiveteam
23:05 🔗 winr5r has joined #archiveteam
23:05 🔗 MrRadar has joined #archiveteam
23:05 🔗 RedType_ has joined #archiveteam
23:05 🔗 zino has joined #archiveteam
23:05 🔗 arkiver has joined #archiveteam
23:05 🔗 Infreq has joined #archiveteam
23:05 🔗 Peetz0r_ has joined #archiveteam
23:05 🔗 indrora has joined #archiveteam
23:05 🔗 aschmitz has joined #archiveteam
23:05 🔗 gibigiana has joined #archiveteam
23:05 🔗 w0rp has joined #archiveteam
23:05 🔗 HCross has joined #archiveteam
23:05 🔗 irc.servercentral.net sets mode: +oooo SketchCow chazchaz arkiver HCross
23:05 🔗 dxrt has joined #archiveteam
23:05 🔗 Zebranky has joined #archiveteam
23:05 🔗 ranma has joined #archiveteam
23:05 🔗 antomatic has joined #archiveteam
23:05 🔗 hook54321 has joined #archiveteam
23:05 🔗 luckcolor has joined #archiveteam
23:05 🔗 ErkDog has joined #archiveteam
23:05 🔗 Cameron_D has joined #archiveteam
23:05 🔗 dcmorton has joined #archiveteam
23:05 🔗 irc.servercentral.net sets mode: +oooo dxrt antomatic luckcolor dcmorton
23:05 🔗 is- has joined #archiveteam
23:05 🔗 mistym- has joined #archiveteam
23:05 🔗 swebb has joined #archiveteam
23:05 🔗 atlogbot has joined #archiveteam
23:05 🔗 dserodio has joined #archiveteam
23:05 🔗 filippo__ has joined #archiveteam
23:05 🔗 irc.servercentral.net sets mode: +oo mistym- swebb
23:05 🔗 swebb sets mode: +o brayden_
23:05 🔗 swebb sets mode: +o Atluxity
23:05 🔗 swebb sets mode: +o DFJustin
23:05 🔗 swebb sets mode: +o beardicus
23:05 🔗 swebb sets mode: +o midas
23:05 🔗 swebb sets mode: +o SadDM
23:05 🔗 swebb sets mode: +o balrog
23:05 🔗 swebb sets mode: +o edsu
23:05 🔗 swebb sets mode: +o joepie91
23:05 🔗 swebb sets mode: +o altlabel
23:05 🔗 swebb sets mode: +o Jonimoose
23:05 🔗 swebb sets mode: +o xmc
23:08 🔗 dxrt has quit IRC (Ping timeout: 370 seconds)
23:10 🔗 dxrt has joined #archiveteam
23:13 🔗 max has joined #archiveteam
23:14 🔗 max i have a site that may have historical significance and i am thinking of shutting it down. who should i talk to about potentially getting it archived efficiently?
23:15 🔗 Frogging What's the site?
23:15 🔗 max www.ytmnd.com
23:16 🔗 xmc o my
23:16 🔗 nicolas17 ...okay yes that has historical / internet culture significance o.O
23:16 🔗 Frogging o.o
23:16 🔗 max it isn't really cost-effective to host anymore
23:16 🔗 xmc yea we can hold it
23:16 🔗 JW_work max: thank you for considering how best to archive it
23:16 🔗 xmc <3
23:16 🔗 max i could spend the time to try to get it on all virtualized, but i think it would only prolong the inevitable death
23:17 🔗 nicolas17 max: how much bandwidth is it eating?
23:17 🔗 JW_work the best way would be to make a copy of the whole site database, and ship/upload that to archive.org as an item
23:17 🔗 JW_work (we can help if you have questions)
23:18 🔗 JW_work if that's not feasible (and maybe as an alternative), we can make a scrape of it before it goes down, which will get copied into the Wayback Machine
23:18 🔗 max nicolas17: probably less than 10mbps on average, mainly the costs are colocation fees at the moment since the hardware is aging
23:18 🔗 xmc imo a scrape would be best in any case
23:18 🔗 howdoicom has joined #archiveteam
23:18 🔗 xmc guided by a list of valid sites
23:18 🔗 xmc warc it up
23:18 🔗 JW_work it'd just be nice to have the raw database, too, in case someone else wants to host it again later
23:18 🔗 BlueMaxim has joined #archiveteam
23:18 🔗 xmc yeah
23:19 🔗 JW_work but yeah, both — both would be best
23:19 🔗 nicolas17 max: I meant in GB/month (a constant 10mbps would mean 3TB/mo)
23:19 🔗 max nicolas17: i haven't looked and i get billed at 95th percentile
23:19 🔗 nicolas17 JW_work: bothisgood.gif
23:19 🔗 JW_work exactly
23:19 🔗 max the content drive is currently 1.7T, i think i'd probably need to anonymize the db at the very least, remove private messages and stuff
23:20 🔗 max at the very least, i could write a script to create a list of every unique URL on the entire site
23:20 🔗 Frogging JW_work: someone should probably write some scripts
23:21 🔗 JW_work well, if you're willing, I'm pretty certain archive.org would be delighted to get a non-anoymized version of the drive and keep it private for a couple of decades or so
23:21 🔗 max to be fair, there is probably a ton of dmca violations, and horrific nsfw stuff
23:21 🔗 JW_work 1.7T is not particular painfully large for us
23:21 🔗 max i figured
23:22 🔗 nicolas17 JW_work: I heard you guys wanted a copy of Mapillary in case they go under...
23:22 🔗 max the database is pretty large
23:22 🔗 JW_work yep, it'd be great to have that, too
23:22 🔗 nicolas17 Mapillary staff told me they have 200TB of photos, so yeah, 1.7TB is small XD
23:23 🔗 JW_work yeah, 200TB is in the range where we'd need to discuss with IA staff before dropping it on them :-)
23:23 🔗 max but 1.5TB of our content is probably homemade drawings of sonic the hedgehog having sex with tails
23:24 🔗 Frogging that's fine
23:24 🔗 JW_work eh, we're still glad to have it
23:24 🔗 max db is mysql and around 180gb. it has historical view data for every site dating back to 2004 i think
23:24 🔗 JW_work that would be awesome to have
23:25 🔗 nicolas17 you mean like access logs? data scientists are drooling right now
23:25 🔗 JW_work :-)
23:25 🔗 max it's more like date, site_id, view_counter
23:25 🔗 max but yeah some neat stuff could be done with it
23:27 🔗 max i wonder if a warc would be able to faithfully encapsulate/play back a ytmnd
23:27 🔗 max it uses a flash loader because at the time it was the only way you could gaplessly loop WAV files
23:28 🔗 nicolas17 warc as a format should support it, a naive scraper trying to create the warc would have trouble with the Flash though
23:28 🔗 Frogging max: How long can you keep it online for?
23:28 🔗 xmc if you want to make a html5 version that plays nicely in the archive, people from the future would appreciate it
23:28 🔗 xmc if you don't want to, that's fine
23:29 🔗 max Frogging: indefinitely
23:29 🔗 Frogging thanks
23:29 🔗 max this is pretty preliminary, but if i dont give it to someone it will just sit on a hard drive in my closet forever which seems pretty lame
23:30 🔗 nicolas17 I think Google made a Flash-to-HTML5 converter (mainly for Flash ads to work on mobile), it would be interesting to see if it can handle ytmnd .swf's
23:30 🔗 DoomTay I once tried to make a script to convert the things to HTML5, but I got absolutely nowhere with it
23:30 🔗 nicolas17 (actually kind of Flash-to-JSON which is then interpreted by an HTML5/Javascript player)
23:30 🔗 max ytmnd just has 1 swf for the player and everything else is standard image/audio formats
23:31 🔗 max i made a prelim html5 version in 2011 but audio support wasnt very good back then
23:31 🔗 nicolas17 oh :o
23:31 🔗 max and that was the last time i really worked on the site
23:31 🔗 nicolas17 I thought you had vector swf animations and stuff
23:34 🔗 max it's a glorified flash intro and then is just used to play sound
23:34 🔗 max i.e. waits until the gif and audio are loaded before playing either
23:37 🔗 DoomTay We should probably give max the secret phrase so he can make a page about this
23:37 🔗 nicolas17 xmc: ok, there is no way you can throw a generic warc grabber at this; there is a swf loader that gets a json to know what wav and jpg to load
23:38 🔗 nicolas17 so if you want to scrape, custom script it is
23:38 🔗 xmc yea
23:38 🔗 xmc that was my gut feeling
23:38 🔗 nicolas17 http://picard.ytmnd.com/info/508/json
23:38 🔗 xmc this sounds like a good job for the warrior
23:42 🔗 kristian_ has quit IRC (Leaving)
23:43 🔗 max turns out if i just change the default from flash to html5, it seems to work fine now
23:44 🔗 max less flashy since there's no status or anything, but it lets you see the site at least
23:47 🔗 DoomTay Ha ha, flashy
23:51 🔗 ErkDog omg I grew up on YTMD
23:51 🔗 nicolas17 ErkDog: what were the consequences? :P
23:52 🔗 ErkDog max, is it PHP / MySQL?
23:53 🔗 ErkDog guess I could read, lol
23:54 🔗 max yeah common lamp stack
23:54 🔗 ErkDog lol I mirrored this from YTMD 1,000 years ago
23:54 🔗 ErkDog http://erkdog.netho.tk/picard/
23:55 🔗 ErkDog YTMD was basically audio-meme's before meme's were even a thing
23:55 🔗 ErkDog well is*
23:55 🔗 ErkDog it's like a meme w/ audio / animation except meme's didn't exist back then
23:57 🔗 max we just called them fads, very few of them had the staying power of something like dickbutt

irclogger-viewer