#urlteam 2016-12-05,Mon

↑back Search

Time Nickname Message
00:33 πŸ”— Somebody has joined #urlteam
05:41 πŸ”— Sk1d has quit IRC (Ping timeout: 194 seconds)
05:44 πŸ”— Start has quit IRC (Quit: Disconnected.)
05:47 πŸ”— Sk1d has joined #urlteam
05:47 πŸ”— Start has joined #urlteam
06:46 πŸ”— svchfoo1 has quit IRC (Quit: Closing)
06:47 πŸ”— svchfoo1 has joined #urlteam
06:48 πŸ”— svchfoo3 sets mode: +o svchfoo1
06:56 πŸ”— svchfoo1 has quit IRC (Quit: Closing)
06:57 πŸ”— Somebody has quit IRC (Ping timeout: 370 seconds)
06:57 πŸ”— svchfoo1 has joined #urlteam
06:58 πŸ”— svchfoo3 sets mode: +o svchfoo1
10:57 πŸ”— swebb has quit IRC (Ping timeout: 246 seconds)
10:57 πŸ”— swebb has joined #urlteam
10:58 πŸ”— svchfoo1 sets mode: +o swebb
11:21 πŸ”— dashcloud has quit IRC (Read error: Operation timed out)
11:33 πŸ”— dashcloud has joined #urlteam
14:25 πŸ”— WinterFox has quit IRC (Read error: Operation timed out)
15:41 πŸ”— Start has quit IRC (Quit: Disconnected.)
17:08 πŸ”— VADemon has joined #urlteam
17:37 πŸ”— Somebody has joined #urlteam
18:26 πŸ”— Somebody has quit IRC (Ping timeout: 370 seconds)
20:08 πŸ”— VADemon has quit IRC (Quit: left4dead)
20:54 πŸ”— HCross has quit IRC (Quit: Leaving)
20:58 πŸ”— dashcloud has quit IRC (Read error: Operation timed out)
20:59 πŸ”— pizzaiolo has joined #urlteam
20:59 πŸ”— pizzaiolo folks
20:59 πŸ”— dashcloud has joined #urlteam
20:59 πŸ”— pizzaiolo p, li { white-space: pre-wrap; } I'm talking to the maintainer of a link shortener service
21:00 πŸ”— pizzaiolo he's asking whether submitting the database of shortened links to the internet archive is enough for backing up
21:00 πŸ”— pizzaiolo what are your thoughts?
21:15 πŸ”— Smiley has quit IRC (Ping timeout: 250 seconds)
21:20 πŸ”— Smiley has joined #urlteam
21:23 πŸ”— hawc145 has joined #urlteam
21:24 πŸ”— hawc145 has quit IRC (Read error: Connection reset by peer)
21:30 πŸ”— bwn pizzaiolo: sounds good to me!
21:31 πŸ”— pizzaiolo bwn: does he need to do anything else? I'm not very sure how to help him as I'm also a beginner to the world of internet archiving
21:37 πŸ”— bwn a database of the shortlinks to urls sounds like a good start to me
21:37 πŸ”— bwn right now the urlteam results are packaged up into beacon format: https://gbv.github.io/beaconspec/beacon.html
21:43 πŸ”— bwn if the maintainer is willing, there's a python tool that can be used to pretty easily automate a upload to ia: https://internetarchive.readthedocs.io/en/latest/
22:22 πŸ”— pizzaiolo thanks bwn
22:29 πŸ”— JW_work has joined #urlteam
22:32 πŸ”— JW_work pizzaiolo: thanks for checking in! What's the URL of the shortening service you are in contact with? Is it still making new URLs? How many does it have currently, and how many new ones are generated per month?
22:32 πŸ”— pizzaiolo JW_work: http://ĝi.ga
22:33 πŸ”— pizzaiolo yes, still active
22:33 πŸ”— JW_work thanks for the URL. It looks like the short codes are 4 characters
22:33 πŸ”— pizzaiolo it's not a large service, let me check how many with the maintainer
22:33 πŸ”— JW_work At that scale, it's probably easiest for us to just grab them ourselves.
22:34 πŸ”— JW_work (although an upload by the maintainer would be certainly welcome as well!)
22:35 πŸ”— pizzaiolo he seemed enthusiastic about the idea
22:35 πŸ”— pizzaiolo he says he'll surely do it
22:36 πŸ”— JW_work great!
22:36 πŸ”— JW_work And I'll make a grab of it as well β€” it looks easy enough to do so
22:37 πŸ”— pizzaiolo hmm says he doesn't know how many
22:37 πŸ”— JW_work http://www.xn--i-8ia.ga/aaf9
22:37 πŸ”— pizzaiolo neat, thanks JW_work
22:37 πŸ”— JW_work that's an example
22:37 πŸ”— JW_work just a 302 response
22:37 πŸ”— pizzaiolo he says maybe less than 100 links per month
22:37 πŸ”— JW_work cool, that's certainly easy for us to grab
22:38 πŸ”— JW_work and 200 for non-existing ones
22:38 πŸ”— JW_work you could warn him he will get somewhat higher traffic for a day or so (probably less) starting this evening
22:39 πŸ”— pizzaiolo neat, will do!
22:40 πŸ”— JW_work thanks very much for reaching out! If you'd like to help more, we have literally hundreds of URLs rumored to contain (or have previously contained) shorteners listed on the URLTeam wiki page that we'd love help investigating…
22:41 πŸ”— pizzaiolo JW_work: heh, I've added a couple of link shorteners to the list on the wiki
22:41 πŸ”— pizzaiolo I'm still very new to internet archiving but I love the idea
22:41 πŸ”— JW_work much appreciated!
22:42 πŸ”— pizzaiolo for now I've mostly relied on https://addons.mozilla.org/en-US/firefox/addon/archive-webextension/?src=search
22:43 πŸ”— JW_work you can also do that just with a bookmarklet javascript:(function(){location.href='http://web.archive.org/save/'+(location.href);})();
22:44 πŸ”— JW_work (although that fails on, say, github due to some security feature I don't remember the name of right now)
22:44 πŸ”— JW_work and if you want to grab a whole domain, #archivebot is your friend
22:44 πŸ”— pizzaiolo oh
22:45 πŸ”— JW_work (also, if you want to grab a single page and have it downloadable outside the Wayback Machine (and as such, not vulnerable to robots.txt changes), archivebot also works for that)
22:45 πŸ”— pizzaiolo ArchiveBot is an IRC bot designed to automate the archival of smaller websites (e.g. up to a few hundred thousand URLs). You give it a URL to start at, and it grabs all content under that URL, records it in a WARC, and then uploads that WARC to ArchiveTeam servers for eventual injection into the Internet Archive (or other archive sites).
22:45 πŸ”— JW_work please add https://addons.mozilla.org/en-US/firefox/addon/archive-webextension to the Internet Archive wiki page on the archiveteam wiki.
22:45 πŸ”— pizzaiolo my god this is brilliant
22:46 πŸ”— pizzaiolo okie
22:46 πŸ”— JW_work glad to hear it!
22:46 πŸ”— pizzaiolo where do you think it could go? on the wiki
22:46 πŸ”— JW_work just at the bottom of the page, under See Also/External Links
22:46 πŸ”— pizzaiolo ok
22:47 πŸ”— JW_work it's just nice to have links to all the Internet Archive-related tools that we know of
22:47 πŸ”— pizzaiolo of course
22:50 πŸ”— pizzaiolo added
22:55 πŸ”— JW_work thanks!
23:16 πŸ”— dashcloud has quit IRC (Remote host closed the connection)
23:23 πŸ”— Start has joined #urlteam

irclogger-viewer