#archiveteam 2016-08-15,Mon

↑back Search

Time Nickname Message
00:01 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
00:14 🔗 DoomTay has joined #archiveteam
00:22 🔗 Coderjoe has joined #archiveteam
00:31 🔗 DoomTay has quit IRC (Ping timeout: 268 seconds)
00:42 🔗 JesseW has joined #archiveteam
01:09 🔗 BlueMaxim has joined #archiveteam
01:46 🔗 ravetcofx has quit IRC (Remote host closed the connection)
01:50 🔗 ravetcofx has joined #archiveteam
02:05 🔗 kristian_ has joined #archiveteam
02:21 🔗 Aranje has quit IRC (Ping timeout: 260 seconds)
02:31 🔗 yipdw_ is now known as yipdw
03:15 🔗 VADemon has quit IRC (Quit: left4dead)
03:49 🔗 redlob_ has joined #archiveteam
03:50 🔗 redlob has quit IRC (Read error: Operation timed out)
04:24 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
04:29 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
04:31 🔗 Sk1d has joined #archiveteam
04:43 🔗 JesseW has joined #archiveteam
04:52 🔗 tfgbd_znc has quit IRC (Ping timeout: 633 seconds)
04:56 🔗 DoomTay has joined #archiveteam
05:06 🔗 tfgbd_znc has joined #archiveteam
05:06 🔗 Honno has joined #archiveteam
05:20 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
05:28 🔗 DoomTay has quit IRC (Quit: Page closed)
05:51 🔗 TC02 has quit IRC (Read error: Connection reset by peer)
05:51 🔗 TC02 has joined #archiveteam
05:55 🔗 vitzli has joined #archiveteam
06:03 🔗 dashcloud has quit IRC (Read error: Operation timed out)
06:07 🔗 dashcloud has joined #archiveteam
06:41 🔗 tomwsmf has quit IRC (Read error: Operation timed out)
07:13 🔗 midas1 is now known as midas
07:30 🔗 signius has quit IRC (Read error: Operation timed out)
07:31 🔗 vitzli has quit IRC (Quit: Leaving)
07:35 🔗 brayden has joined #archiveteam
07:35 🔗 swebb sets mode: +o brayden
07:37 🔗 brayden_ has quit IRC (Read error: Operation timed out)
07:45 🔗 signius has joined #archiveteam
07:45 🔗 redlob_ has quit IRC (Read error: Operation timed out)
07:51 🔗 redlob has joined #archiveteam
08:16 🔗 kristian_ has quit IRC (Leaving)
08:58 🔗 z00nx has joined #archiveteam
09:05 🔗 schbirid has joined #archiveteam
09:29 🔗 WinterFox has joined #archiveteam
09:58 🔗 dashcloud has quit IRC (Read error: Operation timed out)
10:07 🔗 dashcloud has joined #archiveteam
11:17 🔗 Jon hey guys, does anyone know of anyone archiving old .plan files from famous-ish people? e.g. John Carmack's .plan is widely available, but apparently some other ID people published them too
11:18 🔗 schbirid i have a lot
11:20 🔗 schbirid buuuut i am on and off working on turning them into a nice interface or twitter bot so i am too keen on sharing =(
11:22 🔗 schbirid bluesnews has a nice archive
11:27 🔗 Jon schbirid: thanks
11:27 🔗 Jon schbirid: please don't die or something before sharing :P
11:27 🔗 schbirid :D
11:28 🔗 midas yeah schbirid, dont do that.
11:28 🔗 schbirid midas has no rights to say anything about this kind of stuff until the jamendo stuff is up
11:28 🔗 midas ill shut up
11:28 🔗 midas :p
11:29 🔗 midas next week ill be switching back to my old ISP, so fast internet again
11:38 🔗 Stiletto has quit IRC (Read error: Connection reset by peer)
11:39 🔗 Stiletto has joined #archiveteam
11:44 🔗 Nemo_bis aww sorry for the failed switch
11:55 🔗 dashcloud has quit IRC (Read error: Operation timed out)
11:59 🔗 dashcloud has joined #archiveteam
12:06 🔗 espes__ has quit IRC (Read error: Connection reset by peer)
12:06 🔗 yeoldeto1 has quit IRC (Read error: Connection reset by peer)
12:17 🔗 yeoldetoa has joined #archiveteam
12:18 🔗 espes__ has joined #archiveteam
12:22 🔗 Sanqui has quit IRC (Remote host closed the connection)
12:27 🔗 Sanqui has joined #archiveteam
12:56 🔗 BlueMaxim has quit IRC (Quit: Leaving)
13:00 🔗 WinterFox has quit IRC (Ping timeout: 492 seconds)
13:42 🔗 Froggypwn has joined #archiveteam
13:50 🔗 sep332 has quit IRC (Quit: konversation out)
14:00 🔗 sep332 has joined #archiveteam
14:39 🔗 JesseW has joined #archiveteam
15:01 🔗 JesseW The sites described here are probably worth a look by ArchiveTeam (ping godane): http://listheory.prattsils.org/cataloging-plunder-thoughts-on-the-digital-text-sharing-underground/
15:04 🔗 SketchCow We archived UbuWeb some time ago.
15:15 🔗 godane cool
15:15 🔗 JesseW excellent
15:18 🔗 godane i put this url in archivebot https://www.memoryoftheworld.org/
15:19 🔗 godane we only have one page of that site archive based on status
15:25 🔗 tomwsmf has joined #archiveteam
15:29 🔗 JesseW godane: thanks!
15:32 🔗 JesseW has quit IRC (Quit: Leaving.)
15:32 🔗 JesseW has joined #archiveteam
15:51 🔗 Aranje has joined #archiveteam
15:57 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
15:58 🔗 K4k has quit IRC (Quit: WeeChat 1.5)
15:58 🔗 K4k has joined #archiveteam
16:06 🔗 SketchCow (It's 6.2tb of material left on the machine, although I have no idea how much of that is, say, some backups)
16:06 🔗 DoomTay has joined #archiveteam
16:07 🔗 SketchCow Or the aforementioned "Pipes". For example, the Hip-Hop pipe is 257gb for some reason.
16:07 🔗 SketchCow Ha ha, I see why.
16:08 🔗 SketchCow (There's a 230gb project in it, subdirectory.)
16:16 🔗 SketchCow Oh see, 13gb of the 27gb remaining were hip-hop albums I forgot to shove in! (They're portions of the albums that an ill-advised combining of two archives happened)
16:32 🔗 redlob_ has joined #archiveteam
16:33 🔗 redlob has quit IRC (Read error: Operation timed out)
16:36 🔗 DoomTay So Lego CHIMA basically concluded back in 2015, so there's no telling how much longer http://www.lego.com/en-us/chima will stay up. The thing about sticking it in ArchiveBot is that it seems it just won't download the actual video files despite giving it PhantomJS AND youtube-dl AND UA spoofing
16:46 🔗 JW_work1 has quit IRC (Quit: Leaving.)
16:52 🔗 bauruine has joined #archiveteam
17:02 🔗 AlexLehm has joined #archiveteam
17:04 🔗 JW_work has joined #archiveteam
18:30 🔗 DoomTay has quit IRC (Quit: Page closed)
19:17 🔗 nicolas17 has joined #archiveteam
19:18 🔗 nicolas17 hi archivers
19:18 🔗 nicolas17 the forum is the only part of the OpenStreetMap infrastructure that isn't managed by the main OSM operations team, and its administrator seems to be Missing In Action, I think nobody else has server access
19:19 🔗 nicolas17 they are at the point of considering setting up a new forum and pointing the forum.openstreetmap.org hostname to it
19:20 🔗 nicolas17 but that means losing the existing data
19:37 🔗 JW_work nicolas17: is there a link to the discussion about setting up a new forum?
19:42 🔗 nicolas17 sec
19:43 🔗 nicolas17 gee it'd be nice if gmane was up
19:43 🔗 JW_work ha. it's being worked on
19:43 🔗 nicolas17 https://lists.openstreetmap.org/pipermail/talk/2016-August/076580.html last post
19:45 🔗 nicolas17 if I'm going to scrape, looks like I can get the raw bbcode by logging in and trying to quote a post
19:50 🔗 JW_work we'll do a scrape in WARC format
19:50 🔗 JW_work it should be available in a couple of days
19:51 🔗 nicolas17 I once archived a small forum with plain old wget -r, and it got a *lot* of redundant stuff, like following links that returned the same thread in a different order, or thread?id=1 and post?id=2 giving pretty much the same content
19:53 🔗 bai isn't there often a url parameter you can pass to those sorts of forums to get the crawler-friendly page, with less links, all canonicalized?
19:53 🔗 bai or maybe based on googlebot useragent
19:53 🔗 nicolas17 hm maybe
19:54 🔗 nicolas17 that crap I archived with wget was an old phpbb
20:03 🔗 nicolas17 bai: I just tried setting googlebot UA and I get the same page
20:08 🔗 bai yeah I think there's a url parameter, like when you click a forum post link on google and you get the black-and-grey-on-white printer friendly view
20:09 🔗 bai not having much luck searching for what that option is though
20:09 🔗 nicolas17 are you sure this software supports such thing? :P
20:09 🔗 bai also that may be specifically phpBB or one of the other popular ones, dunno about fluxBB
20:09 🔗 nicolas17 ah
20:23 🔗 xmc nicolas17: yeah we have a set of options to make crawling forums work much better
20:32 🔗 DoomTay has joined #archiveteam
20:52 🔗 schbirid i archived the forums some months ago, can't find any trace of it though =)
20:52 🔗 schbirid iirc it was a bit harder than usual
20:52 🔗 schbirid but otehrwise standard forum stuff
20:53 🔗 nicolas17 "Digital objects last forever – or 5 years, whichever comes first"
20:56 🔗 schbirid if no one else raises their hand, i will start a wget for it right now
20:56 🔗 nicolas17 schbirid: I interpreted JW_work's message as hand raising already?
20:57 🔗 xmc schbirid: go for it
20:57 🔗 JW_work schbirid: I just started a #archivebot job
20:57 🔗 JW_work but duplicate is probably fine
20:57 🔗 schbirid yeah :)
20:57 🔗 JW_work esspecially as the archivebot job seems to have failed...
20:58 🔗 schbirid on it
21:00 🔗 schbirid man, fluxbb is nice and clean
21:06 🔗 yipdw JW_work: yes, there's a reason why it failed
21:06 🔗 yipdw <JW_work> !a http://http://forum.openstreetmap.org/ --ignore-sets=forums
21:07 🔗 JW_work yeah, I see that now :-/
21:07 🔗 schbirid runs well here
21:09 🔗 DoomTay I wonder what would happen if the site gets overwhelmed. BZPower would say "The servers are too busy to handle your request". No idea what status code it would return though
21:12 🔗 MMovie has quit IRC (Read error: Connection reset by peer)
21:18 🔗 MMovie has joined #archiveteam
21:21 🔗 RichardG has quit IRC (Read error: Operation timed out)
21:28 🔗 RichardG has joined #archiveteam
21:35 🔗 schbirid running nice and smooth, good night for now
21:35 🔗 schbirid has quit IRC (Quit: Leaving)
21:39 🔗 RichardG has quit IRC (Ping timeout: 370 seconds)
21:57 🔗 Honno has quit IRC (Read error: Operation timed out)
22:17 🔗 Gfy has quit IRC (Read error: Operation timed out)
22:27 🔗 Gfy has joined #archiveteam
22:45 🔗 redlob_ has quit IRC (Read error: Operation timed out)
22:46 🔗 RichardG has joined #archiveteam
22:51 🔗 redlob has joined #archiveteam
23:02 🔗 DoomTay has quit IRC (Ping timeout: 268 seconds)
23:39 🔗 AlexLehm has quit IRC (Ping timeout: 260 seconds)
23:40 🔗 RichardG_ has joined #archiveteam
23:41 🔗 RichardG has quit IRC (Ping timeout: 370 seconds)
23:42 🔗 ZeoNet has joined #archiveteam
23:47 🔗 RichardG_ has quit IRC (Ping timeout: 370 seconds)
23:48 🔗 RichardG has joined #archiveteam
23:58 🔗 ZeoNet has quit IRC (Ping timeout: 244 seconds)

irclogger-viewer