#archiveteam 2018-08-13,Mon

↑back Search

Time Nickname Message
00:07 🔗 BlueMax has joined #archiveteam
00:23 🔗 Soni has joined #archiveteam
00:28 🔗 Stilett0 has quit IRC (Ping timeout: 252 seconds)
00:30 🔗 Stilett0 has joined #archiveteam
00:38 🔗 Stiletto has joined #archiveteam
00:43 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
01:41 🔗 Stilett0 has joined #archiveteam
01:43 🔗 Stiletto has quit IRC (Read error: Operation timed out)
01:55 🔗 pizzaiolo has quit IRC (Remote host closed the connection)
02:18 🔗 Soni has quit IRC (Ping timeout: 264 seconds)
02:33 🔗 Stilett0 has quit IRC (Ping timeout: 268 seconds)
02:36 🔗 Stilett0 has joined #archiveteam
03:33 🔗 Stiletto has joined #archiveteam
03:35 🔗 Stilett0 has quit IRC (Ping timeout: 268 seconds)
03:38 🔗 Stilett0 has joined #archiveteam
03:38 🔗 Stiletto has quit IRC (Ping timeout: 261 seconds)
03:48 🔗 Stilett0 has quit IRC (Ping timeout: 360 seconds)
03:50 🔗 Stilett0 has joined #archiveteam
03:53 🔗 archodg_ has joined #archiveteam
03:55 🔗 archodg__ has quit IRC (Ping timeout: 252 seconds)
03:55 🔗 odemg has quit IRC (Ping timeout: 260 seconds)
04:07 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
04:08 🔗 odemg has joined #archiveteam
04:41 🔗 archodg__ has joined #archiveteam
04:42 🔗 archodg_ has quit IRC (Read error: Connection reset by peer)
06:16 🔗 sam-p has joined #archiveteam
06:20 🔗 wp494 has quit IRC (Read error: Connection reset by peer)
06:37 🔗 m007a83_ has quit IRC (Read error: Operation timed out)
06:37 🔗 wp494 has joined #archiveteam
06:44 🔗 m007a83 has joined #archiveteam
07:02 🔗 Stilett0 has joined #archiveteam
07:25 🔗 sam-p reposted from earlier in #TalkBork, since I don't know who is *really* here or not...
07:26 🔗 sam-p *sighs* either way, it looks like things are on borrowed time, since the "deadline" is 10 august but it looks like the webspace system is still online for now. here are the additional domain/paths
07:26 🔗 sam-p users.tinyworld.co.uk/USERNAME users.tinyonline.co.uk/USERNAME www.USERNAME.screaming.net www.USERNAME.homecall.co.uk www.USERNAME.ukgateway.net www.USERNAME.worldonline.co.uk www.USERNAME.nildram.co.uk
07:27 🔗 sam-p I would contribute myself but I lack scripting skills, and the requirement to add "?showpage=true" to each URL prevents me simply using winhttrack (since all requests get redirected to talktalk's intersticial page first)
08:19 🔗 ta9le has joined #archiveteam
08:20 🔗 Zialus has quit IRC (Read error: Operation timed out)
09:28 🔗 redlob_ has quit IRC (Read error: Operation timed out)
09:31 🔗 redlob has joined #archiveteam
10:04 🔗 icedice has joined #archiveteam
10:26 🔗 ta9le has quit IRC (Quit: Connection closed for inactivity)
10:27 🔗 fredgido has quit IRC (Quit: Connection closed for inactivity)
10:31 🔗 Hiccup has joined #archiveteam
10:32 🔗 Hiccup is there a video/tubeup bot?
10:41 🔗 m007a83_ has joined #archiveteam
10:43 🔗 m007a83 has quit IRC (Read error: Operation timed out)
10:46 🔗 JAA Hiccup: There's #videobot, but it's currently not functional and limited to few services (Facebook, Periscope, Liveleak, and Twitter; some might be broken as well).
10:48 🔗 Hiccup okay, I'll install tubeup instead
11:02 🔗 Jusque has quit IRC (Ping timeout: 268 seconds)
11:03 🔗 Jusque has joined #archiveteam
11:06 🔗 pizzaiolo has joined #archiveteam
11:09 🔗 Hiccup has quit IRC (Quit: Page closed)
11:16 🔗 m007a83 has joined #archiveteam
11:19 🔗 m007a83_ has quit IRC (Read error: Operation timed out)
11:23 🔗 Darkstar has quit IRC (Ping timeout: 260 seconds)
11:47 🔗 Darkstar has joined #archiveteam
12:00 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
13:06 🔗 bitBaron has joined #archiveteam
13:36 🔗 SketchCow ---------------------------------------------
13:36 🔗 SketchCow REMINDER THAT FOS GOES OFFLINE FOR A LOT OF TUESDAY
13:36 🔗 SketchCow (Scheduled Maintenance on the Machine/VM, ete.)
13:37 🔗 SketchCow Going to try and get it as clean beforehand so when it's back, it'll take all the backed up stuff.
13:37 🔗 SketchCow ---------------------------------------------
13:39 🔗 icedice What's FOS?
13:40 🔗 jut archiveteam.org/index.php?title=Fortress_of_Solitude
13:52 🔗 icedice Ah
13:52 🔗 icedice Thanks jut
13:55 🔗 davidar has quit IRC (Quit: Connection closed for inactivity)
13:57 🔗 SketchCow A lot of things/processes stop super dead while this machine is down and we push a lot of data through it so I'm prioritizing making sure nothing is sitting around.
13:59 🔗 SketchCow 600gb of archivebot being uploaded
14:03 🔗 Soni has joined #archiveteam
14:04 🔗 JAA ^ ArchiveBot pipeline operators be aware of that and throttle jobs (i.e. increase delay) if your pipelines are at risk of running out of disk space.
14:16 🔗 Odd0002 has quit IRC (Read error: Operation timed out)
14:20 🔗 Odd0002 has joined #archiveteam
14:59 🔗 pizzaiolo has quit IRC (Quit: pizzaiolo)
15:13 🔗 tyzoid Has anyone considered trying to archive liveatc.net?
15:13 🔗 tyzoid They record air traffic control frequencies via third-party radios, and keep them for 30 days
15:14 🔗 tyzoid They have directoryindexes turned off, so I can't just recursively wget it, or throw the pipeline at it
15:16 🔗 tyzoid If we're interested, they maintain a contact page for inquiries, might be a good idea to see if they're willing to either upload directly, or provide us a way to grab it.
15:16 🔗 tyzoid otherwise, I could rig up a script to enumerate their files.
15:30 🔗 martini has joined #archiveteam
15:42 🔗 wp494 has quit IRC (Read error: Operation timed out)
15:42 🔗 wp494 has joined #archiveteam
15:52 🔗 turbo_ has joined #archiveteam
15:53 🔗 turbo_ has quit IRC (Client Quit)
15:53 🔗 m007a83_ has joined #archiveteam
15:54 🔗 m007a83 has quit IRC (Ping timeout: 252 seconds)
16:03 🔗 SketchCow Sounds like a great way to fill hard drives
16:03 🔗 tyzoid can't tell if sarcasm
16:04 🔗 SketchCow changes topic to: Archive Team: We're not archive.org | https://archiveteam.org/ | Long discussions: #archiveteam-bs | Offtopic: #archiveteam-ot | Can't Tell if Sarcasm
16:07 🔗 pfallenop has quit IRC (Quit: leaving)
16:09 🔗 pfallenop has joined #archiveteam
16:19 🔗 JAA There are almost 2300 feeds on that site, and a half-hour recording appears to be a few MiB. So that's roughly 10 GiB per half hour or 480 GiB per day. Yep, decent way to fill hard drives.
16:27 🔗 tyzoid We could limit to just the class bravo airports
16:27 🔗 tyzoid imo, that still has value
16:38 🔗 SketchCow Let's put it this way
16:38 🔗 SketchCow It has value
16:38 🔗 SketchCow Buy hard drives and do it
16:42 🔗 MMovie has quit IRC (Read error: Operation timed out)
16:44 🔗 MMovie has joined #archiveteam
16:49 🔗 Nemo_bis "Third-party use of LiveATC live audio streams is prohibited"
16:49 🔗 Nemo_bis Whatever that means
16:51 🔗 m007a83 has joined #archiveteam
16:52 🔗 m007a83_ has quit IRC (Ping timeout: 252 seconds)
17:01 🔗 eientei95 Nemo_bis: It means "Please archive us"
17:18 🔗 bitBaron has quit IRC (Quit: Bye!)
17:41 🔗 m007a83_ has joined #archiveteam
17:44 🔗 m007a83 has quit IRC (Ping timeout: 252 seconds)
18:00 🔗 betamax has quit IRC (Read error: Operation timed out)
18:17 🔗 fredgido has joined #archiveteam
18:24 🔗 bRick5772 has joined #archiveteam
18:34 🔗 archodg_ has joined #archiveteam
18:36 🔗 archodg__ has quit IRC (Ping timeout: 252 seconds)
18:36 🔗 odemg has quit IRC (Ping timeout: 260 seconds)
18:43 🔗 m007a83 has joined #archiveteam
18:44 🔗 m007a83_ has quit IRC (Ping timeout: 252 seconds)
18:50 🔗 odemg has joined #archiveteam
19:05 🔗 schbirid has quit IRC (Remote host closed the connection)
19:19 🔗 m007a83_ has joined #archiveteam
19:20 🔗 macsek has joined #archiveteam
19:22 🔗 macsek hello everyone! pretty new to the world of archiving but this site http://pcvilag.muskatli.hu/ caught my attention because it hosts insane amounts of good quality scanned materials regarding the computer history of hungary
19:23 🔗 m007a83 has quit IRC (Read error: Operation timed out)
19:23 🔗 macsek my question is what can be done to preserve it for the future?
19:27 🔗 bsmith093 has quit IRC (Remote host closed the connection)
19:34 🔗 arkiver macsek: being archive now, thanks!
19:34 🔗 arkiver archived*
19:39 🔗 tyzoid macsek: In general, we have three tools at our disposal: Manual archiving (i.e. downloading it ourselves, then uploading that to IA), ArchiveBot (automated crawler for smallish sites), and the ArchiveTeam Warrior (For larger sites)
19:39 🔗 tyzoid Generally, we want to grab it in WARC format, so that the wayback machine can index it
19:44 🔗 macsek WOW thank you guys awesome! one more thing, where can one view data archived by archiveteam?
19:45 🔗 tyzoid There's a few places. The archivebot dashboard shows current bot projects: http://dashboard.at.ninjawedding.org/3?showNicks=1
19:46 🔗 tyzoid That data ends up here: https://archive.org/details/archivebot
19:46 🔗 tyzoid Other data streams can end up in various places
19:47 🔗 macsek hmmm i will have fiber in my middle of nowhere town in a few months, is there a shortage of "warriors"?
19:48 🔗 tyzoid macsek: If you're interested in sticking around, check out #archivebot, #getgit (looking at archiving parts of github), #wikiteam, and #radio-archive
19:51 🔗 SketchCow Accuweather is closing Forums at the end of year
19:51 🔗 SketchCow So, I am a member of this forum. The folks there were looking for a solution to be able to back up the content contained therein.
19:51 🔗 SketchCow Is that something you guys do, or do you guys have howtos or links to the right tools for this task?
19:51 🔗 SketchCow Thanks!
19:52 🔗 bRick5772 has quit IRC (Quit: Leaving.)
19:52 🔗 tyzoid Quick estimate, there's about ~15k threads with ~2M posts
19:53 🔗 Kaz cool, it's all public
19:53 🔗 tyzoid "Our members have made a total of 2,288,196 posts / We have 14,193 registered members"
19:53 🔗 tyzoid Kaz: attachments require sign-in to view
19:54 🔗 Kaz ah, that's a shame.
19:54 🔗 Kaz -bs for channel name shenanigans?
19:54 🔗 tyzoid sure
20:11 🔗 macsek has quit IRC (Quit: Page closed)
20:22 🔗 nyaomi has quit IRC (Quit: meow)
20:26 🔗 fredgido has quit IRC (Quit: Connection closed for inactivity)
20:35 🔗 Zialus has joined #archiveteam
20:50 🔗 Zialus has quit IRC (i'm out!)
20:56 🔗 Zialus has joined #archiveteam
20:56 🔗 bsmith093 has joined #archiveteam
20:59 🔗 nyaomi has joined #archiveteam
21:00 🔗 bsmith093 has quit IRC (Client Quit)
21:03 🔗 bsmith093 has joined #archiveteam
21:06 🔗 bRick5772 has joined #archiveteam
21:16 🔗 m007a83_ has quit IRC (Ping timeout: 252 seconds)
21:30 🔗 mal_ has quit IRC (mal_)
21:34 🔗 bRick5772 has quit IRC (Quit: Leaving.)
21:38 🔗 mal has joined #archiveteam
21:43 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
22:01 🔗 lindalap has quit IRC (Quit: lindalap)
22:12 🔗 mib_dfu24 has joined #archiveteam
22:12 🔗 mib_dfu24 has quit IRC (Client Quit)
22:37 🔗 martini has quit IRC (Ping timeout: 360 seconds)
23:16 🔗 m007a83 has joined #archiveteam
23:32 🔗 tyzoid has quit IRC (Read error: Operation timed out)
23:54 🔗 adinbied has quit IRC (Left Channel.)
23:55 🔗 ivan has quit IRC (Read error: Operation timed out)
23:55 🔗 adinbied has joined #archiveteam
23:55 🔗 JAA has quit IRC (Read error: Operation timed out)
23:55 🔗 ivan has joined #archiveteam
23:56 🔗 zyphlar has quit IRC (Read error: Operation timed out)
23:56 🔗 jspiros has quit IRC (Read error: Operation timed out)
23:56 🔗 wabu has quit IRC (Read error: Operation timed out)
23:57 🔗 Petri152 has quit IRC (Read error: Operation timed out)
23:57 🔗 Jusque has quit IRC (Read error: Operation timed out)
23:58 🔗 Jusque has joined #archiveteam

irclogger-viewer