#archiveteam-bs 2017-10-07,Sat

↑back Search

Time Nickname Message
00:07 🔗 godane i'm uploading The Place For No Story as 'The Place For No Story 1973 Timecode'
00:08 🔗 godane cause there is timecode burn into the video
00:14 🔗 Asparagir JAA; Of course we're interested, why would you even ask. :-)
00:16 🔗 Asparagir has quit IRC (Asparagir)
00:32 🔗 vitzli has joined #archiveteam-bs
01:06 🔗 Baljem has quit IRC (Read error: Operation timed out)
01:06 🔗 godane i'm capturing the 36th tape from Laughing Squid
01:08 🔗 godane SketchCow: at this rate i may have all tapes digitize in about week
01:21 🔗 Odd0002_ has joined #archiveteam-bs
01:21 🔗 Odd0002 has quit IRC (Ping timeout: 600 seconds)
01:21 🔗 Odd0002_ is now known as Odd0002
01:26 🔗 dashcloud @Stiletto Amazingly, I was able to find the article again: http://www.vintagecomputing.com/index.php/archives/1063/bringing-prodigy-back-from-the-dead
01:30 🔗 Stiletto thanks so much :D
01:31 🔗 dashcloud I can't wait to see all of the cool stuff you've found
01:34 🔗 username1 has joined #archiveteam-bs
01:37 🔗 schbirid2 has quit IRC (Read error: Operation timed out)
02:22 🔗 pizzaiolo has quit IRC (Quit: pizzaiolo)
04:04 🔗 vitzli has quit IRC (Quit: Leaving)
04:23 🔗 ranma http://techreport.com/news/32659/pour-one-out-for-aol-instant-messenger
04:24 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
04:31 🔗 Sk1d has joined #archiveteam-bs
04:53 🔗 superkuh Yeah. I tried logging on to AIM just now for kicks. SSL error on login.
04:53 🔗 superkuh Can't seem to get online.
04:53 🔗 superkuh ICQ still works fine though. We'll always have ICQ.
05:03 🔗 ranma hopefully
05:25 🔗 fie zino, godane : I have a torrent site for "home-recordings" and " odd stuff like travel tapes"
05:27 🔗 fie superkuh, yeah you need an up-to-date client
05:28 🔗 superkuh Makes sense. I'm using Pidgin 2.6.6 which is pretty old.
05:29 🔗 fie I was afraid they were only going to allow official aim client but new pidgin works
05:30 🔗 fie gf just said they are shutting down now? wtf
05:31 🔗 fie damn you facebook
05:32 🔗 fie Why can't mozilla take it over or something
05:34 🔗 fie someone named Mental Elf messaged me...
05:34 🔗 fie nobody on my buddy list is ever signed on
05:47 🔗 godane fie: is societyglitch?
05:47 🔗 godane i have a account there
06:34 🔗 Stiletto has quit IRC ()
06:47 🔗 fie godane, yes
07:18 🔗 fie Just don't know where I would source home movies and odd stuff... probably not ebay.
07:34 🔗 Stilett0- has joined #archiveteam-bs
08:23 🔗 TheLovina has quit IRC (Ping timeout: 370 seconds)
09:42 🔗 brayden has quit IRC (Ping timeout: 255 seconds)
09:43 🔗 brayden has joined #archiveteam-bs
09:43 🔗 swebb sets mode: +o brayden
11:14 🔗 JAA Alright, so about wordpress.com: they have a link shortener, wp.me. The shortcode can have various different formats for linking to specific pages of a blog (e.g. directly to a post or an image attached to a post etc.). The format of main interest in this context, however, is simply the blog ID encoded in base62 ([0-9a-zA-Z]).
11:15 🔗 JAA This shortening is provided by Jetpack, a Wordpress plugin installed and activated by default on all wordpress.com blogs (including free ones).
11:18 🔗 JAA It seems that the maximum blog ID is currently somewhere just below 9g000, i.e. on the order of 135M shortcodes need to be scanned (9 * 62^4 + 16 * 62^3).
11:19 🔗 JAA That's also the order of magnitude of how many blogs there are.
11:20 🔗 JAA We could do this through URLTeam and then figure out what to do with it later.
11:47 🔗 dd0a13f37 has joined #archiveteam-bs
11:47 🔗 dd0a13f37 Where do I report security issues for archive.org?
11:59 🔗 username1 info@archive.org
11:59 🔗 dashcloud has quit IRC (Read error: Connection reset by peer)
12:00 🔗 username1 they can either forward it for you or give you direct contact
12:00 🔗 dd0a13f37 alright, thanks
12:00 🔗 dashcloud has joined #archiveteam-bs
12:01 🔗 username1 thank YOU
12:07 🔗 dd0a13f37 Not sure it's anything major, but better safe than sorry I guess
12:09 🔗 godane SketchCow: your getting a showtime airing of Road To Wellville cause that in the case of tapes
12:10 🔗 godane plus side is it got the most out of having 10000k setting being at 6.4gb
12:11 🔗 godane based on the preview of Outer Limits preview it aired on the week of 1996-04-05
12:12 🔗 godane it was a preview for the episode called "The Refuge" with actor M. Emmet Walsh
12:21 🔗 icedice has joined #archiveteam-bs
12:34 🔗 dd0a13f37 JAA: It's very close, more than an order of magnitude. Converting the latest shortlink to decimal and using it together with https://wordpress.com/activity/ to get an estimate
12:34 🔗 dd0a13f37 for posts/blog gives a result close to https://en.blog.wordpress.com/2015/01/06/2014-in-review/
12:35 🔗 dd0a13f37 Or the other way around, estimate number of blogs from 2014 posts/blog stats and stats, convert to b64, note that it's close
12:35 🔗 dd0a13f37 b62*
12:35 🔗 JAA Yeah, I ran a test with the two-character codes and almost all of them existed.
12:36 🔗 dd0a13f37 According to that, there should be (base62) 08 57 30 52 59 blogs (131913987)
12:36 🔗 dd0a13f37 Which is close to 9g000
12:37 🔗 JAA Yup
12:44 🔗 dd0a13f37 Although it's not exact - if you manipulate the POST request from the stats page you can get a chart for the number of blogs which gives 125452778 (08 30 23 61 56) as total
12:45 🔗 dd0a13f37 Or maybe they subtract deleted blogs, in which case it makes perfect sense
12:50 🔗 dd0a13f37 4 billion posts, that's actually not a whole lot
12:56 🔗 K4k has joined #archiveteam-bs
12:57 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
12:59 🔗 Mateon1 has quit IRC (Ping timeout: 255 seconds)
13:00 🔗 Mateon1 has joined #archiveteam-bs
13:51 🔗 dd0a13f37 I don't think any of the libgen collections on IA are complete unless the logs have been tampered with. Should I upload it again?
15:02 🔗 dashcloud has quit IRC (Read error: Operation timed out)
15:15 🔗 icedice2 has joined #archiveteam-bs
15:17 🔗 icedice has quit IRC (Ping timeout: 260 seconds)
15:17 🔗 icedice2 has quit IRC (Client Quit)
15:17 🔗 icedice has joined #archiveteam-bs
15:28 🔗 JAA Just to confirm: is gawker.com archived, and where can the archives be found? I saw several mentions of it in the logs, but I can't find it on IA. (Via: https://www.reddit.com/r/Archiveteam/comments/73xszd/has_gawker_been_fully_archived/ )
15:31 🔗 dashcloud has joined #archiveteam-bs
15:37 🔗 dd0a13f37 nvm, i found the real collection, up to r_2092000 is archived
15:51 🔗 username1 is now known as schbirid
15:52 🔗 schbirid anyone know how to strip all formatting from a $msg in irssi perl scripting?
15:57 🔗 Rai-chan has joined #archiveteam-bs
16:08 🔗 RichardG_ has joined #archiveteam-bs
16:08 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
16:18 🔗 dd0a13f37 Sci-mag is archived up to 64099999, foreignfiction up to 1600000
16:19 🔗 dd0a13f37 Foreignfiction goes up to 1890000, sci-mag torrents are down so not sure exactly how far they go
16:20 🔗 schbirid are they surely fully archived? or just 50% stalled torrents?
16:23 🔗 dd0a13f37 The torrents are seeded, so I think they're archived. The ones that I checked were at least
16:25 🔗 schbirid i meant to grab all of scimag but about iirc 25% of the ones i tried were not fully seeded :(
16:34 🔗 dd0a13f37 Ask on forums for reseed then
16:42 🔗 loadup has joined #archiveteam-bs
16:54 🔗 icedice2 has joined #archiveteam-bs
16:56 🔗 icedice has quit IRC (Ping timeout: 250 seconds)
16:59 🔗 kepler45 has quit IRC (Quit: Leaving)
17:20 🔗 pizzaiolo has joined #archiveteam-bs
17:27 🔗 Asparagir has joined #archiveteam-bs
17:28 🔗 svchfoo1 sets mode: +o Asparagir
17:28 🔗 icedice2 has quit IRC (Quit: Leaving)
17:29 🔗 icedice has joined #archiveteam-bs
17:33 🔗 TC01 has quit IRC (Remote host closed the connection)
17:38 🔗 icedice2 has joined #archiveteam-bs
17:44 🔗 icedice has quit IRC (Read error: Operation timed out)
18:00 🔗 Asparagir has quit IRC (Asparagir)
18:31 🔗 icedice2 has quit IRC (Ping timeout: 255 seconds)
18:31 🔗 JAA For the record, we're now grabbing wp.me in URLTeam. :-)
18:33 🔗 icedice has joined #archiveteam-bs
18:46 🔗 dd0a13f37 So will you archive the whole of WP?
18:46 🔗 Somebody2 dd0a13f37: just the URLs, not their contents (at this point, at least)
18:47 🔗 dd0a13f37 In URLteam, yes, but for the WP project
18:47 🔗 dd0a13f37 Are they having any problems?
18:47 🔗 JAA Not that I know of.
18:47 🔗 Somebody2 Not that I know of, but it's good to have a backup
18:47 🔗 JAA But I figured, why the hell not?
18:47 🔗 Somebody2 heh, jinx
18:48 🔗 dd0a13f37 4 bil posts, 1/4 have images, 1bil images, 1m each, 1pb
18:48 🔗 dd0a13f37 Large endeavour
18:53 🔗 dd0a13f37 has quit IRC (Ping timeout: 268 seconds)
18:58 🔗 icedice2 has joined #archiveteam-bs
18:59 🔗 dd0a13f37 has joined #archiveteam-bs
19:03 🔗 icedice has quit IRC (Ping timeout: 506 seconds)
19:14 🔗 icedice has joined #archiveteam-bs
19:16 🔗 dd0a13f37 has quit IRC (Ping timeout: 268 seconds)
19:16 🔗 dd0a has joined #archiveteam-bs
19:16 🔗 dd0a is now known as dd0a13f37
19:21 🔗 icedice2 has quit IRC (Ping timeout: 506 seconds)
19:27 🔗 icedice2 has joined #archiveteam-bs
19:29 🔗 icedice has quit IRC (Ping timeout: 245 seconds)
19:30 🔗 icedice has joined #archiveteam-bs
19:32 🔗 dd0a13f37 has quit IRC (Ping timeout: 268 seconds)
19:36 🔗 ajshell1 has quit IRC (Quit: Leaving)
19:36 🔗 icedice2 has quit IRC (Read error: Operation timed out)
19:38 🔗 atrocity has joined #archiveteam-bs
19:39 🔗 Atros has quit IRC (Ping timeout: 246 seconds)
19:44 🔗 icedice2 has joined #archiveteam-bs
19:50 🔗 icedice has quit IRC (Read error: Operation timed out)
19:56 🔗 dd0a13f37 has joined #archiveteam-bs
19:57 🔗 dd0a13f37 It sure is some improvement over proxy+webirc
20:04 🔗 ajshell1 has joined #archiveteam-bs
20:15 🔗 atrocity has quit IRC (Read error: Connection reset by peer)
20:16 🔗 atrocity has joined #archiveteam-bs
20:35 🔗 ajshell1 has quit IRC (Quit: Leaving)
20:45 🔗 godane so i have this on tape from the box: https://en.wikipedia.org/wiki/Heat_and_Sunlight
20:45 🔗 godane digitize it now
20:48 🔗 ajshell1 has joined #archiveteam-bs
20:48 🔗 ajshell1 has quit IRC (Client Quit)
20:56 🔗 ajshell1 has joined #archiveteam-bs
20:57 🔗 Stilett0- has quit IRC (Ping timeout: 260 seconds)
21:02 🔗 dashcloud has quit IRC (Remote host closed the connection)
21:03 🔗 dashcloud has joined #archiveteam-bs
21:13 🔗 TC01 has joined #archiveteam-bs
21:17 🔗 ajshell1 has quit IRC (Quit: Leaving)
21:32 🔗 ajshell1 has joined #archiveteam-bs
21:45 🔗 ajshell1 has quit IRC (Quit: Leaving)
21:51 🔗 kepler45 has joined #archiveteam-bs
22:13 🔗 ajshell1 has joined #archiveteam-bs
22:19 🔗 ajshell1 has quit IRC (Quit: Leaving)
22:25 🔗 kepler45 has quit IRC (Quit: Leaving)
22:25 🔗 odemg has quit IRC (Read error: Operation timed out)
22:32 🔗 odemg has joined #archiveteam-bs
22:38 🔗 dd0a13f37 A stripped-down version of archivebot for !ao, now that would be something. You could make it run much much faster if you can disregard certain constraints
22:39 🔗 JAA You can't really ignore that much though. You still need to process images, stylesheets, scripts, etc.
22:41 🔗 JAA An internet where everyone conforms to standards so we don't have to use parsers which are slowed down by all kinds of odd special cases, now that would be something.
22:41 🔗 dd0a13f37 Not always. And you could use another parser, like myhtml
22:41 🔗 dd0a13f37 Myhtml is fast, but there are no python binding
22:43 🔗 Somebody2 !ao jobs don't see to be much of a bottleneck
22:43 🔗 JAA Indeed, we rarely have a queue of !ao jobs.
22:44 🔗 JAA And that's with only one !ao-only pipeline...
22:45 🔗 dd0a13f37 JAA: I don't think they parse inline scripts
22:45 🔗 JAA dd0a13f37: wpull does not actually parse scripts, but it does process it and tries to extract links from it.
22:45 🔗 JAA s/it/them/
22:46 🔗 JAA Same with CSS, I believe.
22:46 🔗 JAA Only HTML is parsed properly.
23:00 🔗 zino has quit IRC (Read error: Connection reset by peer)
23:01 🔗 zino has joined #archiveteam-bs
23:08 🔗 ajshell1 has joined #archiveteam-bs
23:12 🔗 ajshell1 has quit IRC (Client Quit)
23:18 🔗 icedice has joined #archiveteam-bs
23:20 🔗 icedice2 has quit IRC (Ping timeout: 260 seconds)
23:27 🔗 ajshell1 has joined #archiveteam-bs
23:41 🔗 icedice2 has joined #archiveteam-bs
23:43 🔗 icedice has quit IRC (Ping timeout: 260 seconds)
23:45 🔗 dashcloud has quit IRC (Read error: Operation timed out)
23:48 🔗 dashcloud has joined #archiveteam-bs
23:58 🔗 icedice has joined #archiveteam-bs

irclogger-viewer