Time |
Nickname |
Message |
00:07
🔗
|
godane |
i'm uploading The Place For No Story as 'The Place For No Story 1973 Timecode' |
00:08
🔗
|
godane |
cause there is timecode burn into the video |
00:14
🔗
|
Asparagir |
JAA; Of course we're interested, why would you even ask. :-) |
00:16
🔗
|
|
Asparagir has quit IRC (Asparagir) |
00:32
🔗
|
|
vitzli has joined #archiveteam-bs |
01:06
🔗
|
|
Baljem has quit IRC (Read error: Operation timed out) |
01:06
🔗
|
godane |
i'm capturing the 36th tape from Laughing Squid |
01:08
🔗
|
godane |
SketchCow: at this rate i may have all tapes digitize in about week |
01:21
🔗
|
|
Odd0002_ has joined #archiveteam-bs |
01:21
🔗
|
|
Odd0002 has quit IRC (Ping timeout: 600 seconds) |
01:21
🔗
|
|
Odd0002_ is now known as Odd0002 |
01:26
🔗
|
dashcloud |
@Stiletto Amazingly, I was able to find the article again: http://www.vintagecomputing.com/index.php/archives/1063/bringing-prodigy-back-from-the-dead |
01:30
🔗
|
Stiletto |
thanks so much :D |
01:31
🔗
|
dashcloud |
I can't wait to see all of the cool stuff you've found |
01:34
🔗
|
|
username1 has joined #archiveteam-bs |
01:37
🔗
|
|
schbirid2 has quit IRC (Read error: Operation timed out) |
02:22
🔗
|
|
pizzaiolo has quit IRC (Quit: pizzaiolo) |
04:04
🔗
|
|
vitzli has quit IRC (Quit: Leaving) |
04:23
🔗
|
ranma |
http://techreport.com/news/32659/pour-one-out-for-aol-instant-messenger |
04:24
🔗
|
|
Sk1d has quit IRC (Ping timeout: 250 seconds) |
04:31
🔗
|
|
Sk1d has joined #archiveteam-bs |
04:53
🔗
|
superkuh |
Yeah. I tried logging on to AIM just now for kicks. SSL error on login. |
04:53
🔗
|
superkuh |
Can't seem to get online. |
04:53
🔗
|
superkuh |
ICQ still works fine though. We'll always have ICQ. |
05:03
🔗
|
ranma |
hopefully |
05:25
🔗
|
fie |
zino, godane : I have a torrent site for "home-recordings" and " odd stuff like travel tapes" |
05:27
🔗
|
fie |
superkuh, yeah you need an up-to-date client |
05:28
🔗
|
superkuh |
Makes sense. I'm using Pidgin 2.6.6 which is pretty old. |
05:29
🔗
|
fie |
I was afraid they were only going to allow official aim client but new pidgin works |
05:30
🔗
|
fie |
gf just said they are shutting down now? wtf |
05:31
🔗
|
fie |
damn you facebook |
05:32
🔗
|
fie |
Why can't mozilla take it over or something |
05:34
🔗
|
fie |
someone named Mental Elf messaged me... |
05:34
🔗
|
fie |
nobody on my buddy list is ever signed on |
05:47
🔗
|
godane |
fie: is societyglitch? |
05:47
🔗
|
godane |
i have a account there |
06:34
🔗
|
|
Stiletto has quit IRC () |
06:47
🔗
|
fie |
godane, yes |
07:18
🔗
|
fie |
Just don't know where I would source home movies and odd stuff... probably not ebay. |
07:34
🔗
|
|
Stilett0- has joined #archiveteam-bs |
08:23
🔗
|
|
TheLovina has quit IRC (Ping timeout: 370 seconds) |
09:42
🔗
|
|
brayden has quit IRC (Ping timeout: 255 seconds) |
09:43
🔗
|
|
brayden has joined #archiveteam-bs |
09:43
🔗
|
|
swebb sets mode: +o brayden |
11:14
🔗
|
JAA |
Alright, so about wordpress.com: they have a link shortener, wp.me. The shortcode can have various different formats for linking to specific pages of a blog (e.g. directly to a post or an image attached to a post etc.). The format of main interest in this context, however, is simply the blog ID encoded in base62 ([0-9a-zA-Z]). |
11:15
🔗
|
JAA |
This shortening is provided by Jetpack, a Wordpress plugin installed and activated by default on all wordpress.com blogs (including free ones). |
11:18
🔗
|
JAA |
It seems that the maximum blog ID is currently somewhere just below 9g000, i.e. on the order of 135M shortcodes need to be scanned (9 * 62^4 + 16 * 62^3). |
11:19
🔗
|
JAA |
That's also the order of magnitude of how many blogs there are. |
11:20
🔗
|
JAA |
We could do this through URLTeam and then figure out what to do with it later. |
11:47
🔗
|
|
dd0a13f37 has joined #archiveteam-bs |
11:47
🔗
|
dd0a13f37 |
Where do I report security issues for archive.org? |
11:59
🔗
|
username1 |
info@archive.org |
11:59
🔗
|
|
dashcloud has quit IRC (Read error: Connection reset by peer) |
12:00
🔗
|
username1 |
they can either forward it for you or give you direct contact |
12:00
🔗
|
dd0a13f37 |
alright, thanks |
12:00
🔗
|
|
dashcloud has joined #archiveteam-bs |
12:01
🔗
|
username1 |
thank YOU |
12:07
🔗
|
dd0a13f37 |
Not sure it's anything major, but better safe than sorry I guess |
12:09
🔗
|
godane |
SketchCow: your getting a showtime airing of Road To Wellville cause that in the case of tapes |
12:10
🔗
|
godane |
plus side is it got the most out of having 10000k setting being at 6.4gb |
12:11
🔗
|
godane |
based on the preview of Outer Limits preview it aired on the week of 1996-04-05 |
12:12
🔗
|
godane |
it was a preview for the episode called "The Refuge" with actor M. Emmet Walsh |
12:21
🔗
|
|
icedice has joined #archiveteam-bs |
12:34
🔗
|
dd0a13f37 |
JAA: It's very close, more than an order of magnitude. Converting the latest shortlink to decimal and using it together with https://wordpress.com/activity/ to get an estimate |
12:34
🔗
|
dd0a13f37 |
for posts/blog gives a result close to https://en.blog.wordpress.com/2015/01/06/2014-in-review/ |
12:35
🔗
|
dd0a13f37 |
Or the other way around, estimate number of blogs from 2014 posts/blog stats and stats, convert to b64, note that it's close |
12:35
🔗
|
dd0a13f37 |
b62* |
12:35
🔗
|
JAA |
Yeah, I ran a test with the two-character codes and almost all of them existed. |
12:36
🔗
|
dd0a13f37 |
According to that, there should be (base62) 08 57 30 52 59 blogs (131913987) |
12:36
🔗
|
dd0a13f37 |
Which is close to 9g000 |
12:37
🔗
|
JAA |
Yup |
12:44
🔗
|
dd0a13f37 |
Although it's not exact - if you manipulate the POST request from the stats page you can get a chart for the number of blogs which gives 125452778 (08 30 23 61 56) as total |
12:45
🔗
|
dd0a13f37 |
Or maybe they subtract deleted blogs, in which case it makes perfect sense |
12:50
🔗
|
dd0a13f37 |
4 billion posts, that's actually not a whole lot |
12:56
🔗
|
|
K4k has joined #archiveteam-bs |
12:57
🔗
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
12:59
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 255 seconds) |
13:00
🔗
|
|
Mateon1 has joined #archiveteam-bs |
13:51
🔗
|
dd0a13f37 |
I don't think any of the libgen collections on IA are complete unless the logs have been tampered with. Should I upload it again? |
15:02
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
15:15
🔗
|
|
icedice2 has joined #archiveteam-bs |
15:17
🔗
|
|
icedice has quit IRC (Ping timeout: 260 seconds) |
15:17
🔗
|
|
icedice2 has quit IRC (Client Quit) |
15:17
🔗
|
|
icedice has joined #archiveteam-bs |
15:28
🔗
|
JAA |
Just to confirm: is gawker.com archived, and where can the archives be found? I saw several mentions of it in the logs, but I can't find it on IA. (Via: https://www.reddit.com/r/Archiveteam/comments/73xszd/has_gawker_been_fully_archived/ ) |
15:31
🔗
|
|
dashcloud has joined #archiveteam-bs |
15:37
🔗
|
dd0a13f37 |
nvm, i found the real collection, up to r_2092000 is archived |
15:51
🔗
|
|
username1 is now known as schbirid |
15:52
🔗
|
schbirid |
anyone know how to strip all formatting from a $msg in irssi perl scripting? |
15:57
🔗
|
|
Rai-chan has joined #archiveteam-bs |
16:08
🔗
|
|
RichardG_ has joined #archiveteam-bs |
16:08
🔗
|
|
RichardG has quit IRC (Read error: Connection reset by peer) |
16:18
🔗
|
dd0a13f37 |
Sci-mag is archived up to 64099999, foreignfiction up to 1600000 |
16:19
🔗
|
dd0a13f37 |
Foreignfiction goes up to 1890000, sci-mag torrents are down so not sure exactly how far they go |
16:20
🔗
|
schbirid |
are they surely fully archived? or just 50% stalled torrents? |
16:23
🔗
|
dd0a13f37 |
The torrents are seeded, so I think they're archived. The ones that I checked were at least |
16:25
🔗
|
schbirid |
i meant to grab all of scimag but about iirc 25% of the ones i tried were not fully seeded :( |
16:34
🔗
|
dd0a13f37 |
Ask on forums for reseed then |
16:42
🔗
|
|
loadup has joined #archiveteam-bs |
16:54
🔗
|
|
icedice2 has joined #archiveteam-bs |
16:56
🔗
|
|
icedice has quit IRC (Ping timeout: 250 seconds) |
16:59
🔗
|
|
kepler45 has quit IRC (Quit: Leaving) |
17:20
🔗
|
|
pizzaiolo has joined #archiveteam-bs |
17:27
🔗
|
|
Asparagir has joined #archiveteam-bs |
17:28
🔗
|
|
svchfoo1 sets mode: +o Asparagir |
17:28
🔗
|
|
icedice2 has quit IRC (Quit: Leaving) |
17:29
🔗
|
|
icedice has joined #archiveteam-bs |
17:33
🔗
|
|
TC01 has quit IRC (Remote host closed the connection) |
17:38
🔗
|
|
icedice2 has joined #archiveteam-bs |
17:44
🔗
|
|
icedice has quit IRC (Read error: Operation timed out) |
18:00
🔗
|
|
Asparagir has quit IRC (Asparagir) |
18:31
🔗
|
|
icedice2 has quit IRC (Ping timeout: 255 seconds) |
18:31
🔗
|
JAA |
For the record, we're now grabbing wp.me in URLTeam. :-) |
18:33
🔗
|
|
icedice has joined #archiveteam-bs |
18:46
🔗
|
dd0a13f37 |
So will you archive the whole of WP? |
18:46
🔗
|
Somebody2 |
dd0a13f37: just the URLs, not their contents (at this point, at least) |
18:47
🔗
|
dd0a13f37 |
In URLteam, yes, but for the WP project |
18:47
🔗
|
dd0a13f37 |
Are they having any problems? |
18:47
🔗
|
JAA |
Not that I know of. |
18:47
🔗
|
Somebody2 |
Not that I know of, but it's good to have a backup |
18:47
🔗
|
JAA |
But I figured, why the hell not? |
18:47
🔗
|
Somebody2 |
heh, jinx |
18:48
🔗
|
dd0a13f37 |
4 bil posts, 1/4 have images, 1bil images, 1m each, 1pb |
18:48
🔗
|
dd0a13f37 |
Large endeavour |
18:53
🔗
|
|
dd0a13f37 has quit IRC (Ping timeout: 268 seconds) |
18:58
🔗
|
|
icedice2 has joined #archiveteam-bs |
18:59
🔗
|
|
dd0a13f37 has joined #archiveteam-bs |
19:03
🔗
|
|
icedice has quit IRC (Ping timeout: 506 seconds) |
19:14
🔗
|
|
icedice has joined #archiveteam-bs |
19:16
🔗
|
|
dd0a13f37 has quit IRC (Ping timeout: 268 seconds) |
19:16
🔗
|
|
dd0a has joined #archiveteam-bs |
19:16
🔗
|
|
dd0a is now known as dd0a13f37 |
19:21
🔗
|
|
icedice2 has quit IRC (Ping timeout: 506 seconds) |
19:27
🔗
|
|
icedice2 has joined #archiveteam-bs |
19:29
🔗
|
|
icedice has quit IRC (Ping timeout: 245 seconds) |
19:30
🔗
|
|
icedice has joined #archiveteam-bs |
19:32
🔗
|
|
dd0a13f37 has quit IRC (Ping timeout: 268 seconds) |
19:36
🔗
|
|
ajshell1 has quit IRC (Quit: Leaving) |
19:36
🔗
|
|
icedice2 has quit IRC (Read error: Operation timed out) |
19:38
🔗
|
|
atrocity has joined #archiveteam-bs |
19:39
🔗
|
|
Atros has quit IRC (Ping timeout: 246 seconds) |
19:44
🔗
|
|
icedice2 has joined #archiveteam-bs |
19:50
🔗
|
|
icedice has quit IRC (Read error: Operation timed out) |
19:56
🔗
|
|
dd0a13f37 has joined #archiveteam-bs |
19:57
🔗
|
dd0a13f37 |
It sure is some improvement over proxy+webirc |
20:04
🔗
|
|
ajshell1 has joined #archiveteam-bs |
20:15
🔗
|
|
atrocity has quit IRC (Read error: Connection reset by peer) |
20:16
🔗
|
|
atrocity has joined #archiveteam-bs |
20:35
🔗
|
|
ajshell1 has quit IRC (Quit: Leaving) |
20:45
🔗
|
godane |
so i have this on tape from the box: https://en.wikipedia.org/wiki/Heat_and_Sunlight |
20:45
🔗
|
godane |
digitize it now |
20:48
🔗
|
|
ajshell1 has joined #archiveteam-bs |
20:48
🔗
|
|
ajshell1 has quit IRC (Client Quit) |
20:56
🔗
|
|
ajshell1 has joined #archiveteam-bs |
20:57
🔗
|
|
Stilett0- has quit IRC (Ping timeout: 260 seconds) |
21:02
🔗
|
|
dashcloud has quit IRC (Remote host closed the connection) |
21:03
🔗
|
|
dashcloud has joined #archiveteam-bs |
21:13
🔗
|
|
TC01 has joined #archiveteam-bs |
21:17
🔗
|
|
ajshell1 has quit IRC (Quit: Leaving) |
21:32
🔗
|
|
ajshell1 has joined #archiveteam-bs |
21:45
🔗
|
|
ajshell1 has quit IRC (Quit: Leaving) |
21:51
🔗
|
|
kepler45 has joined #archiveteam-bs |
22:13
🔗
|
|
ajshell1 has joined #archiveteam-bs |
22:19
🔗
|
|
ajshell1 has quit IRC (Quit: Leaving) |
22:25
🔗
|
|
kepler45 has quit IRC (Quit: Leaving) |
22:25
🔗
|
|
odemg has quit IRC (Read error: Operation timed out) |
22:32
🔗
|
|
odemg has joined #archiveteam-bs |
22:38
🔗
|
dd0a13f37 |
A stripped-down version of archivebot for !ao, now that would be something. You could make it run much much faster if you can disregard certain constraints |
22:39
🔗
|
JAA |
You can't really ignore that much though. You still need to process images, stylesheets, scripts, etc. |
22:41
🔗
|
JAA |
An internet where everyone conforms to standards so we don't have to use parsers which are slowed down by all kinds of odd special cases, now that would be something. |
22:41
🔗
|
dd0a13f37 |
Not always. And you could use another parser, like myhtml |
22:41
🔗
|
dd0a13f37 |
Myhtml is fast, but there are no python binding |
22:43
🔗
|
Somebody2 |
!ao jobs don't see to be much of a bottleneck |
22:43
🔗
|
JAA |
Indeed, we rarely have a queue of !ao jobs. |
22:44
🔗
|
JAA |
And that's with only one !ao-only pipeline... |
22:45
🔗
|
dd0a13f37 |
JAA: I don't think they parse inline scripts |
22:45
🔗
|
JAA |
dd0a13f37: wpull does not actually parse scripts, but it does process it and tries to extract links from it. |
22:45
🔗
|
JAA |
s/it/them/ |
22:46
🔗
|
JAA |
Same with CSS, I believe. |
22:46
🔗
|
JAA |
Only HTML is parsed properly. |
23:00
🔗
|
|
zino has quit IRC (Read error: Connection reset by peer) |
23:01
🔗
|
|
zino has joined #archiveteam-bs |
23:08
🔗
|
|
ajshell1 has joined #archiveteam-bs |
23:12
🔗
|
|
ajshell1 has quit IRC (Client Quit) |
23:18
🔗
|
|
icedice has joined #archiveteam-bs |
23:20
🔗
|
|
icedice2 has quit IRC (Ping timeout: 260 seconds) |
23:27
🔗
|
|
ajshell1 has joined #archiveteam-bs |
23:41
🔗
|
|
icedice2 has joined #archiveteam-bs |
23:43
🔗
|
|
icedice has quit IRC (Ping timeout: 260 seconds) |
23:45
🔗
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
23:48
🔗
|
|
dashcloud has joined #archiveteam-bs |
23:58
🔗
|
|
icedice has joined #archiveteam-bs |