| Time |
Nickname |
Message |
|
00:07
π
|
|
primus has quit IRC (Ping timeout: 512 seconds) |
|
00:10
π
|
chfoo |
SketchCow: for the items "2015_ovi_store_panic" and "2015_ovi_store_panic_2", the CDX files don't seem to be generated. could you check up on that? |
|
00:10
π
|
|
primus has joined #archiveteam |
|
00:18
π
|
|
primus104 has joined #archiveteam |
|
00:22
π
|
Start |
anyone know if there's a python script for scraping reddit.com/domain/ ? |
|
00:25
π
|
yipdw |
!a http://www.reddit.com/r/subreddit --ignore-sets=reddit |
|
00:25
π
|
yipdw |
or https://github.com/ludios/grab-site as a distributaable |
|
00:30
π
|
Start |
i'm more interested in just getting the urls from a specific domain, for example reddit.com/domain/layervault.com |
|
00:32
π
|
|
ohhdemgir has quit IRC (Quit: Leaving) |
|
00:53
π
|
|
kyan has joined #archiveteam |
|
00:56
π
|
|
beardicus has quit IRC (Quit: My MacBook Pro has gone to sleep. ZZZzzzβ¦) |
|
01:07
π
|
|
RichardG has joined #archiveteam |
|
01:08
π
|
|
Wizardcry has quit IRC (Read error: Operation timed out) |
|
01:19
π
|
|
RichardG has quit IRC (Quit: No keyboard found, press F1 to continue) |
|
01:20
π
|
|
RichardG has joined #archiveteam |
|
01:27
π
|
joepie91_ |
Start: there's a node.js module named reddit-stream that might do what you want |
|
01:34
π
|
|
NovaKing_ has quit IRC (Read error: Operation timed out) |
|
01:34
π
|
|
Selanda has quit IRC (Read error: Operation timed out) |
|
01:35
π
|
|
lytv has quit IRC (Read error: Operation timed out) |
|
01:35
π
|
|
cadbury_ has quit IRC (Read error: Operation timed out) |
|
01:35
π
|
|
aNthraXx has quit IRC (Read error: Operation timed out) |
|
01:35
π
|
|
caber has quit IRC (Read error: Operation timed out) |
|
01:36
π
|
|
lytv has joined #archiveteam |
|
01:36
π
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
|
01:37
π
|
|
Coderjoe has joined #archiveteam |
|
01:37
π
|
|
caber has joined #archiveteam |
|
01:40
π
|
|
brayden has quit IRC (Read error: Operation timed out) |
|
01:40
π
|
|
caber has quit IRC (Read error: Operation timed out) |
|
01:41
π
|
|
Selanda has joined #archiveteam |
|
01:42
π
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
|
01:51
π
|
|
primus104 has quit IRC (Leaving.) |
|
01:53
π
|
|
Coderjoe has joined #archiveteam |
|
01:59
π
|
|
caber has joined #archiveteam |
|
02:01
π
|
|
cadbury_ has joined #archiveteam |
|
02:03
π
|
|
aNthraXx has joined #archiveteam |
|
02:12
π
|
|
NovaKing_ has joined #archiveteam |
|
02:44
π
|
Start |
arkiver: we should be able to begin layervault in the next few days |
|
02:44
π
|
Start |
i've discovered some sequential api urls |
|
02:45
π
|
Start |
i'd recommend having layervault.com and news.layervault.com (designer news) as separate warrior projects, as they are completely different sites |
|
03:05
π
|
|
brayden has joined #archiveteam |
|
03:19
π
|
|
primus has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
BlueMaxim has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
SN4T14_ has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
Emcy has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
Mayonaise has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
rejon has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
Rickster has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
ryan_ has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
xmc has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
Sue_ has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
yipdw has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
dcmorton has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
marnold has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
ersi has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
slash` has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
Famicoman has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
eprillios has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:19
π
|
|
Cameron_D has quit IRC (ircd.choopa.net irc.eversible.com) |
|
03:21
π
|
|
SN4T14 has joined #archiveteam |
|
03:23
π
|
|
primus has joined #archiveteam |
|
03:23
π
|
|
BlueMaxim has joined #archiveteam |
|
03:23
π
|
|
SN4T14_ has joined #archiveteam |
|
03:23
π
|
|
Mayonaise has joined #archiveteam |
|
03:23
π
|
|
rejon has joined #archiveteam |
|
03:23
π
|
|
Rickster has joined #archiveteam |
|
03:23
π
|
|
ryan_ has joined #archiveteam |
|
03:23
π
|
|
xmc has joined #archiveteam |
|
03:23
π
|
|
Sue_ has joined #archiveteam |
|
03:23
π
|
|
yipdw has joined #archiveteam |
|
03:23
π
|
|
dcmorton has joined #archiveteam |
|
03:23
π
|
|
marnold has joined #archiveteam |
|
03:23
π
|
|
ersi has joined #archiveteam |
|
03:23
π
|
|
slash` has joined #archiveteam |
|
03:23
π
|
|
Famicoman has joined #archiveteam |
|
03:23
π
|
|
Cameron_D has joined #archiveteam |
|
03:23
π
|
|
irc.eversible.com sets mode: +oooo xmc dcmorton ersi Cameron_D |
|
03:23
π
|
|
swebb sets mode: +o xmc |
|
03:23
π
|
|
swebb sets mode: +o ersi |
|
03:25
π
|
|
Wolfie has quit IRC (Read error: Connection reset by peer) |
|
03:26
π
|
|
dcmorton has quit IRC (Excess Flood) |
|
03:26
π
|
|
dcmorton has joined #archiveteam |
|
03:27
π
|
|
Famicoman has quit IRC (Remote host closed the connection) |
|
03:27
π
|
|
ersi has quit IRC (Read error: Connection reset by peer) |
|
03:27
π
|
|
ersi has joined #archiveteam |
|
03:27
π
|
|
swebb sets mode: +o ersi |
|
03:30
π
|
|
SN4T14_ has quit IRC (Ping timeout: 512 seconds) |
|
03:35
π
|
|
eprillios has joined #archiveteam |
|
03:36
π
|
|
Famicoman has joined #archiveteam |
|
03:37
π
|
|
fiatjaf has left undefined |
|
04:16
π
|
SketchCow |
chfoo: Restarted - let's see if it derives |
|
04:18
π
|
|
Infreq has joined #archiveteam |
|
04:21
π
|
|
chazchaz_ has quit IRC (Remote host closed the connection) |
|
04:22
π
|
|
chazchaz_ has joined #archiveteam |
|
04:33
π
|
|
svchfoo2 has quit IRC (Quit: Closing) |
|
04:36
π
|
|
svchfoo2 has joined #archiveteam |
|
06:06
π
|
garyrh |
I'm writing up a grab project for blingee. |
|
06:06
π
|
garyrh |
(channel name suggestion: #jankee) |
|
06:18
π
|
|
mistym has joined #archiveteam |
|
06:45
π
|
|
JMC has quit IRC (Ping timeout: 370 seconds) |
|
07:18
π
|
|
scyther has joined #archiveteam |
|
07:32
π
|
signius |
Is there any ETA on when the Google Code Grab is likely to start |
|
07:33
π
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
|
08:07
π
|
|
primus104 has joined #archiveteam |
|
08:24
π
|
|
schbirid has joined #archiveteam |
|
08:50
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
08:58
π
|
|
BlueMaxim has quit IRC (Ping timeout: 512 seconds) |
|
08:59
π
|
|
BlueMaxim has joined #archiveteam |
|
09:44
π
|
|
Ymgve has joined #archiveteam |
|
09:45
π
|
|
habi has joined #archiveteam |
|
09:47
π
|
|
habi has left |
|
10:02
π
|
|
signius has quit IRC (Ping timeout: 306 seconds) |
|
10:14
π
|
|
signius has joined #archiveteam |
|
10:22
π
|
schbirid |
could someone fully archive https://www.reddit.com/r/IAmA/comments/31esm0/iama_95_year_old_german_women_from_a_village_in/ ? it's wonderful |
|
10:31
π
|
Smiley |
archivebot no good for it?? |
|
11:09
π
|
BlueMaxim |
I imagine that loading all the comments would be the problem |
|
11:17
π
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
|
11:34
π
|
|
SimpBrain has joined #archiveteam |
|
11:48
π
|
|
primus has quit IRC (Read error: Connection timed out) |
|
11:49
π
|
|
primus has joined #archiveteam |
|
11:59
π
|
|
dashcloud has joined #archiveteam |
|
12:32
π
|
|
Ara_ has joined #archiveteam |
|
12:35
π
|
|
philpem has joined #archiveteam |
|
12:38
π
|
|
Ara__ has quit IRC (Ping timeout: 492 seconds) |
|
12:45
π
|
|
Ara__ has joined #archiveteam |
|
12:51
π
|
|
Ara_ has quit IRC (Ping timeout: 492 seconds) |
|
12:53
π
|
|
Ara_ has joined #archiveteam |
|
12:54
π
|
|
Ara__ has quit IRC (Ping timeout: 492 seconds) |
|
13:10
π
|
SimpBrain |
got a new server ready to pile archiveteam data on to. Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz (Cores 8), 2 x 3TB hdd's. 16GB Ram |
|
13:29
π
|
|
Ara_ has quit IRC (Ping timeout: 492 seconds) |
|
13:35
π
|
|
monod has joined #archiveteam |
|
13:42
π
|
mietek |
Would someone be able to help recover lost files related to the Charity programming language? https://github.com/mietek/charity-language/issues/1 |
|
13:43
π
|
mietek |
Iβve manually archived the Charity website (http://pll.cpsc.ucalgary.ca/charity1/www/home.html) while itβs still available, fixing broken links and restoring papers from other locations: https://github.com/mietek/charity-language |
|
13:43
π
|
mietek |
There are, however, some files which I cannot find |
|
13:45
π
|
mietek |
Iβve also made a full mirror of the website, which includes many TeX source files, and unrelated papers β if anyone is interested, I can upload the tarball somewhere. |
|
13:46
π
|
mietek |
Itβs 160MB compressed |
|
13:46
π
|
arkiver |
you can always upload those archived to the Internet Archive |
|
13:46
π
|
arkiver |
they'd happily store it for :) |
|
13:49
π
|
mietek |
Didnβt know they take tarballs |
|
13:50
π
|
arkiver |
they take any kind of file |
|
14:01
π
|
|
antomatic has quit IRC () |
|
14:03
π
|
|
Wolfie has joined #archiveteam |
|
14:05
π
|
|
antomatic has joined #archiveteam |
|
14:08
π
|
|
primus104 has quit IRC (Leaving.) |
|
14:38
π
|
|
bzc6p has joined #archiveteam |
|
14:39
π
|
|
bzc6p has left |
|
14:50
π
|
|
Peetz0r_ has joined #archiveteam |
|
14:50
π
|
|
Peetz0r has quit IRC (Read error: Connection reset by peer) |
|
14:58
π
|
|
Ara_ has joined #archiveteam |
|
15:13
π
|
|
monod has quit IRC (Ping timeout: 512 seconds) |
|
15:18
π
|
|
SimpBrai1 has joined #archiveteam |
|
15:25
π
|
|
SimpBrain has quit IRC (Ping timeout: 512 seconds) |
|
15:31
π
|
|
Infreq has quit IRC () |
|
15:46
π
|
|
signius has quit IRC (Quit: Leaving) |
|
15:47
π
|
|
signius has joined #archiveteam |
|
15:48
π
|
|
signius has quit IRC (Client Quit) |
|
15:48
π
|
|
signius has joined #archiveteam |
|
16:40
π
|
|
monod has joined #archiveteam |
|
16:40
π
|
balrog |
mietek: what happened to ftp.cpsc.ucalgary.ca? |
|
16:41
π
|
mietek |
balrog: good question |
|
16:41
π
|
balrog |
has anyone asked the university? |
|
16:41
π
|
mietek |
Probably overzealous IT departments |
|
16:41
π
|
mietek |
Note the Calgary pages block IA |
|
16:41
π
|
balrog |
(asked as a programmer/researcher) |
|
16:41
π
|
mietek |
Iβve contacted the main researcher behind the project; no response yet |
|
16:42
π
|
balrog |
:/ ok |
|
16:42
π
|
mietek |
Iβm now working down the list of people associated with the project |
|
16:42
π
|
mietek |
But theyβre all long gone from the university |
|
16:42
π
|
|
primus104 has joined #archiveteam |
|
16:42
π
|
mietek |
It really pisses me off that universities delete peopleβs home pages |
|
16:42
π
|
mietek |
It should be a crime to do that |
|
16:43
π
|
balrog |
http://pll.cpsc.ucalgary.ca/charity1/www/home.html does seem still to be up |
|
16:43
π
|
balrog |
and that server has no robots.txt |
|
16:43
π
|
mietek |
I know. I pasted that above :) |
|
16:43
π
|
mietek |
That server is pretty badly set up, so you can actually browse the entire hierarchy |
|
16:43
π
|
mietek |
And so I was able to recover almost all of their papers |
|
16:44
π
|
mietek |
Home pages were hosted on e.g. http://web.archive.org/web/*/pages.cpsc.ucalgary.ca/%7Espoonerd/ |
|
16:44
π
|
balrog |
and there's no robots.txt there either |
|
16:44
π
|
balrog |
this is an issue with IA where it doesn't refresh if the robots.txt is removed, apparently :/ |
|
16:45
π
|
mietek |
Iβm holding out hope that IA crawls even if itβs blocked |
|
16:45
π
|
mietek |
And just silently collects the data |
|
16:45
π
|
mietek |
For the future |
|
16:45
π
|
|
habi has joined #archiveteam |
|
16:45
π
|
balrog |
afaik IA does not |
|
16:45
π
|
mietek |
:( |
|
16:46
π
|
|
habi has left |
|
16:46
π
|
xmc |
archivebot does! |
|
16:47
π
|
mietek |
Was it around in 1997? |
|
16:47
π
|
xmc |
no. |
|
16:50
π
|
mietek |
Do you have any tips for locating people? |
|
16:50
π
|
mietek |
https://github.com/mietek/charity-language/blob/master/doc/pdf/2003-zeng-an-implementation-of-charity.pdf |
|
16:51
π
|
mietek |
Min Zeng, Calgary MSc 2003 |
|
16:51
π
|
|
habi1 has joined #archiveteam |
|
16:51
π
|
mietek |
Actually, thatβs probably easy. |
|
16:55
π
|
|
Ara_ has quit IRC (Ping timeout: 240 seconds) |
|
16:58
π
|
|
habi1 has left |
|
17:04
π
|
|
mistym has joined #archiveteam |
|
17:28
π
|
|
Wizardcry has joined #archiveteam |
|
17:53
π
|
|
monod has quit IRC (Ping timeout: 512 seconds) |
|
17:56
π
|
|
Wizardcry has quit IRC (Read error: Operation timed out) |
|
18:02
π
|
|
appledash has quit IRC (Read error: Connection reset by peer) |
|
18:12
π
|
|
Ara_ has joined #archiveteam |
|
18:19
π
|
|
rolfb has joined #archiveteam |
|
18:21
π
|
|
aliz has joined #archiveteam |
|
18:29
π
|
|
rolfb has quit IRC (Leaving...) |
|
19:25
π
|
|
garyrh has quit IRC (Write error: Broken pipe) |
|
19:28
π
|
|
useretail has quit IRC (hub.se irc.ac.za) |
|
19:35
π
|
|
garyrh has joined #archiveteam |
|
19:36
π
|
|
lytv has quit IRC (Ping timeout: 265 seconds) |
|
19:37
π
|
Start |
arkiver: once we've started with friendfeed, we'll be able to start layervault |
|
19:37
π
|
Start |
i found a way of grabbing everything sequentially through their api |
|
19:39
π
|
|
lytv has joined #archiveteam |
|
19:40
π
|
arkiver |
Start: awesome |
|
19:40
π
|
arkiver |
looks like I can safely fully start the grab of friendfeed tonight, which means less work on that |
|
19:40
π
|
arkiver |
then I'll get on layervault |
|
19:47
π
|
|
Rickster has quit IRC (Quit: ZNC - http://znc.in) |
|
19:48
π
|
|
SN4T14_ has joined #archiveteam |
|
19:51
π
|
|
Rickster has joined #archiveteam |
|
19:52
π
|
|
Mayonaise has quit IRC (Ping timeout: 512 seconds) |
|
19:53
π
|
|
SN4T14 has quit IRC (Ping timeout: 306 seconds) |
|
20:39
π
|
|
Mayonaise has joined #archiveteam |
|
20:40
π
|
|
godane has quit IRC (Read error: Operation timed out) |
|
20:47
π
|
|
svchfoo2 has quit IRC (Ping timeout: 240 seconds) |
|
20:52
π
|
|
svchfoo2 has joined #archiveteam |
|
21:06
π
|
|
SimpBrai1 has quit IRC (Quit: Leaving) |
|
21:08
π
|
|
Deewiant has joined #archiveteam |
|
21:11
π
|
|
aaaaaaaaa has joined #archiveteam |
|
21:13
π
|
|
godane has joined #archiveteam |
|
21:39
π
|
|
Peetz0r_ is now known as Peetz0r |
|
21:56
π
|
|
scyther has quit IRC (Read error: Connection reset by peer) |
|
22:15
π
|
|
wtron has joined #archiveteam |
|
22:16
π
|
|
BlueMaxim has joined #archiveteam |
|
22:21
π
|
|
mistym has quit IRC (Remote host closed the connection) |
|
22:51
π
|
arkiver |
Start: can we talk in ~10 hours about the findings you got from layerfault? |
|
22:51
π
|
arkiver |
we'll be starting a discover tomorrow for that |
|
22:52
π
|
Start |
ok |
|
23:15
π
|
|
mahadri has joined #archiveteam |
|
23:58
π
|
Atluxity |
hmmm.. I just had a fantasy about a archive warrior using html5, websockets etc... All one would need to participate would be to visit a warrior-url with a modern browser, and then it would do the job from there |
|
23:58
π
|
|
Ara_ has quit IRC (Ping timeout: 240 seconds) |
|
23:59
π
|
|
philpem has quit IRC (Ping timeout: 260 seconds) |
|
23:59
π
|
|
mistym has joined #archiveteam |