Time |
Nickname |
Message |
00:07
π
|
|
primus has quit IRC (Ping timeout: 512 seconds) |
00:10
π
|
chfoo |
SketchCow: for the items "2015_ovi_store_panic" and "2015_ovi_store_panic_2", the CDX files don't seem to be generated. could you check up on that? |
00:10
π
|
|
primus has joined #archiveteam |
00:18
π
|
|
primus104 has joined #archiveteam |
00:22
π
|
Start |
anyone know if there's a python script for scraping reddit.com/domain/ ? |
00:25
π
|
yipdw |
!a http://www.reddit.com/r/subreddit --ignore-sets=reddit |
00:25
π
|
yipdw |
or https://github.com/ludios/grab-site as a distributaable |
00:30
π
|
Start |
i'm more interested in just getting the urls from a specific domain, for example reddit.com/domain/layervault.com |
00:32
π
|
|
ohhdemgir has quit IRC (Quit: Leaving) |
00:53
π
|
|
kyan has joined #archiveteam |
00:56
π
|
|
beardicus has quit IRC (Quit: My MacBook Pro has gone to sleep. ZZZzzzβ¦) |
01:07
π
|
|
RichardG has joined #archiveteam |
01:08
π
|
|
Wizardcry has quit IRC (Read error: Operation timed out) |
01:19
π
|
|
RichardG has quit IRC (Quit: No keyboard found, press F1 to continue) |
01:20
π
|
|
RichardG has joined #archiveteam |
01:27
π
|
joepie91_ |
Start: there's a node.js module named reddit-stream that might do what you want |
01:34
π
|
|
NovaKing_ has quit IRC (Read error: Operation timed out) |
01:34
π
|
|
Selanda has quit IRC (Read error: Operation timed out) |
01:35
π
|
|
lytv has quit IRC (Read error: Operation timed out) |
01:35
π
|
|
cadbury_ has quit IRC (Read error: Operation timed out) |
01:35
π
|
|
aNthraXx has quit IRC (Read error: Operation timed out) |
01:35
π
|
|
caber has quit IRC (Read error: Operation timed out) |
01:36
π
|
|
lytv has joined #archiveteam |
01:36
π
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
01:37
π
|
|
Coderjoe has joined #archiveteam |
01:37
π
|
|
caber has joined #archiveteam |
01:40
π
|
|
brayden has quit IRC (Read error: Operation timed out) |
01:40
π
|
|
caber has quit IRC (Read error: Operation timed out) |
01:41
π
|
|
Selanda has joined #archiveteam |
01:42
π
|
|
Coderjoe has quit IRC (Read error: Operation timed out) |
01:51
π
|
|
primus104 has quit IRC (Leaving.) |
01:53
π
|
|
Coderjoe has joined #archiveteam |
01:59
π
|
|
caber has joined #archiveteam |
02:01
π
|
|
cadbury_ has joined #archiveteam |
02:03
π
|
|
aNthraXx has joined #archiveteam |
02:12
π
|
|
NovaKing_ has joined #archiveteam |
02:44
π
|
Start |
arkiver: we should be able to begin layervault in the next few days |
02:44
π
|
Start |
i've discovered some sequential api urls |
02:45
π
|
Start |
i'd recommend having layervault.com and news.layervault.com (designer news) as separate warrior projects, as they are completely different sites |
03:05
π
|
|
brayden has joined #archiveteam |
03:19
π
|
|
primus has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
BlueMaxim has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
SN4T14_ has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
Emcy has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
Mayonaise has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
rejon has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
Rickster has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
ryan_ has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
xmc has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
Sue_ has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
yipdw has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
dcmorton has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
marnold has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
ersi has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
slash` has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
Famicoman has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
eprillios has quit IRC (ircd.choopa.net irc.eversible.com) |
03:19
π
|
|
Cameron_D has quit IRC (ircd.choopa.net irc.eversible.com) |
03:21
π
|
|
SN4T14 has joined #archiveteam |
03:23
π
|
|
primus has joined #archiveteam |
03:23
π
|
|
BlueMaxim has joined #archiveteam |
03:23
π
|
|
SN4T14_ has joined #archiveteam |
03:23
π
|
|
Mayonaise has joined #archiveteam |
03:23
π
|
|
rejon has joined #archiveteam |
03:23
π
|
|
Rickster has joined #archiveteam |
03:23
π
|
|
ryan_ has joined #archiveteam |
03:23
π
|
|
xmc has joined #archiveteam |
03:23
π
|
|
Sue_ has joined #archiveteam |
03:23
π
|
|
yipdw has joined #archiveteam |
03:23
π
|
|
dcmorton has joined #archiveteam |
03:23
π
|
|
marnold has joined #archiveteam |
03:23
π
|
|
ersi has joined #archiveteam |
03:23
π
|
|
slash` has joined #archiveteam |
03:23
π
|
|
Famicoman has joined #archiveteam |
03:23
π
|
|
Cameron_D has joined #archiveteam |
03:23
π
|
|
irc.eversible.com sets mode: +oooo xmc dcmorton ersi Cameron_D |
03:23
π
|
|
swebb sets mode: +o xmc |
03:23
π
|
|
swebb sets mode: +o ersi |
03:25
π
|
|
Wolfie has quit IRC (Read error: Connection reset by peer) |
03:26
π
|
|
dcmorton has quit IRC (Excess Flood) |
03:26
π
|
|
dcmorton has joined #archiveteam |
03:27
π
|
|
Famicoman has quit IRC (Remote host closed the connection) |
03:27
π
|
|
ersi has quit IRC (Read error: Connection reset by peer) |
03:27
π
|
|
ersi has joined #archiveteam |
03:27
π
|
|
swebb sets mode: +o ersi |
03:30
π
|
|
SN4T14_ has quit IRC (Ping timeout: 512 seconds) |
03:35
π
|
|
eprillios has joined #archiveteam |
03:36
π
|
|
Famicoman has joined #archiveteam |
03:37
π
|
|
fiatjaf has left undefined |
04:16
π
|
SketchCow |
chfoo: Restarted - let's see if it derives |
04:18
π
|
|
Infreq has joined #archiveteam |
04:21
π
|
|
chazchaz_ has quit IRC (Remote host closed the connection) |
04:22
π
|
|
chazchaz_ has joined #archiveteam |
04:33
π
|
|
svchfoo2 has quit IRC (Quit: Closing) |
04:36
π
|
|
svchfoo2 has joined #archiveteam |
06:06
π
|
garyrh |
I'm writing up a grab project for blingee. |
06:06
π
|
garyrh |
(channel name suggestion: #jankee) |
06:18
π
|
|
mistym has joined #archiveteam |
06:45
π
|
|
JMC has quit IRC (Ping timeout: 370 seconds) |
07:18
π
|
|
scyther has joined #archiveteam |
07:32
π
|
signius |
Is there any ETA on when the Google Code Grab is likely to start |
07:33
π
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
08:07
π
|
|
primus104 has joined #archiveteam |
08:24
π
|
|
schbirid has joined #archiveteam |
08:50
π
|
|
mistym has quit IRC (Remote host closed the connection) |
08:58
π
|
|
BlueMaxim has quit IRC (Ping timeout: 512 seconds) |
08:59
π
|
|
BlueMaxim has joined #archiveteam |
09:44
π
|
|
Ymgve has joined #archiveteam |
09:45
π
|
|
habi has joined #archiveteam |
09:47
π
|
|
habi has left |
10:02
π
|
|
signius has quit IRC (Ping timeout: 306 seconds) |
10:14
π
|
|
signius has joined #archiveteam |
10:22
π
|
schbirid |
could someone fully archive https://www.reddit.com/r/IAmA/comments/31esm0/iama_95_year_old_german_women_from_a_village_in/ ? it's wonderful |
10:31
π
|
Smiley |
archivebot no good for it?? |
11:09
π
|
BlueMaxim |
I imagine that loading all the comments would be the problem |
11:17
π
|
|
BlueMaxim has quit IRC (Read error: Connection reset by peer) |
11:34
π
|
|
SimpBrain has joined #archiveteam |
11:48
π
|
|
primus has quit IRC (Read error: Connection timed out) |
11:49
π
|
|
primus has joined #archiveteam |
11:59
π
|
|
dashcloud has joined #archiveteam |
12:32
π
|
|
Ara_ has joined #archiveteam |
12:35
π
|
|
philpem has joined #archiveteam |
12:38
π
|
|
Ara__ has quit IRC (Ping timeout: 492 seconds) |
12:45
π
|
|
Ara__ has joined #archiveteam |
12:51
π
|
|
Ara_ has quit IRC (Ping timeout: 492 seconds) |
12:53
π
|
|
Ara_ has joined #archiveteam |
12:54
π
|
|
Ara__ has quit IRC (Ping timeout: 492 seconds) |
13:10
π
|
SimpBrain |
got a new server ready to pile archiveteam data on to. Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz (Cores 8), 2 x 3TB hdd's. 16GB Ram |
13:29
π
|
|
Ara_ has quit IRC (Ping timeout: 492 seconds) |
13:35
π
|
|
monod has joined #archiveteam |
13:42
π
|
mietek |
Would someone be able to help recover lost files related to the Charity programming language? https://github.com/mietek/charity-language/issues/1 |
13:43
π
|
mietek |
Iβve manually archived the Charity website (http://pll.cpsc.ucalgary.ca/charity1/www/home.html) while itβs still available, fixing broken links and restoring papers from other locations: https://github.com/mietek/charity-language |
13:43
π
|
mietek |
There are, however, some files which I cannot find |
13:45
π
|
mietek |
Iβve also made a full mirror of the website, which includes many TeX source files, and unrelated papers β if anyone is interested, I can upload the tarball somewhere. |
13:46
π
|
mietek |
Itβs 160MB compressed |
13:46
π
|
arkiver |
you can always upload those archived to the Internet Archive |
13:46
π
|
arkiver |
they'd happily store it for :) |
13:49
π
|
mietek |
Didnβt know they take tarballs |
13:50
π
|
arkiver |
they take any kind of file |
14:01
π
|
|
antomatic has quit IRC () |
14:03
π
|
|
Wolfie has joined #archiveteam |
14:05
π
|
|
antomatic has joined #archiveteam |
14:08
π
|
|
primus104 has quit IRC (Leaving.) |
14:38
π
|
|
bzc6p has joined #archiveteam |
14:39
π
|
|
bzc6p has left |
14:50
π
|
|
Peetz0r_ has joined #archiveteam |
14:50
π
|
|
Peetz0r has quit IRC (Read error: Connection reset by peer) |
14:58
π
|
|
Ara_ has joined #archiveteam |
15:13
π
|
|
monod has quit IRC (Ping timeout: 512 seconds) |
15:18
π
|
|
SimpBrai1 has joined #archiveteam |
15:25
π
|
|
SimpBrain has quit IRC (Ping timeout: 512 seconds) |
15:31
π
|
|
Infreq has quit IRC () |
15:46
π
|
|
signius has quit IRC (Quit: Leaving) |
15:47
π
|
|
signius has joined #archiveteam |
15:48
π
|
|
signius has quit IRC (Client Quit) |
15:48
π
|
|
signius has joined #archiveteam |
16:40
π
|
|
monod has joined #archiveteam |
16:40
π
|
balrog |
mietek: what happened to ftp.cpsc.ucalgary.ca? |
16:41
π
|
mietek |
balrog: good question |
16:41
π
|
balrog |
has anyone asked the university? |
16:41
π
|
mietek |
Probably overzealous IT departments |
16:41
π
|
mietek |
Note the Calgary pages block IA |
16:41
π
|
balrog |
(asked as a programmer/researcher) |
16:41
π
|
mietek |
Iβve contacted the main researcher behind the project; no response yet |
16:42
π
|
balrog |
:/ ok |
16:42
π
|
mietek |
Iβm now working down the list of people associated with the project |
16:42
π
|
mietek |
But theyβre all long gone from the university |
16:42
π
|
|
primus104 has joined #archiveteam |
16:42
π
|
mietek |
It really pisses me off that universities delete peopleβs home pages |
16:42
π
|
mietek |
It should be a crime to do that |
16:43
π
|
balrog |
http://pll.cpsc.ucalgary.ca/charity1/www/home.html does seem still to be up |
16:43
π
|
balrog |
and that server has no robots.txt |
16:43
π
|
mietek |
I know. I pasted that above :) |
16:43
π
|
mietek |
That server is pretty badly set up, so you can actually browse the entire hierarchy |
16:43
π
|
mietek |
And so I was able to recover almost all of their papers |
16:44
π
|
mietek |
Home pages were hosted on e.g. http://web.archive.org/web/*/pages.cpsc.ucalgary.ca/%7Espoonerd/ |
16:44
π
|
balrog |
and there's no robots.txt there either |
16:44
π
|
balrog |
this is an issue with IA where it doesn't refresh if the robots.txt is removed, apparently :/ |
16:45
π
|
mietek |
Iβm holding out hope that IA crawls even if itβs blocked |
16:45
π
|
mietek |
And just silently collects the data |
16:45
π
|
mietek |
For the future |
16:45
π
|
|
habi has joined #archiveteam |
16:45
π
|
balrog |
afaik IA does not |
16:45
π
|
mietek |
:( |
16:46
π
|
|
habi has left |
16:46
π
|
xmc |
archivebot does! |
16:47
π
|
mietek |
Was it around in 1997? |
16:47
π
|
xmc |
no. |
16:50
π
|
mietek |
Do you have any tips for locating people? |
16:50
π
|
mietek |
https://github.com/mietek/charity-language/blob/master/doc/pdf/2003-zeng-an-implementation-of-charity.pdf |
16:51
π
|
mietek |
Min Zeng, Calgary MSc 2003 |
16:51
π
|
|
habi1 has joined #archiveteam |
16:51
π
|
mietek |
Actually, thatβs probably easy. |
16:55
π
|
|
Ara_ has quit IRC (Ping timeout: 240 seconds) |
16:58
π
|
|
habi1 has left |
17:04
π
|
|
mistym has joined #archiveteam |
17:28
π
|
|
Wizardcry has joined #archiveteam |
17:53
π
|
|
monod has quit IRC (Ping timeout: 512 seconds) |
17:56
π
|
|
Wizardcry has quit IRC (Read error: Operation timed out) |
18:02
π
|
|
appledash has quit IRC (Read error: Connection reset by peer) |
18:12
π
|
|
Ara_ has joined #archiveteam |
18:19
π
|
|
rolfb has joined #archiveteam |
18:21
π
|
|
aliz has joined #archiveteam |
18:29
π
|
|
rolfb has quit IRC (Leaving...) |
19:25
π
|
|
garyrh has quit IRC (Write error: Broken pipe) |
19:28
π
|
|
useretail has quit IRC (hub.se irc.ac.za) |
19:35
π
|
|
garyrh has joined #archiveteam |
19:36
π
|
|
lytv has quit IRC (Ping timeout: 265 seconds) |
19:37
π
|
Start |
arkiver: once we've started with friendfeed, we'll be able to start layervault |
19:37
π
|
Start |
i found a way of grabbing everything sequentially through their api |
19:39
π
|
|
lytv has joined #archiveteam |
19:40
π
|
arkiver |
Start: awesome |
19:40
π
|
arkiver |
looks like I can safely fully start the grab of friendfeed tonight, which means less work on that |
19:40
π
|
arkiver |
then I'll get on layervault |
19:47
π
|
|
Rickster has quit IRC (Quit: ZNC - http://znc.in) |
19:48
π
|
|
SN4T14_ has joined #archiveteam |
19:51
π
|
|
Rickster has joined #archiveteam |
19:52
π
|
|
Mayonaise has quit IRC (Ping timeout: 512 seconds) |
19:53
π
|
|
SN4T14 has quit IRC (Ping timeout: 306 seconds) |
20:39
π
|
|
Mayonaise has joined #archiveteam |
20:40
π
|
|
godane has quit IRC (Read error: Operation timed out) |
20:47
π
|
|
svchfoo2 has quit IRC (Ping timeout: 240 seconds) |
20:52
π
|
|
svchfoo2 has joined #archiveteam |
21:06
π
|
|
SimpBrai1 has quit IRC (Quit: Leaving) |
21:08
π
|
|
Deewiant has joined #archiveteam |
21:11
π
|
|
aaaaaaaaa has joined #archiveteam |
21:13
π
|
|
godane has joined #archiveteam |
21:39
π
|
|
Peetz0r_ is now known as Peetz0r |
21:56
π
|
|
scyther has quit IRC (Read error: Connection reset by peer) |
22:15
π
|
|
wtron has joined #archiveteam |
22:16
π
|
|
BlueMaxim has joined #archiveteam |
22:21
π
|
|
mistym has quit IRC (Remote host closed the connection) |
22:51
π
|
arkiver |
Start: can we talk in ~10 hours about the findings you got from layerfault? |
22:51
π
|
arkiver |
we'll be starting a discover tomorrow for that |
22:52
π
|
Start |
ok |
23:15
π
|
|
mahadri has joined #archiveteam |
23:58
π
|
Atluxity |
hmmm.. I just had a fantasy about a archive warrior using html5, websockets etc... All one would need to participate would be to visit a warrior-url with a modern browser, and then it would do the job from there |
23:58
π
|
|
Ara_ has quit IRC (Ping timeout: 240 seconds) |
23:59
π
|
|
philpem has quit IRC (Ping timeout: 260 seconds) |
23:59
π
|
|
mistym has joined #archiveteam |