Time |
Nickname |
Message |
01:44
π
|
|
jacketcha has quit IRC (Ping timeout: 252 seconds) |
02:20
π
|
|
trs80 has quit IRC (Ping timeout: 246 seconds) |
03:25
π
|
|
Mateon1 has quit IRC (Remote host closed the connection) |
03:26
π
|
|
Mateon1 has joined #internetarchive.bak |
06:56
π
|
|
trs80 has joined #internetarchive.bak |
07:45
π
|
|
trs80 has quit IRC (Ping timeout: 246 seconds) |
07:45
π
|
|
trs80 has joined #internetarchive.bak |
08:18
π
|
|
rrn has joined #internetarchive.bak |
08:24
π
|
|
atomotic has joined #internetarchive.bak |
10:11
π
|
|
atomotic has quit IRC (Quit: atomotic) |
10:32
π
|
|
Mateon1 has quit IRC (Read error: Operation timed out) |
10:32
π
|
|
Mateon1 has joined #internetarchive.bak |
10:58
π
|
|
atomotic has joined #internetarchive.bak |
12:34
π
|
|
atomotic has quit IRC (Quit: atomotic) |
13:02
π
|
|
atomotic has joined #internetarchive.bak |
13:13
π
|
|
AsmoB has joined #internetarchive.bak |
13:49
π
|
|
atomotic has quit IRC (Quit: atomotic) |
14:04
π
|
|
atomotic has joined #internetarchive.bak |
16:10
π
|
|
atomotic has quit IRC (Quit: atomotic) |
17:42
π
|
|
atomotic has joined #internetarchive.bak |
17:58
π
|
|
atomotic has quit IRC (Quit: atomotic) |
18:26
π
|
|
Pixi has quit IRC (Quit: Pixi) |
18:27
π
|
|
Pixi has joined #internetarchive.bak |
18:28
π
|
|
iabak-reg has joined #internetarchive.bak |
18:53
π
|
|
shenghac has joined #internetarchive.bak |
18:53
π
|
|
shenghac has quit IRC (Client Quit) |
18:55
π
|
|
hc has joined #internetarchive.bak |
18:58
π
|
|
shenghac has joined #internetarchive.bak |
18:59
π
|
shenghac |
https://www.irccloud.com/pastebin/mkn7oNEz/ |
18:59
π
|
shenghac |
Introduce myself: |
18:59
π
|
shenghac |
I am Yu-Sheng Su, a computer science graduate school student at National Chengchi University (Taiwan). I do my research about network embedding and visual caption in Computational Linguistics and Information Processing Laboratory. Besides, I was an R&D intern at Microsoft and a machine learning intern in TradingValley (a startup company). Therefore, I am familiar with sklearn, tensorflow, and keras. |
19:00
π
|
shenghac |
Project Question: |
19:00
π
|
shenghac |
I have great interests in [Idea 3 Detect βsoft 404sβ and βparkedβ websites]. After I studied on it this week, I have few questions below: |
19:00
π
|
shenghac |
1. Will Internet Archive offer βsoft 404sβ and βparkedβ websites datasets? label data? or not? |
19:00
π
|
shenghac |
-If the data was labeled, I can follow this paper [Identifying "Soft 404" Error Pages: Analyzing the Lexical Signatures of Documents in Distributed Collections] to meet |
19:00
π
|
shenghac |
precision: 99% and recall of 92% (or higher) |
19:00
π
|
shenghac |
-If there is no labeled data, I may choose unsupervised or NN model to do it. |
19:00
π
|
shenghac |
2.Final result: |
19:00
π
|
shenghac |
- After the Idea 3 is finished, this system will be merged to wayback-machine-chrome, wayback-machine-firefox, or wayback-machine-safari?? If it will, I need to consider more when I build this system. |
19:00
π
|
shenghac |
3.Mentor: |
19:00
π
|
shenghac |
Who will be this project Mentor? I found this project (Kenji Nagahashi) in internetarchive. Will Kenji Nagahashi be a mentor in this project. If he will , how can I connect with him to ask more in detail? |
19:01
π
|
shenghac |
====================== |
19:01
π
|
shenghac |
Look forward to your reply, it will be a great help for me. |
19:01
π
|
shenghac |
Big thanks!! |
19:03
π
|
|
hc has quit IRC (Quit: Page closed) |
21:22
π
|
|
AsmoB has quit IRC ((null)) |