#archiveteam 2012-12-07,Fri

↑back Search

Time Nickname Message
05:00 🔗 tuankiet @alard: How is Yahoo Blog?
08:55 🔗 alard tuankiet: Still there, I guess?
08:56 🔗 alard (I'm not doing anything with it at the moment, if that's what you mean.)
08:56 🔗 ersi Yeah, he just wrote in #dailybooth so I think so
08:57 🔗 alard Ah, haha. I wanted to write that Yahoo Blog is still alive, but my question works in the other sense as well. :)
08:58 🔗 ersi Ha
08:58 🔗 ersi Yeah :) I'm a bit sleepy I guess
09:17 🔗 alard tuankiet: Yahoo search could perhaps produce blog names? http://search.yahoo.com/search?p=T%C3%B4i&n=100&ei=UTF-8&vs=blog.yahoo.com
09:19 🔗 alard Unfriendly urls though: "Hey, read my blog, it's on http://blog.yahoo.com/_XVBWXXWGXGFK4VJXNK5EGIRKLQ/"
09:21 🔗 SmileyG lol thats actually a link :D
09:21 🔗 * SmileyG thought you'd headbutted the keyboard
09:22 🔗 SmileyG About 12,200,000 results (0.31 seconds) << google site: search
09:24 🔗 alard Is Yahoo good at blocking?
09:24 🔗 ersi uh, weird that display ads for Yahoo! Korea on Yahoo! Vietnam
09:51 🔗 tuankiet Sorry, I am here
10:05 🔗 ersi tuankiet: alard said "Still there." to "How is Yahoo Blog?"
10:21 🔗 alard tuankiet: Do you know about these unreadble blog names? Do those blogs also have a 'normal' name, or is that ID the only way to reach them?
11:26 🔗 tuankiet I don't know. There are readable blog name like http://blog.yahoo.com/blog.thietke/ or http://blog.yahoo.com/hungno1 but some aren't readable like http://blog.yahoo.com/_A4YNU4FWW57SQE2KMZPX5LN3RE. I am finding the reason
11:27 🔗 tuankiet The unreadable blog name like http://blog.yahoo.com/_A4YNU4FWW57SQE2KMZPX5LN3RE is autogenerated by Yahoo.
11:30 🔗 tuankiet Ah, bad thing. There are Yahoo blog in Vietnam, Hongkong and Taiwan. They are using the same domain(http://blog.yahoo.com). SO if we find based on the http://blog.yahoo.com/{username}, we will also rescue Yahoo Blog Hongkong and Taiwan
11:36 🔗 alard https://gist.github.com/4fa302a54ea8b5aa5c28
11:39 🔗 tuankiet Sorry to say this thing, but I DON'T KNOW python. Only Pascal
11:40 🔗 alard It's a blog finder. It searches Yahoo with words from the dictionary, extracts the blog names and reports them to the tracker.
11:40 🔗 chronomex are you interested in learning? most decent programmers can muddle along in python without great difficulty
11:42 🔗 chronomex (I say this not to taunt or belittle, just to be friendly)
11:43 🔗 tuankiet In Vietnam, people don't learn Python
11:43 🔗 alard There's a great niche for you then! The first Python programmer in Vietnam. :)
11:43 🔗 chronomex hah
11:44 🔗 chronomex why don't they, tuankiet?
11:44 🔗 chronomex pascal is an unusual language to learn, is it more common there?
11:44 🔗 tuankiet Very much. Every students know Pascal because they must learn Pascal in 8th grade.
11:45 🔗 ersi Whaat, awesome
11:45 🔗 tuankiet I am in 7th and I also learn it
11:45 🔗 chronomex fascinating
11:45 🔗 ersi yeah, it really is :o
11:45 🔗 chronomex you're in 7th grade right now??
11:46 🔗 tuankiet To 11th or 11th, they learn about Microsoft Access. It's useless, too
11:46 🔗 chronomex I must say, ou seem remarkably mature for 7th grade
11:47 🔗 chronomex also it's exciting to hear that lots of students learn to program
11:47 🔗 chronomex :D
11:48 🔗 tuankiet Oh, Sorry ;)
11:48 🔗 chronomex sorry for what?
11:48 🔗 tuankiet Lokk above "I must say, ou seem remarkably mature for 7th grade"
11:49 🔗 chronomex no, keep up the good work
11:49 🔗 tuankiet OK
11:49 🔗 chronomex many 7th grade people on the internet act immature, I absolutely didn't guess you were so young
11:50 🔗 alard I don't want to discourage this discussion, but shall I ring the #archiveteam-bs bell?
11:50 🔗 chronomex yes, I was about to do the same
11:51 🔗 chronomex tuankiet: I suggest you also join #archiveteam-bs , it's where we put off-topic conversation
11:57 🔗 tuankiet OK
12:05 🔗 tuankiet Is the python script work?
12:08 🔗 alard Technically, yes. Practically, somewhat. The timeouts need a bit of tuning. Yahoo blocks it.
12:16 🔗 tuankiet Running in Ubuntu 12.10 and "no module named tornado". How can I fix this
12:18 🔗 alard sudo apt-get install python-tornado
12:19 🔗 alard (may or may not work)
12:22 🔗 tuankiet Thanks. The code is running
12:35 🔗 alard Good. Here's a counter: http://tracker.archiveteam.org:8124/
12:39 🔗 tuankiet I known it. Thanks
12:39 🔗 tuankiet Now I will run it everyday?
12:43 🔗 tuankiet Here is the output: http://pastebin.com/ZNRUA1k9
13:01 🔗 tuankiet Error 999 blocked!
13:14 🔗 alard Yahoo returns error 999 if they have had enough for a while. The script will wait for a while and then retry.
13:16 🔗 alard You may want to increase the WAIT time, to 30 seconds, for instance. (Line 16 of the script.)
13:34 🔗 tuankiet How to make this script run for forever
13:35 🔗 SmileyG &
13:36 🔗 tuankiet What?
13:36 🔗 SmileyG i don't know what you mean by running forever.
13:36 🔗 alard Something like while true ; do python script.py ; done
13:36 🔗 SmileyG yeah, that'd work
13:39 🔗 tuankiet Thanks!
15:12 🔗 tuankiet @alard: you should change the BLOCK_TIMEOUT to 600 (10 mins)
15:12 🔗 alard tuankiet: Is that how long it takes?
15:14 🔗 tuankiet No, I think it takes 30 mins. So maybe BLOCK_TIMEOUT should be 1800
15:14 🔗 tuankiet I am not sure
15:14 🔗 tuankiet ;)
15:14 🔗 tuankiet Should at 600
15:15 🔗 tuankiet 600, sure
15:17 🔗 alard Ideally the WAIT should be long enough to avoid the blocking.
15:26 🔗 tuankiet What does the WAIT do?
15:27 🔗 alard It's the time between each request to Yahoo.
15:27 🔗 tuankiet ok
15:28 🔗 tuankiet What should it be? 60 or 120?
15:28 🔗 alard I don't know. It should be just long enough not to trigger the alarm.
15:31 🔗 tuankiet Just testing with WAIT=30. This may take few days to know
15:32 🔗 tuankiet And now I have to sleep. It's half past 10 in Vietnam
15:32 🔗 tuankiet Goodbye ;)
15:32 🔗 alard Goodnight.
15:55 🔗 SketchCow AAAAAAND morning
15:56 🔗 SketchCow Did we just have a 7th grade archive team member?
15:57 🔗 ersi Indeedily. He's still here and #archiveteam-bs is still there :-)
15:59 🔗 alard SketchCow: Want to join #dailybooth?
16:10 🔗 SketchCow dashcloud: All of rainsoft uploaded? 136mb?

irclogger-viewer