#internetarchive 2017-03-28,Tue

↑back Search

Time Nickname Message
01:28 🔗 edsu has quit IRC (Read error: Operation timed out)
01:35 🔗 edsu has joined #internetarchive
01:39 🔗 X-Scale has quit IRC (Ping timeout: 240 seconds)
01:40 🔗 [X-Scale] has joined #internetarchive
03:44 🔗 nicolas17 has joined #internetarchive
03:44 🔗 nicolas17 hi, what's the user-agent used by the wayback machine crawler?
03:47 🔗 nicolas17 I'm trying to archive http://www.casarosada.gob.ar/informacion/eventos-destacados-presi/39048-el-gobierno-firmo-convenios-con-la-casa-ana-frank-para-promover-el-dialogo-y-la-tolerancia and I'm getting a blank page
03:48 🔗 nicolas17 so I'm wondering if they block IA
03:48 🔗 xmc https://archive.org/details/archive.org_bot
03:48 🔗 xmc "User Agent archive.org_bot is used for our wide crawl of the web"
03:48 🔗 nicolas17 I'm using the "Save Page Now" thing, will that do things differently?
03:48 🔗 xmc i'm not sure
03:49 🔗 nicolas17 they do block curl (response becomes "<html><head></head><body><br><br></body></html>" if the user-agent is curl)
03:50 🔗 xmc that's positively adversarial
03:50 🔗 nicolas17 but curl -A "archive.org_bot" gives the normal page
03:50 🔗 nicolas17 so I dunno
04:47 🔗 X-Scale has joined #internetarchive
04:48 🔗 [X-Scale] has quit IRC (Ping timeout: 240 seconds)
06:14 🔗 Stilett0 has quit IRC (Ping timeout: 250 seconds)
06:22 🔗 nicolas17 has quit IRC (Quit: Konversation terminated!)
07:34 🔗 atomotic has joined #internetarchive
08:00 🔗 X-Scale has quit IRC (Quit: Try HydraIRC -> http://www.hydrairc.com <-)
08:23 🔗 atomotic has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…)
08:28 🔗 atomotic has joined #internetarchive
09:33 🔗 atomotic has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…)
09:34 🔗 kyounko has joined #internetarchive
10:08 🔗 atomotic has joined #internetarchive
14:00 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
15:21 🔗 nicolas17 has joined #internetarchive
15:30 🔗 atomotic has joined #internetarchive
16:51 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
17:14 🔗 Stilett0 has joined #internetarchive
20:31 🔗 SketchCow has quit IRC (Quit: Lost terminal)
20:38 🔗 SketchCow has joined #internetarchive
23:19 🔗 butchster has quit IRC (Ping timeout: 492 seconds)
23:25 🔗 kyan has joined #internetarchive

irclogger-viewer