#archiveteam-ot 2018-10-04,Thu

↑back Search

Time Nickname Message
00:01 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
00:02 🔗 Stilett0 has joined #archiveteam-ot
00:03 🔗 Stiletto has quit IRC (Ping timeout: 264 seconds)
00:21 🔗 Stiletto has joined #archiveteam-ot
00:23 🔗 Stilett0 has quit IRC (Ping timeout: 246 seconds)
00:36 🔗 Stilett0 has joined #archiveteam-ot
00:36 🔗 Stiletto has quit IRC (Ping timeout: 260 seconds)
01:24 🔗 Stiletto has joined #archiveteam-ot
01:26 🔗 Stilett0 has quit IRC (Ping timeout: 252 seconds)
01:31 🔗 Stilett0 has joined #archiveteam-ot
01:32 🔗 Stiletto has quit IRC (Ping timeout: 264 seconds)
01:38 🔗 Stiletto has joined #archiveteam-ot
01:41 🔗 Stilett0 has quit IRC (Ping timeout: 260 seconds)
01:46 🔗 odemg has quit IRC (Ping timeout: 260 seconds)
01:58 🔗 odemg has joined #archiveteam-ot
02:56 🔗 Despatche has joined #archiveteam-ot
03:01 🔗 Stilett0 has joined #archiveteam-ot
03:03 🔗 Stiletto has quit IRC (Ping timeout: 264 seconds)
03:36 🔗 odemg has quit IRC (Ping timeout: 260 seconds)
03:41 🔗 Despatche has quit IRC (Read error: Operation timed out)
03:48 🔗 odemg has joined #archiveteam-ot
04:04 🔗 odemg has quit IRC (Ping timeout: 260 seconds)
04:04 🔗 odemg has joined #archiveteam-ot
04:50 🔗 Stiletto has joined #archiveteam-ot
04:52 🔗 Stilett0 has quit IRC (Read error: Operation timed out)
05:35 🔗 BlueMax has quit IRC (Remote host closed the connection)
05:35 🔗 m007a83_ has joined #archiveteam-ot
05:36 🔗 asie has quit IRC (Read error: Operation timed out)
05:37 🔗 BlueMax has joined #archiveteam-ot
05:38 🔗 Jens has quit IRC (Read error: Operation timed out)
05:39 🔗 m007a83 has quit IRC (Read error: Operation timed out)
05:41 🔗 wp494 has quit IRC (Ping timeout: 633 seconds)
05:43 🔗 wp494 has joined #archiveteam-ot
05:48 🔗 m007a83_ has quit IRC (Quit: Fuck you Comcast)
06:15 🔗 Jens has joined #archiveteam-ot
06:53 🔗 Mateon1 has quit IRC (Ping timeout: 268 seconds)
06:53 🔗 Mateon1 has joined #archiveteam-ot
06:57 🔗 asie has joined #archiveteam-ot
07:19 🔗 dashcloud has quit IRC (Read error: Operation timed out)
07:23 🔗 dashcloud has joined #archiveteam-ot
07:45 🔗 MrRadar has quit IRC (Read error: Operation timed out)
07:51 🔗 MrRadar has joined #archiveteam-ot
08:20 🔗 Odd0002_ has joined #archiveteam-ot
08:22 🔗 Odd0002 has quit IRC (Read error: Operation timed out)
08:22 🔗 Odd0002_ is now known as Odd0002
08:29 🔗 godane has joined #archiveteam-ot
08:29 🔗 svchfoo3 sets mode: +o godane
09:24 🔗 ivan https://github.com/ludios/item-maker
09:41 🔗 ivan for mass twitter archiving I am ignoring ^https?://pbs\.twimg\.com/(emoji|profile_images)/
09:41 🔗 ivan hopefully most of the profile images are in other captures
11:24 🔗 ivan https://github.com/ludios/grab-site/tree/html5-parser still baking
11:29 🔗 ivan if anyone has a dire need for it right now add @html5-parser to the pip3 install url
11:33 🔗 * ivan goes to actually make it work
11:40 🔗 ivan I thought I ran it with the new --html-parser but no and now I realize it's a little trickier
11:42 🔗 ivan wpull works on its own objects created by HTMLParserTarget instead of the lxml tree
11:57 🔗 ivan mmm multiple inheritance
12:43 🔗 ivan ok, it's back to scraping links again
12:55 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
13:07 🔗 ivan does anyone use {primary_url} in their own grab-site ignores? tempted to remove it as I rewrite ignoracle
13:14 🔗 kiska1 has quit IRC (Read error: Operation timed out)
13:15 🔗 mal_ has quit IRC (Write error: Broken pipe)
13:15 🔗 djsundog has quit IRC (Read error: Operation timed out)
13:15 🔗 dxrt_ has quit IRC (Read error: Operation timed out)
13:15 🔗 Albardin has quit IRC (Write error: Broken pipe)
13:49 🔗 m007a83 has joined #archiveteam-ot
13:56 🔗 kiska1 has joined #archiveteam-ot
14:03 🔗 mal_ has joined #archiveteam-ot
14:12 🔗 Albardin has joined #archiveteam-ot
14:12 🔗 djsundog has joined #archiveteam-ot
14:13 🔗 dxrt_ has joined #archiveteam-ot
14:13 🔗 dxrt sets mode: +o dxrt_
14:44 🔗 Albardin has quit IRC (Read error: Operation timed out)
14:44 🔗 mal_ has quit IRC (Write error: Broken pipe)
14:44 🔗 kiska1 has quit IRC (Read error: Operation timed out)
14:44 🔗 dxrt_ has quit IRC (Read error: Operation timed out)
14:44 🔗 djsundog has quit IRC (Read error: Operation timed out)
15:03 🔗 ivan I cherry-picked some of the early wpull 2.0 commits and now grab-site is running on Python 3.7 :-)
15:12 🔗 adinbied has quit IRC (Read error: Operation timed out)
15:16 🔗 JAA ivan: Did you see my ignoracle changes that were merged into ArchiveBot recently? Not sure regarding {primary_url} and {primary_netloc}, but it's used in ArchiveBot sometimes (singletumblr igset, for example).
15:16 🔗 JAA I think the only changes necessary in wpull for 3.7 are replacing asyncio.async with asyncio.ensure_future.
15:17 🔗 adinbied has joined #archiveteam-ot
15:20 🔗 eLbot has quit IRC (Ping timeout: 268 seconds)
15:20 🔗 kiskabak has quit IRC (Ping timeout: 268 seconds)
15:21 🔗 w0rmybak has quit IRC (Ping timeout: 268 seconds)
15:21 🔗 robogoat has quit IRC (Ping timeout: 268 seconds)
15:21 🔗 fenn has quit IRC (Ping timeout: 268 seconds)
15:21 🔗 Igloo has quit IRC (Ping timeout: 268 seconds)
15:21 🔗 Igloo has joined #archiveteam-ot
15:21 🔗 fenn has joined #archiveteam-ot
15:21 🔗 ivan I did not see them, I will take a look
15:21 🔗 ivan I was planning on combining all the ignores into one and putting into pyre2
15:21 🔗 jut has quit IRC (Ping timeout: 268 seconds)
15:21 🔗 mr_archiv has quit IRC (Ping timeout: 268 seconds)
15:21 🔗 svchfoo1 has quit IRC (Ping timeout: 268 seconds)
15:21 🔗 ivan putting it into
15:22 🔗 wp494 has quit IRC (Read error: Operation timed out)
15:22 🔗 eLbot has joined #archiveteam-ot
15:23 🔗 wp494 has joined #archiveteam-ot
15:23 🔗 JAA Oof, the ping timeouts.
15:28 🔗 kiska1 has joined #archiveteam-ot
15:29 🔗 kiska Yeah Choopa and portlane are very unstable
15:29 🔗 ivan how come my WebSocketClientProtocol.on_close isn't working after switching from trollius to asyncio
15:31 🔗 ivan onClose I mean
15:32 🔗 mal_ has joined #archiveteam-ot
15:35 🔗 robogoat has joined #archiveteam-ot
15:41 🔗 MrRadar2 has quit IRC (Ping timeout: 268 seconds)
15:42 🔗 ivan I'm sure autobahn is doing something terrible like swallowing an exception
15:42 🔗 SketchCow has quit IRC (Ping timeout: 268 seconds)
15:43 🔗 Albardin has joined #archiveteam-ot
15:43 🔗 mr_archiv has joined #archiveteam-ot
15:43 🔗 djsundog has joined #archiveteam-ot
15:43 🔗 dxrt_ has joined #archiveteam-ot
15:43 🔗 dxrt sets mode: +o dxrt_
15:44 🔗 SketchCow has joined #archiveteam-ot
15:45 🔗 BnAboyZ has quit IRC (Ping timeout: 268 seconds)
15:47 🔗 MrRadar2 has joined #archiveteam-ot
15:48 🔗 BnAboyZ has joined #archiveteam-ot
15:53 🔗 adinbied has quit IRC (Read error: Operation timed out)
15:58 🔗 VerifiedJ has joined #archiveteam-ot
15:59 🔗 kiskabak has joined #archiveteam-ot
15:59 🔗 w0rmybak has joined #archiveteam-ot
15:59 🔗 svchfoo1 has joined #archiveteam-ot
16:00 🔗 svchfoo3 sets mode: +o svchfoo1
16:00 🔗 jut has joined #archiveteam-ot
16:19 🔗 godane has quit IRC (Read error: Operation timed out)
16:30 🔗 adinbied has joined #archiveteam-ot
16:43 🔗 ivan I google for "NoneType' object has no attribute 'resume_reading" because it's a thing I'm seeing on Python 3.7 and on the second page... ArchiveBot dashboard 3.0
16:44 🔗 ivan https://github.com/ludios/wpull/issues/3
16:45 🔗 Despatche has joined #archiveteam-ot
16:46 🔗 ivan if someone could write a crawler in a good language that would be great
16:52 🔗 JAA Haha
16:52 🔗 JAA I blame asyncio. Its network stack is awful.
16:54 🔗 JAA Well, and Tornado. Removing that from wpull is actually one of the things on my todo list.
16:55 🔗 ivan it looks like asyncio was refactored to use async functions and I guess someone fucked it up
16:55 🔗 ivan everything can change across an `await`
16:56 🔗 JAA That was always the case though.
16:56 🔗 m007a83_ has joined #archiveteam-ot
16:56 🔗 JAA Even when it was all directly generator-based etc.
16:56 🔗 ivan sure
16:57 🔗 VerifiedJ has quit IRC (Read error: Operation timed out)
16:57 🔗 JAA On the other hand, I haven't noticed any issues with my aiohttp-based script so far, even after a few hundred million requests.
16:58 🔗 JAA I'm still running that on 3.4 though, so maybe that's not too surprising.
16:58 🔗 JAA If something went wrong in the @asyncio.coroutine -> async def conversion.
16:58 🔗 VerifiedJ has joined #archiveteam-ot
16:58 🔗 m007a83 has quit IRC (Read error: Operation timed out)
17:03 🔗 mgrytbak has quit IRC (Ping timeout: 492 seconds)
17:05 🔗 ivan ok I patched asyncio to check if _transport is not None
17:06 🔗 mgrytbak_ has joined #archiveteam-ot
17:15 🔗 apache2_ has quit IRC (Remote host closed the connection)
17:15 🔗 apache2 has joined #archiveteam-ot
17:16 🔗 Despatche has quit IRC (Quit: Error: Connection reset by peer)
17:28 🔗 ivan https://github.com/ludios/grab-site/commit/9b9682b72209dab1b4e7f149e513175f00a03592#diff-30267b0aeba882fe1003b704fbed5804
17:34 🔗 ivan I didn't even patch the right file, great
17:34 🔗 JAA Hmm, that error crashed wpull? I thought it just produces an error.
17:36 🔗 ivan oh, I'm not running wpull 2.0.3 but rather just enough patches to get wpull 1.2.3 running on Python 3.7
17:36 🔗 ivan maybe I should stop and use the real 2.x
17:37 🔗 JAA Aah, I see.
17:37 🔗 JAA Well, if you're a masochist, maybe you should. :-)
17:40 🔗 kiska xD
18:04 🔗 godane has joined #archiveteam-ot
18:04 🔗 svchfoo1 sets mode: +o godane
18:14 🔗 ivan oh I see the thing I've been fighting was fixed in 0c517bab510ff9555bff055b4e1e78a807e0bd90in the fork
18:15 🔗 ivan makes total sense for asyncio to be raising AttributeError!
18:18 🔗 JAA Ah, so that's why it's not a crash in ArchiveBot.
18:18 🔗 wp494 has quit IRC (west.us.hub irc.Prison.NET)
18:34 🔗 wp494 has joined #archiveteam-ot
18:43 🔗 vectr0n has quit IRC (Quit: ZNC - https://znc.in)
19:03 🔗 dxrt has quit IRC (Ping timeout: 360 seconds)
19:03 🔗 dxrt has joined #archiveteam-ot
19:06 🔗 vectr0n has joined #archiveteam-ot
19:06 🔗 wp494 has quit IRC (Read error: Operation timed out)
19:06 🔗 Igloo_ has joined #archiveteam-ot
19:06 🔗 Igloo has quit IRC (Ping timeout: 360 seconds)
19:06 🔗 wp494 has joined #archiveteam-ot
19:08 🔗 arkiver has quit IRC (Ping timeout: 360 seconds)
19:10 🔗 w0rmybak has quit IRC (Ping timeout: 268 seconds)
19:10 🔗 kiskabak has quit IRC (Ping timeout: 268 seconds)
19:12 🔗 arkiver has joined #archiveteam-ot
19:14 🔗 nightpool has quit IRC (Read error: Operation timed out)
19:14 🔗 nightpool has joined #archiveteam-ot
19:15 🔗 Albardin has quit IRC (Ping timeout: 600 seconds)
19:15 🔗 MrRadar has quit IRC (Read error: Connection reset by peer)
19:16 🔗 robogoat has quit IRC (Ping timeout: 360 seconds)
19:19 🔗 chirlu` has quit IRC (Read error: Operation timed out)
19:23 🔗 robogoat has joined #archiveteam-ot
19:29 🔗 chirlu has joined #archiveteam-ot
19:38 🔗 kiskabak has joined #archiveteam-ot
19:38 🔗 w0rmybak has joined #archiveteam-ot
19:38 🔗 Albardin has joined #archiveteam-ot
19:48 🔗 m007a83_ is now known as m007a83
19:51 🔗 MrRadar has joined #archiveteam-ot
20:03 🔗 mgrytbak_ is now known as mgrytbak
21:39 🔗 BlueMax has joined #archiveteam-ot
21:40 🔗 SimpBrain has joined #archiveteam-ot
21:42 🔗 Famicoman has quit IRC (Quit: Famicoman)
21:50 🔗 VerifiedJ has quit IRC (Quit: Leaving)
21:52 🔗 Flashfire I will be out today with family so unable to help monitor archivebot
21:53 🔗 Stiletto has quit IRC ()
21:57 🔗 w0rmhole has quit IRC (Excess Flood)
21:57 🔗 w0rmhole has joined #archiveteam-ot
22:04 🔗 w0rmhole !ao < https://transfer.sh/13hV5W/-_WikiTeam-tweets --igset twitter --delay 0

irclogger-viewer