[00:06] *** wp494 has quit IRC (Read error: Operation timed out) [00:06] *** wp494 has joined #archiveteam-ot [00:19] *** kiskaLap has quit IRC (Leaving) [00:19] *** kiskaLap has joined #archiveteam-ot [00:27] *** kiskaLap has quit IRC (Leaving) [00:27] *** kiskaLap has joined #archiveteam-ot [00:50] *** ta9le has quit IRC (Quit: Connection closed for inactivity) [01:04] *** dxrt_ has quit IRC (Ping timeout: 268 seconds) [01:08] *** dxrt_ has joined #archiveteam-ot [01:08] *** dxrt sets mode: +o dxrt_ [02:15] *** Flashfire has joined #archiveteam-ot [02:27] *** redlizard has quit IRC (Read error: Connection reset by peer) [03:56] *** odemg has quit IRC (Ping timeout: 268 seconds) [04:07] *** odemg has joined #archiveteam-ot [06:36] *** kiska3 has joined #archiveteam-ot [06:39] *** kiska has quit IRC (Ping timeout: 252 seconds) [08:54] *** ivan has quit IRC (Read error: Operation timed out) [08:54] *** JAA has quit IRC (Read error: Operation timed out) [08:54] *** JAA has joined #archiveteam-ot [08:54] *** ivan has joined #archiveteam-ot [08:55] *** jspiros has quit IRC (Read error: Operation timed out) [08:55] *** bakJAA sets mode: +o JAA [08:55] *** SketchCow has quit IRC (Read error: Operation timed out) [09:00] *** wp494 has quit IRC (Read error: Operation timed out) [09:00] *** wp494 has joined #archiveteam-ot [09:03] *** schbirid has joined #archiveteam-ot [09:18] *** BlueMaxim has joined #archiveteam-ot [09:23] *** BlueMax has quit IRC (Read error: Operation timed out) [09:29] *** betamax has quit IRC (Read error: Operation timed out) [09:32] *** betamax has joined #archiveteam-ot [09:54] *** ta9le has joined #archiveteam-ot [09:59] *** jspiros has joined #archiveteam-ot [10:29] *** Flashfire has quit IRC (Bye) [10:38] *** dxrt has quit IRC (Remote host closed the connection) [10:38] *** dxrt has joined #archiveteam-ot [10:59] *** vitzli has joined #archiveteam-ot [11:04] *** t2t2 has quit IRC (Read error: Operation timed out) [11:05] *** t2t2 has joined #archiveteam-ot [14:16] *** vitzli has quit IRC (Quit: Leaving) [15:30] *** schbirid has quit IRC (Quit: Leaving) [15:30] *** schbirid has joined #archiveteam-ot [16:11] *** jut_ has joined #archiveteam-ot [16:14] *** jut has quit IRC (Ping timeout: 252 seconds) [16:29] *** BlueMaxim has quit IRC (Leaving) [17:50] *** icedice has joined #archiveteam-ot [17:51] Does anybody here know if HGST HDDs are still good after the Western Digital merger? [17:57] *** icedice has quit IRC (Quit: Leaving) [17:58] *** icedice has joined #archiveteam-ot [18:09] discussion reliability benefits for single/few hard drives is idiotic [18:09] pick any you like [18:09] you have backups so why worry about it [18:25] *** SketchCow has joined #archiveteam-ot [18:54] *** godane has quit IRC (Ping timeout: 252 seconds) [20:29] *** MrRadar has quit IRC (Read error: Operation timed out) [20:30] *** Soni has quit IRC (Ping timeout: 264 seconds) [20:35] *** arkiver has quit IRC (Ping timeout: 360 seconds) [20:37] *** arkiver has joined #archiveteam-ot [20:37] *** Soni has joined #archiveteam-ot [20:41] *** Stiletto has quit IRC (Ping timeout: 360 seconds) [20:42] *** Stilett0 has joined #archiveteam-ot [20:44] *** dxrt- has joined #archiveteam-ot [20:44] *** chirlu` has quit IRC (Excess Flood) [20:45] *** dxrt has quit IRC (Ping timeout: 360 seconds) [20:49] *** MrRadar has joined #archiveteam-ot [20:50] *** chirlu has joined #archiveteam-ot [20:53] *** astrid has quit IRC (Read error: Operation timed out) [20:53] *** zino has quit IRC (Read error: Connection reset by peer) [20:53] *** zino has joined #archiveteam-ot [20:55] *** MrRadar has quit IRC (Read error: Connection reset by peer) [20:55] *** MrRadar_ has joined #archiveteam-ot [20:59] *** astrid has joined #archiveteam-ot [21:00] *** schbirid has quit IRC (Ping timeout: 1212 seconds) [21:03] *** icedice has quit IRC (Ping timeout: 252 seconds) [22:27] *** nightpool has joined #archiveteam-ot [22:39] Anyone know a grep pattern/command/whatever that I could use to extract a all URLs from a text document? [22:48] hook54321: That's a fairly difficult problem, especially when it also involves stuff like "check out this page (http://domain/file.html)" where you'd want to not include the closing parenthesis. [22:49] But if there's whitespace around the URLs, try grep -o 'http\S*' file [22:49] (Which would obviously also find things that aren't URLs.) [22:50] http URLs* [22:52] Here's another simple one that would extract URLs with any scheme and be a bit more strict about it: grep -o '\<[a-z]\+://\S*' [23:01] JAA: With that second pattern there's still " after them (and some other characters after ") [23:01] on most of them at least [23:02] Yeah, that's the "difficult" part I mentioned above. [23:06] hook54321: grep -Po '(^|\s)([^a-z]?)\K[a-z]+://\S*(?=\2(\s|$))' file [23:06] That still won't deal with parentheses correctly though. [23:06] It does handle quotes around URLs though. [23:11] ok, thanks