[00:09] *** Arcorann_ has joined #archiveteam-bs [00:10] hmm, does anyone know a tool that'll let me spider a sub-portion of a website and only save files with a specific extension? I want to grab all of a specific filetype from one website and nothing else [00:10] I can go into more detail if needed [00:20] I can probably rig something up with beautifulsoup if needed, but I'm hoping there's something easier than writing code myself [00:25] *** godane has joined #archiveteam-bs [00:27] endrift: I think wget can do this, but not entirely sure. What should definitely work with wget is --spider, parse the log, then run the actual retrieval separately, but that's ugly. [00:27] ah, interesting [00:28] I had read the wget manual prior but it's big enough that I hadn't found --spider [00:44] heh got IP banned [00:44] I'll slow down the scan next time [00:44] ...does curl have an option for that? [00:45] I don't think curl has any recursion or even HTML parsing/link extraction. [00:45] oh I extracted the links using grep [00:45] turns out that was good enough [00:51] *** BlueMax has joined #archiveteam-bs [01:18] *** scorche has quit IRC (Read error: Operation timed out) [01:20] *** SynMonger has quit IRC (hub.efnet.us irc.Prison.NET) [01:23] *** synm0nger has joined #archiveteam-bs [01:44] *** scorche has joined #archiveteam-bs [01:54] *** Frogging has joined #archiveteam-bs [02:06] *** RichardG_ has joined #archiveteam-bs [02:06] *** RichardG has quit IRC (Read error: Connection reset by peer) [02:07] *** BnAboyZ has quit IRC (Read error: Operation timed out) [02:09] *** Ctrl has quit IRC (Read error: Operation timed out) [02:13] *** bsmith093 has quit IRC (Read error: Operation timed out) [02:16] *** BnAboyZ has joined #archiveteam-bs [02:19] *** Ctrl has joined #archiveteam-bs [03:25] I only got IP banned once \o/ [03:25] If you ever wanted every single GBA save file hosted on GameFAQs for some reason: I archived 'em [03:25] I may do DS at some point too [03:27] They used to allow savestates I think but at some point they all got purged [03:27] which is frustrating [04:26] *** DFJustin has quit IRC (Ping timeout: 745 seconds) [05:03] *** Stiletto has quit IRC () [05:14] *** DFJustin has joined #archiveteam-bs [05:37] *** Stiletto has joined #archiveteam-bs [06:38] *** klg has joined #archiveteam-bs [07:18] *** scorche has quit IRC (hub.efnet.us irc.Prison.NET) [07:18] *** Pixi` has joined #archiveteam-bs [07:19] *** AlsoJAA has quit IRC (Ping timeout: 258 seconds) [07:19] *** AlsoJAA has joined #archiveteam-bs [07:19] *** JAA sets mode: +o AlsoJAA [07:22] *** Jake has quit IRC (Read error: Connection reset by peer) [07:22] *** Ctrl has quit IRC (Read error: Operation timed out) [07:24] *** jshoard has joined #archiveteam-bs [07:25] *** BnAboyZ has quit IRC (Read error: Connection reset by peer) [07:25] *** Mayonaise has quit IRC (Read error: Operation timed out) [07:25] *** dxrt_ has quit IRC (Read error: Operation timed out) [07:25] *** Wingy has quit IRC (Read error: Operation timed out) [07:26] *** Pixi has quit IRC (Read error: Operation timed out) [07:26] *** hook54321 has quit IRC (Ping timeout: 258 seconds) [07:26] *** Arcorann_ has quit IRC (Read error: Operation timed out) [07:26] *** systwi has joined #archiveteam-bs [07:26] *** SLC has quit IRC (Read error: Operation timed out) [07:26] *** SLC has joined #archiveteam-bs [07:27] *** Mayonaise has joined #archiveteam-bs [07:27] *** endrift has quit IRC (Read error: Operation timed out) [07:29] *** gandalf has quit IRC (Ping timeout: 622 seconds) [07:30] *** BnAboyZ has joined #archiveteam-bs [07:30] *** sembiance has quit IRC (Read error: Operation timed out) [07:32] *** gandalf has joined #archiveteam-bs [07:33] *** systwi_ has quit IRC (Ping timeout: 622 seconds) [07:33] *** paul2520 has quit IRC (Ping timeout: 622 seconds) [07:36] *** systwi has quit IRC (Ping timeout: 622 seconds) [07:37] *** synm0nger has quit IRC (Ping timeout: 622 seconds) [07:38] *** Arcorann has joined #archiveteam-bs [07:38] *** Arcorann has quit IRC (Remote host closed the connection) [07:38] *** asdf01011 has quit IRC (Read error: Operation timed out) [07:41] *** systwi has joined #archiveteam-bs [07:42] *** atphoenix has quit IRC (Read error: Operation timed out) [07:43] *** atphoenix has joined #archiveteam-bs [07:43] *** scorche has joined #archiveteam-bs [07:43] *** SynMonger has joined #archiveteam-bs [07:45] *** hook54321 has joined #archiveteam-bs [07:48] *** paul2520 has joined #archiveteam-bs [07:49] *** Arcorann has joined #archiveteam-bs [07:52] *** godane has quit IRC (Read error: Operation timed out) [07:52] *** systwi has quit IRC (Read error: Operation timed out) [07:52] *** Arcorann_ has joined #archiveteam-bs [07:52] *** endrift has joined #archiveteam-bs [07:53] *** Arcorann has quit IRC (Read error: Connection reset by peer) [07:53] *** Arcorann_ has quit IRC (Remote host closed the connection) [07:53] *** Arcorann_ has joined #archiveteam-bs [07:53] *** systwi has joined #archiveteam-bs [07:53] *** godane has joined #archiveteam-bs [07:55] *** sembiance has joined #archiveteam-bs [08:13] *** hook54321 has quit IRC (Ping timeout: 258 seconds) [08:17] *** hook54321 has joined #archiveteam-bs [08:22] *** Ctrl has joined #archiveteam-bs [08:37] *** hook54321 has quit IRC (Ping timeout: 258 seconds) [08:41] *** hook54321 has joined #archiveteam-bs [08:56] *** BlueMax has quit IRC (Quit: Leaving) [09:55] *** BnAboyZ has quit IRC (Read error: Connection reset by peer) [09:58] *** Ctrl has quit IRC (Read error: Operation timed out) [09:59] *** BnAboyZ has joined #archiveteam-bs [10:01] *** Ctrl has joined #archiveteam-bs [10:04] *** fuzzy802 has joined #archiveteam-bs [10:04] *** Selavi has quit IRC (verb. to stop or discontinue) [10:04] *** kiska has quit IRC (Read error: Connection reset by peer) [10:05] *** SketchCow has quit IRC (Read error: Connection reset by peer) [10:05] *** kiska has joined #archiveteam-bs [10:05] *** Mayonaise has quit IRC (Read error: Operation timed out) [10:05] *** SketchCow has joined #archiveteam-bs [10:05] *** TC01 has quit IRC (Read error: Operation timed out) [10:05] *** balrog has quit IRC (Quit: Bye) [10:05] *** Mayonaise has joined #archiveteam-bs [10:06] *** TC01 has joined #archiveteam-bs [10:06] *** Raccoon has quit IRC (Ping timeout: 265 seconds) [10:07] *** benjinsmi has joined #archiveteam-bs [10:07] *** kiska has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** Frogging has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** Larsenv has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** benjinss has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** Yurume has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** betamax has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** SmileyG has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** _niklas has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** ats_ has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** fuzzy8021 has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** Darkstar has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** swebb has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** Fionera_ has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** jodizzle has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** JAA has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** ripdog has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** simon816 has quit IRC (hub.efnet.us irc.colosolutions.net) [10:07] *** balrog has joined #archiveteam-bs [10:07] *** ats has joined #archiveteam-bs [10:09] *** jrwr has quit IRC (Read error: Operation timed out) [10:09] *** betamax_ has joined #archiveteam-bs [10:09] *** kyledrake has quit IRC (Ping timeout: 260 seconds) [10:09] *** Larsenv_ has joined #archiveteam-bs [10:09] *** revi has quit IRC (Ping timeout: 260 seconds) [10:10] *** Meli has joined #archiveteam-bs [10:11] *** Meli-sama has quit IRC (Ping timeout: 272 seconds) [10:12] *** hook54321 has quit IRC (Ping timeout: 258 seconds) [10:12] *** mgrytbak has quit IRC (Read error: Network is unreachable) [10:13] *** mr_archiv has quit IRC (Ping timeout: 376 seconds) [10:14] *** SketchCow has quit IRC (Read error: Connection reset by peer) [10:15] *** Frogging has joined #archiveteam-bs [10:15] *** Yurume has joined #archiveteam-bs [10:15] *** SmileyG has joined #archiveteam-bs [10:15] *** jodizzle has joined #archiveteam-bs [10:15] *** simon816 has joined #archiveteam-bs [10:17] *** Dark_Star has joined #archiveteam-bs [10:20] *** bsmith094 has joined #archiveteam-bs [10:20] *** Selavi has joined #archiveteam-bs [10:20] *** swebb has joined #archiveteam-bs [10:20] *** kiska has joined #archiveteam-bs [10:20] *** ripdog has joined #archiveteam-bs [10:21] *** mr_archiv has joined #archiveteam-bs [10:22] *** swebb has quit IRC (Read error: Operation timed out) [10:22] *** kiska has quit IRC (Read error: Operation timed out) [10:24] *** swebb has joined #archiveteam-bs [10:25] *** kiska has joined #archiveteam-bs [10:25] *** ripdog has quit IRC (Read error: Operation timed out) [10:25] *** ripdog has joined #archiveteam-bs [10:26] *** swebb has quit IRC (Read error: Operation timed out) [10:26] *** JAA has joined #archiveteam-bs [10:26] *** kiska has quit IRC (Read error: Operation timed out) [10:26] *** AlsoJAA sets mode: +o JAA [10:28] *** ripdog has quit IRC (Read error: Operation timed out) [10:29] *** JAA has quit IRC (Read error: Operation timed out) [10:29] *** bsmith094 has quit IRC (Ping timeout: 745 seconds) [10:31] *** kiska has joined #archiveteam-bs [10:31] *** ripdog has joined #archiveteam-bs [10:31] *** JAA has joined #archiveteam-bs [10:32] *** JAA has quit IRC (Read error: Operation timed out) [10:33] *** kiska has quit IRC (Read error: Operation timed out) [10:34] *** ripdog has quit IRC (Read error: Operation timed out) [10:35] *** swebb has joined #archiveteam-bs [10:37] *** hook54321 has joined #archiveteam-bs [10:37] *** hook54321 has quit IRC (Excess Flood) [10:37] *** hook54321 has joined #archiveteam-bs [10:40] *** kyledrake has joined #archiveteam-bs [10:41] *** swebb has quit IRC (Ping timeout: 376 seconds) [10:41] *** JAA has joined #archiveteam-bs [10:41] *** AlsoJAA sets mode: +o JAA [10:42] *** JAA has quit IRC (Read error: Operation timed out) [10:44] *** mgrytbak has joined #archiveteam-bs [10:51] *** jrwr has joined #archiveteam-bs [10:55] *** jrwr has quit IRC (Read error: Operation timed out) [10:55] *** swebb has joined #archiveteam-bs [10:58] *** swebb has quit IRC (Read error: Operation timed out) [10:59] *** jrwr has joined #archiveteam-bs [11:04] *** jrwr has quit IRC (Ping timeout: 260 seconds) [11:05] *** swebb has joined #archiveteam-bs [11:12] *** Ctrl has quit IRC (Read error: Operation timed out) [11:13] *** jrwr has joined #archiveteam-bs [11:13] *** swebb has quit IRC (Read error: Operation timed out) [11:14] *** VerifiedJ has joined #archiveteam-bs [11:17] *** revi has joined #archiveteam-bs [11:22] *** kyledrake has quit IRC (Ping timeout: 1221 seconds) [11:28] *** BnAboyZ has quit IRC (Read error: Connection reset by peer) [11:28] *** BnAboyZ has joined #archiveteam-bs [11:30] *** kyledrake has joined #archiveteam-bs [11:40] *** SynMonger has quit IRC (hub.efnet.us irc.Prison.NET) [11:40] *** scorche has quit IRC (hub.efnet.us irc.Prison.NET) [11:40] *** jrwr has quit IRC (Read error: Connection reset by peer) [11:40] *** hook54321 has quit IRC (Read error: Connection reset by peer) [11:41] *** revi has quit IRC (Read error: Connection reset by peer) [11:41] *** mgrytbak has quit IRC (Read error: Network is unreachable) [11:41] *** revi has joined #archiveteam-bs [11:42] *** swebb has joined #archiveteam-bs [11:42] *** mgrytbak has joined #archiveteam-bs [11:42] *** hook54321 has joined #archiveteam-bs [11:43] *** SJon___ has joined #archiveteam-bs [11:44] *** swebb has quit IRC (Read error: Operation timed out) [11:46] *** BnAboyZ has quit IRC (Read error: Connection reset by peer) [11:47] *** mgrytbak has quit IRC (Ping timeout: 272 seconds) [11:47] *** mr_archiv has quit IRC (Ping timeout: 857 seconds) [11:48] *** mr_archiv has joined #archiveteam-bs [11:56] *** SJon___ has quit IRC (Read error: Operation timed out) [11:56] *** swebb has joined #archiveteam-bs [11:58] *** mgrytbak has joined #archiveteam-bs [12:00] *** swebb has quit IRC (Read error: Operation timed out) [12:05] *** swebb has joined #archiveteam-bs [12:09] *** BnAboyZ has joined #archiveteam-bs [12:10] *** tech234a has quit IRC (Quit: Connection closed for inactivity) [12:11] *** SynMonger has joined #archiveteam-bs [12:13] *** swebb has quit IRC (Ping timeout: 376 seconds) [12:13] *** revi has quit IRC (Ping timeout: 260 seconds) [12:13] *** revi has joined #archiveteam-bs [12:15] *** mgrytbak has quit IRC (Read error: Network is unreachable) [12:15] *** mgrytbak has joined #archiveteam-bs [12:16] *** Xibalba has quit IRC (Quit: ZNC - https://znc.in) [12:16] *** Xibalba has joined #archiveteam-bs [12:20] *** swebb has joined #archiveteam-bs [12:21] *** prq has quit IRC (Read error: Operation timed out) [12:22] *** swebb has quit IRC (Read error: Operation timed out) [12:27] *** ftl has quit IRC (Quit: Connection closed for inactivity) [12:30] *** jrwr has joined #archiveteam-bs [12:32] *** swebb has joined #archiveteam-bs [12:36] *** prq has joined #archiveteam-bs [12:38] *** mgrytbak has quit IRC (Read error: Network is unreachable) [12:39] *** hook54321 has quit IRC (Ping timeout: 258 seconds) [12:39] *** SJon____ has joined #archiveteam-bs [12:39] *** swebb has quit IRC (Ping timeout: 376 seconds) [12:40] *** hook54321 has joined #archiveteam-bs [12:40] *** mgrytbak has joined #archiveteam-bs [12:45] *** acridAxid has quit IRC (Quit: marauder) [12:49] *** kisspunch has quit IRC (Quit: ZNC - http://znc.in) [12:51] *** swebb has joined #archiveteam-bs [12:52] *** kisspunch has joined #archiveteam-bs [12:53] *** swebb has quit IRC (Read error: Operation timed out) [12:55] *** JonimusP has quit IRC (Ping timeout: 745 seconds) [13:00] *** JonimusP has joined #archiveteam-bs [13:03] *** scorche has joined #archiveteam-bs [13:22] *** swebb has joined #archiveteam-bs [13:26] *** swebb has quit IRC (Read error: Connection reset by peer) [13:27] *** revi has quit IRC (Read error: Connection reset by peer) [13:27] *** jrwr has quit IRC (Ping timeout: 260 seconds) [13:27] *** revi has joined #archiveteam-bs [13:28] *** jrwr has joined #archiveteam-bs [13:33] *** swebb has joined #archiveteam-bs [13:36] *** swebb_ has joined #archiveteam-bs [13:36] *** JAA has joined #archiveteam-bs [13:36] *** AlsoJAA sets mode: +o JAA [13:37] *** swebb_ has quit IRC (Read error: Operation timed out) [13:38] *** JAA has quit IRC (Read error: Operation timed out) [13:39] *** swebb has quit IRC (Ping timeout: 376 seconds) [13:43] *** swebb has joined #archiveteam-bs [13:50] *** swebb has quit IRC (Ping timeout: 376 seconds) [14:02] *** acridAxid has joined #archiveteam-bs [14:08] *** mgrytbak has quit IRC (Ping timeout: 272 seconds) [14:08] *** mgrytbak has joined #archiveteam-bs [14:12] *** kiska has joined #archiveteam-bs [14:21] *** JAA has joined #archiveteam-bs [14:21] *** AlsoJAA sets mode: +o JAA [14:23] *** mgrytbak has quit IRC (Read error: Network is unreachable) [14:24] *** mgrytbak has joined #archiveteam-bs [14:24] *** swebb has joined #archiveteam-bs [14:27] *** SketchCow has joined #archiveteam-bs [14:28] *** jrwr has quit IRC (Ping timeout: 260 seconds) [14:29] *** revi has quit IRC (Ping timeout: 260 seconds) [14:31] *** swebb has quit IRC (Ping timeout: 376 seconds) [14:36] *** HP_Archiv has joined #archiveteam-bs [14:42] *** swebb has joined #archiveteam-bs [14:42] *** revi has joined #archiveteam-bs [14:53] *** swebb has quit IRC (Read error: Connection reset by peer) [14:53] *** swebb has joined #archiveteam-bs [15:01] *** tech234a has joined #archiveteam-bs [15:15] *** swebb has quit IRC (Read error: Operation timed out) [15:16] *** swebb has joined #archiveteam-bs [15:51] *** betamax_ is now known as betamax [16:02] *** katocala has quit IRC () [16:04] *** katocala has joined #archiveteam-bs [16:12] *** Larsenv_ is now known as Larsenv [16:13] *** HP_Archiv has quit IRC (Quit: Leaving) [16:31] *** synm0nger has joined #archiveteam-bs [16:32] *** SynMonger has quit IRC (Read error: Connection reset by peer) [18:46] *** Sokar has joined #archiveteam-bs [18:56] *** semisimpl has joined #archiveteam-bs [18:57] *** cascode1 has quit IRC (Quit: WeeChat 2.9) [19:03] *** Jake has joined #archiveteam-bs [19:27] *** jrwr has joined #archiveteam-bs [19:49] *** Arcorann_ has quit IRC (Read error: Connection reset by peer) [20:26] *** VerifiedJ has quit IRC (Quit: Leaving) [21:24] *** britmob has quit IRC (Read error: Connection reset by peer) [22:36] *** britmob has joined #archiveteam-bs [22:38] *** Ctrl has joined #archiveteam-bs [23:18] *** semisimpl has quit IRC (Quit: semisimpl) [23:23] *** Arcorann_ has joined #archiveteam-bs [23:46] *** jshoard has quit IRC (Quit: Leaving) [23:55] *** britmob has quit IRC (Quit: Leaving) [23:59] *** britmob has joined #archiveteam-bs