[00:20] *** WinterFox has joined #archiveteam [01:21] *** Honno has quit IRC (Read error: Operation timed out) [01:49] *** philpem has quit IRC (Ping timeout: 260 seconds) [01:50] *** Pudsey has joined #archiveteam [01:51] *** Pudsey has quit IRC (Remote host closed the connection) [02:12] *** ariscop_ has joined #archiveteam [02:18] *** ariscop has quit IRC (Read error: Operation timed out) [02:38] *** tfgbd_znc has joined #archiveteam [02:39] *** achip has quit IRC (Ping timeout: 258 seconds) [02:39] *** VADemon_ has joined #archiveteam [02:42] *** VADemon has quit IRC (Ping timeout: 258 seconds) [02:42] *** achip has joined #archiveteam [03:33] *** MMovie1 has quit IRC (Read error: Operation timed out) [03:34] *** MMovie has joined #archiveteam [04:28] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [04:36] *** Sk1d has joined #archiveteam [04:43] *** Pudsey has joined #archiveteam [04:45] How can you search the justin tv archive if you only know the channel name? For blip you had a big list of urls you could filter. [05:32] *** xmc sets mode: +b *!uid118096@* [05:35] *** bsmith093 has quit IRC (Ping timeout: 370 seconds) [05:54] *** bsmith093 has joined #archiveteam [06:07] *** JesseW has joined #archiveteam [06:36] *** Pudsey has quit IRC (Remote host closed the connection) [06:48] *** JesseW has quit IRC (Read error: Operation timed out) [06:58] *** ariscop has joined #archiveteam [07:05] *** ariscop_ has quit IRC (Read error: Operation timed out) [07:11] *** tomwsmf-a has quit IRC (Ping timeout: 258 seconds) [07:29] *** Emcy_ has quit IRC (Read error: Operation timed out) [07:48] *** chazchaz has quit IRC (Quit: leaving) [07:48] *** chazchaz has joined #archiveteam [07:49] https://www.destructoid.com/overwatch-porn-creators-that-rip-the-game-s-assets-are-getting-notices-of-copyright-infringement-364472.phtml [08:09] *** philpem has joined #archiveteam [09:21] *** hook54321 has quit IRC (Quit: Connection closed for inactivity) [09:21] Sanqui: awesome!! [09:21] We'll also start a grab soon for the actual wikis, but for now only external URLs. [09:21] yeah, that's wonderful [09:22] I mean, external urls are wonderful. I fear for them more than for the wikis themselves [09:23] here's one: hiddenpalace:hiddenpalace.org/w/api.php:hiddenpalace.org/ [09:25] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [09:25] so, is there a list of existing ones? and what's the formal process for submitting? :) [09:26] https://github.com/ArchiveTeam/wikis-items [09:27] Just create a list and send it to me or create PR on github [09:27] That should do it [09:28] okay, I can do that. should I just create a new file with my own list, or should I make an attempt at organization? [09:28] the eu notation is a bit awkward, couldn't eu items be autogenerated from regular ones? [09:28] Well, that's the name the item is named for the external URLs grab. [09:29] What do you mean by an attempt at organization? [09:29] *** atomotic has joined #archiveteam [09:29] Well, it seems to me that you'll need to define each wiki twice, once for their grab and once for an external URL grab [09:30] if I make a pull request, should I just make a new file (20_, 21_...) for each wiki? [09:30] Yes [09:30] *** Emcy has joined #archiveteam [09:30] Err [09:30] You don't need to create a new file for each wiki. [09:30] But you do have to add the mediawikieu, since these lists are exactly the lists that are added to the tracker. [09:31] I got it. [09:31] That also means the lists later on also created without the 'eu' part, to grab the real wiki pages. [09:31] *** BartoCH has joined #archiveteam [09:32] Alright. [09:32] I guess I'll just name it 20_mediawikieu_assorted or something? :) [09:32] If you have a wikifarm of wikis, like wikia, you can add a list with a name like '15_mediawikieu_wikifarm.com' [09:32] yeah, no farm [09:32] else just '15_mediawikieu' and then have a list of different wikis [09:32] yeah, do what's best for you [09:32] nod! thank you! [09:33] *** Honno has joined #archiveteam [09:34] Thank you too :) [09:40] *** metalcamp has joined #archiveteam [09:41] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [09:43] fwiw, the api link could be automatically scraped from special:version [09:53] *** Fake-Nam1 has joined #archiveteam [09:55] *** Fake-Name has quit IRC (Read error: Operation timed out) [10:03] The guys behind the Osu! forums aqre releasing a new forum "soon" https://osu.ppy.sh/forum/ - http://blog.ppy.sh/post/143838044998/2016-04-dev-meeting not sure what is happening to the old forums, but I think we should get a copy [10:26] *** Fake-Name has joined #archiveteam [10:28] *** Fake-Nam1 has quit IRC (Read error: Operation timed out) [10:34] can the osu! ...uhm, level files be backed up? [10:39] or is the goal moreso the forums? [10:40] merely forums feels a bit incomplete without the osu level files [10:52] the beatmaps require a login [10:52] and I think they limit how many unless you have osu! supporter [11:32] *** philpem has quit IRC (Read error: Connection reset by peer) [11:40] *** arkiver2 has joined #archiveteam [12:02] *** ndiddy has joined #archiveteam [12:04] *** ndiddy has quit IRC (Client Quit) [12:06] *** ndiddy has joined #archiveteam [12:06] *** arkiver3 has joined #archiveteam [12:09] *** arkiver2 has quit IRC (Read error: Connection reset by peer) [12:09] *** arkiver3 has quit IRC (Read error: Connection reset by peer) [12:16] Sanqui: yeah, but for as far as I know some wikis are heavily edited. On those they can't be easily found [12:21] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [12:39] *** atomotic has joined #archiveteam [12:41] *** Froggypwn has quit IRC (Quit: ~ Trillian Astra - www.trillian.im ~) [12:51] *** BlueMaxim has quit IRC (Quit: Leaving) [12:55] *** Simpbra1 has quit IRC (Ping timeout: 260 seconds) [13:14] *** WinterFox has quit IRC (Remote host closed the connection) [13:23] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [13:36] *** atomotic has joined #archiveteam [14:00] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [14:33] *** hook54321 has joined #archiveteam [15:16] *** Simpbrain has joined #archiveteam [15:16] *** schbirid has joined #archiveteam [15:32] *** maseck_ has joined #archiveteam [15:35] *** JesseW has joined #archiveteam [15:38] *** maseck has quit IRC (Ping timeout: 633 seconds) [15:48] *** Simpbrain has quit IRC (Quit: Leaving) [15:49] *** zino has joined #archiveteam [16:13] *** Simpbrain has joined #archiveteam [16:14] *** Simpbrain has quit IRC (Remote host closed the connection) [16:19] *** VADemon_ has quit IRC (Quit: left4dead) [16:37] *** zenguy has quit IRC (Read error: Operation timed out) [16:43] *** zenguy has joined #archiveteam [17:05] *** maseck has joined #archiveteam [17:07] *** maseck_ has quit IRC (Read error: Operation timed out) [17:22] *** JesseW has quit IRC (Ping timeout: 370 seconds) [17:57] *** balrog has quit IRC (Read error: Operation timed out) [18:00] *** balrog has joined #archiveteam [18:08] *** Tomcat_ has joined #archiveteam [18:57] *** Honno has quit IRC (Quit: Leaving) [19:23] *** schbirid has quit IRC (Quit: Leaving) [19:33] *** atomotic has joined #archiveteam [19:43] *** maseck_ has joined #archiveteam [19:45] *** Tomcat_ has quit IRC (Remote host closed the connection) [19:47] *** ariscop has quit IRC (Quit: Leaving) [19:51] *** maseck has quit IRC (Ping timeout: 633 seconds) [20:04] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [20:22] *** tfgbd_znc is now known as tfgbd [20:47] *** maseck_ has quit IRC (Ping timeout: 250 seconds) [20:47] The scripts for arto.com are updated! [20:48] Please update your scripts, I'll set the new version in the tracker as minimum version tomorrow. [20:52] *** maseck has joined #archiveteam [20:59] *** ariscop has joined #archiveteam [21:25] *** tomwsmf-a has joined #archiveteam [21:31] how does archive team deal with cloudflare [21:34] and I don't want to hear no "it doesn't" [21:35] https://github.com/search?utf8=%E2%9C%93&q=cloudflare+bypass&type=Repositories&ref=searchresults [21:35] we sit all day and solve captcas [21:35] I have found https://github.com/Anorov/cloudflare-scrape [21:35] can we integrate this with archivebot somehow? [21:36] it looks absolutely wonderful [21:36] and integrates with requests [21:36] >It's easy to integrate cloudflare-scrape with other applications and tools. Cloudflare uses two cookies as tokens: one to verify you made it past their challenge page and one to track your session. To bypass the challenge page, simply include both of these cookies (with the appropriate user-agent) in all HTTP requests you make. [21:37] so run a script to bypass cloudflare, grab those cookies, and then run a regular archivebot job - can we do that [21:39] it sounds possible with programming and willpower [21:39] wonderful [21:40] things are even easier if you run your own pipeline that you patch to support crazy stuff [21:40] this should be in the standard archivebot distribution [21:40] I'd like to at least open an issue. should I do it on ArchiveBot? [21:41] https://github.com/ArchiveTeam/ArchiveBot/issues/101 [21:41] *** PaulFerts has quit IRC (Read error: Operation timed out) [21:41] cool, I'll reference that issue [21:45] https://github.com/ArchiveTeam/ArchiveBot/issues/216 [21:54] *** Simpbrain has joined #archiveteam [22:09] *** divingk has joined #archiveteam [22:09] https://www.youtube.com/channel/UCzZ8jigea0Ph0pgUgrX5U-Q [22:09] I appreciate channels like this one. [22:09] Because yes, there are craploads of games getting their servers shut down every day. [22:11] *** maseck has quit IRC (Read error: Operation timed out) [22:14] *** arkiver3 has joined #archiveteam [22:15] *** arkiver3 has quit IRC (Client Quit) [22:22] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [22:23] *** maseck has joined #archiveteam [22:37] *** tomwsmf-a has joined #archiveteam [22:50] *** Aranje has quit IRC (Quit: Three sheets to the wind) [23:28] *** Aranje has joined #archiveteam [23:59] *** MMovie1 has joined #archiveteam