[00:15] *** wyatt8750 has joined #archiveteam-bs [00:15] *** Yurume_ has quit IRC (Read error: Operation timed out) [00:15] *** legoktm has quit IRC (Read error: Operation timed out) [00:15] *** legoktm has joined #archiveteam-bs [00:16] *** pie_ has quit IRC (Read error: Operation timed out) [00:17] *** jodizzle has quit IRC (Read error: Operation timed out) [00:17] *** BnAboyZ has quit IRC (Read error: Operation timed out) [00:17] *** jodizzle has joined #archiveteam-bs [00:18] *** colona has quit IRC (Read error: Connection reset by peer) [00:20] *** colona has joined #archiveteam-bs [00:21] *** pie_ has joined #archiveteam-bs [00:22] *** Yurume has joined #archiveteam-bs [00:24] *** BnAboyZ has joined #archiveteam-bs [00:24] *** wessel152 has quit IRC (Read error: Operation timed out) [00:25] *** wyatt8740 has quit IRC (Read error: Operation timed out) [00:25] *** wyatt8750 is now known as wyatt8740 [00:33] *** BnAboyZ has quit IRC (Read error: Connection reset by peer) [00:37] *** hook54321 has quit IRC (se.hub irc.nordunet.se) [00:42] *** BnAboyZ has joined #archiveteam-bs [01:13] *** Jake has quit IRC (Read error: Operation timed out) [01:13] *** Mayonaise has quit IRC (Read error: Operation timed out) [01:14] *** dxrt_ has quit IRC (Read error: Operation timed out) [01:14] *** t3 has quit IRC (Quit: Connection closed for inactivity) [01:14] *** SmileyG has joined #archiveteam-bs [01:14] *** Wingy has quit IRC (Read error: Operation timed out) [01:15] *** paul2520 has quit IRC (Read error: Operation timed out) [01:15] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [01:15] *** Mayonaise has joined #archiveteam-bs [01:15] *** Lord_Nigh has joined #archiveteam-bs [01:16] *** jodizzle has quit IRC (Read error: Operation timed out) [01:16] *** jodizzle has joined #archiveteam-bs [01:16] *** asdf0101 has quit IRC (Read error: Operation timed out) [01:17] *** colona has quit IRC (Read error: Operation timed out) [01:17] *** _niklas has joined #archiveteam-bs [01:17] *** _niklas_ has quit IRC (Write error: Broken pipe) [01:17] *** katocala has quit IRC (Read error: Operation timed out) [01:17] *** katocala has joined #archiveteam-bs [01:17] *** colona has joined #archiveteam-bs [01:17] *** svchfoo3 sets mode: +o katocala [01:17] *** Selavi has quit IRC (Read error: Operation timed out) [01:18] *** endrift has quit IRC (Read error: Operation timed out) [01:18] *** Smiley has quit IRC (Read error: Operation timed out) [01:19] *** Gfy_ has quit IRC (Read error: Operation timed out) [01:19] *** Selavi has joined #archiveteam-bs [01:21] *** systwi_ has quit IRC (Read error: Operation timed out) [01:24] *** Wingy has joined #archiveteam-bs [01:24] *** Jake4 has joined #archiveteam-bs [01:24] *** paul2520 has joined #archiveteam-bs [01:24] *** sembiance has quit IRC (Read error: Connection reset by peer) [01:24] *** systwi has joined #archiveteam-bs [01:24] *** dxrt_ has joined #archiveteam-bs [01:24] *** dxrt sets mode: +o dxrt_ [01:25] *** Gfy has joined #archiveteam-bs [01:27] *** Jake45 has joined #archiveteam-bs [01:28] *** Jake43 has joined #archiveteam-bs [01:28] *** asdf0101 has joined #archiveteam-bs [01:28] *** sembiance has joined #archiveteam-bs [01:29] *** Ctrl has joined #archiveteam-bs [01:30] *** Jake43 has quit IRC (Client Quit) [01:30] *** Jake43 has joined #archiveteam-bs [01:30] *** endrift has joined #archiveteam-bs [01:32] *** Jake43 has quit IRC (Client Quit) [01:34] *** Jake47 has joined #archiveteam-bs [01:35] *** Jake47 is now known as Jake [01:35] *** Jake4 has quit IRC (Ping timeout: 622 seconds) [01:37] *** Jake45 has quit IRC (Remote host closed the connection) [01:37] *** Jake has quit IRC (Remote host closed the connection) [01:38] *** Jake has joined #archiveteam-bs [01:42] *** Jake0 has joined #archiveteam-bs [01:44] *** Jake0 has quit IRC (Client Quit) [01:44] *** Jake1 has joined #archiveteam-bs [01:46] *** Jake1 has quit IRC (Client Quit) [01:46] *** Jake5 has joined #archiveteam-bs [01:48] *** Jake5 has quit IRC (Client Quit) [01:48] *** Jake5 has joined #archiveteam-bs [01:50] *** Jake5 has quit IRC (Client Quit) [01:50] *** Jake has quit IRC (Ping timeout: 622 seconds) [01:50] *** Jake has joined #archiveteam-bs [02:06] *** Raccoon has quit IRC (Ping timeout: 745 seconds) [02:41] *** wyatt8740 has quit IRC (Ping timeout: 260 seconds) [03:01] *** wyatt8740 has joined #archiveteam-bs [03:22] *** qw3rty_ has joined #archiveteam-bs [03:26] *** qw3rty has quit IRC (Ping timeout: 265 seconds) [03:30] *** Mateon1 has quit IRC (Remote host closed the connection) [03:35] *** Mateon1 has joined #archiveteam-bs [03:50] *** wyatt8740 has quit IRC (Remote host closed the connection) [03:54] *** wyatt8740 has joined #archiveteam-bs [04:17] *** asdf01011 has joined #archiveteam-bs [04:20] *** asdf0101 has quit IRC (Read error: Operation timed out) [04:25] *** endrift has quit IRC (Read error: Operation timed out) [04:25] *** endrift has joined #archiveteam-bs [04:27] *** asdf01011 has quit IRC (Read error: Operation timed out) [04:27] *** dxrt_ has quit IRC (Read error: Connection reset by peer) [04:28] *** dxrt_ has joined #archiveteam-bs [04:28] *** dxrt sets mode: +o dxrt_ [04:39] *** wyatt8740 has quit IRC (Ping timeout: 260 seconds) [04:39] *** wyatt8740 has joined #archiveteam-bs [04:40] *** Pixi` is now known as Pixi [04:45] *** colona has quit IRC (Read error: Operation timed out) [04:46] *** colona has joined #archiveteam-bs [05:23] *** Smiley has joined #archiveteam-bs [05:25] *** Laverne has quit IRC (Ping timeout: 272 seconds) [05:25] *** trc_ has joined #archiveteam-bs [05:25] *** _niklas has quit IRC (Ping timeout: 272 seconds) [05:25] *** Lord_Nigh has quit IRC (Ping timeout: 272 seconds) [05:26] *** Doranwen has joined #archiveteam-bs [05:26] *** Gfy has quit IRC (Ping timeout: 272 seconds) [05:26] *** SmileyG has quit IRC (Ping timeout: 272 seconds) [05:26] *** luckcolor has quit IRC (Ping timeout: 272 seconds) [05:26] *** Doran has quit IRC (Ping timeout: 272 seconds) [05:26] *** brayden has quit IRC (Ping timeout: 272 seconds) [05:27] *** luckcolor has joined #archiveteam-bs [05:29] *** Arcorann_ has joined #archiveteam-bs [05:29] *** _niklas has joined #archiveteam-bs [05:29] *** Lord_Nigh has joined #archiveteam-bs [05:30] *** Gfy has joined #archiveteam-bs [05:30] *** Arcorann has quit IRC (Ping timeout: 265 seconds) [05:30] *** trc has quit IRC (Ping timeout: 265 seconds) [05:32] *** qw3rty_ has quit IRC (Ping timeout: 265 seconds) [05:32] *** qw3rty has joined #archiveteam-bs [05:33] *** Stilett0 has joined #archiveteam-bs [05:36] *** britm0b has joined #archiveteam-bs [05:40] *** trc has joined #archiveteam-bs [05:43] *** trc_ has quit IRC (se.hub irc.underworld.no) [05:43] *** cascode has quit IRC (se.hub irc.underworld.no) [05:43] *** Stiletto has quit IRC (se.hub irc.underworld.no) [05:43] *** ats has quit IRC (se.hub irc.underworld.no) [05:43] *** wp494 has quit IRC (se.hub irc.underworld.no) [05:43] *** britmob has quit IRC (se.hub irc.underworld.no) [05:43] *** Aoede has quit IRC (se.hub irc.underworld.no) [05:43] *** Jon has quit IRC (se.hub irc.underworld.no) [05:43] *** OrIdow6 has quit IRC (se.hub irc.underworld.no) [05:43] *** i0npulse has quit IRC (se.hub irc.underworld.no) [05:43] *** Tugboat has quit IRC (se.hub irc.underworld.no) [05:43] *** Jens has quit IRC (se.hub irc.underworld.no) [05:43] *** arkiver has quit IRC (se.hub irc.underworld.no) [05:49] *** cascode has joined #archiveteam-bs [05:49] *** ats has joined #archiveteam-bs [05:49] *** wp494 has joined #archiveteam-bs [05:49] *** Aoede has joined #archiveteam-bs [05:49] *** Jon has joined #archiveteam-bs [05:49] *** OrIdow6 has joined #archiveteam-bs [05:49] *** arkiver has joined #archiveteam-bs [05:49] *** i0npulse has joined #archiveteam-bs [05:49] *** Tugboat has joined #archiveteam-bs [05:49] *** irc.underworld.no sets mode: +o arkiver [06:07] *** Laverne has joined #archiveteam-bs [06:08] *** brayden has joined #archiveteam-bs [06:14] *** brayden has quit IRC (Ping timeout: 272 seconds) [06:14] *** Laverne has quit IRC (Ping timeout: 272 seconds) [06:15] *** icedice has joined #archiveteam-bs [06:31] *** HP_Archiv has joined #archiveteam-bs [06:51] *** HP_Archiv has quit IRC (Quit: Leaving) [06:56] *** Laverne has joined #archiveteam-bs [06:56] *** brayden has joined #archiveteam-bs [07:06] *** Laverne has quit IRC (Ping timeout: 272 seconds) [07:07] *** brayden has quit IRC (Ping timeout: 272 seconds) [07:11] *** Raccoon has joined #archiveteam-bs [07:39] *** icedice has quit IRC (Leaving) [07:50] *** brayden has joined #archiveteam-bs [07:50] *** Laverne has joined #archiveteam-bs [07:55] *** Laverne has quit IRC (Ping timeout: 272 seconds) [07:56] *** brayden has quit IRC (Ping timeout: 272 seconds) [08:23] *** jshoard has joined #archiveteam-bs [08:28] I've got a full rip with ddrescue log file for this item https://archive.org/details/maximumlinux which is incomplete in the archive (corrupted disk); should I upload it as a separate item then request someone merge/change/update/whatever? [08:29] oh wait there's a comment, somone else hasbeat me to it [08:39] *** Laverne has joined #archiveteam-bs [08:39] *** brayden has joined #archiveteam-bs [08:58] *** Laverne has quit IRC (Ping timeout: 272 seconds) [08:59] *** brayden has quit IRC (Ping timeout: 272 seconds) [09:10] *** abartov__ has quit IRC (Read error: Connection reset by peer) [09:10] *** HCross has quit IRC (Read error: Connection reset by peer) [09:18] *** HCross has joined #archiveteam-bs [09:18] *** svchfoo3 sets mode: +o HCross [09:22] *** abartov__ has joined #archiveteam-bs [09:40] *** brayden has joined #archiveteam-bs [09:40] *** Laverne has joined #archiveteam-bs [09:44] *** VerifiedJ has joined #archiveteam-bs [10:31] *** BlueMax has quit IRC (Read error: Connection reset by peer) [10:49] *** BnAboyZ has quit IRC (Read error: Operation timed out) [10:50] *** colona has quit IRC (Read error: Operation timed out) [10:53] *** colona has joined #archiveteam-bs [10:53] *** Ctrl has quit IRC (Read error: Operation timed out) [10:59] *** BnAboyZ has joined #archiveteam-bs [11:21] *** Ctrl has joined #archiveteam-bs [12:04] *** HP_Archiv has joined #archiveteam-bs [12:06] *** HP_Archiv has quit IRC (Client Quit) [12:19] *** Meli has quit IRC (Read error: Connection reset by peer) [12:20] *** Meli has joined #archiveteam-bs [12:40] *** Mateon1 has quit IRC (Read error: Operation timed out) [12:44] *** Mateon1 has joined #archiveteam-bs [12:44] *** yano_ is now known as yano [13:30] *** pew has joined #archiveteam-bs [13:39] *** fuzzy802 has joined #archiveteam-bs [13:43] *** fuzzy8021 has quit IRC (Read error: Operation timed out) [13:49] *** fuzzy802 is now known as fuzzy8021 [13:53] *** icedice has joined #archiveteam-bs [14:59] *** DigiDigi` has quit IRC (Read error: Operation timed out) [15:12] *** DigiDigi has joined #archiveteam-bs [15:35] *** Arcorann_ has quit IRC (Read error: Connection reset by peer) [16:16] *** icedice has quit IRC (Ping timeout: 622 seconds) [16:29] *** cascode has quit IRC (Ping timeout: 265 seconds) [16:36] *** icedice has joined #archiveteam-bs [16:40] *** VADemon has joined #archiveteam-bs [16:44] *** cascode has joined #archiveteam-bs [16:59] *** cascode has quit IRC (Ping timeout: 265 seconds) [17:19] *** hook54321 has joined #archiveteam-bs [17:49] *** Meli has quit IRC (Ping timeout: 272 seconds) [17:49] *** Laverne has quit IRC (Ping timeout: 272 seconds) [17:49] *** Meli has joined #archiveteam-bs [17:50] *** brayden has quit IRC (Ping timeout: 272 seconds) [17:51] *** VADemon has quit IRC (Read error: Connection reset by peer) [18:01] *** JAA sets mode: +o hook54321 [18:04] *** wessel152 has joined #archiveteam-bs [18:30] *** cascode has joined #archiveteam-bs [18:32] *** Laverne has joined #archiveteam-bs [18:38] *** cascode has quit IRC (Read error: Operation timed out) [18:38] *** Laverne has quit IRC (Ping timeout: 272 seconds) [19:15] *** cascode has joined #archiveteam-bs [19:20] *** Laverne has joined #archiveteam-bs [19:48] *** cascode has quit IRC (Ping timeout: 745 seconds) [19:57] *** semisimpl has joined #archiveteam-bs [20:51] *** VerifiedJ has quit IRC (Quit: Leaving) [20:58] here is the data i get when i download a manual manually from manualslib: https://pastebin.com/raw/sgrHZN4z [20:59] can you guys figure out how to grab the manual pdf urls? [21:05] 1> https://www.manualslib.com/manual/1350327/Kenmore-105-20002410.html [21:05] 2> https://www.manualslib.com/download/1350327/Kenmore-105-20002410.html [21:05] 3> https://data2.manualslib.com/pdf6/136/13504/1350327-kenmore/10520002410.pdf?198264392e33b225a10c91f7e3ccc36f&take=binary [21:06] i just want to autobot it in bash now [21:06] so the addresses are pretty easy to deconstruct. the trick would be the auth token / captcha cookie [21:06] well, i would start by grabbing the /manual/ url links for all 4 million manuals [21:06] since those are not behind captcha [21:06] *** icedice has quit IRC (Quit: Leaving) [21:06] then the work can be figured out from there [21:07] no you just have to get the post data of manualslib.com/download/ [21:07] maybe the same auth token can be reused N number of times? needs experimentation [21:07] what post data [21:08] in firefox when i'm downloading the manual it will come up with manualslib.com/download/ [21:08] it gives the url [21:08] of the pdf [21:08] yeah. i gave you the url deconstruction above [21:08] your post data is not important from what i can tell [21:09] it is to get the pdf url [21:09] once you get the url you don't need i know [21:10] 1> https://www.manualslib.com/manual/041350327/07Kenmore-06105-0620002410.html [21:10] 3> https://data2.manualslib.com/pdf6/136/13504/041350327-07kenmore/0610520002410.pdf?198264392e33b225a10c91f7e3ccc36f&take=binary [21:11] might need to look at the 136 and 13504 closer if there's some pattern [21:12] i still think you're screwed as far as the captcha goes, unless you've found an 0day exploit to google recaptcha [21:13] you will have to bypass the captcha to get the urls i would think [21:14] that's not a thing [21:14] except when it is [21:14] Buster? [21:14] how many manuals can you grab by hand before it makes you solve another captcha? [21:15] every manual has a captcha [21:15] btw, 136 is just 135+1, and 13504 is just 13503+1 [21:15] that might be the pattern [21:17] you can grab them like this : https://www.manualslib.com/manual/1350327/a.html [21:17] the pages redirect at least on thoes [21:17] do they progress 0 to 4000000 [21:18] there are gaps [21:18] first task would be filling a text file with every knowable url that way [21:19] and collecting metadata while at it [21:19] to me there maybe a api we could try to find [21:20] that just spits out the pdf urls [21:20] do they have a web app? [21:20] no [21:20] Reverse-engineer https://play.google.com/store/apps/details?id=com.manualslib.app&hl=en_US perhaps? [21:20] not sure why they would design that if they specifically captcha wall their content [21:20] there you go [21:21] i don't have android phone [21:21] so i can't do anything with that [21:21] You can download it on https://apkpure.com/manualslib-user-guides-owners-manuals-library/com.manualslib.app and then analyse from there. [21:21] shouldn't need one if just groping for "http" strings with grep. but it'd be easier to just spy [21:21] ok will do that [21:22] I just searched on GitHub. Guess whose code I found? :-P [21:23] that's too easy. it's a trap. [21:23] what did you find? [21:23] https://github.com/godane/random-code/blob/master/manualslib/download.sh [21:24] of course [21:24] Yup [21:24] i tried download them as cbz files [21:24] the images just be download then make into a cbz file but to me that poor way to grab it [21:25] also hd space will go up by a lot [21:26] Thought you didn't own an Android device :) thought only Kindle peeps used cbz :) [21:26] (only/mainly) [21:31] I just checked the mobile app, I assume it also uses the cbz format (text is raster, not embedded text) [21:33] *** Meli has quit IRC (Quit: After 1d 5h 40m 25s of wasteful lurking, 's brain 63gf4u1ted! X_x) [21:33] *** Meli has joined #archiveteam-bs [21:34] Although maybe it's actually the same pdf format, I can't tell [21:37] so i found this : https://app.manualslib.com/v2/ [21:37] but nothing comes up [21:46] *** Stilett0 is now known as Stiletto [22:02] If you do want to run an Android app on PC, there are programs like BlueStacks to do that. [22:40] i think i will just do it by making cbz files of the manuals [22:41] i think the bigImage url is stable cause they use it for the zoom feature [22:48] *** semisimpl has quit IRC (Quit: semisimpl) [23:01] *** Arcorann_ has joined #archiveteam-bs [23:02] *** Arcorann_ has quit IRC (Read error: Connection reset by peer) [23:02] *** Arcorann_ has joined #archiveteam-bs [23:16] anyways i came up with code to grab a brand name of manuals that i think should be archive [23:16] this makes the uploading different cause i'm not uploading a range of id numbers [23:20] *** cascode has joined #archiveteam-bs [23:24] *** BlueMax has joined #archiveteam-bs [23:27] *** cascode has quit IRC (Ping timeout: 272 seconds) [23:28] *** wyatt8740 has quit IRC (Ping timeout: 260 seconds) [23:29] *** wyatt8740 has joined #archiveteam-bs [23:39] *** Meli has quit IRC (Ping timeout: 272 seconds) [23:39] *** Meli has joined #archiveteam-bs [23:40] *** Laverne has quit IRC (Ping timeout: 272 seconds) [23:46] *** bsmith093 has quit IRC (Read error: Operation timed out) [23:46] *** scorche` has quit IRC (Read error: Operation timed out) [23:47] *** yano has quit IRC (Read error: Connection reset by peer) [23:47] *** Jonimoose has quit IRC (Ping timeout: 260 seconds) [23:47] *** godane has quit IRC (Read error: Operation timed out) [23:47] *** nico_32 has quit IRC (Read error: Operation timed out) [23:48] *** Doranwen has quit IRC (Read error: Operation timed out) [23:48] *** logchfoo0 has quit IRC (Ping timeout: 260 seconds) [23:49] *** logchfoo1 starts logging #archiveteam-bs at Mon Aug 17 23:49:13 2020 [23:49] *** logchfoo1 has joined #archiveteam-bs [23:49] *** Doran has joined #archiveteam-bs [23:53] *** nico_32_ has joined #archiveteam-bs [23:57] *** scorche has joined #archiveteam-bs