[00:13] *** killsushi has quit IRC (Quit: Leaving) [01:11] *** Quirk8 has quit IRC (Read error: Operation timed out) [01:26] *** acridAxid has quit IRC (Quit: marauder) [01:55] *** BlueMax has joined #archiveteam-bs [02:17] *** Quirk8 has joined #archiveteam-bs [02:30] *** DogsRNice has quit IRC (Read error: Connection reset by peer) [02:39] second: https://wiki.archlinux.org/index.php/Arch_Linux_Archive#Historical_Archive [03:28] *** qw3rty has joined #archiveteam-bs [03:35] *** qw3rty2 has quit IRC (Ping timeout: 745 seconds) [03:36] *** odemgi_ has joined #archiveteam-bs [03:38] *** odemgi has quit IRC (Ping timeout: 252 seconds) [03:43] *** ShellyRol has quit IRC (Quit: Leaving) [03:43] *** ShellyRol has joined #archiveteam-bs [04:55] *** eythian has quit IRC (Remote host closed the connection) [05:09] *** kiska18 has quit IRC (Read error: Operation timed out) [05:11] *** kiska18 has joined #archiveteam-bs [05:23] *** m007a83 has quit IRC (Quit: Fuck you Comcast) [05:24] *** eythian has joined #archiveteam-bs [07:05] SketchCow: so i noticed that you put episode of x-play in community video instead of the collection you make for me years ago [07:05] said episodes : https://archive.org/search.php?query=subject%3A%22X-Play%22+addeddate%3A2019&and[]=creator%3A%22g4%22 [07:06] the collection to move them to : https://archive.org/details/g4-xplay [07:17] *** ScruffyB has quit IRC (Remote host closed the connection) [07:17] *** ScruffyB has joined #archiveteam-bs [08:55] *** PurpleSym has joined #archiveteam-bs [08:56] *** Fusl____ sets mode: +o PurpleSym [08:56] *** Fusl sets mode: +o PurpleSym [08:56] *** Fusl_ sets mode: +o PurpleSym [09:27] *** bluefoo has quit IRC (Read error: Connection reset by peer) [09:34] *** deevious has joined #archiveteam-bs [10:16] *** icedice has joined #archiveteam-bs [10:16] *** icedice has quit IRC (Connection closed) [10:32] *** bluefoo has joined #archiveteam-bs [11:10] *** icedice has joined #archiveteam-bs [11:31] *** icedice has quit IRC (Quit: Leaving) [11:38] *** icedice has joined #archiveteam-bs [11:38] *** icedice has quit IRC (Connection closed) [11:39] *** icedice has joined #archiveteam-bs [11:47] *** bluefoo has quit IRC (Read error: Connection reset by peer) [12:05] *** deevious has quit IRC (Remote host closed the connection) [12:36] *** BlueMax has quit IRC (Quit: Leaving) [13:05] *** MaximeleG has joined #archiveteam-bs [13:12] *** MaximeleG has quit IRC (Quit: MaximeleG) [13:48] *** milkydice has joined #archiveteam-bs [13:48] So about Drawr.net [13:48] Ok milkydice, I have started two jobs. One for the notice page and one for the website. However, It is hugely likely that the se won't complete in time. [13:49] So, We need to use our other toolsets. [13:49] We could do with some stats if we can get them? Number of users, site layout, number of sketches. ID's ideally [13:49] I see. Were you able to grab the timelapse data too? [13:50] https://archive.drawr.net/ It's currently grabbing everything from here. [13:50] However, The pipeline for the main website has been IP blocked [13:55] I'm not quite getting this. From that link it seems to imply that you can only download you own works [13:55] Based on crude Google translation that is [13:56] Yeah that's waht I am using [13:56] http://dashboard.at.ninjawedding.org/ [13:56] There is a job here for drawr cvyzqtvmq3dhxbsn4sux95ju2 [13:56] Which is getting "stuff". [13:57] http://drawr.net/show.php?id=6284135 [13:58] http://drawr.net/show.php?id=7161353 [13:58] Is the latest if "new" is to be believed [13:58] 7,161,353 images. [13:58] (Some are deleted). [13:58] ((or private)) [13:59] I launched a channel #drawrnomore [13:59] As I can see this needing some more work than what the archivebot tool can manage [13:59] I see, that's going to take a while [13:59] We can do it with the warrior tool. [13:59] That won't take a too long, just need to make some scripts [14:00] I'll use the archivebot pipeline to get some information on site structure. [14:00] Is it possible to grab the data played through the flash player too? [14:00] https://github.com/ArchiveTeam/pixiv-grab [14:00] Somebody did something similar for Pixiv Sketch [14:01] We can get them yeah [14:01] Just need to know the website format [14:02] Ah I see, thanks a lot for the help! [14:03] *** yano has quit IRC (WeeChat, The Better IRC Client, https://weechat.org/) [14:03] It's what we do :) [14:03] I'm not a very technically literate with this kind of stuff, so I'm glad there's a day early [14:03] *fast response [14:04] *** yano has joined #archiveteam-bs [14:04] *** apache2 has quit IRC (Remote host closed the connection) [14:04] *** apache2 has joined #archiveteam-bs [14:05] http://drawr.net/user.php?id=xxxxx http://drawr.net/show.php?id=xxxx [14:05] Are the main things to get. [14:05] I'll write something later when I get chance and get this as a warrior project. We can power through 7 million in very short order as the site seems fairly quick. [14:06] That sounds very promising! [14:07] Is there a reason to go through the user pages too? [14:07] Makes playback work nicely [14:07] And also if the user pages are linked elsewhere they'll be in the wayback machine too [14:07] Ah right makes sense. [14:09] Oh, Nice, their pagnation for user pages doesn't suck [14:09] and doesn't present a link if there aren't any. [14:09] So we can just power through the users. [14:09] Which will then find all the image pages. [14:09] Do you have a link to the flash pages? [14:11] The flash player should be available in every artwork page. You just need to click it, then the plug-in loads. [14:12] As in http://drawr.net/show.php?id=xxxx [14:14] Ah I block flash by default [14:14] Just seeing what happens [14:16] Must be loaded through JS somehow since there's no or .swf reference in the image page HTML. [14:16] Yeah [14:16] jsel_plyr_fn ="5d8caab3jHzft4u2 [14:16] Yeah, it happens in http://drawr.net/show.php?id=7161353 [14:16] function setevt_plyr [14:17] var movie = 'draw_player.swf?xml=' + fn + '.xml&obj=ebdrw' + this.getAttribute(pf + 'id') + '&ver=' + plyr_ver; [14:17] img10.drawr.net/draw/img/288983/5d8caab3jHzft4u2.gz [14:18] I have no idea if that would play in WBM. [14:18] drawr.net/draw_player.swf?xml=http://img10.drawr.net/draw/img/288983/5d8caab3jHzft4u2.xml&obj=ebdrw7161353&ver=100216 [14:19] I doubt that the WBM rewrites SWFs and redirects those requests. [14:19] Also, that XML doesn't exist? [14:19] Nope. [14:20] ¯\_(ツ)_/¯ [14:20] The file is stored as the .gz though [14:20] And the .gz is something weird. zcat/zless don't like it. [14:20] gunzip and mediainfo? [14:24] I doubt that gunzip does anything different than zcat or zless since it's part of the same toolset. [14:24] Don't have mediainfo handy right now. [14:24] file says it's "zlib compressed data". [14:25] So yeah, I guess gzip toolery can't read it. [14:26] Interesting. [14:27] *** bluefoo has joined #archiveteam-bs [14:29] python3 -c 'import zlib,sys'$'\n''sys.stdout.buffer.write(zlib.decompress(sys.stdin.buffer.read()))' <5d8caab3jHzft4u2.gz [14:31] It's some sort of binary file format for the drawing data. [14:32] I guess reverse engineering would be in order if anyone wants to figure out how it works. [14:36] *** milkydice has quit IRC (Ping timeout: 260 seconds) [14:38] We can just hit the image page and grab the data for now [14:38] That can be fixed later (never) [14:39] *** milkydice has joined #archiveteam-bs [14:40] Yeah, just preserve the data, everything else can be figured out by whoever wants to look at the data in the future. [14:42] Pixiv Sketch flash data was able to be played back in BlueMaxima's Flashpoint. So maybe it'll be possible too in this case. [14:55] トイレであります [15:03] *** qwebirc31 has joined #archiveteam-bs [15:05] *** milkydice has quit IRC (Ping timeout: 260 seconds) [15:08] *** qwebirc31 has quit IRC (Client Quit) [15:13] *** milkydice has joined #archiveteam-bs [15:44] *** Smiley has quit IRC (Read error: Operation timed out) [16:04] *** yano has quit IRC (Quit: WeeChat, The Better IRC Client, https://weechat.org/) [16:06] *** yano has joined #archiveteam-bs [16:42] *** themadpr0 has joined #archiveteam-bs [16:43] the graph in the lower left corner of the warrior is showing all 0's [16:43] am I doing it right? [16:44] pls ignore the above two messages [16:45] *** themadpr0 has left [16:50] *** m007a83 has joined #archiveteam-bs [17:39] I'm working on a size estimate of NatGeo Your Shot, but I'm running into weird timeout issues. [17:50] Mm, My estimate shot up because I found a different API used for search results uses square sized previews while the first one was only aspect ratio preserving resizes. But I'm going to worry about those pages later. What would help most is someone looking at the JS to confirm what sizes will never be used. [17:50] Any opposition to #gotshot ? [17:53] Nope, seems good. I've created #gotshot on hackint (!). [17:54] I still say #shootermurdoch or #rupertmurderednatgeo :p [17:56] But Murdoch isn't involved in NatGeo anymore at all? [17:57] I thought he bought the thing only a couple years ago. [17:58] My elderly neighbors had been subscribed since day one, have a complete collection, and have said the last 2 years have been utter trash and are remiss to cancel. [17:58] Yeah, he/21st Century Fox did, but Disney bought them recently. [17:59] ah, the corporate soup [17:59] that would make you neighbor the oldest person alive [17:59] well, they be pretty old [17:59] lol yup, first edition was published in 1888. [17:59] huh [18:00] i think they're vampires then! [18:01] i wonder if disney bought natgeo to bury their sordid history with lemmings [18:02] Disney bought 21st Century Fox, not just NatGeo. [18:02] oh wow [18:02] Anyway, this is getting into -ot territory. [18:02] The channel for NatGeo Your Shot is #gotshot on hackint. [18:02] well they couldn't have cleaned house / reversed the damage that quickly. /end [18:13] SketchCow: i'm starting to upload the fmso-monographs pdfs [18:14] i did it differently then my eric archive and stuff [18:14] mostly cause the id numbers can do a content-disposition and have a very unique file name [18:15] SketchCow: a example of one : https://archive.org/details/20161016_Winter_Manpower_Gaps_in_the_Syrian_Army [18:20] i may also upload a zip of them too [18:20] some of the file names are too long for archive.org [18:23] *** DogsRNice has joined #archiveteam-bs [18:52] *** Smiley has joined #archiveteam-bs [19:26] https://archive.org/details/godaneinbox?and[]=creator%3A%22apan%22 [19:39] *** anarcat has quit IRC (Remote host closed the connection) [19:45] *** icedice2 has joined #archiveteam-bs [19:49] *** icedice has quit IRC (Read error: Operation timed out) [20:21] *** britmob has quit IRC (Read error: Connection reset by peer) [20:22] *** milkydice has quit IRC (Remote host closed the connection) [20:33] *** icedice2 has quit IRC (Quit: Leaving) [20:38] *** icedice has joined #archiveteam-bs [21:01] *** icedice2 has joined #archiveteam-bs [21:02] *** MRX3 has joined #archiveteam-bs [21:05] *** icedice has quit IRC (Ping timeout: 252 seconds) [21:06] *** icedice2 has quit IRC (Ping timeout: 252 seconds) [21:16] *** h3ndr1k has quit IRC (Quit: ) [21:18] drawr? [21:18] *** h3ndr1k has joined #archiveteam-bs [21:19] nice drawr looks sequential [21:20] I'm thinking of getting a Warrior project running for it [21:35] *** BlueMax has joined #archiveteam-bs [21:46] so i may have to do another crazy project [21:46] the fcc has tons of these small like file memos on there website [21:47] paths like this : ecfsapi.fcc.gov/file/6516984268.pdf [21:47] so i may end calling them some like ecfsapi-fcc-gov-file-$id or something [21:48] i don't think i can get metadata and some of them are like less then a paragraph [22:05] #drawrnomore [22:17] *** BlueMax has quit IRC (Read error: Connection reset by peer) [22:23] *** milkydice has joined #archiveteam-bs [23:11] seems the 3 upcoming deadlines are staggered- sony sketch Sept, natgeo yourshot Oct, drawr.net Nov [23:17] *** MRX3 has quit IRC (Ping timeout: 252 seconds) [23:27] *** Mateon1 has quit IRC (Quit: Mateon1) [23:27] *** Mateon1 has joined #archiveteam-bs