#archiveteam-bs 2017-06-09,Fri

↑back Search

Time Nickname Message
00:05 πŸ”— dashcloud has quit IRC (Ping timeout: 245 seconds)
00:06 πŸ”— j08nY has quit IRC (Quit: Leaving)
00:07 πŸ”— dashcloud has joined #archiveteam-bs
00:50 πŸ”— dashcloud has quit IRC (Ping timeout: 245 seconds)
00:58 πŸ”— dashcloud has joined #archiveteam-bs
01:40 πŸ”— BlueMaxim has joined #archiveteam-bs
01:52 πŸ”— tfgbd_znc has joined #archiveteam-bs
02:09 πŸ”— REiN^ has quit IRC (Read error: Operation timed out)
02:10 πŸ”— REiN^ has joined #archiveteam-bs
02:44 πŸ”— ndiddy has quit IRC ()
02:51 πŸ”— dashcloud has quit IRC (Read error: Operation timed out)
03:51 πŸ”— Stilett0 has joined #archiveteam-bs
03:51 πŸ”— Stilett0 is now known as Stiletto
03:52 πŸ”— phuzion has quit IRC (Ping timeout: 600 seconds)
03:53 πŸ”— phuzion has joined #archiveteam-bs
04:07 πŸ”— pizzaiolo has joined #archiveteam-bs
04:08 πŸ”— pizzaiolo has quit IRC (Client Quit)
04:15 πŸ”— godane SketchCow: just read your post on retromags
04:16 πŸ”— godane thats good that he just been resting
04:56 πŸ”— Sk1d has quit IRC (Ping timeout: 194 seconds)
05:01 πŸ”— Sk1d has joined #archiveteam-bs
05:18 πŸ”— SHODAN_UI has joined #archiveteam-bs
06:19 πŸ”— kristian_ has joined #archiveteam-bs
06:27 πŸ”— ranma archivebot ignores robots.txt, right?
06:34 πŸ”— dxrt ranma: yes
07:03 πŸ”— j08nY has joined #archiveteam-bs
07:06 πŸ”— SHODAN_UI has quit IRC (Remote host closed the connection)
07:07 πŸ”— ivan has quit IRC (Leaving)
07:11 πŸ”— ivan has joined #archiveteam-bs
07:48 πŸ”— j08nY has quit IRC (Read error: Operation timed out)
08:25 πŸ”— JAA 06-09 03:17:50 <@xmc> it shouldn't go above/outside the directory named in the urls that you give it -- I actually observed the opposite. If you archive https://example.com/ and there's a 302 redirect on https://example.com/foo to https://otherpage.net/, it will recursively grab otherpage.net as well.
08:26 πŸ”— JAA Or maybe 301, don't remember.
08:28 πŸ”— JAA Checked the logs, it was a 303 See Other on job a98hr9u2potfhw6ikf0dnbsua.
08:31 πŸ”— JAA 06-08 21:38:57 <@xmc> [!ao on Twitter] won't get more than the most recent posts, but it'll go much faster -- ArchiveBot won't grab the entire tweet history regardless of the options, even with phantomjs, in my experience.
08:48 πŸ”— kristian_ has quit IRC (Quit: Leaving)
08:59 πŸ”— SHODAN_UI has joined #archiveteam-bs
09:20 πŸ”— Jonison has joined #archiveteam-bs
09:50 πŸ”— Jonison has quit IRC (Quit: Leaving)
09:55 πŸ”— icedice has joined #archiveteam-bs
10:25 πŸ”— gui7 has joined #archiveteam-bs
10:28 πŸ”— j08nY has joined #archiveteam-bs
10:28 πŸ”— gui7 ok so, question. there's this website in my native language that is an incredible treasure trove of soccer match data?
10:29 πŸ”— gui7 I just need a bit of help getting started... regex is rusty lol
10:31 πŸ”— JAA Link?
10:57 πŸ”— BlueMaxim has quit IRC (Read error: Operation timed out)
11:29 πŸ”— ranma not that the average archivist would want to back this up, but https://www.reddit.com/r/DataHoarder/comments/6g4c3p/erosharecom_nsfw_shutting_down_june_30th/
11:32 πŸ”— JAA Yeah, EroShare, ImgBox, ImageBam, and SendVid are all shutting down end of June.
11:33 πŸ”— ranma TIL sendvid. someone mentioned imgbox, imagebam when i mentioned that link
12:12 πŸ”— phuzion has quit IRC (Remote host closed the connection)
12:24 πŸ”— phuzion has joined #archiveteam-bs
12:42 πŸ”— pizzaiolo has joined #archiveteam-bs
13:26 πŸ”— joepie91 don't see why there couldn't be a project for it
13:41 πŸ”— JAA Well, sure. We'll need a list of URLs though. EroShare and SendVid seem to use 8-char base-36 IDs (2.8 * 10^12 combinations), ImgBox 8-char base-62 IDs (2.2 * 10^14), ImageBam 14/15-char base-16 IDs (1.2 * 10^18). ImageBam also has galleries with 32-char base-36 IDs it seems (6.3 * 10^49 !)...
13:46 πŸ”— JAA By the way, ImgBox, ImageBam, and SendVid are indeed operated by the same entity, Flixya Entertainment, LLC.
13:58 πŸ”— JAA They also ran VideoBam, ViRoll, and Snapixel previously. Apparently, shared.com was also theirs at some point, but it looks like they sold that.
14:01 πŸ”— JAA ImageBam also has a second domain: imgbam.com
14:02 πŸ”— Frogging they're all shutting down at once? o.o
14:04 πŸ”— JAA Yep.
14:04 πŸ”— ZexaronS has joined #archiveteam-bs
14:04 πŸ”— JAA Not sure if there's a connection between EroShare and Flixya.
14:04 πŸ”— JAA I guess it's possible Flixya also runs EroShare but doesn't want to be associated with it or something like that. (eroshare.com is registered through a whois proxy.)
14:05 πŸ”— JAA But it might also just be a coincidence.
14:05 πŸ”— JAA The other three are just Flixya still not having figured out how to run a profitable image hosting website.
14:05 πŸ”— JAA image/video*
14:06 πŸ”— Frogging nobody has figured that out and that's why they all shut down eventually
14:06 πŸ”— Frogging :p
14:06 πŸ”— Frogging or become so ad-laden as to be unusable
14:07 πŸ”— JAA :-P
14:07 πŸ”— Frogging I've noticed imgur is becoming more and more obnoxious with their redirecting of hotlinks
14:07 πŸ”— JAA Yeah, same.
14:09 πŸ”— JAA picyou.com and ucash.in were also Flixya's at some point, but now seem to belong to another party (like shared.com).
14:13 πŸ”— JAA I found two more Flixya registrations: adhance.com, an advertising platform, and continue.com, a "traffic recapturing" service (think adf.ly). Both are now for sale.
14:16 πŸ”— JAA Also sharedhq.com, which has this beautiful quote: "[Flixya etc. founder] Ivan [Wong] is a serial entrepreneur and veteran web producer. He has been featured on the β€œNew York Times” and excels in online advertising, analytics and project management." Perhaps he misunderstood the verb "to excel" as "I can calculate some online advertising, analytics, and project management stuff in Microsoft Excel"
14:16 πŸ”— JAA ?
14:55 πŸ”— joepie91 'serial entrepreneur'
14:55 πŸ”— joepie91 is that now the new term for "trying to be like Yahoo"?
15:15 πŸ”— Frogging hopping from one overfunded project to the next and leaving a trail of destruction
15:15 πŸ”— Frogging :p
15:17 πŸ”— ZexaronS has quit IRC (Leaving)
15:28 πŸ”— odemg has joined #archiveteam-bs
15:28 πŸ”— Gilfoyle has joined #archiveteam-bs
17:23 πŸ”— Honno has joined #archiveteam-bs
17:24 πŸ”— ReimuHaku has joined #archiveteam-bs
17:25 πŸ”— SHODAN_UI has quit IRC (Remote host closed the connection)
17:41 πŸ”— godane SketchCow: so this guy did a ton of scans of New Computer Express: https://archive.org/details/@zzapmort
18:30 πŸ”— arkiver yeah, lists of URLs are the most important for these new sites shutting down
18:30 πŸ”— arkiver we can try to contact them
18:31 πŸ”— arkiver wut, imagebam shutting down?
18:31 πŸ”— arkiver huh, all of them?
18:31 πŸ”— arkiver that's big
18:32 πŸ”— arkiver Can we create a list of what is exactly shutting down and how it is all connected with each other?
18:32 πŸ”— JAA ImgBox, ImageBam, SendVid are all Flixya Entertainment, LLC services
18:33 πŸ”— JAA EroShare is the other service shutting down on 30 June. Not related to Flixya, at least publicly.
18:34 πŸ”— arkiver I hope there's some way to list them without the IDs in the URLs
18:35 πŸ”— arkiver list the content on them*
18:35 πŸ”— godane has quit IRC (Ping timeout: 245 seconds)
18:39 πŸ”— godane has joined #archiveteam-bs
19:12 πŸ”— SHODAN_UI has joined #archiveteam-bs
19:19 πŸ”— ItsYoda has quit IRC (Quit: rippppp to the yoda you used to know!)
19:22 πŸ”— SketchCow godane: I've written them for permission to re-render them as readable. Good catch
19:25 πŸ”— ZexaronS has joined #archiveteam-bs
19:28 πŸ”— godane SketchCow: i only found it cause i was looking at retropdfs.wordpress.com
19:29 πŸ”— godane and the guy had tons of New Computer Express missing in his collection
19:29 πŸ”— godane so i started look for the magazine and found it on archive.org
19:30 πŸ”— godane SketchCow: he also as tons of Commodore inlays
19:31 πŸ”— schbirid has joined #archiveteam-bs
19:33 πŸ”— godane i did find it weird that he started using tiff for issues 131 and 135
19:33 πŸ”— godane based on what can tell the rest are just jpgs in zips
19:34 πŸ”— gui7 has quit IRC (Read error: Operation timed out)
19:59 πŸ”— ItsYoda has joined #archiveteam-bs
21:04 πŸ”— Kaz is there a channel for eroshare-related stuff yet?
21:11 πŸ”— ndiddy has joined #archiveteam-bs
21:15 πŸ”— xmc no, maybe it should be #nofap though
21:17 πŸ”— tklk +1 for xmc's suggestion
21:18 πŸ”— Kaz I'll sit in it, if that's the route we'll go
21:18 πŸ”— Kaz not sure if we're actually going to grab any of it though, does the archive *want* the data?
21:43 πŸ”— arkiver haha
21:44 πŸ”— timmc I'm sure people 200 years from now would love to be able to look back at our quaint porn.
21:45 πŸ”— arkiver yep
21:47 πŸ”— timmc "oh hah they still used their bodies back then, unlike now with our VR quantum hyperfornication"
21:49 πŸ”— DFJustin judging by item view counts, way more people want porn than anything else in the archive.org collections
21:50 πŸ”— Nazca pixiv is done? nice
21:50 πŸ”— Nazca should change the archiveteam's choice then
21:51 πŸ”— MrRadar Not quiet, Nazca. We're going to grab tags and then do another pass for "R18" rooms
21:51 πŸ”— MrRadar Which require an account
21:53 πŸ”— Nazca ETA for that?
21:54 πŸ”— arkiver chfoo: how can I export the out items from a warrior project? in this case it's for pixiv
21:54 πŸ”— arkiver yipdw might now too ^
21:57 πŸ”— Nazca welp
21:57 πŸ”— Nazca my current project page broke
21:57 πŸ”— Nazca it's completely empty now
21:57 πŸ”— Nazca no matter what project I pick
21:57 πŸ”— Nazca I already restarted the VM
21:57 πŸ”— Nazca how weird
22:01 πŸ”— Nazca hard booting without using the web menu worked
22:17 πŸ”— chfoo arkiver: something like: redis-cli zrange pixiv:out 0 -1 > pixiv_out.txt
22:26 πŸ”— SHODAN_UI has quit IRC (Remote host closed the connection)
22:29 πŸ”— Honno has quit IRC (Read error: Operation timed out)
22:59 πŸ”— icedice has quit IRC (Quit: Leaving)
23:02 πŸ”— yipdw arkiver: I don't know of a function in the tracker, but if you can ssh into tracker.archiveteam.org, you can run redis-cli zrange pixiv:out 0 -1
23:28 πŸ”— ZexaronS has quit IRC (Leaving)

irclogger-viewer