#archiveteam-bs 2017-06-09,Fri

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
***dashcloud has quit IRC (Ping timeout: 245 seconds)
j08nY has quit IRC (Quit: Leaving)
dashcloud has joined #archiveteam-bs
[00:05]
......... (idle for 43mn)
dashcloud has quit IRC (Ping timeout: 245 seconds) [00:50]
dashcloud has joined #archiveteam-bs [00:58]
......... (idle for 42mn)
BlueMaxim has joined #archiveteam-bs [01:40]
tfgbd_znc has joined #archiveteam-bs [01:52]
.... (idle for 17mn)
REiN^ has quit IRC (Read error: Operation timed out)
REiN^ has joined #archiveteam-bs
[02:09]
....... (idle for 34mn)
ndiddy has quit IRC () [02:44]
dashcloud has quit IRC (Read error: Operation timed out) [02:51]
............. (idle for 1h0mn)
Stilett0 has joined #archiveteam-bs
Stilett0 is now known as Stiletto
phuzion has quit IRC (Ping timeout: 600 seconds)
phuzion has joined #archiveteam-bs
[03:51]
pizzaiolo has joined #archiveteam-bs
pizzaiolo has quit IRC (Client Quit)
[04:07]
godaneSketchCow: just read your post on retromags
thats good that he just been resting
[04:15]
......... (idle for 40mn)
***Sk1d has quit IRC (Ping timeout: 194 seconds) [04:56]
Sk1d has joined #archiveteam-bs [05:01]
.... (idle for 17mn)
SHODAN_UI has joined #archiveteam-bs [05:18]
............. (idle for 1h1mn)
kristian_ has joined #archiveteam-bs [06:19]
ranmaarchivebot ignores robots.txt, right? [06:27]
dxrtranma: yes [06:34]
...... (idle for 29mn)
***j08nY has joined #archiveteam-bs
SHODAN_UI has quit IRC (Remote host closed the connection)
ivan has quit IRC (Leaving)
ivan has joined #archiveteam-bs
[07:03]
........ (idle for 37mn)
j08nY has quit IRC (Read error: Operation timed out) [07:48]
........ (idle for 37mn)
JAA06-09 03:17:50 <@xmc> it shouldn't go above/outside the directory named in the urls that you give it -- I actually observed the opposite. If you archive https://example.com/ and there's a 302 redirect on https://example.com/foo to https://otherpage.net/, it will recursively grab otherpage.net as well.
Or maybe 301, don't remember.
Checked the logs, it was a 303 See Other on job a98hr9u2potfhw6ikf0dnbsua.
06-08 21:38:57 <@xmc> [!ao on Twitter] won't get more than the most recent posts, but it'll go much faster -- ArchiveBot won't grab the entire tweet history regardless of the options, even with phantomjs, in my experience.
[08:25]
.... (idle for 17mn)
***kristian_ has quit IRC (Quit: Leaving) [08:48]
SHODAN_UI has joined #archiveteam-bs [08:59]
..... (idle for 21mn)
Jonison has joined #archiveteam-bs [09:20]
....... (idle for 30mn)
Jonison has quit IRC (Quit: Leaving) [09:50]
icedice has joined #archiveteam-bs [09:55]
....... (idle for 30mn)
gui7 has joined #archiveteam-bs
j08nY has joined #archiveteam-bs
[10:25]
gui7ok so, question. there's this website in my native language that is an incredible treasure trove of soccer match data?
I just need a bit of help getting started... regex is rusty lol
[10:28]
JAALink? [10:31]
...... (idle for 26mn)
***BlueMaxim has quit IRC (Read error: Operation timed out) [10:57]
....... (idle for 32mn)
ranmanot that the average archivist would want to back this up, but https://www.reddit.com/r/DataHoarder/comments/6g4c3p/erosharecom_nsfw_shutting_down_june_30th/ [11:29]
JAAYeah, EroShare, ImgBox, ImageBam, and SendVid are all shutting down end of June. [11:32]
ranmaTIL sendvid. someone mentioned imgbox, imagebam when i mentioned that link [11:33]
........ (idle for 39mn)
***phuzion has quit IRC (Remote host closed the connection) [12:12]
phuzion has joined #archiveteam-bs [12:24]
.... (idle for 18mn)
pizzaiolo has joined #archiveteam-bs [12:42]
......... (idle for 44mn)
joepie91don't see why there couldn't be a project for it [13:26]
.... (idle for 15mn)
JAAWell, sure. We'll need a list of URLs though. EroShare and SendVid seem to use 8-char base-36 IDs (2.8 * 10^12 combinations), ImgBox 8-char base-62 IDs (2.2 * 10^14), ImageBam 14/15-char base-16 IDs (1.2 * 10^18). ImageBam also has galleries with 32-char base-36 IDs it seems (6.3 * 10^49 !)... [13:41]
By the way, ImgBox, ImageBam, and SendVid are indeed operated by the same entity, Flixya Entertainment, LLC. [13:46]
They also ran VideoBam, ViRoll, and Snapixel previously. Apparently, shared.com was also theirs at some point, but it looks like they sold that.
ImageBam also has a second domain: imgbam.com
[13:58]
Froggingthey're all shutting down at once? o.o [14:02]
JAAYep. [14:04]
***ZexaronS has joined #archiveteam-bs [14:04]
JAANot sure if there's a connection between EroShare and Flixya.
I guess it's possible Flixya also runs EroShare but doesn't want to be associated with it or something like that. (eroshare.com is registered through a whois proxy.)
But it might also just be a coincidence.
The other three are just Flixya still not having figured out how to run a profitable image hosting website.
image/video*
[14:04]
Froggingnobody has figured that out and that's why they all shut down eventually
:p
or become so ad-laden as to be unusable
[14:06]
JAA:-P [14:07]
FroggingI've noticed imgur is becoming more and more obnoxious with their redirecting of hotlinks [14:07]
JAAYeah, same.
picyou.com and ucash.in were also Flixya's at some point, but now seem to belong to another party (like shared.com).
I found two more Flixya registrations: adhance.com, an advertising platform, and continue.com, a "traffic recapturing" service (think adf.ly). Both are now for sale.
Also sharedhq.com, which has this beautiful quote: "[Flixya etc. founder] Ivan [Wong] is a serial entrepreneur and veteran web producer. He has been featured on the “New York Times” and excels in online advertising, analytics and project management." Perhaps he misunderstood the verb "to excel" as "I can calculate some online advertising, analytics, and project management stuff in Microsoft Excel"
?
[14:07]
........ (idle for 39mn)
joepie91'serial entrepreneur'
is that now the new term for "trying to be like Yahoo"?
[14:55]
..... (idle for 20mn)
Frogginghopping from one overfunded project to the next and leaving a trail of destruction
:p
[15:15]
***ZexaronS has quit IRC (Leaving) [15:17]
odemg has joined #archiveteam-bs
Gilfoyle has joined #archiveteam-bs
[15:28]
........................ (idle for 1h55mn)
Honno has joined #archiveteam-bs
ReimuHaku has joined #archiveteam-bs
SHODAN_UI has quit IRC (Remote host closed the connection)
[17:23]
.... (idle for 16mn)
godaneSketchCow: so this guy did a ton of scans of New Computer Express: https://archive.org/details/@zzapmort [17:41]
.......... (idle for 49mn)
arkiveryeah, lists of URLs are the most important for these new sites shutting down
we can try to contact them
wut, imagebam shutting down?
huh, all of them?
that's big
Can we create a list of what is exactly shutting down and how it is all connected with each other?
[18:30]
JAAImgBox, ImageBam, SendVid are all Flixya Entertainment, LLC services
EroShare is the other service shutting down on 30 June. Not related to Flixya, at least publicly.
[18:32]
arkiverI hope there's some way to list them without the IDs in the URLs
list the content on them*
[18:34]
***godane has quit IRC (Ping timeout: 245 seconds)
godane has joined #archiveteam-bs
[18:35]
....... (idle for 33mn)
SHODAN_UI has joined #archiveteam-bs [19:12]
ItsYoda has quit IRC (Quit: rippppp to the yoda you used to know!) [19:19]
SketchCowgodane: I've written them for permission to re-render them as readable. Good catch [19:22]
***ZexaronS has joined #archiveteam-bs [19:25]
godaneSketchCow: i only found it cause i was looking at retropdfs.wordpress.com
and the guy had tons of New Computer Express missing in his collection
so i started look for the magazine and found it on archive.org
SketchCow: he also as tons of Commodore inlays
[19:28]
***schbirid has joined #archiveteam-bs [19:31]
godanei did find it weird that he started using tiff for issues 131 and 135
based on what can tell the rest are just jpgs in zips
[19:33]
***gui7 has quit IRC (Read error: Operation timed out) [19:34]
...... (idle for 25mn)
ItsYoda has joined #archiveteam-bs [19:59]
.............. (idle for 1h5mn)
Kazis there a channel for eroshare-related stuff yet? [21:04]
***ndiddy has joined #archiveteam-bs [21:11]
xmcno, maybe it should be #nofap though [21:15]
tklk+1 for xmc's suggestion [21:17]
KazI'll sit in it, if that's the route we'll go
not sure if we're actually going to grab any of it though, does the archive *want* the data?
[21:18]
...... (idle for 25mn)
arkiverhaha [21:43]
timmcI'm sure people 200 years from now would love to be able to look back at our quaint porn. [21:44]
arkiveryep [21:45]
timmc"oh hah they still used their bodies back then, unlike now with our VR quantum hyperfornication" [21:47]
DFJustinjudging by item view counts, way more people want porn than anything else in the archive.org collections [21:49]
Nazcapixiv is done? nice
should change the archiveteam's choice then
[21:50]
MrRadarNot quiet, Nazca. We're going to grab tags and then do another pass for "R18" rooms
Which require an account
[21:51]
NazcaETA for that? [21:53]
arkiverchfoo: how can I export the out items from a warrior project? in this case it's for pixiv
yipdw might now too ^
[21:54]
Nazcawelp
my current project page broke
it's completely empty now
no matter what project I pick
I already restarted the VM
how weird
hard booting without using the web menu worked
[21:57]
.... (idle for 16mn)
chfooarkiver: something like: redis-cli zrange pixiv:out 0 -1 > pixiv_out.txt [22:17]
***SHODAN_UI has quit IRC (Remote host closed the connection)
Honno has quit IRC (Read error: Operation timed out)
[22:26]
....... (idle for 30mn)
icedice has quit IRC (Quit: Leaving) [22:59]
yipdwarkiver: I don't know of a function in the tracker, but if you can ssh into tracker.archiveteam.org, you can run redis-cli zrange pixiv:out 0 -1 [23:02]
...... (idle for 26mn)
***ZexaronS has quit IRC (Leaving) [23:28]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)