Time |
Nickname |
Message |
00:06
π
|
arkiver |
http://tracker.archiveteam.org/gamefrontforums/ is now active! |
00:43
π
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
00:44
π
|
|
WinterFox has joined #archiveteam |
00:46
π
|
|
sigkell has quit IRC (Ping timeout: 260 seconds) |
00:48
π
|
|
philpem has quit IRC (Ping timeout: 260 seconds) |
00:57
π
|
|
sigkell has joined #archiveteam |
01:17
π
|
|
JesseW has joined #archiveteam |
01:43
π
|
|
Honno has quit IRC (Read error: Operation timed out) |
01:53
π
|
|
xmc has quit IRC (Read error: Operation timed out) |
01:53
π
|
|
mismatch has joined #archiveteam |
01:54
π
|
|
Fletcher has quit IRC (Read error: Operation timed out) |
01:54
π
|
|
mismatch_ has quit IRC (Read error: Operation timed out) |
01:54
π
|
|
Famicoma1 has quit IRC (Read error: Operation timed out) |
01:55
π
|
|
xmc has joined #archiveteam |
01:55
π
|
|
swebb sets mode: +o xmc |
01:55
π
|
|
robink has quit IRC (Read error: Connection reset by peer) |
01:56
π
|
|
Fletcher has joined #archiveteam |
01:59
π
|
|
robink has joined #archiveteam |
02:03
π
|
MrRadar |
arkiver: In the gamefrontforums grab did you mean to include the geo-IP block check for downloading GameFront files? |
02:03
π
|
MrRadar |
I don't think the forums have the same block |
02:04
π
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
02:08
π
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
02:13
π
|
|
dashcloud has joined #archiveteam |
02:48
π
|
|
Famicoma1 has joined #archiveteam |
03:55
π
|
|
JesseW has joined #archiveteam |
03:59
π
|
JesseW |
arkiver: since I'm not banned from gamefront, shall I stay with that one, rather than switching over to forums? |
04:28
π
|
JesseW |
up to 6 concurrency on gamefront |
04:47
π
|
|
Sk1d has quit IRC (Ping timeout: 194 seconds) |
04:53
π
|
|
Sk1d has joined #archiveteam |
05:24
π
|
|
metalcamp has joined #archiveteam |
05:39
π
|
|
Honno has joined #archiveteam |
05:43
π
|
|
BlueMaxim has joined #archiveteam |
06:21
π
|
|
signius has joined #archiveteam |
06:22
π
|
|
mismatch has quit IRC (Ping timeout: 633 seconds) |
06:33
π
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
07:24
π
|
|
Honno has quit IRC (Read error: Operation timed out) |
07:30
π
|
|
schbirid has joined #archiveteam |
07:35
π
|
|
morbus_ has joined #archiveteam |
07:41
π
|
|
Morbus has quit IRC (Read error: Operation timed out) |
07:57
π
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
08:01
π
|
|
redlob has quit IRC (Read error: Operation timed out) |
08:02
π
|
|
redlob has joined #archiveteam |
08:09
π
|
|
metalcamp has joined #archiveteam |
08:37
π
|
|
Wuked has joined #archiveteam |
09:14
π
|
|
Smiley has joined #archiveteam |
09:25
π
|
|
atomotic has joined #archiveteam |
09:34
π
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
09:35
π
|
|
bwn has quit IRC (Ping timeout: 492 seconds) |
09:45
π
|
|
metalcamp has joined #archiveteam |
09:50
π
|
|
bwn has joined #archiveteam |
10:15
π
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
10:25
π
|
|
Wuked has quit IRC (My Mac has gone to sleep. ZZZzzzβ¦) |
10:35
π
|
|
metalcamp has joined #archiveteam |
10:35
π
|
|
Wuked has joined #archiveteam |
10:41
π
|
|
Wuked has quit IRC (Quit: My Mac has gone to sleep. ZZZzzzβ¦) |
10:53
π
|
|
Wuked has joined #archiveteam |
11:07
π
|
|
Wuked has quit IRC (My Mac has gone to sleep. ZZZzzzβ¦) |
11:26
π
|
|
Lord_Nigh has quit IRC (Ping timeout: 244 seconds) |
11:29
π
|
|
Crocatowa has quit IRC (Read error: Operation timed out) |
11:30
π
|
|
Crocatowa has joined #archiveteam |
11:35
π
|
|
Medowar has joined #archiveteam |
11:37
π
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
11:41
π
|
|
vitzli has joined #archiveteam |
11:45
π
|
|
Lord_Nigh has joined #archiveteam |
11:54
π
|
|
kris33 has joined #archiveteam |
12:02
π
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
12:25
π
|
|
RichardG has quit IRC (Ping timeout: 260 seconds) |
12:25
π
|
|
atomotic has joined #archiveteam |
12:32
π
|
|
RichardG has joined #archiveteam |
12:45
π
|
|
Honno has joined #archiveteam |
12:47
π
|
|
Wuked has joined #archiveteam |
12:59
π
|
|
kris33 has quit IRC (Textual IRC Client: www.textualapp.com) |
13:06
π
|
|
Honno has quit IRC (Ping timeout: 1208 seconds) |
13:23
π
|
|
suggestio has joined #archiveteam |
13:26
π
|
suggestio |
Image hosting site run by ThePirateBay crew has been temporarily revived after sudden shutdown in 2014. Old images now accessible again. http://bayimg.com/ |
13:28
π
|
|
BlueMaxim has quit IRC (Read error: Operation timed out) |
13:33
π
|
phuzion |
suggestio: Know anything about how their image URLs are generated? |
13:33
π
|
|
suggestio has quit IRC (Ping timeout: 268 seconds) |
13:53
π
|
|
VADemon has joined #archiveteam |
13:53
π
|
|
scyther has joined #archiveteam |
13:53
π
|
|
scyther has quit IRC (Connection closed) |
14:26
π
|
|
metalcamp has joined #archiveteam |
14:43
π
|
joepie91 |
phuzion: potential vector for mapping stuff out: http://bayimg.com/album/ |
14:43
π
|
joepie91 |
it seems to try to load everything |
14:44
π
|
phuzion |
I can't imagine that it's more than a few TB, think it might be worth trying to archive? |
14:44
π
|
joepie91 |
everything lives on image.bayimg.com |
14:44
π
|
joepie91 |
seemingly using hashes of files |
14:44
π
|
joepie91 |
http://image.bayimg.com/d3099f010b848bd079b53d0c985e409f67914928.jpg |
14:44
π
|
phuzion |
gross |
14:44
π
|
joepie91 |
the 'view' pages are easier |
14:44
π
|
joepie91 |
http://bayimg.com/PaiLPAAgH |
14:44
π
|
phuzion |
lol Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 32 bytes) in /var/data/bayimg.com/www/ajax_album.php on line 70 |
14:44
π
|
joepie91 |
album sample: http://bayimg.com/album/MAANGaaaC |
14:45
π
|
joepie91 |
phuzion: yes, I think it tries to list everything |
14:45
π
|
joepie91 |
note the ajax_ prefix |
14:45
π
|
phuzion |
yeah |
14:45
π
|
joepie91 |
I think it's a sort of API as well |
14:45
π
|
|
atomotic has quit IRC (Ping timeout: 260 seconds) |
14:45
π
|
joepie91 |
might be able to enumerate it with certain params |
14:45
π
|
joepie91 |
ah, hold on |
14:46
π
|
joepie91 |
http://bayimg.com/album/- also fails |
14:47
π
|
joepie91 |
it seems to always fail.. |
14:48
π
|
joepie91 |
strange... |
14:50
π
|
joepie91 |
Google knows a lot of them anyway |
14:51
π
|
joepie91 |
interesting |
14:51
π
|
joepie91 |
phuzion: the image IDs are NOT randomly generated |
14:52
π
|
joepie91 |
phuzion: first two pages of Google: https://gist.github.com/joepie91/ac014769a3446e074d62c2792b9c05b2 |
14:52
π
|
joepie91 |
entirely too consistent |
14:52
π
|
joepie91 |
always a lowercase or uppercase A in the 2nd 6th position |
14:53
π
|
joepie91 |
2nd and 6th* |
14:53
π
|
joepie91 |
various other apparent patterns |
14:53
π
|
joepie91 |
always a lowercase or uppercase A in the 7th posuition to, it seems |
14:53
π
|
joepie91 |
position too* |
14:54
π
|
PurpleSym |
bing results: http://paste.nerds.io/edixocitog.txt |
14:55
π
|
joepie91 |
yeah, very not random |
14:55
π
|
joepie91 |
lol |
14:58
π
|
joepie91 |
PurpleSym: phuzion: combined and sorted: http://sprunge.us/ceTE |
14:58
π
|
joepie91 |
time to find patterns :P |
15:04
π
|
|
Wuked has quit IRC (Quit: My Mac has gone to sleep. ZZZzzzβ¦) |
15:06
π
|
PurpleSym |
Ids seem to be case-insensitive. |
15:06
π
|
joepie91 |
whoa, fucking seriously? |
15:07
π
|
joepie91 |
wow |
15:07
π
|
joepie91 |
that is great |
15:10
π
|
joepie91 |
PurpleSym: updated list: http://sprunge.us/UiPM |
15:10
π
|
joepie91 |
seems the 8th position is just a-f? |
15:12
π
|
PurpleSym |
Could be a coincidence. The sample size is quite small. |
15:12
π
|
joepie91 |
seems unlikely |
15:12
π
|
joepie91 |
PurpleSym: can you get a larger list? |
15:12
π
|
joepie91 |
via Google or w/e |
15:13
π
|
PurpleSym |
I could try the common crawl index. |
15:14
π
|
joepie91 |
please do :P |
15:15
π
|
VADemon |
What's the status of maxfile.ro? Has anyone more detailed information? |
15:16
π
|
|
atomotic has joined #archiveteam |
15:16
π
|
VADemon |
PurpleSym: I've grabbed yandex, duckduckgo and bing for maxfile.ro. Turns out 50 of your results were still unique to mine |
15:17
π
|
PurpleSym |
Better get them all, VADemon. |
15:17
π
|
PurpleSym |
joepie91: Nothing in the Common Crawl Index, as far as I see. |
15:17
π
|
|
atomotic has quit IRC (Client Quit) |
15:17
π
|
joepie91 |
PurpleSym: try Google then? |
15:17
π
|
|
atomotic has joined #archiveteam |
15:18
π
|
PurpleSym |
I donβt have scripts for that. |
15:18
π
|
joepie91 |
PurpleSym: if you use Chrome, "Link Grabber" is greatly useful for this |
15:18
π
|
joepie91 |
lets you ignore internal stuff |
15:19
π
|
joepie91 |
and only extract actual search results |
15:19
π
|
joepie91 |
makes it a semi-automated process |
15:22
π
|
PurpleSym |
Well, I donβt. |
15:31
π
|
|
Rotab has joined #archiveteam |
15:31
π
|
joepie91 |
PurpleSym: char frequency counts: http://storage2.static.itmages.com/i/16/0426/h_1461684754_5932817_5afad23c2c.png |
15:32
π
|
|
Wuked has joined #archiveteam |
15:34
π
|
joepie91 |
http://bayimg.com/fajkkaadd and http://bayimg.com/fajkkaaddd are the same image |
15:34
π
|
joepie91 |
so it ignores everything beyond 8 chars |
15:34
π
|
joepie91 |
also doesn't seem to go beyond P anywhere |
15:34
π
|
Rotab |
are you murdering gamefront (filefront?) forums? |
15:35
π
|
joepie91 |
so... |
15:35
π
|
* |
joepie91 makes permutation calc |
15:35
π
|
MrRadar |
Rotab: we did start archiving them last night, so probably |
15:35
π
|
MrRadar |
Ping arkiver ^^^^ |
15:35
π
|
joepie91 |
phuzion: PurpleSym: 6291456 permutations |
15:35
π
|
joepie91 |
I think we can pull that off |
15:35
π
|
joepie91 |
(for bayimg) |
15:36
π
|
phuzion |
6.2m, that's not bad |
15:36
π
|
PurpleSym |
Sure, thatβs not too bad. |
15:36
π
|
Rotab |
i cant even join in on gamefront, the ip check fails :S |
15:37
π
|
PurpleSym |
joepie91: Wrt frequency counts: Could be an increasing 32 bit counter with nibbles shuffled around. |
15:37
π
|
PurpleSym |
*shifted |
15:37
π
|
VADemon |
Rotab: doesn't seem to be caused by us, the file downloading from their servers still works for me |
15:38
π
|
MrRadar |
The issue is not the file hosting, it's the forums |
15:38
π
|
MrRadar |
I can access them but it takes about 10 seconds for each page to load |
15:38
π
|
|
luckcolor has joined #archiveteam |
15:38
π
|
luckcolor |
Hello |
15:39
π
|
MrRadar |
Hello |
15:39
π
|
|
WinterFox has quit IRC (Remote host closed the connection) |
15:39
π
|
Rotab |
yeah, it is very slow |
15:39
π
|
joepie91 |
PurpleSym: that goes beyond my capabilities :) |
15:39
π
|
joepie91 |
the distribution is a bit odd but I think we can just treat it as randomized |
15:39
π
|
joepie91 |
with the given ranges |
15:39
π
|
Rotab |
although the forumgrab needlessly checks if you can download files |
15:39
π
|
joepie91 |
and have it be Good Enough |
15:39
π
|
MrRadar |
Yeah, I mentioned that last night but arkiver is AFK |
15:40
π
|
PurpleSym |
Yeah, that should work fine, joepie91. |
15:44
π
|
|
luckcolor has quit IRC (Quit: Page closed) |
15:48
π
|
|
JesseW has joined #archiveteam |
15:49
π
|
|
atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzzβ¦) |
15:53
π
|
|
atomotic has joined #archiveteam |
15:56
π
|
|
atomotic has quit IRC (Client Quit) |
15:58
π
|
VADemon |
- Started grabbing maxfile.ro, with 660 links from search engines - |
15:59
π
|
PurpleSym |
joepie91: Three pictures I just uploaded in this order: http://bayimg.com/aAimFAAgH http://bayimg.com/aaiMgaagH http://bayimg.com/aAiMHaagh |
16:02
π
|
joepie91 |
incremental? excellent :) |
16:03
π
|
joepie91 |
huh |
16:03
π
|
joepie91 |
that's odd |
16:03
π
|
joepie91 |
those are 9 chars |
16:19
π
|
|
JesseW has quit IRC (Ping timeout: 370 seconds) |
16:40
π
|
arkiver |
hi |
16:40
π
|
arkiver |
I limited filefront |
16:40
π
|
arkiver |
What's up with bayimg and maxfile.ro? |
16:43
π
|
HCross |
its come back up for a bit |
16:44
π
|
VADemon |
arkiver: Started mirroring with 660 links off search engines |
16:44
π
|
arkiver |
Ah yeah, it was shutting down |
16:44
π
|
arkiver |
Great! |
16:44
π
|
|
vitzli has quit IRC (Quit: Leaving) |
16:48
π
|
joepie91 |
arkiver: can you write something for bayimg given the above information? |
16:48
π
|
joepie91 |
may be a warrior project thing |
16:48
π
|
arkiver |
sure |
16:48
π
|
joepie91 |
I'm not sure about the 9th char though |
16:48
π
|
arkiver |
what is wrong with the website |
16:48
π
|
arkiver |
I haven't read it all |
16:48
π
|
joepie91 |
it ignores everything after the 8th char but we did get a 9th char |
16:48
π
|
joepie91 |
arkiver: https://torrentfreak.com/pirate-bays-image-hosting-site-bayimg-returns-for-a-bit-160425/ |
16:48
π
|
joepie91 |
"The site will remain online for a week or so. This allows people to secure their files, if needed, but in a few days the site will close its doors again. Apparently, the TPB team prefers to focus exclusively on the torrent site." |
16:49
π
|
joepie91 |
that sounds like an invitation ;) |
16:51
π
|
joepie91 |
so it seems like they've simply shifted it by an A |
16:51
π
|
joepie91 |
for the newer uploads |
16:52
π
|
joepie91 |
maybe they ran out of keyspace? |
16:52
π
|
joepie91 |
hm, maybe not |
16:52
π
|
joepie91 |
wow, nevermind |
16:52
π
|
joepie91 |
I'm blind |
16:52
π
|
joepie91 |
it has always been 9 chars, not 8 |
16:52
π
|
joepie91 |
ignore everything I just said |
16:52
π
|
joepie91 |
kik |
16:53
π
|
joepie91 |
lol* |
16:57
π
|
arkiver |
I'll have a look at the site this evening |
17:08
π
|
|
Honno has joined #archiveteam |
17:12
π
|
|
philpem has joined #archiveteam |
17:16
π
|
joepie91 |
arkiver: alright. |
17:16
π
|
joepie91 |
arkiver: the essential information: |
17:16
π
|
joepie91 |
1) char frequency information: http://storage2.static.itmages.com/i/16/0426/h_1461684754_5932817_5afad23c2c.png |
17:16
π
|
joepie91 |
2) everything after 9 chars is ignored (so you can ignore `position 9` there) |
17:16
π
|
joepie91 |
3) image IDs are case-insensitive |
17:17
π
|
joepie91 |
4) album URLs are linked from the pages of the images that belong to them, so you can discover albums just by scraping the "Album" buttons (eg. http://bayimg.com/PaiLPAAgH ) |
17:18
π
|
joepie91 |
5) nothing ever goes beyond P in the image IDs |
17:19
π
|
MrRadar |
arkiver: For the FileFront Forums did you mean to include the GameFront geo-IP block check in the scripts? I don't think the forums have any geoblocking |
17:23
π
|
joepie91 |
arkiver: oh and 6) it's gone in a week :) |
17:24
π
|
joepie91 |
okay, so some quick calculation work |
17:24
π
|
joepie91 |
6 million permutations and change |
17:24
π
|
joepie91 |
in a week's time |
17:25
π
|
joepie91 |
say 1 million permutations a day |
17:25
π
|
joepie91 |
works out to ~12 requests per second |
17:25
π
|
joepie91 |
not sure they're going to be able to handle that |
17:25
π
|
joepie91 |
and that's just for the images, not the discovered albums |
17:25
π
|
joepie91 |
they're already slow, so chances are they will start complaining at us |
18:19
π
|
|
Medowar has quit IRC (Quit: Connection closed for inactivity) |
18:27
π
|
SketchCow |
https://archive.org/details/roiocollection |
18:37
π
|
|
hictooth has joined #archiveteam |
18:42
π
|
|
hictooth has quit IRC (Quit: Bye!) |
18:44
π
|
|
Peetz0r has quit IRC (Read error: Operation timed out) |
18:48
π
|
|
hictooth has joined #archiveteam |
19:03
π
|
|
BartoCH has joined #archiveteam |
19:18
π
|
SketchCow |
If gangsta art and music is your thing, you're in luck with https://archive.org/details/@sketch_the_cow?and[]=mediatype%3A%22audio%22&and[]=collection:audio |
19:18
π
|
SketchCow |
https://www.flickr.com/photos/textfiles/sets/72157594265759470 is getting all the papers I'm now scanning. |
19:19
π
|
SketchCow |
https://www.flickr.com/photos/textfiles/albums/72157663634874672 is getting all CD-ROM faces I'm scanning (ISOs will go up on hard drives mailed to IA) |
19:22
π
|
SketchCow |
I'm also describing Negativland items, but that's once every 5-8 hours, that's hardly worth noting. |
19:27
π
|
|
bwn has quit IRC (Ping timeout: 246 seconds) |
19:30
π
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
19:33
π
|
|
dashcloud has joined #archiveteam |
19:56
π
|
|
sivoais has quit IRC (Read error: Operation timed out) |
20:03
π
|
|
Wuked has quit IRC (Read error: Connection reset by peer) |
20:03
π
|
|
bwn has joined #archiveteam |
20:07
π
|
|
sivoais has joined #archiveteam |
20:07
π
|
VADemon |
joepie91, bayimg: if my bruteforce string generator is correct and we exclude strings which has "a" in positions 2,6,7 then our search space is totalling at 1,048,600 URLs |
20:08
π
|
|
Wuked has joined #archiveteam |
20:09
π
|
VADemon |
arkiver: basically I've generated items for warrior here: https://github.com/VADemon/bayimg-brute/blob/40f3bfd130e3e405ec83dd874ee8990b9c0bc192/bayimg-portion-list.txt portion;<id>;<starting string>;<ending string>;<endString length>;<total strings in this item> |
20:11
π
|
VADemon |
each individual string would be generated on the fly by lua and given to wget-lua, that's how I imagine this to work |
20:11
π
|
schbirid |
can anyone recommend a feed reader that stores warc files of each post automatically? |
20:13
π
|
|
Wuked has quit IRC (Read error: Connection reset by peer) |
20:13
π
|
joepie91 |
VADemon: huh. hold on. |
20:14
π
|
|
Wuked has joined #archiveteam |
20:14
π
|
joepie91 |
VADemon: |
20:14
π
|
joepie91 |
> 16 * 1 * 16 * 16 * 16 * 1 * 1 * 6 * 16 |
20:14
π
|
joepie91 |
6291456 |
20:14
π
|
joepie91 |
what am I missing? |
20:18
π
|
|
Sanqui has quit IRC (Remote host closed the connection) |
20:19
π
|
|
Sanqui has joined #archiveteam |
20:31
π
|
|
Medowar has joined #archiveteam |
20:37
π
|
|
Wuked_ has joined #archiveteam |
20:37
π
|
|
Wuked has quit IRC (Read error: Connection reset by peer) |
20:38
π
|
|
Ravenloft has joined #archiveteam |
20:46
π
|
|
Wuked_ has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
20:51
π
|
|
schbirid has quit IRC (Quit: Leaving) |
20:54
π
|
|
Lord_Nigh has quit IRC (Ping timeout: 250 seconds) |
20:54
π
|
|
metalcamp has quit IRC (Ping timeout: 244 seconds) |
20:54
π
|
|
Lord_Nigh has joined #archiveteam |
21:13
π
|
|
Emcy_ has joined #archiveteam |
21:15
π
|
|
Emcy has quit IRC (Ping timeout: 246 seconds) |
21:24
π
|
|
Emcy_ has quit IRC (Ping timeout: 246 seconds) |
21:25
π
|
|
Emcy has joined #archiveteam |
21:37
π
|
|
Emcy has quit IRC (Ping timeout: 370 seconds) |
22:25
π
|
arkiver |
bayimg is not case sensitive. It seems to randomly use some case |
22:25
π
|
|
Honno has quit IRC (Read error: Operation timed out) |
22:29
π
|
arkiver |
we'll get it |
22:30
π
|
arkiver |
they also have albums a tags |
22:30
π
|
arkiver |
will have to add some discovery for that |
22:30
π
|
arkiver |
Who has some rsync space for the discovery part? |
22:33
π
|
joepie91 |
arkiver: see above, albums can be derived from images |
22:35
π
|
arkiver |
Yeah, http://bayimg.com/cAIMfAaGH the album button |
22:52
π
|
arkiver |
will be 458752 items, 16 images/item |
23:20
π
|
|
JW_work has quit IRC (Read error: Operation timed out) |
23:29
π
|
|
JW_work has joined #archiveteam |
23:29
π
|
joepie91 |
ack |
23:31
π
|
|
dashcloud has quit IRC (Read error: Operation timed out) |
23:43
π
|
|
Ymgve__ has joined #archiveteam |
23:46
π
|
|
Ymgve has quit IRC (Ping timeout: 506 seconds) |
23:47
π
|
|
dashcloud has joined #archiveteam |
23:50
π
|
|
Ravenloft has quit IRC (Ping timeout: 260 seconds) |
23:52
π
|
|
Rye has quit IRC (Ping timeout: 244 seconds) |
23:52
π
|
|
Ravenloft has joined #archiveteam |
23:58
π
|
|
Rye has joined #archiveteam |