#archiveteam-bs 2017-06-06,Tue

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
***gui7 has joined #archiveteam-bs
gui7 has quit IRC (Client Quit)
BlueMaxim has joined #archiveteam-bs
[00:01]
.... (idle for 18mn)
JRWR-Thin has joined #archiveteam-bs [00:21]
BlueMaxim has quit IRC (Quit: Leaving)
Whopper_ has quit IRC (Ping timeout: 246 seconds)
Whopper has joined #archiveteam-bs
[00:27]
JRWR-Thinso far I'm impressed with the speed of the scaleway ARM64-2GB for a warrior
just wish it has more disk space
does run-pipeline handle low diskspace?
[00:35]
***pizzaiolo has quit IRC (Remote host closed the connection) [00:37]
....... (idle for 30mn)
MrRadarYo, if you're running SPUF via scripts please make sure to update them. It's a wall of Warriors over on the tracker right now...
(And no, the project is not done yet.)
[01:07]
***ndiddy has quit IRC (Read error: Operation timed out) [01:10]
j08nY has quit IRC (Quit: Leaving)
JRWR-Thin has quit IRC (Ping timeout: 268 seconds)
[01:20]
JRWR-iPAD has joined #archiveteam-bs [01:28]
username1 has joined #archiveteam-bs
JRWR-iPAD has quit IRC (Ping timeout: 268 seconds)
schbirid2 has quit IRC (Read error: Operation timed out)
JRWR-iPAD has joined #archiveteam-bs
[01:35]
...... (idle for 25mn)
ZexaronS has quit IRC (Leaving) [02:05]
joepie91JRWR: you can add additional disk space, no? [02:17]
JRWRits a OVH Box
So I could spin up another box and move the DNS
[02:18]
But I can't afford it [02:30]
......... (idle for 41mn)
joepie91JRWR: huh? you said Scaleway
not OVH :P
and Scaleway has block storage stuff iirc
[03:11]
JRWRWait
What is the topic, I was talking about the Rsync Ingress I'm running
[03:11]
joepie91[02:35] <JRWR-Thin> so far I'm impressed with the speed of the scaleway ARM64-2GB for a warrior
[02:35] <JRWR-Thin> just wish it has more disk space
[03:13]
..... (idle for 20mn)
***yuitimoth has quit IRC (Remote host closed the connection)
yuitimoth has joined #archiveteam-bs
[03:33]
.... (idle for 16mn)
r3c0d3x has joined #archiveteam-bs [03:49]
...... (idle for 26mn)
MrRadararkiver xmc: Someone from pixiv is asking us to throttle our requests a bit
Since our grab is triggering their automatic anti-DDoS systems
[04:15]
Odd0002I think this is day 3 of my yahoo answers thread running [04:17]
MrRadarKaz, GLaDOS: See above ^^^
Over in #savepixiv
[04:17]
...... (idle for 26mn)
***BlueMaxim has joined #archiveteam-bs [04:44]
.... (idle for 15mn)
Sk1d has quit IRC (Ping timeout: 250 seconds)
dashcloud has quit IRC (Ping timeout: 492 seconds)
[04:59]
Sk1d has joined #archiveteam-bs [05:06]
...... (idle for 25mn)
JRWR-iPADHrm
We could use a tracker admin
need to put the brakes on pixiv
40RPS has beerequested
[05:31]
Odd0002ok, I'll pause mine for now [05:32]
JRWR-iPADIll stop mine as well [05:33]
...... (idle for 27mn)
***SHODAN_UI has joined #archiveteam-bs [06:00]
........ (idle for 36mn)
bsmith093 has quit IRC (Read error: Operation timed out) [06:36]
.......... (idle for 47mn)
godaneso i'm close to going past last month number of items
i have like another 100 items to upload to past it
[07:23]
***SHODAN_UI has quit IRC (Remote host closed the connection)
tuluu has quit IRC (Ping timeout: 260 seconds)
[07:32]
.... (idle for 18mn)
bsmith093 has joined #archiveteam-bs [07:52]
.... (idle for 15mn)
bsmith093 has quit IRC (Ping timeout: 245 seconds) [08:07]
BlueMaxim has quit IRC (Read error: Operation timed out)
BlueMaxim has joined #archiveteam-bs
[08:15]
bsmith093 has joined #archiveteam-bs [08:25]
godanehttp://www.npr.org/sections/alltechconsidered/2017/06/03/529155865/videotapes-are-becoming-unwatchable-as-archivists-work-to-save-them [08:28]
***JRWR-iPAD has quit IRC (Quit: Page closed) [08:39]
jiphex has joined #archiveteam-bs [08:45]
............ (idle for 55mn)
BlueMaxim has quit IRC (Quit: Leaving) [09:40]
j08nY has joined #archiveteam-bs [09:50]
ZexaronS has joined #archiveteam-bs [10:02]
......... (idle for 43mn)
Stiletto has quit IRC (Read error: Operation timed out) [10:45]
.................. (idle for 1h29mn)
AoedeI did a scrape of literotica.com, let me know if there's any issues with it.
https://archive.org/details/literotica.com_2017-04
[12:14]
......................... (idle for 2h2mn)
***pizzaiolo has joined #archiveteam-bs [14:16]
JAAThis may be of interest to some of you: https://www.reddit.com/r/DataHoarder/comments/6fku0s/2873_nes_roms_the_eye/ [14:21]
***alfie has quit IRC (Ping timeout: 260 seconds) [14:33]
alfie has joined #archiveteam-bs [14:44]
....... (idle for 33mn)
SHODAN_UI has joined #archiveteam-bs
pizzaiolo has quit IRC (Read error: Connection reset by peer)
pizzaiolo has joined #archiveteam-bs
[15:17]
.......... (idle for 49mn)
username1 is now known as spirit [16:08]
spiritJAA: probably better to use the standard sets
Aoede: thanks >:)
[16:08]
Aoedespirit: not like I have anything better to do with my time :P [16:20]
spiritwell, in about 27 minutes I do [16:20]
Aoede? [16:23]
spiritsays wget [16:28]
Aoedeoh :D
It's around 11GB uncompressed
[16:29]
***pizzaiolo has quit IRC (Quit: pizzaiolo) [16:42]
joepie91I was just pointed at this very, very interesting project: http://formats.kaitai.io/ -- cc SketchCow
essentially standardized format specifications for a crapload of (binary) formats
with parser and chart generation and whatnot
(though the crapload is not yet big enough :P)
[16:42]
MrRadarSomeone with an account on the File Formats wiki should add a link there: http://fileformats.archiveteam.org/wiki/Main_Page [16:45]
JAAInteresting. Shall we throw it into ArchiveBot just in case? [16:47]
............ (idle for 59mn)
***bmcginty has quit IRC (Ping timeout: 268 seconds)
icedice has joined #archiveteam-bs
[17:46]
bmcginty has joined #archiveteam-bs [17:52]
Pudsey has joined #archiveteam-bs [17:59]
Pudsey has quit IRC (Remote host closed the connection) [18:04]
....... (idle for 32mn)
JRWR-iPAD has joined #archiveteam-bs [18:36]
SHODAN_UI has quit IRC (Remote host closed the connection) [18:44]
............ (idle for 59mn)
Honno has quit IRC (Quit: Leaving) [19:43]
Aranje has joined #archiveteam-bs [19:48]
kristian_ has joined #archiveteam-bs [20:02]
JAAI'm looking into archiving Steam Greenlight. Some of it looks pretty straightforward, other things will be really annoying. [20:03]
Odd0002what sort of stuff is there to archive? Comments and pages? [20:06]
JAAThe list of games which were released through Greenlight, announcements plus comments to those, and two discussion forums.
As far as I can see
[20:09]
Odd0002ok [20:11]
JAAThe announcement comments are the most annoying thing I've seen so far as they're based entirely on JavaScripted POST requests. [20:11]
MrRadarBleh [20:19]
.... (idle for 17mn)
***ZexaronS has quit IRC (Read error: Connection reset by peer) [20:36]
ZexaronS has joined #archiveteam-bs
SHODAN_UI has joined #archiveteam-bs
[20:46]
............... (idle for 1h10mn)
spirit has quit IRC (Quit: Leaving)
dashcloud has joined #archiveteam-bs
[21:58]
JAAOk, so those POST requests aren't actually necessary strictly speaking since all information can also be accessed directly as a static HTML page. That doesn't mean I won't try archiving that "API" as well, of course.
Each game released through Greenlight obviously has its own page, including a discussion forum etc. I think I'll skip those for now.
[22:07]
arkiverdo you have a list of games
or IDs
[22:09]
JAANot yet, but I guess I can make one, why? [22:09]
arkiverwell for archiving [22:09]
JAAWell yeah, but I don't see those games going anywhere anytime soon. It's just the whole Greenlight framework which will disappear, as I understand it.
It would certainly be a good idea to grab all of Steam at some point though.
[22:10]
arkiverah
at some point yeah
any idea how big this is?
greenlight I mean
[22:10]
JRWR14k games overall over the course of its lifetime
thats including all the crap ones
[22:11]
arkiverah
good enough for archivebot I think
[22:12]
JAANot big. 16k games and 6k forum threads are the main part. [22:12]
arkiverright yeah, so a few GB only [22:12]
JAAI don't think ArchiveBot will handle this very well. It's spread across various directories on the steamcommunity.com domain.
A specific wpull with the relevant --accept-regex rules will work better, I think.
[22:13]
arkiveryep [22:14]
JRWR-iPADlike a mini archivebocx
bot*
[22:14]
arkiver:P [22:14]
JAA:-)
JustAnotherArchivebot
Hmm, actually, it may be necessary to grab the games as well. Still not big though.
Just painful
[22:14]
***SHODAN_UI has quit IRC (Remote host closed the connection) [22:34]
ZexaronS- has joined #archiveteam-bs
ZexaronS has quit IRC (Read error: Operation timed out)
[22:41]
BartoCH has quit IRC (Ping timeout: 260 seconds) [22:54]
.... (idle for 19mn)
kristian_ has quit IRC (Quit: Leaving) [23:13]
yipdwholy shit http://codeology.braintreepayments.com/archiveteam/archivebot
this is a sweet repo visualization
[23:14]
JRWRIS THAT FLYING SPAGHETTI MONSTER? [23:15]
yipdwwhat I really like is that the objects used to represent the repo aren't random
Python code, for example, always shows up as those purple pyramidial-ish things
stuff recognized as Makefile directives show up as those spindly structures
[23:16]
JRWRArchive bot is FSM
Legit thats fucking GSM
FSM*
[23:16]
yipdwso you develop a visual (and, if this were to be e.g. 3D-printed, tactile/material) vocabulary for a repo's composition
here's seesaw: http://codeology.braintreepayments.com/archiveteam/seesaw-kit
[23:17]
JRWRmy god, its ALIVE
http://codeology.braintreepayments.com/archiveteam/glowing-computing-machine
[23:19]
yipdwanything with a lot of spindly things sticking out of it immediately signals "the build system comprises a substantial portion of the code"
that's really cool
[23:20]
***JRWR has quit IRC (Quit: Page closed)
JRWR has joined #archiveteam-bs
icedice has quit IRC (Quit: Leaving)
[23:22]
JRWRoh
oh my god
http://codeology.braintreepayments.com/featured/torvalds/linux
Linux is AMAZING
[23:25]
voidstathat's pretty cool [23:29]
JRWRhttp://codeology.braintreepayments.com/featured/microsoft/typescript
looks like a bat
[23:30]
voidstahaha, yep [23:31]
.... (idle for 18mn)
***BlueMaxim has joined #archiveteam-bs
zenguy has quit IRC (Ping timeout: 370 seconds)
[23:49]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)