| Time |
Nickname |
Message |
|
00:00
π
|
|
kvieta has joined #archiveteam-bs |
|
00:23
π
|
arkiver |
HCross: well, flickr is currently paused because I need to have a look at a problem with WARCs being too smal |
|
00:25
π
|
|
dashcloud has joined #archiveteam-bs |
|
00:30
π
|
|
BlueMaxim has joined #archiveteam-bs |
|
00:44
π
|
|
kvieta has quit IRC (Ping timeout: 370 seconds) |
|
00:47
π
|
|
kvieta has joined #archiveteam-bs |
|
01:21
π
|
|
kvieta has quit IRC (Read error: Operation timed out) |
|
01:51
π
|
|
kvieta has joined #archiveteam-bs |
|
03:52
π
|
|
wp494 has quit IRC (Read error: Connection reset by peer) |
|
04:04
π
|
|
ndiddy has quit IRC (Ping timeout: 244 seconds) |
|
04:12
π
|
|
mutoso_ has quit IRC (Read error: Connection reset by peer) |
|
04:17
π
|
|
mutoso has joined #archiveteam-bs |
|
04:21
π
|
yipdw |
huh neat, you can use colons in sqlite table names, and pretty much every sqlite tool that isn't the sqlite shell breaks in awesome ways |
|
04:23
π
|
xmc |
deeelightful |
|
04:43
π
|
|
Sk1d has quit IRC (Ping timeout: 194 seconds) |
|
04:49
π
|
|
Sk1d has joined #archiveteam-bs |
|
04:55
π
|
ranma |
has this been backed up? http://www.therobotsvoice.com/ |
|
04:55
π
|
ranma |
it's site that has shitty 15-short-paged articles |
|
04:56
π
|
ranma |
stumbled across it linked from a stackexchange post on swearing from Firefly |
|
05:05
π
|
ranma |
http://www.therobotsvoice.com/2010/11/fireflys_15_best_uses_of_chinese_profanity.php |
|
05:05
π
|
ranma |
"Iβve given this site formerly known as Topless Robot three years of my life and hard work, and I wouldnβt trade them. I hoped that covering the subjects and culture that I love would sustain the site. For three years, it has β the three years it took to make The Force Awakens, no less. But all things must end. Today is the The Robotβs Voiceβs final day of publication. After |
|
05:05
π
|
ranma |
years of trying, we couldnβt make this work financially..." |
|
05:06
π
|
ranma |
Thank you for reading the site, supporting it and creating a community here over the years. I spent more time each day with our regular commenters than I did with my own wife or family, so even though I donβt actually know all your real names, Iβll miss you. Sly, Timely, Abraxas, FakeAss, Gallen, Polk, Mindbender, Zoidberg, Canadian Scott, GrimlockPrime, and everyone elseβ¦Iβll |
|
05:06
π
|
ranma |
never forget you. I stayed up until the early hours of the morning, created social media posts on weekends, ran from dinner tables when news happened, and generally made TR/TRV the focus of my life. You got 100 percent of me, like it or not. And I hope you did..." |
|
05:06
π
|
ranma |
etc etc |
|
05:07
π
|
ranma |
not sure if worth backing up |
|
05:24
π
|
yipdw |
I'd argue it's more "worth backing up" than the latest leak of NSA documents or whatever |
|
05:24
π
|
yipdw |
every nerd on the Internet gets off on saving a copy of those and then never reading them |
|
05:25
π
|
ranma |
lol |
|
05:25
π
|
yipdw |
fanworks though, they don't get much |
|
05:25
π
|
ranma |
i presume this community had just a small reach |
|
05:25
π
|
yipdw |
so in the long run we end up with thousands of copies of unknown integrity of one thing and significantly incomplete copies of everything else |
|
05:25
π
|
yipdw |
so I threw that site into archivebot since that's what it was made for |
|
05:26
π
|
ranma |
will try to keep that in mind! |
|
05:27
π
|
yipdw |
also just for full disclosure, yes, I have a copy of the wikileaks insurance file |
|
05:27
π
|
yipdw |
I too get off on that stuff |
|
05:28
π
|
ranma |
i just don't want to throw EVERYTHING at archivebot |
|
05:28
π
|
ranma |
still gauging what's worth time etc |
|
05:28
π
|
ranma |
time, space |
|
05:28
π
|
yipdw |
I might whine about it a lot but really it's better to just throw something in |
|
05:28
π
|
yipdw |
we do have some limits like github/bitbucket links just making it a mess |
|
05:29
π
|
ranma |
http://archive.fart.website/archivebot/viewer/ <-can incomplete URLs be searched? |
|
05:29
π
|
yipdw |
hostnames only |
|
05:30
π
|
yipdw |
but if you throw in "tumblr" you'll get all hostnames matching tumblr |
|
05:30
π
|
ranma |
ah,yeah, that's what i was wondering |
|
05:30
π
|
ranma |
if "digitalocean" would return all domains |
|
05:30
π
|
ranma |
and subdomains |
|
05:31
π
|
ranma |
is it not good at searching backups? or is everything backed up not necessarily tracked there? |
|
05:31
π
|
ranma |
i'd assume digitalocean, linode (for their guides) have been backed up |
|
05:31
π
|
yipdw |
that's just archivebot's catalog |
|
05:31
π
|
yipdw |
there's a ton of other stuff that isn't in there |
|
05:32
π
|
yipdw |
Warrior projects, works from other AT members, everything else in IA, ... |
|
05:34
π
|
ranma |
how good is archivebot at backing up sites with dynamic "next page/more posts" buttons? https://www.digitalocean.com/community/tutorials |
|
05:35
π
|
ranma |
at the end of the page is a js button "load more results" |
|
05:35
π
|
yipdw |
it's not going to work |
|
05:35
π
|
ranma |
damn |
|
05:35
π
|
yipdw |
phantomjs mode just scrolls, there's no "click this button" function |
|
05:36
π
|
yipdw |
if that button is actually an <a> you might have luck with phantomjs |
|
05:36
π
|
yipdw |
I'm not sure |
|
05:37
π
|
ranma |
<a class="load-more-results" href="javascript:void(0);">Load More Results</a> |
|
05:41
π
|
ranma |
is there a way to archive site.com/dir2 |
|
05:41
π
|
ranma |
and site.com/dir2/sub1 site.com/dir2/sub2 |
|
05:41
π
|
ranma |
but not traverse back to site.com |
|
05:41
π
|
ranma |
and not backup site.com/dir1, etc, linked from site.com |
|
05:42
π
|
yipdw |
yes, !a https://site.com/dir2/ |
|
05:43
π
|
ranma |
does !ao only backup site.com/dir2/index.html + images/resources? |
|
05:43
π
|
ranma |
or does it still spider |
|
05:43
π
|
yipdw |
it's page plus prerequisites |
|
05:43
π
|
yipdw |
https://archivebot.readthedocs.io/en/latest/commands.html#archiveonly |
|
05:44
π
|
ranma |
it didn't make much sense to me :< |
|
05:44
π
|
* |
ranma holds onto his butt and feeds archivebot something |
|
05:47
π
|
ranma |
lol @ kebsonsecurity |
|
05:47
π
|
ranma |
was just reading about their DDoS |
|
06:32
π
|
|
Aranje has quit IRC (Quit: Three sheets to the wind) |
|
06:50
π
|
|
wp494 has joined #archiveteam-bs |
|
06:54
π
|
|
fie has joined #archiveteam-bs |
|
07:08
π
|
HCross2 |
ranma: same few days as OVH got hit with 1.5Tbps |
|
07:11
π
|
ranma |
wheee |
|
07:11
π
|
ranma |
and the company i work for is banking on IoT |
|
07:30
π
|
|
ravetcofx has quit IRC (Read error: Operation timed out) |
|
08:05
π
|
|
xmc sets mode: +o yipdw |
|
08:40
π
|
|
GE has joined #archiveteam-bs |
|
09:00
π
|
midas |
they like DDoSing? |
|
09:03
π
|
ranma |
their store salespeople are a bag of dicks, so i don't have much sympathy |
|
09:05
π
|
ranma |
not implying i'm an aggressor. just don't like babysitting them |
|
09:41
π
|
|
kurt has joined #archiveteam-bs |
|
10:18
π
|
|
GE has quit IRC (Remote host closed the connection) |
|
11:07
π
|
|
GE has joined #archiveteam-bs |
|
11:47
π
|
|
kyounko has quit IRC (Read error: Operation timed out) |
|
12:08
π
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
|
12:22
π
|
|
GE has quit IRC (Remote host closed the connection) |
|
13:59
π
|
|
GE has joined #archiveteam-bs |
|
14:34
π
|
|
Start has quit IRC (Quit: Disconnected.) |
|
14:34
π
|
|
Start has joined #archiveteam-bs |
|
14:35
π
|
|
Start has quit IRC (Client Quit) |
|
14:45
π
|
|
achip has joined #archiveteam-bs |
|
15:29
π
|
|
kurt has quit IRC (Remote host closed the connection) |
|
16:15
π
|
|
VADemon has joined #archiveteam-bs |
|
17:24
π
|
|
Swizzle has quit IRC (Quit: Leaving) |
|
17:50
π
|
|
GE has quit IRC (Quit: zzz) |
|
17:50
π
|
|
GE has joined #archiveteam-bs |
|
18:03
π
|
|
VADemon has quit IRC (Read error: Operation timed out) |
|
18:04
π
|
godane |
i'm at 889k items now |
|
18:11
π
|
yipdw |
whoa https://mosh.org/ |
|
18:12
π
|
xmc |
yeah mosh is super nifty |
|
18:12
π
|
yipdw |
I need to try this |
|
18:12
π
|
xmc |
highly recommended |
|
18:12
π
|
yipdw |
intermittent connectivity is the rule for me now and I'd love something that doesn't broken-pipe on me every time |
|
18:13
π
|
xmc |
not sure how its prediction works with how fish likes to redraw the command line in various colors |
|
18:13
π
|
xmc |
but it's worth a shot |
|
18:13
π
|
yipdw |
I need to figure out how to make mosh work with siped |
|
18:13
π
|
yipdw |
er spiped |
|
18:13
π
|
Frogging |
mosh is amazing |
|
18:14
π
|
yipdw |
maybe just tell mosh to connect via spiped and have networking work out the rest |
|
18:14
π
|
Frogging |
the one issue I have with it is that it breaks scrolling, so you'll probably want to use tmux/screen with it |
|
18:17
π
|
xmc |
i regularly use mosh on airplane wifi. it makes it tolerable. |
|
18:18
π
|
yipdw |
oh nice, I guess mosh uses SSH to establish the initial connection and start mosh-server |
|
18:18
π
|
yipdw |
so my existing spipe ProxyCommands work fine |
|
18:20
π
|
yipdw |
oh my god this is amazing |
|
18:21
π
|
xmc |
:D |
|
18:22
π
|
yipdw |
my build server runs in online.net's Paris datacenter and you get some noticable lag on the Chicago -> Paris hop |
|
18:22
π
|
yipdw |
but not here |
|
18:28
π
|
yipdw |
this probably also means I can go back to using irssi |
|
18:29
π
|
Frogging |
I would suggest trying out Weechat |
|
18:29
π
|
xmc |
or irssi |
|
18:32
π
|
|
ravetcofx has joined #archiveteam-bs |
|
18:32
π
|
yipdw |
over the years I've come to know irssi fairly well so that's why |
|
18:32
π
|
yipdw |
one of these days I'll try a new client |
|
18:33
π
|
Frogging |
I used irssi up until I got a bouncer with multiple networks, and it didn't let connect to the same hostname multiple times. I wonder if they fixed that |
|
18:33
π
|
Frogging |
didn't let me* |
|
18:34
π
|
yipdw |
or I can be really obstinate and reinstall ircii |
|
18:35
π
|
Frogging |
this is kind of neat :p http://tools.suckless.org/ii/ |
|
18:35
π
|
yipdw |
that kinda reminds me of trying to use Plan9 |
|
18:35
π
|
yipdw |
cool ideas in theory but I couldn't really integrate them comfortably |
|
18:37
π
|
yipdw |
on the other hand, ii would probably make a good bot substrate |
|
19:21
π
|
HCross2 |
godane: I'll buy you a cake when you get to one million items |
|
19:57
π
|
|
SketchCow has joined #archiveteam-bs |
|
19:57
π
|
|
midas sets mode: +o SketchCow |
|
19:57
π
|
|
swebb sets mode: +o SketchCow |
|
20:04
π
|
godane |
that should be around december sometime |
|
20:05
π
|
godane |
based on me pushing for about 50k items a month |
|
20:32
π
|
|
Stiletto has quit IRC () |
|
20:38
π
|
|
RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) |
|
20:40
π
|
|
acridAxid has quit IRC (Quit: marauder) |
|
20:41
π
|
|
acridAxid has joined #archiveteam-bs |
|
20:51
π
|
|
Stiletto has joined #archiveteam-bs |
|
21:03
π
|
|
ndiddy has joined #archiveteam-bs |
|
21:05
π
|
|
computerf has quit IRC (Read error: Operation timed out) |
|
21:08
π
|
|
computerf has joined #archiveteam-bs |
|
21:08
π
|
|
RichardG has joined #archiveteam-bs |
|
21:24
π
|
|
computerf has quit IRC (Read error: Operation timed out) |
|
21:35
π
|
|
computerf has joined #archiveteam-bs |
|
21:41
π
|
|
ndiddy has quit IRC (Quit: Leaving) |
|
21:51
π
|
|
yeoldetoa has quit IRC (Remote host closed the connection) |
|
21:52
π
|
|
kristian_ has joined #archiveteam-bs |
|
22:22
π
|
|
BlueMaxim has joined #archiveteam-bs |
|
22:33
π
|
|
Start has joined #archiveteam-bs |
|
22:35
π
|
|
achip has quit IRC (Read error: Operation timed out) |
|
22:37
π
|
|
GE has quit IRC (hub.efnet.us irc.Prison.NET) |
|
23:07
π
|
SketchCow |
Hi Jason, |
|
23:07
π
|
SketchCow |
I read that your group is archiving Gawker. I'm a documentary producer, have created events and series for History Channel, National Geographic, MTV and others, and produce feature/festival documentaries for Smithsonian Network etc. |
|
23:07
π
|
SketchCow |
I am currently in pre-productions on a film about why Gawker and what you're doing are important. |
|
23:07
π
|
SketchCow |
May we get on the phone so that I can tell you a bit more about the project? |
|
23:07
π
|
SketchCow |
My goal is to interview you and document you and your volunteers saving history. |
|
23:07
π
|
SketchCow |
... |
|
23:07
π
|
SketchCow |
My intention is to not respond, unless someone thinks different. |
|
23:09
π
|
xmc |
approved β |
|
23:25
π
|
arkiver |
yeah, sounds interesting |
|
23:25
π
|
xmc |
no i mean not responding is meeting with my approval |
|
23:26
π
|
arkiver |
I think it sounds interesting |
|
23:27
π
|
arkiver |
If he's positive about us, it would be nice to have us in a documentary |
|
23:28
π
|
xmc |
i've been burned enough times |
|
23:30
π
|
SketchCow |
Not responding. |
|
23:30
π
|
SketchCow |
"Gawker" + "Documentary" = hellscape |
|
23:31
π
|
arkiver |
:( ok |
|
23:31
π
|
arkiver |
if it's only about gawker then not |
|
23:32
π
|
arkiver |
but maybe he wants to do something more about web history in general too |
|
23:32
π
|
xmc |
from the description it's a gawker documentary |
|
23:32
π
|
SketchCow |
No. |
|
23:32
π
|
SketchCow |
This will be a gawker documentary |
|
23:32
π
|
arkiver |
oh well |
|
23:32
π
|
arkiver |
I'm off anyway |
|
23:32
π
|
arkiver |
have a good day all! |
|
23:32
π
|
* |
arkiver zzzzzzz |
|
23:32
π
|
SketchCow |
I'd rather watch my eye going through a spaghetti strainer with my remaining eye than be involved in anything glorifying gawker |
|
23:33
π
|
arkiver |
^ got it |
|
23:33
π
|
SketchCow |
Hi Jason, |
|
23:33
π
|
SketchCow |
I read that your group is archiving Gawker. I'm a documentary producer, have created events and series for History Channel, National Geographic, MTV and others, and produce feature/festival documentaries for Smithsonian Network etc. |
|
23:33
π
|
SketchCow |
<3 |
|
23:33
π
|
SketchCow |
Added disk check to pipeline, by the way, sleepy |
|
23:33
π
|
arkiver |
yes, saw it! |
|
23:33
π
|
arkiver |
Looks awesome :D |
|
23:33
π
|
arkiver |
thanks |
|
23:52
π
|
|
RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) |