Time |
Nickname |
Message |
00:00
π
|
|
kvieta has joined #archiveteam-bs |
00:23
π
|
arkiver |
HCross: well, flickr is currently paused because I need to have a look at a problem with WARCs being too smal |
00:25
π
|
|
dashcloud has joined #archiveteam-bs |
00:30
π
|
|
BlueMaxim has joined #archiveteam-bs |
00:44
π
|
|
kvieta has quit IRC (Ping timeout: 370 seconds) |
00:47
π
|
|
kvieta has joined #archiveteam-bs |
01:21
π
|
|
kvieta has quit IRC (Read error: Operation timed out) |
01:51
π
|
|
kvieta has joined #archiveteam-bs |
03:52
π
|
|
wp494 has quit IRC (Read error: Connection reset by peer) |
04:04
π
|
|
ndiddy has quit IRC (Ping timeout: 244 seconds) |
04:12
π
|
|
mutoso_ has quit IRC (Read error: Connection reset by peer) |
04:17
π
|
|
mutoso has joined #archiveteam-bs |
04:21
π
|
yipdw |
huh neat, you can use colons in sqlite table names, and pretty much every sqlite tool that isn't the sqlite shell breaks in awesome ways |
04:23
π
|
xmc |
deeelightful |
04:43
π
|
|
Sk1d has quit IRC (Ping timeout: 194 seconds) |
04:49
π
|
|
Sk1d has joined #archiveteam-bs |
04:55
π
|
ranma |
has this been backed up? http://www.therobotsvoice.com/ |
04:55
π
|
ranma |
it's site that has shitty 15-short-paged articles |
04:56
π
|
ranma |
stumbled across it linked from a stackexchange post on swearing from Firefly |
05:05
π
|
ranma |
http://www.therobotsvoice.com/2010/11/fireflys_15_best_uses_of_chinese_profanity.php |
05:05
π
|
ranma |
"Iβve given this site formerly known as Topless Robot three years of my life and hard work, and I wouldnβt trade them. I hoped that covering the subjects and culture that I love would sustain the site. For three years, it has β the three years it took to make The Force Awakens, no less. But all things must end. Today is the The Robotβs Voiceβs final day of publication. After |
05:05
π
|
ranma |
years of trying, we couldnβt make this work financially..." |
05:06
π
|
ranma |
Thank you for reading the site, supporting it and creating a community here over the years. I spent more time each day with our regular commenters than I did with my own wife or family, so even though I donβt actually know all your real names, Iβll miss you. Sly, Timely, Abraxas, FakeAss, Gallen, Polk, Mindbender, Zoidberg, Canadian Scott, GrimlockPrime, and everyone elseβ¦Iβll |
05:06
π
|
ranma |
never forget you. I stayed up until the early hours of the morning, created social media posts on weekends, ran from dinner tables when news happened, and generally made TR/TRV the focus of my life. You got 100 percent of me, like it or not. And I hope you did..." |
05:06
π
|
ranma |
etc etc |
05:07
π
|
ranma |
not sure if worth backing up |
05:24
π
|
yipdw |
I'd argue it's more "worth backing up" than the latest leak of NSA documents or whatever |
05:24
π
|
yipdw |
every nerd on the Internet gets off on saving a copy of those and then never reading them |
05:25
π
|
ranma |
lol |
05:25
π
|
yipdw |
fanworks though, they don't get much |
05:25
π
|
ranma |
i presume this community had just a small reach |
05:25
π
|
yipdw |
so in the long run we end up with thousands of copies of unknown integrity of one thing and significantly incomplete copies of everything else |
05:25
π
|
yipdw |
so I threw that site into archivebot since that's what it was made for |
05:26
π
|
ranma |
will try to keep that in mind! |
05:27
π
|
yipdw |
also just for full disclosure, yes, I have a copy of the wikileaks insurance file |
05:27
π
|
yipdw |
I too get off on that stuff |
05:28
π
|
ranma |
i just don't want to throw EVERYTHING at archivebot |
05:28
π
|
ranma |
still gauging what's worth time etc |
05:28
π
|
ranma |
time, space |
05:28
π
|
yipdw |
I might whine about it a lot but really it's better to just throw something in |
05:28
π
|
yipdw |
we do have some limits like github/bitbucket links just making it a mess |
05:29
π
|
ranma |
http://archive.fart.website/archivebot/viewer/ <-can incomplete URLs be searched? |
05:29
π
|
yipdw |
hostnames only |
05:30
π
|
yipdw |
but if you throw in "tumblr" you'll get all hostnames matching tumblr |
05:30
π
|
ranma |
ah,yeah, that's what i was wondering |
05:30
π
|
ranma |
if "digitalocean" would return all domains |
05:30
π
|
ranma |
and subdomains |
05:31
π
|
ranma |
is it not good at searching backups? or is everything backed up not necessarily tracked there? |
05:31
π
|
ranma |
i'd assume digitalocean, linode (for their guides) have been backed up |
05:31
π
|
yipdw |
that's just archivebot's catalog |
05:31
π
|
yipdw |
there's a ton of other stuff that isn't in there |
05:32
π
|
yipdw |
Warrior projects, works from other AT members, everything else in IA, ... |
05:34
π
|
ranma |
how good is archivebot at backing up sites with dynamic "next page/more posts" buttons? https://www.digitalocean.com/community/tutorials |
05:35
π
|
ranma |
at the end of the page is a js button "load more results" |
05:35
π
|
yipdw |
it's not going to work |
05:35
π
|
ranma |
damn |
05:35
π
|
yipdw |
phantomjs mode just scrolls, there's no "click this button" function |
05:36
π
|
yipdw |
if that button is actually an <a> you might have luck with phantomjs |
05:36
π
|
yipdw |
I'm not sure |
05:37
π
|
ranma |
<a class="load-more-results" href="javascript:void(0);">Load More Results</a> |
05:41
π
|
ranma |
is there a way to archive site.com/dir2 |
05:41
π
|
ranma |
and site.com/dir2/sub1 site.com/dir2/sub2 |
05:41
π
|
ranma |
but not traverse back to site.com |
05:41
π
|
ranma |
and not backup site.com/dir1, etc, linked from site.com |
05:42
π
|
yipdw |
yes, !a https://site.com/dir2/ |
05:43
π
|
ranma |
does !ao only backup site.com/dir2/index.html + images/resources? |
05:43
π
|
ranma |
or does it still spider |
05:43
π
|
yipdw |
it's page plus prerequisites |
05:43
π
|
yipdw |
https://archivebot.readthedocs.io/en/latest/commands.html#archiveonly |
05:44
π
|
ranma |
it didn't make much sense to me :< |
05:44
π
|
* |
ranma holds onto his butt and feeds archivebot something |
05:47
π
|
ranma |
lol @ kebsonsecurity |
05:47
π
|
ranma |
was just reading about their DDoS |
06:32
π
|
|
Aranje has quit IRC (Quit: Three sheets to the wind) |
06:50
π
|
|
wp494 has joined #archiveteam-bs |
06:54
π
|
|
fie has joined #archiveteam-bs |
07:08
π
|
HCross2 |
ranma: same few days as OVH got hit with 1.5Tbps |
07:11
π
|
ranma |
wheee |
07:11
π
|
ranma |
and the company i work for is banking on IoT |
07:30
π
|
|
ravetcofx has quit IRC (Read error: Operation timed out) |
08:05
π
|
|
xmc sets mode: +o yipdw |
08:40
π
|
|
GE has joined #archiveteam-bs |
09:00
π
|
midas |
they like DDoSing? |
09:03
π
|
ranma |
their store salespeople are a bag of dicks, so i don't have much sympathy |
09:05
π
|
ranma |
not implying i'm an aggressor. just don't like babysitting them |
09:41
π
|
|
kurt has joined #archiveteam-bs |
10:18
π
|
|
GE has quit IRC (Remote host closed the connection) |
11:07
π
|
|
GE has joined #archiveteam-bs |
11:47
π
|
|
kyounko has quit IRC (Read error: Operation timed out) |
12:08
π
|
|
BlueMaxim has quit IRC (Quit: Leaving) |
12:22
π
|
|
GE has quit IRC (Remote host closed the connection) |
13:59
π
|
|
GE has joined #archiveteam-bs |
14:34
π
|
|
Start has quit IRC (Quit: Disconnected.) |
14:34
π
|
|
Start has joined #archiveteam-bs |
14:35
π
|
|
Start has quit IRC (Client Quit) |
14:45
π
|
|
achip has joined #archiveteam-bs |
15:29
π
|
|
kurt has quit IRC (Remote host closed the connection) |
16:15
π
|
|
VADemon has joined #archiveteam-bs |
17:24
π
|
|
Swizzle has quit IRC (Quit: Leaving) |
17:50
π
|
|
GE has quit IRC (Quit: zzz) |
17:50
π
|
|
GE has joined #archiveteam-bs |
18:03
π
|
|
VADemon has quit IRC (Read error: Operation timed out) |
18:04
π
|
godane |
i'm at 889k items now |
18:11
π
|
yipdw |
whoa https://mosh.org/ |
18:12
π
|
xmc |
yeah mosh is super nifty |
18:12
π
|
yipdw |
I need to try this |
18:12
π
|
xmc |
highly recommended |
18:12
π
|
yipdw |
intermittent connectivity is the rule for me now and I'd love something that doesn't broken-pipe on me every time |
18:13
π
|
xmc |
not sure how its prediction works with how fish likes to redraw the command line in various colors |
18:13
π
|
xmc |
but it's worth a shot |
18:13
π
|
yipdw |
I need to figure out how to make mosh work with siped |
18:13
π
|
yipdw |
er spiped |
18:13
π
|
Frogging |
mosh is amazing |
18:14
π
|
yipdw |
maybe just tell mosh to connect via spiped and have networking work out the rest |
18:14
π
|
Frogging |
the one issue I have with it is that it breaks scrolling, so you'll probably want to use tmux/screen with it |
18:17
π
|
xmc |
i regularly use mosh on airplane wifi. it makes it tolerable. |
18:18
π
|
yipdw |
oh nice, I guess mosh uses SSH to establish the initial connection and start mosh-server |
18:18
π
|
yipdw |
so my existing spipe ProxyCommands work fine |
18:20
π
|
yipdw |
oh my god this is amazing |
18:21
π
|
xmc |
:D |
18:22
π
|
yipdw |
my build server runs in online.net's Paris datacenter and you get some noticable lag on the Chicago -> Paris hop |
18:22
π
|
yipdw |
but not here |
18:28
π
|
yipdw |
this probably also means I can go back to using irssi |
18:29
π
|
Frogging |
I would suggest trying out Weechat |
18:29
π
|
xmc |
or irssi |
18:32
π
|
|
ravetcofx has joined #archiveteam-bs |
18:32
π
|
yipdw |
over the years I've come to know irssi fairly well so that's why |
18:32
π
|
yipdw |
one of these days I'll try a new client |
18:33
π
|
Frogging |
I used irssi up until I got a bouncer with multiple networks, and it didn't let connect to the same hostname multiple times. I wonder if they fixed that |
18:33
π
|
Frogging |
didn't let me* |
18:34
π
|
yipdw |
or I can be really obstinate and reinstall ircii |
18:35
π
|
Frogging |
this is kind of neat :p http://tools.suckless.org/ii/ |
18:35
π
|
yipdw |
that kinda reminds me of trying to use Plan9 |
18:35
π
|
yipdw |
cool ideas in theory but I couldn't really integrate them comfortably |
18:37
π
|
yipdw |
on the other hand, ii would probably make a good bot substrate |
19:21
π
|
HCross2 |
godane: I'll buy you a cake when you get to one million items |
19:57
π
|
|
SketchCow has joined #archiveteam-bs |
19:57
π
|
|
midas sets mode: +o SketchCow |
19:57
π
|
|
swebb sets mode: +o SketchCow |
20:04
π
|
godane |
that should be around december sometime |
20:05
π
|
godane |
based on me pushing for about 50k items a month |
20:32
π
|
|
Stiletto has quit IRC () |
20:38
π
|
|
RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) |
20:40
π
|
|
acridAxid has quit IRC (Quit: marauder) |
20:41
π
|
|
acridAxid has joined #archiveteam-bs |
20:51
π
|
|
Stiletto has joined #archiveteam-bs |
21:03
π
|
|
ndiddy has joined #archiveteam-bs |
21:05
π
|
|
computerf has quit IRC (Read error: Operation timed out) |
21:08
π
|
|
computerf has joined #archiveteam-bs |
21:08
π
|
|
RichardG has joined #archiveteam-bs |
21:24
π
|
|
computerf has quit IRC (Read error: Operation timed out) |
21:35
π
|
|
computerf has joined #archiveteam-bs |
21:41
π
|
|
ndiddy has quit IRC (Quit: Leaving) |
21:51
π
|
|
yeoldetoa has quit IRC (Remote host closed the connection) |
21:52
π
|
|
kristian_ has joined #archiveteam-bs |
22:22
π
|
|
BlueMaxim has joined #archiveteam-bs |
22:33
π
|
|
Start has joined #archiveteam-bs |
22:35
π
|
|
achip has quit IRC (Read error: Operation timed out) |
22:37
π
|
|
GE has quit IRC (hub.efnet.us irc.Prison.NET) |
23:07
π
|
SketchCow |
Hi Jason, |
23:07
π
|
SketchCow |
I read that your group is archiving Gawker. I'm a documentary producer, have created events and series for History Channel, National Geographic, MTV and others, and produce feature/festival documentaries for Smithsonian Network etc. |
23:07
π
|
SketchCow |
I am currently in pre-productions on a film about why Gawker and what you're doing are important. |
23:07
π
|
SketchCow |
May we get on the phone so that I can tell you a bit more about the project? |
23:07
π
|
SketchCow |
My goal is to interview you and document you and your volunteers saving history. |
23:07
π
|
SketchCow |
... |
23:07
π
|
SketchCow |
My intention is to not respond, unless someone thinks different. |
23:09
π
|
xmc |
approved β |
23:25
π
|
arkiver |
yeah, sounds interesting |
23:25
π
|
xmc |
no i mean not responding is meeting with my approval |
23:26
π
|
arkiver |
I think it sounds interesting |
23:27
π
|
arkiver |
If he's positive about us, it would be nice to have us in a documentary |
23:28
π
|
xmc |
i've been burned enough times |
23:30
π
|
SketchCow |
Not responding. |
23:30
π
|
SketchCow |
"Gawker" + "Documentary" = hellscape |
23:31
π
|
arkiver |
:( ok |
23:31
π
|
arkiver |
if it's only about gawker then not |
23:32
π
|
arkiver |
but maybe he wants to do something more about web history in general too |
23:32
π
|
xmc |
from the description it's a gawker documentary |
23:32
π
|
SketchCow |
No. |
23:32
π
|
SketchCow |
This will be a gawker documentary |
23:32
π
|
arkiver |
oh well |
23:32
π
|
arkiver |
I'm off anyway |
23:32
π
|
arkiver |
have a good day all! |
23:32
π
|
* |
arkiver zzzzzzz |
23:32
π
|
SketchCow |
I'd rather watch my eye going through a spaghetti strainer with my remaining eye than be involved in anything glorifying gawker |
23:33
π
|
arkiver |
^ got it |
23:33
π
|
SketchCow |
Hi Jason, |
23:33
π
|
SketchCow |
I read that your group is archiving Gawker. I'm a documentary producer, have created events and series for History Channel, National Geographic, MTV and others, and produce feature/festival documentaries for Smithsonian Network etc. |
23:33
π
|
SketchCow |
<3 |
23:33
π
|
SketchCow |
Added disk check to pipeline, by the way, sleepy |
23:33
π
|
arkiver |
yes, saw it! |
23:33
π
|
arkiver |
Looks awesome :D |
23:33
π
|
arkiver |
thanks |
23:52
π
|
|
RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) |