#archiveteam-bs 2017-06-04,Sun

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
***Boppen has quit IRC (Read error: Connection reset by peer)
Boppen has joined #archiveteam-bs
[00:10]
icedice2 has joined #archiveteam-bs
icedice has quit IRC (Ping timeout: 260 seconds)
[00:23]
Boppen has quit IRC (Read error: Connection reset by peer)
Boppen has joined #archiveteam-bs
[00:33]
Boppen has quit IRC (Read error: Connection reset by peer)
Boppen has joined #archiveteam-bs
VADemon has joined #archiveteam-bs
[00:46]
.... (idle for 17mn)
icedice2 has quit IRC (Read error: Connection reset by peer) [01:08]
BlueMaxim has joined #archiveteam-bs [01:19]
ZexaronS has quit IRC (Leaving)
schbirid2 has joined #archiveteam-bs
j08nY has quit IRC (Quit: Leaving)
JRWR-Work is now known as JRWR
schbirid has quit IRC (Ping timeout: 268 seconds)
schbirid2 has quit IRC (Ping timeout: 250 seconds)
schbirid has joined #archiveteam-bs
[01:31]
.......... (idle for 47mn)
JRWRIs it strange that I find it fun to run a Rsync target [02:28]
joepie91JRWR: you might be a datahoarder
:P
[02:28]
JRWRLOOK AT THESE GRAPHS http://jrwr.io:19999
JRWR starts to drool
[02:28]
joepie91lol [02:29]
FroggingI love graphs, I look at my munin ones all the time for no reason
:p
[02:31]
JRWRnetdata has awesome live graphs [02:31]
***dashcloud has quit IRC (Read error: Operation timed out) [02:33]
Odd0002JRWR: you only have 1 GB of RAM? [02:35]
JRWROn that box its 16GB [02:35]
Odd0002oh, I didn't even look at "cached" [02:35]
JRWRIts handling the Pixiv project right now [02:36]
Odd0002is pixiv going down soon or something?
I haven't been paying attention
[02:37]
Frogginglol I turned on my nas after it being off for a few weeks, the rootfs is hosed
john@oblivion:~$ htop
-bash: /usr/bin/htop: Input/output error
are SSDs really that bad without power
[02:37]
JRWROdd0002: chat.pixiv is [02:38]
Odd0002ok [02:38]
hook54321Are there any media outlets that have said who the attackers in London are, or anything about them? [02:48]
MrRadarFrogging: If they're worn-out, maybe? A healthy SSD should definitely be able to go a few weeks without power [02:49]
Froggingyeah I don't know yet if the SSD is broken or just the filesystem
I don't trust it anymore either way though
[02:50]
MrRadarYeah. At the very least I'd do a secure erase on it to reset its internal state [02:51]
JRWRMrRadar: Man we are blazing now
we are up 3x the speed now
[02:52]
MrRadarYeah, FOS has lots of storage but it taps out fairly quickly on IOPS [02:53]
JRWRAll I got is 8TB [02:54]
***kyounko has quit IRC (Read error: Operation timed out) [02:55]
JRWRI was thinking about that
cross uploading to FOS to make sure in case of my server failing
there is a backup
but I would do it in big chunks
[02:56]
Odd0002wait what, SSDs die if they're not on? [02:57]
MrRadarThey *can* but any decent drive should be able to go for a year or more unpowered [02:58]
JRWRYes
Bit Fade
[02:58]
MrRadarTheir flash cells leak electrons [02:58]
JRWREven USB Drives do [02:58]
MrRadarAnd SD/CF/whatever flash cards [02:59]
JRWRSpinning Rust will Decay as well
but it takes MUCH longer
[02:59]
Odd0002how long does it take for SSD's? I know I have a 20 GB HDD that came with win98 that still works [03:00]
JRWRwhen was the last time it was powered on [03:00]
Odd0002it's still on
oh, the HDD?
a few weeks ago
[03:01]
MrRadarI found an article on SSD unpowered data retention: http://www.anandtech.com/show/9248/the-truth-about-ssd-data-retention
Includes this graph, showing how many weeks a drive is expected to retain data based on the temperature at which the data was written and the temperature the drive is stored: http://images.anandtech.com/doci/9248/3_575px.PNG
[03:02]
JRWRGotta keep them powered onj
Mostly every type of storage will decay
[03:04]
Odd0002huh, guess I need to heat up my SSD while it's on [03:05]
JRWRlol
lol, my rsync MOTD shows up on the warriors
[03:05]
Odd0002maybe attach a peltier device to my SSD that heats one side when the PC is on and cools it when its off
if that data is correct then it might lengthen the data's lifetime
if I heat it to 55C and cool it to 25 when its off then I could get 8x the data lifetime!
[03:06]
MrRadarReading the article, that chart is for a drive at the end of its lifespan. Newer drives should retain data much longer [03:10]
Odd0002I know, but now I know what to do with my SSD after it's near the end of its lifespan
heat it when I power it on then put it in the freezer afterwards
[03:12]
JRWRIts like playing a incermental game, watching the numbers go up and up [03:22]
.... (idle for 16mn)
JRWR buys some Archive Team Stickets [03:38]
MrRadarSpeaking of rsync, can we have seesaw detect when rsync failed because the files don't exist and just fail that job? Right now I'm going to have to cancel 5 jobs because the 6th is stuck in an infinite upload failure loop [03:38]
JRWRlol
What project? is it savepixiv?
[03:39]
MrRadarIt's for SPUF
But I had to do the same for some a pixiv pipeline earlier today
[03:39]
joepie91MrRadar: it should never try to upload nonexistent files in the first place? [03:47]
JRWRYa
That should be the pipline's fault
[03:47]
MrRadarI agree with that part too, but since it seems like it happens every so often it should probably still be handled [03:48]
JRWRYa, overall if a job keeps failing outright
the job manager should just nuke it and send it back
[03:48]
joepie91MrRadar: have you filed a bug about this occurring? [03:50]
MrRadarNo, but that's a good idea [03:51]
joepie91rsync trying to upload nonexistent files sounds indicative of a bigger, more serious issue :)
it's an invalid state that should never occur
[03:51]
MrRadarLooks like there's already a bug for the missing data issue: https://github.com/ArchiveTeam/seesaw-kit/issues/48 [03:52]
joepie91MrRadar: hold on, that's failing on a nonexistent *directory*
not nonexistent *files*
[03:52]
MrRadarYeah, sorry for not being precise, that's the exact issue I'm seeing
LOL, I even commented on the issue at the very end
Almost exactly 1 year ago
[03:53]
JRWRI would love to become a secondary rsync ingress for AT
have it auto sync the uploaded data to FOS and clear out the old when I have confirmed data upload
[03:55]
joepie91MrRadar: hm :/ [03:56]
JRWRdo it in bulk transfers and such since FOS iops are kind of low
afk 1 hour
[03:56]
***JRWR has quit IRC (Quit: Page closed)
tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
[03:57]
.... (idle for 18mn)
tfgbd_znc has joined #archiveteam-bs [04:17]
........ (idle for 39mn)
SN4T14 has quit IRC (Quit: ZNC 1.6.3 - http://znc.in) [04:56]
...................... (idle for 1h45mn)
SN4T14 has joined #archiveteam-bs [06:41]
Aranje has quit IRC (Ping timeout: 506 seconds) [06:55]
............ (idle for 59mn)
SHODAN_UI has joined #archiveteam-bs [07:54]
PurpleSymjoepie91, arkiver: Just skimmed through the logs, but that “garbage” looks alot like HTTP chunked transfer encoding to me. [08:04]
***Mayonaise has quit IRC (Read error: Operation timed out) [08:05]
Mayonaise has joined #archiveteam-bs
SHODAN_UI has quit IRC (Remote host closed the connection)
[08:17]
..................... (idle for 1h41mn)
Whopper_ has quit IRC (Ping timeout: 250 seconds) [09:59]
Whopper has joined #archiveteam-bs [10:05]
Whopper has quit IRC (Read error: Operation timed out)
Whopper has joined #archiveteam-bs
[10:14]
RichardG has quit IRC (Read error: Connection reset by peer)
RichardG has joined #archiveteam-bs
SHODAN_UI has joined #archiveteam-bs
[10:26]
.... (idle for 16mn)
godane has quit IRC (Ping timeout: 506 seconds) [10:43]
BlueMaxim has quit IRC (Read error: Operation timed out)
BlueMaxim has joined #archiveteam-bs
[10:50]
j08nY has joined #archiveteam-bs [11:03]
antomati_ has joined #archiveteam-bs
swebb sets mode: +o antomati_
antomatic has quit IRC (Read error: Operation timed out)
antomati_ is now known as antomatic
[11:17]
fie has quit IRC (Ping timeout: 246 seconds) [11:26]
t2t2rsync error: errors selecting input/output files, dirs (code 3) at flist.c(2118) [sender=3.1.1]
Process RsyncUpload returned exit code 3 for Item threads:2755750-2755759
^ so this what you were discussing 12 hours ago just happened to me too
[11:29]
***SHODAN_UI has quit IRC (Remote host closed the connection)
fie has joined #archiveteam-bs
[11:40]
ZexaronS has joined #archiveteam-bs [11:47]
.... (idle for 18mn)
GLaDOS has quit IRC (Read error: Operation timed out) [12:05]
............ (idle for 57mn)
BlueMaxim has quit IRC (Read error: Operation timed out)
bmcginty has quit IRC (Ping timeout: 268 seconds)
bmcginty has joined #archiveteam-bs
[13:02]
...... (idle for 27mn)
tfgbd_znc has quit IRC (Ping timeout: 600 seconds) [13:35]
.... (idle for 16mn)
tfgbd_znc has joined #archiveteam-bs
icedice has joined #archiveteam-bs
[13:51]
.... (idle for 18mn)
fio67 has joined #archiveteam-bs [14:11]
timmcPurpleSym: There are allegedly some instances where the garbage is space-delimited instead of newline-delimited. [14:11]
Nazcaguys, is 6 warriors going to get me IP banned from anything besides yahoo and that other project? [14:12]
PurpleSymGot an example URL, timmc? [14:12]
timmcI don't, just reporting what I saw in chat. [14:14]
fio67Hello, I know you guys aren't archive.org, but do you know if there's a channel for it? Tried ##archive on freenode but it's invite-only. [14:15]
timmcPurpleSym: There was this *very* interesting report joepie91 generated for (I think) just one URL, and by gosh it does look like garbled chunked transfer-encoding: http://sprunge.us/RjWi [14:18]
PurpleSymWould be interesting to look at the actual WARC. [14:19]
timmcThere are overlapping context chunks that have the right byte counts. (And look, there's a zero at the end.) [14:19]
MrRadarfio67: There's #internetarchive here on EFNet but I'm not sure if it's "official" [14:20]
fio67MrRadar: thanks, I'll take a look [14:20]
***fie has quit IRC (Ping timeout: 600 seconds)
Kalroth has joined #archiveteam-bs
[14:22]
timmcPurpleSym: I think https://archive.org/download/archiveteam_portalgraphics_20160727140857/portalgraphics_20160727140857.megawarc.warc.gz and https://web.archive.org/web/20160724001629/http://www.portalgraphics.net/pg/illust/?image_id=10575 [14:28]
PurpleSymThat URL is not listed in the CDX as far as I see. [14:31]
***fie has joined #archiveteam-bs
tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
[14:33]
joepie91PurpleSym: timmc: transfer chunk encoding sizes are decimal though, not hexadecimal?
that having been said it does seem to generally match up
in terms of size
[14:35]
PurpleSymNope, hex: https://tools.ietf.org/html/rfc2616#section-3.6.1 [14:35]
joepie91huh. [14:35]
timmcjoepie91: Those hex chunks that are space-delimited... is that only in the version displayed on web.archive.org, or also in the WARC? [14:37]
joepie91no idea, haven't checked the source. I think it's easier for arkiver to check that [14:38]
***Kalroth has quit IRC (Quit: Bye!) [14:47]
tfgbd_znc has joined #archiveteam-bs
JRWR has joined #archiveteam-bs
[14:52]
fio67 has quit IRC (Quit: Page closed) [15:00]
icedice has quit IRC (Ping timeout: 506 seconds) [15:07]
Kalroth has joined #archiveteam-bs [15:15]
pikhq has quit IRC (Ping timeout: 245 seconds)
tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
pikhq has joined #archiveteam-bs
[15:26]
Aranje has joined #archiveteam-bs [15:37]
tfgbd_znc has joined #archiveteam-bs
SHODAN_UI has joined #archiveteam-bs
[15:45]
..... (idle for 20mn)
Honno has joined #archiveteam-bs
tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
[16:05]
w0rp has quit IRC (Ping timeout: 245 seconds)
tfgbd_znc has joined #archiveteam-bs
w0rp has joined #archiveteam-bs
[16:11]
........ (idle for 37mn)
dashcloud has joined #archiveteam-bs [16:49]
.......... (idle for 48mn)
ZexaronS- has joined #archiveteam-bs
ZexaronS has quit IRC (Read error: Operation timed out)
ZexaronS- has quit IRC (Client Quit)
[17:37]
.... (idle for 19mn)
JRWR has quit IRC (Quit: Page closed)
antomati_ has joined #archiveteam-bs
swebb sets mode: +o antomati_
Silvan has quit IRC (Read error: Operation timed out)
antomatic has quit IRC (Ping timeout: 250 seconds)
antomati_ is now known as antomatic
SilSte has joined #archiveteam-bs
[18:00]
...... (idle for 26mn)
SHODAN_UI has quit IRC (Remote host closed the connection)
j08nY has quit IRC (Read error: Operation timed out)
[18:30]
JRWR has joined #archiveteam-bs [18:40]
JRWRAll the DATA, Give me maor! So as a just in case it happens
when I get to 75% full on my rsync ingress server, what do
[18:41]
joepie91add more HDDs
:p
[18:42]
JRWRits a OVH Box [18:42]
joepie91JRWR: also, you probably should be in #DataHoarder on Freenode :P [18:42]
JRWRI am already :3 [18:43]
joepie91anyway, that wasn't totally serious advice
oh, you are?
[18:43]
JRWRwell once I get my server back [18:43]
joepie91not under this nick though?
ah
[18:43]
JRWRya my 190TB Plex Archive is very nice
I used my main server as a Weechat instance
since I blanked it for this project I dont have that setup atm
since I have 1Gbps up going to waste, I though about syncing to FOS
but I would want to talk to SketchCow before I did, make sure everything was secure
I could do bulk uploads to it, then purge the old jobs that I have confrimed on FOS
but I've got another 6TB to go, so we have tons of time
if its even needed at all
[18:43]
.... (idle for 18mn)
schbirid190TB on ovh? what do you pay? [19:04]
voidstaim guessing plex cloud
synced with a google drive
thatd be a hefty chunk of change :D
[19:07]
JRWRI got one of the "Unlimted" google drive accounts
so thats where it is
UnEncrypted so it can be deduped
[19:08]
schbiridah nice [19:09]
JRWRThe server I am using now for rsync ingress was my general server (had plex, website, other crap)
I wiped it and raid0 it for this, and man I'm impressed
[19:11]
voidstamy main is an EG-16
they recently lowered the prices on them
but im locked in at my original price
:(
[19:13]
JRWRThats what I'm using
im at 78$
[19:13]
voidstasame
79
it's like 74 now
lol
[19:14]
JRWRI really want to find another one
that is cheaper with the same stats
[19:14]
voidstadunno if youll be able to
unmetered BW is the bomb
[19:16]
Kalrotheverybody wants unmetered, but noone wants to pay for it :P [19:18]
voidsta^^ [19:18]
schbiridi like cheap hetzner auction servers [19:19]
arkiverJRWR: I'll handle the uploads to IA or FOS [19:19]
schbirid6tb for ~30€ but limited bw [19:19]
Kalroth"unlimited" bw on that one [19:20]
joepie91nobody will beat OVH for unmetered [19:20]
voidsta^ [19:20]
joepie91that's pretty much certain
anybody who does will almost certainly go bankrupt in under a year
[19:20]
voidstathat's why they have my service
hehe
[19:20]
joepie91(and they're usually just OVH resellers anyway) [19:20]
Kalrothunlimited as in over 20TB = your 1gbit goes 10mbit
(20TB outgoing)
so it's really not that bad for a hetzner server
[19:20]
joepie91like, probably the only reason OVH can get away with unmetered with a relatively good network is that they have a massive network of their own
none of the other big names in budget servers have that
[19:21]
Kalrothtier 1 like, yeah [19:21]
joepie91and none of the new players will have it either, at least not for a while [19:21]
JRWRarkiver: still got the logins I gave you [19:22]
voidstajoepie91: theyre expanding rapidly too [19:22]
joepie91hence why it's pretty much certain that nobody will beat OVH without giving up on quality
for unmetered stuff :P
voidsta: yeah
[19:22]
voidstabeen following the ceo on twitter
new dc here
new dc there
everyone gets a dc
lol
[19:22]
joepie91hehe, was about to make that joke [19:22]
voidsta:D [19:22]
arkiverJRWR: yes [19:23]
JRWRCool [19:29]
also the scaleway arm64 instances are NICE
they are faster then the x86/arm ones
OVH is really the only name in dedicated servers without the fuss
Atleast state side
most of the other places are just meh
[19:38]
***Igloo has quit IRC (Read error: Operation timed out) [19:49]
voidstaagree [19:50]
***bmcginty has quit IRC (Ping timeout: 250 seconds)
Igloo has joined #archiveteam-bs
bmcginty has joined #archiveteam-bs
ZexaronS has joined #archiveteam-bs
[19:57]
.... (idle for 17mn)
pizzaiolo has joined #archiveteam-bs [20:21]
JAAIs this page broken for anyone else? https://archive.org/search.php?query=collection%3Aarchivebot&sort=-publicdate
The HTML just stops somewhere in the navigation links.
[20:30]
JRWRworks for me [20:30]
JAAHm, weird. [20:30]
JRWRit has magic scrolling going on
when you get to the bottom
[20:30]
JAAI know that, but I only get a partial navigation bar, no content at all.
<li><a target="_top" href="//blog.archive.org">BLOG</a></li>
<li><a target=
^ End of the HTML
[20:31]
JRWRYa
Im getting full HTML
https://hastebin.com/ipupobevun.xml
[20:33]
JAAHm yeah, it works from one of my servers, but from this machine, I get it on both two different Firefox profiles and cURL. What the hell? [20:34]
***j08nY has joined #archiveteam-bs
kristian_ has joined #archiveteam-bs
[20:38]
JAALooks like the connection is killed after receiving about 32k of data. [20:42]
SanquiJAA: where are you located, which network provider do you use [20:44]
JAAhttps://archive.org/search.php works, but any actual search has the same issue.
Sanqui: Switzerland, currently on a Swisscom connection (not my own)
[20:44]
Sanquiok. no problems from czech republic [20:45]
JAAIt works from another machine within Switzerland.
(Not on the same network)
[20:46]
***pizzaiolo has quit IRC (Quit: pizzaiolo) [20:48]
JRWRAny of you going to defcon this year? [20:55]
***SHODAN_UI has joined #archiveteam-bs [20:56]
Odd0002huh, my yahoo answers thing has been running for 22 hours now [21:00]
***icedice has joined #archiveteam-bs [21:04]
JAAThe IA page randomly started working again. ¯\_(ツ)_/¯ [21:04]
Kalrothits magic! [21:07]
***pizzaiolo has joined #archiveteam-bs [21:10]
pizzaiolo has quit IRC (Read error: Connection reset by peer)
pizzaiolo has joined #archiveteam-bs
[21:18]
godane has joined #archiveteam-bs [21:27]
ZexaronS has quit IRC (Leaving)
kristian_ has quit IRC (Quit: Leaving)
[21:37]
.... (idle for 18mn)
JAAIs there any way to get from a Wayback Machine page to the corresponding item (or even better, WARC file)? [21:59]
***SHODAN_UI has quit IRC (Remote host closed the connection) [22:00]
schbiridyou could not download them anyways [22:10]
***Ravenloft has joined #archiveteam-bs [22:10]
JAAThis particular one is from the ArchiveBot collection, so I most likely could. [22:11]
SanquiJAA: try the viewer then https://archive.fart.website/archivebot/viewer/ [22:21]
JAASanqui: Yeah, I tried that, but didn't find any match.
I'm trying to look into the corruption issue that was discussed yesterday. Much of the discussion focused on wget-lua, but as DoomTay mentioned earlier in #archivebot, at least one ArchiveBot grab was also affected: https://web.archive.org/web/20160615222159/http://www.portalgraphics.net/pg/illust/?image_id=10575
[22:22]
How certain are we that this "corruption" is really a client-side issue? It definitely looks like chunked transfer encoding gone wrong. But that could also be a missing "Transfer-Encoding: chunked" header from the portalgraphics server. Have we found any examples on other domains yet? [22:38]
......... (idle for 42mn)
godanei can't upload to IA
2.4%Warning: Transient problem: HTTP error Will retry in 5 seconds. 10 retries
i keep getting that
[23:20]
JAAHere's a list mapping the URLs SketchCow posted yesterday to the corresponding IA item: https://hastebin.com/raw/iruvobodor . I also included some additional examples I found. [23:24]
***jtn2 has left [23:28]
.... (idle for 17mn)
JAAHmm. I'm looking at archiveteam_portalgraphics_20160727144032 now. Their server does send the Transfer-Encoding: chunked header according to the WARC. And there is no double chunked encoding or something like that.
This makes me think that maybe the IA processes these incorrectly.
[23:45]
Here's the WARC records for https://web.archive.org/web/20160725184715/http://www.portalgraphics.net/pg/illust/?image_id=21107&lang=en : https://hastebin.com/raw/gokiwuzage
(Hastebin seems to screw up the Unicode characters there, disregard that.)
[23:50]
I suspect that it's due to the space character after "5c". This space doesn't conform to the specs ( https://tools.ietf.org/html/rfc7230#section-4.1 ), which define a chunk as `chunk-size [ chunk-ext ] CRLF chunk-data CRLF`. chunk-size is the size of the chunk in hexadecimal digits (upper or lower case), chunk-ext is an optional extension of the algorithm and is `*( ";" chunk-ext-name [ "=" chunk-ext-v
al ] )
`, i.e. starts always with a semicolon.
[23:59]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)