#archiveteam-bs 2017-06-04,Sun

↑back Search

Time Nickname Message
00:10 πŸ”— Boppen has quit IRC (Read error: Connection reset by peer)
00:11 πŸ”— Boppen has joined #archiveteam-bs
00:23 πŸ”— icedice2 has joined #archiveteam-bs
00:26 πŸ”— icedice has quit IRC (Ping timeout: 260 seconds)
00:33 πŸ”— Boppen has quit IRC (Read error: Connection reset by peer)
00:34 πŸ”— Boppen has joined #archiveteam-bs
00:46 πŸ”— Boppen has quit IRC (Read error: Connection reset by peer)
00:48 πŸ”— Boppen has joined #archiveteam-bs
00:51 πŸ”— VADemon has joined #archiveteam-bs
01:08 πŸ”— icedice2 has quit IRC (Read error: Connection reset by peer)
01:19 πŸ”— BlueMaxim has joined #archiveteam-bs
01:31 πŸ”— ZexaronS has quit IRC (Leaving)
01:32 πŸ”— schbirid2 has joined #archiveteam-bs
01:33 πŸ”— j08nY has quit IRC (Quit: Leaving)
01:34 πŸ”— JRWR-Work is now known as JRWR
01:35 πŸ”— schbirid has quit IRC (Ping timeout: 268 seconds)
01:38 πŸ”— schbirid2 has quit IRC (Ping timeout: 250 seconds)
01:41 πŸ”— schbirid has joined #archiveteam-bs
02:28 πŸ”— JRWR Is it strange that I find it fun to run a Rsync target
02:28 πŸ”— joepie91 JRWR: you might be a datahoarder
02:28 πŸ”— joepie91 :P
02:28 πŸ”— JRWR LOOK AT THESE GRAPHS http://jrwr.io:19999
02:29 πŸ”— * JRWR starts to drool
02:29 πŸ”— joepie91 lol
02:31 πŸ”— Frogging I love graphs, I look at my munin ones all the time for no reason
02:31 πŸ”— Frogging :p
02:31 πŸ”— JRWR netdata has awesome live graphs
02:33 πŸ”— dashcloud has quit IRC (Read error: Operation timed out)
02:35 πŸ”— Odd0002 JRWR: you only have 1 GB of RAM?
02:35 πŸ”— JRWR On that box its 16GB
02:35 πŸ”— Odd0002 oh, I didn't even look at "cached"
02:36 πŸ”— JRWR Its handling the Pixiv project right now
02:37 πŸ”— Odd0002 is pixiv going down soon or something?
02:37 πŸ”— Odd0002 I haven't been paying attention
02:37 πŸ”— Frogging lol I turned on my nas after it being off for a few weeks, the rootfs is hosed
02:37 πŸ”— Frogging john@oblivion:~$ htop
02:37 πŸ”— Frogging -bash: /usr/bin/htop: Input/output error
02:38 πŸ”— Frogging are SSDs really that bad without power
02:38 πŸ”— JRWR Odd0002: chat.pixiv is
02:38 πŸ”— Odd0002 ok
02:48 πŸ”— hook54321 Are there any media outlets that have said who the attackers in London are, or anything about them?
02:49 πŸ”— MrRadar Frogging: If they're worn-out, maybe? A healthy SSD should definitely be able to go a few weeks without power
02:50 πŸ”— Frogging yeah I don't know yet if the SSD is broken or just the filesystem
02:50 πŸ”— Frogging I don't trust it anymore either way though
02:51 πŸ”— MrRadar Yeah. At the very least I'd do a secure erase on it to reset its internal state
02:52 πŸ”— JRWR MrRadar: Man we are blazing now
02:52 πŸ”— JRWR we are up 3x the speed now
02:53 πŸ”— MrRadar Yeah, FOS has lots of storage but it taps out fairly quickly on IOPS
02:54 πŸ”— JRWR All I got is 8TB
02:55 πŸ”— kyounko has quit IRC (Read error: Operation timed out)
02:56 πŸ”— JRWR I was thinking about that
02:56 πŸ”— JRWR cross uploading to FOS to make sure in case of my server failing
02:56 πŸ”— JRWR there is a backup
02:56 πŸ”— JRWR but I would do it in big chunks
02:57 πŸ”— Odd0002 wait what, SSDs die if they're not on?
02:58 πŸ”— MrRadar They *can* but any decent drive should be able to go for a year or more unpowered
02:58 πŸ”— JRWR Yes
02:58 πŸ”— JRWR Bit Fade
02:58 πŸ”— MrRadar Their flash cells leak electrons
02:58 πŸ”— JRWR Even USB Drives do
02:59 πŸ”— MrRadar And SD/CF/whatever flash cards
02:59 πŸ”— JRWR Spinning Rust will Decay as well
02:59 πŸ”— JRWR but it takes MUCH longer
03:00 πŸ”— Odd0002 how long does it take for SSD's? I know I have a 20 GB HDD that came with win98 that still works
03:00 πŸ”— JRWR when was the last time it was powered on
03:01 πŸ”— Odd0002 it's still on
03:01 πŸ”— Odd0002 oh, the HDD?
03:02 πŸ”— Odd0002 a few weeks ago
03:02 πŸ”— MrRadar I found an article on SSD unpowered data retention: http://www.anandtech.com/show/9248/the-truth-about-ssd-data-retention
03:03 πŸ”— MrRadar Includes this graph, showing how many weeks a drive is expected to retain data based on the temperature at which the data was written and the temperature the drive is stored: http://images.anandtech.com/doci/9248/3_575px.PNG
03:04 πŸ”— JRWR Gotta keep them powered onj
03:04 πŸ”— JRWR Mostly every type of storage will decay
03:05 πŸ”— Odd0002 huh, guess I need to heat up my SSD while it's on
03:05 πŸ”— JRWR lol
03:05 πŸ”— JRWR lol, my rsync MOTD shows up on the warriors
03:06 πŸ”— Odd0002 maybe attach a peltier device to my SSD that heats one side when the PC is on and cools it when its off
03:08 πŸ”— Odd0002 if that data is correct then it might lengthen the data's lifetime
03:09 πŸ”— Odd0002 if I heat it to 55C and cool it to 25 when its off then I could get 8x the data lifetime!
03:10 πŸ”— MrRadar Reading the article, that chart is for a drive at the end of its lifespan. Newer drives should retain data much longer
03:12 πŸ”— Odd0002 I know, but now I know what to do with my SSD after it's near the end of its lifespan
03:12 πŸ”— Odd0002 heat it when I power it on then put it in the freezer afterwards
03:22 πŸ”— JRWR Its like playing a incermental game, watching the numbers go up and up
03:38 πŸ”— * JRWR buys some Archive Team Stickets
03:38 πŸ”— MrRadar Speaking of rsync, can we have seesaw detect when rsync failed because the files don't exist and just fail that job? Right now I'm going to have to cancel 5 jobs because the 6th is stuck in an infinite upload failure loop
03:39 πŸ”— JRWR lol
03:39 πŸ”— JRWR What project? is it savepixiv?
03:39 πŸ”— MrRadar It's for SPUF
03:39 πŸ”— MrRadar But I had to do the same for some a pixiv pipeline earlier today
03:47 πŸ”— joepie91 MrRadar: it should never try to upload nonexistent files in the first place?
03:47 πŸ”— JRWR Ya
03:47 πŸ”— JRWR That should be the pipline's fault
03:48 πŸ”— MrRadar I agree with that part too, but since it seems like it happens every so often it should probably still be handled
03:48 πŸ”— JRWR Ya, overall if a job keeps failing outright
03:48 πŸ”— JRWR the job manager should just nuke it and send it back
03:50 πŸ”— joepie91 MrRadar: have you filed a bug about this occurring?
03:51 πŸ”— MrRadar No, but that's a good idea
03:51 πŸ”— joepie91 rsync trying to upload nonexistent files sounds indicative of a bigger, more serious issue :)
03:51 πŸ”— joepie91 it's an invalid state that should never occur
03:52 πŸ”— MrRadar Looks like there's already a bug for the missing data issue: https://github.com/ArchiveTeam/seesaw-kit/issues/48
03:52 πŸ”— joepie91 MrRadar: hold on, that's failing on a nonexistent *directory*
03:52 πŸ”— joepie91 not nonexistent *files*
03:53 πŸ”— MrRadar Yeah, sorry for not being precise, that's the exact issue I'm seeing
03:53 πŸ”— MrRadar LOL, I even commented on the issue at the very end
03:54 πŸ”— MrRadar Almost exactly 1 year ago
03:55 πŸ”— JRWR I would love to become a secondary rsync ingress for AT
03:55 πŸ”— JRWR have it auto sync the uploaded data to FOS and clear out the old when I have confirmed data upload
03:56 πŸ”— joepie91 MrRadar: hm :/
03:56 πŸ”— JRWR do it in bulk transfers and such since FOS iops are kind of low
03:57 πŸ”— JRWR afk 1 hour
03:57 πŸ”— JRWR has quit IRC (Quit: Page closed)
03:59 πŸ”— tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
04:17 πŸ”— tfgbd_znc has joined #archiveteam-bs
04:56 πŸ”— SN4T14 has quit IRC (Quit: ZNC 1.6.3 - http://znc.in)
06:41 πŸ”— SN4T14 has joined #archiveteam-bs
06:55 πŸ”— Aranje has quit IRC (Ping timeout: 506 seconds)
07:54 πŸ”— SHODAN_UI has joined #archiveteam-bs
08:04 πŸ”— PurpleSym joepie91, arkiver: Just skimmed through the logs, but that β€œgarbage” looks alot like HTTP chunked transfer encoding to me.
08:05 πŸ”— Mayonaise has quit IRC (Read error: Operation timed out)
08:17 πŸ”— Mayonaise has joined #archiveteam-bs
08:18 πŸ”— SHODAN_UI has quit IRC (Remote host closed the connection)
09:59 πŸ”— Whopper_ has quit IRC (Ping timeout: 250 seconds)
10:05 πŸ”— Whopper has joined #archiveteam-bs
10:14 πŸ”— Whopper has quit IRC (Read error: Operation timed out)
10:18 πŸ”— Whopper has joined #archiveteam-bs
10:26 πŸ”— RichardG has quit IRC (Read error: Connection reset by peer)
10:26 πŸ”— RichardG has joined #archiveteam-bs
10:27 πŸ”— SHODAN_UI has joined #archiveteam-bs
10:43 πŸ”— godane has quit IRC (Ping timeout: 506 seconds)
10:50 πŸ”— BlueMaxim has quit IRC (Read error: Operation timed out)
10:51 πŸ”— BlueMaxim has joined #archiveteam-bs
11:03 πŸ”— j08nY has joined #archiveteam-bs
11:17 πŸ”— antomati_ has joined #archiveteam-bs
11:17 πŸ”— swebb sets mode: +o antomati_
11:18 πŸ”— antomatic has quit IRC (Read error: Operation timed out)
11:21 πŸ”— antomati_ is now known as antomatic
11:26 πŸ”— fie has quit IRC (Ping timeout: 246 seconds)
11:29 πŸ”— t2t2 rsync error: errors selecting input/output files, dirs (code 3) at flist.c(2118) [sender=3.1.1]
11:29 πŸ”— t2t2 Process RsyncUpload returned exit code 3 for Item threads:2755750-2755759
11:29 πŸ”— t2t2 ^ so this what you were discussing 12 hours ago just happened to me too
11:40 πŸ”— SHODAN_UI has quit IRC (Remote host closed the connection)
11:40 πŸ”— fie has joined #archiveteam-bs
11:47 πŸ”— ZexaronS has joined #archiveteam-bs
12:05 πŸ”— GLaDOS has quit IRC (Read error: Operation timed out)
13:02 πŸ”— BlueMaxim has quit IRC (Read error: Operation timed out)
13:06 πŸ”— bmcginty has quit IRC (Ping timeout: 268 seconds)
13:08 πŸ”— bmcginty has joined #archiveteam-bs
13:35 πŸ”— tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
13:51 πŸ”— tfgbd_znc has joined #archiveteam-bs
13:53 πŸ”— icedice has joined #archiveteam-bs
14:11 πŸ”— fio67 has joined #archiveteam-bs
14:11 πŸ”— timmc PurpleSym: There are allegedly some instances where the garbage is space-delimited instead of newline-delimited.
14:12 πŸ”— Nazca guys, is 6 warriors going to get me IP banned from anything besides yahoo and that other project?
14:12 πŸ”— PurpleSym Got an example URL, timmc?
14:14 πŸ”— timmc I don't, just reporting what I saw in chat.
14:15 πŸ”— fio67 Hello, I know you guys aren't archive.org, but do you know if there's a channel for it? Tried ##archive on freenode but it's invite-only.
14:18 πŸ”— timmc PurpleSym: There was this *very* interesting report joepie91 generated for (I think) just one URL, and by gosh it does look like garbled chunked transfer-encoding: http://sprunge.us/RjWi
14:19 πŸ”— PurpleSym Would be interesting to look at the actual WARC.
14:19 πŸ”— timmc There are overlapping context chunks that have the right byte counts. (And look, there's a zero at the end.)
14:20 πŸ”— MrRadar fio67: There's #internetarchive here on EFNet but I'm not sure if it's "official"
14:20 πŸ”— fio67 MrRadar: thanks, I'll take a look
14:22 πŸ”— fie has quit IRC (Ping timeout: 600 seconds)
14:26 πŸ”— Kalroth has joined #archiveteam-bs
14:28 πŸ”— timmc PurpleSym: I think https://archive.org/download/archiveteam_portalgraphics_20160727140857/portalgraphics_20160727140857.megawarc.warc.gz and https://web.archive.org/web/20160724001629/http://www.portalgraphics.net/pg/illust/?image_id=10575
14:31 πŸ”— PurpleSym That URL is not listed in the CDX as far as I see.
14:33 πŸ”— fie has joined #archiveteam-bs
14:34 πŸ”— tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
14:35 πŸ”— joepie91 PurpleSym: timmc: transfer chunk encoding sizes are decimal though, not hexadecimal?
14:35 πŸ”— joepie91 that having been said it does seem to generally match up
14:35 πŸ”— joepie91 in terms of size
14:35 πŸ”— PurpleSym Nope, hex: https://tools.ietf.org/html/rfc2616#section-3.6.1
14:35 πŸ”— joepie91 huh.
14:37 πŸ”— timmc joepie91: Those hex chunks that are space-delimited... is that only in the version displayed on web.archive.org, or also in the WARC?
14:38 πŸ”— joepie91 no idea, haven't checked the source. I think it's easier for arkiver to check that
14:47 πŸ”— Kalroth has quit IRC (Quit: Bye!)
14:52 πŸ”— tfgbd_znc has joined #archiveteam-bs
14:55 πŸ”— JRWR has joined #archiveteam-bs
15:00 πŸ”— fio67 has quit IRC (Quit: Page closed)
15:07 πŸ”— icedice has quit IRC (Ping timeout: 506 seconds)
15:15 πŸ”— Kalroth has joined #archiveteam-bs
15:26 πŸ”— pikhq has quit IRC (Ping timeout: 245 seconds)
15:29 πŸ”— tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
15:31 πŸ”— pikhq has joined #archiveteam-bs
15:37 πŸ”— Aranje has joined #archiveteam-bs
15:45 πŸ”— tfgbd_znc has joined #archiveteam-bs
15:45 πŸ”— SHODAN_UI has joined #archiveteam-bs
16:05 πŸ”— Honno has joined #archiveteam-bs
16:05 πŸ”— tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
16:11 πŸ”— w0rp has quit IRC (Ping timeout: 245 seconds)
16:11 πŸ”— tfgbd_znc has joined #archiveteam-bs
16:12 πŸ”— w0rp has joined #archiveteam-bs
16:49 πŸ”— dashcloud has joined #archiveteam-bs
17:37 πŸ”— ZexaronS- has joined #archiveteam-bs
17:40 πŸ”— ZexaronS has quit IRC (Read error: Operation timed out)
17:41 πŸ”— ZexaronS- has quit IRC (Client Quit)
18:00 πŸ”— JRWR has quit IRC (Quit: Page closed)
18:00 πŸ”— antomati_ has joined #archiveteam-bs
18:00 πŸ”— swebb sets mode: +o antomati_
18:02 πŸ”— Silvan has quit IRC (Read error: Operation timed out)
18:02 πŸ”— antomatic has quit IRC (Ping timeout: 250 seconds)
18:02 πŸ”— antomati_ is now known as antomatic
18:04 πŸ”— SilSte has joined #archiveteam-bs
18:30 πŸ”— SHODAN_UI has quit IRC (Remote host closed the connection)
18:31 πŸ”— j08nY has quit IRC (Read error: Operation timed out)
18:40 πŸ”— JRWR has joined #archiveteam-bs
18:41 πŸ”— JRWR All the DATA, Give me maor! So as a just in case it happens
18:41 πŸ”— JRWR when I get to 75% full on my rsync ingress server, what do
18:42 πŸ”— joepie91 add more HDDs
18:42 πŸ”— joepie91 :p
18:42 πŸ”— JRWR its a OVH Box
18:42 πŸ”— joepie91 JRWR: also, you probably should be in #DataHoarder on Freenode :P
18:43 πŸ”— JRWR I am already :3
18:43 πŸ”— joepie91 anyway, that wasn't totally serious advice
18:43 πŸ”— joepie91 oh, you are?
18:43 πŸ”— JRWR well once I get my server back
18:43 πŸ”— joepie91 not under this nick though?
18:43 πŸ”— joepie91 ah
18:43 πŸ”— JRWR ya my 190TB Plex Archive is very nice
18:43 πŸ”— JRWR I used my main server as a Weechat instance
18:43 πŸ”— JRWR since I blanked it for this project I dont have that setup atm
18:44 πŸ”— JRWR since I have 1Gbps up going to waste, I though about syncing to FOS
18:45 πŸ”— JRWR but I would want to talk to SketchCow before I did, make sure everything was secure
18:45 πŸ”— JRWR I could do bulk uploads to it, then purge the old jobs that I have confrimed on FOS
18:45 πŸ”— JRWR but I've got another 6TB to go, so we have tons of time
18:46 πŸ”— JRWR if its even needed at all
19:04 πŸ”— schbirid 190TB on ovh? what do you pay?
19:07 πŸ”— voidsta im guessing plex cloud
19:07 πŸ”— voidsta synced with a google drive
19:08 πŸ”— voidsta thatd be a hefty chunk of change :D
19:08 πŸ”— JRWR I got one of the "Unlimted" google drive accounts
19:08 πŸ”— JRWR so thats where it is
19:08 πŸ”— JRWR UnEncrypted so it can be deduped
19:09 πŸ”— schbirid ah nice
19:11 πŸ”— JRWR The server I am using now for rsync ingress was my general server (had plex, website, other crap)
19:11 πŸ”— JRWR I wiped it and raid0 it for this, and man I'm impressed
19:13 πŸ”— voidsta my main is an EG-16
19:13 πŸ”— voidsta they recently lowered the prices on them
19:13 πŸ”— voidsta but im locked in at my original price
19:13 πŸ”— voidsta :(
19:13 πŸ”— JRWR Thats what I'm using
19:13 πŸ”— JRWR im at 78$
19:14 πŸ”— voidsta same
19:14 πŸ”— voidsta 79
19:14 πŸ”— voidsta it's like 74 now
19:14 πŸ”— voidsta lol
19:14 πŸ”— JRWR I really want to find another one
19:14 πŸ”— JRWR that is cheaper with the same stats
19:16 πŸ”— voidsta dunno if youll be able to
19:16 πŸ”— voidsta unmetered BW is the bomb
19:18 πŸ”— Kalroth everybody wants unmetered, but noone wants to pay for it :P
19:18 πŸ”— voidsta ^^
19:19 πŸ”— schbirid i like cheap hetzner auction servers
19:19 πŸ”— arkiver JRWR: I'll handle the uploads to IA or FOS
19:19 πŸ”— schbirid 6tb for ~30€ but limited bw
19:20 πŸ”— Kalroth "unlimited" bw on that one
19:20 πŸ”— joepie91 nobody will beat OVH for unmetered
19:20 πŸ”— voidsta ^
19:20 πŸ”— joepie91 that's pretty much certain
19:20 πŸ”— joepie91 anybody who does will almost certainly go bankrupt in under a year
19:20 πŸ”— voidsta that's why they have my service
19:20 πŸ”— voidsta hehe
19:20 πŸ”— joepie91 (and they're usually just OVH resellers anyway)
19:20 πŸ”— Kalroth unlimited as in over 20TB = your 1gbit goes 10mbit
19:21 πŸ”— Kalroth (20TB outgoing)
19:21 πŸ”— Kalroth so it's really not that bad for a hetzner server
19:21 πŸ”— joepie91 like, probably the only reason OVH can get away with unmetered with a relatively good network is that they have a massive network of their own
19:21 πŸ”— joepie91 none of the other big names in budget servers have that
19:21 πŸ”— Kalroth tier 1 like, yeah
19:21 πŸ”— joepie91 and none of the new players will have it either, at least not for a while
19:22 πŸ”— JRWR arkiver: still got the logins I gave you
19:22 πŸ”— voidsta joepie91: theyre expanding rapidly too
19:22 πŸ”— joepie91 hence why it's pretty much certain that nobody will beat OVH without giving up on quality
19:22 πŸ”— joepie91 for unmetered stuff :P
19:22 πŸ”— joepie91 voidsta: yeah
19:22 πŸ”— voidsta been following the ceo on twitter
19:22 πŸ”— voidsta new dc here
19:22 πŸ”— voidsta new dc there
19:22 πŸ”— voidsta everyone gets a dc
19:22 πŸ”— voidsta lol
19:22 πŸ”— joepie91 hehe, was about to make that joke
19:22 πŸ”— voidsta :D
19:23 πŸ”— arkiver JRWR: yes
19:29 πŸ”— JRWR Cool
19:38 πŸ”— JRWR also the scaleway arm64 instances are NICE
19:39 πŸ”— JRWR they are faster then the x86/arm ones
19:42 πŸ”— JRWR OVH is really the only name in dedicated servers without the fuss
19:42 πŸ”— JRWR Atleast state side
19:42 πŸ”— JRWR most of the other places are just meh
19:49 πŸ”— Igloo has quit IRC (Read error: Operation timed out)
19:50 πŸ”— voidsta agree
19:57 πŸ”— bmcginty has quit IRC (Ping timeout: 250 seconds)
20:00 πŸ”— Igloo has joined #archiveteam-bs
20:04 πŸ”— bmcginty has joined #archiveteam-bs
20:04 πŸ”— ZexaronS has joined #archiveteam-bs
20:21 πŸ”— pizzaiolo has joined #archiveteam-bs
20:30 πŸ”— JAA Is this page broken for anyone else? https://archive.org/search.php?query=collection%3Aarchivebot&sort=-publicdate
20:30 πŸ”— JAA The HTML just stops somewhere in the navigation links.
20:30 πŸ”— JRWR works for me
20:30 πŸ”— JAA Hm, weird.
20:30 πŸ”— JRWR it has magic scrolling going on
20:30 πŸ”— JRWR when you get to the bottom
20:31 πŸ”— JAA I know that, but I only get a partial navigation bar, no content at all.
20:31 πŸ”— JAA <li><a target="_top" href="//blog.archive.org">BLOG</a></li>
20:31 πŸ”— JAA <li><a target=
20:31 πŸ”— JAA ^ End of the HTML
20:33 πŸ”— JRWR Ya
20:33 πŸ”— JRWR Im getting full HTML
20:33 πŸ”— JRWR https://hastebin.com/ipupobevun.xml
20:34 πŸ”— JAA Hm yeah, it works from one of my servers, but from this machine, I get it on both two different Firefox profiles and cURL. What the hell?
20:38 πŸ”— j08nY has joined #archiveteam-bs
20:40 πŸ”— kristian_ has joined #archiveteam-bs
20:42 πŸ”— JAA Looks like the connection is killed after receiving about 32k of data.
20:44 πŸ”— Sanqui JAA: where are you located, which network provider do you use
20:44 πŸ”— JAA https://archive.org/search.php works, but any actual search has the same issue.
20:44 πŸ”— JAA Sanqui: Switzerland, currently on a Swisscom connection (not my own)
20:45 πŸ”— Sanqui ok. no problems from czech republic
20:46 πŸ”— JAA It works from another machine within Switzerland.
20:46 πŸ”— JAA (Not on the same network)
20:48 πŸ”— pizzaiolo has quit IRC (Quit: pizzaiolo)
20:55 πŸ”— JRWR Any of you going to defcon this year?
20:56 πŸ”— SHODAN_UI has joined #archiveteam-bs
21:00 πŸ”— Odd0002 huh, my yahoo answers thing has been running for 22 hours now
21:04 πŸ”— icedice has joined #archiveteam-bs
21:04 πŸ”— JAA The IA page randomly started working again. Β―\_(ツ)_/Β―
21:07 πŸ”— Kalroth its magic!
21:10 πŸ”— pizzaiolo has joined #archiveteam-bs
21:18 πŸ”— pizzaiolo has quit IRC (Read error: Connection reset by peer)
21:18 πŸ”— pizzaiolo has joined #archiveteam-bs
21:27 πŸ”— godane has joined #archiveteam-bs
21:37 πŸ”— ZexaronS has quit IRC (Leaving)
21:41 πŸ”— kristian_ has quit IRC (Quit: Leaving)
21:59 πŸ”— JAA Is there any way to get from a Wayback Machine page to the corresponding item (or even better, WARC file)?
22:00 πŸ”— SHODAN_UI has quit IRC (Remote host closed the connection)
22:10 πŸ”— schbirid you could not download them anyways
22:10 πŸ”— Ravenloft has joined #archiveteam-bs
22:11 πŸ”— JAA This particular one is from the ArchiveBot collection, so I most likely could.
22:21 πŸ”— Sanqui JAA: try the viewer then https://archive.fart.website/archivebot/viewer/
22:22 πŸ”— JAA Sanqui: Yeah, I tried that, but didn't find any match.
22:25 πŸ”— JAA I'm trying to look into the corruption issue that was discussed yesterday. Much of the discussion focused on wget-lua, but as DoomTay mentioned earlier in #archivebot, at least one ArchiveBot grab was also affected: https://web.archive.org/web/20160615222159/http://www.portalgraphics.net/pg/illust/?image_id=10575
22:38 πŸ”— JAA How certain are we that this "corruption" is really a client-side issue? It definitely looks like chunked transfer encoding gone wrong. But that could also be a missing "Transfer-Encoding: chunked" header from the portalgraphics server. Have we found any examples on other domains yet?
23:20 πŸ”— godane i can't upload to IA
23:20 πŸ”— godane 2.4%Warning: Transient problem: HTTP error Will retry in 5 seconds. 10 retries
23:20 πŸ”— godane i keep getting that
23:24 πŸ”— JAA Here's a list mapping the URLs SketchCow posted yesterday to the corresponding IA item: https://hastebin.com/raw/iruvobodor . I also included some additional examples I found.
23:28 πŸ”— jtn2 has left
23:45 πŸ”— JAA Hmm. I'm looking at archiveteam_portalgraphics_20160727144032 now. Their server does send the Transfer-Encoding: chunked header according to the WARC. And there is no double chunked encoding or something like that.
23:45 πŸ”— JAA This makes me think that maybe the IA processes these incorrectly.
23:50 πŸ”— JAA Here's the WARC records for https://web.archive.org/web/20160725184715/http://www.portalgraphics.net/pg/illust/?image_id=21107&lang=en : https://hastebin.com/raw/gokiwuzage
23:50 πŸ”— JAA (Hastebin seems to screw up the Unicode characters there, disregard that.)
23:59 πŸ”— JAA I suspect that it's due to the space character after "5c". This space doesn't conform to the specs ( https://tools.ietf.org/html/rfc7230#section-4.1 ), which define a chunk as `chunk-size [ chunk-ext ] CRLF chunk-data CRLF`. chunk-size is the size of the chunk in hexadecimal digits (upper or lower case), chunk-ext is an optional extension of the algorithm and is `*( ";" chunk-ext-name [ "=" chunk-ext-v
23:59 πŸ”— JAA al ] )
23:59 πŸ”— JAA `, i.e. starts always with a semicolon.

irclogger-viewer