#archiveteam-bs 2016-04-19,Tue

↑back Search

Time Nickname Message
00:03 🔗 tomwsmf-a has quit IRC (Read error: Operation timed out)
00:04 🔗 atrocity fantastic, this worked: -o %(id)s/%(title)s.%(ext)s
00:04 🔗 atrocity now to archive all youtubes
00:08 🔗 dashcloud has quit IRC (Read error: Operation timed out)
00:11 🔗 dashcloud has joined #archiveteam-bs
00:11 🔗 bwn__ is now known as bwn
00:13 🔗 BlueMaxim has joined #archiveteam-bs
00:19 🔗 bwn JW_work: JesseW: i don't use it, but i had come across that warcreate extension, it looked like he was working on adding a 'record' type thing similar to what you were talking about
00:19 🔗 bwn but it just did a snapshot of the current page when I had played with it
00:21 🔗 bwn https://github.com/machawk1/warcreate
00:21 🔗 JesseW has joined #archiveteam-bs
00:25 🔗 Balrog_ has joined #archiveteam-bs
00:30 🔗 Balrog_ has quit IRC (<TerminusEst13> hung she dong)
00:30 🔗 atrocity be awesome if there was a firefox version
00:31 🔗 Start has joined #archiveteam-bs
00:41 🔗 ivan` Delimiter is still AWOL. would not recommend
00:41 🔗 ivan` and I thought OVH had bad customer service
00:43 🔗 BlueMaxim has quit IRC (Quit: Leaving)
00:44 🔗 JesseW bwn: https://github.com/machawk1/warcreate/issues/66
00:44 🔗 JesseW thanks for pointing me at warcreate
01:03 🔗 Honno has quit IRC (Quit: Leaving)
01:06 🔗 * Yoshimura thanks VADemon. Could use that ;) Wish you as well.
01:11 🔗 ivan` joepie91: if you know the lowendtalk guy maybe you can vouch for me, new account ivank
01:12 🔗 wp494 has quit IRC (Read error: Connection reset by peer)
01:17 🔗 wp494 has joined #archiveteam-bs
01:33 🔗 tomwsmf-a has joined #archiveteam-bs
02:01 🔗 wp494 has quit IRC (Read error: Operation timed out)
02:01 🔗 wp494 has joined #archiveteam-bs
02:06 🔗 VADemon has quit IRC (Quit: left4dead)
02:26 🔗 atrocity has quit IRC (Ping timeout: 260 seconds)
02:28 🔗 atrocity has joined #archiveteam-bs
02:29 🔗 atrocity FUCK
02:29 🔗 atrocity power went out here, so lost my openwith shit
02:51 🔗 bwn has quit IRC (Read error: Operation timed out)
02:57 🔗 tomwsmf-a has quit IRC (Read error: Operation timed out)
02:59 🔗 xmc D:
03:12 🔗 Yoshimura The doom with url shorteners is terrible. Not sure what is worse the shorteners or the new ones with custom urls and long crap.
03:13 🔗 Yoshimura Also ads, and more. There are lot of uncrawled ones also. *looks at JesseW with smile*
03:14 🔗 * xmc smiles creepily
03:14 🔗 xmc er, i'm not that creepy
03:14 🔗 Yoshimura Is there anywhere matadata about megawars on archive? Would like to systematically go through some files, indexes or pages.
03:15 🔗 xmc megawarcs are just big warcs
03:15 🔗 xmc what are you looking for?
03:15 🔗 Yoshimura I know. I meant I do not have to click on each page on AI.
03:15 🔗 tomwsmf-a has joined #archiveteam-bs
03:15 🔗 Yoshimura I meant IA... looking for HTML pages to extract data from.
03:16 🔗 Yoshimura I got both AT related, and two/three different projects related. So it would be handy.
03:16 🔗 Yoshimura First step would be metadata, second index, last sections of warcs by range requests to get only the HTML.
03:17 🔗 Yoshimura And only the more fresh, and depending on content. Some vast sites are kind of useless (except the very index or comments)
03:27 🔗 RichardG has quit IRC (Ping timeout: 260 seconds)
03:32 🔗 tomwsmf-a has quit IRC (Read error: Operation timed out)
03:48 🔗 ErkDog has quit IRC (Read error: Operation timed out)
04:01 🔗 godane has quit IRC (Quit: Leaving.)
04:05 🔗 * Yoshimura found one problem when (ordinary) people get stuff for free... they then expect everything to be super f.... nice and free at least, if not give them gifts.
04:06 🔗 ErkDog has joined #archiveteam-bs
04:21 🔗 Crocatowa has joined #archiveteam-bs
04:28 🔗 ErkDog has quit IRC (Read error: Operation timed out)
04:33 🔗 ErkDog has joined #archiveteam-bs
04:40 🔗 Frogging dem ordinary people
04:41 🔗 bwn has joined #archiveteam-bs
04:50 🔗 BlueMaxim has joined #archiveteam-bs
04:55 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
05:02 🔗 Sk1d has joined #archiveteam-bs
06:22 🔗 hawc145 has joined #archiveteam-bs
06:27 🔗 HCross has quit IRC (Read error: Operation timed out)
06:36 🔗 schbirid has joined #archiveteam-bs
06:49 🔗 JesseW has quit IRC (Quit: Leaving.)
06:49 🔗 JesseW has joined #archiveteam-bs
07:02 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
07:06 🔗 joepie91 damn
07:07 🔗 joepie91 new EU privacy laws give European privacy watchdog the authority to impose fines of up to 20 million euro or 4% of the *global* revenue for significant violations, and 10 million / 2% for more 'formal' violations
07:25 🔗 metalcamp has joined #archiveteam-bs
07:33 🔗 Medowar has joined #archiveteam-bs
07:34 🔗 VADemon has joined #archiveteam-bs
07:59 🔗 mismatch_ has quit IRC (Remote host closed the connection)
08:01 🔗 mismatch_ has joined #archiveteam-bs
08:29 🔗 godane has joined #archiveteam-bs
08:36 🔗 hawc145 is now known as HCross
09:24 🔗 metalcamp has quit IRC (Ping timeout: 244 seconds)
09:56 🔗 bwn has quit IRC (Read error: Operation timed out)
10:05 🔗 bwn has joined #archiveteam-bs
10:55 🔗 metalcamp has joined #archiveteam-bs
10:56 🔗 Atluxity Would this truck blend in anywhere? https://twitter.com/textfiles/status/722094405931397121/photo/1
11:36 🔗 RichardG has joined #archiveteam-bs
11:38 🔗 Medowar has quit IRC (Quit: Connection closed for inactivity)
11:50 🔗 atrocity who would steal from a library...
11:52 🔗 Atluxity maybe they just borrowed it
11:56 🔗 Lord_Nigh has quit IRC (Read error: Operation timed out)
12:00 🔗 Lord_Nigh has joined #archiveteam-bs
12:04 🔗 Lord_Nigh has quit IRC (Read error: Operation timed out)
12:18 🔗 Medowar has joined #archiveteam-bs
12:24 🔗 BlueMaxim has quit IRC (Quit: Leaving)
12:41 🔗 Lord_Nigh has joined #archiveteam-bs
12:45 🔗 vitzli has joined #archiveteam-bs
13:30 🔗 tomwsmf-a has joined #archiveteam-bs
13:31 🔗 RichardG has quit IRC (Ping timeout: 272 seconds)
13:34 🔗 SketchCow That truck
13:42 🔗 hook54321 has joined #archiveteam-bs
13:47 🔗 tomwsmf-a has quit IRC (Read error: Operation timed out)
14:11 🔗 RichardG has joined #archiveteam-bs
14:16 🔗 Start has quit IRC (Quit: Disconnected.)
15:03 🔗 RichardG has quit IRC (Read error: Operation timed out)
15:03 🔗 RichardG has joined #archiveteam-bs
15:09 🔗 tomwsmf-a has joined #archiveteam-bs
15:20 🔗 RichardG has quit IRC (Ping timeout: 250 seconds)
15:21 🔗 RichardG has joined #archiveteam-bs
15:29 🔗 JesseW has joined #archiveteam-bs
15:30 🔗 godane has quit IRC (Read error: Operation timed out)
15:31 🔗 Start has joined #archiveteam-bs
15:44 🔗 RichardG has quit IRC (Ping timeout: 244 seconds)
15:50 🔗 RichardG has joined #archiveteam-bs
15:50 🔗 hook54321 has quit IRC (Quit: Connection closed for inactivity)
15:51 🔗 HCross2 Sorry again, but newsbuddy could do with more power and grabbers please
15:55 🔗 Start has quit IRC (Ping timeout: 260 seconds)
15:56 🔗 Yoshimura HCross2: Define power and grabber?
15:57 🔗 Yoshimura Grabber = pipe, power = cpu?
15:57 🔗 HCross2 Yeah, bandwidth and CPU really
16:00 🔗 Start has joined #archiveteam-bs
16:01 🔗 Yoshimura I could serve BW, or CPU, not both at same time currently.
16:01 🔗 Yoshimura Well, both, but not mutually, aka diff location.
16:02 🔗 Yoshimura If you do not need dedicated box, I can give you container, I use 1/100 - 5/100 of available bw atm.
16:05 🔗 Start has quit IRC (Remote host closed the connection)
16:05 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
16:05 🔗 Medowar >Container
16:05 🔗 Medowar We really should get a docker image. If I have time, I can create one.
16:05 🔗 Medowar but right now, time is an issue.
16:07 🔗 Yoshimura Medowar: Yeah, I can provide HCross2 a Docker with ubuntu stuff (phusion/baseimage).
16:08 🔗 Medowar yeah i have done the same.
16:08 🔗 Yoshimura Btw, anyone knows about high bandwidth pipeline?
16:08 🔗 Medowar But with a debian base please
16:08 🔗 Yoshimura Maybe not neded at all, the hltv is going off in few days.
16:08 🔗 Yoshimura And wayback lacks tons of it
16:08 🔗 Medowar you mean high bandwith servers?
16:09 🔗 Yoshimura Nah archivebot. noone seems to care + the site seems to be loaded so warrior would make no sense
16:09 🔗 Yoshimura Someone might try to reach them or something, but saying they will shutdown in few days sucks.
16:10 🔗 Yoshimura After people realized that I guess more people crawl them personally or something.
16:10 🔗 Yoshimura http://www.hltv.org/?pageid=86&galleryid=7880
16:10 🔗 Yoshimura Example page. Load speed, and I think this one is not in wayback either.
16:11 🔗 Medowar hltv is going offline?
16:11 🔗 Yoshimura I announced that twice at least on main channel
16:11 🔗 Yoshimura Noone cared or noticed. Yes on 23rd april.
16:12 🔗 Medowar wow. Is there an official announcement anywhere?
16:12 🔗 Yoshimura Yes on twitter it was I think
16:12 🔗 Yoshimura https://twitter.com/hltvorg_/status/722083587357544448
16:13 🔗 Yoshimura But it may or not be hoax, I do not know.
16:13 🔗 Medowar fake. Wrong twitter account.
16:13 🔗 Yoshimura The twitter handle sounds sketchy. But someone already on wiki said its valuable. So if it is confirmed hoax, we should still crawl it after the 23rd.
16:14 🔗 Medowar it has literarly nothing on it other than the announcement.
16:14 🔗 Medowar https://twitter.com/HLTVORG
16:14 🔗 Medowar this is the original account
16:14 🔗 Medowar also, the creator announced 9 months ago, that he is going fulltime hltv, so I dont think, that it is shutting down
16:14 🔗 Medowar http://www.hltv.org/?pageid=135&userid=1&blogid=10102
16:16 🔗 Medowar and it is the most important CSGO news site. Has dedicated staff to do interviews on events and stuff
16:17 🔗 Medowar afk 30 min, driving home
16:22 🔗 Yoshimura Alright, then I guess best strategy would be to wait and fetch the site once a year.
16:22 🔗 Yoshimura bot would have space problems maybe, due to galeries, I do not know.
16:23 🔗 SimpBrain has joined #archiveteam-bs
16:47 🔗 bwn_ has joined #archiveteam-bs
16:59 🔗 bwn has quit IRC (Read error: Operation timed out)
17:05 🔗 atrocity newsgrabber on warrior? if so, i can give you like 40/40
17:06 🔗 HCross NOPE
17:06 🔗 HCross It isnty
17:06 🔗 HCross isnt
17:07 🔗 atrocity :/
17:08 🔗 atrocity yuku it is, lol
17:11 🔗 vitzli has quit IRC (Quit: Leaving)
17:14 🔗 ivan` HCross: had a chat with "Michael" who tells me "service will be live by Friday" because that's when they set up their drives in a batch
17:14 🔗 ivan` and still no response to tickets or emails
17:14 🔗 ivan` pretty sure I'm going to be out $130 on my drive
17:15 🔗 HCross Ouch :/
17:18 🔗 HCross Are the drives over at their DC now?
17:18 🔗 ivan` I have no idea. they received the drive last Wednesday
17:19 🔗 ivan` maybe it's already been sold for their hookers-and-blow fund
17:19 🔗 HCross Then surely it would be in last Friday's batch if they do it weekly
17:19 🔗 HCross nah, more like their "Downtime poptart fund"
17:30 🔗 jspiros has quit IRC (Read error: Operation timed out)
17:34 🔗 jspiros has joined #archiveteam-bs
17:45 🔗 Start has joined #archiveteam-bs
17:46 🔗 Honno has joined #archiveteam-bs
17:53 🔗 JW_work1 has joined #archiveteam-bs
17:59 🔗 JW_work has quit IRC (Ping timeout: 370 seconds)
18:08 🔗 ivan` I opened a ticket to get them to cancel and return my drive
18:08 🔗 ivan` fuckers will probably try to bill me $25 for packing the drive
18:10 🔗 HCross pay it, then speak to your CC company
18:10 🔗 ivan` yeah
18:10 🔗 HCross but wait until the drive is in your hand before
18:13 🔗 ivan` I'm out $39 just for shipping back and forth
18:13 🔗 ivan` last month I was out $28 for shipping smoke-filled PS3s back and forth
18:13 🔗 ivan` it's good to be a shipping co
18:14 🔗 HCross yeah, that does sound a tad expensive though. I sent a 2.5inch disk from London to LA for £14 the other month
18:15 🔗 Start has quit IRC (Quit: Disconnected.)
18:16 🔗 HCross Took less than 48 hours to reach LA, but then another 2 weeks to get through customs
18:16 🔗 ivan` heh
18:17 🔗 HCross Yep
18:18 🔗 Start has joined #archiveteam-bs
18:19 🔗 HCross something to do with sending HDDs from the EU being risky or something
18:53 🔗 Yoshimura Yeah, if it goes air or ship ... air means radiation from cosmos.
18:54 🔗 Yoshimura Ship might be ok but slow, but temperatures.
18:54 🔗 Yoshimura Transport over Wire with special purpose application and protocol (scientists have that and they are free or oss) over UDP work sbest.
19:00 🔗 Kazzy no, probably more like the contents of the drive
19:00 🔗 Kazzy pretty sure they're fine with the whole air travel bit in general
19:02 🔗 HCross it was empty too
19:03 🔗 Yoshimura Kazzy: Cosmic radiation = damaging the bits on the magnetic surface?
19:04 🔗 Kazzy Yoshimura: wrap it in tin foil, that blocks all the rads
19:05 🔗 Yoshimura Nope.
19:05 🔗 xmc lead foil
19:05 🔗 Kazzy i don't lose my bits when i go on a plane, why does a metal thing
19:06 🔗 Yoshimura Density.
19:06 🔗 * bwn_ makes a foil hat
19:06 🔗 Yoshimura Also cosmic radiation is fast as hell shielding does not work much.
19:06 🔗 Kazzy wait that's rude
19:09 🔗 arkiver Eelectrical field around the drive :D
19:15 🔗 Kazzy Yoshimura: planes are fast as hell
19:18 🔗 HCross ^ nearly 11 hours from London to LA
19:23 🔗 bwn_ has quit IRC (Read error: Operation timed out)
19:27 🔗 schbirid has quit IRC (Quit: Leaving)
19:32 🔗 Frogging Yoshimura: I don't see what the speed of the particles has to do with shielding
19:32 🔗 Frogging things can and are shielded from cosmic rays, otherwise the satellites orbiting Earth would have issues
19:33 🔗 Yoshimura Frogging: It goes through, only mountains help. Yeah, can but costly.
19:33 🔗 Frogging pretty sure they don't have mountains in orbit
19:33 🔗 Yoshimura Nope, but they got storage media resistent made for that
19:33 🔗 Frogging they have shielding
19:34 🔗 Frogging the microchips aren't special, they're just shielded
19:36 🔗 Yoshimura Shield your disk and send it instead of upload then
19:37 🔗 godane has joined #archiveteam-bs
19:44 🔗 Frogging well, yes. that's what was being suggested. Your objection was that "the radiation is too fast so shielding doesn't work", remember?
19:45 🔗 Start has quit IRC (Quit: Disconnected.)
19:46 🔗 bwn_ has joined #archiveteam-bs
19:47 🔗 tomwsmf-a has quit IRC (Read error: Operation timed out)
20:13 🔗 Start has joined #archiveteam-bs
20:24 🔗 metalcamp has quit IRC (Ping timeout: 244 seconds)
20:30 🔗 powerKite has joined #archiveteam-bs
20:33 🔗 * zino is trying to remember what free forum hosting service his lost forum was on.
20:35 🔗 xmc invisionfree?
20:36 🔗 powerKite I think the worst thing about archiving an ARG
20:36 🔗 powerKite is that you end up having to ***DO THE PUZZLES AGAIN*** to find out what you need to archive
20:37 🔗 zino Heh.
20:38 🔗 zino xmc: I think the domain for the forum contained "easyforum.com or something.
20:40 🔗 joepie91 forumotion?
20:40 🔗 zino Hmm. Nope.
20:41 🔗 powerKite anyway, is there a Megaswf archive I just don't know about or somthing?
20:41 🔗 powerKite or am I just fucked in regards to getting those SWFs
20:46 🔗 powerKite judging by the lake of responses, it's probably the latter
20:47 🔗 zino Quite possibly
20:48 🔗 Medowar has quit IRC (Quit: Connection closed for inactivity)
20:49 🔗 Atluxity :P
21:00 🔗 powerKite has quit IRC (Quit: Page closed)
21:10 🔗 Start has quit IRC (Quit: Disconnected.)
22:17 🔗 JW_work1 https://twitter.com/textfiles/status/722530539006214146
22:19 🔗 ErkDog has quit IRC (Read error: Operation timed out)
22:20 🔗 zino JW_work1: Great! I choose to belive that it was my retweet that did the differance...
22:21 🔗 JW_work1 I'm just curious what damage, if any, there will be to it.
22:21 🔗 BlueMaxim has joined #archiveteam-bs
22:21 🔗 JW_work1 Hopefully if there's damage to the paint job, they can get the artist to fix it
22:23 🔗 Yoshimura Is there anywhere a picture of the van?
22:24 🔗 Kazzy https://twitter.com/textfiles/status/722094405931397121
22:24 🔗 Yoshimura Thanks ;)
22:26 🔗 ErkDog has joined #archiveteam-bs
22:32 🔗 Yoshimura Ok, pipeline, would like to run one.
22:34 🔗 Kazzy vbox + https://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova
22:34 🔗 Yoshimura Who could help or provide more info, would be glad. I did not care till now, when apparently pipes are loaded, stalling etc. If they all work there would be enough BW.
22:34 🔗 Yoshimura Kazzy: Archivebot :P Alerady running all projects on warrior simultaneously at concurrency 6 (which IRL is lower thanks to lack of work)
22:35 🔗 Kazzy archivebot is a ton more involved
22:35 🔗 Kazzy basically don't bother even trying unless you can provide 50/50 (ideally 100/100 line) for 2-3 months minimum at 100% uptime, guaranteed with no filtering
22:36 🔗 Kazzy if you pass all that, proceed to https://github.com/ArchiveTeam/ArchiveBot/blob/master/INSTALL.pipeline
22:37 🔗 Yoshimura Kazzy: 100/100
22:37 🔗 Yoshimura Atm, at least once.
22:38 🔗 Yoshimura Not even SLAs have 100%, but 99.9
22:39 🔗 Yoshimura And filtering is needed, and used almost everywhere, people just pretend to think its not (IPS, IDS)
22:39 🔗 godane SketchCow: we are up to 2008-07-05 with funny or die archive videos
22:40 🔗 Yoshimura But the providers do it, to lower DDoS, while retaining the real bandwidth, plus residual DoS.
22:49 🔗 Yoshimura If you want your pipeline to only handle !ao/!archiveonly jobs, run it with the AO_ONLY environment variable set.
22:50 🔗 Yoshimura Sounds like a job for me, starting small. Sounds great.
22:56 🔗 Honno has quit IRC (Quit: Leaving)
23:23 🔗 Rickster has quit IRC (Ping timeout: 260 seconds)
23:34 🔗 Rickster has joined #archiveteam-bs
23:38 🔗 VADemon has quit IRC (Quit: left4dead)
23:50 🔗 JesseW has joined #archiveteam-bs

irclogger-viewer