#archiveteam 2015-10-25,Sun

↑back Search

Time Nickname Message
00:09 🔗 marvinw grabbing it
00:11 🔗 marvinw currently archiving 2100 channels, it's taking a while
00:13 🔗 aaaaaaaaa has joined #archiveteam
00:16 🔗 HCross yeah, I need to sort the disk space on my virtualisation server then I can ace a few channels
00:21 🔗 Microguru do we have a formal log of who has what? if not, we should start one. just a list of names and videos with notes like "entire playlist" and "entire channel up to October 2015" will do at first. I'm saying this to make it easy to fold our efforts into a big push on the off chance that youtube does something inadvisable like deleting all the nonprofitable channels
00:21 🔗 * joepie91 is grabbing nopefully
00:22 🔗 marvinw Microguru: the only easy thing for me to share is a list of channels+users+playlists I'm scheduled to grab, and a list of all video IDs that I have
00:23 🔗 Microguru marvinw, that's likely enough info for now.
00:31 🔗 HCrossSRV has joined #archiveteam
00:34 🔗 nertzy has quit IRC (Quit: This computer has gone to sleep)
00:41 🔗 marvinw in related news here is the god-awful software I am using to archive users/channels/playlists https://www.refheap.com/0a0da5376faa8984131d83f83/raw
00:45 🔗 Microguru definitely looks like quite a hack. how do the proxies work? are they different servers in different countries that you are ssh port forwarding to?
00:48 🔗 marvinw Microguru: spiped to a polipo proxy
00:48 🔗 marvinw yes, different countries
00:49 🔗 Microguru when it comes to setting up download servers, do you think that being geographically diverse is a good idea? many hosting providers let you set up servers in several countries. Digital ocean, for example, lets me have a server in the netherlands,, USA, singapore, UK, germany and canada
00:49 🔗 marvinw digitalocean is not very good for archiving because you don't get unmetered bandwidth
00:50 🔗 philpem has quit IRC (Ping timeout: 252 seconds)
00:50 🔗 Microguru true. I think that my VPN server gets 1 TB a month, and it's the $5 package
00:50 🔗 marvinw being geographically diverse is good but beware of crawling the web from countries that censor websites
00:51 🔗 Microguru are datacenters generally subject to national filters?
00:51 🔗 marvinw I don't know
00:52 🔗 marvinw my tests from online.net suggested that French censorship did not apply
00:52 🔗 marvinw best to test it anyway
00:52 🔗 Microguru in Singapore, for example, " Censorship of sexual, political and racially or religiously sensitive content is extensive."
00:52 🔗 marvinw let's move to -bs
00:52 🔗 Microguru they might be a bad place for archiving if they apply that to datacenters and not just to citizens
00:52 🔗 Microguru ok
00:52 🔗 Microguru to -bs it is
00:54 🔗 joepie91 looks like IA groks youtube videos now? https://web.archive.org/web/20130303114655if_/http://www.youtube.com/embed/dU1xS07N-FA
00:54 🔗 Microguru so does that mean that all we need to do is scrape for videos and let IA do the rest? sounds more bandwidth efficent
00:54 🔗 marvinw joepie91: they implemented this a while ago when they grabbed ~1PB of YouTube but I think they mostly stopped grabbing YouTube
00:55 🔗 joepie91 aha
00:56 🔗 marvinw it also works on the /watch?v= pages
00:56 🔗 marvinw if you're lucky Chrome will manage to load some kind of replacement Flash (?) player
00:57 🔗 marvinw your /embed/ link isn't working for me
00:57 🔗 marvinw no video playback, I think that's just the YouTube player failing to load YouTube content?
00:57 🔗 Microguru "when they grabbed ~1PB of YouTube" what did they all grab?
00:58 🔗 yipdw #-bs
01:00 🔗 Sanqui has quit IRC (Read error: Operation timed out)
01:01 🔗 wp494_ has joined #archiveteam
01:02 🔗 stevieo has joined #archiveteam
01:04 🔗 cloudmons has quit IRC (Ping timeout: 506 seconds)
01:05 🔗 SiBurning has quit IRC (Ping timeout: 506 seconds)
01:05 🔗 wp494 has quit IRC (Ping timeout: 506 seconds)
01:05 🔗 robink has quit IRC (Ping timeout: 506 seconds)
01:09 🔗 BlueMaxim has joined #archiveteam
01:21 🔗 wp494_ is now known as wp494
01:27 🔗 aaaaaaaaa has quit IRC (Leaving)
01:33 🔗 cloudmons has joined #archiveteam
01:34 🔗 robink has joined #archiveteam
01:39 🔗 Sanqui has joined #archiveteam
01:45 🔗 cloudmons has quit IRC (Read error: Connection reset by peer)
01:45 🔗 cloudmons has joined #archiveteam
02:01 🔗 aaaaaaaaa has joined #archiveteam
02:04 🔗 godane has quit IRC (Ping timeout: 268 seconds)
02:12 🔗 cloudmons has quit IRC (Read error: Connection reset by peer)
02:12 🔗 cloudmons has joined #archiveteam
02:13 🔗 robink has quit IRC (Read error: Connection reset by peer)
02:14 🔗 robink has joined #archiveteam
02:23 🔗 Coderjoe_ has quit IRC (Ping timeout: 252 seconds)
02:25 🔗 Coderjoe has joined #archiveteam
02:49 🔗 SketchCow Microguru: 56,302 copies of "Shake it Off"
02:50 🔗 SketchCow Back home!
02:50 🔗 SketchCow ....for two days
02:56 🔗 stevieo has quit IRC (Read error: Connection reset by peer)
02:57 🔗 godane has joined #archiveteam
03:32 🔗 SketchCow Uploading of Gamefront begins.
03:48 🔗 zenguy_pc has quit IRC (Read error: Connection reset by peer)
04:04 🔗 zenguy_pc has joined #archiveteam
04:18 🔗 Froggypwn has joined #archiveteam
04:25 🔗 aaaaaaaaa has quit IRC (Leaving)
04:44 🔗 zenguy_pc has quit IRC (Read error: Connection reset by peer)
04:44 🔗 matthusb- has quit IRC (Read error: Operation timed out)
04:44 🔗 matthusby has joined #archiveteam
05:01 🔗 zenguy_pc has joined #archiveteam
05:38 🔗 swebb has quit IRC (ny.us.hub irc.colosolutions.net)
05:52 🔗 Infreq has joined #archiveteam
06:00 🔗 pokeball9 has quit IRC (Quit: Connection closed for inactivity)
06:46 🔗 bzc6p has joined #archiveteam
06:48 🔗 scyther has joined #archiveteam
06:58 🔗 Ungstein1 has quit IRC (Ping timeout: 252 seconds)
06:59 🔗 Ungstein1 has joined #archiveteam
07:16 🔗 vitzli has joined #archiveteam
07:18 🔗 asfd has joined #archiveteam
07:23 🔗 JesseW has joined #archiveteam
07:29 🔗 JesseW has quit IRC (Read error: Operation timed out)
07:40 🔗 primus104 has joined #archiveteam
07:47 🔗 diskozap has joined #archiveteam
07:47 🔗 diskozap has quit IRC (Client Quit)
08:10 🔗 DFJustin http://forum.vgcw.net/single/?p=8640299&t=11352108
08:21 🔗 oli has quit IRC (Ping timeout: 252 seconds)
08:26 🔗 philpem has joined #archiveteam
08:37 🔗 schbirid has joined #archiveteam
08:46 🔗 oli has joined #archiveteam
08:49 🔗 scyther has quit IRC (Quit: Leaving)
08:54 🔗 insane_al has joined #archiveteam
09:04 🔗 arkiver SketchCows: thanks!
09:06 🔗 arkiver oops, without the s
09:11 🔗 bai the world cannot handle more than one SketchCow. it's just not ready.
09:11 🔗 schbirid there would be an bovine ignition movement from scared citizens
09:21 🔗 Ungstein1 has quit IRC (Quit: Leaving.)
09:24 🔗 arkiver yuku grab is started!
09:26 🔗 philpem has quit IRC (Ping timeout: 252 seconds)
09:29 🔗 dxrt arkiver: I'm getting rate limited?
09:29 🔗 arkiver yeah, currently going at 1 item/min
09:30 🔗 arkiver the site is very unstable, I'm not sure if that's due to bandwidth
09:30 🔗 arkiver if it is, we should go very very slow
09:30 🔗 arkiver I'll higher the limit if the site remains stable
09:31 🔗 dxrt alright
10:16 🔗 Ghost_of_ has joined #archiveteam
10:22 🔗 asfd has quit IRC (Quit: Leaving)
11:22 🔗 Dark_Star has quit IRC ()
11:33 🔗 zerkalo has quit IRC (Ping timeout: 186 seconds)
11:33 🔗 thefinn93 has quit IRC (Ping timeout: 186 seconds)
11:33 🔗 winr4r has quit IRC (Ping timeout: 186 seconds)
11:33 🔗 jmtd is now known as Jon
11:34 🔗 Coderjoe has quit IRC (Ping timeout: 186 seconds)
11:34 🔗 Coderjoe has joined #archiveteam
11:34 🔗 Nemo_bis has quit IRC (Ping timeout: 186 seconds)
11:34 🔗 Nemo_bis has joined #archiveteam
11:35 🔗 thefinn93 has joined #archiveteam
11:36 🔗 zerkalo has joined #archiveteam
11:38 🔗 winr4r has joined #archiveteam
11:45 🔗 primus104 has quit IRC (Leaving.)
11:57 🔗 VADemon has joined #archiveteam
12:20 🔗 BlueMaxim has quit IRC (Read error: Connection reset by peer)
12:21 🔗 WinterFox has quit IRC (Remote host closed the connection)
12:30 🔗 HCrossSRV has quit IRC (Read error: Operation timed out)
12:31 🔗 HCrossSRV has joined #archiveteam
12:49 🔗 Ghost_of_ has quit IRC (Quit: Leaving)
12:50 🔗 Elegance has quit IRC (Read error: Connection reset by peer)
12:59 🔗 Elegance has joined #archiveteam
13:16 🔗 zenguy_pc has quit IRC (Read error: Connection reset by peer)
13:30 🔗 z00nx has quit IRC (Quit: WeeChat 1.2)
13:30 🔗 z00nx has joined #archiveteam
13:33 🔗 zenguy_pc has joined #archiveteam
13:36 🔗 z00nx has quit IRC (Quit: WeeChat 1.2)
13:36 🔗 z00nx has joined #archiveteam
13:38 🔗 z00nx has quit IRC (Client Quit)
13:40 🔗 z00nx has joined #archiveteam
13:49 🔗 primus104 has joined #archiveteam
14:19 🔗 VADemon_ has joined #archiveteam
14:21 🔗 SketchCow Transfers going fine.
14:22 🔗 arkiver Yes, already 14 items uploaded.
14:22 🔗 VADemon has quit IRC (Read error: Operation timed out)
14:25 🔗 arkiver SketchCow: so you also have a presentation on the 31st? It's not listed on the IMPAKT website. When will it be?
14:26 🔗 SketchCow I speak in Brussels on the 31st and Utrecht on the 1st.
14:27 🔗 arkiver Sounds like a busy week
14:27 🔗 SketchCow PACKED is the Brussels event
14:28 🔗 Ghost_of_ has joined #archiveteam
14:28 🔗 arkiver http://www.packed.be/
14:29 🔗 arkiver Me and midas will be there the 1st of november
14:29 🔗 arkiver Looking forward to it :)
14:43 🔗 HCross anything like that in the London area?
14:45 🔗 SketchCow I've spoken in the London area in the past, but nothing scheduled this year forward.
14:45 🔗 SketchCow http://devslovebacon.com/conferences/bacon-2014/talks/from-colo-to-yolo-confessions-of-the-angriest-archivist (2014)
14:46 🔗 HCross http://www.thinkingdigital.co.uk might be a good one to look at
14:46 🔗 SketchCow I get flown to them. I don't fly to them.
14:46 🔗 HCross ah xD
14:47 🔗 SketchCow There's like six new active Archive Team members, like this HCross guy here, and I probably should say hi to them.
14:48 🔗 HCross Hi SketchCow
14:48 🔗 SketchCow I mostly notice them (and you, HCross) because they come in demanding answers
14:48 🔗 SketchCow Potentially good AT material
14:48 🔗 HCross ah
14:48 🔗 SketchCow First ask "why", then go "what the fuck", then "fuck you", then "I built a tool, good luck negotiating with your ISP next month for overages"
14:49 🔗 HCross haha. Unlimited BW servers here
14:51 🔗 joepie91 SketchCow: talks being recorded?
14:51 🔗 joepie91 / published
14:51 🔗 SketchCow I have no idea.
14:51 🔗 SketchCow Probably.
14:52 🔗 SketchCow I could PROBABLY sneak you into the second one, you cheapshit
14:52 🔗 SketchCow Since it's down the way
14:53 🔗 HCross SketchCow, a very enjoyable talk. Thanks
14:53 🔗 joepie91 lol
14:53 🔗 HCross "If you see a cat change colour, RUNe
14:53 🔗 HCross "If you see a cat change colour, RUN"
14:53 🔗 joepie91 SketchCow: would be appreciated, but I still have to take into account travel cost as well, especially since I'm presently behind on rent :P
14:53 🔗 joepie91 well
14:53 🔗 joepie91 not technically behind yet
14:53 🔗 joepie91 but going to be
14:54 🔗 SketchCow SO BONE POOR
14:54 🔗 SketchCow I could certainly see about it
14:54 🔗 joepie91 (it's like 27 euro roundtrip by train, from Dordrecht <-> Utrecht)
14:57 🔗 SketchCow Cling to the bottom of the train
15:00 🔗 joepie91 lol
15:02 🔗 pokeball9 has joined #archiveteam
15:13 🔗 * midas fires up google maps
15:13 🔗 midas arkiver: whereabouts are you?
15:14 🔗 arkiver amsterdam
15:14 🔗 arkiver you?
15:14 🔗 midas naarden
15:14 🔗 midas so im in between utrecht / amsterdam
15:14 🔗 arkiver yes
15:15 🔗 arkiver are you going all day or only to SketchCow's talk?
15:15 🔗 midas probably just the talk, i might go all day
15:15 🔗 midas joepie91: if in need, i can pick you up and bring you home again. yadayada companycar
15:17 🔗 midas sets mode: +o joepie91
15:17 🔗 joepie91 midas: that'd be a viable option :P cc SketchCow
15:17 🔗 midas sets mode: +oo ersi Nemo_bis
15:18 🔗 midas it's just a 160k roundtrip :p
15:18 🔗 midas but companycar! it's free
15:18 🔗 joepie91 midas: calculate in ~10-20 minutes of getting completely lost, though, this city is a nightmare to navigate by car
15:18 🔗 joepie91 and I have seen exactly one satnav get it right
15:18 🔗 joepie91 lol
15:18 🔗 HCross just made the Yuku grab work on ARM
15:19 🔗 midas apple maps will fail at it, badly
15:19 🔗 midas but ill give it a try
15:19 🔗 joepie91 midas: it's all one-way streets everywhere, you make a single wrong turn and you end up having to make a full circle around the city center to get back where you were
15:19 🔗 joepie91 :P
15:19 🔗 midas thats why im using apple maps, i like to see the city a couple of times
15:19 🔗 schbirid dutch-bs
15:20 🔗 midas daww
15:38 🔗 Ghost_of_ has quit IRC (Quit: Leaving)
16:06 🔗 nertzy has joined #archiveteam
16:18 🔗 JesseW has joined #archiveteam
16:18 🔗 jmad980 has quit IRC (Ping timeout: 252 seconds)
16:30 🔗 JesseW has quit IRC (Read error: Operation timed out)
16:34 🔗 jmad980 has joined #archiveteam
16:45 🔗 Start has quit IRC (Read error: Connection reset by peer)
16:46 🔗 Start has joined #archiveteam
16:47 🔗 arkiver bzc6p: available for myvip questions?
16:49 🔗 bzc6p Yes. But we should probably set up a channel for that.
16:49 🔗 arkiver let's thinkg of a channel then
16:50 🔗 bzc6p #byevip ?
16:50 🔗 arkiver yes, let's do #byevip
16:51 🔗 SketchCow #nosovip
16:51 🔗 SketchCow notsovip
16:51 🔗 philpem has joined #archiveteam
16:53 🔗 arkiver everyone: #byevip or #notsovip ?
16:55 🔗 bzc6p In fact, byevip may sound better only in Hungarian, as we pronounce it [maivip] instead of [maiviaipi:].
16:55 🔗 arkiver SketchCow: you talked about archiving external links from mediawiki's some time ago.
16:55 🔗 SketchCow Yes
16:55 🔗 SketchCow IO
16:55 🔗 arkiver I think we'll archive those with the upcoming wikis project
16:55 🔗 SketchCow I'd like that very very much, starting with fileformats.archiveteam.org
16:56 🔗 arkiver Basically that project will grab all wikis into WARCs, I'll also make sure we can grab all external links from wikis into WARCs
16:56 🔗 arkiver So those will be done through the warrior I think
16:56 🔗 arkiver That ok?
16:57 🔗 bzc6p arkiver: I'm away for like 20 minutes, don't you mind?
16:57 🔗 arkiver Nope
16:58 🔗 bzc6p You may list your questions in the appropriate channel until then. Thanks for your time, by the way!
16:58 🔗 arkiver ;)
17:01 🔗 JesseW has joined #archiveteam
17:03 🔗 gibigiana has quit IRC (Ping timeout: 252 seconds)
17:03 🔗 SketchCow So, I've been handed nearly the entire metadata (grabbed HTML and images) of MP3.COM.
17:03 🔗 SketchCow It's safely esconsed behind a dark object at the archive now. About 5gb
17:04 🔗 Lord_Nigh has quit IRC (Ping timeout: 252 seconds)
17:04 🔗 vtyl has quit IRC (Ping timeout: 252 seconds)
17:04 🔗 arkiver nice
17:05 🔗 is- has quit IRC (Ping timeout: 252 seconds)
17:05 🔗 lukeman has quit IRC (Ping timeout: 252 seconds)
17:06 🔗 JesseW has quit IRC (Read error: Operation timed out)
17:06 🔗 is- has joined #archiveteam
17:07 🔗 dan- has quit IRC (Ping timeout: 252 seconds)
17:07 🔗 lytv has joined #archiveteam
17:07 🔗 wp494_ has joined #archiveteam
17:08 🔗 balrog has quit IRC (Ping timeout: 252 seconds)
17:08 🔗 zhongfu has quit IRC (Ping timeout: 252 seconds)
17:08 🔗 tephra has quit IRC (Ping timeout: 252 seconds)
17:08 🔗 tephra has joined #archiveteam
17:08 🔗 wp494 has quit IRC (Ping timeout: 252 seconds)
17:09 🔗 Baljem_ has joined #archiveteam
17:09 🔗 Selanda has quit IRC (Ping timeout: 252 seconds)
17:10 🔗 vitzli has quit IRC (Leaving)
17:11 🔗 goekesmi has quit IRC (Ping timeout: 252 seconds)
17:11 🔗 bai has quit IRC (Ping timeout: 252 seconds)
17:11 🔗 gibigiana has joined #archiveteam
17:11 🔗 goekesmi has joined #archiveteam
17:11 🔗 Selanda has joined #archiveteam
17:12 🔗 bai has joined #archiveteam
17:12 🔗 diacope has quit IRC (Ping timeout: 252 seconds)
17:13 🔗 wacky_ has quit IRC (Ping timeout: 252 seconds)
17:13 🔗 Baljem has quit IRC (Ping timeout: 252 seconds)
17:14 🔗 Kenshin has quit IRC (Ping timeout: 252 seconds)
17:15 🔗 wacky has joined #archiveteam
17:15 🔗 Lord_Nigh has joined #archiveteam
17:17 🔗 Kenshin has joined #archiveteam
17:17 🔗 zhongfu has joined #archiveteam
17:17 🔗 lukeman has joined #archiveteam
17:18 🔗 dan- has joined #archiveteam
17:23 🔗 balrog has joined #archiveteam
17:24 🔗 Kenshin has quit IRC (Quit: ZNC - http://znc.in)
17:24 🔗 Kenshin has joined #archiveteam
17:29 🔗 diacope has joined #archiveteam
17:36 🔗 arkiver2 has joined #archiveteam
17:36 🔗 wp494_ is now known as wp494
17:42 🔗 SketchCow So, my plan is to work with Archive Team members to convert it back into WARCS and shove it into the archive.
17:42 🔗 diacope has quit IRC (Ping timeout: 252 seconds)
17:43 🔗 arkiver2 has quit IRC (Ping timeout: 252 seconds)
17:43 🔗 Fletcher has quit IRC (Ping timeout: 252 seconds)
17:43 🔗 dan- has quit IRC (Ping timeout: 252 seconds)
17:44 🔗 Famicoman has quit IRC (Ping timeout: 252 seconds)
17:44 🔗 sivoais has quit IRC (Ping timeout: 252 seconds)
17:44 🔗 balrog has quit IRC (Ping timeout: 252 seconds)
17:44 🔗 wacky has quit IRC (Ping timeout: 252 seconds)
17:44 🔗 joepie91 has quit IRC (Ping timeout: 252 seconds)
17:44 🔗 WubTheCap has quit IRC (Ping timeout: 252 seconds)
17:44 🔗 wacky has joined #archiveteam
17:44 🔗 Sue_ has quit IRC (Ping timeout: 252 seconds)
17:45 🔗 sivoais has joined #archiveteam
17:51 🔗 joepie91 has joined #archiveteam
17:52 🔗 aaaaaaaaa has joined #archiveteam
17:55 🔗 WubTheCap has joined #archiveteam
17:56 🔗 dan- has joined #archiveteam
17:56 🔗 diacope has joined #archiveteam
17:57 🔗 SketchCow This brings the classic issue
17:57 🔗 SketchCow Of creating WARCs from nowhere
17:57 🔗 balrog has joined #archiveteam
18:00 🔗 Sue_ has joined #archiveteam
18:04 🔗 bzc6p_ has joined #archiveteam
18:09 🔗 arkiver SketchCow: Main problem is the recreation of request and response headers
18:09 🔗 bzc6p has quit IRC (Read error: Operation timed out)
18:13 🔗 SketchCow Agreed, a host of issues.
18:13 🔗 SketchCow I've started with the foundation, of course. Would this break wayback?
18:13 🔗 SketchCow (Asking our engineers)
18:17 🔗 SketchCow This dubstep is making me forget everybody sucks
18:18 🔗 Fletcher has joined #archiveteam
18:18 🔗 schbirid i kinda want to sneak in some fake warc data some day
18:18 🔗 schbirid it's crazy that we can do it
18:19 🔗 SketchCow This entire situation is built on a whole set of trust built on my name and reputation with the archive.
18:19 🔗 SketchCow Violate it and we won't even remember where you once stood
18:19 🔗 schbirid the problem is that you cannot trust people, who knows who sneaks in stuff :(
18:20 🔗 SketchCow I find they tend to be little blabbermouths who mention it in a general channel.
18:20 🔗 arkiver So the external URL grabbing script is working
18:21 🔗 arkiver The fileformats wii has 17381 external URLs
18:22 🔗 arkiver wiki*
18:24 🔗 SketchCow Yep
18:25 🔗 arkiver chfoo: can you please create a FOS rsync target for 'wikis'?
18:26 🔗 arkiver chfoo: we'll be grabbing full wikis and external links from wikis
18:54 🔗 Famicoman has joined #archiveteam
18:57 🔗 insane_al has quit IRC (Leaving)
19:03 🔗 bzc6p_ is now known as bzc6p
19:17 🔗 DFJustin didn't you already upload that mp3.com stuff ages ago
19:18 🔗 DFJustin https://archive.org/details/mp3com-skeleton
19:20 🔗 insane_al has joined #archiveteam
19:23 🔗 Ghost_of_ has joined #archiveteam
19:39 🔗 Start has quit IRC (Quit: Disconnected.)
19:39 🔗 Start has joined #archiveteam
19:40 🔗 SketchCow ha ha
19:40 🔗 SketchCow mmmaybe
19:42 🔗 HCross The IA is now getting a copy of "Justin Bieber OS"
19:44 🔗 Start_ has joined #archiveteam
19:45 🔗 SketchCow Thank god
19:45 🔗 HCross it needs to run on all the IA servers now
19:46 🔗 Start has quit IRC (Read error: Operation timed out)
19:46 🔗 HCross [19:46:20] 19<Major> HCross: Your job for http://biebian.sourceforge.net/ has finished.
19:47 🔗 chfoo arkiver: ok, done
19:47 🔗 Start has joined #archiveteam
19:51 🔗 Start_ has quit IRC (Ping timeout: 310 seconds)
19:52 🔗 bzc6p It's Linux, at least.
19:54 🔗 bzc6p By the way, what's the news with sourceforge?
19:55 🔗 arkiver no reply yet
19:56 🔗 bzc6p then there won't be any
19:57 🔗 SketchCow Assume they're all dead
19:57 🔗 SketchCow http://i.ytimg.com/vi/PE-CfJ190x4/maxresdefault.jpg
19:57 🔗 SimpBrain has quit IRC (Read error: Operation timed out)
19:58 🔗 Start has quit IRC (Quit: Disconnected.)
20:10 🔗 Start has joined #archiveteam
20:16 🔗 schbirid has quit IRC (Quit: Leaving)
20:19 🔗 insane_al has quit IRC (Leaving)
20:23 🔗 aaaaaaaaa with both sourceforge and google code, for the source code items, what exactly is the plan for them?
20:24 🔗 aaaaaaaaa I mean one item per repo seems a little much. But I don't know if randomly stuffed in packs is that useful
20:25 🔗 arkiver I think alphabetical packing
20:26 🔗 aaaaaaaaa the problem with that is they would need to be grabbed alphabetically. Plus some letters will have way more than others, like l, x, and p
20:27 🔗 SketchCow 2.7tb of gamefront being uploaded at the moment.
20:30 🔗 scyther has joined #archiveteam
20:33 🔗 SimpBrain has joined #archiveteam
21:02 🔗 arkiver Where going to start the wikis project for external URLs!
21:02 🔗 arkiver Who wants to take the (big) fileformats archiveteam wiki item?
21:03 🔗 arkiver ooops, we're I mean
21:04 🔗 arkiver sorry, I'm a bit tired
21:05 🔗 VADemon_ the wikis project should be able to grab referenced youtube videos, imho
21:05 🔗 HCross what is the most urgent one?
21:05 🔗 arkiver currently that grab is not able to grab youtube videos
21:05 🔗 VADemon_ although youtube is stable...
21:06 🔗 arkiver HCross: you want to fileformats wiki?
21:06 🔗 Meeh has quit IRC (Remote host closed the connection)
21:06 🔗 HCross I meant, is the wiki project more important vs the Yuku
21:07 🔗 arkiver yuku is more important, but we currently don't need a lot more concurrent on that grab
21:07 🔗 HCross I will stay on Yuku with my 40 workers
21:08 🔗 arkiver you already have 40 on it? yeah, please keep them on yuku then
21:08 🔗 HCross 2x20
21:20 🔗 arkiver2 has joined #archiveteam
21:20 🔗 arkiver2 has quit IRC (Client Quit)
21:41 🔗 WubTheCap has quit IRC (Quit: Leaving)
21:59 🔗 scyther has quit IRC (Read error: Connection reset by peer)
22:03 🔗 nertzy has quit IRC (Quit: This computer has gone to sleep)
22:10 🔗 VADemon_ yuku-discovery: PR, added DNSdumpster and penetration-tools https://github.com/chpwssn/yuku-discovery/pull/1
22:27 🔗 arkiver VADemon_: thanks, but we found a sitemap a while ago, so we have all sites
22:27 🔗 aaaaaaaaa has quit IRC (Read error: Connection reset by peer)
22:27 🔗 aaaaaaaaa has joined #archiveteam
22:44 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
22:47 🔗 RichardG has joined #archiveteam
22:51 🔗 VADemon_ ah yea, it has the sitemap.xml too, but the script still reported added entries
22:56 🔗 dashcloud has joined #archiveteam
22:58 🔗 VADemon_ arkiver, amazonklubben.yuku.com is not in the sitemap.xml for example
23:04 🔗 arkiver VADemon_: nice find!
23:04 🔗 arkiver well, will have a look at that later, afk now for the night
23:25 🔗 dashcloud has quit IRC (Read error: Operation timed out)
23:27 🔗 dashcloud has joined #archiveteam
23:55 🔗 dan- has quit IRC (Ping timeout: 252 seconds)

irclogger-viewer