#archiveteam-bs 2018-06-27,Wed

↑back Search

Time Nickname Message
00:11 πŸ”— ta9le has joined #archiveteam-bs
00:16 πŸ”— wacky_ has quit IRC (Ping timeout: 260 seconds)
00:45 πŸ”— Stilett0 has quit IRC (Ping timeout: 252 seconds)
00:54 πŸ”— Darkstar has quit IRC (Ping timeout: 632 seconds)
01:08 πŸ”— Darkstar has joined #archiveteam-bs
01:13 πŸ”— C4K3 has quit IRC (Read error: Operation timed out)
01:28 πŸ”— m007a83 has quit IRC (Read error: Connection reset by peer)
01:37 πŸ”— Dimtree has quit IRC (Read error: Connection reset by peer)
01:41 πŸ”— archodg__ has quit IRC (Read error: Operation timed out)
01:44 πŸ”— Dimtree has joined #archiveteam-bs
02:22 πŸ”— wacky has joined #archiveteam-bs
02:31 πŸ”— m007a83 has joined #archiveteam-bs
02:45 πŸ”— C4K3 has joined #archiveteam-bs
03:03 πŸ”— ta9le has quit IRC (Quit: Connection closed for inactivity)
03:06 πŸ”— wp494 has quit IRC (Ping timeout: 492 seconds)
03:06 πŸ”— wp494 has joined #archiveteam-bs
03:09 πŸ”— odemg has quit IRC (Ping timeout: 260 seconds)
03:22 πŸ”— odemg has joined #archiveteam-bs
03:34 πŸ”— flashfire has joined #archiveteam-bs
04:08 πŸ”— K4k_ has quit IRC (Read error: Connection reset by peer)
04:34 πŸ”— flashfire has quit IRC (Quit: http://www.mibbit.com ajax IRC Client)
04:38 πŸ”— BlueMax has quit IRC (Leaving)
05:04 πŸ”— BlueMax has joined #archiveteam-bs
05:17 πŸ”— Zebranky has quit IRC (Read error: Operation timed out)
05:17 πŸ”— JAA has quit IRC (Read error: Operation timed out)
05:18 πŸ”— tapedrive has quit IRC (Read error: Operation timed out)
05:18 πŸ”— Dimtree has quit IRC (Read error: Operation timed out)
05:18 πŸ”— squires has quit IRC (Read error: Operation timed out)
05:18 πŸ”— beardicus has quit IRC (Read error: Operation timed out)
05:18 πŸ”— BlueMaxim has joined #archiveteam-bs
05:26 πŸ”— logchfoo3 starts logging #archiveteam-bs at Wed Jun 27 05:26:44 2018
05:26 πŸ”— logchfoo3 has joined #archiveteam-bs
05:27 πŸ”— C4K3 has quit IRC (Ping timeout: 601 seconds)
05:27 πŸ”— Gfy has joined #archiveteam-bs
05:28 πŸ”— REiN^ has quit IRC (Ping timeout: 602 seconds)
05:30 πŸ”— Lord_Nigh has quit IRC (Read error: Operation timed out)
05:33 πŸ”— Lord_Nigh has joined #archiveteam-bs
05:58 πŸ”— Dimtree has joined #archiveteam-bs
06:14 πŸ”— beardicus has joined #archiveteam-bs
06:14 πŸ”— REiN^ has joined #archiveteam-bs
06:15 πŸ”— rbraun has joined #archiveteam-bs
06:17 πŸ”— squires has joined #archiveteam-bs
06:19 πŸ”— tyzoid has joined #archiveteam-bs
06:32 πŸ”— PotcFdk has joined #archiveteam-bs
06:33 πŸ”— phillipsj has quit IRC (Ping timeout: 1212 seconds)
06:41 πŸ”— ItsYoda has quit IRC (Ping timeout: 260 seconds)
06:45 πŸ”— sep332 has joined #archiveteam-bs
06:48 πŸ”— C4K3 has joined #archiveteam-bs
06:50 πŸ”— ItsYoda has joined #archiveteam-bs
07:23 πŸ”— Jusque has quit IRC (Ping timeout: 260 seconds)
07:23 πŸ”— Jusque has joined #archiveteam-bs
09:38 πŸ”— ta9le has joined #archiveteam-bs
10:08 πŸ”— Mateon1 has quit IRC (Read error: Operation timed out)
10:08 πŸ”— Mateon1 has joined #archiveteam-bs
10:43 πŸ”— eientei95 [22:41:14] <eientei95> I'm not sure if 6k0ssd31g4qrmlp05pc2kgdx5 will complete or if we want it to
10:43 πŸ”— eientei95 [22:41:29] <eientei95> It's got a lot of large files, most of which are TV shows, movies or software
10:43 πŸ”— eientei95 [22:41:44] <eientei95> If we were to archive them, it'd be better to have them made as individual items
10:54 πŸ”— BlueMaxim has quit IRC (Quit: Leaving)
12:16 πŸ”— Muad-Dib JAA: If someone were to be successful in getting the Halo data from Bungie, who will handle possible pre-IA storage?
12:36 πŸ”— PurpleSym Any idea how much data we’re talking about, Muad-Dib ?
12:36 πŸ”— PurpleSym (Spread some ops please)
12:47 πŸ”— Muad-Dib the statistics themselves are mostly just tables of numbers for each game, but there are about 800M games played for Halo 2 and from what I hear WAY more for Halo 3 and up
12:47 πŸ”— Muad-Dib PurpleSym: ^
12:48 πŸ”— Muad-Dib and we do get more multimedia content I'd push up the estimate to tens, if not hundreds of TB's
12:49 πŸ”— Muad-Dib I can't remember the total file size and scope of the Halo 3 multimedia grab we did, but that should be a good basis for an estimate
12:51 πŸ”— Muad-Dib sets mode: +o arkiver
12:51 πŸ”— Muad-Dib sets mode: +oo HCross SketchCow
12:52 πŸ”— Muad-Dib sets mode: +o PurpleSym
12:52 πŸ”— Muad-Dib JAA: I still haven't been able to find contact information for Wolfson, only an earthlink email address from a very old personal website of his that I'll assume not to be alive anymore
12:53 πŸ”— svchfoo1 has joined #archiveteam-bs
12:53 πŸ”— svchfoo3 has joined #archiveteam-bs
12:53 πŸ”— PurpleSym sets mode: +oo svchfoo1 svchfoo3
12:59 πŸ”— Muad-Dib PurpleSym: In the less ideal case, grabbing individual games' stats pages with Wpull/archivebot is currently giving me 4000 games per gigabyte (there's a "small" job for the last ~138k games running on archivebot right now)
13:23 πŸ”— PurpleSym Maybe we can guess his bungie email address. first.last@bungie.net or something similar?
13:28 πŸ”— Igloo I can provide a couple of TB of space to it, But it'd have to be WARCd and then I can upload to IA. But it'd be best as a warrior project
13:28 πŸ”— Igloo With numerous sync targets to spread the love
13:51 πŸ”— Muad-Dib Igloo: there's probably no time to make it a Warrior project, the detailed stats are going tomorrow 17:00 UTC
13:52 πŸ”— Muad-Dib that's we're considering asking Bungie
13:52 πŸ”— Muad-Dib that's why we're*
14:04 πŸ”— Aoede has quit IRC (Quit: ZNC - https://znc.in)
14:07 πŸ”— Aoede has joined #archiveteam-bs
14:23 πŸ”— icedice has joined #archiveteam-bs
14:27 πŸ”— icedice has quit IRC (Client Quit)
14:27 πŸ”— icedice has joined #archiveteam-bs
14:57 πŸ”— K4k has joined #archiveteam-bs
15:12 πŸ”— JAA According to our wiki page, we archived 37 TB of data in 2014/15.
15:13 πŸ”— JAA The game pages should be smaller than that, but the problem is that we can't really grab 800 million (or more) pages in 26 hours.
15:15 πŸ”— JAA Muad-Dib: I believe the numbers displayed on the ArchiveBot dashboard are uncompressed, by the way.
15:15 πŸ”— JAA I.e. the amount of (download?) network traffic, not the size of the archives.
15:21 πŸ”— godane SketchCow: i'm uploading some old tapes rips i have not uploaded yet
15:21 πŸ”— godane one is NH Unsolved Mysteries from WMUR
15:22 πŸ”— godane i can't tell if its from 1996 or 1997 so i put 199x in file name
15:33 πŸ”— icedice has quit IRC (Ping timeout: 268 seconds)
16:47 πŸ”— schbirid uh, trying to "$ gpg --verify archiveteam-warrior-v3-20171013.ova.asc archiveteam-warrior-v3-20171013.ova" i dont have the public key and cannot find where to get it. chfoo
16:57 πŸ”— SketchCow WHY HELLO
16:58 πŸ”— arkiver JAA: if we need a project, let me know
17:07 πŸ”— Kaz wasn't twitch like 2014/15?
17:08 πŸ”— Kaz oh, context is key, ignore me
17:10 πŸ”— schbirid has quit IRC (Quit: Leaving)
17:17 πŸ”— godane looks like that salem lot incomplete tbs recording is from late October 1993
17:17 πŸ”— godane it maybe about october 25 or 26
17:17 πŸ”— godane cause there was a tbs news report in the commercial breaks
17:38 πŸ”— archodg has joined #archiveteam-bs
17:39 πŸ”— schbirid has joined #archiveteam-bs
17:51 πŸ”— chfoo schbirid: the public key id is C718CE578A321F2D. it should be published on most public key servers.
17:57 πŸ”— schbirid cheers
17:57 πŸ”— schbirid gpg had suggested B251CF4887C1510C
17:57 πŸ”— schbirid maybe add the id to the readme?: )
19:04 πŸ”— jschwart has joined #archiveteam-bs
19:23 πŸ”— icedice has joined #archiveteam-bs
19:38 πŸ”— godane SketchCow: i noticed your post about BBC Computer Literacy Project
19:39 πŸ”— godane i couldn't use youtube-dl to download urls directly but i made a script grab the m3u8 using youtube-dl
19:39 πŸ”— SketchCow Let's... hold off
19:39 πŸ”— SketchCow Like, maybe not reward the BBC for uploading a series by immediately mirroring it
19:40 πŸ”— godane its for my personal collection
19:40 πŸ”— godane its going to be a few months anyways before i upload it to you guys
19:41 πŸ”— godane i'm most likely going to give a copy to myspleen people first
19:41 πŸ”— SketchCow Ok, but then you're calling me out
19:41 πŸ”— SketchCow And I come in to see if I'm needed
19:41 πŸ”— SketchCow And I'm not
19:41 πŸ”— SketchCow http://fos.textfiles.com/RECOGNIZER/
19:41 πŸ”— SketchCow I'm over in this page and related ones trying to sort collections
19:41 πŸ”— SketchCow It's hardish make-you-crazy work
19:42 πŸ”— SketchCow http://fos.textfiles.com/RECOGNIZER/type.html
19:42 πŸ”— godane i'm just tell you cause it was not a simple youtube-dl $url
19:43 πŸ”— godane but was a simple youtube-dl --hls-prefer-native --fixup never https://computer-literacy-project.pilots.bbcconnectedstudio.co.uk/asset/video/$(basename $url)/index.m3u8
21:01 πŸ”— wp494 has quit IRC (Ping timeout: 260 seconds)
21:01 πŸ”— wp494 has joined #archiveteam-bs
21:08 πŸ”— schbirid has quit IRC (Quit: Leaving)
21:23 πŸ”— verifiedj has joined #archiveteam-bs
21:34 πŸ”— verifiedj has quit IRC (Quit: http://www.mibbit.com ajax IRC Client)
21:51 πŸ”— verifiedj has joined #archiveteam-bs
22:04 πŸ”— verifiedj atluxity: https://verifiedjoseph.com/archiveteam/mocpages.txt
22:05 πŸ”— jschwart has quit IRC (Quit: Konversation terminated!)
22:07 πŸ”— arkiver moc pages going down?
22:07 πŸ”— icedice has quit IRC (Quit: Leaving)
22:08 πŸ”— verifiedj No, but its just come back online after 8 days of being offline with a database error, it has downtime like this on an all too frequent basis.
22:14 πŸ”— verifiedj And the guy who runs it (Sean Kenney) never tells the community what's going on, so saving it maybe a good idea.
22:23 πŸ”— JAA arkiver: Well, the deadline's in less than 19 hours. It would be great to do a warrior project, but I doubt we can do it that quickly and we probably won't get all of it anyway.
22:25 πŸ”— JAA It's worth mentioning that it looks like they won't delete all content. If I read the announcement correctly, they'll purge some data about the games (e.g. medals achieved within a particular game) but keep at least the basic stats.
22:26 πŸ”— JAA I've started a grab for some Halo 2 game pages, namely the oldest million (1-1M) and the newest million (802138050 to 803138049). Let's see how this goes.
22:28 πŸ”— arkiver we can try to do a project
22:28 πŸ”— arkiver I see it's just IDs?
22:28 πŸ”— JAA Yup, e.g. http://halo.bungie.net/Stats/GameStatsHalo2.aspx?gameid=803138049
22:28 πŸ”— JAA (That's the last existing ID as far as I can see.)
22:29 πŸ”— arkiver and that one image of the playing field is specific to the game?
22:29 πŸ”— JAA There are similar pages for the other Halo games, some with significantly more IDs (at least 1.9 billion for Halo 3 for example).
22:29 πŸ”— arkiver As in we should grab that page and the image of the playing field
22:29 πŸ”— arkiver Well we can do 1.9 billion :P, will just take a little bit of time
22:31 πŸ”— JAA I'm doing 23k requests per minute at the moment. Two months at that speed for 2 billion.
22:31 πŸ”— arkiver Did you find any limits?
22:31 πŸ”— arkiver I believe we can do a few hundred thousand per minute
22:32 πŸ”— arkiver also nvm about that image, didn't look good
22:32 πŸ”— arkiver so only the html page, nothing more
22:32 πŸ”— JAA Haven't come across any limits so far.
22:33 πŸ”— arkiver 803138049 is latest?
22:33 πŸ”— JAA For Halo 2, yes.
22:33 πŸ”— DragonMon has quit IRC (Read error: Connection reset by peer)
22:33 πŸ”— arkiver we have to do ~700,000 URLs per minutes for this one
22:34 πŸ”— arkiver I'm thinking 3000 URLs/item?
22:34 πŸ”— arkiver or 4000 or something
22:34 πŸ”— arkiver just enough so we will not be limited by the tracker handling not enough items per minute
22:35 πŸ”— JAA They seem to be using CloudFlare's CDN. So I'm sure they could handle it; the question is if CF lets us.
22:36 πŸ”— arkiver awesome
22:36 πŸ”— arkiver this will be fun
22:36 πŸ”— arkiver let's get people :D
22:36 πŸ”— arkiver odemg ^
22:38 πŸ”— JAA According to http://halo.bungie.net/images/News/Inline12/sunset/halo_mulitplayer_stats_sm.jpg, there were roughly 21 billion games in total across the different Halo games.
22:38 πŸ”— astrid are you going to try and archive every halo game ever played?
22:38 πŸ”— arkiver that image doesn't say 800 million halo 2 games
22:38 πŸ”— JAA But that graphic also mentions a number of 5.4 billion games for Halo 2, which doesn't match the maximum ID of 803 million.
22:39 πŸ”— astrid in god's name, why
22:39 πŸ”— arkiver while 803138049 is the latest (?)
22:39 πŸ”— arkiver yah
22:39 πŸ”— arkiver yeah*
22:39 πŸ”— astrid if you do that i'm quitting archiveteam
22:41 πŸ”— astrid i'm not kidding
22:41 πŸ”— astrid this is absurd and it's a sign that you all have lost your way
22:42 πŸ”— arkiver total size shouldn't be too big
22:47 πŸ”— arkiver kind of an overreaction there
22:47 πŸ”— JAA About 1.1 GiB per 100k games, it seems.
22:49 πŸ”— astrid it's not the space
22:49 πŸ”— astrid it's the mental energy
22:49 πŸ”— astrid i joined archiveteam because geocities was going away
22:49 πŸ”— astrid no fucking way is geocities on the same level as a list of all the halo games
22:53 πŸ”— DragonMon has joined #archiveteam-bs
22:53 πŸ”— JAA I don't disagree with you there, and that's why I invested significantly more time and energy into archiving the Halo discussion forums. But I still think it's also worth preserving at least some information about the actual games for such an influential video game.
23:00 πŸ”— * arkiver agrees with JAA
23:08 πŸ”— arkiver astrid: ^ ?
23:22 πŸ”— verifiedj has quit IRC (http://www.mibbit.com ajax IRC Client)
23:46 πŸ”— BlueMax has joined #archiveteam-bs
23:48 πŸ”— JAA For the record, my grab of the Halo forums covered 3036432 threads. According to the homepage, there are 3037088 threads in total. So I should have pretty much everything; not sure where those 656 remaining ones are, or maybe the counters on the homepage are off, but that's only 0.02 %.
23:51 πŸ”— JAA The game grab has slowed down to about 10k per minute, and the WARCs are now about 1.6 GiB per 100k games. I didn't look into it, but I suspect that the old ones were "purged" and thus the pages are much smaller than for newer, not-yet-purged games. The slowdown/size increase occurred around the time it completed the 1 million oldest games and started with the 1 million most recent ones.
23:54 πŸ”— Muad-Dib JAA: yeah, that seemed to happen to my small archivebot runs as well

irclogger-viewer