#archiveteam-bs 2018-06-26,Tue

↑back Search

Time Nickname Message
00:00 πŸ”— flashfire has joined #archiveteam-bs
00:00 πŸ”— flashfire https://globenewswire.com/news-release/2018/06/04/1516160/0/en/Corning-Closes-Acquisition-of-Substantially-All-of-3M-s-Communication-Markets-Division.html
00:02 πŸ”— wp494 has quit IRC (Ping timeout: 255 seconds)
00:02 πŸ”— wp494 has joined #archiveteam-bs
00:13 πŸ”— godane SketchCow: how many boxes are you senting?
00:13 πŸ”— godane *sending
00:23 πŸ”— dashcloud has joined #archiveteam-bs
00:23 πŸ”— tstarling has joined #archiveteam-bs
00:23 πŸ”— flashfire So yeah my archiving tends to be hit and miss. Sometimes I am at school and have to load it into Archive.is but this work around I have come up with allows me to use !ao
00:24 πŸ”— flashfire I also forget how to use the Ignore functions and sometimes forget i set jobs
00:25 πŸ”— flashfire Plus I tend to archive stuff nobody cares about sometimes simply because I liked the look of the site and didnt see a copy in the wayback machine
00:25 πŸ”— flashfire the forgetting ignore patterns was what astrid may have sighed about or I may have interpreted it as that
00:26 πŸ”— flashfire Also I am using mibbit with irc.underworld.no as a work around because my school is using smoothwall which blocks way to much to be usable
00:26 πŸ”— flashfire The archive itself is blocked at my school
00:26 πŸ”— astrid my school blocked the archive too :( it's bogus
00:27 πŸ”— flashfire but strangely enough they didnt block archive.is
00:28 πŸ”— astrid that is a little bit strange, yes
00:28 πŸ”— flashfire also when i access http://dashboard.at.ninjawedding.org/3?showNicks=1 I get a frozen snapshot of what it was up to when the site initially loaded. realtime updates are blocked but not the site itself
00:28 πŸ”— astrid especially because archive.org is a state of california licensed library :)
00:28 πŸ”— flashfire exactly
00:28 πŸ”— astrid anyway
00:28 πŸ”— flashfire I mean I am in australia but its still an internationally recognised library
00:29 πŸ”— flashfire https://finance.yahoo.com/news/match-group-can-get-away-acquiring-25-dating-sites-counting-151306438.html this seems suspicious
00:31 πŸ”— flashfire 5 minutes before I have to go any passing comments?
00:31 πŸ”— * astrid shrugs
00:32 πŸ”— flashfire Apart from WATCH THE IGNORE SETS FLASH DAMN IT ITS NOT THAT HARD
00:33 πŸ”— astrid nah, i get it, it can be a pain to babysit these things
00:34 πŸ”— flashfire Yeah I find it oddly therapeutic to watch them tick over sometimes
00:34 πŸ”— astrid it's more a "it'd be good if you knew when a job was going to be big and came by every day or so to check that it's not got stuck somewhere irrelevant"
00:35 πŸ”— flashfire I usually only grab small sites. Its when I put an explain next to it that I had a reason more than oooh that site has pretty colours
00:35 πŸ”— astrid aye
00:35 πŸ”— flashfire or if you google the URL you will find its being shut down. Or its a thrown together spam site designed to collect ad revenue
00:36 πŸ”— flashfire bye
00:36 πŸ”— flashfire has quit IRC (Quit: http://www.mibbit.com ajax IRC Client)
00:36 πŸ”— tstarling re archiving POST responses: there's nothing in the WARC spec that helps us out, as far as I can see
00:36 πŸ”— astrid the request headers will go into the arachive just like any other request header/body content
00:36 πŸ”— astrid but, as i saaid, the waybaack machine has no way to match on that and ensure thata you get the right response
00:37 πŸ”— astrid i'm not sure how much more clear i can be about this
00:37 πŸ”— tstarling yeah ok
00:37 πŸ”— tstarling "For a target-URI of the β€˜http’ or β€˜https’ schemes, a β€˜request’ record block should contain the full HTTP request sent over the network, including headers."
00:38 πŸ”— tstarling so will archivebot actually archive POST requests right now?
00:39 πŸ”— astrid no
00:39 πŸ”— astrid you will have to make your own crawler
00:40 πŸ”— tstarling ok...
00:42 πŸ”— tstarling and when you said "create an index into it", is an index a concept that exists already?
00:42 πŸ”— astrid i'm thinking more of a html file with links or something
00:42 πŸ”— astrid worry about that once you have the content
00:45 πŸ”— tstarling so if I generated this WARC file, would I be able to upload it somewhere for safe keeping?
00:45 πŸ”— astrid yep!
00:45 πŸ”— astrid https://archive.org/upload
00:46 πŸ”— tstarling ok, and then IA would not be able to incorporate it into the wayback machine for now, but maybe in the future they could do that
00:46 πŸ”— astrid yep!
00:47 πŸ”— ta9le has quit IRC (Quit: Connection closed for inactivity)
00:47 πŸ”— tstarling sounds like a plan
00:50 πŸ”— tstarling there's a revision history, which you're not able to view when logged out
00:51 πŸ”— tstarling but it would be cool to have it
00:52 πŸ”— tstarling account creation is not restricted, you just need any user account, but I guess the usual convention is not to archive content that's protected in that way?
00:53 πŸ”— astrid eh, if you can freely register an account then you should do that
00:53 πŸ”— astrid it starts getting less clear if you have to get manually approved
00:54 πŸ”— tstarling you can freely register
00:54 πŸ”— astrid go for it
03:07 πŸ”— archodg__ has joined #archiveteam-bs
03:09 πŸ”— odemg has quit IRC (Read error: Operation timed out)
03:12 πŸ”— archodg_ has quit IRC (Read error: Operation timed out)
03:22 πŸ”— odemg has joined #archiveteam-bs
03:37 πŸ”— flashfire has joined #archiveteam-bs
03:59 πŸ”— Dimtree has quit IRC (Read error: Operation timed out)
04:01 πŸ”— flashfire astrid why are you still online werent you going to have a nap?
04:08 πŸ”— Meroje has quit IRC (Ping timeout: 260 seconds)
04:23 πŸ”— Dimtree has joined #archiveteam-bs
04:25 πŸ”— Meroje has joined #archiveteam-bs
04:56 πŸ”— flashfire has quit IRC (Quit: http://www.mibbit.com ajax IRC Client)
05:21 πŸ”— wp494 has quit IRC (Read error: Operation timed out)
05:25 πŸ”— wp494 has joined #archiveteam-bs
06:24 πŸ”— Sk2d has joined #archiveteam-bs
06:26 πŸ”— Sk1d has quit IRC (Read error: Operation timed out)
06:26 πŸ”— Sk2d is now known as Sk1d
06:39 πŸ”— schbirid has joined #archiveteam-bs
07:11 πŸ”— godane has quit IRC (Read error: Operation timed out)
07:21 πŸ”— JAA The Bungie Halo forums grab finished around midnight UTC. 271 GiB in total now. It would be around 100 GiB less if I hadn't grabbed those threads thousands of times. Oh well...
07:21 πŸ”— JAA I'll look into grabbing the user profile pages later.
07:24 πŸ”— PurpleSym JAA: chromebot dedups WARCs before uploading them. You might want to look into that. https://github.com/PromyLOPh/crocoite/blob/master/crocoite/tools.py#L29
07:25 πŸ”— JAA PurpleSym: Yeah, I'll look into that as well. Thanks.
07:46 πŸ”— schbirid has quit IRC (Quit: Leaving)
07:59 πŸ”— ta9le has joined #archiveteam-bs
08:35 πŸ”— BlueMax has quit IRC (Leaving)
08:43 πŸ”— junknickf has joined #archiveteam-bs
08:48 πŸ”— junknickf has quit IRC (Quit: Page closed)
08:55 πŸ”— horkermon has joined #archiveteam-bs
09:06 πŸ”— m007a83_ has joined #archiveteam-bs
09:08 πŸ”— m007a83__ has joined #archiveteam-bs
09:09 πŸ”— m007a83 has quit IRC (Read error: Operation timed out)
09:13 πŸ”— m007a83_ has quit IRC (Read error: Operation timed out)
10:28 πŸ”— godane has joined #archiveteam-bs
10:52 πŸ”— godane so i found something interesting: http://www2.boxoffice.com/the_vault/page_thumbnails?issue_id=2000-11-1
10:53 πŸ”— godane bad news is there scans look like shit
10:54 πŸ”— godane the text for the most part is readable but images just ugly: http://www2.boxoffice.com/the_vault/issue_page?issue_id=2000-11-1&page_no=5#page_start
11:05 πŸ”— eientei95 Looks like DJVU compression
11:36 πŸ”— Valentine has quit IRC (Quit: Addio, adieu, adios, aloha, arrivederci, auf Wiedersehen, au revoir, bye, bye-bye, cheerio, cheers, farewell, good)
11:43 πŸ”— godane how do you fix DJVU compression in those images?
11:45 πŸ”— eientei95 You can't, it's lossy
11:59 πŸ”— godane so i found new website looking thur boxoffice magazine
11:59 πŸ”— godane called yumpu.com
11:59 πŸ”— godane i think its based in Germany but not 100% sure
12:00 πŸ”— godane anyways copys of boxoffice magazines was there too but still with bad compressing
12:11 πŸ”— Mateon1 has quit IRC (Read error: Operation timed out)
12:16 πŸ”— Valentine has joined #archiveteam-bs
12:48 πŸ”— ta9le has quit IRC (Quit: Connection closed for inactivity)
12:52 πŸ”— m007a83_ has joined #archiveteam-bs
12:56 πŸ”— m007a83__ has quit IRC (Ping timeout: 252 seconds)
13:20 πŸ”— m007a83_ is now known as m007a83
13:21 πŸ”— m007a83 has quit IRC (Quit: Leaving)
13:21 πŸ”— m007a83 has joined #archiveteam-bs
13:27 πŸ”— godane evening 2005 pdfs looks like shit: http://www2.boxoffice.com/the_vault/issue_pages?issue_id=2005-1-1
13:27 πŸ”— godane *even
13:27 πŸ”— godane i'm off the bed
15:16 πŸ”— SilSte has quit IRC (Read error: Operation timed out)
15:17 πŸ”— Sk2d has joined #archiveteam-bs
15:17 πŸ”— Sk1d has quit IRC (Read error: Operation timed out)
15:17 πŸ”— Sk2d is now known as Sk1d
15:23 πŸ”— SilSte has joined #archiveteam-bs
15:38 πŸ”— ta9le has joined #archiveteam-bs
16:11 πŸ”— schbirid has joined #archiveteam-bs
16:12 πŸ”— rbraun has quit IRC (Read error: Operation timed out)
16:14 πŸ”— rbraun has joined #archiveteam-bs
16:23 πŸ”— JAA Muad-Dib: You were referring to Halo 2 stats yesterday, which were supposedly not grabbed as part of the project a few years ago. Where can I find those stats at all? The current halo.bungie.net website doesn't seem to have a section for them, and I haven't found any links on user profile pages either.
16:25 πŸ”— JAA By the way, I found 296k members during the thread retrieval. I'm setting up a grab for the profile pages and stats plus an extraction of groups currently, to be started later today.
16:25 πŸ”— Muad-Dib JAA: thatΕ› correct, the buttons are gone but theyre still there, hold on
16:27 πŸ”— Muad-Dib You have to get to them through the search function
16:27 πŸ”— JAA Oh yeah, when you access a user's Halo 3 stats page, you get a link to the Halo 2 stats.
16:27 πŸ”— Muad-Dib you look for a gamertag that can't be found for halo 3 or reach, then select Halo 2 through a dropdown menu
16:28 πŸ”— Muad-Dib JAA: that too
16:28 πŸ”— Muad-Dib I put some of that information on the wiki back in 14
16:29 πŸ”— JAA Yeah, I should read the wiki more often.
16:30 πŸ”— Muad-Dib the game id's seem to be sequential, following the halo.bungie.net/Stats/GameStatsHalo2.aspx?gameid=<integer> pattern
16:30 πŸ”— Darkstar has quit IRC (Ping timeout: 1212 seconds)
16:30 πŸ”— JAA Muad-Dib: Hmm, which page would that be? https://archiveteam.org/index.php?title=Halo has very little details.
16:31 πŸ”— Muad-Dib problem: the lowest number I've found was around 6060, the highest somewhere in the 803 million ...
16:31 πŸ”— Muad-Dib JAA: that's all, it wasn't much
16:32 πŸ”— JAA Ah ok. Yeah, 800 million requests aren't going to happen.
16:33 πŸ”— Muad-Dib yup, that's the problem
16:34 πŸ”— Muad-Dib maybe grab the first couple thousand or something
16:34 πŸ”— Muad-Dib they start 2 days before the release date http://halo.bungie.net/Stats/GameStatsHalo2.aspx?gameid=6066
16:34 πŸ”— Muad-Dib first and last couple thousand
16:36 πŸ”— Muad-Dib because both those groups are interesting in their own way, birth/death of halo 2 multiplayer etc.
16:37 πŸ”— Muad-Dib the 800 million figure does seem plausible, considering bungie mentioned 500 million a few years before shutdown
16:39 πŸ”— JAA Damn, the numbers for Halo 3 are much bigger. I've seen over 1.9 billion already.
16:41 πŸ”— Muad-Dib :/
16:44 πŸ”— Muad-Dib sparkle of hope: it seems all information contained in the tables of a game details screen is present without any async js bullshit needing to happen to retrive it, the only thing the js on the page seems to mostly do is toggle visibility attribute of the tables
16:44 πŸ”— Muad-Dib toggle the*
16:45 πŸ”— Muad-Dib so at least stuff can be recovered from non-js-enabled grabs
16:45 πŸ”— Muad-Dib maybe I'll throw the first and last few thousand games into archivebot
16:50 πŸ”— Muad-Dib except for the "rich" game statistics/game viewer, that would require figuring out the game viewer http://halo.bungie.net/Stats/Halo2WebMaps/richgame.aspx?g=800000000 -- http://halo.bungie.net/Stats/Halo2WebMaps/halo2webmap.ashx?g=800000000&mn=0&mx=571&v=85&zs=1
16:51 πŸ”— Meroje has quit IRC (Quit: bye!)
16:52 πŸ”— Meroje has joined #archiveteam-bs
16:54 πŸ”— Darkstar has joined #archiveteam-bs
16:58 πŸ”— Muad-Dib jesus christ, so I thought it would be manageable to grab the games from release to the end of 2004, since it was released in november... the last gameid from 2004 is 43103253
16:58 πŸ”— Muad-Dib over 43 million games within the first two months
16:58 πŸ”— Muad-Dib from 6066 to 43103253
16:59 πŸ”— JAA When are they shutting down exactly again?
16:59 πŸ”— Muad-Dib 28th
16:59 πŸ”— Muad-Dib looking up the PDT time now
17:00 πŸ”— Muad-Dib can't we get an insider to help us again?
17:02 πŸ”— JAA That would be nice, yeah.
17:02 πŸ”— Muad-Dib JAA: "These changes go into effect on June 28 at 10 a.m. PDT. If you need to save anything, get it done before then." https://www.bungie.net/en/Explore/Detail/News/46965
17:03 πŸ”— Muad-Dib they name someone in the news posting
17:03 πŸ”— JAA So that's 2018-06-28 17:00 UTC. Thanks.
17:03 πŸ”— Muad-Dib np
17:03 πŸ”— Muad-Dib "Resident Archivist Roger Wolfson will explain what's changing and how it may affect you. "
17:04 πŸ”— JAA It looks like our 2014/15 project was only about the Halo 3 files, i.e. the stuff listed on http://halo.bungie.net/online/default.aspx. Is that correct?
17:05 πŸ”— JAA (That's based on looking at the halo-grab and halo-items repositories on GitHub.)
17:05 πŸ”— Muad-Dib arkiver did the code, I believe
17:06 πŸ”— JAA Yes, looks like it.
17:07 πŸ”— Muad-Dib I pitched the idea about the project becaused I cared about the material there and also knew Jason enjoyed playing it ;)
17:07 πŸ”— Muad-Dib https://web.archive.org/web/20120606154827/https://www.bungie.net/News/content.aspx?cid=18094
17:09 πŸ”— Muad-Dib that Wolfson guy got hired as "Server Software Development Lead", I'll just assume he also has weight on the administration side of things nowadays
17:10 πŸ”— Muad-Dib If we want an insider, he'd probably be able to help us out
17:11 πŸ”— Muad-Dib question is, would he?
17:11 πŸ”— Muad-Dib (looking people up like this makes me feel like such a creep)
17:12 πŸ”— Muad-Dib anyway, dinner before monologues, brb
17:12 πŸ”— JAA Might be worth contacting him. I doubt it'd lead to an accelerated shutdown, so it couldn't hurt, right?
17:18 πŸ”— JAA What about Bungie Pro Video http://halo.bungie.net/Projects/BungiePro/default.aspx ? They'll also purge those videos hosted there. Are they publicly available, and do we know the URL format?
17:27 πŸ”— Darkstar has quit IRC (Ping timeout: 633 seconds)
17:39 πŸ”— Darkstar has joined #archiveteam-bs
17:43 πŸ”— jschwart has joined #archiveteam-bs
17:47 πŸ”— SilSte has quit IRC (Read error: Connection reset by peer)
17:47 πŸ”— SilSte has joined #archiveteam-bs
17:50 πŸ”— Muad-Dib http://halo.bungie.net/News/content.aspx?type=topnews&cid=32028 "Bungie will preserve all existing historical Halo data on Bungie.net for as long as the Internet and Bungie's data storage systems remain functional.
17:50 πŸ”— Muad-Dib "
17:52 πŸ”— Muad-Dib https://www.youtube.com/watch?v=Xr9Oubxw1gA
17:57 πŸ”— JAA That didn't age well.
18:06 πŸ”— Muad-Dib we could mention it to them ;)
18:10 πŸ”— Darkstar has quit IRC (Ping timeout: 246 seconds)
18:22 πŸ”— Darkstar has joined #archiveteam-bs
18:44 πŸ”— Gfy_ is now known as Gfy
18:51 πŸ”— Muad-Dib I probably shouldn't write that e-mail though, you might've noticed it would become a rambling shitfest
18:54 πŸ”— ta9le has quit IRC (Quit: Connection closed for inactivity)
19:09 πŸ”— schbirid has quit IRC (Quit: Leaving)
19:26 πŸ”— antomati_ has joined #archiveteam-bs
19:28 πŸ”— antomatic has quit IRC (Read error: Operation timed out)
19:29 πŸ”— schbirid has joined #archiveteam-bs
19:37 πŸ”— Mateon1 has joined #archiveteam-bs
21:00 πŸ”— Jens has quit IRC (Remote host closed the connection)
21:01 πŸ”— Jens has joined #archiveteam-bs
21:21 πŸ”— Muad-Dib JAA: do you know someone who's tactful at writing those mails?
21:23 πŸ”— Muad-Dib I'm going to bed and have a pretty busy schedule the coming days, so I won't be able to help out as much as I want to
21:24 πŸ”— HCross I've got 600GB or so of stuff from the forums so fasr
21:32 πŸ”— jschwart has quit IRC (Quit: Konversation terminated!)
21:39 πŸ”— Darkstar has quit IRC (Remote host closed the connection)
21:44 πŸ”— Darkstar has joined #archiveteam-bs
21:48 πŸ”— horker has joined #archiveteam-bs
21:51 πŸ”— horkermon has quit IRC (Read error: Operation timed out)
23:08 πŸ”— BlueMax has joined #archiveteam-bs
23:41 πŸ”— flashfire has joined #archiveteam-bs
23:42 πŸ”— flashfire If you need storage for the next few months I have a google education account at the moment
23:43 πŸ”— horker has quit IRC (Quit: Leaving)
23:45 πŸ”— flashfire has quit IRC (Client Quit)
23:53 πŸ”— Lord_Nigh has quit IRC (Read error: Operation timed out)
23:59 πŸ”— Lord_Nigh has joined #archiveteam-bs

irclogger-viewer