#archiveteam 2012-04-20,Fri

โ†‘back Search

Time Nickname Message
00:51 ๐Ÿ”— bayleef` Stupid thing. All three irc connections broke.
02:07 ๐Ÿ”— godane i'm making a mirror of torrentfreak.com
02:08 ๐Ÿ”— godane i may have to learn how to do a warc version next mouth
02:08 ๐Ÿ”— godane don't want to do that now since i have download over 50gbs of dl.tv this mouth
02:08 ๐Ÿ”— godane *month
03:03 ๐Ÿ”— LordNlptp ... is archiveteam, and not the thailand flood, responsible for the rise in hard drive prices? :P
03:52 ๐Ÿ”— Coderjoe grr
03:53 ๐Ÿ”— bayleef` boo hiss!
03:53 ๐Ÿ”— bsmith093 is there a dark part of the IA? for stuff that isnt out of copyright yet, but will be dust before that?
03:54 ๐Ÿ”— Coderjoe how long have I been in reconnect limbo?
03:54 ๐Ÿ”— dnova bsmith093: yes, it has been talked about many times
03:54 ๐Ÿ”— Coderjoe 2012-04-19 16:44:21 [I] Connection closed
03:55 ๐Ÿ”— bsmith093 and?
03:55 ๐Ÿ”— dnova and what
03:55 ๐Ÿ”— Coderjoe so 7 hours. GRARRRR
03:55 ๐Ÿ”— bsmith093 is it officaly there, or do they just ignore it and make parts dark?
03:55 ๐Ÿ”— dnova officially?
03:56 ๐Ÿ”— dnova I've never gone looking around for some kind of proof that they have a collection that is not accessible
03:56 ๐Ÿ”— dnova if I was them I'd probably keep my mouth shut
03:57 ๐Ÿ”— Coderjoe they can specify public/nonpublic and unindexed/indexed on a per-item basis
03:57 ๐Ÿ”— bsmith093 books are great, but what about cassette-only audiobooks, or animation?, i cant imagine they dont have the full disney collection somewhere, but all the other stuf old tv shows, lost pilots, whatever the BBC hasnt destroyed yet, that should all be in there on some level
03:57 ๐Ÿ”— Coderjoe no need to move them into a special collection
03:57 ๐Ÿ”— Coderjoe they have a large television archive. they have systems capturing 24/7 from a number of tv channels
03:57 ๐Ÿ”— bsmith093 basically make the LoC a subset, is what I;m saying
03:58 ๐Ÿ”— Coderjoe all of which goes dark until some later date
03:58 ๐Ÿ”— Coderjoe which is where all the stuff for the september 11 collection was culled from
03:59 ๐Ÿ”— shaqfu Coderjoe: The story I heard was that the 9/11 collection was accidental
03:59 ๐Ÿ”— shaqfu bsmith093: As for media companies, those are typically handled internally
03:59 ๐Ÿ”— bsmith093 theres an austrailian movie, i think it was made for tv, called Go to Hell! that i only found out about through demonoid, from a guy who specifically uploads old and or rare stuff
04:01 ๐Ÿ”— shaqfu Those tend to be hard to keep track of since there's no central entity managing them
04:01 ๐Ÿ”— shaqfu If a studio goes under, so does its archival holdings
04:14 ๐Ÿ”— godane looks like mirroring torrentfreak.com equals to a lot of garage data
04:15 ๐Ÿ”— godane like .html files that don't work at all
04:31 ๐Ÿ”— Coderjoe anyway, there are metadata fields such as "noindex", "public", and "hidden", as well as "publicdate"
04:31 ๐Ÿ”— Coderjoe public and hidden are marked as deprecated
04:38 ๐Ÿ”— Coderjoe that was an awesome split from my POV
04:44 ๐Ÿ”— ragechin ditto
04:44 ๐Ÿ”— chronomex who split from your pov?
04:46 ๐Ÿ”— Coderjoe chronomex: https://ezcrypt.it/XT4n#FxYzfwfz0E8DYcmjItfmhJnJ
04:47 ๐Ÿ”— chronomex that's quite the split
05:50 ๐Ÿ”— Coderjoe huh
05:50 ๐Ÿ”— Coderjoe dna lounge has a live video feed
05:51 ๐Ÿ”— Coderjoe bit of a sync offset, though
06:50 ๐Ÿ”— godane how do do you make sure nothing is compressed when mirror with wget?
06:50 ๐Ÿ”— godane mirror torrentfreak cause some files to be compress but most of it is not
07:05 ๐Ÿ”— aggro Finished the mirror a few days ago using wget-warc. Did I miss anything? Any post-checks I could do? In short... what's next? :P
07:05 ๐Ÿ”— aggro http://archiveteam.org/index.php?title=Cyberpunkreview.com
07:54 ๐Ÿ”— godane how do you guys stop webserver from compressing html?
07:55 ๐Ÿ”— godane i want to make a mirror of torrentfreak but some files i just get the compress data
07:55 ๐Ÿ”— aggro Do you mean: get the webserver (apache) to not compress html at all, ever? Or do you mean: advertise to the server that my client does not support compression?
07:55 ๐Ÿ”— Frigolit that's weird, the compression should only be on the protocol, not on the resulting data
07:55 ๐Ÿ”— godane i using wget and some files are compressed
07:56 ๐Ÿ”— Frigolit and considering it's "some" files and not all
07:56 ๐Ÿ”— Frigolit you sure they aren't supposed to be that way
07:56 ๐Ÿ”— Frigolit cause wget should decompress the data before writing it to disk
07:57 ๐Ÿ”— godane it spotly too
07:57 ๐Ÿ”— Frigolit but you could maybe make wget not tell the webserver that it supports gzip compression
07:57 ๐Ÿ”— Frigolit how i dunno but
07:57 ๐Ÿ”— Frigolit and are you sure the compressed files aren't supposed to be compressed?
07:57 ๐Ÿ”— godane some files download in coompress gzip then other times it doesn't
07:58 ๐Ÿ”— godane i'm adding --header="Content-Encoding: deflate"
07:58 ๐Ÿ”— aggro wget --header="accept-encoding: " <-- I don't know if that would work or not. But that's my first guess to tell the server you don't support compression.
07:58 ๐Ÿ”— Frigolit gzip uses deflate
07:59 ๐Ÿ”— Frigolit so it's pretty much the same
07:59 ๐Ÿ”— aggro Quick test shows that --header="accept-encoding: " does "turn off" compression.
07:59 ๐Ÿ”— Frigolit nice
08:00 ๐Ÿ”— godane it didn't work for me
08:00 ๐Ÿ”— godane trying to mirror torrentfreak.com
08:01 ๐Ÿ”— aggro What's the command you're using? I'll try to replicate.
08:01 ๐Ÿ”— Coderjoe wget doesn't do compression, and does not ask the server for a compressed transfer encoding
08:01 ๐Ÿ”— godane //torrentfreak.com/
08:01 ๐Ÿ”— godane n --convert-links -U Mozilla --header="Accept-Encoding:" -e robots=off -nc http:
08:01 ๐Ÿ”— godane wget --mirror -r -p --html-extensio
08:02 ๐Ÿ”— godane wget --mirror -r -p --html-extension --convert-links -U Mozilla --header="Accept-Encoding:" -e robots=off -nc http://torrentfreak.com/
08:02 ๐Ÿ”— alard Why did you add the Accept-Encoding header?
08:02 ๐Ÿ”— godane aggro said it worked
08:02 ๐Ÿ”— Coderjoe the server (or script on the server) is violaing spec if it returns data with a compressed tranfer encoding when not asked for
08:03 ๐Ÿ”— aggro I only tested on my blog.
08:03 ๐Ÿ”— aggro This command: wget --header="accept-encoding: gzip" http://aspensmonster.com/
08:03 ๐Ÿ”— aggro Versus this one: wget --header="accept-encoding: " http://aspensmonster.com/
08:04 ๐Ÿ”— aggro first one gives me the binary. Second one gives me plain html.
08:04 ๐Ÿ”— aggro But still replicatin' on torrentfreak
08:06 ๐Ÿ”— godane is there a way to log mirror wget -o command and still see it in console?
08:06 ๐Ÿ”— aggro Not that I know of, but "tail -f the-log-file" has the same effect.
08:07 ๐Ÿ”— godane now its giving me compress data again
08:07 ๐Ÿ”— godane :-(
08:08 ๐Ÿ”— Coderjoe you can tee wget's stdout
08:08 ๐Ÿ”— aggro yeah. I get compressed data too with your command. Now I'm just watching the headers to see what's going on. Might even get to jump into wireshark :D
08:08 ๐Ÿ”— Coderjoe godane: I think the server is ignoring the accept-encoding header and forcing a transfer-encoding on you
08:09 ๐Ÿ”— Coderjoe "everyone supports this, right?"
08:09 ๐Ÿ”— aggro Well, it's a varnish server running over nginx...
08:09 ๐Ÿ”— aggro hmm...
08:10 ๐Ÿ”— Coderjoe i think the only zlib support in wget is with the warc output
08:10 ๐Ÿ”— Coderjoe got a specific offending url?
08:11 ๐Ÿ”— godane root index.html file
08:11 ๐Ÿ”— godane its causes problems sometimes
08:11 ๐Ÿ”— Coderjoe url?
08:11 ๐Ÿ”— aggro curl --header "accept-encoding: " http://torrentfreak.com
08:11 ๐Ÿ”— aggro still spits out binary.
08:12 ๐Ÿ”— Coderjoe Content-Encoding: gzip
08:13 ๐Ÿ”— Coderjoe either varnish, nginx, php, or the php script is ignoring the lack of accepting gzip
08:13 ๐Ÿ”— aggro Well, if its grabbing from a proxy, the proxy could very well be forcing gzip no matter what. Nginx + varnish is a high probability of that.
08:14 ๐Ÿ”— Coderjoe varnish is a proxy, from the look of the headers
08:14 ๐Ÿ”— aggro Via: 1.1 varnish
08:14 ๐Ÿ”— Coderjoe Varnish is an HTTP accelerator designed for content-heavy dynamic web sites. In contrast to other HTTP accelerators, such as Squid, which began life as a client-side cache, or Apache and nginx, which are primarily origin servers, Varnish was designed from the ground up as an HTTP accelerator. Varnish is focused exclusively on HTTP, unlike other proxy servers that often support FTP, SMTP and other network protocols.
08:15 ๐Ÿ”— Coderjoe IOW: varnish is a proxy server
08:15 ๐Ÿ”— aggro Wordpress too. Ten bucks says he's running w3-total-cache or super-cache
08:15 ๐Ÿ”— aggro http://torrentfreak.com/wp-login.php
08:15 ๐Ÿ”— Coderjoe 1.1
08:16 ๐Ÿ”— Coderjoe quite old
08:16 ๐Ÿ”— Coderjoe 3.0 was released in 2011
08:17 ๐Ÿ”— aggro Hmmmm...
08:18 ๐Ÿ”— aggro Apparently varnish can't handle cookies according to http://stackoverflow.com/questions/8011102/is-there-a-way-to-bypass-varnish-cache-on-client-side. Got something new with this:
08:18 ๐Ÿ”— aggro curl --header http://torrentfreak.com/?cache=123
08:18 ๐Ÿ”— aggro the "?cache=123" is just meant to throw off the caching mechanism
08:23 ๐Ÿ”— aggro derp. no need for "--header" up there.
08:23 ๐Ÿ”— godane i'm trying to mirror torrenfreak the right way
08:24 ๐Ÿ”— godane so how can i have a local mirror of torrentfreak and disable the gzip problem?
08:24 ๐Ÿ”— chronomex wget-warc, yes?
08:25 ๐Ÿ”— godane i want to use wget-warc
08:25 ๐Ÿ”— godane but i also want a copy in torrentfreak.com folder
08:25 ๐Ÿ”— chronomex aye
08:25 ๐Ÿ”— godane it i could be host on my local lan
08:25 ๐Ÿ”— BlueMaxim Is it unusual for my netbook hard drive to make a small click like in this video every now and again? http://www.youtube.com/watch?v=aEDaPeKcFys
08:28 ๐Ÿ”— Coderjoe it feels like a varnish bug that wasn't caught by 1.1 because most normal browsers support gzip encoding
08:29 ๐Ÿ”— Coderjoe hmm
08:30 ๐Ÿ”— Coderjoe except this mailing list message suggests varnish had no gzip support until jan 2011
08:32 ๐Ÿ”— aggro Well, if it makes you feel any better, this command does seem to chug along:
08:32 ๐Ÿ”— aggro wget --mirror -r -p --html-extension --convert-links -U Mozilla --header="accept-encoding: " -e robots=off -nc http://torrentfreak.com/?dummyvar=123
08:33 ๐Ÿ”— aggro I get the initial "index.html?dummyvar=123.html" file for the root, then lots of folders for the SEF URLs that WP generates... and clean index.html files in there...
08:33 ๐Ÿ”— aggro Just an option I suppose.
08:33 ๐Ÿ”— aggro Also, I'd probably institute some sort of delay between requests.
08:35 ๐Ÿ”— aggro Or rather, upon scanning, it's doing the same behaviour you mentioned :P some encoded, some not. Probably has to do with reusing the same connection.
08:35 ๐Ÿ”— godane like i thought
08:36 ๐Ÿ”— aggro I think wget might have an option to force a new connection each time, but I'm also looking into post-processing. I.e., just decompressing after the fact.
08:37 ๐Ÿ”— Coderjoe there's the option of adding gzip and deflate encoding support to wget
08:38 ๐Ÿ”— aggro Renaming the file to index.html.gz, then "gzip -d index.html.gz" gives you the original html, but I don't know how that would square with how wget does the link conversion.
08:38 ๐Ÿ”— aggro If anything I'd bet wget would attempt to convert links on the binary gz's first :P
08:38 ๐Ÿ”— Coderjoe it needs to be able to read the file to get links
08:38 ๐Ÿ”— aggro ^
08:39 ๐Ÿ”— Coderjoe link conversion happens all at the end
08:41 ๐Ÿ”— Coderjoe within the last hour or so, i passed the 1500 uploaded videos mark
08:44 ๐Ÿ”— Nemo_bis and I'm nearing 1000 wikis downloaded with WikiTeam taskforce
08:45 ๐Ÿ”— Coderjoe getting close to 1/3 of this stage6 data uploaded
08:46 ๐Ÿ”— Coderjoe 23 items needing special attention so far
09:19 ๐Ÿ”— aggro of course.
09:20 ๐Ÿ”— aggro my irc foo is weak.
09:21 ๐Ÿ”— aggro Regardless, I think I've found a trick for torrent freak @godane ; I tried to figure out a way to get wget to append parameters to each request and I think I've found one. Been running for about five minutes now without any gzip encoded html files.
09:21 ๐Ÿ”— aggro wget --mirror -r -p --html-extension --convert-links -U Mozilla --header=accept-encoding: -e robots=off -nc --post-data user=blah&password=blah http://torrentfreak.com/
09:21 ๐Ÿ”— aggro The line I'm running to test for the presence of gzip'd files is:
09:21 ๐Ÿ”— aggro for i in `find . -name "*.html"`; do file $i ; done | grep gzip
09:22 ๐Ÿ”— aggro (from the root directory of the mirror, of course)
09:24 ๐Ÿ”— aggro Adding gzip support to wget would be an awesome idea though. I just don't trust myself to try tackling that beast :P
09:37 ๐Ÿ”— Coderjoe uh
09:38 ๐Ÿ”— Coderjoe that's going to cause a POST request, when the site is likely expecting a GET request
09:42 ๐Ÿ”— Coderjoe you'd be better off creating a bogus cookie. varnish, is appears, does not use the cache when there are cookies
09:42 ๐Ÿ”— Coderjoe since it has no idea how to interpret them
10:25 ๐Ÿ”— aggro How would I go about making a bogus cookie?
10:29 ๐Ÿ”— dnova put raisins in
10:29 ๐Ÿ”— aggro and what sort of dough is best?
11:04 ๐Ÿ”— godane aggro: what did you find?
11:04 ๐Ÿ”— godane nevermind
11:05 ๐Ÿ”— godane just notice your pm from earlier
12:34 ๐Ÿ”— godane i found the edge magazine
12:35 ๐Ÿ”— godane i mean from 1995 and up
12:49 ๐Ÿ”— godane holy bat crap
12:50 ๐Ÿ”— godane there is a edge magazine torrent with 400+mb pdfs
12:50 ๐Ÿ”— godane what the hell?
12:52 ๐Ÿ”— emijrp pr0n.
12:57 ๐Ÿ”— SketchCow Ops, please.
12:57 ๐Ÿ”— winr4r sup jason
12:58 ๐Ÿ”— SketchCow Well, EFnet shit an actual bed into another bed, which it then shit
12:58 ๐Ÿ”— kronoch ?
12:59 ๐Ÿ”— winr4r haha
12:59 ๐Ÿ”— godane SketchCow: Just found most of edge magazine on demonoid
13:00 ๐Ÿ”— godane just search 'edge magazine' on demonid
13:00 ๐Ÿ”— godane no need to email the links
13:00 ๐Ÿ”— emijrp winr4r: no speedydeletion yet, we have a win rar
13:00 ๐Ÿ”— oli many peers?
13:00 ๐Ÿ”— oli seeds i mean ..
13:00 ๐Ÿ”— winr4r emijrp: i know, i was amazed
13:01 ๐Ÿ”— emijrp added webcitation links to all the citations
13:01 ๐Ÿ”— winr4r i thought there was some bot on wikipedia that automatically slapped a speedy tag on new articles man
13:02 ๐Ÿ”— emijrp ha, i did a bot for that at Spanish Wikipedia, to delete junk and test pages
13:02 ๐Ÿ”— emijrp using regexps to detect, and other tricky stuff
13:03 ๐Ÿ”— winr4r oh man, i hated those things
13:04 ๐Ÿ”— emijrp find the baddass guys and discard the suckers https://en.wikipedia.org/wiki/Category:Web_archiving_initiatives
13:04 ๐Ÿ”— winr4r who wants to bet that it'll end up in AFD anyway?
13:05 ๐Ÿ”— emijrp in that case it will end in '''keep''', citations are OK
13:10 ๐Ÿ”— emijrp ha https://en.wikipedia.org/wiki/AT#Other_uses
13:11 ๐Ÿ”— winr4r emijrp: haha
13:11 ๐Ÿ”— Jaybird11 SketchCow, my Q-audio upload progresses. Once it's done I'll run a Perl script I found to find duplicate files, as there are bound to be several.
13:35 ๐Ÿ”— SketchCow Sure.
13:36 ๐Ÿ”— Jaybird11 There are also, as I suspected, even more blatant copyright violations than I thought, which is going to present a special problem in the case of songs by artists who aren't widely-known enough that a definite determination can be made.
13:38 ๐Ÿ”— Jaybird11 I don't know if anything like this is up there, but let's say somebody produced their own CD and uploaded it so their Twitter friends could enjoy it. The filenames are likely to look just like other infringing files up ther.
13:40 ๐Ÿ”— Jaybird11 The strange thing is, Q has stopped complaining about scrapers. Either everyone doesn't want to get their IP blocked and has stopped, he's started blocking people silently, or he's stopped caring about it.
13:40 ๐Ÿ”— SketchCow Regarding violation, I want it all.
13:41 ๐Ÿ”— SketchCow We'll deal with the problems of that later, but the time slow in tracking down provenance is not the way to go.
13:41 ๐Ÿ”— SketchCow And I bet he probably has slowed down a bit while people show interest.
13:41 ๐Ÿ”— SketchCow I just want blind dudes to have their shit
13:42 ๐Ÿ”— Jaybird11 I'd conduct a little experiment and scrape some to see if I get blocked, escept A. I don't have a scraper capable of grabbing the modern filenaming convention, and B. I don't want my IP blocked, silently or otherwise.
13:42 ๐Ÿ”— SmileyG [14:41:40] <@SketchCow> I just want blind dudes to have their shit
13:42 ๐Ÿ”— SmileyG That. Is. Beautiful.
13:42 ๐Ÿ”— SmileyG <3 you
13:42 ๐Ÿ”— SmileyG <3 your ideals (that I know of)
13:42 ๐Ÿ”— SmileyG <3 your style
13:42 ๐Ÿ”— SmileyG <3
13:43 ๐Ÿ”— SmileyG I have a spare IP..... on a slowish connection (10Mb/1Mb)
13:43 ๐Ÿ”— SmileyG If you want me to try :D
13:43 ๐Ÿ”— SketchCow No, I am sure Jaybird11 is handling it just fine.
13:43 ๐Ÿ”— SmileyG ok
13:44 ๐Ÿ”— * SmileyG goes back to his Cave to worship the church of Sketchcow and his "Son" sockington.
13:44 ๐Ÿ”— SketchCow Don't make this weird
13:44 ๐Ÿ”— SketchCow er
13:46 ๐Ÿ”— Jaybird11 I came up with an example of a version of SketchCow's civil war letter thing. Take http://q-audio.net/d/89
13:46 ๐Ÿ”— Jaybird11 This is my very first audio tweet in my own voice to Q-audio. And it's very stupid, nothing of significance is said, and it was poorly produced to say the least!
13:47 ๐Ÿ”— Jaybird11 But you can tell several things. First, that I wanted to test Q-audio's record/upload facility and didn't want to bother with doing a proper setup for my microphone.
13:48 ๐Ÿ”— Jaybird11 Second, near the beginning you can hear my speech synthesizer. That's a DECtalk express. The fact that someone was still using one of those as his primary speech synthesizer on October 10, 2010 might be significant years down the road.
13:48 ๐Ÿ”— Jaybird11 It goes on and on.
13:50 ๐Ÿ”— Jaybird11 Amusingly, some time later someone posted a Q-audio post where they were making weird noises with feedback and pretending to contact aliens. Lol!
13:57 ๐Ÿ”— SmileyG :/ sorry.
14:00 ๐Ÿ”— Jaybird11 Wow it seems from the files going across right now like someone uploaded the same file over and over and over again, based on the name
14:50 ๐Ÿ”— godane i'm on underground-gamer now
14:51 ๐Ÿ”— godane i may get some goods for dark-rack
14:52 ๐Ÿ”— godane i can said the a lot of torrents are like 400gb+
15:10 ๐Ÿ”— godane i think i found something
15:10 ๐Ÿ”— godane the first pc gamer CD
15:17 ๐Ÿ”— DFJustin I'm grabbing all the CD stuff on UG but you can go after mags if you want
15:18 ๐Ÿ”— DFJustin I got the pc zone one already
15:20 ๐Ÿ”— godane i didn't see pc gamer cd #1 on archive.org
15:21 ๐Ÿ”— DFJustin yeah guess that one is new
15:22 ๐Ÿ”— godane its going at about 20kbytes
15:23 ๐Ÿ”— godane its only 221mb
15:54 ๐Ÿ”— godane i found game manuals for nes
17:04 ๐Ÿ”— godane SketchCow: I found dragaon magazine
17:04 ๐Ÿ”— godane a dungeons and dragons magazine
17:05 ๐Ÿ”— shaqfu godane: email him the torrent link
17:05 ๐Ÿ”— godane its on underground-gamer
17:05 ๐Ÿ”— godane i don't know if he has a account there
17:06 ๐Ÿ”— shaqfu Grab it, then; I'm sure it's not a big set
17:06 ๐Ÿ”— godane its 5.74gb
17:09 ๐Ÿ”— DFJustin there's a whole magazine category on there with hundreds of gb of stuff
17:10 ๐Ÿ”— shaqfu I'm tempted to grab the whole magazine category; it'll take me months, but what the hell
17:14 ๐Ÿ”— DFJustin it might not be too bad if you skip stuff that's already on IA or retromags
17:28 ๐Ÿ”— closure so, if you decide to celerate 4/20 by finding the 43 minute version of Dark Star, you'll find some beautiful words about archival http://archive.org/details/gd73-12-06.sbd.kaplan-fink-hamilton.4452.sbeok.shnf
17:28 ๐Ÿ”— closure "It's so amazing to be able to take a Dark Star or a St. Stephen from 72 and hold it up next to one from 77 or 82, roll them around on your tongue and decide which one really smacks you hardest right in the pleasure lobe. Truly a blessing, this Archive. God bless you."
17:40 ๐Ÿ”— godane i also found gamroom magazine
18:12 ๐Ÿ”— godane does the stage6 videos have any satellaview videos?
18:12 ๐Ÿ”— godane i only ask cause i may found something rare
18:44 ๐Ÿ”— LordNlptp satellaview? someone recorded one of the live transmissions with the satellite audio?
18:44 ๐Ÿ”— godane yes
18:45 ๐Ÿ”— underscor Coderjoe: ping
18:46 ๐Ÿ”— godane there is id numbers with the torrent
18:47 ๐Ÿ”— godane 1011014, 1011305, 1017211
19:12 ๐Ÿ”— SketchCow closure, we're making you an admin of the group for stage6
19:13 ๐Ÿ”— SketchCow Shortly, you'll be able to just upload right into that group, ok?
19:14 ๐Ÿ”— emijrp can i be an admin in wikiteam collection pl0x?
19:23 ๐Ÿ”— shaqfu http://www.kickstarter.com/projects/280024034/3ton-preservation-society
19:24 ๐Ÿ”— mistym shaqfu: Wow, very modest goal.
19:25 ๐Ÿ”— shaqfu Seems like a neat collection of stuff, too
19:25 ๐Ÿ”— mistym I wish Kickstarter didn't make it a complete pain to participate if you're not American.
19:25 ๐Ÿ”— shaqfu I suppose because of rewards?
19:26 ๐Ÿ”— mistym No, just because their financial backend is very US-centric. It's not possible to accept payments at all if you're not in the US (so, no out-of-the-States projects), and difficult to pay for non-Americans.
19:26 ๐Ÿ”— shaqfu Gotcha
19:26 ๐Ÿ”— DFJustin I was able to use it from canada but they may have tightened it up recently
19:26 ๐Ÿ”— SketchCow I know why they originally did that.
19:27 ๐Ÿ”— SketchCow It was because only Amazon allowed financial holds.
19:27 ๐Ÿ”— SketchCow And Amazon was only US.
19:27 ๐Ÿ”— mistym Aha.
19:27 ๐Ÿ”— SketchCow I too have been surprised.
19:28 ๐Ÿ”— shaqfu Seems like they're big enough now to loosen that
19:28 ๐Ÿ”— SketchCow Considering they're blowing up, the utter shutout of international orders and refusing to find ways to deal with that have been pretty f'in glaring.
19:28 ๐Ÿ”— Deewiant Paying through Amazon was quite painless for me, in Europe.
19:28 ๐Ÿ”— SketchCow A year ago, some russians asked, BEGGED for a russian kickstarter.
19:28 ๐Ÿ”— SketchCow They made a film! Kickstarter blogged it! Did nothing.
19:28 ๐Ÿ”— shaqfu SketchCow: Blowing up good, or blowing up bad?
19:28 ๐Ÿ”— SketchCow Blowing up good
19:29 ๐Ÿ”— SketchCow Although they're about to encounter a concentration of fraud and terror they're probably not ready for.
19:29 ๐Ÿ”— shaqfu Yeah, they're in for a scary time once a big project tanks
19:29 ๐Ÿ”— SketchCow Or is a well designed fraud!
19:29 ๐Ÿ”— shaqfu My bet's on Wasteland 2 never actually getting made
19:29 ๐Ÿ”— SketchCow http://www.wired.com/gamelife/2012/04/prince-of-persia-source-code/ in case nobody has seen it yet.
19:29 ๐Ÿ”— SketchCow It's me being a hero with other people being heroes.
19:30 ๐Ÿ”— SketchCow No not-heroes except the sadness and rot of time
19:30 ๐Ÿ”— balrog__ :p
19:30 ๐Ÿ”— balrog__ they still didn't fix the issues though, heh
19:30 ๐Ÿ”— SketchCow Fuck you, time
19:30 ๐Ÿ”— DFJustin obligatory hourglass reference
19:30 ๐Ÿ”— SketchCow balrog__: I mailed the guy, he mailed back an ages-for-you 500 seconds ago saying he was telling his editor.
19:31 ๐Ÿ”— yipdw godane: see ffnet-grab
19:31 ๐Ÿ”— yipdw we use warctools' warc2warc to deal with servers that violate HTTP specs
19:31 ๐Ÿ”— yipdw well
19:31 ๐Ÿ”— yipdw "violate"
19:31 ๐Ÿ”— yipdw it's technically not a violation
19:31 ๐Ÿ”— balrog__ yeah I know :)
19:31 ๐Ÿ”— balrog__ yipdw: what do they do then?
19:31 ๐Ÿ”— yipdw but it is a dick move
19:31 ๐Ÿ”— yipdw balrog__: github.com/ArchiveTeam/ffnet-grab :P
19:32 ๐Ÿ”— yipdw we use warc2warc -D to decompress compressed data
19:33 ๐Ÿ”— shaqfu As expected, in the comments on the Ars article about PoP, there was a license war :P
19:33 ๐Ÿ”— oli is the seesaw thing still broken
19:33 ๐Ÿ”— oli for direct s3 uploads
19:34 ๐Ÿ”— mistym shaqfu: Amazingly, it hasn't broken out on the Github repo.
19:34 ๐Ÿ”— mistym I have not had to use my terrible powers once.
19:34 ๐Ÿ”— Coderjoe underscor: pong?
19:34 ๐Ÿ”— shaqfu The Ars thread even had an invocation of RMS!
19:34 ๐Ÿ”— balrog__ why do people continually fight about licensing? :(
19:34 ๐Ÿ”— shaqfu mistym: That's promising; hopefully it stays calm
19:35 ๐Ÿ”— Nemo_bis SketchCow, <emijrp> can i be an admin in wikiteam collection pl0x?
19:35 ๐Ÿ”— SmileyG herp. Hi internet dudes and dudettes...
19:35 ๐Ÿ”— Nemo_bis me too, i'm going to upload a thousand wikis (wikiteam items) in a few days รขย€ย“ if someone writes a script for that ;)
19:36 ๐Ÿ”— Nemo_bis (via s3)
19:40 ๐Ÿ”— Coderjoe underscor, godane: i do not have any of the three IDs listed before, it appears
19:41 ๐Ÿ”— Coderjoe not even metadata.
19:41 ๐Ÿ”— Coderjoe sadly, I only have metadata for videos I was personally interested in (or other videos that shared tags with ones I was interested in)
19:41 ๐Ÿ”— Coderjoe so my contribution to the collection is slightly slanted in that direction
19:42 ๐Ÿ”— alard oli: Was it broken?
19:43 ๐Ÿ”— Coderjoe 1600 items with the prefix stage6-
19:44 ๐Ÿ”— SketchCow I want to pass along the thing with adminning a colleciton.
19:44 ๐Ÿ”— SketchCow So, I would love to make people admins of collections they're contributing to, moving the human factor of me moving them over.
19:44 ๐Ÿ”— SketchCow But to do that, I need to know the e-mail address you have for your uploader account.
19:45 ๐Ÿ”— SketchCow Once I know that, I can have that account added as the admin for the given collection.
19:45 ๐Ÿ”— oli alard: well all my scripts broke
19:45 ๐Ÿ”— oli and someone acknowledged some problem earlier in this chan or the memac one
19:45 ๐Ÿ”— oli the rsync process was just not connecting or something
19:46 ๐Ÿ”— alard oli: Ah, but seesaw-s3 isn't using rsync. The normal seesaw is.
19:46 ๐Ÿ”— oli ok, well whatever it was :p i was using this: DATA_DIR=data-1 ./seesaw-s3-repeat.sh oli BLAH BLAH
19:46 ๐Ÿ”— oli and that one was not working
19:46 ๐Ÿ”— oli IIRC
19:47 ๐Ÿ”— oli it kept hanging anyway
19:47 ๐Ÿ”— oli w/ that curl problem or HTP 100 or whatever
19:48 ๐Ÿ”— alard oli: Let's do this on the #memac channel.
20:05 ๐Ÿ”— kennethre SketchCow: https://twitter.com/mongoose_q/status/193426274964877312
20:09 ๐Ÿ”— SketchCow Yeah
20:30 ๐Ÿ”— ivan` would archive.org be interested in 160GB of UT2004 maps grabbed from some long-gone mirror server?
20:30 ๐Ÿ”— SketchCow I would.
20:30 ๐Ÿ”— SketchCow archive.org is different than me.
20:31 ๐Ÿ”— SketchCow Me, I'm interested, and I use archive.org as a place to store things.
20:31 ๐Ÿ”— ivan` okay, I guess I will send them to you
20:31 ๐Ÿ”— SketchCow So yes.
20:31 ๐Ÿ”— SketchCow is it a file directory? I can give you an rsync list.
20:31 ๐Ÿ”— ivan` my upstream is too slow but I can mail you a drive that I do not need back
20:32 ๐Ÿ”— ivan` it's a dir with 60,000 files
20:36 ๐Ÿ”— godane i'm getting edge magazine
20:37 ๐Ÿ”— godane some one made some 200dpi scans with ads removed
20:37 ๐Ÿ”— godane goes from 02-1995 to end of 2007
20:39 ๐Ÿ”— shaqfu Ads removed? Unfortunate
20:39 ๐Ÿ”— mistym 12 years worth of scans is pretty ++, though.
20:40 ๐Ÿ”— godane do you guys have cd-i magazine?
20:48 ๐Ÿ”— SketchCow OKAY NEXT
20:48 ๐Ÿ”— SketchCow ----------------------------------------------------------
20:48 ๐Ÿ”— SketchCow SO I HAVE BEEN WORKING WITH THE DISCFERRET PROJECT
20:48 ๐Ÿ”— SketchCow GOING VERY WELL, NEEDS MORE HANDS
20:48 ๐Ÿ”— SketchCow THIS IS YOUR TIME TO STRETCH THOSE DEV MUSCLES
20:48 ๐Ÿ”— SketchCow COME TO #DISCFERRET ON THE FREENODE IRC NETWORK
20:49 ๐Ÿ”— emijrp C-C-C-C-C-OMBO BREAKER: 10 days for knol
20:49 ๐Ÿ”— SketchCow DON'T BE A DUMBASS, THAT IS ALL
20:49 ๐Ÿ”— SketchCow ----------------------------------------------------------
20:49 ๐Ÿ”— mistym CAPS LOCK
20:49 ๐Ÿ”— SketchCow emijrp: dm me your e-mail on archive.org and what collections you need admin on
20:55 ๐Ÿ”— godane SketchCow: do you have Ahoy Disk Magazine for the Commodore 64/128
20:56 ๐Ÿ”— godane this torrent may include all the floppies that come with magazine
21:01 ๐Ÿ”— ivan` at least people who really want something might Google and find me https://ludios.org/archive/uz2.gameservers.net-ut2004-listing.html
21:01 ๐Ÿ”— * ivan` makes a third copy
21:02 ๐Ÿ”— ivan` looks like it died 6 months after I wget'ed, good timing
21:04 ๐Ÿ”— emijrp why dont we use the IA S3 to upload all the YouTube Creative Commons videos?
21:05 ๐Ÿ”— ivan` IA has a free S3 now?
21:05 ๐Ÿ”— emijrp one to one, no shitty 200GB packs
21:05 ๐Ÿ”— SketchCow Before we do that, I need to discuss it with Brewster
21:05 ๐Ÿ”— SketchCow I'm trying to deal with the at-risk stuff as a priority.
21:06 ๐Ÿ”— emijrp k
21:06 ๐Ÿ”— ivan` I feel so useless in a low-bandwidth residence
21:06 ๐Ÿ”— ivan` it's like a gravity well
21:07 ๐Ÿ”— alard ivan`: You can think of smart things that other people can run?
21:07 ๐Ÿ”— ivan` yeah, I should write some software
21:15 ๐Ÿ”— ivan` http://uz.ut-files.com/ http://uz.ut-files.com/ http://uz2.ut-files.com/ http://uz3.ut-files.com/ are high-risk given past failures and that these are treated as "caches"
21:15 ๐Ÿ”— ivan` via http://news.ut-files.com/
21:17 ๐Ÿ”— ivan` must be at least 200GB
21:18 ๐Ÿ”— SketchCow e-mail me for a mailing address for the drie
21:18 ๐Ÿ”— ivan` okay, will do when I get it onto a drive
21:30 ๐Ÿ”— kennethre SketchCow: are you going to be in SF in may?
21:32 ๐Ÿ”— shaqfu "Must have experience in Tier 4 archiving" - corporate archive ads are always lol
21:34 ๐Ÿ”— kennethre shaqfu: "must have at least 10 years experience with warc"
21:34 ๐Ÿ”— kennethre รขย€ยฆ and node.js
21:35 ๐Ÿ”— shaqfu Haha
21:35 ๐Ÿ”— shaqfu (for those following at home, a "tier 4 archive" is suit code for "shit you page")
21:36 ๐Ÿ”— mistym Recently saw a library job ad looking for someone with 8+ experience with Rails.
21:37 ๐Ÿ”— shaqfu Oh! At NYPL, right?
21:37 ๐Ÿ”— mistym Yeah!
21:37 ๐Ÿ”— shaqfu My brother called me and asked about that
21:37 ๐Ÿ”— kennethre mistym: must have experience since THE DAY IT WAS INVENTED
21:37 ๐Ÿ”— mistym I guess they only really want dhh to apply.
21:37 ๐Ÿ”— shaqfu Him and his dev friends were having laughs at their expense; I told him that *all* their postings are like that
21:38 ๐Ÿ”— shaqfu Protip: if SketchCow got a MLIS and wouldn't be qualified for your digital archivist posting, rethink your posting
21:38 ๐Ÿ”— mistym shaqfu: No kidding.
21:38 ๐Ÿ”— mistym Of course, it's also hilarious how unrealistic these requirements are while at the same paying a pittance.
21:41 ๐Ÿ”— mistym Maybe designed so they can adjust the pay scale even further downward when they inevitably hire someone who doesn't meet every requirement.
21:41 ๐Ÿ”— shaqfu mistym: Happens when the jobs say "BA required, MLIS preferred"
21:41 ๐Ÿ”— mistym shaqfu: Ugh. Yes.
21:41 ๐Ÿ”— shaqfu "Oh, it's a BA-level posting, so we'll pay BA-level rates"
21:42 ๐Ÿ”— shaqfu mistym: Don't get me started on "archivist" positions that are actually secretaries
21:43 ๐Ÿ”— shaqfu "High school diploma and 1+ years corporate office experience" is *not* what we do!
21:43 ๐Ÿ”— mistym Ugh.
21:43 ๐Ÿ”— mistym Starting to get fed up with archives jobs tbh. It's sad how many archives are quick to talk about how much we need new archivists with strong tech skills while actively driving them out of the field.
21:44 ๐Ÿ”— mistym (Not to say that good archives jobs don't exist, etc. But the places that "get it" are not in the majority either.)
21:44 ๐Ÿ”— SketchCow kennethre: Why
21:45 ๐Ÿ”— kennethre SketchCow: i'll be in town for a week if you want to meetup
21:45 ๐Ÿ”— SketchCow Where are you normally?
21:45 ๐Ÿ”— kennethre east coast, near DC-ish
21:45 ๐Ÿ”— SketchCow Oh.
21:46 ๐Ÿ”— SketchCow Whenever everybody in Readibility goes to jail, they're still going to come for you
21:46 ๐Ÿ”— SketchCow I suggest a false name, on the run
21:46 ๐Ÿ”— SketchCow Anonymous git checkins at work
21:46 ๐Ÿ”— kennethre hahahaha
21:46 ๐Ÿ”— kennethre I left for a reason :)
21:47 ๐Ÿ”— SketchCow I'm saying
21:47 ๐Ÿ”— SketchCow They're still going to find you
21:47 ๐Ÿ”— kennethre i never wrote my $1 check to get the company shares
21:47 ๐Ÿ”— shaqfu mistym: Yeah, it's miserable. I more or less gave up the chance to ever settle down
21:47 ๐Ÿ”— kennethre so i think i'll be last on the list
21:48 ๐Ÿ”— shaqfu Anyway! Enough of me ranting at job boards; I have a question for the good folk of AT
21:48 ๐Ÿ”— mistym shaqfu: What's up?
21:49 ๐Ÿ”— shaqfu NDIIP's holding a conference on digital archiving/preservation, and they're open to mini-talks by anyone
21:49 ๐Ÿ”— shaqfu I wrote this a few weeks back: http://archiveteam.org/index.php?title=Backup_Tips
21:49 ๐Ÿ”— SketchCow NDIIP: We never saw a working committee that could beat a subject into the ground with no action we didn't like
21:49 ๐Ÿ”— shaqfu And curious if you folk think the bit on appraisal is worth discussing for ~5 minutes
21:49 ๐Ÿ”— shaqfu SketchCow: You just described every LOC-affiliated group ever
21:50 ๐Ÿ”— mistym It's so true.
21:50 ๐Ÿ”— SketchCow Yes, which is why I am keynoting the JCDL conference
21:50 ๐Ÿ”— SketchCow And not sitting in a committee
21:50 ๐Ÿ”— SketchCow Do you know the title of my talk?
21:50 ๐Ÿ”— shaqfu JCDL?
21:50 ๐Ÿ”— shaqfu And no
21:50 ๐Ÿ”— SketchCow http://www.jcdl2012.info/
21:50 ๐Ÿ”— SketchCow The ACM/IEEE Joint Conference on Digital Libraries is a major international forum focusing on digital libraries and associated technical, practical, organizational, and social issues.
21:51 ๐Ÿ”— SketchCow I am the first keynote, the first speech
21:51 ๐Ÿ”— kennethre awesome
21:51 ๐Ÿ”— shaqfu What's the title?
21:51 ๐Ÿ”— shaqfu (and what's that orange thingy in your pic on the site?)
21:52 ๐Ÿ”— SketchCow ALL YOU CARED ABOUT IS GONE AND ALL YOUR FRIENDS ARE DEAD:
21:52 ๐Ÿ”— SketchCow The Fun Frolic of Preservation Activism
21:52 ๐Ÿ”— shaqfu Hah
21:52 ๐Ÿ”— mistym That's wonderful.
21:53 ๐Ÿ”— SketchCow So you'll forgive me if the opportunity to get inolved in a mini-talk at some digipres conference isn't making me whip out the windmill
21:53 ๐Ÿ”— shaqfu SketchCow: I gotcha
21:53 ๐Ÿ”— SketchCow I am all for archiveteam members destroying conferences with crazy thoughts, though
21:53 ๐Ÿ”— SketchCow So go for it.
21:53 ๐Ÿ”— shaqfu I was just asking if it merited sending something in; I know it's ultimately meaningless :P
21:53 ๐Ÿ”— SketchCow Embarassed regard of me as a founder of archive team always welcome
21:53 ๐Ÿ”— SketchCow "Yeah... yeah, he did"
21:54 ๐Ÿ”— shaqfu Since nobody talks about appraisal outside of "you should back up important shit"
21:54 ๐Ÿ”— shaqfu Which is paramount to telling somoene to eat their peas
21:55 ๐Ÿ”— mistym shaqfu: I would say "go for it".
21:55 ๐Ÿ”— shaqfu In true LOC fashion, they want a 300 word proposal, which would take more than five minutes to read...
21:55 ๐Ÿ”— SketchCow EAT YOUR FUCKING PEAS
21:56 ๐Ÿ”— shaqfu Admittedly, if you came bearing down on me like that, I would eat my fucking peas religiously every fucking day
21:56 ๐Ÿ”— kennethre SketchCow: so was that a yes or no to may
21:56 ๐Ÿ”— mistym They do not know the MEANING of "less yack more hack".
21:56 ๐Ÿ”— shaqfu mistym: MAC this year was about web archive appraisal, which makes me wonder "there's time to appraise?!"
21:57 ๐Ÿ”— Nemo_bis shaqfu, copy to hard drive only every 2-3 months??
21:57 ๐Ÿ”— mistym shaqfu: I liked helrond's take on that re: description: http://hillelarnold.com/blog/?p=680
21:57 ๐Ÿ”— mistym shaqfu: Erg. Yes. Most people do not know how to work on the timeframe web archiving TAKES.
21:58 ๐Ÿ”— mistym I hope to beat it into people's heads with my ACA talk.
21:59 ๐Ÿ”— shaqfu Nemo_bis: Adjust to every month, then
21:59 ๐Ÿ”— Nemo_bis shaqfu, I think one should have a backup process easy enough to do it weekly, or something like that
22:00 ๐Ÿ”— SketchCow I just don't know.
22:00 ๐Ÿ”— SketchCow To if I'll be there.
22:00 ๐Ÿ”— mistym Good god bzr is slow.
22:00 ๐Ÿ”— shaqfu Nemo_bis: The issue there is if people actually make enough stuff to merit a backup weekly
22:00 ๐Ÿ”— kennethre cool
22:00 ๐Ÿ”— SketchCow If so great, otherwise no.
22:00 ๐Ÿ”— SketchCow I go to DC a few times this year.
22:00 ๐Ÿ”— SketchCow Then I can touch you lovingly
22:00 ๐Ÿ”— * kennethre backs away slowly
22:00 ๐Ÿ”— Nemo_bis shaqfu, unless one doesn't produce new data as valuable as the time spent archiving it in the defined period.
22:01 ๐Ÿ”— shaqfu Nemo_bis: Fair point
22:02 ๐Ÿ”— Coderjoe bleh
22:03 ๐Ÿ”— Coderjoe scans of Turing's two newly-released papers need to be released
22:03 ๐Ÿ”— shaqfu Coderjoe: Any reason why they're not?
22:03 ๐Ÿ”— * ivan` grabs all of dharmaseed.org and it is unexpectedly huge, hundreds of gigs
22:04 ๐Ÿ”— DFJustin I do everything weekly
22:04 ๐Ÿ”— Coderjoe shaqfu: the GCHQ just released them into a museum exhibit, afaict
22:04 ๐Ÿ”— Coderjoe http://www.bbc.com/news/technology-17771962
22:04 ๐Ÿ”— DFJustin the fact that it's not too much data is actually helpful because then it doesn't take long
22:04 ๐Ÿ”— shaqfu Coderjoe: Hunh; I figured there might be some weird copyright issue with it, but I suppose not
22:05 ๐Ÿ”— Coderjoe they say they released them "to the public domain"
22:06 ๐Ÿ”— Nemo_bis shaqfu, also, you could expand the part about being organized; it's very easy to backup if you keep all your personal data in a single directory, very hard if your C: drive (or home directory) is a mix of programs, personal directories, directories of software data and assorted garbage
22:06 ๐Ÿ”— shaqfu Nemo_bis: Another good idea, although there's a problem with everything being across multiple devices
22:07 ๐Ÿ”— Nemo_bis shaqfu, I don't understand
22:07 ๐Ÿ”— shaqfu Nemo_bis: Docs on C:\, photos on the camera, bad photos and texts on the phone, God knows what in the cloud...
22:08 ๐Ÿ”— DFJustin programs like synctoy let you define a big ol hairy mess of directories
22:08 ๐Ÿ”— Nemo_bis shaqfu, that's why you already recommend to keep copies of everything locally, don't you
22:08 ๐Ÿ”— shaqfu Nemo_bis: I did? I forgot I wrote that bit
22:08 ๐Ÿ”— Nemo_bis hmm
22:09 ๐Ÿ”— Nemo_bis maybe I read too much into it
22:09 ๐Ÿ”— Nemo_bis but for instance I'd hate having my emails only on a webmail
22:11 ๐Ÿ”— shaqfu Hm, I did
22:11 ๐Ÿ”— DFJustin well I have personal data across 3 HDDs on the same box so there's that
22:14 ๐Ÿ”— shaqfu Nemo_bis: Updated
22:15 ๐Ÿ”— SmileyG synctoy is the win for windows :/
22:19 ๐Ÿ”— shaqfu What's it do? Pretends that stuff on an external device lives in somewhere more sensible (cam pics in /My Pictures/)?
22:20 ๐Ÿ”— DFJustin it's just a simple utility to define a bunch of folder pairs and then when you hit the button, copy across everything so that the same contents are in both
22:20 ๐Ÿ”— shaqfu Gotcha. Seems useful
22:21 ๐Ÿ”— DFJustin with options as to one-way, two-way, exclude lists etc
22:22 ๐Ÿ”— DFJustin it also has smarts to figure out that I:\MP3s on your phone is really the same as F:\MP3s from last week because you reshuffled your usb devices

irclogger-viewer