#archiveteam-bs 2014-06-08,Sun

↑back Search

Time	Nickname	Message
00:09 ^🔗	Famicoman	all of it?
00:28 ^🔗	balrog	joepie91: http://www.anonnews.org/press/item/820/comments/ didn't have it?
00:29 ^🔗	balrog	it used to be at http://www.archiveteam.org/archives/edramatica/ED_archive.zip
00:32 ^🔗	joepie91	balrog: well yes, "used to e"
00:32 ^🔗	joepie91	be *
00:32 ^🔗	joepie91	I've run across a number of broken links on archiveteam.org
00:32 ^🔗	joepie91	which is simultaneously funny and kinda bad
00:56 ^🔗	nico	so we should run !a http://www.archiveteam.org/ more often on #archivebot
01:06 ^🔗	ivan`	joepie91: archivebot has it
01:06 ^🔗	ivan`	maybe not the old version you want
01:06 ^🔗	ivan`	https://encrypted.google.com/search?q=archivebot+encyclopediadramatica+site%3Aarchive.org&btnG=Search
01:09 ^🔗	DFJustin	http://web.archive.org/http://www.archiveteam.org/archives/edramatica/ED_archive.zip
03:16 ^🔗	godane	so i'm mirroring msnbc news pages from wayback machine
03:16 ^🔗	godane	crazy code to make it happen from cdx: cat cdxmsnbc.comnews1 \| grep 'asp?cp1=1 ' \| grep 'text/html 200' \| sed 's\| http\|/http\|g' \| sed 's\| text/html.\|\|g' \| sed 's\|. \|\|g' \| sed 's\|:80\|\|g' \| sed 's\|http://msnbc.com\|http://www.msnbc.com\|g' \| sort \| uniq > urls.txt
03:17 ^🔗	yipdw	yeah
03:17 ^🔗	yipdw	there comes a point where shell is no longer the best option :P
03:39 ^🔗	yipdw	https://www.fanfiction.net/s/9571902/1/The-Truth
03:39 ^🔗	yipdw	whoa
03:40 ^🔗	yipdw	Edward Snowden/Hetalia Axis Powers crossover
04:15 ^🔗	joepie91	yipdw: there are no limits to what can be found on hte interwebs
04:15 ^🔗	joepie91	the *
04:15 ^🔗	joepie91	ivan`: not the same stuff
04:15 ^🔗	joepie91	I mean, that webecology backup -was- integrated into the new site
04:16 ^🔗	joepie91	but it's not the same data :p
04:27 ^🔗	godane	so i got a 22 min video from dateline in 1998 about beef
04:27 ^🔗	vantec	well, it was what's for dinner
06:37 ^🔗	joepie91	how come this is not being updated anymore? https://archive.org/details/freemusicarchive
06:43 ^🔗	joepie91	SketchCow: underscor: if I were to write a client for IA, what should I set as the default maximum concurrent download and upload limit?
07:12 ^🔗	SketchCow	Why would you write a client?
07:12 ^🔗	SketchCow	We already have one.
07:12 ^🔗	SketchCow	You could look at it and see if improvements or features are needed.
07:18 ^🔗	exmic	but making it work with fortran is so much work
07:30 ^🔗	SketchCow	https://pypi.python.org/pypi/internetarchive
07:30 ^🔗	SketchCow	We've done a million uploads with it
08:00 ^🔗	joepie91	SketchCow: I mean a graphical client, where uploading to IA is one of the features
08:00 ^🔗	joepie91	not just a library
08:00 ^🔗	joepie91	it's something I've been working on for a while to automate some processes here
08:01 ^🔗	joepie91	hence wondering how many concurrent transfers are acceptable
08:01 ^🔗	joepie91	(also, SketchCow, I've actually been providing some feedback / bug reports on that library already :)
08:05 ^🔗	SketchCow	That's the one.
08:05 ^🔗	SketchCow	I would say, ask Jake then.
08:05 ^🔗	SketchCow	jake@archive.org
08:07 ^🔗	joepie91	alright, thanks
08:10 ^🔗	SketchCow	Also, the answer to "why hasn't _____ been updated on archive.org" is ALWAYS "because there are 8 people responsible for maintaining collections"
08:10 ^🔗	SketchCow	So unless an outside person is maintaining/co-maintaining the collection, fix-ups come in waves
08:11 ^🔗	SketchCow	Across years, sometimes
08:25 ^🔗	godane	i'm close to 1000 videos for 2000 clips from nbcnews
08:25 ^🔗	godane	*for year 2000
08:46 ^🔗	joepie91	damnit gmail
08:46 ^🔗	joepie91	where did my "you don't have a subject" warning go
08:46 ^🔗	joepie91	SketchCow: I see
08:46 ^🔗	SketchCow	So I've been working on script-based ways to shore up our stuff.
08:47 ^🔗	SketchCow	Because when the new UI kicks in it will DEFINITELY show gaps and slowdowns in additions.
08:48 ^🔗	joepie91	what kind of stuff should I be thinking about?
08:48 ^🔗	SketchCow	In what context
08:48 ^🔗	joepie91	thinking of*, sorry
08:48 ^🔗	joepie91	like, what kind of stuff is to be shored up
08:48 ^🔗	joepie91	(my brain is on low-power mode today)
08:50 ^🔗	SketchCow	Help me understand what's going on, again. You hinted but I was busy.
08:50 ^🔗	SketchCow	Quit your job, intend to do "stuff" for a year.
08:50 ^🔗	SketchCow	With IA being one of the beneficiaries of this time.
08:50 ^🔗	SketchCow	Is that right?
08:50 ^🔗	joepie91	oh, that was a different context actually
08:50 ^🔗	joepie91	this was more a generic question of "what do you mean with 'stuff' in <@SketchCow> So I've been working on script-based ways to shore up our stuff."
08:51 ^🔗	joepie91	but yes, the above is also correct
08:51 ^🔗	joepie91	(though I'll have to see how the fundraiser idea works out before I commit to anything)
08:51 ^🔗	SketchCow	What I am talking about scripting isn't an archiveteam thing. It's a me and the archive thing.
08:51 ^🔗	joepie91	well yes, but I'm curious what kind of stuff it entails :P
08:51 ^🔗	SketchCow	Many items don't have cover images. Many don't have keywords, etc.
08:51 ^🔗	joepie91	aha
08:51 ^🔗	joepie91	right
08:52 ^🔗	SketchCow	Many have no metadata of any kind. Intend to work on that.
08:52 ^🔗	joepie91	SketchCow: I'd been pondering about this a bit, but idk if this might simply already be on the roadmap: would wikifying metadata not be an option?
08:52 ^🔗	SketchCow	That is an ugly situation.
08:52 ^🔗	SketchCow	We worked together on that one solution, but I've had zero time to work with your code.
08:53 ^🔗	SketchCow	Yanking metadata into a wiki wholesale, and then we edit and I oversee it flying back in, could be good.
08:53 ^🔗	SketchCow	That's the best compromise we can have it.
08:53 ^🔗	joepie91	well, the idea I was thinking of was more inline wikified editing - so that a user with an account on IA could just edit metadata from an item page itself (excluding 'protected' items)
08:53 ^🔗	joepie91	but not sure how technically feasible
08:53 ^🔗	SketchCow	There will never, never, ever be, at least within the span of years, a case where you click on something at IA and people do editing in a wiki fashion.
08:54 ^🔗	joepie91	what's the reasoning behind that?
08:54 ^🔗	SketchCow	It's baked into the organization at the moment.
08:54 ^🔗	SketchCow	I mean, you want to go ahead and tell me why it's great, go ahead, make yourself feel better. But I can see it won't happy anytime soon.
08:54 ^🔗	joepie91	right, but I'm quite curious whether that's just a time/attention constraint issue, or an inherent conceptual problem with wikifying
08:54 ^🔗	SketchCow	Happy?
08:54 ^🔗	joepie91	er
08:54 ^🔗	SketchCow	Conceptual problem.
08:54 ^🔗	joepie91	conceptual problem that people have with *
08:54 ^🔗	joepie91	right
08:54 ^🔗	SketchCow	Combined with time/attention.
08:56 ^🔗	joepie91	SketchCow: completely unrelated quesiton, do you guys at IA have a spamfilter that triggers on empty subject lines? because I accidentally sent my email to jake without a subject, and apparently my gmail setting to warn me about that has magically vanished
08:56 ^🔗	SketchCow	My end-run is the closest we'll have.
08:56 ^🔗	joepie91	question *
08:57 ^🔗	SketchCow	I have not the slightest idea.
08:57 ^🔗	SketchCow	I do know we have a spam issue.
08:57 ^🔗	SketchCow	I don't use the IA mail system.
08:57 ^🔗	joepie91	alright, we'll see if I get a response then
08:57 ^🔗	joepie91	right :P
08:57 ^🔗	joepie91	I suppose that if you have a spam issue, it's not a terribly trigger-happy filter (if any at all), so my mail will probably go through fin
08:57 ^🔗	joepie91	fine *
08:57 ^🔗	SketchCow	I am all for us using the parallel wiki idea.
08:58 ^🔗	joepie91	SketchCow: can you elaborate on how you'd see that working, in a technical sense?
08:58 ^🔗	exmic	metadata goes in
08:59 ^🔗	exmic	metadata comes out
08:59 ^🔗	exmic	can't explain that
08:59 ^🔗	joepie91	lol
08:59 ^🔗	SketchCow	We did a prototype a while ao.
08:59 ^🔗	SketchCow	Sort of - you wrote a post bot but I've been busy.
08:59 ^🔗	joepie91	well obviously, but the idea I got was that SketchCow meant using a standard wiki system (a la mediawiki), at which point the question is "how do you turn the wiki page back into useful metadata without making the page a pain to edit"
08:59 ^🔗	joepie91	re: exmic
09:00 ^🔗	SketchCow	* collection chosen
09:00 ^🔗	SketchCow	* metadata of all items is pulled into wiki under a set, with each item a page
09:00 ^🔗	SketchCow	* editttttt
09:00 ^🔗	SketchCow	* push all of it back
09:00 ^🔗	SketchCow	----
09:00 ^🔗	SketchCow	On a page:
09:00 ^🔗	SketchCow	metadata pair becomes == METADATA NAME ==
09:00 ^🔗	SketchCow	Followed by metadata.
09:01 ^🔗	SketchCow	Obviously there is some trickery from the ingestor to pull things in.
09:01 ^🔗	SketchCow	Obviously there is potential for things to go wrong, or for issues with newbs making a mess
09:01 ^🔗	SketchCow	Obviously it's not the fast fast fast fast shut the fuck up it's fast keep going world of, say, Wikipedia.
09:01 ^🔗	SketchCow	Which... I hate.
11:43 ^🔗	nico	05:40 yipdw> Edward Snowden/Hetalia Axis Powers crossover
11:44 ^🔗	nico	i really should try to restart the ffnet archiving project
11:50 ^🔗	nico	https://github.com/FlatRockSoft/
14:10 ^🔗	SadDM	SketchCow: is the code for your keyword generator posted anywhere?
14:11 ^🔗	SadDM	I know you're using https://github.com/ox-it/spindle-code/ and https://pypi.python.org/pypi/internetarchive, but what about the glue and baling twine that holds them together?
14:23 ^🔗	godane	some good news on the martin yan's chinatowns torrents
14:24 ^🔗	godane	i got upload 2 and upload 4 last night
14:25 ^🔗	godane	so now i got about 30 episodes of it
17:03 ^🔗	ersi	Hmm~ got a USB stick that shows up in dmesg as a SCSI removable disk (like usual) that gets a device (/dev/sdb).. but I can't mount it and if I `dd` from it, it says "dd opening /dev/sdb no medium found" :/
17:04 ^🔗	ersi	Any ideas on how to retrieve data from it?
17:39 ^🔗	nico	ersi: borked usb stick?
17:39 ^🔗	nico	do cfdisk /dev/sdb return something real?
18:31 ^🔗	SketchCow	SadDM: My keyword generator is VERY weaksauce
18:32 ^🔗	SketchCow	If you want it, I can provide it
18:32 ^🔗	SketchCow	Obviously you need write control on the item for it to work.
18:43 ^🔗	SketchCow	SadDM: http://fos.textfiles.com/keyworder.zip
18:44 ^🔗	SketchCow	You need internetarchive (the python program) installed
19:36 ^🔗	SadDM	SketchCow: anything I'd cobble together would also be weaksauce... you've just saved me the trouble
19:39 ^🔗	SadDM	gah! BOOM goes the zip file
20:16 ^🔗	DFJustin	http://www.reading.ac.uk/news-and-events/releases/PR583836.aspx
20:41 ^🔗	DFJustin	https://www.youtube.com/watch?v=d0mg9DxvfZE
23:41 ^🔗	balrog	kanzure_: good question, I dunno. I'd think that people who do photographic printed circuit board production might know.
23:41 ^🔗	balrog	this is for diybio?
23:59 ^🔗	kanzure_	balrog: yes, sort of

irclogger-viewer