Time |
Nickname |
Message |
00:09
🔗
|
Famicoman |
all of it? |
00:28
🔗
|
balrog |
joepie91: http://www.anonnews.org/press/item/820/comments/ didn't have it? |
00:29
🔗
|
balrog |
it used to be at http://www.archiveteam.org/archives/edramatica/ED_archive.zip |
00:32
🔗
|
joepie91 |
balrog: well yes, "used to e" |
00:32
🔗
|
joepie91 |
be * |
00:32
🔗
|
joepie91 |
I've run across a number of broken links on archiveteam.org |
00:32
🔗
|
joepie91 |
which is simultaneously funny and kinda bad |
00:56
🔗
|
nico |
so we should run !a http://www.archiveteam.org/ more often on #archivebot |
01:06
🔗
|
ivan` |
joepie91: archivebot has it |
01:06
🔗
|
ivan` |
maybe not the old version you want |
01:06
🔗
|
ivan` |
https://encrypted.google.com/search?q=archivebot+encyclopediadramatica+site%3Aarchive.org&btnG=Search |
01:09
🔗
|
DFJustin |
http://web.archive.org/http://www.archiveteam.org/archives/edramatica/ED_archive.zip |
03:16
🔗
|
godane |
so i'm mirroring msnbc news pages from wayback machine |
03:16
🔗
|
godane |
crazy code to make it happen from cdx: cat cdx*msnbc.com*news*1* | grep 'asp?cp1=1 ' | grep 'text/html 200' | sed 's| http|/http|g' | sed 's| text/html.*||g' | sed 's|.* ||g' | sed 's|:80||g' | sed 's|http://msnbc.com|http://www.msnbc.com|g' | sort | uniq > urls.txt |
03:17
🔗
|
yipdw |
yeah |
03:17
🔗
|
yipdw |
there comes a point where shell is no longer the best option :P |
03:39
🔗
|
yipdw |
https://www.fanfiction.net/s/9571902/1/The-Truth |
03:39
🔗
|
yipdw |
whoa |
03:40
🔗
|
yipdw |
Edward Snowden/Hetalia Axis Powers crossover |
04:15
🔗
|
joepie91 |
yipdw: there are no limits to what can be found on hte interwebs |
04:15
🔗
|
joepie91 |
the * |
04:15
🔗
|
joepie91 |
ivan`: not the same stuff |
04:15
🔗
|
joepie91 |
I mean, that webecology backup -was- integrated into the new site |
04:16
🔗
|
joepie91 |
but it's not the same data :p |
04:27
🔗
|
godane |
so i got a 22 min video from dateline in 1998 about beef |
04:27
🔗
|
vantec |
well, it was what's for dinner |
06:37
🔗
|
joepie91 |
how come this is not being updated anymore? https://archive.org/details/freemusicarchive |
06:43
🔗
|
joepie91 |
SketchCow: underscor: if I were to write a client for IA, what should I set as the default maximum concurrent download and upload limit? |
07:12
🔗
|
SketchCow |
Why would you write a client? |
07:12
🔗
|
SketchCow |
We already have one. |
07:12
🔗
|
SketchCow |
You could look at it and see if improvements or features are needed. |
07:18
🔗
|
exmic |
but making it work with fortran is so much work |
07:30
🔗
|
SketchCow |
https://pypi.python.org/pypi/internetarchive |
07:30
🔗
|
SketchCow |
We've done a million uploads with it |
08:00
🔗
|
joepie91 |
SketchCow: I mean a graphical client, where uploading to IA is one of the features |
08:00
🔗
|
joepie91 |
not just a library |
08:00
🔗
|
joepie91 |
it's something I've been working on for a while to automate some processes here |
08:01
🔗
|
joepie91 |
hence wondering how many concurrent transfers are acceptable |
08:01
🔗
|
joepie91 |
(also, SketchCow, I've actually been providing some feedback / bug reports on that library already :) |
08:05
🔗
|
SketchCow |
That's the one. |
08:05
🔗
|
SketchCow |
I would say, ask Jake then. |
08:05
🔗
|
SketchCow |
jake@archive.org |
08:07
🔗
|
joepie91 |
alright, thanks |
08:10
🔗
|
SketchCow |
Also, the answer to "why hasn't _____ been updated on archive.org" is ALWAYS "because there are 8 people responsible for maintaining collections" |
08:10
🔗
|
SketchCow |
So unless an outside person is maintaining/co-maintaining the collection, fix-ups come in waves |
08:11
🔗
|
SketchCow |
Across years, sometimes |
08:25
🔗
|
godane |
i'm close to 1000 videos for 2000 clips from nbcnews |
08:25
🔗
|
godane |
*for year 2000 |
08:46
🔗
|
joepie91 |
damnit gmail |
08:46
🔗
|
joepie91 |
where did my "you don't have a subject" warning go |
08:46
🔗
|
joepie91 |
SketchCow: I see |
08:46
🔗
|
SketchCow |
So I've been working on script-based ways to shore up our stuff. |
08:47
🔗
|
SketchCow |
Because when the new UI kicks in it will DEFINITELY show gaps and slowdowns in additions. |
08:48
🔗
|
joepie91 |
what kind of stuff should I be thinking about? |
08:48
🔗
|
SketchCow |
In what context |
08:48
🔗
|
joepie91 |
thinking of*, sorry |
08:48
🔗
|
joepie91 |
like, what kind of stuff is to be shored up |
08:48
🔗
|
joepie91 |
(my brain is on low-power mode today) |
08:50
🔗
|
SketchCow |
Help me understand what's going on, again. You hinted but I was busy. |
08:50
🔗
|
SketchCow |
Quit your job, intend to do "stuff" for a year. |
08:50
🔗
|
SketchCow |
With IA being one of the beneficiaries of this time. |
08:50
🔗
|
SketchCow |
Is that right? |
08:50
🔗
|
joepie91 |
oh, that was a different context actually |
08:50
🔗
|
joepie91 |
this was more a generic question of "what do you mean with 'stuff' in <@SketchCow> So I've been working on script-based ways to shore up our stuff." |
08:51
🔗
|
joepie91 |
but yes, the above is also correct |
08:51
🔗
|
joepie91 |
(though I'll have to see how the fundraiser idea works out before I commit to anything) |
08:51
🔗
|
SketchCow |
What I am talking about scripting isn't an archiveteam thing. It's a me and the archive thing. |
08:51
🔗
|
joepie91 |
well yes, but I'm curious what kind of stuff it entails :P |
08:51
🔗
|
SketchCow |
Many items don't have cover images. Many don't have keywords, etc. |
08:51
🔗
|
joepie91 |
aha |
08:51
🔗
|
joepie91 |
right |
08:52
🔗
|
SketchCow |
Many have no metadata of any kind. Intend to work on that. |
08:52
🔗
|
joepie91 |
SketchCow: I'd been pondering about this a bit, but idk if this might simply already be on the roadmap: would wikifying metadata not be an option? |
08:52
🔗
|
SketchCow |
That is an ugly situation. |
08:52
🔗
|
SketchCow |
We worked together on that one solution, but I've had zero time to work with your code. |
08:53
🔗
|
SketchCow |
Yanking metadata into a wiki wholesale, and then we edit and I oversee it flying back in, could be good. |
08:53
🔗
|
SketchCow |
That's the best compromise we can have it. |
08:53
🔗
|
joepie91 |
well, the idea I was thinking of was more inline wikified editing - so that a user with an account on IA could just edit metadata from an item page itself (excluding 'protected' items) |
08:53
🔗
|
joepie91 |
but not sure how technically feasible |
08:53
🔗
|
SketchCow |
There will never, never, ever be, at least within the span of years, a case where you click on something at IA and people do editing in a wiki fashion. |
08:54
🔗
|
joepie91 |
what's the reasoning behind that? |
08:54
🔗
|
SketchCow |
It's baked into the organization at the moment. |
08:54
🔗
|
SketchCow |
I mean, you want to go ahead and tell me why it's great, go ahead, make yourself feel better. But I can see it won't happy anytime soon. |
08:54
🔗
|
joepie91 |
right, but I'm quite curious whether that's just a time/attention constraint issue, or an inherent conceptual problem with wikifying |
08:54
🔗
|
SketchCow |
Happy? |
08:54
🔗
|
joepie91 |
er |
08:54
🔗
|
SketchCow |
Conceptual problem. |
08:54
🔗
|
joepie91 |
conceptual problem that people have with * |
08:54
🔗
|
joepie91 |
right |
08:54
🔗
|
SketchCow |
Combined with time/attention. |
08:56
🔗
|
joepie91 |
SketchCow: completely unrelated quesiton, do you guys at IA have a spamfilter that triggers on empty subject lines? because I accidentally sent my email to jake without a subject, and apparently my gmail setting to warn me about that has magically vanished |
08:56
🔗
|
SketchCow |
My end-run is the closest we'll have. |
08:56
🔗
|
joepie91 |
question * |
08:57
🔗
|
SketchCow |
I have not the slightest idea. |
08:57
🔗
|
SketchCow |
I do know we have a spam issue. |
08:57
🔗
|
SketchCow |
I don't use the IA mail system. |
08:57
🔗
|
joepie91 |
alright, we'll see if I get a response then |
08:57
🔗
|
joepie91 |
right :P |
08:57
🔗
|
joepie91 |
I suppose that if you have a spam issue, it's not a terribly trigger-happy filter (if any at all), so my mail will probably go through fin |
08:57
🔗
|
joepie91 |
fine * |
08:57
🔗
|
SketchCow |
I am all for us using the parallel wiki idea. |
08:58
🔗
|
joepie91 |
SketchCow: can you elaborate on how you'd see that working, in a technical sense? |
08:58
🔗
|
exmic |
metadata goes in |
08:59
🔗
|
exmic |
metadata comes out |
08:59
🔗
|
exmic |
can't explain that |
08:59
🔗
|
joepie91 |
lol |
08:59
🔗
|
SketchCow |
We did a prototype a while ao. |
08:59
🔗
|
SketchCow |
Sort of - you wrote a post bot but I've been busy. |
08:59
🔗
|
joepie91 |
well obviously, but the idea I got was that SketchCow meant using a standard wiki system (a la mediawiki), at which point the question is "how do you turn the wiki page back into useful metadata without making the page a pain to edit" |
08:59
🔗
|
joepie91 |
re: exmic |
09:00
🔗
|
SketchCow |
* collection chosen |
09:00
🔗
|
SketchCow |
* metadata of all items is pulled into wiki under a set, with each item a page |
09:00
🔗
|
SketchCow |
* editttttt |
09:00
🔗
|
SketchCow |
* push all of it back |
09:00
🔗
|
SketchCow |
---- |
09:00
🔗
|
SketchCow |
On a page: |
09:00
🔗
|
SketchCow |
metadata pair becomes == METADATA NAME == |
09:00
🔗
|
SketchCow |
Followed by metadata. |
09:01
🔗
|
SketchCow |
Obviously there is some trickery from the ingestor to pull things in. |
09:01
🔗
|
SketchCow |
Obviously there is potential for things to go wrong, or for issues with newbs making a mess |
09:01
🔗
|
SketchCow |
Obviously it's not the fast fast fast fast shut the fuck up it's fast keep going world of, say, Wikipedia. |
09:01
🔗
|
SketchCow |
Which... I hate. |
11:43
🔗
|
nico |
05:40 yipdw> Edward Snowden/Hetalia Axis Powers crossover |
11:44
🔗
|
nico |
i really should try to restart the ffnet archiving project |
11:50
🔗
|
nico |
https://github.com/FlatRockSoft/ |
14:10
🔗
|
SadDM |
SketchCow: is the code for your keyword generator posted anywhere? |
14:11
🔗
|
SadDM |
I know you're using https://github.com/ox-it/spindle-code/ and https://pypi.python.org/pypi/internetarchive, but what about the glue and baling twine that holds them together? |
14:23
🔗
|
godane |
some good news on the martin yan's chinatowns torrents |
14:24
🔗
|
godane |
i got upload 2 and upload 4 last night |
14:25
🔗
|
godane |
so now i got about 30 episodes of it |
17:03
🔗
|
ersi |
Hmm~ got a USB stick that shows up in dmesg as a SCSI removable disk (like usual) that gets a device (/dev/sdb).. but I can't mount it and if I `dd` from it, it says "dd opening /dev/sdb no medium found" :/ |
17:04
🔗
|
ersi |
Any ideas on how to retrieve data from it? |
17:39
🔗
|
nico |
ersi: borked usb stick? |
17:39
🔗
|
nico |
do cfdisk /dev/sdb return something real? |
18:31
🔗
|
SketchCow |
SadDM: My keyword generator is VERY weaksauce |
18:32
🔗
|
SketchCow |
If you want it, I can provide it |
18:32
🔗
|
SketchCow |
Obviously you need write control on the item for it to work. |
18:43
🔗
|
SketchCow |
SadDM: http://fos.textfiles.com/keyworder.zip |
18:44
🔗
|
SketchCow |
You need internetarchive (the python program) installed |
19:36
🔗
|
SadDM |
SketchCow: anything I'd cobble together would also be weaksauce... you've just saved me the trouble |
19:39
🔗
|
SadDM |
gah! *BOOM* goes the zip file |
20:16
🔗
|
DFJustin |
http://www.reading.ac.uk/news-and-events/releases/PR583836.aspx |
20:41
🔗
|
DFJustin |
https://www.youtube.com/watch?v=d0mg9DxvfZE |
23:41
🔗
|
balrog |
kanzure_: good question, I dunno. I'd think that people who do photographic printed circuit board production might know. |
23:41
🔗
|
balrog |
this is for diybio? |
23:59
🔗
|
kanzure_ |
balrog: yes, sort of |