#archiveteam-bs 2017-11-01,Wed

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
***TC01 has quit IRC (Read error: Connection reset by peer) [00:04]
TC01 has joined #archiveteam-bs [00:09]
...... (idle for 29mn)
bitBaron has quit IRC (Quit: My computer has gone to sleep. ZZZzzz…) [00:38]
...... (idle for 26mn)
SilSte has joined #archiveteam-bs [01:04]
...... (idle for 25mn)
MRX3 has quit IRC (Quit: Leaving) [01:29]
hook54321If a teacher says something and a student records it is the recording part of the Public Domain? [01:36]
***drumstick has quit IRC (Ping timeout: 255 seconds)
drumstick has joined #archiveteam-bs
[01:42]
dashcloudno
everything now is copyrighted, unless you do something specifically to change that
that's what Creative Commons does for non-software things, and what all the open-source licenses do for software
[01:54]
Somebody2Copyright in recordings of extemporaneous speech generally belong to the person who makes the recording, IIRC.
But recording people without their consent can be illegal, depending on other factors.
(like what state you are in, whether it was a private conversation, and others)
[01:56]
dashcloudif you're in person, in a school environment, unless you are specifically requested not to do so, you should be able to record without issue [01:58]
Somebody2And if a speech was written down before it was delivered, whoever wrote it down holds the copyright on it, and audio recordings are derivative works. [01:58]
dashcloudhook54321: I get the feeling that none of this really answers the question you had [01:59]
Somebody2It's a *VERY* gray area just how detailed notes have to be to make a recording of a speech a derivative work.
But yeah, I suspect you had a different question.
[01:59]
hook54321It wasn't really a speech, it was this teacher's rant.
https://archive.org/details/BillJohnson
[02:00]
.......... (idle for 47mn)
Somebody2That certainly sounds extemporaneous, so copyright is likely not a concern.
But it also seems likely to attract the attention of an irrational and angry person, so I, at least, will be staying far away.
[02:47]
***schbirid2 has quit IRC (Ping timeout: 255 seconds) [02:50]
godanei'm splitting the bbc america bowie tape into 2 parts
cause one recording is from 2000 and the other is from 1975
[02:57]
***schbirid2 has joined #archiveteam-bs [03:03]
godaneso another unlabel tape has Cinemax recording of Excalibur
i'm very sure thats on dvd some where
anyways turns out there is some sort of Live event recorded after it
called Film Independent's Spirit Awards
this must have been the 2008 one
SketchCow: btw its hosted by Rainn Wilson
also your going to better bitrate the commercial tapes with this one
i'm getting 8300k to 8700k
[03:10]
***Stilett0 has joined #archiveteam-bs
pizzaiolo has quit IRC (Remote host closed the connection)
[03:25]
..... (idle for 24mn)
zhongfu has quit IRC (Ping timeout: 260 seconds) [03:53]
..... (idle for 24mn)
qw3rty116 has joined #archiveteam-bs
bitBaron has joined #archiveteam-bs
qw3rty115 has quit IRC (Read error: Operation timed out)
bitBaron has quit IRC (Quit: My computer has gone to sleep. ZZZzzz…)
[04:17]
..... (idle for 23mn)
zhongfu has joined #archiveteam-bs [04:48]
..... (idle for 23mn)
Lord_Nigh has quit IRC (Read error: Operation timed out) [05:11]
hook54321JAA: Do you still have the partial recording of Bryan Lunduke's 24 hour thing? [05:12]
***Lord_Nigh has joined #archiveteam-bs [05:14]
...... (idle for 28mn)
balrog has quit IRC (Read error: Operation timed out)
balrog has joined #archiveteam-bs
swebb sets mode: +o balrog
[05:42]
................... (idle for 1h31mn)
godaneSketchCow: we got some good old hbo previews at the end of this tape too [07:17]
.... (idle for 17mn)
these hbo previews are from 1990/1991 since the is inside the nfl talking about super bowl 25 [07:34]
..... (idle for 23mn)
***Stilett0 is now known as Stiletto [07:57]
godaneone tape i'm skipping is the 'in treatment' hbo episodes
i question its from a 2008 series and its on dvd
[08:05]
................ (idle for 1h19mn)
1 minute of footage from this tape is missing [09:25]
***pizzaiolo has joined #archiveteam-bs [09:26]
godaneaudio capture but video goes back and white then to black
around 34:16 to 35:15 this happen
[09:26]
***Stiletto has quit IRC () [09:30]
Stilett0 has joined #archiveteam-bs [09:41]
nyaomi has quit IRC (Read error: Operation timed out) [09:50]
nyaomi has joined #archiveteam-bs
drumstick has quit IRC (Ping timeout: 255 seconds)
drumstick has joined #archiveteam-bs
pizzaiolo has quit IRC (pizzaiolo)
pizzaiolo has joined #archiveteam-bs
pizzaiolo has quit IRC (Ping timeout: 246 seconds)
[09:59]
...... (idle for 26mn)
godanei may stop the tape after the current episode only cause is having problems
there is frame issue with episode 4 on this tape
[10:31]
***BlueMaxim has quit IRC (Quit: Leaving) [10:34]
JAAhook54321: Yes, I do. [10:45]
***zhongfu has quit IRC (Ping timeout: 260 seconds) [10:57]
zhongfu has joined #archiveteam-bs [11:03]
zhongfu has quit IRC (Ping timeout: 260 seconds)
zhongfu has joined #archiveteam-bs
ScruffyB has joined #archiveteam-bs
decay_ has joined #archiveteam-bs
pikhq_ has joined #archiveteam-bs
RKenshin has joined #archiveteam-bs
tuluu has joined #archiveteam-bs
SN4T14_ has joined #archiveteam-bs
ppsym has joined #archiveteam-bs
Hecatz- has joined #archiveteam-bs
db420 has joined #archiveteam-bs
db420 has quit IRC (Connection closed)
LeG0ax has joined #archiveteam-bs
[11:12]
phillipsj has quit IRC (se.hub irc.underworld.no)
Aerochrom has quit IRC (se.hub irc.underworld.no)
purplebot has quit IRC (se.hub irc.underworld.no)
PurpleSym has quit IRC (se.hub irc.underworld.no)
JensRex has quit IRC (se.hub irc.underworld.no)
tuluu_ has quit IRC (se.hub irc.underworld.no)
i0npulse has quit IRC (se.hub irc.underworld.no)
pikhq has quit IRC (se.hub irc.underworld.no)
Hecatz has quit IRC (se.hub irc.underworld.no)
Kenshin has quit IRC (se.hub irc.underworld.no)
Ing3b0rg has quit IRC (se.hub irc.underworld.no)
dboard2 has quit IRC (se.hub irc.underworld.no)
Rai-chan has quit IRC (se.hub irc.underworld.no)
medowar has quit IRC (se.hub irc.underworld.no)
decay has quit IRC (se.hub irc.underworld.no)
SN4T14 has quit IRC (se.hub irc.underworld.no)
[11:29]
zhongfu has quit IRC (Ping timeout: 260 seconds)
zhongfu has joined #archiveteam-bs
[11:37]
RKenshin is now known as Kenshin
ppsym is now known as PurpleSym
LeG0ax is now known as Ing3b0rg
Hecatz- is now known as Hecatz
Aerochrom has joined #archiveteam-bs
[11:45]
pizzaiolo has joined #archiveteam-bs
drumstick has quit IRC (Read error: Operation timed out)
zhongfu has quit IRC (Ping timeout: 260 seconds)
zhongfu has joined #archiveteam-bs
[11:56]
...... (idle for 25mn)
dboard2 has joined #archiveteam-bs
dboard2 has quit IRC (Connection closed)
[12:24]
............. (idle for 1h3mn)
godaneso i decided to subscribe The New Yorker for the digital issues on there site [13:27]
..... (idle for 23mn)
***kyounko_ has joined #archiveteam-bs
kyounko_ has quit IRC (Excess Flood)
alfie has quit IRC (Ping timeout: 260 seconds)
r3c0d3x has quit IRC (Ping timeout: 260 seconds)
Meroje has quit IRC (Ping timeout: 260 seconds)
dan- has quit IRC (Ping timeout: 260 seconds)
DopefishJ has joined #archiveteam-bs
swebb sets mode: +o DopefishJ
Meroje has joined #archiveteam-bs
kyounko_ has joined #archiveteam-bs
r3c0d3x has joined #archiveteam-bs
dan- has joined #archiveteam-bs
Hecatz has quit IRC (Ping timeout: 260 seconds)
ld1 has quit IRC (Ping timeout: 260 seconds)
Muad-Dib has quit IRC (Ping timeout: 260 seconds)
ZexaronS- has joined #archiveteam-bs
jsa has quit IRC (Quit: No Ping reply in 180 seconds.)
zhongfu has quit IRC (Remote host closed the connection)
ld1 has joined #archiveteam-bs
jsa has joined #archiveteam-bs
kyounko has quit IRC (Ping timeout: 260 seconds)
DFJustin has quit IRC (Ping timeout: 260 seconds)
ZexaronS has quit IRC (Ping timeout: 260 seconds)
Hecatz has joined #archiveteam-bs
alfie has joined #archiveteam-bs
[13:50]
JAAWhat is going on with all these ping timeouts? [13:52]
***zhongfu has joined #archiveteam-bs
alfie has quit IRC (Ping timeout: 260 seconds)
alembic has quit IRC (Ping timeout: 260 seconds)
riking has quit IRC (Ping timeout: 260 seconds)
ThisAsYou has quit IRC (Ping timeout: 260 seconds)
midas has quit IRC (Ping timeout: 260 seconds)
ItsYoda has quit IRC (Ping timeout: 260 seconds)
JSharp has quit IRC (Ping timeout: 260 seconds)
riking has joined #archiveteam-bs
JSharp has joined #archiveteam-bs
ThisAsYou has joined #archiveteam-bs
alembic has joined #archiveteam-bs
DopefishJ has quit IRC (Ping timeout: 260 seconds)
DrasticAc has quit IRC (Ping timeout: 260 seconds)
bitspill has quit IRC (Ping timeout: 260 seconds)
robogoat has quit IRC (Ping timeout: 260 seconds)
trvz has quit IRC (Ping timeout: 260 seconds)
Hecatz has quit IRC (Ping timeout: 260 seconds)
ld1 has quit IRC (Ping timeout: 260 seconds)
dan- has quit IRC (Ping timeout: 260 seconds)
pikhq_ has quit IRC (Ping timeout: 260 seconds)
spacegirl has quit IRC (Ping timeout: 260 seconds)
zhongfu has quit IRC (Ping timeout: 260 seconds)
robogoat has joined #archiveteam-bs
pikhq has joined #archiveteam-bs
spacegirl has joined #archiveteam-bs
DFJustin has joined #archiveteam-bs
swebb sets mode: +o DFJustin
zhongfu has joined #archiveteam-bs
ld1 has joined #archiveteam-bs
midas has joined #archiveteam-bs
DrasticAc has joined #archiveteam-bs
bitspill has joined #archiveteam-bs
Hecatz has joined #archiveteam-bs
alfie has joined #archiveteam-bs
ItsYoda has joined #archiveteam-bs
trvz has joined #archiveteam-bs
Muad-Dib has joined #archiveteam-bs
[13:53]
godaneso looks like i have this tape from jason : https://archive.org/details/ShigeruMiyamotoGdcKeynote1999
we have the opening of it so it is different then that one
[14:13]
***dan- has joined #archiveteam-bs [14:17]
.... (idle for 17mn)
tuluu has quit IRC (Read error: Operation timed out)
tuluu has joined #archiveteam-bs
purplebot has joined #archiveteam-bs
Rai-chan has joined #archiveteam-bs
dboard2 has joined #archiveteam-bs
i0npulse has joined #archiveteam-bs
[14:34]
godane has left
godane has joined #archiveteam-bs
bitBaron has joined #archiveteam-bs
bitBaron has quit IRC (Client Quit)
[14:48]
.................. (idle for 1h28mn)
Stilett0 is now known as Stiletto
HCross2 has quit IRC (Ping timeout: 260 seconds)
mattl has quit IRC (Ping timeout: 260 seconds)
voltagex has quit IRC (Ping timeout: 260 seconds)
jiphex has quit IRC (Ping timeout: 260 seconds)
trvz has quit IRC (Ping timeout: 260 seconds)
bitspill has quit IRC (Ping timeout: 260 seconds)
DrasticAc has quit IRC (Ping timeout: 260 seconds)
r3c0d3x has quit IRC (Ping timeout: 260 seconds)
tklk has quit IRC (Ping timeout: 260 seconds)
floogulin has quit IRC (Ping timeout: 260 seconds)
DedSec has quit IRC (Ping timeout: 260 seconds)
fallenoak has quit IRC (Ping timeout: 260 seconds)
ThisAsYou has quit IRC (Ping timeout: 260 seconds)
alembic has quit IRC (Ping timeout: 260 seconds)
JSharp has quit IRC (Ping timeout: 260 seconds)
riking has quit IRC (Ping timeout: 260 seconds)
SN4T14_ has quit IRC (Ping timeout: 260 seconds)
BartoCH has quit IRC (Ping timeout: 260 seconds)
Ctrl-S___ has quit IRC (Ping timeout: 260 seconds)
deathy has quit IRC (Ping timeout: 260 seconds)
xarph has quit IRC (Ping timeout: 260 seconds)
Muad-Dib has quit IRC (Ping timeout: 260 seconds)
jsa has quit IRC (Ping timeout: 260 seconds)
Meroje has quit IRC (Ping timeout: 260 seconds)
victorbje has quit IRC (Ping timeout: 260 seconds)
johtso has quit IRC (Ping timeout: 260 seconds)
octarine has quit IRC (Ping timeout: 260 seconds)
jrwr has quit IRC (Ping timeout: 260 seconds)
[16:22]
JAAWTF [16:28]
***BartoCH has joined #archiveteam-bs
mattl has joined #archiveteam-bs
deathy has joined #archiveteam-bs
JSharp has joined #archiveteam-bs
ThisAsYou has joined #archiveteam-bs
riking has joined #archiveteam-bs
jiphex has joined #archiveteam-bs
voltagex has joined #archiveteam-bs
alembic has joined #archiveteam-bs
Ctrl-S___ has joined #archiveteam-bs
HCross2 has joined #archiveteam-bs
trvz has joined #archiveteam-bs
octarine has joined #archiveteam-bs
victorbje has joined #archiveteam-bs
r3c0d3x has joined #archiveteam-bs
Meroje has joined #archiveteam-bs
floogulin has joined #archiveteam-bs
tklk has joined #archiveteam-bs
DrasticAc has joined #archiveteam-bs
fallenoak has joined #archiveteam-bs
DedSec has joined #archiveteam-bs
bitspill has joined #archiveteam-bs
johtso has joined #archiveteam-bs
jsa has joined #archiveteam-bs
SN4T14 has joined #archiveteam-bs
Muad-Dib has joined #archiveteam-bs
[16:29]
...... (idle for 29mn)
joepie91https://motherboard.vice.com/en_us/article/bj7vam/why-twitter-is-the-best-social-media-platform-for-disinformation [17:03]
***midas2 has quit IRC (Read error: Operation timed out) [17:10]
midas2 has joined #archiveteam-bs [17:16]
K4k has quit IRC (Read error: Operation timed out)
K4k has joined #archiveteam-bs
[17:22]
Coderjohmm... not quite a user-driven site, but...
http://support.comixology.com/customer/portal/articles/2887181-pull-list-retirement-faq
it somewhat was, with the retailer portal bit, I guess
And who is surprised at Amazon killing this part of the company after acquiring it? Show of hands?
[17:30]
***xarph has joined #archiveteam-bs [17:45]
.... (idle for 17mn)
JensRex has joined #archiveteam-bs [18:02]
...... (idle for 26mn)
K4k has quit IRC (Quit: WeeChat 1.9.1)
jrwr has joined #archiveteam-bs
K4k has joined #archiveteam-bs
[18:28]
K4k has quit IRC (Quit: WeeChat 1.9.1)
K4k has joined #archiveteam-bs
jrochkind has joined #archiveteam-bs
[18:37]
jrochkindHello, I am a librarian-programmer, but not professionally involved in digital archiving,a nd dont’ know much about archiveteam. BUT….
Baltimore City Paper, Baltimore’s 40-year old alternative free weekly, just published their last issue, after being bought by Tribune Media/TRONC. The website is still up, with lots and lots of content, but who knows for how long. I want to try to to preserve as much as possible.
Can anyone here help? Either via archiveteam project, or advice, or whatever?
[18:38]
JAAThank you. I'll add it to ArchiveBot.
That might not exactly grab everything though.
[18:42]
jrochkindThanks! http://www.citypaper.com/ I will continue exporing various other approaches. Is there a place i can check to see ArchiveBot progress, or find the results of what it managers to get? Sorry, I am starting from zero knowledge about how your tools work, although I am an engineer and understand stuff. [18:45]
JAAYeah, the site uses JS for quite a lot of stuff.
http://dashboard.at.ninjawedding.org/
[18:45]
jrochkindawesome, thank you. [18:46]
JAAIt will be job at569nt11fsuk3019kimdq036 (displayed on the far right), but it might not start until in a few days.
I'll also throw in some subdomains, e.g. http://events.citypaper.com/
http://digitaledition.citypaper.com/ definitely won't work with ArchiveBot at all.
And even on the main site, galleries etc. all only work with JavaScript. :-|
[18:46]
jrochkindi actually didn’t even know about digitaledition.citypaper.com, ha. There’s def years of content jsut available on HTML pages, although I don’t know about the internal links, if a scraper is going to find them. [18:47]
JAAYeah, I'm not quite sure either. [18:50]
jrochkindHere’s an example page I found on google (happens to have a letter to the editor from me, is how I targetted it), which is not currently in the IA wayback machine. It’s just an ordinary HTML page, but I dunno about internal links for a scraper to find it. http://www.citypaper.com/bcp-cms-1-1406281-migrated-story-cp-20121121-mail-20121121-story.html [18:50]
JAAI think it should discover quite a large part of the site through http://www.citypaper.com/topic/
Luckily, the listings within topics are using URL-based pagination, e.g. http://www.citypaper.com/topic/politics-government/government/catherine-e.-pugh-PEPLT00007656-topic.html -> http://www.citypaper.com/topic/politics-government/government/catherine-e.-pugh-PEPLT00007656-topic.html?page=2&
[18:56]
jrochkindhmm. what if I get a list of every `site:citypaper.com` hit URL from google, perhaps using a google CSE I pay for. Is there anything useful I can do with that? [18:59]
JAAYes, we could make use of that. But keep in mind that search engines (especially Google) have strict rate limits. Scraping it for results is only really possible for smallish websites, in my experience. [19:01]
jrochkindoh nice, yeah that topics index with paginated lists of topics is pretty good. [19:01]
JAASpecifically, they'll make you fill out captchas, so you can't really automate it. [19:01]
jrochkindGoogle has 30-40K hits for citypaper.com. If you pay google, you actually get an allowed API, no captcha, unless they’ve cancelled that service since I used it last. It will not be expensive to use to just get all the paginated results. (I’d pay for it). Although the allowed API actually might not let me get em all, it might stop you from paginating beyond a certai point. But I might mess with it, if a giant list of
URLs would be useful to you. If I do get a list of a few tens of K of URLs, can I share them with you somehow?
[19:03]
JAAAh, right. [19:03]
jrochkind$5 per 1000 queries, if it really lets me paginate thorugh 30K at 10 at a time, that’s only $15. [19:04]
JAAIt could be useful, but if those articles are all (or almost all) discovered through /topic anyway, it's probably not worth it.
I need to leave for a bit. Maybe someone else has better ideas.
[19:06]
jrochkinddrat, I believe Google actually shut down that API anyway. Even though their docs still doc it, it gives me an error when I try to create one, and I vaguely remember them saying they were gonna shut it down. Ah, Google. Anyway, ok, than you JAA! [19:07]
...... (idle for 26mn)
***TheLovina has quit IRC (Read error: Connection reset by peer) [19:33]
jrochkindJAA if they come back or any other interested parties, they do have a sitemap.xml, although it seems to only have some very limited things in it, its’ not really a sitemap. Dont’ know if your tools will use sitemap. [19:36]
their robots.txt actualy disallows all those topic/ pages, which seemed the most useful for scraping links. don’t know what archivebot does with robots.txt [19:41]
JAAjrochkind: wpull (used by ArchiveBot) knows about both sitemaps and robots.txt. With the options used in ArchiveBot, it grabs both to discover additional content (i.e. ignores Disallow directives). [19:52]
jrochkindcool. looking at it, this site might not be very scrapable, it’s a pretty poorly designed site. we’ll find out!
those topics are actually pretty useless. I think it’s just a listing of terms from some standard vocabularly, I have yet to find one that actually leads to articles.
which may be why they are disallowed in robots.txt.
[19:52]
JAAYes, most of those "topics" seem useless, but some do have links to articles, e.g. the one I linked above.
In that case, it seems to be the author of the articles.
[19:54]
jrochkindah, cool. it might trip up a scraper in requesting thousands of useless links too though. [19:56]
JAAYeah, but some thousands of links aren't really that problematic in the big picture. [19:57]
jrochkindinteresting. there are some weird topic links for sure. http://www.citypaper.com/topic/education/schools/high-schools/05005003-topic.html
i wonder who they licensed that vocabulary from haha
[19:58]
...... (idle for 27mn)
***schbirid2 has quit IRC (Quit: Leaving) [20:25]
jschwart has joined #archiveteam-bs [20:33]
Mateon1 has quit IRC (Ping timeout: 250 seconds)
Mateon1 has joined #archiveteam-bs
[20:38]
........ (idle for 36mn)
tuluu has quit IRC (Remote host closed the connection)
tuluu has joined #archiveteam-bs
[21:16]
....... (idle for 34mn)
dashcloud has quit IRC (Remote host closed the connection) [21:53]
kyounko_ has quit IRC (Ping timeout: 255 seconds) [22:02]
...... (idle for 25mn)
drumstick has joined #archiveteam-bs [22:27]
.... (idle for 17mn)
jschwart has quit IRC (Konversation terminated!) [22:44]
......... (idle for 40mn)
dashcloud has joined #archiveteam-bs
BlueMaxim has joined #archiveteam-bs
[23:24]
.... (idle for 19mn)
jrochkind has quit IRC (jrochkind) [23:47]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)