#archiveteam 2017-11-19,Sun

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
***Ctrl has joined #archiveteam [00:07]
DFJustinhttps://archive.org/details/jstor_ejc
dunno about the open access ebooks part though
[00:07]
***Martle_ has joined #archiveteam
Martle has quit IRC (Read error: Operation timed out)
j08nY has quit IRC (Read error: Operation timed out)
j08nY has joined #archiveteam
Martle__ has joined #archiveteam
Ctrl has quit IRC (Remote host closed the connection)
Ctrl has joined #archiveteam
Ctrl has quit IRC (Excess Flood)
[00:20]
SketchCowThe SILK guy got back to me with a .csv of subdomains.
I've forwarded the list to arkiver to process.
Along with the warnings of the guy, i.e. they know we're going to do this but they can be over capacity easy by us doing massive grabs
[00:35]
***Martle_ has quit IRC (Read error: Operation timed out) [00:36]
SketchCowAlso
From John Gilmore: archiveteam.org uses an invalid security certificate. The certificate is only valid for the following names: breeze.tqhosting.com, www.breeze.tqhosting.com Error code: SSL_ERROR_BAD_CERT_DOMAIN
I'll happily work with someone to fix this
[00:39]
***Soni has quit IRC (Ping timeout: 264 seconds) [00:48]
Ctrl has joined #archiveteam
icedice2 has quit IRC (Quit: Leaving)
Soni has joined #archiveteam
[01:02]
ZexaronS has joined #archiveteam [01:13]
.... (idle for 19mn)
SketchCowAlso
Hey Jason - I wonder if it's worth having the Archive Team spider
FamilySearch.org? It looks like their proprietary "partners" are
forcing them to put it behind a login-wall starting Dec 13. And of
course the first thing a login-wall does is to turn off any account
that starts doing bulk downloads...
And if you're talking about "history going offline", this has some of
the best most detailed history of human ancestry ever collected. I
have discovered and researched my ancestors back to the early 1800s in
their data -- all without logging in. Church baptism records from the 1500s.
Government census records from the very beginning. Etc.
[01:32]
.... (idle for 15mn)
***nertzy2 has joined #archiveteam [01:48]
nertzy has quit IRC (Read error: Operation timed out) [01:55]
ZexaronS has quit IRC (Read error: Operation timed out) [02:07]
........ (idle for 36mn)
pizzaiolo has quit IRC (Remote host closed the connection)
kristian_ has joined #archiveteam
j08nY has quit IRC (Remote host closed the connection)
[02:43]
Valentine has joined #archiveteam
Valentin- has quit IRC (Ping timeout: 506 seconds)
[02:59]
......... (idle for 44mn)
superkuh has quit IRC (Quit: the neuronal action potential is an electrical manipulation of reversible abrupt phase changes in the lipid bilaye) [03:44]
SketchCow-----------------------------
FOS UPDATE
The new FOS should basically have taken everything over from old FOS
There's a few hundred gigs of this and that I'll nail down this week
Some things might not run right, let me know if you see them
-----------------------------
[03:47]
***odemg has quit IRC (Ping timeout: 245 seconds) [04:00]
odemg has joined #archiveteam [04:14]
DFJustinffs the one genealogy site on the internet that isn't ruined [04:19]
***ranavalon has quit IRC (Read error: Connection reset by peer) [04:23]
...... (idle for 29mn)
ZexaronS has joined #archiveteam
qw3rty110 has joined #archiveteam
kristian_ has quit IRC (Quit: Leaving)
qw3rty19 has quit IRC (Read error: Operation timed out)
[04:52]
......... (idle for 41mn)
ZexaronS has quit IRC (Quit: Leaving) [05:42]
hook54321I'm in the FamilySearch Yammer chat, if anyone has any question I can probably ask them there.
https://media.familysearch.org/familysearch-free-sign-in-offers-greater-subscriber-experiences-and-benefits/
[05:43]
........ (idle for 36mn)
***dboard2 is now known as dboard [06:21]
................................... (idle for 2h51mn)
Mateon1 has quit IRC (Ping timeout: 260 seconds)
Mateon1 has joined #archiveteam
[09:12]
.......... (idle for 47mn)
pizzaiolo has joined #archiveteam [09:59]
....... (idle for 31mn)
BlueMaxim has quit IRC (Read error: Connection reset by peer) [10:30]
j08nY has joined #archiveteam [10:41]
luk has joined #archiveteam [10:51]
luk has quit IRC (Ping timeout: 260 seconds)
Fusl has joined #archiveteam
[10:58]
schbirid has joined #archiveteam [11:06]
...... (idle for 27mn)
zino has quit IRC (Ping timeout: 255 seconds) [11:33]
zino has joined #archiveteam [11:41]
zino has quit IRC (Remote host closed the connection) [11:53]
........ (idle for 38mn)
kristian_ has joined #archiveteam [12:31]
...... (idle for 27mn)
schbiridso uh, dont ask me why but i set up some automatic grabbing of (selected by format) cinemageddon uploads with no actual plan but soothing my hoarding mind
if someone (i know and trust from here) wants to upload&dark them to IA, i could rsync to you. finished torrent contents only, not the torrent file or the metadata or anything, sorry
[12:58]
...... (idle for 26mn)
***j08nY has quit IRC (Quit: Leaving) [13:25]
..... (idle for 20mn)
ZexaronS has joined #archiveteam [13:45]
........ (idle for 38mn)
kristian_ has quit IRC (Quit: Leaving)
ranavalon has joined #archiveteam
ranavalon has quit IRC (Remote host closed the connection)
ranavalon has joined #archiveteam
[14:23]
ZexaronS has quit IRC (Quit: Leaving) [14:33]
..... (idle for 21mn)
Stilett0 has joined #archiveteam
justaj has joined #archiveteam
[14:54]
justajhi, I was wondering how I could best save an entire Reddit thread. I've read on the AT wiki that there was a partial archive of Reddit but I want to save threads just one by one if that's possible. I made a thread asking just that - https://redd.it/7e0xm6
I'd appreciate if anyone could help out.
[14:57]
***superkuh has joined #archiveteam [15:09]
balrog has quit IRC (Read error: Operation timed out) [15:20]
JAAjustaj: That's a bit tricky. As soon as a thread grows too large, you can't easily access all child comments but have to retrieve what those "load more comments" links do as well. I'm not aware of any straightforward archiving solution for Reddit threads.
However, there is an archive of all Reddit comments at https://files.pushshift.io/reddit/comments/ (IA mirror at https://archive.org/details/reddit-data-comments ), and it should be possible to extract all comments for a particular thread from there.
[15:28]
justaj: One way to archive an entire thread would be to use warcprox with any browser, then go to the relevant thread and click on all the "load more comments" and "continue thread" links manually. That would save all relevant data to a WARC file, which can later be played back e.g. with pywb. It's all manual though. [15:36]
.... (idle for 16mn)
schbiridhttp://www.pagetable.com/?p=904 [15:52]
........ (idle for 36mn)
justajJAA: I see. One trick (if you want to see a maximum of 500 comments) is to append ?limit=1000 to the URL and then archive that way. However, that still doesn't solve the issue with archiving long comment threads that are behind the "Continue this thread --->" parts. I have the Wayback Machine browser extension and I don't really mind the manual wor
k, so I think I'll try to archive the links leading to those "hidden" parts as well using that.
I'll try messing around with warcprox, but I'm so far a noob with python and certainly messing around with certificates and MITM.
[16:28]
JAAYeah, and that also doesn't help with comments which received tons of replies because some of those will be hidden behind "load more comments". I think warcprox (or a similar software) is probably the only way to really capture everything.
You don't need to know Python at all to use warcprox, and the certificate thing should be fairly straightforward.
If you want to discuss this further, please come to #archiveteam-bs. This channel is mainly for announcements.
[16:33]
***odemg has quit IRC (Quit: Leaving) [16:46]
.... (idle for 18mn)
pizzaiolo has quit IRC (Read error: Operation timed out) [17:04]
.............. (idle for 1h8mn)
SirCmpwn has quit IRC (Read error: Operation timed out)
Zialus has quit IRC (Read error: Operation timed out)
Fusl_ has joined #archiveteam
Martle has joined #archiveteam
Stiletto has joined #archiveteam
liam has quit IRC (Read error: Operation timed out)
lukeman has quit IRC (Read error: Operation timed out)
squires has quit IRC (Read error: Operation timed out)
beardicus has quit IRC (Read error: Operation timed out)
MMovie has quit IRC (Read error: Operation timed out)
justaj has quit IRC (Read error: Operation timed out)
Fusl has quit IRC (Read error: Operation timed out)
lukeman has joined #archiveteam
Stilett0 has quit IRC (Read error: Operation timed out)
C4K3 has quit IRC (Read error: Operation timed out)
REiN^ has quit IRC (Read error: Operation timed out)
PotcFdk has quit IRC (Read error: Operation timed out)
Martle__ has quit IRC (Read error: Operation timed out)
Dimtree has quit IRC (Read error: Operation timed out)
c4rc4s has quit IRC (Ping timeout: 600 seconds)
nwf_ has quit IRC (Read error: Operation timed out)
qw3rty110 has quit IRC (Read error: Operation timed out)
oli_ has joined #archiveteam
c4rc4s has joined #archiveteam
oli has quit IRC (Read error: Operation timed out)
oli_ is now known as oli
[18:12]
SirCmpwn has joined #archiveteam [18:25]
wp494Weather Underground is tossing out webcams now: http://help.wunderground.com/knowledgebase/articles/1821811
"After 10 years of proudly displaying your webcam footage across our website and apps, we sadly have to remove this functionality as we no longer have the necessary resources to maintain it. On December 15, 2017, we’ll remove the webcam feeds from our website, mobile apps, and within our API – meaning uploading and accessing webcam footage will no longer be available."
"Q: Can I download my existing webcam footage?
Unfortunately, we do not have download functionality for webcam footage."
[18:37]
JAAUgh [18:38]
wp494I thought IBM "liberating" WU from NBC/Comcast would be a good thing but so far it really hasn't been [18:39]
***Dimtree has joined #archiveteam [18:43]
Harzilein has joined #archiveteam [18:50]
Harzileinhi [18:50]
***qw3rty110 has joined #archiveteam [18:51]
wp494Yes, hello [18:51]
***liam has joined #archiveteam
beardicus has joined #archiveteam
REiN^ has joined #archiveteam
squires has joined #archiveteam
MMovie has joined #archiveteam
C4K3 has joined #archiveteam
[18:51]
arkiverwe can archive the webcam footage from wunderground.com
https://www.wunderground.com/webcams/
[18:56]
***Zialus has joined #archiveteam [19:00]
nwf_ has joined #archiveteam [19:07]
PotcFdk has joined #archiveteam [19:15]
pizzaiolo has joined #archiveteam [19:23]
Fusl_does someone know if there's a docker image available for the warrior that doesn't require manual configuration on container boot? [19:27]
***Fusl_ is now known as Fusl
Pixi` has quit IRC (Quit: Pixi`)
Pixi has joined #archiveteam
[19:27]
antomaticHuh! IBM are short of disc space? Who knew. [19:31]
***jschwart has joined #archiveteam
bithippo has quit IRC (My MacBook Air has gone to sleep. ZZZzzz…)
odemg has joined #archiveteam
[19:39]
hook54321arkiver: http://icons.wunderground.com/webcamarchive/u/t/utdot/246/2016/09/20160911.mp4 [19:53]
arkiveryeah
we just need a list of uploaders
like kydot in https://www.wunderground.com/webcams/kydot/
can maybe get that from the map, will have a look
[19:54]
hook54321why do we need a list of uploaders? [20:00]
antomaticso we know what to archive
(or at least where to start)
[20:09]
***j08nY has joined #archiveteam [20:13]
ZexaronS has joined #archiveteam
bithippo has joined #archiveteam
[20:24]
...... (idle for 29mn)
balrog has joined #archiveteam [20:57]
....... (idle for 30mn)
trvz has quit IRC (Ping timeout: 260 seconds) [21:27]
.... (idle for 15mn)
icedice has joined #archiveteam [21:42]
............. (idle for 1h1mn)
matt_ has joined #archiveteam
matt_ is now known as Igloo_
achip has joined #archiveteam
Igloo_ has quit IRC (Client Quit)
jschwart has quit IRC (Quit: Konversation terminated!)
Igloo_ has joined #archiveteam
Igloo has quit IRC (Quit: leaving)
Igloo_ is now known as Igloo
[22:43]
Rondom_ has joined #archiveteam
yuitimoth has quit IRC (Read error: Connection reset by peer)
Rondom has quit IRC (Read error: Network is unreachable)
atluxity has quit IRC (Remote host closed the connection)
yuitimoth has joined #archiveteam
atluxity has joined #archiveteam
kcaj has quit IRC (Ping timeout: 506 seconds)
kcaj has joined #archiveteam
[23:06]
...... (idle for 29mn)
trvz has joined #archiveteam
BlueMaxim has joined #archiveteam
[23:40]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)