#archiveteam-bs 2017-10-11,Wed

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
godaneits a rca home theater vcr
it powers one but the machine will not take the tape
[00:00]
***Atom has quit IRC (Read error: Connection reset by peer)
icedice has quit IRC (Read error: Operation timed out)
[00:09]
godaneso i now have a tape hitting the 10000k mark
not of your tapes didn't for some reason
[00:14]
anyways i'm only putting this tape at 6000k now
most cause i only capture tv stuff around 5000k
[00:19]
.... (idle for 18mn)
btw i got the money from patreon now [00:38]
***BlueMaxim has quit IRC (Quit: Leaving)
pizzaiolo has joined #archiveteam-bs
[00:40]
....... (idle for 32mn)
BlueMaxim has joined #archiveteam-bs [01:15]
...... (idle for 26mn)
dashcloudglad to hear you got the payment situation fixed godane [01:41]
***schbirid2 has joined #archiveteam-bs [01:49]
godanebtw i found a youtube channel with all Sightings epsiodes
i'm grabbing it for the archive and myspleen cause they been looking for all episodes of it
[01:50]
***username1 has quit IRC (Read error: Operation timed out)
dashcloud has quit IRC (Remote host closed the connection)
[01:54]
godaneso got sci-fi airing of star trek [01:56]
atrocitywhy does sci-fi airing it matter?
just for commercials and stuff?
[02:05]
secondmundus: what's up with your server? [02:07]
godaneit has intro with william shatner talking about the episode
so its for commericals and stuff
some times bad edits on stations
anyways i got 14 tapes from the guy for $10.01
[02:11]
***VerifiedJ has joined #archiveteam-bs [02:19]
.... (idle for 17mn)
VerifiedJgodane: I did some more digging and found a way to get full PDFs. Details here https://verifiedjoseph.com/f68qUv7lqs/archiveteam/pagesuite-pdfs.txt (i hope it makes sense) [02:36]
***r3c0d3x has quit IRC (Ping timeout: 260 seconds)
VerifiedJ has left
r3c0d3x has joined #archiveteam-bs
[02:42]
...... (idle for 27mn)
Asparagir has quit IRC (Asparagir) [03:13]
Stilett0 has joined #archiveteam-bs
Stilett0 is now known as Stiletto
[03:22]
mundussecond, can't afford it [03:24]
***pizzaiolo has quit IRC (Quit: pizzaiolo) [03:24]
secondmundus: how much was it costing? [03:25]
mundus$7/mo [03:25]
secondhmm
Bandwidth cost?
[03:28]
mundusI know it's not much
But I don't have much money
Unlimited bw
[03:28]
....... (idle for 33mn)
secondDoes anyone know where I can find the old imdb database?
Or a movie database dataset?
Can someone archive this? ftp://ftp.fu-berlin.de/pub/misc/movies/database/temporaryaccess/
https://sourceforge.net/p/imdbpy/mailman/message/35922484/
And perhaps this ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/
IMDB got rid of their database dumps and got a new format but it is missing a lot of data
a lot of cast / crew meta
[04:02]
Somebody2second: how big are those? [04:10]
secondA few gigs
Post said it was Old files: 49 files, 1.9 GB
[04:11]
Somebody2nods [04:12]
secondNew files: 6 files, 361 MB on S3
I can download the S3 stuff and pay for it just need to know where I can upload it for archiveteam to take it and how they want it
Or I can pay someone to get it and they give me a copy ;D
[04:12]
Somebody2second: you can upload it to the Internet Archive.
Just make an account (all you need is an email address, which will be permanently and (not very) publically attached to whatever you upload).
You can download each file and upload them before downloading the next, which should avoid you needing to hold on to larger amounts of space.
[04:13]
secondHow is the IA doing with space? [04:15]
Somebody2second: they've got plenty.
a few gigs won't even be noticed.
[04:15]
secondI hope they have backups, also california is on fire, I hope the IA doesn't burn down too [04:16]
Somebody2a few *terabytes* wouldn't be noticed
once you get up to a petabyte, it's polite to ask first.
(I'm somewhat exagerating, but only somewhat)
They have backups; they are working on a backup in Canada, although I haven't heard much about it lately.
[04:16]
***Sk1d has quit IRC (Ping timeout: 250 seconds) [04:19]
Somebody2second: I'm grabbing the fu-berlin one now. [04:25]
***Sk1d has joined #archiveteam-bs [04:26]
kisspunchsecond: also grabbing [04:27]
.... (idle for 15mn)
Somebody2Up to 2.8G so far
second: it looks like the other address, funet.fi, is a mirror; are you sure the data is different?
[04:42]
kisspunchSomebody2: IMDB releases snapshots with diffs--compare something outside the diffs folder
They will either be the same data at 2 points in time or the same point in time is what I was trying to convey
[04:46]
Somebody2kisspunch: not sure what you mean?
the reported file sizes are identical
the timestamps are a few hours different
[04:47]
kisspunchSomebody2: I am saying, compare a file that's not a diff if you're going to do that check. In any case, I'm just grabbing both and running a deduplicator after [04:49]
Somebody2sounds good.
Let me know if your deduplication finds that they are different, and I'll grab the second one.
[04:49]
kisspunchPretty sure they're the same (actors.list.gz is the same size) but I'll double check tomorrow or so
Expect it to be 14G
My internet's not that fast, I just have an old dump :)
[04:56]
yipdwJAA: send me an SSH public key over query or email or whatnot, I can grant you access to archivebot@archivebot-proto2 and then you can register new pipelines [05:11]
***wp494 has quit IRC (Ping timeout: 506 seconds) [05:13]
Somebody2YAYAYAY! New pipeline energy!
https://blog.archive.org/2017/10/10/books-from-1923-to-1941-now-liberated/
One of the points about this focus on whether copies can be bought for a fair price.
If there are only a few copies, can someone buy them, and announce that they are no longer for sale, and thereby trigger section 108(h)?
[05:19]
pikhqHow shockingly reasonable of US copyright law. [05:23]
Somebody2pikhq: yeah, ain't it? [05:25]
.... (idle for 19mn)
***wp494 has joined #archiveteam-bs
BlueMaxim has quit IRC (Quit: Leaving)
[05:44]
..... (idle for 22mn)
Somebody2second: I've now got the fu-berlin one; it's 13G in size. I'll wait to hear from kisspunch about whether the funet.fi one is different before going after that. [06:10]
***BlueMaxim has joined #archiveteam-bs [06:24]
loadup has quit IRC (Read error: Operation timed out) [06:35]
...... (idle for 27mn)
Honno has joined #archiveteam-bs [07:02]
............ (idle for 59mn)
atrocity has quit IRC () [08:01]
..... (idle for 22mn)
BlueMaxim has quit IRC (Ping timeout: 255 seconds)
BlueMaxim has joined #archiveteam-bs
[08:23]
..... (idle for 21mn)
wp494 has quit IRC (Ping timeout: 492 seconds) [08:44]
wp494 has joined #archiveteam-bs [08:51]
........ (idle for 35mn)
tfgbd_znc has quit IRC (Read error: Connection reset by peer) [09:26]
..... (idle for 20mn)
wabu has quit IRC (Read error: Operation timed out) [09:46]
wabu has joined #archiveteam-bs
kepler45 has joined #archiveteam-bs
[09:56]
kisspunchugh, there are so many versions of fdupes [10:05]
***Honno has quit IRC (Read error: Operation timed out) [10:15]
kisspunchSomebody2: It's the same. [10:22]
***atrocity has joined #archiveteam-bs
ivan has quit IRC (Leaving)
[10:28]
marvinw has joined #archiveteam-bs [10:40]
Mateon1 has quit IRC (Ping timeout: 250 seconds) [10:54]
midas has quit IRC (Read error: Connection reset by peer)
midas has joined #archiveteam-bs
[11:02]
JAAyipdw: Excellent, will do in a bit. [11:04]
..... (idle for 23mn)
***pizzaiolo has joined #archiveteam-bs [11:27]
....... (idle for 33mn)
qw3rty3 has joined #archiveteam-bs [12:00]
...... (idle for 25mn)
wabu has quit IRC (Read error: Operation timed out)
Atom has joined #archiveteam-bs
[12:25]
wabu has joined #archiveteam-bs [12:35]
BlueMaxim has quit IRC (Quit: Leaving) [12:43]
.......... (idle for 49mn)
secondThank you Somebody2
Did anyone by chance download the aws bucket for the imdb data?
[13:32]
JAASince you have to pay S3's exorbitant bandwidth fees (it's a Requester-Pays bucket), I kind of doubt it. I believe IMDB is still working on an HTTP interface without those fees.
See: https://getsatisfaction.com/imdb/topics/imdb-data-now-available-in-amazon-s3
Them not having the HTTP interface up seems to be the reason why the FTP servers are still online.
[13:39]
secondDoes anyone know where I can find a last.fm dump? [13:46]
***Mateon1 has joined #archiveteam-bs [13:47]
...... (idle for 25mn)
Pixi has quit IRC (Quit: Pixi)
Pixi has joined #archiveteam-bs
icedice has joined #archiveteam-bs
[14:12]
qw3rty3Is there a channel for Amazon Forum archival? [14:17]
..... (idle for 20mn)
***sep332 has joined #archiveteam-bs [14:37]
...... (idle for 29mn)
Asparagir has joined #archiveteam-bs [15:06]
icedice has quit IRC (Quit: Leaving) [15:16]
Stiletto has quit IRC (Ping timeout: 260 seconds) [15:21]
ZexaronS- has joined #archiveteam-bs
ZexaronS has quit IRC (Ping timeout: 260 seconds)
[15:28]
JAAqw3rty3: No, there isn't. [15:34]
..... (idle for 24mn)
***Asparagir has quit IRC (Asparagir) [15:58]
.... (idle for 17mn)
schbirid2re that new order forum. 125$ for a forum that would run on a 5$ host... wtf [16:15]
JAAI don't know the story behind this case, but I've seen similar setups before, and there it was a matter of "never change a running system" mixed with "I'm too lazy to do anything about it". [16:16]
***Stilett0 has joined #archiveteam-bs [16:18]
Stilett0 is now known as Stiletto [16:24]
............ (idle for 58mn)
Asparagir has joined #archiveteam-bs [17:22]
.... (idle for 17mn)
pa has joined #archiveteam-bs [17:39]
dd0a13f37VerifiedJ: that's basically a slower version of pdfcat though, it's not pristine so to speak [17:50]
***pa has quit IRC (Quit: pa)
pa has joined #archiveteam-bs
[17:52]
dd0a13f37second: a quick google search gives me https://www.demonforums.net/Thread-Last-fm-Dump-Re-upload https://leakninja.com/39243-lastfm-1-8gb-dump-12.html
oh hey, https://btdig.com/85f39f1d94917d61277725e7da85d8177a5c12eb/
/last.fm/lastfm.txt.gz
Any way to upload a torrent larger than 100gb to internetarchive?
[17:56]
.......... (idle for 45mn)
***Stiletto has quit IRC () [18:44]
....... (idle for 32mn)
Asparagir has quit IRC (Asparagir) [19:16]
........... (idle for 51mn)
schbirid2 has quit IRC (Quit: Leaving)
schbirid has joined #archiveteam-bs
[20:07]
..... (idle for 23mn)
pa has quit IRC (Quit: pa)
pa has joined #archiveteam-bs
pa has quit IRC (Client Quit)
[20:32]
JAAWhat's the best way to archive different source code repositories? I know about svnrdump for SVN repos, but what about other softwares? git, Mercurial, Bazaar, CVS, etc.
In particular, what to do if the repository itself is not public but only accessible through a web frontend? (There's an ArchiveBot job currently grabbing a CVSweb instance; that's the immediate trigger for these questions, though I've been wondering about it for longer.)
[20:39]
yipdwgit clone
etc
github-backup for stuff on github that might have other useful things like issues, wiki pages, etc
[20:44]
kisspunchFor git clone the harder part is keeping your mirror up to date--the initial clone yeah, git clone works fine [20:53]
.......... (idle for 45mn)
***Stilett0 has joined #archiveteam-bs [21:38]
............ (idle for 55mn)
Asparagir has joined #archiveteam-bs [22:33]
kepler45 has quit IRC (Quit: Leaving) [22:39]
wp494that guy that I asked to PM me about the whole deal involving NCIX got back to me and he straight up refused despite having PMed someone else already
/shrug
[22:49]
...... (idle for 26mn)
***BlueMaxim has joined #archiveteam-bs [23:15]
..... (idle for 22mn)
Soni has quit IRC (Ping timeout: 272 seconds) [23:37]
.... (idle for 16mn)
Asparagir has quit IRC (Asparagir) [23:53]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)