#archiveteam-bs 2017-10-11,Wed

↑back Search

Time Nickname Message
00:00 🔗 godane its a rca home theater vcr
00:01 🔗 godane it powers one but the machine will not take the tape
00:09 🔗 Atom has quit IRC (Read error: Connection reset by peer)
00:10 🔗 icedice has quit IRC (Read error: Operation timed out)
00:14 🔗 godane so i now have a tape hitting the 10000k mark
00:14 🔗 godane not of your tapes didn't for some reason
00:19 🔗 godane anyways i'm only putting this tape at 6000k now
00:20 🔗 godane most cause i only capture tv stuff around 5000k
00:38 🔗 godane btw i got the money from patreon now
00:40 🔗 BlueMaxim has quit IRC (Quit: Leaving)
00:43 🔗 pizzaiolo has joined #archiveteam-bs
01:15 🔗 BlueMaxim has joined #archiveteam-bs
01:41 🔗 dashcloud glad to hear you got the payment situation fixed godane
01:49 🔗 schbirid2 has joined #archiveteam-bs
01:50 🔗 godane btw i found a youtube channel with all Sightings epsiodes
01:50 🔗 godane i'm grabbing it for the archive and myspleen cause they been looking for all episodes of it
01:54 🔗 username1 has quit IRC (Read error: Operation timed out)
01:55 🔗 dashcloud has quit IRC (Remote host closed the connection)
01:56 🔗 godane so got sci-fi airing of star trek
02:05 🔗 atrocity why does sci-fi airing it matter?
02:05 🔗 atrocity just for commercials and stuff?
02:07 🔗 second mundus: what's up with your server?
02:11 🔗 godane it has intro with william shatner talking about the episode
02:12 🔗 godane so its for commericals and stuff
02:12 🔗 godane some times bad edits on stations
02:13 🔗 godane anyways i got 14 tapes from the guy for $10.01
02:19 🔗 VerifiedJ has joined #archiveteam-bs
02:36 🔗 VerifiedJ godane: I did some more digging and found a way to get full PDFs. Details here https://verifiedjoseph.com/f68qUv7lqs/archiveteam/pagesuite-pdfs.txt (i hope it makes sense)
02:42 🔗 r3c0d3x has quit IRC (Ping timeout: 260 seconds)
02:44 🔗 VerifiedJ has left
02:46 🔗 r3c0d3x has joined #archiveteam-bs
03:13 🔗 Asparagir has quit IRC (Asparagir)
03:22 🔗 Stilett0 has joined #archiveteam-bs
03:22 🔗 Stilett0 is now known as Stiletto
03:24 🔗 mundus second, can't afford it
03:24 🔗 pizzaiolo has quit IRC (Quit: pizzaiolo)
03:25 🔗 second mundus: how much was it costing?
03:25 🔗 mundus $7/mo
03:28 🔗 second hmm
03:28 🔗 second Bandwidth cost?
03:28 🔗 mundus I know it's not much
03:29 🔗 mundus But I don't have much money
03:29 🔗 mundus Unlimited bw
04:02 🔗 second Does anyone know where I can find the old imdb database?
04:04 🔗 second Or a movie database dataset?
04:07 🔗 second Can someone archive this? ftp://ftp.fu-berlin.de/pub/misc/movies/database/temporaryaccess/
04:07 🔗 second https://sourceforge.net/p/imdbpy/mailman/message/35922484/
04:07 🔗 second And perhaps this ftp://ftp.funet.fi/pub/mirrors/ftp.imdb.com/pub/
04:08 🔗 second IMDB got rid of their database dumps and got a new format but it is missing a lot of data
04:08 🔗 second a lot of cast / crew meta
04:10 🔗 Somebody2 second: how big are those?
04:11 🔗 second A few gigs
04:12 🔗 second Post said it was Old files: 49 files, 1.9 GB
04:12 🔗 Somebody2 nods
04:12 🔗 second New files: 6 files, 361 MB on S3
04:12 🔗 second I can download the S3 stuff and pay for it just need to know where I can upload it for archiveteam to take it and how they want it
04:13 🔗 second Or I can pay someone to get it and they give me a copy ;D
04:13 🔗 Somebody2 second: you can upload it to the Internet Archive.
04:14 🔗 Somebody2 Just make an account (all you need is an email address, which will be permanently and (not very) publically attached to whatever you upload).
04:14 🔗 Somebody2 You can download each file and upload them before downloading the next, which should avoid you needing to hold on to larger amounts of space.
04:15 🔗 second How is the IA doing with space?
04:15 🔗 Somebody2 second: they've got plenty.
04:16 🔗 Somebody2 a few gigs won't even be noticed.
04:16 🔗 second I hope they have backups, also california is on fire, I hope the IA doesn't burn down too
04:16 🔗 Somebody2 a few *terabytes* wouldn't be noticed
04:16 🔗 Somebody2 once you get up to a petabyte, it's polite to ask first.
04:16 🔗 Somebody2 (I'm somewhat exagerating, but only somewhat)
04:17 🔗 Somebody2 They have backups; they are working on a backup in Canada, although I haven't heard much about it lately.
04:19 🔗 Sk1d has quit IRC (Ping timeout: 250 seconds)
04:25 🔗 Somebody2 second: I'm grabbing the fu-berlin one now.
04:26 🔗 Sk1d has joined #archiveteam-bs
04:27 🔗 kisspunch second: also grabbing
04:42 🔗 Somebody2 Up to 2.8G so far
04:45 🔗 Somebody2 second: it looks like the other address, funet.fi, is a mirror; are you sure the data is different?
04:46 🔗 kisspunch Somebody2: IMDB releases snapshots with diffs--compare something outside the diffs folder
04:46 🔗 kisspunch They will either be the same data at 2 points in time or the same point in time is what I was trying to convey
04:47 🔗 Somebody2 kisspunch: not sure what you mean?
04:47 🔗 Somebody2 the reported file sizes are identical
04:47 🔗 Somebody2 the timestamps are a few hours different
04:49 🔗 kisspunch Somebody2: I am saying, compare a file that's not a diff if you're going to do that check. In any case, I'm just grabbing both and running a deduplicator after
04:49 🔗 Somebody2 sounds good.
04:50 🔗 Somebody2 Let me know if your deduplication finds that they are different, and I'll grab the second one.
04:56 🔗 kisspunch Pretty sure they're the same (actors.list.gz is the same size) but I'll double check tomorrow or so
04:58 🔗 kisspunch Expect it to be 14G
04:58 🔗 kisspunch My internet's not that fast, I just have an old dump :)
05:11 🔗 yipdw JAA: send me an SSH public key over query or email or whatnot, I can grant you access to archivebot@archivebot-proto2 and then you can register new pipelines
05:13 🔗 wp494 has quit IRC (Ping timeout: 506 seconds)
05:19 🔗 Somebody2 YAYAYAY! New pipeline energy!
05:20 🔗 Somebody2 https://blog.archive.org/2017/10/10/books-from-1923-to-1941-now-liberated/
05:21 🔗 Somebody2 One of the points about this focus on whether copies can be bought for a fair price.
05:21 🔗 Somebody2 If there are only a few copies, can someone buy them, and announce that they are no longer for sale, and thereby trigger section 108(h)?
05:23 🔗 pikhq How shockingly reasonable of US copyright law.
05:25 🔗 Somebody2 pikhq: yeah, ain't it?
05:44 🔗 wp494 has joined #archiveteam-bs
05:48 🔗 BlueMaxim has quit IRC (Quit: Leaving)
06:10 🔗 Somebody2 second: I've now got the fu-berlin one; it's 13G in size. I'll wait to hear from kisspunch about whether the funet.fi one is different before going after that.
06:24 🔗 BlueMaxim has joined #archiveteam-bs
06:35 🔗 loadup has quit IRC (Read error: Operation timed out)
07:02 🔗 Honno has joined #archiveteam-bs
08:01 🔗 atrocity has quit IRC ()
08:23 🔗 BlueMaxim has quit IRC (Ping timeout: 255 seconds)
08:23 🔗 BlueMaxim has joined #archiveteam-bs
08:44 🔗 wp494 has quit IRC (Ping timeout: 492 seconds)
08:51 🔗 wp494 has joined #archiveteam-bs
09:26 🔗 tfgbd_znc has quit IRC (Read error: Connection reset by peer)
09:46 🔗 wabu has quit IRC (Read error: Operation timed out)
09:56 🔗 wabu has joined #archiveteam-bs
09:56 🔗 kepler45 has joined #archiveteam-bs
10:05 🔗 kisspunch ugh, there are so many versions of fdupes
10:15 🔗 Honno has quit IRC (Read error: Operation timed out)
10:22 🔗 kisspunch Somebody2: It's the same.
10:28 🔗 atrocity has joined #archiveteam-bs
10:29 🔗 ivan has quit IRC (Leaving)
10:40 🔗 marvinw has joined #archiveteam-bs
10:54 🔗 Mateon1 has quit IRC (Ping timeout: 250 seconds)
11:02 🔗 midas has quit IRC (Read error: Connection reset by peer)
11:03 🔗 midas has joined #archiveteam-bs
11:04 🔗 JAA yipdw: Excellent, will do in a bit.
11:27 🔗 pizzaiolo has joined #archiveteam-bs
12:00 🔗 qw3rty3 has joined #archiveteam-bs
12:25 🔗 wabu has quit IRC (Read error: Operation timed out)
12:28 🔗 Atom has joined #archiveteam-bs
12:35 🔗 wabu has joined #archiveteam-bs
12:43 🔗 BlueMaxim has quit IRC (Quit: Leaving)
13:32 🔗 second Thank you Somebody2
13:33 🔗 second Did anyone by chance download the aws bucket for the imdb data?
13:39 🔗 JAA Since you have to pay S3's exorbitant bandwidth fees (it's a Requester-Pays bucket), I kind of doubt it. I believe IMDB is still working on an HTTP interface without those fees.
13:39 🔗 JAA See: https://getsatisfaction.com/imdb/topics/imdb-data-now-available-in-amazon-s3
13:40 🔗 JAA Them not having the HTTP interface up seems to be the reason why the FTP servers are still online.
13:46 🔗 second Does anyone know where I can find a last.fm dump?
13:47 🔗 Mateon1 has joined #archiveteam-bs
14:12 🔗 Pixi has quit IRC (Quit: Pixi)
14:12 🔗 Pixi has joined #archiveteam-bs
14:14 🔗 icedice has joined #archiveteam-bs
14:17 🔗 qw3rty3 Is there a channel for Amazon Forum archival?
14:37 🔗 sep332 has joined #archiveteam-bs
15:06 🔗 Asparagir has joined #archiveteam-bs
15:16 🔗 icedice has quit IRC (Quit: Leaving)
15:21 🔗 Stiletto has quit IRC (Ping timeout: 260 seconds)
15:28 🔗 ZexaronS- has joined #archiveteam-bs
15:30 🔗 ZexaronS has quit IRC (Ping timeout: 260 seconds)
15:34 🔗 JAA qw3rty3: No, there isn't.
15:58 🔗 Asparagir has quit IRC (Asparagir)
16:15 🔗 schbirid2 re that new order forum. 125$ for a forum that would run on a 5$ host... wtf
16:16 🔗 JAA I don't know the story behind this case, but I've seen similar setups before, and there it was a matter of "never change a running system" mixed with "I'm too lazy to do anything about it".
16:18 🔗 Stilett0 has joined #archiveteam-bs
16:24 🔗 Stilett0 is now known as Stiletto
17:22 🔗 Asparagir has joined #archiveteam-bs
17:39 🔗 pa has joined #archiveteam-bs
17:50 🔗 dd0a13f37 VerifiedJ: that's basically a slower version of pdfcat though, it's not pristine so to speak
17:52 🔗 pa has quit IRC (Quit: pa)
17:54 🔗 pa has joined #archiveteam-bs
17:56 🔗 dd0a13f37 second: a quick google search gives me https://www.demonforums.net/Thread-Last-fm-Dump-Re-upload https://leakninja.com/39243-lastfm-1-8gb-dump-12.html
17:56 🔗 dd0a13f37 oh hey, https://btdig.com/85f39f1d94917d61277725e7da85d8177a5c12eb/
17:57 🔗 dd0a13f37 /last.fm/lastfm.txt.gz
17:59 🔗 dd0a13f37 Any way to upload a torrent larger than 100gb to internetarchive?
18:44 🔗 Stiletto has quit IRC ()
19:16 🔗 Asparagir has quit IRC (Asparagir)
20:07 🔗 schbirid2 has quit IRC (Quit: Leaving)
20:09 🔗 schbirid has joined #archiveteam-bs
20:32 🔗 pa has quit IRC (Quit: pa)
20:33 🔗 pa has joined #archiveteam-bs
20:34 🔗 pa has quit IRC (Client Quit)
20:39 🔗 JAA What's the best way to archive different source code repositories? I know about svnrdump for SVN repos, but what about other softwares? git, Mercurial, Bazaar, CVS, etc.
20:41 🔗 JAA In particular, what to do if the repository itself is not public but only accessible through a web frontend? (There's an ArchiveBot job currently grabbing a CVSweb instance; that's the immediate trigger for these questions, though I've been wondering about it for longer.)
20:44 🔗 yipdw git clone
20:44 🔗 yipdw etc
20:44 🔗 yipdw github-backup for stuff on github that might have other useful things like issues, wiki pages, etc
20:53 🔗 kisspunch For git clone the harder part is keeping your mirror up to date--the initial clone yeah, git clone works fine
21:38 🔗 Stilett0 has joined #archiveteam-bs
22:33 🔗 Asparagir has joined #archiveteam-bs
22:39 🔗 kepler45 has quit IRC (Quit: Leaving)
22:49 🔗 wp494 that guy that I asked to PM me about the whole deal involving NCIX got back to me and he straight up refused despite having PMed someone else already
22:49 🔗 wp494 /shrug
23:15 🔗 BlueMaxim has joined #archiveteam-bs
23:37 🔗 Soni has quit IRC (Ping timeout: 272 seconds)
23:53 🔗 Asparagir has quit IRC (Asparagir)

irclogger-viewer