#archiveteam 2012-12-19,Wed

↑back Search

Time Nickname Message
03:35 🔗 SketchCow When is blip closing?
03:36 🔗 chronomex nothing on their blog, I think it may be preemptive
03:46 🔗 dashcloud blip isn't closing- they notified one guy to clear out his videos, and told him it's because they're not interested in that type of content anymore
03:47 🔗 dashcloud I've sent O'Reilly (the book publisher) a question about it, since they've got multiple blip channels, and they're looking into it
03:48 🔗 dashcloud in the mean time, me and ivan` are downloading the channels in the list here: http://piratepad.net/R18h7lKV1N
03:56 🔗 Famicoman what content
03:56 🔗 Famicoman err, what type of
04:00 🔗 dashcloud ivan` could tell you more, but this is the tweet he pointed out: https://twitter.com/richhickey/status/280469081512083460
04:00 🔗 Famicoman That's very interesting.
04:08 🔗 dashcloud I hope it's just a big misunderstanding, but in the event it isn't, we'll have copies
04:09 🔗 ivan` I found another company that got their videos nuked because they were using it to promote themselves
04:09 🔗 ivan` that is not allowed, apparently
04:17 🔗 ivan` https://www.facebook.com/notes/tilestack/tilestack-on-youtube/267036710303
04:17 🔗 ivan` doesn't sound like they got any warning
05:21 🔗 ivan` I'm probably blip.tv's #1 user at this point
05:22 🔗 ivan` found even more channels in my IRC logs
05:22 🔗 ivan` blip.tv channels, that is
05:27 🔗 ivan` including "EyeHandy delivers free how to videos performed by attractive female models in an elegant fashion. Discover a sexier, more captivating way to learn."
05:36 🔗 chronomex sounds good to me
05:36 🔗 chronomex where do I throw the money
05:51 🔗 godane so is archive.org down?
05:52 🔗 chronomex always
05:53 🔗 godane so i may have 3 episodes of attack of the show uploaded now
05:54 🔗 ivan` since money for disks is limited, shouldn't we have some guidelines on packing as much culture into the free space as possible?
05:54 🔗 chronomex disk is cheap
05:54 🔗 chronomex not free, but shockingly cheap
05:54 🔗 ivan` well, not cheap enough apparently
05:56 🔗 godane ivan`: i think stuff like dedup should be used
05:57 🔗 godane or lastest check to see how much is repeat date
05:57 🔗 godane *data
05:58 🔗 godane i'm also thinking of a script that can detect repeat data in warc.gz
05:58 🔗 ivan` well, repeat data is easily compressed
05:58 🔗 ivan` with LZMA/LZMA2 especially
05:59 🔗 godane yes but with dedup your only storing it once
06:00 🔗 godane it beats any compress type i think
06:00 🔗 godane but could just be check with files of x size
06:00 🔗 godane like video files
06:01 🔗 godane that are more then 1gb
06:01 🔗 godane also stuff thats darked should be check this way too
06:02 🔗 ivan` darked?
06:02 🔗 godane that are not searchable or have no access cause of stuff like copyright
06:06 🔗 ivan` LZMA2 with a 4GB solid block size is pretty much like whole-filesystem dedup
06:07 🔗 ivan` ZFS dedup does not actually work for anyone I know
06:07 🔗 ivan` that other layered filesystem sort of does, I hear
06:08 🔗 ivan` HTML files will have a lot of partially-repeating content that page-level dedup will not dedup
06:08 🔗 chronomex that's what compression is for
06:25 🔗 theJ3STeR http://tinyurl.com/MayanUpdateNews
06:43 🔗 GLaDOS theJ3STeR: http://5z8.info/back-to-africa_i2s3iz_nakedgrandmas.jpg
06:46 🔗 chronomex I'm not clicking that even though I know it's from shadyurl
06:46 🔗 GLaDOS it just links to http://pastebin.com/raw.php?i=mCGemjr8
06:56 🔗 theJ3STeR Shady:/
06:56 🔗 theJ3STeR ...
06:56 🔗 GLaDOS Also hi
08:39 🔗 computerm Hello, brothers and sisters!
08:39 🔗 computerm I have some peace of code what can be interest you in.
08:39 🔗 computerm (At first - sorry for my english, i am not native speaker...).
08:39 🔗 computerm I wrote multiprocesing TPB crawler and converter from crawler .txt or BTSN .txt ("|" separated) formats to MySQL or SQLite DB.
08:39 🔗 computerm Link on forum thread:
08:39 🔗 computerm https://forum.suprbay.org/showthread.php?tid=131515
08:39 🔗 computerm More clear post:
08:39 🔗 computerm https://forum.suprbay.org/showthread.php?tid=131515&pid=817353#pid817353
08:39 🔗 computerm github:
08:39 🔗 computerm https://github.com/computermite/TPBLocalKit
08:39 🔗 computerm Archive and Perl script on your page http://archiveteam.org/index.php?title=The_Pirate_Bay
08:39 🔗 computerm very outdated.
08:39 🔗 computerm And one question: are you interested in co-work?
08:39 🔗 computerm (Also, for legal reasons, i will not provide any dumps, only code. Working code!)
08:40 🔗 chronomex what do you mean by co-work?
08:41 🔗 computerm how to integrate with you? or better to finalize my code as standalone program?
08:41 🔗 chronomex ahh
08:42 🔗 chronomex much/all of archiveteam stuff is at https://github.com/archiveteam/
08:43 🔗 computerm ok, i will learn it. thanks. When i will have time, i will commit my part. And will help as i can.
08:47 🔗 computerm Also i can provide one 1mb git mirror in Ex-USSR and one mirror in Canada(40Gb/100Mbit/1Tb-month) for mirroring or testing purposes if you want.
08:56 🔗 computerm By the way i will be on channel(try to) 24/7.
08:56 🔗 ivan` can you provide 17TB of disks for a mirror of github? ;)
08:59 🔗 computerm phh. only 40_______Gb_______. if you got "cloud" structure - i can add it
09:01 🔗 computerm at my current primary work i got thousands of petabytes, but at private use only 40-60(if upgraded) Gigabytes.
09:02 🔗 computerm 17TB is not so big, but try get it for free...
09:04 🔗 computerm FYI: my latest archive PBAY + BTSN take 3.5GB with ~17-18 000 000 magnets in MySQL.
09:04 🔗 computerm uniq magnets, i mean.
09:05 🔗 chronomex hmmmm interesting
09:05 🔗 chronomex why can't you share your dumps?
09:05 🔗 theJ3STeR http://tinyurl.com/TopologyLOG
09:06 🔗 chronomex theJ3STeR: no short urls in #archiveteam
09:06 🔗 chronomex final warning
09:06 🔗 computerm legal. i come to Other World from Ex-Ussr so don't want to go back zzzzzz..
09:06 🔗 chronomex computerm: what is the Other World?
09:07 🔗 computerm Other world is Russia-Ukraine-Belarus. They are scary for IT people. For all people.
09:08 🔗 chronomex ah, I bet
09:08 🔗 chronomex you're in the usa now, though?
09:09 🔗 computerm yeah, but no talks about exact place, ok ;).
09:12 🔗 Nemo_bis hm http://stores.ebay.it/The-Attic-Bug?_trksid=p4340.l2563
09:13 🔗 computerm One more question: are some style for guide exists? Example: https://github.com/ArchiveTeam/cityofheroes-grab/blob/master/pipeline.py
09:13 🔗 computerm Predefines USERAGENT in code - is it ok?
09:15 🔗 chronomex computerm: your ip address says you're in california
09:15 🔗 chronomex btw
09:16 🔗 computerm Why not?
09:16 🔗 chronomex computerm: you need a style guide? archiveteam prefers code that works :]
09:16 🔗 computerm ok, understand.
09:18 🔗 computerm On this weeknd i'll try to adopt code to your style and tools. And for now - thanks for all, i will go sleep.
09:19 🔗 chronomex ok! goodnight
09:19 🔗 chronomex we're not very picky about precise style

irclogger-viewer