Time |
Nickname |
Message |
03:35
🔗
|
SketchCow |
When is blip closing? |
03:36
🔗
|
chronomex |
nothing on their blog, I think it may be preemptive |
03:46
🔗
|
dashcloud |
blip isn't closing- they notified one guy to clear out his videos, and told him it's because they're not interested in that type of content anymore |
03:47
🔗
|
dashcloud |
I've sent O'Reilly (the book publisher) a question about it, since they've got multiple blip channels, and they're looking into it |
03:48
🔗
|
dashcloud |
in the mean time, me and ivan` are downloading the channels in the list here: http://piratepad.net/R18h7lKV1N |
03:56
🔗
|
Famicoman |
what content |
03:56
🔗
|
Famicoman |
err, what type of |
04:00
🔗
|
dashcloud |
ivan` could tell you more, but this is the tweet he pointed out: https://twitter.com/richhickey/status/280469081512083460 |
04:00
🔗
|
Famicoman |
That's very interesting. |
04:08
🔗
|
dashcloud |
I hope it's just a big misunderstanding, but in the event it isn't, we'll have copies |
04:09
🔗
|
ivan` |
I found another company that got their videos nuked because they were using it to promote themselves |
04:09
🔗
|
ivan` |
that is not allowed, apparently |
04:17
🔗
|
ivan` |
https://www.facebook.com/notes/tilestack/tilestack-on-youtube/267036710303 |
04:17
🔗
|
ivan` |
doesn't sound like they got any warning |
05:21
🔗
|
ivan` |
I'm probably blip.tv's #1 user at this point |
05:22
🔗
|
ivan` |
found even more channels in my IRC logs |
05:22
🔗
|
ivan` |
blip.tv channels, that is |
05:27
🔗
|
ivan` |
including "EyeHandy delivers free how to videos performed by attractive female models in an elegant fashion. Discover a sexier, more captivating way to learn." |
05:36
🔗
|
chronomex |
sounds good to me |
05:36
🔗
|
chronomex |
where do I throw the money |
05:51
🔗
|
godane |
so is archive.org down? |
05:52
🔗
|
chronomex |
always |
05:53
🔗
|
godane |
so i may have 3 episodes of attack of the show uploaded now |
05:54
🔗
|
ivan` |
since money for disks is limited, shouldn't we have some guidelines on packing as much culture into the free space as possible? |
05:54
🔗
|
chronomex |
disk is cheap |
05:54
🔗
|
chronomex |
not free, but shockingly cheap |
05:54
🔗
|
ivan` |
well, not cheap enough apparently |
05:56
🔗
|
godane |
ivan`: i think stuff like dedup should be used |
05:57
🔗
|
godane |
or lastest check to see how much is repeat date |
05:57
🔗
|
godane |
*data |
05:58
🔗
|
godane |
i'm also thinking of a script that can detect repeat data in warc.gz |
05:58
🔗
|
ivan` |
well, repeat data is easily compressed |
05:58
🔗
|
ivan` |
with LZMA/LZMA2 especially |
05:59
🔗
|
godane |
yes but with dedup your only storing it once |
06:00
🔗
|
godane |
it beats any compress type i think |
06:00
🔗
|
godane |
but could just be check with files of x size |
06:00
🔗
|
godane |
like video files |
06:01
🔗
|
godane |
that are more then 1gb |
06:01
🔗
|
godane |
also stuff thats darked should be check this way too |
06:02
🔗
|
ivan` |
darked? |
06:02
🔗
|
godane |
that are not searchable or have no access cause of stuff like copyright |
06:06
🔗
|
ivan` |
LZMA2 with a 4GB solid block size is pretty much like whole-filesystem dedup |
06:07
🔗
|
ivan` |
ZFS dedup does not actually work for anyone I know |
06:07
🔗
|
ivan` |
that other layered filesystem sort of does, I hear |
06:08
🔗
|
ivan` |
HTML files will have a lot of partially-repeating content that page-level dedup will not dedup |
06:08
🔗
|
chronomex |
that's what compression is for |
06:25
🔗
|
theJ3STeR |
http://tinyurl.com/MayanUpdateNews |
06:43
🔗
|
GLaDOS |
theJ3STeR: http://5z8.info/back-to-africa_i2s3iz_nakedgrandmas.jpg |
06:46
🔗
|
chronomex |
I'm not clicking that even though I know it's from shadyurl |
06:46
🔗
|
GLaDOS |
it just links to http://pastebin.com/raw.php?i=mCGemjr8 |
06:56
🔗
|
theJ3STeR |
Shady:/ |
06:56
🔗
|
theJ3STeR |
... |
06:56
🔗
|
GLaDOS |
Also hi |
08:39
🔗
|
computerm |
Hello, brothers and sisters! |
08:39
🔗
|
computerm |
I have some peace of code what can be interest you in. |
08:39
🔗
|
computerm |
(At first - sorry for my english, i am not native speaker...). |
08:39
🔗
|
computerm |
I wrote multiprocesing TPB crawler and converter from crawler .txt or BTSN .txt ("|" separated) formats to MySQL or SQLite DB. |
08:39
🔗
|
computerm |
Link on forum thread: |
08:39
🔗
|
computerm |
https://forum.suprbay.org/showthread.php?tid=131515 |
08:39
🔗
|
computerm |
More clear post: |
08:39
🔗
|
computerm |
https://forum.suprbay.org/showthread.php?tid=131515&pid=817353#pid817353 |
08:39
🔗
|
computerm |
github: |
08:39
🔗
|
computerm |
https://github.com/computermite/TPBLocalKit |
08:39
🔗
|
computerm |
Archive and Perl script on your page http://archiveteam.org/index.php?title=The_Pirate_Bay |
08:39
🔗
|
computerm |
very outdated. |
08:39
🔗
|
computerm |
And one question: are you interested in co-work? |
08:39
🔗
|
computerm |
(Also, for legal reasons, i will not provide any dumps, only code. Working code!) |
08:40
🔗
|
chronomex |
what do you mean by co-work? |
08:41
🔗
|
computerm |
how to integrate with you? or better to finalize my code as standalone program? |
08:41
🔗
|
chronomex |
ahh |
08:42
🔗
|
chronomex |
much/all of archiveteam stuff is at https://github.com/archiveteam/ |
08:43
🔗
|
computerm |
ok, i will learn it. thanks. When i will have time, i will commit my part. And will help as i can. |
08:47
🔗
|
computerm |
Also i can provide one 1mb git mirror in Ex-USSR and one mirror in Canada(40Gb/100Mbit/1Tb-month) for mirroring or testing purposes if you want. |
08:56
🔗
|
computerm |
By the way i will be on channel(try to) 24/7. |
08:56
🔗
|
ivan` |
can you provide 17TB of disks for a mirror of github? ;) |
08:59
🔗
|
computerm |
phh. only 40_______Gb_______. if you got "cloud" structure - i can add it |
09:01
🔗
|
computerm |
at my current primary work i got thousands of petabytes, but at private use only 40-60(if upgraded) Gigabytes. |
09:02
🔗
|
computerm |
17TB is not so big, but try get it for free... |
09:04
🔗
|
computerm |
FYI: my latest archive PBAY + BTSN take 3.5GB with ~17-18 000 000 magnets in MySQL. |
09:04
🔗
|
computerm |
uniq magnets, i mean. |
09:05
🔗
|
chronomex |
hmmmm interesting |
09:05
🔗
|
chronomex |
why can't you share your dumps? |
09:05
🔗
|
theJ3STeR |
http://tinyurl.com/TopologyLOG |
09:06
🔗
|
chronomex |
theJ3STeR: no short urls in #archiveteam |
09:06
🔗
|
chronomex |
final warning |
09:06
🔗
|
computerm |
legal. i come to Other World from Ex-Ussr so don't want to go back zzzzzz.. |
09:06
🔗
|
chronomex |
computerm: what is the Other World? |
09:07
🔗
|
computerm |
Other world is Russia-Ukraine-Belarus. They are scary for IT people. For all people. |
09:08
🔗
|
chronomex |
ah, I bet |
09:08
🔗
|
chronomex |
you're in the usa now, though? |
09:09
🔗
|
computerm |
yeah, but no talks about exact place, ok ;). |
09:12
🔗
|
Nemo_bis |
hm http://stores.ebay.it/The-Attic-Bug?_trksid=p4340.l2563 |
09:13
🔗
|
computerm |
One more question: are some style for guide exists? Example: https://github.com/ArchiveTeam/cityofheroes-grab/blob/master/pipeline.py |
09:13
🔗
|
computerm |
Predefines USERAGENT in code - is it ok? |
09:15
🔗
|
chronomex |
computerm: your ip address says you're in california |
09:15
🔗
|
chronomex |
btw |
09:16
🔗
|
computerm |
Why not? |
09:16
🔗
|
chronomex |
computerm: you need a style guide? archiveteam prefers code that works :] |
09:16
🔗
|
computerm |
ok, understand. |
09:18
🔗
|
computerm |
On this weeknd i'll try to adopt code to your style and tools. And for now - thanks for all, i will go sleep. |
09:19
🔗
|
chronomex |
ok! goodnight |
09:19
🔗
|
chronomex |
we're not very picky about precise style |