Time |
Nickname |
Message |
07:38
🔗
|
SmileyG |
godane: images thread is HUGE> |
07:38
🔗
|
SmileyG |
did you get that whole thread? |
07:41
🔗
|
godane |
i'm getting it |
07:42
🔗
|
godane |
i had to redo it cause it was over 4gb |
07:46
🔗
|
SmileyG |
o_O |
07:46
🔗
|
SmileyG |
I think it'll be suitably massive, I can grab a copy if you got the wget code? |
07:47
🔗
|
godane |
i think i can do it |
07:48
🔗
|
godane |
its past 18k urls and its not ever 1gb yet |
07:48
🔗
|
godane |
i just had to set it up with --warc-file-size=1G a while back |
09:10
🔗
|
godane |
so i'm pushing more tech news today episodes |
10:16
🔗
|
omf_ |
vegetables the breakfast of companions |
10:19
🔗
|
Schbirid |
i am not sure why, but i am crawling steam user profiles |
10:19
🔗
|
Schbirid |
and steam does not block |
10:19
🔗
|
omf_ |
find anything interesting |
10:19
🔗
|
Schbirid |
yes, usernames :P |
10:19
🔗
|
Schbirid |
nah, i thought there would be tools to "easily" create graphs but hey, i stumbled into high processing territory instead |
10:20
🔗
|
Schbirid |
but it is fun |
10:22
🔗
|
SmileyG |
gnuploit! |
10:22
🔗
|
SmileyG |
gnuplot! |
10:22
🔗
|
Schbirid |
funny :P |
10:22
🔗
|
SmileyG |
;) |
10:23
🔗
|
omf_ |
try d3 it is pretty user friendly |
10:23
🔗
|
Schbirid |
i meant graph as in "user connections" |
10:23
🔗
|
Schbirid |
should ahve said that |
10:24
🔗
|
omf_ |
d3 can do direct graphs which is what you are looking for http://bl.ocks.org/mbostock/4062045 |
10:25
🔗
|
Schbirid |
yeah but i have a "bit" more nodes and edges |
10:25
🔗
|
Schbirid |
300k+ so far |
10:25
🔗
|
Schbirid |
gephi is working but slow |
10:25
🔗
|
omf_ |
I doubt that. I loaded 15 million into d3 for a freebase talk |
10:25
🔗
|
omf_ |
and if you are in the billions you need a proper graph database like titan |
10:26
🔗
|
Schbirid |
whoa |
10:26
🔗
|
omf_ |
all a "graph" is, is a directed data structure |
10:26
🔗
|
Schbirid |
yeah but the sorting for visual display is hard |
10:26
🔗
|
omf_ |
yep |
10:26
🔗
|
Schbirid |
cant believe d3 does 15 million at once |
10:26
🔗
|
omf_ |
there are no good solutions as of yet. |
10:26
🔗
|
omf_ |
It is limited by memory |
10:27
🔗
|
omf_ |
decades got spent on relational databases and little on graph databases |
10:27
🔗
|
omf_ |
now the best solutions are still pathetic |
10:28
🔗
|
omf_ |
google, facebook, yahoo, ms, twitter all use their own rolled closed solution |
10:32
🔗
|
omf_ |
and yet those same companies minus MS use open source relational databases all the time |
10:34
🔗
|
omf_ |
I feel your pain Schbirid |
10:37
🔗
|
omf_ |
I liken graph databases to lisp, super powerful and not fully understood by most of the programming community |
10:41
🔗
|
godane |
so i got 2012 of glenn beck show uploaded now |
10:41
🔗
|
godane |
also i'm close to getting 2011-09 of tech news today uploaded |
10:48
🔗
|
godane |
now this is very odd: http://web.archive.org/web/*/http://torrentbytes.net |
10:49
🔗
|
godane |
looks like IA has been trying to mirror torrentbytes.net alot |
10:50
🔗
|
godane |
looks like my stuff is in wayback machine now |
10:54
🔗
|
godane |
looks like the forum_index.php i didn't grab but you get all the forum_viewforum.php here: http://web.archive.org/web/*/http://www.torrentbytes.net/forum_viewforum.php* |
11:22
🔗
|
godane |
the wget log of katproxy.com alone is over 60mb |
12:37
🔗
|
SmileyG |
SketchCow: defcon docu on archive.org yet? |
12:44
🔗
|
SmileyG |
damnit i'm in left korea again ¬_¬ |
12:48
🔗
|
Schbirid |
i am not sure how i feel about non-public forums ending up in the wayback machine |
12:51
🔗
|
ersi |
me either, even though I'm of course thinking of the "expect it to become public, if you share it" mindset as well |
12:52
🔗
|
Schbirid |
well, things can be shared with a selected group of people, in this case the members of a site |
12:52
🔗
|
godane |
i was not thinking it was gong to be in wayback this quick |
12:53
🔗
|
ersi |
of course, but any of the parties that has access can make it public or share it along.. so I guess one should consider it public from the get go |
12:53
🔗
|
ersi |
then again, like I said - I'm a bit skeptical as well :) |
12:53
🔗
|
Schbirid |
that's a post-privacy standpoint i vehemently disagree with |
12:53
🔗
|
Schbirid |
it's like saying that any kind of communication should be considered public because the other person can make it public |
12:54
🔗
|
godane |
i only did a panic mirror cause the site maybe going download |
12:54
🔗
|
Schbirid |
just because the technology makes it easy does not make it ok |
12:54
🔗
|
Schbirid |
godane: not bashing you, just thinking out loud |
12:55
🔗
|
Schbirid |
hell, any archiving we do is complicated but to me making previously non-public stuff public is more complicated than preserving public content data |
12:55
🔗
|
godane |
also this is funny: http://web.archive.org/web/20130724114835/http://www.torrentbytes.net/robots.txt |
12:55
🔗
|
ersi |
Schbirid: I completely agree with you there, though it is always a risk that either the parties that aren't you or aren't the intended audience reveals the data |
12:56
🔗
|
ersi |
I'm not saying that everyone should consider all communications public as default. I just thought out loud about the inherent 'risk' |
12:56
🔗
|
godane |
the robots disallows everything |
12:56
🔗
|
Schbirid |
aye :) |
12:56
🔗
|
Schbirid |
just for the record, my manlihood is huge |
13:03
🔗
|
godane |
also know that torrentbytes.net was getting hit like 5 times a day for some reason by IA |
13:15
🔗
|
godane |
uploaded: https://archive.org/details/katproxy.com-community-20130805 |
14:14
🔗
|
Schbirid |
ok, gephi falls apart with 3 million edges already for me |
14:14
🔗
|
Schbirid |
also ti updates the graph display when i do stuff in the fucking ui |
14:15
🔗
|
Schbirid |
which takes many seconds |
14:20
🔗
|
SmileyG |
it's like "private" irc |
14:21
🔗
|
SmileyG |
exactly 1Gb godane for katproxy.com?! |
14:21
🔗
|
SmileyG |
thats... suspcious. |
14:21
🔗
|
omf_ |
he is cutting at 1gb warcs |
14:22
🔗
|
omf_ |
easier for him to upload |
14:22
🔗
|
SmileyG |
So the rest aren't uploaded yet? |
14:22
🔗
|
SmileyG |
ok |
14:23
🔗
|
SmileyG |
how big did it end up? |
14:24
🔗
|
omf_ |
^ lol |
14:24
🔗
|
SmileyG |
omf_: i'm confused, something funny? |
14:25
🔗
|
omf_ |
my brain is dry rotted by the internet and porn |
14:25
🔗
|
SmileyG |
:/ |
14:26
🔗
|
SmileyG |
i need to know if we have it all |
14:26
🔗
|
SmileyG |
if not, I need to fix that. |
14:32
🔗
|
godane |
no its the html of katproxy.com/community/ is about 1gb |
14:37
🔗
|
godane |
the images is about 10gb i think |
14:38
🔗
|
godane |
alot of the urls are from yuq.me |
14:40
🔗
|
SmileyG |
k |
18:23
🔗
|
godane |
looks like there is more g4 archiving fans: http://g4tvarchive.tumblr.com/ |
18:59
🔗
|
godane |
anyways uploading the katproxy.com community images |
19:09
🔗
|
SmileyG |
good good |
19:43
🔗
|
SketchCow |
Whew |
19:45
🔗
|
SmileyG |
fun times? |
19:46
🔗
|
* |
SmileyG wants to see the docu. :/ |
19:46
🔗
|
SmileyG |
Or are you selling DVD SketchCow ? |
19:51
🔗
|
omf_ |
git tip of the day: git ls-files --other --exclude-standard |
19:52
🔗
|
omf_ |
list only the files not tracked by git |
20:50
🔗
|
SketchCow |
I am not |
20:50
🔗
|
SketchCow |
hackerstickers.com |
20:51
🔗
|
SketchCow |
but it's on youtube, piratebay, etc |
20:51
🔗
|
SketchCow |
* done 1365798.3 MB Rate: 912.9 / 0.0 KB Uploaded: 2115262.0 MB |
20:51
🔗
|
SketchCow |
* MESS 0.149 Software List CHDs |
21:16
🔗
|
SketchCow |
Like a boss |
21:30
🔗
|
SmileyG |
that site just brtoke my eyes D: |
22:31
🔗
|
omf_ |
SmileyG, http://imgur.com/gallery/PLiWoj4 |
23:46
🔗
|
omf_ |
Anyone got an invite for medium.com |