#archiveteam-bs 2013-08-06,Tue

↑back Search

Time Nickname Message
07:38 🔗 SmileyG godane: images thread is HUGE>
07:38 🔗 SmileyG did you get that whole thread?
07:41 🔗 godane i'm getting it
07:42 🔗 godane i had to redo it cause it was over 4gb
07:46 🔗 SmileyG o_O
07:46 🔗 SmileyG I think it'll be suitably massive, I can grab a copy if you got the wget code?
07:47 🔗 godane i think i can do it
07:48 🔗 godane its past 18k urls and its not ever 1gb yet
07:48 🔗 godane i just had to set it up with --warc-file-size=1G a while back
09:10 🔗 godane so i'm pushing more tech news today episodes
10:16 🔗 omf_ vegetables the breakfast of companions
10:19 🔗 Schbirid i am not sure why, but i am crawling steam user profiles
10:19 🔗 Schbirid and steam does not block
10:19 🔗 omf_ find anything interesting
10:19 🔗 Schbirid yes, usernames :P
10:19 🔗 Schbirid nah, i thought there would be tools to "easily" create graphs but hey, i stumbled into high processing territory instead
10:20 🔗 Schbirid but it is fun
10:22 🔗 SmileyG gnuploit!
10:22 🔗 SmileyG gnuplot!
10:22 🔗 Schbirid funny :P
10:22 🔗 SmileyG ;)
10:23 🔗 omf_ try d3 it is pretty user friendly
10:23 🔗 Schbirid i meant graph as in "user connections"
10:23 🔗 Schbirid should ahve said that
10:24 🔗 omf_ d3 can do direct graphs which is what you are looking for http://bl.ocks.org/mbostock/4062045
10:25 🔗 Schbirid yeah but i have a "bit" more nodes and edges
10:25 🔗 Schbirid 300k+ so far
10:25 🔗 Schbirid gephi is working but slow
10:25 🔗 omf_ I doubt that. I loaded 15 million into d3 for a freebase talk
10:25 🔗 omf_ and if you are in the billions you need a proper graph database like titan
10:26 🔗 Schbirid whoa
10:26 🔗 omf_ all a "graph" is, is a directed data structure
10:26 🔗 Schbirid yeah but the sorting for visual display is hard
10:26 🔗 omf_ yep
10:26 🔗 Schbirid cant believe d3 does 15 million at once
10:26 🔗 omf_ there are no good solutions as of yet.
10:26 🔗 omf_ It is limited by memory
10:27 🔗 omf_ decades got spent on relational databases and little on graph databases
10:27 🔗 omf_ now the best solutions are still pathetic
10:28 🔗 omf_ google, facebook, yahoo, ms, twitter all use their own rolled closed solution
10:32 🔗 omf_ and yet those same companies minus MS use open source relational databases all the time
10:34 🔗 omf_ I feel your pain Schbirid
10:37 🔗 omf_ I liken graph databases to lisp, super powerful and not fully understood by most of the programming community
10:41 🔗 godane so i got 2012 of glenn beck show uploaded now
10:41 🔗 godane also i'm close to getting 2011-09 of tech news today uploaded
10:48 🔗 godane now this is very odd: http://web.archive.org/web/*/http://torrentbytes.net
10:49 🔗 godane looks like IA has been trying to mirror torrentbytes.net alot
10:50 🔗 godane looks like my stuff is in wayback machine now
10:54 🔗 godane looks like the forum_index.php i didn't grab but you get all the forum_viewforum.php here: http://web.archive.org/web/*/http://www.torrentbytes.net/forum_viewforum.php*
11:22 🔗 godane the wget log of katproxy.com alone is over 60mb
12:37 🔗 SmileyG SketchCow: defcon docu on archive.org yet?
12:44 🔗 SmileyG damnit i'm in left korea again ¬_¬
12:48 🔗 Schbirid i am not sure how i feel about non-public forums ending up in the wayback machine
12:51 🔗 ersi me either, even though I'm of course thinking of the "expect it to become public, if you share it" mindset as well
12:52 🔗 Schbirid well, things can be shared with a selected group of people, in this case the members of a site
12:52 🔗 godane i was not thinking it was gong to be in wayback this quick
12:53 🔗 ersi of course, but any of the parties that has access can make it public or share it along.. so I guess one should consider it public from the get go
12:53 🔗 ersi then again, like I said - I'm a bit skeptical as well :)
12:53 🔗 Schbirid that's a post-privacy standpoint i vehemently disagree with
12:53 🔗 Schbirid it's like saying that any kind of communication should be considered public because the other person can make it public
12:54 🔗 godane i only did a panic mirror cause the site maybe going download
12:54 🔗 Schbirid just because the technology makes it easy does not make it ok
12:54 🔗 Schbirid godane: not bashing you, just thinking out loud
12:55 🔗 Schbirid hell, any archiving we do is complicated but to me making previously non-public stuff public is more complicated than preserving public content data
12:55 🔗 godane also this is funny: http://web.archive.org/web/20130724114835/http://www.torrentbytes.net/robots.txt
12:55 🔗 ersi Schbirid: I completely agree with you there, though it is always a risk that either the parties that aren't you or aren't the intended audience reveals the data
12:56 🔗 ersi I'm not saying that everyone should consider all communications public as default. I just thought out loud about the inherent 'risk'
12:56 🔗 godane the robots disallows everything
12:56 🔗 Schbirid aye :)
12:56 🔗 Schbirid just for the record, my manlihood is huge
13:03 🔗 godane also know that torrentbytes.net was getting hit like 5 times a day for some reason by IA
13:15 🔗 godane uploaded: https://archive.org/details/katproxy.com-community-20130805
14:14 🔗 Schbirid ok, gephi falls apart with 3 million edges already for me
14:14 🔗 Schbirid also ti updates the graph display when i do stuff in the fucking ui
14:15 🔗 Schbirid which takes many seconds
14:20 🔗 SmileyG it's like "private" irc
14:21 🔗 SmileyG exactly 1Gb godane for katproxy.com?!
14:21 🔗 SmileyG thats... suspcious.
14:21 🔗 omf_ he is cutting at 1gb warcs
14:22 🔗 omf_ easier for him to upload
14:22 🔗 SmileyG So the rest aren't uploaded yet?
14:22 🔗 SmileyG ok
14:23 🔗 SmileyG how big did it end up?
14:24 🔗 omf_ ^ lol
14:24 🔗 SmileyG omf_: i'm confused, something funny?
14:25 🔗 omf_ my brain is dry rotted by the internet and porn
14:25 🔗 SmileyG :/
14:26 🔗 SmileyG i need to know if we have it all
14:26 🔗 SmileyG if not, I need to fix that.
14:32 🔗 godane no its the html of katproxy.com/community/ is about 1gb
14:37 🔗 godane the images is about 10gb i think
14:38 🔗 godane alot of the urls are from yuq.me
14:40 🔗 SmileyG k
18:23 🔗 godane looks like there is more g4 archiving fans: http://g4tvarchive.tumblr.com/
18:59 🔗 godane anyways uploading the katproxy.com community images
19:09 🔗 SmileyG good good
19:43 🔗 SketchCow Whew
19:45 🔗 SmileyG fun times?
19:46 🔗 * SmileyG wants to see the docu. :/
19:46 🔗 SmileyG Or are you selling DVD SketchCow ?
19:51 🔗 omf_ git tip of the day: git ls-files --other --exclude-standard
19:52 🔗 omf_ list only the files not tracked by git
20:50 🔗 SketchCow I am not
20:50 🔗 SketchCow hackerstickers.com
20:51 🔗 SketchCow but it's on youtube, piratebay, etc
20:51 🔗 SketchCow * done 1365798.3 MB Rate: 912.9 / 0.0 KB Uploaded: 2115262.0 MB
20:51 🔗 SketchCow * MESS 0.149 Software List CHDs
21:16 🔗 SketchCow Like a boss
21:30 🔗 SmileyG that site just brtoke my eyes D:
22:31 🔗 omf_ SmileyG, http://imgur.com/gallery/PLiWoj4
23:46 🔗 omf_ Anyone got an invite for medium.com

irclogger-viewer