#archiveteam 2016-05-28,Sat

↑back Search

Time Nickname Message
00:13 🔗 tomwsmf-a has quit IRC (Ping timeout: 258 seconds)
00:29 🔗 VADemon has quit IRC (Quit: left4dead)
00:42 🔗 tomwsmf-a has joined #archiveteam
01:02 🔗 BlueMaxim has joined #archiveteam
02:12 🔗 Coderjoe has quit IRC (Read error: Connection reset by peer)
02:32 🔗 Coderjoe has joined #archiveteam
02:45 🔗 Start has joined #archiveteam
03:09 🔗 tomwsmf-a has quit IRC (Read error: Operation timed out)
03:17 🔗 philpem has quit IRC (Ping timeout: 260 seconds)
04:09 🔗 RichardG has quit IRC (Ping timeout: 260 seconds)
04:21 🔗 JesseW has joined #archiveteam
04:29 🔗 Sk1d has quit IRC (Ping timeout: 194 seconds)
04:35 🔗 Sk1d has joined #archiveteam
05:00 🔗 JesseW has quit IRC (Ping timeout: 370 seconds)
05:38 🔗 ariscop has quit IRC (Quit: Leaving)
05:40 🔗 ariscop has joined #archiveteam
06:44 🔗 RichardG has joined #archiveteam
07:00 🔗 RichardG has quit IRC (Ping timeout: 370 seconds)
07:31 🔗 Honno has joined #archiveteam
07:56 🔗 philpem has joined #archiveteam
08:19 🔗 schbirid has joined #archiveteam
08:23 🔗 godane has quit IRC (Read error: Operation timed out)
08:36 🔗 godane has joined #archiveteam
09:14 🔗 zhongfu has quit IRC (Remote host closed the connection)
09:16 🔗 zhongfu has joined #archiveteam
09:38 🔗 n3m0 has joined #archiveteam
09:38 🔗 n3m0 is now known as skrp
09:40 🔗 skrp can i get a serious pm. 100T+ bsd over zfs system. 2 Million files. 1Million+ books. I've been archiving on my on, but would appreciate a merger
10:19 🔗 RichardG has joined #archiveteam
10:31 🔗 bai skrp: https://twitter.com/textfiles/status/736270734033575936
10:33 🔗 Tomcat_ has joined #archiveteam
10:36 🔗 M-davidar has joined #archiveteam
10:41 🔗 M-davidar is now known as davidar_
10:53 🔗 Tomcat__ has joined #archiveteam
10:53 🔗 Tomcat__ has quit IRC (Remote host closed the connection!)
10:53 🔗 Tomcat_ has quit IRC (Read error: Operation timed out)
10:58 🔗 Tomcat_ has joined #archiveteam
10:58 🔗 Tomcat_ has quit IRC (Connection closed)
10:58 🔗 Tomcat_ has joined #archiveteam
10:58 🔗 Tomcat_ has quit IRC (Connection closed)
10:59 🔗 Tomcat_ has joined #archiveteam
11:06 🔗 skrp bai: :/ its hard to take tweeters seriously
11:06 🔗 skrp the days when men were men and birds were birds...
11:12 🔗 skrp *didn't mean to sound sexist. if he is a lonely housewife, its accepable
11:13 🔗 ivan skrp: what do you want to merge?
11:15 🔗 ivan are your books from the libgen torrents?
11:32 🔗 skrp ivan: do you already have all the libgen?
11:33 🔗 skrp no they are wildcard actions ive done via torrent/http/ftp
11:35 🔗 skrp ive been working on a c coded archive system that works over zfs. keeping all the files in one pool, naturally deduplicated as each file is named after its $index~$sha256^$filename@$previous_path
11:36 🔗 Tomcat_ has quit IRC (Remote host closed the connection)
11:37 🔗 skrp you add an input source and it extracts recursively everything while maintaining importanta 'metadata'
11:40 🔗 skrp ripper -t http -i www.pedrk.com -o /uber_dump --index 010518 #this gives it an ultra transient always updated always deduplicated [zfs dedup is a shod]
11:48 🔗 BartoCH has quit IRC (Read error: Connection reset by peer)
11:48 🔗 ivan skrp: no, I am lacking libgen
11:48 🔗 ivan I've never worried about dedup because I just dump hundreds of TB into google drive :-)
11:51 🔗 ivan is now known as ivan`
11:51 🔗 skrp ivan: lol i dont trust The Machine. I maintain my own servers with my accouting business funds
11:51 🔗 HCross Downloading a set of subreddits and other sites related to the EU Referendum
11:52 🔗 ivan has joined #archiveteam
11:52 🔗 skrp ivan`: libgen is a very hard to get deal. but if you are willing to deal :D
11:54 🔗 skrp im trying to get into a group that shares my same philosophy "Data gets lost. Storage gets cheaper. So get everything now"
11:56 🔗 Sanqui i think we're your people, even if some of us vary in scale - i barely own a single terabyte of my own data :)
11:57 🔗 BartoCH has joined #archiveteam
11:57 🔗 skrp Sanqui: well with one TB you could back up many htmls
11:58 🔗 Sanqui absolutely - i help archive sites with the archivebot and keep some private material too.
11:58 🔗 skrp the internet is a glass cannon. all universities share the same 'ebsco host' systems which amount to only 200k files.
11:59 🔗 skrp once war hits the internet is bye bye. too insecure
11:59 🔗 skrp so thats why i call myself noah of my bsd zfs ark haha
12:00 🔗 skrp ivan`: the libgen is alot larger than most ppl think, i suspect the russians also stored stenography information in them as well
12:01 🔗 Sanqui anyway, if you want to speak to somebody serious about your collection, SketchCow's your guy
12:05 🔗 skrp well ill be at bsdcan if anyone else from this group is a demon
12:25 🔗 RichardG_ has joined #archiveteam
12:26 🔗 RichardG has quit IRC (Read error: Connection reset by peer)
12:39 🔗 arkiver RuTracker project is runnig again.
12:52 🔗 ariscop has quit IRC (Ping timeout: 633 seconds)
13:06 🔗 BlueMaxim has quit IRC (Read error: Operation timed out)
13:07 🔗 Simpbra1 has joined #archiveteam
13:08 🔗 Simpbrain has quit IRC (Read error: Connection reset by peer)
13:18 🔗 voltagex https://launchpad.net/~voltagex/+archive/ubuntu/wget-lua if anyone needs it - rebuilt wget-lua for newer Ubuntus (helpful for EC2 instances)
13:36 🔗 metalcamp has joined #archiveteam
14:02 🔗 metalcamp has quit IRC (Ping timeout: 244 seconds)
14:05 🔗 WinterFox has quit IRC (Remote host closed the connection)
14:09 🔗 arkiver New wikis are added to the wikis project:
14:09 🔗 arkiver battlestarwiki.org
14:09 🔗 arkiver editthis.info
14:09 🔗 arkiver gamepedia.com
14:09 🔗 arkiver miraheze.org
14:09 🔗 arkiver referata.com
14:09 🔗 arkiver wiki-site.com
14:10 🔗 arkiver The lists are taken from the wikiteam project
14:10 🔗 arkiver All external URLs from these sites will be grabbed in the wikis project.
14:10 🔗 luckcolor firng up scripts in a sec
14:10 🔗 luckcolor *firing
14:11 🔗 arkiver Thanks
14:11 🔗 arkiver It seems wiki-site.com is currently all failing, but that will be fixed.
14:12 🔗 arkiver almost all*
14:12 🔗 luckcolor concurrency 4 go!
14:14 🔗 arkiver The grab is running since november 2015. November 2016 we will regrab all sites, to fetch new external URLs and fetch changed external URLs.
15:03 🔗 RichardG_ is now known as RichardG
15:23 🔗 tfgbd has quit IRC (Read error: Connection reset by peer)
15:48 🔗 JesseW has joined #archiveteam
16:30 🔗 JesseW has quit IRC (Read error: Operation timed out)
16:42 🔗 Sanqui arkiver: can I just request more wikis to be added?
17:00 🔗 JesseW has joined #archiveteam
17:33 🔗 fie has quit IRC (Read error: Operation timed out)
18:09 🔗 Zinob has joined #archiveteam
18:10 🔗 Zinob WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
18:10 🔗 Zinob ... oh ok, nvm. Just keep an eye on OpenCores.org?
18:11 🔗 luckcolor it's yahoosucks
18:12 🔗 JesseW has quit IRC (Read error: Operation timed out)
18:12 🔗 luckcolor Zinob: what's the status of that website?
18:12 🔗 Zei-Pii has joined #archiveteam
18:14 🔗 Zinob luckcolor: It is all fine! But the company that hosts the sites biggest customer ran in to problems.
18:15 🔗 luckcolor If you want i can schedule a grab on #archivebot
18:17 🔗 luckcolor *Zinob:
18:17 🔗 Zinob Either that site will get more love now that they have less to do with the other big company OR it might go belly-up...
18:17 🔗 Zinob So yeah.. a grab might be in order.
18:18 🔗 luckcolor Ok join the channel, in the topic thet's the link to check the progressi of the crawl
18:18 🔗 Zinob I dont realy care for the site personally, but it is for FPGA-dessign what SourceForge is for the GPL compunity
18:19 🔗 luckcolor Ok it's going
18:21 🔗 Zinob Nice
18:21 🔗 luckcolor Let me know if you need anything else :p
18:21 🔗 Zinob I usually pester a friend that is in the archive-team but i thought that i could check for my self for once :)
18:22 🔗 Zinob Great stuff, Keep the good work up. Your Wikipedia Archives have helped me a few times
19:18 🔗 schbirid has quit IRC (Quit: Leaving)
19:36 🔗 Jeroen52 has quit IRC (Ping timeout: 260 seconds)
19:56 🔗 Jeroen52 has joined #archiveteam
20:08 🔗 zino has quit IRC (Read error: Operation timed out)
20:09 🔗 tomwsmf-a has joined #archiveteam
20:22 🔗 tfgbd_znc has joined #archiveteam
20:27 🔗 tfgbd_znc has quit IRC (Read error: Connection reset by peer)
20:41 🔗 Aranje has joined #archiveteam
21:17 🔗 VADemon has joined #archiveteam
21:31 🔗 Zei-Pii has quit IRC (Read error: Connection reset by peer)
21:34 🔗 maseck has quit IRC (Remote host closed the connection)
21:40 🔗 maseck has joined #archiveteam
22:47 🔗 arkiver Sanqui: always!
22:47 🔗 arkiver For now only mediawikis are supported
22:47 🔗 Sanqui is there a formal way, or should I just put them here?
22:47 🔗 arkiver What an item looks like:
22:49 🔗 arkiver for example, mediawikieu:bulpedia.wikia.com/api.php:bulpedia.wikia.com/wiki/
22:49 🔗 arkiver 'eu' in mediawikieu means 'external urls'
22:50 🔗 arkiver bulpedia.wikia.com/api.php is the location of the api.php
22:50 🔗 arkiver bulpedia.wikia.com/wiki/ is the prefix for the articles
22:50 🔗 arkiver for example, the above wiki has an page http://bulpedia.wikia.com/wiki/Jokes
22:50 🔗 arkiver so the prefix is bulpedia.wikia.com/wiki/
22:51 🔗 arkiver it is different for different wikis
22:51 🔗 arkiver if you have all that, it can be added to the warrior grab
23:03 🔗 FalconK has quit IRC (Ping timeout: 260 seconds)
23:04 🔗 BlueMaxim has joined #archiveteam
23:05 🔗 Sanqui arkiver: yeah, I can get that.
23:17 🔗 FalconK has joined #archiveteam
23:49 🔗 ariscop has joined #archiveteam

irclogger-viewer