#newsgrabber 2017-06-17,Sat

Logs of this channel are not protected. You can protect them by a password.

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)


WhoWhatWhen
***kyan has quit IRC (Read error: Operation timed out) [01:40]
Aranje has joined #newsgrabber [01:45]
...................................... (idle for 3h6mn)
phuzion has quit IRC (Ping timeout: 600 seconds)
phuzion has joined #newsgrabber
[04:51]
..... (idle for 23mn)
kyan has joined #newsgrabber [05:18]
..................... (idle for 1h40mn)
kyan_ has joined #newsgrabber
kyan has quit IRC (Read error: Operation timed out)
[06:58]
.................... (idle for 1h37mn)
midas1 has joined #newsgrabber [08:36]
................................................. (idle for 4h1mn)
SpaffGarg has quit IRC (Quit: ZNC 1.6.3+deb1 - http://znc.in) [12:37]
........... (idle for 52mn)
arkiverwas away all day yesterday
time to finish this up now
HCross Kaz ^
jrwr: we could do that, but it's not much different from what we already have
[13:29]
HCross2im around all day to get this done :) [13:35]
arkiverawesome :)
there is http://wikiba.se/
which poweres wikidata
but also https://data.droidwiki.org/
which looks the same
wouldn't it be possible to set up out own wiki of this kind?
with wikibase for example
[13:37]
HCross2certainly [13:38]
arkiverand we'd have only certain list of properties, like site, seedURL, refreshtime, etc.
I don't have any experience with setting up websites or stuff like that
[13:39]
***SpaffGarg has joined #newsgrabber [13:43]
HCross2so we'd want a mediawiki with that too [13:43]
arkiverI guess so
wikibase says it's a mediawiki extension to create something like wikidata
[13:44]
HCross2arkiver: http://wiki.newsbuddy.net/dashboard/ :) [13:50]
arkivernice! [13:56]
HCross2http://wiki.newsbuddy.net/mediawiki/Main_Page Ta Daa! [13:57]
arkiverthat was fast...
nice
[14:03]
HCross2arkiver: xampp makes it fast as anything
and it helps that the cloud provider I use now doesnt bill me :p
[14:03]
arkiverwill it be possible to allow people without account to make edits?
haha :)
it might be nice to try it out first (people without account)
but if it goes wrong, disable it
though given that this will have only a specific set of options that can be selected for the information to be added
I don't think there will be many spambots that will be able to handle that
[14:04]
HCross2I'm going to take nightly snapshots anyway - so if something happens we can simply pull down the previous nights snapshot and restore it [14:05]
arkiversounds good [14:05]
HCross2jrwr: are you around? [14:06]
jrwrI am
Whats up HCross2
[14:08]
HCross2Are you any good with mediawiki? [14:09]
jrwrIm very good with it
jrwr Ran PCGamingWiki for a year
[14:10]
HCross2Awesome - ive got the NewsGrabber wiki setup [14:10]
jrwrCool
Did you add the wikibase extensions to it
[14:10]
HCross2not yet [14:11]
jrwrIf you want, I can give you my ssh pubkey and Ill can get her all fixed up [14:11]
HCross2yes please [14:11]
arkivernew version of the warrior scripts is online
https://github.com/ArchiveTeam/NewsGrabber-Warrior
this has the local edited version of warcio
final steps are testing uploads for the main machine
and an extra test for the warrior scripts
[14:12]
...... (idle for 29mn)
main server script now ready
cleaning up from previous runs
going to remove all old incoming URLs lists
which might take some time
[14:43]
.... (idle for 16mn)
***newsbuddy has joined #newsgrabber [15:01]
newsbuddyHello! I've just been (re)started. Follow my newsgrabs in #newsgrabberbot [15:01]
***newsbuddy has quit IRC (Remote host closed the connection) [15:01]
arkiverstarting a test run
should be up in a minute
just started main.py
[15:01]
***newsbuddy has joined #newsgrabber [15:02]
newsbuddyHello! I've just been (re)started. Follow my newsgrabs in #newsgrabberbot [15:02]
***newsbuddy has quit IRC (Remote host closed the connection)
newsbuddy has joined #newsgrabber
[15:04]
newsbuddyHello! I've just been (re)started. Follow my newsgrabs in #newsgrabberbot [15:06]
***newsbuddy has quit IRC (Remote host closed the connection)
newsbuddy has joined #newsgrabber
[15:07]
newsbuddyHello! I've just been (re)started. Follow my newsgrabs in #newsgrabberbot [15:08]
***newsbuddy has quit IRC (Remote host closed the connection) [15:10]
arkiverall looking good
we're currently doing 10 URLs/item, btw
I'm thinking of making that 20 or more
we can decide on that later though
[15:11]
***newsbuddy has joined #newsgrabber [15:11]
newsbuddyHello! I've just been (re)started. Follow my newsgrabs in #newsgrabberbot [15:11]
HCross2ok - Ill warm some warriors up [15:13]
arkiverok
I'm still doing some last testing on the warrior
looks like we're missing KazHomeD
[15:13]
HCross2Kaz:
kurt:
[15:15]
KazHello
Give me a mo
[15:16]
HCross2http://176.9.43.188:1337 :) [15:17]
KazRight, not at a laptop atm and various things have changed with kazhomed
Can we take that out for now please
[15:21]
HCross2done [15:24]
KazThanks [15:25]
arkiver!rs [15:29]
newsbuddyarkiver: Refreshing services...
arkiver: Refreshed services.
[15:29]
HCross2arkiver: for some reason, Bangalore 1 just decided it wanted to use all of its ram, disk and CPU at once - did something happen? [15:34]
arkiverhmm
I don't think so
[15:34]
HCross2arkiver: want me to enable some requests per minute now?
`in the tracker
[15:43]
arkiverno, not yet
I'll ping you when we can start or have started
[15:43]
***kyan_ has quit IRC (Remote host closed the connection) [15:55]
jrwrarkiver: HCross2 go make a account http://wiki.newsbuddy.net [15:57]
HCross2done [15:58]
jrwrYou are now admin [16:03]
Wikibase client and repo are installed
That was "Interesting" to install
the whole thing is installed to /var/www/html/
[16:11]
HCross2im going to guess the M247 mirrors werent being too useful :p [16:12]
jrwrna
Wikibase latest is broken
Had to go back a version that matched the stable install of mediawiki
Im going to have to research on how to use wikibase
its ... interesting to use
[16:21]
...... (idle for 27mn)
Kazright hello I'm home [16:52]
HCross2: home.kurt.gg should be good to go for discovery [16:59]
arkiverjrwr: will make an account in a bit
I'm adding an extra check for warcio for the warrior
[17:12]
jrwrCool
The wiki is installed, Secured and cached
[17:13]
arkiver:D awesome!
thanks :)
I'll ping you when I've made an account
[17:23]
................. (idle for 1h22mn)
Kazlooks like items are going in nicely
but I guess that's a ~4tb backlog already, forgot how heavy this project was
[18:45]
........ (idle for 38mn)
HCross2yup, and my server setup has changed to the point my only real grabbing capacity is a bandwidth limited Hetzner [19:24]
arkiverwe can make this the default project once it's running [19:30]
jrwrStill got my OVH Box you can abuse
its got 500Mbit up
[19:35]
Kazwell, at least we're not going to have overload issues like in the past
so we'll have full grabs without failed jobs
[19:37]
HCross2yup, fingers crossed the master can hold up
OVH do do some NVMe servers now, but theyre £££££
[19:37]
KazE5v2-SAT-1-64 might be a good choice for a megawarc node.. 25gb megawarcs lets you do the whole thing in RAM
arkiver: do you think we'll start tonight?
[19:39]
jrwrif you get a E5v2-SAT-1-64
I can config it
[19:47]
arkiverKaz: I hope so
If it works fine, yes
[19:49]
Kaz:D [19:49]
HCross2we'll see how the 12TB server works first [19:55]
jrwrCool [19:58]
http://wiki.newsbuddy.net/Item:Q2
im working on these
[20:07]
HCross2is it a manual thing or are you scripting them? [20:07]
jrwrmanual for now
to get all the data outlines
there are a TON of pieces I have to put together
[20:08]
arkiverIt's nice we can set multiple languages for these properties
we might be able to get chinese or russian speaking people to contribute
maybe even go to local african sites :D
[20:10]
jrwrYep [20:11]
HCross2arkiver: Mark will like this an absolute ton :) [20:11]
jrwrIts a very complex system we will be setting up [20:12]
arkiverdefinitely :D
jrwr: not sure
basically people should just add their sites and the data
[20:12]
jrwrRight [20:12]
arkiverand we'll process a wikidump when we update
arkiver is afk for 2 hours
[20:12]
jrwrhttps://www.wikidata.org/wiki/Property:P31
Here are the docs
https://github.com/wikimedia/mediawiki-extensions-Wikibase/tree/master/docs
[20:14]
***Smiley has quit IRC (Read error: Connection reset by peer)
Smiley has joined #newsgrabber
[20:18]
.................. (idle for 1h28mn)
KazHCross2: is kazhomed set up on master? Not sure if it's just working off an old assignment list [21:47]
HCross2Not yet, it might be worth you resetting anyway
Just so that there are no old lists
[21:48]
jrwrOk HCross2 I've done some science
http://wiki.newsbuddy.net/index.php?title=ABC_News
check out edit form (also I nuked the DB, You will need to make another account
[21:56]
HCross2Thank you so much. I've recreated my account [21:58]
jrwrSo
thats NOT using Wikibase
thats just basic templates and forms
man wikibase was fuck all complex
[21:58]
Kazjust restarted kazhomed [22:04]
jrwrThats neat, I disabled the edit source form, so its required to use the form to edit anything in the Services Catagory [22:05]
Kazhmm, i wonder if we can alter the tracker's job assignment speed automatically
even if it's just an emergency 'kill everything' if master fills up
[22:05]
arkiverThat looks nice jrwr [22:10]
jrwrYep
Its very handy
[22:10]
arkiverit wouldn't really be possible to create a form with wikidata? [22:10]
jrwrNot really [22:11]
arkivernice thing about wikidata is the support for different languages [22:11]
jrwrnot on this simple of level
This does as well
[22:11]
arkiverespecially since we want non-english stuff too
let me create an account
would it be possible to add 'extra' fields?
[22:11]
jrwrYes
Any Field you want
[22:11]
arkiverI mean fields that are not needed to fill in [22:12]
jrwrjust edit the Template:Services then Form:Services with settings
Yep
it has two hidden fields now
Status and Log
[22:12]
arkiverah right [22:12]
jrwrStatus and Log will be auto filled by a wiki bot [22:12]
arkiverarkiver creates an account [22:12]
jrwrStatus is if the changes are approved or not
and Log will show what links where found with the regexes
Pretty much the output of your testing tool
[22:12]
arkiverI have an account
it's arkiver
not sure how hard it would be, but can we add https?
Not sure about showing what URLs were found, many changes would have to be made all the time and it's literally thousands of found URLs after some time
[22:15]
jrwrJust even some basic examples
some feedback would be nice for the end users
like the first 4 links found per regex
then a bot could go back over every blue moon and see if the regex is still vaild
like if now a regex is not returning anything at all
Also permissions where trash for wikidata
the only thing I could do was limit if new props could be made
[22:18]
arkiverI'm editing stuff a little
will write some guides on how to contribute
[22:22]
jrwrOk
We can make more then one form use the same Template
its under specialpages on create form
[22:22]
arkiverHmm yes [22:23]
jrwrSo like a "Easy Mode" form [22:23]
arkivercan we edit languages, but still keep it to 1 form? [22:23]
JAAarkiver: Setting up HTTPS is really trivial with Let's Encrypt. I can help with that if needed. [22:23]
arkiverlike add korean
but keep the form without duplicating it
[22:23]
jrwrWe can Copy a form and edit from there [22:23]
JAAUnless MediaWiki needs some special attention regarding that. I have no experience whatsoever with wiki software. [22:23]
jrwrthats all we can do really
Im already in progress of TLS
[22:23]
JAAAh sweet [22:24]
arkiverthis looks really nice [22:26]
jrwrAlso its marked as Stable, Wikibase is not (and kind of broken)
Im making the dhparm key right now
[22:29]
***Aranje has quit IRC (Three sheets to the wind) [22:30]
jrwroh arkiver there is a default field for that
so it auto fills
instead of placeholder
https://www.mediawiki.org/wiki/Extension:Page_Forms/Defining_forms
https://www.mediawiki.org/wiki/Extension:Page_Forms/Input_types
thats if you are raw editing a form
[22:31]
arkiver: watch your case, its picky
SSL is now in place
I've installed Cargo, this will allow me to do advanced queries on the data
almost like SQL, (hell it stores it as SQL and I can just poll the database)
[22:37]
............. (idle for 1h0mn)
arkiverthanks jrwr, having a look at that [23:42]

↑back Search ←Prev date Next date→ Show only urls(Click on time to select a line by its url)