Remote call to Roxie server from Thor - hpcc-ecl

Assuming that we have the end-point to a Roxie server of interest, I was wondering if it is possible to make a remote call to it from a bwr script on Thor, and get the number of nodes that Roxie server has.
The code would probably look like the following:
RoxieServerIP := 'roxie-end-point';
numNodesRoxie := someBuiltInFunctionToGetNodes(RoxieServerIP);
OUTPUT(numNodesRoxie, NAMED('numNodesRoxie'));
I looked into some of the built-in functions to get the number of nodes of a cluster that you are running a process on such as:
OUTPUT(thorlib.wuid());
OUTPUT(thorlib.nodes());
but I haven't seen anything where we can call out to a difference server (e.g. Roxie) and get its number of nodes.
Any help would be appreciated!
Thanks

I chatted with the development team today and the best way to approach what you need to do is to deploy a query to the remote ROXIE that returned how many nodes it had. So in other words, built a "diagnostic" ROXIE query that embeds the nodes() function, and then call it from your other remote location.
Hope this helps!
Bob

Related

Hazelcast readiness probe

I am aware that there is a probe available at /hazelcast/health/ready at port 5701.
However I need to do this progamatically through code as I am using embedded hazelcast deployed on a kubernetes cluster and all communication should pass through the main application (that means hazelcast cannot expose that endpoint, using http requests through localhost would not suffice). I tried looking into the documentation but I found no help in this.
Only thing that I found is to use instance.getServer().getPartitionService().isLocalMemberSafe() but I got no evidence that this is effectively the same as checking the readiness probe.
Any help would be appreciated, thanks!
The exact logic for /ready endpoint is:
node.isRunning() && node.getNodeExtension().isStartCompleted()
I guess you can't use exactly the same from the code, but fairly good approximations are:
instance.getLifecycleService().isRunning() (the only difference is that it won't wait with being ready for joining other members)
instance.getPartitionService().isClusterSafe() (the difference is that it will wait for all the Hazelcast migration to finish)
You can use any of them. If you want to be really sure that Hazelcast member can receive the traffic when its ready, then the second option is totally safe.

How can I use multiple dynamic DB with single code/project on run time with node.js

I want to connect dynamic mongo DB with my single code according to sub domain url.
eg.
if www.xyz.example.com then mongo DB is xyz
if www.abc.example.com then mongo DB is abc
if www.efg.example.com then mongo DB is efg
if someone hit www.xyz.example.com url then xyz DB automatically connect. if someone hit www.abc.example.com url then abc DB automatically connect.
but xyz DB connection should not disconnect. it should be remain . Because there is single code/project.
Please give a solution.
I'm not quite sure about your application use case so cannot assure the best solution.
One feasible solution is to run 3 node.js threads on 3 different ports, each connect to a specific DB instance. You can do it by running 3 different node.js process with different environment variables. Then forward the requests to each domain to different ports.
This approach has some advantages:
Ease of configuration, just need to care about deployment setting without if/else hacking in source code.
System availability, if 1 of the 3 DBs is down, only 1 domain affected, the others still work well.
NOTE: This approach just works well with small number of sub domains. If you have 30 sub domains or dynamic domains, then please re-consider your deployment architecture :). You may need to use some more advanced techniques to deal with it. A quick (but not best) way is to maintain a list of mongoose instances inside the application during application runtime, each instance is responsible for 1 sub domains. Then use req.get('host') to check the sub domain and use the corresponding mongoose instance to process the DB operations.

Beanstalkd / Pheanstalk security issue

I have just started using beanstalkd and pheanstalk and I am curious whether the following situation is a security issue (and if not, why not?):
When designing a queue that will contain jobs for an eventual worker script to pick up and preform SQL database queries, I asked a friend what I could do to prevent an online user from going into port 11300 of my server, and inserting a job into the queue himself and hence causing the job to be executed with malicious code. I was told that I could include a password inside the job being sent.
Though after some time passed, I recognized that someone could preform a few simple commands on a terminal and obtain the job inside the queue, and hence find the password, and then create jobs with the password included:
telnet thewebsitesipaddress 11300 //creating a telnet connection
list-tubes //finding which tubes are currently being used
use a_tube_found //using one of the tubes found
peek-ready //see whats inside one of the jobs and find the password
What could be done to make sure this does not happen and my queue doesn't get hacked / controlled?
Thanks in advance!
You can avoid those situations by placing beanstalkd behind a firewall or in a private network.
DigitalOcean (for example) offers such a service where you have a private network IP address which can be accessed only from servers of the same location.
We've been using beanstalkd in our company for more than a year, and we haven't had any of those issues yet.
I see, but what if the producer was a page called index.php, where when someone entered it, a job would be sent to the queue. In this situation, wouldn't the server have to be an open network?
The browser has no way to get in contact with the job server, it only access the resources /you/ allow them to, that is the view page. Only the back-end is allowed to access the job server. Also, if you build the web application in a certain way that the front-end is separated from the back-end, you're going to have even less potential security issues.

How to connect pinoccio to apache couchdb

Is there anyone using the nice pinoccio from www.pinocc.io ?
I want to use it to post data into an apache couchdb using node.js. So I'm trying to poll data from the pinnocio API, but I'm a little lost:
schedule the polls
do long polls
do a completely different approach
Any ideas are welcome
Pitt
Sure. I wrote the Pinoccio API, here’s how you do it
https://gist.github.com/soldair/c11d6ae6f4bead140838
This example depends on the pinoccio npm module ~0.1.3 so make sure to npm install again to pick up the newest version.
you don't need to poll because pinoccio will send you changes as they happen if you have an open connection to either "stats" or "sync". if you want to poll you can but its not "real time".
sync gives you the current state + streams changes as they happen. so its perfect if you
only need to save the changes to your troop while your script is running. or show the current and last known state on a web page.
The solution that replicates every data point we store is stats. This is the example provided. Stats lets you read everything that has happened to a scout. Digital pins for example are the "digital" report. You can ask for data from a specific point in time or just from the current time (default). Changes to this "digital" report will continue streaming live as they happen, until the "end" time is reached, or if "tail" equals 0 in the options passed to stats.
hope this helps. i tested the script on my local couch and it worked well. you would need to modify it to copy more stats from each scout. I hope that soon you will be able to request multiple reports from multiple scouts in the same stream. i just have some bugs to sort out ;)
You need to look into 2 dimensions:
node.js talking to CouchDB. This is well understood and there are some questions you can find here.
Getting the data from the pinoccio. The API suggests that as long as the connection is open, you get data. So use a short timeout and a loop. You might want to run your own node.js instance for that.
Interesting fact: the CouchDB team seems to work on replacing their internal JS engine with node.js

nodeJS multi node Web server

I need to create multi node web server that will be allow to control number of nodes in real time and change process UID and GUID.
For example at start server starts 5 workers and pushes them into workers pool.
When the server gets the new request it searches for free workers, sets UID or GUID if needed, and gives it the request to proces. In case if there is no free workers, server will create new one, set GUID or UID, also pushes it into pool and so on.
Can you suggest me how it can be implemented?
I've tried this example http://nodejs.ru/385 but it doesn't allow to control the number of workers, so I decided that there must be other solution but I can't find it.
If you have some examples or links that will help me to resolve this issue write me please.
I guess you are looking for this: http://learnboost.github.com/cluster/
I don't think cluster will do it for you.
What you want is to use one process per request.
Have in mind that this can be very innefficient, and node is designed to work around those types of worker processing, but if you really must do it, then you must do it.
On the other hand, node is very good at handling processes, so you need to keep a process pool, which is easily accomplished by using node internal child_process.spawn API.
Also, you will need a way for you to communicate to the worker process.
I suggest opening a unix-domain socket and sending the client connection file descriptor, so you can delegate that connection into the new worker.
Also, you will need to handle edge-cases for timeouts, etc.
https://github.com/pgte/fugue I use this.

Resources