VoltDB - Writing your own client or using JSON HTTP Interface - voltdb

I am a bit confused in how should I perform my operation using VoltDB. There are two choice -
Run VoltDB server, create a connection from the client and call your required procedure.
JSON HTTP Interface provided by the VoltDB itself.
I have different applications which need to access the data stored in VoltDB, So I was writing code to connect and call required procedures, but later when I read about JSON HTTP Interface provided by the VoltDB I realized that the data can be accessed over the HTTP APIs without connecting each application with VoltDB.
Now I am confused which method should I choose and why?
I am pretty much in favor of using HTTP APIs provided by VoltDB. But what are the implications of it?

Well, the answer is pretty straightforward.
If you have a situation where low latency is high priority, for example,
storing/processing real-time data which can go to a high rate of transactions/sec.
High insertion rates.
High data query rates.
Then using a real client will typically be the best solution, since you can have a persistent connection. This is not possible with the HTTP API, which needs to reconnect and re-authenticate for each call and use HTTP API for operational queries like fetching/storing data which has low hits.

Related

Firestore snapshot listener limit in node.js application

I have a nodejs backend that is serving as a gRPC server in front of a cloud firestore datastore. In perusing the best practices documentation for Firestore, I noticed: "limit snapshot listeners to 100 per client".
This is a pretty reasonable limitation if a "client" is a web UI or flutter application, but does the same limitation apply to a node.js or golang server connecting to the database via the admin interface? Suddenly, in the best case I am looking at 100 concurrent users per server process, which isn't super-great, if those users each request a single resource in streaming mode.
So: does that 100 snapshot listeners per client limitation apply when the "client" is actually a backend API service? And if so, what are some best practices to work around this?
(yes, I know I could just use the regular client API in the client itself, and will be doing that, I am mostly wondering about the limitations in an academic sense, as I was considering using streaming GRPC because there's a fair bit of data massaging that needs to happen between the storage representation and what the client consumes, so putting that all into a single place on a server where I control the rollout frequency is easier than dealing with data representation sync errors because some client is using an older implementation of a transformer method. Plus: that's extra data / code to ship to clients).
The 100 snapshot listeners per client limit should apply for any client, including a backend API service.
Firestore doesn't have a way to make the distinction on where the calls come from, and as such there's no built-in mechanism to make it to exempt the limitation.

Front End vs Back-End processing

I'm ingesting a codebase that is a React-NodeJS stack. One of the concepts that I am trying to grasp is regarding the back-end API and how its handled client side.
The codebase is essentially dumping an entire collection from MongoDB with an API call, and then doing a good amount of parsing and client side logic with React in order to render custom views. The HTTP responses here are pretty large, and will only get larger as data is added to the DB.
Is there any advantage/disadvantage to this approach, as opposed to creating multiple endpoints in NodeJS, and utilizing something like Mongoose to return filtered data to the client, making rendering easy and responses smaller.
Things to take into consideration could be resource consumption, how this would be billed if in the cloud, the impact of SPA's, etc.
Hopefully i get some more clarity at the end of this?
Client-side processing is best because as you know our server-side resources are free and can easily handle requests.but Sending large amounts of data to the client for processing will incur client overhead and make their browsing experience less acceptable, data security may be compromised, or the network may be overwhelmed and bandwidth consumed. Processing data server side will increase your server load per client.
So to avoid these issues, it is best to first hand over some of these conflicts to the side of the database handled (with filtering and special conditions) and then filter the security data processed in server side with command and coding that Do not send clients.
But let's do the heavy processing on the client machine. SPA has other benefits, of course.
So I mostly do server-side processing - unless it's really basic stuff like simple sorting etc etc.
Also, don't assume JavaScript is enabled. You have to fall back gracefully and that would require the server to do the processing anyhow.
this link say to you differences between server-side and client-side programming

Best approach to connect two nodejs REST API servers

Scenario is I have two Node applications which are providing some REST APIs, Server_A has some set of REST endpoints, and Server_B has some other set of endpoints.
We have a requirement where Server_A need some data from Server_B. We can create some REST endpoints for this but there will be some performance issues. Server_A will create http connection each time to Server_B.
We can use Websockets but I am not much sure if it would be good approach or not.
In all cases Server_A will be calling Server_B and Server_B will return data instantly.
Server_B will be doing most of the database operations, Server_A has calculations only. Server_A will call Server_B for some data requirements.
In Addition, there will be only one socket connection that is between Server_A and Server_B, all other clients will connect via REST only.
Could anyone suggest if it would be correct approach?
Or you have some better idea.
It would be helpful if I get some code references, modules suggestions.
Thanks
What you are asking about is premature optimization. You are attempting to optimize before you even know you have a problem.
HTTP connections are pretty darn fast. There are databases that work using an HTTP API and those databases are consulted on every HTTP request of the server. So, an HTTP API that is used frequently can work just fine.
What you need to do is to implement your server A using the regular HTTP requests to server B that are already supported. Then, test your system at load and see how it performs. Chances are pretty good that the real bottleneck won't have anything to do with the fact that you're using HTTP requests between server A and server B and if you want to improve the performance of your system, you will probably be working on different problems. This is why you don't want to do premature optimization.
The more moving parts in a system, the less likely you have any idea where the actual bottlenecks are when you put the system under load. That's why you have to test the system under load, instrument it like crazy so you can see where the performance is being impacted the most and then measure like crazy. Then, and only then, will you know where it makes sense to invest your development resources to improve your scalablity or performance.
FYI, a webSocket connection has some advantages over repeated HTTP connections (less connection overhead per request), but also some disadvantages (it's not request/response so you have invent your own way to match a response with a given request).

Is it a bad idea to use a web api instead of a tcp socket for master/slave communication?

TL:DR; Are there any drawbacks / pitfalls to use a RESTful API instead of a TCP/Socket connection for a master-slave pattern?
I'm writing a web application that fetches data from an 3rd party API, calculates statistics (and more), and presents it. The fetched data is stored in a database.
I'm writing this web application with NodeJS and AngularJS.
So there are 2 basic opertions the app will do:
Answer HTTP requests from the front end
Update the database. This includes: Fetching data, saving data to database, and caching results of some heavy queries every 5 minutes in a in-memory database like redis.
I want to seperate these operations in two applications. The HTTP server shall be the master. The second application is a just a slave, of which as many instances can be spawned.
The master will implement some kind of task-processor which just distributes tasks to idle slaves. The Slave is very dumb. It can report about its status (idle/busy and some details like current load etc). You can start tasks on a slave. The master will take care of queueing tasks and so on.
I guess this is common server/agent pattern.
So I've started implenting a TCP Server and Client, but it just feels like so much overhead to do this. A bit like reinventing the wheel. So I thought I just could use a HTTP server on my client which just has two endpoints like
[GET] /status
[POST] /execute/:task
Am I on the right track here?
TL;DR; There are drawbacks to rolling your own REST API for master-slave architecture.
Your server/agent pattern is commonly referred to as microservices.
Rolling your own REST API might work, but is probably not optimal. Dealing with delivery semantics (for example, at most once vs at least once), transient failures, polling, etc will likely cause a lot of pain before you get it right.
There are libraries/services to provide varying levels of convenience:
Seneca - http://senecajs.org
Pigato - http://prdn.github.io/pigato/
Kong by Mashape - http://getkong.org
Webtask by Auth0 (paid) - https://webtask.io

MongoDB document streaming with HTTP response?

After an all day research on node.js real-time frameworks/wrappers (derby.js, meteor,
socketIO...) I realised, that the more old-fashioned (sorry) way of a restful API
fits all my needs.
One of the reasons I thought I have to use an ongoing socket connection was because I want to
stream my MongoDB documents from the database instead of loading them all into memory on the server. I think this is the recommended way because it minimizes the use of server ressources.
But here is the problem:
Does a simple document query streaming work with the ordinary HTTP request/response
model or do we have to establish an ongoing socket-connection to stream all documents to the client?
Note: I only have to load the documents on an ajax call - without the need to have new
documents to be pushed to the client (so really no need to be realtime).
Is there anything special to consider?
You can stream the results of the query using the standard HTTP request/response APIs.
The general sequence of calls is:
res.writeHead(<header content>)
res.write(<data>)
...
res.write(<data>)
res.end();
But you make those calls asynchronously, driven by the streaming events from your query.

Resources