What is the best solution to sync a mongodb instance in local server with dynamic IP (set by ISP) with a mongodb instance in public server (eg. Amazon AWS)? Can i do that from node.js ?
You can do this in a number of ways, but first to address the public/dynamic IP issue you will want to either use a hostname --> IP address mapping that you maintain (/etc/hosts or your own DNS servers) or look into one of the dynamic DNS solutions.
Once you have the changing IP address problems solved, the question is how to keep the systems in sync. The most obvious way is to have the two nodes in a replica set - if your connection is reliable enough this might work, though you will probably want to put an arbiter locally or remotely for whatever side of the connection you want to do writes on when the connection is flakey (in a 2 node set, if either node is down then they are both secondary and cannot take writes).
Another option is to use the mongo connector which lets you sync to arbitrary destinations, including another MongoDB instance.
That project will give you a pretty good idea of what you need to do (in python) to provide such a syncing service. You will need to write something similar in node.js to achieve a proper sync and essentially you will need to tail the oplog on one host and apply it to the other on a regular basis, depending on your requirements.
Related
Let's say I have a cluster of 3 nodes for ScyllaDB in my local network (it can be AWS VPC).
I have my Java application running in the same local network.
I am concerned how to properly connect app to DB.
Do I need to specify all 3 IP addresses of DB nodes for the app?
What if over time one or several nodes die and get resurrected on other IPs? Do I have to manually reconfigure application?
How is it done properly in big real production cases with tens of DB servers, possibly in different data centers?
I would be much grateful for a code sample of how to connect Java app to multi-node cluster.
You need to specify contact points (you can use DNS names instead of IPs) - several nodes (usually 2-3), and driver will connect to one of them, and will discover the all nodes of the cluster after connection (see the driver's documentation). After connection is established, driver keeps the separate control connection opened, and via it receives the information about nodes that are going up & down, joining or leaving the cluster, etc., so it's able to keep information about cluster topology up-to-date.
If you're specifying DNS names instead of the IP addresses, then it's better to specify configuration parameter datastax-java-driver.advanced.resolve-contact-points as true (see docs), so the names will be resolved to IPs on every reconnect, instead of resolving at the start of application.
Alex Ott's answer is correct, but I wanted to add a bit more background so that it doesn't look arbitrary.
The selection of the 2 or 3 nodes to connect to is described at
https://docs.scylladb.com/kb/seed-nodes/
However, going forward, Scylla is looking to move away from differentiating between Seed and non-Seed nodes. So, in future releases, the answer will likely be different. Details on these developments at:
https://www.scylladb.com/2020/09/22/seedless-nosql-getting-rid-of-seed-nodes-in-scylla/
Answering the specific questions:
Do I need to specify all 3 IP addresses of DB nodes for the app?
No. Your app just needs one to work. But it might not be a bad idea to have a few, just in case one is down.
What if over time one or several nodes die and get resurrected on other IPs?
As long as your app doesn't stop, it maintains its own version of gossip. So it will see the new nodes being added and connect to them as it needs to.
Do I have to manually reconfigure application?
If you're specifying IP addresses, yes.
How is it done properly in big real production cases with tens of DB servers, possibly in different data centers?
By abstracting the need for a specific IP, using something like Consul. If you wanted to, you could easily build a simple restful service to expose an inventory list or even the results of nodetool status.
Problem:
I have an AWS EC2 instance running FreeBSD. In there, I'm running a NodeJS TLS/TCP server. I'd like to create a set of rules (in my NodeJS application) to be able to individually block IP addresses programmatically based on a few logical conditions.
I'd like to run an external (not on the same machine/instance) firewall or load-balancer, that I can control from NodeJS programmatically, such that when certain conditions are given, I can block a specific remote-address(IP) before it reaches the NodeJS instance.
Things I've tried:
I have initially looked into nginx as an option, running it on a second instance, and placing my NodeJS server behind it, but after skimming through the NGINX
Cookbook
Advanced Recipes for High Performance
Load Balancing I've learned that only the NGINX Plus (the paid version) allows for remote/API control & customization. While I believe that paying $3500/license is not too much (considering all NGINX Plus' features), I simply can not afford to buy it at this point in time; in addition the only feature I'd be using (at this point) would be the remote API control and the IP address blocking.
My second thought was to go with the AWS/ELB (elastic-load-balancer) by integrating AWS' SDK into my project. That sounded feasible, unfortunately, after reading a few forum threads and part of their documentation (unless I'm mistaken) it seems these two features I need are not available on the AWS/ELB. AWS seems to offer an entire different service called WAF that I honestly don't understand very well (both as a service and from a feature-stand-point).
I have also (briefly) looked into CloudFlare, as it was recommended in one of the posts, here on Sackoverflow, though I can't really tell if their firewall would allow this level of (remote) control.
Question:
What are my options? What would you guys recommend I did?
I think Nginx provide such kind of functionality please refer to link
If you want to block an IP with Node TCP you can just edit a nginx config file and deny IP address.
Frankly speaking, If I were you, I would use AWS WAF but if you don’t want to use it, you can simply use Node JS
In Node JS You should have a global array variable where you will store all blocked IP addresses and upon connection, you will check whether connected host IP is in blocked IP variable. However there occurs a problem when machine or application is restarted, you will lose all information about blocked IP-s. So as a solution to that you can just setup Redis (It is key-value database but there are also other datatypes) DB and store blocked IP-s there. Inasmuch as Redis DB is in RAM all interaction with DB will be instantly and as long as machine or node is restarted, Redis makes a backup on hard drive and it syncs from it and continue to work in RAM with old databases.
I'm a systems/architecture noob, and am trying to get an understanding of how to use multiple GCE instances to run a Meteor app. This walkthrough seems pretty straightforward for getting Meteor running on a single instance, but if I want to add more instances it isn't clear to me how to connect them together.
From what I understand, I'll add each instance to an instance group and use a load-balancer to direct incoming traffic evenly across them. It also seems like I want to attach a persistent disk to each instance which the OS will boot from and which will include a MongoDB installation that participates in a "replicated set".
Is that accurate? And if so, how do I actually tell the MongoDB installation on each instance's disk to be a part of the replicated set?
This question is actually somewhat similar to Get local IP address in node.js, but with one big difference.
On a machine with many different network interfaces (not being all on the same subnet), I will run a node js based application.
I will absolutely have to tell my own IP address to each peer in my first message to them.
According to the routing table, I should now be able to find the right IP-Address which is reachable by my peer.
So... I know how to list all the network interfaces. But it's not enought to find the right one. I need routing information from the os.
How can this be done in a platform-independent way, and if possible without using too much native code.
I've read about https://www.npmjs.com/package/netroute. But I think it's an absolute overkill to install Python only for this.
I want to know how to structure my NodeJS server.
I want to separate services proposed on my website to mount cluster in the future and to have many servers (each allowed to one special task).
Example :
The 'main' server which have one project : ExpressJS and Database
The 'communication server' which have one project : Chat + Forum
Others projects : For complex computing (generating chart / stats / emailing)
Could you explain me different approach for this type of complex website ?
Like Benjamin Gruenbaym said, the architecture belongs somewhere else.
If you are wondering about how to setup the applications on an individual server, there are a few things to keep in mind.
NodeJS runs in a single process, so it should ideally take up 1 core of the CPU. If you run a database on the same server, that is another core. So it may be fine to host all node applications on the same server, if it has a sufficient number of cores.
To run two different Node processes on the same machine, you simply start them one after another, but make sure that they listen on different ports.
To make sure that you can scale out your application later, it is important that you use domain names, instead of IP adresses when you identify your services to each other. So the nodeJS app should know about the database as mydatabase.mycompany.com, not as 192.168.1.10 or any other ip address. This will allow you to later move the database to another network address or to use a load balancer.