Migrating existing Marklogic Application Server from Linux to AWS - linux

I'd like to migrate Marklogic 7 Application Server from a Linux environment to AWS.
I've seen pdfs/tutorials on creating a new server on AWS but I'm not sure how to migrate existing data and configurations.
There is more than one cluster.
Thanks
NGala

This question has nothing to do with AWS (AWS servers are just standard Linux servers). Consult your Marklogic documentation on how to migrate between servers.

It makes a big difference whether you need to keep the server online the whole time, or not. If you can shut it down, just install MarkLogic on an AWS linux image and copy /var/opt/MarkLogic and any external data directories.
If you need to keep the system online, export a configuration package for your database and app server(s) from the MarkLogic configuration manager on port 8000. Then import it on the new host. Then set up database replication as described at http://docs.marklogic.com/guide/database-replication/dbrep_intro - then once replication has synchronized, fail over to the new system.
Specific to AWS, you could back up a database to S3 from one cluster and then restore it on another cluster. This works even outside AWS, as long as the system can access S3.

Related

Random error on external Oracle database connection with Kubernetes

After month of research, we are here, hoping for someone to have a insight about these issue:
On a GKE cluster, our pods (node.JS) are having trouble connecting to our external oracle business database.
To be more precise, ~70% of our connection tentative are ending in error:
ORA-12545: Connect failed because target host or object does not exist
The 30% left are working well, and doesn't reset or end prematurely. Once it's connected, it's all good from here.
Our stack:
Our flux are handed by containers based on a node:12.15.0-slim image, at which we add LIBAIO1 and a instant oracle client (v12.2). We use oracleDB v5.0.0 as node module
We use cron job pod handling our node container, in a clusterIP service on a GKE cluster (1.16.15-gke.4300).
Our external oracle database in on a private network (which our cluster have access), in a Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi version, behind a load balancer
I can give more detail if needed.
What we have already tried:
We have tried to pass directly on the database, cutting off the load balancer: no effect
We had cron job pod doing ping each min on the database server for a day: no error, although flux pod somehow encounter the ORA-12545 error
We redo all our code, connecting differently to the database and making update for our node module oracledb (v4 to v5): no effect
We tried to monitore the load up over the oracle database and take action spreading our flux over all night instead of a 1 hour window: no effect
We had our own kubernetes cluster before GKE, directly in our private network, causing the exactly same error.
We had a audit by some expert on kubernetes, without them finding the issue or seeing a critical issue over our cluster/k8s configuration
What works:
All our pods, some requesting into mySql database, micro service, web front, are all working fine.
All our business tool (dozen of, including Talend and some custom software) are using the oracle database without issue.
Our own flux handling node container are working fine with the oracle database as long they are into a docker env, and not a kube one.
To resume: We have a mysterious issue when trying to connect to an oracle database from a kubernetes env, where pods are randomly unable to reach the database
We are looking for any hint we can have

Is it possible to fake Cassandra connection?

I have been given a task to configure Cassandra DB for the project. We are facing a problem - for all environments there is a dedicated server for Cassandra. But, for the DEV environment, the client does not want to provide a seperate server and current DEV servers are already fully packaged and we can't afford to install Cassandra on them.
My question is, is there any possibility to fake connection to Cassandra in an environment? I've created CassandraConfiguration.java class, configured session, cluster etc etc, it all works smoothly on other envs, but on DEV, well, it fails, as it cannot connect, because there's no Cassandra... Commiting the cassandraconfiguration file will kill the dev.
You can use scassandra (simulated cassandra), or Simulacron that are emulating Cassandra. Or you can use cassandra-unit that will run Cassandra in the same JVM as your test.

Open ports for Cassandra in Google Cloud Comput production environment

I've been working with Storm topologies and Cassandra databases for relatively short period of time. I recently realized that my development environment's spec is not strong enough for my testing, so I deployed a 3-node Cassandra cluster on Google Cloud instance. Now I'd like to let Storm topology (hosted on a separate box) to insert into Cassandra. Obviously, this feature is not enabled by default, and I'd like to have a guideline of how to, securely, open Cassandra for database queries from different IP in production scenario. ( I suspect that Google protects its instances with a firewall as well?)
Following Carlos Rojas's directions in THIS LINK, I could open the ports to access Cassandra from outside the network computer. Also, you can open ports in your firewall using this line :
gcutil addfirewall cassandra-rule --allowed="tcp:9042,tcp:9160" --network="default" --description="Allow external Cassandra Thrift/CQL connections" from THIS LINK

npm cluster package on a server cluster

So I have an app I am working on and I am wondering if I am doing it correctly.
I am running cluster on my node.js app, here is a link to cluster. I couldn't find anywhere that states if I should only run cluster on a single server or if it is okay to run it on a cluster of servers. If I continue down the road I am going I will have a cluster inside a cluster.
So that it is not just opinions as answers, here is my question. Was cluster the package made to do what I am doing (cluster of workers on a single server inside a cluster of servers)?
Thanks in advance!
Cluster wasn't specifically designed for that, but there is nothing about it which would cause a problem. If you've designed your app to work with cluster, it's a good indication that your app will also scale across multiple servers. The main gotcha would be if you're doing anything stateful on the filesystem. For example, if a user uploads a photo and you store it on the server disk, that would be problematic when scaling out across multiple servers (that don't share the same disk).

How to run Couchbase on multiple server or multiple AWS instances?

I am trying to evaluate couchbase`s performance on multiple nodes. I have a Client that generates data for me based on some schema(for 1 node currently, local). But I want to know how I can horizontally scale Couchbase and how it works. Like If I have multiple machines or AWS instances or Windows Azure how can I configure Couchbase to shard the data and than I can evaluate its performance for multiple nodes. Any suggestions and details as to how I can do this?
I am not (yet) familiar with Azure but you can find a very good white paper about Couchbase on AWS:
Running Couchbase on AWS
Let's talk about the cluster itself, you just need to
install Couchbase on multiple nodes
create a "cluster" on one of then
then you simply have to add other nodes to the cluster and rebalance.
I have created an Ansible script that use exactly the steps to create a cluster from command line, see
Create a Couchbase cluster with Ansible
Once you have done that your application will leverage all the nodes automatically, and you can add/remove nodes as you need.
Finally if you want to learn more about Couchbase architecture, how sharding, failover, data consistency, indexing work, I am inviting your to look at this white paper:
Couchbase Server: An Architectural Overview

Resources