Recommended approach & tools to provision a VM instance(s) from node.js? - node.js

I am trying to implement a 'lab in the cloud' to allow people to have a sandbox to experiment and learn in; i.e. for devops (chef/puppet), installing or configuring software etc.
I have a node.js server implementation to manage this and looking for sane and reasonable ways to attack this problem.
The options are bewilderingly diverse: puppet or chef directly, or vagrant seems appropriate. But Openstack, cloudfoundry, Amazon EC2 also provide their own feature sets.
Ideally a micro-cloud solution (multiple VM's per instance) would be ideal as there isn't going to be any large computational load.
Suggestions most appreciated.
Cheers

After some investigation, it seems that LXC on EC2 might be the way forward:
It gives
lightweight, instances on a single EC2 instance
supports hibernate/restore
fast to standup
able to automate using chef/cucumber
EC2 virtualization using LXC
Chef-lxc
Testing infrastructure code in LXC using Cucumber

Related

Can I install different softwares into one Aws EC2 virtual machine?

I want to ask a question as I'm new to AWS.
At one instance of Ubuntu EC2 I installed InfluxDB and it is running so I want to know if I can install Node.js on that same instance? Would my node.js installation affect InfluxDB?
Basically, I want to run a background nodejs script that will be live forever, to insert data to InfluxDB from a server.
Would I need to launch a separate virtual machine to run that script OR it will be on that same virtual machine?
Generally speaking, you can install and run any software in a single EC2 instance. The only limit is the underlying resource, meaning whether the instance has sufficient memory, CPU, disk I/O or network bandwidth to run all of them.
Practically, any decisions you make will have trade-offs and it's always good for you to be aware of them.
In your case, I can give you some pros and cons of the 2 approaches
Same-instance installation
Pros: easy to configure as your script is in the same instance with your InfluxDB. Also , if your NodeJS script has a small resource footprint, then this approach is possibly cheaper as well.
Cons: if you are running a cluster of multiple InfluxDB instances, having multiple copies of a NodeJS script in all of your InfluxDB instances will make it hard to maintain, deploy, update and monitor those instances.
This approach is only recommended if you are running single-node InfluxDB.
Dedicated installation
Pros: easy to scale up. easy to manage, deploy and update. better availability.
You can have a dedicated cluster of InfluxDB and another much smaller cluster of your NodeJS scripts.
This separation provides you a more reliable cluster for InfluxDB, as the frequency that you update your NodeJS script is usually higher than you update your InfluxDB software. Having a dedicated NodeJS cluster gives you peace of mind that even if your script has a critical bug, your InfluxDB cluster is still running fine.
Cons: harder to configure. You also need to deal with the distributed nature of your system, as your script is now hosted in different instances of your InfluxDB. Also, this approach is more expensive as well.
You should consider this approach if you are running InfluxDB cluster.

How to make a cluster of GitLab instances

Is it possible to create a cluster of multiple GitLab instances (multiple machines)? My instance is over utilized and I would like to add other machines, but at the same for the user should be transparent to access his project, he doesn't care which instance it will be hosted on.
What could be the best solution to help the users?
I'm on GitLab Community Edition 10.6.4
Thanks for your help,
Leonardo
I reckon you are talking about scaling GitLab server, not GitLab runners.
GitLab Omnibus is a fairly complex system with multiple components, some are stateless and some are stateful.
If you currently have everything on the same server, the easiest option is to scale up (move to bigger machine).
If you can't, you can extract stateful components to host them separately: PostgreSQL, Redis, files to NFS.
Funnily you can make performance worse here.
Next step you can scale out the stateless side.
But it is in no way an easy task.
I'd suggest to start with setting up proper monitoring to see where are your limitations (CPU, RAM, IO) and bottle-necks (in which components).
See docs, including some examples of scaling:
https://docs.gitlab.com/ee/administration/high_availability/
https://about.gitlab.com/solutions/high-availability/
https://docs.gitlab.com/charts/
https://docs.gitlab.com/ee/development/architecture.html
https://docs.gitlab.com/ee/administration/high_availability/gitlab.html

Docker For Development Only

I am an IT Supervisor head and have very little development background so I apologize for this naive question.
Currently, we are using Weblogic, running in Linux VMs, created by Oracle VM (OVM) to host our application for production.
The development environment also uses the same configuration.
Our developers are suggesting we use docker in the development environment and utilize DevOps to increase the agility of development.
This sounds like a good idea to me, but I still want our production to run on the same configuration running today (Weblogic in Linux VMs over Oracle VM Hypervisor); I do not want to use docker for production.
I have been searching to find out if that is possible with no luck.
I would really appreciate it if you can help.
I have three questions:
Is that possible?
Is that a normal practice to run docker for development only while using traditional nondocker for production?
If it is possible, what are the best ways to achieve that?
Thank You
Docker is linux distro-agnostic. Java development is JEE container-agnostic (if you follow the Java official specs defined in the JSRs).
So, these are two reasons why you should have the same behaviour between your developper environment and your production environment. Of course, a pre-production environment should be welcome to be sure this is true. And do not avoid looking at memory and performances issues, before doing that. Moreover, depending on the reason you are using Weblogic, ask yourself about which JVM and JEE container you would run in your docker containers.
is that possible ?
Yes, we do that in my organization, for some applications, using tomcat (instead of WebSphere for other applications).
is that a normal practice to run docker for development only while using traditional none docker for production ?
There are many practices, depending on the organization goals, strategy and level of agility. Using Docker for development and not in production is the most use-case with Docker containers, nowadays, but the next level is to use a Docker engine in a production environment. See next section:
-if it is possible, what are the best practice to achieve that ?
The difficulty is that in a production environment, you need a system for automating deployment, scaling, and management of containerized applications.
Developers do not need that. So it is really easy for them to migrate to Docker (and it lets them do things easier and faster than without Docker).
In production, you should really consider using Kubernetes or OpenShift, instead of running a simple docker engine, like your developers do. But it is much more complicated than simply installing Docker on a single Windows or Linux host.

How does RunKit make their virtual servers?

There are many websites providing cloud coding sush as Cloud9, repl.it. They must use server virtualisation technologies. For example, Clould9's workspaces are powered by Docker Ubuntu containers. Every workspace is a fully self-contained VM (see details).
I would like to know if there are other technologies to make sandboxed environment. For example, RunKit seems to have a light solution:
It runs a completely standard copy of Node.js on a virtual server
created just for you. Every one of npm's 300,000+ packages are
pre-installed, so try it out
Does anyone know how RunKit acheives this?
You can see more in "Tonic is now RunKit - A Part of Stripe! " (see discussion)
we attacked the problem of time traveling debugging not at the application level, but directly on the OS by using the bleeding edge virtualization tools of CRIU on top of Docker.
The details are in "Time Traveling in Node.js Notebooks"
we were able to take a different approach thanks to an ambitious open source project called CRIU (which stands for checkpoint and restore in user space).
The name says it all. CRIU aims to give you the same checkpointing capability for a process tree that virtual machines give you for an entire computer.
This is no small task: CRIU incorporates a lot of lessons learned from earlier attempts at similar functionality, and years of discussion and work with the Linux kernel team. The most common use case of CRIU is to allow migrating containers from one computer to another
The next step was to get CRIU working well with Docker
Part of that setup is being opened-source, as mentioned in this HackerNews feed.
It uses linux containers, currently powered by Docker.

Securing Elasticsearch Clusters

I want to create a secure Elasticsearch Cluster.
About my use case. I want a multitenant system. Users must have administrative access to their own namespace. After a couple tries, I'm now just giving users their own clusters (via docker).
Attempt 1: Shield on a dedicated node with multitenancy. This requires me to modify roles yml file for every user. This is cumbersome and painful.
Attempt 2: Docker container + Shield: This looked to be working ok after some trial and error, but I don't like the licensing, and I also do not understand how it is securing the tcp transport.
Attempt 3: Docker container + nginx reverse proxy & htpasswd: This works well for securing the http transport, and works great with kibana now that basic auth is supported in Kibana. Unfortunately, this limits my clustering abilities because 9300 is wide open.
Attempt 4: I'm about to try docker container + Search Guard: This looks like a decent option, but I'm still not sure how the tcp transport is supposed to be secured.
How do people actually secure multitenant Elasticsearch clusters?
You're on the right track. ES isn't inherently multi-tenant and you really can't know for sure you've properly secured / namespace access. Also, ES lacks authentication and https, so you'll have those problems to deal with too. I know you can pay for the privilege, and there are some other hacks you can do to get it, but realistically, the system is per customer, not multi tenant.
I'd also caution against the assumption that multi-tenant using docker is a viable solution. Again, docker security is not a well known / solved problem yet. There are risks when you virtualize on top of the kernel. The main risk being that the kernel is a huge amount of code vs accepted virtualization techniques on hardware. Take an amazon ec2 instance that runs on a hypervisor. The hypervisor implements much of the boundaries between VMs through hardware - ie, special CPU procedures that assist in isolating different VMs at the hardware level.
Because the hypervisor is a small bit of code (compared to the kernel) it's much more easy to audit. Because the hypervisor uses hardware features to enforce isolation, it's much more safe.
On one dimension, Docker actually adds security on a per process basis (IE, if your application running nginx gets hacked and the docker is setup well, then the intruder will also have to break out of the docker instance). On the other dimension, it's not nearly as good as machine virtualization.
My recommendation is to create a cluster VMs for each customer, and on each VM cluster, run the ES docker plus other application dockers.

Resources