How to create and run multiple ec2 instances with same configurations and software installed? - linux

Fairly new to cloud computing, so bear with me if question is obvious or silly. With tons of information available on internet, I was able to successfully create an ec2 linux instance and installed R and Rstudio on it. Ran my scripts on it which went really well but took too long (16 hrs) and very expensive as well since I require instances with high memory and vCPUs .
In my programs, I am essentially running the same scripts for different datasets.
My question is, is there any way I can run multiple similar instances of ec2 (with exactly same software installed and my scripts). So, this way, I will be able to run my scripts on every dataset on a separate instance simultaneously in less amount of time.
So what I have tried so far. I have created an AMI image of my existing instance and launched it. But I couldn't SSH it because of its weird username and ip address, something like "root#10.0.0.1". I can see both instances are running (original and the AMI image instance), I can SSH into original but not into the other one. I am able to login to the RStudio for original instance on port 8787.
Another question is how to launch this AMI imaged instance using SSH (Putty) in parallel with original instance. What problem will it cause if I use both of them in the browser (RStudio in this case) simultaneously?
Please help me with this!Thanks!

Problem: For a school project, I was running several machine learning algorithms on pretty large size data which happened to requre 30-35GB of memory and my PC couldn't handle it. I was using R/RStudio. So, I resorted to AWS for my memory limitation problem.
What I did initially: I created an ec2-instance, installed R/RStudio. Everything worked out perfectly and I was able to run my programs on RStudio through browser. I actually, ran my scripts on a very small dataset on this AWS instance to see how things are going. To much of my surprise it took very long for the whole script to run even with this small dataset. Soon enough, I realized that all these algorithms in my programs could be run independently for the same set of features with a little tweak in the scripts.
So, I decided to play with AWS little bit. I recreated the programs such that everything stayed the same except the learning algorithms in each script. In other words, I wanted to simultaneously run a copy of these programs with different algorithms so that I could get everything running simultaneously and produce the results in a smaller amount of time.
Now, my goal was to run multiple copies of this instance (original instance). And I should be able to run RStudio on my browser for each of these instances e.g. 5 ec2 instances will have 5 RStudio running concurrently on browser's different tabs. With that, I would be able to run all RStudio for each instance on my browser.
Then, I created an image (AMI) of this instance and then I created multiple instances from the AMI but I was missing out few points while creating those new instances from AMI which caused the problem I asked in the question above.
I initially suspected that it has something to do with port 8787 and I might not be able to run multiple RStudio for each ec2 instance in the browser. However, that was not the problem at all.
There are few very important things to take care of while you create the new instances from an AMI.
Mistake: While CREATING new instances from this AMI, I was NOT selecting two important things correctly i.e. VPC and Security Group.
Correct method is:
VPC -- On the "Configuration Instance Details" page:
a. Click the "Network" dropdown and select the VPC which was created for the original instance. (Original instance is the one which is used to create the AMI (image))
b. Click the "Auto-assign Public IP" dropdown and select Enable
Security Group -- On the "Configure Security Group" page:
a. for "Assign a security group" option, tick the "Select an existing security group" options
b. If there are more than one security groups in the list then select the one which was created for the original instance (OR create a new Security Group and make sure that it has the same kind of inbound and outbound port accesses)
Once I set this up, as Marc B mentioned in the comments, each instance gets its own IP address, and a local subnet address is assigned as well
IP address of instance looks like : ec2-33-444-22-111.us-west-1.compute.amazonaws.com
subnet looks like: 127.0.0.35
Now, after learning this, I recreated 5 instances from my AMI. So, now I had 5 instances with RStudio on each of them. All of them were running perfectly fine because I was able to SSH into each of them.
Now I thought I should be able to work with these instances in different tabs of the browser and run my scripts in them. But I wasn't able to login to all the RStudio instances in my browser tabs. Only one of them was working fine and the others were just not working in the browser. However, I was able to SSH into all of them from PuTTY. I could have ran my scripts from Linux (SSH) as well but I wanted to run them using RStudio.
After spending a good number of hours on this, I figured out the problem that the RStudio server needs to be started manually for each ec2 instance in the linux except the very first instance.
For one of the ec2-instance (besides the one which was working fine on browser), I did the following to start the RStudio server manually as below:
SSH using PuTTY
Become root: sudo su
Go to this path where RStudio was installed on my Linux instance: cd /usr/lib/rstudio-server/bin
start RServer with this command : rstudio-server start
Now go back to the browser, open another tab and use your ec2-instance address and port number (http://ec2-33-444-22-111.us-west-1.compute.amazonaws.com:8787). And now you should get the login page of RStudio for this instance as well.
Now, with a similar process, I had to manually run RStudio-servers for all other instance in order to be able to access them through the browser. Then I thought, if there is a way to start the RStudio server when Linux starts up every time. Then came up with a solution. To do this, I made a change in one of the configuration files of Linux as follows:
Become root: sudo su
go to this path: cd /etc/rc.d
vi the file rc.local and add the following command:
/usr/lib/rstudio-server/bin/rstudio-server start
save the changes you made.
close the SSH connection
Then, I went back to the AWS console, stopped this instance and created an AMI (image) of it. Now the above changes will be effective for each instance that I create from this AMI i.e. now RStudio server will be started as soon as the instance boots and will be accessible through the browser.
Now I can use multiple RStudio instances using different tabs of my browser. Make sure you are using the correct instance address in the browser. Port number stays same for all i.e. 8787

Related

Does an opened SSH connection to a GCLoud VM prevent it from freezing/crashing?

I have a f1-micro gcloud vm instance running Ubuntu 20.04.
It has 0,2 vcpus and 600mb memory.
I write freezing/crashing which stands for just not responding to anything anymore.
From my monitoring i can see that the cpu is at its peak at 40% usage (usually steady under 1%), while the memory is always arround 60% (both stats with my (nodejs) server running).
When i open a ssh connection to my instance and run my (nodejs) server in background everything works fine as long as i keep the ssh connection alive. As soon as i close the connection it takes a few more minutes until the instance freezes/crashes. Without closing the ssh connection i can keep it running for hours without any problem.
I dont get any crash or freeze information from gcloud itself. The instance has a green checkmark and is kind of still running. I just cant open a new ssh connection and also the only way to do something again with this instance is by restarting it.
I have cloud logging active and there are also no messages in there.
So with this knowledge my question is if gcloud somehow boosts ssh connected vms to keep them alive?
Cause i dont know what else could cause this behaviour.
My (nodejs) server uses arround 120mb, another service uses 80mb and the gcp monitoring agent uses 30mb. The linux free command on the instance shows memory available between 60mb and 100mb.
In addition to John Hanley and Mike, You can edit your Machine Type based on you needs.
In the Google Cloud Console, Go to VM Instance under Compute Engine.
Select Instance name to open its Overview page.
Make sure to Stop the Instance before editing Instance.
Select Machine Type that match your application needs.
Save.
For more info and guides you may refer on link below:
Edit Instance
Machine Family Categories
Since there were no answers that explained the strange behaviour i encountered.
I also haven't figured it out but at least my servers wont crash/freeze anymore.
I somehow fixxed it by running my node.js application in an actual background job using forever instead of running it like node main.js &.

should I configure my EC2 using user_data or Ansible

When launching EC2 using Terraform (or cloud formation), we can configure EC2 by putting some scripts in user_data/remote-exec. Alternatively, we can configure EC2 using Ansible/Chef, etc. What are the difference of configuring EC2 in user_data/remote-exec and do that with Ansible/Chef? when to use the former, when to use the latter (I know Ansible/Chef is idempotent)?
In my case, the EC2 is originally manually launched, then manually configured using a lot of linux commands. and the commands are not configured by me. Now I am the person to automate the whole structure using terraform, and configure EC2s. Using user_data/remote-exec to configure EC2 is straightforward. I just need to put all the existing linux commands they have in some scripts with a little change. And if the configuration result using my script is not successful, at least I can quickly figure out whether I miss some commands by comparing my script and the original linux commands. But if I use ansible/chef, I have to rewrite all the steps using different language. And if the configuration is not what expected, it is hard for me to figure out which steps are not correct, because the syntax of ansible/chef and linux commands are totally different.
My question is, in my case, should I use ansible/chef or user_data/remote-exec for configuration?
User Data is good for initial configuration of the system. If you need longer term maintenance a configuration management software like Ansible/Chef/Salt/Puppet is a great option.
Packer can be used for immutable infrastructure, i.e. doesn't change after creation. You can run all the scripts and installs on the system for it to be ready to just boot, this is also faster because you don't have to wait for user data to run.
A few questions you have to ask as well, how often are you going to patch these? Are you going to just update existing or replace with new. Ansible is great for configuration since it's just yaml files an
Blue/Green deployments generally replace servers with all new ones and gradually move traffic over to the new servers.
Some more things to consider with your Infrastructure as code

Install Neo4j on Azure, cannot browse WebAdmin

I've just installed Neo4j 1.8.2 onto Azure by following this step-by-step process...
http://de.slideshare.net/neo4j/neo4j-on-azure-step-by-step-22598695
Unfortunately, when I browse to http://:7474/webadmin Fiddler says Error 10061 - No connection could be made because the target machine actively refused it.
I've followed the instructions exactly and haven't received any errors.
Any help much appreciated.
So, I think I got to the bottom of this. I think it was due to the size of compute / VM I was creating. It looks like the problem is caused when running on Extra Small instances. I created a new installation using a Small instance and everything now works :).
Try setting the server to accept connections form all hosts, and maybe use a newer Neo4j, say 1.9.4
http://docs.neo4j.org/chunked/stable/security-server.html#_secure_the_port_and_remote_client_connection_accepts
The way the VM Depot image is set up, it's pre-configured to allow all hosts to connect, and the Neo4j server will auto-start. The only thing you need to take care of, when constructing your VM, is to open an Input Endpoint, with any public port you want (preferably 7474 to stay true to Neo4j) and internal port 7474.
Note that the UI changed a bit since the how-to was published: You can specify the endpoint as the last step before creating your virtual machine. Other than that, the instructions should be the same. And... once the VM is up and running (it'll take about 5-10 minutes), you just visit http://yourservicename.cloudapp.net:7474 and you should see the web admin. Note: this is not the same as your vm name. If you named your VM something like 'neo' then you do not want http://neo:7474 or http://neo.cloudapp.net:7474. You need to use your cloud service name (you had to create a name for the service when you deployed the VM.
I've deployed that image several times in demos, and just tried again right now to make sure nothing wonky happened. Worked perfectly.

Is a Amazon Machine Images (AMI's) static or it's code be modified and rebuilt

I have a customer who wishes me to do some customisations of the erp system opentaps, which they used via opentaps Amazon Elastic Computing Cloud (EC2) images, I've only worked with it on a normal server and don't know anything about images in the cloud. When I ssh in with the details the client gave me there is no sign of the erp installation directory I'd expect to see. I did originally expect that the image wouldn't be accessible, but the client assured me it was. I suppose they could be confused.
Would one have to create a new image and swap it out or is there a way to alter the source and rebuild like on a normal server?
Something is not quite clear to me here. First of all EC2 images running in the cloud are just like normal virtual servers, so If you have an access to the running instance there is no difference between instance in the cloud and instance on another pc in your home for example.
You have to find out how opentaps are installed on the provided amis, then do your modifications, create an image from the modified instance and save it to s3 for backup if necessary.
If you want to start with fresh instance, you can start up any linux/windows distro on the EC2, install opentaps yourself your way and you are done.

How to duplicate a virtual PC with SharePoint, K2 and domain controller

Is anyone aware of an easy way of duplicating and renaming a virtual PC (can be MS VPC, VMWare or Virtual Box), which is running SharePoint, K2 and acting as a domain controller? I’m looking for a method of creating an image which can be quickly and easily copied and run by multiple parties on the same network simultaneously without name conflicts. It’s either that or go through a ground-up build on each and every machine as far as I can see.
I'd advise against it.. renaming an installed SharePoint machine is sure to cause you pain indefinately and unexpectedly. The way to go is with scripted installs:
create copy of a VM with OS
rename machine + run sysprep
script install SQL
script install MOSS
script configure MOSS (replaces config wizard + a lot of manual settings)
It can all be done unattended.
As a shortcut to install short-lived development machines I have used the following. Just make sure the SharePoint configuration wizard runs after the rename and there should be no problem.
create a copy of a VM having: OS+SQL+MOSS(no config wiz)
rename machine
script configure MOSS
It has the advantage of your development machines being identically installed. Takes about 10 minutes to create a fresh one. It doesn't have sysprep but they are renamed so you can run them all on your network. Not running sysprep has never caused me grief but I wouldn't do it for production environments. Running the configuration of MOSS scripted makes sure it will work on the renamed environment (and all MOSS farms are configured exactly the same, same ports, SSP setup, etc, yay!)
For MOSS configuration scripting see h tt p://stsadm.blogspot.com/2008/03/sample-install-script.html
Plently of samples for SQL out there too.
SharePoint doesn't like having the server re-named from under it's feet (so to speak). Neither does SQL Server (which I assume you'd have installed on the VM for the installation). Not sure about a DC being renamed, there's probably problems there as well...
Having said that, there are some instructions I've read for renaming both SharePoint machines and SQL Server machines, so you might get somewhere.
On the third hand, I've tried it a few times and always ended up rebuilding the server from the ground up for SharePoint as it can get subtly mangled in ways which aren't always apparent straight away (the admin interface and shared services seem to be especially easy to confuse). I've found that I can build a vanilla MOSS install pretty quickly these days...
Sharepoint writes the name of the server into configuration tables in SQL Server. So if you change the name of the server, things stop working.
What you can do, is to install just the OS. Then take a copy each time you need a new machine. Run sysprep
to give the machine a new name. Then install SQL Server and MOSS.
This is not exactly what you are after but it should save you some time.
I've done this, and it wasn't too bad.
Rename the SharePoint-server first, then rename the Windows server.
This posting has a nice checklist.
Don't forget to remove the NIC node from the settings file of the virtual machine, otherwise you get name collision due to duplicate MAC addresses. Here's a how-to.
I believe the solutions above are really good. But I would suggest an alternative ...
If this is a development virtual PC I would suggest that you do the following
Do not rename the server
Change the IP address to be on different network
Change the MAC address so that there are no packet collisions
Since you are using it as a development VPC, edit the computer's lmhosts file edit the entry to point to the new IP address
You might want to skip the step 2 and be on the same network. But changing the hosts file will still point back to you. For example you server name was "myserver" and it was pointed 192.168.1.100 which was the local ip (has hosts file entry) , then if you copy the server give it ip 192.168.1.150 and edit the hosts file and point myserver to 192.168.1.150, the system will still work flawlessly. There will some domain name collisions in the event log of the machine, but it wont affect your development.

Resources