Is a Amazon Machine Images (AMI's) static or it's code be modified and rebuilt - amazon-ami

I have a customer who wishes me to do some customisations of the erp system opentaps, which they used via opentaps Amazon Elastic Computing Cloud (EC2) images, I've only worked with it on a normal server and don't know anything about images in the cloud. When I ssh in with the details the client gave me there is no sign of the erp installation directory I'd expect to see. I did originally expect that the image wouldn't be accessible, but the client assured me it was. I suppose they could be confused.
Would one have to create a new image and swap it out or is there a way to alter the source and rebuild like on a normal server?

Something is not quite clear to me here. First of all EC2 images running in the cloud are just like normal virtual servers, so If you have an access to the running instance there is no difference between instance in the cloud and instance on another pc in your home for example.
You have to find out how opentaps are installed on the provided amis, then do your modifications, create an image from the modified instance and save it to s3 for backup if necessary.
If you want to start with fresh instance, you can start up any linux/windows distro on the EC2, install opentaps yourself your way and you are done.

Related

Use Google App Engine or Google Cloud Compute VM to Test Run My App?

I'm moving my Three.js app and its customized node.js environment, which I've been running on my local machine to Google Cloud. I want to test things out there, and hopefully soon get some early alpha testing going with other people.
I'm not sure which is the wiser way to go... to upload the repo I've been running locally as-is onto a VM which users would then access via the VM's external IP until I get a good name to call this app... or merge my local node.js environment with what's available via the Google App Engine and run it on GAE.
Issues I'm running into with the linux VM approach... I'm not sure how to do the equivalent on the VM of what I've been doing locally. In Windows Powershell I cd into the app directory and then enter node index.js. I'm assuming by this method of deployment that I can get the app running as soon as the browser hits the external IP. I should mention too that the app will allow users to save content as well as upload images, and eventually, 3D models as well as json datasets.
Issues I'm running into with the App Engine approach: it looks like I only have access to a linux-based command line, and have to install all the node.js modules manually. Meanwhile I have a bunch of files to upload, both the server-side node files and all the frontend stuff. I don't see where to upload those files, and ultimately what I'd like to do is have access to a visual, editable file-tree interface, as I have in Windows and FileZilla, so I can swap files in and out, etc. Alternatively I suppose I could import a repo from Github? Github would be fine as long as I can visually see what's happening. Is there a visual interface for file structure available in GAE somewhere? Am I missing something?
I went through the GAE "Hello World" tutorial and that worked fine, but was left scratching my head afterward regarding how to actually see and edit the guts of the tutorial app, or even where to look for the files.
So first off, I want to determine what's the better approach, and then if possible, determine how to make the experience of getting my app up there and running a more visual, user-friendly experience.
Thanks.
There are many things to consider when choosing how to run an app, but my instinct for your use case is to simply use a VM on GCE. The most compelling reason for this is that it's the most similar thing to what you have now. You can SSH into the machine and run nohup node index.js & (or node index.js inside tmux/screen if you prefer) and it will start the app and not stop it when you log out of SSH. You can use SCP / SFTP with whatever GUI client you want to upload files. You don't have to learn anything new! If you wanted to, you could even use a Windows VM (although I think you have to pay a little more than for a comparable Linux VM due to the licensing fees).
That said, the other way is arguably more "correct" by modern development standards, but it will involve a lot more learning that will prevent you from getting your app running somewhere other than your laptop in the short term:
First, you'll need to learn about Docker and stateless containers, which is basically what your app runs inside of on AppEngine.
Next, you'll need to learn how to hook up a separate stateful service (database, file server, ...) to your app's container so you can store your files, etc. in it, and then probably rewrite your app somewhat to use it to store stuff.
Next, you'll probably want some way to automatically deploy this from code instead of manually doing it, which gets you into build systems, package managers, artifact storage, continuous integration systems, and on and on and on.
This latter path is certainly what you should choose for a long-running production service if you work with a big team of developers -- but that doesn't mean that it's necessarily the right path for your project today. If you don't care about scaling up automatically, load balancing between nodes, redundant copies of your app running in different regions in case there's a natural disaster, etc., then go with the easy way for now, and you can learn new ways to improve the service when they're actually needed.

File server in container

I can see there are some implemented Web, DB servers are able to run as a container, it occurred to me that why not be able to implement as a file server with a centralized storage (e.g. SAN)
Does anyone try this before, or any recommendation to me?
My basic idea is use 2-3 docker images to create the file servers (mostly Windows servers) and they are mounting on the same storage. For the front-end, I may go or DFS namespaces to normalize the UNC path.
Windows based images have Server service disabled out of the box. It's impossible to start it either since drivers are removed as well. It will not be possible to do in Windows containers.

How to create and run multiple ec2 instances with same configurations and software installed?

Fairly new to cloud computing, so bear with me if question is obvious or silly. With tons of information available on internet, I was able to successfully create an ec2 linux instance and installed R and Rstudio on it. Ran my scripts on it which went really well but took too long (16 hrs) and very expensive as well since I require instances with high memory and vCPUs .
In my programs, I am essentially running the same scripts for different datasets.
My question is, is there any way I can run multiple similar instances of ec2 (with exactly same software installed and my scripts). So, this way, I will be able to run my scripts on every dataset on a separate instance simultaneously in less amount of time.
So what I have tried so far. I have created an AMI image of my existing instance and launched it. But I couldn't SSH it because of its weird username and ip address, something like "root#10.0.0.1". I can see both instances are running (original and the AMI image instance), I can SSH into original but not into the other one. I am able to login to the RStudio for original instance on port 8787.
Another question is how to launch this AMI imaged instance using SSH (Putty) in parallel with original instance. What problem will it cause if I use both of them in the browser (RStudio in this case) simultaneously?
Please help me with this!Thanks!
Problem: For a school project, I was running several machine learning algorithms on pretty large size data which happened to requre 30-35GB of memory and my PC couldn't handle it. I was using R/RStudio. So, I resorted to AWS for my memory limitation problem.
What I did initially: I created an ec2-instance, installed R/RStudio. Everything worked out perfectly and I was able to run my programs on RStudio through browser. I actually, ran my scripts on a very small dataset on this AWS instance to see how things are going. To much of my surprise it took very long for the whole script to run even with this small dataset. Soon enough, I realized that all these algorithms in my programs could be run independently for the same set of features with a little tweak in the scripts.
So, I decided to play with AWS little bit. I recreated the programs such that everything stayed the same except the learning algorithms in each script. In other words, I wanted to simultaneously run a copy of these programs with different algorithms so that I could get everything running simultaneously and produce the results in a smaller amount of time.
Now, my goal was to run multiple copies of this instance (original instance). And I should be able to run RStudio on my browser for each of these instances e.g. 5 ec2 instances will have 5 RStudio running concurrently on browser's different tabs. With that, I would be able to run all RStudio for each instance on my browser.
Then, I created an image (AMI) of this instance and then I created multiple instances from the AMI but I was missing out few points while creating those new instances from AMI which caused the problem I asked in the question above.
I initially suspected that it has something to do with port 8787 and I might not be able to run multiple RStudio for each ec2 instance in the browser. However, that was not the problem at all.
There are few very important things to take care of while you create the new instances from an AMI.
Mistake: While CREATING new instances from this AMI, I was NOT selecting two important things correctly i.e. VPC and Security Group.
Correct method is:
VPC -- On the "Configuration Instance Details" page:
a. Click the "Network" dropdown and select the VPC which was created for the original instance. (Original instance is the one which is used to create the AMI (image))
b. Click the "Auto-assign Public IP" dropdown and select Enable
Security Group -- On the "Configure Security Group" page:
a. for "Assign a security group" option, tick the "Select an existing security group" options
b. If there are more than one security groups in the list then select the one which was created for the original instance (OR create a new Security Group and make sure that it has the same kind of inbound and outbound port accesses)
Once I set this up, as Marc B mentioned in the comments, each instance gets its own IP address, and a local subnet address is assigned as well
IP address of instance looks like : ec2-33-444-22-111.us-west-1.compute.amazonaws.com
subnet looks like: 127.0.0.35
Now, after learning this, I recreated 5 instances from my AMI. So, now I had 5 instances with RStudio on each of them. All of them were running perfectly fine because I was able to SSH into each of them.
Now I thought I should be able to work with these instances in different tabs of the browser and run my scripts in them. But I wasn't able to login to all the RStudio instances in my browser tabs. Only one of them was working fine and the others were just not working in the browser. However, I was able to SSH into all of them from PuTTY. I could have ran my scripts from Linux (SSH) as well but I wanted to run them using RStudio.
After spending a good number of hours on this, I figured out the problem that the RStudio server needs to be started manually for each ec2 instance in the linux except the very first instance.
For one of the ec2-instance (besides the one which was working fine on browser), I did the following to start the RStudio server manually as below:
SSH using PuTTY
Become root: sudo su
Go to this path where RStudio was installed on my Linux instance: cd /usr/lib/rstudio-server/bin
start RServer with this command : rstudio-server start
Now go back to the browser, open another tab and use your ec2-instance address and port number (http://ec2-33-444-22-111.us-west-1.compute.amazonaws.com:8787). And now you should get the login page of RStudio for this instance as well.
Now, with a similar process, I had to manually run RStudio-servers for all other instance in order to be able to access them through the browser. Then I thought, if there is a way to start the RStudio server when Linux starts up every time. Then came up with a solution. To do this, I made a change in one of the configuration files of Linux as follows:
Become root: sudo su
go to this path: cd /etc/rc.d
vi the file rc.local and add the following command:
/usr/lib/rstudio-server/bin/rstudio-server start
save the changes you made.
close the SSH connection
Then, I went back to the AWS console, stopped this instance and created an AMI (image) of it. Now the above changes will be effective for each instance that I create from this AMI i.e. now RStudio server will be started as soon as the instance boots and will be accessible through the browser.
Now I can use multiple RStudio instances using different tabs of my browser. Make sure you are using the correct instance address in the browser. Port number stays same for all i.e. 8787

Use a rackspace cloud image on Amazon EC2?

I've a Rackspace (UK) cloud instance, running Ubuntu 11.10, which has taken 10+ man-hours to install all the packages (and custom application code) I need, tighten security, test, etc.
I can take a snapshot of that, and start another instance on Rackspace UK. That worked nicely. Because I've got /etc under git source control I could see the files the start-up process altered were:
network files (IP address, default gateway)
root password
/etc/hostname
About the only post-startup steps I needed to do were a DNS entry and dpkg-reconfigure postfix to set the new machine name.
I'm assuming, but haven't tested yet, that I could use this image with Rackspace U.S. But what about with Amazon EC2 (or any other cloud provider for that matter)? Can I just download the image, upload it to Amazon S3, and start new instances with it? If not is there a utility I can run to convert from one linux image format to another?
The poor man's approach is to use rsync between servers. Rackspace has a 3-part guide on this, starting here:
http://www.rackspace.com/knowledge_center/index.php/Migrating_a_Linux_Server_From_Command_Line_Stage_1

cloudfoundry: how to use filesystem

I am planning to use cloudfoundry paas service (from VMWare) for hosting my node.js application. I have seen that it has support for mongo and redis in the service layer and node.js framework. So far so good.
Now I need to store my mediafiles(images uploaded by users) to a filesystem. I have the metadata stored in Mongo.
I have been searching internet, but have not yet got good information.
You cannot do that for the following reasons:
There are multiple host machines running your application. They each have their own filesystems. Each running process in your application would see a different set of files.
The host machines on which your particular application is running can change moment-to-moment. Indeed, they will change every time you re-deploy your application. Every time a process is started on a new host machine, it will see an empty set of files. Every time a process is stopped on an old host machine, all the files would be permanently deleted.
You absolutely must solve this problem in another way.
Store the media files in MongoDB GridFS.
Store the media files in an object store such as Amazon S3 or Rackspace Cloud Files.
Filesystem in most cloud solutions are "ephemeral", so you can not use FS. You will have to use solutions like S3/ DB for such purpose

Resources