Container Implementation - rackspace-cloud

I'm currently working on a project where users will upload projects, but others users will be able to clone those projects (think github-esque).
Now my initial idea is create a container for each project, making it easy to clone them. Though I will still store a reference to each file & it's location in the database.
Would creating a container for each project be the best option, or should I stick to a container per user? I know the file amount limits are huge in the containers, but I feel my initial plan would scale better.
Thoughts people?

This is just personal opinion as I am currently also using rackspace cloud in my project. I think that creating one container for each users will be still good option since you can copy, move object inside container.
And also by creating container for each user you can easily get current size object, container of users so that you can know what current free space they have without using additional calculation of it.

Related

Using the temp directory for Azure Functions

I have a set of Azure functions running on the same host, which scales up to many instances at times. I'd like to store a very small amount of ephemeral data (a few kb's) and opportunistically share those data between function executions. I know that the temp directory is only available to the functions running on that same instance. I also know that I could use the home directory, durable functions, or other Azure (such as blob) storage to share data between all functions persistently.
I have two main questions
What are the security implications of using the temp directory? Who can access its contents outside of the running function?
Is this still a reasonable solution? I can't find much in the way of Microsoft documentation outside of what looks like some outdated kudu documentation here.
Thanks!
Answer to Question 1
Yes, it is secure. The Function host process runs inside a sandbox. All access data stored to D:\local is self-contained and isolated to the processes within the sandbox. Kindly see https://github.com/projectkudu/kudu/wiki/Azure-Web-App-sandbox
Answer to Question 2
The data in D:\local\Temp exists as long as the Function host process is alive. The Functions host process can be recycled at any time due to unexpected events such as unhandled exceptions, timeouts, hitting resource usage limits for your plan. As long as your workflow accounts for the fact that the data stored in D:\local\Temp is ephemeral, then the answer is a 'yes'.
I believe this will answer your question :
Please refer to this for more details.
Also, when Folder/Files when created via code inside the “Temp” folder; you cannot view them when you visit KUDU site. But you can use those files/ folders.
How to view the files/ folders if created via KUDU?
We will need to add - WEBSITE_DISABLE_SCM_SEPARATION = true in Configuration(app settings).
Note:- Another important note is that the Main site and the scm site do not share temp files. So if you write some files there from your site, you will not see them from Kudu Console (and vice versa).
You can make them use the same temp space if you disable separation (via WEBSITE_DISABLE_SCM_SEPARATION).
But note that this is a legacy flag, and its use is not recommended/supported.
(ref : shared document link)
Security implications depend on the level of isolation you are seeking.
In shared app-service plan or consumption plan you need to trust the sandbox isolation. This is not an isolated microvm like AWS lambda.
If you have your own app-service plan then you need to trust the VM hypervisor isolation of your app-service plan.
If you are really paranoid or running healtcare application, then you likely need to run your function in a ASE plan.
Reasonable solution is one where the cost is not exceeding the worth of data you are protecting :)

How to create Azure Windows images in Azure while keeping master VM?

For my students in my teaching classes, I create short-lived Azure VM's based on an image that I have created using sysprep and captured. That all works great.
But my problem is that each time I sysprep and capture my master VM I lose it and that means that I have to recreate the master image from scratch each time I want to update it, and that takes many hours to do.
I have seen many fragile approaches by which they all seem to involve a lot of manual steps and low-level disk backup/copy/VHD's to get around this.
So my question is what is the best approach for me as a teacher to keep my master VM alive so that I don't have to re-create it from scratch each time I need to create a new image for my clones?
I am sure there must be a better way to do this?
For your requirement, I think you need to make a copy for your VM and then create the image from the copy VM, so your original VM will be alive. You can follow the copy steps here. Then create the image as before.
You need to create a new image when you update your VM each time, all the VM would be created from the image. So it's the only way to do that.

Is there any way to make a file instance persist for all users in a website

I have a file which is around 1.2 GB size and I want to call an instance to it while formulating results for my website. Is it possible to make the instance the same for all users of the website? According to my understanding, for eg Heroku, as all create separate instances of the website for every user, is there any way to make it happen. I apologize in advance if the question is naive !!
Bummer, heroku only allows you to have 500mb of files in your deploys.
If not for that, you could just commit that big file to your repo and use it on your site.
Heroku doesn't create an instance of the app for every user but for every deploy. Once you deploy your application, your entire old server is thrown away and a new one is created.
Let's say you need a temporary file, you could put it on /tmp/something and then use it for a while. Once you make a deploy, as the server is discarded and a new one is spawn, that file wouldn't be there and you'd have to recreate it for the first request.
Anyway, that doesn't look good. Even if heroku would let you store the file there, you'd also have to parse and dig through it in order to display your results for your site, which is likely to make you run out of memory too.
I would recommend you to review your approach to this problem, maybe break that file in small pieces or store it in a database to perform smaller calculations.
If there is absolutely no way around it, you could have a server in some other server like digital ocean and build a small api to perform the calculations there and make your site call it.
If you want, I can give you a hand on switching the approach for it, just post a different question, comment the link to it and I'll give it a shot.

How to mount a file and access it from application in a container kubernetes

I am looking for a best solution for a problem where lets say an application has to access a csv file (say employee.csv) and does some operations such as getEmployee or updateEmployee etc.
Which Volume is best suitable for this and why?
Please note that employee.csv will have some pre-loaded data already.
Also to be precise we are using azure-cli for handling kubernetes.
Please Help!!
My first question would be: is your application meant to be scalable (i.e. have multiple instances running at the same time)? If that is the case, then you should choose a volume that can be written by multiple instances at the same time (ReadWriteMany, https://kubernetes.io/docs/concepts/storage/persistent-volumes/). As you are using Azure, the AzureFile volume could fit your case. However, I am concerned that there could be a conflict with multiple writers (and some data may be lost). My advice would be to better use a Database System so you avoid this kind of situations.
If you only want to have one writer, then you could use pretty much any of them. However, if you use local volumes you could have problems when a pod get rescheduled on another host (it would not be able to retrieve the data). Given the requirements that you have (a simple csv file), the reason I would give you for using one PersistentVolume provider instead of another would be the less painful to setup. In this sense, just like before, if you are using Azure you could simply use an AzureFile volume type, as it should be more straightforward to configure in that cloud: https://learn.microsoft.com/en-us/azure/aks/azure-files

Why nobody does not make it in the docker? (All-in-one container/"black box")

I need a lot of various web applications and microservices.
Also, I need to do easy backup/restore and move it between servers/cloud providers.
I started to study Docker for this. And I'm embarrassed when I see advice like this: "create first container for your application, create second container for your database and link these together".
But why I need to do separate container for database? If I understand correctly, the main message is the docker the: "allow to run and move applications with all these dependencies in isolated environment". That is, as I understand, it is appropriate to place in the container application and all its dependencies (especially if it's a small application with no require to have external database).
How I see the best-way for use Docker in my case:
Take a baseimage (eg phusion/baseimage)
Build my own image based on this (with nginx, database and
application code).
Expose port for interaction with my application.
Create data-volume based on this image on the target server (for store application data, database, uploads etc) or restore data-volume from prevous backup.
Run this container and have fun.
Pros:
Easy to backup/restore/move application around all. (Move data-volume only and simply start it on the new server/environment).
Application is the "black box", with no headache external dependencies.
If I need to store data in external databases or use data form this - nothing prevents me for doing it (but usually it is never necessary). And I prefer to use the API of other blackboxes instead direct access to their databases.
Much isolation and security than in the case of a single database for all containers.
Cons:
Greater consumption of RAM and disk space.
A little bit hard to scale. (If I need several instances of app for response on thousand requests per second - I can move database in separate container and link several app instances on it. But it need in very rare cases)
Why I not found recommendations for use of this approach? What's wrong with it? What's the pitfalls I have not seen?
First of all you need to understand a Docker container is not a virtual machine, just a wrapper around the kernel features chroot, cgroups and namespaces, using layered filesystems, with its own packaging format. A virtual machine usually a heavyweight, stateful artifact with extensive configuration options regarding to the resources available on the host machine and you can setup complex environments within a VM.
A container is a lightweight, throwable runtime environment with a recommendation to make it as stateless as possible. All changes are stored with in the container that is just a running instance of the image and you'll loose all diffs in case of container deletion. Of course you can map volumes for more static data, but this is available for the multi-container architecture too.
If you pack everything into one container you loose the capability to scale the components independently from each other and build a tight coupling.
With this tight coupling you can't implement fail-over, redundancy and scalability features into your app config. The most modern nosql databases are built to scale out easily and also the data redundancy could be a possibility when you run more than one backing database instance.
On the other side defining this single-responsible containers is easy with docker-compose, where you can declare them in a simple yml file.

Resources