File Server Design Approaches [closed]

File Server Design Approaches [closed] - web

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I have been working on designing a file server that could take the load off from the primary website, and serve images/files over the web to the client.
Primary goals of the file server:
- Take off load from primary server hosting the site
- Reuse the existing web server code base and avoid duplication of code/logic for better maintainability
- Being scalable for increasing downloads
- Hide real download url path from user
By keeping above in mind, i could come up with two approaches. Sequence diagram representation of the two approaches for ease of understanding [apologies for the skewed use of sequence diagram]. Neither of the approaches would satisfy all my goals.
Which of these approaches would you recommend considering my goals?
Is there a better third approach?
Some of the differences, i could think of:
- Approach #1 would result in duplicating BL code causing maintainability issues
- Approach #2 would reuse code and centralize BL reducing maintainability issues
- Approach #1 would reduce network calls while #2 increases them
The concept of file servers, scalability of downloads, bandwidth distribution have all been there for a while now. Please share your thoughts!
UPDATED:
Approach #1 looks very attractive as it takes the load off the primary server completely. The only issue to address in #1 is the code duplication and maintainability issues. This could be overcome by having just one project for BL/DAC comprising the functionality required by both web service and file server. And, reference the assembly/library in both web service and file server projects. Now, there is only one BL/DAC code to maintain and also avoids the network calls in approach #2.

By serving images/files to the client, I assume you mean static files css, js etc.
Most of the time, a simple solution is the best solution. Just host them on a different server under a different subdomin, i.e. http://content.mydomain.com/img/xyz.jpg. You could host them at a data centre on dedicated server giving your perforamace (close to the backbone), you could load balance the url and by have 2+ servers at 2+ different data centers, giving you resilliance and scalability.
You maintence task is then having todo find a replace when promoting your site to live to replace dev/uat content paths with the live content path (tho you'd only need todo this in css files as you could store the paths for content used within aspx files for as config data).

Related

Point of the application service? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I don't know if I see the benefit to an app service. It seems like with controllers and domain services you can get the same result. Can someone post a scenario when it would make sense to use an app service over controller/domain service?

Host independence - if your domain model is only exercised through HTTP calls, your controllers may be fine to keep complexity low but being able to call them from multiple hosts (think console application for an event queue, serverless, tests) can be beneficial. I am a fan of adding complexity as it is needed but unfortunately, a lot of developers will just copy the pattern that has come before. Especially if the initial plans have been lost to attrition.
Tests - mentioned above and pretty much just one side of the above cube but having the Application Service as a seam for writing tests against is often quite useful if you don't want to tie your tests to your host. Having said that, tools for testing ASPNET (an I am sure other technologies) in proc have come a long way over the years.
Reveal intent - Controllers, unfortunately, suffer from functionality gravitation. All functionality tends to be pulled into them. Are they accepting HTTP requests? Deserializing those requests? Converting to a command? To a model? Orchestrating the domain model calls for creating the model, domain services, repositories? What really is it's responsibility? The term service is soooo overloaded that teams I have worked with call Application Services Use-cases and name them for exactly what they are trying to exercise the domain to do.
Although you can of course use controllers as this entry point, you lose some flexibility but gain some initial simplicity. This is the exact balancing act you are playing when delaying the use of DDD adoption in the first place instead of standard MVC app with no strategy for managing complexity. Maybe if you are not seeing a benefit, the application has not reached the complexity needed to use DDD in the first place? It does come with a complexity cost.
With regard to domain service, they are really part of your domain and do the work whereas the application service is really the entry point that orchestrates the whole use case. Be careful of overusing domain services though, personally, I often view them as a failure on my part to find a decent model (but maybe that's just me).

Aim of using puppet, chef or ansible [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I read many article concerning Configuration Management, but I dont really understand on what this configuration is applied.
Is it on software himself ? Like, changing hosts in conf file etc... ?
Or on the app "host" ? In that case, what is the aim of using this kind of software, knowing that we generally use docker containers "ready to use" ?

You spent hours setting up that server, configuring every variable, installing every package, updating config files. You love that server so much that you named it 'Lucy'.
Tomorrow you get run over by a bus. Will your coworkers know every single tiny change you made to that server? Unlikely. They will have to spend hours digging into that server trying to figure out what you've done and why you've done it.
Now let's multiply this by 100s or even 1000s servers. Doing this manually is unfeasible.
That's where config management systems come in.
It allows you to have documentation of your system's configurations by the nature of config management systems itself. Playbooks/manifests/recipes/'whatever term they use' will be the authoritative description of your servers. Unlike readme.txt which might not always match the real world, these systems ensure that what you see there is what you have on your servers.
It will be relatively simple to duplicate this server configuration over and over to potentially limitless scale(Google, Facebook, Microsoft and every other large company work that way).
You might think of a "Golden image" approach where you configure everything, then take a snapshot and keep replicating it over and over. The problem is it's difficult to compare the difference between 2 such images. You just have binary blobs. Where as with most config management systems you can use traditional VCS and easily diff various versions.
The same principle applies to containers.
Don't treat your servers as pets, treat them as cattle.

How do companies like facebook release features slowly to portions of their user base? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I like how facebook releases features incrementally and not all at once to their entire user base. I get that this can be replicated with a bunch if statements smattered all throughout your code, but there needs to be a better way to do this. Perhaps that really is all they are doing, but that seems rather inelegant. Does anyone know if there is an industry standard for an architecture than can incrementally release features to portions of a user base?
On that same note, I have a feeling that all of their employees see an entirely different completely beta view of the site. So it seems that they are able to deem certain portions of their website as beta and others as production and have some sort of access control list to guide what people see? That seems like it would be slow.
Thanks!

Facebook has a lot of servers so they can apply new features only on some of them. Also they have some servers where they test new features before commiting to the production.

A more elegant solution is, if statements and feature flags using systems like gargoyle (in python).
Using a system like this you could do something like:
if feature_flag.is_active(MY_FEATURE_NAME, request, user, other_key_objects):
# do some stuff
In a web interface you would be able to isolate describe users, requests, or any other key object your system has and deliver your feature to them. In fact, via requests you could do things like direct X% of traffic to the new feature, and thus do things like A/B test and gather analytics.

An approach to this is to have a tiered architecture where the authentication tier hands-off to the product tier.
A user enters the product URL and that is resolved to direct them to a cluster of authentication servers. These servers handle authentication and then hand off the session to a cluster of product servers.
Using this approach you can:
Separate out your product servers in to 'zones' that run different versions of your application
Run logic on your authentication servers that decides which zone to route the session to
As an example, you could have Zone A running the latest production code and Zone B running beta code. At the point of login the authentication server sends every user with a user name starting with a-m to Zone A and n-z to Zone B. That way roughly half the users are running on the beta product.
Depending on the information you have available at the point of login you could even do something more sophisticated than this. For example you could target a particular user demographic (e.g. age ranges, gender, location, etc).

Are there any examples of group data-sharing using a replicated database, such as CouchDB? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
Background: I am working on a proposal for a PHP/web-based P2P replication layer for PDO databases. My vision is that someone with a need to crowd-source data sets up this software on a web server, hooks it up to their preferred db platform, and then writes a web app around it to add/edit/delete data locally. Other parties, if they wish, may set up a similar thing - with their own web apps written around it - and set up data-sharing agreements with one or more peers. In the general case, changes made to one database are written to another on a versioned basis, such that they eventually flow around the whole network.
Someone has asked me why I'm not using CouchDB, since it has bi-directional replication and record versioning offered as standard. I wasn't aware of these capabilities, so this turns out to be an excellent question! It occurs to me, if this facility is already available, are there any existing examples of server-to-server replication between separate groups? I've done a great deal of hunting and not found anything.
(I suppose what I am looking for is examples of "group-sourcing": give groups a means to access a shared dataset locally, plus the benefits of critical mass they would be unable to build individually, whilst avoiding the political ownership/control problems associated with the traditional centralised model.)

You might want to check out http://refuge.io/
It is built around couchdb, but more specifically to form peer groups.
Also, here is a couchbase sponsored case study of replication between various groups
http://site.couchio.couchone.com/case-study-assay-depot
This can be achived on standard couchdb installs.
Hope that gives you a start.

when to start performance tuning a website [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
i have a asp.net mvc website and the volume of traffic is increasing. I have the site pointing to a backend sql server 2008 database.
at what point, do i need to figure out what the bottleneck of the system and look to review if i need to load balance machines, or change the way i am doing database connection management.
are there specific tools and thresholds that are indicators that the current model isn't scalable or is hitting a breaking point (besides just observations of a slow site.

When you start noticing performance issues.

There are some very easy things you can do to increase performance with so little work, it's easier to do them that see if you need to yet ;)
First and foremost is putting all static images and other media on a separate server. That eliminates a whole lot of queries on the boxen running the dynamic parts of the web server.
Next in line is make sure you are using as many hard drive spindles as possible. Of course you want your database on a separate machine, let alone a separate hard drive, but you also want your web server logs written to a separate hard drive. That prevents a lot of jumping around of the hard drive heads.
As far as "how do you know when you need to performance tune", I will give a different answer than George Stocker: When there is a cost associated with your performance that outstrips the cost of looking into it. I say it this way because your customers may be a little unhappy if your website is a little sluggish, but if it doesn't prevent anyone from using it, or recommending it to others, then it may not be worth looking into. People put up with sub-optimal performance all the time.

There are a plethora of tools available to address the plethora of possible bottlenecks. A decent performance tuning strategy starts with measurement and consistent instrumentation of the given system.
But performance tuning requires precious time and resources, and should only be pursued when it gives you the most bang for the buck, i.e. it provides the greatest improvement to achieving your website's objectives given the work required. If your website supports (or is) a business or organization, you must continuously evaluate the business landscape and plan the next allocation of resources. This is entirely dependent on the particular industry.
An engineer might focus on continual refinement of an existing system, but the project commissioners (be they an external client, or your company's management) must weigh the costs and benefits of all types of development, from improving an existing featureset, to adding new features, to addressing technical limitations affecting product usability (including performance issues). That's not to say engineers have no say in resource allocation, but their perspective is just one of many contributing to success.

When you have doubts that the website would survive a doubling in max usage. One common line of thought where I am from is that you should have the performance capacity to support at least 2x the number of users you expect.
Determining whether or not you can support 2x is something better left to load testing though, rather then speculation. One comment from your other comment though: chances are a website performance problem is going to affect everyone using the web site, including you on a local machine... unless it's a bandwidth problem and you're connected to a local network. Barring cable cuttings, it's not going to be 'just the people in Asia'.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string