What are the advantages of merging multiple .NET assemblies into one?

What are the advantages of merging multiple .NET assemblies into one? - .net-assembly

I've seen discussions of techniques to merging multiple assemblies into one (e.g. ilmerge). I am scratching my head on why does one want to do so. Is there any reason other than the obvious "one file is easier to deploy/track/maintain/reference"?

Some reasons may be:
Small utility programs, for example when you want to just copy/paste the whole program to a server without any folders or anything.
Deployment scenarios where you want to keep minimal amount of files that should be copied, uploaded to FTP etc, for example installers.
NuGet packages where you want to keep the amount of added references to a minimum.

There is a faster startup time associated with decreased I/O on an application's Cold Startup. Best way to see how much improvement time can be saved is to measure as each app is different: number of assemblies, size of assemblies, etc.

Related

Node server: code weight and server performance

I would like to know how important could be the impact between using a 15.5k library just for doing very simple validations, and, using my own 1k super-simple validation class, in the time when I'll have more than 10k users on my system (Node + Mongo running on a super pentium 8 core 32gb ram).
Is it worst to care about this 14.5k of code?
I cant find any clue in my so bleak but always wondering mind.
I'll apretiate very much your opinion.

A nice thing about server development is that you usually have significant RAM available and the code is generally loaded just once at the startup of the server so load time is not part of the user experience.
Because of these, you'd be hard pressed to even measure a meaningful impact between a 1k library and a 15k library. You might care about 15k of memory usage per active user, but you will not care about an extra 15k of code loaded into memory one time and it will not affect your server performance in any way.
So, I'd say you should not worry about skrimping on code size (within reason). Instead, pick the tool that best solves your problem, makes your development quickest and the most reliable. And, where possible, use what has been built and tested before rather than build your own from scratch. That will leave you more development time to spend on the things that really matter to making your website better, different or great. Or, it will get you to market faster.
For reference, 15k is 0.000045% of your total computer's RAM.

I agree with #jfriend00. There's almost no impact on memory/performance for the code sizes you describe. You can always benchmark different modules according to your usage profile and choose by yourself. However, I think you should ask yourself some other (similar) questions -
Why the package I use is so 'big'? maybe there's a much 'smaller' one
that does the same job with the same performance. When I say big or small here I mean in terms of functionality. Most of the times you'd want to go with minimum functionality, even if its size might seem big. If you use a validation module that also validates emails, but you don't need it - doesn't mean that you shouldn't use it, just know the tradeoffs - it might get updated more frequently because bugs in the email validation that might cause other bugs in integer validations that you use, you have more code to read if you want your production code to feel more safe (explained bellow).
Does the package function as I expect? (read the tests)
Is the package I use "secured"/"OK for production"? Read the code of the packages you use, make sure there isn't something fishy going on - usually node packages are not that big because most are minimal (I never used it, but I know https://requiresafe.com/ exists for these types of questions - you might want to check it out). Note that if they are larger in size that might mean you would have to read more code.
Ask these questions (and others of you feel you should) recursively on the package' dependencies.

Setting up Perforce depot for multiple projects

Summary: Want help to figure out how to setup the depot and my development environment so that I can support multiple, related projects.
Details:
Until now I've had a depot which had in it only one project - ProjectA - robot version A.
I am starting to work on a new version (ProjectB) which has some differences in HW - I/O port mappings and timers have changed. I would like to continue to develop code for both projects.
This means that ProjectB will share some files with the ProjectA and some files will be different.
Since the differences are HW related items what I'm thinking of doing is creating a common area and then project specific areas where the common area is for device independent code and project specific area is for device dependent code.
The differences are big enough that I don't want to do #ifdef within files. Some differences are simple - different I/O port mapping and some are completely new modules.
To make maintainance easier, I would like to be able to compare differences between device dependent code and propagate selected changes.
Finally, to minimize my burden during comparisons, I would like to mark differences that I know are okay so that in future comparisons they don't show up.
Help!

Your instincts are good -- you're trying to Not Duplicate Code. This is the core of good design & engineering.
As for the file layout, it's always annoying to have your directories too deep, but that's MUCH better than too shallow. Maybe:
<root>
main/
projects/
robot1/...
robot2/...
shared1/
shared2/
(Big repositories are much deeper than that, even.)
As for how you make shared code -- you could have different setup.h or constants.h that drive what the various shared libraries do. Alternatively, build your shared libraries so they are parameterized at runtime.
SetupDrivers(0x80020); // address of PIO registers
And lastly -- if the projects really are different, decide if sharing the code really is the right thing. Usually yes, but everything is a choice. If you hope to manually "diff" your files to look for differences, it's really up to you to keep the structures close enough to diff. The "different config.h file for each project" idea mentioned above would help.
If you roll your own diff tool (in python or whatever) you could use special comments to flag "expected different lines".

Code archive? what do people use?

I have loads of notepad , js , .cs in a folder that I use to refer back to when I'm developing. They are just in a folder on my laptop. Is anyone aware of a better way of storing all this guff in a more stuctured way? Thinking some kind of cloud website or something?

You can use a wiki for this kind of thing. There are wikis that are local, such as TiddlyWiki.
One way or another, to keep things safe, you should use source control, and/or backup to the cloud.

I keep my code samples that aren't project-specific in a revision-controlled directory tree, based on the language they're in; actual projects are also kept in revision control, but are kept separately. I have tons of them now.
For smaller idioms and snippets that are useful or that I forget as I switch between languages for a period of time, I pop them into a wiki, with different pages also based on which language they're in. I don't put whole files in there; I just extract the pieces that I tend to forget and pop them in there.
They do tend to build up as time goes on, so just putting the smaller pieces in is much more efficient for fast lookup.

How to keep track of performance testing

I'm currently doing performance and load testing of a complex many-tier system investigating the effect of different changes, but I'm having problems keeping track of everything:
There are many copies of different assemblies
Orignally released assemblies
Officially released hotfixes
Assemblies that I've built containing further additional fixes
Assemblies that I've build containing additional diagnostic logging or tracing
There are many database patches, some of the above assemblies depend on certain database patches being applied
Many different logging levels exist, in different tiers (Application logging, Application performance statistics, SQL server profiling)
There are many different scenarios, sometimes it is useful to test only 1 scenario, other times I need to test combinations of different scenarios.
Load may be split across multiple machines or only a single machine
The data present in the database can change, for example some tests might be done with generated data, and then later with data taken from a live system.
There is a massive amount of potential performance data to be collected after each test, for example:
Many different types of application specific logging
SQL Profiler traces
Event logs
DMVs
Perfmon counters
The database(s) are several Gb in size so where I would have used backups to revert to a previous state I tend to apply changes to whatever database is present after the last test, causing me to quickly loose track of things.
I collect as much information as I can about each test I do (the scenario tested, which patches are applied what data is in the database), but I still find myself having to repeat tests because of inconsistent results. For example I just did a test which I believed to be an exact duplicate of a test I ran a few months ago, however with updated data in the database. I know for a fact that the new data should cause a performance degregation, however the results show the opposite!
At the same time I find myself sepdning disproportionate amounts of time recording these all these details.
One thing I considered was using scripting to automate the collection of performance data etc..., but I wasnt sure this was such a good idea - not only is it time spent developing scripts instead of testing, but bugs in my scripts could cause me to loose track of things even quicker.
I'm after some advice / hints on how better to manage the test environment, in particular how to strike a balance between collecting everything and actually getting some testing done at the risk of missing something important?

Scripting the collection of the test parameters + environment is a very good idea to check out. If you're testing across several days, and the scripting takes a day, it's time well spent. If after a day you see it won't finish soon, reevaluate and possibly stop pursuing this direction.
But you owe it to yourself to try it.

I would tend to agree with #orip, scripting at least part of your workload is likely to save you time. You might consider taking a moment to ask what tasks are the most time consuming in terms of your labor and how amenable are they to automation? Scripts are especially good at collecting and summarizing data - much better then people, typically. If the performance data requires a lot of interpretation on your part, you may have problems.
An advantage to scripting some of these tasks is that you can then check them in along side the source / patches / branches and you may find you benefit from organizational structure of your systems complexity rather than struggling to chase it as you do now.

If you can get away with testing only against a few set configurations that will keep the admin simple. It may also make it easier to put one on each of several virtual machines which can be quickly redeployed to give clean baselines.
If you genuinely need the complexity you describe I'd recommend building a simple database to allow you to query the multivariate results you have. Having a column for each of the important factors will a allow you to query in for questions like "what testing config had the lowest variance in latency?" and "which test database allowed the raising of most bugs?". I use sqlite3 (probably through the Python wrapper or the Firefox plug-in) for this kind of lightweight collection, because it keeps maintenance overhead relatively low and allows you to avoid perturbing the system under test too far, even if you need to run on the same box.
Scripting the tests will make them quicker to execute and permit results to be gathered in an already-ordered way, but it sounds like your system may be too complex to make this easy to do.

Reorganizing a project for expansion/reuse

The scope of the project I'm working on is being expanded. The application is fairly simple but currently targets a very specific niche. For the immediate future I've been asked to fork the project to target a new market and continue developing the two projects in tandem.
Both projects will be functionally similar so there is a very strong incentive to generalize a lot of the guts of the original project. Also I'm certain I'll be targeting more markets in the near future (the markets are geographic).
The problem is a previous maintainers of the project made a lot of assumptions that tie it to its original market. It's going to take quite a bit of refactoring to separate the generic from the market specific code.
To make things more complex several suggestions have been tossed around on how to organize the projects for the growing number of markets:
Each market is a separate project, commonalities between projects are moved to a shared library, projects are deployed independently.
Expand the existing project to target multiple markets, limiting functionality based on purchased license.
Create a parent application and redesign projects as plugins, purchased separately
All three suggestions have merit and ideally I would like to structure the codeto be flexible enough that any of these is possible with minor adjustments. Suggestion 3 appears to be the most daunting as it would require building a plugin architecture. The first two suggestions are a bit more plausible.
Are there any good resources available on the pros and cons of these different architectures?
What are the pros and cons on sharing code between projects verses copying and forking?

Forking is usually going to get you a quicker result initially, but almost always going to come around and bite you in maintenance -- bug fixes and feature enhancements from one fork get lost in the other forks, and eventually you find yourself throwing out whole forks and having to re-add their features to the "best" fork. Avoid it if you can.
Moving on: all three of your options can work, but they have trade-offs in terms of build complexity, cost of maintenance, deployment, communication overhead and the amount of refactoring you need to do.
1. Each market is a separate project
A good solution if you're going to be developing simultaneously for multiple markets.
Pros:
It allows developers for market A to break the A build without interfering with ongoing work on B
It makes it much less likely that a change made for market A will cause a bug for market B
Cons:
You have to take the time to separate out the shared code
You have to take the time to set up parallel builds
Modifications to the shared code now have more overhead since they affect both teams.
2. Expand the existing project to target multiple markets
Can be made to work okay for quite a while. If you're going to be working on releases for one market at a time, with a small team, it might be your best bet.
Pros:
The license work is probably valuable anyway, even if you move toward (1) or (3).
The single code base allows refactoring across all markets.
Cons:
Even if you're just working on something for market A, you have to build and ship the code for markets B, C and D as well -- okay if you have a small code base, but increasingly annoying as you get into thousands of classes
Changes to one market risk breaking the code for other markets
Changes to one market require other markets to be re-tested
3. Create a parent application and redesign projects as plugins
Feels technically sweet, and may allow you to share more code.
Pros:
All the pros of (1), potentially, plus:
clearer separation of shared and market-specific code
may allow you to move toward a public API, which would allow offloading some of your work onto your customers and/or selling lucrative service projects
Cons:
All the cons of (1), plus requires even more refactoring.
I would guess that (2) is sort of where you find yourself now, apart from the licensing. I think it's okay to stay there for a little while, but put some effort into moving toward (1) -- moving the shared code into a separate project even if it's all built together, for instance, trying to make sure the dependencies from market code to shared code are all one-way.
Whether you end up at (1) or (3) kind of depends. Mostly it comes down to who's "in charge" -- the shared code, or the market-specific code? The line between a plugin, and a controller class that configures some shared component, can be pretty blurry. My advice would be, let the code tell you what it needs.

1) NO! You don't want to manage different branches of the same code base... Because as common as the code may be, you will want to make sweeping changes, and one project will "at the moment" not be as important as the others, and then you will get one branch growing faster than the others.... insert snowball.
2) This is more or less the industry standard. Big config file, limit things based on license/configuration. It can make the app a bit cumbersome, but as long as the code complains about mutually exclusive stuff and all the developers are in constant communication about new features and how they ripple throughout the entire application, you should do fine. This also is the easiest to hack, if that is a concern.
3) This also 'can' work. If you are using C#, plugins are relatively simple, you only have to worry about dependency hell. If the plugins have any chance of becoming circularly interdependant (that is, a requires b requires c requires a), then this will quickly explode and you will revert (quite easily) back to #2.
The best resources you have are probably the past experiences of your coworkers on different projects, and the experience of people yammering about it on here or Slashdot or wherever. Certainly the cheapest.
Pros of sharing code:
One change changes everything.
Unified data model.
There is only one truth. (Much easier for everyone to be on the same page)
Cons of sharing code:
One change changes everything.. Be careful.
If one bug is in it, it affects everything.
Pros of copying/forking:
Usually quicker to implement a specific feature for a specific customer.
Faster to hack when you realize that assumption A is only applicable for markets B and C, not D.
Cons of copying/forking:
One or more of the copied projects will eventually fail, due to a lack of cohesion in your code.
As above said: Sweeping changes take a lot longer.
Good luck.

You said "copying and forking" which leads me to think that perhaps you haven't considered managing this "fork" as a branch in a revision control system like SVN. By doing it this way, when you refactor the branch to accomodate a different industry, you can merge those changes back into the main trunk with the aid of the revision control system.
If you are following a long term strategy of moving to a single app where all the variations are controlled by a config file (or an SQLITE config database) then this approach will help you. You don't have to merge anything until you are confident that you have generalised it for both industries, so you can still build two unique systems as long as you need to. But, you aren't backing yourself into a corner because it is all in one source code tree, the trunk for the legacy industry, and one branch for each new industry.
If your company really wants to atack multiple industries, then I don't think that the config database solution will meet all your needs. You will still need to have special code modules of some sort. A plug-in architecture is a good thing to put in because it will help, particularly if you embed a scripting engine like Python into your app. However, I don't think that plugins will be able to meet all your code variation requirements when you get into the "thousands of classes" scale.
You need to take a pragmatic approach that allows you to build a separate app today for the new industry, but makes it relatively easy to merge the improvements into the existing app as you go along. You may never reach the nirvana of a single trunk with thousands of classes and several industries, but you will at least have tamed the complexity, and will only have to deal with really important variations where there is real divergence in the industry need.
If I were in your shoes, I would also be looking at any and all features in the app which might be considered "reporting" and trying to factor them out, maybe even into an off the shelf reporting tool.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string