SemVer major upgrade or not? - semantic-versioning

I have an artifact that is not used directly. These artifacts contain the server that runs users application. The API that is visible to users (i.e. to 3rd parties) is well defined in a separate library.
Now, I am doing some changes in the server. Some public methods get signature change. However, this does not reflect the user, since he does not see the changes.
I am not sure what SemVer defines in this situation. Should I
A) Bump the servers major version, since public method signature is changed, or
B) Bump minor version as this change does not affect the users of the server?
In fact, it seems that in case B the server will never get the major version increase, i.e. it would always stay at 1.x.x since the API for a user is defined in a different library (server is just an implementation of it).
How should I treat this case?

A "public method signature" change sounds like a breaking change to me. But then you say this change doesn't affect the clients. It's a bit confusing. Did you add features to the API? That's a minor bump. If you changed an existing API such that a client must be recompiled or modified in order to continue using it, then you have a major bump. The SemVer spec is quite clear that API changes require either a major or minor bump and everything else falls under the "bug fix" category.
If your change involves adding optional parameters to existing methods or accepting additional constants/enumerations to functions in a way that does not require client code to be recompiled and no behavior changes are visible to clients using the old feature set, you definitely have a minor change.

Related

GWT - Ensure SQL password is never handed to the client

I have a GWT project where the server needs to talk to an SQL database. It needs a password to do that. That password needs to be stored somewhere. I can think of three locations to store that password:
Right there in the call to DriverManager.getConnection.
A final String field somewhere.
A .properties file.
With cases 1 and 2, the scenario comes to mind that the source code is translated to JavaScript and sent to the client.
That would never happen intentionally since it only makes sense for the server to talk to the database and not the client, but it could happen accidentally. In case 1 GWT would probably complain that it can't deal with JDBC, but in case 2 the field might be in some Constants class that compiles just fine.
I don't have enough experience with GWT to know how .properties files are handled. E.g. files in the src\foo\server directory might not be included in the JavaScript that gets handed to the client, but someone might come along later and accidentally move the file somewhere else where it is included.
So how can I ensure that the password is never accidentally sent to the client?
Note that I don't care that the password is stored in plain-text, either in code or in a config file.
Edit:
Clarification of my current situation:
My TestModule.gwt.xml only contains <source path='client'/>. It does not contain <source path='shared'/> or <source path='server'/>!
I have shared configs and server-only configs (the server-only config would contain the password for the database, then):
In the TestScreen (which is a Composite that shows a button on the page) I can use the ServerConfig class and SharedConfig class from client code without any problems:
This is a problem since I (or someone else) might accidentally cause the class with the password to be translated to JS and sent to the client.
The database password should rather be stored in a properties file than somewhere in the code. Unlike the code, this properties file should not be submitted to a version control system (like git or similar). It should also be outside the web folders.
Moreover, it would be huge security risk to use public final static String to store a password. Public members are visible to all other classes, static means no instance necessary to use it and final that it won't change. In your code you are storing a String constant that will be available to all instances of the class, and to other objects using the class. That is no good way to start considering security risks and is not directly related to GWT. It would be like storing a lot of money in a bank with no walls or doors and then asking how one could make it safe.
As long as data stays in the server side, you're fine. Per default, only client and shared paths are specified for translatable code. If your server classes do not implement IsSerializable and are not explicitly specified for translatable code in your gwt.xml file, they won't be sent to the client.
You have more than one option here :
Use a sperate classpath for both client a server so the classes in the server are never referenced in the client, this can be done following the recommended prject structure where each of the client/shared/server are a separate project, you can create such project structure using https://github.com/tbroyer/gwt-maven-archetypes, when you use this most likely the build will fail when anyone tries to depends on the server from the client, but there is still the possibilty that someone will do something and make it work.
Use #GwtIncompatible annotation on the class that holds the password which means the class will never be transpiled to JS at all and if referenced from the client side it will be a compilation error at gwt compilation phase.
Never put the password in a source file and depend on environment variable or some sort of password/key store that only exists on the server where you deploy the app and you still can set it locally for development.
If the server types and members are still accessible, you have misconfigured the .gwt.xml file, as #Adam said - make sure that the server vs client vs shared packages all exist in together in the same package as your .gwt.xml, and that no other .gwt.xml might exist.
This is not a security feature, like you are treating it, but a "how do I get the code I actually need to do my work" issue - java bytecode doesn't have enough detail in it (generics are erased, and old versions of gwt actually used javadoc tags for more detail) to generate the sources. Generally speaking, if you don't have sources, you can't pass that Java to GWT and expect it to be used in producing JS.
There are at least two edge case exceptions to this. There are probably more, but these spots of weirdness usually only matter when trying to understand why GWT can't generate JS from some Java, whereas you are trying to leverage these limitations as security features.
Generators and linkers run in the JVM, so naturally they can function with just plain JVM bytecode while the compiler is running. It would be a weird case where you would care about this, but consider something like a generator which was trying to extract some kind of reflection information and provide it in a static format for the browser.
GWT uses JDT to read the classes to be compiled, and it loads up bytecode where possible to resolve some things - one of those things happens to include constants. A "static final" string or primitive can be read from bytecode in this way without needing to go to the original .java sources.
If you have content in your bytecode that must not be considered in any way when generating JS, it belongs in a separate classpath - generally speaking, you should always separate your client code from your server code into separate projects with separate classpaths. There may exist at least one more project, to signify "shared" code which both client and server need to have access to.
And finally, it is generally speaking considered a bad idea to put secrets of any kind in your project itself, whether in the code itself or properties files, but instead to make it part of the deployment or runtime environment.

Version queries in URL to bypass browser cache

I'm writing a web application which will likely be updated frequently, including changes to css and js files which are typically cached aggressively by the browser.
In order for changes to become instantly visible to users, without affecting cache performance, I've come up with the following system:
Every resource file has a version. This version is appended with a ? sign, e.g. main.css becomes main.css?v=147. If I change a file, I increment the version in all references. (In practice I would probably just have a script to increment the version for all resources, every time I deploy an update.)
My questions are:
Is this a reasonable approach for production code? Am I missing something? (Or is there a better way?)
Does the question mark introduce additional overhead? I could incorporate the version number into the filename, if that is more efficient.
The approach sound reasonable. Here are some points to consider:
If you have many different resource files with different version numbers it might be quite some overhead for developers to correctly manage all these and increase them in the correct situations.
You might need to implement a policy for your team
or write a CI task to check that the devs did it right
You could use one version number for all files. For example when you have a version number of the whole app you could use that.
It makes "managing" the versions for developers a no-op.
It changes the links on every deploy
Depending on the number of resource files you have to manage the frequency of deploys vs. the frequency of deploys that change a resource file and the numbers of requests for these resource files one or the other solution might be more performant. This is a question of trade off.

Semantic versioning - major version for a traditional web application

I have a Rails app which is a traditional web application (HTTP requests are processed and HTML pages are rendered). As of now, it does not have an APIs that are exposed to other apps.
I want to use semantic versioning for versioning the application. Currently it is at '0.0.0'.
Quoting from the documentation:
MAJOR version when you make incompatible API changes,
MINOR version when you add functionality in a backwards-compatible manner, and
PATCH version when you make backwards-compatible bug fixes.
From what I understand, because there are no applications dependent on mine, the major version will never change. Only the minor and patch versions will change, the major version will always remain 0.
I want to know if my understanding is correct. Is there any scenario in which my major version will change?
Since you're not developing and releasing software package, semantic versioning is not directly applicable. It sounds like a single "release" number could be enough for your use case, since what you need is track when a code change will be in test and in prod. Assuming code must go through test before going to prod, you would update the number whenever you update the test environment with code from the development branch. This way, at a given moment development would have release N, test would have N-1, and prod N-2.
API versioning is a different problem, independent of release numbering. In my experience API users only care about breaking changes, so those need to be versioned. Also, since users are slow to update their apps you must be prepared to keep old versions around indefinitely.
One way you could think about this is to think about the user's flow through the application as the basis for versioning. If a breaking change happens (i.e. the user's flow is changed in a way that makes the old route impossible) then it could be considered breaking. If you're adding new functionality that hasn't existed before (i.e. the user has access to a new feature or sees something new on the website that they can interact with) then that could be considered a minor version increase. If you're deploying minor fixes to things like text, then that could be considered a patch-level change.
The problem with this approach, though, is that you need to understand a user's workflow through the application to be able to correctly increment the major version, and as software developers we're still pretty terrible at doing that properly.
Ref: https://christianlydemann.com/versioning-your-angular-app-automatically-with-standard-version

What does "public api" mean in Semantic Versioning?

I'm learning about how to assign and increment version numbers with the rule called "Semantic Versioning" from http://semver.org/.
Among all its rules, the first one said:
Software using Semantic Versioning MUST declare a public API. This API could be declared in the code itself or exist strictly in documentation. However it is done, it should be precise and comprehensive"
I am confused about "public API". What does it refer to?
Public API refers to the "point of access" that the external world (users, other programs and/or programmers, etc) have to your software.
E.g., if you're developing a library, the public API is the set of all the methods invokations that can be made to your library.
There is understanding that, unless a major version changes, your API will be backwards-compatible, i.e. all the calls that were valid on a version will be valid on a later version.
You can read at point 9 of those rules:
Major version X (X.y.z | X > 0) MUST be incremented if any backwards incompatible changes are introduced to the public API.
I discovered SemVer today and read up on it from several sources to ensure I had fully grasped it.
I am confused about "public API". What does it refer to?
I was also confused about this. I wanted to set about using SemVer immediately to version some of my scripts, but they didn't have a public API and it wasn't even clear to me how they could have one.
The best answer I found is one that explains:
SemVer is explicitly not for versioning all code. It's only for code that has a public API.
Using SemVer to version the wrong software is an all too common source
of frustration. SemVer can't version software that doesn't declare a
public API.
Software that declare a public API include libraries and command line
applications. Software that don't declare a public API include many
games and websites. Consider a blog; unlike a library, it has no
public API. Other pieces of software cannot access it
programmatically. As such, the concept of backward compatibility
doesn't apply to a blog. As we'll explain, semver version numbers
depend on backward compatibility. Because of this dependence, semver
can't version software like blogs.
Source: What Software Can SemVer Version?
It requires a public API in order to effectively apply it's versioning pattern.
For example:
Bug fixes not affecting the API increment the patch version
Backwards compatible API additions/changes increment the minor
version, and...
Backwards incompatible API changes increment the major version.
What represents your API is subjective, as they even state in the SemVer doc:
This may consist of documentation or be enforced by the code itself.
Having read the spec a few times,
Software using Semantic Versioning MUST declare a public API. This API could be declared in the code itself or exist strictly in
documentation. However it is done, it should be precise and
comprehensive.
I wonder whether all it means is that the consumers of your software must be able to establish the precise "semantic" version they are using.
For example, I could produce a simple script where the semantic version is in the name of the script:
DoStuff_1.0.0.ps1
It's public and precise. Not just in my head :)
Semantic versioning is intended to remove the arbitrariness that can be seen when someone decides to select a versioning scheme for their project. To do that, it needs rules, and a public API is a rule that SemVer chose to use. If you are building a personal project, you don't need to follow SemVer, or follow it strictly. You can, for example, choose to loosely interpret is as
MAJOR: Big new feature or major refactor
MINOR: New feature which does not impact the rest of the code much
PATCH: Small bug fix
But the vagueness of this loose interpretation opens you up to arbitrariness again. That might not matter to you, or the people you foresee who will be using your software.
The larger your project is, the more the details of your versioning scheme matters. As someone who has worked in a third level support for a large IT company (which licenses APIs to customers) for quite some time, I have seen the "is it a bug or is it a feature" debate many times. SemVer intends to make such distinctions easier.
A public API can, of course, be a REST API, or the public interface of a software library. The public/private distinction is important, because one should have the freedom to change the private code without it adversely affecting other people. (If someone accesses your private code through, say, reflection, and you make a change which breaks their code, that is their loss.)
But a public API can even be something like command line switches. Think of POSIX compliant CLI tools. These tools are standalone applications. But they are used in shell scripts, so the input they accept, and the output they produce, can matter. The GNU project may choose to reimplement a POSIX standard tool, and include its own features, but in order for a plethora of shell scripts to continue working across different systems, they need to maintain the behaviour of the existing switches for that application. I have seen people having to build wrappers around applications because the output of the version command changes, and they had scripts relying on the output of the version command to be in a certain form. Should the output of the version command be part of the public API, or is what those people using it did a hack? The answer is that it is up to you and what guarantees you make to the people using your software. You might not be able to imagine all use cases. Declaring the intended use of your software creates a contract with your users, which is what a public API is.
SemVer, thus, makes it easier for your uses to know what they are getting when upgrading. Did only the patch level change? Yes, so better install it quick to get the latest patch fix. Did the major version change? Better run a full, potentially expensive, regression test suite to see if my application will still work after the upgrade.

How to manage "3rd party" sub-projects in Perforce?

Our group integrates a bunch of different sub-blocks into our main project and we are trying to determine the best way to manage all of these different pieces of intellectual property. (From here on out I will refer to these sub-projects as pieces of IP "Intellectual Property").
The IP will be a mixture of third party vendor IP, previous projects IP and new to this project IP. Here are some of the ideas we are considering for managing all the different pieces of IP:
Publish releases on a physical drive and have the main project point to the correct releases.
PROS - Little to no dependencies on the SCM: seems simpler to manage initially:
CONS - Must remember to keep each physical design center up to date:
Use Perforce client spec views to include the correct version.
PROS - Able to quickly see what IPs are being used in the client spec:
CONS - With a lot of IPs the client spec becomes very messy and hard to manage: each team member manages there own client spec (inconsistencies): the very thing determining which IP version to use is not under SCM (by default):
Integrate the the different releases into a single one line client view.
PROS - Makes client spec maintenance dead simple: any change to the IP version is easly observable with the standard Perforce tools:
CONS - Not as easy to see what versions of IP we are using:
Our manager prefers #2 because it is easiest for him to look at a client spec and know all the IPs we are using and the versions. The worker bees tend to strongly dislike this one as it means we have to try and keep everyones individual client specs up to date and is not under SCM of the project itself.
How do others handle IP within a Perforce project and what recommendations do you have?
UPDATE:
I am really leaning towards solution #3, it just seems so much cleaner and easier to maintain. If any one can think of why #3 is not a good idea please let me know.
I would go for the third solution too.
I can't think of any downsides, and have not experienced any when faced with similar situations in the past.
You could placate your manager by using a branch spec that clearly spells out which IP versions are branched in. He could then refer to that branch spec instead of a client spec.
Also if you look up 'spec depots' in the help, you can set Perforce up so that it version controls all specs, including branch specs, automatically, which will give you traceability if you alter IP versions.
"each team member manages there own client spec (inconsistencies)"
Don't do that. Have the client spec be a file that is checked in to Perforce.
I would suggest #2 as it is the most transparent system. Yes it will mean some more work keeping clients up to date, but you can minimize that issue by using template clients.
At my work we use template clients that the devs copy from to keep their clients properly configured. We name this with the pattern "0-PRODUCT-BRANCH" (and sometimes add platform if needed). Then it is a one line command from the command line, or a couple clicks from the GUI to update your client. I send notices to the team whenever the template changes.
Now in my case, template changes don't happen very often. There is maybe a max of 5-6 per year, so the hassle level may be different for you.

Resources