Naming step definitions used with 2+ keywords (e.g. Given + Then) - cucumber

Consider this feature snippet:
Given the date is '2022-01-01'
...
When the date is '2022-01-31'
...
One identical step but used with two different keywords. I haven't find much discussion about the naming convention in this case. I'd love for people to give their input.
The three main approaches I suppose are:
Name method with first keyword you use. If you need to use it again, just do it and ignore the fact that the method contains "Given" in its name
[Given(#"the date is '.*'")]
[Then(#"the date is '.*'")]
public void GivenTheDateIs(string date)
{
...
}
Use appropriate keyword in method name and create new method if needed more than once
[Given(#"the date is '.*'")]
public void GivenTheDateIs(string date)
{
//same code
}
[Then(#"the date is '.*'")]
public void ThenTheDateIs(string date)
{
//same code
}
Don't use keyword in method name
[Given(#"the date is '.*'")]
[Then(#"the date is '.*'")]
public void TheDateIs(string date)
{
...
}
Which one do you use and why?

Given is for setting up state
When is for doing something
Then is for evaluating results
In English at least tense differentiates
Given - past tense
When - present tense
Then - future tense
Generally When's are going to be done inefficiently.
With a UI a When will involve a browser interaction. Often a Given that does the same thing can be done more efficiently without that browser interaction. With an API the When will be done by a request/response cycle. Again a Given can do the same thing without the request/response.
For example consider adding the behaviour registration. First you would do
When 'I register' do
fill_in name, with: "Fred"
...
submit_form
end
But for you given you might do
Given 'I am registered' do
User.create(
name: "Fred"
...
end
end
which is much much faster.
You will use When I register only a few times in your application
You will use Given I am registered (in some form or other) for almost every user interaction you write a scenario for.
For the above example the Then is
Then 'I should be registered' do
# check UI for proof or registration
end
So Given's, When's and Then's are not only different in their context and tense, but also they are different in their implementations and their frequency of use.
The idea that a Given could be identical to a When or Then just reflects a lack of understanding and precision in your use of language for your scenarios and code for the implementation of your step definitions.
Of course you are allowed to write Cukes without this precision and control, but you don't need to, and you will gain alot if can write precisely enough that you never have to worry about the difference between a Given, When or Then
Caveats: Answer is opinionated, but question is an excellent one in my opinion. All code and examples are Ruby. All the examples are crude and simplistic. Putting lots of code in step def and calling the database directly in test code is poor practice.

Related

Use value object in command and event?

Can we use value object in command ?
Suppose I have a Shop (aggregate) in which there is one value object Address.
In the value object constructor Address ,I was put the some validation logic for address.
So if I am using that Address object in command (CreateShopCmd) , then it get validated at the making of command , but What I want or Read that validation should be present in command handler.
But problem is that , I have to put that validation again in command handler (Since validation is already present in it Address constructor) and if I am not putting that in command handler , then the validation will occur when I am making the Address object in event handler and assign to Shop aggregate(Which is incorrect)
So, please guide me.
Below are code example
#Aggregate
#AggregateRoot
public class Shop {
#AggregateIdentifier
private ShopId shopId;
private String shopName;
private Address address;
#CommandHandler
public Shop(CreateShopCmd cmd){
//Validation Logic here , if not using the Address in
// in cmd
//Fire an event after validation
ShopRegistredEvt shopRegistredEvt = new ShopRegistredEvt();
AggregateLifecycle.apply(shopRegistredEvt);
}
#EventSourcingHandler
public void on(ShopRegistredEvt evt) {
this.shopName = evt.getShopName();
//Validation happend here if not put in cmd at the time of making
//Address object - this is wrong
this.address = new Address(evt.getCity(),evt.getCountry(),evt.getZipCode())
}
}
public class CreateShopCmd{
private String shopId;
private String shopName;
private String city;
private String zipCode;
private String country;
}
public ShopCreatedEvent{
private String shopId;
private String shopName;
private String city;
private String zipCode;
private String country;
}
There is nothing conceptually wrong with using Value Objects in Commands or Events. However, you should use them with caution.
The structure of a Message may change over time. If you have used Value Object excessively inside your messages, it may become less clear how a change in one of the value objects changes the structure of different messages.
For Value Objects that represent a "common" concept, such as an Address, this is not so much of a problem. But as soon as the Value Objects become more domain-specific, this may come up as an issue.
This is a very good question and I have been thoroughly thinking about embedding value objects in commands or not. I came to the conclusion you should definitely not use Value Objects in commands:
Commands are part of the application layer, they are supposed to work as simple as possible, avoiding any typed objects, and work best using literal (think serialization). What happen when an external system wants to plugin on your hexagon (application layer) and send commands to your application, do they need your command library to be able to use the objects and the structure defined ? Hell no ! You don't want that, so keep command simple.
Another reason is, as DmitriBodiu said, VO contains business logic and validation, they belong to the domain layer, do not ever put them in commands. Application service will do the translation, and be responsible of throwing validation error to any non conforming commands at the client.
There is nothing wrong in your design, its actually how Vaughn Vernon (the author of Implementing Domain Driven Design - IDDD book) did in his repository, you might want to check the application layer at this link:
https://github.com/VaughnVernon/IDDD_Samples/blob/master/iddd_identityaccess/src/main/java/com/saasovation/identityaccess/application/IdentityApplicationService.java
Notice how he reconstruct every objects from flat commands to value object belonging to the domain layer:
#Transactional
public void changeUserContactInformation(ChangeContactInfoCommand aCommand) {
User user = this.existingUser(aCommand.getTenantId(), aCommand.getUsername());
this.internalChangeUserContactInformation(
user,
new ContactInformation(
new EmailAddress(aCommand.getEmailAddress()),
new PostalAddress(
aCommand.getAddressStreetAddress(),
aCommand.getAddressCity(),
aCommand.getAddressStateProvince(),
aCommand.getAddressPostalCode(),
aCommand.getAddressCountryCode()),
new Telephone(aCommand.getPrimaryTelephone()),
new Telephone(aCommand.getSecondaryTelephone())));
}
Commands must not contain business logic, so they cannot carry a value object.
I wouldn't suggest using Value Objects in commands. Cause your commands are part of the application layer, but Value Objects are kept in Domain Layer. You can use your ValueObjects in DomainEvens though. Because if domain model changes, modification of your domain event wouln't be that painful, cause the modification is done in the same bounded context. You should never use ValueObjects in integration events though.
Short answer: Have you ever thought about Integer, String, Boolean, etc.? Those are Value Objects, too. The only difference is, that you didn't create them yourself. Now try to build a Command without any Value Objects ;-)
Long answer:
In general I don't see any issue with Value Objects within Commands. As long as you follow a few simple guidelines:
The most important code in your application is your Domain Model. The Domain Model defines the data structures it expects for Command handling. This means: The only reason to change your Command Model is if your Domain Model requires this change. The same applies to your Value Objects: Value Objects only change if this change is required by your Domain Model. No exceptions!
Commands can in general fail either because of business constraints, or because of invalid data (or because of optimistic locking, or whatever).
As said above: Integers and Strings are Value Objects, too. If you only use basic types within your Command, it will already throw an exception if you try new SetAgeCommand(aggId, "foo"), because String cannot be assigned to int. The same applies if you don't provide an Aggregate ID to your UpdatePersonCommand. These are no business constraints, but instead very basic data and type validation. Your Command will never be created if you pass malformed data.
Now let's say you have a PersonAge Value Object. I doesn't matter where you construct this object, because in any case it must throw an Exception if you try to construct it with a negative number: -5 cannot be assigned to PersonAge - looks familiar? As long as you can make sure that your code created those Value Object instances, you can know for sure that they are valid.
Business rules should be checked by the Command Handler within your Domain Model. In general business constraints are specific to your Domain, and most often they rely on the data within your Aggregate. Take for example SendMoneyCommand. Your Money Value Object can validate if it's a valid currency, but it cannot validate if the user's bank account has enough money to execute the transaction. This is a business validation and it's part of your Domain Model.
And a word regarding Events: I'd suggest to only use very basic Value Objects inside your events. For example: String, Integer, Date, etc. Basically every kind of Value Object that will never change. The reason behind it: Business requirements can change. For example: Maybe your Domain Model requires your Address Value Object to change, and it's now required to provide geo-coordinates. Then this will implicitly change your NewAddressAddedEvent. But your already persisted Events didn't have this requirement, though you're unable to construct Address Value Objects from your past event data, because the new Address Value Object will throw an Exception if there are no geo-coordinates provided.
There are (at least) two solutions for this problem:
Versioned Events: After modifying your Address Value Object, you have now a NewAddressAddedEvent_Version2 which uses the new Address Value Object, and you have the old NewAddressAddedEvent which must use a backup copy of the old Address Value Object.
Write a Script that "repairs" your event database by adding geo-coordinates to every Event that uses the Address Value Object. So you can throw away the old NewAddressAddedEvent.
That's OK as long as the value objects are conceptually a part of your message contract, and not used in entities.
And if they are a part of your entity, don't expose them as public properties of your message or you'll be in soop.

How to model associations in DDD approach?

I'm learning DDD approach step by step with imaginary business domain by reading books of Eric Evans and Vaughn Vernon and I try to implement it using in my project using PHP (but it really doesn't matter here).
Recently I've been reading a lot of Aggregate, AggregateRoot and Entity patterns for models that should be defined by a domain. And, frankly, I'm not sure I understand all definitions well so I decided to ask my questions here.
At first I'd like to present my (sub)domain responsible for employees' holidays management which should make answers for my questions easier.
The most trivial case is that the Employee can be found in many Teams. When the employee decides to take few days off, he has to send a HolidaysRequest with metadata like type of holidays (like rest holidays, some days off to take care of his child, etc.), the acceptance status and of course time range when he's not going to appear in his office. Of couse HolidaysRequest should be aware of which Employee has sent the HolidaysRequest. I'd like also to find all HolidaysRequest that are sent by Employee.
I'm quite sure that things like DateRange or HolidayType are pure ValueObjects. It's quite clear for me. The problems start when I have to define boundries of entities. I may have bad practices of defining associations by nesting objects in entities, so, please, tell me finding out the definitions of responsibilities here.
What is an entity here? What should be an Aggregate and where's the place for AggregateRoot?
How to define associations between entities? E.g. an Employee can belong to multiple Teams or HolidaysRequest is authored by Employee and assigned to another Employee who can accept it. Should they be implemented as Aggregates?
Why I'm asking these questions? Because few weeks ago I've posted a question here and one of answers was to think about relations between Employee and Teams, that they should be in the single Aggreate called EmployeeInTeam but I'm not sure I understand it in proper way.
Thanks for any advice.
The main thing about DDD, is to put focus in the domain, that's why its called Domain Driven Design.
When you start asking about relationships, aggregates and entities without even deeply exploring what consists your domain, you're actually looking for database modeling instead of domain.
Please, I'm not saying you're asking wrong questions, nor criticising they, I think you're not wrong at all when trying to put in practice while studying.
I'm not DDD expert, I'm learning just like you, but I'm gonna try to help.
Start by thinking what situation's may arise about Holydays Management. When you have different rules for something, you could start by using strategies (I'm saying is the final solution).
Building a nice and meaningful domain, is very hard (at least for me). You write code. Test it. Have insights, throw your code way and rewrite it. Refactor it. In your software's lifecycle, you should put focus on domain, therefore you should be always improving it.
Start by coding (like a domain's draft) to see how it looks like. Let's exercise it. First of all, why do we need to manage this stuff? What problem are we trying to solve? Ahh, sometimes employees ask some days off, we want to control it. We may approve or not, depending on the reason they want "holyday", and how is our team status. If we decline and they still go home, we'll late decide whether we fire or discount in salary. Enforcing ubiquitous language, let's express in code this problem:
public interface IHolydayStrategy
{
bool CanTakeDaysOff(HolydayRequest request);
}
public class TakeCareOfChildren : IHolydayStrategy
{
public bool CanTakeDaysOff(HolydayRequest request)
{
return IsTotalDaysRequestedUnderLimit(request.Range.TotalDays());
}
public bool IsTotalDaysRequestedUnderLimit(int totalDays)
{
return totalDays < 3;
}
}
public class InjuredEmployee : IHolydayStrategy
{
public bool CanTakeDaysOff(HolydayRequest request)
{
return true;
}
}
public class NeedsToRelax : IHolydayStrategy
{
public bool CanTakeDaysOff(HolydayRequest request)
{
return IsCurrentPercentageOfWorkingEmployeesAcceptable(request.TeamRealSize, request.WorkingEmployees)
|| AreProjectsWithinDeadline(request.Projects);
}
private bool AreProjectsWithinDeadline(IEnumerable<Project> projects)
{
return !projects.Any(p => p.IsDeadlineExceeded());
}
private bool IsCurrentPercentageOfWorkingEmployeesAcceptable(int teamRealSize, int workingEmployees)
{
return workingEmployees / teamRealSize > 0.7d;
}
}
public class Project
{
public bool IsDeadlineExceeded()
{
throw new NotImplementedException();
}
}
public class DateRange
{
public DateTime Start { get; set; }
public DateTime End { get; set; }
public int TotalDays()
{
return End.Subtract(Start).Days;
}
public bool IsBetween(DateTime date)
{
return date > Start && date < End;
}
}
public enum HolydayTypes
{
TakeCareOfChildren,
NeedToRelax,
BankOfHours,
Injured,
NeedToVisitDoctor,
WannaVisitDisney
}
public class HolydayRequest
{
public IEnumerable<Project> Projects { get; internal set; }
public DateRange Range { get; set; }
public HolydayTypes Reason { get; set; }
public int TeamRealSize { get; internal set; }
public int WorkingEmployees { get; internal set; }
}
Here is how I quickly wrote this:
Holydays may be granted or not, depending on the situation and
reason, let's create a IHolydayStrategy.
Created an empty (propertyless) HolydayRequest class.
For each possible reason, let's create a different strategy.
If the reason is to take care of children, they can take days off if
the total days request is under a limit.
If the reason is because the employee has been injured, we have no
choice other than allowing the request.
If the reason is because they need to relax, we check if we have an
acceptable percentage of working employees, or if projects are within
deadline.
As soon as I needed some data in the strategy, I used CTRL + . to
automagically create properties in HolydayRequest.
See how I don't even know how these stuff are going to be stored/mapped? I just wrote code to solve a problem, and get piece of information needed to resolve it.
Obviously this is not the final domain, is just a draft. I might take away this code and rewrite, if needed, no feelings for it yet.
People may think it's useless to create an InjuredEmployee class just to always return true, but the point here is to make use of ubiquitous language, to make things as explicit as possible, anyone would read and understand the same thing: "Well, if we have an injured employee, they are always allowed to take days off, regardless of the team's situation and how many days they need.". One of the problems this concept in DDD solves is the misunderstanding of terms and rules between developers, product owners, domain experts, and other participants.
After this, I would start writing some tests with mock data. I might refactor code.
This "3":
public bool IsTotalDaysRequestedUnderLimit(int totalDays)
{
return totalDays < 3;
}
and this "0.7d":
private bool IsCurrentPercentageOfWorkingEmployeesAcceptable(int teamRealSize, int workingEmployees)
{
return workingEmployees / teamRealSize > 0.7d;
}
are specifications, In my point of view, which shouldn't reside in a strategy. We might apply Specification Pattern to make things decoupled.
After we get to a reasonably initial solution with passed tests, now let's think how should we store it. We might use the final defined classes (such as Team, Project, Employee) here to be mapped by an ORM.
As soon as you started writing your domain, relationships will arise between your entities, that's why I usually don't care at all how the ORM will persist my domain, and what is Aggregate at this point.
See how I didn't create an Employee class yet, even though it sounds very important. That's why we shouldn't start by creating entities and their properties, because it's the exact same thing as creating tables and fields.
Your DDD turns into Database Driven Design that way, we don't want this. Of course, eventually we'll make the Employee, but let's take step by step, create only when you need it. Don't try to start modeling everything at once, predicting all entities you're going to need. Put focus on your problem, and how to solve it.
About your questions, what is entity and what is aggregate, I think you're not asking the definition of them, but whether Employee is considered one or other, considering your domain. You'll eventually answer yourself, as soon as your domain start being revealed by your code. You'll know it when you started developing your Application Layer, which should have the responsibility of loading data and delegating to your domain. What data my domain logic expects, from where do I start querying.
I hope I helped someone.

what does getType do in antlr4?

This question is with reference to the Cymbol code from the book (~ page 143) :
int t = ctx.type().start.getType(); // in DefPhase.enterFunctionDecl()
Symbol.Type type = CheckSymbols.getType(t);
What does each component return: "ctx.type()", "start", "getType()" ? The book does not contain any explanation about these names.
I can "kind of" understand that "ctx.type()" refers to the "type" rule, and "getType()" returns the number associated with it. But what exactly does the "start" do?
Also, to generalize this question: what is the mechanism to get the value/structure returned by a rule - especially in the context of usage in a listener?
I can see that for an ID, it is:
String name = ctx.ID().getText();
And as in above, for an enumeration of keywords it is via "start.getType()". Any other special kinds of access that I should be aware of?
Lets disassemble problem step by step. Obviously, ctx is instance of CymbolParser.FunctionDeclContext. On page 98-99 you can see how grammar and ParseTree are implemented (at least the feeling - for real implementation please see th .g4 file).
Take a look at the figure of AST on page 99 - you can see that node FunctionDeclContext has a several children, one labeled type. Intuitively you see that it somehow correspond with function return-type. This is the node you retrieve when calling CymbolParser.FunctionDeclContext::type. The return type is probably sth like TypeContext.
Note that methods without 'get' at the beginning are usually children-getters - e.g. you can access the block by calling CymbolParser.FunctionDeclContext::block.
So you got the type context of the method you got passed. You can call either begin or end on any context to get first of last Token defining the context. Simply start gets you "the first word". In this case, the first Token is of course the function return-type itsef, e.g. int.
And the last call - Token::getType returns integral representation of Token.
You can find more information at API reference webpages - Context, Token. But the best way of understanding the behavior is reading through the generated ANTLR classes such as <GrammarName>Parser etc. And to be complete, I attach a link to the book.

DDD - Invalidating expirable

Currently diving into DDD and i've read most of the big blue book of Eric Evans. Quite interesting so far :)
I've been modeling some aggregates where they hold a collection of entities which expire. I've come up with a generic approach of expressing that:
public class Expirable<T>
{
public T Value { get; protected set; }
public DateTime ValidTill { get; protected set; }
public Expirable(T value, DateTime validTill)
{
Value = value;
ValidTill = validTill;
}
}
I am curious what the best way is to invalidate an Expirable (nullify or omit it when working in a set). So far I've been thinking to do that in the Repository constructor since that's the place where you access the aggregates from and acts as a 'collection'.
I am curious if someone has come up with a solution to tackle this and I would be glad to hear it :) Other approaches are also very welcome.
UPDATE 10-1-2013:
This is not DDD with the CQRS/ES approach from Greg Young. But the approach Evans had, since I just started with the book and the first app. Like Greg Young said, if you have to make good tables, you have to make a few first ;)
There are probably multiple ways to approach this, but I, personally, would solve this using the Specification pattern. Assuming object expiration is a business rule that belongs in the domain, I would have a specification in addition to the class you have written. Here is an example:
public class NotExpiredSpecification
{
public bool IsSatisfiedBy(Expirable<T> expirableValue)
{
//Return true if not expired; otherwise, false.
}
}
Then, when your repositories are returning a list of aggregates or when performing any business actions on a set, this can be utilized to restrict the set to un-expired values which will make your code expressive and keep the business logic within the domain.
To learn more about the Specification pattern, see this paper.
I've added a method to my abstract repository InvalidateExpirable. An example would be the UserRepository where I remove in active user sessions like this: InvalidateExpirable(x => x.Sessions, (user, expiredSession) => user.RemoveSession(expiredSession));.
The signature of InvalidateExpirable looks like this: protected void InvalidateExpirable<TExpirableValue>(Expression<Func<T, IEnumerable<Expirable<TExpirableValue>>>> selector, Action<T, Expirable<TExpirableValue>> remover). The method itself uses reflection to extract the selected property from the selector parameter. That property name is glued in a generic HQL query which will traverse over the set calling the remove lambda. user.RemoveSession will remove the session from the aggregate. This way the I keep the aggregate responsible for it's own data. Also in RemoveSession an domain event is raised for future cases.
See: https://gist.github.com/4484261 for an example
Works quite well sofar, I have to see how it works further down in the application though.
Have been reading up on DDD with CQRS/ES (Greg Young approach) and found a great example on the MSDN site about CQRS/ES: http://msdn.microsoft.com/en-us/library/jj554200.aspx
In this example they use the command message queue to queue a Expire message in the future, which will call the Aggregate at the specified time removing/deactivate the expirable construct from the aggregate.

Are string constants overrated?

It's easy to lose track of odd numbers like 0, 1, or 5. I used to be very strict about this when I wrote low-level C code. As I work more with all the string literals involved with XML and SQL, I find myself often breaking the rule of embedding constants in code, at least when it comes to string literals. (I'm still good about numeric constants.)
Strings aren't the same as numbers. It feels tedious and a little silly to create a compile-time constant that has the same name as its value (E.g. const string NameField = "Name";), and although the repetition of the same string literal in many locations seems risky, there's little chance of a typo thanks to copying and pasting, and when I refactor I'm usually doing a global search that involves changing more than just the name of the thing, like how it's treated functionally in relation to the things around it.
So, let's say you don't have a good XML serializer (or aren't in the mood to set one up). Which of these would you personally use (if you weren't trying to bow to peer pressure in some code review):
static void Main(string[] args)
{
// ...other code...
XmlNode node = ...;
Console.WriteLine(node["Name"].InnerText);
Console.WriteLine(node["Color"].InnerText);
Console.WriteLine(node["Taste"].InnerText);
// ...other code...
}
or:
class Fruit
{
private readonly XmlNode xml_node;
public Fruit(XmlNode xml_node)
{
this.xml_node = xml_node;
}
public string Name
{ get { return xml_node["Name"].InnerText; } }
public string Color
{ get { return xml_node["Color"].InnerText; } }
public string Taste
{ get { return xml_node["Taste"].InnerText; } }
}
static void Main(string[] args)
{
// ...other code...
XmlNode node = ...;
Fruit fruit_node = new Fruit(node);
Console.WriteLine(fruit_node.Name);
Console.WriteLine(fruit_node.Color);
Console.WriteLine(fruit_node.Taste);
// ...other code...
}
A defined constant is easier to refactor. If "Name" ends up being used three times and you change it to "FullName", changing the constant is one change instead of three.
For something like that it depends on how often the constant is used. If it's just in one place as per your example, then hard-coding is fine. If it's used in many different places, definitely use a constant. One typo could lead to hours of debugging if you're not careful, because your compiler isn't going to notice that you typed "Tsate" instead of "Taste", while it WILL notice that you typed fruit_node.Tsate instead of fruit_node.Taste.
Edit:
I see now that you mentioned copying and pasting, but if you're doing that you may also be losing the time you save by not creating a constant in the first place. With intellisense and auto-completion, you could have the constant out there in a few keystrokes, instead of going through the trouble of copy/paste.
As you probably guessed. The answer is: it depends on the context.
It depends on what the example code is part of. If it's just part of a small throw away system then hard coding the constants may be acceptable.
If it's part of a large, complex system and the constants will be used in mulitple files, I'd be more drawn to the second option.
As in many matters of programming, this is a matter of taste. The "laws" of proper programming were created from experience -- many people have been burned by global variables causing namespace or clarity problems, so Global Variables Are Evil. Many have used magic numbers, only to later discover that the number was wrong or needed changing. Text search is ill-suited to changing these values, so Constants In Code Are Evil.
But both are permitted, because sometimes they aren't evil. You need to make the decision yourself -- which leads to clearer code? Which is going to be better for maintainers? Does the reasoning behind the original rule apply to my situation? If I had to read or maintain this code later, how would I rather that it were written?
There is no absolute law of good coding style, because no two programmers' minds works exactly alike. The rule is to write the clearest, cleanest code that you can.
Personally, I'd load the fruit from the XML file in advance - something like:
public class Fruit
{
public Fruit(string name, Color color, string taste)
{
this.Name = name; this.Color = color; this.Taste = taste;
}
public string Name { get; private set; }
public Color Color { get; private set; }
public string Taste { get; private set; }
}
// ... In your data access handling class...
public static FruitFromXml(XmlNode node)
{
// create fruit from xml node with validation here
}
}
That way, the "fruit" isn't really tied to the storage.
I'd go with the constants. It is a little more work, but there is no performance impact. And even if you usually copy/paste the values, I've certainly had instances where I changed code when I typed and didn't realize that Visual Studio had focus. I'd much prefer these resulted in compile errors.
For the example given, where the Strings are used as keys to a map or dictionary, I would lean toward use of an enum (or other object) instead. You can often do much more with an enum than with a constant string. In addition, if some code is commented out, IDE's will often miss that when doing a refactor. Also, references to a String constant that are in comments may or may not be included in a refactor.
I will make a constant for a string when the string will be used in many locations, the string is long or complicated (such as a regex), or when a properly-named constant will make the code more obvious.
I prefer my typos, incomplete refactorings, and other bugs of this sort to fail to compile rather than to just fail to operate properly.
Like many other refactorings, it's an arguably optional additional step that leaves you with code that's less risky to maintain and is more easily grokked by the "next guy". If you're in a situation that rewards that kind of thing (most that I'm in do), go for it.
Yeah, pretty much.
I think developers in statically typed languages have an unhealthy fear of anything at all dynamic. Pretty much every line of code in a dynamically typed language is effectively a string literal, and they've been fine for years. For instance, in JavaScript technically this:
var x = myObject.prop1.prop2;
Is equivalent to this:
var x = window["myObject"]["prop1"]["prop2"]; // assuming global scope
But it is definitely not a standard practice in JavaScript to do this:
var OBJ_NAME = "myObject";
var PROP1_NAME = "prop1";
var PROP2_NAME = "prop2";
var x = window[OBJ_NAME][PROP1_NAME][PROP2_NAME];
That would just be ridiculous.
It still depends though, like if a string is used in numerous places and it's rather cumbersome/ugly to type ("name" vs. "my-custom-property-name-x"), then it's probably worth making a constant, even within a single class (at which point it's probably good to be internally consistent within the class and make all the other strings constants too).
Also, if you actually intend for other external users to interact with your library using these constants, then it's also a good idea to define publicly accessible constants and document that users should use those to interact with your library. However, a library which interacts via magic string constants is usually a bad practice and you should consider designing your library in such a way that you don't need to use magic constants to interact with it in the first place.
I think in the specific example you gave, where the strings are relatively simple to type and there are presumably no external users of your API who would expect to work with it using those string values (i.e. they're just for internal data manipulation), readable code is far more valuable than refactorable code, so I would just put the literals directly inline. Again, this is assuming I understand your exact use case specifically.
One thing nobody seemed to notice is that as soon as you define a constant, its scope becomes something to maintain and think about. This actually does have a cost, it's not free like everyone seems to think. Consider this:
Should it be private or public in my class? What if some other namespace/package has a need for the same value, should I now extract the constant to some global static class of constants? What if I now need it in other assemblies/modules, do I extract it further? All these things make the code less and less readable, harder to maintain, less pleasant to work with, and more complicated. All in the name of refactorability?
Usually, these "great refactorings" never occur, and when they do they require a complete rewrite anyway, with all new strings. And if you had been using some shared module before this great refactoring (as in the above paragraph) which didn't have these new strings which you now need, what then? Do you add them to the same shared module of constants (what if you don't have access to the code for this shared module)? Or do you keep them local to you, in which case there are now multiple scattered repositories of string constants, all at different levels, running the risk of duplicated constants all over the code? Once you get to this point (and believe me I've seen it), refactoring becomes moot, because while you'll get all your usages of your constants, you'll miss other people's usages of their constants, even though these constants have the same logical value as your constants and you're actually trying to change all of them.

Resources