What's the difference between doThrow() and thenThrow()?
Let's say, we want to mock an authentication service to validate the login credentials of a user. What's the difference between the following two lines if we were to mock an exception?
doThrow(new BadCredentialsException("Wrong username/password!")).when(authenticationService).login("user1", "pass1");
vs
when(authenticationService.login("user1", "pass1")).thenThrow(new BadCredentialsException("Wrong username/password!"));
Almost nothing: in simple cases they behave exactly the same. The when syntax reads more like a grammatical sentence in English.
Why "almost"? Note that the when style actually contains a call to authenticationService.login. That's the first expression evaluated in that line, so whatever behavior you have stubbed will happen during the call to when. Most of the time, there's no problem here: the method call has no stubbed behavior, so Mockito only returns a dummy value and the two calls are exactly equivalent. However, this might not be the case if either of the following are true:
you're overriding behavior you already stubbed, particularly to run an Answer or throw an Exception
you're working with a spy with a non-trivial implementation
In those cases, doThrow will call when(authenticationService) and deactivate all dangerous behavior, whereas when().thenThrow() will invoke the dangerous method and throw off your test.
(Of course, for void methods, you'll also need to use doThrow—the when syntax won't compile without a return value. There's no choice there.)
Thus, doThrow is always a little safer as a rule, but when().thenThrow() is slightly more readable and usually equivalent.
Related
I am trying to implement a utility library in nodeJS that I can use in different projects. I am stuck with how to handle errors properly.For example suppose I have a function
function dateCompare(date1,operator,date2) // it can take either a date object or valid date string.
Now suppose an invalid input is supplied-
1. I can return error in result like asynchronous logic- {error:true,result:""}, but this prevents me from using my function as if((date1,'eq',date2) || (date3,'l3',date4)
2. If I throw custom exception here, then I am afraid that node is single threaded and creating error context is very expensive.
How can we handle it so that it is easy to use as well as not very expensive? Under what circumstances throwing exceptions will be more appropriate even if it is too expensive ? some practical use cases will be very helpful.
There's no "right" answer for questions like this. There are various different philosophies and you have to decide which one makes the most sense for you or for your context.
Here's my general scheme:
If you detect a serious programming mistake such as a required argument to a function is missing or is the wrong type, then I prefer to just throw an exception and spell out in the exception msg exactly what is wrong. This should get seen by the developer the first time this code is run and they should then know they need to correct their code immediately. The general idea here is that you want the developer to see their error immediately and throwing an exception is usually the fastest way to do so and you can put a useful message in the exception.
If there are expected error return values such as "user name already taken" or "user name contains invalid characters" that are not programming mistakes, but are just an indication of why a given operation (perhaps containing user data) did not complete, then I would craft return values from the function that communicate this info to the caller.
If your function needs to return either a result or an error, then you have to decide on a case by case basis if it is easy to come up with a range of error values that are easily detectable as separate from the successful return values. For example, Array.prototype.indexOf() returns a negative value to indicate the value was not found or zero or a positive number to indicate it is returning an index. These ranges are completely independent so they are easy to code a test to distinguish them.
Another reason to throw an exception is that your code is likely to be used in a circumstance where it's simpler to let the exception propagate up several calling levels or block levels rather than manually writing code to propagate errors. This is a double edged sword. While sometimes it's very useful to let the exception propagate, sometimes you actually need to know about and deal with the exception at each level anyway to properly clean up in an error condition (release resources, etc...) so you can't let it go up that may levels automatically anyway.
If such a distinction is not simple to do for either you the code of the function or the developer who will call it, then sometimes it makes sense to return an object that has more than one property, one of which is an error property, another of which is a value.
In your specific case of:
function dateCompare(date1,operator,date2)
and
if (dateCompare(date1,'eq',date2) || dateCompare(date3,'l3',date4))
It sure would be convenient if the function just returns a boolean and throws an exception of the date values or operator are invalid. Whether this is good design decision depends a bit on how this is going to be used. If you're in a tight loop, running this on lots of values, many of which will be badly formatted and would throw such an exception and performance is important in this case, then it may be better to return the above-described object and change how you write the calling code.
But, if a format failure is not a regular expected case or you're only doing it once or the performance difference of an exception vs. a return value wouldn't even be noticed (which is usually the case), then throw the exception - it's a clean way to handle invalid input without polluting the expected use case of the function.
How can we handle it so that it is easy to use as well as not very
expensive?
It's not expensive to throw an exception upon bad input if that isn't the normally expected case. Plus, unless this code is in some kind of tight loop and called many times, it's unlikely you would even notice the difference between a return value and a thrown/caught exception. So, I'd suggest you code to make the expected cases simpler to code for and use exceptions for the unexpected conditions. Then, your expected code path doesn't go the exception route. In other words, exceptions actually are "exceptions" to normal.
Under what circumstances throwing exceptions will be more appropriate
even if it is too expensive?
See the description above.
I see the following approach often when working on certain projects that use Node.js and Bluebird.js:
function someAsyncOp(arg) {
return somethingAsync(arg).then(function (results) {
return somethingElseAsync(results);
});
}
This is, creating a wrapper function/closure around another function that accepts the exact same arguments. It seems this could be written more cleanly as:
function someAsyncOp(arg) {
return somethingAsync(arg).then(somethingElseAsync);
}
When I propose it to others, they usually like it and switch to it.
There is, however, an important caveat: if you're calling something like object.function, and the function relies on this (like console.log does), then this will lose its binding. You have to do object.function.bind(object):
return somethingAsync(arg).then(somethingElseAsync).catch(console.log.bind(console));
This does seem potentially undesirable, and the .bind call feels a little awkward. You can't go wrong with the let's-always-do-the-closure approach.
I can't seem to find any discussion on this on google, there doesn't seem to be anything in ESLint about unnecessary wrapper functions. I'm trying to find out more about it so here I am. I guess it's a case of I don't know what I don't know. Is there a name for this? (Useless use of closures?) Any other thoughts or wisdoms? Thank you.
Edit: someone's going to comment that someAsyncOp is also redundant, yes, it is, let's pretend it does something useful.
The discussion here is pretty straightforward. If your function is OK being called directly by the promise system, with the exact arguments and this value that will be in place when its called directly by the promise system and its return value is exactly what you want in the promise chain, then by all means, just specify the function reference directly as the .then() handler:
somethingAsync(arg).then(somethingElseAsync)
But, if your function isn't set up to be called directly that way, then you need a wrapper function or something like .bind() to fix the mismatch and call your function exactly as you want or set up the proper return value.
There's really nothing more to it than that. It's no different than specifying any callback anywhere in Javascript. If you have a function that already meets the specs of the callback exactly, then you can specify that function name as a direct reference with no wrapper. But, if the function you have doesn't quite work the way the callback is designed to work, then you use a wrapper function to smooth over the mismatch.
All callback functions have the same issue with passing obj.method as the callback. If your .method expects the this value to be obj, then you will probably have to do something to make sure that the this value is set accordingly before your function executes. The callbacks in .then() handlers are no different than callbacks for any other Javascript/node.js function such as setTimeout() or fs.readFile() or another other function that takes a callback as an argument. So, neither of the issues you mention is unique to promises at all. It just so happens that promises live by callbacks so if you're trying to make method calls via a callback, you will run into the issue with the object value getting passed appropriately to the method.
FYI, it is possible to code methods so that they are permanently bound to their own object and can be passed as obj.method, but that can only be used in your method implementation and has some other tradeoffs. In general, experienced Javascript developers are perfectly fine using obj.method.bind(obj) as the reference to pass. Seeing the .bind() in the code also indicates that you're aware that you need the proper obj value inside the method and that you have made a provision for that.
As for some of your bolded questions or comments:
Is there a name for this?
Not that I'm aware of. Technically it's "passing a named reference to a previously defined function as a callback", but I doubt that's something you can search for and find useful discussion of.
Any other thoughts or wisdoms?
For reasons, I'm not entirely sure of (though has been a topic of discussion elsewhere), Javascript programming style conventions seem to encourage the use of anonymous inline callbacks rather than defining a method or function elsewhere and then passing that named reference (like you would be more likely to do in many other languages). Obviously, if you put the actual code to process the callback in an inline anonymous function, then neither of the issues you mention comes up. Using arrow functions in ES6 now even allows you to preserve the current value of this in the inline callback. I'm not saying that this is an answer to your question just an observation about common Javascript coding conventions.
You can't go wrong with the let's-always-do-the-closure approach.
As you seem to already know, it's a waste to wrap something if it doesn't need wrapping. I would vote for wrapping only when there's a mismatch between the specification for the callback and the already existing named function and there's a reason not to just fix the named function to match the specification of the callback.
As per THIS post, There are two ways to mock the method doSomeStuff() to return a 1 :
when(bloMock.doSomeStuff()).thenReturn(1);
and
doReturn(1).when(bloMock).doSomeStuff();
The very important difference is that the first option will actually
call the doSomeStuff()- method while the second will not
So, my question is what is the point in having the first option which actually calls the actual method but returns 1 only. In which use case, we may want to something like that?
I dug a bit more than this, and the answer to why both syntaxes exist can be found in the old release notes, and a referenced mailing list discussion.
To start with, doReturn() was added in version 1.5 (26/07/2008), while when() was added in version 1.6 (21/10/2008). when() was implemented to replace the old stub() method and doReturn() to replace stubVoid(). Basically this is a design deicision by the creator of Mockito (cited from the mailing list 29/06/2008):
I never liked stubVoid() syntax but that was the best I could think
of. The stubbing syntax I'd implement now if I did mockito again:
//regular stubbing:
when(mock.getStuff()).thenReturn("foo");
when(mock.getStuff()).thenThrow(new RuntimeException());
//for void methods and some corner cases:
doReturn("foo").when(mock).getStuff();
doThrow(new RuntimeException()).when(mock).someMethod();
//for stubbing consecutively:
when(mock.getStuff()).thenReturn("foo").thenThrow(new RuntimeException());
doThrow(new RuntimeException()).thenReturn("foo").when(mock).someMethod();
I proposed this syntax couple of weeks ago but received only single
feedback saying that it's rather cosmetic (which is true...). Hence I
decided not to change the API.
And as already pointed out by Bewusstsein in the comments, when() provides type safety. If we have a method String doSomething() both below blocks will compile. The latter will however throw an exception during runtime.
Mockito.doReturn("String").when(mock).doSomething();
Mockito.doReturn(1).when(mock).doSomething();
So, to conclude, it was a design decision to introduce both ways of mocking. when() was imlemented as the prefered way of mocking, due to its type safety and its fluent reading. doReturn() was implemented to allow for mocking of void methods and other corner cases.
So, I was just coding a bit today, and I realized that I don't have much consistency when it comes to a coding style when programming functions. One of my main concerns is whether or not its proper to code it so that you check that the input of the user is valid OUTSIDE of the function, or just throw the values passed by the user into the function and check if the values are valid in there. Let me sketch an example:
I have a function that lists hosts based on an environment, and I want to be able to split the environment into chunks of hosts. So an example of the usage is this:
listhosts -e testenv -s 2 1
This will get all the hosts from the "testenv", split it up into two parts, and it is displaying part one.
In my code, I have a function that you pass it in a list, and it returns a list of lists based on you parameters for splitting. BUT, before I pass it a list, I first verify the parameters in my MAIN during the getops process, so in the main I check to make sure there are no negatives passed by the user, I make sure the user didnt request to split into say, 4 parts, but asking to display part 5 (which would not be valid), etc.
tl;dr: Would you check the validity of a users input the flow of you're MAIN class, or would you do a check in your function itself, and either return a valid response in the case of valid input, or return NULL in the case of invalid input?
Obviously both methods work, I'm just interested to hear from experts as to which approach is better :) Thanks for any comments and suggestions you guys have! FYI, my example is coded in Python, but I'm still more interested in a general programming answer as opposed to a language-specific one!
Good question! My main advice is that you approach the problem systematically. If you are designing a function f, here is how I think about its specification:
What are the absolute requirements that a caller of f must meet? Those requirements are f's precondition.
What does f do for its caller? When f returns, what is the return value and what is the state of the machine? Under what circumstances does f throw an exception, and what exception is thrown? The answers to all these questions constitute f's postcondition.
The precondition and postcondition together constitute f's contract with callers.
Only a caller meeting the precondition gets to rely on the postcondition.
Finally, bearing directly on your question, what happens if f's caller doesn't meet the precondition? You have two choices:
You guarantee to halt the program, one hopes with an informative message. This is a checked run-time error.
Anything goes. Maybe there's a segfault, maybe memory is corrupted, maybe f silently returns a wrong answer. This is an unchecked run-time error.
Notice some items not on this list: raising an exception or returning an error code. If these behaviors are to be relied upon, they become part of f's contract.
Now I can rephrase your question:
What should a function do when its caller violates its contract?
In most kinds of applications, the function should halt the program with a checked run-time error. If the program is part of an application that needs to be reliable, either the application should provide an external mechanism for restarting an application that halts with a checked run-time error (common in Erlang code), or if restarting is difficult, all functions' contracts should be made very permissive so that "bad input" still meets the contract but promises always to raise an exception.
In every program, unchecked run-time errors should be rare. An unchecked run-time error is typically justified only on performance grounds, and even then only when code is performance-critical. Another source of unchecked run-time errors is programming in unsafe languages; for example, in C, there's no way to check whether memory pointed to has actually been initialized.
Another aspect of your question is
What kinds of contracts make the best designs?
The answer to this question varies more depending on the problem domain.
Because none of the work I do has to be high-availability or safety-critical, I use restrictive contracts and lots of checked run-time errors (typically assertion failures). When you are designing the interfaces and contracts of a big system, it is much easier if you keep the contracts simple, you keep the preconditions restrictive (tight), and you rely on checked run-time errors when arguments are "bad".
I have a function that you pass it in a list, and it returns a list of lists based on you parameters for splitting. BUT, before I pass it a list, I first verify the parameters in my MAIN during the getops process, so in the main I check to make sure there are no negatives passed by the user, I make sure the user didnt request to split into say, 4 parts, but asking to display part 5.
I think this is exactly the right way to solve this particular problem:
Your contract with the user is that the user can say anything, and if the user utters a nonsensical request, your program won't fall over— it will issue a sensible error message and then continue.
Your internal contract with your request-processing function is that you will pass it only sensible requests.
You therefore have a third function, outside the second, whose job it is to distinguish sense from nonsense and act accordingly—your request-processing function gets "sense", the user is told about "nonsense", and all contracts are met.
One of my main concerns is whether or not its proper to code it so that you check that the input of the user is valid OUTSIDE of the function.
Yes. Almost always this is the best design. In fact, there's probably a design pattern somewhere with a fancy name. But if not, experienced programmers have seen this over and over again. One of two things happens:
parse / validate / reject with error message
parse / validate / process
This kind of design has one data type (request) and four functions. Since I'm writing tons of Haskell code this week, I'll give an example in Haskell:
data Request -- type of a request
parse :: UserInput -> Request -- has a somewhat permissive precondition
validate :: Request -> Maybe ErrorMessage -- has a very permissive precondition
process :: Request -> Result -- has a very restrictive precondition
Of course there are many other ways to do it. Failures could be detected at the parsing stage as well as the validation stage. "Valid request" could actually be represented by a different type than "unvalidated request". And so on.
I'd do the check inside the function itself to make sure that the parameters I was expecting were indeed what I got.
Call it "defensive programming" or "programming by contract" or "assert checking parameters" or "encapsulation", but the idea is that the function should be responsible for checking its own pre- and post-conditions and making sure that no invariants are violated.
If you do it outside the function, you leave yourself open to the possibility that a client won't perform the checks. A method should not rely on others knowing how to use it properly.
If the contract fails you either throw an exception, if your language supports them, or return an error code of some kind.
Checking within the function adds complexity, so my personal policy is to do sanity checking as far up the stack as possible, and catch exceptions as they arise. I also make sure that my functions are documented so that other programmers know what the function expects of them. They may not always follow such expectations, but to be blunt, it is not my job to make their programs work.
It often makes sense to check the input in both places.
In the function you should validate the inputs and throw an exception if they are incorrect. This prevents invalid inputs causing the function to get halfway through and then throw an unexpected exception like "array index out of bounds" or similar. This will make debugging errors much simpler.
However throwing exceptions shouldn't be used as flow control and you wouldn't want to throw the raw exception straight to the user, so I would also add logic in the user interface to make sure I never call the function with invalid inputs. In your case this would be displaying a message on the console, but in other cases it might be showing a validation error in a GUI, possibly as you are typing.
"Code Complete" suggests an isolation strategy where one could draw a line between classes that validate all input and classes that treat their input as already validated. Anything allowed to pass the validation line is considered safe and can be passed to functions that don't do validation (they use asserts instead, so that errors in the external validation code can manifest themselves).
How to handle errors depends on the programming language; however, when writing a commandline application, the commandline really should validate that the input is reasonable. If the input is not reasonable, the appropriate behavior is to print a "Usage" message with an explanation of the requirements as well as to exit with a non-zero status code so that other programs know it failed (by testing the exit code).
Silent failure is the worst kind of failure, and that is what happens if you simply return incorrect results when given invalid arguments. If the failure is ever caught, then it will most likely be discovered very far away from the true point of failure (passing the invalid argument). Therefore, it is best, IMHO to throw an exception (or, where not possible, to return an error status code) when an argument is invalid, since it flags the error as soon as it occurs, making it much easier to identify and correct the true cause of failure.
I should also add that it is very important to be consistent in how you handle invalid inputs; you should either check and throw an exception on invalid input for all functions or do that for none of them, since if users of your interface discover that some functions throw on invalid input, they will begin to rely on this behavior and will be incredibly surprised when other function simply return invalid results rather than complaining.
I recently took in a small MCF C++ application, which is obviously in a working state. To get started I'm running PC-Lint over the code, and lint is complaining that CStringT's are being passed to Format. Opinion on the internet seems to be divided. Some say that CSting is designed to handle this use case without error, but others (and an MSDN article) say that it should always be cast when passed to a variable argument function. Can Stackoverflow come to any consensus on the issue?
CString has been carefully designed to be passed as part of a variable argument list, so it is safe to use it that way. And you can be fairly sure that Microsoft will take care not to break this particular behavior. So I'd say you are safe to continue using it that way, if you want to.
That said, personally I'd prefer the cast. It is not common behavior that string classes behave that way (e.g. std::string does not) and for mental consistency it may be better to just do it the "safe" way.
P.S.: See this thread for implementation details and further notes on how to cast.