Example test cases for hypothesis based strategies? - python-hypothesis

What is considered current best practice to test own strategies which are based on hypothesis? There are e.g. tests about how good examples shrink HypothesisWorks/hypothesis-python/tests/quality/test_shrink_quality.py. However I could not find examples so far which test the data generation functionality of strategies (in general, performance, etc.).

Hypothesis runs a series of health checks on each strategy you use, including for time taken to generate data and the proportion of generation attempts that succeed - try e.g. none().map(lambda x: time.sleep(2)).example() or integers().map(lambda x: x % 17 == 0).example() to see them in action!
In most cases you do not need to test your own strategies beyond using these healthchecks. Instead, I would check that your tests are sufficient by using a code coverage library.

Related

Reusing cucumber steps in a large codebase/team

We're using cucumberJS on a fairly large codebase with hundreds of cucumber scenarios and we've been running into issues with steps reuse.
Since all the steps in Cucumber are global, it's quite difficult to write steps like "and I select the first item in the list" or similar that would be similarly high-level. We end up having to append "on homepage" (so: "I select the first item in the list of folders on homepage") which just feels wrong and reads wrong.
Also, I find it very hard to figure out what the dependencies between steps are. For example we use a "and I see " pattern for storing a page object reference on the world cucumber instance to be used in some later steps. I find that very awkward since those dependencies are all but invisible when reading the .feature files.
What's your tips on how to use cucumber within a large team? (Including "ditch cucumber and use instead" :) )
Write scenarios/steps that are about what you are doing and why you are doing it rather than about how you do things. Cucumber is a tool for doing BDD. The key word here is Behaviour, and its interpretation. The fundamental idea behind Cucumber and steps is that each piece of behaviour (the what) has a unique name and place in the application, and in the application context you can talk about that behaviour using that name without ambiguity.
So your examples should never be in steps because they are about HOW your do something. Good steps never talk about clicking or selecting. Instead they talk about the reason Why you are clicking or selecting.
When you follow this pattern you end up with fewer steps at a higher level of abstraction that are each focused on a particular topic.
This pattern is easy to implement, and moderately easy to maintain. The difficulty is that to write the scenarios you have to have a profound understanding of what you are doing and why its important so you can discover/uncover the language you need to express yourself distinctly, clearly and simply.
I'll give my standard example about login. I use this because we share an understanding of What login is and Why its important. Realise before you can login that you have to be registered and that is complex.
Scenario: Login
Given I am registered
When I login
Then I should be logged in
The implementation of this is interesting in that I delegate all work to helper methods
Given I am registered
#i = create_registered_user
end
When I login
login_as(user: #i)
end
Then I should be logged in
should_be_logged_in
end
Now your problem becomes one of managing helper methods. What you have is a global namespace with a large number of helper methods. This is now a code and naming problem and All you have to do is
keep the number of helper methods as small as possible
keep each helper method simple
ensure there is no ambiguity between method names
ensure there is no duplication
This is still a hard problem, but
- its not as hard as what you are dealing with
- getting to this point has a large number of additional benefits
- its now a code problem, lots of people have experience of managing code.
You can do all these things with
- naming discipline (all my methods above have login in their name)
- clever but controlled use of arguments
- frequent refactoring and code cleaning
The code of your helper methods will have
- the highest churn of all your application code
- the greatest need to be simple and clear
So currently your problem is not about Cucumber its about debt you have with your existing scenarios and their implementation. You have to pay of your debt if you want things to improve, good luck

Testing for heteroskedasticity and autocorrelation in large unbalanced panel data

I want to test for heteroskedasticity and autocorrelation in a large unbalanced panel dataset.
I do so using the following code:
* Heteroskedasticity test
// iterated GLS with only heteroskedasticity produces
// maximum-likelihood parameter estimates
xtgls adjusted_volume ibn.rounded_time i.id i.TRD_EVENT_DT, igls panels(heteroskedastic)
estimates store hetero
* Autocorrelation
findit xtserial
net sj 3-2 st0039
net install st0039
xtserial adjusted_volume ibn.rounded_time i.id i.TRD_EVENT_DT
Though I use the calculation power of high process center, because of the iteration method, this procedure takes more than 15 hours.
What is the most efficient program to perform these tests using Stata?
This question is borderline off-topic and quite broad, but i suspect still of
considerable interest to new users. As such, here i will try to consolidate our
conversation in the comments as an answer.
I strongly advise in the future to refrain from using highly subjective
words such as 'best', which can mean different things to different people. Or
terms like 'efficient', which can have a different meaning in a different context.
It is also difficult to provide specific advice regarding the use of commands
when we know nothing about what you are trying to do.
In my view, the 'best' choice, is the choice that gets the job done as accurately
as possible given the available data. Speed is an important consideration nowadays, but accuracy is still the most fundamental one. As you continue to use Stata, you will see that it has a considerable number of commands, often with overlapping functionality. Depending on the use case, sometimes opting for one implementation over another can be 'better', in the sense that it may be more practical or faster in achieving the desired end result.
Case in point, your comment in your previous post where the noconstant option is unavailable in rreg. In that particular context you can get a reasonably good alternative using regress with the vce(robust) option. In fact, this alternative may often be adequate for several use cases.
In this particular example, xtgls will be considerably faster if the igls
option is not used. This will be especially true with larger and more 'difficult' datasets. In cases where MLE is necessary, the iterate option will allow you to specify a fixed number of iterations, which could speed things up but can be a recipe for disaster if you don't know what you are doing and is thus not recommended. This option is usually used for other purposes. However, is xtgls the only command you could use? Read here why this may in fact not necessarily be the case.
Regarding speed, Stata in general is slow, at least when the ado language is used. This is because it is an interpreted language. The only realistic option for speed gains here is through parallelisation if you have Stata MP. Even in this case, whether any gains are achieved it will depend on a number of factors,
including which command you use.
Finally, xtserial is a community-contributed command, something which you
fail to make clear in your question. It is customary and useful to provide this
information right from the start, so others know that you do not refer to an
official, built-in command.

What is difference between Test case and Test case(if we are not taking automation into consideration(

If Automation is excluded and from manual testing point of view, what is diffrerence between Test Strategy, Test Scenario, Test case and Test Script
**
Test Strategy
A Test Strategy document is a high level document and normally developed by project manager. This document defines “Software Testing Approach” to achieve testing objectives. The Test Strategy is normally derived from the Business Requirement Specification document.
Some companies include the “Test Approach” or “Strategy” inside the Test Plan, which is fine and it is usually the case for small projects. However, for larger projects, there is one Test Strategy document and different number of Test Plans for each phase or level of testing.
Components of the Test Strategy document
1)Scope and Objectives
2)Business issues
3)Roles and responsibilities
4)Communication and status reporting
5)Test deliverability
6)Industry standards to follow
7)Test automation and tools
8)Testing measurements and metrices
9)Risks and mitigation
10)Defect reporting and tracking
11)Change and configuration management
12)Training plan
**
Test Scenario
A scenario is a story that describes a hypothetical situation. In testing, you check how the program copes with this hypothetical situation.
The ideal scenario test is credible, motivating, easy to evaluate, and complex.
Scenarios are usually different from test cases in that test cases are single steps and scenarios cover a number of steps. Test suites and scenarios can be used in concert for complete system tests.
A Scenario is any functionality that can be tested. It is also called Test Condition ,or Test Possibility.
**
Test Cases
In software engineering, a test case is a set of conditions or variables under which a tester will determine if a requirement upon an application is partially or fully satisfied. It may take many test cases to determine that a requirement is fully satisfied. In order to fully test that all the requirements of an application are met, there must be at least one test case for each requirement unless a requirement has sub requirements. In that situation, each sub requirement must have at least one test case .
A test case is also defined as a sequence of steps to test the correct behavior of a functionality/feature of an application.
A sequence of steps consisting of actions to be performed on the system under test. (These steps are sometimes called the test procedure or test script). These actions are often associated with some set of data (preloaded or input during the test). The combination of actions taken and data provided to the system under test leads to the test condition. This condition tends to produce results that the test can compare with the expected results; I.e assess quality under the given test condition. The actions can be performed serially, in parallel, or in some other combination of consecution.
**
Test Script
Test Script is a set of instructions (written using a scripting/programming language) that is performed on a system under test to verify that the system performs as expected. Test scripts are used in automated testing.
Sometimes, a set of instructions (written in a human language), used in manual testing, is also called a Test Script but a better term for that would be a Test Case.
Test Scenario means " What to be tested" and test case means " How to be tested".
Test case: It consist of test case name, Precondition, steps / input condition, expected result.
Test Scenario: Test scenario consists of a detailed test procedure. We can also say that a test scenario has many test cases associated with it. Before executing the test scenario we need to think of test cases for each scenario.
Test Script: A Test Script is a set of instructions (written using a programming language) that is performed on a system under test to verify that the system performs as expected.
Test scripts is the term used when referring to automated testing. When you're creating a test script, you are using an automation tool to create your script.
Test strategy
outlines the testing approach and everything else that surrounds it. It is different from the test plan, in the sense that a Test strategy is only a sub set of the test plan. It is a hard core test document that is to an extent generic and static. There is also an argument about at what levels test strategy or plan is used- but I really do not see any discerning difference.
Example: Test plan gives the information of who is going to test at what time. For example: Module 1 is going to be tested by “X tester”. If tester Y replaces X for some reason, the test plan has to be updated.
On the contrary, test strategy is going to have details like – “Individual modules are to be tested by test team members. “ In this case, it does not matter who is testing it- so it’s generic and the change in the team member does not have to be updated, keeping it static.
Test scenario
This is a one line pointer that testers create as an initial, transitional step into the test design phase. This is mostly a one line definition of “What” we are going to test with respect to a certain feature. Usually, test scenarios are an input for the creation of test cases. In agile projects, Test scenarios are the only test design outputs and no test cases are written following these. A test scenario might result in multiple tests.
Examples test scenarios:
Validate if a new country can be added by the Admin
Validate if an existing country can be deleted by the admin
Validate if an existing country can be updated
Test Case:
Test Case is a commonly used term for a specific test. This is usually the smallest unit of testing. A Test Case will consist of information such as requirements testing, test steps, verification steps, prerequisites, outputs, test environment, etc.
A set of inputs, execution preconditions, and expected outcomes developed for a particular objective, such as to exercise a particular program path or to verify compliance with a specific requirement.
Test Script:
Commonly used to refer to the instructions for a particular test that will be carried out by an automated test tool
Test Scenarios: A high-level/simple/individual test panorama of actual system capability. We no need to define a clear step-by-step way of validation at this stage as we define test scenarios at very early stages of software life cycle. This will not be considered for test plan as this is a non-defined item in terms resource allocation.
Test Case: Is a document which consists of system specific prerequisites, but no step-by-step validation. In test case traceability we use a test case document against requirements. This is how we will define the test coverage matrix against requirements. In most of the cases, a test case will cover multiple test scenarios. A test case will carry complexity. Test cases are used for calculation of testing efforts for a particular release with respect to code version.
Test Script(without Automation/programming language context): Every one aware of the fact that a test script is an automation program which is uniquely mapped to a test case. But without automation as well we can use this term especially when you are using Rational Quality Manager(RQM) as your test repo.
1.When a test case has multiple versions and the testing team needs to maintain all test case versions against multiple system code versions.In this case, one test case will have multiple test scripts(one for each version).
2.When a test case produces different results in different environments(Operating system or technology.. etc), a test case will be mapped to multiple test scripts which have the expected results change but entire test case remains same.
In either of the above cases, while creating test plan we need to first decide on which version of the test case(in other terms, test script) for execution based on code version or the environment.
Hope this helps to answer your question.

Testing Statistical Methods

I write a lot of statistical methods for application.
The problem is I don't know how to test it appropriately.
For example, in unit-test I check whether the sum of all probabilities of distribution converges to 1, however is never 1.
For example, the sum of all probabilities might be 0.9999999 or even 1.0000000005, the actually value if strongly depends on how many different outcomes the distribution have.
maybe I can test like so
value should be less that 1.1
value should be more that 0.9
but I am not sure that this test is consistent, maybe there is a distribution that due to numeric calculation will output 1.1
How to test it appropriately.
There is a related discussion here that you might find interesting.
The short version is that you want to break up your statistical methods into pieces that can be tested deterministically.
Where that's not the case, you probably want to use some epsilon value to compare your expected and actual outputs. You could also run several iterations of the test and perform a simpler statistical test (a t-test perhaps?) to see if the distribution looks like what you think it should be.

Can TDD be a valid alternative to overkill data validation?

Consider these two data validation scenarios:
Check everything everywhere
Make sure that every method that takes one or more arguments actually checks them to ensure that they're syntactically valid.
Pros
Very fine check granularity.
If the code that is being written is for some kind of library we make sure to limit the damage that can be done if the developers that will be using it fail to provide valid data.
Cons
It's costly to always perform checks that most of the time shouldn't be needed.
It's still possible to forget to add a check every now and then.
More code is being written and hence in need of maintenance.
Make use of TDD goodness
Validate data only when it enters your code from the external world.
To make sure that internally data will be always syntactically correct, create tests that check every method that returns a value. To make sure that if valid data enters, valid data exits.
The pros and the cons are practically switched with the ones from the former approach.
As of now I'm using the first approach, but since I'm employing test driven development I thought that maybe I could go with the second one.
The advantages are clear, still, I wonder if it's as secure as the first method.
It sounds like the first method is contract driven, and one aspect of that is that you also need to verify that what you return from any public interface meets the contract.
But, I think that both approaches are valid, but very different.
TDD only partially deals with the public interface, in that it should check that every input is properly validated, unfortunately, unless you have all your validation in separate functions, to adequately test, it becomes very difficult to ensure that this function of 3 or 4 parameters is being properly tested for validity. The number of tests you have to write is quite high, in either approach.
If you are using a library, then in every function that can be called directly from the outside (outside being outside the library) then you will need to check that every input is valid, and that invalid input is handled as per the contract, either returning a null or throwing an exception. But, it must be in agreement with the documentation.
Once you have verified it, then there is no reason to force the verification on private functions as those can only be called from within the library, and you should be verifying that you are only dealing with valid data.
Lots of tests will be needed, regardless, unfortunately. All these tests do is to ensure that you don't have any surprise problems, but that should generally help justify the cost of writing and maintaining them.
As to your question, if your tests are really well written, and you ensure that all validity checks are done completely, then it should be as secure, but the risk is that if you believe it is secure and you have poorly written tests then it will actually be worse than no tests, as there is an assumption that your tests are well-written.
I would use both methods, until you know your tests are well-written then just go with TDD.
My opinion is that in the first scenario, two of your Cons outweigh everything else:
It's costly to always perform checks
that most of the time shouldn't be
needed.
More code is being written and hence
in need of maintenance.
Also, technically TDD has no bearing on this question, because it is not a testing technique. More later...
To mitigate the Cons I would strongly advocate (as I think you say) splitting the code into an outside and an inside: The outside is where all the validation occurs. Hopefully this is but a thin wrapper around the inside, to prevent GIGO. Once inside, data never needs to be validated again.
As for TDD, I would strongly advocate (as you are now doing) employing it to develop your code, with the added benefit of leaving a trail of tests that become a regression test suite. Now you will naturally develop your outside code to perform robust validation, with the promise of easily adding any checks that you might initially forget. Your inside code can be developed assuming it will only handle valid data, but TDD will still give you the confidence that it will function to spec.
I'm saying that I would go with the second approach, as I've described, independently of whether I'm developing with TDD, or not (but TDD is always my first choice).
The advantages are clear, still, I wonder if it's as secure as the first method.
This completely depends on how well you test it.
This could be just as secure, if the following two criteria are met:
Every publicly exposed means of adding data to the system are validated completely
Every internal method that translates data is completely and adequately tested
However, I question that this would be easier or that it would require less code. The amount of code required to check every public entry point is going to be very similar to the amount of code required to validate each method. You're going to need more checks in the entry points, since they'll have to check things that might otherwise be checked internally.
For the second method, you need two good sets of tests. You must not only check that
To make sure that if valid data
enters, valid data exits.
You must also check that if Invalid data enters, an exception is thrown. I suppose you still have to validate data and kick out if you have invalid data. This is really the only way if you don't want pesky ArgumentNullException s or other cryptic errors in your production application. However TDD can really toughen up the quality of all that checking (especially with Fuzz Testing).
One item is missing from your list of Pros and Cons and that is something important enough to make unit testing a much more safer method than maniac parameters checking.
You just have to consider the When and the Where.
For unit testing the when and the where are:
when: at design time
where: in a dedicated source file outside of the application code
For overkill data checking they are:
when: at runtime
where: entangled in the application source code, typically using asserts.
That is the point: code covered by unit testing detects errors at design time when you run the tests, if you are the paranoid and schizofrenic kind of tester (the bests) you write tests designed to break whatever can be, checking each data boundary and perverse input. You also use code coverage tools to ensure every branch of every alternative is tested. You have no limit : tests lies in their own files and do not clutter application. Doesn't matter if you get ten times as many test lines than the actual application code, no run time penalty, no readability penalty.
On the other hand integrated overkill testing detects errors at runtime. In the worst-case it will detects errors on the user system, where you can do nothing about it (if even you ever heard of this error happening). Also even if you are the paranoid kind you will have to limit your testing. Assertion just can't be 90 percents of the application code. It raise readability issues, maintenance, often heavy performances penalty. Where will you stop then: only checking parameters for external input ? Checking every possible or impossible inputs of inner functions ? Checking every loop invariant ? Also testing behavior when out of flow data (globals, system files, etc) is changed ? You must also be conscious that assertion code can also contain some bugs. What if the formula of an assertion perform a divide. You must ensure it will not lead by a DIVIDE-BY-ZERO error or such ?
Another problem is that in many cases you just don't know what can be done when an assertion failure. If you are at a real entry point you can return back something understandable for your user or the lib user... when you are checking innner functions

Resources