I’ve been tasked with investigating whether something can be added to our MR approval process to highlight whether proposed Merge Requests (MRs) in GitLab, which are C++ based code, contain any new unit tests or modifications to existing tests. The overall aim is to remind developers and approvers that they need to consider unit testing.
Ideally a small script would run and detect the presence of additional tests or changes (this I can easily write, and I accept that there’s a limit to how much can be done here) and display a warning on the MR if they weren’t detected.
An addition step, if at all possible, would be to block the MR until either, further commits were pushed that meet the criteria, or an (extra/custom) GitLab MR field is completed explaining why unit testing is not appropriate for this change. This field would be held with the MR for audit purposes. I accept that this is not foolproof but am hoping to pilot this as part of a bigger push for more unit test coverage.
As mentioned, I can easily write a script in, say, Python to check for unit tests in the commit(s), but what I don’t know is whether/how I can hook this into the GitLab MR process (I looked at web-hooks but they seem to focus on notifying other systems rather than being transactional) and whether GitLab is extensible enough for us to achieve the additional step above. Any thoughts? Can this be done, and if so, how would I go about it?
measuring the lack of unit tests
detect the presence of additional tests or changes
i think you are looking for the wrong thing here.
the fact that tests have changed or that there are any additional tests does not mean, that the MR contains any unit tests for the submitted code.
the underlying problem is of course a hard one.
a good approximation of what you want is typically to check how many lines of code are covered by the test-suite.
if the testsuite tests more LOCs after the MR than before, then the developer has done their homework and the testsuite has improved. if the coverage has grown smaller, than there is a problem.
of course it's still possible for a user to submit unit tests that are totally unrelated to their code changes, but at least the overall coverage has improved (or: if you already have a 100% coverage before the MR, then any MR that keeps the coverage at 100% and adds new code has obviously added unit tests for the new code).
finally, to come to your question
yes, it's possible to configure a gitlab-project to report the test-coverage change introduced by an MR.
https://docs.gitlab.com/ee/ci/pipelines/settings.html#test-coverage-parsing
you obviously need to create a coverage report from your unittest run.
how you do this depends on the unittesting framework you are using, but the gitlab documentation gives some hints.
You don't need a web hook or anything like that. This should be something you can more or less trivially solve with just an extra job in your .gitlab-ci.yml. Run your python script and have it exit nonzero if there are no new tests, ideally with an error message indicating that new tests are required. Now when MRs are posted your job will run and if there are no new tests the pipeline will fail.
If you want the pipeline to fail very fast, you can put this new job at the head of the pipeline so that nothing else runs if this one fails.
You will probably want to make it conditional so that it only runs as part of an MR, otherwise you might get false failures (e.g. if just running the pipeline against some arbitrary commit on a branch).
Related
The first part I must state is that this is not because certain tests must run before others. We have to run full regressions every week, that generates a singular report and not lots of smaller reports (I asked and management doesn't want this), 11k scenarios and we tag each scenario with the release they are associated with. We would love to be able to run the scenarios in specific orders depending on what occurred during the release so that we aren't wasting time waiting for the 100th from the last test to fail only to have to start from scratch again, or at the very least run certain files first.
I know there is the solution of just renaming the folders/files each release, which is what I am doing, but it is extremely tedious and I would like to just change something in our Java runner.
There are similar questions from years ago, so I am hoping some hack or feature has been added that I just can't seem to find.
I am using GitLab and its CI for a project.
I used to test coverage with some CI jobs until these scripts stopped working ("keyword cobertura not valid").
Simultaneously I found that the CI added some "external" jobs automatically handling coverage (see screenshot).
I don't why it appeared, maybe because I have linked the project with Codecov external site.
This was a pleasant surprise at the time because I didn't have to maintain a special script for coverage.
However eventually now these external coverage tests are failing and I can't merge my changes because of it.
Worst part is that these are not normal scripts so I can't see what is wrong with it. And, there is no Retry button even (see screenshot, on the right).
I don't want to throw away my otherwise perfectly working merge request.
How can I see what is wrong about this part of the CI?
Clicking on the failed test send me to Codecov website and I don't see anything wrong with it.
Here it the link to the pipeline: https://gitlab.com/correaa/boost-multi/-/pipelines/540520025
I think I solved the problem, it could have been that coverage percentage decreased (by 0.01% !) and that was interpreted by "the system" as failure.
I added test to cover some uncovered lines and the problem was solved.
If this is the right interpretation, this is indeed nice, but also scary, because some big changes sometimes require a hit in coverage.
In my particular example, what happened is that I simplified code and the number of total lines when down, making the covered fraction go lower than previously.
I think this error might have something to do with the coverage range you have declared.
Looking at your .codecov.yml file:
coverage:
precision: 2
round: down
range: "99...100"
You're excluding 100% when using three dots in the range, and you have achieved 100% coverage with this branch. I feel like this shouldn't matter, but you could be hitting an edge case with codecov. Maybe file a bug report with them.
Try changing the range to 99..100. Quotes should be unnecessary.
https://docs.codecov.com/docs/coverage-configuration
I would like to determine which schedulers to trigger depending on the branch name, from inside the build factory - if that's possible.
Essentially I have a builder that is doing all the common build steps to compile package etc, and then has a bunch of trigger steps that trigger a bunch of tests (via triggerable schedulers).
However, I would like to configure the type of tests that get started (eg which schedulers are triggered) to depend on the branch name. So far I've tried to add the change_filter arg to my Triggerable scheduler, but it seems that it doesn't accept that argument. I guess that makes sense because it supposed to be Triggered, so maybe it doesn't care about using a change filter. That seems a bit strange though because Dependent schedulers do accept this kwarg.
So far the correct way to set this up is not clear to me.
I guess my questions are really:
Is there a way to use renderables / properties to decide which schedulers to trigger (based on the branch name for example)?
Is there a better way to do this? Perhaps create separate schedulers for the build that apply the change filter I need and have a build factory that triggers the correct tests, but that's not very DRY.
I came back to leave this here in case it might help someone with a tricky buildbot setup.
I solved this by making all of the dependent schedulers (for specific types of tests) into triggerable schedulers. Then I created main build schedulers for each subset of tests, each with a change filter and regex for the branches that should undergo that subset of tests. Finally, I created the buildfactory for each main scheduler by passing it only the triggerable schedulers for the test that that specific type of main scheduler should run.
For my current use case, this works great!
We are using WebdriverIO for our automated tests and we generate HTML reports with Mochawesome in the end based on the result JSON files.
Now we have a lot of implemented tests and we want to fetch the difference between two testruns as fast as possible. Therefore it would be cool if we will have a possibility to compare two testrun results with each other and to generate also a HTML report only with the test result differences.
Maybe there is a still existing implemantation/package to do that? Yes, of course it is possible to compare the two different JSON result files with each other, but I prefer a still implemented solution to save effort.
How would you do the comparison in my case?
Thanks,
Martin
You could set up a job in a CI tool like Jenkins.
Here it always compares the latest results with the previous build and tells you if it is a new failure, regression issue or a fixed script.
Regression indicates that test passed in the previous build, but failing in the new build
Failed indicates, it is failing from the past couple of builds
Fixed indicates, that it was failing in the last build, but now passing in the latest build
I am automating acceptance tests defined in a specification written in Gherkin using Elixir. One way to do this is an ExUnit addon called Cabbage.
Now ExUnit seems to provide a setup hook which runs before any single test and a setup_all hook, which runs before the whole suite.
Now when I try to isolate my Gherkin scenarios by resetting the persistence within the setup hook, it seems that the persistence is purged before each step definition is executed. But one scenario in Gherkin almost always needs multiple steps which build up the test environment and execute the test in a fixed order.
The other option, the setup_all hook, on the other hand, resets the persistence once per feature file. But a feature file in Gherkin almost always includes multiple scenarios, which should ideally be fully isolated from each other.
So the aforementioned hooks seem to allow me to isolate single steps (which I consider pointless) and whole feature files (which is far from optimal).
Is there any way to isolate each scenario instead?
First of all, there are alternatives, for example: whitebread.
If all your features, needs some similar initial step, maybe background steps are something to look into. Sadly those changes were mixed in a much larger rewrite of the library that newer got merged into. There is another PR which also is mixed in with other functionality and currently is waiting on companion library update. So currently that doesn't work.
Haven't tested how the library is behaving with setup hooks, but setup_all should work fine.
There is such a thing as tags. Which I think haven't yet been published with the new release, but is in master. They work with callback tag. You can look closer at the example in tests.
There currently is a little bit of mess. I don't have as much time for this library as I would like to.
Hope this helps you a little bit :)