What's a flaky test?
It's a test that sometimes fails, but if you retry it enough times, it passes, eventually.
When a test frequently fails in
a ~"master:broken" issue
should be created.
If the test cannot be fixed in a timely fashion, there is an impact on the
productivity of all the developers, so it should be placed in quarantine by
:quarantine metadata with the issue URL.
it 'should succeed', quarantine: 'https://gitlab.com/gitlab-org/gitlab/-/issues/12345' do expect(response).to have_gitlab_http_status(:ok) end
This means it is skipped unless run with
bin/rspec --tag quarantine
Before putting a test in quarantine, you should make sure that a ~"master:broken" issue exists for it so it doesn't stay in quarantine forever.
Once a test is in quarantine, there are 3 choices:
- Should the test be fixed (i.e. get rid of its flakiness)?
- Should the test be moved to a lower level of testing?
- Should the test be removed entirely (e.g. because there's already a lower-level test, or it's duplicating another same-level test, or it's testing too much etc.)?
Quarantine tests on the CI
Quarantined tests are run on the CI in dedicated jobs that are allowed to fail:
rspec-pg-quarantine(CE & EE)
Automatic retries and flaky tests detection
We also use a home-made
RspecFlaky::Listener listener which records flaky
examples in a JSON report file on
This was originally implemented in: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/13021.
If you want to enable retries locally, you can use the
RETRIES environment variable.
RETRIES=1 bin/rspec ... would retry the failing examples once.
Problems we had in the past at GitLab
rspec-retryis biting us when some API specs fail: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/9825
Sporadic RSpec failures due to
- FFaker generates funky data that tests are not ready to handle (and tests should be predictable so that's bad!):
spec/mailers/notify_spec.rbmore robust: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10015
Transient failure in
- Replace FFaker factory data with sequences: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10184
- Transient failure in spec/finders/issues_finder_spec.rb: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10404
Order-dependent flaky tests
These flaky tests can fail depending on the order they run with other tests. For example:
To identify the tests that lead to such failure, we can use
which would give us the minimal test combination to reproduce the failure:
rspec --bisect ee/spec/services/ee/merge_requests/update_service_spec.rb ee/spec/services/ee/notes/quick_actions_service_spec.rb ee/spec/services/epic_links/create_service_spec.rb ee/spec/services/ee/issuable/bulk_update_service_spec.rb Bisect started using options: "ee/spec/services/ee/merge_requests/update_service_spec.rb ee/spec/services/ee/notes/quick_actions_service_spec.rb ee/spec/services/epic_links/create_service_spec.rb ee/spec/services/ee/issuable/bulk_update_service_spec.rb" Running suite to find failures... (2 minutes 18.4 seconds) Starting bisect with 3 failing examples and 144 non-failing examples. Checking that failure(s) are order-dependent... failure appears to be order-dependent Round 1: bisecting over non-failing examples 1-144 . ignoring examples 1-72 (1 minute 11.33 seconds) ... Round 7: bisecting over non-failing examples 132-133 . ignoring example 132 (43.78 seconds) Bisect complete! Reduced necessary non-failing examples from 144 to 1 in 8 minutes 31 seconds. The minimal reproduction command is: rspec ./ee/spec/services/ee/issuable/bulk_update_service_spec.rb[1:2:1:1:1:1,1:2:1:2:1:1,1:2:1:3:1] ./ee/spec/services/epic_links/create_service_spec.rb[1:1:2:2:6:4]
We can reproduce the test failure with the reproduction command above. If we change the order of the tests, the test would pass.
Time-sensitive flaky tests
Array order expectation
- Be sure to create all the data the test need before starting exercise: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12059
- Bis: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12604
- Bis: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12664
- Assert against the underlying database state instead of against a page's content: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10934
- In JS tests, shifting elements can cause Capybara to mis-click when the element moves at the exact time Capybara sends the click
- Triggering JS events before the event handlers are set up
- Wait for the image to be lazy-loaded when asserting on a Markdown image's
Capybara viewport size related issues
- Transient failure of spec/features/issues/filtered_search/filter_issues_spec.rb: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10411
Capybara JS driver related issues
- Don't wait for AJAX when no AJAX request is fired: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/10454
- Bis: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12626
PhantomJS / WebKit related issues
- Memory is through the roof! (Load images but block images requests!): https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/12003
Capybara expectation times out
- Test imports a project (via Sidekiq) that is growing over time, leading to timeouts when the import takes longer than 60 seconds
- Flaky Tests: Are You Sure You Want to Rerun Them?
- How to Deal With and Eliminate Flaky Tests
- Tips on Treating Flakiness in your Rails Test Suite
- 'Flaky' tests: a short story
- Using Insights to Discover Flaky, Slow, and Failed Tests