Necessist
Run tests with statements and method calls removed to help identify broken tests
Necessist currently supports Foundry, Go, Hardhat TS, and Rust.
Contents
Installation
System requirements:
Install pkg-config and sqlite3 development files on your system, e.g., on Ubuntu:
Install Necessist from crates.io:
Install Necessist from github.com:
Overview
Necessist iteratively removes statements and method calls from tests and then runs them. If a test passes with a statement or method call removed, it could indicate a problem in the test. Or worse, it could indicate a problem in the code being tested.
Example
This example is from rust-openssl. The verify_untrusted_callback_override_ok test checks that a failed certificate validation can be overridden by a callback. But if the callback were never called (e.g., because of a failed connection), the test would still pass. Necessist reveals this fact by showing that the test passes without the call to set_verify_callback:
Following this discovery, a flag was added to the test to record whether the callback is called. The flag must be set for the test to succeed:
Comparison to conventional mutation testing
Conventional mutation testing tries to identify gaps in test coverage, whereas Necessist tries to identify bugs in existing tests.
Conventional mutation testing tools (such a universalmutator) randomly inject faults into source code, and see whether the code's tests still pass. If they do, it could mean the code's tests are inadequate.
Notably, conventional mutation testing is about finding deficiencies in the set of tests as a whole, not in individual tests. That is, for any given test, randomly injecting faults into the code is not especially likely to reveal bugs in that test. This is unfortunate since some tests are more important than others, e.g., because ensuring the correctness of some parts of the code is more important than others.
By comparison, Necessist's approach of iteratively removing statements and method calls does target individual tests, and thus can reveal bugs in individual tests.
Of course, there is overlap is the sets of problems the two approaches can uncover, e.g., a failure to find an injected fault could indicate a bug in a test. Nonetheless, for the reasons just given, we see the two approaches as complementary, not competing.
Usage
Usage: necessist [OPTIONS] [TEST_FILES]... [-- <ARGS>...]
Arguments:
[TEST_FILES]... Test files to mutilate (optional)
[ARGS]... Additional arguments to pass to each test command
Options:
--allow <WARNING> Silence <WARNING>; `--allow all` silences all warnings
--default-config Create a default necessist.toml file in the project's root directory
--deny <WARNING> Treat <WARNING> as an error; `--deny all` treats all warnings as errors
--dump Dump sqlite database contents to the console
--dump-candidates Dump removal candidates and exit (for debugging)
--framework <FRAMEWORK> Assume testing framework is <FRAMEWORK> [possible values: auto, foundry, go, hardhat-ts, rust]
--no-dry-run Do not perform dry runs
--no-sqlite Do not output to an sqlite database
--quiet Do not output to the console
--reset Discard sqlite database contents
--resume Resume from the sqlite database
--root <ROOT> Root directory of the project under test
--timeout <TIMEOUT> Maximum number of seconds to run any test; 60 is the default, 0 means no timeout
--verbose Show test outcomes besides `passed`
-h, --help Print help
-V, --version Print version
Output
By default, Necessist outputs to the console only when tests pass. Passing --verbose causes Necessist to instead output all of the removal outcomes below.
| Outcome | Meaning (With the statement/method call removed...) |
|---|---|
| passed | The test(s) built and passed. |
| timed-out | The test(s) built but timed-out. |
| failed | The test(s) built but failed. |
| nonbuildable | The test(s) did not build. |
By default, Necessist outputs to both the console and to an sqlite database. For the latter, a tool like sqlitebrowser can be used to filter/sort the results.
Details
Generally speaking, Necessist will not attempt to remove a statement if it is one the following:
- a statement containing other statements (e.g., a
forloop) - a declaration (e.g., a local or
letbinding) - a
break,continue, orreturn - the last statement in a test
Similarly, Necessist will not attempt to remove a method call if:
- It is the primary effect of an enclosing statement (e.g.,
x.foo();). - It appears in the argument list of an ignored function, method, or macro (see below).
Also, for some frameworks, certain statements and methods are ignored. Click on a framework to see its specifics.
In addition to the below, the Foundry framework ignores:
- a statement immediately following a use of
vm.prankor any form ofvm.expect(e.g.,vm.expectRevert) - an
emitstatement
Ignored functions
- Anything beginning with
assert(e.g.,assertEq) - Anything beginning with
vm.expect(e.g.,vm.expectCall) - Anything beginning with
console.log(e.g.,console.log,console.logInt) - Anything beginning with
console2.log(e.g.,console2.log,console2.logInt) vm.getLabelvm.label
In addition to the below, the Go framework ignores:
- Anything beginning with
assert.(e.g.,assert.Equal) - Anything beginning with
require.(e.g.,require.Equal) deferstatements
Ignored methods*
CloseErrorErrorfFailFailNowFatalFatalfLogLogfParallel
* This list is based primarily on testing.T's methods. However, some methods with commonplace names are omitted to avoid colliding with other types' methods.
Ignored functions
assert- Anything beginning with
assert.(e.g.,assert.equal) expect
Ignored methods
toNumbertoString
Ignored macros
assertassert_eqassert_matchesassert_neeprinteprintlnpanicprintprintlnunimplementedunreachable
Ignored methods*
as_bytesas_mutas_mut_os_stras_mut_os_stringas_mut_sliceas_mut_stras_os_stras_os_str_bytesas_pathas_refas_sliceas_strborrowborrow_mutcloneclonedcopiedderefderef_mutexpectexpect_errinto_boxed_bytesinto_boxed_os_strinto_boxed_pathinto_boxed_sliceinto_boxed_strinto_bytesinto_os_stringinto_ownedinto_path_bufinto_stringinto_veciteriter_mutsuccessto_os_stringto_ownedto_path_bufto_stringto_vecunwrapunwrap_err
* This list is essentially the watched trait and inherent methods of Dylint's unnecessary_conversion_for_trait lint, with the following additions:
clone(e.g.std::clone::Clone::clone)cloned(e.g.std::iter::Iterator::cloned)copied(e.g.std::iter::Iterator::copied)expect(e.g.std::option::Option::expect)expect_err(e.g.std::result::Result::expect_err)into_owned(e.g.std::borrow::Cow::into_owned)success(e.g.assert_cmd::assert::Assert::success)unwrap(e.g.std::option::Option::unwrap)unwrap_err(e.g.std::result::Result::unwrap_err)
Configuration files
A configuration file allows one to tailor Necessist's behavior with respect to a project. The file must be named necessist.toml, appear in the project's root directory, and be toml encoded. The file may contain one more of the options listed below.
-
ignored_functions,ignored_methods,ignored_macros: A list of strings interpreted as patterns. A function, method, or macro (respectively) whose path matches a pattern in the list is ignored. Note thatignored_macrosis used only by the Rust framework currently. -
ignored_path_disambiguation: One of the stringsEither,Function, orMethod. For a path that could refer to a function or method (see below), this option influences whether the function or method is ignored.-
Either(default): Ignore if the path matches either anignored_functionsorignored_macrospattern. -
Function: Ignore only if the path matches anignored_functionspattern. -
Method: Ignore only if the path matches anignored_methodspattern.
-
Patterns
A pattern is a string composed of letters, numbers, ., _, or *. Each character, other than *, is treated literally and matches itself only. A * matches any string, including the empty string.
The following are examples of patterns:
assert: matches itself onlyassert_eq: matches itself onlyassertEqual: matches itself onlyassert.Equal: matches itself onlyassert.*: matchesassert.Equal, but notassert,assert_eq, orassertEqualassert*: matchesassert,assert_eq,assertEqual, andassert.Equal*.Equal: matchesassert.Equal, but notEqual
Notes:
- Patterns match paths, not individual identifiers.
.is treated literally like in aglobpattern, not like in regular expression.
Paths
A path is a sequence of identifiers separated by .. Consider this example (from Chainlink):
operator.connect(roles.oracleNode).signer.sendTransaction({
to: operator.address,
data,
}),
In the above, operator.connect and signer.sendTransaction are paths.
Note, however, that paths like operator.connect are ambiguous:
- If
operatorrefers to package or module, thenoperator.connectrefers to a function. - If
operatorrefers to an object, thenoperator.connectrefers to a method.
By default, Necessist ignores such a path if it matches either an ignored_functions or ignored_macros pattern. Setting the ignored_path_disambiguation option above to Function or Method causes Necessist ignore the path only if it matches an ignored_functions or ignored_macros pattern (respectively).
Limitations
-
Slow. Modifying tests requires them to be rebuilt. Running Necessist on even moderately sized codebases can take several hours.
-
Triage requires intimate knowledge of the source code. Generally speaking, Necessist does not produce "obvious" bugs. In our experience, deciding whether a statement/method call should be necessary requires intimate knowledge of the code under test. Necessist is best run on codebases for which one has (or intends to have) such knowledge.
Goals
- If a project uses a supported framework, then
cding into the project's directory and typingnecessist(with no arguments) should produce meaningful output.
References
- Groce, A., Ahmed, I., Jensen, C., McKenney, P.E., Holmes, J.: How verified (or tested) is my code? Falsification-driven verification and testing. Autom. Softw. Eng. 25, 917–960 (2018). A preprint is available. See Section 2.3.
License
Necessist is licensed and distributed under the AGPLv3 license. Contact us if you're looking for an exception to the terms.