conf
conf is a derive-based env-and-argument parser aimed at the practically-minded web developer building large web projects.
It uses clap under the hood to parse CLI arguments and generate help text.
conf has an intentionally similar proc-macro API to clap-derive, but it is not a fork. It is a new library with different goals. It offers some powerful features and support that clap-derive does not, which help with the configuration of large projects. But it also doesn't offer some features of clap, which are typically less useful in a web project.
The features that you get for this bargain are:
- You can assign a prefix to a structure's fields when flattening it into another structure, and you can similarly do
envprefixing in a controlled way. - You get ALL the errors and not just one of them if some required env is missing and/or several of the values are invalid. In my searching I found that surprisingly few config crates out there actually do this. Very helpful if your deployments take a while.
- Isolation & testability around
env.claponly supports reading env values fromstd::env::var_os.- If you want to test what happens when different variables are set, your tests can become racy.
- If you want to test a component that takes config as an argument, and use
::parse_fromto initialize the config, then your tests will pass or fail depending on your local env. - If you want to implement
Defaultbased on the default values your declared on your structure, you can't really because you can't isolate it fromenv. conflets you pass an iterator to represent a snapshot of the environment.
- Support for
envaliases.clapsupports aliases for command-line arguments but not forenv. Make changes without breaking compatibility. - You can declare fields which are only read from
envand cannot be read from args at all. - You can declare fields which represent secrets. This controls whether or not the entire value should be printed in error messages if it fails to parse.
- Support for an optional-flatten syntax. This can be simpler and more idiomatic than using argument groups and such in
clap-derive. - Support for user-defined validation predicates. This allows you to express constraints that can't be expressed in
clap.
conf is heavily influenced by clap-derive and the earlier struct-opt which I used for years. They are both great and became popular for a reason.
In most cases, conf tries to stay extremely close to clap-derive syntax and behavior, for familiarity and ease of migrating a large project.
In some cases, there are small deviations from the behavior of clap-derive to either help avoid mistakes, or to make the defaults closer to a good 12-factor app behavior.
For some advanced features of clap, conf has a way to achieve the same thing, but we took a different approach. This is typically in an attempt to simplify how it works for the user of the derive macro, to have fewer named concepts, or to ease maintenance going forward.
Rather than try to make many of the underlying objects and concepts part of the public API, and support hand-written implementations of the derivable traits as clap-derive does, the public API here is restricted to one trait, one proc-macro to derive it, and one error type. It is hoped that this will both reduce the learning curve and ease future development and maintenance.
See MOTIVATION.md for more discussion about this project and the other various alternatives out there.
Using conf in a cargo project
First add conf to the dependencies in your Cargo.toml file:
[dependencies]
conf = "0.1"
NOTE: Not actually published to crates.io yet... there is some more test coverage that is needed before that can happen.
Then, create a struct which represents the configuration data your application needs to read on startup.
This struct should derive the Conf trait, and the conf attributes should be used to describe how each field can be read.
Finally, you can parse the config:
let config = parse;
Usually you would call that somewhere in fn main() and then use the config to initialize your application.
The parse() function will automatically add a --help option for users that contains auto-generated documentation, based on your doc strings.
Additionally, if parsing fails for some reason, it will display a helpful error message and exit.
(The Conf trait offers a few variants of this function, which you can read about in the docs.)
Generally, the CLI interface and help text that is generated is meant to conform to POSIX and GNU conventions. Read more in clap docu about this.
A tour
A field in your struct can be read from a few sources:
#[arg(short)]means that it corresponds to a "short" option, such as-u. By default the first letter of your field is used. This can be overridden with#[conf(short='t')]for example.#[arg(long)]means that it corresponds to a "long" option, such as--url. By default the kebab-case name of your field is used. This can be overridden with#[conf(long="target-url")]for example.#[arg(env)]means that it corresponds to an environment variable, such asURL. By default the upper snake-case name of your field is used. This can be overridden with#[conf(env="TARGET_URL")]for example.#[arg(default_value)]specifies a default value for this field if none of the other three possible sources provides one.
Such attributes can be combined by separating them with commas, for example #[arg(long, env, default_value="x")] means the field has an assocated long option, an associated environment variable, and a default value if both of these are omitted.
Your field can have any type as long as it implements FromStr, and this will be used to parse it.
The type bool is special and results in a "flag" being generated rather than a "parameter", which expects no string parameter to be passed during parsing.
Option<T> is also special, and indicates that the value is optional rather than required. You can also specify an alternative parsing function using value_parser.
So far this is almost exactly the same clap-derive. Where it gets more interesting is the flatten option.
You may have one structure that derives Conf and declares a bunch of related config values:
Then you can "flatten" it into a larger Conf structure using the conf(flatten) attribute.
Intuitively, this is meant to read a lot like the serde(flatten) attribute, and has a similar behavior.
During parsing, the parser behaves as if every field of DbConfig were declared within Config, and generates matching options, env, and help, but then the parsed values actually
get stored in subfields of the .db field.
Using flatten can save a lot of labor. For example, suppose your web application consists of ten different web services, and they all need a DbConfig. Instead of duplicating all the values,
any env, any defaults, any help text, in each Config that you have, you can write that once and then flatten it ten times. Then, later when you discover that DbConfig should contain another value,
you only have to add it to DbConfig once, and every service that uses DbConfig will get the new config parameter. Also, when you need to initialize your db connection, you can just pass it the entire .db field rather
than pick out needed config arguments one-by-one.
Where conf differs from clap-derive is that we expect that you will use flatten in your project quite a lot.
For example, you might need to do this:
because logically, you have three different http clients that you need to configure.
However with clap-derive, this is going to cause a problem, because when the fields from HttpClientConfig get flattened, their names will collide, and the parser will reject it as ambiguous.
When using conf, you can resolve it by declaring a prefix.
This will cause every option associated to the auth_service structure to get a prefix, derived from the field name, auth_service, on any long-form options and on any env variables. The prefix will be kebab-case for long-form options and upper snake-case for env variables. And similarly for friend_service and snaps_service.
You can also override this prefix:
You can also configure env prefixes and option prefixes separately if you want that. Setting env_prefix will cause env vars to be prefixed, but not options. long_prefix will cause long-form options to be prefixed, but not env vars. (Short options are never prefixed, so there is not usually a good way to resolve a conflict among them. Short options should be used with caution in a large project.)
Finally, you can also declare prefixes at the level of a struct rather than a field. So for example, if you need every environment variable your program reads to be prefixed with ACME_, you can achieve that very easily.
You can read about all the attributes in the docs or the REFERENCE.md, but hopefully this is enough to get started.
Topics
This section discusses more advanced features and usage patterns, and alternatives.
Reading files
Sometimes, a web service needs to read a file on startup. conf supports this by using the value_parser feature, which works very similarly as in clap.
A value_parser is a function that takes a &str and returns either a value or an error.
For example, if you need to read a yaml file on startup according to a schema, one way you could do that is
use Conf;
use Deserialize;
use ;
This will read a file path either from CLI args or from env, then attempt to open the file and parse it according to the yaml schema.
If your value_parser is complex or needs to be reused, the best practice is to put it in a named function.
This can also be a good pattern for things like reading a certificate or a cryptographic key from a file, which you want to check on startup, failing fast if the file is not found or is invalid.
Hierarchical config
Hierarchical config is the idea that config values should be merged in from files as well as from args and env.
Applications should follow a hierarchical configuration structure. Use the following order, from highest priority to lowest.
- Command-line arguments
- Environment variables
- Directory or repository-scoped configuration
- User-scoped configuration
- System-wide configuration
- Default configuration shipped with the program.
conf has built-in support for (1), (2), and (6) here.
To get the others when using something like conf, a common practice is to use a crate like dotenvy. This crate can search for an .env file, and then set env values if they are not already set in your program.
You can do this right before calling Config::parse(), and in this manner achieve hierarchical config. You can load multiple .env files this way if you need to.
In web applications, I often use this approach for development rather than production.
If your application has a lot of required values, it may take an engineer a while to figure out how to just run it locally. But you may not want to provide default values in the program that would not be appropriate in production, for safety. Instead, you can provide a .env file which is checked in to the repo, with values which are appropriate for local testing / CI. Then an engineer can use cargo run and it will just work. When you go to build docker containers, you can leave out these .env files, and then be sure that in the deployed environment, kubernetes or similar is in total control, and any missing or misspelled values in the helm charts and whatnot will be loud and fail fast.
These .env files work well if you are using diesel, because the diesel cli tool also uses dotenvy to search for a .env file and find the DATABASE_URL when manging database migrations locally.
This approach to hierarchical config is much less general than what crates like config and figment offer, but it's also simpler, and it's easy to change and debug. There are other reasons discussed in MOTIVATION.md that I personally favor this approach. This of course is highly opinionated -- over time conf may add more features that support other ways of using it. To start, I only built what I felt I needed. Your mileage may vary.
Secrets
conf tries to provide the most helpful and detailed errors that it can, and also to report as many problems as it can when parsing fails.
Usually, if a user-provided value cannot be parsed, we want to provide the value and the error in the error message to help debugging. But if the value represents a secret, then logging its value is bad.
To prevent conf from logging the value, you can mark the field as secret.
pub api_key: ApiKey
When conf knows that something is a secret, it will avoid revealing the value when generating any kind of error message or help text.
conf will also describe it with the [secret] tag in the help text.
Handling secrets is a complex topic and much of the discussion is out of scope here. We'll offer just three points of guidance around this tool.
- The more valuable the secrets are, and the more challenging the threat model is, the more time it makes sense to spend working on defensive measures. The converse is also true.
- If you're at a point where systematically marking things
secretseems like a good idea, then you should also be using special types to manage the secrets. For example, usingSecretStringfrom thesecrecycrate instead ofStringwill prevent your password from appearing in debug logs after it has been loaded. There are alternatives out there ifsecrecycrate doesn't work for your use-case. This is usually a pretty low-effort improvement, and it goes hand-in-hand with what thesecretmarking does.- It's very easy to expose your secret by accident if you don't do something like this. For example, just by putting a
#[tracing::instrument]annotation on a function that some day takes aconfigstruct, you could accidentally log your password.
- It's very easy to expose your secret by accident if you don't do something like this. For example, just by putting a
- If you're at a point where you think you need to systematically zeroize all copies of your secret that reside in process memory when they are no longer needed, then you are past the point
where you can use an environment variable to pass the secret value to the application. Your application most likely needs to read the secret value from a file instead.
- The rust standard library handles environment values as
std::ffi::OsStringinternally and in its API, but this type cannot be securely zeroized. There are no public APIs to mutably access the underlying bytes. - At a lower level,
glibcexposes the environment aschar **environ, makes copies of the entire environment whenever it is changed usingset_varor similar, and leaks the old values. It is difficult to systematically ensure that all of these copies are cleaned up if they contain sensitive data.environoften gets copied by other things very early in the process. The rust standard library also interacts with the environment via theseglibcAPIs, which means that typical rust libraries likedotenvydo as well.
- The rust standard library handles environment values as
Argument groups and constraints
clap has support for the concept of "argument groups" (ArgGroup) and also "dependencies" among Arg's. This is used to create additional conditions that must be satisfied for the config to be valid, and error messages if it is invalid.
clap provides many functions on Arg and on ArgGroup which can be used to define various kinds of constraints, such as conditional dependency or mutual exclusion, between Arg's or ArgGroup's.
The main reason to use these features in clap is that it will generate nicely formatted errors if these constraints are violated, and then you don't have to worry about handling the situation in your application code.
conf similarly wants to support adding constraints in this manner that are checked during parsing, but the design goal is that all of these errors should reportable alongside all the other types of errors.
For several reasons, conf chose to offer a different API than the clap for these purposes.
- In
clap, this API was designed first for the clap builder API, and then exposed via theclap-deriveAPI. - There are about a dozen functions exposed in total, and multiple named concepts (
Argis now joined byArgGroupwhich is different fromArgs) - The API relies on explicit
idvalues forArg's andArgGroups, but this is less idiomatic in the derive API. The derive API is simpler from the user's point of view if theseid's are not really exposed and are more like implementation details. - The API often provides multiple ways to do the same thing, which makes code that uses it less predictable.
- The API has many defaults that I find hard to remember. For example, in an
ArgGroup, doesrequireddefault totrueorfalse? Doesmultipledefault totrueorfalse? These defaults are different for anArgs. - Sometimes the API doesn't feel idiomatic. For example if I have a group of options where if one of them appears, all of them must appear, the most idiomatic thing is if the API can give me a single
Optionthat includes all of them. Otherwise I have to unwrap a bunch of options in application code, on the assumption that my constraint works as expected.
conf provides one mechanism for idiomatically representing when some collection of arguments are optional-but-mutually-required. Then it provides a few one-offs to express exclusivity between arguments. Finally, it provides a very general mechanism that can express arbitrary constraints.
flatten-optional
conf supports the following syntax:
Intuitively, this means that the snaps_service config is optional, and if none of those fields appear, that's not an error, and snaps_service will be None in the parsed config object.
However, if any of the fields of snaps_service appear, then all of its required fields must appear, and parsing the entire flattened object must succeed.
This allows the code that consumes the conditional config to be simpler -- you can just match on whether snaps_service is present or absent, and the type system encodes that when any of those fields are present, all are present.
And you can express which arguments in the group are required to be present or not by marking them optional or not (or giving them a default value), within HttpClientConfig.
This feature actually covers every use-case I've had in real-life for argument groups and constraints in clap across all my web projects, and I like it because I feel that it introduces fewer named concepts
and promotes code reuse. The same struct can be flattened in a required way in one setting and in an optional way in another setting.
Hopefully it's easy to remember what it means, just by looking at the type of the data, and thinking about what would have to happen for it to succeed.
If we can't see any of the (prefixed) substructure's fields appearing, then we return None. If we see some of them appearing, it indicates that we're supposed to be producing a Some. Once we decide that we're supposed to produce Some, it's an error if we can't do so in the normal (non-optional) manner for flatten'ed structures.
one_of_fields
conf provides a simple way to specify that some fields in a struct are mutually exclusive.
When used with two fields, it provides a way to translate many usages of conflicts_with in the clap-derive API.
When used with all fields in a struct, it is similar to an ArgGroup with multiple=false and required=false in the clap-derive API.
This also works with the flatten-optional feature, so one or more optional flattened groups can be made exclusive with eachother or with simple arguments in this structure.
However, it can only be used with fields on the struct that is marked with this attribute, and cannot be used with fields inside of flattened structs, or elsewhere in the structure.
conf provides a variation which requires exactly one of the fields to appear.
When used with all fields in a struct, this is similar to an ArgGroup with multiple=false and required=true in the clap-derive API.
Finally conf provides one more variation
When used with all fields in a struct, this is similar to an ArgGroup with multiple=true and required=true in the clap-derive API.
Any of these attributes can be used multiple times on the same struct to create multiple constraints that apply to that struct.
validation predicate
flatten-optional and one_of_fields provide some easy-to-understand ways to create dependencies and exclusion constraints between different optional fields in a conf structure.
They can directly translate many simple uses of ArgGroup and some of the constraints in the clap-derive API. But, there are many other constraint types supported by clap
that don't translate directly into this, and we don't support declaring arg group membership directly on a field, which is something that clap does support.
At the same time, there are other kinds of constraints you might have a legitimate use for that you can't express in clap's API.
For example, one of your arguments might be a Url object, and you might want to require that if the Url starts with https then some other options are required. As far as I know, there's no way to do this in clap.
Instead of providing direct analogues for every function in clap's constraint API,
conf supports user-defined validation predicates on a per-struct basis.
A validation predicate is a function that takes &T where T is the struct at hand, and returns Result<(), impl Display>.
It behaves similarly to value_parser, in that any function expression can be accepted.
The idea here is, rather than adding increasing numbers of one-off constraint types to conf, or enabling you to write non-local constraints using proc-macro attributes, it
will be more maintainable for you and for conf if you just express what you want in rust code, once your constraints get sophisticated enough.
There's both less API for you to learn and remember, and less API surface area for conf to test and maintain. You will also be able to generate very precise error messages when complex constraints fail.
Using these features together, you can express any kind of constraint you want to impose on your config structure, and hopefully make it feel idiomatic and natural.
Given that the validation_predicate for a T runs after we have actually parsed a T, why have this feature at all? The users could just run such functions on their own after Config::parse succeeds.
The benefit of using the validation_predicate is that if a predicate fails, conf is still able to report those errors and any other errors that occurred elsewhere in the tree.
For example, in this config struct:
It's possible that when parsing a Config, the auth_service fails to parse because of a missing required argument, friend_service fails to parse because of a missing argument and an invalid value, and snaps_service parses but fails its validation predicate. In this scenario conf will report all of these errors, which distinguishes it from other crates in this genre.
Who should use this crate?
The best reason to use this is crate is if you have a medium-to-large project, such as a web app consisting of multiple services, which has a lot of configuration needs. You have multiple services that have several common subsystems or components, and these components have subcomponents, some of which are shared, etc., all of which should read config from the environment in accordance with 12-factor style, and may need to read more such config on short notice. You may already be using clap-derive but have run into limitations as your project has grown.
The purpose of the crate is to help you arrange all of that config in the simplest and most maintainable way possible, while still ensuring that all values that are needed are checked for on program startup (failing fast), reporting as many configuration errors as possible in the most helpful way possible when your deployment goes bad, and providing automated --help documentation of all of the config that is being read.
If you think that this crate is a good fit for you, the suggested way to use it is:
- Whenever you have a component that you think should use a value that is read on startup, you should create a config struct for that component.
You should
derive(Conf)on that struct, and pass that config struct to the component on initialization. The config struct should live in the same module as the component that it is configuring. - If your component is initialized by a larger component, then that component should have its own config struct and you should use
flattento assemble it. You should usually use theprefixandhelp_prefixoptions when flattening. - Each binary target should have a config struct, and should
::parse()it infn main().
This way, whenever you discover in the future that you need to add more config values for one of your small components, all you have to do is add it to the associated config struct, and it will automatically appear in every service that needs it, as many times as needed with appropriate prefixing, without you having to plumb it through every step of the way. Additionally, it makes it easier to create correct config for any future services or tools. And it causes all of your services and tools to have a similar, predictable style, and to have all of their config documented in --help, even pretty obscure environment variables and such, which usually just don't get documented if you choose to read them directly from std::env instead.
When should clap-derive be preferred to this crate?
This crate defines itself somewhat differently from clap-derive and has different features and goals.
clap-deriveis meant to be an alternative to the clap builder API, and exposes essentially all of the features of the builder.clapitself is primarily a CLI argument parser per maintainers, and many simple features aroundenvsupport, like, arguments that can only be read fromenv, are considered out of scope.
conf places emphasis on features differently.
envis actually the most important thing for a 12-factor web app.confhas a different architecture, such that its easier to pass information at runtime between astructand thestructthat it is flattened into, in both directions. This enables many of the new features that it brings to the table. Many of the details are undocumented and not part of the public API, so that they can be extended and improved without a breaking change.confhas very specific goals around error reporting. We want to return as many config errors as possible at once, because deployment might take a relatively long time.
In order to meet its goals, conf does not use clap to handle env at all. clap is only used to parse CLI arguments as strings, and to render help text, which are the two things that it is best at.
This crate can expose more features of the underlying clap builder and get closer towards the feature set offered by clap-derive, but will probably never expose all of them -- we can only expose features that we are sure will work well with the additional features that we have created, like flatten-with-prefix, and will work well with the manner in which we are using the underlying clap builder. The most interesting features are those that can be motivated by common web development needs.
If you have very specific CLI argument parsing needs, or if you need pixel-perfect help text, you will be better off using clap directly instead of this crate, because you will have more control that way. clap is the most mature and feature-complete CLI argument parser out there, by a wide margin.
In many web projects, you don't really have such needs. You aren't making very sophisticated use of clap, your project is small, and you don't particularly need any features of conf either, so you will be able to use clap-derive or conf equally well and not notice very much difference.
If you prefer, you can stick with clap-derive, and then only if you find that you need flatten-with-prefix or another feature, try to switch to conf at that point.
conf is designed to make this migration relatively easy for such projects. (Indeed, I started working on conf because I had several large projects on clap-derive and I was hitting limitations and being forced info workarounds that I wasn't happy with, and I couldn't find a wholly satsifactory alternative.) If you find that you get stuck when trying to migrate, you can open a discussion and we can try to help.
License
Code is available under MIT or Apache 2 at your option.