# Netscape Cookie Importer Behavior
This document compares the Netscape cookie file import behavior of curl, wget,
and aria2. These projects are cookie jar importers used by HTTP clients, not
pure record parsers. Their behavior is useful compatibility context, but it
does not automatically define every default behavior of this crate.
## Summary
| Role | Imports records into libcurl's cookie jar. | Imports records into wget's cookie jar. | Imports records into aria2's cookie jar. |
| Domain field | Strips one leading dot before storing the internal domain. The tail-match field carries the wildcard-domain meaning. | Strips one leading dot before storing the internal domain. The tail-match field is stored separately as `domain_exact`. | Strips leading dots before storing the internal domain. The tail-match field becomes `hostOnly`. |
| Saved domain | Writes a leading dot back when tail matching is enabled. | Writes a leading dot back when tail matching is enabled. | Writes a leading dot back when `hostOnly` is false. |
| Path field | Sanitizes on import: strips a surrounding quote pair, converts empty or non-absolute paths to `/`, and removes one trailing slash from non-root paths. | Preserves the path field exactly after splitting the line. | Preserves the path field exactly after splitting the line. |
| Empty path field | Rejected for a standard seven-field record. Legacy records that omitted the path field are accepted with `/` as the path. | Rejected because required tab fields must be non-empty. | Rejected because `goodPath` requires a non-empty path starting with `/`. |
| Relative path field | Accepted and sanitized to `/`. | Accepted as raw data by the file loader. | Rejected by `goodPath`. |
| Trailing slash in path | `/account/` becomes `/account`. | `/account/` remains `/account/`. | `/account/` remains `/account/`. |
| `#HttpOnly_` marker | Supported and imported as HttpOnly metadata. | Not supported; `#` lines are comments. | Not supported; `#` lines are comments. |
| Missing trailing value | Accepted as an empty value. | Accepted as an empty value after the name field. | Accepted as an empty value if the split result has the other fields. |
| Legacy missing path field | Supported if the path field position looks like a boolean secure field. | Not supported. | Not supported. |
## curl
curl treats imported cookies as internal cookie jar entries. Its `Cookie` struct
documents `path` as a "canonical path", and both `Set-Cookie` header parsing and
Netscape file parsing pass paths through `sanitize_cookie_path`.
The sanitizer does three important things once a path value reaches it:
- removes one surrounding double-quote pair when present;
- converts an empty path or a path that does not start with `/` into `/`;
- removes one trailing slash from a non-root path.
That means a Netscape cookie file record containing `/account/` is stored as
`/account` after curl imports it. curl also has explicit Netscape-file support
for the Firefox/curl `#HttpOnly_` line prefix, and it has a compatibility path
for old records that omitted the path field. In that legacy case curl inserts
`/` as the missing path. A standard seven-field Netscape record with an empty
path field is not accepted because the empty field is interpreted as a
missing-path layout and the remaining fields no longer line up as a valid
record.
Relevant curl source points:
- `lib/cookie.c`: `sanitize_cookie_path`
- `lib/cookie.c`: `parse_netscape`
- `lib/cookie.c`: `get_netscape_format`
## wget
wget also imports the file into a cookie jar, but its Netscape file loader does
not sanitize the path. It splits required fields with a helper that rejects empty
required fields, stores `cookie->path` with `strdupdelim(path_b, path_e)`, and
only strips a leading dot from the domain before storing the internal domain.
The tail-match field is kept as cookie metadata. When wget writes the jar back,
it emits a leading dot again for cookies that are not exact-domain cookies.
Consequences:
- `/account/` remains `/account/`;
- an empty path field is rejected;
- a relative path field is preserved by the file loader;
- `#HttpOnly_` records are treated as comments because lines beginning with `#`
are skipped before field parsing.
Relevant wget source points:
- `src/cookies.c`: `cookie_jar_load`
- `src/cookies.c`: `cookie_jar_save`
- `src/cookies.c`: `path_matches`
## aria2
aria2's `NsCookieParser` imports Netscape records into `Cookie` objects. It
splits on tabs, strips leading dots from the domain, validates the path with
`goodPath`, and then stores the path field directly with `setPath`.
`goodPath` only checks that the field is non-empty and begins with `/`. It does
not trim quotes, does not remove a trailing slash, and does not normalize path
segments.
Consequences:
- `/account/` remains `/account/`;
- an empty path field is rejected;
- a relative path field is rejected;
- `#HttpOnly_` records are treated as comments because lines beginning with `#`
are skipped before Netscape parsing.
Relevant aria2 source points:
- `src/NsCookieParser.cc`: `parseNsCookie`
- `src/cookie_helper.cc`: `goodPath`
- `src/cookie_helper.cc`: `pathMatch`
- `src/Cookie.cc`: `toNsCookieFormat`
## Parser API Implication
The three clients agree that a cookie jar importer may normalize domain storage
while keeping the tail-match flag as separate metadata. They do not agree on
path import policy:
- curl canonicalizes imported paths;
- wget preserves imported paths;
- aria2 preserves imported paths after requiring them to be absolute.
For this crate, the importer survey supports removing one leading domain dot by
default while keeping the tail-match field as separate metadata. That is the
common representation used by curl, wget, and aria2 when storing imported
cookies internally.
The default parser also follows curl for legacy records that omitted the path
field: when the path position looks like a boolean secure field, `/` is inserted
as the path before parsing continues. This compatibility rule is narrow and does
not otherwise canonicalize ordinary path fields.
Path canonicalization is different because it can change cookie path-match
semantics. This crate preserves the path field and leaves curl-style path
normalization to caller-side post-processing when an application wants that
importer policy.