Module opendal::docs::rfcs::rfc_0112_path_normalization
source · Expand description
- Proposal Name:
path-normalization
- Start Date: 2022-03-08
- RFC PR: datafuselabs/opendal#112
- Tracking Issue: datafuselabs/opendal#112
Summary
Implement path normalization to enhance user experience.
Motivation
OpenDAL’s current path behavior makes users confused:
They are different bugs that reflect the exact root cause: the path is not well normalized.
On local fs, we can read the same path with different path: abc/def/../def
, abc/def
, abc//def
, abc/./def
.
There is no magic here: our stdlib does the dirty job. For example:
- std::path::PathBuf::canonicalize: Returns the canonical, absolute form of the path with all intermediate components normalized and symbolic links resolved.
- std::path::PathBuf::components: Produces an iterator over the Components of the path. When parsing the path, there is a small amount of normalization…
But for s3 alike storage system, there’s no such helpers: abc/def/../def
, abc/def
, abc//def
, abc/./def
refers entirely different objects. So users may confuse why I can’t get the object with this path.
So OpenDAL needs to implement path normalization to enhance the user experience.
Guide-level explanation
We will do path normalization automatically.
The following rules will be applied (so far):
- Remove
//
inside path:op.object("abc/def")
andop.object("abc//def")
will resolve to the same object. - Make sure path under
root
:op.object("/abc")
andop.object("abc")
will resolve to the same object.
Other rules still need more consideration to leave them for the future.
Reference-level explanation
We will build the absolute path via {root}/{path}
and replace all //
into /
instead.
Drawbacks
None
Rationale and alternatives
How about the link?
If we build an actual path via {root}/{path}
, the link object may be inaccessible.
I don’t have good ideas so far. Maybe we can add a new flag to control the link behavior. For now, there’s no feature request for link support.
Let’s leave for the future to resolve.
S3 URI Clean
For s3, abc//def
is different from abc/def
indeed. To make it possible to access not normalized path, we can provide a new flag for the builder:
let builder = Backend::build().disable_path_normalization()
In this way, the user can control the path more precisely.
Prior art
None
Unresolved questions
None
Future possibilities
None