pub struct PythonTorchDatasetConfig {
pub root: PathBuf,
pub dataset_id: DatasetId,
pub dataset_view_id: DatasetViewId,
pub source_uri: String,
pub format: String,
pub manifest_hash: ContentId,
pub preprocessing_hash: ContentId,
pub tokenizer_hash: Option<ContentId>,
pub sizing: DatasetSizing,
pub planner: MicroShardPlannerConfig,
pub microshards_per_batch: usize,
pub metadata: BTreeMap<String, String>,
}Expand description
Declares the shard-backed dataset view exposed to the p2p runtime.
Fields§
§root: PathBufRoot containing fetch-manifest.json and shard files.
dataset_id: DatasetIdStable dataset id.
dataset_view_id: DatasetViewIdStable dataset view id.
source_uri: StringSource URI surfaced in dataset metadata.
format: StringDataset format tag.
manifest_hash: ContentIdDataset manifest hash.
preprocessing_hash: ContentIdPreprocessing hash for the view.
tokenizer_hash: Option<ContentId>Optional tokenizer hash.
sizing: DatasetSizingDataset sizing used to plan microshards.
planner: MicroShardPlannerConfigPlanner config used to derive microshard ids.
microshards_per_batch: usizeNumber of cached microshards grouped into one Python batch ref.
metadata: BTreeMap<String, String>Arbitrary dataset metadata propagated into the registration.
Implementations§
Source§impl PythonTorchDatasetConfig
impl PythonTorchDatasetConfig
Sourcepub fn registration(&self) -> DatasetRegistration
pub fn registration(&self) -> DatasetRegistration
Returns a local-upstream dataset registration.
Sourcepub fn plan(&self) -> Result<MicroShardPlan>
pub fn plan(&self) -> Result<MicroShardPlan>
Plans the microshards for this dataset view.
Trait Implementations§
Source§impl Clone for PythonTorchDatasetConfig
impl Clone for PythonTorchDatasetConfig
Source§fn clone(&self) -> PythonTorchDatasetConfig
fn clone(&self) -> PythonTorchDatasetConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for PythonTorchDatasetConfig
impl Debug for PythonTorchDatasetConfig
Source§impl<'de> Deserialize<'de> for PythonTorchDatasetConfig
impl<'de> Deserialize<'de> for PythonTorchDatasetConfig
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for PythonTorchDatasetConfig
impl RefUnwindSafe for PythonTorchDatasetConfig
impl Send for PythonTorchDatasetConfig
impl Sync for PythonTorchDatasetConfig
impl Unpin for PythonTorchDatasetConfig
impl UnsafeUnpin for PythonTorchDatasetConfig
impl UnwindSafe for PythonTorchDatasetConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CanonicalSchema for Twhere
T: Serialize,
impl<T> CanonicalSchema for Twhere
T: Serialize,
Source§fn to_cbor_vec(&self) -> Result<Vec<u8>, SchemaError>
fn to_cbor_vec(&self) -> Result<Vec<u8>, SchemaError>
Serializes the value into canonical CBOR bytes.
Source§fn content_id(&self) -> Result<ContentId, SchemaError>
fn content_id(&self) -> Result<ContentId, SchemaError>
Computes the canonical content identifier for the value.