perspective_python

Struct Table

Source
pub struct Table(/* private fields */);
Expand description

Table is Perspective’s columnar data frame, analogous to a Pandas DataFrame or Apache Arrow, supporting append & in-place updates, removal by index, and update notifications.

A Table contains columns, each of which have a unique name, are strongly and consistently typed, and contains rows of data conforming to the column’s type. Each column in a Table must have the same number of rows, though not every row must contain data; null-values are used to indicate missing values in the dataset.

The schema of a Table is immutable after creation, which means the column names and data types cannot be changed after the Table has been created. Columns cannot be added or deleted after creation either, but a View can be used to select an arbitrary set of columns from the Table.

The examples in this module are in JavaScript. See perspective docs for the Rust API.
The examples in this module are in Python. See perspective docs for the Rust API.

§Schema and Types

The mapping of a Table’s column names to data types is referred to as a schema. Each column has a unique name and a single data type:

var schema = {
    x: "integer",
    y: "string",
    z: "boolean",
};

const table2 = await worker.table(schema);
from datetime import date, datetime

schema = {
    "x": "integer",
    "y": "string",
    "z": "boolean",
}

table2 = perspective.Table(schema)
let data = TableData::Schema(vec![(" a".to_string(), ColumnType::FLOAT)]);
let options = TableInitOptions::default();
let table = client.table(data.into(), options).await?;

When passing data directly to the crate::Client::table constructor, the type of each column is inferred automatically. In some cases, the inference algorithm may not return exactly what you’d like. For example, a column may be interpreted as a datetime when you intended it to be a string, or a column may have no values at all (yet), as it will be updated with values from a real-time data source later on. In these cases, create a table() with a schema.

Once the Table has been created with a schema, further update() calls will no longer perform type inference, so columns must only include values supported by the column’s [ColumnType].

§Data Formats

A Table may also be created-or-updated by data in CSV, Apache Arrow, JSON row-oriented or JSON column-oriented formats.

In addition to these core formats, perspective-python additionally supports pyarrow.Table and pandas.DataFrame objects directly. These formats are otherwise identical to the built-in formats and don’t exhibit any additional support or type-awareness; e.g., pandas.DataFrame support is just pyarrow.Table.from_pandas piped into Perspective’s Arrow reader.

crate::Client::table and Table::update perform coercion on their input for all input formats except Arrow (which comes with its own schema and has no need for coercion).

"date" and "datetime" column types do not have native JSON representations, so these column types cannot be inferred from JSON input. Instead, for columns of these types for JSON input, a Table must first be constructed with a schema. Next, call Table::update with the JSON input - Perspective’s JSON reader may coerce a date or datetime from these native JSON types:

  • integer as milliseconds-since-epoch.
  • string as a any of Perspective’s built-in date format formats.
  • JavaScript Date and Python datetime.date and datetime.datetime are not supported directly. However, in JavaScript Date types are automatically coerced to correct integer timestamps by default when converted to JSON.

For CSV input types, Perspective relies on Apache Arrow’s CSV parser, and as such uses the same column-type inference logic as Arrow itself.

§Row Oriented JSON

Row-oriented JSON is in the form of a list of objects. Each object in the list corresponds to a row in the table. For example:

[
    { "a": 86, "b": false, "c": "words" },
    { "a": 0, "b": true, "c": "" },
    { "a": 12345, "b": false, "c": "here" }
]

§Column Oriented JSON

Column-Oriented JSON comes in the form of an object of lists. Each key of the object is a column name, and each element of the list is the corresponding value in the row.

{
    "a": [86, 0, 12345],
    "b": [false, true, false],
    "c": ["words", "", "here"]
}

§NDJSON

{ "a": 86, "b": false, "c": "words" }
{ "a": 0, "b": true, "c": "" }
{ "a": 12345, "b": false, "c": "here" }

§Index and Limit

Initializing a Table with an index tells Perspective to treat a column as the primary key, allowing in-place updates of rows. Only a single column (of any type) can be used as an index. Indexed Table instances allow:

  • In-place updates whenever a new row shares an index values with an existing row
  • Partial updates when a data batch omits some column.
  • Removes to delete a row by index.

To create an indexed Table, provide the index property with a string column name to be used as an index:

const indexed_table = await perspective.table(data, { index: "a" });
indexed_table = perspective.Table(data, index="a");

Initializing a Table with a limit sets the total number of rows the Table is allowed to have. When the Table is updated, and the resulting size of the Table would exceed its limit, rows that exceed limit overwrite the oldest rows in the Table. To create a Table with a limit, provide the limit property with an integer indicating the maximum rows:

const limit_table = await perspective.table(data, { limit: 1000 });
limit_table = perspective.Table(data, limit=1000);

§Table::update and Table::remove

Once a Table has been created, it can be updated with new data conforming to the Table’s schema. Table::update supports the same data formats as crate::Client::table, minus schema.

const schema = {
    a: "integer",
    b: "float",
};

const table = await perspective.table(schema);
table.update(new_data);
schema = {"a": "integer", "b": "float"}

table = perspective.Table(schema)
table.update(new_data)

Without an index set, calls to update() append new data to the end of the Table. Otherwise, Perspective allows partial updates (in-place) using the index to determine which rows to update:

indexed_table.update({ id: [1, 4], name: ["x", "y"] });
indexed_table.update({"id": [1, 4], "name": ["x", "y"]})

Any value on a Client::table can be unset using the value null in JSON or Arrow input formats. Values may be unset on construction, as any null in the dataset will be treated as an unset value. Table::update calls do not need to provide all columns in the Table’s schema; missing columns will be omitted from the Table’s updated rows.

table.update([{ x: 3, y: null }]); // `z` missing
table.update([{"x": 3, "y": None}]) // `z` missing

Rows can also be removed from an indexed Table, by calling Table::remove with an array of index values:

indexed_table.remove([1, 4]);
indexed_table.remove([1, 4])

§Table::clear and Table::replace

Calling Table::clear will remove all data from the underlying Table. Calling Table::replace with new data will clear the Table, and update it with a new dataset that conforms to Perspective’s data types and the existing schema on the Table.

table.clear();
table.replace(json);
table.clear()
table.replace(df)
`limit` cannot be used in conjunction with `index`.

Implementations§

Source§

impl Table

Source

pub fn get_index(&self, py: Python<'_>) -> Option<String>

Returns the name of the index column for the table.

§JavaScript Examples
const table = await client.table("x,y\n1,2\n3,4", { index: "x" });
const index = table.get_index(); // "x"
§Python Examples
table = client.table("x,y\n1,2\n3,4", index="x");
index = table.get_index() # "x"
§Examples
let options = TableInitOptions {index: Some("x".to_string()), ..default() };
let table = client.table("x,y\n1,2\n3,4", options).await;
let tables = client.open_table("table_one").await;
Source

pub fn get_client(&self, py: Python<'_>) -> Client

Get a copy of the Client this Table came from.

Source

pub fn get_limit(&self, py: Python<'_>) -> Option<u32>

Get a copy of the Client this Table came from.

Source

pub fn get_name(&self, py: Python<'_>) -> String

Source

pub fn clear(&self, py: Python<'_>) -> PyResult<()>

Removes all the rows in the Table, but preserves everything else including the schema, index, and any callbacks or registered View instances.

Calling Table::clear, like Table::update and Table::remove, will trigger an update event to any registered listeners via View::on_update.

Source

pub fn columns(&self, py: Python<'_>) -> PyResult<Vec<String>>

Returns the column names of this Table in “natural” order (the ordering implied by the input format).

§JavaScript Examples
const columns = await table.columns();
§Python Examples
columns = table.columns()
§Examples
let columns = table.columns().await;
Source

pub fn delete(&self, py: Python<'_>) -> PyResult<()>

Delete this Table and cleans up associated resources, assuming it has no View instances registered to it (which must be deleted first).

Tables do not stop consuming resources or processing updates when they are garbage collected in their host language - you must call this method to reclaim these.

§JavaScript Examples
const table = await client.table("x,y\n1,2\n3,4");

// ...

await table.delete();
§Python Examples
table = client.table("x,y\n1,2\n3,4")

// ...

table.delete()
§Examples
let opts = TableInitOptions::default();
let data = TableData::Update(UpdateData::Csv("x,y\n1,2\n3,4".into()));
let table = client.table(data, opts).await?;

// ...

table.delete().await?;
Source

pub fn make_port(&self, py: Python<'_>) -> PyResult<i32>

Create a unique channel ID on this Table, which allows View::on_update callback calls to be associated with the Table::update which caused them.

Source

pub fn on_delete(&self, py: Python<'_>, callback: Py<PyAny>) -> PyResult<u32>

Register a callback which is called exactly once, when this Table is deleted with the Table::delete method.

Table::on_delete resolves when the subscription message is sent, not when the delete event occurs.

Source

pub fn remove( &self, py: Python<'_>, input: Py<PyAny>, format: Option<String>, ) -> PyResult<()>

Removes rows from this Table with the index column values supplied.

§Arguments
  • indices - A list of index column values for rows that should be removed.
§JavaScript Examples
await table.remove([1, 2, 3]);
§Python Examples
tbl = Table({"a": [1, 2, 3]}, index="a")
tbl.remove([2, 3])
§Examples
table.remove(UpdateData::Csv("index\n1\n2\n3")).await?;
Source

pub fn remove_delete(&self, py: Python<'_>, callback_id: u32) -> PyResult<()>

Removes a listener with a given ID, as returned by a previous call to Table::on_delete.

Source

pub fn schema(&self, py: Python<'_>) -> PyResult<HashMap<String, String>>

Returns a table’s Schema, a mapping of column names to column types.

The mapping of a Table’s column names to data types is referred to as a Schema. Each column has a unique name and a data type, one of:

  • "boolean" - A boolean type
  • "date" - A timesonze-agnostic date type (month/day/year)
  • "datetime" - A millisecond-precision datetime type in the UTC timezone
  • "float" - A 64 bit float
  • "integer" - A signed 32 bit integer (the integer type supported by JavaScript)
  • "string" - A String data type (encoded internally as a dictionary)

Note that all Table columns are nullable, regardless of the data type.

Source

pub fn validate_expressions( &self, py: Python<'_>, expression: Py<PyAny>, ) -> PyResult<Py<PyAny>>

Validates the given expressions.

§Python Examples
exprs = client.validate_expressions({"computed": '"Quantity" + 4'})
Source

pub fn view(&self, py: Python<'_>, config: Option<Py<PyDict>>) -> PyResult<View>

Create a new View from this table with a specified ViewConfigUpdate.

See View struct.

§JavaScript Examples
const view = await table.view({
    columns: ["Sales"],
    aggregates: { Sales: "sum" },
    group_by: ["Region", "Country"],
    filter: [["Category", "in", ["Furniture", "Technology"]]],
});
§Python Examples
view = table.view(
  columns=["Sales"],
  aggregates={"Sales": "sum"},
  group_by=["Region", "Country"],
  filter=[["Category", "in", ["Furniture", "Technology"]]]
)
§Examples
use crate::config::*;
let view = table
    .view(Some(ViewConfigUpdate {
        columns: Some(vec![Some("Sales".into())]),
        aggregates: Some(HashMap::from_iter(vec![("Sales".into(), "sum".into())])),
        group_by: Some(vec!["Region".into(), "Country".into()]),
        filter: Some(vec![Filter::new("Category", "in", &[
            "Furniture",
            "Technology",
        ])]),
        ..ViewConfigUpdate::default()
    }))
    .await?;
Source

pub fn size(&self, py: Python<'_>) -> PyResult<usize>

Returns the number of rows in a Table.

Source

pub fn replace( &self, py: Python<'_>, input: Py<PyAny>, format: Option<String>, ) -> PyResult<()>

Updates the rows of this table and any derived View instances.

Calling Table::update will trigger the View::on_update callbacks register to derived View, and the call itself will not resolve until all derived View’s are notified.

When updating a Table with an index, Table::update supports partial updates, by omitting columns from the update data.

§Arguments
  • input - The input data for this Table. The schema of a Table is immutable after creation, so this method cannot be called with a schema.
  • options - Options for this update step - see UpdateOptions.
§JavaScript Examples
await table.update("x,y\n1,2");
§Python Examples
table.update("x,y\n1,2")
§Examples
let data = UpdateData::Csv("x,y\n1,2".into());
let opts = UpdateOptions::default();
table.update(data, opts).await?;
Source

pub fn update( &self, py: Python<'_>, input: Py<PyAny>, port_id: Option<u32>, format: Option<String>, ) -> PyResult<()>

Updates the rows of this table and any derived View instances.

Calling Table::update will trigger the View::on_update callbacks register to derived View, and the call itself will not resolve until all derived View’s are notified.

When updating a Table with an index, Table::update supports partial updates, by omitting columns from the update data.

§Arguments
  • input - The input data for this Table. The schema of a Table is immutable after creation, so this method cannot be called with a schema.
  • options - Options for this update step - see UpdateOptions.
§JavaScript Examples
await table.update("x,y\n1,2");
§Python Examples
table.update("x,y\n1,2")
§Examples
let data = UpdateData::Csv("x,y\n1,2".into());
let opts = UpdateOptions::default();
table.update(data, opts).await?;

Trait Implementations§

Source§

impl HasPyGilRef for Table

Source§

type AsRefTarget = PyCell<Table>

Utility type to make Py::as_ref work.
Source§

impl IntoPy<Py<PyAny>> for Table

Source§

fn into_py(self, py: Python<'_>) -> PyObject

Performs the conversion.
Source§

impl PyClass for Table

Source§

type Frozen = False

Whether the pyclass is frozen. Read more
Source§

impl PyClassImpl for Table

Source§

const IS_BASETYPE: bool = true

#[pyclass(subclass)]
Source§

const IS_SUBCLASS: bool = false

#[pyclass(extends=…)]
Source§

const IS_MAPPING: bool = false

#[pyclass(mapping)]
Source§

const IS_SEQUENCE: bool = false

#[pyclass(sequence)]
Source§

type BaseType = PyAny

Base class
Source§

type ThreadChecker = SendablePyClass<Table>

This handles following two situations: Read more
Source§

type PyClassMutability = <<PyAny as PyClassBaseType>::PyClassMutability as PyClassMutability>::MutableChild

Immutable or mutable
Source§

type Dict = PyClassDummySlot

Specify this class has #[pyclass(dict)] or not.
Source§

type WeakRef = PyClassDummySlot

Specify this class has #[pyclass(weakref)] or not.
Source§

type BaseNativeType = PyAny

The closest native ancestor. This is PyAny by default, and when you declare #[pyclass(extends=PyDict)], it’s PyDict.
Source§

fn items_iter() -> PyClassItemsIter

Source§

fn doc(py: Python<'_>) -> PyResult<&'static CStr>

Rendered class doc
Source§

fn lazy_type_object() -> &'static LazyTypeObject<Self>

Source§

fn dict_offset() -> Option<isize>

Source§

fn weaklist_offset() -> Option<isize>

Source§

impl PyClassNewTextSignature<Table> for PyClassImplCollector<Table>

Source§

fn new_text_signature(self) -> Option<&'static str>

Source§

impl<'a, 'py> PyFunctionArgument<'a, 'py> for &'a Table

Source§

type Holder = Option<PyRef<'py, Table>>

Source§

fn extract( obj: &'a Bound<'py, PyAny>, holder: &'a mut Self::Holder, ) -> PyResult<Self>

Source§

impl<'a, 'py> PyFunctionArgument<'a, 'py> for &'a mut Table

Source§

type Holder = Option<PyRefMut<'py, Table>>

Source§

fn extract( obj: &'a Bound<'py, PyAny>, holder: &'a mut Self::Holder, ) -> PyResult<Self>

Source§

impl PyMethods<Table> for PyClassImplCollector<Table>

Source§

fn py_methods(self) -> &'static PyClassItems

Source§

impl PyTypeInfo for Table

Source§

const NAME: &'static str = "Table"

Class name.
Source§

const MODULE: Option<&'static str>

Module name, if any.
Source§

fn type_object_raw(py: Python<'_>) -> *mut PyTypeObject

Returns the PyTypeObject instance for this type.
Source§

fn type_object(py: Python<'_>) -> &PyType

👎Deprecated since 0.21.0: PyTypeInfo::type_object will be replaced by PyTypeInfo::type_object_bound in a future PyO3 version
Returns the safe abstraction over the type object.
Source§

fn type_object_bound(py: Python<'_>) -> Bound<'_, PyType>

Returns the safe abstraction over the type object.
Source§

fn is_type_of(object: &PyAny) -> bool

👎Deprecated since 0.21.0: PyTypeInfo::is_type_of will be replaced by PyTypeInfo::is_type_of_bound in a future PyO3 version
Checks if object is an instance of this type or a subclass of this type.
Source§

fn is_type_of_bound(object: &Bound<'_, PyAny>) -> bool

Checks if object is an instance of this type or a subclass of this type.
Source§

fn is_exact_type_of(object: &PyAny) -> bool

👎Deprecated since 0.21.0: PyTypeInfo::is_exact_type_of will be replaced by PyTypeInfo::is_exact_type_of_bound in a future PyO3 version
Checks if object is an instance of this type.
Source§

fn is_exact_type_of_bound(object: &Bound<'_, PyAny>) -> bool

Checks if object is an instance of this type.
Source§

impl DerefToPyAny for Table

Auto Trait Implementations§

§

impl Freeze for Table

§

impl !RefUnwindSafe for Table

§

impl Send for Table

§

impl Sync for Table

§

impl Unpin for Table

§

impl !UnwindSafe for Table

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> PyErrArguments for T
where T: IntoPy<Py<PyAny>> + Send + Sync,

Source§

fn arguments(self, py: Python<'_>) -> Py<PyAny>

Arguments for exception
Source§

impl<T> PyTypeCheck for T
where T: PyTypeInfo,

Source§

const NAME: &'static str = <T as PyTypeInfo>::NAME

Name of self. This is used in error messages, for example.
Source§

fn type_check(object: &Bound<'_, PyAny>) -> bool

Checks if object is an instance of Self, which may include a subtype. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> Ungil for T
where T: Send,