perspective_client

Struct Table

Source
pub struct Table { /* private fields */ }
Expand description

Table is Perspective’s columnar data frame, analogous to a Pandas DataFrame or Apache Arrow, supporting append & in-place updates, removal by index, and update notifications.

A Table contains columns, each of which have a unique name, are strongly and consistently typed, and contains rows of data conforming to the column’s type. Each column in a Table must have the same number of rows, though not every row must contain data; null-values are used to indicate missing values in the dataset.

The schema of a Table is immutable after creation, which means the column names and data types cannot be changed after the Table has been created. Columns cannot be added or deleted after creation either, but a View can be used to select an arbitrary set of columns from the Table.

The examples in this module are in JavaScript. See perspective docs for the Rust API.
The examples in this module are in Python. See perspective docs for the Rust API.

§Schema and Types

The mapping of a Table’s column names to data types is referred to as a schema. Each column has a unique name and a single data type:

var schema = {
    x: "integer",
    y: "string",
    z: "boolean",
};

const table2 = await worker.table(schema);
from datetime import date, datetime

schema = {
    "x": "integer",
    "y": "string",
    "z": "boolean",
}

table2 = perspective.Table(schema)
let data = TableData::Schema(vec![(" a".to_string(), ColumnType::FLOAT)]);
let options = TableInitOptions::default();
let table = client.table(data.into(), options).await?;

When passing data directly to the crate::Client::table constructor, the type of each column is inferred automatically. In some cases, the inference algorithm may not return exactly what you’d like. For example, a column may be interpreted as a datetime when you intended it to be a string, or a column may have no values at all (yet), as it will be updated with values from a real-time data source later on. In these cases, create a table() with a schema.

Once the Table has been created with a schema, further update() calls will no longer perform type inference, so columns must only include values supported by the column’s ColumnType.

§Data Formats

A Table may also be created-or-updated by data in CSV, Apache Arrow, JSON row-oriented or JSON column-oriented formats.

In addition to these core formats, perspective-python additionally supports pyarrow.Table and pandas.DataFrame objects directly. These formats are otherwise identical to the built-in formats and don’t exhibit any additional support or type-awareness; e.g., pandas.DataFrame support is just pyarrow.Table.from_pandas piped into Perspective’s Arrow reader.

crate::Client::table and Table::update perform coercion on their input for all input formats except Arrow (which comes with its own schema and has no need for coercion).

"date" and "datetime" column types do not have native JSON representations, so these column types cannot be inferred from JSON input. Instead, for columns of these types for JSON input, a Table must first be constructed with a schema. Next, call Table::update with the JSON input - Perspective’s JSON reader may coerce a date or datetime from these native JSON types:

  • integer as milliseconds-since-epoch.
  • string as a any of Perspective’s built-in date format formats.
  • JavaScript Date and Python datetime.date and datetime.datetime are not supported directly. However, in JavaScript Date types are automatically coerced to correct integer timestamps by default when converted to JSON.

For CSV input types, Perspective relies on Apache Arrow’s CSV parser, and as such uses the same column-type inference logic as Arrow itself.

§Index and Limit

Initializing a Table with an index tells Perspective to treat a column as the primary key, allowing in-place updates of rows. Only a single column (of any type) can be used as an index. Indexed Table instances allow:

  • In-place updates whenever a new row shares an index values with an existing row
  • Partial updates when a data batch omits some column.
  • Removes to delete a row by index.

To create an indexed Table, provide the index property with a string column name to be used as an index:

const indexed_table = await perspective.table(data, { index: "a" });
indexed_table = perspective.Table(data, index="a");

Initializing a Table with a limit sets the total number of rows the Table is allowed to have. When the Table is updated, and the resulting size of the Table would exceed its limit, rows that exceed limit overwrite the oldest rows in the Table. To create a Table with a limit, provide the limit property with an integer indicating the maximum rows:

const limit_table = await perspective.table(data, { limit: 1000 });
limit_table = perspective.Table(data, limit=1000);

§Table::update and Table::remove

Once a Table has been created, it can be updated with new data conforming to the Table’s schema. Table::update supports the same data formats as crate::Client::table, minus schema.

const schema = {
    a: "integer",
    b: "float",
};

const table = await perspective.table(schema);
table.update(new_data);
schema = {"a": "integer", "b": "float"}

table = perspective.Table(schema)
table.update(new_data)

Without an index set, calls to update() append new data to the end of the Table. Otherwise, Perspective allows partial updates (in-place) using the index to determine which rows to update:

indexed_table.update({ id: [1, 4], name: ["x", "y"] });
indexed_table.update({"id": [1, 4], "name": ["x", "y"]})

Any value on a Client::table can be unset using the value null in JSON or Arrow input formats. Values may be unset on construction, as any null in the dataset will be treated as an unset value. Table::update calls do not need to provide all columns in the Table’s schema; missing columns will be omitted from the Table’s updated rows.

table.update([{ x: 3, y: null }]); // `z` missing
table.update([{"x": 3, "y": None}]) // `z` missing

Rows can also be removed from an indexed Table, by calling Table::remove with an array of index values:

indexed_table.remove([1, 4]);
indexed_table.remove([1, 4])

§Table::clear and Table::replace

Calling Table::clear will remove all data from the underlying Table. Calling Table::replace with new data will clear the Table, and update it with a new dataset that conforms to Perspective’s data types and the existing schema on the Table.

table.clear();
table.replace(json);
table.clear()
table.replace(df)
`limit` cannot be used in conjunction with `index`.

Implementations§

Source§

impl Table

Source

pub fn get_client(&self) -> Client

Get a copy of the Client this Table came from.

Source

pub fn get_features(&self) -> ClientResult<Features>

Get a metadata dictionary of the perspective_server::Server’s features, which is (currently) implementation specific, but there is only one implementation.

Source

pub fn get_index(&self) -> Option<String>

Returns the name of the index column for the table.

§JavaScript Examples
const table = await client.table("x,y\n1,2\n3,4", { index: "x" });
const index = table.get_index(); // "x"
§Python Examples
table = client.table("x,y\n1,2\n3,4", index="x");
index = table.get_index() # "x"
§Examples
let options = TableInitOptions {index: Some("x".to_string()), ..default() };
let table = client.table("x,y\n1,2\n3,4", options).await;
let tables = client.open_table("table_one").await;
Source

pub fn get_limit(&self) -> Option<u32>

Returns the user-specified row limit for this table.

Source

pub fn get_name(&self) -> &str

Source

pub async fn clear(&self) -> ClientResult<()>

Removes all the rows in the Table, but preserves everything else including the schema, index, and any callbacks or registered View instances.

Calling Table::clear, like Table::update and Table::remove, will trigger an update event to any registered listeners via View::on_update.

Source

pub async fn delete(&self) -> ClientResult<()>

Delete this Table and cleans up associated resources, assuming it has no View instances registered to it (which must be deleted first).

Tables do not stop consuming resources or processing updates when they are garbage collected in their host language - you must call this method to reclaim these.

§JavaScript Examples
const table = await client.table("x,y\n1,2\n3,4");

// ...

await table.delete();
§Python Examples
table = client.table("x,y\n1,2\n3,4")

// ...

table.delete()
§Examples
let opts = TableInitOptions::default();
let data = TableData::Update(UpdateData::Csv("x,y\n1,2\n3,4".into()));
let table = client.table(data, opts).await?;

// ...

table.delete().await?;
Source

pub async fn columns(&self) -> ClientResult<Vec<String>>

Returns the column names of this Table in “natural” order (the ordering implied by the input format).

§JavaScript Examples
const columns = await table.columns();
§Python Examples
columns = table.columns()
§Examples
let columns = table.columns().await;
Source

pub async fn size(&self) -> ClientResult<usize>

Returns the number of rows in a Table.

Source

pub async fn schema(&self) -> ClientResult<HashMap<String, ColumnType>>

Returns a table’s Schema, a mapping of column names to column types.

The mapping of a Table’s column names to data types is referred to as a Schema. Each column has a unique name and a data type, one of:

  • "boolean" - A boolean type
  • "date" - A timesonze-agnostic date type (month/day/year)
  • "datetime" - A millisecond-precision datetime type in the UTC timezone
  • "float" - A 64 bit float
  • "integer" - A signed 32 bit integer (the integer type supported by JavaScript)
  • "string" - A String data type (encoded internally as a dictionary)

Note that all Table columns are nullable, regardless of the data type.

Source

pub async fn make_port(&self) -> ClientResult<i32>

Create a unique channel ID on this Table, which allows View::on_update callback calls to be associated with the Table::update which caused them.

Source

pub async fn on_delete( &self, on_delete: Box<dyn Fn() + Send + Sync + 'static>, ) -> ClientResult<u32>

Register a callback which is called exactly once, when this Table is deleted with the Table::delete method.

Table::on_delete resolves when the subscription message is sent, not when the delete event occurs.

Source

pub async fn remove_delete(&self, callback_id: u32) -> ClientResult<()>

Removes a listener with a given ID, as returned by a previous call to Table::on_delete.

Source

pub async fn remove(&self, input: UpdateData) -> ClientResult<()>

Removes rows from this Table with the index column values supplied.

§Arguments
  • indices - A list of index column values for rows that should be removed.
§JavaScript Examples
await table.remove([1, 2, 3]);
§Python Examples
tbl = Table({"a": [1, 2, 3]}, index="a")
tbl.remove([2, 3])
§Examples
table.remove(UpdateData::Csv("index\n1\n2\n3")).await?;
Source

pub async fn replace(&self, input: UpdateData) -> ClientResult<()>

Replace all rows in this Table with the input data, coerced to this Table’s existing Schema, notifying any derived View and View::on_update callbacks.

Calling Table::replace is an easy way to replace all the data in a Table without losing any derived View instances or View::on_update callbacks. Table::replace does not infer data types like Client::table does, rather it coerces input data to the Schema like Table::update. If you need a Table with a different Schema, you must create a new one.

§JavaScript Examples
await table.replace("x,y\n1,2");
§Python Examples
table.replace("x,y\n1,2")
§Examples
let data = UpdateData::Csv("x,y\n1,2".into());
let opts = UpdateOptions::default();
table.replace(data, opts).await?;
Source

pub async fn update( &self, input: UpdateData, options: UpdateOptions, ) -> ClientResult<()>

Updates the rows of this table and any derived View instances.

Calling Table::update will trigger the View::on_update callbacks register to derived View, and the call itself will not resolve until all derived View’s are notified.

When updating a Table with an index, Table::update supports partial updates, by omitting columns from the update data.

§Arguments
  • input - The input data for this Table. The schema of a Table is immutable after creation, so this method cannot be called with a schema.
  • options - Options for this update step - see UpdateOptions.
§JavaScript Examples
await table.update("x,y\n1,2");
§Python Examples
table.update("x,y\n1,2")
§Examples
let data = UpdateData::Csv("x,y\n1,2".into());
let opts = UpdateOptions::default();
table.update(data, opts).await?;
Source

pub async fn validate_expressions( &self, expressions: Expressions, ) -> ClientResult<ValidateExpressionsData>

Validates the given expressions.

§Python Examples
exprs = client.validate_expressions({"computed": '"Quantity" + 4'})
Source

pub async fn view(&self, config: Option<ViewConfigUpdate>) -> ClientResult<View>

Create a new View from this table with a specified ViewConfigUpdate.

See View struct.

§JavaScript Examples
const view = await table.view({
    columns: ["Sales"],
    aggregates: { Sales: "sum" },
    group_by: ["Region", "Country"],
    filter: [["Category", "in", ["Furniture", "Technology"]]],
});
§Python Examples
view = table.view(
  columns=["Sales"],
  aggregates={"Sales": "sum"},
  group_by=["Region", "Country"],
  filter=[["Category", "in", ["Furniture", "Technology"]]]
)
§Examples
use crate::config::*;
let view = table
    .view(Some(ViewConfigUpdate {
        columns: Some(vec![Some("Sales".into())]),
        aggregates: Some(HashMap::from_iter(vec![("Sales".into(), "sum".into())])),
        group_by: Some(vec!["Region".into(), "Country".into()]),
        filter: Some(vec![Filter::new("Category", "in", &[
            "Furniture",
            "Technology",
        ])]),
        ..ViewConfigUpdate::default()
    }))
    .await?;

Trait Implementations§

Source§

impl Clone for Table

Source§

fn clone(&self) -> Table

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Auto Trait Implementations§

§

impl Freeze for Table

§

impl !RefUnwindSafe for Table

§

impl Send for Table

§

impl Sync for Table

§

impl Unpin for Table

§

impl !UnwindSafe for Table

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more