time_split.app.reexport#

Reexported resources from the time_split_app namespace.

Running python -m time_split app new will create a Dockerfile and a simple my_extensions.py module.

Environment variables:

Subset of variables used to customize Docker-based deployments.

DATASET_LOADER#: A DataLoaderWidget instance or type, e.g. my_extensions:MyDatasetLoader.

SPLIT_SELECT_FN#: A SelectSplitParams-function, e.g. my_extensions:my_select_fn.

PLOT_FN#: A plot()-like function, e.g. my_extensions:my_plot_fn.

LINK_FN#: A create_explorer_link()-like function, e.g. my_extensions:my_link_fn.

See rsundqvist/time-split-app for more variable-based configuration options.

Module Attributes

`SelectSplitParams`	Type of `SPLIT_SELECT_FN`.
`PlotFn`	Type of `PLOT_FN`.
`LinkFn`	Type of `LINK_FN`.

Functions

`load_dataset`(config)	Load dataset config.
`load_dataset_configs`(file)	Read dataset configs from file.
`select_datetime`()	Prompt user to select datetime.
`select_duration`(label, *[, horizontal])	See `DurationWidget.select()`.

Classes

`DataLoaderWidget`()	Load or generate datasets that require user input.
`Dataset`(*, label, path, index, aggregations, ...)	A loaded preconfigured dataset.
`DatasetConfig`(*, label, path, index, ...)	Configuration type for datasets on disk.
`DurationWidget`([default_unit, units, ...])	Duration specified by unit and count.
`QueryParams`(*[, schedule, step, n_splits, ...])	Parameters which may be passed as arguments in the URL.

Exceptions

DuplicateIndexError(df[, head])

Error raised when unaggregated data is detected.

class DataLoaderWidget[source]#

Bases: ABC

Load or generate datasets that require user input.

abstract get_title() → str[source]#: Title shown in the ⚙️ Configure data menu. Uses Markdown syntax.

abstract get_description() → str[source]#: Brief description shown in the ⚙️ Configure data menu. Uses Markdown syntax.

abstract load(params: bytes | None) → tuple[DataFrame, dict[str, str], bytes] | DataFrame[source]#

Load data.

Note

This method will be called many times due to the Streamlit data model.

You may want to use @streamlit.cache_data or @streamlit.cache_resource to improve performance. See https://docs.streamlit.io/develop/concepts/architecture/caching for more information.

Parameters:: params – Parameter preset as bytes. Handling is implementation-specific.
Returns:: A pandas DataFrame or a tuple (data, aggregations, params), where the bytes may be given as params to recreate the frame returned.

classmethod select_range(initial: tuple[datetime, datetime] | tuple[date, date] | None = None, *, date_only: Literal[False] = False, start_options: Collection[Literal['absolute', 'relative', 'now']] | None = None, end_options: Collection[Literal['absolute', 'relative', 'now']] | None = None) → tuple[datetime, datetime][source]#

classmethod select_range(initial: tuple[datetime, datetime] | tuple[date, date] | None = None, *, date_only: Literal[True], start_options: Collection[Literal['absolute', 'relative', 'now']] | None = None, end_options: Collection[Literal['absolute', 'relative', 'now']] | None = None) → tuple[date, date]

Support method for getting user date range input.

Parameters:

initial – Initial range used by the widget.
date_only – If True, disable the time selector and return dates.
start_options – Start options to make available to the user. Default = all.
end_options – End options to make available to the user. Default = all.

Returns:

A tuple (start, end).

Raises:

TypeError – If start_options or start_options are invalid.

class Dataset(*, label: str, path: str, index: str = '__INDEX__', aggregations: dict[str, str] = <factory>, description: str = '', read_function_kwargs: dict[str, ~typing.Any] = <factory>, df: ~pandas.core.frame.DataFrame)[source]#

Bases: DatasetConfig

A loaded preconfigured dataset.

df: DataFrame#

class DatasetConfig(*, label: str, path: str, index: str = '__INDEX__', aggregations: dict[str, str] = <factory>, description: str = '', read_function_kwargs: dict[str, ~typing.Any] = <factory>)[source]#

Bases: object

Configuration type for datasets on disk.

label: str#

Name shown in the UI (Markdown).

When using load_dataset_configs(), this will default to do the section header.

path: str#: Dataset path. May be prefixed for remote paths, e.g. s3://my-bucket/my-data.csv.zip.

index: str = '__INDEX__'#

Index column. Must be datetime-like.

Use '__INDEX__' if the dataset already has a suitable index.

aggregations: dict[str, str]#: Column aggregations. Default column aggregations. Users may override these in the UI.

description: str = ''#: A longer dataset description for the UI (Markdown). The first row will be used as a summary.

read_function_kwargs: dict[str, Any]#

Keyword arguments for the read function derived based on path, e.g. pandas.read_csv().

The path is always passed as a positional argument in the first position.

exception DuplicateIndexError(df: DataFrame, head: int = 5)[source]#

Bases: Exception

Error raised when unaggregated data is detected.

property samples: DataFrame#: Sample data with duplicated index values.

property n_duplicated: int#: Total number of duplicated index values.

property n_total: int#: Total number of rows in the original frame.

class DurationWidget(default_unit: str | None = None, units: Iterable[str] = ('days', 'hours', 'minutes'), default_periods: dict[str, int] | None = None)[source]#

Bases: object

Duration specified by unit and count.

Parameters:

default_unit – Default unit. Must be in units. Default is units[0].
units – An iterable of permitted units.
default_periods – Default period counts per unit.

select(label: str, *, horizontal: bool = False) → timedelta[source]#

Prompt user to select a duration.

Parameters:

label – Label to show.
horizontal – If True, show elements side-by-side.

Returns:

A timedelta.

LinkFn#

Type of LINK_FN.

alias of Callable[[…], str]

PlotFn#

Type of PLOT_FN.

alias of Callable[[…], Axes]

Bases: object

Parameters which may be passed as arguments in the URL.

For example http://localhost:8501/?n_splits=3&step=3&show_removed=true will give

>>> QueryParams(step=3, n_splits=3, show_removed=True)

http://localhost:8501/?n_splits=3&step=3&show_removed=true&dataset=data.json.gzip

These are later used to set the initial values in various widgets.

schedule: str | None = None#

step: int | None = None#

n_splits: int | None = None#

before: str | None = None#

after: str | None = None#

expand_limits: bool | str | None = None#

show_removed: bool | None = None#

data: int | str | bytes | tuple[datetime, datetime] | None = None#

Data selection.

If an int or str, it is assumed to refer to a DatasetConfig.label, either by index or by the label itself. Labels are normalized using normalize_dataset().

May also be a tuple of UNIX timestamps, specified on the form <start>-<stop>, e.g. 1556668800-1557606600 for a range ('2019-05-01T00:00:00z', ''). Tuples are converted using convert_timestamps(). Note that timestamps are coerced into 5-minute increments as naive UTC timestamps.

If bytes, these are assumed by the the parameters of a custom dataset widget; the bytes will be forwarded to the load()-method of the implementation.

classmethod normalize_dataset(data: str) → str[source]#: Normalize a dataset label.

classmethod convert_timestamps(start: int, end: int, *, utc: bool = False) → tuple[datetime, datetime][source]#

Convert a pair of UNIX timestamps into datetime instances.

Parameters:

start – Start of the range.
end – End of the range.
utc – If True, tz-aware UTC instances are returned.

Returns:

A tuple of two timestamps.

classmethod get() → Self[source]#: Get the session query parameters object.

classmethod set() → Self[source]#: Set the session query parameters object.

to_dict(prefix: str = '', filter: bool = True) → dict[str, int | bool | str][source]#

Return self as a dict with None values.

Parameters:

prefix – Key prefix.
filter – If True, remove None values.

Returns:

String representation of self.

classmethod make(**kwargs: Any) → Self[source]#: Construct a new instance keyword arguments.

SelectSplitParams#

Type of SPLIT_SELECT_FN.

A callable () -> split_params; see time_split.split() and DatetimeIndexSplitterKwargs.

alias of Callable[[], DatetimeIndexSplitterKwargs]

load_dataset(config: DatasetConfig) → Dataset[source]#

Load dataset config.

Parameters:: config – A dataset configuration object.
Returns:: A Dataset instance.

load_dataset_configs(file: str | PathLike[str] | Path) → list[DatasetConfig][source]#

Read dataset configs from file.

Returns on config object per top-level section in file.

Parameters:: file – Path to a TOML file.
Returns:: A list of dataset configs

select_datetime(label: str, initial: datetime | date | None = None, *, header: bool = True, date_only: bool = False, disabled: bool = False) → datetime | date[source]#

Prompt user to select datetime.

Parameters:

label – Widget label.
initial – Initial date. Pass None for current time.
header – If True, show label in a Streamlit header.
date_only – If True, disable the time selector and return plain datetime.date instances.
disabled – If True, disable user input to the Streamlit widget.

Returns:

A datetime or date.

select_duration(label: str, *, horizontal: bool = False) → timedelta[source]#: See DurationWidget.select().