time_split.integration.base#
Base implementations for splitting generic data types.
Users may implement splitting of any data type by implementing suitable as_available and select functions.
Module Attributes
Type of data to split. |
|
A callable |
|
A callable |
Functions
|
Base implementation for splitting integrated data types. |
Classes
|
Time-based split of a generic data type. |
- class DataT#
Type of data to split.
alias of TypeVar(‘DataT’)
- DataAsAvailableFn#
A callable
(data: DataT) -> DatetimeIterable.alias of
Callable[[DataT],Iterable[str|Timestamp|datetime|date|datetime64]]
- DataSelectFn#
A callable
(data: DataT, left_inclusive: datetime, end_exclusive: datetime) -> DataT).
- class DatetimeSplit(data: DataT, future_data: DataT, bounds: DatetimeSplitBounds)[source]#
Bases:
NamedTuple,Generic[DataT]Time-based split of a generic data type.
- data: DataT#
Data before the simulated
training_date.Bounded by bounds.start <= time(future_data) < bounds.mid.
- future_data: DataT#
Data after the simulated
training_date.Bounded by bounds.mid <= time(future_data) < bounds.end.
- bounds: DatetimeSplitBounds#
The underlying bounds that produced this split.
- property training_date: pandas.Timestamp#
Returns the simulated training date (alias of
self.bounds.mid).
- split_data(data: DataT, *, log_progress: str | bool | Logger | LoggerAdapter[Any] | LogSplitProgressKwargs[MetricsType] = False, as_available: Callable[[DataT], Iterable[str | Timestamp | datetime | date | datetime64]], select: Callable[[DataT, datetime, datetime], DataT], **kwargs: Unpack[DatetimeIndexSplitterKwargs]) Iterable[DatetimeSplit[DataT]][source]#
Base implementation for splitting integrated data types.
The required
as_availableandselectcallables provided perform the actual integration.- Parameters:
data – The data to split.
log_progress – Controls logging of fold progress. See
log_split_progress()for details.as_available – A callable
(data: DataT) -> DatetimeIterable.select – A callable
(data: DataT, left_inclusive: datetime, end_exclusive: datetime) -> DataT).**kwargs – Keyword arguments for
split()-function.
- Yields:
Tuples
(data, future_data, bounds).
See also
To get started with your own integration, copy
split_pandas()orsplit_polars()and use it as the baseline (click[source]) on the linked function.