vtools.data package¶

Submodules¶

vtools.data.dst module¶

Daylight Savings Time Conversion¶

This module provides the function dst_st() for converting a pandas Series/Dataframe with a naive DatetimeIndex that observes daylight savings time (DST) to a fixed standard time zone (e.g., PST) using POSIX conventions.

See the automatic API documentation for details: vtools.data.dst.dst_st()

dst_st(ts, src_tz: str = 'US/Pacific', target_tz: str = 'Etc/GMT+8')[source]¶

Convert a pandas Series with a datetime index from a timezone-unaware index that observes DST (e.g., US/Pacific) to a fixed standard time zone (e.g., Etc/GMT+8) using POSIX conventions.

Parameters:

tspandas.Series: Time series with a naive (timezone-unaware) DatetimeIndex.
src_tzstr, optional: Source timezone name (default is ‘US/Pacific’).
target_tzstr, optional: Target standard timezone name (default is ‘Etc/GMT+8’).

Returns:

pandas.Series: Time series with index converted to the target standard timezone and made naive.

Notes

The function assumes the index is not already timezone-aware.
‘Etc/GMT+8’ is the correct tz name for UTC-8 (PST) in pytz; note the sign is reversed from what might be expected.
Handles ambiguous/nonexistent times due to DST transitions.
The returned index is naive (timezone-unaware) but represents the correct standard time.
If the input index is already timezone-aware, this function will raise an error.

Examples

>>> import pandas as pd
>>> from vtools import dst_st
>>> rng = pd.date_range("2023-11-05 00:00", "2023-11-05 04:00", freq="30min")
>>> ts = pd.Series(range(len(rng)), index=rng)
>>> converted = dst_st(ts)
>>> print(converted)
2023-11-05 00:00:00    0
2023-11-05 00:30:00    1
2023-11-05 01:00:00    2
2023-11-05 01:30:00    3
2023-11-05 02:30:00    5
2023-11-05 03:00:00    6
2023-11-05 03:30:00    7
2023-11-05 04:00:00    8
dtype: int64

vtools.data.gap module¶

class Enum(value)[source]¶

Bases: object

Generic enumeration.

Derive from this class to define new enumerations.

__dict__ = mappingproxy({'__module__': 'enum', '__doc__': '\n Generic enumeration.\n\n Derive from this class to define new enumerations.\n ', '__new__': <staticmethod(<function Enum.__new__>)>, '_generate_next_value_': <function Enum._generate_next_value_>, '_missing_': <classmethod(<function Enum._missing_>)>, '__repr__': <function Enum.__repr__>, '__str__': <function Enum.__str__>, '__dir__': <function Enum.__dir__>, '__format__': <function Enum.__format__>, '__hash__': <function Enum.__hash__>, '__reduce_ex__': <function Enum.__reduce_ex__>, 'name': <types.DynamicClassAttribute object>, 'value': <types.DynamicClassAttribute object>, '__dict__': <attribute '__dict__' of 'Enum' objects>, '__weakref__': <attribute '__weakref__' of 'Enum' objects>, '_member_names_': [], '_member_map_': {}, '_member_type_': <class 'object'>, '_value2member_map_': {}, '__annotations__': {}})¶

__doc__ = '\n Generic enumeration.\n\n Derive from this class to define new enumerations.\n '¶

__module__ = 'enum'¶

__weakref__¶: list of weak references to the object (if defined)

_generate_next_value_(start, count, last_values)[source]¶

Generate the next value when not given.

name: the name of the member start: the initial start value or None count: the number of existing members last_value: the last value assigned or None

_member_map_ = {}¶

_member_names_ = []¶

_member_type_¶: alias of object

classmethod _missing_(value)[source]¶

_value2member_map_ = {}¶

name¶: The name of the Enum member.

value¶: The value of the Enum member.

class GapSpec(n_gaps: int = 60, min_len: int = 70, max_len: int = 900, seed: Optional[int] = 123, strategy: vtools.data.gap.GapStrategy = <GapStrategy.TARGET_ONLY: 1>, ensure_room: int = 2)[source]¶

Bases: object

__annotations__ = {'ensure_room': <class 'int'>, 'max_len': <class 'int'>, 'min_len': <class 'int'>, 'n_gaps': <class 'int'>, 'seed': typing.Optional[int], 'strategy': <enum 'GapStrategy'>}¶

__dataclass_fields__ = {'ensure_room': Field(name='ensure_room',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'max_len': Field(name='max_len',type=<class 'int'>,default=900,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'min_len': Field(name='min_len',type=<class 'int'>,default=70,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'n_gaps': Field(name='n_gaps',type=<class 'int'>,default=60,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'seed': Field(name='seed',type=typing.Optional[int],default=123,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'strategy': Field(name='strategy',type=<enum 'GapStrategy'>,default=<GapStrategy.TARGET_ONLY: 1>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD)}¶

__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶

__dict__ = mappingproxy({'__module__': 'vtools.data.gap', '__annotations__': {'n_gaps': <class 'int'>, 'min_len': <class 'int'>, 'max_len': <class 'int'>, 'seed': typing.Optional[int], 'strategy': <enum 'GapStrategy'>, 'ensure_room': <class 'int'>}, 'n_gaps': 60, 'min_len': 70, 'max_len': 900, 'seed': 123, 'strategy': <GapStrategy.TARGET_ONLY: 1>, 'ensure_room': 2, '__dict__': <attribute '__dict__' of 'GapSpec' objects>, '__weakref__': <attribute '__weakref__' of 'GapSpec' objects>, '__doc__': 'GapSpec(n_gaps: int = 60, min_len: int = 70, max_len: int = 900, seed: Optional[int] = 123, strategy: vtools.data.gap.GapStrategy = <GapStrategy.TARGET_ONLY: 1>, ensure_room: int = 2)', '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False), '__dataclass_fields__': {'n_gaps': Field(name='n_gaps',type=<class 'int'>,default=60,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'min_len': Field(name='min_len',type=<class 'int'>,default=70,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'max_len': Field(name='max_len',type=<class 'int'>,default=900,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'seed': Field(name='seed',type=typing.Optional[int],default=123,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'strategy': Field(name='strategy',type=<enum 'GapStrategy'>,default=<GapStrategy.TARGET_ONLY: 1>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'ensure_room': Field(name='ensure_room',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD)}, '__init__': <function GapSpec.__init__>, '__repr__': <function GapSpec.__repr__>, '__eq__': <function GapSpec.__eq__>, '__hash__': None, '__match_args__': ('n_gaps', 'min_len', 'max_len', 'seed', 'strategy', 'ensure_room')})¶

__doc__ = 'GapSpec(n_gaps: int = 60, min_len: int = 70, max_len: int = 900, seed: Optional[int] = 123, strategy: vtools.data.gap.GapStrategy = <GapStrategy.TARGET_ONLY: 1>, ensure_room: int = 2)'¶

__eq__(other)¶: Return self==value.

__hash__ = None¶

__init__(n_gaps: int = 60, min_len: int = 70, max_len: int = 900, seed: int | None = 123, strategy: GapStrategy = GapStrategy.TARGET_ONLY, ensure_room: int = 2) → None¶

__match_args__ = ('n_gaps', 'min_len', 'max_len', 'seed', 'strategy', 'ensure_room')¶

__module__ = 'vtools.data.gap'¶

__repr__()¶: Return repr(self).

__weakref__¶: list of weak references to the object (if defined)

ensure_room: int = 2¶

max_len: int = 900¶

min_len: int = 70¶

n_gaps: int = 60¶

seed: int | None = 123¶

strategy: GapStrategy = 1¶

class GapStrategy(value)[source]¶

Bases: Enum

Where to create synthetic gaps.

BOTH = 2¶

STAGGERED = 3¶

TARGET_ONLY = 1¶

__doc__ = 'Where to create synthetic gaps.'¶

__module__ = 'vtools.data.gap'¶

_choose_gap_windows(n: int, spec: GapSpec) → List[Tuple[int, int]][source]¶

apply_gaps(target: Series, neighbor: Series | DataFrame, spec: GapSpec) → Tuple[Series, Series | DataFrame, Dict[str, List[Tuple[Timestamp, Timestamp]]]][source]¶

Apply synthetic gaps to target and/or neighbor.

Parameters:

target, neighbortime-aligned inputs
specGapSpec

Returns:

(target_gapped, neighbor_gapped, gap_windows)

gap_windows provides timestamp ranges actually masked under keys ‘target’ and possibly ‘neighbor’.

class auto[source]¶

Bases: object

Instances are replaced with an appropriate value in Enum class suites.

__dict__ = mappingproxy({'__module__': 'enum', '__doc__': '\n Instances are replaced with an appropriate value in Enum class suites.\n ', 'value': <object object>, '__dict__': <attribute '__dict__' of 'auto' objects>, '__weakref__': <attribute '__weakref__' of 'auto' objects>, '__annotations__': {}})¶

__doc__ = '\n Instances are replaced with an appropriate value in Enum class suites.\n '¶

__module__ = 'enum'¶

__weakref__¶: list of weak references to the object (if defined)

value = <object object>¶

dataclass(cls=None, /, *, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False, match_args=True, kw_only=False, slots=False)[source]¶

Returns the same class as was passed in, with dunder methods added based on the fields defined in the class.

Examines PEP 526 __annotations__ to determine fields.

If init is true, an __init__() method is added to the class. If repr is true, a __repr__() method is added. If order is true, rich comparison dunder methods are added. If unsafe_hash is true, a __hash__() method function is added. If frozen is true, fields may not be assigned to after instance creation. If match_args is true, the __match_args__ tuple is added. If kw_only is true, then by default all fields are keyword-only. If slots is true, an __slots__ attribute is added.

describe_null(dset, name, context=2)[source]¶: If dset is a DataFrame, run describe_series_gaps on each column. If it’s a Series, just run it once.

describe_series_gaps(s: Series, name: str, context: int = 2)[source]¶: Print gaps in a single Series s, showing context non-null points before and after each gap, with an ellipsis marker in between.

example_gap()[source]¶

gap_count(ts, state='gap', dtype=<class 'int'>)[source]¶

Count missing data Identifies gaps (runs of missing or non-missing data) and quantifies the length of the gap in terms of number of samples, which works better for regular series. Each time point receives the length of the run.

Parameters:

tsDataFrame: Time series to analyze
statestr one of ‘gap’|’good’|’both’: State to count. If state is gap, block size of missing data are counted and reported for time points in the gap (every point in a given gap will receive the same value). Non missing data will have a size of zero. Setting state to ‘good’ inverts this – missing blocks are reported as zero and good data are counted.
dtypestr or type: Data type of output, should be acceptable to pandas astype

gap_distance(ts, disttype='count', to='good')[source]¶

For each element of ts, count the distance to the nearest good data/or bad data.

Parameters:

tsDataFrame
Time series to analyze
disttypestr one of ‘bad’|’good’
If disttype = “count” this is the number of values. If dist_type=”freq” it is in the units of ts.freq
(so if freq == “15min” it is in minutes”)
tostr one of ‘bad’|’good’
If to = “good” this is the distance to the nearest good data (which is 0 for good data).
If to = “bad”, this is the distance to the nearest nan (which is 0 for nan).

Returns:

resultDataFrame: A new regular time series with the same freq as the argument holding the distance of good/bad data.

gap_size(ts)[source]¶

Identifies gaps (runs of missing data) and quantifies the length of the gap. Each time point receives the length of the run in terms of seconds or number of values in the time dimension, with non-missing data returning zero. Time is measured from the time the data first started being missing to when the data first starts being not missing .

Parameters:

tsDataFrame

Returns:

resultDataFrame: A new regular time series with the same freq as the argument holding the size of the gap.

Examples

>>> ndx = pd.date_range(pd.Timestamp(2017,1,1,12),freq='15min',periods=10)
>>> vals0 = np.arange(0.,10.,dtype='d')
>>> vals1 = np.arange(0.,10.,dtype='d')
>>> vals2 =  np.arange(0.,10.,dtype='d')
>>> vals0[0:3] = np.nan
>>> vals0[7:-1] = np.nan
>>> vals1[2:4] = np.nan>>>
>>> vals1[6] = np.nan
>>> vals1[9] = np.nan

>>> df = pd.DataFrame({'vals0':vals0,'vals1':vals1,'vals2':vals2},index = ndx)
>>> out = gap_size(df)
>>> print(df)
                         vals0  vals1  vals2
2017-01-01 12:00:00    NaN    0.0    0.0
2017-01-01 12:15:00    NaN    1.0    1.0
2017-01-01 12:30:00    NaN    NaN    2.0
2017-01-01 12:45:00    3.0    NaN    3.0
2017-01-01 13:00:00    4.0    4.0    4.0
2017-01-01 13:15:00    5.0    5.0    5.0
2017-01-01 13:30:00    6.0    NaN    6.0
2017-01-01 13:45:00    NaN    7.0    7.0
2017-01-01 14:00:00    NaN    8.0    8.0
2017-01-01 14:15:00    9.0    NaN    9.0
>>> print(out)
                         vals0  vals1  vals2
2017-01-01 12:00:00   45.0    0.0    0.0
2017-01-01 12:15:00   45.0    0.0    0.0
2017-01-01 12:30:00   45.0   30.0    0.0
2017-01-01 12:45:00    0.0   30.0    0.0
2017-01-01 13:00:00    0.0    0.0    0.0
2017-01-01 13:15:00    0.0    0.0    0.0
2017-01-01 13:30:00    0.0   15.0    0.0
2017-01-01 13:45:00   30.0    0.0    0.0
2017-01-01 14:00:00   30.0    0.0    0.0
2017-01-01 14:15:00    0.0    0.0    0.0

vtools.data.sample_series module¶

bessel_df()[source]¶: Sample series with bessel function signals

extra()[source]¶

interval(ts)[source]¶: Sampling interval of series

jay_flinchem_chirptest(c1=3.5, c2=5.5, c3=0.0002, c4=6.75)[source]¶: Approximation of the signal from Jay and Flinchem 1999 A comparison of methods for analysis of tidal records containing multi-scale non-tidal background energy that has a small tide with noisy, river-influenced amplitude and subtide

small_subtide(subtide_scale=0.0, add_nan=False)[source]¶: Inspired by large tidal flow with small Qr undercurrent with 72hr period This is a tough lowpass filtering job because the diurnal band is large and must be supressed in order to see the more subtle subtidal amplitude

vtools.data.timeseries module¶

Time series module Helpers for creating regular and irregular time series, transforming irregular to regular and analyzing gaps.

class PchipInterpolator(x, y, axis=0, extrapolate=None)[source]¶

Bases: CubicHermiteSpline

PCHIP 1-D monotonic cubic interpolation.

x and y are arrays of values used to approximate some function f, with y = f(x). The interpolant uses monotonic cubic splines to find the value of new points. (PCHIP stands for Piecewise Cubic Hermite Interpolating Polynomial).

Parameters:

xndarray, shape (npoints, ): A 1-D array of monotonically increasing real values. x cannot include duplicate values (otherwise f is overspecified)
yndarray, shape (…, npoints, …): A N-D array of real values. y’s length along the interpolation axis must be equal to the length of x. Use the axis parameter to select the interpolation axis.
axisint, optional: Axis in the y array corresponding to the x-coordinate values. Defaults to axis=0.
extrapolatebool, optional: Whether to extrapolate to out-of-bounds points based on first and last intervals, or to return NaNs.

See also

CubicHermiteSpline: Piecewise-cubic interpolator.
Akima1DInterpolator: Akima 1D interpolator.
CubicSpline: Cubic spline data interpolator.
PPoly: Piecewise polynomial in terms of coefficients and breakpoints.

Notes

The interpolator preserves monotonicity in the interpolation data and does not overshoot if the data is not smooth.

The first derivatives are guaranteed to be continuous, but the second derivatives may jump at \(x_k\).

Determines the derivatives at the points \(x_k\), \(f'_k\), by using PCHIP algorithm [1].

Let \(h_k = x_{k+1} - x_k\), and \(d_k = (y_{k+1} - y_k) / h_k\) are the slopes at internal points \(x_k\). If the signs of \(d_k\) and \(d_{k-1}\) are different or either of them equals zero, then \(f'_k = 0\). Otherwise, it is given by the weighted harmonic mean

\[\frac{w_1 + w_2}{f'_k} = \frac{w_1}{d_{k-1}} + \frac{w_2}{d_k}\]

where \(w_1 = 2 h_k + h_{k-1}\) and \(w_2 = h_k + 2 h_{k-1}\).

The end slopes are set using a one-sided scheme [2].

References

[1]

F. N. Fritsch and J. Butland, A method for constructing local monotone piecewise cubic interpolants, SIAM J. Sci. Comput., 5(2), 300-304 (1984). :doi:`10.1137/0905021`.

[2]

see, e.g., C. Moler, Numerical Computing with Matlab, 2004. :doi:`10.1137/1.9780898717952`

Methods

`__call__`(x[, nu, extrapolate])	Evaluate the piecewise polynomial or its derivative.
`derivative`([nu])	Construct a new piecewise polynomial representing the derivative.
`antiderivative`([nu])	Construct a new piecewise polynomial representing the antiderivative.
`roots`([discontinuity, extrapolate])	Find real roots of the piecewise polynomial.

__doc__ = "PCHIP 1-D monotonic cubic interpolation.\n\n ``x`` and ``y`` are arrays of values used to approximate some function f,\n with ``y = f(x)``. The interpolant uses monotonic cubic splines\n to find the value of new points. (PCHIP stands for Piecewise Cubic\n Hermite Interpolating Polynomial).\n\n Parameters\n ----------\n x : ndarray, shape (npoints, )\n A 1-D array of monotonically increasing real values. ``x`` cannot\n include duplicate values (otherwise f is overspecified)\n y : ndarray, shape (..., npoints, ...)\n A N-D array of real values. ``y``'s length along the interpolation\n axis must be equal to the length of ``x``. Use the ``axis``\n parameter to select the interpolation axis.\n axis : int, optional\n Axis in the ``y`` array corresponding to the x-coordinate values. Defaults\n to ``axis=0``.\n extrapolate : bool, optional\n Whether to extrapolate to out-of-bounds points based on first\n and last intervals, or to return NaNs.\n\n Methods\n -------\n __call__\n derivative\n antiderivative\n roots\n\n See Also\n --------\n CubicHermiteSpline : Piecewise-cubic interpolator.\n Akima1DInterpolator : Akima 1D interpolator.\n CubicSpline : Cubic spline data interpolator.\n PPoly : Piecewise polynomial in terms of coefficients and breakpoints.\n\n Notes\n -----\n The interpolator preserves monotonicity in the interpolation data and does\n not overshoot if the data is not smooth.\n\n The first derivatives are guaranteed to be continuous, but the second\n derivatives may jump at :math:`x_k`.\n\n Determines the derivatives at the points :math:`x_k`, :math:`f'_k`,\n by using PCHIP algorithm [1]_.\n\n Let :math:`h_k = x_{k+1} - x_k`, and :math:`d_k = (y_{k+1} - y_k) / h_k`\n are the slopes at internal points :math:`x_k`.\n If the signs of :math:`d_k` and :math:`d_{k-1}` are different or either of\n them equals zero, then :math:`f'_k = 0`. Otherwise, it is given by the\n weighted harmonic mean\n\n .. math::\n\n \\frac{w_1 + w_2}{f'_k} = \\frac{w_1}{d_{k-1}} + \\frac{w_2}{d_k}\n\n where :math:`w_1 = 2 h_k + h_{k-1}` and :math:`w_2 = h_k + 2 h_{k-1}`.\n\n The end slopes are set using a one-sided scheme [2]_.\n\n\n References\n ----------\n .. [1] F. N. Fritsch and J. Butland,\n A method for constructing local\n monotone piecewise cubic interpolants,\n SIAM J. Sci. Comput., 5(2), 300-304 (1984).\n :doi:`10.1137/0905021`.\n .. [2] see, e.g., C. Moler, Numerical Computing with Matlab, 2004.\n :doi:`10.1137/1.9780898717952`\n\n "¶

__init__(x, y, axis=0, extrapolate=None)[source]¶

__module__ = 'scipy.interpolate._cubic'¶

static _edge_case(h0, h1, m0, m1)[source]¶

static _find_derivatives(x, y)[source]¶

axis¶

c¶

extrapolate¶

x¶

datetime_elapsed(index_or_ts, reftime=None, dtype='d', inplace=False)[source]¶

Convert a time series or DatetimeIndex to an integer/double series of elapsed time

Parameters:

index_or_tsDatatimeIndex: Time series or index to be transformed
reftimeDatatimeIndex or something convertible: The reference time upon which elapsed time is measured. Default of None means start of series
dtypestr like ‘i’ or ‘d’ or type like int (Int64) or float (Float64): Data type for output, which starts out as a Float64 (‘d’) and gets converted, typically to Int64 (‘i’)
inplacebool: If input is a data frame, replaces the index in-place with no copy

Returns:

result: A new index using elapsed time from reftime as its value and of type dtype

days(d)[source]¶: Create a time interval representing d days

elapsed_datetime(index_or_ts, reftime=None, time_unit='s', inplace=False)[source]¶

Convert a time series or numerical Index to a Datetime index or series

Parameters:

index_or_tsDatatimeIndex: Time series or index to be transformed with index in elapsed seconds from reftime
reftimeDatatimeIndex or something convertible: The reference time upon which datetimes are to be evaluated.
inplacebool: If input is a data frame, replaces the index in-place with no copy

Returns:

result: A new index using DatetimeIndex inferred from elapsed time from reftime as its value and of type dtype

example()[source]¶

extrapolate_ts(ts, start=None, end=None, method='ffill', val=None)[source]¶

Extend a regular time series to a new start and/or end using a specified extrapolation method.

Parameters:

tspandas.Series or pandas.DataFrame

The input time series with a DateTimeIndex and a regular frequency.

startdatetime-like, optional

The new starting time. If None, no extension is done before the existing data.

enddatetime-like, optional

The new ending time. If None, no extension is done after the existing data.

method{‘ffill’, ‘bfill’, ‘linear_slope’, ‘taper’, ‘constant’}, default ‘ffill’

The method used to fill new values outside the original time range:

‘ffill’ : Forward-fill after the original data using its last value.
‘bfill’ : Backward-fill before the original data using its first value.
‘linear_slope’ : Bidirectional linear extrapolation using the first/last two points.
‘taper’ : One-sided linear interpolation to/from a specified value (val).
‘constant’ : One-sided constant value fill with val.

valfloat, optional

Required for ‘taper’ and ‘constant’. Specifies the value to use.

Returns:

extendedpandas.Series or pandas.DataFrame: The time series extended and filled using the selected method.

Raises:

ValueError

If extrapolation rules are violated based on the method.
If method requires or forbids val and it’s misused.
If frequency cannot be inferred.

hours(h)[source]¶: Create a time interval representing h hours

is_regular(ts, raise_exception=False)[source]¶

Check if a pandas DataFrame, Series, or xarray object with a time axis (axis 0) has a regular time index.

Regular means:

The index is unique.
The index equals a date_range spanning from the first to the last value with the inferred frequency.

Parameters:

tsDataFrame, Series, or xarray object.: Series to evaluate
raise_exceptionbool: If True, raises a ValueError when the index is not regular. Otherwise, returns False.

Returns:

bool : True if the time index is regular; False otherwise.

minutes(m)[source]¶: Create a time interval representing m minutes

months(m)[source]¶: Create a time interval representing m months

Rename columns (for DataFrame) or the name (for Series).

Parameters:

ts

pandas Series or DataFrame to rename.

colnames

str
- Series: set Series.name to this value.
- DataFrame: treated as a single target name; the DataFrame must have exactly one column. Raises if there are multiple columns.
Sequence[str]
- Series: must be length 1; that single value becomes Series.name.
- DataFrame: length must equal the number of columns; these become the new column names in order.
Mapping[str, str]
- DataFrame: passed to DataFrame.rename(columns=…).
- Series: if the current Series.name is a key, it is mapped to the value. If not present, the name is left unchanged. Mapping with None can be used to rename a nameless series.
Callable[[str], str]
- DataFrame: applied to each column name (via rename).
- Series: called with the current Series.name to compute the new name.

convert_dfbool, default True

If True and ts is a Series, convert it to a single-column DataFrame before renaming.

Returns:

pandas Series or DataFrame: A copy of ts with updated name(s).

Raises:

TypeError: If ts is not a Series or DataFrame, or colnames has an unsupported type.
ValueError: If a provided list length does not match the number of columns (DataFrame) or is not exactly 1 (Series), or if a single string is given for a multi-column DataFrame.

rts(data, start, freq, columns=None, props=None)[source]¶

Create a regular or calendar time series from data and time parameters

Parameters:

dataarray_like

Should be a array/list of values. There is no restriction on data: type, but not all functionality like addition or interpolation will work on all data.

startPandas.Timestamp

Timestamp or a string or type that can be coerced to one.

interval_time_interval

Can also be a string representing a pandas freq.

Returns:

resultPandas.DataFrame: A regular time series with the freq attribute set

rts_formula(start, end, freq, valfunc=nan)[source]¶

Create a regular time series filled with constant value or formula based on elapsed seconds

Parameters:

startPandas.Timestamp: Starting Timestamp or a string or type that can be coerced to one.
endPandas.Timestamp: Ending Timestamp or a string or type that can be coerced to one.
freq_time_interval: Can also be a string representing an interval.
valfuncdict: Constant or dictionary that maps column names to lambdas based on elapsed time from the starts of the series. An example would be {“value”: lambda x: np.nan}

Returns:

resultPandas.DataFrame: A regular time series with the freq attribute set

seconds(s)[source]¶: Create a time interval representing s seconds

time_overlap(ts0, ts1, valid=True)[source]¶: Check for overlapping time coverage between series Returns a tuple of start and end of overlapping periods. Only considers the time stamps of the start/end, possibly ignoring NaNs at the beginning if valid=True, does not check for actual time stamp alignment

to_dataframe(ts)[source]¶

transition_ts(ts0, ts1, method='linear', create_gap=None, overlap=(0, 0), return_type='series')[source]¶

years(y)[source]¶: Create a time interval representing y years

vtools.data.vis_gap module¶

generate_sample_data()[source]¶

interactive_gap_plot(df)[source]¶

plot_missing_data(df, ax, min_gap_duration, overall_start, overall_end)[source]¶

vtools.data.vtime module¶

Basic ops for creating, testing and manipulating times and time intervals. This module contains factory and helper functions for working with times and time intervals.

For time intervals (or deltas), VTools uses classes that are compatible with the “freq” argument of

requires a time and time interval system that is consistent (e.g. time+n*interval makes sense) and that can be applied to both calendar dependent and calendar-independent intervals. Because this requirement is not met by any one implementation it is recommended that you always use the factory functions in this module for creating intervals or testing whether an interval is valid.

days(d)[source]¶: Create a time interval representing d days

dst_to_standard_naive(ts, dst_zone='US/Pacific', standard_zone='Etc/GMT+8')[source]¶

Convert timezone-unaware series from a local (with daylight) time to standard time This would be useful, say, for converting a series that is PDT during summer to one that is not. The routine is mainly to treat cases where the time stamps at DST interfaces are not redundant – if they are you can probably use tz_convert and tz_localize with the ambiguous = ‘infer’ option and do the job more efficiently, but lots of databases don’t store data this way.

The choice of the standard_zone is, it seems, buggy. The defaults are supposed to convert from PST/PDT to pure PST, and the latter should be GMT-8. In a sense, this function is included before the behavior is really understood.

Only regular series are accepted … this is a quirk of the implementation

hours(h)[source]¶: Create a time interval representing h hours

minutes(m)[source]¶: Create a time interval representing m minutes

months(m)[source]¶: Create a time interval representing m months

seconds(s)[source]¶: Create a time interval representing s seconds

years(y)[source]¶: Create a time interval representing y years

Navigation

Related Topics

vtools.data package¶

Submodules¶

vtools.data.dst module¶

Daylight Savings Time Conversion¶

vtools.data.gap module¶

vtools.data.sample_series module¶

vtools.data.timeseries module¶

vtools.data.vis_gap module¶

vtools.data.vtime module¶

Module contents¶