vtools.functions package

Submodules

vtools.functions.climatology module

apply_climatology(climate, index)[source]

Apply daily or monthly climatology to a new index

Parameters:
climate: DataFrame with integer index representing month of year (Jan=1) or day of year. Must be of size 12 365,366. Day 366 will be inferred from day 365 value
index: DatetimeIndex representing locations to be inferred
Returns:
DataFrame or Series as given by climate with values extracted from climatology for the month or day
climatology(ts, freq, nsmooth=None)[source]

“ Create a climatology on the columns of ts

Parameters:
ts: DataFrame or Series
DataStructure to be analyzed. Must have a length of at least 2*freq
freq: period [“day”,”month”]
Period over which the climatology is analyzed
nsmooth: int

window size (number of values) of pre-smoothing. This may not make sense for series that are not approximately regular. An odd number is usually best.

Returns:

out: DataFrame or Series Data structure of the same type as ts, with Integer index representing month (Jan=1) or day of year (1:365).

climatology_quantiles(ts, min_day_year, max_day_year, window_width, quantiles=[0.05, 0.25, 0.5, 0.75, 0.95])[source]

“ Create windowed quantiles across years on a time series

Parameters:
ts: DataFrame or Series
DataStructure to be analyzed.
min_day_year: int
Minimum Julian day to be considered
freq: period [“day”,”month”]
Maximum Julian day to be considered
window_width: int
Number of days to include, including the central day and days on each side. So for instance window_width=15 would span the central date and 7 days on each side
quantiles: array-like

quantiles requested

Returns:

out: DataFrame or Series Data structure with Julian day as the index and quantiles as columns.

vtools.functions.envelope module

class PchipInterpolator(x, y, axis=0, extrapolate=None)[source]

Bases: CubicHermiteSpline

PCHIP 1-D monotonic cubic interpolation.

x and y are arrays of values used to approximate some function f, with y = f(x). The interpolant uses monotonic cubic splines to find the value of new points. (PCHIP stands for Piecewise Cubic Hermite Interpolating Polynomial).

Parameters:
xndarray, shape (npoints, )

A 1-D array of monotonically increasing real values. x cannot include duplicate values (otherwise f is overspecified)

yndarray, shape (…, npoints, …)

A N-D array of real values. y’s length along the interpolation axis must be equal to the length of x. Use the axis parameter to select the interpolation axis.

axisint, optional

Axis in the y array corresponding to the x-coordinate values. Defaults to axis=0.

extrapolatebool, optional

Whether to extrapolate to out-of-bounds points based on first and last intervals, or to return NaNs.

See also

CubicHermiteSpline

Piecewise-cubic interpolator.

Akima1DInterpolator

Akima 1D interpolator.

CubicSpline

Cubic spline data interpolator.

PPoly

Piecewise polynomial in terms of coefficients and breakpoints.

Notes

The interpolator preserves monotonicity in the interpolation data and does not overshoot if the data is not smooth.

The first derivatives are guaranteed to be continuous, but the second derivatives may jump at \(x_k\).

Determines the derivatives at the points \(x_k\), \(f'_k\), by using PCHIP algorithm [1].

Let \(h_k = x_{k+1} - x_k\), and \(d_k = (y_{k+1} - y_k) / h_k\) are the slopes at internal points \(x_k\). If the signs of \(d_k\) and \(d_{k-1}\) are different or either of them equals zero, then \(f'_k = 0\). Otherwise, it is given by the weighted harmonic mean

\[\frac{w_1 + w_2}{f'_k} = \frac{w_1}{d_{k-1}} + \frac{w_2}{d_k}\]

where \(w_1 = 2 h_k + h_{k-1}\) and \(w_2 = h_k + 2 h_{k-1}\).

The end slopes are set using a one-sided scheme [2].

References

[1]

F. N. Fritsch and J. Butland, A method for constructing local monotone piecewise cubic interpolants, SIAM J. Sci. Comput., 5(2), 300-304 (1984). :doi:`10.1137/0905021`.

[2]

see, e.g., C. Moler, Numerical Computing with Matlab, 2004. :doi:`10.1137/1.9780898717952`

Attributes:
axis
c
extrapolate
x

Methods

__call__(x[, nu, extrapolate])

Evaluate the piecewise polynomial or its derivative.

derivative([nu])

Construct a new piecewise polynomial representing the derivative.

antiderivative([nu])

Construct a new piecewise polynomial representing the antiderivative.

roots([discontinuity, extrapolate])

Find real roots of the piecewise polynomial.

__annotations__ = {}
__doc__ = "PCHIP 1-D monotonic cubic interpolation.\n\n    ``x`` and ``y`` are arrays of values used to approximate some function f,\n    with ``y = f(x)``. The interpolant uses monotonic cubic splines\n    to find the value of new points. (PCHIP stands for Piecewise Cubic\n    Hermite Interpolating Polynomial).\n\n    Parameters\n    ----------\n    x : ndarray, shape (npoints, )\n        A 1-D array of monotonically increasing real values. ``x`` cannot\n        include duplicate values (otherwise f is overspecified)\n    y : ndarray, shape (..., npoints, ...)\n        A N-D array of real values. ``y``'s length along the interpolation\n        axis must be equal to the length of ``x``. Use the ``axis``\n        parameter to select the interpolation axis.\n    axis : int, optional\n        Axis in the ``y`` array corresponding to the x-coordinate values. Defaults\n        to ``axis=0``.\n    extrapolate : bool, optional\n        Whether to extrapolate to out-of-bounds points based on first\n        and last intervals, or to return NaNs.\n\n    Methods\n    -------\n    __call__\n    derivative\n    antiderivative\n    roots\n\n    See Also\n    --------\n    CubicHermiteSpline : Piecewise-cubic interpolator.\n    Akima1DInterpolator : Akima 1D interpolator.\n    CubicSpline : Cubic spline data interpolator.\n    PPoly : Piecewise polynomial in terms of coefficients and breakpoints.\n\n    Notes\n    -----\n    The interpolator preserves monotonicity in the interpolation data and does\n    not overshoot if the data is not smooth.\n\n    The first derivatives are guaranteed to be continuous, but the second\n    derivatives may jump at :math:`x_k`.\n\n    Determines the derivatives at the points :math:`x_k`, :math:`f'_k`,\n    by using PCHIP algorithm [1]_.\n\n    Let :math:`h_k = x_{k+1} - x_k`, and  :math:`d_k = (y_{k+1} - y_k) / h_k`\n    are the slopes at internal points :math:`x_k`.\n    If the signs of :math:`d_k` and :math:`d_{k-1}` are different or either of\n    them equals zero, then :math:`f'_k = 0`. Otherwise, it is given by the\n    weighted harmonic mean\n\n    .. math::\n\n        \\frac{w_1 + w_2}{f'_k} = \\frac{w_1}{d_{k-1}} + \\frac{w_2}{d_k}\n\n    where :math:`w_1 = 2 h_k + h_{k-1}` and :math:`w_2 = h_k + 2 h_{k-1}`.\n\n    The end slopes are set using a one-sided scheme [2]_.\n\n\n    References\n    ----------\n    .. [1] F. N. Fritsch and J. Butland,\n           A method for constructing local\n           monotone piecewise cubic interpolants,\n           SIAM J. Sci. Comput., 5(2), 300-304 (1984).\n           :doi:`10.1137/0905021`.\n    .. [2] see, e.g., C. Moler, Numerical Computing with Matlab, 2004.\n           :doi:`10.1137/1.9780898717952`\n\n    "
__init__(x, y, axis=0, extrapolate=None)[source]
__module__ = 'scipy.interpolate._cubic'
static _edge_case(h0, h1, m0, m1)[source]
static _find_derivatives(x, y)[source]
axis
c
extrapolate
x
chunked_loess_smoothing(ts, window_hours=1.25, chunk_days=10, overlap_days=1)[source]

Apply LOESS smoothing in overlapping chunks to reduce computation time.

Parameters:
tspd.Series

Time series with datetime index and possible NaNs.

window_hoursfloat

LOESS smoothing window size in hours.

chunk_daysint

Core chunk size (e.g., 10 days).

overlap_daysint

Overlap added before and after each chunk to avoid edge effects.

Returns:
pd.Series

Smoothed series, NaNs where input is NaN or unsupported.

filter_extrema_ngood(extrema_df, smoothed, series, loess_window_pts=25, n_good=3, sig_gap_minutes=45)[source]

Filter extrema based on local and contextual data quality criteria.

Parameters:
extrema_dfpd.DataFrame

DataFrame with columns ‘time’ and ‘value’ for candidate extrema.

smoothedpd.Series

Smoothed version of the signal used for extrema detection.

seriespd.Series

Original time series (with gaps).

loess_window_ptsint

Number of points in the LOESS window.

n_goodint

Minimum number of non-NaN points required.

sig_gap_minutesfloat

Threshold for detecting significant gaps (in minutes).

Returns:
pd.DataFrame

Filtered extrema DataFrame.

find_peaks(x, height=None, threshold=None, distance=None, prominence=None, width=None, wlen=None, rel_height=0.5, plateau_size=None)[source]

Find peaks inside a signal based on peak properties.

This function takes a 1-D array and finds all local maxima by simple comparison of neighboring values. Optionally, a subset of these peaks can be selected by specifying conditions for a peak’s properties.

Parameters:
xsequence

A signal with peaks.

heightnumber or ndarray or sequence, optional

Required height of peaks. Either a number, None, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied, as the maximal required height.

thresholdnumber or ndarray or sequence, optional

Required threshold of peaks, the vertical distance to its neighboring samples. Either a number, None, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied, as the maximal required threshold.

distancenumber, optional

Required minimal horizontal distance (>= 1) in samples between neighbouring peaks. Smaller peaks are removed first until the condition is fulfilled for all remaining peaks.

prominencenumber or ndarray or sequence, optional

Required prominence of peaks. Either a number, None, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied, as the maximal required prominence.

widthnumber or ndarray or sequence, optional

Required width of peaks in samples. Either a number, None, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied, as the maximal required width.

wlenint, optional

Used for calculation of the peaks prominences, thus it is only used if one of the arguments prominence or width is given. See argument wlen in peak_prominences for a full description of its effects.

rel_heightfloat, optional

Used for calculation of the peaks width, thus it is only used if width is given. See argument rel_height in peak_widths for a full description of its effects.

plateau_sizenumber or ndarray or sequence, optional

Required size of the flat top of peaks in samples. Either a number, None, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied as the maximal required plateau size.

New in version 1.2.0.

Returns:
peaksndarray

Indices of peaks in x that satisfy all given conditions.

propertiesdict

A dictionary containing properties of the returned peaks which were calculated as intermediate results during evaluation of the specified conditions:

  • ‘peak_heights’

    If height is given, the height of each peak in x.

  • ‘left_thresholds’, ‘right_thresholds’

    If threshold is given, these keys contain a peaks vertical distance to its neighbouring samples.

  • ‘prominences’, ‘right_bases’, ‘left_bases’

    If prominence is given, these keys are accessible. See peak_prominences for a description of their content.

  • ‘width_heights’, ‘left_ips’, ‘right_ips’

    If width is given, these keys are accessible. See peak_widths for a description of their content.

  • ‘plateau_sizes’, left_edges’, ‘right_edges’

    If plateau_size is given, these keys are accessible and contain the indices of a peak’s edges (edges are still part of the plateau) and the calculated plateau sizes.

    New in version 1.2.0.

To calculate and return properties without excluding peaks, provide the open interval (None, None) as a value to the appropriate argument (excluding distance).

Warns:
PeakPropertyWarning

Raised if a peak’s properties have unexpected values (see peak_prominences and peak_widths).

Warning

This function may return unexpected results for data containing NaNs. To avoid this, NaNs should either be removed or replaced.

See also

find_peaks_cwt

Find peaks using the wavelet transformation.

peak_prominences

Directly calculate the prominence of peaks.

peak_widths

Directly calculate the width of peaks.

Notes

In the context of this function, a peak or local maximum is defined as any sample whose two direct neighbours have a smaller amplitude. For flat peaks (more than one sample of equal amplitude wide) the index of the middle sample is returned (rounded down in case the number of samples is even). For noisy signals the peak locations can be off because the noise might change the position of local maxima. In those cases consider smoothing the signal before searching for peaks or use other peak finding and fitting methods (like find_peaks_cwt).

Some additional comments on specifying conditions:

  • Almost all conditions (excluding distance) can be given as half-open or closed intervals, e.g., 1 or (1, None) defines the half-open interval \([1, \infty]\) while (None, 1) defines the interval \([-\infty, 1]\). The open interval (None, None) can be specified as well, which returns the matching properties without exclusion of peaks.

  • The border is always included in the interval used to select valid peaks.

  • For several conditions the interval borders can be specified with arrays matching x in shape which enables dynamic constrains based on the sample position.

  • The conditions are evaluated in the following order: plateau_size, height, threshold, distance, prominence, width. In most cases this order is the fastest one because faster operations are applied first to reduce the number of peaks that need to be evaluated later.

  • While indices in peaks are guaranteed to be at least distance samples apart, edges of flat peaks may be closer than the allowed distance.

  • Use wlen to reduce the time it takes to evaluate the conditions for prominence or width if x is large or has many local maxima (see peak_prominences).

New in version 1.1.0.

Examples

To demonstrate this function’s usage we use a signal x supplied with SciPy (see scipy.datasets.electrocardiogram). Let’s find all peaks (local maxima) in x whose amplitude lies above 0.

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from scipy.datasets import electrocardiogram
>>> from scipy.signal import find_peaks
>>> x = electrocardiogram()[2000:4000]
>>> peaks, _ = find_peaks(x, height=0)
>>> plt.plot(x)
>>> plt.plot(peaks, x[peaks], "x")
>>> plt.plot(np.zeros_like(x), "--", color="gray")
>>> plt.show()

We can select peaks below 0 with height=(None, 0) or use arrays matching x in size to reflect a changing condition for different parts of the signal.

>>> border = np.sin(np.linspace(0, 3 * np.pi, x.size))
>>> peaks, _ = find_peaks(x, height=(-border, border))
>>> plt.plot(x)
>>> plt.plot(-border, "--", color="gray")
>>> plt.plot(border, ":", color="gray")
>>> plt.plot(peaks, x[peaks], "x")
>>> plt.show()

Another useful condition for periodic signals can be given with the distance argument. In this case, we can easily select the positions of QRS complexes within the electrocardiogram (ECG) by demanding a distance of at least 150 samples.

>>> peaks, _ = find_peaks(x, distance=150)
>>> np.diff(peaks)
array([186, 180, 177, 171, 177, 169, 167, 164, 158, 162, 172])
>>> plt.plot(x)
>>> plt.plot(peaks, x[peaks], "x")
>>> plt.show()

Especially for noisy signals peaks can be easily grouped by their prominence (see peak_prominences). E.g., we can select all peaks except for the mentioned QRS complexes by limiting the allowed prominence to 0.6.

>>> peaks, properties = find_peaks(x, prominence=(None, 0.6))
>>> properties["prominences"].max()
0.5049999999999999
>>> plt.plot(x)
>>> plt.plot(peaks, x[peaks], "x")
>>> plt.show()

And, finally, let’s examine a different section of the ECG which contains beat forms of different shape. To select only the atypical heart beats, we combine two conditions: a minimal prominence of 1 and width of at least 20 samples.

>>> x = electrocardiogram()[17000:18000]
>>> peaks, properties = find_peaks(x, prominence=1, width=20)
>>> properties["prominences"], properties["widths"]
(array([1.495, 2.3  ]), array([36.93773946, 39.32723577]))
>>> plt.plot(x)
>>> plt.plot(peaks, x[peaks], "x")
>>> plt.vlines(x=peaks, ymin=x[peaks] - properties["prominences"],
...            ymax = x[peaks], color = "C1")
>>> plt.hlines(y=properties["width_heights"], xmin=properties["left_ips"],
...            xmax=properties["right_ips"], color = "C1")
>>> plt.show()
find_raw_extrema(smoothed, prominence=0.01)[source]

Find raw peaks and troughs using scipy.signal.find_peaks. Returns DataFrames for peaks and troughs.

generate_pink_noise(n, seed=None, scale=1.0)[source]

Generate pink (1/f) noise using the Voss-McCartney algorithm.

Parameters:
nint

Number of samples to generate.

seedint or None

Random seed for reproducibility.

scalefloat

Standard deviation scaling factor for the noise.

Returns:
np.ndarray

Pink noise signal of length n.

generate_simplified_mixed_tide(start_time='2022-01-01', ndays=40, freq='15min', A_M2=1.0, A_K1=0.5, A_O1=0.5, phase_D1=1.570795, noise_amplitude=0.08, return_components=False)[source]

Generate a simplified synthetic mixed semidiurnal/diurnal tide with explicit O1 and K1.

Parameters:
start_timestr

Start time for the series.

ndaysint

Number of days.

freqstr

Sampling interval.

A_M2float

Amplitude of M2.

A_K1float

Amplitude of K1.

A_O1float

Amplitude of O1.

phase_D1float

Common phase shift for O1 and K1.

return_componentsbool

Whether to return individual components.

Returns:
pd.Series or pd.DataFrame

Combined tide or components with time index.

interpolate_envelope(anchor_df, series, max_anchor_gap_hours=36)[source]

Interpolate envelope using PCHIP, breaking if anchor points are too far apart.

lowess(endog, exog, frac=0.6666666666666666, it=3, delta=0.0, xvals=None, is_sorted=False, missing='drop', return_sorted=True)[source]

LOWESS (Locally Weighted Scatterplot Smoothing)

A lowess function that outs smoothed estimates of endog at the given exog values from points (exog, endog)

Parameters:
endog1-D numpy array

The y-values of the observed points

exog1-D numpy array

The x-values of the observed points

fracfloat

Between 0 and 1. The fraction of the data used when estimating each y-value.

itint

The number of residual-based reweightings to perform.

deltafloat

Distance within which to use linear-interpolation instead of weighted regression.

xvals: 1-D numpy array

Values of the exogenous variable at which to evaluate the regression. If supplied, cannot use delta.

is_sortedbool

If False (default), then the data will be sorted by exog before calculating lowess. If True, then it is assumed that the data is already sorted by exog. If xvals is specified, then it too must be sorted if is_sorted is True.

missingstr

Available options are ‘none’, ‘drop’, and ‘raise’. If ‘none’, no nan checking is done. If ‘drop’, any observations with nans are dropped. If ‘raise’, an error is raised. Default is ‘drop’.

return_sortedbool

If True (default), then the returned array is sorted by exog and has missing (nan or infinite) observations removed. If False, then the returned array is in the same length and the same sequence of observations as the input array.

Returns:
out{ndarray, float}

The returned array is two-dimensional if return_sorted is True, and one dimensional if return_sorted is False. If return_sorted is True, then a numpy array with two columns. The first column contains the sorted x (exog) values and the second column the associated estimated y (endog) values. If return_sorted is False, then only the fitted values are returned, and the observations will be in the same order as the input arrays. If xvals is provided, then return_sorted is ignored and the returned array is always one dimensional, containing the y values fitted at the x values provided by xvals.

Notes

This lowess function implements the algorithm given in the reference below using local linear estimates.

Suppose the input data has N points. The algorithm works by estimating the smooth y_i by taking the frac*N closest points to (x_i,y_i) based on their x values and estimating y_i using a weighted linear regression. The weight for (x_j,y_j) is tricube function applied to abs(x_i-x_j).

If it > 1, then further weighted local linear regressions are performed, where the weights are the same as above times the _lowess_bisquare function of the residuals. Each iteration takes approximately the same amount of time as the original fit, so these iterations are expensive. They are most useful when the noise has extremely heavy tails, such as Cauchy noise. Noise with less heavy-tails, such as t-distributions with df>2, are less problematic. The weights downgrade the influence of points with large residuals. In the extreme case, points whose residuals are larger than 6 times the median absolute residual are given weight 0.

delta can be used to save computations. For each x_i, regressions are skipped for points closer than delta. The next regression is fit for the farthest point within delta of x_i and all points in between are estimated by linearly interpolating between the two regression fits.

Judicious choice of delta can cut computation time considerably for large data (N > 5000). A good choice is delta = 0.01 * range(exog).

If xvals is provided, the regression is then computed at those points and the fit values are returned. Otherwise, the regression is run at points of exog.

Some experimentation is likely required to find a good choice of frac and iter for a particular dataset.

References

Cleveland, W.S. (1979) “Robust Locally Weighted Regression and Smoothing Scatterplots”. Journal of the American Statistical Association 74 (368): 829-836.

Examples

The below allows a comparison between how different the fits from lowess for different values of frac can be.

>>> import numpy as np
>>> import statsmodels.api as sm
>>> lowess = sm.nonparametric.lowess
>>> x = np.random.uniform(low = -2*np.pi, high = 2*np.pi, size=500)
>>> y = np.sin(x) + np.random.normal(size=len(x))
>>> z = lowess(y, x)
>>> w = lowess(y, x, frac=1./3)

This gives a similar comparison for when it is 0 vs not.

>>> import numpy as np
>>> import scipy.stats as stats
>>> import statsmodels.api as sm
>>> lowess = sm.nonparametric.lowess
>>> x = np.random.uniform(low = -2*np.pi, high = 2*np.pi, size=500)
>>> y = np.sin(x) + stats.cauchy.rvs(size=len(x))
>>> z = lowess(y, x, frac= 1./3, it=0)
>>> w = lowess(y, x, frac=1./3)
main()[source]
savgol_filter(x, window_length, polyorder, deriv=0, delta=1.0, axis=-1, mode='interp', cval=0.0)[source]

Apply a Savitzky-Golay filter to an array.

This is a 1-D filter. If x has dimension greater than 1, axis determines the axis along which the filter is applied.

Parameters:
xarray_like

The data to be filtered. If x is not a single or double precision floating point array, it will be converted to type numpy.float64 before filtering.

window_lengthint

The length of the filter window (i.e., the number of coefficients). If mode is ‘interp’, window_length must be less than or equal to the size of x.

polyorderint

The order of the polynomial used to fit the samples. polyorder must be less than window_length.

derivint, optional

The order of the derivative to compute. This must be a nonnegative integer. The default is 0, which means to filter the data without differentiating.

deltafloat, optional

The spacing of the samples to which the filter will be applied. This is only used if deriv > 0. Default is 1.0.

axisint, optional

The axis of the array x along which the filter is to be applied. Default is -1.

modestr, optional

Must be ‘mirror’, ‘constant’, ‘nearest’, ‘wrap’ or ‘interp’. This determines the type of extension to use for the padded signal to which the filter is applied. When mode is ‘constant’, the padding value is given by cval. See the Notes for more details on ‘mirror’, ‘constant’, ‘wrap’, and ‘nearest’. When the ‘interp’ mode is selected (the default), no extension is used. Instead, a degree polyorder polynomial is fit to the last window_length values of the edges, and this polynomial is used to evaluate the last window_length // 2 output values.

cvalscalar, optional

Value to fill past the edges of the input if mode is ‘constant’. Default is 0.0.

Returns:
yndarray, same shape as x

The filtered data.

See also

savgol_coeffs

Notes

Details on the mode options:

‘mirror’:

Repeats the values at the edges in reverse order. The value closest to the edge is not included.

‘nearest’:

The extension contains the nearest input value.

‘constant’:

The extension contains the value given by the cval argument.

‘wrap’:

The extension contains the values from the other end of the array.

For example, if the input is [1, 2, 3, 4, 5, 6, 7, 8], and window_length is 7, the following shows the extended data for the various mode options (assuming cval is 0):

mode       |   Ext   |         Input          |   Ext
-----------+---------+------------------------+---------
'mirror'   | 4  3  2 | 1  2  3  4  5  6  7  8 | 7  6  5
'nearest'  | 1  1  1 | 1  2  3  4  5  6  7  8 | 8  8  8
'constant' | 0  0  0 | 1  2  3  4  5  6  7  8 | 0  0  0
'wrap'     | 6  7  8 | 1  2  3  4  5  6  7  8 | 1  2  3

New in version 0.14.0.

Examples

>>> import numpy as np
>>> from scipy.signal import savgol_filter
>>> np.set_printoptions(precision=2)  # For compact display.
>>> x = np.array([2, 2, 5, 2, 1, 0, 1, 4, 9])

Filter with a window length of 5 and a degree 2 polynomial. Use the defaults for all other parameters.

>>> savgol_filter(x, 5, 2)
array([1.66, 3.17, 3.54, 2.86, 0.66, 0.17, 1.  , 4.  , 9.  ])

Note that the last five values in x are samples of a parabola, so when mode=’interp’ (the default) is used with polyorder=2, the last three values are unchanged. Compare that to, for example, mode=’nearest’:

>>> savgol_filter(x, 5, 2, mode='nearest')
array([1.74, 3.03, 3.54, 2.86, 0.66, 0.17, 1.  , 4.6 , 7.97])
select_salient_extrema(extrema, typ, spacing_hours=14, envelope_type='outer')[source]

Select salient extrema (HH/LL or HL/LH) using literal spacing-based OR logic.

Parameters:
extremapd.DataFrame with columns [“time”, “value”]

Candidate extrema.

typstr

Either “high” or “low” (for peak or trough selection).

spacing_hoursfloat

Time window for neighbor comparison.

envelope_typestr

Either “outer” (default) or “inner” to switch saliency logic.

Returns:
pd.DataFrame

Extrema that passed the saliency test.

smooth_series(ts, window_hours=1.75)[source]
smooth_series2(series, window_pts=25, method='lowess', **kwargs)[source]

Smooth a time series using the specified method. Currently supports ‘lowess’, ‘moving_average’, or ‘savgol’.

tidal_envelope(series, smoothing_window_hours=2.5, n_good=3, peak_prominence=0.05, saliency_window_hours=14, max_anchor_gap_hours=36, envelope_type='outer')[source]

Compute the tidal envelope (high and low) of a time series using smoothing, extrema detection, and interpolation. This function processes a time series to extract its tidal envelope by smoothing the data, identifying significant peaks and troughs, filtering out unreliable extrema, selecting salient extrema within a specified window, and interpolating between anchor points to generate continuous envelope curves. Parameters ———- series : pandas.Series

Time-indexed series of water levels or similar data.

smoothing_window_hoursfloat, optional

Window size in hours for smoothing the input series (default is 2.5).

n_goodint, optional

Minimum number of good points required for an extremum to be considered valid (default is 3).

peak_prominencefloat, optional

Minimum prominence of peaks/troughs to be considered as extrema (default is 0.05).

saliency_window_hoursfloat, optional

Window size in hours for selecting salient extrema (default is 14).

max_anchor_gap_hoursfloat, optional

Maximum allowed gap in hours between anchor points for interpolation (default is 36).

envelope_typestr, optional

Type of envelope to compute, e.g., “outer” (default is “outer”).

Returns

env_highpandas.Series

Interpolated high (upper) envelope of the input series.

env_lowpandas.Series

Interpolated low (lower) envelope of the input series.

anchor_highspandas.DataFrame

DataFrame of selected anchor points for the high envelope.

anchor_lowspandas.DataFrame

DataFrame of selected anchor points for the low envelope.

smoothedpandas.Series

Smoothed version of the input series.

Notes

This function assumes regular time intervals in the input series. If the frequency cannot be inferred, it is estimated from the first two timestamps.

vtools.functions.error_detect module

_nrepeat(ts)[source]

Series-only version

bounds_test(ts, bounds)[source]
despike(arr, n1=2, n2=20, block=10)[source]
example()[source]
example2()[source]
med_outliers(ts, level=4.0, scale=None, filt_len=7, range=(None, None), quantiles=(0.01, 0.99), copy=True, as_anomaly=False)[source]

Detect outliers by running a median filter, subtracting it from the original series and comparing the resulting residuals to a global robust range of scale (the interquartile range). Individual time points are rejected if the residual at that time point is more than level times the range of scale.

The original concept comes from Basu & Meckesheimer (2007) Automatic outlier detection for time series: an application to sensor data although they didn’t use the interquartile range but rather expert judgment. To use this function effectively, you need to be thoughtful about what the interquartile range will be. For instance, for a strongly tidal flow station it is likely to

level: Number of times the scale or interquantile range the data has to be

to be rejected.d

scale: Expert judgment of the scale of maximum variation over a time step.

If None, the interquartile range will be used. Note that for a strongly tidal station the interquartile range may substantially overestimate the reasonable variation over a single time step, in which case the filter will work fine, but level should be set to a number (less than one) accordingly.

filt_len: length of median filter, default is 5

quantilestuple of quantiles defining the measure of scale. Ignored

if scale is given directly. Default is interquartile range, and this is almost always a reasonable choice.

copy: if True, a copy is made leaving original series intact

You can also specify rejection of values based on a simple range

Returns: copy of series with outliers replaced by nan

median_test(ts, level=4, filt_len=7, quantiles=(0.005, 0.095), copy=True)[source]
nrepeat(ts)[source]

Return the length of consecutive runs of repeated values

Parameters:
ts: DataFrame or series
Returns:
Like-indexed series with lengths of runs. Nans will be mapped to 0
steep_then_nan(ts, level=4.0, scale=None, filt_len=11, range=(None, None), quantiles=(0.01, 0.99), copy=True, as_anomaly=True)[source]

Detect outliers by running a median filter, subtracting it from the original series and comparing the resulting residuals to a global robust range of scale (the interquartile range). Individual time points are rejected if the residual at that time point is more than level times the range of scale.

The original concept comes from Basu & Meckesheimer (2007) although they didn’t use the interquartile range but rather expert judgment. To use this function effectively, you need to be thoughtful about what the interquartile range will be. For instance, for a strongly tidal flow station it is likely to

level: Number of times the scale or interquantile range the data has to be

to be rejected.d

scale: Expert judgment of the scale of maximum variation over a time step.

If None, the interquartile range will be used. Note that for a strongly tidal station the interquartile range may substantially overestimate the reasonable variation over a single time step, in which case the filter will work fine, but level should be set to a number (less than one) accordingly.

filt_len: length of median filter, default is 5

quantilestuple of quantiles defining the measure of scale. Ignored

if scale is given directly. Default is interquartile range, and this is almost always a reasonable choice.

copy: if True, a copy is made leaving original series intact

You can also specify rejection of values based on a simple range

Returns: copy of series with outliers replaced by nan

threshold(ts, bounds, copy=True)[source]

vtools.functions.example2 module

main()[source]

vtools.functions.filter module

Module contains filter used in tidal time series analysis.

_gf1d(ts, sigma, order, mode, cval, truncate)[source]
_lanczos_impl(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True, cosine_taper=False)[source]

squared low-pass cosine lanczos filter on a regular time series.

Parameters:
tsDataFrame
filter_lenint, time_interval

Size of lanczos window, default is to number of samples within filter_period*1.25.

cutoff_frequency: float,optional

Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.

cutoff_periodstring or _time_interval

Period of cutting off frequency. If input as a string, it must be convertible to a _time_interval (Pandas freq). cutoff_frequency and cutoff_period can’t be specified at the same time.

padtypestr or None, optional

Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is None.

padlenint or None, optional

The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis]-1. padlen=0 implies no padding. If padtye is not None and padlen is not given, padlen is be set to 6*m.

fill_edge_nan: bool,optional

If pading is not used and fill_edge_nan is true, resulting data on the both ends are filled with nan to account for edge effect. This is 2*m on the either end of the result. Default is true.

Returns:
resultTimeSeries

A new regular time series with the same interval of ts. If no pading is used the beigning and ending 4*m resulting data will be set to nan to remove edge effect.

Raises:
ValueError

If input timeseries is not regular, or, cutoff_period and cutoff_frequency are given at the same time, or, neither cutoff_period nor curoff_frequence is given, or, padtype is not “odd”,”even”,”constant”,or None, or, padlen is larger than data size

butter(N, Wn, btype='low', analog=False, output='ba', fs=None)[source]

Butterworth digital and analog filter design.

Design an Nth-order digital or analog Butterworth filter and return the filter coefficients.

Parameters:
Nint

The order of the filter. For ‘bandpass’ and ‘bandstop’ filters, the resulting order of the final second-order sections (‘sos’) matrix is 2*N, with N the number of biquad sections of the desired system.

Wnarray_like

The critical frequency or frequencies. For lowpass and highpass filters, Wn is a scalar; for bandpass and bandstop filters, Wn is a length-2 sequence.

For a Butterworth filter, this is the point at which the gain drops to 1/sqrt(2) that of the passband (the “-3 dB point”).

For digital filters, if fs is not specified, Wn units are normalized from 0 to 1, where 1 is the Nyquist frequency (Wn is thus in half cycles / sample and defined as 2*critical frequencies / fs). If fs is specified, Wn is in the same units as fs.

For analog filters, Wn is an angular frequency (e.g. rad/s).

btype{‘lowpass’, ‘highpass’, ‘bandpass’, ‘bandstop’}, optional

The type of filter. Default is ‘lowpass’.

analogbool, optional

When True, return an analog filter, otherwise a digital filter is returned.

output{‘ba’, ‘zpk’, ‘sos’}, optional

Type of output: numerator/denominator (‘ba’), pole-zero (‘zpk’), or second-order sections (‘sos’). Default is ‘ba’ for backwards compatibility, but ‘sos’ should be used for general-purpose filtering.

fsfloat, optional

The sampling frequency of the digital system.

New in version 1.2.0.

Returns:
b, andarray, ndarray

Numerator (b) and denominator (a) polynomials of the IIR filter. Only returned if output='ba'.

z, p, kndarray, ndarray, float

Zeros, poles, and system gain of the IIR filter transfer function. Only returned if output='zpk'.

sosndarray

Second-order sections representation of the IIR filter. Only returned if output='sos'.

See also

buttord, buttap

Notes

The Butterworth filter has maximally flat frequency response in the passband.

The 'sos' output parameter was added in 0.16.0.

If the transfer function form [b, a] is requested, numerical problems can occur since the conversion between roots and the polynomial coefficients is a numerically sensitive operation, even for N >= 4. It is recommended to work with the SOS representation.

Warning

Designing high-order and narrowband IIR filters in TF form can result in unstable or incorrect filtering due to floating point numerical precision issues. Consider inspecting output filter characteristics freqz or designing the filters with second-order sections via output='sos'.

Examples

Design an analog filter and plot its frequency response, showing the critical points:

>>> from scipy import signal
>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>> b, a = signal.butter(4, 100, 'low', analog=True)
>>> w, h = signal.freqs(b, a)
>>> plt.semilogx(w, 20 * np.log10(abs(h)))
>>> plt.title('Butterworth filter frequency response')
>>> plt.xlabel('Frequency [radians / second]')
>>> plt.ylabel('Amplitude [dB]')
>>> plt.margins(0, 0.1)
>>> plt.grid(which='both', axis='both')
>>> plt.axvline(100, color='green') # cutoff frequency
>>> plt.show()

Generate a signal made up of 10 Hz and 20 Hz, sampled at 1 kHz

>>> t = np.linspace(0, 1, 1000, False)  # 1 second
>>> sig = np.sin(2*np.pi*10*t) + np.sin(2*np.pi*20*t)
>>> fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True)
>>> ax1.plot(t, sig)
>>> ax1.set_title('10 Hz and 20 Hz sinusoids')
>>> ax1.axis([0, 1, -2, 2])

Design a digital high-pass filter at 15 Hz to remove the 10 Hz tone, and apply it to the signal. (It’s recommended to use second-order sections format when filtering, to avoid numerical error with transfer function (ba) format):

>>> sos = signal.butter(10, 15, 'hp', fs=1000, output='sos')
>>> filtered = signal.sosfilt(sos, sig)
>>> ax2.plot(t, filtered)
>>> ax2.set_title('After 15 Hz high-pass filter')
>>> ax2.axis([0, 1, -2, 2])
>>> ax2.set_xlabel('Time [seconds]')
>>> plt.tight_layout()
>>> plt.show()
butterworth(ts, cutoff_period=None, cutoff_frequency=None, order=4)[source]

low-pass butterworth-squared filter on a regular time series.

Parameters:
tsDataFrame

Must be one or two dimensional, and regular.

order: int ,optional

The default is 4.

cutoff_frequency: float,optional

Cutoff frequency expressed as a ratio with Nyquist frequency, should within the range (0,1). For a discretely sampled system, the Nyquist frequency is the fastest frequency that can be resolved by that sampling, which is half the sampling frequency. For example, if the sampling frequency is 1 sample/1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.

cutoff_periodstring or _time_interval

Period corresponding to cutoff frequency. If input as a string, it must be convertible to a regular interval using the same rules as a pandas frequency.. cutoff_frequency and cutoff_period can’t be specified at the same time.

Returns:
result

A new regular time series with the same interval as ts.

Raises:
ValueError

If input order is not even, or input timeseries is not regular, or neither cutoff_period and cutoff_frequency is given while input time series interval is not 15min or 1 hour, or cutoff_period and cutoff_frequency are given at the same time.

convert_span_to_nstep(freq, span)[source]
cosine_lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]
cosine_lanczos5(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]

squared low-pass cosine lanczos filter on a regular time series.

Parameters:
tsDataFrame
filter_lenint, time_interval

Size of lanczos window, default is to number of samples within filter_period*1.25.

cutoff_frequency: float,optional

Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.

cutoff_periodstring or _time_interval

Period of cutting off frequency. If input as a string, it must be convertible to a _time_interval (Pandas freq). cutoff_frequency and cutoff_period can’t be specified at the same time.

padtypestr or None, optional

Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is None.

padlenint or None, optional

The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis]-1. padlen=0 implies no padding. If padtye is not None and padlen is not given, padlen is be set to 6*m.

fill_edge_nan: bool,optional

If pading is not used and fill_edge_nan is true, resulting data on the both ends are filled with nan to account for edge effect. This is 2*m on the either end of the result. Default is true.

Returns:
resultTimeSeries

A new regular time series with the same interval of ts. If no pading is used the beigning and ending 4*m resulting data will be set to nan to remove edge effect.

Raises:
ValueError

If input timeseries is not regular, or, cutoff_period and cutoff_frequency are given at the same time, or, neither cutoff_period nor curoff_frequence is given, or, padtype is not “odd”,”even”,”constant”,or None, or, padlen is larger than data size

filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None, method='pad', irlen=None)[source]

Apply a digital filter forward and backward to a signal.

This function applies a linear digital filter twice, once forward and once backwards. The combined filter has zero phase and a filter order twice that of the original.

The function provides options for handling the edges of the signal.

The function sosfiltfilt (and filter design using output='sos') should be preferred over filtfilt for most filtering tasks, as second-order sections have fewer numerical problems.

Parameters:
b(N,) array_like

The numerator coefficient vector of the filter.

a(N,) array_like

The denominator coefficient vector of the filter. If a[0] is not 1, then both a and b are normalized by a[0].

xarray_like

The array of data to be filtered.

axisint, optional

The axis of x to which the filter is applied. Default is -1.

padtypestr or None, optional

Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is ‘odd’.

padlenint or None, optional

The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis] - 1. padlen=0 implies no padding. The default value is 3 * max(len(a), len(b)).

methodstr, optional

Determines the method for handling the edges of the signal, either “pad” or “gust”. When method is “pad”, the signal is padded; the type of padding is determined by padtype and padlen, and irlen is ignored. When method is “gust”, Gustafsson’s method is used, and padtype and padlen are ignored.

irlenint or None, optional

When method is “gust”, irlen specifies the length of the impulse response of the filter. If irlen is None, no part of the impulse response is ignored. For a long signal, specifying irlen can significantly improve the performance of the filter.

Returns:
yndarray

The filtered output with the same shape as x.

See also

sosfiltfilt, lfilter_zi, lfilter, lfiltic, savgol_filter, sosfilt

Notes

When method is “pad”, the function pads the data along the given axis in one of three ways: odd, even or constant. The odd and even extensions have the corresponding symmetry about the end point of the data. The constant extension extends the data with the values at the end points. On both the forward and backward passes, the initial condition of the filter is found by using lfilter_zi and scaling it by the end point of the extended data.

When method is “gust”, Gustafsson’s method [1] is used. Initial conditions are chosen for the forward and backward passes so that the forward-backward filter gives the same result as the backward-forward filter.

The option to use Gustaffson’s method was added in scipy version 0.16.0.

References

[1]

F. Gustaffson, “Determining the initial states in forward-backward filtering”, Transactions on Signal Processing, Vol. 46, pp. 988-992, 1996.

Examples

The examples will use several functions from scipy.signal.

>>> import numpy as np
>>> from scipy import signal
>>> import matplotlib.pyplot as plt

First we create a one second signal that is the sum of two pure sine waves, with frequencies 5 Hz and 250 Hz, sampled at 2000 Hz.

>>> t = np.linspace(0, 1.0, 2001)
>>> xlow = np.sin(2 * np.pi * 5 * t)
>>> xhigh = np.sin(2 * np.pi * 250 * t)
>>> x = xlow + xhigh

Now create a lowpass Butterworth filter with a cutoff of 0.125 times the Nyquist frequency, or 125 Hz, and apply it to x with filtfilt. The result should be approximately xlow, with no phase shift.

>>> b, a = signal.butter(8, 0.125)
>>> y = signal.filtfilt(b, a, x, padlen=150)
>>> np.abs(y - xlow).max()
9.1086182074789912e-06

We get a fairly clean result for this artificial example because the odd extension is exact, and with the moderately long padding, the filter’s transients have dissipated by the time the actual data is reached. In general, transient effects at the edges are unavoidable.

The following example demonstrates the option method="gust".

First, create a filter.

>>> b, a = signal.ellip(4, 0.01, 120, 0.125)  # Filter to be applied.

sig is a random input signal to be filtered.

>>> rng = np.random.default_rng()
>>> n = 60
>>> sig = rng.standard_normal(n)**3 + 3*rng.standard_normal(n).cumsum()

Apply filtfilt to sig, once using the Gustafsson method, and once using padding, and plot the results for comparison.

>>> fgust = signal.filtfilt(b, a, sig, method="gust")
>>> fpad = signal.filtfilt(b, a, sig, padlen=50)
>>> plt.plot(sig, 'k-', label='input')
>>> plt.plot(fgust, 'b-', linewidth=4, label='gust')
>>> plt.plot(fpad, 'c-', linewidth=1.5, label='pad')
>>> plt.legend(loc='best')
>>> plt.show()

The irlen argument can be used to improve the performance of Gustafsson’s method.

Estimate the impulse response length of the filter.

>>> z, p, k = signal.tf2zpk(b, a)
>>> eps = 1e-9
>>> r = np.max(np.abs(p))
>>> approx_impulse_len = int(np.ceil(np.log(eps) / np.log(r)))
>>> approx_impulse_len
137

Apply the filter to a longer signal, with and without the irlen argument. The difference between y1 and y2 is small. For long signals, using irlen gives a significant performance improvement.

>>> x = rng.standard_normal(4000)
>>> y1 = signal.filtfilt(b, a, x, method='gust')
>>> y2 = signal.filtfilt(b, a, x, method='gust', irlen=approx_impulse_len)
>>> print(np.max(np.abs(y1 - y2)))
2.875334415008979e-10
firwin(numtaps, cutoff, *, width=None, window='hamming', pass_zero=True, scale=True, nyq=<object object>, fs=None)[source]

FIR filter design using the window method.

This function computes the coefficients of a finite impulse response filter. The filter will have linear phase; it will be Type I if numtaps is odd and Type II if numtaps is even.

Type II filters always have zero response at the Nyquist frequency, so a ValueError exception is raised if firwin is called with numtaps even and having a passband whose right end is at the Nyquist frequency.

Parameters:
numtapsint

Length of the filter (number of coefficients, i.e. the filter order + 1). numtaps must be odd if a passband includes the Nyquist frequency.

cutofffloat or 1-D array_like

Cutoff frequency of filter (expressed in the same units as fs) OR an array of cutoff frequencies (that is, band edges). In the latter case, the frequencies in cutoff should be positive and monotonically increasing between 0 and fs/2. The values 0 and fs/2 must not be included in cutoff.

widthfloat or None, optional

If width is not None, then assume it is the approximate width of the transition region (expressed in the same units as fs) for use in Kaiser FIR filter design. In this case, the window argument is ignored.

windowstring or tuple of string and parameter values, optional

Desired window to use. See scipy.signal.get_window for a list of windows and required parameters.

pass_zero{True, False, ‘bandpass’, ‘lowpass’, ‘highpass’, ‘bandstop’}, optional

If True, the gain at the frequency 0 (i.e., the “DC gain”) is 1. If False, the DC gain is 0. Can also be a string argument for the desired filter type (equivalent to btype in IIR design functions).

New in version 1.3.0: Support for string arguments.

scalebool, optional

Set to True to scale the coefficients so that the frequency response is exactly unity at a certain frequency. That frequency is either:

  • 0 (DC) if the first passband starts at 0 (i.e. pass_zero is True)

  • fs/2 (the Nyquist frequency) if the first passband ends at fs/2 (i.e the filter is a single band highpass filter); center of first passband otherwise

nyqfloat, optional, deprecated

This is the Nyquist frequency. Each frequency in cutoff must be between 0 and nyq. Default is 1.

Deprecated since version 1.0.0: firwin keyword argument nyq is deprecated in favour of fs and will be removed in SciPy 1.14.0.

fsfloat, optional

The sampling frequency of the signal. Each frequency in cutoff must be between 0 and fs/2. Default is 2.

Returns:
h(numtaps,) ndarray

Coefficients of length numtaps FIR filter.

Raises:
ValueError

If any value in cutoff is less than or equal to 0 or greater than or equal to fs/2, if the values in cutoff are not strictly monotonically increasing, or if numtaps is even but a passband includes the Nyquist frequency.

See also

firwin2
firls
minimum_phase
remez

Examples

Low-pass from 0 to f:

>>> from scipy import signal
>>> numtaps = 3
>>> f = 0.1
>>> signal.firwin(numtaps, f)
array([ 0.06799017,  0.86401967,  0.06799017])

Use a specific window function:

>>> signal.firwin(numtaps, f, window='nuttall')
array([  3.56607041e-04,   9.99286786e-01,   3.56607041e-04])

High-pass (‘stop’ from 0 to f):

>>> signal.firwin(numtaps, f, pass_zero=False)
array([-0.00859313,  0.98281375, -0.00859313])

Band-pass:

>>> f1, f2 = 0.1, 0.2
>>> signal.firwin(numtaps, [f1, f2], pass_zero=False)
array([ 0.06301614,  0.88770441,  0.06301614])

Band-stop:

>>> signal.firwin(numtaps, [f1, f2])
array([-0.00801395,  1.0160279 , -0.00801395])

Multi-band (passbands are [0, f1], [f2, f3] and [f4, 1]):

>>> f3, f4 = 0.3, 0.4
>>> signal.firwin(numtaps, [f1, f2, f3, f4])
array([-0.01376344,  1.02752689, -0.01376344])

Multi-band (passbands are [f1, f2] and [f3,f4]):

>>> signal.firwin(numtaps, [f1, f2, f3, f4], pass_zero=False)
array([ 0.04890915,  0.91284326,  0.04890915])
gaussian_filter1d(input, sigma, axis=-1, order=0, output=None, mode='reflect', cval=0.0, truncate=4.0, *, radius=None)[source]

1-D Gaussian filter.

Parameters:
inputarray_like

The input array.

sigmascalar

standard deviation for Gaussian kernel

axisint, optional

The axis of input along which to calculate. Default is -1.

orderint, optional

An order of 0 corresponds to convolution with a Gaussian kernel. A positive order corresponds to convolution with that derivative of a Gaussian.

outputarray or dtype, optional

The array in which to place the output, or the dtype of the returned array. By default an array of the same dtype as input will be created.

mode{‘reflect’, ‘constant’, ‘nearest’, ‘mirror’, ‘wrap’}, optional

The mode parameter determines how the input array is extended beyond its boundaries. Default is ‘reflect’. Behavior for each valid value is as follows:

‘reflect’ (d c b a | a b c d | d c b a)

The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.

‘constant’ (k k k k | a b c d | k k k k)

The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.

‘nearest’ (a a a a | a b c d | d d d d)

The input is extended by replicating the last pixel.

‘mirror’ (d c b | a b c d | c b a)

The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.

‘wrap’ (a b c d | a b c d | a b c d)

The input is extended by wrapping around to the opposite edge.

For consistency with the interpolation functions, the following mode names can also be used:

‘grid-mirror’

This is a synonym for ‘reflect’.

‘grid-constant’

This is a synonym for ‘constant’.

‘grid-wrap’

This is a synonym for ‘wrap’.

cvalscalar, optional

Value to fill past edges of input if mode is ‘constant’. Default is 0.0.

truncatefloat, optional

Truncate the filter at this many standard deviations. Default is 4.0.

radiusNone or int, optional

Radius of the Gaussian kernel. If specified, the size of the kernel will be 2*radius + 1, and truncate is ignored. Default is None.

Returns:
gaussian_filter1dndarray

Notes

The Gaussian kernel will have size 2*radius + 1 along each axis. If radius is None, a default radius = round(truncate * sigma) will be used.

Examples

>>> from scipy.ndimage import gaussian_filter1d
>>> import numpy as np
>>> gaussian_filter1d([1.0, 2.0, 3.0, 4.0, 5.0], 1)
array([ 1.42704095,  2.06782203,  3.        ,  3.93217797,  4.57295905])
>>> gaussian_filter1d([1.0, 2.0, 3.0, 4.0, 5.0], 4)
array([ 2.91948343,  2.95023502,  3.        ,  3.04976498,  3.08051657])
>>> import matplotlib.pyplot as plt
>>> rng = np.random.default_rng()
>>> x = rng.standard_normal(101).cumsum()
>>> y3 = gaussian_filter1d(x, 3)
>>> y6 = gaussian_filter1d(x, 6)
>>> plt.plot(x, 'k', label='original data')
>>> plt.plot(y3, '--', label='filtered, sigma=3')
>>> plt.plot(y6, ':', label='filtered, sigma=6')
>>> plt.legend()
>>> plt.grid()
>>> plt.show()
generate_godin_fir(freq)[source]

generate godin filter impulse response for given freq freq is a pandas freq

godin(ts)[source]

Low-pass Godin filter a regular time series. Applies the \(\mathcal{A_{24}^{2}A_{25}}\) Godin filter [1] The filter is generalized to be the equivalent of one boxcar of the length of the lunar diurnal (~25 hours) constituent and two of the solar diurnal (~24 hours), though the implementation combines these steps.

Parameters:
tsDataFrame
Returns:
resultDataFrame

A new regular time series with the same interval of ts.

Raises:
NotImplementedError

If input time series is not univariate

References

[1]

Godin (1972) Analysis of Tides

hours(h)[source]

Create a time interval representing h hours

lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]
lfilter(b, a, x, axis=-1, zi=None)[source]

Filter data along one-dimension with an IIR or FIR filter.

Filter a data sequence, x, using a digital filter. This works for many fundamental data types (including Object type). The filter is a direct form II transposed implementation of the standard difference equation (see Notes).

The function sosfilt (and filter design using output='sos') should be preferred over lfilter for most filtering tasks, as second-order sections have fewer numerical problems.

Parameters:
barray_like

The numerator coefficient vector in a 1-D sequence.

aarray_like

The denominator coefficient vector in a 1-D sequence. If a[0] is not 1, then both a and b are normalized by a[0].

xarray_like

An N-dimensional input array.

axisint, optional

The axis of the input data array along which to apply the linear filter. The filter is applied to each subarray along this axis. Default is -1.

ziarray_like, optional

Initial conditions for the filter delays. It is a vector (or array of vectors for an N-dimensional input) of length max(len(a), len(b)) - 1. If zi is None or is not given then initial rest is assumed. See lfiltic for more information.

Returns:
yarray

The output of the digital filter.

zfarray, optional

If zi is None, this is not returned, otherwise, zf holds the final filter delay values.

See also

lfiltic

Construct initial conditions for lfilter.

lfilter_zi

Compute initial state (steady state of step response) for lfilter.

filtfilt

A forward-backward filter, to obtain a filter with zero phase.

savgol_filter

A Savitzky-Golay filter.

sosfilt

Filter data using cascaded second-order sections.

sosfiltfilt

A forward-backward filter using second-order sections.

Notes

The filter function is implemented as a direct II transposed structure. This means that the filter implements:

a[0]*y[n] = b[0]*x[n] + b[1]*x[n-1] + ... + b[M]*x[n-M]
                      - a[1]*y[n-1] - ... - a[N]*y[n-N]

where M is the degree of the numerator, N is the degree of the denominator, and n is the sample number. It is implemented using the following difference equations (assuming M = N):

a[0]*y[n] = b[0] * x[n]               + d[0][n-1]
  d[0][n] = b[1] * x[n] - a[1] * y[n] + d[1][n-1]
  d[1][n] = b[2] * x[n] - a[2] * y[n] + d[2][n-1]
...
d[N-2][n] = b[N-1]*x[n] - a[N-1]*y[n] + d[N-1][n-1]
d[N-1][n] = b[N] * x[n] - a[N] * y[n]

where d are the state variables.

The rational transfer function describing this filter in the z-transform domain is:

                    -1              -M
        b[0] + b[1]z  + ... + b[M] z
Y(z) = -------------------------------- X(z)
                    -1              -N
        a[0] + a[1]z  + ... + a[N] z

Examples

Generate a noisy signal to be filtered:

>>> import numpy as np
>>> from scipy import signal
>>> import matplotlib.pyplot as plt
>>> rng = np.random.default_rng()
>>> t = np.linspace(-1, 1, 201)
>>> x = (np.sin(2*np.pi*0.75*t*(1-t) + 2.1) +
...      0.1*np.sin(2*np.pi*1.25*t + 1) +
...      0.18*np.cos(2*np.pi*3.85*t))
>>> xn = x + rng.standard_normal(len(t)) * 0.08

Create an order 3 lowpass butterworth filter:

>>> b, a = signal.butter(3, 0.05)

Apply the filter to xn. Use lfilter_zi to choose the initial condition of the filter:

>>> zi = signal.lfilter_zi(b, a)
>>> z, _ = signal.lfilter(b, a, xn, zi=zi*xn[0])

Apply the filter again, to have a result filtered at an order the same as filtfilt:

>>> z2, _ = signal.lfilter(b, a, z, zi=zi*z[0])

Use filtfilt to apply the filter:

>>> y = signal.filtfilt(b, a, xn)

Plot the original signal and the various filtered versions:

>>> plt.figure
>>> plt.plot(t, xn, 'b', alpha=0.75)
>>> plt.plot(t, z, 'r--', t, z2, 'r', t, y, 'k')
>>> plt.legend(('noisy signal', 'lfilter, once', 'lfilter, twice',
...             'filtfilt'), loc='best')
>>> plt.grid(True)
>>> plt.show()
lowpass_cosine_lanczos_filter_coef(cf, m, normalize=True)[source]

return the convolution coefficients for low pass lanczos filter.

Parameters:
cf: float

Cutoff frequency expressed as a ratio of a Nyquist frequency.

m: int

Size of filtering window size.

Returns:
results: list

Coefficients of filtering window.

lowpass_lanczos_filter_coef(cf, m, normalize=True, cosine_taper=False)[source]

Return the convolution coefficients for a low-pass Lanczos filter.

Parameters:
cffloat

Cutoff frequency expressed as a ratio of the Nyquist frequency.

mint

Size of the filtering window.

normalizebool, optional

Whether to normalize the filter coefficients so they sum to 1.

cosine_taperbool, optional

If True, applies a cosine-squared taper to the Lanczos window.

Returns:
resnp.ndarray

Coefficients of the filtering window.

minutes(m)[source]

Create a time interval representing m minutes

process_cutoff(cutoff_frequency, cutoff_period, freq)[source]
seconds(s)[source]

Create a time interval representing s seconds

ts_gaussian_filter(ts, sigma, order=0, mode='reflect', cval=0.0, truncate=4.0)[source]

Column-wise Gaussian smoothing of regular time series. Missing/irregular values are not handled, which means this function is not much different from a rolling window gaussian average in pandas which may be preferable in the case of missing data (ts.rolling(window=5,win_type=’gaussian’).mean. This function has been kept around awaiting irreg as an aspiration but yet to be implemented.

Parameters:
tsDataFrame

The series to be smoothed

sigmaint or freq

The sigma scale of the smoothing (analogous to std. deviation), given as a number of steps or freq

Returns:
resultDataFrame

A new regular time series with the same interval of ts.

vtools.functions.frequency_response module

butterworth(ts, cutoff_period=None, cutoff_frequency=None, order=4)[source]

low-pass butterworth-squared filter on a regular time series.

Parameters:
tsDataFrame

Must be one or two dimensional, and regular.

order: int ,optional

The default is 4.

cutoff_frequency: float,optional

Cutoff frequency expressed as a ratio with Nyquist frequency, should within the range (0,1). For a discretely sampled system, the Nyquist frequency is the fastest frequency that can be resolved by that sampling, which is half the sampling frequency. For example, if the sampling frequency is 1 sample/1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.

cutoff_periodstring or _time_interval

Period corresponding to cutoff frequency. If input as a string, it must be convertible to a regular interval using the same rules as a pandas frequency.. cutoff_frequency and cutoff_period can’t be specified at the same time.

Returns:
result

A new regular time series with the same interval as ts.

Raises:
ValueError

If input order is not even, or input timeseries is not regular, or neither cutoff_period and cutoff_frequency is given while input time series interval is not 15min or 1 hour, or cutoff_period and cutoff_frequency are given at the same time.

compare_response(cutoff_period)[source]

Generate frequency response plot of low-pass filters: cosine_lanczos, boxcar 24h, boxcar 25h, and godin.

Parameters:
cutoff_periodint

Low-pass filter cutoff period in number of hours.

Returns:
None.
cosine_lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]
cosine_lanczos5(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]

squared low-pass cosine lanczos filter on a regular time series.

Parameters:
tsDataFrame
filter_lenint, time_interval

Size of lanczos window, default is to number of samples within filter_period*1.25.

cutoff_frequency: float,optional

Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.

cutoff_periodstring or _time_interval

Period of cutting off frequency. If input as a string, it must be convertible to a _time_interval (Pandas freq). cutoff_frequency and cutoff_period can’t be specified at the same time.

padtypestr or None, optional

Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is None.

padlenint or None, optional

The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis]-1. padlen=0 implies no padding. If padtye is not None and padlen is not given, padlen is be set to 6*m.

fill_edge_nan: bool,optional

If pading is not used and fill_edge_nan is true, resulting data on the both ends are filled with nan to account for edge effect. This is 2*m on the either end of the result. Default is true.

Returns:
resultTimeSeries

A new regular time series with the same interval of ts. If no pading is used the beigning and ending 4*m resulting data will be set to nan to remove edge effect.

Raises:
ValueError

If input timeseries is not regular, or, cutoff_period and cutoff_frequency are given at the same time, or, neither cutoff_period nor curoff_frequence is given, or, padtype is not “odd”,”even”,”constant”,or None, or, padlen is larger than data size

freqz(b, a=1, worN=512, whole=False, plot=None, fs=6.283185307179586, include_nyquist=False)[source]

Compute the frequency response of a digital filter.

Given the M-order numerator b and N-order denominator a of a digital filter, compute its frequency response:

            jw                 -jw              -jwM
   jw    B(e  )    b[0] + b[1]e    + ... + b[M]e
H(e  ) = ------ = -----------------------------------
            jw                 -jw              -jwN
         A(e  )    a[0] + a[1]e    + ... + a[N]e
Parameters:
barray_like

Numerator of a linear filter. If b has dimension greater than 1, it is assumed that the coefficients are stored in the first dimension, and b.shape[1:], a.shape[1:], and the shape of the frequencies array must be compatible for broadcasting.

aarray_like

Denominator of a linear filter. If b has dimension greater than 1, it is assumed that the coefficients are stored in the first dimension, and b.shape[1:], a.shape[1:], and the shape of the frequencies array must be compatible for broadcasting.

worN{None, int, array_like}, optional

If a single integer, then compute at that many frequencies (default is N=512). This is a convenient alternative to:

np.linspace(0, fs if whole else fs/2, N, endpoint=include_nyquist)

Using a number that is fast for FFT computations can result in faster computations (see Notes).

If an array_like, compute the response at the frequencies given. These are in the same units as fs.

wholebool, optional

Normally, frequencies are computed from 0 to the Nyquist frequency, fs/2 (upper-half of unit-circle). If whole is True, compute frequencies from 0 to fs. Ignored if worN is array_like.

plotcallable

A callable that takes two arguments. If given, the return parameters w and h are passed to plot. Useful for plotting the frequency response inside freqz.

fsfloat, optional

The sampling frequency of the digital system. Defaults to 2*pi radians/sample (so w is from 0 to pi).

New in version 1.2.0.

include_nyquistbool, optional

If whole is False and worN is an integer, setting include_nyquist to True will include the last frequency (Nyquist frequency) and is otherwise ignored.

New in version 1.5.0.

Returns:
wndarray

The frequencies at which h was computed, in the same units as fs. By default, w is normalized to the range [0, pi) (radians/sample).

hndarray

The frequency response, as complex numbers.

See also

freqz_zpk
sosfreqz

Notes

Using Matplotlib’s matplotlib.pyplot.plot() function as the callable for plot produces unexpected results, as this plots the real part of the complex transfer function, not the magnitude. Try lambda w, h: plot(w, np.abs(h)).

A direct computation via (R)FFT is used to compute the frequency response when the following conditions are met:

  1. An integer value is given for worN.

  2. worN is fast to compute via FFT (i.e., next_fast_len(worN) <scipy.fft.next_fast_len> equals worN).

  3. The denominator coefficients are a single value (a.shape[0] == 1).

  4. worN is at least as long as the numerator coefficients (worN >= b.shape[0]).

  5. If b.ndim > 1, then b.shape[-1] == 1.

For long FIR filters, the FFT approach can have lower error and be much faster than the equivalent direct polynomial calculation.

Examples

>>> from scipy import signal
>>> import numpy as np
>>> b = signal.firwin(80, 0.5, window=('kaiser', 8))
>>> w, h = signal.freqz(b)
>>> import matplotlib.pyplot as plt
>>> fig, ax1 = plt.subplots()
>>> ax1.set_title('Digital filter frequency response')
>>> ax1.plot(w, 20 * np.log10(abs(h)), 'b')
>>> ax1.set_ylabel('Amplitude [dB]', color='b')
>>> ax1.set_xlabel('Frequency [rad/sample]')
>>> ax2 = ax1.twinx()
>>> angles = np.unwrap(np.angle(h))
>>> ax2.plot(w, angles, 'g')
>>> ax2.set_ylabel('Angle (radians)', color='g')
>>> ax2.grid(True)
>>> ax2.axis('tight')
>>> plt.show()

Broadcasting Examples

Suppose we have two FIR filters whose coefficients are stored in the rows of an array with shape (2, 25). For this demonstration, we’ll use random data:

>>> rng = np.random.default_rng()
>>> b = rng.random((2, 25))

To compute the frequency response for these two filters with one call to freqz, we must pass in b.T, because freqz expects the first axis to hold the coefficients. We must then extend the shape with a trivial dimension of length 1 to allow broadcasting with the array of frequencies. That is, we pass in b.T[..., np.newaxis], which has shape (25, 2, 1):

>>> w, h = signal.freqz(b.T[..., np.newaxis], worN=1024)
>>> w.shape
(1024,)
>>> h.shape
(2, 1024)

Now, suppose we have two transfer functions, with the same numerator coefficients b = [0.5, 0.5]. The coefficients for the two denominators are stored in the first dimension of the 2-D array a:

a = [   1      1  ]
    [ -0.25, -0.5 ]
>>> b = np.array([0.5, 0.5])
>>> a = np.array([[1, 1], [-0.25, -0.5]])

Only a is more than 1-D. To make it compatible for broadcasting with the frequencies, we extend it with a trivial dimension in the call to freqz:

>>> w, h = signal.freqz(b, a[..., np.newaxis], worN=1024)
>>> w.shape
(1024,)
>>> h.shape
(2, 1024)
godin(ts)[source]

Low-pass Godin filter a regular time series. Applies the \(\mathcal{A_{24}^{2}A_{25}}\) Godin filter [1] The filter is generalized to be the equivalent of one boxcar of the length of the lunar diurnal (~25 hours) constituent and two of the solar diurnal (~24 hours), though the implementation combines these steps.

Parameters:
tsDataFrame
Returns:
resultDataFrame

A new regular time series with the same interval of ts.

Raises:
NotImplementedError

If input time series is not univariate

References

[1]

Godin (1972) Analysis of Tides

inset_axes(parent_axes, width, height, loc='upper right', bbox_to_anchor=None, bbox_transform=None, axes_class=None, axes_kwargs=None, borderpad=0.5)[source]

Create an inset axes with a given width and height.

Both sizes used can be specified either in inches or percentage. For example,:

inset_axes(parent_axes, width='40%', height='30%', loc='lower left')

creates in inset axes in the lower left corner of parent_axes which spans over 30% in height and 40% in width of the parent_axes. Since the usage of .inset_axes may become slightly tricky when exceeding such standard cases, it is recommended to read the examples.

Parameters:
parent_axesmatplotlib.axes.Axes

Axes to place the inset axes.

width, heightfloat or str

Size of the inset axes to create. If a float is provided, it is the size in inches, e.g. width=1.3. If a string is provided, it is the size in relative units, e.g. width=’40%’. By default, i.e. if neither bbox_to_anchor nor bbox_transform are specified, those are relative to the parent_axes. Otherwise, they are to be understood relative to the bounding box provided via bbox_to_anchor.

locstr, default: ‘upper right’

Location to place the inset axes. Valid locations are ‘upper left’, ‘upper center’, ‘upper right’, ‘center left’, ‘center’, ‘center right’, ‘lower left’, ‘lower center’, ‘lower right’. For backward compatibility, numeric values are accepted as well. See the parameter loc of .Legend for details.

bbox_to_anchortuple or ~matplotlib.transforms.BboxBase, optional

Bbox that the inset axes will be anchored to. If None, a tuple of (0, 0, 1, 1) is used if bbox_transform is set to parent_axes.transAxes or parent_axes.figure.transFigure. Otherwise, parent_axes.bbox is used. If a tuple, can be either [left, bottom, width, height], or [left, bottom]. If the kwargs width and/or height are specified in relative units, the 2-tuple [left, bottom] cannot be used. Note that, unless bbox_transform is set, the units of the bounding box are interpreted in the pixel coordinate. When using bbox_to_anchor with tuple, it almost always makes sense to also specify a bbox_transform. This might often be the axes transform parent_axes.transAxes.

bbox_transform~matplotlib.transforms.Transform, optional

Transformation for the bbox that contains the inset axes. If None, a .transforms.IdentityTransform is used. The value of bbox_to_anchor (or the return value of its get_points method) is transformed by the bbox_transform and then interpreted as points in the pixel coordinate (which is dpi dependent). You may provide bbox_to_anchor in some normalized coordinate, and give an appropriate transform (e.g., parent_axes.transAxes).

axes_class~matplotlib.axes.Axes type, default: .HostAxes

The type of the newly created inset axes.

axes_kwargsdict, optional

Keyword arguments to pass to the constructor of the inset axes. Valid arguments include:

Properties: adjustable: {‘box’, ‘datalim’} agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array and two offsets from the bottom left corner of the image alpha: scalar or None anchor: (float, float) or {‘C’, ‘SW’, ‘S’, ‘SE’, ‘E’, ‘NE’, …} animated: bool aspect: {‘auto’, ‘equal’} or float autoscale_on: bool autoscalex_on: unknown autoscaley_on: unknown axes_locator: Callable[[Axes, Renderer], Bbox] axisbelow: bool or ‘line’ box_aspect: float or None clip_box: ~matplotlib.transforms.BboxBase or None clip_on: bool clip_path: Patch or (Path, Transform) or None facecolor or fc: color figure: ~matplotlib.figure.Figure frame_on: bool gid: str in_layout: bool label: object mouseover: bool navigate: bool navigate_mode: unknown path_effects: list of .AbstractPathEffect picker: None or bool or float or callable position: [left, bottom, width, height] or ~matplotlib.transforms.Bbox prop_cycle: ~cycler.Cycler rasterization_zorder: float or None rasterized: bool sketch_params: (scale: float, length: float, randomness: float) snap: bool or None subplotspec: unknown title: str transform: ~matplotlib.transforms.Transform url: str visible: bool xbound: (lower: float, upper: float) xlabel: str xlim: (left: float, right: float) xmargin: float greater than -0.5 xscale: unknown xticklabels: unknown xticks: unknown ybound: (lower: float, upper: float) ylabel: str ylim: (bottom: float, top: float) ymargin: float greater than -0.5 yscale: unknown yticklabels: unknown yticks: unknown zorder: float

borderpadfloat, default: 0.5

Padding between inset axes and the bbox_to_anchor. The units are axes font size, i.e. for a default font size of 10 points borderpad = 0.5 is equivalent to a padding of 5 points.

Returns:
inset_axesaxes_class

Inset axes object created.

Notes

The meaning of bbox_to_anchor and bbox_to_transform is interpreted differently from that of legend. The value of bbox_to_anchor (or the return value of its get_points method; the default is parent_axes.bbox) is transformed by the bbox_transform (the default is Identity transform) and then interpreted as points in the pixel coordinate (which is dpi dependent).

Thus, following three calls are identical and creates an inset axes with respect to the parent_axes:

axins = inset_axes(parent_axes, "30%", "40%")
axins = inset_axes(parent_axes, "30%", "40%",
                   bbox_to_anchor=parent_axes.bbox)
axins = inset_axes(parent_axes, "30%", "40%",
                   bbox_to_anchor=(0, 0, 1, 1),
                   bbox_transform=parent_axes.transAxes)
lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]
lowpass_cosine_lanczos_filter_coef(cf, m, normalize=True)[source]

return the convolution coefficients for low pass lanczos filter.

Parameters:
cf: float

Cutoff frequency expressed as a ratio of a Nyquist frequency.

m: int

Size of filtering window size.

Returns:
results: list

Coefficients of filtering window.

lowpass_lanczos_filter_coef(cf, m, normalize=True, cosine_taper=False)[source]

Return the convolution coefficients for a low-pass Lanczos filter.

Parameters:
cffloat

Cutoff frequency expressed as a ratio of the Nyquist frequency.

mint

Size of the filtering window.

normalizebool, optional

Whether to normalize the filter coefficients so they sum to 1.

cosine_taperbool, optional

If True, applies a cosine-squared taper to the Lanczos window.

Returns:
resnp.ndarray

Coefficients of the filtering window.

main()[source]
mark_inset(parent_axes, inset_axes, loc1, loc2, **kwargs)[source]

Draw a box to mark the location of an area represented by an inset axes.

This function draws a box in parent_axes at the bounding box of inset_axes, and shows a connection with the inset axes by drawing lines at the corners, giving a “zoomed in” effect.

Parameters:
parent_axes~matplotlib.axes.Axes

Axes which contains the area of the inset axes.

inset_axes~matplotlib.axes.Axes

The inset axes.

loc1, loc2{1, 2, 3, 4}

Corners to use for connecting the inset axes and the area in the parent axes.

**kwargs

Patch properties for the lines and box drawn:

Properties: agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array and two offsets from the bottom left corner of the image alpha: unknown animated: bool antialiased or aa: bool or None capstyle: .CapStyle or {‘butt’, ‘projecting’, ‘round’} clip_box: ~matplotlib.transforms.BboxBase or None clip_on: bool clip_path: Patch or (Path, Transform) or None color: color edgecolor or ec: color or None facecolor or fc: color or None figure: ~matplotlib.figure.Figure fill: bool gid: str hatch: {‘/’, ‘\’, ‘|’, ‘-’, ‘+’, ‘x’, ‘o’, ‘O’, ‘.’, ‘*’} in_layout: bool joinstyle: .JoinStyle or {‘miter’, ‘round’, ‘bevel’} label: object linestyle or ls: {‘-’, ‘–’, ‘-.’, ‘:’, ‘’, (offset, on-off-seq), …} linewidth or lw: float or None mouseover: bool path_effects: list of .AbstractPathEffect picker: None or bool or float or callable rasterized: bool sketch_params: (scale: float, length: float, randomness: float) snap: bool or None transform: ~matplotlib.transforms.Transform url: str visible: bool zorder: float

Returns:
pp~matplotlib.patches.Patch

The patch drawn to represent the area of the inset axes.

p1, p2~matplotlib.patches.Patch

The patches connecting two corners of the inset axes and its area.

ts_gaussian_filter(ts, sigma, order=0, mode='reflect', cval=0.0, truncate=4.0)[source]

Column-wise Gaussian smoothing of regular time series. Missing/irregular values are not handled, which means this function is not much different from a rolling window gaussian average in pandas which may be preferable in the case of missing data (ts.rolling(window=5,win_type=’gaussian’).mean. This function has been kept around awaiting irreg as an aspiration but yet to be implemented.

Parameters:
tsDataFrame

The series to be smoothed

sigmaint or freq

The sigma scale of the smoothing (analogous to std. deviation), given as a number of steps or freq

Returns:
resultDataFrame

A new regular time series with the same interval of ts.

unit_impulse_ts(size, interval)[source]
Parameters:
sizeint

Length of impluse time series, odd number.

intervalstring

time series interval, such as “15min”

Returns:
pandas.Dataframe

time series value 0.0 except 1 at the middle of time series.

vtools.functions.interannual module

interannual(ts)[source]

Pivots years of multiyear series to columns and convert index to elapsed time of year

Parameters:
tsseries Univariate series
Returns:
annualDataFrame with year in the columns and elapsed time of year as index
interannual_sample()[source]
interannual_ticks_labels(freq)[source]

Convenient ticks and labels for interannual

Parameters:
freqFrequency string for desired spacing. Must be “Q”,”QS”,”M”,”MS” for quarterly or monthly
Returns:
ticks_labelstuple of tick locations and labels

vtools.functions.interpolate module

Module for data interpolation using splines or interfaces unavailable in Pandas.

_monotonic_spline(x, y, xnew)[source]

Third order (M3-A) monotonicity-preserving spline Usage: interpolate.spline(x,y,xnew)

where

x are the sorted index values of the original data y are the original dataxnew xnew are new locations for the spline

Reference: Huynh, HT <<Accurate Monotone Cubic Interpolation>>, SIAM J. Numer. Analysis V30 No. 1 pp 57-100 All equation numbers refer to this paper. The variable names are also almost the same. Double letters like “ee” to indicate that the subscript should have “+1/2” added to it and a number after the variable to show the “t” that the first member applies to.

_ratsp1(x, y, p, q, y0, yn)[source]

RATSP1 in Spath (1995)

interpolate_to_index(df, dest)[source]
monotonic_spline(ts, dest)[source]

Interpolating a regular time series (rts) to a finer rts by rational histospline.

The rational histospline preserves area under the curve. This is a good choice of spline for period averaged data where an interpolant is desired that is ‘conservative’. Note that it is the underlying continuous interpolant that will be ‘conservative though, not the returned discrete time series which merely samples the underlying interpolant.

Parameters:
tsPandas.DataFrame

Series to be interpolated, typically with DatetimeIndex

desta pandas freq code (e.g. ‘16min’ or ‘D’) or a DateTimeIndex
Returns:
resultDataFrame

A regular time series with same columns as ts, populated with instantaneous values and with an index of type DateTimeIndex

rhist(x, y, xnew, y0, yn, p, q)[source]

Histopline for arrays with tension. Based by an algorithm rhist2 in One Dimensional Spline Interpolation Algorithms by Helmuth Spath (1995).

Parameters:
xarray-like

Abscissa array of original data, of length n

yarray-like, dimension (n-1)

Values (mantissa) of original data giving the rectangle (average) values between x[i] and x[i+1]

xnewarray-like

Array of new locations at which to interpolate.

y0,ynfloat

Initial and terminal values

p,q: array-like, dimension (n-1)

Tension parameter, p and q are almost always the same. The higher p and q are for a particular x interval, the more rectangular the interpolant will look and the more positivity and shape preserving it is at the expense of accuracy. For this routine any number p,q > -1 is allowed, although the bound routine doesn’t use vals less than zero.

Returns
——-
ynewarray-like

Array that interpolates the original data.

rhist_bound(x, y, xnew, y0, yn, p, lbound=None, maxiter=5, pfactor=2, floor_eps=0.001)[source]

Numpy implementation of histospline with bounds Histopline for arrays with lower bound enforcement. This routine drives rhist() but tests that the output array observes the lower bound and adapts the tension parameters as needed.

This will not work exactly if the input array has values right on the lower bound. In this case, the parameter floor_eps allows you to specify a tolerance of bound violation to shoot for … and if it isn’t met in maxiter iterations the value is simply floored.

Parameters:
xarray-like

Abscissa array of original data to be interpolated, of length n

yarray-like, dimension (n-1)

Values (mantissa) of original data giving the rectangle (average) values between x[i] and x[i+1]

xnewarray-like

Array of new locations at which to interpolate.

y0,ynfloat

Initial and terminal values

p: float

Tension parameter. This starts out as a global scalar, but will be converted to an array and adapted locally. The higher this goes for a particular x interval, the more rectangular the interpolant will look and the more positivity and shape preserving it is at the expense of accuracy. A good number is 1, and for this routine, p > 0 is required because the adaptive process multiplies it by pfactor each iteration on the expectation that it will get bigger.

lbound: float

Lower bound to be enforced. If the original y’s are strictly above this value, the output has the potential to also be strictly above. If the original y’s lie on the lower bound, then the lower bound can only be enforced within a tolerance using the Spath algorithm … and once the values reach that tolerance they are floored. If lbound = None, this function behaves like rhist()

maxiterinteger

Number of times to increase p by multiplying it by pfactor before giving up on satisfying floor_eps.

pfactorfloat

Factor by which to multiply individual time step p

floor_epsfloat

Tolerance for lower bound violation at which the algorithm will be terminated and the bounds will be enforced by flooring.

Returns:
ynewarray-like

Array that interpolates the original data, on a curve that conserves mass and strictly observes the lower bound.

rhist_coef(x, y, y0, yn, p, q)[source]

Routine that produces coefficients for the histospline

rhist_example()[source]
rhist_val(xnew, x, p, q, a, b, c)[source]

Evaluate a histospline at new x points

rhistinterp(ts, dest, p=2.0, lowbound=None, tolbound=0.001, maxiter=5)[source]

Interpolate a regular time series (rts) to a finer rts by rational histospline.

The rational histospline preserves area under the curve. This is a good choice of spline for period averaged data where an interpolant is desired that is ‘conservative’. Note that it is the underlying continuous interpolant that will be ‘conservative though, not the returned discrete time series which merely samples the underlying interpolant.

Parameters:
tsPandas.DataFrame

Series to be interpolated, with period index and assuming time stamps at beginning of the period and no missing data

deststring or DateTimeIndex

A pandas freq code (e.g. ‘16min’ or ‘D’) or a DateTimeIndex

pfloat, optional

Spline tension, usually between 0 and 20. Must >-1. For a ‘sufficiently large’ value of p, the interpolant will be monotonicity-preserving and will maintain strict positivity (always being strictly > lowbound). It will also preserve the original shape of the time series.

lowboundfloat, optional

Lower bound of interpolated values.

tolboundfloat, optional

Tolerance for determining if an input is on the bound.

Returns:
resultpandas.DataFrame

A regular time series with same columns as ts, populated with instantaneous values and with an index of type DateTimeIndex

tridiagonal(a, b, c, d)[source]

a is the lower band (with leading zero) b is the center diagonal (length == nrow) c is upper band (trailing zero) d is right hand side

vtools.functions.lag_cross_correlation module

calculate_lag(lagged, base, max_lag, res, interpolate_method='linear')[source]

Calculate shift in lagged, that maximizes cross-correlation with base.

Parameters:
base,lagged: :class:`Pandas.Series`

time series to compare. The result is relative to base

max_lag: interval

Maximum pos/negative time shift to consider in cross-correlation (ie, from -max_lag to +max_lag). Required windows in lagged will account for this bracket. For series dominated by a single frequency (eg 1c/12.5 hours for tides), the algorithm can tolerate a range of 180 degrees (6 hours)

res: interval

Resolution of analysis. The series lagged will be interpolated to this resolution using interpolate_method. Unit used here will determine the type of the output. See documentation of the interval concept which is most compatible with pandas.tseries.offsets, not timeDelta, because of better math properties in things like division – vtime helpers like minutes(1) may be helpful

interpolate_method: str, optional

Interpolate method to refine lagged to res. Must be compatible with pandas interpolation method names (and hence scipy)

Returns:
laginterval

Shift as a pandas.tseries.offsets subtype that matches units with res This shift is the apparent lateness (pos) or earliness (neg). It must be applied to base or removed to lagged to align the features.

icrosscorr(lag, ts0, ts1)[source]

Lag-N cross correlation. Shifted data filled with NaNs

Parameters:
lagint, default 0
ts0, ts1pandas.Series objects of equal length
Returns
———-
crosscorrfloat
mincrosscorr(lag, ts0, ts1)[source]

vtools.functions.merge module

_apply_names(result, names)[source]

Helper to apply renaming and column selection based on names.

_reindex_to_continuous(result, first_freq)[source]

Reindex the given result (DataFrame or Series) to a continuous index spanning from the minimum to maximum timestamp of the current index using the provided frequency.

Parameters:
resultpandas.DataFrame or pandas.Series

The merged or spliced result whose index will be reindexed.

first_freqfrequency

The frequency (e.g. ‘D’ for daily) to use for the continuous index, taken from the first input series.

Returns:
resultpandas.DataFrame or pandas.Series

The result reindexed to a continuous index. Any gaps in the timeline will be filled with NaN. If the index is a PeriodIndex, the index is rebuilt with the proper frequency.

reduce(function, iterable[, initial]) value

Apply a function of two arguments cumulatively to the items of a sequence or iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). If initial is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty.

ts_merge(series, names=None, strict_priority=False)[source]

Merge multiple time series together, prioritizing series in order.

Parameters:
seriessequence of pandas.Series or pandas.DataFrame

Higher priority first. All indexes must be DatetimeIndex.

namesNone, str, or iterable of str, optional
  • If None (default), inputs must share compatible column names.

  • If str, the output is univariate and will be named accordingly.

  • If iterable, it is used as a subset/ordering of columns.

strict_prioritybool, default False

If False (default): lower-priority data may fill NaNs in higher-priority series anywhere (traditional merge/overlay). If True: for each column, within the window [first_valid_index, last_valid_index] of any higher-priority series, lower-priority data are masked out — even if the higher-priority value is NaN. Outside those windows, behavior is unchanged.

Returns:
pandas.Series or pandas.DataFrame
ts_splice(series, names=None, transition='prefer_last', floor_dates=False)[source]

Splice multiple time series together, prioritizing series in patches of time.

Unlike ts_merge, which blends overlapping data points, ts_splice stitches together time series without overlap. The function determines when to switch between series based on a transition strategy.

Parameters:
seriestuple or list of pandas.DataFrame or pandas.Series

A tuple or list of time series. Each series must have a DatetimeIndex and consistent column structure.

namesNone, str, or iterable of str, optional
  • If None (default), all input series must share common column names, and the output will merge common columns.

  • If a str, all input series must have a single column, and the output will be a DataFrame with this name as the column name.

  • If an iterable of str, all input DataFrames must have the same number of columns matching the length of names, and these will be used for the output.

transition{‘prefer_first’, ‘prefer_last’} or list of pandas.Timestamp

Defines how to determine breakpoints between time series: - ‘prefer_first’: Uses the earlier series on the list during until its valid timestamp. - ‘prefer_last’: Uses the later series starting from its first valid timestamp. - A list of specific timestamps can also be provided as transition points.

floor_datesbool, optional, default=False

If True, inferred transition timestamps (prefer_first or prefer_last) are floored to the beginning of the day. This can introduce NaNs if the input series are regular with a freq attribute.

Returns:
pandas.DataFrame or pandas.Series
  • If the input contains multi-column DataFrames, the output is a DataFrame with the same column structure.

  • If a collection of single-column Series is provided, the output will be a Series.

  • The output retains a freq attribute if all inputs share the same frequency.

See also

ts_merge

Merges series by filling gaps in order of priority.

Notes

  • The output time index is the union of input time indices.

  • If transition is ‘prefer_first’, gaps may appear in the final time series.

  • If transition is ‘prefer_last’, overlapping data is resolved in favor of later series.

vtools.functions.period_op module

period_op(ts, period='D', agg='mean', max_absent_frac=0.0)[source]
window_op(ts, window, period='D', agg='mean', max_absent_frac=0.0)[source]

vtools.functions.read_dss module

check_exclude(pathname, exclude_pathname)[source]

Returns True if pathname matches the exclude_pathname pattern. Wildcards (*) in exclude_pathname are supported.

read_dss(filename, pathname, dt=<15 * Minutes>, p=2.0, start_date=None, end_date=None, exclude_pathname=None)[source]

Reads in a DSM2 dss file and interpolates Outputs an interpolated DataFrame of that variable

Parameters:
filename: str|Path

Path to the DSS file to read

pathname: str|list

Pathname(s) within the DSS file to read. Needs to be in the format ‘/A_PART/B_PART/C_PART/D_PART/E_PART/F_PART/’ (e.g. ‘//RSAN112/FLOW////’)

vtools.functions.savitzky_golay module

_build_vander(x, degree)[source]
_evaluate_poly(c, x)[source]
_polyfit_window(x, y, w, degree)[source]
main()[source]
savgol_filter_numba(y, window_length, degree, error)[source]
savgol_filter_weighted(data, window_length, degree, error=None, cov_matrix=None, deriv=None, use_numba=True)[source]

Apply a Savitzky–Golay filter with weights to a univariate DataFrame or Series.

Parameters:
datapandas.DataFrame or pandas.Series

DataFrame or Series containing your data.

window_lengthint

Length of the filter window (must be odd).

degreeint

Degree of the polynomial fit.

errorpandas.Series, optional

Series containing the error (used to compute weights).

cov_matrix2D numpy array, optional

Covariance matrix for the errors.

derivint, optional

Derivative order to compute.

use_numbabool, optional

If True, uses the Numba-accelerated kernel.

Returns:
pandas.Series

Series of the filtered values.

Notes

The practical size of window_length depends on the data and the computational resources. Larger window lengths provide smoother results but require more computation and may not capture local variations well. It is recommended to experiment with different window lengths to find the optimal value for your specific application.

Some of the workflow derived from this work: https://github.com/surhudm/savitzky_golay_with_errors

savgol_filter_werror(y, window_length, degree, error=None, cov=None, deriv=None)[source]
solve_leastsq(yarr, ycov, vander, vanderT, deriv=None)[source]
solve_polyfit(xarr, yarr, degree, weight, deriv=None)[source]

vtools.functions.skill_metrics module

_main()[source]
corr_coefficient(predictions, targets, method='pearson')[source]

Calculates the correlation coefficient (the ‘r’ in ‘-squared’ between two series.

For time series where the targets are serially correlated and may span only a fraction of the natural variability, this statistic may not be appropriate and Murphy (1988) explains why caution should be exercised in using this statistic.

Parameters:
predictions, targetsarray_like

Time series to analyze

methodpearson’, ‘kendall’, ‘spearman’

Method compatilble with pandasa

Returns:
rfloat

Correlation coefficient

mean_error(predictions, targets, proportiontocut)[source]

Calculate the untrimmed mean error, discounting nan values

Parameters:
predictions, targetsarray_like

Time series or arrays to be analyzed

Returns:
medfloat

Median error

median_error(predictions, targets)[source]

Calculate the median error, discounting nan values

Parameters:
predictions, targetsarray_like

Time series or arrays to be analyzed

Returns:
medfloat

Median error

mse(predictions, targets)[source]

Mean squared error

Parameters:
predictions, targetsarray_like

Time series or arrays to analyze

Returns:
msevtools.data.timeseries.TimeSeries

Mean squared error between predictions and targets

rmse(predictions, targets)[source]

Root mean squared error

Parameters:
predictions, targetsarray_like

Time series or arrays to analyze

Returns:
msefloat

Mean squared error

skill_score(predictions, targets, ref=None)[source]

Calculate a Nash-Sutcliffe-like skill score based on mean squared error

As per the discussion in Murphy (1988) other reference forecasts (climatology, harmonic tide, etc.) are possible.

Parameters:
predictions, targetsarray_like

Time series or arrays to be analyzed

Returns:
rmsefloat

Root mean squared error

tmean_error(predictions, targets, limits=None, inclusive=[True, True])[source]

Calculate the (possibly trimmed) mean error, discounting nan values

Parameters:
predictions, targetsarray_like

Time series or arrays to be analyzed

limitstuple(float)

Low and high limits for trimming

inclusive[boolean, boolean]

True if clipping is inclusive on the low/high end

Returns:
meanfloat

Trimmed mean error

willmott_score(predictions, targets, ref=None)[source]

Calculate a Nash-Sutcliffe-like skill score based on mean squared error

As per the discussion in Murphy (1988) other reference forecasts (climatology, harmonic tide, etc.) are possible.

Parameters:
predictions, targetsarray_like

Time series or arrays to be analyzed

Returns:
rmsefloat

Root mean squared error

vtools.functions.tidalhl module

cosine_lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]
filter_where_na(df, dfb)[source]

remove values in df where dfb has na values

get_phase_diff(df1, df2, tolerance='4h')[source]
get_smoothed_resampled(df, cutoff_period='2h', resample_period='1min', interpolate_method='pchip')[source]

Resample the dataframe (indexed by time) to the regular period of resample_period using the interpolate method

Furthermore the cosine lanczos filter is used with a cutoff_period to smooth the signal to remove high frequency noise

Args:

df (DataFrame): A single column dataframe indexed by datetime

cutoff_period (str, optional): cutoff period for cosine lanczos filter. Defaults to ‘2h’.

resample_period (str, optional): Resample to regular period. Defaults to ‘1min’.

interpolate_method (str, optional): interpolation for resampling. Defaults to ‘pchip’.

Returns:

DataFrame: smoothed and resampled dataframe indexed by datetime

get_tidal_amplitude(dfh, dfl)[source]

Tidal amplitude given tidal highs and lows

Args:

dfh (DataFrame): Tidal highs time series

dfl (DataFrame): Tidal lows time series

Returns:

DataFrame: Amplitude timeseries, at the times of the low following the high being used for amplitude calculation

get_tidal_amplitude_diff(dfamp1, dfamp2, percent_diff=False, tolerance='4h')[source]

Get the difference of values within +/- 4H of values in the two amplitude arrays

Args:

dfamp1 (DataFrame): Amplitude time series

dfamp2 (DataFrame): Amplitude time series

percent_diff (bool, optional): If true do percent diff. Defaults to False.

Returns:

DataFrame: Difference dfamp1-dfamp2 or % Difference (dfamp1-dfamp2)/dfamp2*100 for values within +/- 4H of each other

get_tidal_hl(df, cutoff_period='2h', resample_period='1min', interpolate_method='pchip', moving_window_size='7h')[source]

Get Tidal highs and lows

Args:

df (DataFrame): A single column dataframe indexed by datetime

cutoff_period (str, optional): cutoff period for cosine lanczos filter. Defaults to ‘2h’.

resample_period (str, optional): Resample to regular period. Defaults to ‘1min’.

interpolate_method (str, optional): interpolation for resampling. Defaults to ‘pchip’.

moving_window_size (str, optional): moving window size to look for lows within. Defaults to ‘7h’.

Returns:

tuple of DataFrame: Tidal high and tidal low time series

get_tidal_hl_rolling(df, cutoff_period='2h', resample_period='1min', interpolate_method='pchip', moving_window_size='7h')

Get Tidal highs and lows

Args:

df (DataFrame): A single column dataframe indexed by datetime

cutoff_period (str, optional): cutoff period for cosine lanczos filter. Defaults to ‘2h’.

resample_period (str, optional): Resample to regular period. Defaults to ‘1min’.

interpolate_method (str, optional): interpolation for resampling. Defaults to ‘pchip’.

moving_window_size (str, optional): moving window size to look for lows within. Defaults to ‘7h’.

Returns:

tuple of DataFrame: Tidal high and tidal low time series

get_tidal_hl_zerocrossing(df, round_to='1min')[source]

Finds the tidal high and low times using zero crossings of the first derivative.

This works for all situations but is not robust in the face of noise and perturbations in the signal

get_tidal_phase_diff(dfh2, dfl2, dfh1, dfl1, tolerance='4h')[source]

Calculates the phase difference between df2 and df1 tidal highs and lows

Scans +/- 4 hours in df1 to get the highs and lows in that windows for df2 to get the tidal highs and lows at the times of df1

Args:

dfh2 (DataFrame): Timeseries of tidal highs

dfl2 (DataFrame): Timeseries of tidal lows

dfh1 (DataFrame): Timeseries of tidal highs

dfl1 (DataFRame): Timeseries of tidal lows

Returns:

DataFrame: Phase difference (dfh2-dfh1) and (dfl2-dfl1) in minutes

limit_to_indices(df, si, ei)[source]
lmax(arr)[source]

Local maximum: Returns value only when centered on maximum

lmin(arr)[source]

Local minimum: Returns value only when centered on minimum

periods_per_window(moving_window_size: str, period_str: str) int[source]

Number of period size in moving window

Args:

moving_window_size (str): moving window size as a string e.g 7H for 7 hour

period_str (str): period as str e.g. 1T for 1 min

Returns:

int: number of periods in the moving window rounded to an integer

tidal_highs(df, moving_window_size='7h')[source]

Tidal highs (could be upto two highs in a 25 hr period)

Args:

df (DataFrame): a time series with a regular frequency

moving_window_size (str, optional): moving window size to look for highs within. Defaults to ‘7h’.

Returns:

DataFrame: an irregular time series with highs at resolution of df.index

tidal_lows(df, moving_window_size='7h')[source]

Tidal lows (could be upto two lows in a 25 hr period)

Args:

df (DataFrame): a time series with a regular frequency

moving_window_size (str, optional): moving window size to look for lows within. Defaults to ‘7h’.

Returns:

DataFrame: an irregular time series with lows at resolution of df.index

where_changed(df)[source]
where_same(dfg, df)[source]

return dfg only where its value is the same as df for the same time stamps i.e. the interesection locations with df

zerocross(df)[source]

Calculates the gradient of the time series and identifies locations where gradient changes sign Returns the time rounded to nearest minute where the zero crossing happens (based on linear derivative assumption)

vtools.functions.tidalhours module

cdiff(a, n=1, axis=-1)[source]

Like np.diff, but include difference from last element back to first.

cosine_lanczos5(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]

squared low-pass cosine lanczos filter on a regular time series.

Parameters:
tsDataFrame
filter_lenint, time_interval

Size of lanczos window, default is to number of samples within filter_period*1.25.

cutoff_frequency: float,optional

Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.

cutoff_periodstring or _time_interval

Period of cutting off frequency. If input as a string, it must be convertible to a _time_interval (Pandas freq). cutoff_frequency and cutoff_period can’t be specified at the same time.

padtypestr or None, optional

Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is None.

padlenint or None, optional

The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis]-1. padlen=0 implies no padding. If padtye is not None and padlen is not given, padlen is be set to 6*m.

fill_edge_nan: bool,optional

If pading is not used and fill_edge_nan is true, resulting data on the both ends are filled with nan to account for edge effect. This is 2*m on the either end of the result. Default is true.

Returns:
resultTimeSeries

A new regular time series with the same interval of ts. If no pading is used the beigning and ending 4*m resulting data will be set to nan to remove edge effect.

Raises:
ValueError

If input timeseries is not regular, or, cutoff_period and cutoff_frequency are given at the same time, or, neither cutoff_period nor curoff_frequence is given, or, padtype is not “odd”,”even”,”constant”,or None, or, padlen is larger than data size

diff_h(tidal_hour_series)[source]

Compute the time derivative of tidal hour.

This derivative is often included to capture how rapidly the tidal phase is changing, which can be important in modeling flow reversals, estuarine dynamics, or for detecting slack tide conditions where the rate of change is near zero.

Parameters:
tidal_hour_seriespandas.Series

Output of tidal_hour_signal, indexed by datetime.

Returns:
pandas.Series

Time derivative of tidal hour (dH/dt) in hours/hour, indexed by datetime.

find_slack(jd, u, leave_mean=False, which='both')[source]
hilbert(x, N=None, axis=-1)[source]

Compute the analytic signal, using the Hilbert transform.

The transformation is done along the last axis by default.

Parameters:
xarray_like

Signal data. Must be real.

Nint, optional

Number of Fourier components. Default: x.shape[axis]

axisint, optional

Axis along which to do the transformation. Default: -1.

Returns:
xandarray

Analytic signal of x, of each 1-D array along axis

Notes

The analytic signal x_a(t) of signal x(t) is:

\[x_a = F^{-1}(F(x) 2U) = x + i y\]

where F is the Fourier transform, U the unit step function, and y the Hilbert transform of x. [1]

In other words, the negative half of the frequency spectrum is zeroed out, turning the real-valued signal into a complex signal. The Hilbert transformed signal can be obtained from np.imag(hilbert(x)), and the original signal from np.real(hilbert(x)).

References

[1]

Wikipedia, “Analytic signal”. https://en.wikipedia.org/wiki/Analytic_signal

[2]

Leon Cohen, “Time-Frequency Analysis”, 1995. Chapter 2.

[3]

Alan V. Oppenheim, Ronald W. Schafer. Discrete-Time Signal Processing, Third Edition, 2009. Chapter 12. ISBN 13: 978-1292-02572-8

Examples

In this example we use the Hilbert transform to determine the amplitude envelope and instantaneous frequency of an amplitude-modulated signal.

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from scipy.signal import hilbert, chirp
>>> duration = 1.0
>>> fs = 400.0
>>> samples = int(fs*duration)
>>> t = np.arange(samples) / fs

We create a chirp of which the frequency increases from 20 Hz to 100 Hz and apply an amplitude modulation.

>>> signal = chirp(t, 20.0, t[-1], 100.0)
>>> signal *= (1.0 + 0.5 * np.sin(2.0*np.pi*3.0*t) )

The amplitude envelope is given by magnitude of the analytic signal. The instantaneous frequency can be obtained by differentiating the instantaneous phase in respect to time. The instantaneous phase corresponds to the phase angle of the analytic signal.

>>> analytic_signal = hilbert(signal)
>>> amplitude_envelope = np.abs(analytic_signal)
>>> instantaneous_phase = np.unwrap(np.angle(analytic_signal))
>>> instantaneous_frequency = (np.diff(instantaneous_phase) /
...                            (2.0*np.pi) * fs)
>>> fig, (ax0, ax1) = plt.subplots(nrows=2)
>>> ax0.plot(t, signal, label='signal')
>>> ax0.plot(t, amplitude_envelope, label='envelope')
>>> ax0.set_xlabel("time in seconds")
>>> ax0.legend()
>>> ax1.plot(t[1:], instantaneous_frequency)
>>> ax1.set_xlabel("time in seconds")
>>> ax1.set_ylim(0.0, 120.0)
>>> fig.tight_layout()
hour_tide(jd, u=None, h=None, jd_new=None, leave_mean=False, start_datum='ebb')[source]

Calculate tide-hour from a time series of u (velocity, flood-positive) or h (water level, i.e. positive-up).

jd: time in days, i.e. julian day, datenum, etc. u: velocity, flood-positive h: water level, positive up. jd_new: if you want to evaluate the tidal hour at a different

(but overlapping) set of times.

leave_mean: by default, the time series mean is removed, but this

can be disabled by passing True here.

the return values are between 0 and 12, with 0 being slack before ebb (hmm - may need to verify that.).

hour_tide_fn(jd, u, start_datum='ebb', leave_mean=False)[source]

Return a function for extracting tidal hour from the time/velocity given. Use the _fn version if making repeated calls with different jd_new, but the same jd,u

class interp1d(x, y, kind='linear', axis=-1, copy=True, bounds_error=None, fill_value=nan, assume_sorted=False)[source]

Bases: _Interpolator1D

Interpolate a 1-D function.

x and y are arrays of values used to approximate some function f: y = f(x). This class returns a function whose call method uses interpolation to find the value of new points.

Parameters:
x(npoints, ) array_like

A 1-D array of real values.

y(…, npoints, …) array_like

A N-D array of real values. The length of y along the interpolation axis must be equal to the length of x. Use the axis parameter to select correct axis. Unlike other interpolators, the default interpolation axis is the last axis of y.

kindstr or int, optional

Specifies the kind of interpolation as a string or as an integer specifying the order of the spline interpolator to use. The string has to be one of ‘linear’, ‘nearest’, ‘nearest-up’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, or ‘next’. ‘zero’, ‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of zeroth, first, second or third order; ‘previous’ and ‘next’ simply return the previous or next value of the point; ‘nearest-up’ and ‘nearest’ differ when interpolating half-integers (e.g. 0.5, 1.5) in that ‘nearest-up’ rounds up and ‘nearest’ rounds down. Default is ‘linear’.

axisint, optional

Axis in the y array corresponding to the x-coordinate values. Unlike other interpolators, defaults to axis=-1.

copybool, optional

If True, the class makes internal copies of x and y. If False, references to x and y are used. The default is to copy.

bounds_errorbool, optional

If True, a ValueError is raised any time interpolation is attempted on a value outside of the range of x (where extrapolation is necessary). If False, out of bounds values are assigned fill_value. By default, an error is raised unless fill_value="extrapolate".

fill_valuearray-like or (array-like, array_like) or “extrapolate”, optional
  • if a ndarray (or float), this value will be used to fill in for requested points outside of the data range. If not provided, then the default is NaN. The array-like must broadcast properly to the dimensions of the non-interpolation axes.

  • If a two-element tuple, then the first element is used as a fill value for x_new < x[0] and the second element is used for x_new > x[-1]. Anything that is not a 2-element tuple (e.g., list or ndarray, regardless of shape) is taken to be a single array-like argument meant to be used for both bounds as below, above = fill_value, fill_value. Using a two-element tuple or ndarray requires bounds_error=False.

    New in version 0.17.0.

  • If “extrapolate”, then points outside the data range will be extrapolated.

    New in version 0.17.0.

assume_sortedbool, optional

If False, values of x can be in any order and they are sorted first. If True, x has to be an array of monotonically increasing values.

See also

splrep, splev

Spline interpolation/smoothing based on FITPACK.

UnivariateSpline

An object-oriented wrapper of the FITPACK routines.

interp2d

2-D interpolation

Notes

Calling interp1d with NaNs present in input values results in undefined behaviour.

Input values x and y must be convertible to float values like int or float.

If the values in x are not unique, the resulting behavior is undefined and specific to the choice of kind, i.e., changing kind will change the behavior for duplicates.

Examples

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from scipy import interpolate
>>> x = np.arange(0, 10)
>>> y = np.exp(-x/3.0)
>>> f = interpolate.interp1d(x, y)
>>> xnew = np.arange(0, 9, 0.1)
>>> ynew = f(xnew)   # use interpolation function returned by `interp1d`
>>> plt.plot(x, y, 'o', xnew, ynew, '-')
>>> plt.show()
Attributes:
fill_value

The fill value.

Methods

__call__(x)

Evaluate the interpolant

__dict__ = mappingproxy({'__module__': 'scipy.interpolate._interpolate', '__doc__': '\n    Interpolate a 1-D function.\n\n    .. legacy:: class\n\n        For a guide to the intended replacements for `interp1d` see\n        :ref:`tutorial-interpolate_1Dsection`.\n\n    `x` and `y` are arrays of values used to approximate some function f:\n    ``y = f(x)``. This class returns a function whose call method uses\n    interpolation to find the value of new points.\n\n    Parameters\n    ----------\n    x : (npoints, ) array_like\n        A 1-D array of real values.\n    y : (..., npoints, ...) array_like\n        A N-D array of real values. The length of `y` along the interpolation\n        axis must be equal to the length of `x`. Use the ``axis`` parameter\n        to select correct axis. Unlike other interpolators, the default\n        interpolation axis is the last axis of `y`.\n    kind : str or int, optional\n        Specifies the kind of interpolation as a string or as an integer\n        specifying the order of the spline interpolator to use.\n        The string has to be one of \'linear\', \'nearest\', \'nearest-up\', \'zero\',\n        \'slinear\', \'quadratic\', \'cubic\', \'previous\', or \'next\'. \'zero\',\n        \'slinear\', \'quadratic\' and \'cubic\' refer to a spline interpolation of\n        zeroth, first, second or third order; \'previous\' and \'next\' simply\n        return the previous or next value of the point; \'nearest-up\' and\n        \'nearest\' differ when interpolating half-integers (e.g. 0.5, 1.5)\n        in that \'nearest-up\' rounds up and \'nearest\' rounds down. Default\n        is \'linear\'.\n    axis : int, optional\n        Axis in the ``y`` array corresponding to the x-coordinate values. Unlike\n        other interpolators, defaults to ``axis=-1``.\n    copy : bool, optional\n        If True, the class makes internal copies of x and y.\n        If False, references to `x` and `y` are used. The default is to copy.\n    bounds_error : bool, optional\n        If True, a ValueError is raised any time interpolation is attempted on\n        a value outside of the range of x (where extrapolation is\n        necessary). If False, out of bounds values are assigned `fill_value`.\n        By default, an error is raised unless ``fill_value="extrapolate"``.\n    fill_value : array-like or (array-like, array_like) or "extrapolate", optional\n        - if a ndarray (or float), this value will be used to fill in for\n          requested points outside of the data range. If not provided, then\n          the default is NaN. The array-like must broadcast properly to the\n          dimensions of the non-interpolation axes.\n        - If a two-element tuple, then the first element is used as a\n          fill value for ``x_new < x[0]`` and the second element is used for\n          ``x_new > x[-1]``. Anything that is not a 2-element tuple (e.g.,\n          list or ndarray, regardless of shape) is taken to be a single\n          array-like argument meant to be used for both bounds as\n          ``below, above = fill_value, fill_value``. Using a two-element tuple\n          or ndarray requires ``bounds_error=False``.\n\n          .. versionadded:: 0.17.0\n        - If "extrapolate", then points outside the data range will be\n          extrapolated.\n\n          .. versionadded:: 0.17.0\n    assume_sorted : bool, optional\n        If False, values of `x` can be in any order and they are sorted first.\n        If True, `x` has to be an array of monotonically increasing values.\n\n    Attributes\n    ----------\n    fill_value\n\n    Methods\n    -------\n    __call__\n\n    See Also\n    --------\n    splrep, splev\n        Spline interpolation/smoothing based on FITPACK.\n    UnivariateSpline : An object-oriented wrapper of the FITPACK routines.\n    interp2d : 2-D interpolation\n\n    Notes\n    -----\n    Calling `interp1d` with NaNs present in input values results in\n    undefined behaviour.\n\n    Input values `x` and `y` must be convertible to `float` values like\n    `int` or `float`.\n\n    If the values in `x` are not unique, the resulting behavior is\n    undefined and specific to the choice of `kind`, i.e., changing\n    `kind` will change the behavior for duplicates.\n\n\n    Examples\n    --------\n    >>> import numpy as np\n    >>> import matplotlib.pyplot as plt\n    >>> from scipy import interpolate\n    >>> x = np.arange(0, 10)\n    >>> y = np.exp(-x/3.0)\n    >>> f = interpolate.interp1d(x, y)\n\n    >>> xnew = np.arange(0, 9, 0.1)\n    >>> ynew = f(xnew)   # use interpolation function returned by `interp1d`\n    >>> plt.plot(x, y, \'o\', xnew, ynew, \'-\')\n    >>> plt.show()\n    ', '__init__': <function interp1d.__init__>, 'fill_value': <property object>, '_check_and_update_bounds_error_for_extrapolation': <function interp1d._check_and_update_bounds_error_for_extrapolation>, '_call_linear_np': <function interp1d._call_linear_np>, '_call_linear': <function interp1d._call_linear>, '_call_nearest': <function interp1d._call_nearest>, '_call_previousnext': <function interp1d._call_previousnext>, '_call_spline': <function interp1d._call_spline>, '_call_nan_spline': <function interp1d._call_nan_spline>, '_evaluate': <function interp1d._evaluate>, '_check_bounds': <function interp1d._check_bounds>, '__dict__': <attribute '__dict__' of 'interp1d' objects>, '__weakref__': <attribute '__weakref__' of 'interp1d' objects>, '__annotations__': {}})
__doc__ = '\n    Interpolate a 1-D function.\n\n    .. legacy:: class\n\n        For a guide to the intended replacements for `interp1d` see\n        :ref:`tutorial-interpolate_1Dsection`.\n\n    `x` and `y` are arrays of values used to approximate some function f:\n    ``y = f(x)``. This class returns a function whose call method uses\n    interpolation to find the value of new points.\n\n    Parameters\n    ----------\n    x : (npoints, ) array_like\n        A 1-D array of real values.\n    y : (..., npoints, ...) array_like\n        A N-D array of real values. The length of `y` along the interpolation\n        axis must be equal to the length of `x`. Use the ``axis`` parameter\n        to select correct axis. Unlike other interpolators, the default\n        interpolation axis is the last axis of `y`.\n    kind : str or int, optional\n        Specifies the kind of interpolation as a string or as an integer\n        specifying the order of the spline interpolator to use.\n        The string has to be one of \'linear\', \'nearest\', \'nearest-up\', \'zero\',\n        \'slinear\', \'quadratic\', \'cubic\', \'previous\', or \'next\'. \'zero\',\n        \'slinear\', \'quadratic\' and \'cubic\' refer to a spline interpolation of\n        zeroth, first, second or third order; \'previous\' and \'next\' simply\n        return the previous or next value of the point; \'nearest-up\' and\n        \'nearest\' differ when interpolating half-integers (e.g. 0.5, 1.5)\n        in that \'nearest-up\' rounds up and \'nearest\' rounds down. Default\n        is \'linear\'.\n    axis : int, optional\n        Axis in the ``y`` array corresponding to the x-coordinate values. Unlike\n        other interpolators, defaults to ``axis=-1``.\n    copy : bool, optional\n        If True, the class makes internal copies of x and y.\n        If False, references to `x` and `y` are used. The default is to copy.\n    bounds_error : bool, optional\n        If True, a ValueError is raised any time interpolation is attempted on\n        a value outside of the range of x (where extrapolation is\n        necessary). If False, out of bounds values are assigned `fill_value`.\n        By default, an error is raised unless ``fill_value="extrapolate"``.\n    fill_value : array-like or (array-like, array_like) or "extrapolate", optional\n        - if a ndarray (or float), this value will be used to fill in for\n          requested points outside of the data range. If not provided, then\n          the default is NaN. The array-like must broadcast properly to the\n          dimensions of the non-interpolation axes.\n        - If a two-element tuple, then the first element is used as a\n          fill value for ``x_new < x[0]`` and the second element is used for\n          ``x_new > x[-1]``. Anything that is not a 2-element tuple (e.g.,\n          list or ndarray, regardless of shape) is taken to be a single\n          array-like argument meant to be used for both bounds as\n          ``below, above = fill_value, fill_value``. Using a two-element tuple\n          or ndarray requires ``bounds_error=False``.\n\n          .. versionadded:: 0.17.0\n        - If "extrapolate", then points outside the data range will be\n          extrapolated.\n\n          .. versionadded:: 0.17.0\n    assume_sorted : bool, optional\n        If False, values of `x` can be in any order and they are sorted first.\n        If True, `x` has to be an array of monotonically increasing values.\n\n    Attributes\n    ----------\n    fill_value\n\n    Methods\n    -------\n    __call__\n\n    See Also\n    --------\n    splrep, splev\n        Spline interpolation/smoothing based on FITPACK.\n    UnivariateSpline : An object-oriented wrapper of the FITPACK routines.\n    interp2d : 2-D interpolation\n\n    Notes\n    -----\n    Calling `interp1d` with NaNs present in input values results in\n    undefined behaviour.\n\n    Input values `x` and `y` must be convertible to `float` values like\n    `int` or `float`.\n\n    If the values in `x` are not unique, the resulting behavior is\n    undefined and specific to the choice of `kind`, i.e., changing\n    `kind` will change the behavior for duplicates.\n\n\n    Examples\n    --------\n    >>> import numpy as np\n    >>> import matplotlib.pyplot as plt\n    >>> from scipy import interpolate\n    >>> x = np.arange(0, 10)\n    >>> y = np.exp(-x/3.0)\n    >>> f = interpolate.interp1d(x, y)\n\n    >>> xnew = np.arange(0, 9, 0.1)\n    >>> ynew = f(xnew)   # use interpolation function returned by `interp1d`\n    >>> plt.plot(x, y, \'o\', xnew, ynew, \'-\')\n    >>> plt.show()\n    '
__init__(x, y, kind='linear', axis=-1, copy=True, bounds_error=None, fill_value=nan, assume_sorted=False)[source]

Initialize a 1-D linear interpolation class.

__module__ = 'scipy.interpolate._interpolate'
__weakref__

list of weak references to the object (if defined)

_call_linear(x_new)[source]
_call_linear_np(x_new)[source]
_call_nan_spline(x_new)[source]
_call_nearest(x_new)[source]

Find nearest neighbor interpolated y_new = f(x_new).

_call_previousnext(x_new)[source]

Use previous/next neighbor of x_new, y_new = f(x_new).

_call_spline(x_new)[source]
_check_and_update_bounds_error_for_extrapolation()[source]
_check_bounds(x_new)[source]

Check the inputs for being in the bounds of the interpolated data.

Parameters:
x_newarray
Returns:
out_of_boundsbool array

The mask on x_new of values that are out of the bounds.

_evaluate(x_new)[source]

Actually evaluate the value of the interpolator.

_y_axis
_y_extra_shape
dtype
property fill_value

The fill value.

tidal_hour_signal(ts, filter=True)[source]

Compute the tidal hour of a semidiurnal signal.

This function returns the instantaneous phase-based tidal hour for a time series, assuming a semidiurnal signal. Optionally applies a cosine Lanczos low-pass filter (e.g., 40h) to isolate tidal components from subtidal or noisy fluctuations.

The tidal hour is computed using the phase of the analytic signal obtained via the Hilbert transform. This phase is then scaled to range from 0 to 12 hours to represent one semidiurnal tidal cycle. The output is a pandas Series aligned with the input time index.

Parameters:
tspandas.Series

Time series of water level or other semidiurnal signal. Must have a datetime index.

filterbool, default True

Whether to apply a 40-hour cosine Lanczos filter to the input signal. If False, uses the raw signal.

Returns:
pandas.Series

Tidal hour as a float (range [0, 12)), indexed by datetime.

See also

diff_h

Compute the derivative (rate of change) of tidal hour.

cosine_lanczos

External function used to apply low-pass filtering.

Notes

The tidal hour is derived from the instantaneous phase of the analytic signal. This signal is computed as:

analytic_signal = ts + 1j * hilbert(ts)

The phase (angle) of this complex signal varies smoothly over time and reflects the oscillatory nature of the tide, allowing us to construct a continuous representation of “tidal time” even between extrema.

The use of the Hilbert transform provides a smooth interpolation of the signal’s phase progression, since it yields the narrow-band envelope and instantaneous phase of the dominant frequency component (assumed to be semidiurnal here).

tidal_hour_signal2(ts: Series | DataFrame, filter: bool = True) Series | DataFrame[source]

Calculate the tidal hour from a semidiurnal tidal signal.

Parameters:
tspd.Series or pd.DataFrame

Input time series of water levels (must have datetime index)

filterbool, optional

If True, apply Lanczos filter to remove low-frequency components (default True) Note this is opposite of ‘leave_mean’ in original implementation

Returns:
pd.Series or pd.DataFrame

Tidal hour in datetime format (same shape as input)

Notes

The tidal hour represents the phase of the semidiurnal tide in temporal units. The calculation uses complex interpolation for smooth phase estimation: 1. The Hilbert transform creates an analytic signal 2. The angle gives the instantaneous phase 3. Complex interpolation avoids phase jumps at 0/2π boundaries 4. This provides continuous phase evolution even during slack tides

If h/u distinction is needed, consider applying diff_h to separate flood/ebb phases. The derivative was likely included in original code to identify phase reversals during tidal current analysis.

vtools.functions.transition module

class PchipInterpolator(x, y, axis=0, extrapolate=None)[source]

Bases: CubicHermiteSpline

PCHIP 1-D monotonic cubic interpolation.

x and y are arrays of values used to approximate some function f, with y = f(x). The interpolant uses monotonic cubic splines to find the value of new points. (PCHIP stands for Piecewise Cubic Hermite Interpolating Polynomial).

Parameters:
xndarray, shape (npoints, )

A 1-D array of monotonically increasing real values. x cannot include duplicate values (otherwise f is overspecified)

yndarray, shape (…, npoints, …)

A N-D array of real values. y’s length along the interpolation axis must be equal to the length of x. Use the axis parameter to select the interpolation axis.

axisint, optional

Axis in the y array corresponding to the x-coordinate values. Defaults to axis=0.

extrapolatebool, optional

Whether to extrapolate to out-of-bounds points based on first and last intervals, or to return NaNs.

See also

CubicHermiteSpline

Piecewise-cubic interpolator.

Akima1DInterpolator

Akima 1D interpolator.

CubicSpline

Cubic spline data interpolator.

PPoly

Piecewise polynomial in terms of coefficients and breakpoints.

Notes

The interpolator preserves monotonicity in the interpolation data and does not overshoot if the data is not smooth.

The first derivatives are guaranteed to be continuous, but the second derivatives may jump at \(x_k\).

Determines the derivatives at the points \(x_k\), \(f'_k\), by using PCHIP algorithm [1].

Let \(h_k = x_{k+1} - x_k\), and \(d_k = (y_{k+1} - y_k) / h_k\) are the slopes at internal points \(x_k\). If the signs of \(d_k\) and \(d_{k-1}\) are different or either of them equals zero, then \(f'_k = 0\). Otherwise, it is given by the weighted harmonic mean

\[\frac{w_1 + w_2}{f'_k} = \frac{w_1}{d_{k-1}} + \frac{w_2}{d_k}\]

where \(w_1 = 2 h_k + h_{k-1}\) and \(w_2 = h_k + 2 h_{k-1}\).

The end slopes are set using a one-sided scheme [2].

References

[1]

F. N. Fritsch and J. Butland, A method for constructing local monotone piecewise cubic interpolants, SIAM J. Sci. Comput., 5(2), 300-304 (1984). :doi:`10.1137/0905021`.

[2]

see, e.g., C. Moler, Numerical Computing with Matlab, 2004. :doi:`10.1137/1.9780898717952`

Attributes:
axis
c
extrapolate
x

Methods

__call__(x[, nu, extrapolate])

Evaluate the piecewise polynomial or its derivative.

derivative([nu])

Construct a new piecewise polynomial representing the derivative.

antiderivative([nu])

Construct a new piecewise polynomial representing the antiderivative.

roots([discontinuity, extrapolate])

Find real roots of the piecewise polynomial.

__annotations__ = {}
__doc__ = "PCHIP 1-D monotonic cubic interpolation.\n\n    ``x`` and ``y`` are arrays of values used to approximate some function f,\n    with ``y = f(x)``. The interpolant uses monotonic cubic splines\n    to find the value of new points. (PCHIP stands for Piecewise Cubic\n    Hermite Interpolating Polynomial).\n\n    Parameters\n    ----------\n    x : ndarray, shape (npoints, )\n        A 1-D array of monotonically increasing real values. ``x`` cannot\n        include duplicate values (otherwise f is overspecified)\n    y : ndarray, shape (..., npoints, ...)\n        A N-D array of real values. ``y``'s length along the interpolation\n        axis must be equal to the length of ``x``. Use the ``axis``\n        parameter to select the interpolation axis.\n    axis : int, optional\n        Axis in the ``y`` array corresponding to the x-coordinate values. Defaults\n        to ``axis=0``.\n    extrapolate : bool, optional\n        Whether to extrapolate to out-of-bounds points based on first\n        and last intervals, or to return NaNs.\n\n    Methods\n    -------\n    __call__\n    derivative\n    antiderivative\n    roots\n\n    See Also\n    --------\n    CubicHermiteSpline : Piecewise-cubic interpolator.\n    Akima1DInterpolator : Akima 1D interpolator.\n    CubicSpline : Cubic spline data interpolator.\n    PPoly : Piecewise polynomial in terms of coefficients and breakpoints.\n\n    Notes\n    -----\n    The interpolator preserves monotonicity in the interpolation data and does\n    not overshoot if the data is not smooth.\n\n    The first derivatives are guaranteed to be continuous, but the second\n    derivatives may jump at :math:`x_k`.\n\n    Determines the derivatives at the points :math:`x_k`, :math:`f'_k`,\n    by using PCHIP algorithm [1]_.\n\n    Let :math:`h_k = x_{k+1} - x_k`, and  :math:`d_k = (y_{k+1} - y_k) / h_k`\n    are the slopes at internal points :math:`x_k`.\n    If the signs of :math:`d_k` and :math:`d_{k-1}` are different or either of\n    them equals zero, then :math:`f'_k = 0`. Otherwise, it is given by the\n    weighted harmonic mean\n\n    .. math::\n\n        \\frac{w_1 + w_2}{f'_k} = \\frac{w_1}{d_{k-1}} + \\frac{w_2}{d_k}\n\n    where :math:`w_1 = 2 h_k + h_{k-1}` and :math:`w_2 = h_k + 2 h_{k-1}`.\n\n    The end slopes are set using a one-sided scheme [2]_.\n\n\n    References\n    ----------\n    .. [1] F. N. Fritsch and J. Butland,\n           A method for constructing local\n           monotone piecewise cubic interpolants,\n           SIAM J. Sci. Comput., 5(2), 300-304 (1984).\n           :doi:`10.1137/0905021`.\n    .. [2] see, e.g., C. Moler, Numerical Computing with Matlab, 2004.\n           :doi:`10.1137/1.9780898717952`\n\n    "
__init__(x, y, axis=0, extrapolate=None)[source]
__module__ = 'scipy.interpolate._cubic'
static _edge_case(h0, h1, m0, m1)[source]
static _find_derivatives(x, y)[source]
axis
c
extrapolate
x
transition_ts(ts0, ts1, method='linear', create_gap=None, overlap=(0, 0), return_type='series')[source]

Create a smooth transition between two aligned time series.

Parameters:
ts0pandas.Series or pandas.DataFrame

The initial time series segment. Must share the same frequency and type as ts1.

ts1pandas.Series or pandas.DataFrame

The final time series segment. Must share the same frequency and type as ts0.

method{“linear”, “pchip”}, default=”linear”

The interpolation method to use for generating the transition.

create_gaplist of str or pandas.Timestamp, optional

A list of two elements [start, end] defining the gap over which to transition. If not specified, a natural gap between ts0 and ts1 will be inferred. If ts0 and ts1 overlap, create_gap is required.

overlaptuple of int or str, default=(0, 0)

Amount of overlap to use for interpolation anchoring in pchip mode. Each entry can be: - An integer: number of data points before/after to use. - A pandas-compatible frequency string: e.g., “2h” or “45min”.

return_type{“series”, “glue”}, default=”series”
  • “series”: returns the full merged series including ts0, transition, ts1.

  • “glue”: returns only the interpolated transition segment.

Returns:
pandas.Series or pandas.DataFrame

The resulting time series segment, either the full merged series or just the transition zone.

Raises:
ValueError

If ts0 and ts1 have mismatched types or frequencies, or if overlap exists but create_gap is not specified.

vtools.functions.unit_conversions module

celsius_to_fahrenheit(x)[source]

Convert cfs to cms

cfs_to_cms(x)[source]

Convert cfs to cms

cms_to_cfs(x)[source]

Convert cms to cfs

ec_psu_25c(ts_ec, hill_correction=True)[source]

Convert scalar or vector ec to psu assuming 25C temperature and no low salinity correction Note: The input argument, ts_ec, will be altered.

fahrenheit_to_celsius(x)[source]

Convert cfs to cms

ft_to_m(x)[source]

Convert foot to meter

m_to_ft(x)[source]

Convert meter to foot WARNING: The function modifies the input argument if it is TimeSeries

psu_ec_25c(psu, refine=True, hill_correction=True)[source]

Conversion from psu to ec. If used without refinement, this is the Schemel conversion, otherwise uses root-finding Note that the round trip ec-psu-ec tends to wander a lot without refine = True. The refinement usually takes 4-6 iterations

psu_ec_25c_scalar(psu, refine=True, hill_correction=True)[source]

Conversion from psu to ec for a scalar. If used without refinement, this is the Schemel conversion, otherwise uses root-finding Note that the round trip ec-psu-ec tends to wander a lot without refine = True. The refinement usually takes 4-6 iterations

psu_ec_resid(x, psu, hill_correction)[source]

Module contents