vtools.functions package¶
Submodules¶
vtools.functions.climatology module¶
- apply_climatology(climate, index)[source]¶
Apply daily or monthly climatology to a new index
- Parameters:
- climate: DataFrame with integer index representing month of year (Jan=1) or day of year. Must be of size 12 365,366. Day 366 will be inferred from day 365 value
- index: DatetimeIndex representing locations to be inferred
- Returns:
- DataFrame or Series as given by climate with values extracted from climatology for the month or day
- climatology(ts, freq, nsmooth=None)[source]¶
“ Create a climatology on the columns of ts
- Parameters:
- ts: DataFrame or Series
- DataStructure to be analyzed. Must have a length of at least 2*freq
- freq: period [“day”,”month”]
- Period over which the climatology is analyzed
- nsmooth: int
window size (number of values) of pre-smoothing. This may not make sense for series that are not approximately regular. An odd number is usually best.
- Returns:
out: DataFrame or Series Data structure of the same type as ts, with Integer index representing month (Jan=1) or day of year (1:365).
- climatology_quantiles(ts, min_day_year, max_day_year, window_width, quantiles=[0.05, 0.25, 0.5, 0.75, 0.95])[source]¶
“ Create windowed quantiles across years on a time series
- Parameters:
- ts: DataFrame or Series
- DataStructure to be analyzed.
- min_day_year: int
- Minimum Julian day to be considered
- freq: period [“day”,”month”]
- Maximum Julian day to be considered
- window_width: int
- Number of days to include, including the central day and days on each side. So for instance window_width=15 would span the central date and 7 days on each side
- quantiles: array-like
quantiles requested
- Returns:
out: DataFrame or Series Data structure with Julian day as the index and quantiles as columns.
vtools.functions.envelope module¶
- class PchipInterpolator(x, y, axis=0, extrapolate=None)[source]¶
Bases:
CubicHermiteSpline
PCHIP 1-D monotonic cubic interpolation.
x
andy
are arrays of values used to approximate some function f, withy = f(x)
. The interpolant uses monotonic cubic splines to find the value of new points. (PCHIP stands for Piecewise Cubic Hermite Interpolating Polynomial).- Parameters:
- xndarray, shape (npoints, )
A 1-D array of monotonically increasing real values.
x
cannot include duplicate values (otherwise f is overspecified)- yndarray, shape (…, npoints, …)
A N-D array of real values.
y
’s length along the interpolation axis must be equal to the length ofx
. Use theaxis
parameter to select the interpolation axis.- axisint, optional
Axis in the
y
array corresponding to the x-coordinate values. Defaults toaxis=0
.- extrapolatebool, optional
Whether to extrapolate to out-of-bounds points based on first and last intervals, or to return NaNs.
See also
CubicHermiteSpline
Piecewise-cubic interpolator.
Akima1DInterpolator
Akima 1D interpolator.
CubicSpline
Cubic spline data interpolator.
PPoly
Piecewise polynomial in terms of coefficients and breakpoints.
Notes
The interpolator preserves monotonicity in the interpolation data and does not overshoot if the data is not smooth.
The first derivatives are guaranteed to be continuous, but the second derivatives may jump at \(x_k\).
Determines the derivatives at the points \(x_k\), \(f'_k\), by using PCHIP algorithm [1].
Let \(h_k = x_{k+1} - x_k\), and \(d_k = (y_{k+1} - y_k) / h_k\) are the slopes at internal points \(x_k\). If the signs of \(d_k\) and \(d_{k-1}\) are different or either of them equals zero, then \(f'_k = 0\). Otherwise, it is given by the weighted harmonic mean
\[\frac{w_1 + w_2}{f'_k} = \frac{w_1}{d_{k-1}} + \frac{w_2}{d_k}\]where \(w_1 = 2 h_k + h_{k-1}\) and \(w_2 = h_k + 2 h_{k-1}\).
The end slopes are set using a one-sided scheme [2].
References
[1]F. N. Fritsch and J. Butland, A method for constructing local monotone piecewise cubic interpolants, SIAM J. Sci. Comput., 5(2), 300-304 (1984). :doi:`10.1137/0905021`.
[2]see, e.g., C. Moler, Numerical Computing with Matlab, 2004. :doi:`10.1137/1.9780898717952`
- Attributes:
- axis
- c
- extrapolate
- x
Methods
__call__
(x[, nu, extrapolate])Evaluate the piecewise polynomial or its derivative.
derivative
([nu])Construct a new piecewise polynomial representing the derivative.
antiderivative
([nu])Construct a new piecewise polynomial representing the antiderivative.
roots
([discontinuity, extrapolate])Find real roots of the piecewise polynomial.
- __annotations__ = {}¶
- __doc__ = "PCHIP 1-D monotonic cubic interpolation.\n\n ``x`` and ``y`` are arrays of values used to approximate some function f,\n with ``y = f(x)``. The interpolant uses monotonic cubic splines\n to find the value of new points. (PCHIP stands for Piecewise Cubic\n Hermite Interpolating Polynomial).\n\n Parameters\n ----------\n x : ndarray, shape (npoints, )\n A 1-D array of monotonically increasing real values. ``x`` cannot\n include duplicate values (otherwise f is overspecified)\n y : ndarray, shape (..., npoints, ...)\n A N-D array of real values. ``y``'s length along the interpolation\n axis must be equal to the length of ``x``. Use the ``axis``\n parameter to select the interpolation axis.\n axis : int, optional\n Axis in the ``y`` array corresponding to the x-coordinate values. Defaults\n to ``axis=0``.\n extrapolate : bool, optional\n Whether to extrapolate to out-of-bounds points based on first\n and last intervals, or to return NaNs.\n\n Methods\n -------\n __call__\n derivative\n antiderivative\n roots\n\n See Also\n --------\n CubicHermiteSpline : Piecewise-cubic interpolator.\n Akima1DInterpolator : Akima 1D interpolator.\n CubicSpline : Cubic spline data interpolator.\n PPoly : Piecewise polynomial in terms of coefficients and breakpoints.\n\n Notes\n -----\n The interpolator preserves monotonicity in the interpolation data and does\n not overshoot if the data is not smooth.\n\n The first derivatives are guaranteed to be continuous, but the second\n derivatives may jump at :math:`x_k`.\n\n Determines the derivatives at the points :math:`x_k`, :math:`f'_k`,\n by using PCHIP algorithm [1]_.\n\n Let :math:`h_k = x_{k+1} - x_k`, and :math:`d_k = (y_{k+1} - y_k) / h_k`\n are the slopes at internal points :math:`x_k`.\n If the signs of :math:`d_k` and :math:`d_{k-1}` are different or either of\n them equals zero, then :math:`f'_k = 0`. Otherwise, it is given by the\n weighted harmonic mean\n\n .. math::\n\n \\frac{w_1 + w_2}{f'_k} = \\frac{w_1}{d_{k-1}} + \\frac{w_2}{d_k}\n\n where :math:`w_1 = 2 h_k + h_{k-1}` and :math:`w_2 = h_k + 2 h_{k-1}`.\n\n The end slopes are set using a one-sided scheme [2]_.\n\n\n References\n ----------\n .. [1] F. N. Fritsch and J. Butland,\n A method for constructing local\n monotone piecewise cubic interpolants,\n SIAM J. Sci. Comput., 5(2), 300-304 (1984).\n :doi:`10.1137/0905021`.\n .. [2] see, e.g., C. Moler, Numerical Computing with Matlab, 2004.\n :doi:`10.1137/1.9780898717952`\n\n "¶
- __module__ = 'scipy.interpolate._cubic'¶
- axis¶
- c¶
- extrapolate¶
- x¶
- chunked_loess_smoothing(ts, window_hours=1.25, chunk_days=10, overlap_days=1)[source]¶
Apply LOESS smoothing in overlapping chunks to reduce computation time.
- Parameters:
- tspd.Series
Time series with datetime index and possible NaNs.
- window_hoursfloat
LOESS smoothing window size in hours.
- chunk_daysint
Core chunk size (e.g., 10 days).
- overlap_daysint
Overlap added before and after each chunk to avoid edge effects.
- Returns:
- pd.Series
Smoothed series, NaNs where input is NaN or unsupported.
- filter_extrema_ngood(extrema_df, smoothed, series, loess_window_pts=25, n_good=3, sig_gap_minutes=45)[source]¶
Filter extrema based on local and contextual data quality criteria.
- Parameters:
- extrema_dfpd.DataFrame
DataFrame with columns ‘time’ and ‘value’ for candidate extrema.
- smoothedpd.Series
Smoothed version of the signal used for extrema detection.
- seriespd.Series
Original time series (with gaps).
- loess_window_ptsint
Number of points in the LOESS window.
- n_goodint
Minimum number of non-NaN points required.
- sig_gap_minutesfloat
Threshold for detecting significant gaps (in minutes).
- Returns:
- pd.DataFrame
Filtered extrema DataFrame.
- find_peaks(x, height=None, threshold=None, distance=None, prominence=None, width=None, wlen=None, rel_height=0.5, plateau_size=None)[source]¶
Find peaks inside a signal based on peak properties.
This function takes a 1-D array and finds all local maxima by simple comparison of neighboring values. Optionally, a subset of these peaks can be selected by specifying conditions for a peak’s properties.
- Parameters:
- xsequence
A signal with peaks.
- heightnumber or ndarray or sequence, optional
Required height of peaks. Either a number,
None
, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied, as the maximal required height.- thresholdnumber or ndarray or sequence, optional
Required threshold of peaks, the vertical distance to its neighboring samples. Either a number,
None
, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied, as the maximal required threshold.- distancenumber, optional
Required minimal horizontal distance (>= 1) in samples between neighbouring peaks. Smaller peaks are removed first until the condition is fulfilled for all remaining peaks.
- prominencenumber or ndarray or sequence, optional
Required prominence of peaks. Either a number,
None
, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied, as the maximal required prominence.- widthnumber or ndarray or sequence, optional
Required width of peaks in samples. Either a number,
None
, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied, as the maximal required width.- wlenint, optional
Used for calculation of the peaks prominences, thus it is only used if one of the arguments prominence or width is given. See argument wlen in peak_prominences for a full description of its effects.
- rel_heightfloat, optional
Used for calculation of the peaks width, thus it is only used if width is given. See argument rel_height in peak_widths for a full description of its effects.
- plateau_sizenumber or ndarray or sequence, optional
Required size of the flat top of peaks in samples. Either a number,
None
, an array matching x or a 2-element sequence of the former. The first element is always interpreted as the minimal and the second, if supplied as the maximal required plateau size.New in version 1.2.0.
- Returns:
- peaksndarray
Indices of peaks in x that satisfy all given conditions.
- propertiesdict
A dictionary containing properties of the returned peaks which were calculated as intermediate results during evaluation of the specified conditions:
- ‘peak_heights’
If height is given, the height of each peak in x.
- ‘left_thresholds’, ‘right_thresholds’
If threshold is given, these keys contain a peaks vertical distance to its neighbouring samples.
- ‘prominences’, ‘right_bases’, ‘left_bases’
If prominence is given, these keys are accessible. See peak_prominences for a description of their content.
- ‘width_heights’, ‘left_ips’, ‘right_ips’
If width is given, these keys are accessible. See peak_widths for a description of their content.
- ‘plateau_sizes’, left_edges’, ‘right_edges’
If plateau_size is given, these keys are accessible and contain the indices of a peak’s edges (edges are still part of the plateau) and the calculated plateau sizes.
New in version 1.2.0.
To calculate and return properties without excluding peaks, provide the open interval
(None, None)
as a value to the appropriate argument (excluding distance).
- Warns:
- PeakPropertyWarning
Raised if a peak’s properties have unexpected values (see peak_prominences and peak_widths).
Warning
This function may return unexpected results for data containing NaNs. To avoid this, NaNs should either be removed or replaced.
See also
find_peaks_cwt
Find peaks using the wavelet transformation.
peak_prominences
Directly calculate the prominence of peaks.
peak_widths
Directly calculate the width of peaks.
Notes
In the context of this function, a peak or local maximum is defined as any sample whose two direct neighbours have a smaller amplitude. For flat peaks (more than one sample of equal amplitude wide) the index of the middle sample is returned (rounded down in case the number of samples is even). For noisy signals the peak locations can be off because the noise might change the position of local maxima. In those cases consider smoothing the signal before searching for peaks or use other peak finding and fitting methods (like find_peaks_cwt).
Some additional comments on specifying conditions:
Almost all conditions (excluding distance) can be given as half-open or closed intervals, e.g.,
1
or(1, None)
defines the half-open interval \([1, \infty]\) while(None, 1)
defines the interval \([-\infty, 1]\). The open interval(None, None)
can be specified as well, which returns the matching properties without exclusion of peaks.The border is always included in the interval used to select valid peaks.
For several conditions the interval borders can be specified with arrays matching x in shape which enables dynamic constrains based on the sample position.
The conditions are evaluated in the following order: plateau_size, height, threshold, distance, prominence, width. In most cases this order is the fastest one because faster operations are applied first to reduce the number of peaks that need to be evaluated later.
While indices in peaks are guaranteed to be at least distance samples apart, edges of flat peaks may be closer than the allowed distance.
Use wlen to reduce the time it takes to evaluate the conditions for prominence or width if x is large or has many local maxima (see peak_prominences).
New in version 1.1.0.
Examples
To demonstrate this function’s usage we use a signal x supplied with SciPy (see scipy.datasets.electrocardiogram). Let’s find all peaks (local maxima) in x whose amplitude lies above 0.
>>> import numpy as np >>> import matplotlib.pyplot as plt >>> from scipy.datasets import electrocardiogram >>> from scipy.signal import find_peaks >>> x = electrocardiogram()[2000:4000] >>> peaks, _ = find_peaks(x, height=0) >>> plt.plot(x) >>> plt.plot(peaks, x[peaks], "x") >>> plt.plot(np.zeros_like(x), "--", color="gray") >>> plt.show()
We can select peaks below 0 with
height=(None, 0)
or use arrays matching x in size to reflect a changing condition for different parts of the signal.>>> border = np.sin(np.linspace(0, 3 * np.pi, x.size)) >>> peaks, _ = find_peaks(x, height=(-border, border)) >>> plt.plot(x) >>> plt.plot(-border, "--", color="gray") >>> plt.plot(border, ":", color="gray") >>> plt.plot(peaks, x[peaks], "x") >>> plt.show()
Another useful condition for periodic signals can be given with the distance argument. In this case, we can easily select the positions of QRS complexes within the electrocardiogram (ECG) by demanding a distance of at least 150 samples.
>>> peaks, _ = find_peaks(x, distance=150) >>> np.diff(peaks) array([186, 180, 177, 171, 177, 169, 167, 164, 158, 162, 172]) >>> plt.plot(x) >>> plt.plot(peaks, x[peaks], "x") >>> plt.show()
Especially for noisy signals peaks can be easily grouped by their prominence (see peak_prominences). E.g., we can select all peaks except for the mentioned QRS complexes by limiting the allowed prominence to 0.6.
>>> peaks, properties = find_peaks(x, prominence=(None, 0.6)) >>> properties["prominences"].max() 0.5049999999999999 >>> plt.plot(x) >>> plt.plot(peaks, x[peaks], "x") >>> plt.show()
And, finally, let’s examine a different section of the ECG which contains beat forms of different shape. To select only the atypical heart beats, we combine two conditions: a minimal prominence of 1 and width of at least 20 samples.
>>> x = electrocardiogram()[17000:18000] >>> peaks, properties = find_peaks(x, prominence=1, width=20) >>> properties["prominences"], properties["widths"] (array([1.495, 2.3 ]), array([36.93773946, 39.32723577])) >>> plt.plot(x) >>> plt.plot(peaks, x[peaks], "x") >>> plt.vlines(x=peaks, ymin=x[peaks] - properties["prominences"], ... ymax = x[peaks], color = "C1") >>> plt.hlines(y=properties["width_heights"], xmin=properties["left_ips"], ... xmax=properties["right_ips"], color = "C1") >>> plt.show()
- find_raw_extrema(smoothed, prominence=0.01)[source]¶
Find raw peaks and troughs using scipy.signal.find_peaks. Returns DataFrames for peaks and troughs.
- generate_pink_noise(n, seed=None, scale=1.0)[source]¶
Generate pink (1/f) noise using the Voss-McCartney algorithm.
- Parameters:
- nint
Number of samples to generate.
- seedint or None
Random seed for reproducibility.
- scalefloat
Standard deviation scaling factor for the noise.
- Returns:
- np.ndarray
Pink noise signal of length n.
- generate_simplified_mixed_tide(start_time='2022-01-01', ndays=40, freq='15min', A_M2=1.0, A_K1=0.5, A_O1=0.5, phase_D1=1.570795, noise_amplitude=0.08, return_components=False)[source]¶
Generate a simplified synthetic mixed semidiurnal/diurnal tide with explicit O1 and K1.
- Parameters:
- start_timestr
Start time for the series.
- ndaysint
Number of days.
- freqstr
Sampling interval.
- A_M2float
Amplitude of M2.
- A_K1float
Amplitude of K1.
- A_O1float
Amplitude of O1.
- phase_D1float
Common phase shift for O1 and K1.
- return_componentsbool
Whether to return individual components.
- Returns:
- pd.Series or pd.DataFrame
Combined tide or components with time index.
- interpolate_envelope(anchor_df, series, max_anchor_gap_hours=36)[source]¶
Interpolate envelope using PCHIP, breaking if anchor points are too far apart.
- lowess(endog, exog, frac=0.6666666666666666, it=3, delta=0.0, xvals=None, is_sorted=False, missing='drop', return_sorted=True)[source]¶
LOWESS (Locally Weighted Scatterplot Smoothing)
A lowess function that outs smoothed estimates of endog at the given exog values from points (exog, endog)
- Parameters:
- endog1-D numpy array
The y-values of the observed points
- exog1-D numpy array
The x-values of the observed points
- fracfloat
Between 0 and 1. The fraction of the data used when estimating each y-value.
- itint
The number of residual-based reweightings to perform.
- deltafloat
Distance within which to use linear-interpolation instead of weighted regression.
- xvals: 1-D numpy array
Values of the exogenous variable at which to evaluate the regression. If supplied, cannot use delta.
- is_sortedbool
If False (default), then the data will be sorted by exog before calculating lowess. If True, then it is assumed that the data is already sorted by exog. If xvals is specified, then it too must be sorted if is_sorted is True.
- missingstr
Available options are ‘none’, ‘drop’, and ‘raise’. If ‘none’, no nan checking is done. If ‘drop’, any observations with nans are dropped. If ‘raise’, an error is raised. Default is ‘drop’.
- return_sortedbool
If True (default), then the returned array is sorted by exog and has missing (nan or infinite) observations removed. If False, then the returned array is in the same length and the same sequence of observations as the input array.
- Returns:
- out{ndarray, float}
The returned array is two-dimensional if return_sorted is True, and one dimensional if return_sorted is False. If return_sorted is True, then a numpy array with two columns. The first column contains the sorted x (exog) values and the second column the associated estimated y (endog) values. If return_sorted is False, then only the fitted values are returned, and the observations will be in the same order as the input arrays. If xvals is provided, then return_sorted is ignored and the returned array is always one dimensional, containing the y values fitted at the x values provided by xvals.
Notes
This lowess function implements the algorithm given in the reference below using local linear estimates.
Suppose the input data has N points. The algorithm works by estimating the smooth y_i by taking the frac*N closest points to (x_i,y_i) based on their x values and estimating y_i using a weighted linear regression. The weight for (x_j,y_j) is tricube function applied to abs(x_i-x_j).
If it > 1, then further weighted local linear regressions are performed, where the weights are the same as above times the _lowess_bisquare function of the residuals. Each iteration takes approximately the same amount of time as the original fit, so these iterations are expensive. They are most useful when the noise has extremely heavy tails, such as Cauchy noise. Noise with less heavy-tails, such as t-distributions with df>2, are less problematic. The weights downgrade the influence of points with large residuals. In the extreme case, points whose residuals are larger than 6 times the median absolute residual are given weight 0.
delta can be used to save computations. For each x_i, regressions are skipped for points closer than delta. The next regression is fit for the farthest point within delta of x_i and all points in between are estimated by linearly interpolating between the two regression fits.
Judicious choice of delta can cut computation time considerably for large data (N > 5000). A good choice is
delta = 0.01 * range(exog)
.If xvals is provided, the regression is then computed at those points and the fit values are returned. Otherwise, the regression is run at points of exog.
Some experimentation is likely required to find a good choice of frac and iter for a particular dataset.
References
Cleveland, W.S. (1979) “Robust Locally Weighted Regression and Smoothing Scatterplots”. Journal of the American Statistical Association 74 (368): 829-836.
Examples
The below allows a comparison between how different the fits from lowess for different values of frac can be.
>>> import numpy as np >>> import statsmodels.api as sm >>> lowess = sm.nonparametric.lowess >>> x = np.random.uniform(low = -2*np.pi, high = 2*np.pi, size=500) >>> y = np.sin(x) + np.random.normal(size=len(x)) >>> z = lowess(y, x) >>> w = lowess(y, x, frac=1./3)
This gives a similar comparison for when it is 0 vs not.
>>> import numpy as np >>> import scipy.stats as stats >>> import statsmodels.api as sm >>> lowess = sm.nonparametric.lowess >>> x = np.random.uniform(low = -2*np.pi, high = 2*np.pi, size=500) >>> y = np.sin(x) + stats.cauchy.rvs(size=len(x)) >>> z = lowess(y, x, frac= 1./3, it=0) >>> w = lowess(y, x, frac=1./3)
- savgol_filter(x, window_length, polyorder, deriv=0, delta=1.0, axis=-1, mode='interp', cval=0.0)[source]¶
Apply a Savitzky-Golay filter to an array.
This is a 1-D filter. If x has dimension greater than 1, axis determines the axis along which the filter is applied.
- Parameters:
- xarray_like
The data to be filtered. If x is not a single or double precision floating point array, it will be converted to type
numpy.float64
before filtering.- window_lengthint
The length of the filter window (i.e., the number of coefficients). If mode is ‘interp’, window_length must be less than or equal to the size of x.
- polyorderint
The order of the polynomial used to fit the samples. polyorder must be less than window_length.
- derivint, optional
The order of the derivative to compute. This must be a nonnegative integer. The default is 0, which means to filter the data without differentiating.
- deltafloat, optional
The spacing of the samples to which the filter will be applied. This is only used if deriv > 0. Default is 1.0.
- axisint, optional
The axis of the array x along which the filter is to be applied. Default is -1.
- modestr, optional
Must be ‘mirror’, ‘constant’, ‘nearest’, ‘wrap’ or ‘interp’. This determines the type of extension to use for the padded signal to which the filter is applied. When mode is ‘constant’, the padding value is given by cval. See the Notes for more details on ‘mirror’, ‘constant’, ‘wrap’, and ‘nearest’. When the ‘interp’ mode is selected (the default), no extension is used. Instead, a degree polyorder polynomial is fit to the last window_length values of the edges, and this polynomial is used to evaluate the last window_length // 2 output values.
- cvalscalar, optional
Value to fill past the edges of the input if mode is ‘constant’. Default is 0.0.
- Returns:
- yndarray, same shape as x
The filtered data.
See also
savgol_coeffs
Notes
Details on the mode options:
- ‘mirror’:
Repeats the values at the edges in reverse order. The value closest to the edge is not included.
- ‘nearest’:
The extension contains the nearest input value.
- ‘constant’:
The extension contains the value given by the cval argument.
- ‘wrap’:
The extension contains the values from the other end of the array.
For example, if the input is [1, 2, 3, 4, 5, 6, 7, 8], and window_length is 7, the following shows the extended data for the various mode options (assuming cval is 0):
mode | Ext | Input | Ext -----------+---------+------------------------+--------- 'mirror' | 4 3 2 | 1 2 3 4 5 6 7 8 | 7 6 5 'nearest' | 1 1 1 | 1 2 3 4 5 6 7 8 | 8 8 8 'constant' | 0 0 0 | 1 2 3 4 5 6 7 8 | 0 0 0 'wrap' | 6 7 8 | 1 2 3 4 5 6 7 8 | 1 2 3
New in version 0.14.0.
Examples
>>> import numpy as np >>> from scipy.signal import savgol_filter >>> np.set_printoptions(precision=2) # For compact display. >>> x = np.array([2, 2, 5, 2, 1, 0, 1, 4, 9])
Filter with a window length of 5 and a degree 2 polynomial. Use the defaults for all other parameters.
>>> savgol_filter(x, 5, 2) array([1.66, 3.17, 3.54, 2.86, 0.66, 0.17, 1. , 4. , 9. ])
Note that the last five values in x are samples of a parabola, so when mode=’interp’ (the default) is used with polyorder=2, the last three values are unchanged. Compare that to, for example, mode=’nearest’:
>>> savgol_filter(x, 5, 2, mode='nearest') array([1.74, 3.03, 3.54, 2.86, 0.66, 0.17, 1. , 4.6 , 7.97])
- select_salient_extrema(extrema, typ, spacing_hours=14, envelope_type='outer')[source]¶
Select salient extrema (HH/LL or HL/LH) using literal spacing-based OR logic.
- Parameters:
- extremapd.DataFrame with columns [“time”, “value”]
Candidate extrema.
- typstr
Either “high” or “low” (for peak or trough selection).
- spacing_hoursfloat
Time window for neighbor comparison.
- envelope_typestr
Either “outer” (default) or “inner” to switch saliency logic.
- Returns:
- pd.DataFrame
Extrema that passed the saliency test.
- smooth_series2(series, window_pts=25, method='lowess', **kwargs)[source]¶
Smooth a time series using the specified method. Currently supports ‘lowess’, ‘moving_average’, or ‘savgol’.
- tidal_envelope(series, smoothing_window_hours=2.5, n_good=3, peak_prominence=0.05, saliency_window_hours=14, max_anchor_gap_hours=36, envelope_type='outer')[source]¶
Compute the tidal envelope (high and low) of a time series using smoothing, extrema detection, and interpolation. This function processes a time series to extract its tidal envelope by smoothing the data, identifying significant peaks and troughs, filtering out unreliable extrema, selecting salient extrema within a specified window, and interpolating between anchor points to generate continuous envelope curves. Parameters ———- series : pandas.Series
Time-indexed series of water levels or similar data.
- smoothing_window_hoursfloat, optional
Window size in hours for smoothing the input series (default is 2.5).
- n_goodint, optional
Minimum number of good points required for an extremum to be considered valid (default is 3).
- peak_prominencefloat, optional
Minimum prominence of peaks/troughs to be considered as extrema (default is 0.05).
- saliency_window_hoursfloat, optional
Window size in hours for selecting salient extrema (default is 14).
- max_anchor_gap_hoursfloat, optional
Maximum allowed gap in hours between anchor points for interpolation (default is 36).
- envelope_typestr, optional
Type of envelope to compute, e.g., “outer” (default is “outer”).
Returns¶
- env_highpandas.Series
Interpolated high (upper) envelope of the input series.
- env_lowpandas.Series
Interpolated low (lower) envelope of the input series.
- anchor_highspandas.DataFrame
DataFrame of selected anchor points for the high envelope.
- anchor_lowspandas.DataFrame
DataFrame of selected anchor points for the low envelope.
- smoothedpandas.Series
Smoothed version of the input series.
Notes¶
This function assumes regular time intervals in the input series. If the frequency cannot be inferred, it is estimated from the first two timestamps.
vtools.functions.error_detect module¶
- med_outliers(ts, level=4.0, scale=None, filt_len=7, range=(None, None), quantiles=(0.01, 0.99), copy=True, as_anomaly=False)[source]¶
Detect outliers by running a median filter, subtracting it from the original series and comparing the resulting residuals to a global robust range of scale (the interquartile range). Individual time points are rejected if the residual at that time point is more than level times the range of scale.
The original concept comes from Basu & Meckesheimer (2007) Automatic outlier detection for time series: an application to sensor data although they didn’t use the interquartile range but rather expert judgment. To use this function effectively, you need to be thoughtful about what the interquartile range will be. For instance, for a strongly tidal flow station it is likely to
- level: Number of times the scale or interquantile range the data has to be
to be rejected.d
- scale: Expert judgment of the scale of maximum variation over a time step.
If None, the interquartile range will be used. Note that for a strongly tidal station the interquartile range may substantially overestimate the reasonable variation over a single time step, in which case the filter will work fine, but level should be set to a number (less than one) accordingly.
filt_len: length of median filter, default is 5
- quantilestuple of quantiles defining the measure of scale. Ignored
if scale is given directly. Default is interquartile range, and this is almost always a reasonable choice.
copy: if True, a copy is made leaving original series intact
You can also specify rejection of values based on a simple range
Returns: copy of series with outliers replaced by nan
- nrepeat(ts)[source]¶
Return the length of consecutive runs of repeated values
- Parameters:
- ts: DataFrame or series
- Returns:
- Like-indexed series with lengths of runs. Nans will be mapped to 0
- steep_then_nan(ts, level=4.0, scale=None, filt_len=11, range=(None, None), quantiles=(0.01, 0.99), copy=True, as_anomaly=True)[source]¶
Detect outliers by running a median filter, subtracting it from the original series and comparing the resulting residuals to a global robust range of scale (the interquartile range). Individual time points are rejected if the residual at that time point is more than level times the range of scale.
The original concept comes from Basu & Meckesheimer (2007) although they didn’t use the interquartile range but rather expert judgment. To use this function effectively, you need to be thoughtful about what the interquartile range will be. For instance, for a strongly tidal flow station it is likely to
- level: Number of times the scale or interquantile range the data has to be
to be rejected.d
- scale: Expert judgment of the scale of maximum variation over a time step.
If None, the interquartile range will be used. Note that for a strongly tidal station the interquartile range may substantially overestimate the reasonable variation over a single time step, in which case the filter will work fine, but level should be set to a number (less than one) accordingly.
filt_len: length of median filter, default is 5
- quantilestuple of quantiles defining the measure of scale. Ignored
if scale is given directly. Default is interquartile range, and this is almost always a reasonable choice.
copy: if True, a copy is made leaving original series intact
You can also specify rejection of values based on a simple range
Returns: copy of series with outliers replaced by nan
vtools.functions.example2 module¶
vtools.functions.filter module¶
Module contains filter used in tidal time series analysis.
- _lanczos_impl(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True, cosine_taper=False)[source]¶
squared low-pass cosine lanczos filter on a regular time series.
- Parameters:
- ts
DataFrame
- filter_lenint, time_interval
Size of lanczos window, default is to number of samples within filter_period*1.25.
- cutoff_frequency: float,optional
Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.
- cutoff_periodstring or _time_interval
Period of cutting off frequency. If input as a string, it must be convertible to a _time_interval (Pandas freq). cutoff_frequency and cutoff_period can’t be specified at the same time.
- padtypestr or None, optional
Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is None.
- padlenint or None, optional
The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis]-1. padlen=0 implies no padding. If padtye is not None and padlen is not given, padlen is be set to 6*m.
- fill_edge_nan: bool,optional
If pading is not used and fill_edge_nan is true, resulting data on the both ends are filled with nan to account for edge effect. This is 2*m on the either end of the result. Default is true.
- ts
- Returns:
- result
TimeSeries
A new regular time series with the same interval of ts. If no pading is used the beigning and ending 4*m resulting data will be set to nan to remove edge effect.
- result
- Raises:
- ValueError
If input timeseries is not regular, or, cutoff_period and cutoff_frequency are given at the same time, or, neither cutoff_period nor curoff_frequence is given, or, padtype is not “odd”,”even”,”constant”,or None, or, padlen is larger than data size
- butter(N, Wn, btype='low', analog=False, output='ba', fs=None)[source]¶
Butterworth digital and analog filter design.
Design an Nth-order digital or analog Butterworth filter and return the filter coefficients.
- Parameters:
- Nint
The order of the filter. For ‘bandpass’ and ‘bandstop’ filters, the resulting order of the final second-order sections (‘sos’) matrix is
2*N
, with N the number of biquad sections of the desired system.- Wnarray_like
The critical frequency or frequencies. For lowpass and highpass filters, Wn is a scalar; for bandpass and bandstop filters, Wn is a length-2 sequence.
For a Butterworth filter, this is the point at which the gain drops to 1/sqrt(2) that of the passband (the “-3 dB point”).
For digital filters, if fs is not specified, Wn units are normalized from 0 to 1, where 1 is the Nyquist frequency (Wn is thus in half cycles / sample and defined as 2*critical frequencies / fs). If fs is specified, Wn is in the same units as fs.
For analog filters, Wn is an angular frequency (e.g. rad/s).
- btype{‘lowpass’, ‘highpass’, ‘bandpass’, ‘bandstop’}, optional
The type of filter. Default is ‘lowpass’.
- analogbool, optional
When True, return an analog filter, otherwise a digital filter is returned.
- output{‘ba’, ‘zpk’, ‘sos’}, optional
Type of output: numerator/denominator (‘ba’), pole-zero (‘zpk’), or second-order sections (‘sos’). Default is ‘ba’ for backwards compatibility, but ‘sos’ should be used for general-purpose filtering.
- fsfloat, optional
The sampling frequency of the digital system.
New in version 1.2.0.
- Returns:
- b, andarray, ndarray
Numerator (b) and denominator (a) polynomials of the IIR filter. Only returned if
output='ba'
.- z, p, kndarray, ndarray, float
Zeros, poles, and system gain of the IIR filter transfer function. Only returned if
output='zpk'
.- sosndarray
Second-order sections representation of the IIR filter. Only returned if
output='sos'
.
See also
buttord
,buttap
Notes
The Butterworth filter has maximally flat frequency response in the passband.
The
'sos'
output parameter was added in 0.16.0.If the transfer function form
[b, a]
is requested, numerical problems can occur since the conversion between roots and the polynomial coefficients is a numerically sensitive operation, even for N >= 4. It is recommended to work with the SOS representation.Warning
Designing high-order and narrowband IIR filters in TF form can result in unstable or incorrect filtering due to floating point numerical precision issues. Consider inspecting output filter characteristics freqz or designing the filters with second-order sections via
output='sos'
.Examples
Design an analog filter and plot its frequency response, showing the critical points:
>>> from scipy import signal >>> import matplotlib.pyplot as plt >>> import numpy as np
>>> b, a = signal.butter(4, 100, 'low', analog=True) >>> w, h = signal.freqs(b, a) >>> plt.semilogx(w, 20 * np.log10(abs(h))) >>> plt.title('Butterworth filter frequency response') >>> plt.xlabel('Frequency [radians / second]') >>> plt.ylabel('Amplitude [dB]') >>> plt.margins(0, 0.1) >>> plt.grid(which='both', axis='both') >>> plt.axvline(100, color='green') # cutoff frequency >>> plt.show()
Generate a signal made up of 10 Hz and 20 Hz, sampled at 1 kHz
>>> t = np.linspace(0, 1, 1000, False) # 1 second >>> sig = np.sin(2*np.pi*10*t) + np.sin(2*np.pi*20*t) >>> fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True) >>> ax1.plot(t, sig) >>> ax1.set_title('10 Hz and 20 Hz sinusoids') >>> ax1.axis([0, 1, -2, 2])
Design a digital high-pass filter at 15 Hz to remove the 10 Hz tone, and apply it to the signal. (It’s recommended to use second-order sections format when filtering, to avoid numerical error with transfer function (
ba
) format):>>> sos = signal.butter(10, 15, 'hp', fs=1000, output='sos') >>> filtered = signal.sosfilt(sos, sig) >>> ax2.plot(t, filtered) >>> ax2.set_title('After 15 Hz high-pass filter') >>> ax2.axis([0, 1, -2, 2]) >>> ax2.set_xlabel('Time [seconds]') >>> plt.tight_layout() >>> plt.show()
- butterworth(ts, cutoff_period=None, cutoff_frequency=None, order=4)[source]¶
low-pass butterworth-squared filter on a regular time series.
- Parameters:
- ts
DataFrame
Must be one or two dimensional, and regular.
- order: int ,optional
The default is 4.
- cutoff_frequency: float,optional
Cutoff frequency expressed as a ratio with Nyquist frequency, should within the range (0,1). For a discretely sampled system, the Nyquist frequency is the fastest frequency that can be resolved by that sampling, which is half the sampling frequency. For example, if the sampling frequency is 1 sample/1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.
- cutoff_periodstring or _time_interval
Period corresponding to cutoff frequency. If input as a string, it must be convertible to a regular interval using the same rules as a pandas frequency.. cutoff_frequency and cutoff_period can’t be specified at the same time.
- ts
- Returns:
- result
A new regular time series with the same interval as ts.
- Raises:
- ValueError
If input order is not even, or input timeseries is not regular, or neither cutoff_period and cutoff_frequency is given while input time series interval is not 15min or 1 hour, or cutoff_period and cutoff_frequency are given at the same time.
- cosine_lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]¶
- cosine_lanczos5(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]¶
squared low-pass cosine lanczos filter on a regular time series.
- Parameters:
- ts
DataFrame
- filter_lenint, time_interval
Size of lanczos window, default is to number of samples within filter_period*1.25.
- cutoff_frequency: float,optional
Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.
- cutoff_periodstring or _time_interval
Period of cutting off frequency. If input as a string, it must be convertible to a _time_interval (Pandas freq). cutoff_frequency and cutoff_period can’t be specified at the same time.
- padtypestr or None, optional
Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is None.
- padlenint or None, optional
The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis]-1. padlen=0 implies no padding. If padtye is not None and padlen is not given, padlen is be set to 6*m.
- fill_edge_nan: bool,optional
If pading is not used and fill_edge_nan is true, resulting data on the both ends are filled with nan to account for edge effect. This is 2*m on the either end of the result. Default is true.
- ts
- Returns:
- result
TimeSeries
A new regular time series with the same interval of ts. If no pading is used the beigning and ending 4*m resulting data will be set to nan to remove edge effect.
- result
- Raises:
- ValueError
If input timeseries is not regular, or, cutoff_period and cutoff_frequency are given at the same time, or, neither cutoff_period nor curoff_frequence is given, or, padtype is not “odd”,”even”,”constant”,or None, or, padlen is larger than data size
- filtfilt(b, a, x, axis=-1, padtype='odd', padlen=None, method='pad', irlen=None)[source]¶
Apply a digital filter forward and backward to a signal.
This function applies a linear digital filter twice, once forward and once backwards. The combined filter has zero phase and a filter order twice that of the original.
The function provides options for handling the edges of the signal.
The function sosfiltfilt (and filter design using
output='sos'
) should be preferred over filtfilt for most filtering tasks, as second-order sections have fewer numerical problems.- Parameters:
- b(N,) array_like
The numerator coefficient vector of the filter.
- a(N,) array_like
The denominator coefficient vector of the filter. If
a[0]
is not 1, then both a and b are normalized bya[0]
.- xarray_like
The array of data to be filtered.
- axisint, optional
The axis of x to which the filter is applied. Default is -1.
- padtypestr or None, optional
Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is ‘odd’.
- padlenint or None, optional
The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than
x.shape[axis] - 1
.padlen=0
implies no padding. The default value is3 * max(len(a), len(b))
.- methodstr, optional
Determines the method for handling the edges of the signal, either “pad” or “gust”. When method is “pad”, the signal is padded; the type of padding is determined by padtype and padlen, and irlen is ignored. When method is “gust”, Gustafsson’s method is used, and padtype and padlen are ignored.
- irlenint or None, optional
When method is “gust”, irlen specifies the length of the impulse response of the filter. If irlen is None, no part of the impulse response is ignored. For a long signal, specifying irlen can significantly improve the performance of the filter.
- Returns:
- yndarray
The filtered output with the same shape as x.
See also
sosfiltfilt
,lfilter_zi
,lfilter
,lfiltic
,savgol_filter
,sosfilt
Notes
When method is “pad”, the function pads the data along the given axis in one of three ways: odd, even or constant. The odd and even extensions have the corresponding symmetry about the end point of the data. The constant extension extends the data with the values at the end points. On both the forward and backward passes, the initial condition of the filter is found by using lfilter_zi and scaling it by the end point of the extended data.
When method is “gust”, Gustafsson’s method [1] is used. Initial conditions are chosen for the forward and backward passes so that the forward-backward filter gives the same result as the backward-forward filter.
The option to use Gustaffson’s method was added in scipy version 0.16.0.
References
[1]F. Gustaffson, “Determining the initial states in forward-backward filtering”, Transactions on Signal Processing, Vol. 46, pp. 988-992, 1996.
Examples
The examples will use several functions from scipy.signal.
>>> import numpy as np >>> from scipy import signal >>> import matplotlib.pyplot as plt
First we create a one second signal that is the sum of two pure sine waves, with frequencies 5 Hz and 250 Hz, sampled at 2000 Hz.
>>> t = np.linspace(0, 1.0, 2001) >>> xlow = np.sin(2 * np.pi * 5 * t) >>> xhigh = np.sin(2 * np.pi * 250 * t) >>> x = xlow + xhigh
Now create a lowpass Butterworth filter with a cutoff of 0.125 times the Nyquist frequency, or 125 Hz, and apply it to
x
with filtfilt. The result should be approximatelyxlow
, with no phase shift.>>> b, a = signal.butter(8, 0.125) >>> y = signal.filtfilt(b, a, x, padlen=150) >>> np.abs(y - xlow).max() 9.1086182074789912e-06
We get a fairly clean result for this artificial example because the odd extension is exact, and with the moderately long padding, the filter’s transients have dissipated by the time the actual data is reached. In general, transient effects at the edges are unavoidable.
The following example demonstrates the option
method="gust"
.First, create a filter.
>>> b, a = signal.ellip(4, 0.01, 120, 0.125) # Filter to be applied.
sig is a random input signal to be filtered.
>>> rng = np.random.default_rng() >>> n = 60 >>> sig = rng.standard_normal(n)**3 + 3*rng.standard_normal(n).cumsum()
Apply filtfilt to sig, once using the Gustafsson method, and once using padding, and plot the results for comparison.
>>> fgust = signal.filtfilt(b, a, sig, method="gust") >>> fpad = signal.filtfilt(b, a, sig, padlen=50) >>> plt.plot(sig, 'k-', label='input') >>> plt.plot(fgust, 'b-', linewidth=4, label='gust') >>> plt.plot(fpad, 'c-', linewidth=1.5, label='pad') >>> plt.legend(loc='best') >>> plt.show()
The irlen argument can be used to improve the performance of Gustafsson’s method.
Estimate the impulse response length of the filter.
>>> z, p, k = signal.tf2zpk(b, a) >>> eps = 1e-9 >>> r = np.max(np.abs(p)) >>> approx_impulse_len = int(np.ceil(np.log(eps) / np.log(r))) >>> approx_impulse_len 137
Apply the filter to a longer signal, with and without the irlen argument. The difference between y1 and y2 is small. For long signals, using irlen gives a significant performance improvement.
>>> x = rng.standard_normal(4000) >>> y1 = signal.filtfilt(b, a, x, method='gust') >>> y2 = signal.filtfilt(b, a, x, method='gust', irlen=approx_impulse_len) >>> print(np.max(np.abs(y1 - y2))) 2.875334415008979e-10
- firwin(numtaps, cutoff, *, width=None, window='hamming', pass_zero=True, scale=True, nyq=<object object>, fs=None)[source]¶
FIR filter design using the window method.
This function computes the coefficients of a finite impulse response filter. The filter will have linear phase; it will be Type I if numtaps is odd and Type II if numtaps is even.
Type II filters always have zero response at the Nyquist frequency, so a ValueError exception is raised if firwin is called with numtaps even and having a passband whose right end is at the Nyquist frequency.
- Parameters:
- numtapsint
Length of the filter (number of coefficients, i.e. the filter order + 1). numtaps must be odd if a passband includes the Nyquist frequency.
- cutofffloat or 1-D array_like
Cutoff frequency of filter (expressed in the same units as fs) OR an array of cutoff frequencies (that is, band edges). In the latter case, the frequencies in cutoff should be positive and monotonically increasing between 0 and fs/2. The values 0 and fs/2 must not be included in cutoff.
- widthfloat or None, optional
If width is not None, then assume it is the approximate width of the transition region (expressed in the same units as fs) for use in Kaiser FIR filter design. In this case, the window argument is ignored.
- windowstring or tuple of string and parameter values, optional
Desired window to use. See scipy.signal.get_window for a list of windows and required parameters.
- pass_zero{True, False, ‘bandpass’, ‘lowpass’, ‘highpass’, ‘bandstop’}, optional
If True, the gain at the frequency 0 (i.e., the “DC gain”) is 1. If False, the DC gain is 0. Can also be a string argument for the desired filter type (equivalent to
btype
in IIR design functions).New in version 1.3.0: Support for string arguments.
- scalebool, optional
Set to True to scale the coefficients so that the frequency response is exactly unity at a certain frequency. That frequency is either:
0 (DC) if the first passband starts at 0 (i.e. pass_zero is True)
fs/2 (the Nyquist frequency) if the first passband ends at fs/2 (i.e the filter is a single band highpass filter); center of first passband otherwise
- nyqfloat, optional, deprecated
This is the Nyquist frequency. Each frequency in cutoff must be between 0 and nyq. Default is 1.
Deprecated since version 1.0.0: firwin keyword argument nyq is deprecated in favour of fs and will be removed in SciPy 1.14.0.
- fsfloat, optional
The sampling frequency of the signal. Each frequency in cutoff must be between 0 and
fs/2
. Default is 2.
- Returns:
- h(numtaps,) ndarray
Coefficients of length numtaps FIR filter.
- Raises:
- ValueError
If any value in cutoff is less than or equal to 0 or greater than or equal to
fs/2
, if the values in cutoff are not strictly monotonically increasing, or if numtaps is even but a passband includes the Nyquist frequency.
See also
firwin2
firls
minimum_phase
remez
Examples
Low-pass from 0 to f:
>>> from scipy import signal >>> numtaps = 3 >>> f = 0.1 >>> signal.firwin(numtaps, f) array([ 0.06799017, 0.86401967, 0.06799017])
Use a specific window function:
>>> signal.firwin(numtaps, f, window='nuttall') array([ 3.56607041e-04, 9.99286786e-01, 3.56607041e-04])
High-pass (‘stop’ from 0 to f):
>>> signal.firwin(numtaps, f, pass_zero=False) array([-0.00859313, 0.98281375, -0.00859313])
Band-pass:
>>> f1, f2 = 0.1, 0.2 >>> signal.firwin(numtaps, [f1, f2], pass_zero=False) array([ 0.06301614, 0.88770441, 0.06301614])
Band-stop:
>>> signal.firwin(numtaps, [f1, f2]) array([-0.00801395, 1.0160279 , -0.00801395])
Multi-band (passbands are [0, f1], [f2, f3] and [f4, 1]):
>>> f3, f4 = 0.3, 0.4 >>> signal.firwin(numtaps, [f1, f2, f3, f4]) array([-0.01376344, 1.02752689, -0.01376344])
Multi-band (passbands are [f1, f2] and [f3,f4]):
>>> signal.firwin(numtaps, [f1, f2, f3, f4], pass_zero=False) array([ 0.04890915, 0.91284326, 0.04890915])
- gaussian_filter1d(input, sigma, axis=-1, order=0, output=None, mode='reflect', cval=0.0, truncate=4.0, *, radius=None)[source]¶
1-D Gaussian filter.
- Parameters:
- inputarray_like
The input array.
- sigmascalar
standard deviation for Gaussian kernel
- axisint, optional
The axis of input along which to calculate. Default is -1.
- orderint, optional
An order of 0 corresponds to convolution with a Gaussian kernel. A positive order corresponds to convolution with that derivative of a Gaussian.
- outputarray or dtype, optional
The array in which to place the output, or the dtype of the returned array. By default an array of the same dtype as input will be created.
- mode{‘reflect’, ‘constant’, ‘nearest’, ‘mirror’, ‘wrap’}, optional
The mode parameter determines how the input array is extended beyond its boundaries. Default is ‘reflect’. Behavior for each valid value is as follows:
- ‘reflect’ (d c b a | a b c d | d c b a)
The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.
- ‘constant’ (k k k k | a b c d | k k k k)
The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.
- ‘nearest’ (a a a a | a b c d | d d d d)
The input is extended by replicating the last pixel.
- ‘mirror’ (d c b | a b c d | c b a)
The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.
- ‘wrap’ (a b c d | a b c d | a b c d)
The input is extended by wrapping around to the opposite edge.
For consistency with the interpolation functions, the following mode names can also be used:
- ‘grid-mirror’
This is a synonym for ‘reflect’.
- ‘grid-constant’
This is a synonym for ‘constant’.
- ‘grid-wrap’
This is a synonym for ‘wrap’.
- cvalscalar, optional
Value to fill past edges of input if mode is ‘constant’. Default is 0.0.
- truncatefloat, optional
Truncate the filter at this many standard deviations. Default is 4.0.
- radiusNone or int, optional
Radius of the Gaussian kernel. If specified, the size of the kernel will be
2*radius + 1
, and truncate is ignored. Default is None.
- Returns:
- gaussian_filter1dndarray
Notes
The Gaussian kernel will have size
2*radius + 1
along each axis. If radius is None, a defaultradius = round(truncate * sigma)
will be used.Examples
>>> from scipy.ndimage import gaussian_filter1d >>> import numpy as np >>> gaussian_filter1d([1.0, 2.0, 3.0, 4.0, 5.0], 1) array([ 1.42704095, 2.06782203, 3. , 3.93217797, 4.57295905]) >>> gaussian_filter1d([1.0, 2.0, 3.0, 4.0, 5.0], 4) array([ 2.91948343, 2.95023502, 3. , 3.04976498, 3.08051657]) >>> import matplotlib.pyplot as plt >>> rng = np.random.default_rng() >>> x = rng.standard_normal(101).cumsum() >>> y3 = gaussian_filter1d(x, 3) >>> y6 = gaussian_filter1d(x, 6) >>> plt.plot(x, 'k', label='original data') >>> plt.plot(y3, '--', label='filtered, sigma=3') >>> plt.plot(y6, ':', label='filtered, sigma=6') >>> plt.legend() >>> plt.grid() >>> plt.show()
- generate_godin_fir(freq)[source]¶
generate godin filter impulse response for given freq freq is a pandas freq
- godin(ts)[source]¶
Low-pass Godin filter a regular time series. Applies the \(\mathcal{A_{24}^{2}A_{25}}\) Godin filter [1] The filter is generalized to be the equivalent of one boxcar of the length of the lunar diurnal (~25 hours) constituent and two of the solar diurnal (~24 hours), though the implementation combines these steps.
- Parameters:
- Returns:
- result
DataFrame
A new regular time series with the same interval of ts.
- result
- Raises:
- NotImplementedError
If input time series is not univariate
References
[1]Godin (1972) Analysis of Tides
- lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]¶
- lfilter(b, a, x, axis=-1, zi=None)[source]¶
Filter data along one-dimension with an IIR or FIR filter.
Filter a data sequence, x, using a digital filter. This works for many fundamental data types (including Object type). The filter is a direct form II transposed implementation of the standard difference equation (see Notes).
The function sosfilt (and filter design using
output='sos'
) should be preferred over lfilter for most filtering tasks, as second-order sections have fewer numerical problems.- Parameters:
- barray_like
The numerator coefficient vector in a 1-D sequence.
- aarray_like
The denominator coefficient vector in a 1-D sequence. If
a[0]
is not 1, then both a and b are normalized bya[0]
.- xarray_like
An N-dimensional input array.
- axisint, optional
The axis of the input data array along which to apply the linear filter. The filter is applied to each subarray along this axis. Default is -1.
- ziarray_like, optional
Initial conditions for the filter delays. It is a vector (or array of vectors for an N-dimensional input) of length
max(len(a), len(b)) - 1
. If zi is None or is not given then initial rest is assumed. See lfiltic for more information.
- Returns:
- yarray
The output of the digital filter.
- zfarray, optional
If zi is None, this is not returned, otherwise, zf holds the final filter delay values.
See also
lfiltic
Construct initial conditions for lfilter.
lfilter_zi
Compute initial state (steady state of step response) for lfilter.
filtfilt
A forward-backward filter, to obtain a filter with zero phase.
savgol_filter
A Savitzky-Golay filter.
sosfilt
Filter data using cascaded second-order sections.
sosfiltfilt
A forward-backward filter using second-order sections.
Notes
The filter function is implemented as a direct II transposed structure. This means that the filter implements:
a[0]*y[n] = b[0]*x[n] + b[1]*x[n-1] + ... + b[M]*x[n-M] - a[1]*y[n-1] - ... - a[N]*y[n-N]
where M is the degree of the numerator, N is the degree of the denominator, and n is the sample number. It is implemented using the following difference equations (assuming M = N):
a[0]*y[n] = b[0] * x[n] + d[0][n-1] d[0][n] = b[1] * x[n] - a[1] * y[n] + d[1][n-1] d[1][n] = b[2] * x[n] - a[2] * y[n] + d[2][n-1] ... d[N-2][n] = b[N-1]*x[n] - a[N-1]*y[n] + d[N-1][n-1] d[N-1][n] = b[N] * x[n] - a[N] * y[n]
where d are the state variables.
The rational transfer function describing this filter in the z-transform domain is:
-1 -M b[0] + b[1]z + ... + b[M] z Y(z) = -------------------------------- X(z) -1 -N a[0] + a[1]z + ... + a[N] z
Examples
Generate a noisy signal to be filtered:
>>> import numpy as np >>> from scipy import signal >>> import matplotlib.pyplot as plt >>> rng = np.random.default_rng() >>> t = np.linspace(-1, 1, 201) >>> x = (np.sin(2*np.pi*0.75*t*(1-t) + 2.1) + ... 0.1*np.sin(2*np.pi*1.25*t + 1) + ... 0.18*np.cos(2*np.pi*3.85*t)) >>> xn = x + rng.standard_normal(len(t)) * 0.08
Create an order 3 lowpass butterworth filter:
>>> b, a = signal.butter(3, 0.05)
Apply the filter to xn. Use lfilter_zi to choose the initial condition of the filter:
>>> zi = signal.lfilter_zi(b, a) >>> z, _ = signal.lfilter(b, a, xn, zi=zi*xn[0])
Apply the filter again, to have a result filtered at an order the same as filtfilt:
>>> z2, _ = signal.lfilter(b, a, z, zi=zi*z[0])
Use filtfilt to apply the filter:
>>> y = signal.filtfilt(b, a, xn)
Plot the original signal and the various filtered versions:
>>> plt.figure >>> plt.plot(t, xn, 'b', alpha=0.75) >>> plt.plot(t, z, 'r--', t, z2, 'r', t, y, 'k') >>> plt.legend(('noisy signal', 'lfilter, once', 'lfilter, twice', ... 'filtfilt'), loc='best') >>> plt.grid(True) >>> plt.show()
- lowpass_cosine_lanczos_filter_coef(cf, m, normalize=True)[source]¶
return the convolution coefficients for low pass lanczos filter.
- Parameters:
- cf: float
Cutoff frequency expressed as a ratio of a Nyquist frequency.
- m: int
Size of filtering window size.
- Returns:
- results: list
Coefficients of filtering window.
- lowpass_lanczos_filter_coef(cf, m, normalize=True, cosine_taper=False)[source]¶
Return the convolution coefficients for a low-pass Lanczos filter.
- Parameters:
- cffloat
Cutoff frequency expressed as a ratio of the Nyquist frequency.
- mint
Size of the filtering window.
- normalizebool, optional
Whether to normalize the filter coefficients so they sum to 1.
- cosine_taperbool, optional
If True, applies a cosine-squared taper to the Lanczos window.
- Returns:
- resnp.ndarray
Coefficients of the filtering window.
- ts_gaussian_filter(ts, sigma, order=0, mode='reflect', cval=0.0, truncate=4.0)[source]¶
Column-wise Gaussian smoothing of regular time series. Missing/irregular values are not handled, which means this function is not much different from a rolling window gaussian average in pandas which may be preferable in the case of missing data (ts.rolling(window=5,win_type=’gaussian’).mean. This function has been kept around awaiting irreg as an aspiration but yet to be implemented.
vtools.functions.frequency_response module¶
- butterworth(ts, cutoff_period=None, cutoff_frequency=None, order=4)[source]¶
low-pass butterworth-squared filter on a regular time series.
- Parameters:
- ts
DataFrame
Must be one or two dimensional, and regular.
- order: int ,optional
The default is 4.
- cutoff_frequency: float,optional
Cutoff frequency expressed as a ratio with Nyquist frequency, should within the range (0,1). For a discretely sampled system, the Nyquist frequency is the fastest frequency that can be resolved by that sampling, which is half the sampling frequency. For example, if the sampling frequency is 1 sample/1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.
- cutoff_periodstring or _time_interval
Period corresponding to cutoff frequency. If input as a string, it must be convertible to a regular interval using the same rules as a pandas frequency.. cutoff_frequency and cutoff_period can’t be specified at the same time.
- ts
- Returns:
- result
A new regular time series with the same interval as ts.
- Raises:
- ValueError
If input order is not even, or input timeseries is not regular, or neither cutoff_period and cutoff_frequency is given while input time series interval is not 15min or 1 hour, or cutoff_period and cutoff_frequency are given at the same time.
- compare_response(cutoff_period)[source]¶
Generate frequency response plot of low-pass filters: cosine_lanczos, boxcar 24h, boxcar 25h, and godin.
- Parameters:
- cutoff_periodint
Low-pass filter cutoff period in number of hours.
- Returns:
- None.
- cosine_lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]¶
- cosine_lanczos5(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]¶
squared low-pass cosine lanczos filter on a regular time series.
- Parameters:
- ts
DataFrame
- filter_lenint, time_interval
Size of lanczos window, default is to number of samples within filter_period*1.25.
- cutoff_frequency: float,optional
Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.
- cutoff_periodstring or _time_interval
Period of cutting off frequency. If input as a string, it must be convertible to a _time_interval (Pandas freq). cutoff_frequency and cutoff_period can’t be specified at the same time.
- padtypestr or None, optional
Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is None.
- padlenint or None, optional
The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis]-1. padlen=0 implies no padding. If padtye is not None and padlen is not given, padlen is be set to 6*m.
- fill_edge_nan: bool,optional
If pading is not used and fill_edge_nan is true, resulting data on the both ends are filled with nan to account for edge effect. This is 2*m on the either end of the result. Default is true.
- ts
- Returns:
- result
TimeSeries
A new regular time series with the same interval of ts. If no pading is used the beigning and ending 4*m resulting data will be set to nan to remove edge effect.
- result
- Raises:
- ValueError
If input timeseries is not regular, or, cutoff_period and cutoff_frequency are given at the same time, or, neither cutoff_period nor curoff_frequence is given, or, padtype is not “odd”,”even”,”constant”,or None, or, padlen is larger than data size
- freqz(b, a=1, worN=512, whole=False, plot=None, fs=6.283185307179586, include_nyquist=False)[source]¶
Compute the frequency response of a digital filter.
Given the M-order numerator b and N-order denominator a of a digital filter, compute its frequency response:
jw -jw -jwM jw B(e ) b[0] + b[1]e + ... + b[M]e H(e ) = ------ = ----------------------------------- jw -jw -jwN A(e ) a[0] + a[1]e + ... + a[N]e
- Parameters:
- barray_like
Numerator of a linear filter. If b has dimension greater than 1, it is assumed that the coefficients are stored in the first dimension, and
b.shape[1:]
,a.shape[1:]
, and the shape of the frequencies array must be compatible for broadcasting.- aarray_like
Denominator of a linear filter. If b has dimension greater than 1, it is assumed that the coefficients are stored in the first dimension, and
b.shape[1:]
,a.shape[1:]
, and the shape of the frequencies array must be compatible for broadcasting.- worN{None, int, array_like}, optional
If a single integer, then compute at that many frequencies (default is N=512). This is a convenient alternative to:
np.linspace(0, fs if whole else fs/2, N, endpoint=include_nyquist)
Using a number that is fast for FFT computations can result in faster computations (see Notes).
If an array_like, compute the response at the frequencies given. These are in the same units as fs.
- wholebool, optional
Normally, frequencies are computed from 0 to the Nyquist frequency, fs/2 (upper-half of unit-circle). If whole is True, compute frequencies from 0 to fs. Ignored if worN is array_like.
- plotcallable
A callable that takes two arguments. If given, the return parameters w and h are passed to plot. Useful for plotting the frequency response inside freqz.
- fsfloat, optional
The sampling frequency of the digital system. Defaults to 2*pi radians/sample (so w is from 0 to pi).
New in version 1.2.0.
- include_nyquistbool, optional
If whole is False and worN is an integer, setting include_nyquist to True will include the last frequency (Nyquist frequency) and is otherwise ignored.
New in version 1.5.0.
- Returns:
- wndarray
The frequencies at which h was computed, in the same units as fs. By default, w is normalized to the range [0, pi) (radians/sample).
- hndarray
The frequency response, as complex numbers.
See also
freqz_zpk
sosfreqz
Notes
Using Matplotlib’s
matplotlib.pyplot.plot()
function as the callable for plot produces unexpected results, as this plots the real part of the complex transfer function, not the magnitude. Trylambda w, h: plot(w, np.abs(h))
.A direct computation via (R)FFT is used to compute the frequency response when the following conditions are met:
An integer value is given for worN.
worN is fast to compute via FFT (i.e., next_fast_len(worN) <scipy.fft.next_fast_len> equals worN).
The denominator coefficients are a single value (
a.shape[0] == 1
).worN is at least as long as the numerator coefficients (
worN >= b.shape[0]
).If
b.ndim > 1
, thenb.shape[-1] == 1
.
For long FIR filters, the FFT approach can have lower error and be much faster than the equivalent direct polynomial calculation.
Examples
>>> from scipy import signal >>> import numpy as np >>> b = signal.firwin(80, 0.5, window=('kaiser', 8)) >>> w, h = signal.freqz(b)
>>> import matplotlib.pyplot as plt >>> fig, ax1 = plt.subplots() >>> ax1.set_title('Digital filter frequency response')
>>> ax1.plot(w, 20 * np.log10(abs(h)), 'b') >>> ax1.set_ylabel('Amplitude [dB]', color='b') >>> ax1.set_xlabel('Frequency [rad/sample]')
>>> ax2 = ax1.twinx() >>> angles = np.unwrap(np.angle(h)) >>> ax2.plot(w, angles, 'g') >>> ax2.set_ylabel('Angle (radians)', color='g') >>> ax2.grid(True) >>> ax2.axis('tight') >>> plt.show()
Broadcasting Examples
Suppose we have two FIR filters whose coefficients are stored in the rows of an array with shape (2, 25). For this demonstration, we’ll use random data:
>>> rng = np.random.default_rng() >>> b = rng.random((2, 25))
To compute the frequency response for these two filters with one call to freqz, we must pass in
b.T
, because freqz expects the first axis to hold the coefficients. We must then extend the shape with a trivial dimension of length 1 to allow broadcasting with the array of frequencies. That is, we pass inb.T[..., np.newaxis]
, which has shape (25, 2, 1):>>> w, h = signal.freqz(b.T[..., np.newaxis], worN=1024) >>> w.shape (1024,) >>> h.shape (2, 1024)
Now, suppose we have two transfer functions, with the same numerator coefficients
b = [0.5, 0.5]
. The coefficients for the two denominators are stored in the first dimension of the 2-D array a:a = [ 1 1 ] [ -0.25, -0.5 ]
>>> b = np.array([0.5, 0.5]) >>> a = np.array([[1, 1], [-0.25, -0.5]])
Only a is more than 1-D. To make it compatible for broadcasting with the frequencies, we extend it with a trivial dimension in the call to freqz:
>>> w, h = signal.freqz(b, a[..., np.newaxis], worN=1024) >>> w.shape (1024,) >>> h.shape (2, 1024)
- godin(ts)[source]¶
Low-pass Godin filter a regular time series. Applies the \(\mathcal{A_{24}^{2}A_{25}}\) Godin filter [1] The filter is generalized to be the equivalent of one boxcar of the length of the lunar diurnal (~25 hours) constituent and two of the solar diurnal (~24 hours), though the implementation combines these steps.
- Parameters:
- Returns:
- result
DataFrame
A new regular time series with the same interval of ts.
- result
- Raises:
- NotImplementedError
If input time series is not univariate
References
[1]Godin (1972) Analysis of Tides
- inset_axes(parent_axes, width, height, loc='upper right', bbox_to_anchor=None, bbox_transform=None, axes_class=None, axes_kwargs=None, borderpad=0.5)[source]¶
Create an inset axes with a given width and height.
Both sizes used can be specified either in inches or percentage. For example,:
inset_axes(parent_axes, width='40%', height='30%', loc='lower left')
creates in inset axes in the lower left corner of parent_axes which spans over 30% in height and 40% in width of the parent_axes. Since the usage of .inset_axes may become slightly tricky when exceeding such standard cases, it is recommended to read the examples.
- Parameters:
- parent_axesmatplotlib.axes.Axes
Axes to place the inset axes.
- width, heightfloat or str
Size of the inset axes to create. If a float is provided, it is the size in inches, e.g. width=1.3. If a string is provided, it is the size in relative units, e.g. width=’40%’. By default, i.e. if neither bbox_to_anchor nor bbox_transform are specified, those are relative to the parent_axes. Otherwise, they are to be understood relative to the bounding box provided via bbox_to_anchor.
- locstr, default: ‘upper right’
Location to place the inset axes. Valid locations are ‘upper left’, ‘upper center’, ‘upper right’, ‘center left’, ‘center’, ‘center right’, ‘lower left’, ‘lower center’, ‘lower right’. For backward compatibility, numeric values are accepted as well. See the parameter loc of .Legend for details.
- bbox_to_anchortuple or ~matplotlib.transforms.BboxBase, optional
Bbox that the inset axes will be anchored to. If None, a tuple of (0, 0, 1, 1) is used if bbox_transform is set to parent_axes.transAxes or parent_axes.figure.transFigure. Otherwise, parent_axes.bbox is used. If a tuple, can be either [left, bottom, width, height], or [left, bottom]. If the kwargs width and/or height are specified in relative units, the 2-tuple [left, bottom] cannot be used. Note that, unless bbox_transform is set, the units of the bounding box are interpreted in the pixel coordinate. When using bbox_to_anchor with tuple, it almost always makes sense to also specify a bbox_transform. This might often be the axes transform parent_axes.transAxes.
- bbox_transform~matplotlib.transforms.Transform, optional
Transformation for the bbox that contains the inset axes. If None, a .transforms.IdentityTransform is used. The value of bbox_to_anchor (or the return value of its get_points method) is transformed by the bbox_transform and then interpreted as points in the pixel coordinate (which is dpi dependent). You may provide bbox_to_anchor in some normalized coordinate, and give an appropriate transform (e.g., parent_axes.transAxes).
- axes_class~matplotlib.axes.Axes type, default: .HostAxes
The type of the newly created inset axes.
- axes_kwargsdict, optional
Keyword arguments to pass to the constructor of the inset axes. Valid arguments include:
Properties: adjustable: {‘box’, ‘datalim’} agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array and two offsets from the bottom left corner of the image alpha: scalar or None anchor: (float, float) or {‘C’, ‘SW’, ‘S’, ‘SE’, ‘E’, ‘NE’, …} animated: bool aspect: {‘auto’, ‘equal’} or float autoscale_on: bool autoscalex_on: unknown autoscaley_on: unknown axes_locator: Callable[[Axes, Renderer], Bbox] axisbelow: bool or ‘line’ box_aspect: float or None clip_box: ~matplotlib.transforms.BboxBase or None clip_on: bool clip_path: Patch or (Path, Transform) or None facecolor or fc: color figure: ~matplotlib.figure.Figure frame_on: bool gid: str in_layout: bool label: object mouseover: bool navigate: bool navigate_mode: unknown path_effects: list of .AbstractPathEffect picker: None or bool or float or callable position: [left, bottom, width, height] or ~matplotlib.transforms.Bbox prop_cycle: ~cycler.Cycler rasterization_zorder: float or None rasterized: bool sketch_params: (scale: float, length: float, randomness: float) snap: bool or None subplotspec: unknown title: str transform: ~matplotlib.transforms.Transform url: str visible: bool xbound: (lower: float, upper: float) xlabel: str xlim: (left: float, right: float) xmargin: float greater than -0.5 xscale: unknown xticklabels: unknown xticks: unknown ybound: (lower: float, upper: float) ylabel: str ylim: (bottom: float, top: float) ymargin: float greater than -0.5 yscale: unknown yticklabels: unknown yticks: unknown zorder: float
- borderpadfloat, default: 0.5
Padding between inset axes and the bbox_to_anchor. The units are axes font size, i.e. for a default font size of 10 points borderpad = 0.5 is equivalent to a padding of 5 points.
- Returns:
- inset_axesaxes_class
Inset axes object created.
Notes
The meaning of bbox_to_anchor and bbox_to_transform is interpreted differently from that of legend. The value of bbox_to_anchor (or the return value of its get_points method; the default is parent_axes.bbox) is transformed by the bbox_transform (the default is Identity transform) and then interpreted as points in the pixel coordinate (which is dpi dependent).
Thus, following three calls are identical and creates an inset axes with respect to the parent_axes:
axins = inset_axes(parent_axes, "30%", "40%") axins = inset_axes(parent_axes, "30%", "40%", bbox_to_anchor=parent_axes.bbox) axins = inset_axes(parent_axes, "30%", "40%", bbox_to_anchor=(0, 0, 1, 1), bbox_transform=parent_axes.transAxes)
- lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]¶
- lowpass_cosine_lanczos_filter_coef(cf, m, normalize=True)[source]¶
return the convolution coefficients for low pass lanczos filter.
- Parameters:
- cf: float
Cutoff frequency expressed as a ratio of a Nyquist frequency.
- m: int
Size of filtering window size.
- Returns:
- results: list
Coefficients of filtering window.
- lowpass_lanczos_filter_coef(cf, m, normalize=True, cosine_taper=False)[source]¶
Return the convolution coefficients for a low-pass Lanczos filter.
- Parameters:
- cffloat
Cutoff frequency expressed as a ratio of the Nyquist frequency.
- mint
Size of the filtering window.
- normalizebool, optional
Whether to normalize the filter coefficients so they sum to 1.
- cosine_taperbool, optional
If True, applies a cosine-squared taper to the Lanczos window.
- Returns:
- resnp.ndarray
Coefficients of the filtering window.
- mark_inset(parent_axes, inset_axes, loc1, loc2, **kwargs)[source]¶
Draw a box to mark the location of an area represented by an inset axes.
This function draws a box in parent_axes at the bounding box of inset_axes, and shows a connection with the inset axes by drawing lines at the corners, giving a “zoomed in” effect.
- Parameters:
- parent_axes~matplotlib.axes.Axes
Axes which contains the area of the inset axes.
- inset_axes~matplotlib.axes.Axes
The inset axes.
- loc1, loc2{1, 2, 3, 4}
Corners to use for connecting the inset axes and the area in the parent axes.
- **kwargs
Patch properties for the lines and box drawn:
Properties: agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array and two offsets from the bottom left corner of the image alpha: unknown animated: bool antialiased or aa: bool or None capstyle: .CapStyle or {‘butt’, ‘projecting’, ‘round’} clip_box: ~matplotlib.transforms.BboxBase or None clip_on: bool clip_path: Patch or (Path, Transform) or None color: color edgecolor or ec: color or None facecolor or fc: color or None figure: ~matplotlib.figure.Figure fill: bool gid: str hatch: {‘/’, ‘\’, ‘|’, ‘-’, ‘+’, ‘x’, ‘o’, ‘O’, ‘.’, ‘*’} in_layout: bool joinstyle: .JoinStyle or {‘miter’, ‘round’, ‘bevel’} label: object linestyle or ls: {‘-’, ‘–’, ‘-.’, ‘:’, ‘’, (offset, on-off-seq), …} linewidth or lw: float or None mouseover: bool path_effects: list of .AbstractPathEffect picker: None or bool or float or callable rasterized: bool sketch_params: (scale: float, length: float, randomness: float) snap: bool or None transform: ~matplotlib.transforms.Transform url: str visible: bool zorder: float
- Returns:
- pp~matplotlib.patches.Patch
The patch drawn to represent the area of the inset axes.
- p1, p2~matplotlib.patches.Patch
The patches connecting two corners of the inset axes and its area.
- ts_gaussian_filter(ts, sigma, order=0, mode='reflect', cval=0.0, truncate=4.0)[source]¶
Column-wise Gaussian smoothing of regular time series. Missing/irregular values are not handled, which means this function is not much different from a rolling window gaussian average in pandas which may be preferable in the case of missing data (ts.rolling(window=5,win_type=’gaussian’).mean. This function has been kept around awaiting irreg as an aspiration but yet to be implemented.
vtools.functions.interannual module¶
vtools.functions.interpolate module¶
Module for data interpolation using splines or interfaces unavailable in Pandas.
- _monotonic_spline(x, y, xnew)[source]¶
Third order (M3-A) monotonicity-preserving spline Usage: interpolate.spline(x,y,xnew)
- where
x are the sorted index values of the original data y are the original dataxnew xnew are new locations for the spline
Reference: Huynh, HT <<Accurate Monotone Cubic Interpolation>>, SIAM J. Numer. Analysis V30 No. 1 pp 57-100 All equation numbers refer to this paper. The variable names are also almost the same. Double letters like “ee” to indicate that the subscript should have “+1/2” added to it and a number after the variable to show the “t” that the first member applies to.
- monotonic_spline(ts, dest)[source]¶
Interpolating a regular time series (rts) to a finer rts by rational histospline.
The rational histospline preserves area under the curve. This is a good choice of spline for period averaged data where an interpolant is desired that is ‘conservative’. Note that it is the underlying continuous interpolant that will be ‘conservative though, not the returned discrete time series which merely samples the underlying interpolant.
- Parameters:
- ts
Pandas.DataFrame
Series to be interpolated, typically with DatetimeIndex
- desta pandas freq code (e.g. ‘16min’ or ‘D’) or a DateTimeIndex
- ts
- Returns:
- result
DataFrame
A regular time series with same columns as ts, populated with instantaneous values and with an index of type DateTimeIndex
- result
- rhist(x, y, xnew, y0, yn, p, q)[source]¶
Histopline for arrays with tension. Based by an algorithm rhist2 in One Dimensional Spline Interpolation Algorithms by Helmuth Spath (1995).
- Parameters:
- xarray-like
Abscissa array of original data, of length n
- yarray-like, dimension (n-1)
Values (mantissa) of original data giving the rectangle (average) values between x[i] and x[i+1]
- xnewarray-like
Array of new locations at which to interpolate.
- y0,ynfloat
Initial and terminal values
- p,q: array-like, dimension (n-1)
Tension parameter, p and q are almost always the same. The higher p and q are for a particular x interval, the more rectangular the interpolant will look and the more positivity and shape preserving it is at the expense of accuracy. For this routine any number p,q > -1 is allowed, although the bound routine doesn’t use vals less than zero.
- Returns
- ——-
- ynewarray-like
Array that interpolates the original data.
- rhist_bound(x, y, xnew, y0, yn, p, lbound=None, maxiter=5, pfactor=2, floor_eps=0.001)[source]¶
Numpy implementation of histospline with bounds Histopline for arrays with lower bound enforcement. This routine drives rhist() but tests that the output array observes the lower bound and adapts the tension parameters as needed.
This will not work exactly if the input array has values right on the lower bound. In this case, the parameter floor_eps allows you to specify a tolerance of bound violation to shoot for … and if it isn’t met in maxiter iterations the value is simply floored.
- Parameters:
- xarray-like
Abscissa array of original data to be interpolated, of length n
- yarray-like, dimension (n-1)
Values (mantissa) of original data giving the rectangle (average) values between x[i] and x[i+1]
- xnewarray-like
Array of new locations at which to interpolate.
- y0,ynfloat
Initial and terminal values
- p: float
Tension parameter. This starts out as a global scalar, but will be converted to an array and adapted locally. The higher this goes for a particular x interval, the more rectangular the interpolant will look and the more positivity and shape preserving it is at the expense of accuracy. A good number is 1, and for this routine, p > 0 is required because the adaptive process multiplies it by pfactor each iteration on the expectation that it will get bigger.
- lbound: float
Lower bound to be enforced. If the original y’s are strictly above this value, the output has the potential to also be strictly above. If the original y’s lie on the lower bound, then the lower bound can only be enforced within a tolerance using the Spath algorithm … and once the values reach that tolerance they are floored. If lbound = None, this function behaves like rhist()
- maxiterinteger
Number of times to increase p by multiplying it by pfactor before giving up on satisfying floor_eps.
- pfactorfloat
Factor by which to multiply individual time step p
- floor_epsfloat
Tolerance for lower bound violation at which the algorithm will be terminated and the bounds will be enforced by flooring.
- Returns:
- ynewarray-like
Array that interpolates the original data, on a curve that conserves mass and strictly observes the lower bound.
- rhistinterp(ts, dest, p=2.0, lowbound=None, tolbound=0.001, maxiter=5)[source]¶
Interpolate a regular time series (rts) to a finer rts by rational histospline.
The rational histospline preserves area under the curve. This is a good choice of spline for period averaged data where an interpolant is desired that is ‘conservative’. Note that it is the underlying continuous interpolant that will be ‘conservative though, not the returned discrete time series which merely samples the underlying interpolant.
- Parameters:
- ts
Pandas.DataFrame
Series to be interpolated, with period index and assuming time stamps at beginning of the period and no missing data
- deststring or
DateTimeIndex
A pandas freq code (e.g. ‘16min’ or ‘D’) or a DateTimeIndex
- pfloat, optional
Spline tension, usually between 0 and 20. Must >-1. For a ‘sufficiently large’ value of p, the interpolant will be monotonicity-preserving and will maintain strict positivity (always being strictly > lowbound). It will also preserve the original shape of the time series.
- lowboundfloat, optional
Lower bound of interpolated values.
- tolboundfloat, optional
Tolerance for determining if an input is on the bound.
- ts
- Returns:
- result
pandas.DataFrame
A regular time series with same columns as ts, populated with instantaneous values and with an index of type DateTimeIndex
- result
vtools.functions.lag_cross_correlation module¶
- calculate_lag(lagged, base, max_lag, res, interpolate_method='linear')[source]¶
Calculate shift in lagged, that maximizes cross-correlation with base.
- Parameters:
- base,lagged: :class:`Pandas.Series`
time series to compare. The result is relative to base
- max_lag: interval
Maximum pos/negative time shift to consider in cross-correlation (ie, from -max_lag to +max_lag). Required windows in lagged will account for this bracket. For series dominated by a single frequency (eg 1c/12.5 hours for tides), the algorithm can tolerate a range of 180 degrees (6 hours)
- res: interval
Resolution of analysis. The series lagged will be interpolated to this resolution using interpolate_method. Unit used here will determine the type of the output. See documentation of the interval concept which is most compatible with pandas.tseries.offsets, not timeDelta, because of better math properties in things like division – vtime helpers like minutes(1) may be helpful
- interpolate_method: str, optional
Interpolate method to refine lagged to res. Must be compatible with pandas interpolation method names (and hence scipy)
- Returns:
- laginterval
Shift as a pandas.tseries.offsets subtype that matches units with res This shift is the apparent lateness (pos) or earliness (neg). It must be applied to base or removed to lagged to align the features.
vtools.functions.merge module¶
- _reindex_to_continuous(result, first_freq)[source]¶
Reindex the given result (DataFrame or Series) to a continuous index spanning from the minimum to maximum timestamp of the current index using the provided frequency.
- Parameters:
- resultpandas.DataFrame or pandas.Series
The merged or spliced result whose index will be reindexed.
- first_freqfrequency
The frequency (e.g. ‘D’ for daily) to use for the continuous index, taken from the first input series.
- Returns:
- resultpandas.DataFrame or pandas.Series
The result reindexed to a continuous index. Any gaps in the timeline will be filled with NaN. If the index is a PeriodIndex, the index is rebuilt with the proper frequency.
- reduce(function, iterable[, initial]) value ¶
Apply a function of two arguments cumulatively to the items of a sequence or iterable, from left to right, so as to reduce the iterable to a single value. For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5). If initial is present, it is placed before the items of the iterable in the calculation, and serves as a default when the iterable is empty.
- ts_merge(series, names=None, strict_priority=False)[source]¶
Merge multiple time series together, prioritizing series in order.
- Parameters:
- seriessequence of pandas.Series or pandas.DataFrame
Higher priority first. All indexes must be DatetimeIndex.
- namesNone, str, or iterable of str, optional
If None (default), inputs must share compatible column names.
If str, the output is univariate and will be named accordingly.
If iterable, it is used as a subset/ordering of columns.
- strict_prioritybool, default False
If False (default): lower-priority data may fill NaNs in higher-priority series anywhere (traditional merge/overlay). If True: for each column, within the window [first_valid_index, last_valid_index] of any higher-priority series, lower-priority data are masked out — even if the higher-priority value is NaN. Outside those windows, behavior is unchanged.
- Returns:
- pandas.Series or pandas.DataFrame
- ts_splice(series, names=None, transition='prefer_last', floor_dates=False)[source]¶
Splice multiple time series together, prioritizing series in patches of time.
Unlike ts_merge, which blends overlapping data points, ts_splice stitches together time series without overlap. The function determines when to switch between series based on a transition strategy.
- Parameters:
- seriestuple or list of pandas.DataFrame or pandas.Series
A tuple or list of time series. Each series must have a DatetimeIndex and consistent column structure.
- namesNone, str, or iterable of str, optional
If None (default), all input series must share common column names, and the output will merge common columns.
If a str, all input series must have a single column, and the output will be a DataFrame with this name as the column name.
If an iterable of str, all input DataFrames must have the same number of columns matching the length of names, and these will be used for the output.
- transition{‘prefer_first’, ‘prefer_last’} or list of pandas.Timestamp
Defines how to determine breakpoints between time series: - ‘prefer_first’: Uses the earlier series on the list during until its valid timestamp. - ‘prefer_last’: Uses the later series starting from its first valid timestamp. - A list of specific timestamps can also be provided as transition points.
- floor_datesbool, optional, default=False
If True, inferred transition timestamps (prefer_first or prefer_last) are floored to the beginning of the day. This can introduce NaNs if the input series are regular with a freq attribute.
- Returns:
- pandas.DataFrame or pandas.Series
If the input contains multi-column DataFrames, the output is a DataFrame with the same column structure.
If a collection of single-column Series is provided, the output will be a Series.
The output retains a freq attribute if all inputs share the same frequency.
See also
ts_merge
Merges series by filling gaps in order of priority.
Notes
The output time index is the union of input time indices.
If transition is ‘prefer_first’, gaps may appear in the final time series.
If transition is ‘prefer_last’, overlapping data is resolved in favor of later series.
vtools.functions.period_op module¶
vtools.functions.read_dss module¶
- check_exclude(pathname, exclude_pathname)[source]¶
Returns True if pathname matches the exclude_pathname pattern. Wildcards (*) in exclude_pathname are supported.
- read_dss(filename, pathname, dt=<15 * Minutes>, p=2.0, start_date=None, end_date=None, exclude_pathname=None)[source]¶
Reads in a DSM2 dss file and interpolates Outputs an interpolated DataFrame of that variable
- Parameters:
- filename: str|Path
Path to the DSS file to read
- pathname: str|list
Pathname(s) within the DSS file to read. Needs to be in the format ‘/A_PART/B_PART/C_PART/D_PART/E_PART/F_PART/’ (e.g. ‘//RSAN112/FLOW////’)
vtools.functions.savitzky_golay module¶
- savgol_filter_weighted(data, window_length, degree, error=None, cov_matrix=None, deriv=None, use_numba=True)[source]¶
Apply a Savitzky–Golay filter with weights to a univariate DataFrame or Series.
- Parameters:
- datapandas.DataFrame or pandas.Series
DataFrame or Series containing your data.
- window_lengthint
Length of the filter window (must be odd).
- degreeint
Degree of the polynomial fit.
- errorpandas.Series, optional
Series containing the error (used to compute weights).
- cov_matrix2D numpy array, optional
Covariance matrix for the errors.
- derivint, optional
Derivative order to compute.
- use_numbabool, optional
If True, uses the Numba-accelerated kernel.
- Returns:
- pandas.Series
Series of the filtered values.
Notes
The practical size of window_length depends on the data and the computational resources. Larger window lengths provide smoother results but require more computation and may not capture local variations well. It is recommended to experiment with different window lengths to find the optimal value for your specific application.
Some of the workflow derived from this work: https://github.com/surhudm/savitzky_golay_with_errors
vtools.functions.skill_metrics module¶
- corr_coefficient(predictions, targets, method='pearson')[source]¶
Calculates the correlation coefficient (the ‘r’ in ‘-squared’ between two series.
For time series where the targets are serially correlated and may span only a fraction of the natural variability, this statistic may not be appropriate and Murphy (1988) explains why caution should be exercised in using this statistic.
- Parameters:
- predictions, targetsarray_like
Time series to analyze
- methodpearson’, ‘kendall’, ‘spearman’
Method compatilble with pandasa
- Returns:
- rfloat
Correlation coefficient
- mean_error(predictions, targets, proportiontocut)[source]¶
Calculate the untrimmed mean error, discounting nan values
- Parameters:
- predictions, targetsarray_like
Time series or arrays to be analyzed
- Returns:
- medfloat
Median error
- median_error(predictions, targets)[source]¶
Calculate the median error, discounting nan values
- Parameters:
- predictions, targetsarray_like
Time series or arrays to be analyzed
- Returns:
- medfloat
Median error
- mse(predictions, targets)[source]¶
Mean squared error
- Parameters:
- predictions, targetsarray_like
Time series or arrays to analyze
- Returns:
- msevtools.data.timeseries.TimeSeries
Mean squared error between predictions and targets
- rmse(predictions, targets)[source]¶
Root mean squared error
- Parameters:
- predictions, targetsarray_like
Time series or arrays to analyze
- Returns:
- msefloat
Mean squared error
- skill_score(predictions, targets, ref=None)[source]¶
Calculate a Nash-Sutcliffe-like skill score based on mean squared error
As per the discussion in Murphy (1988) other reference forecasts (climatology, harmonic tide, etc.) are possible.
- Parameters:
- predictions, targetsarray_like
Time series or arrays to be analyzed
- Returns:
- rmsefloat
Root mean squared error
- tmean_error(predictions, targets, limits=None, inclusive=[True, True])[source]¶
Calculate the (possibly trimmed) mean error, discounting nan values
- Parameters:
- predictions, targetsarray_like
Time series or arrays to be analyzed
- limitstuple(float)
Low and high limits for trimming
- inclusive[boolean, boolean]
True if clipping is inclusive on the low/high end
- Returns:
- meanfloat
Trimmed mean error
- willmott_score(predictions, targets, ref=None)[source]¶
Calculate a Nash-Sutcliffe-like skill score based on mean squared error
As per the discussion in Murphy (1988) other reference forecasts (climatology, harmonic tide, etc.) are possible.
- Parameters:
- predictions, targetsarray_like
Time series or arrays to be analyzed
- Returns:
- rmsefloat
Root mean squared error
vtools.functions.tidalhl module¶
- cosine_lanczos(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]¶
- get_smoothed_resampled(df, cutoff_period='2h', resample_period='1min', interpolate_method='pchip')[source]¶
Resample the dataframe (indexed by time) to the regular period of resample_period using the interpolate method
Furthermore the cosine lanczos filter is used with a cutoff_period to smooth the signal to remove high frequency noise
Args:
df (DataFrame): A single column dataframe indexed by datetime
cutoff_period (str, optional): cutoff period for cosine lanczos filter. Defaults to ‘2h’.
resample_period (str, optional): Resample to regular period. Defaults to ‘1min’.
interpolate_method (str, optional): interpolation for resampling. Defaults to ‘pchip’.
Returns:
DataFrame: smoothed and resampled dataframe indexed by datetime
- get_tidal_amplitude(dfh, dfl)[source]¶
Tidal amplitude given tidal highs and lows
Args:
dfh (DataFrame): Tidal highs time series
dfl (DataFrame): Tidal lows time series
Returns:
DataFrame: Amplitude timeseries, at the times of the low following the high being used for amplitude calculation
- get_tidal_amplitude_diff(dfamp1, dfamp2, percent_diff=False, tolerance='4h')[source]¶
Get the difference of values within +/- 4H of values in the two amplitude arrays
Args:
dfamp1 (DataFrame): Amplitude time series
dfamp2 (DataFrame): Amplitude time series
percent_diff (bool, optional): If true do percent diff. Defaults to False.
Returns:
DataFrame: Difference dfamp1-dfamp2 or % Difference (dfamp1-dfamp2)/dfamp2*100 for values within +/- 4H of each other
- get_tidal_hl(df, cutoff_period='2h', resample_period='1min', interpolate_method='pchip', moving_window_size='7h')[source]¶
Get Tidal highs and lows
Args:
df (DataFrame): A single column dataframe indexed by datetime
cutoff_period (str, optional): cutoff period for cosine lanczos filter. Defaults to ‘2h’.
resample_period (str, optional): Resample to regular period. Defaults to ‘1min’.
interpolate_method (str, optional): interpolation for resampling. Defaults to ‘pchip’.
moving_window_size (str, optional): moving window size to look for lows within. Defaults to ‘7h’.
Returns:
tuple of DataFrame: Tidal high and tidal low time series
- get_tidal_hl_rolling(df, cutoff_period='2h', resample_period='1min', interpolate_method='pchip', moving_window_size='7h')¶
Get Tidal highs and lows
Args:
df (DataFrame): A single column dataframe indexed by datetime
cutoff_period (str, optional): cutoff period for cosine lanczos filter. Defaults to ‘2h’.
resample_period (str, optional): Resample to regular period. Defaults to ‘1min’.
interpolate_method (str, optional): interpolation for resampling. Defaults to ‘pchip’.
moving_window_size (str, optional): moving window size to look for lows within. Defaults to ‘7h’.
Returns:
tuple of DataFrame: Tidal high and tidal low time series
- get_tidal_hl_zerocrossing(df, round_to='1min')[source]¶
Finds the tidal high and low times using zero crossings of the first derivative.
This works for all situations but is not robust in the face of noise and perturbations in the signal
- get_tidal_phase_diff(dfh2, dfl2, dfh1, dfl1, tolerance='4h')[source]¶
Calculates the phase difference between df2 and df1 tidal highs and lows
Scans +/- 4 hours in df1 to get the highs and lows in that windows for df2 to get the tidal highs and lows at the times of df1
Args:
dfh2 (DataFrame): Timeseries of tidal highs
dfl2 (DataFrame): Timeseries of tidal lows
dfh1 (DataFrame): Timeseries of tidal highs
dfl1 (DataFRame): Timeseries of tidal lows
Returns:
DataFrame: Phase difference (dfh2-dfh1) and (dfl2-dfl1) in minutes
- periods_per_window(moving_window_size: str, period_str: str) int [source]¶
Number of period size in moving window
Args:
moving_window_size (str): moving window size as a string e.g 7H for 7 hour
period_str (str): period as str e.g. 1T for 1 min
Returns:
int: number of periods in the moving window rounded to an integer
- tidal_highs(df, moving_window_size='7h')[source]¶
Tidal highs (could be upto two highs in a 25 hr period)
Args:
df (DataFrame): a time series with a regular frequency
moving_window_size (str, optional): moving window size to look for highs within. Defaults to ‘7h’.
Returns:
DataFrame: an irregular time series with highs at resolution of df.index
- tidal_lows(df, moving_window_size='7h')[source]¶
Tidal lows (could be upto two lows in a 25 hr period)
Args:
df (DataFrame): a time series with a regular frequency
moving_window_size (str, optional): moving window size to look for lows within. Defaults to ‘7h’.
Returns:
DataFrame: an irregular time series with lows at resolution of df.index
vtools.functions.tidalhours module¶
- cdiff(a, n=1, axis=-1)[source]¶
Like np.diff, but include difference from last element back to first.
- cosine_lanczos5(ts, cutoff_period=None, cutoff_frequency=None, filter_len=None, padtype=None, padlen=None, fill_edge_nan=True)[source]¶
squared low-pass cosine lanczos filter on a regular time series.
- Parameters:
- ts
DataFrame
- filter_lenint, time_interval
Size of lanczos window, default is to number of samples within filter_period*1.25.
- cutoff_frequency: float,optional
Cutoff frequency expressed as a ratio of a Nyquist frequency, should within the range (0,1). For example, if the sampling frequency is 1 hour, the Nyquist frequency is 1 sample/2 hours. If we want a 36 hour cutoff period, the frequency is 1/36 or 0.0278 cycles per hour. Hence the cutoff frequency argument used here would be 0.0278/0.5 = 0.056.
- cutoff_periodstring or _time_interval
Period of cutting off frequency. If input as a string, it must be convertible to a _time_interval (Pandas freq). cutoff_frequency and cutoff_period can’t be specified at the same time.
- padtypestr or None, optional
Must be ‘odd’, ‘even’, ‘constant’, or None. This determines the type of extension to use for the padded signal to which the filter is applied. If padtype is None, no padding is used. The default is None.
- padlenint or None, optional
The number of elements by which to extend x at both ends of axis before applying the filter. This value must be less than x.shape[axis]-1. padlen=0 implies no padding. If padtye is not None and padlen is not given, padlen is be set to 6*m.
- fill_edge_nan: bool,optional
If pading is not used and fill_edge_nan is true, resulting data on the both ends are filled with nan to account for edge effect. This is 2*m on the either end of the result. Default is true.
- ts
- Returns:
- result
TimeSeries
A new regular time series with the same interval of ts. If no pading is used the beigning and ending 4*m resulting data will be set to nan to remove edge effect.
- result
- Raises:
- ValueError
If input timeseries is not regular, or, cutoff_period and cutoff_frequency are given at the same time, or, neither cutoff_period nor curoff_frequence is given, or, padtype is not “odd”,”even”,”constant”,or None, or, padlen is larger than data size
- diff_h(tidal_hour_series)[source]¶
Compute the time derivative of tidal hour.
This derivative is often included to capture how rapidly the tidal phase is changing, which can be important in modeling flow reversals, estuarine dynamics, or for detecting slack tide conditions where the rate of change is near zero.
- Parameters:
- tidal_hour_seriespandas.Series
Output of tidal_hour_signal, indexed by datetime.
- Returns:
- pandas.Series
Time derivative of tidal hour (dH/dt) in hours/hour, indexed by datetime.
- hilbert(x, N=None, axis=-1)[source]¶
Compute the analytic signal, using the Hilbert transform.
The transformation is done along the last axis by default.
- Parameters:
- xarray_like
Signal data. Must be real.
- Nint, optional
Number of Fourier components. Default:
x.shape[axis]
- axisint, optional
Axis along which to do the transformation. Default: -1.
- Returns:
- xandarray
Analytic signal of x, of each 1-D array along axis
Notes
The analytic signal
x_a(t)
of signalx(t)
is:\[x_a = F^{-1}(F(x) 2U) = x + i y\]where F is the Fourier transform, U the unit step function, and y the Hilbert transform of x. [1]
In other words, the negative half of the frequency spectrum is zeroed out, turning the real-valued signal into a complex signal. The Hilbert transformed signal can be obtained from
np.imag(hilbert(x))
, and the original signal fromnp.real(hilbert(x))
.References
[1]Wikipedia, “Analytic signal”. https://en.wikipedia.org/wiki/Analytic_signal
[2]Leon Cohen, “Time-Frequency Analysis”, 1995. Chapter 2.
[3]Alan V. Oppenheim, Ronald W. Schafer. Discrete-Time Signal Processing, Third Edition, 2009. Chapter 12. ISBN 13: 978-1292-02572-8
Examples
In this example we use the Hilbert transform to determine the amplitude envelope and instantaneous frequency of an amplitude-modulated signal.
>>> import numpy as np >>> import matplotlib.pyplot as plt >>> from scipy.signal import hilbert, chirp
>>> duration = 1.0 >>> fs = 400.0 >>> samples = int(fs*duration) >>> t = np.arange(samples) / fs
We create a chirp of which the frequency increases from 20 Hz to 100 Hz and apply an amplitude modulation.
>>> signal = chirp(t, 20.0, t[-1], 100.0) >>> signal *= (1.0 + 0.5 * np.sin(2.0*np.pi*3.0*t) )
The amplitude envelope is given by magnitude of the analytic signal. The instantaneous frequency can be obtained by differentiating the instantaneous phase in respect to time. The instantaneous phase corresponds to the phase angle of the analytic signal.
>>> analytic_signal = hilbert(signal) >>> amplitude_envelope = np.abs(analytic_signal) >>> instantaneous_phase = np.unwrap(np.angle(analytic_signal)) >>> instantaneous_frequency = (np.diff(instantaneous_phase) / ... (2.0*np.pi) * fs)
>>> fig, (ax0, ax1) = plt.subplots(nrows=2) >>> ax0.plot(t, signal, label='signal') >>> ax0.plot(t, amplitude_envelope, label='envelope') >>> ax0.set_xlabel("time in seconds") >>> ax0.legend() >>> ax1.plot(t[1:], instantaneous_frequency) >>> ax1.set_xlabel("time in seconds") >>> ax1.set_ylim(0.0, 120.0) >>> fig.tight_layout()
- hour_tide(jd, u=None, h=None, jd_new=None, leave_mean=False, start_datum='ebb')[source]¶
Calculate tide-hour from a time series of u (velocity, flood-positive) or h (water level, i.e. positive-up).
jd: time in days, i.e. julian day, datenum, etc. u: velocity, flood-positive h: water level, positive up. jd_new: if you want to evaluate the tidal hour at a different
(but overlapping) set of times.
- leave_mean: by default, the time series mean is removed, but this
can be disabled by passing True here.
the return values are between 0 and 12, with 0 being slack before ebb (hmm - may need to verify that.).
- hour_tide_fn(jd, u, start_datum='ebb', leave_mean=False)[source]¶
Return a function for extracting tidal hour from the time/velocity given. Use the _fn version if making repeated calls with different jd_new, but the same jd,u
- class interp1d(x, y, kind='linear', axis=-1, copy=True, bounds_error=None, fill_value=nan, assume_sorted=False)[source]¶
Bases:
_Interpolator1D
Interpolate a 1-D function.
x and y are arrays of values used to approximate some function f:
y = f(x)
. This class returns a function whose call method uses interpolation to find the value of new points.- Parameters:
- x(npoints, ) array_like
A 1-D array of real values.
- y(…, npoints, …) array_like
A N-D array of real values. The length of y along the interpolation axis must be equal to the length of x. Use the
axis
parameter to select correct axis. Unlike other interpolators, the default interpolation axis is the last axis of y.- kindstr or int, optional
Specifies the kind of interpolation as a string or as an integer specifying the order of the spline interpolator to use. The string has to be one of ‘linear’, ‘nearest’, ‘nearest-up’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, or ‘next’. ‘zero’, ‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of zeroth, first, second or third order; ‘previous’ and ‘next’ simply return the previous or next value of the point; ‘nearest-up’ and ‘nearest’ differ when interpolating half-integers (e.g. 0.5, 1.5) in that ‘nearest-up’ rounds up and ‘nearest’ rounds down. Default is ‘linear’.
- axisint, optional
Axis in the
y
array corresponding to the x-coordinate values. Unlike other interpolators, defaults toaxis=-1
.- copybool, optional
If True, the class makes internal copies of x and y. If False, references to x and y are used. The default is to copy.
- bounds_errorbool, optional
If True, a ValueError is raised any time interpolation is attempted on a value outside of the range of x (where extrapolation is necessary). If False, out of bounds values are assigned fill_value. By default, an error is raised unless
fill_value="extrapolate"
.- fill_valuearray-like or (array-like, array_like) or “extrapolate”, optional
if a ndarray (or float), this value will be used to fill in for requested points outside of the data range. If not provided, then the default is NaN. The array-like must broadcast properly to the dimensions of the non-interpolation axes.
If a two-element tuple, then the first element is used as a fill value for
x_new < x[0]
and the second element is used forx_new > x[-1]
. Anything that is not a 2-element tuple (e.g., list or ndarray, regardless of shape) is taken to be a single array-like argument meant to be used for both bounds asbelow, above = fill_value, fill_value
. Using a two-element tuple or ndarray requiresbounds_error=False
.New in version 0.17.0.
If “extrapolate”, then points outside the data range will be extrapolated.
New in version 0.17.0.
- assume_sortedbool, optional
If False, values of x can be in any order and they are sorted first. If True, x has to be an array of monotonically increasing values.
See also
splrep
,splev
Spline interpolation/smoothing based on FITPACK.
UnivariateSpline
An object-oriented wrapper of the FITPACK routines.
interp2d
2-D interpolation
Notes
Calling interp1d with NaNs present in input values results in undefined behaviour.
Input values x and y must be convertible to float values like int or float.
If the values in x are not unique, the resulting behavior is undefined and specific to the choice of kind, i.e., changing kind will change the behavior for duplicates.
Examples
>>> import numpy as np >>> import matplotlib.pyplot as plt >>> from scipy import interpolate >>> x = np.arange(0, 10) >>> y = np.exp(-x/3.0) >>> f = interpolate.interp1d(x, y)
>>> xnew = np.arange(0, 9, 0.1) >>> ynew = f(xnew) # use interpolation function returned by `interp1d` >>> plt.plot(x, y, 'o', xnew, ynew, '-') >>> plt.show()
- Attributes:
fill_value
The fill value.
Methods
__call__
(x)Evaluate the interpolant
- __dict__ = mappingproxy({'__module__': 'scipy.interpolate._interpolate', '__doc__': '\n Interpolate a 1-D function.\n\n .. legacy:: class\n\n For a guide to the intended replacements for `interp1d` see\n :ref:`tutorial-interpolate_1Dsection`.\n\n `x` and `y` are arrays of values used to approximate some function f:\n ``y = f(x)``. This class returns a function whose call method uses\n interpolation to find the value of new points.\n\n Parameters\n ----------\n x : (npoints, ) array_like\n A 1-D array of real values.\n y : (..., npoints, ...) array_like\n A N-D array of real values. The length of `y` along the interpolation\n axis must be equal to the length of `x`. Use the ``axis`` parameter\n to select correct axis. Unlike other interpolators, the default\n interpolation axis is the last axis of `y`.\n kind : str or int, optional\n Specifies the kind of interpolation as a string or as an integer\n specifying the order of the spline interpolator to use.\n The string has to be one of \'linear\', \'nearest\', \'nearest-up\', \'zero\',\n \'slinear\', \'quadratic\', \'cubic\', \'previous\', or \'next\'. \'zero\',\n \'slinear\', \'quadratic\' and \'cubic\' refer to a spline interpolation of\n zeroth, first, second or third order; \'previous\' and \'next\' simply\n return the previous or next value of the point; \'nearest-up\' and\n \'nearest\' differ when interpolating half-integers (e.g. 0.5, 1.5)\n in that \'nearest-up\' rounds up and \'nearest\' rounds down. Default\n is \'linear\'.\n axis : int, optional\n Axis in the ``y`` array corresponding to the x-coordinate values. Unlike\n other interpolators, defaults to ``axis=-1``.\n copy : bool, optional\n If True, the class makes internal copies of x and y.\n If False, references to `x` and `y` are used. The default is to copy.\n bounds_error : bool, optional\n If True, a ValueError is raised any time interpolation is attempted on\n a value outside of the range of x (where extrapolation is\n necessary). If False, out of bounds values are assigned `fill_value`.\n By default, an error is raised unless ``fill_value="extrapolate"``.\n fill_value : array-like or (array-like, array_like) or "extrapolate", optional\n - if a ndarray (or float), this value will be used to fill in for\n requested points outside of the data range. If not provided, then\n the default is NaN. The array-like must broadcast properly to the\n dimensions of the non-interpolation axes.\n - If a two-element tuple, then the first element is used as a\n fill value for ``x_new < x[0]`` and the second element is used for\n ``x_new > x[-1]``. Anything that is not a 2-element tuple (e.g.,\n list or ndarray, regardless of shape) is taken to be a single\n array-like argument meant to be used for both bounds as\n ``below, above = fill_value, fill_value``. Using a two-element tuple\n or ndarray requires ``bounds_error=False``.\n\n .. versionadded:: 0.17.0\n - If "extrapolate", then points outside the data range will be\n extrapolated.\n\n .. versionadded:: 0.17.0\n assume_sorted : bool, optional\n If False, values of `x` can be in any order and they are sorted first.\n If True, `x` has to be an array of monotonically increasing values.\n\n Attributes\n ----------\n fill_value\n\n Methods\n -------\n __call__\n\n See Also\n --------\n splrep, splev\n Spline interpolation/smoothing based on FITPACK.\n UnivariateSpline : An object-oriented wrapper of the FITPACK routines.\n interp2d : 2-D interpolation\n\n Notes\n -----\n Calling `interp1d` with NaNs present in input values results in\n undefined behaviour.\n\n Input values `x` and `y` must be convertible to `float` values like\n `int` or `float`.\n\n If the values in `x` are not unique, the resulting behavior is\n undefined and specific to the choice of `kind`, i.e., changing\n `kind` will change the behavior for duplicates.\n\n\n Examples\n --------\n >>> import numpy as np\n >>> import matplotlib.pyplot as plt\n >>> from scipy import interpolate\n >>> x = np.arange(0, 10)\n >>> y = np.exp(-x/3.0)\n >>> f = interpolate.interp1d(x, y)\n\n >>> xnew = np.arange(0, 9, 0.1)\n >>> ynew = f(xnew) # use interpolation function returned by `interp1d`\n >>> plt.plot(x, y, \'o\', xnew, ynew, \'-\')\n >>> plt.show()\n ', '__init__': <function interp1d.__init__>, 'fill_value': <property object>, '_check_and_update_bounds_error_for_extrapolation': <function interp1d._check_and_update_bounds_error_for_extrapolation>, '_call_linear_np': <function interp1d._call_linear_np>, '_call_linear': <function interp1d._call_linear>, '_call_nearest': <function interp1d._call_nearest>, '_call_previousnext': <function interp1d._call_previousnext>, '_call_spline': <function interp1d._call_spline>, '_call_nan_spline': <function interp1d._call_nan_spline>, '_evaluate': <function interp1d._evaluate>, '_check_bounds': <function interp1d._check_bounds>, '__dict__': <attribute '__dict__' of 'interp1d' objects>, '__weakref__': <attribute '__weakref__' of 'interp1d' objects>, '__annotations__': {}})¶
- __doc__ = '\n Interpolate a 1-D function.\n\n .. legacy:: class\n\n For a guide to the intended replacements for `interp1d` see\n :ref:`tutorial-interpolate_1Dsection`.\n\n `x` and `y` are arrays of values used to approximate some function f:\n ``y = f(x)``. This class returns a function whose call method uses\n interpolation to find the value of new points.\n\n Parameters\n ----------\n x : (npoints, ) array_like\n A 1-D array of real values.\n y : (..., npoints, ...) array_like\n A N-D array of real values. The length of `y` along the interpolation\n axis must be equal to the length of `x`. Use the ``axis`` parameter\n to select correct axis. Unlike other interpolators, the default\n interpolation axis is the last axis of `y`.\n kind : str or int, optional\n Specifies the kind of interpolation as a string or as an integer\n specifying the order of the spline interpolator to use.\n The string has to be one of \'linear\', \'nearest\', \'nearest-up\', \'zero\',\n \'slinear\', \'quadratic\', \'cubic\', \'previous\', or \'next\'. \'zero\',\n \'slinear\', \'quadratic\' and \'cubic\' refer to a spline interpolation of\n zeroth, first, second or third order; \'previous\' and \'next\' simply\n return the previous or next value of the point; \'nearest-up\' and\n \'nearest\' differ when interpolating half-integers (e.g. 0.5, 1.5)\n in that \'nearest-up\' rounds up and \'nearest\' rounds down. Default\n is \'linear\'.\n axis : int, optional\n Axis in the ``y`` array corresponding to the x-coordinate values. Unlike\n other interpolators, defaults to ``axis=-1``.\n copy : bool, optional\n If True, the class makes internal copies of x and y.\n If False, references to `x` and `y` are used. The default is to copy.\n bounds_error : bool, optional\n If True, a ValueError is raised any time interpolation is attempted on\n a value outside of the range of x (where extrapolation is\n necessary). If False, out of bounds values are assigned `fill_value`.\n By default, an error is raised unless ``fill_value="extrapolate"``.\n fill_value : array-like or (array-like, array_like) or "extrapolate", optional\n - if a ndarray (or float), this value will be used to fill in for\n requested points outside of the data range. If not provided, then\n the default is NaN. The array-like must broadcast properly to the\n dimensions of the non-interpolation axes.\n - If a two-element tuple, then the first element is used as a\n fill value for ``x_new < x[0]`` and the second element is used for\n ``x_new > x[-1]``. Anything that is not a 2-element tuple (e.g.,\n list or ndarray, regardless of shape) is taken to be a single\n array-like argument meant to be used for both bounds as\n ``below, above = fill_value, fill_value``. Using a two-element tuple\n or ndarray requires ``bounds_error=False``.\n\n .. versionadded:: 0.17.0\n - If "extrapolate", then points outside the data range will be\n extrapolated.\n\n .. versionadded:: 0.17.0\n assume_sorted : bool, optional\n If False, values of `x` can be in any order and they are sorted first.\n If True, `x` has to be an array of monotonically increasing values.\n\n Attributes\n ----------\n fill_value\n\n Methods\n -------\n __call__\n\n See Also\n --------\n splrep, splev\n Spline interpolation/smoothing based on FITPACK.\n UnivariateSpline : An object-oriented wrapper of the FITPACK routines.\n interp2d : 2-D interpolation\n\n Notes\n -----\n Calling `interp1d` with NaNs present in input values results in\n undefined behaviour.\n\n Input values `x` and `y` must be convertible to `float` values like\n `int` or `float`.\n\n If the values in `x` are not unique, the resulting behavior is\n undefined and specific to the choice of `kind`, i.e., changing\n `kind` will change the behavior for duplicates.\n\n\n Examples\n --------\n >>> import numpy as np\n >>> import matplotlib.pyplot as plt\n >>> from scipy import interpolate\n >>> x = np.arange(0, 10)\n >>> y = np.exp(-x/3.0)\n >>> f = interpolate.interp1d(x, y)\n\n >>> xnew = np.arange(0, 9, 0.1)\n >>> ynew = f(xnew) # use interpolation function returned by `interp1d`\n >>> plt.plot(x, y, \'o\', xnew, ynew, \'-\')\n >>> plt.show()\n '¶
- __init__(x, y, kind='linear', axis=-1, copy=True, bounds_error=None, fill_value=nan, assume_sorted=False)[source]¶
Initialize a 1-D linear interpolation class.
- __module__ = 'scipy.interpolate._interpolate'¶
- __weakref__¶
list of weak references to the object (if defined)
- _check_bounds(x_new)[source]¶
Check the inputs for being in the bounds of the interpolated data.
- Parameters:
- x_newarray
- Returns:
- out_of_boundsbool array
The mask on x_new of values that are out of the bounds.
- _y_axis¶
- _y_extra_shape¶
- dtype¶
- property fill_value¶
The fill value.
- tidal_hour_signal(ts, filter=True)[source]¶
Compute the tidal hour of a semidiurnal signal.
This function returns the instantaneous phase-based tidal hour for a time series, assuming a semidiurnal signal. Optionally applies a cosine Lanczos low-pass filter (e.g., 40h) to isolate tidal components from subtidal or noisy fluctuations.
The tidal hour is computed using the phase of the analytic signal obtained via the Hilbert transform. This phase is then scaled to range from 0 to 12 hours to represent one semidiurnal tidal cycle. The output is a pandas Series aligned with the input time index.
- Parameters:
- tspandas.Series
Time series of water level or other semidiurnal signal. Must have a datetime index.
- filterbool, default True
Whether to apply a 40-hour cosine Lanczos filter to the input signal. If False, uses the raw signal.
- Returns:
- pandas.Series
Tidal hour as a float (range [0, 12)), indexed by datetime.
See also
diff_h
Compute the derivative (rate of change) of tidal hour.
cosine_lanczos
External function used to apply low-pass filtering.
Notes
The tidal hour is derived from the instantaneous phase of the analytic signal. This signal is computed as:
analytic_signal = ts + 1j * hilbert(ts)
The phase (angle) of this complex signal varies smoothly over time and reflects the oscillatory nature of the tide, allowing us to construct a continuous representation of “tidal time” even between extrema.
The use of the Hilbert transform provides a smooth interpolation of the signal’s phase progression, since it yields the narrow-band envelope and instantaneous phase of the dominant frequency component (assumed to be semidiurnal here).
- tidal_hour_signal2(ts: Series | DataFrame, filter: bool = True) Series | DataFrame [source]¶
Calculate the tidal hour from a semidiurnal tidal signal.
- Parameters:
- tspd.Series or pd.DataFrame
Input time series of water levels (must have datetime index)
- filterbool, optional
If True, apply Lanczos filter to remove low-frequency components (default True) Note this is opposite of ‘leave_mean’ in original implementation
- Returns:
- pd.Series or pd.DataFrame
Tidal hour in datetime format (same shape as input)
Notes
The tidal hour represents the phase of the semidiurnal tide in temporal units. The calculation uses complex interpolation for smooth phase estimation: 1. The Hilbert transform creates an analytic signal 2. The angle gives the instantaneous phase 3. Complex interpolation avoids phase jumps at 0/2π boundaries 4. This provides continuous phase evolution even during slack tides
If h/u distinction is needed, consider applying diff_h to separate flood/ebb phases. The derivative was likely included in original code to identify phase reversals during tidal current analysis.
vtools.functions.transition module¶
- class PchipInterpolator(x, y, axis=0, extrapolate=None)[source]¶
Bases:
CubicHermiteSpline
PCHIP 1-D monotonic cubic interpolation.
x
andy
are arrays of values used to approximate some function f, withy = f(x)
. The interpolant uses monotonic cubic splines to find the value of new points. (PCHIP stands for Piecewise Cubic Hermite Interpolating Polynomial).- Parameters:
- xndarray, shape (npoints, )
A 1-D array of monotonically increasing real values.
x
cannot include duplicate values (otherwise f is overspecified)- yndarray, shape (…, npoints, …)
A N-D array of real values.
y
’s length along the interpolation axis must be equal to the length ofx
. Use theaxis
parameter to select the interpolation axis.- axisint, optional
Axis in the
y
array corresponding to the x-coordinate values. Defaults toaxis=0
.- extrapolatebool, optional
Whether to extrapolate to out-of-bounds points based on first and last intervals, or to return NaNs.
See also
CubicHermiteSpline
Piecewise-cubic interpolator.
Akima1DInterpolator
Akima 1D interpolator.
CubicSpline
Cubic spline data interpolator.
PPoly
Piecewise polynomial in terms of coefficients and breakpoints.
Notes
The interpolator preserves monotonicity in the interpolation data and does not overshoot if the data is not smooth.
The first derivatives are guaranteed to be continuous, but the second derivatives may jump at \(x_k\).
Determines the derivatives at the points \(x_k\), \(f'_k\), by using PCHIP algorithm [1].
Let \(h_k = x_{k+1} - x_k\), and \(d_k = (y_{k+1} - y_k) / h_k\) are the slopes at internal points \(x_k\). If the signs of \(d_k\) and \(d_{k-1}\) are different or either of them equals zero, then \(f'_k = 0\). Otherwise, it is given by the weighted harmonic mean
\[\frac{w_1 + w_2}{f'_k} = \frac{w_1}{d_{k-1}} + \frac{w_2}{d_k}\]where \(w_1 = 2 h_k + h_{k-1}\) and \(w_2 = h_k + 2 h_{k-1}\).
The end slopes are set using a one-sided scheme [2].
References
[1]F. N. Fritsch and J. Butland, A method for constructing local monotone piecewise cubic interpolants, SIAM J. Sci. Comput., 5(2), 300-304 (1984). :doi:`10.1137/0905021`.
[2]see, e.g., C. Moler, Numerical Computing with Matlab, 2004. :doi:`10.1137/1.9780898717952`
- Attributes:
- axis
- c
- extrapolate
- x
Methods
__call__
(x[, nu, extrapolate])Evaluate the piecewise polynomial or its derivative.
derivative
([nu])Construct a new piecewise polynomial representing the derivative.
antiderivative
([nu])Construct a new piecewise polynomial representing the antiderivative.
roots
([discontinuity, extrapolate])Find real roots of the piecewise polynomial.
- __annotations__ = {}¶
- __doc__ = "PCHIP 1-D monotonic cubic interpolation.\n\n ``x`` and ``y`` are arrays of values used to approximate some function f,\n with ``y = f(x)``. The interpolant uses monotonic cubic splines\n to find the value of new points. (PCHIP stands for Piecewise Cubic\n Hermite Interpolating Polynomial).\n\n Parameters\n ----------\n x : ndarray, shape (npoints, )\n A 1-D array of monotonically increasing real values. ``x`` cannot\n include duplicate values (otherwise f is overspecified)\n y : ndarray, shape (..., npoints, ...)\n A N-D array of real values. ``y``'s length along the interpolation\n axis must be equal to the length of ``x``. Use the ``axis``\n parameter to select the interpolation axis.\n axis : int, optional\n Axis in the ``y`` array corresponding to the x-coordinate values. Defaults\n to ``axis=0``.\n extrapolate : bool, optional\n Whether to extrapolate to out-of-bounds points based on first\n and last intervals, or to return NaNs.\n\n Methods\n -------\n __call__\n derivative\n antiderivative\n roots\n\n See Also\n --------\n CubicHermiteSpline : Piecewise-cubic interpolator.\n Akima1DInterpolator : Akima 1D interpolator.\n CubicSpline : Cubic spline data interpolator.\n PPoly : Piecewise polynomial in terms of coefficients and breakpoints.\n\n Notes\n -----\n The interpolator preserves monotonicity in the interpolation data and does\n not overshoot if the data is not smooth.\n\n The first derivatives are guaranteed to be continuous, but the second\n derivatives may jump at :math:`x_k`.\n\n Determines the derivatives at the points :math:`x_k`, :math:`f'_k`,\n by using PCHIP algorithm [1]_.\n\n Let :math:`h_k = x_{k+1} - x_k`, and :math:`d_k = (y_{k+1} - y_k) / h_k`\n are the slopes at internal points :math:`x_k`.\n If the signs of :math:`d_k` and :math:`d_{k-1}` are different or either of\n them equals zero, then :math:`f'_k = 0`. Otherwise, it is given by the\n weighted harmonic mean\n\n .. math::\n\n \\frac{w_1 + w_2}{f'_k} = \\frac{w_1}{d_{k-1}} + \\frac{w_2}{d_k}\n\n where :math:`w_1 = 2 h_k + h_{k-1}` and :math:`w_2 = h_k + 2 h_{k-1}`.\n\n The end slopes are set using a one-sided scheme [2]_.\n\n\n References\n ----------\n .. [1] F. N. Fritsch and J. Butland,\n A method for constructing local\n monotone piecewise cubic interpolants,\n SIAM J. Sci. Comput., 5(2), 300-304 (1984).\n :doi:`10.1137/0905021`.\n .. [2] see, e.g., C. Moler, Numerical Computing with Matlab, 2004.\n :doi:`10.1137/1.9780898717952`\n\n "¶
- __module__ = 'scipy.interpolate._cubic'¶
- axis¶
- c¶
- extrapolate¶
- x¶
- transition_ts(ts0, ts1, method='linear', create_gap=None, overlap=(0, 0), return_type='series')[source]¶
Create a smooth transition between two aligned time series.
- Parameters:
- ts0pandas.Series or pandas.DataFrame
The initial time series segment. Must share the same frequency and type as ts1.
- ts1pandas.Series or pandas.DataFrame
The final time series segment. Must share the same frequency and type as ts0.
- method{“linear”, “pchip”}, default=”linear”
The interpolation method to use for generating the transition.
- create_gaplist of str or pandas.Timestamp, optional
A list of two elements [start, end] defining the gap over which to transition. If not specified, a natural gap between ts0 and ts1 will be inferred. If ts0 and ts1 overlap, create_gap is required.
- overlaptuple of int or str, default=(0, 0)
Amount of overlap to use for interpolation anchoring in pchip mode. Each entry can be: - An integer: number of data points before/after to use. - A pandas-compatible frequency string: e.g., “2h” or “45min”.
- return_type{“series”, “glue”}, default=”series”
“series”: returns the full merged series including ts0, transition, ts1.
“glue”: returns only the interpolated transition segment.
- Returns:
- pandas.Series or pandas.DataFrame
The resulting time series segment, either the full merged series or just the transition zone.
- Raises:
- ValueError
If ts0 and ts1 have mismatched types or frequencies, or if overlap exists but create_gap is not specified.
vtools.functions.unit_conversions module¶
- ec_psu_25c(ts_ec, hill_correction=True)[source]¶
Convert scalar or vector ec to psu assuming 25C temperature and no low salinity correction Note: The input argument, ts_ec, will be altered.
- m_to_ft(x)[source]¶
Convert meter to foot WARNING: The function modifies the input argument if it is TimeSeries
- psu_ec_25c(psu, refine=True, hill_correction=True)[source]¶
Conversion from psu to ec. If used without refinement, this is the Schemel conversion, otherwise uses root-finding Note that the round trip ec-psu-ec tends to wander a lot without refine = True. The refinement usually takes 4-6 iterations
- psu_ec_25c_scalar(psu, refine=True, hill_correction=True)[source]¶
Conversion from psu to ec for a scalar. If used without refinement, this is the Schemel conversion, otherwise uses root-finding Note that the round trip ec-psu-ec tends to wander a lot without refine = True. The refinement usually takes 4-6 iterations