tidyms.peaks

functions and objects used to detect peaks.

Objects

  • Peak : Stores peak location and extension. Computes peak parameters.

Functions

  • estimate_noise(x) : Estimates noise level in a 1D signal

  • estimate_baseline(x, noise) : Estimates the baseline in a 1D signal

  • detect_peaks(x, y) : Detects peaks in a 1D signal

  • find_centroids(x, y) : Computes the centroid and area of peaks in a 1D signal.

detect_peaks(x: ndarray, noise: ndarray, baseline: ndarray, find_peaks_params: dict | None = None) Tuple[ndarray, ndarray, ndarray]

Finds peaks in a 1D signal.

Parameters:
xarray

Signal with peaks.

noisearray with the same size as x

Noise level of x.

baselinearray with the same size as x

Baseline estimation of x

find_peaks_paramsdict or None, default=None

parameters to pass to scipy.signal.find_peaks().

Returns:
startarray

Indices where peaks start

apexarray

Indices where the peak maximum was found

endarray

Indices where peaks end

See also

estimate_noise

estimates noise in a 1D signal

estimate_baseline

estimates the baseline in a 1D signal

Peak

stores peak start, apex and end indices

get_peak_descriptors

computes descriptors on a list of Peak objects

Notes

The algorithm for peak finding is as follows:

  1. Peaks are detected using scipy.signal.find_peaks(). Peaks with a prominence lower than three times the noise or in regions classified as baseline are removed.

  2. Points from \(x\) are considered baseline is the following condition is meet:

    \[|x[k] - b[k]| < e[k]\]

    where \(b\) is the baseline and \(e\) is the noise. If a detected peak is classified as baseline is removed.

  3. The extension of each peak is found by finding the closest baseline point to its left and right.

  4. If there are overlapping peaks (i.e. overlapping peak extensions), the extension is fixed by defining a boundary between the peaks as the minimum value between the two peaks.

estimate_baseline(x: ndarray, noise: ndarray, min_proba: float = 0.05) ndarray

Computes the baseline of a 1D signal.

The baseline is estimated by classifying each point in the signal as either signal or baseline. The baseline is obtained by interpolation of baseline points. See [ADD LINK] for a detailed explanation of how the method works.

Parameters:
xnon-empty 1D array
noisearray

Noise estimation obtained with estimate_noise

min_probanumber between 0 and 1, default=0.05
Returns:
baselinearray with the same size as x
estimate_noise(x: ndarray, min_slice_size: int = 200, n_slices: int = 5, robust: bool = True) ndarray

Estimates the noise level in a signal.

Splits x into several slices and estimates the noise assuming that the noise is gaussian iid in each slice. See [ADD LINK] for a detailed description of how the method works

Parameters:
x1D array
min_slice_sizeint, default=200

Minimum size of a slice. If the size of x is smaller than this value, the noise is estimated using the whole array.

n_slices: int, default=5

Number of slices to create. The size of each slice must be greater than min_slice_size.

robustbool, default=True

If True, estimates the noise using the median absolute deviation. Else uses the standard deviation.

Returns:
noise: array with the same size as x
find_centroids(mz: ndarray, spint: ndarray, min_snr: float, min_distance: float) Tuple[ndarray, ndarray]

Finds the centroid of a mass spectrum in profile mode.

Parameters:
mzarray
spintarray
min_snrpositive number

Minimum signal-to-noise ratio

min_distancepositive number

Minimum m/z distance between consecutive centroids

Returns:
centroid_mzarray

centroid m/z of peaks

centroid_intarray

area of peaks