tidyms.lcms

Functions and objects for working with LC-MS data.

Objects

Chromatogram MSSpectrum Roi

Functions

make_chromatograms make_roi accumulate_spectra_profile accumulate_spectra_centroid get_lc_filter_peak_params get_roi_params get_find_centroid_params

class Annotation(label: int, isotopologue_label: int, isotopologue_index: int, charge: int)

Contains annotation information of features.

If an annotation is not available, -1 is used.

Attributes:
labelint

Correspondence label of features.

isotopologue_labelint

Groups features from the same isotopic envelope.

isotopologue_indexint

Position of the feature in an isotopic envelope.

chargeint

Charge state.

class Chromatogram(time: ndarray, spint: ndarray, index: int = 0, mode: str = 'uplc')

Representation of a chromatogram. Manages plotting and peak detection.

Subclassed from LCRoi.

Attributes:
timearray

Retention time data.

spintarray

Intensity data.

mode{“uplc”, “hplc”}, default=”uplc”

Analytical platform used for separation. Sets default values for peak detection.

class Feature(roi: Roi, index: int = 0)

Abstract representation of a feature.

Attributes:
roi: Roi
annotation: Optional[Annotation]
index: int
exception InvalidPeakException

Exception raised when invalid indices are used in the construction of Peak objects.

class LCTrace(time: ndarray[Any, dtype[floating]], spint: ndarray[Any, dtype[floating]], mz: ndarray[Any, dtype[floating]], scan: ndarray[Any, dtype[integer]], index: int = 0, mode: str | None = 'uplc', noise: ndarray | None = None, baseline: ndarray | None = None)

m/z traces where chromatographic peaks may be found. m/z information is stored besides time and intensity information.

Subclassed from Roi. Used for feature detection in LCMS data.

Attributes:
timearray

time in each scan.

spintarray

intensity in each scan.

mzarray

m/z in each scan.

scanarray

scan numbers where the ROI is defined.

mode{“uplc”, “hplc”}

Analytical platform used separation. Sets default values for peak detection.

extract_features(smoothing_strength: float | None = 1.0, store_smoothed: bool = False, **kwargs) list['Peak']

Detect chromatographic peaks.

Peaks are stored in the features attribute.

Parameters:
smoothing_strengthfloat or None, default=1.0

Scale of a Gaussian function used to smooth the signal. If None, no smoothing is applied.

store_smoothedbool, default=True

If True, replaces the original data with the smoothed version.

**kwargs

Parameters to pass to tidyms.peaks.detect_peaks().

See also

tidyms.peaks.estimate_noise

noise estimation of 1D signals

tidyms.peaks.estimate_baseline

baseline estimation of 1D signals

tidyms.peaks.detect_peaks

peak detection of 1D signals.

Notes

Peak detection is done in five steps:

  1. Estimate the noise level.

  2. Apply a gaussian smoothing to the chromatogram.

  3. Estimate the baseline.

  4. Detect peaks in the chromatogram.

A complete description can be found here.

plot(figure: figure | None = None, show: bool = True) figure

Plot the ROI.

Parameters:
figurebokeh.plotting.figure or None, default=None

Figure to add the plot. If None, a new figure is created.

showbool, default=True

If True calls bokeh.plotting.show on the Figure.

Returns:
bokeh.plotting.figure
class MSSpectrum(mz: ndarray, spint: ndarray, time: float | None = None, ms_level: int = 1, polarity: int | None = None, instrument: str = 'qtof', is_centroid: bool = True)

Representation of a Mass Spectrum. Manages conversion to centroid and plotting of data.

Attributes:
mzarray

m/z data

spintarray

Intensity data

timefloat or None

Time at which the spectrum was acquired

ms_levelint

MS level of the scan

polarityint or None

Polarity used to acquire the data.

instrument{“qtof”, “orbitrap”}, default=”qtof”

MS instrument type. Used to set default values in methods.

is_centroidbool

True if the data is in centroid mode.

find_centroids(min_snr: float = 10.0, min_distance: float | None = None) Tuple[ndarray, ndarray]

Find centroids in the spectrum.

Parameters:
min_snrpositive number, default=10.0

Minimum signal-to-noise ratio of the peaks.

min_distancepositive number or None, default=None

Minimum distance between consecutive peaks. If None, the value is set to 0.01 if self.instrument is "qtof" or to 0.005 if self.instrument is "orbitrap".

Returns:
centroidarray

m/z centroids. If self.is_centroid is True, returns self.mz.

areaarray

peak area. If self.is_centroid is True, returns self.spint.

plot(fig_params: dict | None = None, line_params: dict | None = None, show: bool = True) figure

Plot the spectrum using Bokeh.

Parameters:
fig_paramsdict or None, default=None

key-value parameters to pass to bokeh.plotting.figure.

line_paramsdict, or None, default=None

key-value parameters to pass to bokeh.plotting.figure.line.

showbool, default=True

If True calls bokeh.plotting.show on the Figure.

Returns:
bokeh.plotting.figure
class MZTrace(time: ndarray[Any, dtype[floating]], spint: ndarray[Any, dtype[floating]], mz: ndarray[Any, dtype[floating]], scan: ndarray[Any, dtype[integer]], index: int = 0, mode: str | None = None, noise: ndarray | None = None, baseline: ndarray | None = None)

ROI Implementation using MZ traces.

MZ traces are 1D traces containing time, intensity and m/z associated with each scan.

Attributes:
timearray

time in each scan.

spintarray

intensity in each scan.

mzarray

m/z in each scan.

scanarray

scan numbers where the ROI is defined.

mode{“uplc”, “hplc”}

Analytical platform used separation. Sets default values for peak detection.

featuresOptionalList[Feature]]
fill_nan(**kwargs)

Fill missing values in the trace.

Missing m/z values are filled using the mean m/z of the ROI. Missing intensity values are filled using linear interpolation. Missing values on the boundaries are filled by extrapolation. Negative values are set to 0.

Parameters:
kwargs:

Parameters to pass to scipy.interpolate.interp1d()

to_string() str

Serializes the LCRoi into a JSON str.

Returns:
str
class Peak(start: int, apex: int, end: int, roi: LCTrace, index: int = 0)

Representation of a chromatographic peak. Computes peak descriptors.

Attributes:
start: int

index where the peak begins. Must be smaller than apex

apex: int

index where the apex of the peak is located. Must be smaller than end

end: int

index where the peak ends. Start and end used as slices defines the peak region.

roi: LCTrace

ROI associated with the Peak.

index: int

Unique index for features detected a ROI.

static compute_isotopic_envelope(features: list['Peak']) Tuple[list[float], list[float]]

Computes a m/z and relative abundance for a list of features.

describe() dict[str, float]

Computes peak height, area, location, width and SNR.

Returns:
descriptors: dict

A mapping of descriptor names to descriptor values.

get_area() float

Computes the area in the region defined by the peak.

If the baseline area is greater than the peak area, the area is set to zero.

Returns:
areapositive number.
get_extension() float

Computes the peak extension, defined as the length of the peak region.

Returns:
extensionpositive number
get_height() float

Computes the height of the peak, defined as the difference between the value of intensity in the ROI and the baseline at the peak apex.

Returns:
heightnon-negative number. If the baseline estimation is greater
than y, the height is set to zero.
get_mz() float

Computes the weighted average m/z of the peak.

Returns:
mz_meanfloat
get_mz_std() float | None

Computes the standard deviation of the m/z in the peak

Returns:
mz_stdfloat
get_rt() float

Finds the peak location in the ROI rt, using spint as weights.

Returns:
rtfloat
get_rt_end() float

Computes the end of the peak, in time units

Returns:
float
get_rt_start() float

Computes the start of the peak, in time units

Returns:
float
get_snr() float

Computes the peak signal-to-noise ratio, defined as the quotient between the peak height and the noise level at the apex.

Returns:
snrfloat
get_width() float

Computes the peak width, defined as the region where the 95 % of the total peak area is distributed.

Returns:
widthpositive number.
class Roi(index: int)

Regions of interest extracted from raw MS data.

classmethod from_string(s: str) AnyRoi

Loads a ROI from a JSON string.

abstract to_string() str

Serializes a ROI into a string.

get_find_centroid_params(instrument: str) dict

Set default parameters to find_centroid method using instrument information.

Parameters:
instrument{“qtof”, “orbitrap”}
Returns:
paramsdict