tidyms.lcms¶
Functions and objects for working with LC-MS data.
Objects¶
Chromatogram MSSpectrum Roi
Functions¶
make_chromatograms make_roi accumulate_spectra_profile accumulate_spectra_centroid get_lc_filter_peak_params get_roi_params get_find_centroid_params
- class Annotation(label: int, isotopologue_label: int, isotopologue_index: int, charge: int)¶
Contains annotation information of features.
If an annotation is not available,
-1is used.- Attributes:
- labelint
Correspondence label of features.
- isotopologue_labelint
Groups features from the same isotopic envelope.
- isotopologue_indexint
Position of the feature in an isotopic envelope.
- chargeint
Charge state.
- class Chromatogram(time: ndarray, spint: ndarray, index: int = 0, mode: str = 'uplc')¶
Representation of a chromatogram. Manages plotting and peak detection.
Subclassed from LCRoi.
- Attributes:
- timearray
Retention time data.
- spintarray
Intensity data.
- mode{“uplc”, “hplc”}, default=”uplc”
Analytical platform used for separation. Sets default values for peak detection.
- class Feature(roi: Roi, index: int = 0)¶
Abstract representation of a feature.
- Attributes:
- roi: Roi
- annotation: Optional[Annotation]
- index: int
- exception InvalidPeakException¶
Exception raised when invalid indices are used in the construction of Peak objects.
- class LCTrace(time: ndarray[Any, dtype[floating]], spint: ndarray[Any, dtype[floating]], mz: ndarray[Any, dtype[floating]], scan: ndarray[Any, dtype[integer]], index: int = 0, mode: str | None = 'uplc', noise: ndarray | None = None, baseline: ndarray | None = None)¶
m/z traces where chromatographic peaks may be found. m/z information is stored besides time and intensity information.
Subclassed from Roi. Used for feature detection in LCMS data.
- Attributes:
- timearray
time in each scan.
- spintarray
intensity in each scan.
- mzarray
m/z in each scan.
- scanarray
scan numbers where the ROI is defined.
- mode{“uplc”, “hplc”}
Analytical platform used separation. Sets default values for peak detection.
- extract_features(smoothing_strength: float | None = 1.0, store_smoothed: bool = False, **kwargs) list['Peak']¶
Detect chromatographic peaks.
Peaks are stored in the features attribute.
- Parameters:
- smoothing_strengthfloat or None, default=1.0
Scale of a Gaussian function used to smooth the signal. If None, no smoothing is applied.
- store_smoothedbool, default=True
If True, replaces the original data with the smoothed version.
- **kwargs
Parameters to pass to
tidyms.peaks.detect_peaks().
See also
tidyms.peaks.estimate_noisenoise estimation of 1D signals
tidyms.peaks.estimate_baselinebaseline estimation of 1D signals
tidyms.peaks.detect_peakspeak detection of 1D signals.
Notes
Peak detection is done in five steps:
Estimate the noise level.
Apply a gaussian smoothing to the chromatogram.
Estimate the baseline.
Detect peaks in the chromatogram.
A complete description can be found here.
- plot(figure: figure | None = None, show: bool = True) figure¶
Plot the ROI.
- Parameters:
- figurebokeh.plotting.figure or None, default=None
Figure to add the plot. If None, a new figure is created.
- showbool, default=True
If True calls
bokeh.plotting.showon the Figure.
- Returns:
- bokeh.plotting.figure
- class MSSpectrum(mz: ndarray, spint: ndarray, time: float | None = None, ms_level: int = 1, polarity: int | None = None, instrument: str = 'qtof', is_centroid: bool = True)¶
Representation of a Mass Spectrum. Manages conversion to centroid and plotting of data.
- Attributes:
- mzarray
m/z data
- spintarray
Intensity data
- timefloat or None
Time at which the spectrum was acquired
- ms_levelint
MS level of the scan
- polarityint or None
Polarity used to acquire the data.
- instrument{“qtof”, “orbitrap”}, default=”qtof”
MS instrument type. Used to set default values in methods.
- is_centroidbool
True if the data is in centroid mode.
- find_centroids(min_snr: float = 10.0, min_distance: float | None = None) Tuple[ndarray, ndarray]¶
Find centroids in the spectrum.
- Parameters:
- min_snrpositive number, default=10.0
Minimum signal-to-noise ratio of the peaks.
- min_distancepositive number or None, default=None
Minimum distance between consecutive peaks. If
None, the value is set to 0.01 ifself.instrumentis"qtof"or to 0.005 ifself.instrumentis"orbitrap".
- Returns:
- centroidarray
m/z centroids. If
self.is_centroidisTrue, returnsself.mz.- areaarray
peak area. If
self.is_centroidisTrue, returnsself.spint.
- plot(fig_params: dict | None = None, line_params: dict | None = None, show: bool = True) figure¶
Plot the spectrum using Bokeh.
- Parameters:
- fig_paramsdict or None, default=None
key-value parameters to pass to
bokeh.plotting.figure.- line_paramsdict, or None, default=None
key-value parameters to pass to
bokeh.plotting.figure.line.- showbool, default=True
If True calls
bokeh.plotting.showon the Figure.
- Returns:
- bokeh.plotting.figure
- class MZTrace(time: ndarray[Any, dtype[floating]], spint: ndarray[Any, dtype[floating]], mz: ndarray[Any, dtype[floating]], scan: ndarray[Any, dtype[integer]], index: int = 0, mode: str | None = None, noise: ndarray | None = None, baseline: ndarray | None = None)¶
ROI Implementation using MZ traces.
MZ traces are 1D traces containing time, intensity and m/z associated with each scan.
- Attributes:
- timearray
time in each scan.
- spintarray
intensity in each scan.
- mzarray
m/z in each scan.
- scanarray
scan numbers where the ROI is defined.
- mode{“uplc”, “hplc”}
Analytical platform used separation. Sets default values for peak detection.
- featuresOptionalList[Feature]]
- fill_nan(**kwargs)¶
Fill missing values in the trace.
Missing m/z values are filled using the mean m/z of the ROI. Missing intensity values are filled using linear interpolation. Missing values on the boundaries are filled by extrapolation. Negative values are set to 0.
- Parameters:
- kwargs:
Parameters to pass to
scipy.interpolate.interp1d()
- to_string() str¶
Serializes the LCRoi into a JSON str.
- Returns:
- str
- class Peak(start: int, apex: int, end: int, roi: LCTrace, index: int = 0)¶
Representation of a chromatographic peak. Computes peak descriptors.
- Attributes:
- start: int
index where the peak begins. Must be smaller than apex
- apex: int
index where the apex of the peak is located. Must be smaller than end
- end: int
index where the peak ends. Start and end used as slices defines the peak region.
- roi: LCTrace
ROI associated with the Peak.
- index: int
Unique index for features detected a ROI.
- static compute_isotopic_envelope(features: list['Peak']) Tuple[list[float], list[float]]¶
Computes a m/z and relative abundance for a list of features.
- describe() dict[str, float]¶
Computes peak height, area, location, width and SNR.
- Returns:
- descriptors: dict
A mapping of descriptor names to descriptor values.
- get_area() float¶
Computes the area in the region defined by the peak.
If the baseline area is greater than the peak area, the area is set to zero.
- Returns:
- areapositive number.
- get_extension() float¶
Computes the peak extension, defined as the length of the peak region.
- Returns:
- extensionpositive number
- get_height() float¶
Computes the height of the peak, defined as the difference between the value of intensity in the ROI and the baseline at the peak apex.
- Returns:
- heightnon-negative number. If the baseline estimation is greater
- than y, the height is set to zero.
- get_mz() float¶
Computes the weighted average m/z of the peak.
- Returns:
- mz_meanfloat
- get_mz_std() float | None¶
Computes the standard deviation of the m/z in the peak
- Returns:
- mz_stdfloat
- get_rt() float¶
Finds the peak location in the ROI rt, using spint as weights.
- Returns:
- rtfloat
- get_rt_end() float¶
Computes the end of the peak, in time units
- Returns:
- float
- get_rt_start() float¶
Computes the start of the peak, in time units
- Returns:
- float
- get_snr() float¶
Computes the peak signal-to-noise ratio, defined as the quotient between the peak height and the noise level at the apex.
- Returns:
- snrfloat
- get_width() float¶
Computes the peak width, defined as the region where the 95 % of the total peak area is distributed.
- Returns:
- widthpositive number.
- class Roi(index: int)¶
Regions of interest extracted from raw MS data.
- classmethod from_string(s: str) AnyRoi¶
Loads a ROI from a JSON string.
- abstract to_string() str¶
Serializes a ROI into a string.
- get_find_centroid_params(instrument: str) dict¶
Set default parameters to find_centroid method using instrument information.
- Parameters:
- instrument{“qtof”, “orbitrap”}
- Returns:
- paramsdict