tidyms.chem.FormulaGenerator¶
- class FormulaGenerator(bounds: Dict[str, Tuple[int, int]], max_M: float | None = None)¶
Generates sum formulas based on exact mass values.
- Attributes:
- n_results: int
Number of valid formulas generated.
- results: dict
a mapping of nominal masses of the results to a tuple of three arrays: 1. the row index of positive coefficients. 2. the row index of negative coefficients. 3. the number of 12C in the formula.
Methods
generate_formulas(M, tolerance[, ...])Computes formulas compatibles with the given query mass.
Convert results to an array of coefficients.
from_hmdb(mass[, bounds])Creates a FormulaGenerator using elemental bounds obtained from molecules present in the Human Metabolome database.
FormulaGenerator constructor.
- Parameters:
- bounds: Dict
A dictionary from strings with isotopes to lower and upper bounds of formulas coefficients. Isotope strings can be an element symbol (eg: “C”) or an isotope string representation (eg: “13C”). In the first case, the element is converted to the most abundant isotope (“12C”).
- max_Mfloat or None, default=None
Maximum mass value for generated formulas. If specified it is used to update the bounds. For examples is
max_M=300and the bounds for 32S are(0, 10), then they are updated to(0, 9).
Examples
>>> import tidyms as ms >>> fg_bounds = {"C": (0, 5), "H": (0, 10), "O": (0, 4)} >>> fg = ms.chem.FormulaGenerator(fg_bounds)
- static from_hmdb(mass: int, bounds: Dict[str, Tuple[int, int]] | None = None)¶
Creates a FormulaGenerator using elemental bounds obtained from molecules present in the Human Metabolome database. By default, bounds for CHNOPS elements are included.
- Parameters:
- mass{500, 1000, 1500, 2000}
Bounds are created using molecules with molecular mass lower than this value.
- bounds: Dict[str, Tuple[int, int]] or None, default=None
Passes additional isotopes to the generator.
- Returns:
- FormulaGenerator
See also
get_chnops_bounds
Examples
>>> import tidyms as ms # creates a formula generator using a max mass of 500. # Also include chlorine to the bounds. >>> fg = ms.chem.FormulaGenerator.from_hmdb(500, bounds={"Cl": (0, 2)})
- generate_formulas(M: float, tolerance: float, min_defect: float | None = None, max_defect: float | None = None)¶
Computes formulas compatibles with the given query mass. The formulas are computed assuming neutral species. If charged species are used, mass values must be corrected using the electron mass.
Results are stored in an internal format, use results_to_array to obtain the compatible formulas.
- Parameters:
- Mfloat
Exact mass used for formula generation.
- tolerancefloat
Tolerance to search compatible formulas.
- min_defect: float or None, default=None
Minimum mass defect allowed for the results. If None, all values are allowed.
- max_defect: float or None, default=None
Maximum mass defect allowed for the results. If None, all values are allowed.
Examples
>>> import tidyms as ms >>> fg_bounds = {"C": (0, 5), "H": (0, 10), "O": (0, 4)} >>> fg = ms.chem.FormulaGenerator(fg_bounds) >>> fg.generate_formulas(46.042, 0.005)
- results_to_array() Tuple[ndarray, List[Isotope], ndarray]¶
Convert results to an array of coefficients.
- Returns:
- coefficients: np.array
Formula coefficients. Each row is a formula, each column is an isotope.
- isotopes: list[Isotopes]
Isotopes associated to each column of coefficients.
- M: array
Exact mass associated to each row of coefficients.
Examples
>>> import tidyms as ms >>> fg_bounds = {"C": (0, 5), "H": (0, 10), "O": (0, 4)} >>> fg = ms.chem.FormulaGenerator(fg_bounds) >>> fg.generate_formulas(46.042, 0.005) >>> coeff, isotopes, M = fg.results_to_array()