interferences.table.build¶
interferences.table.build.build_table(elements=None, max_atoms=3, sortby=['m_z', 'charge', 'mass'], charges=[1, 2], add_labels=False, threshold=None, window=None, cache_results=True)[source]¶Build the interferences table.
Parameters:
- elements (
list) – List of elements to include in the table.- max_atoms (
int) – Largest size of molecule to build, in atoms.- sortby (
str|list) – Column or list of columns to sort the final table by.- charges (
list(int)) – Ionic charges to include in the model.- add_labels (
bool) – Whether to produce molecule names which are nicely formatted. This takes additional computation time.- threshold (
float) – Threshold for isotopic abundance for inclusion of low-abudance/non-stable isotopes.- mass_window (
tuple) – Window of interest to filter out irrelevant examples (here a mass window, which directly translates to m/z window with z=1).- cache_results (
bool) – Whether to store the results on disk forTodo
Consider options for parellizing this to reduce build time. This would allow larger molecules to be included.
Invalid molecules (e.g. H{2+}) will currently be present, but will ideally be filtered out
In some cases, mass peaks will be duplicated, and we want to keep the simplest version (e.g. Ar[40]+ and Ar[40]2{2+}). We here remove duplicate mass peaks before sorting (i.e. take the first one, as higher charges would be penalised), but we could potentially add a check that both contain the same isotopic components for verificaiton (this would be slow..).
While “m/z” would be an appropriate column name, it can’t be used in HDF indexes.
interferences.table.combinations¶
Functions for calculating combinations (in the combinatorics sense) of elements and isotopes into isotope-specified molecular ions.
interferences.table.combinations.get_elemental_combinations(elements, max_atoms=3)[source]¶Combine a list of elements into lists of molecular combinations up to a maximum number of atoms per molecule. Successively adds smaller molecules until down to single atoms.
Parameters: Todo
Check that isotopes supplied to this function are propogated
interferences.table.combinations.get_isotopic_combinations(element_comb, threshold=None)[source]¶Take a combination of elements and expand it to generate the potential combinations of elements.
Parameters: Returns: Return type:
interferences.table.intensity¶
Functions to threshold, combine and estimate intensities of elements and isotopes based on their abundances.
interferences.table.intensity.isotope_abundance_threshold(isotopes, threshold=None)[source]¶Remove isotopes from a list which have no or zero abundance.
Parameters: Returns: Return type:
interferences.table.intensity.get_isotopic_abundance_product(components)[source]¶Estimates the abundance of a molecule based on the abundance of the isotopic components.
Returns: Return type: floatNotes
This is essentially a simplistic activity model. Isotopic abundances from periodictable are in %, and are hence divded by 100 here.
interferences.table.molecules¶
Functions for creating, formatting and serialising representaitons of molecules.
interferences.table.molecules.deduplicate(df, charges=None, multiples=True)[source]¶De-duplicate a dataframe index based on index values and and molecule-multiples.
Parameters:
- df (
pandas.DataFrame) – Dataframe to check the index of.- charges (
list) – List of valid charges for the frame.- multiples (
bool) – Whether to remove molecule-multiples.Returns: Return type:
interferences.table.molecules.repr_formula(molecule)[source]¶Get a string representation of a formula which preserves element and isotope information.
interferences.table.molecules.get_formatted_formula(molecule, sorted=False)[source]¶Construct a formatted name for a molecule.
Parameters: Returns: Return type:
interferences.table.molecules.get_molecule_labels(df, **kwargs)[source]¶Get labels for molecules based on their composition and charge.
Parameters: df ( pandas.DataFrame)Returns: Return type: pandas.Series
interferences.table.molecules.molecule_from_components(components)[source]¶Builds a
Formulafrom a list of atom or isotope components.
Parameters: components ( list) – Atomic, isotope or molecular components to construct an ionic molecule from.Returns: Return type: FormulaTodo
- Modify to accept consumption of molecular components (e.g. Fe2O3+)
interferences.table.store¶
interferences.table.store.load_store(path=None, complevel=4, complib='lzo', **kwargs)[source]¶Load the interferences HDF store.
Parameters:
- path (
str|pathlib.Path) – Path to the store.- complevel (
int) – Compression level option for the HDF store. Uncompressed tables can easily reach a few hundred MB - this isn’t an issue on a local disk, but can be limiting for web transfer.- complib (
str) – Which compression library to use.Returns: Return type:
pandas.HDFStore
interferences.table.store.lookup_components(identifier, path=None, key='table', window=None, **kwargs)[source]¶Look up a a list of components from the store based on their identifiers.
Parameters:
- identifiers (
str) – Identifiers for the components to look up.- path (
str|pathlib.Path) – Path to store to search.- key (
str) – Key for the table within the store.- window (
tuple) – Window for indexing along m/z to return a subset of results.- drop_first_level (
bool) – Whether to drop the first level of the index for simplicity.Returns: Return type:
interferences.table.store.process_subtables(dfs, charges=None, dump=True, path=None, mode='a', data_columns=['parts', 'elements', 'm_z', 'iso_abund_product'], complevel=4, complib='lzo', **kwargs)[source]¶Process and optionally dump a set of subtables to file, appending to the hierarchically-indexed table.
Parameters:
- dfs (
list`(:class:`pandas.DataFrame)) – Dataframes to dump.- charges (
list) – Charges used to create for the table.- path (
str|pathlib.Path) – Path to the file to add the table to.- mode (
str) – Mode for accessing the HDF file.- data_columns (
list) – List of columns to create an indexes for to allow query-by-data.- complevel (
int) – Compression level option for the HDF store. Uncompressed tables can easily reach a few hundred MB - this isn’t an issue on a local disk, but can be limiting for web transfer.- complib (
str) – Which compression library to use.Returns: De-duplicated concatenated version of new tables.
Return type:
interferences.table.store.reset_table(path=None, remove=True, key='table', format='table', complevel=4, complib='lzo', **kwargs)[source]¶Reset or remove a HDF store.
Parameters:
- path (
str|pathlib.Path) – Path to store.- remove (
bool) – Whether to remove the table from disk, if possible.- format (
str) – Format to set for the new tables.- complevel (
int) – Compression level option for the HDF store. Uncompressed tables can easily reach a few hundred MB - this isn’t an issue on a local disk, but can be limiting for web transfer.- complib (
str) – Which compression library to use.