lipidoz.isotope_scoring

This module contains a function for performing lipid double bond position determination from OzID data using a method that is based on comparing observed and theoretical isotope distributions for putatative Oz fragments.

Structure of score_db_pos_isotope_dist_polyunsat Results

The lipidoz.isotope_scoring.score_db_pos_isotope_dist_polyunsat() function returns a dictionary containing all of the output information from a double bond position determination analysis for a single lipid species. The dictionary is organized into a few top-level sections: one for information related to the precursor (dictionary key 'precursor'), and the others with information related to putative OzID fragments for each acyl chain (dictionary key 'fragments'). The fragments sections is further divided in a heirarchical fashion, first by double bond index, second by double bond position, with information related to the corresponding putative OzID fragments stored in sections 'aldehyde' and 'criegee' under that. For example, results['fragments'][2][9]['aldehyde'] would contain information about the putative aldehyde OzID fragment corresponding to a double bond at position 9 with index of 2 (i.e., the second double bond ordered from the end of the acyl chain).

Example score_db_pos_isotope_dist_polyunsat results dictionary
results = {
    'precursor': {  # data associated with the lipid precursor
        'target_mz': 853.4567,
        'target_rt': 23.70,
        'xic_peak_rt': 23.71,  # XIC fitted peak parameter
        'xic_peak_ht': 1.23e5,  # XIC fitted peak parameter
        'xic_peak_fwhm': 1.4,  # XIC fitted peak parameter
        'mz_ppm': 11.621,  # isotope distribution scoring component
        'abun_percent': 4.15,  # isotope distribution scoring component
        'mz_cos_dist': 0.030,  # isotope distribution scoring component
        'isotope_dist_img': b'...',  # image in .png format stored as bytes
        'xic_fit_img': b'...',  # image in .png format stored as bytes
        'saturation_corrected': True
    },
    'fragments': {
        1: {
            # ... results for other double bond indices omitted
        },
        2: {
            # ... results for other double bond positions omitted
            8: {
                'aldehyde': None,  # aldehyde fragment not found for this db position
                'criegee': None,  # criegee fragment not found for this db position
            },
            9: {
                'aldehyde': {
                    'target_mz': 625.424762,
                    'target_rt': 23.7,
                    'xic_peak_rt': 23.717556783082685,  # XIC fitted peak parameter
                    'xic_peak_ht': 12661.977204378885,  # XIC fitted peak parameter
                    'xic_peak_fwhm': 0.17212908575224203,  # XIC fitted peak parameter
                    'mz_ppm': 5.196747765566774,  # isotope distribution scoring component
                    'abun_percent': 1.3707328733140074,  # isotope distribution scoring component
                    'mz_cos_dist': 0.025067410332574203,  # isotope distribution scoring component
                    'rt_cos_dist': 0.13721195277045561,  # retention time agreement with precursor
                    'isotope_dist_img': b'...',  # image in .png format stored as bytes
                    'xic_fit_img': b'...',  # image in .png format stored as bytes
                    'saturation_corrected': False,
                },
                'criegee': {
                    'target_mz': 641.419677,
                    'target_rt': 23.7,
                    'xic_peak_rt': 23.72089709916982,  # XIC fitted peak parameter
                    'xic_peak_ht': 25546.76018712644,  # XIC fitted peak parameter
                    'xic_peak_fwhm': 0.17084010175727868,  # XIC fitted peak parameter
                    'mz_ppm': 7.422567626157515,  # isotope distribution scoring component
                    'abun_percent': 0.7105029034931888,  # isotope distribution scoring component
                    'mz_cos_dist': 0.03100184607806855,  # isotope distribution scoring component
                    'rt_cos_dist': 0.08396320539908453,  # retention time agreement with precursor
                    'isotope_dist_img': b'...',  # image in .png format stored as bytes
                    'xic_fit_img': b'...',  # image in .png format stored as bytes
                    'saturation_corrected': False,
                },
            },
            10: {
                'aldehyde': None,  # aldehyde fragment not found for this db position
                'criegee': {
                    'target_mz': 627.404027,
                    'target_rt': 23.7,
                    'xic_peak_rt': 23.200000000000003,  # XIC fitted peak parameter
                    'xic_peak_ht': 1320.5005362849708,  # XIC fitted peak parameter
                    'xic_peak_fwhm': 0.21637538953545316,  # XIC fitted peak parameter
                    'mz_ppm': 31.88232636309292,  # isotope distribution scoring component
                    'abun_percent': 2.9104719246527786,  # isotope distribution scoring component
                    'mz_cos_dist': 0.20026290761290033,  # isotope distribution scoring component
                    'rt_cos_dist': 0.9794397320037685,  # retention time agreement with precursor
                    'isotope_dist_img': b'...',  # image in .png format stored as bytes
                    'xic_fit_img': b'...',  # image in .png format stored as bytes
                    'saturation_corrected': False,
                },
            },
            # ... results for other double bond positions omitted
        },
        3: {
            # ... results for other double bond indices omitted
        },
    },
}

Note

Results from lipidoz.isotope_scoring.score_db_pos_isotope_dist_polyunsat_infusion() are the same as for lipidoz.isotope_scoring.score_db_pos_isotope_dist_polyunsat(), except all information related to retention time, XIC, etc. are omitted. Likewise, results from lipidoz.isotope_scoring.score_db_pos_isotope_dist_polyunsat_infusion() and lipidoz.isotope_scoring.score_db_pos_isotope_dist_targeted() are in exactly the same format, the targeted variant just has results for fewer double bond positions.

Module Reference

lipidoz.isotope_scoring.score_db_pos_isotope_dist_polyunsat(oz_data, precursor_formula, fa_nc, fa_nu, precursor_rt, rt_tol, rt_peak_win, mz_tol, rt_fit_method='gauss', ms1_fit_method='localmax', check_saturation=True, saturation_threshold=100000.0, remove_d=None, debug_flag=None, debug_cb=None, info_cb=None, early_stop_event=None)

performs isotope distribution scoring for a range of potential double-bond positions for polyunsaturated lipids, also works for monounsaturated lipids, and essentially does nothing for completely saturated lipids

Parameters:
oz_datamzapy.MZA

mza data interface instance for OzID data

precursor_formuladict(str:int)

chemical formula of the precursor ion

fa_nctuple(int)

number of carbons in precursor fatty acids

fa_nutuple(int)

number of DB in each precursor fatty acid, in same order as precursor_nc

precursor_rtfloat

precursor retention time

rt_tolfloat

retention time tolerance

rt_peak_winfloat

size of RT window to extract for peak fitting

mz_tolfloat

m/z tolerance for extracting XICs

rt_fit_methodstr, default=’gauss’

specify method to use for fitting the RT peak (‘gauss’ works best in testing)

ms1_fit_methodstr, default=’localmax’

specify method to use for fitting MS1 spectrum (‘localmax’ works best in testing)

check_saturationbool, default=True

whether to check for signal saturation and use leading edge strategy if necessary

saturation_thresholdfloat, default=1e5

specify a threshold intensity for determining peak saturation

remove_dint, optional

adjust molecular formulas to get rid of D labels on fatty acid tail that are part of the neutral loss (specific to SPLASH lipids)

debug_flagstr, optional

specifies how to dispatch the message and/or plot, None to do nothing

debug_cbfunc, optional

callback function that takes the debugging message as an argument, can be None if debug_flag is not set to ‘textcb’

info_cbfunction, optional

optional callback function that gets called at several intermediate steps and gives information about data processing details. Callback function takes a single argument which is a str info message

early_stop_eventthreading.Event, optional

When the workflow is running in its own thread and this event gets set, processing is stopped gracefully

Returns:
resultdict(…)

dictionary containing analysis results

lipidoz.isotope_scoring.score_db_pos_isotope_dist_targeted(oz_data, precursor_formula, db_idxs, db_posns, precursor_rt, rt_tol, rt_peak_win, mz_tol, rt_fit_method='gauss', ms1_fit_method='localmax', check_saturation=True, saturation_threshold=100000.0, remove_d=None, debug_flag=None, debug_cb=None, info_cb=None)

performs isotope distribution scoring for targeted double bond positions

Parameters:
oz_datamzapy.MZA

mza data interface instance for OzID data

precursor_formuladict(str:int)

chemical formula of the precursor ion

db_idxslist(int)

list of targeted double bond indices

db_posnslist(int)

list of targeted double bond positions

precursor_rtfloat

precursor retention time

rt_tolfloat

retention time tolerance

rt_peak_winfloat

size of RT window to extract for peak fitting

mz_tolfloat

m/z tolerance for extracting XICs

rt_fit_methodstr, default=’gauss’

specify method to use for fitting the RT peak (‘gauss’ works best in testing)

ms1_fit_methodstr, default=’localmax’

specify method to use for fitting MS1 spectrum (‘localmax’ works best in testing)

check_saturationbool, default=True

whether to check for signal saturation and use leading edge strategy if necessary

saturation_thresholdfloat, default=1e5

specify a threshold intensity for determining peak saturation

remove_dint, optional

adjust molecular formulas to get rid of D labels on fatty acid tail that are part of the neutral loss (specific to SPLASH lipids)

debug_flagstr, optional

specifies how to dispatch the message and/or plot, None to do nothing

debug_cbfunc, optional

callback function that takes the debugging message as an argument, can be None if debug_flag is not set to ‘textcb’

info_cbfunction, optional

optional callback function that gets called at several intermediate steps and gives information about data processing details. Callback function takes a single argument which is a str info message

Returns:
resultdict(…)

dictionary containing analysis results

lipidoz.isotope_scoring.score_db_pos_isotope_dist_polyunsat_infusion(oz_data, precursor_formula, fa_nc, fa_nu, mz_tol, ms1_fit_method='localmax', remove_d=None, debug_flag=None, debug_cb=None)

works the same as score_db_pos_isotope_dist_polyunsat but for direct infusion data (i.e. without retention time information). All components of the analysis related to retention time are omitted.

Parameters:
oz_datamzapy.MZA

mza data interface instance for OzID data

precursor_formuladict(str:int)

chemical formula of the precursor ion

fa_nctuple(int)

number of carbons in precursor fatty acids

fa_nutuple(int)

number of DB in each precursor fatty acid, in same order as precursor_nc

mz_tolfloat

m/z tolerance for extracting XICs

ms1_fit_methodstr, default=’localmax’

specify method to use for fitting MS1 spectrum (‘localmax’ works best in testing)

remove_dint, optional

adjust molecular formulas to get rid of D labels on fatty acid tail that are part of the neutral loss (specific to SPLASH lipids)

debug_flagstr, optional

specifies how to dispatch the message and/or plot, None to do nothing

debug_cbfunc, optional

callback function that takes the debugging message as an argument, can be None if debug_flag is not set to ‘textcb’

Returns:
resultdict(…)

dictionary containing analysis results