Sodium adduct formation with graph-based machine learning can aid structural elucidation in non-targeted LC/ESI/HRMS

Riccardo Costalunga, Sofja Tshepelevitsh, Helen Sepman, Meelis Kull, Anneli Kruve*

*Corresponding author for this work

Abstract

Non-targeted screening with LC/ESI/HRMS aims to identify the structure of the detected compounds using their retention time, exact mass, and fragmentation pattern. Challenges remain in differentiating between isomeric compounds. One untapped possibility to facilitate identification of isomers relies on different ionic species formed in electrospray. In positive ESI mode, both protonated molecules and adducts can be formed; however, not all isomeric structures form the same ionic species. The complicated mechanism of adduct formation has hindered the use of this molecular characteristic in the structural elucidation in non-targeted screening. Here, we have studied the adduct formation for 94 small molecules with ion mobility spectra and compared collision cross-sections of the respective ions. Based on the results we developed a fast support vector machine classifier with polynomial kernels for accurately predicting the sodium adduct formation in ESI/HRMS. The model is trained on five independent data sets from different laboratories and uses the graph-based connectivity of functional groups and PubChem fingerprints to predict the sodium adduct formation in ESI/HRMS. The validation of the model showed an accuracy of 74.7% (balanced accuracy 70.0%) on a dataset from an independent laboratory, which was not used in the training of the model. Lastly, we applied the classification algorithm to the SusDat database by NORMAN network to evaluate the proportion of isomeric compounds that could be distinguished based on predicted sodium adduct formation. It was observed that sodium adduct formation probability can provide additional selectivity for about one quarter of the exact masses and, therefore, shows practical utility for structural assignment in non-targeted screening.
Original languageEnglish
JournalAnalytica Chimica Acta
Publication statusPublished - 29.04.2022

Cite this