Species Name Identification for Essential Oils from Biomedical Abstracts Using Text Mining and Natural Language Processing


Authors : Chou-Cheng Chen

Volume/Issue : Volume 9 - 2024, Issue 4 - April

Google Scholar : https://tinyurl.com/27s7c3yc

Scribd : https://tinyurl.com/3vcnv647

DOI : https://doi.org/10.38124/ijisrt/IJISRT24APR2134

Abstract : In recent years, the publication of scientific papers related to essential oils has achieved exponential growth due to the popularity of aromatherapy, although no studies using natural language processing and text mining methods to extract information from scientific articles related to essential oils are currently found. Accordingly, this study is the first to use natural language processing and text mining methods to identify species names appearing in abstracts related to essential oils. We obtained 34,637 abstracts using keywords, “essential oil” to quarry PubMed on 2024/03/15. The 1,081,005 species names of plants and fungi were obtained from Taxonomy FTP on the same day. The nouns from titles of articles related to essential oils were obtained via identification of parts-of-speech and from titles and abstracts extracted within italicized labels. These nouns were used to identify 10,445 plant and fungal species names downloaded from FTP appearing in abstracts related to essential oils with these identification terms being used to detect whether abstracts related to essential oils revealed the species names. 156,371 records contained links between PMID and Taxonomy ID. To the best of our knowledge, our study shows this method can efficiently identify the names of species from abstracts related to essential oil.

Keywords : Text Mining; POS; Essential Oil; Species Name.

References :

  1. B. Cooke and E. Ernst, “Aromatherapy: a systematic review,” Br J Gen Pract, 50(455): pp. 493-498, 2000.
  2. E.W. Sayers, et al., “Database resources of the national center for biotechnology information,” Nucleic Acids Res, 50(D1): pp. D20-D26, 2022.
  3. D. Bi, Ju-E Guo, E. Zhao, S. Sun and S. Wang, “Identifying environmental and health threats in unconventional oil and gas violations: evidence from Pennsylvania compliance reports,” Environ Sci Pollut Res Int, 29(15): pp. 22742-22755, 2022.
  4. K. Domingues, N.H. Franco, I. Rodrigues, G. Stilwel and M.M.-S. Ana, “Bibliometric trend analysis of non-conventional (alternative) therapies in veterinary research,” Vet Q, 42(1): pp. 192-198, 2022.
  5. Dos Santos, N.S.S., et al., “Biotransformation of 1-nitro-2-phenylethane [Formula: see text] 2-phenylethanol from fungi species of the Amazon biome: an experimental and theoretical analysis,” J Mol Model, 29(8): pp. 223, 2023.
  6. Sayers E., “The E-utilities In-Depth: Parameters, Syntax and More, ” 2009 2022/11/30 [cited 2024 04/18]; Available from: https://www.ncbi.nlm.nih.gov/books/NBK25499/.
  7. Schoch C.L., et al., “NCBI Taxonomy: a comprehensive update on curation, resources and tools, Database (Oxford), 2020, 2020.
  8. “The 9 E-utilities and Associated Parameters,” [cited 2024 4/18]; Available from: https://www.nlm.nih.gov/dataguide/eutilities/utilities.html.
  9. Manning C., Surdeanu M., Bauer J., Finkel J., Bethard S., and McClosky, D., “The Stanford CoreNLP natural language processing toolkit,” in Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, 2014.
  10. Steinmann A., Schätzle M., Agathos M. and Breit R., “Allergic contact dermatitis from black cumin (Nigella sativa) oil after topical use,” Contact Dermatitis, 36(5): pp. 268-276, 1997.

In recent years, the publication of scientific papers related to essential oils has achieved exponential growth due to the popularity of aromatherapy, although no studies using natural language processing and text mining methods to extract information from scientific articles related to essential oils are currently found. Accordingly, this study is the first to use natural language processing and text mining methods to identify species names appearing in abstracts related to essential oils. We obtained 34,637 abstracts using keywords, “essential oil” to quarry PubMed on 2024/03/15. The 1,081,005 species names of plants and fungi were obtained from Taxonomy FTP on the same day. The nouns from titles of articles related to essential oils were obtained via identification of parts-of-speech and from titles and abstracts extracted within italicized labels. These nouns were used to identify 10,445 plant and fungal species names downloaded from FTP appearing in abstracts related to essential oils with these identification terms being used to detect whether abstracts related to essential oils revealed the species names. 156,371 records contained links between PMID and Taxonomy ID. To the best of our knowledge, our study shows this method can efficiently identify the names of species from abstracts related to essential oil.

Keywords : Text Mining; POS; Essential Oil; Species Name.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe