Extraction of chinese health news using computation of noun numbers| International Journal of Innovative Science and Research Technology

Extraction of Chinese Health News Using Computation of Noun Numbers

Authors : Chou-Cheng Chen

Volume/Issue : Volume 7 - 2022, Issue 11 - November

Google Scholar : https://bit.ly/3IIfn9N

DOI : https://doi.org/10.5281/zenodo.7439934

Abstract : - Significant amounts of health information can be obtained from Chinese newspapers and magazines, but the reader must spend much time to study this. Common methods of extracting information from articles include machine learning, text mining, word cloud sampling or use of algorithms. A high-quality model of machine learning for extracting information must be trained using a large amount of good data. Before high precision and recall of extracting information is obtained from text mining, many keywords should be collected to identify token sentences. This means that both extracting information from machine learning and text mining take up significant amounts of time. Although word cloud systems can quickly identify which words are widely used in the article, the extracted information is often fragmented. Accordingly, the author has created an elegant algorithm to extract health information from Chinese news using computation of noun numbers. Firstly, the title or subtitle of context from Chinese health news of websites were labeled. Secondly, each sentence was separated via identification of commas, periods, and question marks. Thirdly, word segments of context were tagged as parts of speech via natural language processing. Fourthly, the score of each sentence was identified via computation of the number of nouns where the nouns were identified as 3 points and 2 points as nouns detected in the title and subtitle respectively, while other nouns were identified as 1 point. Finally, high scoring sentences were selected via the query of the user

- Significant amounts of health information can be obtained from Chinese newspapers and magazines, but the reader must spend much time to study this. Common methods of extracting information from articles include machine learning, text mining, word cloud sampling or use of algorithms. A high-quality model of machine learning for extracting information must be trained using a large amount of good data. Before high precision and recall of extracting information is obtained from text mining, many keywords should be collected to identify token sentences. This means that both extracting information from machine learning and text mining take up significant amounts of time. Although word cloud systems can quickly identify which words are widely used in the article, the extracted information is often fragmented. Accordingly, the author has created an elegant algorithm to extract health information from Chinese news using computation of noun numbers. Firstly, the title or subtitle of context from Chinese health news of websites were labeled. Secondly, each sentence was separated via identification of commas, periods, and question marks. Thirdly, word segments of context were tagged as parts of speech via natural language processing. Fourthly, the score of each sentence was identified via computation of the number of nouns where the nouns were identified as 3 points and 2 points as nouns detected in the title and subtitle respectively, while other nouns were identified as 1 point. Finally, high scoring sentences were selected via the query of the user

CALL FOR PAPERS

Paper Submission Last Date
30 - June - 2025

Video Explanation for Published paper

CALL FOR PAPERS

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.