Authors :
Atul Bengeri; Dr. Amol C Goje
Volume/Issue :
Volume 7 - 2022, Issue 2 - February
Google Scholar :
http://bitly.ws/gu88
Scribd :
https://bit.ly/3Cj1tsU
DOI :
https://doi.org/10.5281/zenodo.6334712
Abstract :
:- Customary, conventional healthcare Database
Management Systems are used as a repository of data and
to process structured data efficiently, but in case of diverse
variety and huge volumes of data it becomes arduous to
handle such mammoth volumes. The question arises of
what and how to process such data from various sources
which could be structured as well as unstructured and in a
distributed manner? Hadoop is open source framework,
based on distributed computing, which is capable of
storing and processing Big Data, which may comprise of
structured, unstructured as well as semi-structured data.
In this paper, we summarize the basic operations
performed on healthcare data in a Data Management
Lifecycle.
Keywords :
Big Data, Data Analysis, Distributed Computing, ETL Hadoop, Healthcare, MapReduce.
:- Customary, conventional healthcare Database
Management Systems are used as a repository of data and
to process structured data efficiently, but in case of diverse
variety and huge volumes of data it becomes arduous to
handle such mammoth volumes. The question arises of
what and how to process such data from various sources
which could be structured as well as unstructured and in a
distributed manner? Hadoop is open source framework,
based on distributed computing, which is capable of
storing and processing Big Data, which may comprise of
structured, unstructured as well as semi-structured data.
In this paper, we summarize the basic operations
performed on healthcare data in a Data Management
Lifecycle.
Keywords :
Big Data, Data Analysis, Distributed Computing, ETL Hadoop, Healthcare, MapReduce.