Cloud Computing Fault Tolerance


Authors : Diwakar Mainali; Megan Nagarkoti; Jebin Dangol; Dipendra Pandit; Ojaswi Adhikari; Dr. Om Prakash Sharma

Volume/Issue : Volume 9 - 2024, Issue 8 - August


Google Scholar : https://shorturl.at/Yo1o5

Scribd : https://tinyurl.com/y8y6fdeb

DOI : https://doi.org/10.38124/ijisrt/IJISRT24AUG519

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.


Abstract : Fault tolerance is an important part of cloud computing because it makes sure that services will still be available and reliable even if there is a problem with the hardware, software, or network. The paper talks about a number of different models and strategies for fault tolerance that are used in cloud computing. We look at many important ideas in depth in a literature study. Some of these are redundancy, replication, consensus methods, checkpointing, and failover techniques. As the review pointed out, these methods are used by major cloud service providers like Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure to keep data safe and offer high availability. We talk about new ideas like serverless computing, microservices architecture, and the use of machine learning for fault detection. We also talk about present problems like data consistency, performance overhead, security concerns, and the complexity of fault tolerance models. In the end of the paper, suggestions are made for more research, with a focus on looking into how new technologies affect fault tolerance. There needs to be more actual research to fix the problems with the secondary research, like the small range of literature that was looked at and how quickly cloud technology is growing. The results of this study can help both researchers and professional users who want to make cloud services more reliable by learning more about how to add fault tolerance techniques to cloud systems.

Keywords : Fault Tolerance, Cloud Computing, Redundancy, Replication, Consensus Algorithms, Checkpointing, Failover Mechanisms, Serverless Computing, Microservices, Machine Learning, Data Integrity, High Availability.

References :

  1. Kumari, P., & Kaur, P. (2021). A survey of fault tolerance in cloud computing. Journal of King Saud University-Computer and Information Sciences33(10), 1159-1176.
  2. Hasan, M., & Goraya, M. S. (2018). Fault tolerance in cloud computing environment: A systematic survey. Computers in Industry99, 156-172.
  3. Cheraghlou, M. N., Khadem-Zadeh, A., &Haghparast, M. (2016). A survey of fault tolerance architecture in cloud computing. Journal of Network and Computer Applications61, 81-92.
  4. Ataallah, S. M., Nassar, S. M., &Hemayed, E. E. (2015, December). Fault tolerance in cloud computing-survey. In 2015 11th International computer engineering conference (ICENCO) (pp. 241-245). IEEE.
  5. Jhawar, R., Piuri, V., &Santambrogio, M. (2012, March). A comprehensive conceptual system-level approach to fault tolerance in cloud computing. In 2012 IEEE International Systems Conference SysCon 2012 (pp. 1-5). IEEE.
  6. Mohammed, B., Kiran, M., Maiyama, K. M., Kamala, M. M., & Awan, I. U. (2017). Failover strategy for fault tolerance in cloud computing environment. Software: Practice and Experience47(9), 1243-1274.
  7. Patra, P. K., Singh, H., & Singh, G. (2013). Fault tolerance techniques and comparative implementation in cloud computing. International Journal of Computer Applications64(14), 37-41.
  8. Patra, P. K., Singh, H., & Singh, G. (2013). Fault tolerance techniques and comparative implementation in cloud computing. International Journal of Computer Applications64(14), 37-41.
  9. Amin, Z., Singh, H., & Sethi, N. (2015). Review on fault tolerance techniques in cloud computing. International Journal of Computer Applications116(18), 11-17.
  10. Rezaeipanah, A., Mojarad, M., & Fakhari, A. (2022). Providing a new approach to increase fault tolerance in cloud computing using fuzzy logic. International Journal of Computers and Applications44(2), 139-147.
  11. Mittal, D., & Agarwal, N. (2015, March). A review paper on Fault Tolerance in Cloud Computing. In 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom) (pp. 31-34). IEEE.
  12. Rehman, A. U., Aguiar, R. L., & Barraca, J. P. (2022). Fault-tolerance in the scope of cloud computing. IEEE Access10, 63422-63441.
  13. Mohammadian, V., Navimipour, N. J., Hosseinzadeh, M., & Darwesh, A. (2021). Fault-tolerant load balancing in cloud computing: A systematic literature review. IEEE Access10, 12714-12731.
  14. Tchana, A., Broto, L., &Hagimont, D. (2012, May). Approaches to cloud computing fault tolerance. In 2012 International Conference on Computer, Information and Telecommunication Systems (CITS) (pp. 1-6). IEEE.
  15. Kochhar, D., &Jabanjalin, H. (2017). An approach for fault tolerance in cloud computing using machine learning technique. International Journal of Pure and Applied Mathematics117(22), 345-351.
  16. Liu, A. A. H. (2015). Research article a survey of fault-tolerance in cloud computing: Concepts and practice. Research Journal of Applied Sciences, Engineering and Technology11(12), 1365-1377.
  17. Shahid, M. A., Islam, N., Alam, M. M., Mazliham, M. S., & Musa, S. (2021). Towards Resilient Method: An exhaustive survey of fault tolerance methods in the cloud computing environment. Computer Science Review40, 100398.
  18. Mukwevho, M. A., & Celik, T. (2018). Toward a smart cloud: A review of fault-tolerance methods in cloud systems. IEEE Transactions on Services Computing14(2), 589-605.
  19. Kumar, S., Rana, D. S., &Dimri, S. C. (2015). Fault tolerance and load balancing algorithm in cloud computing: A survey. International Journal of Advanced Research in Computer and Communication Engineering4(7), 92-96.
  20. Kathpal, C., & Garg, R. (2019). Survey on fault-tolerance-aware scheduling in cloud computing. In Information and Communication Technology for Competitive Strategies: Proceedings of Third International Conference on ICTCS 2017 (pp. 275-283). Springer Singapore.
  21. Alshayeji, M. H., Al-Rousan, M., Yossef, E., &Ellethy, H. (2018). A study on fault tolerance mechanisms in cloud computing. International Journal of Computer Electrical Engineering10(1), 62-71.
  22. Abdulhamid, S. I. M., Abd Latiff, M. S., Madni, S. H. H., & Abdullahi, M. (2018). Fault tolerance aware scheduling technique for cloud computing environment using dynamic clustering algorithm. Neural Computing and Applications29, 279-293.
  23. Mohammed, B., Kiran, M., Awan, I. U., & Maiyama, K. M. (2016, August). Optimising fault tolerance in real-time cloud computing IaaS environment. In 2016 IEEE 4th international conference on future internet of things and cloud (FiCloud) (pp. 363-370). IEEE.

Fault tolerance is an important part of cloud computing because it makes sure that services will still be available and reliable even if there is a problem with the hardware, software, or network. The paper talks about a number of different models and strategies for fault tolerance that are used in cloud computing. We look at many important ideas in depth in a literature study. Some of these are redundancy, replication, consensus methods, checkpointing, and failover techniques. As the review pointed out, these methods are used by major cloud service providers like Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure to keep data safe and offer high availability. We talk about new ideas like serverless computing, microservices architecture, and the use of machine learning for fault detection. We also talk about present problems like data consistency, performance overhead, security concerns, and the complexity of fault tolerance models. In the end of the paper, suggestions are made for more research, with a focus on looking into how new technologies affect fault tolerance. There needs to be more actual research to fix the problems with the secondary research, like the small range of literature that was looked at and how quickly cloud technology is growing. The results of this study can help both researchers and professional users who want to make cloud services more reliable by learning more about how to add fault tolerance techniques to cloud systems.

Keywords : Fault Tolerance, Cloud Computing, Redundancy, Replication, Consensus Algorithms, Checkpointing, Failover Mechanisms, Serverless Computing, Microservices, Machine Learning, Data Integrity, High Availability.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe