Intelligent Process Automation for Data Lifecycle Management (Data Retention and Data Destruction) Through Process Mining


Authors : L Chingwaru; S Chaputsira; M Mutandavari

Volume/Issue : Volume 9 - 2024, Issue 8 - August


Google Scholar : https://tinyurl.com/yc5sjbkm

DOI : https://doi.org/10.38124/ijisrt/24aug1034

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 30 to 40 days to display the article.


Abstract : In an era of rapidly growing data, effective data lifecycle management has become crucial for organizations. This paper addresses the challenge of identifying and classifying data columns as either demographic or transactional across various systems, where column names may differ significantly (e.g., "Sex" in one system and "Gender" in another). The purpose of this research is to develop a model that can accurately classify these data columns, enabling automated data retention and destruction processes. The proposed model leverages intelligent process automation and process mining to identify and categorize data, allowing transactional data to be archived automatically after a specified timeframe. By implementing this model, organizations can improve their data management efficiency, ensuring compliance with data retention policies while optimizing storage use.

Keywords : Data Lifecycle Management, Column Classification, Intelligent Process Automation, Process Mining, Data Retention.

References :

  1. Aggarwal, C. C. (2014). Data Classification: Algorithms and Applications. CRC Press. ISBN: 978-1466583284.
  2. Bose, R. P. J. C., & van der Aalst, W. M. P. (2015). Process Mining in Healthcare: Data Challenges when Answering Frequently Posed Questions. In Proceedings of the 13th Conference on Business Process Management. IEEE.
  3. Ceravolo, P., et al. (2018). Big Data Semantics for Data Quality Management in Data-Intensive Processes. Journal of Data and Information Quality, 9(1), 1-24.
  4. Provost, F., & Fawcett, T. (2013). Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking. O'Reilly Media. ISBN: 978-1449361327.
  5. Reis, R. S. (2019). A Review of Data Retention Policies and Practices. Journal of Information Security, 10(3), 101-115.
  6. Sicular, S. (2019). How to Get Started with Data Lifecycle Management. Gartner Research. Retrieved from
  7. van der Aalst, W. M. P. (2016). Process Mining: Data Science in Action. Springer. ISBN: 978- 3662509357.
  8. Zhang, Y., & Xu, H. (2020). Applying Machine Learning for Data Classification in Business Applications. In Proceedings of the 2020 International Conference on Data Science and Analytics (pp. 89-96). IEEE.
  9. Lee, S., & Zhang, J. (2021). Automating Data Retention and Deletion with Intelligent Process Automation. International Journal of Data Management, 57, 222-240.
  10. Johnson, L. C., & Nguyen, T. P. (2021). Automating Data Lifecycle Management through Intelligent Process Mining. International Journal of Information Management, 58, 102311
  11. Aggarwal, C. C. (2014). Data Classification: Algorithms and Applications. CRC Press. ISBN: 978-1466583284.
  12. Chen, M., Mao, S., & Liu, Y. (2014). Big Data: A Survey. Mobile Networks and Applications, 19(2), 171-209
  13. Gandomi, A., & Haider, M. (2015). Beyond the Hype: Big Data Concepts, Methods, and Analytics. International Journal of Information Management, 35(2), 137-144.
  14. Schüller, D. (2020). Process Mining and Data Science: Bridging the Gap. Data Science Journal, 19, 1-11
  15. Porter, M. E., & Heppelmann, J. E. (2015). How Smart, Connected Products Are Transforming Companies. Harvard Business Review, 93(10), 96-114.
  16. Chen, H., Chiang, R. H. L., & Storey, V. C. (2012). Business Intelligence and Analytics: From Big Data to Big Impact. MIS Quarterly, 36(4), 1165- 1188
  17. Meyer, G., & Wiseman, S. (2017). Managing Data as a Strategic Asset: The Five Pillars of Effective Data Governance. Journal of Data Governance, 3(4), 22-28.
  18. Wilkinson, M. D., et al. (2016). The FAIR Guiding Principles for Scientific Data Management and Stewardship. Scientific Data, 3(1), 160018.

In an era of rapidly growing data, effective data lifecycle management has become crucial for organizations. This paper addresses the challenge of identifying and classifying data columns as either demographic or transactional across various systems, where column names may differ significantly (e.g., "Sex" in one system and "Gender" in another). The purpose of this research is to develop a model that can accurately classify these data columns, enabling automated data retention and destruction processes. The proposed model leverages intelligent process automation and process mining to identify and categorize data, allowing transactional data to be archived automatically after a specified timeframe. By implementing this model, organizations can improve their data management efficiency, ensuring compliance with data retention policies while optimizing storage use.

Keywords : Data Lifecycle Management, Column Classification, Intelligent Process Automation, Process Mining, Data Retention.

CALL FOR PAPERS


Paper Submission Last Date
31 - July - 2025

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe