Fin-Rag A Rag System for Financial Documents


Authors : Dr. K. E. Kannammal; Mr. Anirudh R K; Kuzhali Tamizhiniyal P; Ganishkar G; Adrinath C

Volume/Issue : Volume 10 - 2025, Issue 4 - April


Google Scholar : https://tinyurl.com/bddf75t9

Scribd : https://tinyurl.com/bd8ree44

DOI : https://doi.org/10.38124/ijisrt/25apr1147

Google Scholar

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Note : Google Scholar may take 15 to 20 days to display the article.


Abstract : Fin-RAG (Financial Retrieval-Augmented Generation) is an AI-powered chatbot system designed to simplify and accelerate financial data retrieval. Built on Retrieval-Augmented Generation (RAG), it enables natural language querying of financial documents, delivering accurate and context-aware responses in real time. The system supports both text-based and image-based documents, utilizing advanced NLP and image recognition capabilities. Users can extract key insights from balance sheets, profit and loss statements, and scanned invoices effortlessly. Fin-RAG leverages domain-specific embeddings via Hugging Face’s Inference API for precise and relevant search results. Key features include real-time insights, automated reporting, semantic search, and multimodal document analysis. Scalable and compliant, Fin-RAG improves financial decision-making efficiency. It is ideal for auditing, corporate finance, and strategic analysis.

Keywords : Fin-RAG, Retrieval-Augmented Generation (RAG), GPT-4, OpenAIMultiModal, Embedding Models, BERT (Bidirectional Encoder Representations from Transformers), CoBERT, Re-ranking, LlamaIndex, Semantic Understanding, Querying Precision, Multimodal Input, Financial Queries, Textual and Visual Data, Response Latency, Domain-Specific Fine- Tuning, Reinforcement Learning with Human Feedback (RLH

References :

  1. Luo, Kun, et al. "BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval      Augmented Long-Context Large Language Models." arXiv preprint arXiv:2402.11573 (2024).
  2. Guo, Jun, et al. "BKRAG: A BGE Reranker RAG for similarity analysis of power project requirements." Proceedings of the 2024 6th International Conference on Pattern Recognition and Intelligent Systems. 2024.
  3. Chen, Jianlv, et al. "Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation." arXiv preprint arXiv:2402.03216 (2024).
  4. Zhang, X., Zhang, Y., Long, D., Xie, W., Dai, Z., Tang, J., ... & Zhang, M. (2024). mgte: Generalized long-context text representation and reranking models for multilingual text retrieval. arXiv preprint arXiv:2407.19669.
  5. Anderson, Andrew, et al. "Low-memory gemm-based convolution algorithms for deep neural networks." arXiv preprint arXiv:1709.03395 (2017).
  6. Wang, Xingbo, et al. "RV-GEMM: Neural Network Inference Acceleration with Near-Memory GEMM Instructions on RISC-V." Proceedings of the 21st ACM International Conference on Computing Frontiers. 2024
  7. Wei, Gengchen, et al. "DocReLM: Mastering Document Retrieval with Language Model." arXiv preprint arXiv:2405.11461 (2024).
  8. Clavié, Benjamin. "rerankers: A Lightweight Python Library to Unify Ranking Methods." arXiv preprint arXiv:2408.17344 (2024).
  9. Wu, Shengqiong, et al. "Next-gpt: Any-to-any multimodal llm." arXiv preprint arXiv:2309.05519 (2023).
  10. Alberts, Ian L., et al. "Large language models (LLM) and ChatGPT: what will the impact on nuclear medicine be?." European journal of nuclear medicine and molecular imaging 50.6 (2023): 1549-1552.
  11. Lewis, Patrick, et al. "Retrieval-augmented generation for knowledge-intensive nlp tasks." Advances in Neural Information Processing Systems 33 (2020): 9459-9474.
  12. Salemi, Alireza, and Hamed Zamani. "Evaluating retrieval quality in retrieval-augmented generation." Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2024.
  13. Li, Huayang, et al. "A survey on retrieval-augmented text generation." arXiv preprint arXiv:2202.01110 (2022).
  14. Chen, Jiawei, et al. "Benchmarking large language models in retrieval-augmented generation." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 38. No. 16. 2024.
  15. Wu, Shangyu, et al. "Retrieval-augmented generation for natural language processing: A survey." arXiv preprint arXiv:2407.13193 (2024).
  16. Zhao, Penghao, et al. "Retrieval-augmented generation for ai-generated content: A survey." arXiv preprint arXiv:2402.19473 (2024).
  17. Peng, Boci, et al. "Graph retrieval-augmented generation: A survey." arXiv preprint arXiv:2408.08921 (2024).
  18. Yan, Shi-Qi, et al. "Corrective retrieval augmented generation." arXiv preprint arXiv:2401.15884 (2024).
  19. Nogueira, Rodrigo, and Kyunghyun Cho. "Passage Re-ranking with BERT." arXiv preprint arXiv:1901.04085 (2019).
  20. Pei, Changhua, et al. "Personalized re-ranking for recommendation." Proceedings of the 13th ACM conference on recommender systems. 2019.
  21. Pedronette, Daniel Carlos Guimaraes, and Ricardo da S. Torres. "Image re-ranking and rank aggregation based on similarity of ranked lists." Pattern Recognition 46.8 (2013): 2350-2360.
  22. Ren, R., Qu, Y., Liu, J., Zhao, W.X., She, Q., Wu, H., Wang, H. and Wen, J.R., 2021. Rocketqav2: A joint training method for dense passage retrieval and passage re-ranking. arXiv preprint arXiv:2110.07367.
  23. Shen X, Xiao Y, Hu SX, Sbai O, Aubry M. Re-ranking for image retrieval and transductive few-shot classification. Advances in Neural Information Processing Systems. 2021 Dec 6;34:25932-43.
  24. Meister, Lior, Oren Kurland, and Inna Gelfer Kalmanovich. "Re-ranking search results using an additional retrieved list." Information retrieval 14 (2011): 413-437.
  25. Touvron H, Lavril T, Izacard G, Martinet X, Lachaux MA, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F, Rodriguez A. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. 2023 Feb 27.
  26. Jo, Minjeong, and Junghoon Lee. "Llama index-based Machine Learning Model for Emergency rescue." In Annual Conference of KIPS, pp. 705-706. Korea Information Processing Society, 2024.
  27. Gilson, A. (2024). Bringing Large Language Models To Ophthalmology: Domain-Specific Ontologies And Evidence Attribution.
  28. Braunschweiler, Norbert, Rama Doddipatla, Simon Keizer, and Svetlana Stoyanchev. "Evaluating Large Language Models for Document-grounded Response Generation in Information-Seeking Dialogues." arXiv preprint arXiv:2309.11838 (2023).
  29. Naya-Forcano, A., M. Garcia-Bosque, E. Cascarosa, C. Sánchez-Azqueta, S. Celma, C. Aldea, and F. Aznar. "CLASSROOM INTERVENTION BASED IN AD HOC OPEN-ACCESS INTELLIGENT TUTORING SYSTEM IN HIGHER EDUCATION." In EDULEARN24 Proceedings, pp. 5938-5942. IATED, 2024.
  30. Bandara, Eranga, Sachin Shetty, Ravi Mukkamala, Abdul Rahman, Peter Foytik, Xueping Liang, Kasun De Zoysa, and Ng Wee Keong. "DevSec-GPT—Generative-AI (with Custom-Trained Meta's Llama2 LLM), Blockchain, NFT and PBOM Enabled Cloud Native Container Vulnerability Management and Pipeline Verification Platform." In 2024 IEEE Cloud Summit, pp. 28-35. IEEE, 2024.                                                                                

Fin-RAG (Financial Retrieval-Augmented Generation) is an AI-powered chatbot system designed to simplify and accelerate financial data retrieval. Built on Retrieval-Augmented Generation (RAG), it enables natural language querying of financial documents, delivering accurate and context-aware responses in real time. The system supports both text-based and image-based documents, utilizing advanced NLP and image recognition capabilities. Users can extract key insights from balance sheets, profit and loss statements, and scanned invoices effortlessly. Fin-RAG leverages domain-specific embeddings via Hugging Face’s Inference API for precise and relevant search results. Key features include real-time insights, automated reporting, semantic search, and multimodal document analysis. Scalable and compliant, Fin-RAG improves financial decision-making efficiency. It is ideal for auditing, corporate finance, and strategic analysis.

Keywords : Fin-RAG, Retrieval-Augmented Generation (RAG), GPT-4, OpenAIMultiModal, Embedding Models, BERT (Bidirectional Encoder Representations from Transformers), CoBERT, Re-ranking, LlamaIndex, Semantic Understanding, Querying Precision, Multimodal Input, Financial Queries, Textual and Visual Data, Response Latency, Domain-Specific Fine- Tuning, Reinforcement Learning with Human Feedback (RLH

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe