A comprehensive framework for frame detection leveraging sift and visual feature characterization| International Journal of Innovative Science and Research Technology

A Comprehensive Framework for Frame Detection Leveraging SIFT and Visual Feature Characterization

Authors : Adwaith Rajesh; Akash V V; Jyothish M; Sankeerth O T; Aswathy T S

Volume/Issue : Volume 10 - 2025, Issue 4 - April

Google Scholar : https://tinyurl.com/3dhpzbxa

Scribd : https://tinyurl.com/3sf4ey2w

DOI : https://doi.org/10.38124/ijisrt/25apr1659

PlumX Metrics

Semantic Scholar

ResearchGate

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : This project focuses on developing a system that can identify videos using individual frames or short sequences. This is a complex task, but it has the potential to revolutionize how we interact with video content in many industries, from entertainment to security. The ability to identify videos from just a still frame or short video segment is a complex yet highly demanded task in industries ranging from entertainment to security. The system will use visual feature extraction and a comprehensive database to match frames to videos. The methodology involves using a combination of SIFT, YOLOv5, and ResNet-50 to process and analyze the frames. ChromaDB, a vector database for AI applications, is used to store and search for matches. The system will then use a modified ensemble ranking system that considers factors like frequency, consistency, and tag coverage to calculate a confidence score for each match. This score will be displayed to the user along with the matched videos. The project aims to provide a user-friendly interface that allows users to upload images and view the predicted videos, as well as the calculations performed during the matching process. Future improvements include refining the algorithm for finding unique frames, enhancing the user interface with history tracking, and improving the confidence calculation algorithm.

Keywords : Predictions, Recommendations, Machine Learn- ING, Collaborative Farming, Streamline Trading.

References :

D. Bose, R. Hebbar, K. Somandepalli, H. Zhang, Y. Cui, K. Cole- McLaughlin, H. Wang, and S. Narayanan, ”Movieclip: Visual scene recognition in movies,” in *Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision*, pp. 2083–2092, 2023.
B. Deng, A. P. French, and M. P. Pound, ”Addressing multiple salient object detection via dual-space long-range dependencies,” *Computer Vision and Image Understanding*, vol. 235, p. 103776, 2023.
C. F. Flores, A. Gonzalez-Garcia, J. van de Weijer, and B. Raducanu, ”Saliency for fine-grained object recognition in domains with scarce training data,” *Pattern Recognition*, vol. 94, pp. 62–73, 2019.
R. Kaur and S. Singh, ”A comprehensive review of object detection with deep learning,” *Digital Signal Processing*, vol. 132, p. 103812, 2023.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, ”Imagenet classification with deep convolutional neural networks,” *Communications of the ACM*, vol. 60, no. 6, pp. 84–90, 2017.
D. G. Lowe, ”Distinctive image features from scale-invariant keypoints,” *International Journal of Computer Vision*, vol. 60, pp. 91–110, 2004.
D. Tsourounis, D. Kastaniotis, C. Theoharatos, A. Kazantzidis, and G. Economou, ”Sift-cnn: When convolutional neural networks meet dense sift descriptors for image and sequence classification,” *Journal of Imaging*, vol. 8, no. 10, p. 256, 2022.
M. K. Vidhyalakshmi, E. Poovammal, V. Bhaskar, and J. Sathya- narayanan, ”Novel similarity metric learning using deep learning and root sift for person re-identification,” *Wireless Personal Communica- tions*, vol. 117, no. 3, pp. 1835–1851, 2021.
Q. Wu, P. Zhu, Z. Chai, and G. Guo, ”Joint learning of foreground, background and edge for salient object detection,” *Computer Vision and Image Understanding*, vol. 240, p. 103915, 2024.
Z. Zhou, Q. M. J. Wu, S. Wan, W. Sun, and X. Sun, ”Integrating sift and cnn feature matching for partial-duplicate image detection,” *IEEE Transactions on Emerging Topics in Computational Intelligence*, vol. 4, no. 5, pp. 593–604, 2020.

This project focuses on developing a system that can identify videos using individual frames or short sequences. This is a complex task, but it has the potential to revolutionize how we interact with video content in many industries, from entertainment to security. The ability to identify videos from just a still frame or short video segment is a complex yet highly demanded task in industries ranging from entertainment to security. The system will use visual feature extraction and a comprehensive database to match frames to videos. The methodology involves using a combination of SIFT, YOLOv5, and ResNet-50 to process and analyze the frames. ChromaDB, a vector database for AI applications, is used to store and search for matches. The system will then use a modified ensemble ranking system that considers factors like frequency, consistency, and tag coverage to calculate a confidence score for each match. This score will be displayed to the user along with the matched videos. The project aims to provide a user-friendly interface that allows users to upload images and view the predicted videos, as well as the calculations performed during the matching process. Future improvements include refining the algorithm for finding unique frames, enhancing the user interface with history tracking, and improving the confidence calculation algorithm.

Keywords : Predictions, Recommendations, Machine Learn- ING, Collaborative Farming, Streamline Trading.