Authors :
Adwaith Rajesh; Akash V V; Jyothish M; Sankeerth O T; Aswathy T S
Volume/Issue :
Volume 10 - 2025, Issue 4 - April
Google Scholar :
https://tinyurl.com/3dhpzbxa
Scribd :
https://tinyurl.com/3sf4ey2w
DOI :
https://doi.org/10.38124/ijisrt/25apr1659
Google Scholar
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 15 to 20 days to display the article.
Abstract :
This project focuses on developing a system that can identify videos using individual frames or short sequences.
This is a complex task, but it has the potential to revolutionize how we interact with video content in many industries, from
entertainment to security. The ability to identify videos from just a still frame or short video segment is a complex
yet highly demanded task in industries ranging from entertainment to security. The system will use visual feature
extraction and a comprehensive database to match frames to videos. The methodology involves using a combination of
SIFT, YOLOv5, and ResNet-50 to process and analyze the frames. ChromaDB, a vector database for AI applications, is
used to store and search for matches. The system will then use a modified ensemble ranking system that considers factors
like frequency, consistency, and tag coverage to calculate a confidence score for each match. This score will be displayed
to the user along with the matched videos. The project aims to provide a user-friendly interface that allows users to upload
images and view the predicted videos, as well as the calculations performed during the matching process. Future
improvements include refining the algorithm for finding unique frames, enhancing the user interface with history
tracking, and improving the confidence calculation algorithm.
Keywords :
Predictions, Recommendations, Machine Learn- ING, Collaborative Farming, Streamline Trading.
References :
- D. Bose, R. Hebbar, K. Somandepalli, H. Zhang, Y. Cui, K. Cole- McLaughlin, H. Wang, and S. Narayanan, ”Movieclip: Visual scene recognition in movies,” in *Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision*, pp. 2083–2092, 2023.
- B. Deng, A. P. French, and M. P. Pound, ”Addressing multiple salient object detection via dual-space long-range dependencies,” *Computer Vision and Image Understanding*, vol. 235, p. 103776, 2023.
- C. F. Flores, A. Gonzalez-Garcia, J. van de Weijer, and B. Raducanu, ”Saliency for fine-grained object recognition in domains with scarce training data,” *Pattern Recognition*, vol. 94, pp. 62–73, 2019.
- R. Kaur and S. Singh, ”A comprehensive review of object detection with deep learning,” *Digital Signal Processing*, vol. 132, p. 103812, 2023.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, ”Imagenet classification with deep convolutional neural networks,” *Communications of the ACM*, vol. 60, no. 6, pp. 84–90, 2017.
- D. G. Lowe, ”Distinctive image features from scale-invariant keypoints,” *International Journal of Computer Vision*, vol. 60, pp. 91–110, 2004.
- D. Tsourounis, D. Kastaniotis, C. Theoharatos, A. Kazantzidis, and G. Economou, ”Sift-cnn: When convolutional neural networks meet dense sift descriptors for image and sequence classification,” *Journal of Imaging*, vol. 8, no. 10, p. 256, 2022.
- M. K. Vidhyalakshmi, E. Poovammal, V. Bhaskar, and J. Sathya- narayanan, ”Novel similarity metric learning using deep learning and root sift for person re-identification,” *Wireless Personal Communica- tions*, vol. 117, no. 3, pp. 1835–1851, 2021.
- Q. Wu, P. Zhu, Z. Chai, and G. Guo, ”Joint learning of foreground, background and edge for salient object detection,” *Computer Vision and Image Understanding*, vol. 240, p. 103915, 2024.
- Z. Zhou, Q. M. J. Wu, S. Wan, W. Sun, and X. Sun, ”Integrating sift and cnn feature matching for partial-duplicate image detection,” *IEEE Transactions on Emerging Topics in Computational Intelligence*, vol. 4, no. 5, pp. 593–604, 2020.
This project focuses on developing a system that can identify videos using individual frames or short sequences.
This is a complex task, but it has the potential to revolutionize how we interact with video content in many industries, from
entertainment to security. The ability to identify videos from just a still frame or short video segment is a complex
yet highly demanded task in industries ranging from entertainment to security. The system will use visual feature
extraction and a comprehensive database to match frames to videos. The methodology involves using a combination of
SIFT, YOLOv5, and ResNet-50 to process and analyze the frames. ChromaDB, a vector database for AI applications, is
used to store and search for matches. The system will then use a modified ensemble ranking system that considers factors
like frequency, consistency, and tag coverage to calculate a confidence score for each match. This score will be displayed
to the user along with the matched videos. The project aims to provide a user-friendly interface that allows users to upload
images and view the predicted videos, as well as the calculations performed during the matching process. Future
improvements include refining the algorithm for finding unique frames, enhancing the user interface with history
tracking, and improving the confidence calculation algorithm.
Keywords :
Predictions, Recommendations, Machine Learn- ING, Collaborative Farming, Streamline Trading.