Authors :
Sanidhya Vishal Sharma; Swati Joshi
Volume/Issue :
Volume 10 - 2025, Issue 11 - November
Google Scholar :
https://tinyurl.com/27t49rac
Scribd :
https://tinyurl.com/3afvj2hx
DOI :
https://doi.org/10.38124/ijisrt/25nov813
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Note : Google Scholar may take 30 to 40 days to display the article.
Abstract :
Behavioral finance has emerged as a critical framework for understanding market dynamics beyond traditional
rational agent models. This research presents a comprehensive multimodal approach to behavioral finance analysis,
integrating market data, macroeconomic indicators, news sentiment, cryptocurrency metrics, Web3 analytics, GitHub
development activity, and social sentiment to test five advanced hypotheses regarding behavioral pattern identification
and market anomaly detection. The study employs an ultra-comprehensive data pipeline processing 30,400 samples across
seven distinct data sources, generating 91 engineered features representing behavioral biases, investment patterns, and
market psychology. Advanced machine learning techniques including Principal Component Analysis, t-Distributed
Stochastic Neighbor Embedding, Variational Autoencoders, K-Means, Hierarchical Clustering, DBSCAN, Isolation
Forest, One-Class SVM, and Elliptic Envelope are applied to identify behavioral structures and detect anomalies.
Statistical validation through chi-square tests, ANOVA, Granger causality analysis, and lagged correlation studies
demonstrates that three of five hypotheses (60%) achieve statistical significance at p < 0.05. Key findings reveal that
behavioral structures exist and correspond to canonical biases (chi-square = 3406.780, p < 0.001), cluster assignments
maintain moderate stability across market regimes (Jaccard similarity = 0.300), and sentiment and macroeconomic factors
exhibit 65 significant causal relationships with behavioral patterns. However, multimodal data integration does not
uniformly improve clustering quality (Silhouette score decrease of 0.116), and cluster-conditioned anomaly detection fails
to outperform global methods (F1-score decrease of 0.017). These findings contribute to behavioral finance theory while
providing practical applications for investment management, fraud detection, and regulatory compliance.
Keywords :
Behavioral Finance, Multimodal Data Integration, Machine Learning, Hypothesis Testing, Anomaly Detection, Market Psychology
References :
- D. Kahneman and A. Tversky, "Prospect theory: An analysis of decision under risk," Econometrica, vol. 47, no. 2, pp. 263-291, Mar. 1979.
- B. M. Barber and T. Odean, "Boys will be boys: Gender, overconfidence, and common stock investment," Q. J. Econ., vol. 116, no. 1, pp. 261-292, Feb. 2001.
- R. J. Shiller, Irrational Exuberance. Princeton, NJ: Princeton Univ. Press, 2000.
- H. Chen, P. De, Y. J. Hu, and B.-H. Hwang, "Wisdom of crowds: The value of stock opinions transmitted through social media," Rev. Financial Studies, vol. 27, no. 5, pp. 1367-1403, May 2014.
- Y. Zhang, J. Li, and S. Wang, "Multimodal sentiment analysis for stock market prediction using deep learning," IEEE Access, vol. 8, pp. 144437-144448, 2020.
- R. H. Thaler, "Mental accounting matters," J. Behavioral Decision Making, vol. 12, no. 3, pp. 183-206, Sep. 1999.
- D. Hirshleifer, "Investor psychology and asset pricing," J. Finance, vol. 56, no. 4, pp. 1533-1597, Aug. 2001.
- S. DellaVigna, "Psychology and economics: Evidence from the field," J. Econ. Literature, vol. 47, no. 2, pp. 315-372, Jun. 2009.
- J. B. Heaton, N. G. Polson, and J. H. Witte, "Deep learning for finance: Deep portfolios," Appl. Stochastic Models Business Industry, vol. 33, no. 1, pp. 3-12, Jan. 2017.
- S. Gu, B. Kelly, and D. Xiu, "Empirical asset pricing via machine learning," Rev. Financial Studies, vol. 33, no. 5, pp. 2223-2273, May 2020.
- S. Gu, B. Kelly, and D. Xiu, "Autoencoder asset pricing models," J. Econometrics, vol. 222, no. 1, pp. 429-450, May 2021.
- J. B. Heaton, N. G. Polson, and J. H. Witte, "Deep learning for finance: Deep portfolios," Appl. Stochastic Models Business Industry, vol. 33, no. 1, pp. 3-12, Jan. 2017.
- S. R. Nanda, B. Mahanty, and M. K. Tiwari, "Clustering Indian stock market data for portfolio management," Expert Syst. Appl., vol. 37, no. 12, pp. 8793-8798, Dec. 2010.
- A. Kumar and C. M. Lee, "Retail investor sentiment and return comovements," J. Finance, vol. 61, no. 5, pp. 2451-2486, Oct. 2006.
- S. R. Nanda, B. Mahanty, and M. K. Tiwari, "Clustering Indian stock market data for portfolio management," Expert Syst. Appl., vol. 37, no. 12, pp. 8793-8798, Dec. 2010.
- M. E. Tipping and C. M. Bishop, "Probabilistic principal component analysis," J. Royal Statistical Soc.: Series B, vol. 61, no. 3, pp. 611-622, 1999.
- L. van der Maaten and G. Hinton, "Visualizing data using t-SNE," J. Machine Learning Research, vol. 9, pp. 2579-2605, Nov. 2008.
- D. P. Kingma and M. Welling, "Auto-encoding variational Bayes," in Proc. 2nd Int. Conf. Learning Representations (ICLR), Banff, AB, Canada, Apr. 2014, pp. 1-14.
- J. B. Tenenbaum, V. de Silva, and J. C. Langford, "A global geometric framework for nonlinear dimensionality reduction," Science, vol. 290, no. 5500, pp. 2319-2323, Dec. 2000.
- H. Chen, P. De, Y. J. Hu, and B.-H. Hwang, "Wisdom of crowds: The value of stock opinions transmitted through social media," Rev. Financial Studies, vol. 27, no. 5, pp. 1367-1403, May 2014.
- Y. Zhang, J. Li, and S. Wang, "Multimodal sentiment analysis for stock market prediction using deep learning," IEEE Access, vol. 8, pp. 144437-144448, 2020.
- P. P. Liang, A. Zadeh, and L.-P. Morency, "Foundations and recent trends in multimodal machine learning," ACM Computing Surveys, vol. 56, no. 4, pp. 1-35, Apr. 2024.
- F. T. Liu, K. M. Ting, and Z.-H. Zhou, "Isolation forest," in Proc. 8th IEEE Int. Conf. Data Mining (ICDM), Pisa, Italy, Dec. 2008, pp. 413-422.
- B. Schölkopf et al., "Estimating the support of a high-dimensional distribution," Neural Computation, vol. 13, no. 7, pp. 1443-1471, Jul. 2001.
- Y. Shen, S. Chakraborty, and Y. Lu, "Clustering-based local outlier detection for fraud transaction identification," in Proc. IEEE Int. Conf. Big Data, Los Angeles, CA, USA, Dec. 2019, pp. 1582-1591.
- V. Chandola, A. Banerjee, and V. Kumar, "Anomaly detection: A survey," ACM Computing Surveys, vol. 41, no. 3, pp. 1-58, Jul. 2009.
- J. D. Hamilton, "A new approach to the economic analysis of nonstationary time series," Econometrica, vol. 57, no. 2, pp. 357-384, Mar. 1989.
- M. Guidolin and A. Timmermann, "International asset allocation under regime switching," Rev. Financial Studies, vol. 21, no. 2, pp. 889-935, Apr. 2008.
- P. Jaccard, "The distribution of the flora in the alpine zone," New Phytologist, vol. 11, no. 2, pp. 37-50, Feb. 1912.
- C. W. J. Granger, "Investigating causal relations by econometric models," Econometrica, vol. 37, no. 3, pp. 424-438, Aug. 1969.
- Z. Bai, W.-K. Wong, and B. Zhang, "Multivariate linear and nonlinear causality tests," Mathematics and Computers in Simulation, vol. 81, no. 1, pp. 5-17, Sep. 2010.
- J. Pearl, Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge, UK: Cambridge Univ. Press, 2009.
- F. Schär, "Decentralized finance: On blockchain- and smart contract-based financial markets," Federal Reserve Bank St. Louis Rev., vol. 103, no. 2, pp. 153-174, 2021.
- N. Barberis and R. Thaler, "A survey of behavioral finance," in Handbook Economics of Finance, vol. 1. Amsterdam: Elsevier, 2003, pp. 1053-1128.
- D. Hirshleifer and S. H. Teoh, "Limited attention, information disclosure, and financial reporting," J. Accounting Econ., vol. 36, no. 1-3, pp. 337-386, Dec. 2003.
- P. J. Rousseeuw, "Silhouettes: A graphical aid to interpretation of cluster analysis," J. Computational Appl. Mathematics, vol. 20, pp. 53-65, Nov. 1987.
- J. Cohen, Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Hillsdale, NJ: Lawrence Erlbaum, 1988.
- V. Chandola, A. Banerjee, and V. Kumar, "Anomaly detection: A survey," ACM Computing Surveys, vol. 41, no. 3, pp. 1-58, Jul. 2009.
- D. W. Hosmer Jr., S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression, 3rd ed. Hoboken, NJ: Wiley, 2013.
- H. Lütkepohl, New Introduction to Multiple Time Series Analysis. Berlin: Springer, 2005.
- R. Aroussi, "yfinance: Download market data from Yahoo! Finance API," Python Package Index, 2024. [Online]. Available: https://pypi.org/project/yfinance/
- Federal Reserve Bank of St. Louis, "FRED Economic Data," 2024. [Online]. Available: https://fred.stlouisfed.org/
- L. Ou-Yang, "newspaper3k: Article scraping & curation," Python Package Index, 2024. [Online]. Available: https://pypi.org/project/newspaper3k/
- C. J. Hutto and E. Gilbert, "VADER: A parsimonious rule-based model for sentiment analysis," in Proc. 8th Int. AAAI Conf. Weblogs Social Media, Ann Arbor, MI, USA, Jun. 2014, pp. 216-225.
- CoinGecko, "CoinGecko API Documentation," 2024. [Online]. Available: https://www.coingecko.com/api/documentation
- GitHub Inc., "GitHub REST API Documentation," 2024. [Online]. Available: https://docs.github.com/en/rest
- DefiLlama, "DeFi TVL Rankings," 2024. [Online]. Available: https://defillama.com/
- J. Bollen, H. Mao, and X. Zeng, "Twitter mood predicts the stock market," J. Computational Sci., vol. 2, no. 1, pp. 1-8, Mar. 2011.
- CNN Business, "Fear & Greed Index," 2024. [Online]. Available: https://money.cnn.com/data/fear-and-greed/
- P. J. Rousseeuw and K. Van Driessen, "A fast algorithm for minimum covariance determinant," Technometrics, vol. 41, no. 3, pp. 212-223, Aug. 1999.
- E. F. Fama and K. R. French, "Common risk factors in returns on stocks and bonds," J. Financial Econ., vol. 33, no. 1, pp. 3-56, Feb. 1993.
- S. van der Walt, S. C. Colbert, and G. Varoquaux, "The NumPy array: A structure for efficient numerical computation," Computing Sci. Eng., vol. 13, no. 2, pp. 22-30, Mar. 2011.
- W. F. Sharpe, "Capital asset prices: A theory of market equilibrium under risk," J. Finance, vol. 19, no. 3, pp. 425-442, Sep. 1964.
- R. Bellman, Adaptive Control Processes: A Guided Tour. Princeton, NJ: Princeton Univ. Press, 1961.
- K. Beyer et al., "When is 'nearest neighbor' meaningful?" in Proc. 7th Int. Conf. Database Theory, Jerusalem, Israel, Jan. 1999, pp. 217-235.
- I. Higgins et al., "β-VAE: Learning basic visual concepts with a constrained variational framework," in Proc. 5th Int. Conf. Learning Representations, Toulon, France, Apr. 2017, pp. 1-13.
- J. A. Hartigan and M. A. Wong, "Algorithm AS 136: A k-means clustering algorithm," J. Royal Statistical Soc.: Series C, vol. 28, no. 1, pp. 100-108, 1979.
- P. J. Rousseeuw and K. Van Driessen, "A fast algorithm for minimum covariance determinant," Technometrics, vol. 41, no. 3, pp. 212-223, Aug. 1999.
- R. Hyndman and G. Athanasopoulos, Forecasting: Principles and Practice, 3rd ed. Melbourne: OTexts, 2021.
- F. Pedregosa et al., "Scikit-learn: Machine learning in Python," J. Machine Learning Research, vol. 12, pp. 2825-2830, Oct. 2011.
- J. P. A. Ioannidis, "Why most published research findings are false," PLOS Medicine, vol. 2, no. 8, p. e124, Aug. 2005.
- D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," in Proc. 3rd Int. Conf. Learning Representations, San Diego, CA, USA, May 2015, pp. 1-15.
- N. Barberis and R. Thaler, "A survey of behavioral finance," in Handbook Economics of Finance, vol. 1. Amsterdam: Elsevier, 2003, pp. 1053-1128.
- L. Ruff et al., "Deep one-class classification," in Proc. 35th Int. Conf. Machine Learning, Stockholm, Sweden, Jul. 2018, pp. 4393-4402.
- W. F. Sharpe, "Capital asset prices: A theory of market equilibrium under risk," J. Finance, vol. 19, no. 3, pp. 425-442, Sep. 1964.
- IEEE, "IEEE Editorial Style Manual for Authors," IEEE, Piscataway, NJ, USA, 2022.
- K. R. Ahern, D. Daminelli, and C. Fracassi, "Lost in translation? The effect of cultural values on mergers," J. Financial Econ., vol. 117, no. 1, pp. 165-189, Jul. 2015.
- C. Molnar, Interpretable Machine Learning. Munich: Lulu.com, 2022.
- G. Hulten, L. Spencer, and P. Domingos, "Mining time-changing data streams," in Proc. 7th ACM SIGKDD Int. Conf. Knowledge Discovery, San Francisco, CA, USA, Aug. 2001, pp. 97-106.
- J. Pearl, Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge, UK: Cambridge Univ. Press, 2009.
- B. A. Nosek et al., "Promoting an open research culture," Science, vol. 348, no. 6242, pp. 1422-1425, Jun. 2015.
Behavioral finance has emerged as a critical framework for understanding market dynamics beyond traditional
rational agent models. This research presents a comprehensive multimodal approach to behavioral finance analysis,
integrating market data, macroeconomic indicators, news sentiment, cryptocurrency metrics, Web3 analytics, GitHub
development activity, and social sentiment to test five advanced hypotheses regarding behavioral pattern identification
and market anomaly detection. The study employs an ultra-comprehensive data pipeline processing 30,400 samples across
seven distinct data sources, generating 91 engineered features representing behavioral biases, investment patterns, and
market psychology. Advanced machine learning techniques including Principal Component Analysis, t-Distributed
Stochastic Neighbor Embedding, Variational Autoencoders, K-Means, Hierarchical Clustering, DBSCAN, Isolation
Forest, One-Class SVM, and Elliptic Envelope are applied to identify behavioral structures and detect anomalies.
Statistical validation through chi-square tests, ANOVA, Granger causality analysis, and lagged correlation studies
demonstrates that three of five hypotheses (60%) achieve statistical significance at p < 0.05. Key findings reveal that
behavioral structures exist and correspond to canonical biases (chi-square = 3406.780, p < 0.001), cluster assignments
maintain moderate stability across market regimes (Jaccard similarity = 0.300), and sentiment and macroeconomic factors
exhibit 65 significant causal relationships with behavioral patterns. However, multimodal data integration does not
uniformly improve clustering quality (Silhouette score decrease of 0.116), and cluster-conditioned anomaly detection fails
to outperform global methods (F1-score decrease of 0.017). These findings contribute to behavioral finance theory while
providing practical applications for investment management, fraud detection, and regulatory compliance.
Keywords :
Behavioral Finance, Multimodal Data Integration, Machine Learning, Hypothesis Testing, Anomaly Detection, Market Psychology