Why Current Financial Historic Data is Neither Enough nor Truly Useful for Testing Financial Trading and Investment Models in the AI Era (part 1)

Artificial intelligence (AI) has revolutionized the field of financial trading and investment. AI models can analyze massive amounts of data, learn from patterns and trends, and make predictions and recommendations for optimal decision-making [1] . However, to train and test these models, historical data is often used as the main source of information. Is historical data enough or even useful for testing trading and investment models in the AI era? In this blog post, we will discuss why historical data has several limitations and challenges.

The Problem of Data Scarcity

One of the main limitations of historical data is its scarcity. As we go back in time, financial data becomes less available and reliable. It is difficult to obtain comprehensive and accurate data from distant past periods, especially for emerging markets and new asset classes [2]. This limits the ability of backtesting models to capture the full complexity and diversity of real-world financial scenarios.

The Problem of Data Relevance

Another limitation of historical data is its relevance. The economy is a dynamic system that changes constantly. The factors that influenced the financial markets in the past may not be relevant or applicable in the present or future. For example, technological innovations, regulatory changes, accounting standards, corporate actions, and consumer behaviors all affect the financial landscape over time [3]. Therefore, historical data may not reflect the current or future market conditions and trends.

The Problem of Data Quality

A third limitation of historical data is its quality. Historical data may contain errors, inconsistencies, gaps, outliers, or biases that can affect the accuracy and validity of backtesting and analysis results. For example, historical data may not account for inflation effects, survivorship bias [4], look-ahead bias [5], or changes in market liquidity and volatility. Therefore, historical data may not provide a reliable basis for testing trading and investment models.

Conclusion

While historical financial data is valuable for understanding past trends and patterns, its limitations must be recognized, especially in the AI era. Scarce data, the lack of mapping between past and present, the changing corporate landscape, and evolving accounting standards all contribute to the diminishing usefulness of historical data for backtesting and investment analysis. It is crucial to strike a balance between incorporating historical insights and relying on real-time data, alternative data, synthetic data and expert oppionions to adapt to the ever-evolving financial landscape. Ultimately, the ability to adapt and leverage a wide variety of information will be key to successful trading and investment strategies in the AI era. In our next blog, we will delve into the potential of synthetic data as a unique and innovative source of financial information.”

 

References

[1] Cao, Longbing. “AI in Finance: Challenges, Techniques and Opportunities.” arXiv preprint arXiv:2107.09051, 2021, https://arxiv.org/abs/2107.09051

[2] Ahmed, Shamima, et al. “Artificial intelligence and machine learning in finance: A bibliometric review.” Expert Systems with Applications, Volume 61, October 2022, 101646, https://www.sciencedirect.com/science/article/pii/S0275531922000344

[3] “Artificial Intelligence, Machine Learning and Big Data in Finance Report”. OECD, 2021, https://www.oecd.org/finance/artificial-intelligence-machine-learning-big-data-in-finance.htm

[4] CFI Team, “Survivorship bias” Corporate Finance Institute, 2023, https://corporatefinanceinstitute.com/resources/capital-markets/survivorship-bias/

Laurentiu Vasiliu, Peracton Ltd.