Skip to content

wrenparismoe/COMPS

Repository files navigation

Investigation of Financial Forecasting Systems: A Survey and Comprehensive Examination

Repository holding work for Senior Composite Project at Occidental College in Computer Science. 50 page research paper is included in base directory (COMPS_Financial_Forecasting.pdf).

Abstract

The stock market has the allure to many investors who believe they can beat market returns. The tools investment firms use to analyze the market are constantly evolving to give their developers advantages over the competition. The latest example is using major technological advancements to analyze information and outsmart the market. New developments in computing capabilities offer potential to bring in greater profits. Today 80% of trades are automated indicating a massive change in how traditional trading is done. Bringing computing into stock price analysis puts mathematicians and computer scientists at the forefront of the industry. However, the question whether computational forecasting techniques can consistently beat the market is still unknown and highly debated. Few studies exist that provide a complete analysis of the performances of financial prediction systems as a whole. However, none consist of holistic component-wise comparisons of entire financial time series forecasting systems. This research offers a comprehensive study of how combinations of input data, preprocessing methods, feature engineering techniques, and popular prediction models interact when tasked with predicting the prices of futures. Specific arrangements are determined that provide the most optimal systems for each model. Research procedures show models can predict prices within a range of high accuracy. Though predicting market directions, up or down, with consistent success to beat the market is more challenging. Ultimately, support vector regression and Gaussian processes demonstrate the most consistent ability to produce low errors while also predicting market directions accurately.

Introduction

The stock market is evolving due to advancements in computing. The work of mathematicians, computer scientists, and financial engineers has become increasingly important to the industry. Developing profitable prediction models is a focus of both small- and large-scale financial institutions. The accurately predicting market movements is incredibly difficult. Producing a successful financial trading system on model predictions has the potential to provide lucrative profits to its developer. Information on the best models to forecast the market is not public knowledge. The greatest challenge for a developer is determining which methods to use. There are countless models that can be applied to financial trading and forecasting. However, no method has yet to be universally agreed upon as the best. Specific factors cause algorithms to fit certain indices, markets, or tasks better than others. The lack of a known algorithm that universally fits the market drives the ongoing study of financial prediction systems.

This project proposes a comprehensive study of each aspect of financial forecasting systems. The question stands whether it is possible to establish the information and methods required to accurately predict market prices and directions. The primary considerations for the study are data input, data preprocessing methods, feature engineering techniques, and forecasting models. Prediction models fall into two overarching classes: hard computing and soft computing. Hard computing is conventional programming where rules and conditions are precisely coded into the structure of an algorithm. These methods tend to be simpler to understand and implement. Soft computing models are more complex, incorporating structures such as neural networks (NNs) and genetic algorithms (GAs). Broadly defined, soft computing is the use of estimated calculations that provide imprecise but informative solutions to computationally complex problems. Soft computing approaches exploit tolerance for imprecision, uncertainty, and partial truth to solve their applications. A common debate among experts is whether soft computing techniques can consistently outperform their more simplistic counterparts in hard computing.

The most popular financial prediction models are grouped into three distinct subcategories: classical statistical methods, machine learning (ML) techniques, and deep learning (DL) frameworks. Statistical methods are a prime example of hard computing while ML and DL fall under soft computing. Studying the applications of hard and soft computing in finance provides relevant and valuable information to other areas. How components interact and influence one another with financial prediction systems is knowledge directly applicable to any task of time series analysis. This research offers insight to such interactions and the performances that come from specific model combinations. Thus, this survey serves as a resource for programmers aspiring to build and learn about the foundations of financial forecasting systems.

Statement of Problem

Stock price forecasting is an exceedingly complex subject. The primary factor lies within the structure of financial time series data. A time series is identified as a data set formed by series of values obtained at successive times. Financial time series consists of historical open, high, low, and close (OHLC) prices, as well as recorded volume. In the most basic sense, time series forecasting analyzes past OHLC data to make predictions of future stock prices. The aim is to find or build a function capable of accurately mapping current and past values to future ones.

There are numerous reasons why accurately forecasting financial time series is a difficult problem. Relative to other types of time series, financial time series are particularly inconsistent. This presents a massive hurdle for mathematicians and computer scientists attempting to predict market movements. Financial time series are inherently embedded with extreme levels of randomness and high volatility. This characteristic of time series data is known as noise. The non-linearity and randomness of financial time series makes building forecasting models a challenging tasks. The unpredictability of financial time series is defined by the Efficient Market Hypothesis (EMH).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages