freeradiantbunny.org

freeradiantbunny.org/blog

feature engineering

Feature Engineering for a Stock Return Prediction Neural Network

Feature engineering is a crucial step in building effective neural networks for stock return prediction. It involves transforming raw financial data into a set of meaningful features that capture relevant patterns, trends, and signals, enabling the model to learn more accurately and make better predictions. Since stock markets are complex, noisy, and influenced by many factors, feature engineering helps distill useful information that highlights the underlying structure of the data, improving the neural network's ability to forecast returns.

Understanding the Problem

Predicting stock returns typically means forecasting the percentage change in price over a future time period, rather than simply predicting the next price value. Returns are often more stationary and exhibit statistical properties that make them better suited for modeling. However, stock returns are influenced by a mix of historical price behavior, volume, market sentiment, macroeconomic indicators, and other external factors. Proper feature engineering aims to encapsulate these influences into a numerical representation the neural network can process.

Raw Data Sources

The starting point for feature engineering is the raw data, which usually includes:

While raw price and volume data provide the foundation, these values alone are rarely sufficient for accurate prediction due to their volatility and noise.

Common Feature Engineering Techniques for Stock Return Prediction

1. Technical Indicators

Technical indicators summarize price and volume information into metrics that reflect market trends, momentum, volatility, or mean reversion tendencies. Some widely used indicators include:

These indicators convert raw price series into more interpretable signals. By calculating them over different lookback periods, you can capture short-, medium-, and long-term dynamics.

2. Return Calculations

Since the goal is to predict returns, engineering features based on returns themselves is natural. Examples include:

3. Time Features

Temporal features help the model learn seasonality and periodic patterns:

Encoding these features cyclically (e.g., sine/cosine transforms of day-of-year) avoids artificial discontinuities.

4. Volume-based Features

Volume indicates trading intensity and can signal strength or weakness of price moves:

5. Cross-Asset or Market Features

Stocks do not move in isolation; incorporating features from related assets or indices helps capture market-wide effects:

6. Sentiment and Alternative Data

If available, sentiment analysis on news or social media can provide valuable leading indicators:

Data Preprocessing Considerations

Before feeding features into the neural network, careful preprocessing is necessary:

Challenges and Best Practices