freeradiantbunny.org

freeradiantbunny.org/blog

pytorch deep learning case study

The following is an attempt to use pytorch to create a stock price prediction model.

The aim is to study the code and how the model operates.

The first example is a stock price prediction program that uses pytorch's LSTM.

Pipeline

Data Loading and Preprocessing

At first, synthetic stock price data is generated and saved to a file.

The main program loads stock price data. The program splits the data into 2 sets: train and test.

Normalization techniques are applied to the data so as to produce better model performance.

The MinMaxScaler(feature_range=(0, 1)) function is used to initialize a scalar. Then the fit_transform(data) function is used to produce the scaled data.

MinMaxScaler function transforms each feature (column) individually by rescaling its values to the specified range, usually [0, 1].

LSTM Model Implementation

A model is defined. The model is a Long Short-Term Memory (LSTM) network for time-series forecasting.

It needs to be discussed how the LSTM Model is specifically designed for stock price forecasting.

The model class imports torch.nn.

The model is instantiated with 3 parameters: input_size = 1, hidden_size = 50, num_layers = 2.

The batch_first variable is set to True. This is the default. This documentation states that this setting is used to make the input tensor's first dimension represent the batch size, which aligns with how most PyTorch models and DataLoaders structure data. It simplifies the code by keeping batch dimensions consistent across layers.

The model's final layer is then configured to predict one price (closing price). This i s done bu setting nn.Linear with in_features set to hidden_layer and out_features set to 1. This defining a fully connected (dense) linear layer as part of a neural network model. The "linear" aspect means that the output can be a real number and that no activation function is used.

The meaning of input_size is that the model will have this number of features per time step in the input sequence. When passing input to an LSTM, the expected input tensor shape is: (batch_size, sequence_length, input_size) # if batch_first=True ... or: (sequence_length, batch_size, input_size) # if batch_first=False (default). So, if input_size = 10, each time step should have 10 features.

The "Number of Layers" is set. This determins how many stacked RNN/LSTM/GRU layers are used. Each layer's output becomes the input to the next layer. If num_layers = 3, the RNN is 3 layers deep. More layers help model more complex temporal dependencies, but also increase training time and risk of vanishing gradients.

The hidden_size parameter is most commonly used in sequence models like nn.LSTM. It defines the dimensionality of the hidden state vector — in other words, the number of features in the model#&39;s internal memory at each time step.

The model also has a forward() function definition. The forward(self, x) function is overridden. This function describes the computation performed at every call of the model instance. It's where you specify how input tensors are passed through layers (e.g., linear layers, activations, RNNs, etc.) to produce output.

Training Pipeline

The training loop has a dynamic learning rate adjustment. The training loop also monitors the loss for efficient training.

The training function train_model90 defines the training loop for a PyTorch LSTM neural network. Below is an explanation of what each part does:

The function trains the LSTM model using input sequences, expected outputs, and optimization settings for a specified number of epochs (full passes over the data).

Here are the parameters:

Here is a step-by-step description:

  1. model.train(): Puts the model in training mode. Enables layers like dropout if defined.
  2. for epoch in range(epochs):: Runs the training process for the specified number of epochs.
  3. outputs = model(train_data): Feeds the training data into the model to get predicted outputs.
  4. optimizer.zero_grad(): Clears old gradients from the previous iteration to avoid accumulation.
  5. loss = criterion(outputs, train_labels): Computes the difference between predictions and true labels.
  6. loss.backward(): Performs backpropagation to compute gradients of the loss w.r.t. model parameters.
  7. optimizer.step(): Updates the model parameters using the gradients and the optimizer's algorithm.
  8. if (epoch+1) % 10 == 0: — Prints the loss every 10 epochs to monitor training progress.

The output looks like this:

	Epoch 10/50, Loss: 0.0271
	Epoch 20/50, Loss: 0.0195
	Epoch 30/50, Loss: 0.0152
	...
Evaluation and Visualization

The evaulate_model() function is a standard evaluation utility for PyTorch models.

It prepares the model for inference, performs predictions without tracking gradients, and converts both the predictions and the ground truth back into their original, human-readable form by reversing normalization. This is crucial for interpreting model results in real-world units, such as stock prices or temperatures.

The function evaluate_model is used to evaluate a trained model on test data. It returns both the model’s predicted values and the actual values, both rescaled back to their original range using a MinMaxScaler.

Here is a step-by-Step explanation:

  1. model.eval(): Puts the model in evaluation mode. This disables training-specific behavior like dropout and batch normalization updates.
  2. with torch.no_grad():: Turns off gradient tracking to save memory and computation, since we’re not training.
  3. predictions = model(test_data): Feeds the test data into the model to generate predicted outputs.
  4. predictions = predictions.cpu().numpy(): Moves the predictions to the CPU (if on GPU) and converts the tensor to a NumPy array.
  5. predictions = scaler.inverse_transform(predictions): Unscales the predictions back to their original range using the inverse of MinMax scaling.
  6. actual_prices = scaler.inverse_transform(test_labels.cpu().numpy()): Also unscales the actual labels for direct comparison in the same scale.
  7. return actual_prices, predictions: Returns the original (unscaled) true values and the predicted values so they can be compared or plotted.

In addition, an image is generated to visualize the predicted vs. actual stock prices.

Requirements

The following libraries are used:

Application 2 Code Explanation - Stock Prediction Neural Net

Line-by-Line Explanation

Preparing the Data

Function Definition:
A create-data() function is defined that accepts two arguments: a dataset (most likely a DataFrame with stock data) and an integer indicating how many previous days of data to include as features for prediction.

Deep Copy:
A deep copy of the original dataset is created to avoid modifying the original data. This copy will be manipulated to add new columns and filter rows.

Set Date Index:
The 'Date' column is set as the index of the DataFrame. This is a standard step in time series data processing to allow operations that rely on chronological ordering.

Create Lagged Features:
A loop runs from 1 to the number of days specified. In each iteration, a new column is added to the DataFrame that represents the closing price from that many days ago. These lagged values are essential features for training time series models.

Drop Missing Values:
Rows with missing values are removed. These missing values are introduced because the first few rows won’t have enough historical data due to the shifting of closing prices.

Return Processed Data:
The cleaned and transformed DataFrame is returned. This DataFrame contains the current closing price and a sequence of lagged closing prices for use as input features in model training.

Apply the Function:
A variable is set to specify the number of days (e.g., 7), and the function is called to generate the training data. The result is stored in a new variable which holds the fully prepared DataFrame for the stock prediction task.

Preprocessing and Splitting

Import the Scaler:
The code begins by importing a utility called MinMaxScaler from the sklearn.preprocessing module. This tool is used to rescale numerical values so they fall within a defined range, which is helpful for training neural networks efficiently.

Initialize the Scaler:
An instance of MinMaxScaler is created with the feature range set to -1 to 1. This means the smallest values in the dataset will be scaled to -1, the largest to 1, and all others proportionally in between.

Fit and Transform the Training Data:
The scaler is applied to the entire training DataFrame. It first analyzes the data to determine the minimum and maximum for each column, then transforms all feature values accordingly to fit within the target range of -1 to 1.

Convert to a DataFrame:
Since the scaled output is a NumPy array, it is converted back into a pandas DataFrame. This preserves the original column names and date indices so that the data remains labeled and easy to work with in subsequent steps.

Preview the Scaled Data:
The first few rows of the scaled DataFrame are displayed for inspection. This helps verify that the scaling was performed as expected and that the structure of the data remains intact.

Train/Test Preparation

Separate Features and Target:
All columns in the scaled DataFrame except for the Close column are selected as input features and stored in a variable called X. The Close column, which represents the value to predict, is stored separately in a variable called y.

Inspect Data Shapes:
The shapes (dimensions) of X and y are checked to confirm the correct number of rows and columns in each. This step ensures that inputs and targets align correctly.

Reverse Time Order of Features:
The columns in X are reversed. This flips the time order of the lagged features so that the most recent day appears first, and the oldest day appears last. This may be preferred depending on how the neural network expects input sequences.

Create Independent Copy:
A deep copy of the reversed data is created to ensure it does not share memory with the original data. This avoids unintentional modifications to the original feature matrix.

Compute Split Index:
A split index is calculated as 90% of the total number of samples. This index will determine where the dataset is divided between training and testing data.

Split into Training and Testing Sets:
The first 90% of X and y are assigned to X_train and y_train, which will be used for training the model. The remaining 10% are assigned to X_test and y_test, which will be used to evaluate model performance.

Verify Final Shapes:
The shapes of X_train, X_test, y_train, and y_test are checked to ensure that the data has been properly split and remains correctly structured for input into the neural network.

Tensor Preparation

Reshape Input Features:
The training and testing feature arrays are reshaped into three-dimensional structures with the shape (samples, 7, 1). This format is commonly required for time-series models such as LSTMs or RNNs in PyTorch. Here:

Reshape Target Labels:
The target arrays, which contain the values to be predicted (closing prices), are reshaped into a two-dimensional structure with shape (samples, 1). This shape matches the expected output format for regression tasks using neural networks.

Inspect Shapes:
The shapes of all reshaped arrays are printed or examined to ensure that the structure is valid for input into the neural network.

Convert to PyTorch Tensors:
The reshaped arrays are converted from NumPy arrays to PyTorch tensors. During this conversion, they are also cast to 32-bit floating point format (float) to match PyTorch's default data type for training models.

Confirm Final Tensor Shapes:
The shapes of the PyTorch tensors are printed again to verify that the conversion and reshaping steps have preserved the correct dimensions for both input features and output labels.

Custom Dataset Definition for Stock Price Prediction

Import PyTorch Utilities:
The code begins by importing DataLoader and Dataset from torch.utils.data. These classes are used to efficiently handle data loading and batching during model training and evaluation in PyTorch.

Define a Custom Dataset Class:
A new class called StockDataset is defined by extending PyTorch’s Dataset base class. This class provides a custom way to store and access input-output pairs of stock data.

Constructor Method (__init__):
The constructor takes two arguments: X (the input features) and y (the target labels). These are saved as instance variables so that the dataset object can return them later when requested.

Length Method (__len__):
This method returns the total number of samples in the dataset. It allows PyTorch to know how many items can be iterated over or batched from this dataset.

Get Item Method (__getitem__):
This method retrieves a single data sample given an index. It returns a tuple consisting of the X and y values at the specified index. This is used internally by PyTorch when iterating over the dataset.

Create Dataset Instances:
Two instances of the StockDataset class are created:,/p.

These objects will later be used with PyTorch's DataLoader to feed data into the neural network during training and testing.

DataLoader Setup for Training and Testing

Set Batch Size:
A variable batch_size is assigned the value 16. This determines how many samples will be processed together in each batch during training and testing. Using batches improves training efficiency and model convergence.

Create Training DataLoader:
A DataLoader object named train_loader is created from the training dataset. It uses the specified batch_size and enables shuffling of the data at each epoch. Shuffling helps the model generalize better by preventing it from seeing the data in the same order every time.

Create Testing DataLoader:
Similarly, a DataLoader called test_loader is created from the test dataset. It also uses the same batch size and shuffling enabled. This allows evaluation of the model on batches of test data, with randomized order.

LSTM Neural Network Model for Stock Price Prediction

Class Definition:
A new class named LSTM is defined, which inherits from PyTorch's nn.Module. This class encapsulates the architecture and forward pass logic for an LSTM-based neural network.

Constructor (__init__):
The constructor method initializes the model's layers and parameters:

Inside the constructor:

Forward Method:
Defines how the input data flows through the network during the forward pass:

Model Instantiation:
An instance of the LSTM model is created with:

Move Model to GPU:
The model is transferred to the GPU device to leverage hardware acceleration for training and inference.

Model Summary:
The final statement simply outputs the model architecture summary, showing the layers and parameters.

Optimizer and Loss Function Initialization

Define the Optimizer:
The Adam optimizer from PyTorch's torch.optim module is instantiated. It is configured to optimize the parameters of the neural network model. The learning rate is set to 0.001, which controls how much the model weights are updated during training with respect to the gradient.

Define the Loss Function:
The mean squared error loss function (nn.MSELoss()) is created. This loss measures the average squared difference between the predicted values and the actual target values, which is appropriate for regression tasks such as stock price prediction.

Function to Train the Model for One Epoch

Function Definition:
Defines a function named train_one_epoch which accepts the current epoch number as an argument. This function performs one full pass (epoch) of training over the entire training dataset.

Set Model to Training Mode:
Calls model.train(True) to set the model in training mode. This enables layers like dropout and batch normalization to behave correctly during training.

Initialize Loss Tracking:
Initializes a variable to accumulate the running loss over batches, starting at zero.

Print Epoch Number:
Displays the current epoch number (incremented by 1 for human readability).

Iterate Over Training Batches:
Loops over batches of data from train_loader. For each batch:

Periodic Loss Reporting:
Every 100 batches, calculates the average loss over the last 100 batches and prints it for monitoring training progress. Then resets the running loss to zero to start accumulating for the next set of batches.

Final Newline:
Prints a blank line at the end of the epoch for cleaner output formatting.

Function to Validate the Model for One Epoch

Function Definition:
Defines a function named validate_one_epoch that performs a single pass through the validation dataset to evaluate model performance without updating model weights.

Set Model to Evaluation Mode:
Calls model.train(False) to set the model into evaluation mode. This disables certain layers like dropout and batch normalization that behave differently during training.

Initialize Loss Tracking:
Sets a variable to accumulate the total validation loss, starting at zero.

Iterate Over Validation Batches:
Loops through batches from test_loader. For each batch:

Compute Average Validation Loss:
After iterating over all batches, calculates the average validation loss by dividing the total loss by the number of batches.

Print Validation Loss:
Displays the average validation loss formatted to three decimal places to monitor how well the model is performing on unseen data.

Formatting Output:
Prints a separator line and a blank line to visually separate validation output from other logs for clarity.

Training Loop, Prediction, and Plotting Results

Set Number of Epochs:
A variable num_epochs is set to 10, indicating that the training and validation processes will run for 10 complete cycles over the dataset.

Training and Validation Loop:
A for loop iterates over the number of epochs. For each epoch:

This sequential training and validation allows monitoring of learning progress and helps detect overfitting.

Make Predictions:
After training completes, a block within torch.no_grad() disables gradient calculations to save memory and computation during inference. The model predicts outputs for the entire training dataset:

Plot Actual vs. Predicted:
Using the matplotlib library, two line plots are created on the same graph:

The plot is labeled with axis titles and includes a legend for clarity.

Display and Save the Plot:
The plot is displayed interactively with plt.show(), allowing the user to visually assess model performance. Then, the plot image is saved to a file named plot_spp-lSTM_predicted.jpg inside an images directory for future reference or reporting.

Inverse Transforming Scaled Values and Plotting Results

Flatten Predicted Values:
The predicted values from the model are flattened into a one-dimensional array for easier manipulation.

Create Dummy Array for Inverse Scaling:
A new NumPy array named dummies is created with zeros. Its shape matches the number of training samples and includes extra columns equal to the number of days plus one. This array will be used as input to the scaler’s inverse transform method, which expects the full feature set.

Insert Predictions into Dummy Array:
The first column of the dummy array is filled with the flattened predicted values. This allows the scaler to inverse transform the data properly by treating the predictions as the 'Close' feature while ignoring other columns.

Inverse Transform Predictions:
The dummy array is passed to the scaler’s inverse_transform method, which converts the scaled predictions back to their original scale (actual stock prices).

Extract Actual Predictions:
The first column of the inversely transformed array is copied as the final array of predicted closing prices in original scale.

Repeat for Actual Target Values:
A similar process is done for the actual target values:

Plot Actual vs. Predicted Closing Prices:
Using matplotlib, the actual closing prices and predicted closing prices are plotted on the same graph with proper labels, axis titles, and a legend.

Display and Save Plot:
The plot is displayed interactively and then saved as an image file named plot_spp-lSTM-predicted2.jpg inside the images directory for documentation or later review.

Generating and Inverse Transforming Test Predictions

Generate Test Predictions:
The model is used to generate predictions for the test dataset. The test input features are moved to the GPU for fast computation. The output tensor is detached from the computation graph (to prevent gradient tracking), moved back to the CPU, converted to a NumPy array, and flattened into a 1D array.

Create Dummy Array for Inverse Transformation:
A zero-filled NumPy array named dummies is created with rows equal to the number of test samples and columns equal to the number of days plus one. This array is used to accommodate the scaler’s expected input shape during inverse scaling.

Insert Predictions for Inverse Scaling:
The flattened test predictions are assigned to the first column of the dummy array. This setup allows the scaler to inverse transform only the relevant feature (the predicted 'Close' values).

Inverse Transform the Predicted Values:
The dummy array is passed to the scaler’s inverse_transform method, converting the scaled predictions back to their original scale. The first column of this inversely transformed array is deep-copied into test_predictions, now representing the predicted closing prices in the original scale.

Print Test Predictions:
The final array of test predictions is printed to the console for inspection.

Prepare Actual Test Targets for Inverse Scaling:
A similar dummy array of zeros is created for the actual test target values. Since the targets are stored as tensors on the GPU, they are flattened, moved to the CPU, and converted to NumPy arrays before assignment to the dummy array.

Inverse Transform Actual Test Targets:
The dummy array containing the actual scaled target values is inverse transformed back to the original scale, and the results are copied into new_y_test, representing the true closing prices for the test set in their original scale.

Plotting Actual and Predicted Stock Closing Prices

Plot Actual Closing Prices:
The actual closing prices stored in new_y_test are plotted as a line graph. A label “Actual Close” is added for the legend.

Plot Predicted Closing Prices:
The predicted closing prices stored in test_predictions are plotted on the same graph for comparison, with the label “Predicted Close”.

Label Axes:
The x-axis is labeled “Day” to indicate the time sequence, and the y-axis is labeled “Close” to represent stock closing prices.

Add Legend:
A legend is displayed on the plot to distinguish between the actual and predicted lines.

Display the Plot:
The plot is shown interactively, allowing visual assessment of model performance.

Save the Plot:
Finally, the plot is saved as an image file named plot_spp-lSTM_predicted3.jpg inside the images directory for documentation or future reference.

That is the end of the program.

another app

Project Workflow

Fetch Stock Data:
Use the yfinance to download historical stock data. The data includes Open, Close, High, and Low prices.

Visualize Stock Trends:
Create plots for Open and Close prices over time.

Calculate Moving Averages:
Compute 100-day, 200-day, and 300-day moving averages.

Split Data:
Divide the data into training (80%) and testing (20%) datasets.

Preprocess the Data:
Scale the stock prices to the range [0, 1] using MinMaxScaler and create sliding window sequences of the past 100 days as input for the model.

Define the LSTM Model:
Implement the LSTM model in PyTorch. The parameter supplied is 100-day sequences of stock prices. The model predicts the next day's price.

Convert Data to PyTorch Tensors:
Prepare the training data and convert the NumPy arrays to PyTorch tensors.

Train the Model:
Optimize the model using Mean Squared Error Loss (MSE).

Test the Model:
Evaluate the model on the test data, make predictions, and compare them against actual values.

Visualize Results:
Plot the predicted prices against the actual prices.

Predict Next-Day Price:
Use the most recent 100 days of data to predict the stock's next day closing price.

Compare Predictions:
Compare the most recent actual closing price to the predicted next day's price and calculate the percentage difference.