freeradiantbunny.org

freeradiantbunny.org/blog

normalization

Scaling in Stock Price Prediction with PyTorch

Common Scaling Techniques

X_scaled = (X - X_min) / (X_max - X_min)
X_scaled = (X - μ) / σ

For financial time series, standardization is typically preferred due to outliers and varying ranges.

How to Effectively Scale Data in PyTorch

Assuming a dataset with multiple features for stock price prediction:

Step 1: Calculate Scaling Parameters on Training Data Only

import torch
      # Example tensor: rows = samples, columns = features
      train_features = torch.tensor([
      [100.0, 0.7, 30.0],    # price, RSI, volume_change_pct
      [102.0, 0.8, 25.0],
      [98.0, 0.6, 40.0],
      # ... more training samples
      ])
      # Compute mean and std only on training set for each feature
      mean = train_features.mean(dim=0)
      std = train_features.std(dim=0, unbiased=False)  # population std for consistency
  

Step 2: Apply Scaling to Train and Validation/Test Sets


      def standardize(tensor, mean, std):
	  return (tensor - mean) / std
	  train_scaled = standardize(train_features, mean, std)
	  # When you get validation or test data:
	  val_features = torch.tensor([
	  [101.0, 0.75, 28.0],
	  # ...
	  ])
	  val_scaled = standardize(val_features, mean, std)
  

Important Notes