freeradiantbunny.org

freeradiantbunny.org/blog

normalization

Scaling in Stock Price Prediction with PyTorch

Common Scaling Techniques

For financial time series, standardization is typically preferred due to outliers and varying ranges.

How to Effectively Scale Data in PyTorch

Assuming a dataset with multiple features for stock price prediction:

Step 1: Calculate Scaling Parameters on Training Data Only

import torch

	  # Example tensor: rows = samples, columns = features
	  train_features = torch.tensor([
	  [100.0, 0.7, 30.0],    # price, RSI, volume_change_pct
	  [102.0, 0.8, 25.0],
	  [98.0, 0.6, 40.0],
	  # ... more training samples
	  ])

	  # Compute mean and std only on training set for each feature
	  mean = train_features.mean(dim=0)
	  std = train_features.std(dim=0, unbiased=False)  # population std for consistency
      

Step 2: Apply Scaling to Train and Validation/Test Sets

def standardize(tensor, mean, std):
	  return (tensor - mean) / std

	  train_scaled = standardize(train_features, mean, std)

	  # When you get validation or test data:
	  val_features = torch.tensor([
	  [101.0, 0.75, 28.0],
	  # ...
	  ])
	  val_scaled = standardize(val_features, mean, std)
      

Important Notes