Training by multivariate data sets

Question

I have the following task to do: Training by the consecutive 3 days to predict the 4th day. Each day data represents one CSV file which has dimension 24x25. Every datapoints of each CSV file are pixels. I have required to use the models like regression(linear, ridge) and LSTM.

For training by 3 days for each model:

For regression model: After did the each day data by flattening, I transpose the data with Shape: (600, 3).

For lstm model: After did the each day data by flattening, I keep the data as it is with Shape: (3, 600).

For example like this:

day_1 = [0.1, 0.2, ..., 0.6]  # 600 features for Day 1
day_2 = [0.15, 0.25, ..., 0.65]  # 600 features for Day 2
day_3 = [0.2, 0.3, ..., 0.7]  # 600 features for Day 3
X_train_linear_ridge = np.array([
    [0.1, 0.15, 0.2],      # Feature 1 across Day 1, Day 2, Day 3
    [0.2, 0.25, 0.3],      # Feature 2 across Day 1, Day 2, Day 3
    # ...
    [0.6, 0.65, 0.7]       # Feature 600 across Day 1, Day 2, Day 3
])  # Shape: (600, 3)
X_train_lstm = np.array([
    [0.1, 0.2, ..., 0.6],   # Day 1 features
    [0.15, 0.25, ..., 0.65], # Day 2 features
    [0.2, 0.3, ..., 0.7]    # Day 3 features
])  # Shape: (3, 600)

Would anyone tell me preparation of data for regression models with Shape: (600, 3) and lstm with Shape: (3, 600) are conceptually correct way?

My motivation:

LSTM: LSTMs are designed to process data with a sequential, temporal relationship. By feeding data with shape (3, 600) (representing 3 time steps, each with 600 features), the LSTM can learn patterns across the sequence. Each time step corresponds to a day in the data, while the 600 features represent individual values for that day. This structure is essential for the LSTM to leverage temporal dependencies.

Linear and Ridge Regression: These models lack the inherent sequential processing capability. They interpret each input as a single flat vector. To approximate sequential learning, we can treat each day's data as a separate feature, creating a setup with shape (600, 3), where the 600 features are "stacked" over the 3 days. Each day becomes a feature for the regression model, but it cannot capture temporal dependencies as an LSTM does.

My concept is correct for regressions and lstm model with respect to shape?

Training by multivariate data sets

0 Answers0