从零开始训练一个神经网络之四：模型调优

2024-08-18

字数统计: 5.9k字 | 阅读时长≈ 27分

在上期我们实现了多算法训练的不同的模型来对金价走势进行预测，可以看到LSTM得到的模型拟合的精度最高；

我们后续提出了两个问题：

如何实现多算法结合的混合模型？
如何将得到的最佳模型进行部署和实战检验？

如何实现多算法结合的混合模型？

创建一个混合模型的思路是将多个模型的预测结果结合在一起，以期望获得比任何单一模型更好的性能。这通常通过以下几种方法来实现：

加权平均法：根据各个模型在验证集上的表现，对它们的预测结果进行加权平均。权重可以基于每个模型的均方误差（MSE）反向设置，MSE越低的模型权重越高。
堆叠（Stacking）：将多个模型的预测结果作为输入，使用一个元模型（如线性回归、决策树、SVM等）来学习如何最佳地组合这些预测。
投票法：对于分类任务，混合模型也可以通过多数投票的方法来确定最终预测。

当前场景中，我们可以用加权平均法来构建一个混合模型。这是因为主要关注的目标“金价”是时间序列，且使用的模型都是回归模型。

实现思路

训练多个模型：分别训练LSTM、GRU、CNN、XGBoost、RandomForest、SVM、MLP等模型，并获取它们在验证集上的预测结果和对应的MSE。
计算加权平均：使用每个模型的MSE反向作为权重，对它们的预测结果进行加权平均。MSE越低的模型，其预测结果权重越高。
生成最终预测：对测试集进行加权平均预测，并计算最终的MSE。

以下是代码的实现：

import tensorflow as tf
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error
from sklearn.ensemble import RandomForestRegressor
from sklearn.svm import SVR
from sklearn.neural_network import MLPRegressor
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, GRU, Conv1D, Dense, Dropout, Flatten
import numpy as np
from data_solve import main as data_preprocessing_main

# 配置Matplotlib使用中文字体，确保图表中的中文字符正常显示
plt.rcParams['font.sans-serif'] = ['SimHei']  # 使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 确保负号显示正常

def build_model(model_type, input_shape=None, units=50, dropout_rate=0.2):
    if model_type in ['LSTM', 'GRU', 'CNN']:
        model = Sequential()
        if model_type == 'LSTM':
            model.add(LSTM(units=units, return_sequences=True, input_shape=input_shape))
            model.add(Dropout(dropout_rate))
            model.add(LSTM(units=units, return_sequences=False))
        elif model_type == 'GRU':
            model.add(GRU(units=units, return_sequences=True, input_shape=input_shape))
            model.add(Dropout(dropout_rate))
            model.add(GRU(units=units, return_sequences=False))
        elif model_type == 'CNN':
            model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=input_shape))
            model.add(Dropout(dropout_rate))
            model.add(Flatten())
        model.add(Dropout(dropout_rate))
        model.add(Dense(units=3))  # 输出3个时间步的预测
        model.compile(optimizer='adam', loss='mean_squared_error')
        return model
    elif model_type == 'XGBoost':
        return [XGBRegressor(objective='reg:squarederror') for _ in range(3)]
    elif model_type == 'RandomForest':
        return [RandomForestRegressor(n_estimators=100, random_state=42) for _ in range(3)]
    elif model_type == 'SVM':
        return [SVR(kernel='rbf') for _ in range(3)]
    elif model_type == 'MLP':
        return [MLPRegressor(hidden_layer_sizes=(units, units), max_iter=500) for _ in range(3)]
    else:
        raise ValueError(f"未知的模型类型: {model_type}")

def evaluate_model(model, X_test, y_test, model_type):
    if model_type in ['LSTM', 'GRU', 'CNN']:
        predictions = model.predict(X_test)
    else:
        X_test_flat = X_test.reshape(X_test.shape[0], -1)
        predictions = np.zeros(y_test.shape)
        for i in range(3):
            predictions[:, i] = model[i].predict(X_test_flat)

    mse = mean_squared_error(y_test, predictions)
    print(f"测试集上的均方误差: {mse}")

    plt.figure(figsize=(14, 7))
    for i in range(predictions.shape[1]):  # 针对每个时间步绘制预测和真实值的对比图
        plt.plot(y_test[:, i], label=f'真实值 - 时间步 {i + 1}')
        plt.plot(predictions[:, i], label=f'预测值 - 时间步 {i + 1}')
    plt.title(f'金价预测 vs 真实值 ({model_type})')
    plt.xlabel('时间步')
    plt.ylabel('金价')
    plt.legend()
    plt.show()

    return mse, predictions

def model_search(X_train, y_train, X_val, y_val, X_test, y_test,
                 model_types=['LSTM', 'GRU', 'CNN', 'XGBoost', 'RandomForest', 'SVM', 'MLP'], epochs=50, batch_size=32):
    model_results = []
    for model_type in model_types:
        print(f"训练模型: {model_type}")
        model = build_model(model_type, input_shape=(X_train.shape[1], X_train.shape[2]))

        if model_type in ['LSTM', 'GRU', 'CNN']:
            model.fit(X_train, y_train, validation_data=(X_val, y_val), epochs=epochs, batch_size=batch_size, verbose=0)
        else:
            X_train_flat = X_train.reshape(X_train.shape[0], -1)
            X_val_flat = X_val.reshape(X_val.shape[0], -1)
            for i in range(3):  # 训练每个时间步的模型
                model[i].fit(X_train_flat, y_train[:, i])

        mse, predictions = evaluate_model(model, X_test, y_test, model_type)
        model_results.append((model_type, model, mse))

    return model_results

def combine_models(model_results, X_test):
    inv_mse_sum = sum(1/mse for _, _, mse in model_results)
    combined_predictions = np.zeros_like(model_results[0][1].predict(X_test))

    for _, model, mse in model_results:
        if isinstance(model, list):  # 对于非神经网络模型
            predictions = np.zeros((X_test.shape[0], 3))
            X_test_flat = X_test.reshape(X_test.shape[0], -1)
            for i in range(3):
                predictions[:, i] = model[i].predict(X_test_flat)
        else:  # 对于神经网络模型
            predictions = model.predict(X_test)

        weight = (1/mse) / inv_mse_sum
        combined_predictions += weight * predictions

    return combined_predictions

def evaluate_combined_model(y_test, combined_predictions):
    mse = mean_squared_error(y_test, combined_predictions)
    print(f"混合模型的均方误差: {mse}")

    plt.figure(figsize=(14, 7))
    for i in range(combined_predictions.shape[1]):  # 针对每个时间步绘制预测和真实值的对比图
        plt.plot(y_test[:, i], label=f'真实值 - 时间步 {i + 1}')
        plt.plot(combined_predictions[:, i], label=f'混合预测值 - 时间步 {i + 1}')
    plt.title('金价预测 vs 真实值 (混合模型)')
    plt.xlabel('时间步')
    plt.ylabel('金价')
    plt.legend()
    plt.show()

# 主函数
if __name__ == "__main__":
    X_train, X_val, X_test, y_train, y_val, y_test = data_preprocessing_main(
        file_path='gold_price_data_extracted.csv',
        window_size=10,
        prediction_horizon=3,
        normalization_method='minmax',
        train_size=0.7,
        validation_size=0.1,
        drop_columns=['Adj', 'Volume'],
        indicators=['MA', 'RSI', 'MACD', 'Bollinger', 'ATR']
    )

    model_results = model_search(X_train, y_train, X_val, y_val, X_test, y_test)

    combined_predictions = combine_models(model_results, X_test)

    evaluate_combined_model(y_test, combined_predictions)

    best_model_type, best_model, _ = min(model_results, key=lambda x: x[2])
    if best_model_type in ['LSTM', 'GRU', 'CNN']:
        best_model.save(f'best_model_{best_model_type}.h5')
    print(f"最佳模型 {best_model_type} 已保存。")

运行的结果如下：

2024-08-18 22:45:34.472344: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-18 22:45:34.999176: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-08-18 22:45:36,762 - INFO - 数据加载完成
2024-08-18 22:45:36,762 - INFO - 原始数据包含 3474 行，7 列
2024-08-18 22:45:36,774 - INFO - 数据清理完成
2024-08-18 22:45:36,774 - INFO - 清理后的数据包含 3474 行，5 列
2024-08-18 22:45:36,776 - INFO - 数据预览:
           Date    Open    High     Low   Close
3473 2010-10-05  134.10  134.10  134.10  134.03
3472 2010-10-06  134.10  135.00  134.10  134.77
3471 2010-10-19  136.98  136.98  132.93  133.60
3470 2010-10-20  133.43  134.70  133.38  134.42
3469 2010-10-21  134.32  134.72  131.88  132.56
2024-08-18 22:45:36,776 - INFO - 生成技术指标: ['MA', 'RSI', 'MACD', 'Bollinger', 'ATR']
2024-08-18 22:45:36,795 - INFO - 技术指标生成完成
2024-08-18 22:45:36,798 - INFO - 数据归一化/标准化完成，方法: minmax
2024-08-18 22:45:36,909 - INFO - 滑动窗口生成特征和标签，窗口大小: 10, 预测范围: 3
2024-08-18 22:45:36,910 - INFO - 数据集划分完成: 训练集大小=(2407, 10, 11), 验证集大小=(343, 10, 11), 测试集大小=(689, 10, 11)
2024-08-18 22:45:36,910 - INFO - 训练集大小: (2407, 10, 11), 验证集大小: (343, 10, 11), 测试集大小: (689, 10, 11)
2024-08-18 22:45:36.913249: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
训练模型: LSTM
E:\ProgramData\anaconda3\envs\goldenpre\Lib\site-packages\keras\src\layers\rnn\rnn.py:204: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 6ms/step
测试集上的均方误差: 0.00027118924039005093
训练模型: GRU
E:\ProgramData\anaconda3\envs\goldenpre\Lib\site-packages\keras\src\layers\rnn\rnn.py:204: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step
测试集上的均方误差: 0.0002801844536378178
训练模型: CNN
E:\ProgramData\anaconda3\envs\goldenpre\Lib\site-packages\keras\src\layers\convolutional\base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step 
测试集上的均方误差: 0.0003564422049939142
训练模型: XGBoost
测试集上的均方误差: 0.013671010934760258
训练模型: RandomForest
测试集上的均方误差: 0.013248586110257024
训练模型: SVM
测试集上的均方误差: 0.04974934214151624
训练模型: MLP
测试集上的均方误差: 0.0011011161628656963
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 952us/step
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 905us/step
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 1000us/step
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 524us/step
混合模型的均方误差: 0.0003560467854642135
2024-08-18 22:46:58,301 - WARNING - You are saving your model as an HDF5 file via `model.save()` or `keras.saving.save_model(model)`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')` or `keras.saving.save_model(model, 'my_model.keras')`. 
最佳模型 LSTM 已保存。

Process finished with exit code 0

我们可以注意到：

运用LSTM设计的模型在测试集上的均方误差: 0.00027118924039005093，而我们最终的混合模型的均方误差: 0.0003560467854642135，反而比LSTM大；
模型训练的结果波动幅度比较大(经过本人多次测试)；

解决上述问题的实现思路和方法：

混合模型的表现有时可能不如某个单独模型的原因可能包括以下几点：

权重分配：在混合模型中，模型的预测结果是基于每个模型的均方误差（MSE）进行加权的。如果一个模型的预测误差相对较小，它的权重就会更大。然而，如果这些模型的预测结果具有较大的方差，即使一个模型的MSE较小，整体的加权平均可能并不会显著提升预测精度。
模型之间的相似性：如果多个模型的预测结果非常相似，那么组合它们并不会显著提高性能。在这种情况下，混合模型的表现可能接近于最好的单个模型。
混合策略的局限性：当前的混合策略只是简单地基于MSE进行加权平均，没有考虑更复杂的混合策略，例如非线性组合、堆叠模型（stacking），这些更复杂的方法可能会捕捉到单个模型未能捕捉到的模式，从而提高整体模型的表现。
模型多样性不足：混合模型的效果依赖于不同模型之间的多样性。如果模型之间的多样性不足（即它们预测出的结果过于相似），混合模型的效果可能不会显著提升。

可能的解决方法

尝试其他混合策略：除了简单的加权平均，你可以尝试其他混合方法，例如堆叠（stacking），其中一个元学习器（如线性回归、神经网络）学习如何组合不同模型的预测结果。
增加模型的多样性：通过加入更多不同类型的模型，或者调整现有模型的超参数，可以增加模型的多样性，从而提升混合模型的效果。
模型选择：有时，简单的选择表现最好的单一模型可能比混合所有模型效果更好。你可以通过更多的验证和交叉验证来决定是否需要混合模型。
手动调整权重：尝试对各个模型的权重进行手动调整，而不仅仅依赖于MSE来进行加权。手动权重调整可以基于经验或者通过优化方法找到最佳的权重组合。

尝试使用其他混合方法，例如堆叠（stacking），学习如何组合不同模型的预测结果、调整现有模型的超参数增加模型的多样性

整体框架和模块设计
- 数据预处理模块 (data_preprocessing_main): 这一部分代码从CSV文件中加载数据，并进行清洗、标准化、技术指标生成等处理，生成用于模型训练和测试的数据集（X_train, X_val, X_test, y_train, y_val, y_test）。
- 模型构建模块 (build_model): 该函数根据传入的模型类型 (model_type) 构建不同类型的模型。支持的模型类型包括神经网络模型（LSTM, GRU, CNN）和传统机器学习模型（XGBoost, RandomForest, SVM, MLP）。对于神经网络模型，使用Keras框架进行构建，并返回编译好的模型；对于传统机器学习模型，返回的是模型实例列表，每个列表中的模型用于预测一个时间步。
- 模型评估模块 (evaluate_model): 该函数评估训练好的模型在测试集上的表现，计算并打印均方误差（MSE），并绘制模型的预测结果与真实值的对比图。神经网络模型直接通过model.predict生成预测值，而传统机器学习模型则针对每个时间步分别进行预测并组合。
- 堆叠模型模块 (stack_models): 此函数实现了使用堆叠方法来组合多个模型的预测结果。对于每个时间步，堆叠模型结合了来自不同基础模型的预测结果，并使用线性回归作为初级学习器，最后通过随机森林回归器作为终极学习器进行堆叠预测。
- 模型搜索与混合模块 (model_search): 这是核心函数，负责搜索最优模型。函数遍历不同模型类型，分别训练并评估每种模型的性能。评估完所有模型后，调用stack_models函数对模型进行堆叠组合。最后输出最优单一模型及其均方误差，同时输出堆叠组合模型的均方误差。
代码流程
- 数据生成: 通过data_preprocessing_main函数生成训练集、验证集和测试集的数据。
- 模型搜索与训练: model_search函数遍历所有指定的模型类型（包括神经网络模型和传统机器学习模型），为每种模型训练并评估其在测试集上的表现。
- 堆叠模型组合: 使用堆叠方法对所有模型进行组合预测，并计算组合模型的均方误差。
- 模型保存: 将最优的单一模型保存为HDF5格式，便于后续使用。
模型混合策略
- 多模型训练: 每种模型都独立训练，并在测试集上进行评估。
- 堆叠策略: 对不同模型的预测结果进行堆叠，通过线性回归和随机森林等方法来学习如何最优地组合这些模型的预测，得到最终的组合模型预测结果。
主要亮点
- 多样化的模型选择: 支持多种神经网络模型（LSTM、GRU、CNN）以及传统机器学习模型（XGBoost、RandomForest、SVM、MLP）。
- 堆叠方法的应用: 通过堆叠多种模型的预测结果，实现了不同模型的优势互补，旨在提高最终预测的准确性。
- 代码的灵活性和可扩展性: 使用统一的框架处理不同类型的模型，使得代码具有很高的可读性和可维护性，便于进一步扩展。

代码实现：

import tensorflow as tf
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error
from sklearn.ensemble import RandomForestRegressor, StackingRegressor
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.neural_network import MLPRegressor
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split, ParameterGrid
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, GRU, Conv1D, Dense, Dropout, Flatten
import numpy as np
from data_solve import main as data_preprocessing_main

# 配置Matplotlib使用中文字体，确保图表中的中文字符正常显示
plt.rcParams['font.sans-serif'] = ['SimHei']  # 使用黑体
plt.rcParams['axes.unicode_minus'] = False  # 确保负号显示正常


def build_model(model_type, input_shape=None, units=50, dropout_rate=0.2):
    """
    构建并编译模型或实例化机器学习算法。

    参数:
    - model_type: 模型类型 ('LSTM', 'GRU', 'CNN', 'XGBoost', 'RandomForest', 'SVM', 'MLP')
    - input_shape: 输入数据的形状 (time_steps, num_features)，仅适用于神经网络
    - units: 神经元数量，适用于神经网络
    - dropout_rate: Dropout层的比例，用于防止过拟合

    返回:
    - 编译好的模型或实例化好的机器学习算法
    """
    if model_type in ['LSTM', 'GRU', 'CNN']:
        model = Sequential()
        if model_type == 'LSTM':
            model.add(LSTM(units=units, return_sequences=True, input_shape=input_shape))
            model.add(Dropout(dropout_rate))
            model.add(LSTM(units=units, return_sequences=False))
        elif model_type == 'GRU':
            model.add(GRU(units=units, return_sequences=True, input_shape=input_shape))
            model.add(Dropout(dropout_rate))
            model.add(GRU(units=units, return_sequences=False))
        elif model_type == 'CNN':
            model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=input_shape))
            model.add(Dropout(dropout_rate))
            model.add(Flatten())
        model.add(Dropout(dropout_rate))
        model.add(Dense(units=3))  # 输出3个时间步的预测
        model.compile(optimizer='adam', loss='mean_squared_error')
        return model
    elif model_type == 'XGBoost':
        return [XGBRegressor(objective='reg:squarederror') for _ in range(3)]
    elif model_type == 'RandomForest':
        return [RandomForestRegressor(n_estimators=100, random_state=42) for _ in range(3)]
    elif model_type == 'SVM':
        return [SVR(kernel='rbf') for _ in range(3)]
    elif model_type == 'MLP':
        return [MLPRegressor(hidden_layer_sizes=(units, units), max_iter=500) for _ in range(3)]
    else:
        raise ValueError(f"未知的模型类型: {model_type}")


def evaluate_model(model, X_test, y_test, model_type):
    """
    评估模型在测试集上的表现，并绘制预测值与真实值的对比图。

    参数:
    - model: 训练好的模型
    - X_test, y_test: 测试数据及其标签
    - model_type: 模型类型，决定如何处理数据输入

    返回:
    - 测试集上的均方误差 (MSE)
    """
    if model_type in ['LSTM', 'GRU', 'CNN']:
        predictions = model.predict(X_test)
    else:
        X_test_flat = X_test.reshape(X_test.shape[0], -1)
        predictions = np.zeros(y_test.shape)
        for i in range(3):
            predictions[:, i] = model[i].predict(X_test_flat)

    mse = mean_squared_error(y_test, predictions)
    print(f"测试集上的均方误差: {mse}")

    plt.figure(figsize=(14, 7))
    for i in range(predictions.shape[1]):  # 针对每个时间步绘制预测和真实值的对比图
        plt.plot(y_test[:, i], label=f'真实值 - 时间步 {i + 1}')
        plt.plot(predictions[:, i], label=f'预测值 - 时间步 {i + 1}')
    plt.title(f'金价预测 vs 真实值 ({model_type})')
    plt.xlabel('时间步')
    plt.ylabel('金价')
    plt.legend()
    plt.show()

    return mse


def stack_models(model_results, X_test, y_test):
    """
    使用堆叠方法结合多个模型的预测结果。

    参数:
    - model_results: 每个模型的结果，包括模型实例、类型等信息
    - X_test: 测试数据特征
    - y_test: 测试数据标签

    返回:
    - 堆叠后的模型均方误差
    - 堆叠模型的预测结果
    """
    combined_predictions = np.zeros_like(y_test)

    for t in range(y_test.shape[1]):  # 针对每个时间步
        stacked_predictions = np.column_stack([
            np.mean([sub_model.predict(X_test.reshape(X_test.shape[0], -1)) for sub_model in model], axis=0)
            if isinstance(model, list) else model.predict(X_test)[:, t]
            for model_type, model, mse, _ in model_results
        ])

        # 使用线性回归作为堆叠模型
        stacker = StackingRegressor(
            estimators=[(f'model_{i}', LinearRegression()) for i in range(stacked_predictions.shape[1])],
            final_estimator=RandomForestRegressor(n_estimators=100, random_state=42)
        )

        stacker.fit(stacked_predictions, y_test[:, t])
        combined_predictions[:, t] = stacker.predict(stacked_predictions)

    mse = mean_squared_error(y_test, combined_predictions)
    return mse, combined_predictions


def model_search(X_train, y_train, X_val, y_val, X_test, y_test,
                 model_types=['LSTM', 'GRU', 'CNN', 'XGBoost', 'RandomForest', 'SVM', 'MLP'], epochs=50, batch_size=32):
    """
    搜索最佳模型类型、训练步数，并评估模型性能。

    参数:
    - X_train, y_train: 训练数据及其标签
    - X_val, y_val: 验证数据及其标签
    - X_test, y_test: 测试数据及其标签
    - model_types: 模型类型列表 (['LSTM', 'GRU', 'CNN', 'XGBoost', 'RandomForest', 'SVM', 'MLP'])
    - epochs: 最大训练轮次
    - batch_size: 批次大小

    返回:
    - 最佳模型类型和性能
    """
    best_model = None
    best_mse = float('inf')
    best_model_type = None
    model_results = []

    for model_type in model_types:
        print(f"训练模型: {model_type}")
        model = build_model(model_type, input_shape=(X_train.shape[1], X_train.shape[2]))

        if model_type in ['LSTM', 'GRU', 'CNN']:
            model.fit(X_train, y_train, validation_data=(X_val, y_val), epochs=epochs, batch_size=batch_size, verbose=0)
        else:
            X_train_flat = X_train.reshape(X_train.shape[0], -1)
            X_val_flat = X_val.reshape(X_val.shape[0], -1)
            for i in range(3):  # 训练每个时间步的模型
                model[i].fit(X_train_flat, y_train[:, i])

        mse = evaluate_model(model, X_test, y_test, model_type)
        model_results.append((model_type, model, mse, model))

        if mse < best_mse:
            best_mse = mse
            best_model = model
            best_model_type = model_type

    # 使用堆叠方法结合模型
    combined_mse, combined_predictions = stack_models(model_results, X_test, y_test)
    print(f"混合模型的均方误差: {combined_mse}")

    return best_model, best_model_type, best_mse, combined_mse


# 主函数
if __name__ == "__main__":
    # 生成数据 (你可以根据你的情况替换为实际的数据生成过程)
    # 数据生成部分需要根据你具体的数据情况进行替换
    X_train, X_val, X_test, y_train, y_val, y_test = data_preprocessing_main(
        file_path='gold_price_data_extracted.csv',
        window_size=10,
        prediction_horizon=3,
        normalization_method='minmax',
        train_size=0.7,
        validation_size=0.1,
        drop_columns=['Adj', 'Volume'],
        indicators=['MA', 'RSI', 'MACD', 'Bollinger', 'ATR']
    )

    # 进行模型搜索
    best_model, best_model_type, best_mse, combined_mse = model_search(X_train, y_train, X_val, y_val, X_test, y_test)

    # 保存最佳模型
    if best_model_type in ['LSTM', 'GRU', 'CNN']:
        best_model.save(f'best_model_{best_model_type}.h5')
    print(f"最佳模型 {best_model_type} 已保存。")

输出结果：

2024-08-19 00:32:57,653 - INFO - 数据加载完成
2024-08-19 00:32:57,653 - INFO - 原始数据包含 3474 行，7 列
2024-08-19 00:32:57,665 - INFO - 数据清理完成
2024-08-19 00:32:57,665 - INFO - 清理后的数据包含 3474 行，5 列
2024-08-19 00:32:57,668 - INFO - 数据预览:
           Date    Open    High     Low   Close
3473 2010-10-05  134.10  134.10  134.10  134.03
3472 2010-10-06  134.10  135.00  134.10  134.77
3471 2010-10-19  136.98  136.98  132.93  133.60
3470 2010-10-20  133.43  134.70  133.38  134.42
3469 2010-10-21  134.32  134.72  131.88  132.56
2024-08-19 00:32:57,668 - INFO - 生成技术指标: ['MA', 'RSI', 'MACD', 'Bollinger', 'ATR']
2024-08-19 00:32:57,688 - INFO - 技术指标生成完成
2024-08-19 00:32:57,691 - INFO - 数据归一化/标准化完成，方法: minmax
2024-08-19 00:32:57,814 - INFO - 滑动窗口生成特征和标签，窗口大小: 10, 预测范围: 3
2024-08-19 00:32:57,816 - INFO - 数据集划分完成: 训练集大小=(2407, 10, 11), 验证集大小=(343, 10, 11), 测试集大小=(689, 10, 11)
2024-08-19 00:32:57,816 - INFO - 训练集大小: (2407, 10, 11), 验证集大小: (343, 10, 11), 测试集大小: (689, 10, 11)
E:\ProgramData\anaconda3\envs\goldenpre\Lib\site-packages\keras\src\layers\rnn\rnn.py:204: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
训练模型: LSTM
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step
测试集上的均方误差: 0.0005383426715256204
训练模型: GRU
E:\ProgramData\anaconda3\envs\goldenpre\Lib\site-packages\keras\src\layers\rnn\rnn.py:204: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step
测试集上的均方误差: 0.0022094757470167074
训练模型: CNN
E:\ProgramData\anaconda3\envs\goldenpre\Lib\site-packages\keras\src\layers\convolutional\base_conv.py:107: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step 
测试集上的均方误差: 0.0009394372486201054
训练模型: XGBoost
测试集上的均方误差: 0.013671010934760258
训练模型: RandomForest
测试集上的均方误差: 0.013248586110257024
训练模型: SVM
测试集上的均方误差: 0.04974934214151624
训练模型: MLP
测试集上的均方误差: 0.0010740313068716674
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step 
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step 
22/22 ━━━━━━━━━━━━━━━━━━━━ 0s 667us/step
混合模型的均方误差: 1.7644727252183848e-05
混合模型已保存为 stacked_model.joblib
最佳模型类型: LSTM，测试集MSE: 0.0005383426715256204
混合模型的测试集MSE: 1.7644727252183848e-05
2024-08-19 00:34:38,340 - WARNING - You are saving your model as an HDF5 file via `model.save()` or `keras.saving.save_model(model)`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')` or `keras.saving.save_model(model, 'my_model.keras')`. 
最佳模型 LSTM 已保存。

Process finished with exit code 0

这样我们就得到了最佳拟合的混合模型stacked_model.joblib，基准测试显示精度非常高，与真实值几乎一致！！！

下一期内容：

模型的部署和上线。

本文作者： Anderson
本文链接： http://nikolahuang.github.io/2024/08/18/从零开始训练一个神经网络之四：模型调优和部署/
版权声明： 转载请注明出处，谢谢。