Abstract
Optimization of stock selection strategies has been a topic of interest in finance. Although deep learning models have demonstrated superior performance over traditional methods, there are still shortcomings. For example, previous studies do not provide enough explanation for feature selection and usually use features such as closing price directly to make predictions; for example, most studies predict the trend of multiple stock indices or only individual stocks, which is difficult to be directly applied to actual stock selection. In this paper, a multivariate hybrid neural network model CNN-LSTM-GNN (CLGNN) for stock prediction is proposed, in which the CNN and the LSTM modules analyze the local and the whole, respectively, while the multivariate time series GNN module is added to explore the potential relationships between the data through the graph learning, graph convolutional, and temporal convolutional layers. CLGNN analyzes the potential relationships between the data based on the returns to classify stocks, and then develops a stock selection strategy, and directly outputs the returns and stock codes. In this paper, a hybrid filter approach based on entropy and Pearson correlation is proposed for feature selection, and experiments are conducted on all stocks in the CSI All Share Index (CSI); the results show that among multiple models, the returns obtained when the features of daily return, turnover rate, relative strength index, volume, and forward adjusted closing price are used as inputs are all the highest, and the return obtained by CLGNN is even higher than that of the other models (e.g., TCN, Transformer, etc.).