1. Introduction
In recent years, renewable energy generation, from sources such as solar and wind energy, has emerged as a crucial component of electrical energy production due to its ability to reduce carbon emissions and serve as an alternative to the rapidly depleting fossil fuels [
1]. Photovoltaics accounted for about 45% of global renewable energy capacity additions in 2020 and showed a high penetration rate among renewable energy sources [
2,
3]. Photovoltaic power relies on uncontrollable solar radiation, which is not conducive to energy management planning. Additionally, an inconsistent photovoltaic power reduces the dependence on photovoltaic power on the supply side of the power grid [
3]. Therefore, to stably integrate photovoltaic power into the power grid, it is essential to accurately forecast solar radiation, which has the most significant impact on photovoltaic power generation [
4].
Solar radiation forecasting models based on various methods have been proposed to forecast solar radiation accurately. For example, models based on statistical methods include autoregressive integrated moving average (ARIMA) [
5], multilinear regression (MLR) [
6], and holt winters [
7]. These models perform well when the inputs and outputs are linear, but the forecasting performance deteriorates when the inputs and outputs are nonlinear [
8,
9]. Artificial intelligence (AI)-based solar radiation forecasting models such as support vector regression (SVR) [
10] and neural network (NN) [
11] have been proposed to solve the performance degradation issues arising from the nonlinear relationship between the input and output. However, although AI-based forecasting models perform well for nonlinear data, their forecasting performance is greatly affected by the number of input variables or the amount of input data. In order to compensate for the degradation of forecasting performance according to the number of input variables and amount of data, a hybrid forecasting model in the frequency domain based on preprocessing methods such as Fourier transformation (FT) and wavelet transformation (WT) has been proposed for data transformation and decomposition [
12]. Such a hybrid model showed improved solar radiation forecasting performance by decomposing the original solar radiation data and making them suitable for modeling nonstationary data with a large amount of information [
13,
14]. However, since this approach uses only past solar radiation data for forecasting, it has a limited ability to cope with the changes in solar radiation caused by exogenous variables such as air temperature and relative humidity, and it cannot respond to rapid weather changes [
15].
In order to overcome the limitations of existing hybrid prediction models, in this paper, we propose a domain hybrid solar radiation forecasting model that combines forecasting in the sequence domain using exogenous variables and forecasting in the frequency domain using past solar radiation. The proposed solar radiation forecasting method consists of two stages, and each model uses algorithms with a relatively low learning time and high accuracy [
16]. In the first stage, solar radiation forecasting is performed in the sequence and frequency domains using exogenous variables and past solar radiation data as inputs, respectively. A forecasting model in the sequence domain is constructed using the light gradient boosting machine (LightGBM) [
17] and time series cross-validation (TSCV). Because the forecasting model in the sequence domain applies TSCV, it was built on the basis of LightGBM, which is fast and has excellent performance. The forecasting model in the frequency domain uses WT [
18] and complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) [
19] to transform past solar radiation data into the frequency domain and perform signal decomposition. We used CEEMDAN to solve the mode mixing problem in data decomposition and to minimize errors in data reconstruction. Then, forecasting models based on multilayer perceptron (MLP) [
20] were constructed using each decomposed solar radiation dataset as input. In the second stage, based on the MLP, more accurate domain hybrid day-ahead solar radiation forecasting is performed by considering solar radiation patterns and exogenous factors in the sequence domain and frequency domain, respectively. The contributions of this paper are as follows:
We present a domain hybrid day-ahead solar radiation model that combines forecasting in the sequence and frequency domains for an accurate solar radiation forecasting.
We further improve the solar radiation forecasting performance by ensembling the forecasting results of the two domains.
The proposed model performs day-ahead forecasting at 1 h intervals and shows a high accuracy.
This paper is organized as follows:
Section 2 introduces several related works.
Section 3 presents the overall structure of the proposed domain hybrid day-ahead solar radiation forecasting model.
Section 4 illustrates the experiments and their results. Lastly,
Section 5 presents the major conclusions of the study.
2. Related Works
Recently, solar radiation forecasting models using AI methods such as SVR [
21,
22,
23] and artificial neural networks (ANNs) [
24,
25,
26] have been proposed to overcome the nonlinearity and complex relationships of time series. For instance, Mellit et al. [
25] presented a method for forecasting day-ahead solar radiation using air temperature values based on the MLP algorithm. This method was validated using data collected in the Italian city of Trieste. Yildirim et al. [
27] studied solar radiation forecasting using regression analysis and ANNs for four different sites in Turkey. The proposed model uses longitude, sunshine hours, relative humidity, air temperature, and time information as the input variables. The authors obtained the most accurate results from the ANN-based model. Kaba et al. [
28] performed solar radiation forecasting at different sites in Turkey using deep learning algorithms. They used sunshine hours, cloud cover, and daily minimum and maximum temperature data as the input variables, and then compared and analyzed the change in accuracy according to different combinations of input variables. Yu et al. [
29] proposed a short-term solar radiation forecasting model based on the long-short term memory (LSTM) algorithm. They considered relative humidity, cloud type, dew point, solar zenith, wind speed, etc. as the input variables and verified the applicability of the proposed method in three sites in the United States. Their results confirmed that the LSTM-based forecasting model showed an excellent performance. He et al. [
30] proposed a hybrid probabilistic solar radiation forecasting model that combined LSTM and residual modeling. LSTM-based forecasting was used for deterministic forecasting, whose value was used to calculate the residual distribution. The input variables of the model were relative humidity, dew point temperature, cloudiness, wind speed, and time information. The authors verified that the proposed model outperformed the existing deep learning-based models.
Solar radiation forecasting using the aforementioned exogenous factors as input variables demonstrated an excellent accuracy in the sequence domain. Nevertheless, there is a limit to the improvement in prediction accuracy when the number of input variables is small. Various forecasting models that use past solar radiation data in the sequence domain as input variables have been proposed to solve this problem. Huang et al. [
12] proposed a solar radiation forecasting model in the frequency domain based on discrete Fourier transform (DFT), principal component analysis (PCA), and Elman neural network (ENN). The authors confirmed that the performance of the proposed forecasting model in the frequency domain was superior to that of the existing ones. Shamshirband et al. [
31] proposed a solar radiation forecasting model in the frequency domain using WT and support vector machine (SVM). WT was used to decompose the solar radiation data, which were the input variables, and each decomposed datapoint was used as the input to individual SVM models. The authors verified that the developed model performed better than other models. Gao et al. [
15] proposed a solar radiation forecasting model combining CEEMDAN, convolution neural networks (CNNs), and LSTM. The authors verified that the forecasting accuracy, which is a noisy time series, can be improved by decomposing a complex signal into several relatively simple signals using CEEMDAN. Zhang et al. [
32] proposed a model to improve the solar radiation forecasting performance in the frequency domain by combining WT, CEEMDAN, improved atom search optimization (IASO), and outlier robust extreme learning machine (ORELM). The authors showed that WT can improve the performance through an appropriate denoising and decomposition of the signal data through CEEMDAN. In addition, it was revealed that the performance could be further enhanced by optimizing the model using IASO. Although the forecasting performance in the frequency domain was excellent, the response of the model to weather changes such as rainy and cloudy days was limited because it did not consider the exogenous factors [
15]. In addition, since only past solar radiation was considered, the accuracy of forecasting instantaneous changes in solar radiation was limited.
In this paper, we present a domain hybrid day-ahead solar radiation forecasting model that combines sequence- and frequency-domain forecasting to compensate for these weaknesses and provide a more robust and superior performance.