1. Introduction
The forecast of the medium to long-term price trend of the electricity market is subject to considerable uncertainty due to the influence of multiple complex factors - geopolitical events, climatic phenomena, social developments, regulations, technical factors, economic cycles, etc. [
1].
Analyzing the evolution of the Spanish electricity market (OMIE) from 2018 to 2023 [
2] unravels key factors behind heightened volatility. Understanding the intricate evolution of energy prices between 2018 and 2023 (
Figure 1), sheds light on the multifaceted factors that influenced this journey. According to the International Energy Agency [
3], during the initial years, from 2018 to 2021, energy prices were shaped by a confluence of variables, including rising costs across various fuels and technologies, supply chain pressures, labor market constraints, and fluctuations in critical mineral supplies and construction materials. Notably, clean energy costs, which had previously witnessed steady declines, exhibited a distinct uptrend during this period.
However, the narrative dramatically turned after 2020, with energy prices experiencing heightened volatility. As reported, the primary catalyst for this volatility was the growing disparity between surging demand and a limited pipeline of new conventional projects within the oil industry. This imbalance introduced a significant risk of price spikes, casting shadows on the global economy's stability. Moreover, competitive electricity markets grappled with a widening gap between revenue from electricity sales and total generation costs, further contributing to price fluctuations.
The COVID-19 pandemic, a global disruptor, played its part by dampening overall energy demand, particularly in the case of carbon-intensive fuels like coal and oil. Conversely, renewable energy sources proved more resilient in the face of the pandemic's impacts. CO2 emissions were notably reduced, and the energy sector witnessed a decline in capital investment, primarily affecting oil and natural gas supply projects. These repercussions are expected to reverberate through energy markets in the years to come.
Furthermore, the world's current energy crisis, initiated by Russia's invasion of Ukraine, has added another layer of complexity to the energy landscape. High energy prices, particularly in the natural gas sector, have led to wealth transfers from consumers to producers, affecting electricity generation costs globally. The crisis has also posed challenges in ensuring access to modern energy for many, with lingering uncertainties about its duration and fossil fuel price trends. While short-term shifts have seen increased demand for oil and coal as alternatives to costly gas, the long-term trajectory points toward low-emissions sources such as renewables, nuclear energy, and heightened efficiency measures.
Previous time series forecasting approach models used to determine energy prices prior to the COVID-19 pandemic [
4,
5,
6] faced significant challenges during and after the crisis. They struggled to adapt to the abrupt disruptions in energy demand, increased price volatility, supply chain pressures, and changes in the energy mix brought about by the pandemic. Additionally, uncertainties in the economic and policy landscapes further hindered their accuracy. The pandemic exposed the limitations of these forecasting models in coping with such unforeseen disruptions, emphasizing the need for more adaptable and robust forecasting approaches to navigate the evolving energy market dynamics effectively.
Scientific evidence suggests that having prior information about these qualitative factors can be of great importance in understanding the possible future evolution of the market. For example, understanding what affect the consumption patterns [
7,
8], the demand density [
9], and the electricity generation [
10,
11] can be critical. In fact, many market specialists publish their understanding of recent market developments and possible trends based on factors such as social phenomena affecting demand, conflicts, government agreements, natural disasters, legislative actions, nuclear plant shutdowns, weather conditions, etc., either through journals or company reports available on the web.
These reports can be processed to apply sentiment analysis techniques. Typically, sentiment analysis techniques applied to markets (stock markets, oil markets, electricity markets) capture only headlines and classify the possible price direction. Research insights suggest that including sentiment analysis results improves quantitative predictions of price prediction models. There is a growing trend towards using deep learning techniques for market sentiment analysis. Machine learning-based methods involve training a model on a dataset of text where the sentiment of each text fragment is known. The trained model can then be used to predict the sentiment of new text. Examples of machine learning techniques include Convolutional Neural Networks (CNN), Recursive Neural Networks (RNN), and Transformer models [
12].
Large Language Models (LLMs) are based on Transformer architecture trained on a large volume of data (GPT), which are then fine-tuned for specific contexts [
13]. Current LLMs have been trained on large and varied datasets, giving them the capability to understand the functioning of the economy and markets. To evaluate sentiment in a particular market, they need to be fine-tuned with specific new datasets derived from specialist reports and related news to derive sentiments of the specific market [
14,
15]. Breitung et al. [
16] apply this approach to the oil market by generating a specific dataset with 1600 records.
So far, we have not found publications dealing with sentiment analysis applied to trends in electricity markets.
Advanced natural language processing methods have transformed AI interaction and Generative Pre-trained Transformers (GPT) are ground-breaking for generating human-like text and comprehending natural language [
17]. GPT models can be extended and refined with specialized news and reports analysis [
18], which can significantly enhance energy price prediction models, in addition to sentiment analysis, in several ways:
Data Enrichment: GPT-based models can analyze and extract valuable insights from a vast corpus of specialized news articles and reports on the energy market. This data enrichment provides a broader context for energy price forecasting models.
Event Detection: GPT models can detect and highlight significant events [
19], such as geopolitical developments, supply disruptions, or regulatory changes, that may impact energy markets. These detected events can be used as input variables for forecasting models.
Market News Summarization: GPT can generate concise summaries of complex news articles and reports [
20] making it easier for analysts and traders to stay informed about market developments. These summaries can serve as valuable inputs for forecasting models.
Identifying Influential Factors: GPT can identify, and rank factors mentioned in the news and reports likely to influence energy prices. This information can guide feature selection and help prioritize variables in forecasting models.
Customized reports: In the case of OpenAI's GPT, users can provide customized prompts to extract specific information or insights from news and reports. This allows for tailored analysis based on the unique requirements of the forecasting model.
The assumption is that by incorporating GPT-based specialized news and reports analysis into energy price prediction models allows for a more comprehensive and timely understanding of market dynamics [
21]. This, in turn, may improve the models' accuracy and helps energy market participants make informed decisions in an increasingly complex and dynamic environment. The application of GPT technology in predicting energy prices in the Spanish Market is the focus of our research.
In this paper, we analyze how the reasoning capabilities of large language models (LLMs) can be used to derive context-specific trend predictions from specialized news and expert reports. We construct a dataset of electricity news and reports and compare the forecasting performance of two different approaches for domain knowledge enhancement: prompting engineering and fine-tuning. Our findings indicate that LLMs can successfully leverage their reasoning capabilities to contextualize the evolution of electricity market prices.
We contribute to the literature on advanced Natural Language Processing (NLP) tools applied to management science, illustrating how LLMs can be effectively used to extract business experts' perspectives on the future and consolidate their consensus, and provide tactical guidance to drive these models towards achieving optimal results.
In what follows, we present the methods used to adjust the GPT, the implementation details (the original code and the dataset are openly available1), the metrics that we have defined to evaluate the different dimensions of the sentiment extracted from the analysis of the texts, the results obtained, the future developments that would be interesting to address and the main conclusions of this article.
3. Results
From the results of analyzing the news and articles, we extract information like its impact on the price, the direction of that impact, and the period it will occur.
We have grouped the insights provided by this analysis in short-term and mid/long-term, calculating the average impact for each interval and the sign indicating Direction, and we evaluate the accuracy of that prediction based on the variation of price over the following period after the news occurred.
We have defined three different types of metrics to measure the accuracy of the Direction of the Impact prediction:
Close Price: If the Direction indicates that the Price will go UP (
Figure 6), the OPEN PRICE at the beginning of the first interval when the news is published (interval t) should be LOWER than at least 1 of the CLOSE PRICE values of the current or the following two intervals (t, t+1, t+2). The intervals will be weeks for short-term and months for mid/long-term.
If the Direction indicates that the Price will go DOWN (
Figure 7), the OPEN PRICE at the beginning of the first interval when the news is published (t) should be HIGHER than at least 1 of the CLOSE PRICE values of the current or the following two intervals (t, t+1, t+2).
High/Low: If the Direction indicates that the Price will go UP (
Figure 8), the OPEN PRICE at the beginning of the first interval when the news is published (interval t) should be LOWER than at least 1 of the HIGH PRICE values of the current or the following two intervals (t, t+1, t+2). The intervals will be weeks for short-term and months for mid/long-term.
Figure 8.
OpenAI High/Low Price Metric: Direction UP.
Figure 8.
OpenAI High/Low Price Metric: Direction UP.
If the Direction indicates that the Price will go DOWN (
Figure 9), the OPEN PRICE at the beginning of the first interval when the news is published (t) should be HIGHER than at least 1 of the LOW PRICE values of the current or the following two intervals (t, t+1, t+2).
Threshold: Same as High or Low, but there should be a minimum difference between the Open Price value and the High or Low, depending on the Direction of 2% for short-term and 5% for mid/long term.
3.1. Short-Term Analysis
The following figures (
Figure 10 and
Figure 11) and
Table 1 show the analysis results of short-term identified impacts on price using OpenAI and comparing both approaches: in-context examples and fine-tuning. The color of the arrows indicates OpenAI prediction to go UP (green) or DOWN (red), next to the price timeseries.
The in-context model detects 117 impact points, while the fine-tuned model only detects 31 short-term impacts.
Concerning the scores, we can see that the fine-tunned approach performs slightly better but with considerably fewer captured points.
3.2. Mid/Long-Term Analysis
The following figures (
Figure 12 and
Figure 13) and
Table 2 show the results of the analysis of mid/long-term identified impacts on price using OpenAI and comparing both approaches: in-context examples and fine-tuning.
The in-context model detects 71 impact points, while the fine-tuned model only detects 59 mid/long-term impacts. The scores show that the fine-tuned approach is performing again better, although with slightly fewer points.
Overall, OpenAI detects the trend in the price time series, especially in the mid/long term. The in-context approach seems more sensitive, but the fine-tuned model provides slightly more accurate results.
4. Discussion
Our findings indicate the potential of GPT models to provide valuable insights and improve predictions, particularly in understanding mid-term price trends. Therefore, we conclude that continued exploration and optimization of OpenAI's capabilities are essential to unlock their full potential in energy price forecasting, and particularly electricity market price. To validate and to generalize the observed results further research is needed.
Building on the insights gained from this study it should be enhanced the contextual understanding and data sources used as input for the GPT models by incorporating additional relevant news and reports sources and increased periodicity and optimizing the accurate impact calculation and decay over time methods to translate real impact.
The key benefit of increasing LLM awareness about the market evolution by the in-context approach is that it avoids the need to re-modify LLM parameters for this specific task application. Instead, developers can append an external knowledge repository, enriching the input and thus refining the output accuracy of the model. Therefore, in-context is seen as a more practical and economical approach, with a lower barrier to entry and independent of the specific LLM model. However, for more extensive and professional applications, it is necessary to automate the workflows that allow scalability and better precision of responses. A RAG architecture [
29] can be developed for this purpose.
RAG has become one of the most popular architectures in LLM systems, combining automated information retrieval mechanisms and in-context learning to bolster LLM performance. In this framework, a query initiated by a user requests the retrieval of relevant information through search algorithms. This information is then integrated into the LLM indications, providing additional context for the generation process [
30].
In basis of an RAG architecture, we plan to develop the following additional functionalities:
the incorporation of GPT-calculated features into multivariate time series prediction models as input variables
influential event detection as early warning signals (natural disasters, geopolitical conflicts, regulatory changes)
automatic generation of reports that describe the recent evolution of the electricity market price and the prediction of price trends.