\cormark

[1] \creditConceptualization, Modeling, Methodology, Software, Writing

\credit

Methodology, Proof

\credit

Modeling, Revision

\credit

Supervision, Revision, Funding Support

\cortext

[1]Corresponding author

Long-Term Energy Management for Microgrid with Hybrid Hydrogen-Battery Energy Storage: A Prediction-Free Coordinated Optimization Framework

Ning Qi [email protected] Kaidi Huang Zhiyuan Fan Bolun Xu Department of Earth and Environmental Engineering, Columbia University, New York, NY 10027, USA Department of Electrical Engineering, Tsinghua University, Beijing 100084, China

Abstract

This paper studies the long-term energy management of a microgrid coordinating hybrid hydrogen-battery energy storage. We develop an approximate semi-empirical hydrogen storage model to accurately capture the power-dependent efficiency of hydrogen storage. We introduce a prediction-free two-stage coordinated optimization framework, which generates the annual state-of-charge (SoC) reference for hydrogen storage offline. During online operation, it updates the SoC reference online using kernel regression and makes operation decisions based on the proposed adaptive virtual-queue-based online convex optimization (OCO) algorithm. We innovatively incorporate penalty terms for long-term pattern tracking and expert-tracking for step size updates. We provide theoretical proof to show that the proposed OCO algorithm achieves a sublinear bound of dynamic regret without using prediction information. Numerical studies based on the Elia and North China datasets show that the proposed framework significantly outperforms the existing online optimization approaches by reducing the operational costs and loss of load by around 30% and 80%, respectively. These benefits can be further enhanced with optimized settings for the penalty coefficient and step size of OCO, as well as more historical references.

keywords:

Long-Term Energy Management \sepHydrogen \sepHybrid Energy Storage \sepOnline Convex Optimization \sepMicrogrid

{highlights}

Long-term energy management of microgrid considering seasonal uncertainties and seasonal storage

A prediction-free two-stage coordinated optimization framework

SoC reference of hydrogen storage generated from kernel regression and historical and AI-generated scenarios

A virtual-queue-based online convex optimization algorithm with expert-tracking

Numerical studies on Elia and North China with ground-truth datasets spanning 10 years

1 Introduction

1.1 Background and motivation

A microgrid is a self-contained electrical network with resources including energy storage (ES), renewable energy sources (RES), and controllable loads, operated in either grid-connected or island mode [5, 30]. Microgrids enhance energy resilience, promote decarbonization, and reduce transmission system investments, but the volatility of RES poses challenges to short-term supply-demand balances [23, 21]. Besides, seasonal variations in RES availability [13] and extreme weather events [42] have highlighted the significance of the long-term energy management of microgrids.

Hybrid energy storage system (HESS) [14, 24] offers a promising way to guarantee both the short-term and long-term supply-demand balance of microgrids. HESS is composed of two or more ES units with different but complementing characteristics, such as duration and efficiency. In day-ahead or intra-day operations, batteries can effectively address the uncertainties introduced by RES and demand. For long-term operation, hydrogen storage consisting of electrolyzer and fuel cell can provide efficient solutions to seasonal energy shifting [16]. In this paper, we focus on a typical application: hybrid hydrogen-battery energy storage (H-BES). Given the differences in storage properties and unanticipated seasonal uncertainties, designing an effective long-term energy management framework for microgrids with H-BES is significant but challenging.

1.2 Literature review

Previous research mainly focuses on the short-term energy management of microgrids with H-BES. Two-stage robust optimization is proposed in [9] for the market operation of H-BES, where the uncertainties from RES are modeled by uncertainty sets. A two-stage distributionally robust optimization-based coordinated scheduling of an integrated energy system with H-BES is introduced in [26], where an ambiguity set is employed to model the uncertainties from RES and integrated energy loads. Two-stage stochastic energy management of H-BES is proposed in [7], where the uncertainties from RES, load, and prices are modeled by typical scenarios. However, these works rely solely on offline optimization methods with predefined uncertainty modeling, which may face optimality or feasibility issues in real-time operation. This motivates the research on real-time energy management with online optimization methods, such as the rolling-horizon method. Model predictive control (MPC) is the widely used rolling-horizon method and multi-level MPC controllers are developed for microgrids with hydrogen or H-BES in [31, 13]. An actor-critic deep reinforcement learning method is proposed in [15] to address multi-timescale coordinated dispatch of microgrid with hybrid battery and supercapacitor. MPC and approximate dynamic programming approach are jointly utilized for multi-stage coordinated dispatch [19], which achieves robust real-time performance through continuously updated forecasts. However, the limitations in the aforementioned works mainly lie in (i) The short-term energy management methods may face infeasibility issues in the long-term operation when considering seasonal variations RES and load. (ii) The performance of these techniques strongly depends on the accuracy of the prediction of uncertainties. However, the predictions are practically

unavailable or unreliable for microgrid operators.

To address the first limitation, recent studies have started to explore the long-term energy management of microgrids, which aims to solve the multi-time-period dispatch with non-anticipativity. Stochastic dynamic programming is technically sound, which can decompose the multi-period dispatch problem into sequential single-period dispatch problems through value function. And it is applied in [4] by learning the value function of H-BES. However, it becomes computationally intractable to train the value function if the storage duration spans multiple months. A continuous spectrum splitting approach is proposed in [10] to assign low-frequency uncertainty scenarios to hydrogen and high-frequency uncertainty scenarios to batteries for power balance, but this approach is designed for the planning of H-BES. A data-driven coordinated dispatch framework is proposed in [13], where the state of charge (SoC) reference for hydrogen storage is generated based on historical simulations. This reference is then updated and embedded into MPC for real-time operation. However, the use of MPC makes the entire framework dependent on forecasting.

Additionally, prediction-free online optimization methods are gaining increased attention. Lyapunov optimization and online convex optimization (OCO) are effective representatives [33]. Lyapunov optimization adopts a “1-lookahead” pattern, where uncertainties are observed first, followed by solving the Lyapunov drift problem [28]. It has wide applications in demand response [40], electric vehicle charging [34], microgrid [1], etc. The long-term operational cost minimization of hydrogen-based building energy systems is transformed into several single-slot subproblems using Lyapunov optimization [37]. A joint energy scheduling and trading algorithm based on Lyapunov optimization and a double-auction mechanism is designed in [43] to optimize the long-term energy cost of each microgrid. However, in some cases, the uncertainties can not be observed before decision-making and Lyapunov optimization becomes inapplicable. For instance, storage participants bid with unknown future prices, and the prices are cleared by the market after the bidding process [41]. Instead, OCO adopts a “0-lookahead” pattern, where the decision is made before the observation of uncertainties. And OCO has been utilized in demand side management [17] and ancillary services [39] due to its completely prediction-free and fast response nature. However, to the best of our knowledge, no research has addressed the long-term energy management of microgrids with H-BES within the OCO framework. The application of OCO in the focused topic may face the following challenges: (i) OCO is problem-dependent without a predefined mathematical formulation, and there is no prior experience available as a reference for designing OCO for microgrids with H-BES. (ii) Although recent works [22, 20, 36, 6] have embedded inter-temporal constraints into the OCO framework, OCO still risks falling into local optima due to its myopic nature. (iii) OCO aims to achieve regret (Reg) that grows sublinearly with time horizon $T$ . However, most of the existing OCO algorithms fail to address the sublinear bounds for dynamic Reg [20, 36, 6] or require prediction information to improve the performance [22]. Please see Table 1 for a comprehensive comparison.

1.3 Research gap

Existing literature is summarized in Table 1. Although some works achieve good results in the long-term energy management of microgrids with H-BES, there are still several research gaps that have not been adequately addressed.

(1) Most existing studies employ a simplified operational model for hydrogen storage, using a constant energy conversion efficiency regardless of whether the storage operates at full capacity. However, the efficiency of hydrogen storage varies with the charge/discharge power and follows a nonlinear function [32]. Using a simplified model can result in sub-optimal or even infeasible solutions [3]. Therefore, it is crucial to incorporate this nonlinearity into the microgrid energy management with H-BES.

(2) Current microgrid energy management approaches either employ offline optimization methods (e.g., robust optimization [9], frequency-domain method [10]) or prediction-dependent online optimization methods (e.g., MPC [13], stochastic dynamic programming [4]). However, the distribution and prediction information is often inaccurate or unavailable in practical microgrid operations. Thus, designing a prediction-free optimization framework for microgrid energy management with H-BES is necessary.

(3) OCO is a promising “0-lookahead” online optimization method originating from the fields of machine learning and control [36]-[6]. However, OCO lacks a global view of long-term operations and adaptability to the high volatility of microgrids. Hence, it is important to extend traditional OCO methods to incorporate long-term operational patterns and time-varying properties.

Table 1: Comparison of existing literature on long-term and short-term energy management of H-BES.

Reference	Storage Type & Model	Long-term Optimization	Short-term Optimization	Prediction-Free
[9]	H-BES-Constant	X	Robust Optimization	$\checkmark$ (Offline)
[26]	H-BES-Constant	X	Distributionally Robust Optimization	$\checkmark$ (Offline)
[7]	H-BES-Constant	X	Stochastic Optimization	$\checkmark$ (Offline)
[31]	H-BES-Electrochemical	X	MPC	X
[15]	Battery+Supercapacitor-Constant	X	Deep Reinforcement Learning	X
[19]	Battery+Thermal Storage-Constant	X	MPC+Dynamic Programming	X
[4]	H-BES-Constant	Stochastic Dynamic Programming		X
[10]	H-BES-Constant	Spectrum Splitting Approach		$\checkmark$ (Offline)
[13]	H-BES-Constant	Historical Reference	MPC	X
[37]	H-BES-Constant	Lyapunov Optimization		$\checkmark$ (1-lookahead)
[43]	Hydrogen Full Cell-Constant	Lyapunov Optimization		$\checkmark$ (1-lookahead)
[22]	Not Given	X	OCO: Dynamic Reg $\mathcal{O}(T^{\max\{1-a-c,c\}})$ , 0<a,c<1	$\checkmark$ (0-lookahead)
[20]	Not Given	X	OCO: Dynamic Reg $\mathcal{O}(T^{c}P_{x}^{c})$ , 0<c<1	$\checkmark$ (0-lookahead)
[36]	Not Given	X	OCO: Static Reg $\mathcal{O}(T^{\max\{1-c,c\}})$ , 0<c<1	$\checkmark$ (0-lookahead)
[6]	Not Given	X	OCO: Dynamic Reg $\mathcal{O}(\max(T^{c}P_{x},T^{1-c}))$ , 0<c<1	$\checkmark$ (0-lookahead)
This Paper	H-BES-Semi-Empirical	Historical&AI-Generated Reference	OCO: Dynamic Reg $\mathcal{O}(T^{c}(1+P_{x})^{1-\kappa}+T^{1-c}(1+P_{x})^{\kappa})$ , $0<\kappa<c<1$	$\checkmark$ (0-lookahead)

a

Depending on the baselines used, Reg is divided into static Reg, with the baseline being a single-period optimal solution, and dynamic Reg, with the baseline being the global optimal solution.
b

$P_{x}$ : path-length, i.e., the accumulated variation of optimal decisions; $P_{g}$ : function variation, i.e., the accumulated variation of constraints.

1.4 Contributions

Motivated by the research gaps, this paper proposes a prediction-free coordinated optimization framework for long-term energy management of microgrid with H-BES while incorporating the nonlinearity of hydrogen storage and seasonal uncertainties from RES and load. Specifically, our contributions are threefold:

(1) Modeling: We propose an approximate semi-empirical hydrogen storage model using piecewise linear relaxation, which accurately captures the power-dependent efficiency of hydrogen storage. Simulations demonstrate that, compared to the constant efficiency model, the proposed approximation model avoids both overly optimistic and overly conservative strategies. This results in a reduction of the practical yearly operational cost by 10% or 36%, and a decrease in yearly loss of load by 1.94 MWh or 3.85 MWh.

(2) Solution Methodology: We introduce a prediction-free two-stage coordinated optimization framework. In the offline stage, the ex-post SoC references for hydrogen storage are generated by deterministic mixed-integer linear programming with historical and AI-generated data on RES and load. These references help to avoid myopic online decision-making and are incrementally updated by kernel regression with newly observed data. Subsequently, we develop an adaptive virtual-queue-based OCO algorithm for prediction-free online decision-making. Compared to the traditional OCO algorithm [22, 20, 36, 6], the proposed method innovatively incorporates a penalty term for long-term pattern tracking and expert-tracking for step size updates. The proposed OCO algorithm is proven to achieve a sublinear bound for dynamic regret.

(3) Numerical Study: We demonstrate the effectiveness of the proposed framework using ground-truth data from Elia [8] and North China [12]. Simulations show that introducing the reference significantly reduces operational costs and loss of load by 40%-57% and 60%-90%, respectively. Furthermore, compared to the prediction-dependent MPC method, the prediction-free OCO method further decreases operational costs and loss of load by 24%-29% and 73%-89%, respectively. These benefits can be further enhanced with optimized settings for the penalty coefficient and step size of OCO, as well as more historical references.

1.5 Paper Organization

We organize the remainder of the paper as follows. Section 2 presents an approximate semi-empirical modeling of hydrogen storage. Section 3 provides the problem formulation for long-term energy management of the microgrid with H-BES. Section 4 introduces the prediction-free two-stage coordinated optimization framework and the proof of OCO performance. Section 5 describes numerical case studies to verify the effectiveness of the proposed framework. Finally, we conclude this paper in Section 6.

2 Approximate semi-empirical hydrogen energy
storage model

2.1 Structure of hydrogen storage system

A hydrogen storage system is composed of several key components, such as electrolyzers, hydrogen storage tanks, fuel cells, compressors, and other auxiliary equipment, as illustrated in Fig. 1. Electrolyzers convert electrical energy into chemical energy by producing hydrogen and oxygen. This paper considers the most mature and commonly used alkaline water electrolyzer. Hydrogen storage tanks are used to store the produced hydrogen. Fuel cells convert the stored hydrogen back into electricity, and we consider the typical type, proton exchange membrane fuel cell (PEMFC). Other auxiliary equipment, including the compressor, cooling system, and control system, is excluded from the modeling.

Refer to caption — Figure 1: Schematic diagram of hydrogen storage system.

2.2 Alkaline water electrolyzer model

(1) Polarization curve

The polarization curve describes the electrochemical behavior of an electrolyzer, modeling the relationship between current and voltage. To account for the impact of temperature and pressure on the thermodynamics and electrochemical process within the electrolyzer, we combine the most used model proposed by Ulleberg [32] and the modified model proposed by Sanchez [27]:

\begin{split}U_{\text{cell}}^{\text{E}}&=U_{\mathrm{rev}}+\left[\left(r_{1}+d_% {1}\right)+r_{2}\cdot\theta+d_{2}\cdot P\right]\cdot\dfrac{i}{A}\\ &+s\cdot\log\left[\left(t_{1}+\frac{t_{2}}{\theta}+\frac{t_{3}}{\theta^{2}}% \right)\cdot\dfrac{i}{A}+1\right]\end{split}

(1)

where the reversible voltage and cell voltage of the electrolyzer are defined as $U_{\mathrm{rev}}$ and $U_{\text{cell}}^{\text{E}}$ . Temperature and pressure are given by $\theta$ and $P$ . The current and effective area of the electrode is defined as $i$ and $A$ . Parameters $r_{1}$ , $r_{2}$ , $d_{1}$ , $d_{2}$ , $t_{1}$ , $t_{2}$ , $t_{3}$ , $s$ are the constants which can be learned from the experimental data.

(2) Faraday efficiency

Faraday efficiency is defined as the ratio of measured hydrogen production to the theoretical value. For an alkaline electrolyzer, the Faraday efficiency typically ranges from 85% to 95% and is affected by temperature. We adopt the four-parameter Faraday efficiency model as (2).

\eta_{\text{F}}=\left(\frac{(i/A)^{2}}{f_{1}+f_{2}\cdot\theta+(i/A)^{2}}\right% )\cdot\left(f_{3}+f_{4}\cdot\theta\right)

(2)

where Faraday efficiency is defined as $\eta_{\text{F}}$ . Parameters $f_{1}$ , $f_{2}$ , $f_{3}$ , $f_{4}$ are the constants which can be learned from the experimental data.

(3) Approximate charging efficiency

According to Faraday’s law, the hydrogen production rate is defined as (3a). The charging efficiency is given by (3b).


	$\displaystyle h^{\text{c}}=3600\cdot\dfrac{\eta_{\text{F}}\cdot M\cdot i\cdot N% }{2F}$		(3a)
	$\displaystyle\eta^{\text{H},{\text{c}}}=\dfrac{h^{\text{c}}\cdot\text{LHV}}{P_% {\text{Stack}}}=3600\cdot\dfrac{\eta_{\text{F}}\cdot M\cdot\text{LHV}}{2F\cdot U% _{\text{cell}}}$		(3b)

where $h^{\text{c}}$ is the hydrogen production rate of electrolyzer. $M$ is the molar mass of hydrogen. $F$ is the Faraday’s constant, i.e., 96485 C/mol. $N$ is the number of cells of the stack. LHV is the lower heat value of hydrogen, i.e., 33.33 kWh/kg.

As illustrated in Fig. 2, the blue curves from the semi-empirical model are non-linear and power-dependent, including a peak in efficiency at around 20% of the rated power. Therefore, a constant conversion efficiency cannot capture the variations in efficiency. To facilitate dispatch optimization, we adopt a piecewise linear approximation for hydrogen production, depicted by red dashed lines.

2.3 PEMFC model

(1) Polarization curve

The polarization curve of PEMFC is typically modeled using the equivalent circuit model proposed by Amphlett [2]. The cell voltage $U_{\text{cell}}^{\text{F}}$ is given by (4), which equals the open circuit voltage $E_{\text{Nernst }}$ dropped by three types of irreversible losses: activation losses $U_{\text{act }}$ , ohmic losses $U_{\text{ohmic }}$ , and concentration losses $U_{\text{con }}$ .


	$\displaystyle U_{\text{cell}}^{\text{F}}=E_{\text{Nernst }}-U_{\text{act }}-U_% {\text{ohmic }}-U_{\text{con }}$		(4a)
	$\displaystyle E_{\text{Nernst }}=\frac{1}{2F}\big{[}\Delta G-\Delta S(\theta-% \theta_{\text{ref }})$
	$\displaystyle\hskip 28.45274pt+R\cdot\theta\left(\log(P_{\mathrm{H}_{2}})+% \frac{\log(P_{\mathrm{O}_{2}})}{2}\right)\big{]}$		(4b)
	$\displaystyle U_{\text{act }}=a_{1}+a_{2}\cdot\theta+a_{3}\cdot\theta\cdot\log% (C_{\mathrm{O}_{2}})+a_{4}\cdot\theta\cdot\log\left(i\right)$		(4c)
	$\displaystyle U_{\text{ohmic }}=i\cdot R_{\text{ohmic }}=i\left(r_{M}\cdot l/A% +R_{c}\right)$		(4d)
	$\displaystyle U_{\text{con }}=B\cdot\log\left(1-\dfrac{J}{J_{\max}}\right)% \text{, }J=\dfrac{i}{A}$		(4e)

where $\Delta G$ is the Gibbs free energy. $\Delta S$ is the entropy change. $R$ is the gas constant (8.314 J/(K $\cdot$ mol)). $P_{\mathrm{H}_{2}}$ and $P_{\mathrm{O}_{2}}$ are the partial pressures of hydrogen and oxygen respectively. $T_{\text{ref}}$ is the reference temperature (298.15 K). $C_{\mathrm{O}_{2}}$ is the oxygen concentration at the surface of the cathode catalyst. $r_{M}$ is the resistivity of the electrolyte membrane. $l$ is the thickness of the electrolyte membrane. $B$ is the concentration overpotential coefficient. $J$ and $J_{\text{max}}$ are the current density and its maximum value. $a_{1}$ , $a_{2}$ , $a_{3}$ , $a_{4}$ are constants that can be learned from the experimental data.

(2) Approximate discharging efficiency

According to Faraday’s law, the hydrogen consumption rate is defined as (5a). The discharging efficiency is given by (5b).


	$\displaystyle h^{\text{d}}=3600\cdot\dfrac{M\cdot i\cdot N}{2F}$		(5a)
	$\displaystyle\eta^{\text{H},{\text{d}}}=\dfrac{P_{\text{Stack}}}{h^{\text{d}}% \cdot\text{HHV}}=\dfrac{2F\cdot U_{\text{cell}}}{3600M\cdot\text{HHV}}$		(5b)

where $h^{\text{d}}$ is the hydrogen consumption rate of PEMFC. HHV is the higher heat value of hydrogen, i.e., 39.4 kWh/kg.

The blue curves from the semi-empirical model in Fig. 3 are non-linear and power-dependent. We also adopt a piecewise linear approximation for hydrogen consumption, depicted by red dashed lines.

2.4 Equivalent hydrogen storage model

The equivalent hydrogen storage model is presented in (6). Constraint (6a) defines the relationship between SoC, charge power, and discharge power. Constraints (6b) limit the SoC of hydrogen storage within the bounds. Constraint (6c) guarantees ensures a sustainable energy state for hydrogen storage over cycles. Constraints (6d)-(6f) describe the tractable formulation of piecewise linear charging and discharging functions. Constraints (6g) limit hydrogen storage’s charging and discharging power.

Constraints: $\forall t\in{{\bm{\Omega}}_{T}}~{}\forall p\in{{\bm{\Omega}}_{{P}}}$


	$\displaystyle E_{t+1}^{\text{H}}=E_{t}^{\text{H}}+\Delta t(h_{t}^{\text{c}}-h_% {t}^{\text{d}})-E_{t}^{\text{H,L}}$		(6a)
	$\displaystyle\underline{E}^{\text{H}}\leq E_{t}^{\text{H}}\leq\overline{E}^{% \text{H}}$		(6b)
	$\displaystyle E_{T}^{\text{H}}\geq E_{0}^{\text{H}}$		(6c)
	$\displaystyle h_{t}^{\text{c}}=\sum_{p}(A_{p}P_{p\text{,}t}^{\text{H\text{,}c}% }+B_{p}z_{p\text{,}t}^{\text{c}})\text{, }h_{t}^{\text{d}}=\sum_{p}(C_{p}P_{p% \text{,}t}^{\text{H,d}}+D_{p}z_{p\text{,}t}^{\text{d}})$		(6d)
	$\displaystyle P_{t}^{\text{H\text{,}c}}=\sum\nolimits_{p}P_{p\text{,}t}^{\text% {H\text{,}c}}\text{, }P_{t}^{\text{H,d}}=\sum\nolimits_{p}P_{p\text{,}t}^{% \text{H,d}}$		(6e)
	$\displaystyle\sum\nolimits_{p}z_{p\text{,}t}^{\text{c}}=1\text{, }\sum% \nolimits_{p}z_{p\text{,}t}^{\text{d}}=1$		(6f)
	$\displaystyle\underline{P}_{p}^{\text{H}}z_{p\text{,}t}^{\text{c}}\leq P_{p% \text{,}t}^{\text{H\text{,}c}}\leq\overline{P}_{p}^{\text{H}}z_{p\text{,}t}^{% \text{c}}\text{, }0\leq P_{p\text{,}t}^{\text{H,d}}\leq\overline{P}_{p}^{\text% {H}}z_{p\text{,}t}^{\text{d}}$		(6g)

where ${{\bm{\Omega}}_{T}}$ and ${{\bm{\Omega}}_{S}}$ are the set of time and parameter segments, respectively. $P_{t}^{\text{H,c}}$ , $P_{t}^{\text{H,d}}$ , and $E_{t}^{\text{H}}$ are decision variables for the charge power, discharge power, and SoC of hydrogen storage. The SoC of hydrogen storage can be measured by the hydrogen mass or as a ratio of the rated capacity. $E_{t}^{\text{H,L}}$ is the hydrogen load for industrial production processes, such as fertilizer manufacturing and steel-making. $\underline{E}^{\text{H}}$ and $\overline{E}^{\text{H}}$ are the lower and upper bounds of SoC. $\underline{P}^{\text{H}}$ and $\overline{P}^{\text{H}}$ are the lower and upper bounds of power. The lower charging power bound is set by the minimum operating power of the electrolyzer, typically 15%-20% of the nominal power. $A_{p}$ and $B_{p}$ are the slope and the intercept of piecewise linear charging segments. $C_{p}$ and $D_{p}$ are the slope and the intercept of piecewise linear discharging segments. $z_{p\text{,}t}^{\text{c}}$ and $z_{p\text{,}t}^{\text{d}}$ are binary variables for piecewise linear function.

3 Long-term energy management of microgrid

3.1 Microgrid structure

In this paper, we only consider the island mode of the microgrid, and the microgrid structure is illustrated in Fig. 4. The microgrid consists of renewable generators (wind and solar), diesel generators, H-BES and local loads.

3.2 Problem formulation

The objective defined in (7) aims to minimize the system cost. This cost comprises the production costs of the diesel generator, penalties for load curtailment (island mode), and operational costs of H-BES. Constraints (8a) and (8b) define the power bounds and ramping bounds of the diesel generator. Constraints (9) define the constraints for battery, which are similar in formulation to those for hydrogen storage (6), as both types of storage involve constraints on charging and discharging rates, SoC, etc. However, it is important to note that the battery efficiency is considered to be constant, there is no minimum charging power limit in (9d), and the self-discharge rate should be considered in (9a). Constraints (10) limit the load curtailment and dispatchable RES. Constraint (11) limits the power import from the main grid. Power balance constraint is defined as (12). The complementary constraints for charging and discharging of battery and hydrogen storage are relaxed and have been removed from the model since sufficient conditions are satisfied [18], i.e., discharging price (“+”) is greater than the charging price (“0”). Moreover, the power flow constraints are overlooked within the dispatch model since the microgrid network is generally designed with high reliability and large redundancy [29].

Objective Function:


	$\displaystyle\underset{\bm{x}}{\min}\ G(\bm{x},{\bm{\xi}})=\sum_{t\in\bm{% \Omega}_{T}}\left(C_{t}^{\text{L}}+C_{t}^{\text{D}}+C_{t}^{\text{B}}+C_{t}^{% \text{H}}\right)$		(7a)
	$\displaystyle C_{t}^{\text{L}}=c^{\text{L}}P_{t}^{\text{L}}\Delta t\text{, }C_% {t}^{\text{D}}=c^{\text{D}}P_{t}^{\text{D}}\Delta t\text{, }C_{t}^{\text{B/H}}% =c^{\text{B/H}}P_{t}^{\text{B/H,d}}\Delta t$		(7b)

where $c^{\text{L}}$ and $P_{t}^{\text{L}}$ are the load curtailment price and load curtailment power. $c^{\text{D}}$ and $P_{t}^{\text{D}}$ are the fuel price and power of diesel generator. $c^{\text{B}}$ and $c^{\text{H}}$ are marginal discharge costs of battery and hydrogen storage. $\Delta t$ is the time interval.

Constraints: $\forall t\in{{\bm{\Omega}}_{T}}~{}$


	$\displaystyle\underline{P}^{\text{D}}\leq P_{t}^{\text{D}}\leq\overline{P}^{% \text{D}}$			(8a)
	$\displaystyle-RD^{\text{D}}\leq P_{t+1}^{\text{D}}-P_{t}^{\text{D}}\leq RU^{% \text{D}}$			(8b)


	$\displaystyle E_{t+1}^{\text{B}}=(1-\varepsilon\Delta t)E_{t}^{\text{B}}+% \Delta t(\eta^{\text{B,c}}P_{t}^{\text{B,c}}-P_{t}^{\text{B,d}}/{\eta^{\text{B% ,d}}})$			(9a)
	$\displaystyle\underline{E}^{\text{B}}\leq E_{t}^{\text{B}}\leq\overline{E}^{% \text{B}}$			(9b)
	$\displaystyle E_{T}^{\text{B}}\geq E_{0}^{\text{B}}$			(9c)
	$\displaystyle 0\leq P_{t}^{\text{B,c}}\leq\overline{P}^{\text{B}}$			(9d)
	$\displaystyle 0\leq P_{t}^{\text{B,d}}\leq\overline{P}^{\text{B}}$			(9e)


	$\displaystyle 0\leq P_{t}^{\text{L}}\leq\xi_{t}^{\text{L}}$			(10a)
	$\displaystyle 0\leq P_{t}^{\text{R}}\leq\xi_{t}^{\text{R}}$			(10b)

\displaystyle 0\leq P_{t}^{\text{G}}\leq\overline{P}_{t}^{\text{G}}

(11)

\displaystyle P_{t}^{\text{G}}+P_{t}^{\text{R}}+(P_{t}^{\text{B,d}}-P_{t}^{% \text{B,c}})+(P_{t}^{\text{H,d}}-P_{t}^{\text{H,c}})+P_{t}^{\text{L}}=\xi_{t}^% {\text{L}}

(12)

where $\underline{P}^{\text{D}}$ and $\overline{P}^{\text{D}}$ are the lower and upper power bounds of diesel generator. $RD^{\text{D}}$ and $RU^{\text{D}}$ are downward and upward ramping rates of diesel generator. $P_{t}^{\text{B,c}}$ , $P_{t}^{\text{B,d}}$ , and $E_{t}^{\text{B}}$ are decision variables for the charge power, discharge power, and SoC of battery. $\eta^{\text{B,c}}$ and $\eta^{\text{B,d}}$ are the charge and discharge efficiency of battery. $\varepsilon$ is the self-discharge rate of battery. $\underline{E}^{\text{B}}$ and $\overline{E}^{\text{B}}$ are the lower and upper SoC bounds of battery. $\overline{P}^{\text{B}}$ is the upper power bound of battery. $\xi_{t}^{\text{L}}$ and $\xi_{t}^{\text{R}}$ are the load power and available RES power with uncertainties. $P_{t}^{\text{R}}$ is the dispatched RES power. The set of stochastic parameters is given by $\bm{\xi}=\{\xi_{t}^{\text{L}},\xi_{t}^{\text{R}}\}$ . The set of decision variables is given by $\bm{x}=\{P_{t}^{\text{L}},P_{t}^{\text{D}},P_{t}^{\text{R}},P_{t}^{\text{B,c/d% }},P_{t}^{\text{H,c/d}},E_{t}^{\text{B}},E_{t}^{\text{H}},h_{t}^{\text{c/d}},z% _{t}^{\text{c/d}}\}$ .

The multi-time-period economic dispatch of microgrid with H-BES ( $\textbf{P}_{1}$ ) is summarized in (3.2). Next, we present the methodology for solving this problem.

	$\displaystyle(\textbf{P}_{1})\hskip 4.0pt\underset{\bm{x}}{\min}$	$\displaystyle{\hskip 4.0pt}G(\bm{x},{\bm{\xi}})$
	s.t.	$\displaystyle{\hskip 4.0pt}\eqref{hydrogen}\text{, }\eqref{DG}-\eqref{powerbalance}$		(13)

4 Prediction-free coordinated optimization framework

4.1 Motivations

Solving the problem (3.2) has the following challenges:

(1) Non-anticipatively : The long-term energy management of the microgrid typically spans more than one month or one season. Nevertheless, the forecast accuracy is acceptable only for several hours ahead. Hence, the load power and available RES power are unanticipated in the long-term optimization. And online optimization methods should be adopted to decompose the long-term optimization problem into several short-term optimization problems.

(2) Storage Dispatch Priority: Batteries with lower marginal discharge costs will be given priority over hydrogen storage with higher marginal discharge costs. We defer the complete proof to Appendix A. The battery-prioritized strategy is feasible and economical for short-term operation. However, this approach does not account for seasonal variations in RES and load, which will result in a lack of pre-stored hydrogen and load losses in long-term operations. Therefore, it is necessary to design a “reference” with a global view to help guide hydrogen storage actions.

(3) Convexity: The piecewise linearization will introduce nonconvexity to the optimization, which contradicts the overall logic of most convex optimization approaches. However, introducing a global “reference” can mitigate this challenge by pre-determining the efficiency.

4.2 Two-stage coordinated optimization framework

We propose a two-stage coordinated optimization framework as illustrated in Fig 5. The proposed framework consists of both online and offline stage optimization. The offline stage aims to generate the ex-post SoC references for hydrogen storage using historical data on RES and load. These references can help avoid myopic decision-making and will be incrementally updated by kernel regression with newly observed data. Subsequently, online decisions are made using an adaptive virtual-queue-based OCO algorithm.

4.3 Offline-stage optimization

Firstly, sequential sequences of scenarios, denoted as $\bm{\xi_{s}}=\{\xi_{s,t}^{\text{L}},\xi_{s,t}^{\text{R}}\}$ , $t\in{{\bm{\Omega}}_{T}}=\{1,2,\cdots,T\},s\in{{\bm{\Omega}}_{S}}=\{1,2,\cdots,N\}$ , are generated from historical data of the past few years. Additionally, to account for climate change and enhance the diversity of references, we can also collect references from different months and seasons. For instance, if we focus on a seasonal dispatch problem and have historical data for 5 years, then $T=1$ season and $N=5\times 4$ . To enhance adaptability to extreme weather conditions, we add extreme scenarios into the historical data using Generative Adversarial Networks [11]. Afterward, we can solve the deterministic mixed-integer linear programming (MILP) as (4.3) to generate the SoC references of hydrogen storage, i.e., $\bm{E}_{s}^{\text{H,*}}=\{E_{s,t}^{\text{H,*}}\}$ , $t\in{{\bm{\Omega}}_{T}},s\in{{\bm{\Omega}}_{S}}$ .

	$\displaystyle(\textbf{P}_{2})\hskip 4.0pt\underset{\bm{x_{s}}}{\min}$	$\displaystyle{\hskip 4.0pt}G(\bm{x_{s}},{\bm{\xi_{s}}})$
	s.t.	$\displaystyle{\hskip 4.0pt}\eqref{hydrogen}\text{, }\eqref{DG}-\eqref{powerbalance}$		(14)

4.4 Online-stage optimization

(1) Data-Driven Reference Tracking

Inspired by [13], we propose a data-driven reference tracking method to combine both the ’lookback’ pattern from historical data and the ’lookahead’ pattern from newly observed data. Firstly, we define $\bm{\xi}_{[t]}$ as the observed sequence for uncertainties from the first time slot to the current time slot t in (15a). Additionally, $\bm{\xi}_{s,[t]}$ defined in (15b) represents the corresponding historical sequence for uncertainties in scenario s. Subsequently, by checking the similarity between $\bm{\xi}_{[t]}$ and $\bm{\xi}_{s,[t]}$ , dynamic weights $\omega_{s,t}$ are assigned to each historical scenario based on the Gaussian kernel function and Euclidean distance, as outlined in (15c). To account for the temporal dynamics, the Gaussian kernel function is modified with a scaling factor t. And the optimal bandwidth $\sigma$ can be found through heuristic methods such as the bisection method. Additionally, the weights are updated in real-time dispatch instead of using average or heuristic values. Finally, the SoC reference of hydrogen storage is updated as (15d). This updated reference also determines the efficiency segment of hydrogen storage, eliminating the nonconvexity issue that arises when using convex optimization approaches.


	$\displaystyle\bm{\xi}_{[t]}=\{\xi_{1}^{\text{G}},\xi_{1}^{\text{L}},\xi_{1}^{% \text{R}},\cdots,\xi_{t}^{\text{G}},\xi_{t}^{\text{L}},\xi_{t}^{\text{R}}\}$		(15a)
	$\displaystyle\bm{\xi}_{s,[t]}=\{\xi_{s,1}^{\text{G}},\xi_{s,1}^{\text{L}},\xi_% {s,1}^{\text{R}},\cdots,\xi_{s,t}^{\text{G}},\xi_{s,t}^{\text{L}},\xi_{s,t}^{% \text{R}}\}$		(15b)
	$\displaystyle\omega_{s,t}=\dfrac{K_{t}(\xi_{[t]},\xi_{s,[t]})}{\sum_{{s^{% \prime}=1}}^{N}K_{t}(\xi_{[t]},\xi_{s^{\prime},{[t]}})},\ K_{t}(x,y)=e^{{-% \frac{(\\|x-y\\|_{2})^{2}}{t\sigma^{2}}}}$		(15c)
	$\displaystyle\bm{E}_{[t]}^{\text{H,R}}=\sum\nolimits_{{s=1}}^{N}\omega_{s,t}% \bm{E}_{s,[t]}^{\text{H,*}}$		(15d)

(2) Real-Time Corrective Dispatch

Real-time corrective dispatch ( $\textbf{P}_{3}$ ) is formulated in (4.4), which aims to minimize the instant operational cost while tracking the SoC reference of hydrogen storage. $\varphi$ is the penalty coefficient to control the SoC deviation from the reference. $\textbf{P}_{3}$ admits a compact form in (17a). $f_{t}$ and $g_{t}$ represent the time-varying objective function and time-varying constraints due to hydrogen storage SoC reference ${E}_{t}^{\text{H,R}}$ and uncertainties $\bm{\xi}$ , respectively. By leveraging the Lagrangian Relaxation, we can obtain the optimum by (17b). $\lambda_{t}$ is the dual variables of the constraints $g_{t}$ . $\left\langle x,y\right\rangle$ denotes the standard inner product. However, without prior knowledge of uncertainties $\bm{\xi}$ , $f_{t}$ and $g_{t}$ are unknown to the online decision-maker. Hence, we next design a VQB-OCO algorithm to solve this issue.

	$\displaystyle(\textbf{P}_{3})\hskip 4.0pt\underset{\bm{x}_{t}}{\min}$	$\displaystyle{\hskip 4.0pt}G(\bm{x}_{t},{\bm{\xi}_{t}})+\varphi({E}_{t}^{\text% {H}}-{E}_{t}^{\text{H,R}})^{2}$
	s.t.	$\displaystyle{\hskip 4.0pt}\eqref{hydrogen}\text{, }\eqref{DG}-\eqref{powerbalance}$		(16)


	$\displaystyle\min_{\bm{x}}\ f_{t}(x_{t})\ \mathrm{~{}s.t.~{}}g_{t}(x_{t})\leq 0$		(17a)
	$\displaystyle x_{t}=\arg\min_{x}\{f_{t}(x)+\left\langle\lambda_{t},\ g_{t}(x)% \right\rangle\}$		(17b)

(3) VQB-OCO Algorithm

The key idea of VQB-OCO is to use information from past time to approximate the current situation. The virtual queue is employed as the substitution of unknown dual variables. Hence, we design the update policy for the virtual queue, decisions, and weights in (18)-(20). Finally, we can obtain the weighted average value of dispatch decision as (21). The VQB-OCO algorithm and the overall two-stage coordinated optimization framework are summarized in Algorithm 1.

Q_{i,t-1}=Q_{i,t-2}+\beta_{t-1}[g_{t-1}(x_{t-1})]_{+}

(18)

	$\displaystyle x_{i,t}$	$\displaystyle=\arg\min_{x}\{\alpha_{i,t-1}\left\langle\partial f_{t-1}(x_{t-1}% ),\ x\right\rangle+$		(19)
		$\displaystyle\alpha_{i,t-1}\beta_{i,t-1}\left\langle Q_{i,t-1},\ [g_{t-1}(x)]_% {+}\right\rangle+\\|x-x_{i,t-1}\\|^{2}\}$		(19)

\ell_{i,t-1}=\left\langle\partial f_{t-1}(x_{t-1}),\ x_{i,t-1}-x_{t-1}\right% \rangle,\ \rho_{i,t}=\dfrac{\rho_{i,t-1}e^{-\gamma\ell_{i,t-1}}}{\sum_{i=1}^{N% }\rho_{i,t-1}e^{-\gamma\ell_{i,t-1}}}

(20)

x_{t}=\sum\nolimits_{t=1}^{N}\rho_{i,t}x_{i,t}

(21)

Remark 1 (Approximation)

$f_{t}(x)$ is approximated using the first-order Taylor expansion $\left\langle\partial f_{t-1}(x_{t-1}),\ x\right\rangle$ . The term $\lambda_{t}$ is substituted by a virtual queue $Q_{i,t-1}$ . The constraint function $g_{t}(x)$ is replaced by the clipped constraint function $\left[g_{t-1}(x)\right]_{+}$ . A regularization term $\left|x-x_{i,t-1}\right|^{2}$ is added to ensure the convexity of the optimization problem and to enhance the convergence of the algorithm.

Remark 2 (Parallel Learning)

Determining the learning rate (step size) is important yet challenging. We assign different learning rates to the first two terms, $\alpha_{i,t-1}$ and $\beta_{i,t-1}$ . Rather than utilizing fixed or adaptive learning rates, we employ the expert-tracking algorithm proposed by [38], which computes $x_{t}$ in parallel with various learning rates as described in equation (19). The weights for each expert $\rho_{i,t}$ are updated based on their empirical performance using an exponential function, as shown in equation (20).

Remark 3 (Virtual Queue Updates)

Based on our previous work [25], the dual variables of the long-term constraints remain fixed when the optimum does not reach the constraint bounds. However, when the optimum reaches these bounds, the dual variables increase, representing a penalty. The update of the virtual queue follows the same pattern as described in equation (18) to limit constraint violations.

Stage1: Offline Optimization

Input: Historical scenarios of RES and load

\bm{\xi_{s}}

Output: Historical reference for hydrogen storage

\bm{E}_{s}^{\text{H,*}}

for $S=1$ to $N$ do

Solve the deterministic MILP problems (

\textbf{P}_{2}

) ;

as (4.3) to generate the SoC references of;

hydrogen storage. ;

end for

Stage2: Online Optimization

Input: Historical reference for hydrogen storage

\bm{E}_{s}^{\text{H,*}}

; ;

Real-time observation of RES and load

\bm{\xi_{t}}

. Output: Real-Time Dispatch Decisions

x_{t}

Step 1 -Initialization

Set

Q_{i,t}=0,\ x_{i,1}\in\bm{X},\ x_{1}=\sum_{i=1}^{N}\rho_{i,1}x_{i,1}

\rho_{i,1}=(M+1)/[i(i+1)M],\ \forall i\in\{1,2,\cdotp\cdotp\cdotp,M\}.

Step 2 - Reference Tracking & VQB-OCO

for $t=2$ to $T$ do

Update real-time SoC reference as (15); ;

for $i=1$ to $M$ parallel do

Update virtual queue

Q_{i,t}

as (18); ;

Update decisions

x_{i,t}

as (19); ;

Update weights

\rho_{i,t}

as (20). ;

end for

Calculate the dispatch decision

x_{i}

as (21).

end for

Algorithm 1 Prediction-Free Two-Stage Online Optimization Algorithm

(4) Performance of VQB-OCO

OCO focuses on the performance of regret (Reg), as defined in (22), where $y_{t}$ is the global optimum. Various OCO algorithms ensure that Reg is a sublinear function of T by designing parameters and update policy, as it implies that the algorithm performs as well as the global optimum in hindsight as T approaches infinity. Next, we provide parameter settings and a proof to achieve strictly sublinear dynamic regret.

\mathrm{Reg}=\sum\nolimits_{t=1}^{T}[f_{t}(x_{t})-f_{t}(y_{t})]

(22)

Assumption 1. The functions $f_{t}$ and $g_{t}$ are convex. The feasible set $\bm{X}$ is convex and closed, and it has a bounded diameter $d(\bm{X})$ , i.e.,

\parallel x-y\parallel\leq d(\bm{X}),\ \forall x,y\in\bm{X}

(23)

Assumption 2. There exists a positive constant F such that

\mid f_{t}(x)-f_{t}(y)\mid\leq F,\ \parallel g_{t}(x)\parallel\leq F,\ \forall t% \in\bm{\Omega_{T}},\ \forall x,y\in\bm{X}

(24)

Assumption 3. The subgradients $\partial f_{t}(x)$ and $\partial g_{t}(x)$ exist. And there exists a positive constant G such that

\parallel\partial f_{t}(x)\parallel\leq G,\ \parallel\partial g_{t}(x)% \parallel\leq G,\ \forall t\in\bm{\Omega_{T}},\ \forall x,y\in\bm{X}

(25)

Theorem 1

Given the assumptions 1–3, and parameters setting as (26), $\kappa\in[0,c]$ , $c\in(0,1)$ , $\alpha_{0}>0$ , $\beta_{0}>0$ , and $\gamma_{0}\in(0,1/(\sqrt{2G}))$ are constants. Then, we have the performance of Reg and Vio as (27).

\displaystyle M=\lfloor\kappa\log_{2}(1+T)\rfloor+1,\ \alpha_{i,t}=\dfrac{% \alpha_{0}2^{i-1}}{t^{c}},\ \beta_{i,t}=\dfrac{\beta_{0}}{\sqrt{\alpha_{i,t}}}% ,\ \gamma=\dfrac{\gamma_{0}}{T^{c}}

(26)

\text{Reg}=\mathcal{O}(T^{c}(1+P_{x})^{1-\kappa}+T^{1-c}(1+P_{x})^{\kappa})

(27)

Proof: The performance of the proposed OCO algorithm achieves a similar performance with [22] which achieves $\mathcal{O}(T^{\max\{1-a-c,c\}})$ for dynamic regret with the help of prediction data. And it outperforms the performance of [20] and [6], achieving dynamic regret with a linear function of $P_{x}$ , which is not satisfactory. Moreover, by setting $\kappa=c=0.5$ , the proposed OCO algorithm achieve the performance of $\mathcal{O}(\sqrt{T(1+P_{x})})$ , which aligns with the performance of [38] where long term constraints are not considered. Hence, the proposed OCO algorithm is no worse than the existing versions. We defer the complete proof to Appendix B.

5 Case studies

5.1 Set-up

The main parameters and configurations are listed in Table 2. Specifically, the capacities of the battery and hydrogen storage are half of the load capacity. The storage durations of the battery and hydrogen are 2 hours and 400 hours, respectively. The installed capacity of renewables is 200 kW, comprising an equal share of solar and wind. The cost coefficients can be found in [13].

We demonstrate the effectiveness of the proposed method based on two datasets: (1) We use the 15-minute historical data on solar, wind, and load from 2014 to 2023 obtained from Belgium’s transmission system operator (Elia) [8] for the baseline case study. (2) We also use the hourly historical data of wind and load from 1981 to 2020 in North China [12] to demonstrate the impact of data resolution and data quantity.

The optimization is coded in MatLab with Yalmip interface and solved by Gurobi 11.0 solver. The programming environment is Intel Core i9-13900HX @ 2.30GHz with RAM 32 GB.

Table 2: Parameters and configuration of the test microgrid.

Parameters	Value	Parameters	Value	Parameters	Value
Initial SoC	0.5	$\overline{P}^{\text{H}}$	50 kW	$c^{\text{L}}$	$5/kWh
$\eta^{\text{B,c/d}}$	0.9	$\overline{E}^{\text{H}}$	20 MWh	$\overline{P}^{\text{D}}$	50kW
$\epsilon$	1%/month	$c^{\text{B}}$	$0.02/kWh	Wind Capacity	100kW
$\overline{P}^{\text{B}}$	50 kW	$c^{\text{H}}$	$0.03/kWh	Solar Capacity	100kW
$\overline{E}^{\text{B}}$	100kW	$c^{\text{D}}$	$0.3/kWh	Load Capacity	100kW

5.2 Offline-stage optimization

(1) Data visualization

We first show the monthly average available renewable and load power of Elia from 2014 to 2023 in Figure 6. It is observed that all of them exhibit seasonal patterns. Wind power is abundant in spring and winter but scarce in summer, while solar power is relatively high in summer and extremely low in winter. Load power peaks in winter. Correspondingly, the net load also peaks in winter and hits a low in summer. Therefore, it indicates the critical role of hydrogen storage to address the seasonal variations in renewables and load, as well as to maintain the long-term energy balance of the microgrid.

(2) Impact of hydrogen storage efficiency model

Next, we compare the offline energy management performance in 2023 with different hydrogen models, including:

(E1): Piecewise linear model as proposed in (6), and the parameters are fitted based on the experimental data as shown in Figure 2 and 3.

(E2): Constant efficiency model with both the highest charging and discharging efficiencies of 63%.

(E3): Constant efficiency model with the lowest charging and discharging efficiencies, i.e., 53% and 45%, respectively.

The hydrogen storage SoC is shown in Figure 7. Other results are also summarized in Table 3. It is observed that using the highest constant efficiency model results in the most optimistic performance, with the lowest operational cost ($97,188), whereas the lowest constant efficiency model yields the highest operational cost ($132,933) due to the significant increase in costs of diesel generation and loss of load. However, using a constant efficiency model can lead to feasibility issues in practical operation, resulting in losses in either charging or discharging power. Additionally, the optimistic strategy generated by the highest efficiency model will introduce an additional loss of load cost of 1.94 MWh. Considering the practical consequences, E2 and E3 will increase the total system costs by 10% and 36%, respectively, compared to E1. This result demonstrates that the proposed model can capture the characteristics of power-dependent efficiency and achieve more reliable and economical performance in practice.

Table 3: Yearly operational performance of the microgrid using different efficiency models.

Model	$\text{Cost}^{\text{T/P}}$ $(\$10^{4})$	$\sum P_{t}^{\text{D,T/P}}\Delta t$ $(\text{MWh})$	$\sum P_{t}^{\text{L,T/P}}\Delta t$ $(\text{MWh})$	$\sum\Delta P_{t}^{\text{H,c}}\Delta t$ $(\text{MWh})$	$\sum\Delta P_{t}^{\text{H,d}}\Delta t$ $(\text{MWh})$
E1	9.87/9.87	324.68/324.68	0.00/0.00	0.00	0.00
E2	9.81/10.78	322.19/322.19	0.00/1.94	0.00	-1.94
E3	13.29/13.29	374.33/374.19	3.85/3.85	-0.14	0.00

5.3 Online-stage optimization

(1) Reference tracking

We test the reference tracking performance in 2023 with different methods, including:

(R1): Global optimal reference generated by deterministic multi-period optimization with perfect knowledge of uncertainty realizations.

(R2): The proposed data-driven reference tracking, trained with 2014-2022 historical data.

(R3): The proposed data-driven reference tracking, trained with 2014-2022 historical data and AI-generated data. The AI-generated data is produced by randomly reducing the historical solar power but increasing the wind power by 10%-50% each quarter.

(R4-R6): Similar to the (R3) method, the AI-generated data is produced by: - (R4): Randomly reducing both the historical solar and wind power by 10%-50% each quarter. - (R5): Increasing the historical solar and wind power by 10%-50% each quarter. - (R6): Using all the AI-generated data from (R3) to (R5).

(R7): The reference generated using the average historical performance.

The hydrogen SoC references are compared in Figure 8, and the tracking performance is summarized in Table 4. The root mean square error (RMSE) is calculated as the average difference between the generated reference and the global optimal reference. The optimal choice of $\sigma$ obtained through the bisection method is 0.098. It is observed that the references generated by the proposed methods R2-R6 can better track the seasonal variations of RES and load, resulting in lower RMSE compared to the reference generated by R7. This is because the proposed methods employ kernel regression to update the weights of historical references instead of using fixed and average values. Additionally, additional generated data inputs will increase the tracking performance as they create new potential extreme scenarios. However, as is shown in the performance of R6, an excessive amount of generated data can lead to overfitting in the regression model, thereby reducing tracking accuracy. The average computation time for a single time interval is around 2 ms, which is acceptable even for minute-level scheduling and control.

Table 4: Reference tracking performance with different reference tracking methods.

Method	R1	R2	R3	R4	R5	R6	R7
RMSE	——	8.41%	8.96%	8.25%	8.01%	8.33%	10.54%
Time (ms)	——	0.35	1.93	1.88	2.01	6.27	0.00
Data Size (Year)	——	9	45	45	45	116	9

Furthermore, we test the reference tracking performance on the North China dataset. The data visualization and reference tracking performance are shown in Figure 13 in Appendix C. Compared with the Elia dataset, both the wind and load data exhibit less variation across seasons and years. The maximum variations across years are 0.22 and 0.09 for wind and load, respectively, while for the Elia dataset, they are 0.25 and 0.15 for wind and load, respectively. Specifically, the load data in North China maintains the same shape across the years. The optimal choice of $\sigma$ obtained through the bisection method is 50. The proposed tracking method performs the same as the averaged method, with an RMSE of 0.046. This is because the historical data and historical references show significant similarity across years, causing the proposed method to select the average value when updating weights. The above results demonstrate the benefit of using a data-driven reference tracking method when historical uncertainties exhibit significant variations across years. Additionally, an appropriate amount of AI-generated data can improve adaptability to extreme weather scenarios.

(2) Online decision-making

We test the online decision-making in 2023 with different dispatch methods, including:

(M0): Deterministic optimization with perfect knowledge of uncertainty realizations for the whole year results in the global optimum. This method is optimistic and not applicable in practice. Hence, it only serves as a baseline.

(M1): The proposed prediction-free coordinated approach, which utilizes OCO for online optimization and R5 for reference tracking.

(M2): The scheduling-correction method proposed by [13], which utilizes MPC for online optimization and R5 for reference tracking.

(M3): Online optimization by OCO without reference tracking.

(M4): Online optimization by MPC without reference tracking.

The operational performance is summarized in Table 5. It is observed that the proposed method M1 outperforms the others in terms of cost-effectiveness, achieving an optimality gap of 27% compared with M0. This is due to its smallest loss of load and RMSE compared to the reference. Additionally, M1 and M2 obtain much better economic performance than M3 and M4. This can be explained by the hydrogen SoC as shown in Figure 9. M3 and M4 generate myopic decisions by continuously discharging the hydrogen storage to reduce short-term operational costs during the winter peak. They fail to charge the hydrogen storage during the renewable-rich spring and summer, resulting in an extremely low SoC after winter. Consequently, these myopic decisions prevent hydrogen storage from effectively shifting energy seasonally, leading to a substantial loss of load and low utilization of RES in practice. In contrast, M1 and M2 follow the pattern of reference while M1 has the better reference following performance (lower RMSE) since OCO utilizes the real-time observed data. This result demonstrates the benefit of introducing a global reference for the online optimization method. We also apply SDP algorithm in [4] to this problem. However, the SDP cannot converge within 24 hr due to the “curse of dimensionality”. Therefore, it is infeasible to use SDP for long-term energy management of microgrid with H-BES.

Moreover, we compare the power dispatch strategies of H-BES and DG using M1 and M2, as shown in Figure 10. It is observed that M1 can better track the net load curve using only hydrogen storage actions. In contrast, M2 keeps charging hydrogen storage and uses DG when renewables are insufficient. This is because the OCO-based method simultaneously tracks the previous decisions and the reference, updating the strategy based on newly observed data, which is more adaptive to the time-varying environment. While the MPC-based method only tracks the reference and updates the strategy based on forecast data. Therefore, if the reference or forecast is not accurate, the MPC-based method may struggle to achieve good performance. This can also be explained by SoC gaps as shown in Figure 9. Furthermore, due to prediction errors, MPC-based online optimization may encounter infeasibility issues, resulting in additional loss of load and penalty costs. In contrast, the OCO-based method makes decisions based on observed data, thereby avoiding infeasibility issues. Regarding computational efficiency, it can be seen that the OCO method has better performance than MPC. Both methods achieve single-step optimization in tens of ms, which is acceptable for most online optimization scenarios.

Table 5: Yearly operational performance of the microgrid in Elia using different optimization methods.

Method	Cost $(\$10^{4})$	$\sum P_{t}^{\text{D}}\Delta t$ $(\text{MWh})$	$\sum P_{t}^{\text{L}}\Delta t$ $(\text{MWh})$	RMSE	Time (ms)
M0	9.87	324.68	0.00	0.00	25.20
M1	12.55	336.30	4.59	10.46	87.87
M2	17.38	488.46	5.19	19.70	97.77
M3	29.43	257.77	43.19	53.19	48.59
M4	38.05	457.58	48.45	52.93	51.64

Additional tests on the North China dataset align with the above results, showing that optimization with reference outperforms optimization without reference, and OCO-based methods outperform MPC-based methods. The results are summarized in Table 7 and Figure 14 in Appendix C. Compared with the case results in Elia, both M1 and M2 achieve SoC strategies with smaller gaps from M0, due to better reference tracking performance. However, in terms of cost and reliability performance, the North China case shows worse results. This is because renewable energy in the North China case is solely supplied by wind power, leading to insufficient generation and higher load curtailment.

5.4 Sensitivity analysis

In this subsection, we further investigate the key impact factor of the proposed optimization framework.

(1) Penalty Coefficient of Reference Tracking. The penalty coefficient represents the tradeoff between instant operational cost and reference tracking performance. However, since this reference is estimated from historical data, it may not be optimal for the current year. We compare the cost and tracking performance in Figure 11 when scaling up the penalty coefficient $\varphi$ . It is observed that RMSE is monotonically decreased with the penalty coefficient, while the operational cost initially decreases to a minimum value at $\varphi=90000$ but then gradually increases with the penalty coefficient. This suggests that improving tracking performance does not always lead to lower operational costs.

(2) Reference and step size of OCO Algorithm. As illustrated in Section 5.3, the reference has a critical impact on operational performance. Table 6 summarizes the performance using M1 with different references (i.e., fixed reference generated by R7 and updated reference generated by R5) and different step sizes (i.e., fixed step size proposed by [22] and step size generated by the proposed expert-tracking algorithm). It is observed that compared to fixed reference, using the proposed updated reference reduces operational cost by 1.09%-1.70% and the loss of load by 5.68%-8.52%. This highlights the importance of finding the right reference for hydrogen storage. Additionally, the proposed step size setting decreases the operational cost by 0.67%-1.29% and the loss of load by 0.06%-3.16%. This is because the step size generated by expert-tracking can better adapt to the changing cost function and avoid heuristic settings.

Table 6: Yearly operational performance of the microgrid in Elia using M1 with different references and step sizes.

Reference	step size	$\text{Cost}^{\text{T/P}}$ $(\$10^{4})$	$\sum P_{t}^{\text{D,T/P}}\Delta t$ $(\text{MWh})$	$\sum P_{t}^{\text{L,T/P}}\Delta t$ $(\text{MWh})$	RMSE
Updated	Fixed	12.55	336.30	4.60	10.87
Updated	Expert-Tracking	12.47	333.66	4.60	10.99
Fixed	Fixed	12.77	336.33	5.03	9.26
Fixed	Expert-Tracking	12.61	333.58	4.87	9.26

(3) Sizing of Renewables. We further compare the reliability performance of M1 and M2 in Figure 12 when scaling up renewable capacity. It is observed that the loss of load decreases with increased renewable capacity. Moreover, M1 achieves an acceptable reliability level with twice the renewable capacity, while M2 requires at least 4 times the renewable capacity to meet reliability requirements. This demonstrates the benefit of the proposed method in reducing renewable planning costs.

6 Conclusion

This paper proposes a prediction-free coordinated optimization framework for long-term energy management of microgrid with H-BES. To accurately captures the power-dependent efficiency of hydrogen storage, we propose an approximate semi-empirical hydrogen storage model using piecewise linear relaxation. Moreover, to address the long-term operational patterns of renewables and load and to eliminate dependence on predictions, we introduce a prediction-free, two-stage coordinated optimization framework. The key idea is to generate and track the SoC reference of hydrogen storage using historical scenarios and kernel regression to avoid myopic online decisions. And online decisions are made based on the proposed VQB-OCO algorithm by leveraging feedback control policy and newly observed data. Case studies on Elia and North China verify that:

(1) Compared to the constant efficiency model, the proposed approximation model avoids both overly optimistic and overly conservative strategies.

(2) The proposed optimization framework outperforms existing online optimization methods and achieves an acceptable gap compared to deterministic optimization with perfect foresight of uncertainties.

(3) The SoC reference of hydrogen storage is critical to overall performance. Therefore, more reliable historical or AI-generated scenarios, along with sophisticated techniques for setting penalty coefficients, are highly required.

(4) The OCO algorithm typically lacks a global view and is sensitive to step size settings. Thus, it is beneficial to incorporate a penalty term for long-term pattern tracking and expert-tracking for step size updates.

Further work will focus on addressing constraints violations in the OCO algorithm and extending the proposed framework to operations in a market environment.

Appendix

Appendix A Proof of storage priority

The marginal discharge cost of battery is $10/MWh-$30/MWh, while it is $1/kg-$2/kg ($30/MWh-$60/MWh) for hydrogen storage, which is much higher than battery. Assume we have an optimal discharge power from H-BES, which is entirely supplied by the battery, i.e., $x^{\text{1}}=P_{t}^{\text{B,d}}$ . Considering another optimum where the discharge power is a mix from both the battery and hydrogen storage, i.e., $x^{\text{2}}=\rho P_{t}^{\text{B,d}}+(1-\rho)P_{t}^{\text{H,d}}$ . From the H-BES cost of the two optima (28) and the marginal cost & efficiency of two types of storages, we can draw the conclusion that $C_{t}^{\text{H-BES,2}}>C_{t}^{\text{H-BES 1}}$ . This indicates that hydrogen storage will not be activated until the battery is fully discharged or charged. Hence, we finish the proof.


	$\displaystyle P_{t}^{\text{B,d}}=\rho P_{t}^{\text{B,d}}+(1-\rho)P_{t}^{\text{% H,d}}$		(28a)
	$\displaystyle C_{t}^{\text{H-BES,1}}=c^{\text{B}}P_{t}^{\text{B,d}}\Delta t$		(28b)
	$\displaystyle C_{t}^{\text{H-BES,2}}=(c^{\text{B}}\rho P_{t}^{\text{B,d}}+c^{% \text{H}}(1-\rho)P_{t}^{\text{H,d}})\Delta t$		(28c)

Appendix B Proof of bounded dynamic regret

Let $\{x_{i,t}\}$ and $\{x_{t}\}$ be the sequences generated by Algorithm 1. Let $\{y_{t}\}$ be a global optimum in the feasible set $\bm{X}$ . From $f_{t}$ is convex and (25), we have:

		$\displaystyle f_{t}(x_{i,t})-f_{t}(y_{t})\leq\left\langle\partial f_{t}(x_{i,t% }),\ x_{i,t}-y_{t}\right\rangle$		(29)
		$\displaystyle\leq G\parallel x_{i,t}-x_{i,t+1}\parallel+\left\langle\partial f% _{t}(x_{i,t}),\ x_{i,t+1}-y_{t}\right\rangle$
		$\displaystyle\leq\frac{G^{2}\alpha_{i,t}}{2}+\frac{1}{2\alpha_{i,t}}\parallel x% _{i,t}-x_{i,t+1}\parallel^{2}+\left\langle\partial f_{t}(x_{i,t}),\ x_{i,t+1}-% y_{t}\right\rangle$

For the rightmost term of (29), we have:

\displaystyle\begin{aligned} &\left\langle\partial f_{t}(x_{i,t}),\ x_{i,t+1}-% y_{t}\right\rangle\\ &=\left\langle\beta_{i,t+1}(\partial[g_{t}(x_{i,t+1})]_{+})^{T}Q_{i,t},\ y_{t}% -x_{i,t+1}\right\rangle\\ &+\left\langle\partial f_{t}(x_{i,t})+\beta_{i,t+1}(\partial[g_{t}(x_{i,t+1})]% _{+})^{T}Q_{i,t},\ x_{i,t+1}-y_{t}\right\rangle\end{aligned}

(30)

Since $g_{t}$ is a convex function, it is trivial to show that $[g_{t}]_{+}$ is also convex; hence the first term of (30) can be relaxed:

		$\displaystyle\left\langle\beta_{t+1}(\partial[g_{t}(x_{i,t+1})]_{+})^{T}Q_{i,t% },\ y_{t}-x_{i,t+1}\right\rangle$		(31)
		$\displaystyle\leq\beta_{t+1}\left\langle Q_{i,t},\ [g_{t}(y_{t})]_{+}\right% \rangle-\beta_{i,t+1}\left\langle Q_{i}(t),\ [g_{t}(x_{i,t+1})]_{+}\right\rangle$		(31)

From Lemma 1 in [35], we have:

		$\displaystyle\left\langle\partial f_{t}(x_{i,t})+\beta_{i,t+1}(\partial[g_{t}(% x_{i,t+1})]_{+})^{T}Q_{i,t},\ x_{i,t+1}-y_{t}\right\rangle$		(32)
		$\displaystyle\leq\frac{1}{\alpha_{i,t}}(\parallel y_{t}-x_{i,t}\parallel^{2}-% \parallel y_{t}-x_{i,t+1}\parallel^{2}-\parallel x_{i,t+1}-x_{i,t}\parallel^{2})$		(32)

Combining (20), (29)-(32), we have:

	$\displaystyle\ell_{t}(x_{i,t})-\ell_{t}(y_{t})$	$\displaystyle\leq\frac{G^{2}\alpha_{i,t}}{2}+\frac{1}{\alpha}(\parallel y_{t}-% x_{i,t}\parallel^{2}-\parallel y_{t}-x_{i,t+1}\parallel^{2})$		(33)
		$\displaystyle+\beta_{t+1}\left\langle Q_{i}(t),\ [g_{t}(y_{t})]_{+}\right\rangle$		(33)

Since the last term of (33) is non-negative, we have:

		$\displaystyle\sum_{t=1}^{T}(\ell_{t}(x_{i,t})-\ell_{t}(y_{t}))\leq\sum_{t=1}^{% T}\frac{G^{2}\alpha_{i,t}}{2}$		(34)
		$\displaystyle+\sum_{t=1}^{T}\frac{1}{\alpha_{i,t}}(\parallel y_{t}-x_{i,t}% \parallel^{2}-\parallel y_{t}-x_{i,t+1}\parallel^{2})$		(34)

For the first term of (34), we have:

\displaystyle\sum_{t=1}^{T}\frac{G^{2}\alpha_{i,t}}{2}\leq\frac{2^{i-1}G^{2}}{% 2}\sum_{t=1}^{T}\frac{1}{t^{c}}\leq\frac{2^{i-1}G^{2}}{2(1-c)}T^{1-c}

(35)

By leveraging (23) and update policy (26), we have:

		$\displaystyle\sum_{t=1}^{T}\frac{t^{c}}{\alpha_{0}2^{i-1}}\left(\\|y_{t}-x_{i,t% }\\|^{2}-\\|y_{t}-x_{i,t+1}\\|^{2}\right)$		(36)
		$\displaystyle=\frac{1}{\alpha_{0}2^{i-1}}\sum_{t=1}^{T}\big{(}t^{c}\\|y_{t}-x_{% i,t}\\|^{2}-(t+1)^{c}\\|y_{t+1}-x_{i,t+1}\\|^{2}$
		$\displaystyle+(t+1)^{c}\\|y_{t+1}-x_{i,t+1}\\|^{2}-t^{c}\\|y_{t}-x_{i,t+1}\\|^{2}$
		$\displaystyle+t^{c}\\|y_{t}-x_{i,t+1}\\|^{2}-t^{c}\\|y_{t}-x_{i,t}\\|^{2}\big{)}$
		$\displaystyle\leq\frac{1}{\alpha_{0}2^{i-1}}\\|y_{1}-x_{i,1}\\|^{2}+\frac{1}{% \alpha_{0}2^{i-1}}\sum_{t=1}^{T}\left((t+1)^{c}-t^{c}\right)(d(\mathbb{X}))^{2}$
		$\displaystyle+\frac{2}{\alpha_{0}2^{i-1}}\sum_{t=1}^{T}t^{c}d(\bm{X})\\|y_{t+1}% -y_{t}\\|$
		$\displaystyle\leq\frac{1}{\alpha_{0}2^{i-1}}\left(1+(T+1)^{c}-1\right)(d(\bm{X% }))^{2}+\frac{2T^{c}d(\bm{X})P_{x}}{\alpha_{0}2^{i-1}}$
		$\displaystyle\leq\frac{2}{\alpha_{0}2^{i-1}}(d(\bm{X}))^{2}T^{c}\left(1+\frac{% P_{x}}{d(\bm{X})}\right)$

Let $i_{0}=\left\lfloor\frac{1}{2}\log_{2}(1+\frac{P_{x}}{d(\bm{X})})\right\rfloor+% 1\in[N]$ , such that we have:

\displaystyle 2^{i_{0}-1}\leq\sqrt{1+\frac{P_{x}}{d(\bm{X})}}\leq 2^{i_{0}}.

(37)

Combining (35)-(37) yields:

	$\displaystyle\sum_{t=1}^{T}(\ell_{t}(x_{i_{0},t})-\ell_{t}(y_{t}))$	$\displaystyle\leq\frac{4}{\alpha_{0}}(d(\bm{X}))^{2}T^{c}\left(1+\frac{P_{x}}{% d(\bm{X})}\right)^{1-\kappa}$		(38)
		$\displaystyle+\frac{G^{2}\alpha_{0}}{2(1-c)}T^{1-c}\left(1+\frac{P_{x}}{d(\bm{% X})}\right)^{\kappa}$		(38)

Applying Lemma 1 in reference [38] to (20) and (21) yields:

\displaystyle\sum_{t=1}^{T}\ell_{t}(x_{t})-\min_{i\in[N]}\{\sum_{t=1}^{T}\ell_% {t}(x_{i,t})+\frac{1}{\gamma}\ln\frac{1}{\rho_{i,1}}\}\leq\frac{\gamma(Gd(\bm{% X}))^{2}T}{2}

(39)

\displaystyle\sum_{t=1}^{T}(\ell_{t}(x_{t})-\ell_{t}(x_{i_{0},t}))\leq\frac{% \gamma_{0}(Gd(\bm{X}))^{2}T^{1-c}}{2}+\frac{1}{\gamma_{0}}T^{c}\ln\frac{1}{% \rho_{i_{0},1}}

(40)

From $\rho_{i,1}=(M+1)/[i(i+1)M]$ , we have:

\displaystyle\ln\frac{1}{\rho_{i_{0},1}}\leq\ln(i_{0}(i_{0}+1))\leq 2\ln(i_{0}% +1)\leq 2\ln(\left\lfloor\kappa\log_{2}(1+\frac{P_{x}}{d(X)})\right\rfloor)

(41)

From (20) and that $f_{t}$ is convex, we have

\displaystyle f_{t}(x_{t})-f_{t}(y_{t})\leq\ell_{t}(x_{t})-\ell_{t}(y_{t})

(42)

Combining (38)-(42) yields:

		$\displaystyle\text{Reg}\leq\frac{4}{\alpha_{0}}(d(\bm{X}))^{2}T^{c}\left(1+% \frac{P_{x}}{d(\bm{X})}\right)^{1-\kappa}+\frac{\gamma_{0}(Gd(\bm{X}))^{2}T^{1% -c}}{2}$		(43)
		$\displaystyle+\frac{G^{2}\alpha_{0}}{2(1-c)}T^{1-c}\left(1+\frac{P_{x}}{d(\bm{% X})}\right)^{\kappa}+\frac{2}{\gamma_{0}}T^{c}\ln([\kappa\log_{2}\left(1+\frac% {P_{x}}{d(\bm{X})}\right)])$		(43)

Hence, we finish the proof.

Appendix C Results on North China Dataset

Table 7: Yearly operational performance of the microgrid in North China using different optimization methods.

Method	Cost $(\$10^{5})$	$\sum P_{t}^{\text{D}}\Delta t$ $(\text{MWh})$	$\sum P_{t}^{\text{L}}\Delta t$ $(\text{MWh})$	RMSE	Time (ms)
M0	5.30	411.78	80.60	0.00	24.28
M1	11.74	424.92	208.85	0.08	78.28
M2	15.49	257.52	1289.70	0.10	85.39
M3	13.35	349.87	245.83	53.19	35.46
M4	23.24	217.91	2104.74	52.93	58.31

\printcredits

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability

The original data can be downloaded from [8] and [12].

Acknowledgements

This work is partly supported by the National Science Foundation under award ECCS-2239046 and partly supported by the Columbia University SEAS Interdisciplinary Research Seed (SIRS) fund. Ning Qi graciously acknowledges special funding from the China Postdoctoral Science Foundation (No.2023TQ0169).

References

Alzahrani et al. [2023] Alzahrani, A., Sajjad, K., Hafeez, G., Murawwat, S., Khan, S., Khan, F.A., 2023. Real-time energy optimization and scheduling of buildings integrated with renewable microgrid. Applied Energy 335, 120640.
Amphlett et al. [1995] Amphlett, J.C., Baumert, R., Mann, R.F., Peppley, B.A., Roberge, P.R., Harris, T.J., 1995. Performance modeling of the ballard mark iv solid polymer electrolyte fuel cell: Ii. empirical model development. Journal of the Electrochemical Society 142, 9.
Baumhof et al. [2023] Baumhof, M.T., Raheli, E., Johnsen, A.G., Kazempour, J., 2023. Optimization of hybrid power plants: When is a detailed electrolyzer model necessary?, in: 2023 IEEE Belgrade PowerTech, IEEE. pp. 1–10.
Darivianakis et al. [2017] Darivianakis, G., Eichler, A., Smith, R.S., Lygeros, J., 2017. A data-driven stochastic optimization approach to the seasonal storage energy management. IEEE control systems letters 1, 394–399.
Dey et al. [2023] Dey, B., Misra, S., Marquez, F.P.G., 2023. Microgrid system energy management with demand response program for clean and economical operation. Applied Energy 334, 120717.
Ding et al. [2021] Ding, X., Chen, L., Zhou, P., Xu, Z., Wen, S., Lui, J.C., Jin, H., 2021. Dynamic online convex optimization with long-term constraints via virtual queue. Information Sciences 577, 140–161.
Eghbali et al. [2022] Eghbali, N., Hakimi, S.M., Hasankhani, A., Derakhshan, G., Abdi, B., 2022. Stochastic energy management for a renewable energy based microgrid considering battery, hydrogen storage, and demand response. Sustainable Energy, Grids and Networks 30, 100652.
Elia [2024] Elia, 2024. Historical data from elia. URL: https://www.elia.be/en/grid-data.
Fan et al. [2021] Fan, F., Zhang, R., Xu, Y., Ren, S., 2021. Robustly coordinated operation of an emission-free microgrid with hybrid hydrogen-battery energy storage. CSEE Journal of Power and Energy Systems 8, 369–379.
Feng and Wei [2024] Feng, S., Wei, W., 2024. Hybrid energy storage sizing in energy hubs: A continuous spectrum splitting approach. Energy 300, 131504.
Gui et al. [2021] Gui, J., Sun, Z., Wen, Y., Tao, D., Ye, J., 2021. A review on generative adversarial networks: Algorithms, theory, and applications. IEEE transactions on knowledge and data engineering 35, 3313–3332.
Guo [2024] Guo, Z., 2024. Historical data from north china. URL: https://github.com/ZhongjieGuo/Seasonal-data.
Guo et al. [2023] Guo, Z., Wei, W., Bai, J., Mei, S., 2023. Long-term operation of isolated microgrids with renewables and hybrid seasonal-battery storage. Applied Energy 349, 121628.
Hajiaghasi et al. [2019] Hajiaghasi, S., Salemnia, A., Hamzeh, M., 2019. Hybrid energy storage system for microgrids applications: A review. Journal of Energy Storage 21, 543–570.
Hu et al. [2022] Hu, C., Cai, Z., Zhang, Y., Yan, R., Cai, Y., Cen, B., 2022. A soft actor-critic deep reinforcement learning method for multi-timescale coordinated operation of microgrids. Protection and Control of Modern Power Systems 7, 29.
Jansen et al. [2021] Jansen, G., Dehouche, Z., Corrigan, H., 2021. Cost-effective sizing of a hybrid regenerative hydrogen fuel cell energy storage system for remote & off-grid telecom towers. International Journal of Hydrogen Energy 46, 18153–18166.
Kim and Giannakis [2016] Kim, S.J., Giannakis, G.B., 2016. An online convex optimization approach to real-time energy pricing for demand response. IEEE Transactions on Smart Grid 8, 2784–2793.
Li et al. [2015] Li, Z., Guo, Q., Sun, H., Wang, J., 2015. Sufficient conditions for exact relaxation of complementarity constraints for storage-concerned economic dispatch. IEEE Transactions on Power Systems 31, 1653–1654.
Li et al. [2021] Li, Z., Wu, L., Xu, Y., Moazeni, S., Tang, Z., 2021. Multi-stage real-time operation of a multi-energy microgrid with electrical and thermal energy storage assets: A data-driven mpc-adp approach. IEEE Transactions on Smart Grid 13, 213–226.
Liu et al. [2022] Liu, Q., Wu, W., Huang, L., Fang, Z., 2022. Simultaneously achieving sublinear regret and constraint violations for online convex optimization with time-varying constraints. ACM SIGMETRICS Performance Evaluation Review 49, 4–5.
Mariam et al. [2016] Mariam, L., Basu, M., Conlon, M.F., 2016. Microgrid: Architecture, policy and future trends. Renewable and Sustainable Energy Reviews 64, 477–489.
Muthirayan et al. [2022] Muthirayan, D., Yuan, J., Khargonekar, P.P., 2022. Online convex optimization with long-term constraints for predictable sequences. IEEE Control Systems Letters 7, 979–984.
Pang et al. [2023] Pang, K., Wang, C., Hatziargyriou, N.D., Wen, F., 2023. Microgrid formation and real-time scheduling of active distribution networks considering source-load stochasticity. IEEE Transactions on Power Systems .
Qi et al. [2023] Qi, N., Pinson, P., Almassalkhi, M.R., Cheng, L., Zhuang, Y., 2023. Chance-constrained generic energy storage operations under decision-dependent uncertainty. IEEE Trans. on Sustainable Energy 14, 2234–2248.
Qi et al. [2024] Qi, N., Zheng, N., Xu, B., 2024. Chance-constrained energy storage pricing for social welfare maximization. arXiv preprint arXiv:2407.07068 .
Qiu et al. [2023] Qiu, Y., Li, Q., Ai, Y., Chen, W., Benbouzid, M., Liu, S., Gao, F., 2023. Two-stage distributionally robust optimization-based coordinated scheduling of integrated energy system with electricity-hydrogen hybrid energy storage. Protection and Control of Modern Power Systems 8, 1–14.
Sánchez et al. [2018] Sánchez, M., Amores, E., Rodríguez, L., Clemente-Jul, C., 2018. Semi-empirical model and experimental validation for the performance evaluation of a 15 kw alkaline water electrolyzer. International Journal of Hydrogen Energy 43, 20332–20345.
Shi et al. [2015] Shi, W., Li, N., Chu, C.C., Gadh, R., 2015. Real-time energy management in microgrids. IEEE Transactions on Smart Grid 8, 228–238.
Shuai et al. [2018] Shuai, H., Fang, J., Ai, X., Tang, Y., Wen, J., He, H., 2018. Stochastic optimization of economic dispatch for microgrid based on approximate dynamic programming. IEEE Transactions on Smart Grid 10, 2440–2452.
Solanki et al. [2018] Solanki, B.V., Cañizares, C.A., Bhattacharya, K., 2018. Practical energy management systems for isolated microgrids. IEEE Transactions on Smart Grid 10, 4762–4775.
Trifkovic et al. [2013] Trifkovic, M., Sheikhzadeh, M., Nigim, K., Daoutidis, P., 2013. Modeling and control of a renewable hybrid energy system with hydrogen storage. IEEE Transactions on Control Systems Technology 22, 169–179.
Ulleberg [2003] Ulleberg, Ø., 2003. Modeling of advanced alkaline electrolyzers: a system simulation approach. International journal of hydrogen energy 28, 21–33.
Wang et al. [2023] Wang, Z., Wei, W., Pang, J.Z.F., Liu, F., Yang, B., Guan, X., Mei, S., 2023. Online optimization in power systems with high penetration of renewable generation: Advances and prospects. IEEE/CAA Journal of Automatica Sinica 10, 839–858.
Yan et al. [2023] Yan, D., Huang, S., Chen, Y., 2023. Real-time feedback based online aggregate ev power flexibility characterization. IEEE Transactions on Sustainable Energy .
Yi et al. [2020] Yi, X., Li, X., Xie, L., Johansson, K.H., 2020. Distributed online convex optimization with time-varying coupled inequality constraints. IEEE Transactions on Signal Processing 68, 731–746.
Yi et al. [2022] Yi, X., Li, X., Yang, T., Xie, L., Chai, T., Johansson, K.H., 2022. Regret and cumulative constraint violation analysis for distributed online constrained convex optimization. IEEE Transactions on Automatic Control 68, 2875–2890.
Yu et al. [2022] Yu, L., Xu, Z., Guan, X., Zhao, Q., Dou, C., Yue, D., 2022. Joint optimization and learning approach for smart operation of hydrogen-based building energy systems. IEEE Transactions on Smart Grid 14, 199–216.
Zhang et al. [2018] Zhang, L., Lu, S., Zhou, Z.H., 2018. Adaptive online learning in dynamic environments. Advances in neural information processing systems 31.
Zhao et al. [2020] Zhao, T., Parisio, A., Milanović, J.V., 2020. Distributed control of battery energy storage systems for improved frequency regulation. IEEE Transactions on Power Systems 35, 3729–3738.
Zheng and Cai [2014] Zheng, L., Cai, L., 2014. A distributed demand response control strategy using lyapunov optimization. IEEE Transactions on Smart Grid 5, 2075–2083.
Zheng et al. [2023] Zheng, N., Qin, X., Wu, D., Murtaugh, G., Xu, B., 2023. Energy storage state-of-charge market model. IEEE Transactions on Energy Markets, Policy and Regulation 1, 11–22.
Zhou et al. [2023] Zhou, S., Han, Y., Zalhaf, A.S., Chen, S., Zhou, T., Yang, P., Elboshy, B., 2023. A novel multi-objective scheduling model for grid-connected hydro-wind-pv-battery complementary system under extreme weather: A case study of sichuan, china. Renewable Energy 212, 818–833.
Zhu et al. [2020] Zhu, D., Yang, B., Liu, Q., Ma, K., Zhu, S., Ma, C., Guan, X., 2020. Energy trading in microgrids for synergies among electricity, hydrogen and heat networks. Applied Energy 272, 115225.

		$\displaystyle\sum_{t=1}^{T}\frac{t^{c}}{\alpha_{0}2^{i-1}}\left(\\|y_{t}-x_{i,t% }\\|^{2}-\\|y_{t}-x_{i,t+1}\\|^{2}\right)$		(36)
		$\displaystyle=\frac{1}{\alpha_{0}2^{i-1}}\sum_{t=1}^{T}\big{(}t^{c}\\|y_{t}-x_{% i,t}\\|^{2}-(t+1)^{c}\\|y_{t+1}-x_{i,t+1}\\|^{2}$
		$\displaystyle+(t+1)^{c}\\|y_{t+1}-x_{i,t+1}\\|^{2}-t^{c}\\|y_{t}-x_{i,t+1}\\|^{2}$
		$\displaystyle+t^{c}\\|y_{t}-x_{i,t+1}\\|^{2}-t^{c}\\|y_{t}-x_{i,t}\\|^{2}\big{)}$
		$\displaystyle\leq\frac{1}{\alpha_{0}2^{i-1}}\\|y_{1}-x_{i,1}\\|^{2}+\frac{1}{% \alpha_{0}2^{i-1}}\sum_{t=1}^{T}\left((t+1)^{c}-t^{c}\right)(d(\mathbb{X}))^{2}$
		$\displaystyle+\frac{2}{\alpha_{0}2^{i-1}}\sum_{t=1}^{T}t^{c}d(\bm{X})\\|y_{t+1}% -y_{t}\\|$
		$\displaystyle\leq\frac{1}{\alpha_{0}2^{i-1}}\left(1+(T+1)^{c}-1\right)(d(\bm{X% }))^{2}+\frac{2T^{c}d(\bm{X})P_{x}}{\alpha_{0}2^{i-1}}$
		$\displaystyle\leq\frac{2}{\alpha_{0}2^{i-1}}(d(\bm{X}))^{2}T^{c}\left(1+\frac{% P_{x}}{d(\bm{X})}\right)$

Long-Term Energy Management for Microgrid with Hybrid Hydrogen-Battery Energy Storage: A Prediction-Free Coordinated Optimization Framework

Abstract

keywords:

1 Introduction

1.1 Background and motivation

1.2 Literature review

1.3 Research gap

1.4 Contributions

1.5 Paper Organization

2 Approximate semi-empirical hydrogen energy storage model

2.1 Structure of hydrogen storage system

2.2 Alkaline water electrolyzer model

2.3 PEMFC model

2.4 Equivalent hydrogen storage model

3 Long-term energy management of microgrid

3.1 Microgrid structure

3.2 Problem formulation

4 Prediction-free coordinated optimization framework

4.1 Motivations

4.2 Two-stage coordinated optimization framework

4.3 Offline-stage optimization

4.4 Online-stage optimization

Remark 1 (Approximation)

Remark 2 (Parallel Learning)

Remark 3 (Virtual Queue Updates)

Theorem 1

5 Case studies

5.1 Set-up

5.2 Offline-stage optimization

5.3 Online-stage optimization

5.4 Sensitivity analysis

6 Conclusion

Appendix A Proof of storage priority

Appendix B Proof of bounded dynamic regret

Appendix C Results on North China Dataset

References

2 Approximate semi-empirical hydrogen energy
storage model