Dynamic Soft Sensor Model for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on Adaptive Feature Matching Variational Autoencoder

Liu, Zhaoxiang; Liu, Hui; Chen, Fugang; Li, Heng; Xue, Xiaojun

doi:10.3390/pr12091807

Open AccessArticle

Dynamic Soft Sensor Model for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on Adaptive Feature Matching Variational Autoencoder

by

Zhaoxiang Liu

^1,2,

Hui Liu

^1,2,*,

Fugang Chen

³,

Heng Li

^1,2 and

Xiaojun Xue

^1,2

¹

Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650093, China

²

Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming 650504, China

³

Yunnan Kungang Electronic and Information Science Ltd., Kunming 650302, China

^*

Author to whom correspondence should be addressed.

Processes 2024, 12(9), 1807; https://doi.org/10.3390/pr12091807

Submission received: 25 July 2024 / Revised: 18 August 2024 / Accepted: 24 August 2024 / Published: 26 August 2024

(This article belongs to the Section Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

:

The key to endpoint control in basic oxygen furnace (BOF) steelmaking lies in accurately predicting the endpoint carbon content and temperature. However, BOF steelmaking data are complex and change distribution due to variations in raw material batches, process adjustments, and equipment conditions, leading to concept drift and affecting model performance. In order to resolve these problems, this paper proposes a dynamic soft sensor model based on an adaptive feature matching variational autoencoder (VAE-AFM). Firstly, this paper innovatively proposes an adaptive feature matching (AFM) method. This method utilizes the maximum mean discrepancy to calculate the values of the marginal and conditional distributions. Based on the discrepancy between these two values, a dynamic adjustment algorithm is designed to adaptively assign different weights to the two distributions. This approach dynamically and quantitatively evaluates and adjusts the relative importance of different distributions in the domain adaptation process, thereby enhancing the effectiveness of cross-domain data alignment. Secondly, a variational autoencoder (VAE) is employed to process the data, as the VAE model can capture the complex data structures and latent features in the steelmaking process. Finally, the features extracted by the VAE are processed with the adaptive feature matching method, thereby constructing the VAE-AFM dynamic soft sensor model. Experimental studies on actual BOF steelmaking data validate the efficacy of the offered approach, offering a reliable solution to the challenges of high complexity and concept drift in BOF steelmaking data.

Keywords:

BOF steelmaking; soft sensor; deep learning; domain adaptation; distribution alignment

1. Introduction

Steel is crucial in modern industry and technology, and high-quality steel is essential for ensuring the quality of steel engineering. Basic oxygen furnace (BOF) steelmaking, a main technology, is widely used for its high efficiency and low energy consumption [1]. This process involves adding molten iron, scrap steel, ferroalloys, and certain non-metallic materials into the molten pool followed by a sequence of intricate chemical and physical processes that result in heat production to finish the steelmaking process [2]. The endpoint carbon content and temperature of the molten steel are critical indicators in the steelmaking process, directly affecting the quality of the steel. Accurately predicting the endpoint carbon content and temperature in molten steel not only improves production efficiency but also reduces production costs and emissions [3].

Currently, the measurement methods for endpoint carbon content and temperature in BOF steelmaking mainly include contact and non-contact techniques [4]. The contact measurement of carbon content and temperature in the molten steel pool primarily utilizes the sublance detection method with special probes and tools [5]. Although highly accurate, this method cannot provide continuous real-time monitoring. Additionally, the sublance probe operates in a high-temperature, corrosive environment, leading to high operational costs. In the BOF steelmaking process, there is a nonlinear relationship between the process production data—such as the added limestone, molten iron, scrap steel, and oxygen blowing amount—and the endpoint carbon content and temperature, which provides a theoretical basis for measuring carbon content and temperature without touching them directly [6]. In addition, advancements in data acquisition and computer technology have enabled the non-contact measurement of endpoint carbon content and temperature in steelmaking. Therefore, researchers have developed data-driven approaches, utilizing artificial intelligence to construct models that accurately predict these key variables in BOF steelmaking [7].

In the BOF steelmaking process, data-driven soft sensor methods have been extensively studied. To improve the prediction accuracy of endpoint carbon content and temperature in BOF steelmaking, researchers have introduced machine learning techniques such as support vector machines [8], artificial neural networks [9], and partial least squares regression [10]. These methods effectively capture complex nonlinear relationships and improve prediction accuracy. Jan et al. [11] employed five machine learning techniques, including multivariate adaptive regression (MARS), support vector regression (SVR), neural network (NN), k-nearest neighbor (k-NN), and random forest (RF) methods, to predict the endpoint carbon content and temperature in BOF steelmaking and proposed a dynamic and static combined method to predict it; Han et al. [12] established a better approach to case-based reasoning that incorporates fuzzy c-means clustering (FCM), mutual information, and support vector machines (SVM) to predict the endpoint in BOF steelmaking; Zhou et al. [13] achieved the prediction of nonlinear endpoint carbon content and temperature using sensor-collected data combined with multi-output least squares support vector regression; Wang et al. [14] presented a twin support vector machine-based model for predicting temperature and carbon content, which demonstrated higher prediction accuracy compared to the BP neural network. However, these methods are shallow learning techniques. The process of manufacturing BOF steel is an intricate industrial one, and the data collected using various sensors are characterized by high dimensionality and strong nonlinearity between variables. Soft sensor model prediction performance may be reduced by the use of shallow learning techniques. To tackle the problem of high-dimensional nonlinear data, deep learning techniques have been extensively studied and have achieved significant breakthroughs [15].

Since deep learning technology has advanced, the prediction of endpoint carbon content and temperature in BOF steelmaking has entered a new stage. Deep learning algorithms such as stacked autoencoders (SAE) [16], deep belief networks (DBN) [17], deep convolutional neural networks (CNN) [18], etc., are currently the more popular deep learning algorithms. These algorithms can achieve feature representation of high-dimensional complex data, thereby effectively improving the performance of the model. Yang et al. [19] considered the multimodal distribution characteristics of industrial production process data of BOF steelmaking and proposed an adaptive update deep learning model based on von-Mises Fisher (vMF), which provided a certain solution for endpoint determination in BOF steelmaking. Lu et al. [20] considered the challenge of low correlation between the acquired features and labels faced by the BOF steelmaking industrial production process data and proposed a BOF steelmaking soft sensor model based on a supervised dual-branch deep belief network (SD-DBN), which provided a certain solution to this problem. Zhang et al. [21] proposed an innovative data-driven soft sensor called the stacked supervised Poisson autoencoder (SSPAE). During the feature extraction process, SSPAE incorporates quality information so that the extracted deep features help improve the accuracy of the prediction model. He et al. [22] proposed a new attribute-dependent distributed variational autoencoder (AR-DVAE) to deal with the different effects of high-dimensional process variables on prediction quality variables and prediction accuracy. The model can effectively extract features from input variables with different correlations. In the literature [23], the proposal suggests a new variable-wise weighted stacked autoencoder (VW-SAE) for layer-by-layer hierarchical output-related feature encoding. By performing correlation analysis with the output variables, important variables are identified from among the inputs in each autoencoder layer. This method can enhance the predictive performance of the model. Due to the nonlinear and dynamic nature of industrial production process data, the authors of [24] proposed a spatiotemporal attention-based LSTM network (STA-LSTM) for a soft sensor model. This method addresses the issue with the original LSTM’s failure to take sample and variable relevance into account when predicting quality. In [25], the authors proposed a variable attention-based long short-term memory network (VA-LSTM) for a soft sensor. This method assigns attention weights based on the correlation between input features and quality variables, enhancing the prediction model’s performance. In [26], for the purpose of learning quality-relevant hidden dynamics for soft sensor applications, a supervised LSTM (SLSTM) network was presented, overcoming the lack of representation of quality data. Shen et al. [27] suggested using deep feature extraction and feature pyramid augmentation to create a multiresolution pyramid variational autoencoder (MR-PVAE) predictive model to address the sampling rates of process variables that are discrepant concerns. Zhang et al. [28] presented a type-II multivariate Laplacian distribution in the latent variable space for robust models and suggested a variational autoencoder regression approach based on the mixture Laplacian distribution (MLVAER). Tang et al. [29] created a soft sensor model for semi-supervised deep conditional VAE (SS-DCVAE) based on unsupervised and supervised DCVAE, achieving excellent nonlinear and uncertain feature extraction capabilities in industrial process modeling. These deep learning methods are effective in processing high-dimensional and nonlinear data generated in industrial processes.

In recent years, the problem of concept drift caused by changes in raw materials, production processes, and equipment conditions in industrial production has been widely studied [30]. In [31], the authors analyzed the characteristics of intermittent industries. Changes in batches will lead to concept drift, which in turn will reduce the control performance of the model. In [32], the authors analyzed the concept drift problem existing in industrial production processes, which makes accurate data-driven modeling still a challenging task. In [33], the authors analyzed concept drift in the injection molding process, where due to machine wear or process changes concept drift will occur and the prediction performance of the model will decrease. In [34], the authors analyzed the phenomenon of concept drift in industrial data streams due to changes in raw materials, sensor aging, etc., which would lead to poor model performance. Therefore, understanding and effectively responding to concept drift caused by changes in raw materials, processes, and equipment in industrial production processes is an important challenge to ensure production process stability and product quality.

The data from the BOF steelmaking process are not only high-dimensional and highly nonlinear but are also subject to distribution changes due to variations in raw material batches, process adjustments, and changes in equipment conditions. These factors lead to concept drift, which negatively affects model performance [35]. To address the issue of degraded model prediction performance caused by concept drift in the BOF steelmaking process, soft sensor methods based on transfer learning have been widely studied [36]. By lowering the dataset’s distribution divergence, transfer learning aims to facilitate learning in a new target domain by utilizing samples from certain source domains. To achieve this goal, feature-based transfer learning methods have been extensively studied [37]. This method primarily extracts transferable features from both the source and target domains by feature mappings, thereby reducing the distribution differences between the different domains. Pan et al. [38] proposed the transfer component analysis (TCA) domain adaptation algorithm. Mapping data from two domains into a high-dimensional replicating kernel Hilbert space is the fundamental idea of TCA. In this space, the data distance between the two domains is minimized, thereby reducing the inter-domain differences. Long et al. [39,40] proposed the joint distribution adaptation (JDA) domain adaptation algorithm, which aims to reduce the inter-domain differences by simultaneously adapting the marginal distribution and conditional distribution of the source and target domains. The JDA extends the TCA method by more comprehensively addressing inter-domain distribution differences. Wang et al. [41] proposed the balanced distribution adaptation (BDA) algorithm. The core of the BDA method is to align the joint distribution of the source and target domains by adjusting the weights of the marginal and conditional distributions. Although these methods can effectively reduce distribution differences between datasets, accurate endpoint carbon content and temperature predictions using solely the previously described transfer learning techniques are difficult to produce due to the high degree of dimensionality and significant nonlinearity of the BOF steelmaking process data.

Therefore, to address the issues of high dimensionality, strong nonlinearity, and concept drift in BOF steelmaking process data, an endpoint carbon content and temperature soft sensor model for BOF steelmaking is proposed in this study based on adaptive feature matching variational autoencoder (VAE-AFM). The variational autoencoder (VAE) model combines deep learning and variational Bayesian probabilistic inference to extract nonlinear feature representations of data. In this paper, VAE is utilized to extract features of BOF steelmaking process data. To effectively address the concept drift problem in BOF steelmaking, this paper proposes an adaptive feature matching (AFM) method. This method adaptively adjusts the weights of the marginal and conditional distributions between the extracted BOF steelmaking data features, thereby minimizing distribution differences between different BOF steelmaking datasets.

In conclusion, the principal highlights of this paper can be outlined as follows:

(1): An adaptive feature matching method is proposed, which can dynamically assess the relative importance of marginal and conditional distributions between different domains. This approach effectively reduces differences between data distributions and integrates it into a deep learning framework.
(2): To address high dimensionality, strong nonlinearity, and concept drift in BOF steelmaking data, a variational autoencoder (VAE) is employed for effective feature extraction. This is then combined with the adaptive feature matching method to construct the VAE-AFM model.
(3): An experimental analysis was conducted on the manufacturing process data from BOF steelmaking. Through comparisons with various soft sensor model methods and ablation experiments, the efficacy of the proposed method was confirmed.

The remainder of this paper is structured as follows: Section 2 gives a thorough overview to the BOF steelmaking process and an analysis of the main chemical processes involved. Section 3 introduces the knowledge of the adaptive feature matching variational autoencoder (VAE-AFM) model proposed in this paper. Section 4 introduces how to build a dynamic soft sensor model of BOF steelmaking endpoint carbon content and temperature based on the VAE-AFM model. Section 5 presents the experimental findings and related discussions. Ultimately, the conclusions are presented in Section 6.

2. Introduction to the BOF Steelmaking Process and Chemical Process Analysis

This section primarily introduces and analyzes the BOF steelmaking process and the main chemical processes involved.

2.1. Introduction to the BOF Steelmaking Process

BOF steelmaking is a common method of steel production that involves blowing oxygen to oxidize the carbon and impurities in molten iron, producing steel [42]. This process takes place in a basic oxygen furnace (BOF). Initially, molten pig iron is poured into the furnace, along with a certain amount of materials, after the scrap charging. Oxygen is then blown in to induce vigorous reactions on the surface of the molten pig iron, causing the oxidation of iron, silicon, and manganese, which forms slag. The convection of the molten steel and slag facilitates the reactions throughout the entire furnace. Over time, as the molten steel retains only trace amounts of silicon and manganese, carbon starts to oxidize, generating carbon monoxide in an exothermic reaction, leading to vigorous boiling of the molten steel. The furnace mouth emits large flames due to the burning of the overflowing carbon monoxide. Finally, phosphorus oxidizes further to form ferrous phosphate. Slag is created when calcium phosphate and calcium sulfide, which are stable byproducts of the ferrous phosphate and quicklime reaction. As phosphorus and sulfur gradually decrease, the flames subside, and brown fumes of ferric oxide appear at the furnace mouth, indicating that the steel has been refined. At this point, the blowing should be stopped immediately, and the molten steel should be poured into a ladle. Then, a deoxidizer is added to carry out the deoxidation process [43].

In summary, the BOF steelmaking process mainly includes the following [44]: (1) Pretreatment of hot metal: In the steelmaking process, pretreatment of hot metal refers to a series of preliminary processing techniques applied to molten iron. These techniques involve the removal of impurities, such as sulfur and phosphorus, and the adjustment of the chemical composition of the hot metal to enhance the quality of the final steel product [45]. (2) Converter charging: Adding scrap, molten pig iron, and additives into the converter. (3) Blowing: Injecting oxygen to react with impurities at high temperatures, removing them through oxidation and reduction, and adding alloying elements. (4) Tempering and alloying: Adjusting steel composition and properties by adding alloying elements in the ladle during tapping. (5) Output of steel: Pouring out the steel when carbon content, temperature, and other elements meet the required standards. The process of manufacturing BOF steel is shown in Figure 1.

2.2. Chemical Analysis of the BOF Steelmaking Process

The process of manufacturing BOF steel involves a number of intricate chemical processes. The chemical reactions in the BOF steelmaking process primarily involve the oxidation of impurities in the molten iron, such as silicon, carbon, manganese, and phosphorus [46]. The various impurities in pig iron have a high affinity for oxygen at high temperatures to varying degrees. Therefore, oxidation can be used to convert these impurities into liquid, solid, or gaseous oxides. Liquid and solid oxides react with fluxes added to the furnace at high temperatures, forming slag, which is then removed during slag skimming. Gaseous oxides are expelled from the furnace by CO during the boiling of the molten steel. The following is an analysis of the main chemical reaction principles in the BOF steelmaking process:

(1): Oxidation of silicon: Silicon has a great affinity with oxygen, so it oxidizes very quickly. It is completely oxidized to form SiO₂ in the early stage of smelting. The reaction is as follows:

$S i + 2 F e O ⟶ S i O_{2} + 2 F e$

(1)

At the same time, SiO₂ reacts with FeO to form silicate:

2 F e O + S i O_{2} ⟶ 2 F e O \cdot S i O_{2}

(2)

Silicate is a very important part of slag. It reacts with CaO to form stable compounds CaO · SiO₂ and FeO. The former exists firmly in the slag, and the latter becomes a free component in the slag, which increases the FeO content in the slag and is more beneficial to promote the oxidation of impurities. The reaction is as follows:

2 F e O \cdot S i O_{2} + 2 C a O ⟶ 2 C a O \cdot S i O_{2} + 2 F e O

(3)

(2): Oxidation of carbon: Carbon oxidizes to form carbon monoxide (CO) or carbon dioxide (CO₂). Under high-temperature conditions, carbon first reacts with oxygen to produce carbon monoxide, and part of the carbon monoxide continues to oxidize to form carbon dioxide.

$2 C + O_{2} ⟶ 2 C O$

(4)

$C + O_{2} ⟶ C O_{2}$

(5)
(3): Oxidation of phosphorus: Oxidation of phosphorus occurs at a not overly high temperature. The dephosphorization process is composed of several reactions, which are as follows:

$2 P + 5 F e O ⟶ P_{2} O_{5} + 5 F e$

(6)

$P_{2} O_{5} + 3 F e O ⟶ 3 F e O \cdot P_{2} O_{5}$

(7)

When there is enough CaO in the alkaline slag, the following reactions will occur:

3 F e O \cdot P_{2} O_{5} + 4 C a O ⟶ 4 C a O \cdot P_{2} O_{5} + 3 F e O

(8)

The generated CaO · P₂O₅ is a stable compound, which is firmly retained in the slag, thus achieving the purpose of dephosphorization.

(4): Oxidation of sulfur: Sulfur exists in the form of FeS. When there is sufficient CaO in the slag, sulfur can also be removed through the following reaction:

$F e S + C a O ⟶ C a S + F e O$

(9)

$C a O + F e S + C ⟶ C a S + F e + C O$

(10)

$\begin{matrix} F e S + M n O ⟶ M n S + F e O \end{matrix}$

(11)
(5): Deoxidation of FeO: In the culminating phase of the steelmaking process, deoxidation is performed in the ladle to remove the large amount of oxygen present. This is typically achieved by adding deoxidizers such as ferromanganese, ferrosilicon, and aluminum to the molten steel. These deoxidizers strongly attract oxygen from FeO, achieving deoxidation through the following reactions:

$F e O + M n ⟶ M n O + F e$

(12)

$2 F e O + S i ⟶ S i O_{2} + 2 F e$

(13)

$3 F e O + 2 A l ⟶ A l_{2} O_{3} + 3 F e$

(14)

In summary, the oxidation of impurities including carbon, silicon, manganese, and phosphorus in the molten iron is the main chemical reaction in the BOF steelmaking process. These reactions are facilitated by blowing in oxygen, resulting in the formation of corresponding oxides. These oxides then react with added slagging materials, such as lime, to form stable slag. Most of these oxidation reactions are exothermic, releasing a substantial amount of heat that maintains the high-temperature environment within the BOF, thereby promoting further reactions. Through these complex chemical and physical processes, BOF steelmaking efficiently removes impurities from the molten iron, ultimately yielding high-purity steel products.

Drawing from the aforementioned analysis, within the BOF steelmaking process, there are complex nonlinear relationships between process parameters such as added burned lime, molten iron, scrap steel, and oxygen blowing rate; the endpoint carbon content; and temperature. These nonlinear relationships provide a theoretical basis for non-contact measurement of carbon content and temperature, making it possible to predict the endpoint carbon content and temperature in BOF steelmaking using data-driven methods. In modern steelmaking processes, advanced data analysis and machine learning techniques can be used to deeply mine and model production data, thereby achieving high-precision predictions of endpoint carbon content and temperature. These data-driven methods not only enhance the level of automation in the production process but also improve the control precision and efficiency of the steelmaking process. This further promotes the development of steel production towards increased intelligence and efficiency. Simultaneously, by analyzing and utilizing big data, steelmaking process parameters can be continuously optimized, reducing energy consumption and production costs, enhancing product quality, and meeting increasingly stringent environmental and economic requirements.

3. The Proposed Adaptive Feature Matching Variational Autoencoder Model

Based on the introduction to the BOF steelmaking process and the analysis of the relevant chemical processes in the previous section, this paper adopts a data-driven approach to predict the endpoint carbon content and temperature in the production of BOF steel. This paper aims to address the issues of high dimensionality, strong nonlinearity, and concept drift in BOF steelmaking data. This article offers a dynamic soft sensor model for predicting endpoint carbon content and temperature using an adaptive feature matching with variational autoencoder. This section introduces the rationale behind the suggested approach, including the principles of the variational autoencoder model and the dynamic balanced joint distribution alignment domain adaptation model.

3.1. Introduction of the Structure of the Proposed Method

The research method’s structural layout is described in this subsection. Figure 2 depicts the major elements of the theoretical structure. It is segmented into three components for elaboration, and the specific details are shown below:

(1) In the BOF steelmaking production process, data are collected in real-time from multiple sensors, including key parameters such as temperature, oxygen-blowing volume, and chemical composition. Due to the highly complex interactions and nonlinear characteristics of these data, traditional linear models are unable to fully explore the information contained within. In Section 3.2, to address this challenge, this paper utilizes the variational autoencoder to extract features from BOF steelmaking data. The VAE is a deep generative model that captures complex nonlinear relationships in high-dimensional data space and extracts low-dimensional representations of the data by learning the distribution of latent variables. The extracted features retain the primary information of the original data and simplify the data structure, thereby enhancing subsequent data analysis and modeling. Section 3.2 will introduce the precise details.

(2) In the BOF steelmaking process, factors such as raw material batches, process adjustments, and changes in equipment conditions can lead to variations in the distribution of BOF steelmaking data collected through sensors, thereby affecting the performance of data-driven models. To address this problem, Section 3.3 proposes a method based on a dynamic balanced joint distribution alignment domain adaptation network to achieve data alignment and enhance the model’s resilience and forecast precision. This network aims to minimize distributional differences between source and target domains by considering both marginal and conditional distributions. Conventional domain adaptation methods typically focus on only one and overlook the dynamic equilibrium between them. The method proposed in Section 3.3 adaptively adjusts the importance of the marginal and conditional distributions by introducing a dynamic balancing factor to more accurately align the data distributions in the source and target domains. The specific details will be introduced in Section 3.3.

(3) Based on the analyses in Section 3.2 and Section 3.3, Section 3.4 proposes a dynamic balanced joint distribution alignment domain adaptation variational autoencoder model. This model combines the domain adaptation capabilities of dynamic balanced joint distribution with the feature extraction capabilities of VAE to provide a comprehensive solution. The temperature and carbon content at the end of the real BOF steelmaking process were predicted using this soft sensor model. The findings of the experiment show that the proposed model effectively addresses the high-dimensional, nonlinear, and concept drift issues in BOF steelmaking data, providing a valuable reference for the industry. The specific details will be introduced in Section 3.4.

3.2. Output-Related Feature Extraction Method for BOF Steelmaking Data Based on Variational Autoencoder

The BOF steelmaking process data exhibits substantial nonlinearity and great dimensionality, traditional data analysis methods struggle to capture its inherent characteristics and patterns. To address this, we employ variational autoencoder technology in deep learning [47]. This method effectively extracts features from the data, providing high-quality input for subsequent model training and optimization, thereby enhancing the prediction and control capabilities of the BOF steelmaking process. The encoder and decoder are the two components that make up the VAE [48].

In this paper, the input of the variational autoencoder is the collected BOF steelmaking process data, the latent variable z represents the extracted features of the BOF steelmaking data, and the output represents the reconstructed BOF steelmaking data. The VAE assumes that the output is produced by a latent variable z. Specifically, it is assumed that z obeys a simple prior distribution

P (z)

, usually a standard normal distribution

N (0, 1)

, and then z is mapped to the data space x through the decoder. Both the encoder

q_{φ} (z | x)

and decoder

p_{θ} (x | z)

are implemented by neural networks, and

φ

and

θ

denote the parameters of the encoder and decoder, respectively. The core idea of the variational autoencoder is to learn the low-dimensional latent representation of the data by maximizing the evidence lower bound (ELBO), which involves minimizing the latent distribution’s variance as well as the reconstruction error. The particular method of derivation is as described below:

The formula for calculating the marginal likelihood of the input data

x

is shown below:

\begin{matrix} log p_{θ} (x) = E_{z \sim q_{φ} (z | x)} log p_{θ} (x) = E_{z \sim q_{φ} (z | x)} log p_{θ} \frac{p_{θ} (x | z) p_{θ} (z) q_{φ} (z | x)}{p_{θ} (z | x) q_{φ} (z | x)} \\ = E_{z \sim q_{φ} (z | x)} log p_{θ} (x | z) + E_{z \sim q_{φ} (z | x)} log p_{θ} \frac{q_{φ} (z | x)}{p_{θ} (z | x)} - E_{z \sim q_{φ} (z | x)} log p_{θ} \frac{q_{φ} (z | x)}{p_{θ} (z)} \\ = E_{z \sim q_{φ} (z | x)} log p_{θ} (x | z) - D_{K L} (q_{φ} (z | x) ∥p_{θ} (z)) + D_{K L} (q_{φ} (z | x) ∥p_{θ} (z | x)) \end{matrix}

(15)

In the above formula, x represents the input data, z represents the latent variables,

p_{θ} (x)

represents the marginal likelihood of the input data x,

p_{θ} (x | z)

represents the likelihood of the input data x given the latent variables z,

p_{θ} (z)

represents the prior distribution of the latent variables z,

q_{φ} (z | x)

represents the approximation of the posterior distribution of the latent variables z given the input data x, and

D_{K L}

denotes the KL divergence, which measures the difference between two distributions. This formula illustrates the transformation and derivation of the log-likelihood function in the variational autoencoder. The aim is to introduce the evidence lower bound (ELBO), facilitating effective optimization of the model.

The VAE’s loss function is to maximize the evidence lower bound (ELBO), which can simultaneously maximize the reconstructed log-likelihood of the data and minimize the divergence between the approximate posterior distribution and the prior distribution, thereby effectively training the variational autoencoder. The loss function of the VAE has two parts. The first term is the log-likelihood expectation of the decoder output, which can be considered as the reconstruction error

L_{r e c}

of the autoencoder. The second term is the difference between the approximate posterior distribution

q_{φ} (z | x)

and the prior distribution

p_{θ} (z)

. The specific formula is as follows:

L_{V A E} = L_{r e c} + L_{K L} = - E_{z \sim q_{φ} (z | x)} log p_{θ} (x | z) + D_{K L} (q_{φ} (z | x) ∥p_{θ} (z))

(16)

The formula for the reconstruction loss of VAE is as follows:

L_{r e c} = \frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - {\tilde{x}}_{i})}^{2}

(17)

L_{r e c}

reflects the reconstruction error term, specifically the mean squared error (MSE) between the input data

x_{i}

and its reconstructed value

{\tilde{x}}_{i}

. This term measures the decoder’s performance in reconstructing the input data. N is the total number of samples. This term also represents the accuracy of the decoder in reconstructing the data x from the latent space z. The smaller the mean square error, the closer the sample reconstructed by the decoder is to the original input sample. During the training process of the VAE model, the reconstruction loss is part of the loss function, which together with the KL divergence term constitutes the variational evidence lower bound (ELBO). By minimizing the reconstruction loss, the parameters of the model can be optimized, thereby improving the reconstruction quality.

The KL divergence loss calculation formula of VAE is as follows:

L_{K L} = D_{K L} (q_{φ} (z | x) ∥p_{θ} (z)) = \int q_{φ} (z | x) log \frac{q_{φ} (z | x)}{p_{θ} (z)} d z = \frac{1}{2} \sum_{j = 1}^{J} (σ_{j}^{2} + μ_{j}^{2} - log σ_{j}^{2} - 1)

(18)

A metric for calculating the disparity between two probability distributions is KL divergence. Specifically, in the VAE model, the KL divergence loss term is used to measure the difference between the approximate posterior distribution

q_{φ} (z | x)

and the prior distribution

p_{θ} (z)

, where

q_{φ} (z | x)

is the approximate posterior distribution obtained by the encoder network, which is usually assumed to be a normal distribution

N (μ, σ^{2})

.

p_{θ} (z)

is the prior distribution, generally chosen as the standard normal distribution

N (0, 1)

.

μ_{j}

and

σ_{j}^{2}

are the mean and variance of the jth dimension in the approximate posterior distribution

q_{φ} (z | x)

, respectively. This formula represents the KL divergence for a multivariate normal distribution. It calculates the mean and variance of each dimension to measure the difference between the approximate posterior distribution and the standard normal distribution.

In summary, the reconstruction error plus the KL divergence add up to the VAE loss function. The formula is as follows:

L_{V A E} = L_{r e c} + L_{K L} = \frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - {\tilde{x}}_{i})}^{2} + \frac{1}{2} \sum_{j = 1}^{J} (σ_{j}^{2} + μ_{j}^{2} - log σ_{j}^{2} - 1)

(19)

The working principle of the VAE model was introduced above. Specifically, the VAE first maps the high-dimensional input data into a lower-dimensional latent space through the encoder. In this process, the model not only learns a latent representation of the data but also samples from a parameterized Gaussian distribution for the latent variables. This introduces a degree of randomness and regularization. Next, the decoder generates reconstructed samples that closely resemble the input data distribution by leveraging the latent variables sampled from the latent space. The training objective of the VAE is to minimize the reconstruction error and the KL divergence between the latent distribution and the prior distribution. This ensures that the model can effectively capture the latent structure and key features of the data while maintaining its generative capabilities. This process not only enhances the model’s ability to generate data but also improves its generalization performance. In this paper, the VAE model is employed to extract characteristics from the BOF steelmaking process data. First, the production process data collected by the sensor is used as input x and mapped to the latent space to obtain the mean

μ

and variance

σ^{2}

. The latent variable z is sampled from the Gaussian distribution using the reparameterization technique, and z is mapped to the data space through the decoder network to generate the reconstructed data

\tilde{x}

. Then, the KL divergence and reconstruction loss are calculated to realize the extraction of BOF steelmaking production process data.

3.3. Constructing Dynamic Balanced Joint Distribution Alignment Domain Adaptation Network Model

In the BOF steelmaking process, variations in raw material batches, process adjustments, and equipment conditions cause shifts in the distribution of sensor-collected steelmaking data, leading to concept drift. This affects the functionality of the model and reduces the precision of endpoint carbon content and temperature forecasts. To address this issue, this section proposes a method based on a dynamic balanced joint distribution alignment domain adaptation model to achieve adaptive feature matching.

(1): The probability distribution of BOF steelmaking data is calculated based on the maximum mean discrepancy.

To address changes in the BOF steelmaking data distribution, this paper calculates the probability distribution using the maximum mean discrepancy. The maximum mean discrepancy (MMD) is a statistical technique frequently used in transfer learning tasks to reduce the distribution difference between the source and target domains [49]. This method achieves distribution alignment by measuring the difference between two probability distributions in the feature space, thereby enhancing the model’s generalization capability in the target domain. For example, in the literature [50], a transfer learning method based on the maximum mean discrepancy is proposed for intrusion detection. MMD is a non-parametric statistic based on kernel methods used to calculate the separation between two distributions. The calculation formula for MMD is as follows:

\begin{matrix} M M D (Z^{s}, Z^{t}) = ∥\frac{1}{n_{s}} \sum_{i = 1}^{n_{s}} Φ (Z_{i}^{s}) - {\frac{1}{n_{t}} \sum_{j = 1}^{n_{t}} Φ (Z_{j}^{t})∥}_{H}^{2} \\ = \frac{1}{n_{s}^{2}} \sum_{i = 1}^{n_{s}} \sum_{j = 1}^{n_{s}} K (Z_{i}^{s}, Z_{j}^{s}) + \frac{1}{n_{t}^{2}} \sum_{i = 1}^{n_{t}} \sum_{j = 1}^{n_{t}} K (Z_{i}^{t}, Z_{j}^{t}) - \frac{2}{n_{s} n_{t}} \sum_{i = 1}^{n_{s}} \sum_{j = 1}^{n_{t}} K (Z_{i}^{s}, Z_{j}^{t}) \end{matrix}

(20)

In the above formula,

Z^{s}

and

Z^{t}

represent the source domain data and target domain data for BOF steelmaking, respectively.

n_{s}

represents the sample size of the source domain data, and

n_{t}

represents the sample size of the target domain data. H denotes the reproducing kernel Hilbert space (RKHS),

Φ (\cdot)

represents the feature mapping function that maps the original samples to RKHS, and

K (\cdot, \cdot)

denotes a Gaussian kernel function. The expression for

K (\cdot, \cdot)

is as follows:

K (z_{i}, z_{j}) = e^{- \frac{∥z_{i} - z_{j}∥}{2 σ^{2}}}

(21)

Here,

σ

stands for the bandwidth parameter that controls the scale of the kernel.

(2): Align BOF steelmaking data from different domains based on the domain adaptation network mechanism.

In the BOF steelmaking process, variations in raw material batches, process adjustments, and changes in equipment conditions can cause shifts in data distribution, leading to concept drift. Specifically, the distribution differences between different domains in BOF steelmaking data can manifest as follows: the marginal distributions can differ, i.e.,

P (x_{s}) \neq P (x_{t})

, where

x_{s}

and

x_{t}

represent the feature distributions of the source and target domains, respectively, and the conditional distributions can differ, i.e.,

P (y_{s} | x_{s}) \neq P (y_{t} | x_{t})

, where

y_{s}

and

y_{t}

represent the label distributions of the source and target domains, respectively.

To address the issue of decreased model prediction performance caused by concept drift in BOF steelmaking data, this paper employs a domain adaptation approach. One important topic of transfer learning study is domain adaptation [51]. The issue of inconsistent data distribution between source and target domains is intended to be resolved by neural networks with domain adaptation. In practical applications, changes in data distribution often lead to a significant decline in performance when directly applying the source domain model to the target domain. To address this issue, domain adaptation neural networks have been developed, aiming to establish effective mapping relationships between the source and target domains, thereby enabling the model to achieve good performance in the target domain. Domain adaptation methods typically achieve this by reducing the distribution discrepancies between the source and target domains. Among these methods, marginal distribution alignment and joint distribution alignment are two commonly used strategies. Marginal distribution alignment reduces cross-domain discrepancies by making the marginal distributions of the source and target domains more consistent in the high-dimensional feature space. On the other hand, joint distribution alignment takes it a step further by considering the conditional relationships between input features and output labels. By simultaneously aligning both marginal and conditional distributions, this approach more comprehensively reduces the distribution differences between the source and target domains. Domain adaptation neural networks achieve effective transfer of source domain models to the target domain by optimizing a loss function that minimizes the distribution distance between the source and target domains in the feature space. This process typically involves several steps: First, data are mapped to a space through a feature extractor. Then, domain alignment techniques are applied to gradually reduce the distribution discrepancies between the source and target domain features. Finally, model parameters are adjusted to ensure accurate regression predictions on the unlabeled target domain samples. Overall, domain adaptation techniques overcome the challenge of cross-domain distribution inconsistency through effective distribution alignment strategies, enabling models trained on the source domain to be effectively applied to the target domain, thereby enhancing the practical effectiveness of transfer learning. In this paper, MMD is used as a metric to thoroughly measure the distance between BOF steelmaking data from different domains in the feature space. This effectively reduces the gap between feature distributions, thereby enhancing the model’s prediction performance.

(3): Construct a dynamic balanced joint distribution alignment domain adaptation network model.

Based on the above analysis, to effectively align BOF steelmaking data from different domains, this paper constructs a dynamic balanced joint distribution alignment domain adaptation network model. This model fully considers the differences in both marginal and conditional distributions between the source and target domains.

In marginal distribution domain adaptation, the difference between the source domain marginal distribution

P (x_{s})

and the target domain marginal distribution

P (x_{t})

can be measured using a distance function

D (D_{s}, D_{t})

. The goal is to minimize this distance to reduce distribution differences.

D (D_{s}, D_{t})

can be expressed as follows:

D (D_{s}, D_{t}) = D (P (x_{s}), P (x_{t}))

(22)

In conditional distribution domain adaptation, the difference between the source domain conditional distribution

P (y_{s} | x_{s})

and the target domain conditional distribution

P (y_{t} | x_{t})

can be measured using a distance function

D (D_{s}, D_{t})

. The goal is to minimize this distance to reduce distribution differences. The

D (D_{s}, D_{t})

can be expressed as follows:

D (D_{s}, D_{t}) = D (P (y_{s} | x_{s}), P (y_{t} | x_{t}))

(23)

In joint distribution domain adaptation,

D (D_{s}, D_{t})

can be defined as the linear total of the marginal and conditional distributions, as shown below:

D (D_{s}, D_{t}) = D (P (x_{s}), P (x_{t})) + D (P (y_{s} | x_{s}), P (y_{t} | x_{t}))

(24)

The above studies only consider one of the two distributions or often assume that both distributions are equally important; however, this presumption might not be accurate in practical applications. This paper provides a solution to this problem by the dynamic balanced joint distribution alignment domain adaptation model. We design a balance factor

μ

that quantitatively evaluates the role that conditional and marginal distributions have in domain adaptation. By dynamically adjusting the factor

μ

, we can effectively balance these distributions. In dynamic balanced joint distribution alignment domain adaptation,

D (D_{s}, D_{t})

can be expressed as a weighted linear combination of the marginal and conditional distributions, as shown below:

D (D_{s}, D_{t}) = μ D (P (x_{s}), P (x_{t})) + (1 - μ) D (P (y_{s} | x_{s}), P (y_{t} | x_{t}))

(25)

Among them,

0 \leq μ \leq 1

. When

μ

= 1, the above formula becomes a marginal distribution; when

μ

= 0, the above formula becomes a conditional distribution; and when

μ

= 0.5, the above formula becomes a joint distribution. Furthermore, based on the previous analysis, this paper employs MMD distance to construct a dynamic balanced joint distribution alignment network.

The formula for calculating the marginal distribution distance based on MMD is as follows:

D_{M M D} (P_{s} (x), P_{t} (x)) = ∥\frac{1}{n_{s}} \sum_{i = 1}^{n_{s}} A^{T} x_{i} - {\frac{1}{n_{t}} \sum_{j = 1}^{n_{t}} A^{T} x_{j}∥}_{H}^{2}

(26)

The formula for calculating the conditional distribution distance based on MMD is as follows:

D_{M M D} (P_{s} (y | x), P_{t} (y | x)) = \sum_{c = 1}^{c} ∥\frac{1}{{n_{s}}^{(c)}} \sum_{x_{s i} \in {D_{s}}^{(c)}} A^{T} x_{s i} - {\frac{1}{{n_{t}}^{(c)}} \sum_{x_{t j} \in {D_{t}}^{(c)}} A^{T} x_{t j}∥}_{H}^{2}

(27)

Furthermore, the calculation formula for the dynamic adjustment factor

μ

is as follows:

μ = \frac{D_{M M D} (P_{s} (x), P_{t} (x))}{D_{M M D} (P_{s} (x), P_{t} (x)) + D_{M M D} (P_{s} (y | x), P_{t} (y | x))}

(28)

In light of the study above, the calculation formula for the proposed adaptive feature matching method in this paper is as follows:

\begin{matrix} D_{M M D} (A^{T} x_{s}, A^{T} x_{t}) = μ * D_{M M D} (P_{s} (x), P_{t} (x)) + (1 - μ) * D_{M M D} (P_{s} (y | x), P_{t} (y | x)) \\ = u * {∥\frac{1}{n_{s}} \sum_{i = 1'}^{n_{s}} A^{T} x_{i} - \frac{1}{n_{t}} \sum_{j = 1}^{n_{t}} A^{A} x_{j}∥}_{H}^{2} + (1 - μ) * \sum_{c = 1}^{c} ∥\frac{1}{{n_{s}}^{(c)}} \sum_{x_{s i} \in {D_{s}}^{(c)}} A^{T} x_{s i} - {\frac{1}{{n_{t}}^{(c)}} \sum_{x_{t j} \in {D_{t}}^{(c)}} A^{T} x_{t j}∥}_{H}^{2} \end{matrix}

(29)

Based on the above analysis, this section constructs a dynamic balanced joint distribution alignment domain adaptation network model, as shown in Figure 3:

The model achieves efficient domain adaptation by adaptively aligning the distributions across different domains. During the domain adaptation process, the contributions of marginal and conditional distributions to domain divergence are not identical. To accurately assess and adjust the influence of these distributions, the model first utilizes maximum mean discrepancy to separately calculate the marginal distribution difference and the conditional distribution difference between the source and target domains. This step ensures that the model can quantitatively capture the degree of divergence between different distributions. Subsequently, the model introduces a dynamic adjustment factor

μ

, which is designed to evaluate the relative importance of marginal and conditional distributions in real-time during the domain adaptation process. This dynamic adjustment factor

μ

adaptively adjusts the weights of each distribution based on the observed differences between them, allowing the model to flexibly address distributional discrepancies across different domains. Through this mechanism, the model not only achieves simultaneous alignment of marginal and conditional distributions but also dynamically adjusts the importance weights of these distributions in response to variations in inter-domain differences. This dynamic weight adjustment mechanism enhances the model’s adaptability and robustness when handling data from different domains, ensuring stability and predictive performance under varying distribution conditions. Overall, the model not only effectively reduces inter-domain distribution discrepancies during the domain adaptation process but also significantly improves its generalization capability in cross-domain applications.

3.4. Based on the Adaptive Feature Matching Variational Autoencoder Model

In Section 3.2 and Section 3.3, we introduced the use of the variational autoencoder model for extracting features from BOF steelmaking data, and the dynamic balanced joint distribution alignment domain adaptation network model for reducing differences between various BOF datasets. Based on the previous analysis, this section constructs an adaptive feature matching variational autoencoder model (VAE-AFM). As shown in Figure 4, the model is generally divided into four stages: domain partitioning, feature extraction, domain adaptation, and endpoint carbon content and temperature prediction. This model is designed to process data from the BOF steelmaking production process, particularly addressing the issue of concept drift.

This model integrates the powerful feature extraction capabilities of VAE with the dynamic balanced joint distribution alignment domain adaptation network, forming a comprehensive solution. In the VAE-AFM model, the processing is divided into two main steps to ensure accurate prediction of the endpoint carbon content and temperature during the BOF steelmaking process. First, the VAE module is responsible for extracting latent feature representations from the raw data. Through its encoder–decoder architecture, the VAE model compresses high-dimensional input data into a lower-dimensional latent space. In this process, the encoder maps the input data to a distribution in the latent space, while the decoder reconstructs samples similar to the original data distribution. This approach not only extracts key features from the data but also captures the underlying structures and patterns, providing robust feature representations. Then, after extracting the latent features, the AFM module processes these features further. The core of AFM lies in applying domain adaptation techniques to ensure that the distributions of source domain data and target domain data in the latent space are as consistent as possible. Specifically, AFM module minimizes the distribution differences between the source and target domains through dynamic joint adaptation of both marginal and conditional distributions. This approach enhances the model’s generalization ability, ensuring stable performance, even when faced with varying data distributions. In this model, the synergy between the two components leads to improved prediction accuracy of the endpoint carbon content and temperature in BOF steelmaking. The VAE module provides robust feature extraction capabilities, ensuring a more precise latent representation of the data, while the AFM module further refines these features by reducing distribution discrepancies between domains, thereby enhancing the model’s adaptability across different data domains. Overall, the VAE-AFM model demonstrates significant advantages in handling the high dimensionality, strong nonlinearity, and concept drift challenges inherent in BOF steelmaking process data, effectively improving prediction accuracy and model robustness.

The pseudo-code for the proposed adaptive feature matching with the variational autoencoder (VAE-AFM) model for predicting the endpoint carbon content and temperature in BOF steelmaking is shown in Algorithm 1:

Algorithm 1: Dynamic Soft Sensor Model Based on Adaptive Feature Matching Variational Autoencoder (VAE-AFM)

4. Dynamic Soft Sensor for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on VAE-AFM Model

Based on the analysis in Section 3, to forecast the endpoint carbon content and temperature in BOF steelmaking, a soft sensor model is built. This section provides a detailed overview of the issues in the BOF steelmaking process and explains how the proposed dynamic soft sensor model is able to forecast the endpoint carbon content and temperature.

4.1. Problem Description

In the BOF steelmaking process, concept drift can be caused by various factors. First, changes in raw materials can lead to fluctuations in data distribution. Variations in the composition and quality of raw materials from different batches can significantly impact the model’s prediction performance. For example, variations in the composition of iron ore and scrap steel directly affect the chemical reactions and temperature distribution within the furnace, thereby altering the composition and quality of the molten steel. Secondly, process adjustments are another important factor. To meet market demands or improve production efficiency, it is often necessary to adjust blowing parameters, alloy ratios, or the amount of scrap steel input during the production process. These adjustments may include changing the flow rates of oxygen and gas, adjusting the furnace pressure and temperature, and modifying the proportions of alloying elements. Each process adjustment leads to changes in data distribution, reducing the model’s prediction accuracy under the new conditions. Additionally, changes in equipment conditions significantly impact data distribution. Prolonged use of BOF equipment can lead to wear, aging, or malfunctions, all of which can alter the process conditions. For example, wear on the oxygen nozzles can result in uneven oxygen supply, affecting the temperature distribution and gas flow within the furnace. Such changes not only affect the current production process but also lead to inconsistencies between historical and current data distributions, thereby impacting the model’s prediction performance. Additionally, environmental factors and the experience level of operators can also impact the production process. For example, seasonal temperature changes can affect the heat conduction efficiency within the furnace. These factors can also cause changes in data distribution, further exacerbating the phenomenon of concept drift. Overall, the complex and variable factors in the BOF steelmaking process collectively lead to dynamic changes in data distribution, making concept drift a critical issue that requires significant attention. Effectively addressing concept drift is crucial for enhancing the robustness and prediction performance of the model.

In this paper, we use the early drift detection method (EDDM) to detect concept drift in the BOF steelmaking process data collected online [52]. It is based on a statistical metric to detect changes in data distribution and provides early warnings to promptly identify the occurrence of concept drift. The principle of the EDDM algorithm is to detect concept drift by monitoring changes in error rates. It uses a sliding window to track recent data samples and calculates the error rate within each window. Then, the EDDM algorithm determines whether concept drift has occurred based on the trend of the error rate changes.

Specifically, the EDDM algorithm first calculates the error rate for each window and compares it with the error rate of the previous window. If the change in error rate exceeds a threshold, it is considered a candidate for concept drift. Then, the EDDM algorithm uses a statistical metric, such as the exponential moving average, to track the cumulative drift amount of candidate drift points. If the cumulative drift amount exceeds another threshold, it is identified as a true concept drift point. The formula is expressed as follows:

w a r n i n g : \frac{({p^{'}}_{i} + 2 {s^{'}}_{i})}{({p^{'}}_{max} + 2 {s^{'}}_{max})} < α

(30)

d r i f t : \frac{({p^{'}}_{i} + 2 {s^{'}}_{i})}{({p^{'}}_{max} + 2 {s^{'}}_{max})} < β

(31)

In the above formula,

p_{i}^{'}

and

s_{i}^{'}

represent the error rate and standard deviation, respectively.

p_{i}^{'} + 2 s^{'}

denotes the distance between the error rate and the standard deviation,

p_{max}^{'} + 2 s_{max}^{'}

indicates the point with the maximum error distance distribution, and

α

is a warning threshold, When the value reaches

α

, training samples are pre-stored to prevent changes in their probability distribution.

β

is a drift threshold, and when the value reaches

β

, it is considered that true concept drift has occurred.

4.2. Dynamic Soft Sensor for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on VAE-AFM Model

Based on the above analysis of the BOF steelmaking process, we established a dynamic soft sensor model for predicting the endpoint carbon content and temperature using the VAE-AFM model. The specific process is illustrated in Figure 5:

The specific details of the BOF steelmaking endpoint carbon content and temperature prediction model are as follows:

(1): Data Collection: During the BOF steelmaking process, advanced sensor technology is used to collect a large amount of key data. Such data include oxygen blowing rate, hot metal loading, scrap loading, temperature, pressure, gas composition inside the furnace, and other environmental and process parameters. These sensors collect data through a real-time monitoring system, ensuring data timeliness and completeness.
(2): Data Preprocessing: Due to environmental interference and equipment errors in actual production, the collected data require systematic preprocessing to ensure their quality and reliability. The preprocessing process includes multiple steps. Data cleaning: Identifying and handling outliers, missing values, and duplicate values. First, in order to ensure the accuracy and completeness of the collected original dataset. We manually delete all zero data, as they may represent invalid or meaningless records, to prevent them from introducing bias in model training. Secondly, we conducted strict screening and manual deletion of obvious outliers. These outliers are usually abnormal data caused by acquisition errors, sensor failures, or extreme working conditions. If they are retained in the dataset without processing, they may have an adverse effect on the training and prediction performance of the model. Through these preprocessing steps, we are able to remove noise from the data and enhance the representativeness and stability of the dataset, thereby improving the generalization ability and prediction accuracy of the model. Feature selection: Features that significantly impact the endpoint carbon content and temperature are selected through correlation analysis and feature engineering. For example, historical data are analyzed to identify which parameters (such as oxygen blowing rate, temperature, and scrap composition) are most important for prediction results. Data standardization and normalization: Standardize the data using zero-mean standardization and min–max normalization to mitigate the effects of various feature scales on model training. Real-time concept drift detection: Design and implement an online detection algorithm to monitor changes in data distribution in real-time and detect the occurrence of concept drift, enabling timely model adjustments.
(3): Feature Matching and Domain Adaptation: In the BOF steelmaking process, changes in raw material batches, process adjustments, and equipment conditions can lead to shifts in data distribution, resulting in concept drift and affecting model performance. Therefore, it is necessary to lessen the distinctions between various domains. The model put forward in this paper combines deep learning and transfer learning methods. First, a variational autoencoder is used to extract high-dimensional features from the raw data, which better represent the intrinsic structure and distribution of the data. Then, an adaptive feature matching is designed to adaptively measure the relative importance of marginal and conditional distributions between different domains. This approach effectively reduces the differences between various BOF steelmaking datasets and enhances the prediction performance of the endpoint carbon content and temperature.
(4): Model Training and Experimental Results Analysis: Based on the aforementioned process, a deep transfer learning model (VAE-AFM) for predicting the endpoint carbon content and temperature in BOF steelmaking is established. Based on this model, predictions of endpoint carbon content and temperature can be achieved, and the experimental results can be analyzed.

Through the above steps, the proposed VAE-AFM model can effectively reduce domain differences in BOF steelmaking data, resulting in improved prediction performance for endpoint carbon content and temperature. This model has both theoretical and practical application value.

5. Experimental Results and Analysis

To validate the effectiveness of the provided approach, experiments are conducted on actual BOF steelmaking process data following the modeling process described in Section 3 and Section 4. In this part, We initially present the experimental data, experimental platform, and evaluation metrics. Next, we contrast the suggested approach with ablation tests to demonstrate its effectiveness in addressing the high dimensionality, complexity, and concept drift issues in BOF steelmaking process data. Finally, this work provides a soft sensor model for BOF steelmaking data and compares its prediction ability with that of other soft sensor models.

5.1. Introduction of Experimental Data and Experimental Environment

For endpoint control in the production of BOF steel, precise temperature and carbon content prediction is essential. The actual BOF steelmaking production process at a steel factory provided the experimental data used in this work. The target variables, or output variables, are the endpoint carbon content and temperature. The raw features collected by various sensors serve as the input variables for the model. In the dataset of carbon content, the source domain contains 3500 samples, while target domains 1, 2, and 3 contain 733, 625, and 492 samples, respectively. Each sample for carbon content includes 26 features. Similarly, in the temperature dataset, the source domain contains 3500 samples, while target domains 1, 2, and 3 containing 497, 660, and 425 samples, respectively. Each sample for temperature includes 30 features. The carbon content and temperature datasets’ sample counts are broken down in detail in Table 1.

For the aforementioned carbon content and temperature datasets, this paper calculates the maximum mean discrepancy (MMD) distance between each domain to visually demonstrate changes in the distribution of BOF steelmaking data. MMD is a criterion used to measure the differences in data distributions, commonly employed in transfer learning to assess the differences between different operating conditions. A larger MMD value indicates greater differences between two datasets, while a smaller value indicates lesser differences. The MMD values are visualized as shown in Figure 6:

In the paper describing the variational autoencoder (VAE), we designed a model incorporating several key components to optimize its performance and adaptability. The main structure of the model includes two parts: the encoder and the decoder. In this study, the encoder consists of four fully connected layers, while the decoder has a symmetric structure, mapping the features from the latent space back to the original data space. ReLU was primarily chosen as the activation function due to its nonlinear properties and strong training performance. The optimizer used to train the VAE is Adam. Table 2 provides the details of the variational autoencoder architecture.

The parameter settings utilized in this investigation are shown in Table 3, which depend on the current application problem. The number of input variables for the network model is 26 dimensions for carbon content and 30 dimensions for temperature. The learning rate for the model is set to 0.01. The training iterations are 4000 for carbon content and 5000 for temperature.

Table 4 provides the features information collected by various sensors during the BOF steelmaking production process. To evaluate the importance of sample feature on the model output, we conducted a ranking analysis of the sample features related to endpoint carbon content and temperature in the BOF steelmaking process. We quantified the relative importance of each feature by calculating its contribution to the model’s prediction outcomes. Through this feature importance analysis, we identified the key features that most significantly impact the output and gained a deeper understanding of the model’s decision-making mechanism. The feature importance ranking results are shown in Figure 7:

In order to more intuitively display the features extracted by the model from the raw data during the training phase, we used the t-SNE dimensionality reduction technology to visualize the features extracted by the model. The specific results are shown in Figure 8, where a represents the visualization result of the carbon content feature of the BOF steelmaking endpoint extracted by the model, and b represents the visualization result of the temperature feature of the BOF steelmaking endpoint extracted by the model.

In the experiment, we used a computer equipped with a 12th-generation Intel Core i7 processor, 16 GB of RAM, and an NVIDIA GeForce RTX 3060 Laptop GPU. The operating system was Windows 11, and the programming language used was Python 3.7. PyTorch was selected as the deep learning framework for the experiments.

5.2. Introduction of Experimental Evaluation Indicators

To assess how well various soft sensor models predict the endpoint carbon content and temperature in BOF steelmaking, we used several evaluation metrics, including prediction accuracy (PA), root mean square error (RMSE), and mean absolute percentage error (MAPE). Prediction accuracy (PA) measures the accuracy of the model, with higher values indicating more accurate predictions. Other metrics, such as root mean square error (RMSE) and mean absolute percentage error (MAPE), assess the magnitude of the model’s errors. Better prediction performance is demonstrated by lower values of these indicators, meaning smaller differences between the predicted values and the true values. By using multiple evaluation metrics, we can comprehensively assess the various models’ performances.

The calculation formulas for each evaluation metric are as follows:

P E_{i} = \{\begin{matrix} 1 & |y_{t e s t, i} - y_{p r e, i}| \leq T h \\ 0 & |y_{t e s t, i} - y_{p r e, i}| > T h \end{matrix}

(32)

P A = \frac{\sum_{i = 1}^{N_{t e s t}} P E_{i}}{N_{t e s t}} * 100 %

(33)

R M S E = \sqrt{\frac{1}{N_{t e s t}} \sum_{i = 1}^{N_{t e s t}} {(y_{t e s t, i} - y_{p r e, i})}^{2}}

(34)

M A P E = \frac{1}{N_{t e s t}} \sum_{i = 1}^{N_{t e s t}} |\frac{y_{t e s t, i} - y_{p r e, i}}{y_{t e s t, i}}|

(35)

In the above equation,

N_{t e s t}

represents the number of test samples;

i \in (1, N_{t e s t})

;

y_{t e s t}

and

y_{p r e}

denote the true and predicted values, respectively; and

T h

denotes the allowable error margin.

5.3. Ablation Experiments

To confirm that the suggested dynamic soft sensor model accurately predicts the endpoint carbon content and temperature in the production of BOF steelmaking, we conducted ablation experiments based on adaptive feature matching with the variational autoencoder model (VAE-AFM). The specific ablation experiment settings are as follows:

Establish a soft sensor model to accurately forecast the endpoint carbon content and temperature in basic oxygen furnace steelmaking based on the variational autoencoder (VAE).
Establish a soft sensor model to accurately forecast the endpoint carbon content and temperature in basic oxygen furnace steelmaking based on marginal distribution alignment with variational autoencoder (VAE-MDA).
Establish a soft sensor model to accurately forecast the endpoint carbon content and temperature in basic oxygen furnace steelmaking based on joint distribution alignment with variational autoencoder (VAE-JDA).
Establish a dynamic soft sensor model to accurately forecast the endpoint carbon content and temperature in basic oxygen furnace steelmaking based on adaptive feature matching variational autoencoder (VAE-AFM).

5.3.1. Graphs of Experimental Results of the Carbon Content

Using the proposed method, the endpoint carbon content in BOF steelmaking is predicted. The Table 5 displays the model’s experimental outcomes.

To illustrate the outcomes of the experiment more intuitively and vividly, this section includes a fitting plot of the carbon content prediction results and a radar chart of the prediction accuracy. These plots are shown in Figure 9, Figure 10, Figure 11 and Figure 12 below:

5.3.2. Analysis of Experimental Results of the Carbon Content

Various ablation experiments were conducted to predict the endpoint carbon content in BOF steelmaking. Using the VAE model to process BOF steelmaking data, to a certain extent, the retrieved features accurately depict the output. However, due to the occurrence of concept drift, changes in data distribution have affected the model’s prediction performance. The marginal distribution model only considers the marginal distribution differences between domains and fails to effectively reduce the overall domain differences. Although the joint distribution model considers both marginal and conditional distributions, it does not account for their relative importance. In contrast, the dynamic balanced joint distribution alignment domain adaptation model proposed in this paper considers both marginal and conditional distributions and evaluates their relative importance. When processing BOF steelmaking data, the VAE-AFM model proposed in this paper effectively reduces the differences between different domains. From the above experimental results, it can be seen that when predicting carbon content in target domains 1, 2, and 3, the VAE-AFM model achieves prediction accuracies of 85.71%, 84%, and 83.83% within a 2% error range, with root mean square errors (RMSE) of 0.0152, 0.0155, and 0.0139, respectively. The experimental results demonstrate that the proposed VAE-AFM method effectively addresses the issues of high dimensionality, strong nonlinearity, and concept drift in BOF steelmaking process data.

5.3.3. Graphs of Experimental Results of the Temperature

Using the proposed method, the endpoint temperature in BOF steelmaking is predicted. Table 6 displays the model’s experimental outcomes.

To showcase the outcomes of the experiment more intuitively and vividly, this section includes a fitting plot of the temperature prediction results and a radar chart of prediction accuracy. These plots are shown in Figure 13, Figure 14, Figure 15 and Figure 16 below:

5.3.4. Analysis of Experimental Results of the Temperature

This section includes our experimentation to predict the endpoint temperature in BOF steelmaking. The experimental design follows the same approach as the carbon content prediction experiments detailed in the first section. We use the VAE, VAE-MDA, VAE-JDA, and VAE-AFM models to predict the temperature. Using the proposed VAE-AFM model to predict the endpoint temperature in target domains 1, 2, and 3, the prediction accuracies within a 10°C error range are 85%, 81.81%, and 84.7%, respectively. The root mean square errors (RMSE) are 7.8317, 8.5019, and 8.8545, respectively. The experimental results demonstrate that the proposed VAE-AFM method effectively addresses the issues of high dimensionality, strong nonlinearity, and concept drift in the BOF steelmaking process data.

5.4. Experimental Comparison Results and Analysis with Other Algorithms

In this section, we compare the proposed VAE-AFM soft sensor method with other soft sensor methods. These include the transfer component analysis (TCA) domain adaptation algorithm proposed by Pan et al. [38], the joint distribution adaptation (JDA) domain adaptation algorithm proposed by Long et al. [39,40], and the balanced distribution adaptation (BDA) algorithm proposed by Wang et al. [41]. Additionally, we compare it with the variable-wise weighted stacked autoencoder (VW-SAE) proposed in [23], the spatiotemporal attention-based LSTM network (STA-LSTM) model proposed in [24], the variable attention-based long short-term memory network (VALSTM) proposed in [25], the supervised LSTM network (SLSTM) model proposed in [26], the attribute-relevant distributed variational autoencoder (AR-DVAE) model proposed in [22], the stacked supervised Poisson autoencoder (SSPAE) model proposed in [21], the supervised dual-branch deep belief network (SD-DBN) model proposed in [20] and the von-Mises Fisher mixture model and weighted stacked autoencoder model (vMF-WSAE) proposed in [19].

5.4.1. Graphs of Experimental Results

Table 7 and Table 8 present the prediction performance metrics of the proposed VAE-AFM soft sensor method and other soft sensor models for endpoint carbon content and temperature in BOF steelmaking, respectively.

To present the experimental results more vividly and intuitively, this section includes radar charts of the MAPE and PA prediction results for endpoint carbon content and temperature in BOF steelmaking, as well as a bar chart of the MSAE results. These charts are shown in Figure 17, Figure 18, Figure 19, Figure 20, Figure 21 and Figure 22 below:

5.4.2. Analysis of Experimental Results

Analyzing the aforementioned experimental results, it becomes apparent that the proposed VAE-AFM model exhibits the highest accuracy among the various soft sensor models, significantly outperforming the other models. The VAE-AFM model achieved outstanding performance on the BOF steelmaking dataset. This can be attributed to the VAE model’s ability to extract complex features from the BOF steelmaking data and the dynamic balanced joint distribution alignment domain adaptation model’s adaptive adjustment of the relative importance of marginal and conditional distributions between different domains, effectively reducing inter-domain differences.

As shown in Table 7 and Table 8, traditional domain adaptation methods such as TCA, JDA, and BDA exhibit relatively poor prediction performance, with higher root mean square error (RMSE) and mean absolute percentage error (MAPE). This is primarily because these domain adaptation methods either only consider the marginal distribution between different domains or consider both the marginal and conditional distributions without accounting for their relative importance. As a result, they cannot adequately reduce the differences between domains. For certain deep learning models, such as the variable-wise weighted stacked autoencoder (VW-SAE) and the spatiotemporal attention-based LSTM network (STA-LSTM), although these models can effectively handle the highly complex BOF steelmaking data, they do not account for the concept drift phenomenon that occurs in the BOF steelmaking process due to differences in raw material batches, process adjustments, and changes in equipment conditions. This leads to changes in data distribution, subsequently affecting model performance.

The model advanced in this paper not only utilizes deep learning to process the complex BOF steelmaking process data but also constructs a dynamic balanced joint distribution alignment domain adaptation model. This model adaptively adjusts the importance of marginal and conditional distributions between different domains, effectively reducing the discrepancies between them. By combining these approaches, the model’s performance is significantly improved, enabling more accurate predictions of the endpoint carbon content and temperature in BOF steelmaking. This study furnishes substantial support for enhancing the efficacy and applicability of soft sensor models in predicting endpoint carbon content and temperature in BOF steelmaking, thereby presenting notable theoretical and practical significance.

6. Conclusions

This paper addresses the challenges of elevated dimensionality and pronounced nonlinearity in BOF steelmaking process data. Additionally, it tackles the issue of concept drift caused by changes in data distribution due to variations in raw material batches, process adjustments, and equipment conditions during the steelmaking process, which affect model performance. We propose a dynamic balanced joint distribution alignment domain adaptation network soft sensor model based on deep learning, referred to as the VAE-AFM soft sensor model. To verify the effectiveness of the VAE-AFM soft sensor model, experiments were carried out in the real BOF steelmaking manufacturing process. This model integrates the advantages of deep learning and dynamic domain adaptation, making it particularly suitable for handling large-scale, high-dimensional, nonlinear data and concept drift in the BOF steelmaking process. The principal contributions of this research encompass the following aspects:

(1): An adaptive feature matching variational autoencoder (VAE-AFM) soft sensor model is proposed. VAE-AFM combines deep learning and transfer learning to process BOF steelmaking process data.
(2): By designing a dynamic adjustment factor $μ$ , a dynamic balanced joint distribution alignment domain adaptation network was constructed. This network adaptively measures the significance of marginal and conditional distributions across different domains, effectively reducing the differences between various BOF steelmaking data and improving the prediction performance for endpoint carbon content and temperature.
(3): Experiments were performed using real BOF steelmaking data. Compared to traditional domain adaptation methods and other deep learning models, the proposed VAE-AFM model demonstrated superior performance. These results provide an effective solution for predicting the endpoint carbon content and temperature in the BOF steelmaking production process.

Overall, the proposed VAE-AFM model effectively reduces differences between domains, thereby improving the prediction accuracy of endpoint carbon content and temperature in BOF steelmaking. This paper provides a promising research direction for modeling soft sensors for endpoint carbon content and temperature in BOF steelmaking using deep learning and transfer learning, offering valuable references for practical applications.

Author Contributions

Z.L.: conceptualization, methodology, software, validation, writing—original draft, writing—review editing, visualization. H.L. (Hui Liu): conceptualization, methodology, supervision, project administration, funding acquisition. F.C.: data curation, formal analysis, investigation. H.L. (Heng Li): investigation. X.X.: investigation. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (No. 62263016); the Applied Basic Research Foundation of Yunnan Province, China (No. 202401AT070375).

Data Availability Statement

The data that have been used are confidential.

Conflicts of Interest

Author Fugang Chen was employed by the company Yunnan Kungang Electronic and Information Science Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Han, M.; Liu, C. Endpoint prediction model for basic oxygen furnace steel-making based on membrane algorithm evolving extreme learning machine. Appl. Soft Comput. 2014, 19, 430–437. [Google Scholar] [CrossRef]
Zhang, R.; Yang, J.; Sun, H.; Yang, W. Prediction of lime utilization ratio of dephosphorization in BOF steelmaking based on online sequential extreme learning machine with forgetting mechanism. Int. J. Miner. Metall. Mater. 2024, 31, 508–517. [Google Scholar] [CrossRef]
Ghalati, M.K.; Zhang, J.; El-Fallah, G.; Nenchev, B.; Dong, H. Toward learning steelmaking—A review on machine learning for basic oxygen furnace process. Mater. Genome Eng. Adv. 2023, 1, e6. [Google Scholar] [CrossRef]
Jun, T.; Xin, W.; Tianyou, C.; Shuming, X. Intelligent control method and application for BOF steelmaking process. IFAC Proc. Vol. 2002, 35, 439–444. [Google Scholar] [CrossRef]
Wang, Z.; Liu, Q.; Liu, H.; Wei, S. A review of end-point carbon prediction for BOF steelmaking process. High Temp. Mater. Process. 2020, 39, 653–662. [Google Scholar] [CrossRef]
He, Z.; Qian, J.; Li, J.; Hong, M.; Man, Y. Data-driven soft sensors of papermaking process and its application to cleaner production with multi-objective optimization. J. Clean. Prod. 2022, 372, 133803. [Google Scholar] [CrossRef]
Acosta, S.M.; Amoroso, A.L.; Sant’Anna, Â.M.O.; Junior, O.C. Predictive modeling in a steelmaking process using optimized relevance vector regression and support vector regression. Ann. Oper. Res. 2022, 316, 905–926. [Google Scholar] [CrossRef]
He, F.; Zhang, L. Prediction model of end-point phosphorus content in BOF steelmaking process based on PCA and BP neural network. J. Process Control 2018, 66, 51–58. [Google Scholar] [CrossRef]
Yuan, X.; Zhou, J.; Wang, Y. A spatial-temporal LWPLS for adaptive soft sensor modeling and its application for an industrial hydrocracking process. Chemom. Intell. Lab. Syst. 2020, 197, 103921. [Google Scholar] [CrossRef]
Kačur, J.; Flegner, P.; Durdán, M.; Laciak, M. Prediction of temperature and carbon concentration in oxygen steelmaking by machine learning: A comparative study. Appl. Sci. 2022, 12, 7757. [Google Scholar] [CrossRef]
Han, M.; Cao, Z. An improved case-based reasoning method and its application in endpoint prediction of basic oxygen furnace. Neurocomputing 2015, 149, 1245–1252. [Google Scholar] [CrossRef]
Zhou, P.; Guo, D.; Wang, H.; Chai, T. Data-driven robust M-LS-SVR-based NARX modeling for estimation and control of molten iron quality indices in blast furnace ironmaking. IEEE Trans. Neural Netw. Learn. Syst. 2017, 29, 4007–4021. [Google Scholar] [CrossRef]
Wang, M.; Gao, C.; Ai, X.; Zhai, B.; Li, S. Whale optimization end-point control model for 260 tons BOF steelmaking. ISIJ Int. 2022, 62, 1684–1693. [Google Scholar] [CrossRef]
Shang, C.; Yang, F.; Huang, D.; Lyu, W. Data-driven soft sensor development based on deep learning technique. J. Process Control 2014, 24, 223–233. [Google Scholar] [CrossRef]
Yu, J.; Hong, C.; Rui, Y.; Tao, D. Multitask autoencoder model for recovering human poses. IEEE Trans. Ind. Electron. 2017, 65, 5060–5068. [Google Scholar] [CrossRef]
Lee, S.; Chang, J.H. Oscillometric blood pressure estimation based on deep learning. IEEE Trans. Ind. Inform. 2016, 13, 461–472. [Google Scholar] [CrossRef]
Waziralilah, N.F.; Abu, A.; Lim, M.; Quen, L.K.; Elfakharany, A. A review on convolutional neural network in bearing fault diagnosis. MATEC Web Conf. 2019, 255, 06002. [Google Scholar] [CrossRef]
Yuan, X.; Huang, B.; Wang, Y.; Yang, C.; Gui, W. Deep learning-based feature representation and its application for soft sensor modeling with variable-wise weighted SAE. IEEE Trans. Ind. Inform. 2018, 14, 3235–3243. [Google Scholar] [CrossRef]
Lu, Y.; Hui, L.; Fugang, C. Soft sensor method of multimode BOF steelmaking endpoint carbon content and temperature based on vMF-WSAE dynamic deep learning. High Temp. Mater. Process. 2023, 42, 20220270. [Google Scholar] [CrossRef]
Zongxu, L.; Hui, L.; Fugang, C.; Heng, L.; XiaoJun, X. BOF steelmaking endpoint carbon content and temperature soft sensor based on supervised dual-branch DBN. Meas. Sci. Technol. 2024, 35, 035119. [Google Scholar] [CrossRef]
Xinmin, Z.; Manabu, K.; Masahiro, T. Stacked supervised Poisson autoencoders-based soft-sensor for defects prediction in steelmaking process. Comput. Chem. Eng. 2023, 172, 108182. [Google Scholar] [CrossRef]
Yan-Lin, H.; Xing-Yuan, L.; Jia-Hui, M.; Qun-Xiong, Z.; Shan, L. Attribute-relevant distributed variational autoencoder integrated with LSTM for dynamic industrial soft sensing. Eng. Appl. Artif. Intell. 2023, 119, 105737. [Google Scholar] [CrossRef]
Yuan, X.; Li, L.; Shardt, Y.A.; Wang, Y.; Yang, C. Deep learning with spatiotemporal attention-based LSTM for industrial soft sensor model development. IEEE Trans. Ind. Electron. 2020, 68, 4404–4414. [Google Scholar] [CrossRef]
Yuan, X.; Li, L.; Wang, Y.; Yang, C.; Gui, W. Deep learning for quality prediction of nonlinear dynamic processes with variable attention-based long short-term memory network. Can. J. Chem. Eng. 2020, 98, 1377–1389. [Google Scholar] [CrossRef]
Yuan, X.; Li, L.; Wang, Y. Nonlinear dynamic soft sensor modeling with supervised long short-term memory network. IEEE Trans. Ind. Inform. 2019, 16, 3168–3176. [Google Scholar] [CrossRef]
Shen, B.; Yao, L.; Ge, Z. Predictive modeling with multiresolution pyramid VAE and industrial soft sensor applications. IEEE Trans. Cybern. 2022, 53, 4867–4879. [Google Scholar] [CrossRef] [PubMed]
Zhang, T.; Yan, G.; Li, R.; Xiao, S.; Pang, Y. Multi-mode industrial soft sensor method based on mixture Laplace variational auto-encoder. Measurement 2024, 229, 114435. [Google Scholar] [CrossRef]
Tang, X.; Yan, J.; Li, Y.; Zhang, X.; Song, Z. Semi-supervised Deep Conditional Variational Auto-encoder for Soft Sensor Modeling. IEEE Sens. J. 2024, 24, 7153–7164. [Google Scholar] [CrossRef]
Zheng, Z.; Long, J.Y.; Gao, X.Q. Production scheduling problems of steelmaking-continuous casting process in dynamic production environment. J. Iron Steel Res. Int. 2017, 24, 586–594. [Google Scholar] [CrossRef]
Sun, Z.; Tang, J.; Qiao, J.; Cui, C. Review of concept drift detection method for industrial process modeling. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; pp. 5754–5759. [Google Scholar]
Hwi, J.D.; Min, L.J. Ensemble learning based latent variable model predictive control for batch trajectory tracking under concept drift. Comput. Chem. Eng. 2020, 139, 106875. [Google Scholar] [CrossRef]
Zhang, T.; Yan, G.; Ren, M.; Cheng, L.; Li, R.; Xie, G. Dynamic transfer soft sensor for concept drift adaptation. J. Process Control 2023, 123, 50–63. [Google Scholar] [CrossRef]
Kvaktun, D.; Liu, D.; Schiffers, R. Detection of concept drift for quality prediction and process control in injection molding. Aip Conf. Proc. 2023, 2884, 110008. [Google Scholar]
Hinder, F.; Vaquet, V.; Hammer, B. One or two things we know about concept drift—A survey on monitoring in evolving environments. Part A: Detecting concept drift. Front. Artif. Intell. 2024, 7, 1330257. [Google Scholar] [CrossRef] [PubMed]
Liu, Y.; Yang, C.; Liu, K.; Chen, B.; Yao, Y. Domain adaptation transfer learning soft sensor for product quality prediction. Chemom. Intell. Lab. Syst. 2019, 192, 103813. [Google Scholar] [CrossRef]
Pan, J. Feature-Based Transfer Learning with Real-World Applications; Hong Kong University of Science and Technology (Hong Kong): Hong Kong, China, 2010. [Google Scholar]
Pan, S.J.; Tsang, I.W.; Kwok, J.T.; Yang, Q. Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 2010, 22, 199–210. [Google Scholar] [CrossRef]
Long, M.; Zhu, H.; Wang, J.; Jordan, M.I. Deep transfer learning with joint adaptation networks. In Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; pp. 2208–2217. [Google Scholar]
Long, M.; Wang, J.; Ding, G.; Sun, J.; Yu, P.S. Transfer feature learning with joint distribution adaptation. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2200–2207. [Google Scholar] [CrossRef]
Wang, J.; Chen, Y.; Hao, S.; Feng, W.; Shen, Z. Balanced distribution adaptation for transfer learning. In Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA, 18–21 November 2017; pp. 1129–1134. [Google Scholar] [CrossRef]
Cinelli, L.P.; Marins, M.A.; Da Silva, E.A.B.; Netto, S.L. Variational Methods for Machine Learning with Applications to Deep Networks; Springer: Berlin/Heidelberg, Germany, 2021; Volume 15. [Google Scholar]
Holappa, L. Recent achievements in iron and steel technology. J. Chem. Technol. Metall. 2017, 52, 159–167. [Google Scholar]
González, L.F.V.; González, D.F.; González, J.I.V. Operations and Basic Processes in Steelmaking; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
Jalkanen, H.; Holappa, L. Converter steelmaking. In Treatise on Process Metallurgy; Elsevier: Amsterdam, The Netherlands, 2014; pp. 223–270. [Google Scholar] [CrossRef]
Wu, W.; Yang, Q.; Gao, Q.; Zeng, J. Effects of calcium ferrite slag on dephosphorization of hot metal during pretreatment in the BOF converter. J. Mater. Res. Technol. 2020, 9, 2754–2761. [Google Scholar] [CrossRef]
Gatschlhofer, C. Phosphorus Behaviour during Carbo-Thermal Reduction of Iron-, Chromium-, and Manganese-Rich Slags. Master’s Thesis, University of Leoben, Leoben, Austria, 2022. [Google Scholar]
Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
Meng, C.; Xu, C.; Lei, Q.; Su, W.; Wu, J. Balanced joint maximum mean discrepancy for deep transfer learning. Anal. Appl. 2021, 19, 491–508. [Google Scholar] [CrossRef]
Kamath, U.; Liu, J.; Whitaker, J.; Kamath, U.; Liu, J.; Whitaker, J. Transfer learning: Domain adaptation. In Deep Learning for NLP and Speech Recognition; Springer: New York, NY, USA, 2019; pp. 495–535. [Google Scholar]
Mu, Z.; Yaoping, L.; Liangbo, X.; Wei, N. Maximum Mean Discrepancy Minimization Based Transfer Learning for Indoor WLAN Personnel Intrusion Detection. IEEE Sens. Lett. 2019, 3, 7500804. [Google Scholar] [CrossRef]
Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A comprehensive survey on transfer learning. Proc. IEEE 2020, 109, 43–76. [Google Scholar] [CrossRef]
Han, M.; Mu, D.; Li, A.; Liu, S.; Gao, Z. Concept drift detection methods based on different weighting strategies. Int. J. Mach. Learn. Cybern. 2024, 1–24. [Google Scholar] [CrossRef]

Figure 1. The process of BOF steelmaking.

Figure 2. The structure of the proposed method.

Figure 3. Schematic diagram of the dynamic balanced joint distribution alignment domain adaptation network.

Figure 4. Schematic diagram of the VAE-AFM model.

Figure 5. Flowchart of the proposed method.

Figure 6. MMD values between different domains.

Figure 7. Ranking of the importance of each feature to output.

Figure 8. Visualization of latent variables.

Figure 9. Ablation experiment results for carbon content in target domain 1.

Figure 10. Ablation experiment results for carbon content in target domain 2.

Figure 11. Ablation experiment results for carbon content in target domain 3.

Figure 12. Radar chart of carbon content prediction results for different target domains.

Figure 13. Ablation experiment results for temperature in target domain 1.

Figure 14. Ablation experiment results for temperature in target domain 2.

Figure 15. Ablation experiment results for temperature in target domain 3.

Figure 16. Radar chart of temperature t prediction results for different target domains.

Figure 17. Carbon content PA index graph.

Figure 18. Carbon content MAPE index graph.

Figure 19. Carbon content RMSE and PA dual indicator graph.

Figure 20. Temperature PA indicator graph.

Figure 21. Temperature MAPE indicator graph.

Figure 22. Temperature RMSE and PA dual indicator graph.

Table 1. Experimental datasets.

	Source Domain	Target Domain 1	Target Domain 2	Target Domain 3
Carbon content	3500	733	625	492
Temperature	3500	497	660	425

Table 2. Details about the VAE architecture.

Name	Parameters
Number of layers	Four layers
Types of layers	Fully connected layers
Activation functions	ReLU function
Loss functions	Reconstruction loss and KL divergence
Optimizers	Adam optimizer

Table 3. Experimental parameter settings.

Parameter Name	Values for the Carbon Content Dataset	Values for the Temperature Dataset
Number of network input variables	26	30
Network model layers	4	4
Learning rate	0.01	0.01
Training batches	4000	5000
Allowable error Te	$\{\pm 0.02, \pm 0.03\} (%)$	$\{\pm 10, \pm 15\} (° C)$

Table 4. Features collected from BOF steelmaking industrial production engineering.

Target Variables

Input Variables

Carbon Content Temperature

No. 1: Loading amount of hot metal. No. 2: Loading amount of pig iron. No. 3: Amount of loaded steel scrap. No. 4: Total of transfer. No. 5: Temperature of iron liquid. No. 6: Carbon content in molten iron. No. 7: Silicon content in molten iron. No. 8: Manganese content in molten iron. No. 9: Sulfur content in molten iron. No. 10: Phosphorus content in molten iron. No. 11: Content of ASN in molten iron. No. 12: The time from testing to starting to mix iron. No. 13: Duration of iron exchange. No. 14: End of iron mixing to oxygen opening time. No. 15: The time from tapping to mix iron. No. 16: The amount of lime added for the first time. No. 17: The time of adding magnesite ball. No. 18: Amount of magnesite ball. No. 19: The time of adding high carbon ferromanganese. No. 20: Amount of high carbon ferromanganese.

Table 5. Results of the carbon content ablation experiments.

Target Domain	Indicators	VAE	VAE-MDA	VAE-JDA	VAE-AFM
Target domain 1	PA (0.02)	0.5238	0.6394	0.7142	0.8571
	PA (0.03)	0.687	0.8163	0.8775	0.9455
	RMSE	0.0374	0.0238	0.0183	0.0152
	MAPE	0.2993	0.2201	0.1846	0.1526
Target domain 2	PA (0.02)	0.536	0.664	0.72	0.84
	PA (0.03)	0.664	0.84	0.88	0.952
	RMSE	0.0323	0.0213	0.0181	0.0155
	MAPE	0.2417	0.1886	0.1471	0.1243
Target domain 3	PA (0.02)	0.5353	0.6565	0.7474	0.8383
	PA (0.03)	0.6565	0.7575	0.8888	0.9696
	RMSE	0.034	0.0235	0.019	0.0139
	MAPE	0.2691	0.2235	0.1682	0.1225

Table 6. Results of the temperature ablation experiments.

Target Domain	Indicators	VAE	VAE-MDA	VAE-JDA	VAE-AFM
Target domain 1	PA (10)	0.52	0.65	0.76	0.85
	PA (15)	0.74	0.78	0.89	0.94
	RMSE	14.0395	13.2024	10.0327	7.8317
	MAPE	0.0066	0.0058	0.0045	0.0038
Target domain 2	PA (10)	0.5075	0.6742	0.7272	0.8181
	PA (15)	0.6818	0.7954	0.8787	0.9015
	RMSE	14.1775	11.1394	9.9015	8.5019
	MAPE	0.007	0.0051	0.0047	0.0039
Target domain 3	PA (10)	0.5647	0.6352	0.7294	0.847
	PA (15)	0.7411	0.8117	0.8588	0.9411
	RMSE	13.1295	11.2575	10.5132	8.8545
	MAPE	0.0063	0.0054	0.0045	0.0039

Table 7. Prediction performance metrics for carbon content using different algorithms.

Target Domain	Model	PA (0.02%)	PA (0.03%)	RMSE	MAPE
Target domain 1	TCA [38]	0.5918	0.7687	0.0297	0.2556
	JDA [39,40]	0.619	0.7687	0.0305	0.2172
	BDA [41]	0.6326	0.8163	0.0284	0.2056
	VW-SAE [23]	0.6734	0.8503	0.0217	0.1856
	STA-LSTM [24]	0.7074	0.8503	0.0219	0.1731
	VALSTM [25]	0.7287	0.8775	0.0234	0.2141
	SLSTM [26]	0.6734	0.8299	0.0229	0.206
	AR-DVAE [22]	0.6938	0.8639	0.0214	0.2078
	SSPAE [21]	0.7278	0.8503	0.0263	0.1962
	SD-DBN [20]	0.7482	0.8979	0.0199	0.1665
	vMF-WSAE [19]	0.8095	0.8843	0.0178	0.1535
	The Proposed Method	0.8571	0.9455	0.0152	0.1526
Target domain 2	TCA [38]	0.6	0.728	0.0344	0.2071
	JDA [39,40]	0.64	0.768	0.0336	0.1876
	BDA [41]	0.616	0.8	0.0346	0.183
	VW-SAE [23]	0.688	0.872	0.0207	0.1567
	STA-LSTM [24]	0.696	0.84	0.0226	0.176
	VALSTM [25]	0.664	0.832	0.023	0.1851
	SLSTM [26]	0.656	0.84	0.0238	0.1649
	AR-DVAE [22]	0.68	0.832	0.0226	0.1788
	SSPAE [21]	0.68	0.896	0.0244	0.1822
	SD-DBN [20]	0.728	0.872	0.0216	0.1622
	vMF-WSAE [19]	0.752	0.888	0.0204	0.1569
	The Proposed Method	0.84	0.952	0.0155	0.1243
Target domain 3	TCA [38]	0.606	0.7575	0.0253	0.193
	JDA [39,40]	0.6565	0.7777	0.0255	0.1768
	BDA [41]	0.6767	0.7979	0.0241	0.1634
	VW-SAE [23]	0.6262	0.8282	0.0219	0.1636
	STA-LSTM [24]	0.6969	0.8787	0.021	0.1608
	VALSTM [25]	0.7272	0.8383	0.022	0.1767
	SLSTM [26]	0.6767	0.8888	0.0206	0.1684
	AR-DVAE [22]	0.6666	0.8484	0.0207	0.1707
	SSPAE [21]	0.7373	0.8585	0.0262	0.1633
	SD-DBN [20]	0.7676	0.8989	0.0186	0.1457
	vMF-WSAE [19]	0.7777	0.9191	0.0167	0.1321
	The Proposed Method	0.8383	0.9696	0.0139	0.1225

Table 8. Prediction performance metrics for temperature using different algorithms.

Target Domain	Model	PA (10 °C)	PA (15 °C)	RMSE	MAPE
Target domain 1	TCA [38]	0.6	0.78	13.006	0.0059
	JDA [39,40]	0.69	0.82	11.67	0.0051
	BDA [41]	0.68	0.86	11.718	0.0051
	VW-SAE [23]	0.69	0.81	11.428	0.0052
	STA-LSTM [24]	0.72	0.82	15.237	0.0058
	VALSTM [25]	0.72	0.83	13.011	0.0049
	SLSTM [26]	0.74	0.83	11.537	0.0048
	AR-DVAE [22]	0.68	0.88	10.3691	0.0049
	SSPAE [21]	0.74	0.86	10.2925	0.0047
	SD-DBN [20]	0.7587	0.8542	11.1747	0.0047
	vMF-WSAE [19]	0.8	0.88	11.7434	0.0042
	The Proposed Method	0.85	0.94	7.8317	0.0038
Target domain 2	TCA [38]	0.6287	0.7954	12.569	0.0058
	JDA [39,40]	0.6893	0.8333	10.761	0.0049
	BDA [41]	0.6742	0.8409	10.796	0.0049
	VW-SAE [23]	0.6136	0.8181	11.63	0.0054
	STA-LSTM [24]	0.6818	0.856	12.955	0.0053
	VALSTM [25]	0.7424	0.8484	10.007	0.0045
	SLSTM [26]	0.7348	0.7878	12.696	0.0051
	AR-DVAE [22]	0.7045	0.8484	10.7092	0.0047
	SSPAE [21]	0.7272	0.8333	10.6635	0.0048
	SD-DBN [20]	0.7651	0.8636	11.5769	0.0044
	vMF-WSAE [19]	0.7954	0.8712	12.9275	0.0047
	The Proposed Method	0.8181	0.9015	8.5019	0.0039
Target domain 3	TCA [38]	0.5882	0.8117	11.612	0.0055
	JDA [39,40]	0.6352	0.847	11.461	0.0055
	BDA [41]	0.6588	0.8117	11.309	0.0054
	VW-SAE [23]	0.7167	0.8235	10.806	0.0047
	STA-LSTM [24]	0.6941	0.8117	14.059	0.0056
	VALSTM [25]	0.7529	0.847	10.845	0.0046
	SLSTM [26]	0.7294	0.847	11.614	0.005
	AR-DVAE [22]	0.7058	0.8588	11.125	0.0048
	SSPAE [21]	0.6941	0.8705	10.432	0.0048
	SD-DBN [20]	0.8181	0.8636	10.0135	0.0042
	vMF-WSAE [19]	0.7803	0.8939	10.0484	0.0043
	The Proposed Method	0.847	0.9411	8.8545	0.0039

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Z.; Liu, H.; Chen, F.; Li, H.; Xue, X. Dynamic Soft Sensor Model for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on Adaptive Feature Matching Variational Autoencoder. Processes 2024, 12, 1807. https://doi.org/10.3390/pr12091807

AMA Style

Liu Z, Liu H, Chen F, Li H, Xue X. Dynamic Soft Sensor Model for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on Adaptive Feature Matching Variational Autoencoder. Processes. 2024; 12(9):1807. https://doi.org/10.3390/pr12091807

Chicago/Turabian Style

Liu, Zhaoxiang, Hui Liu, Fugang Chen, Heng Li, and Xiaojun Xue. 2024. "Dynamic Soft Sensor Model for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on Adaptive Feature Matching Variational Autoencoder" Processes 12, no. 9: 1807. https://doi.org/10.3390/pr12091807

APA Style

Liu, Z., Liu, H., Chen, F., Li, H., & Xue, X. (2024). Dynamic Soft Sensor Model for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on Adaptive Feature Matching Variational Autoencoder. Processes, 12(9), 1807. https://doi.org/10.3390/pr12091807

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dynamic Soft Sensor Model for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on Adaptive Feature Matching Variational Autoencoder

Abstract

1. Introduction

2. Introduction to the BOF Steelmaking Process and Chemical Process Analysis

2.1. Introduction to the BOF Steelmaking Process

2.2. Chemical Analysis of the BOF Steelmaking Process

3. The Proposed Adaptive Feature Matching Variational Autoencoder Model

3.1. Introduction of the Structure of the Proposed Method

3.2. Output-Related Feature Extraction Method for BOF Steelmaking Data Based on Variational Autoencoder

3.3. Constructing Dynamic Balanced Joint Distribution Alignment Domain Adaptation Network Model

3.4. Based on the Adaptive Feature Matching Variational Autoencoder Model

4. Dynamic Soft Sensor for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on VAE-AFM Model

4.1. Problem Description

4.2. Dynamic Soft Sensor for Endpoint Carbon Content and Temperature in BOF Steelmaking Based on VAE-AFM Model

5. Experimental Results and Analysis

5.1. Introduction of Experimental Data and Experimental Environment

5.2. Introduction of Experimental Evaluation Indicators

5.3. Ablation Experiments

5.3.1. Graphs of Experimental Results of the Carbon Content

5.3.2. Analysis of Experimental Results of the Carbon Content

5.3.3. Graphs of Experimental Results of the Temperature

5.3.4. Analysis of Experimental Results of the Temperature

5.4. Experimental Comparison Results and Analysis with Other Algorithms

5.4.1. Graphs of Experimental Results

5.4.2. Analysis of Experimental Results

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI