1. Introduction
Steel is crucial in modern industry and technology, and high-quality steel is essential for ensuring the quality of steel engineering. Basic oxygen furnace (BOF) steelmaking, a main technology, is widely used for its high efficiency and low energy consumption [
1]. This process involves adding molten iron, scrap steel, ferroalloys, and certain non-metallic materials into the molten pool followed by a sequence of intricate chemical and physical processes that result in heat production to finish the steelmaking process [
2]. The endpoint carbon content and temperature of the molten steel are critical indicators in the steelmaking process, directly affecting the quality of the steel. Accurately predicting the endpoint carbon content and temperature in molten steel not only improves production efficiency but also reduces production costs and emissions [
3].
Currently, the measurement methods for endpoint carbon content and temperature in BOF steelmaking mainly include contact and non-contact techniques [
4]. The contact measurement of carbon content and temperature in the molten steel pool primarily utilizes the sublance detection method with special probes and tools [
5]. Although highly accurate, this method cannot provide continuous real-time monitoring. Additionally, the sublance probe operates in a high-temperature, corrosive environment, leading to high operational costs. In the BOF steelmaking process, there is a nonlinear relationship between the process production data—such as the added limestone, molten iron, scrap steel, and oxygen blowing amount—and the endpoint carbon content and temperature, which provides a theoretical basis for measuring carbon content and temperature without touching them directly [
6]. In addition, advancements in data acquisition and computer technology have enabled the non-contact measurement of endpoint carbon content and temperature in steelmaking. Therefore, researchers have developed data-driven approaches, utilizing artificial intelligence to construct models that accurately predict these key variables in BOF steelmaking [
7].
In the BOF steelmaking process, data-driven soft sensor methods have been extensively studied. To improve the prediction accuracy of endpoint carbon content and temperature in BOF steelmaking, researchers have introduced machine learning techniques such as support vector machines [
8], artificial neural networks [
9], and partial least squares regression [
10]. These methods effectively capture complex nonlinear relationships and improve prediction accuracy. Jan et al. [
11] employed five machine learning techniques, including multivariate adaptive regression (MARS), support vector regression (SVR), neural network (NN), k-nearest neighbor (k-NN), and random forest (RF) methods, to predict the endpoint carbon content and temperature in BOF steelmaking and proposed a dynamic and static combined method to predict it; Han et al. [
12] established a better approach to case-based reasoning that incorporates fuzzy c-means clustering (FCM), mutual information, and support vector machines (SVM) to predict the endpoint in BOF steelmaking; Zhou et al. [
13] achieved the prediction of nonlinear endpoint carbon content and temperature using sensor-collected data combined with multi-output least squares support vector regression; Wang et al. [
14] presented a twin support vector machine-based model for predicting temperature and carbon content, which demonstrated higher prediction accuracy compared to the BP neural network. However, these methods are shallow learning techniques. The process of manufacturing BOF steel is an intricate industrial one, and the data collected using various sensors are characterized by high dimensionality and strong nonlinearity between variables. Soft sensor model prediction performance may be reduced by the use of shallow learning techniques. To tackle the problem of high-dimensional nonlinear data, deep learning techniques have been extensively studied and have achieved significant breakthroughs [
15].
Since deep learning technology has advanced, the prediction of endpoint carbon content and temperature in BOF steelmaking has entered a new stage. Deep learning algorithms such as stacked autoencoders (SAE) [
16], deep belief networks (DBN) [
17], deep convolutional neural networks (CNN) [
18], etc., are currently the more popular deep learning algorithms. These algorithms can achieve feature representation of high-dimensional complex data, thereby effectively improving the performance of the model. Yang et al. [
19] considered the multimodal distribution characteristics of industrial production process data of BOF steelmaking and proposed an adaptive update deep learning model based on von-Mises Fisher (vMF), which provided a certain solution for endpoint determination in BOF steelmaking. Lu et al. [
20] considered the challenge of low correlation between the acquired features and labels faced by the BOF steelmaking industrial production process data and proposed a BOF steelmaking soft sensor model based on a supervised dual-branch deep belief network (SD-DBN), which provided a certain solution to this problem. Zhang et al. [
21] proposed an innovative data-driven soft sensor called the stacked supervised Poisson autoencoder (SSPAE). During the feature extraction process, SSPAE incorporates quality information so that the extracted deep features help improve the accuracy of the prediction model. He et al. [
22] proposed a new attribute-dependent distributed variational autoencoder (AR-DVAE) to deal with the different effects of high-dimensional process variables on prediction quality variables and prediction accuracy. The model can effectively extract features from input variables with different correlations. In the literature [
23], the proposal suggests a new variable-wise weighted stacked autoencoder (VW-SAE) for layer-by-layer hierarchical output-related feature encoding. By performing correlation analysis with the output variables, important variables are identified from among the inputs in each autoencoder layer. This method can enhance the predictive performance of the model. Due to the nonlinear and dynamic nature of industrial production process data, the authors of [
24] proposed a spatiotemporal attention-based LSTM network (STA-LSTM) for a soft sensor model. This method addresses the issue with the original LSTM’s failure to take sample and variable relevance into account when predicting quality. In [
25], the authors proposed a variable attention-based long short-term memory network (VA-LSTM) for a soft sensor. This method assigns attention weights based on the correlation between input features and quality variables, enhancing the prediction model’s performance. In [
26], for the purpose of learning quality-relevant hidden dynamics for soft sensor applications, a supervised LSTM (SLSTM) network was presented, overcoming the lack of representation of quality data. Shen et al. [
27] suggested using deep feature extraction and feature pyramid augmentation to create a multiresolution pyramid variational autoencoder (MR-PVAE) predictive model to address the sampling rates of process variables that are discrepant concerns. Zhang et al. [
28] presented a type-II multivariate Laplacian distribution in the latent variable space for robust models and suggested a variational autoencoder regression approach based on the mixture Laplacian distribution (MLVAER). Tang et al. [
29] created a soft sensor model for semi-supervised deep conditional VAE (SS-DCVAE) based on unsupervised and supervised DCVAE, achieving excellent nonlinear and uncertain feature extraction capabilities in industrial process modeling. These deep learning methods are effective in processing high-dimensional and nonlinear data generated in industrial processes.
In recent years, the problem of concept drift caused by changes in raw materials, production processes, and equipment conditions in industrial production has been widely studied [
30]. In [
31], the authors analyzed the characteristics of intermittent industries. Changes in batches will lead to concept drift, which in turn will reduce the control performance of the model. In [
32], the authors analyzed the concept drift problem existing in industrial production processes, which makes accurate data-driven modeling still a challenging task. In [
33], the authors analyzed concept drift in the injection molding process, where due to machine wear or process changes concept drift will occur and the prediction performance of the model will decrease. In [
34], the authors analyzed the phenomenon of concept drift in industrial data streams due to changes in raw materials, sensor aging, etc., which would lead to poor model performance. Therefore, understanding and effectively responding to concept drift caused by changes in raw materials, processes, and equipment in industrial production processes is an important challenge to ensure production process stability and product quality.
The data from the BOF steelmaking process are not only high-dimensional and highly nonlinear but are also subject to distribution changes due to variations in raw material batches, process adjustments, and changes in equipment conditions. These factors lead to concept drift, which negatively affects model performance [
35]. To address the issue of degraded model prediction performance caused by concept drift in the BOF steelmaking process, soft sensor methods based on transfer learning have been widely studied [
36]. By lowering the dataset’s distribution divergence, transfer learning aims to facilitate learning in a new target domain by utilizing samples from certain source domains. To achieve this goal, feature-based transfer learning methods have been extensively studied [
37]. This method primarily extracts transferable features from both the source and target domains by feature mappings, thereby reducing the distribution differences between the different domains. Pan et al. [
38] proposed the transfer component analysis (TCA) domain adaptation algorithm. Mapping data from two domains into a high-dimensional replicating kernel Hilbert space is the fundamental idea of TCA. In this space, the data distance between the two domains is minimized, thereby reducing the inter-domain differences. Long et al. [
39,
40] proposed the joint distribution adaptation (JDA) domain adaptation algorithm, which aims to reduce the inter-domain differences by simultaneously adapting the marginal distribution and conditional distribution of the source and target domains. The JDA extends the TCA method by more comprehensively addressing inter-domain distribution differences. Wang et al. [
41] proposed the balanced distribution adaptation (BDA) algorithm. The core of the BDA method is to align the joint distribution of the source and target domains by adjusting the weights of the marginal and conditional distributions. Although these methods can effectively reduce distribution differences between datasets, accurate endpoint carbon content and temperature predictions using solely the previously described transfer learning techniques are difficult to produce due to the high degree of dimensionality and significant nonlinearity of the BOF steelmaking process data.
Therefore, to address the issues of high dimensionality, strong nonlinearity, and concept drift in BOF steelmaking process data, an endpoint carbon content and temperature soft sensor model for BOF steelmaking is proposed in this study based on adaptive feature matching variational autoencoder (VAE-AFM). The variational autoencoder (VAE) model combines deep learning and variational Bayesian probabilistic inference to extract nonlinear feature representations of data. In this paper, VAE is utilized to extract features of BOF steelmaking process data. To effectively address the concept drift problem in BOF steelmaking, this paper proposes an adaptive feature matching (AFM) method. This method adaptively adjusts the weights of the marginal and conditional distributions between the extracted BOF steelmaking data features, thereby minimizing distribution differences between different BOF steelmaking datasets.
In conclusion, the principal highlights of this paper can be outlined as follows:
- (1)
An adaptive feature matching method is proposed, which can dynamically assess the relative importance of marginal and conditional distributions between different domains. This approach effectively reduces differences between data distributions and integrates it into a deep learning framework.
- (2)
To address high dimensionality, strong nonlinearity, and concept drift in BOF steelmaking data, a variational autoencoder (VAE) is employed for effective feature extraction. This is then combined with the adaptive feature matching method to construct the VAE-AFM model.
- (3)
An experimental analysis was conducted on the manufacturing process data from BOF steelmaking. Through comparisons with various soft sensor model methods and ablation experiments, the efficacy of the proposed method was confirmed.
The remainder of this paper is structured as follows:
Section 2 gives a thorough overview to the BOF steelmaking process and an analysis of the main chemical processes involved.
Section 3 introduces the knowledge of the adaptive feature matching variational autoencoder (VAE-AFM) model proposed in this paper.
Section 4 introduces how to build a dynamic soft sensor model of BOF steelmaking endpoint carbon content and temperature based on the VAE-AFM model.
Section 5 presents the experimental findings and related discussions. Ultimately, the conclusions are presented in
Section 6.
3. The Proposed Adaptive Feature Matching Variational Autoencoder Model
Based on the introduction to the BOF steelmaking process and the analysis of the relevant chemical processes in the previous section, this paper adopts a data-driven approach to predict the endpoint carbon content and temperature in the production of BOF steel. This paper aims to address the issues of high dimensionality, strong nonlinearity, and concept drift in BOF steelmaking data. This article offers a dynamic soft sensor model for predicting endpoint carbon content and temperature using an adaptive feature matching with variational autoencoder. This section introduces the rationale behind the suggested approach, including the principles of the variational autoencoder model and the dynamic balanced joint distribution alignment domain adaptation model.
3.1. Introduction of the Structure of the Proposed Method
The research method’s structural layout is described in this subsection.
Figure 2 depicts the major elements of the theoretical structure. It is segmented into three components for elaboration, and the specific details are shown below:
(1) In the BOF steelmaking production process, data are collected in real-time from multiple sensors, including key parameters such as temperature, oxygen-blowing volume, and chemical composition. Due to the highly complex interactions and nonlinear characteristics of these data, traditional linear models are unable to fully explore the information contained within. In
Section 3.2, to address this challenge, this paper utilizes the variational autoencoder to extract features from BOF steelmaking data. The VAE is a deep generative model that captures complex nonlinear relationships in high-dimensional data space and extracts low-dimensional representations of the data by learning the distribution of latent variables. The extracted features retain the primary information of the original data and simplify the data structure, thereby enhancing subsequent data analysis and modeling.
Section 3.2 will introduce the precise details.
(2) In the BOF steelmaking process, factors such as raw material batches, process adjustments, and changes in equipment conditions can lead to variations in the distribution of BOF steelmaking data collected through sensors, thereby affecting the performance of data-driven models. To address this problem,
Section 3.3 proposes a method based on a dynamic balanced joint distribution alignment domain adaptation network to achieve data alignment and enhance the model’s resilience and forecast precision. This network aims to minimize distributional differences between source and target domains by considering both marginal and conditional distributions. Conventional domain adaptation methods typically focus on only one and overlook the dynamic equilibrium between them. The method proposed in
Section 3.3 adaptively adjusts the importance of the marginal and conditional distributions by introducing a dynamic balancing factor to more accurately align the data distributions in the source and target domains. The specific details will be introduced in
Section 3.3.
(3) Based on the analyses in
Section 3.2 and
Section 3.3,
Section 3.4 proposes a dynamic balanced joint distribution alignment domain adaptation variational autoencoder model. This model combines the domain adaptation capabilities of dynamic balanced joint distribution with the feature extraction capabilities of VAE to provide a comprehensive solution. The temperature and carbon content at the end of the real BOF steelmaking process were predicted using this soft sensor model. The findings of the experiment show that the proposed model effectively addresses the high-dimensional, nonlinear, and concept drift issues in BOF steelmaking data, providing a valuable reference for the industry. The specific details will be introduced in
Section 3.4.
3.2. Output-Related Feature Extraction Method for BOF Steelmaking Data Based on Variational Autoencoder
The BOF steelmaking process data exhibits substantial nonlinearity and great dimensionality, traditional data analysis methods struggle to capture its inherent characteristics and patterns. To address this, we employ variational autoencoder technology in deep learning [
47]. This method effectively extracts features from the data, providing high-quality input for subsequent model training and optimization, thereby enhancing the prediction and control capabilities of the BOF steelmaking process. The encoder and decoder are the two components that make up the VAE [
48].
In this paper, the input of the variational autoencoder is the collected BOF steelmaking process data, the latent variable z represents the extracted features of the BOF steelmaking data, and the output represents the reconstructed BOF steelmaking data. The VAE assumes that the output is produced by a latent variable z. Specifically, it is assumed that z obeys a simple prior distribution , usually a standard normal distribution , and then z is mapped to the data space x through the decoder. Both the encoder and decoder are implemented by neural networks, and and denote the parameters of the encoder and decoder, respectively. The core idea of the variational autoencoder is to learn the low-dimensional latent representation of the data by maximizing the evidence lower bound (ELBO), which involves minimizing the latent distribution’s variance as well as the reconstruction error. The particular method of derivation is as described below:
The formula for calculating the marginal likelihood of the input data
is shown below:
In the above formula, x represents the input data, z represents the latent variables, represents the marginal likelihood of the input data x, represents the likelihood of the input data x given the latent variables z, represents the prior distribution of the latent variables z, represents the approximation of the posterior distribution of the latent variables z given the input data x, and denotes the KL divergence, which measures the difference between two distributions. This formula illustrates the transformation and derivation of the log-likelihood function in the variational autoencoder. The aim is to introduce the evidence lower bound (ELBO), facilitating effective optimization of the model.
The VAE’s loss function is to maximize the evidence lower bound (ELBO), which can simultaneously maximize the reconstructed log-likelihood of the data and minimize the divergence between the approximate posterior distribution and the prior distribution, thereby effectively training the variational autoencoder. The loss function of the VAE has two parts. The first term is the log-likelihood expectation of the decoder output, which can be considered as the reconstruction error
of the autoencoder. The second term is the difference between the approximate posterior distribution
and the prior distribution
. The specific formula is as follows:
The formula for the reconstruction loss of VAE is as follows:
reflects the reconstruction error term, specifically the mean squared error (MSE) between the input data and its reconstructed value . This term measures the decoder’s performance in reconstructing the input data. N is the total number of samples. This term also represents the accuracy of the decoder in reconstructing the data x from the latent space z. The smaller the mean square error, the closer the sample reconstructed by the decoder is to the original input sample. During the training process of the VAE model, the reconstruction loss is part of the loss function, which together with the KL divergence term constitutes the variational evidence lower bound (ELBO). By minimizing the reconstruction loss, the parameters of the model can be optimized, thereby improving the reconstruction quality.
The KL divergence loss calculation formula of VAE is as follows:
A metric for calculating the disparity between two probability distributions is KL divergence. Specifically, in the VAE model, the KL divergence loss term is used to measure the difference between the approximate posterior distribution and the prior distribution , where is the approximate posterior distribution obtained by the encoder network, which is usually assumed to be a normal distribution . is the prior distribution, generally chosen as the standard normal distribution . and are the mean and variance of the jth dimension in the approximate posterior distribution , respectively. This formula represents the KL divergence for a multivariate normal distribution. It calculates the mean and variance of each dimension to measure the difference between the approximate posterior distribution and the standard normal distribution.
In summary, the reconstruction error plus the KL divergence add up to the VAE loss function. The formula is as follows:
The working principle of the VAE model was introduced above. Specifically, the VAE first maps the high-dimensional input data into a lower-dimensional latent space through the encoder. In this process, the model not only learns a latent representation of the data but also samples from a parameterized Gaussian distribution for the latent variables. This introduces a degree of randomness and regularization. Next, the decoder generates reconstructed samples that closely resemble the input data distribution by leveraging the latent variables sampled from the latent space. The training objective of the VAE is to minimize the reconstruction error and the KL divergence between the latent distribution and the prior distribution. This ensures that the model can effectively capture the latent structure and key features of the data while maintaining its generative capabilities. This process not only enhances the model’s ability to generate data but also improves its generalization performance. In this paper, the VAE model is employed to extract characteristics from the BOF steelmaking process data. First, the production process data collected by the sensor is used as input x and mapped to the latent space to obtain the mean and variance . The latent variable z is sampled from the Gaussian distribution using the reparameterization technique, and z is mapped to the data space through the decoder network to generate the reconstructed data . Then, the KL divergence and reconstruction loss are calculated to realize the extraction of BOF steelmaking production process data.
3.3. Constructing Dynamic Balanced Joint Distribution Alignment Domain Adaptation Network Model
In the BOF steelmaking process, variations in raw material batches, process adjustments, and equipment conditions cause shifts in the distribution of sensor-collected steelmaking data, leading to concept drift. This affects the functionality of the model and reduces the precision of endpoint carbon content and temperature forecasts. To address this issue, this section proposes a method based on a dynamic balanced joint distribution alignment domain adaptation model to achieve adaptive feature matching.
- (1)
The probability distribution of BOF steelmaking data is calculated based on the maximum mean discrepancy.
To address changes in the BOF steelmaking data distribution, this paper calculates the probability distribution using the maximum mean discrepancy. The maximum mean discrepancy (MMD) is a statistical technique frequently used in transfer learning tasks to reduce the distribution difference between the source and target domains [
49]. This method achieves distribution alignment by measuring the difference between two probability distributions in the feature space, thereby enhancing the model’s generalization capability in the target domain. For example, in the literature [
50], a transfer learning method based on the maximum mean discrepancy is proposed for intrusion detection. MMD is a non-parametric statistic based on kernel methods used to calculate the separation between two distributions. The calculation formula for MMD is as follows:
In the above formula,
and
represent the source domain data and target domain data for BOF steelmaking, respectively.
represents the sample size of the source domain data, and
represents the sample size of the target domain data.
H denotes the reproducing kernel Hilbert space (RKHS),
represents the feature mapping function that maps the original samples to RKHS, and
denotes a Gaussian kernel function. The expression for
is as follows:
Here, stands for the bandwidth parameter that controls the scale of the kernel.
- (2)
Align BOF steelmaking data from different domains based on the domain adaptation network mechanism.
In the BOF steelmaking process, variations in raw material batches, process adjustments, and changes in equipment conditions can cause shifts in data distribution, leading to concept drift. Specifically, the distribution differences between different domains in BOF steelmaking data can manifest as follows: the marginal distributions can differ, i.e., , where and represent the feature distributions of the source and target domains, respectively, and the conditional distributions can differ, i.e., , where and represent the label distributions of the source and target domains, respectively.
To address the issue of decreased model prediction performance caused by concept drift in BOF steelmaking data, this paper employs a domain adaptation approach. One important topic of transfer learning study is domain adaptation [
51]. The issue of inconsistent data distribution between source and target domains is intended to be resolved by neural networks with domain adaptation. In practical applications, changes in data distribution often lead to a significant decline in performance when directly applying the source domain model to the target domain. To address this issue, domain adaptation neural networks have been developed, aiming to establish effective mapping relationships between the source and target domains, thereby enabling the model to achieve good performance in the target domain. Domain adaptation methods typically achieve this by reducing the distribution discrepancies between the source and target domains. Among these methods, marginal distribution alignment and joint distribution alignment are two commonly used strategies. Marginal distribution alignment reduces cross-domain discrepancies by making the marginal distributions of the source and target domains more consistent in the high-dimensional feature space. On the other hand, joint distribution alignment takes it a step further by considering the conditional relationships between input features and output labels. By simultaneously aligning both marginal and conditional distributions, this approach more comprehensively reduces the distribution differences between the source and target domains. Domain adaptation neural networks achieve effective transfer of source domain models to the target domain by optimizing a loss function that minimizes the distribution distance between the source and target domains in the feature space. This process typically involves several steps: First, data are mapped to a space through a feature extractor. Then, domain alignment techniques are applied to gradually reduce the distribution discrepancies between the source and target domain features. Finally, model parameters are adjusted to ensure accurate regression predictions on the unlabeled target domain samples. Overall, domain adaptation techniques overcome the challenge of cross-domain distribution inconsistency through effective distribution alignment strategies, enabling models trained on the source domain to be effectively applied to the target domain, thereby enhancing the practical effectiveness of transfer learning. In this paper, MMD is used as a metric to thoroughly measure the distance between BOF steelmaking data from different domains in the feature space. This effectively reduces the gap between feature distributions, thereby enhancing the model’s prediction performance.
- (3)
Construct a dynamic balanced joint distribution alignment domain adaptation network model.
Based on the above analysis, to effectively align BOF steelmaking data from different domains, this paper constructs a dynamic balanced joint distribution alignment domain adaptation network model. This model fully considers the differences in both marginal and conditional distributions between the source and target domains.
In marginal distribution domain adaptation, the difference between the source domain marginal distribution
and the target domain marginal distribution
can be measured using a distance function
. The goal is to minimize this distance to reduce distribution differences.
can be expressed as follows:
In conditional distribution domain adaptation, the difference between the source domain conditional distribution
and the target domain conditional distribution
can be measured using a distance function
. The goal is to minimize this distance to reduce distribution differences. The
can be expressed as follows:
In joint distribution domain adaptation,
can be defined as the linear total of the marginal and conditional distributions, as shown below:
The above studies only consider one of the two distributions or often assume that both distributions are equally important; however, this presumption might not be accurate in practical applications. This paper provides a solution to this problem by the dynamic balanced joint distribution alignment domain adaptation model. We design a balance factor
that quantitatively evaluates the role that conditional and marginal distributions have in domain adaptation. By dynamically adjusting the factor
, we can effectively balance these distributions. In dynamic balanced joint distribution alignment domain adaptation,
can be expressed as a weighted linear combination of the marginal and conditional distributions, as shown below:
Among them, . When = 1, the above formula becomes a marginal distribution; when = 0, the above formula becomes a conditional distribution; and when = 0.5, the above formula becomes a joint distribution. Furthermore, based on the previous analysis, this paper employs MMD distance to construct a dynamic balanced joint distribution alignment network.
The formula for calculating the marginal distribution distance based on MMD is as follows:
The formula for calculating the conditional distribution distance based on MMD is as follows:
Furthermore, the calculation formula for the dynamic adjustment factor
is as follows:
In light of the study above, the calculation formula for the proposed adaptive feature matching method in this paper is as follows:
Based on the above analysis, this section constructs a dynamic balanced joint distribution alignment domain adaptation network model, as shown in
Figure 3:
The model achieves efficient domain adaptation by adaptively aligning the distributions across different domains. During the domain adaptation process, the contributions of marginal and conditional distributions to domain divergence are not identical. To accurately assess and adjust the influence of these distributions, the model first utilizes maximum mean discrepancy to separately calculate the marginal distribution difference and the conditional distribution difference between the source and target domains. This step ensures that the model can quantitatively capture the degree of divergence between different distributions. Subsequently, the model introduces a dynamic adjustment factor , which is designed to evaluate the relative importance of marginal and conditional distributions in real-time during the domain adaptation process. This dynamic adjustment factor adaptively adjusts the weights of each distribution based on the observed differences between them, allowing the model to flexibly address distributional discrepancies across different domains. Through this mechanism, the model not only achieves simultaneous alignment of marginal and conditional distributions but also dynamically adjusts the importance weights of these distributions in response to variations in inter-domain differences. This dynamic weight adjustment mechanism enhances the model’s adaptability and robustness when handling data from different domains, ensuring stability and predictive performance under varying distribution conditions. Overall, the model not only effectively reduces inter-domain distribution discrepancies during the domain adaptation process but also significantly improves its generalization capability in cross-domain applications.
3.4. Based on the Adaptive Feature Matching Variational Autoencoder Model
In
Section 3.2 and
Section 3.3, we introduced the use of the variational autoencoder model for extracting features from BOF steelmaking data, and the dynamic balanced joint distribution alignment domain adaptation network model for reducing differences between various BOF datasets. Based on the previous analysis, this section constructs an adaptive feature matching variational autoencoder model (VAE-AFM). As shown in
Figure 4, the model is generally divided into four stages: domain partitioning, feature extraction, domain adaptation, and endpoint carbon content and temperature prediction. This model is designed to process data from the BOF steelmaking production process, particularly addressing the issue of concept drift.
This model integrates the powerful feature extraction capabilities of VAE with the dynamic balanced joint distribution alignment domain adaptation network, forming a comprehensive solution. In the VAE-AFM model, the processing is divided into two main steps to ensure accurate prediction of the endpoint carbon content and temperature during the BOF steelmaking process. First, the VAE module is responsible for extracting latent feature representations from the raw data. Through its encoder–decoder architecture, the VAE model compresses high-dimensional input data into a lower-dimensional latent space. In this process, the encoder maps the input data to a distribution in the latent space, while the decoder reconstructs samples similar to the original data distribution. This approach not only extracts key features from the data but also captures the underlying structures and patterns, providing robust feature representations. Then, after extracting the latent features, the AFM module processes these features further. The core of AFM lies in applying domain adaptation techniques to ensure that the distributions of source domain data and target domain data in the latent space are as consistent as possible. Specifically, AFM module minimizes the distribution differences between the source and target domains through dynamic joint adaptation of both marginal and conditional distributions. This approach enhances the model’s generalization ability, ensuring stable performance, even when faced with varying data distributions. In this model, the synergy between the two components leads to improved prediction accuracy of the endpoint carbon content and temperature in BOF steelmaking. The VAE module provides robust feature extraction capabilities, ensuring a more precise latent representation of the data, while the AFM module further refines these features by reducing distribution discrepancies between domains, thereby enhancing the model’s adaptability across different data domains. Overall, the VAE-AFM model demonstrates significant advantages in handling the high dimensionality, strong nonlinearity, and concept drift challenges inherent in BOF steelmaking process data, effectively improving prediction accuracy and model robustness.
The pseudo-code for the proposed adaptive feature matching with the variational autoencoder (VAE-AFM) model for predicting the endpoint carbon content and temperature in BOF steelmaking is shown in Algorithm 1:
Algorithm 1: Dynamic Soft Sensor Model Based on Adaptive Feature Matching Variational Autoencoder (VAE-AFM) |
|