Goal-Oriented Sensor Reporting Scheduling for Non-linear Dynamic System Monitoring

Prasoon Raghuwanshi\orcidlink0000-0002-9629-9742, , Onel Luis Alcaraz López\orcidlink0000-0003-1838-5183, , Vimal Bhatia\orcidlink0000-0001-5148-6643, , Matti Latva-aho\orcidlink0000-0002-6261-0969 Prasoon Raghuwanshi, Onel Luis Alcaraz López, and Matti Latva-aho are with the Centre for Wireless Communications, University of Oulu, 90570905709057090570, Oulu, Finland (e-mail: [email protected]; [email protected]; [email protected]).Vimal Bhatia is with the Department of Electrical Engineering, Indian Institute of Technology Indore, 453552453552453552453552, Indore, India (e-mail: [email protected])This research has been supported by the Research Council of Finland (former Academy of Finland) 6G Flagship Programme (Grant 346208), the Finnish Foundation for Technology Promotion, the INDIFICORE project, and the European Commission through the Horizon Europe/JU SNS project Hexa-X-II (Grant 101095759).
Abstract

Goal-oriented communication (GoC) is a form of semantic communication where the effectiveness of information transmission is measured by its impact on achieving the desired goal. In the context of the Internet of Things (IoT), GoC can make IoT sensors to selectively transmit data pertinent to the intended goals of the receiver. Therefore, GoC holds significant value for IoT networks as it facilitates timely decision-making at the receiver, reduces network congestion, and enhances spectral efficiency. In this paper, we consider a scenario where an edge node polls sensors monitoring the state of a non-linear dynamic system (NLDS) to respond to the queries of several clients. Our work delves into the foregoing GoC problem, which we term goal-oriented scheduling (GoS). Our proposed GoS utilizes deep reinforcement learning (DRL) with meticulously devised action space, state space, and reward function. The devised action space and reward function play a pivotal role in reducing the number of sensor transmissions. Meanwhile, the devised state space empowers our DRL scheduler to poll the sensor whose observation is expected to minimize the mean square error (MSE) of the query responses. Our numerical analysis demonstrates that the proposed GoS can either effectively minimize the query response MSE further or obtain a resembling MSE compared to benchmark scheduling methods, depending on the type of query. Furthermore, the proposed GoS proves to be energy-efficient for the sensors and of lower complexity compared to benchmark scheduling methods.

Index Terms:
Deep Reinforcement Learning, Goal-oriented Scheduling, Internet of Things, Non-linear Dynamic System.

I Introduction

There are billions of Internet of Things (IoT) devices worldwide and the number will keep growing in the coming years [1]. Notably, a significant share of the IoT landscape comprises low-cost/power sensors monitoring dynamic systems, which are usually high-dimensional. As a result, massive amounts of data are increasingly exchanged in IoT communications, often under stringent quality of service, e.g., latency and reliability, requirements [2, 3].

Given the resource limitations inherent to IoT sensors and networks, there has been a growing interest in remotely estimating the system states at a fusion center/edge node [4, 5, 6, 7]. Notably, an edge node may remotely estimate the entire system state by gathering observations from a subset of IoT sensors, rather than the entire sensor network. Thus, ultimately resulting in energy-efficient state observation. The application of remote state estimation (RSE) assisted-sensor reporting scheduling is diverse, spanning fields such as voltage regulation in power systems [8], strategic actuator placement in control systems [9], and sensing/reporting scheduling in wireless networks [4, 5, 6].

The value-of-information (VoIVoI\operatorname{\textsc{VoI}}voi) [10] has been suggested in [6, 7] as a suitable metric for quantifying the impact of sensor transmission on the RSE error. Here, RSE error is defined with respect to the desired goal. A goal might be to accurately (i) identify the system state, or (ii) respond to queries from clients regarding the system state. Table I provides examples of potential client queries.

Recently, the authors in [4, 5, 6] utilized RSE-assisted sensor reporting scheduling at an edge node. The objective in [4] is to identify the state of a linear dynamic system, whereas in [5, 6], the focus is on effectively addressing client queries regarding the state of a linear dynamic system. Thus, the VoIVoI\operatorname{\textsc{VoI}}voi adopted in [4] corresponds to the mean square error (MSE)MSE(\operatorname{MSE})( roman_MSE ) of the state estimation. Meanwhile, VoIVoI\operatorname{\textsc{VoI}}voi is defined in [5, 6] as the difference between MSEMSE\operatorname{MSE}roman_MSE of query response relative to the prior and posterior estimates of the state estimator [7]. Here, prior and posterior estimates denote estimates obtained before and after the sensor transmission, respectively. Furthermore, [4, 5, 6] exploit a key advantage offered by RSE, namely, the ability to observe system states by selectively polling a subset of sensors. In [4], the sensor scheduling strategy is devised to minimize the state estimation MSEMSE\operatorname{MSE}roman_MSE, whereas in [5, 6], it aims to minimize the query response MSEMSE\operatorname{MSE}roman_MSE.

TABLE I: Examples of Client Queries
Query Definition, zc(𝐱(t))subscript𝑧𝑐𝐱𝑡z_{c}(\mathbf{x}(t))italic_z start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_x ( italic_t ) )
Current state 𝐱(t)𝐱𝑡\mathbf{x}(t)bold_x ( italic_t )
Maximum component max(𝐱(t))𝐱𝑡\max(\mathbf{x}(t))roman_max ( bold_x ( italic_t ) )
Count range m=1M𝟙(xm(t)[\fgee,\fges])superscriptsubscript𝑚1𝑀1subscript𝑥𝑚𝑡\fgee\fges\sum_{m=1}^{M}\mathbbm{1}(x_{m}(t)\in[\fgee,\fges])∑ start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT blackboard_1 ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_t ) ∈ [ , ] )
Sample mean zmean=1Mm=1Mxm(t)subscript𝑧𝑚𝑒𝑎𝑛1𝑀superscriptsubscript𝑚1𝑀subscript𝑥𝑚𝑡z_{mean}=\frac{1}{M}\sum_{m=1}^{M}x_{m}(t)italic_z start_POSTSUBSCRIPT italic_m italic_e italic_a italic_n end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_M end_ARG ∑ start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_t )
Sample variance 1M1m=1M(xm(t)zmean)21𝑀1superscriptsubscript𝑚1𝑀superscriptsubscript𝑥𝑚𝑡subscript𝑧𝑚𝑒𝑎𝑛2\frac{1}{M-1}\sum_{m=1}^{M}(x_{m}(t)-z_{mean})^{2}divide start_ARG 1 end_ARG start_ARG italic_M - 1 end_ARG ∑ start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_t ) - italic_z start_POSTSUBSCRIPT italic_m italic_e italic_a italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
Herein, 𝐱(t)=[x1(t),,xM(t)]TM×1𝐱𝑡superscriptsubscript𝑥1𝑡subscript𝑥𝑀𝑡𝑇superscript𝑀1{\mathbf{x}(t)=[x_{1}(t),\cdots,x_{M}(t)]^{T}\in\mathbb{R}^{M\times 1}}bold_x ( italic_t ) = [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) , ⋯ , italic_x start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( italic_t ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_M × 1 end_POSTSUPERSCRIPT.

Fascinatingly, a closed-form mathematical expression for the query response MSEMSE\operatorname{MSE}roman_MSE can be obtained for certain queries like sample mean, sample variance, and current state. Thus, sensor reporting scheduling strategies for such queries can be determined analytically, as depicted in [5]. Conversely, for queries such as the maximum system state component and count range, deriving closed-form mathematical expressions for the query response MSEMSE\operatorname{MSE}roman_MSE proves to be unattainable. Therefore, addressing such queries necessitates the utilization of advanced approaches such as deep reinforcement learning (DRL) to tackle the sensor reporting scheduling problem, as outlined in [6].

Refer to caption
Figure 1: GoC illustration. Clients ask queries to the edge node about the NLDS state observed by the sensors. The edge node, based on the decision taken by its scheduler, may poll a sensor. Besides, the edge node responds to queries based on the state estimate computed by the CQKF.

Note that the proposals in [4, 5, 6] have one common flaw: they assume that the linear dynamic system model is perfectly known at the edge node, a prerequisite for Kalman filter-based RSE. Unfortunately, obtaining such information is often challenging or even impossible, especially in the case of a non-linear dynamic system (NLDS). Moreover, the Kalman filter cannot even deal with NLDS. Besides, in [6], a sensor must be polled at every time step, even when there are no client queries, resulting in unnecessary depletion of sensor energy. Apart from that, the complete state of the Kalman filter is provided as input in [6] to its DRL-based sensor scheduler. This input significantly inflates the size of the deep neural network (DNN) utilized by the DRL-based sensor scheduler, as it must also extract relevant information from the input. On top of that, time instances where no queries are posed are treated uniformly, providing the same reward to the DRL-based sensor scheduling algorithm on all those time instances. Consequently, the proposal in [6] struggles to determine the optimal action in the absence of queries.

Considering the aforesaid deets regarding NLDS and RSE-assisted-sensor reporting scheduling as our motivation, we propose a novel approach termed goal-oriented scheduling (GoS) for IoT sensors tasked with monitoring NLDS. In our goal-oriented communication (GoC) system model, illustrated in Fig. 1, clients pose queries about the NLDS state to the edge node, which then orchestrates sensor reporting scheduling to gather partial yet informative sensor observations. These observations are utilized by the edge node to perform RSE and address client queries. The sole motive of sensor reporting scheduling is to minimize the MSEMSE\operatorname{MSE}roman_MSE of future query responses, hence the phrase goal-oriented scheduling. Within our system model, the edge node employs a DRL-based scheduler, which decides whether to poll a sensor at each time step. We have devised a reward function such that our DRL-based sensor scheduler makes judicious decisions even when no queries are posed. Furthermore, the edge node utilizes the observation from the polled sensor and the cubature quadrature Kalman filter (CQKF) [11] to estimate the entire NLDS state and respond to the client queries. However, since CQKF requires a mathematical model for the NLDS, we employ Holt’s method [12, 11] to iteratively estimate it. Additionally, we provide a specific attribute of the CQKF state as input to our DRL-based sensor scheduler. This input not only aids in minimizing the query response MSEMSE\operatorname{MSE}roman_MSE but also significantly shrinks the size of the DNN utilized by our DRL-based sensor scheduler. Lastly, we weigh the performance of our proposed scheduler against two benchmark schedulers: the scheduler adopted in [6] and the Monte Carlo scheduler. Our complexity analysis indicates that the proposed scheduler exhibits the least complexity among the considered schedulers. Moreover, the numerical results reveal that, depending on the query type, our proposed scheduler either further minimizes the query response MSEMSE\operatorname{MSE}roman_MSE or obtains a resembling MSEMSE\operatorname{MSE}roman_MSE relative to the benchmark schedulers. In any case, this is accomplished by reducing the number of sensor transmissions, thereby saving sensor energy.

Algorithm 1 CQpointsCQpoints\operatorname{\textsc{CQpoints}}cqpoints
0.  M,n𝑀superscript𝑛M,n^{\prime}italic_M , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
1.  Find the intersection points 𝝍j,j{1,,2M}subscript𝝍𝑗for-all𝑗12𝑀\boldsymbol{\psi}_{j},\forall j\in\{1,\cdots,2M\}bold_italic_ψ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , 2 italic_M } of the unit M𝑀Mitalic_M-hyper-sphere and its axes 𝝍j::subscript𝝍𝑗absent\triangleright\ \boldsymbol{\psi}_{j}:▷ bold_italic_ψ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT : cubature point
2.  Compute the roots λj,j{1,,n}subscript𝜆superscript𝑗for-allsuperscript𝑗1superscript𝑛{\lambda_{j^{\prime}},\forall j^{\prime}\in\{1,\cdots,n^{\prime}\}}italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , ∀ italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ { 1 , ⋯ , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } of the CL polynomial λj::subscript𝜆superscript𝑗absent\triangleright\ \lambda_{j^{\prime}}:▷ italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT : quadrature point
3.   𝝃j+(j1)n=2λj𝝍j,subscript𝝃superscript𝑗𝑗1superscript𝑛2subscript𝜆superscript𝑗subscript𝝍𝑗\boldsymbol{\xi}_{j^{\prime}+(j-1)n^{\prime}}=\sqrt{2\lambda_{j^{\prime}}}% \boldsymbol{\psi}_{j},bold_italic_ξ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + ( italic_j - 1 ) italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = square-root start_ARG 2 italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_ARG bold_italic_ψ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , \triangleright CQ point wj+(j1)n=n!2MΓ(ι+n+1)Γ(M/2)λj1L(λj)2,subscript𝑤superscript𝑗𝑗1superscript𝑛superscript𝑛2𝑀Γ𝜄superscript𝑛1Γ𝑀2subscript𝜆superscript𝑗1superscript𝐿superscriptsubscript𝜆superscript𝑗2w_{j^{\prime}+(j-1)n^{\prime}}=\frac{n^{\prime}!}{2M}\frac{\Gamma(\iota+n^{% \prime}+1)}{\Gamma(M/2)\lambda_{j^{\prime}}}\frac{1}{L^{\prime}(\lambda_{j^{% \prime}})^{2}},italic_w start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + ( italic_j - 1 ) italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = divide start_ARG italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ! end_ARG start_ARG 2 italic_M end_ARG divide start_ARG roman_Γ ( italic_ι + italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + 1 ) end_ARG start_ARG roman_Γ ( italic_M / 2 ) italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_ARG divide start_ARG 1 end_ARG start_ARG italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ,\ddagger\ddagger‡ ‡ j{1,,2M},j{1,,n}formulae-sequencefor-all𝑗12𝑀for-allsuperscript𝑗1superscript𝑛\forall j\in\{1,\cdots,2M\},\forall j^{\prime}\in\{1,\cdots,n^{\prime}\}∀ italic_j ∈ { 1 , ⋯ , 2 italic_M } , ∀ italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ { 1 , ⋯ , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }
3.  𝐰=[w1,,w2Mn]T,𝚵=[𝝃1,,𝝃2Mn]Tformulae-sequence𝐰superscriptsubscript𝑤1subscript𝑤2𝑀superscript𝑛𝑇𝚵superscriptsubscript𝝃1subscript𝝃2𝑀superscript𝑛𝑇\mathbf{w}=[w_{1},\cdots,w_{2Mn^{\prime}}]^{T},\boldsymbol{\Xi}=[\boldsymbol{% \xi}_{1},\cdots,\boldsymbol{\xi}_{2Mn^{\prime}}]^{T}bold_w = [ italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_w start_POSTSUBSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , bold_Ξ = [ bold_italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , bold_italic_ξ start_POSTSUBSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT

\ddagger\ddagger‡ ‡ L(λj)superscript𝐿subscript𝜆superscript𝑗L^{\prime}(\lambda_{j^{\prime}})italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) is the first derivative of L()𝐿L(\cdot)italic_L ( ⋅ ) at λ=λj𝜆subscript𝜆superscript𝑗\lambda=\lambda_{j^{\prime}}italic_λ = italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT.

The paper is structured as follows. Section II delineates the system model. Section III describes the components of the GoS framework and presents the scheduling problem. Section IV introduces benchmark schedulers and Section V discusses the computational complexities of all the considered schedulers. Section VI presents the numerical results. Lastly, Section VII concludes the paper and outlines potential avenues for future research.

Notation: argmax()argmax{\operatorname{argmax}(\cdot)}roman_argmax ( ⋅ ) and max(){\max(\cdot)}roman_max ( ⋅ ) denote the argument of the maximum function and the maximum function itself, respectively. Similarly, argmin()argmin{\operatorname{argmin}(\cdot)}roman_argmin ( ⋅ ) and min(){\min(\cdot)}roman_min ( ⋅ ) denote the argument of the minimum function and the minimum function itself, respectively. The cardinality of a set is represented by ||{|\cdot|}| ⋅ |, while the transpose operation is denoted by []Tsuperscriptdelimited-[]𝑇{[\cdot]^{T}}[ ⋅ ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Column vectors/matrices are indicated by boldface lowercase/uppercase letters. The determinant, trace of a square matrix, and the expected value are denoted by det(){\det(\cdot)}roman_det ( ⋅ ), Tr()Tr{\operatorname{Tr}(\cdot)}roman_Tr ( ⋅ ) and 𝔼[]𝔼delimited-[]{\mathbb{E}[\cdot]}blackboard_E [ ⋅ ], respectively. 𝐈Msubscript𝐈𝑀{\mathbf{I}_{M}}bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT and 𝟎Msubscript0𝑀{\mathbf{0}_{M}}bold_0 start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT signify the M×M𝑀𝑀{M\times M}italic_M × italic_M identity matrix and null vector of dimension M×1𝑀1{M\times 1}italic_M × 1, respectively. Additionally, 𝟏psubscript1𝑝{\mathbf{1}_{p}}bold_1 start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT denotes a vector of dimension M×1𝑀1{M\times 1}italic_M × 1 with all elements set to zero except the pthsuperscript𝑝𝑡p^{th}italic_p start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT element, which is 1111. The sets M×1superscript𝑀1{\mathbb{R}^{M\times 1}}blackboard_R start_POSTSUPERSCRIPT italic_M × 1 end_POSTSUPERSCRIPT and C×1superscript𝐶1{\mathbb{N}^{C\times 1}}blackboard_N start_POSTSUPERSCRIPT italic_C × 1 end_POSTSUPERSCRIPT represent real vectors of dimension M×1𝑀1{M\times 1}italic_M × 1 and non-negative integer vectors of dimension C×1𝐶1{C\times 1}italic_C × 1, respectively. A Gaussian sample vector with mean 𝐲¯¯𝐲{\mathbf{\bar{y}}}over¯ start_ARG bold_y end_ARG and covariance matrix 𝐙𝐙{\mathbf{Z}}bold_Z is denoted as 𝐲𝒩(𝐲¯,𝐙)similar-to𝐲𝒩¯𝐲𝐙{\mathbf{y}\sim\mathcal{N}(\mathbf{\bar{y}},\mathbf{Z})}bold_y ∼ caligraphic_N ( over¯ start_ARG bold_y end_ARG , bold_Z ). Meanwhile, a Gaussian sample observation with mean 𝟏nT𝐲¯superscriptsubscript1𝑛𝑇¯𝐲{\mathbf{1}_{n}^{T}\mathbf{\bar{y}}}bold_1 start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over¯ start_ARG bold_y end_ARG and covariance 𝟏nT𝐙𝟏nsuperscriptsubscript1𝑛𝑇subscript𝐙𝟏𝑛{\mathbf{1}_{n}^{T}\mathbf{Z}\mathbf{1}_{n}}bold_1 start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Z1 start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is denoted as y𝒩(𝟏nT𝐲¯,𝟏nT𝐙𝟏n)similar-to𝑦𝒩superscriptsubscript1𝑛𝑇¯𝐲superscriptsubscript1𝑛𝑇subscript𝐙𝟏𝑛y\sim\mathcal{N}(\mathbf{1}_{n}^{T}\mathbf{\bar{y}},\mathbf{1}_{n}^{T}\mathbf{% Z}\mathbf{1}_{n})italic_y ∼ caligraphic_N ( bold_1 start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over¯ start_ARG bold_y end_ARG , bold_1 start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Z1 start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ). The indicator function, Cholesky decomposition, sample variance, and uniform distribution between 00 and 1111 are denoted by 𝟙()1{\mathbbm{1}(\cdot)}blackboard_1 ( ⋅ ), Chol()Chol{\operatorname{\textsc{Chol}}(\cdot)}chol ( ⋅ ), Var()Var{\operatorname{\textsc{Var}}(\cdot)}VAR ( ⋅ ), and 𝒰(0,1)𝒰01{\mathcal{U}(0,1)}caligraphic_U ( 0 , 1 ), respectively.

II System Model

Consider the GoC system illustrated in Fig. 1. In this system, an edge node receives data from N𝑁Nitalic_N sensors indexed by n{1,2,,N}𝑛12𝑁{n\in\{1,2,\cdots,N\}}italic_n ∈ { 1 , 2 , ⋯ , italic_N } and is tasked with responding to queries from a set 𝒞𝒞\mathscr{C}script_C of C𝐶Citalic_C remote clients. A query from client c𝒞𝑐𝒞{c\in\mathscr{C}}italic_c ∈ script_C is a request for the value of the function zc(𝐱(t))subscript𝑧𝑐𝐱𝑡{z_{c}(\mathbf{x}(t))}italic_z start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_x ( italic_t ) ), while the edge node responds to it with an estimate z^csubscript^𝑧𝑐{\hat{z}_{c}}over^ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT. Each client asks a different type of query about the system state. The system operates in discrete time slots, labeled as t𝑡{t}italic_t. In each slot, the edge node decides whether to poll a single sensor or refrain from doing so. The sensors observe NLDS, with its state represented as

𝐱(t)=𝐟(𝐱(t1))+𝐯1(t)M×1,𝐱𝑡𝐟𝐱𝑡1subscript𝐯1𝑡superscript𝑀1\displaystyle\mathbf{x}(t)=\mathbf{f}(\mathbf{x}(t-1))+\mathbf{v}_{1}(t)\in% \mathbb{R}^{M\times 1},bold_x ( italic_t ) = bold_f ( bold_x ( italic_t - 1 ) ) + bold_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_M × 1 end_POSTSUPERSCRIPT , (1)

where M𝑀Mitalic_M is the dimensionality of the NLDS state, 𝐟()𝐟{\mathbf{f}(\cdot)}bold_f ( ⋅ ) represents a nonlinear state dynamics (NLSD) function, and 𝐯1(t)𝒩(𝟎M,𝚺v1)similar-tosubscript𝐯1𝑡𝒩subscript0𝑀subscript𝚺subscript𝑣1{\mathbf{v}_{1}(t)\sim\mathcal{N}(\mathbf{0}_{M},\mathbf{\Sigma}_{v_{1}})}bold_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) ∼ caligraphic_N ( bold_0 start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) denotes the Gaussian noise with zero mean and covariance 𝚺v1M×Msubscript𝚺subscript𝑣1superscript𝑀𝑀{\mathbf{\Sigma}_{v_{1}}\in\mathbb{R}^{M\times M}}bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_M × italic_M end_POSTSUPERSCRIPT.

The sensors observe the system state as captured by

𝐲(t)=𝐇𝐱(t)+𝐯2(t)N×1.𝐲𝑡𝐇𝐱𝑡subscript𝐯2𝑡superscript𝑁1\displaystyle\mathbf{y}(t)=\mathbf{H}\mathbf{x}(t)+\mathbf{v}_{2}(t)\in\mathbb% {R}^{N\times 1}.bold_y ( italic_t ) = bold_Hx ( italic_t ) + bold_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_t ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × 1 end_POSTSUPERSCRIPT . (2)

Herein, 𝐇N×M𝐇superscript𝑁𝑀{\mathbf{H}\in\mathbb{R}^{N\times M}}bold_H ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_M end_POSTSUPERSCRIPT represents the observation matrix, and 𝐯2(t)𝒩(𝟎N,𝚺v2)similar-tosubscript𝐯2𝑡𝒩subscript0𝑁subscript𝚺subscript𝑣2{\mathbf{v}_{2}(t)\sim\mathcal{N}(\mathbf{0}_{N},\mathbf{\Sigma}_{v_{2}})}bold_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_t ) ∼ caligraphic_N ( bold_0 start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT , bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) is the zero-mean Gaussian measurement noise with covariance matrix 𝚺v2N×Nsubscript𝚺subscript𝑣2superscript𝑁𝑁{\mathbf{\Sigma}_{v_{2}}\in\mathbb{R}^{N\times N}}bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_N × italic_N end_POSTSUPERSCRIPT. Additionally, we model the channel between sensor n𝑛nitalic_n and edge node as a packet erasure channel with a transmission error probability nsubscriptPlanck-constant-over-2-pi𝑛\hbar_{n}roman_ℏ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT.

Algorithm 2 CQKF at t𝑡titalic_t
0.  𝐱^pos(t1),𝚿pos(t1),𝚺v1,𝐰,𝚵,ϖ,ς,subscript^𝐱𝑝𝑜𝑠𝑡1subscript𝚿𝑝𝑜𝑠𝑡1subscript𝚺subscript𝑣1𝐰𝚵italic-ϖ𝜍\hat{\mathbf{x}}_{pos}(t-1),\mathbf{\Psi}_{pos}(t-1),\mathbf{\Sigma}_{v_{1}},% \mathbf{w},\boldsymbol{\Xi},\varpi,\varsigma,over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , bold_w , bold_Ξ , italic_ϖ , italic_ς , 𝒂(t1),𝒃(t1),𝚺v2,𝐇,p𝒂𝑡1𝒃𝑡1subscript𝚺subscript𝑣2𝐇𝑝\boldsymbol{a}(t-1),\boldsymbol{b}(t-1),\mathbf{\Sigma}_{v_{2}},\mathbf{H},pbold_italic_a ( italic_t - 1 ) , bold_italic_b ( italic_t - 1 ) , bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , bold_H , italic_p
1.   𝐱^pri(t),𝚿pri(t),𝒂(t),𝒃(t),𝐙(t1)subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡𝒂𝑡𝒃𝑡superscript𝐙𝑡1absent\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t),\boldsymbol{a}(t),\boldsymbol% {b}(t),\mathbf{Z}^{*}(t-1)\leftarrowover^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_italic_a ( italic_t ) , bold_italic_b ( italic_t ) , bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) ← PredictionStep(𝐱^pos(t1),𝚿pos(t1),𝚺v1,𝐰,\operatorname{\textsc{PredictionStep}}(\hat{\mathbf{x}}_{pos}(t-1),\mathbf{% \Psi}_{pos}(t-1),\mathbf{\Sigma}_{v_{1}},\mathbf{w},predictionstep ( over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , bold_w , 𝚵,ϖ,ς,𝒂(t1),𝒃(t1))\boldsymbol{\Xi},\varpi,\varsigma,\boldsymbol{a}(t-1),\boldsymbol{b}(t-1))bold_Ξ , italic_ϖ , italic_ς , bold_italic_a ( italic_t - 1 ) , bold_italic_b ( italic_t - 1 ) )
2.   Draw θ𝜃\thetaitalic_θ from 𝒰(0,1)𝒰01\mathcal{U}(0,1)caligraphic_U ( 0 , 1 )
3.  if (p0)𝑝0(p\neq 0)( italic_p ≠ 0 ) and (θ0.02p110)𝜃0.02𝑝110(\theta\geq 0.02\lceil{\frac{p-1}{10}}\rceil)( italic_θ ≥ 0.02 ⌈ divide start_ARG italic_p - 1 end_ARG start_ARG 10 end_ARG ⌉ ) then
4.     𝐱^pos(t),𝚿pos(t)UpdateStep(𝐱^pri(t),𝚿pri(t),\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t)\leftarrow\operatorname{% \textsc{UpdateStep}}(\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t),over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) ← updatestep ( over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , 𝐙(t1),𝚺v2,𝐇,𝐰,𝚵,y(t),p)\mathbf{Z}^{*}(t-1),\mathbf{\Sigma}_{v_{2}},\mathbf{H},\mathbf{w},\boldsymbol{% \Xi},y(t),p)bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) , bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , bold_H , bold_w , bold_Ξ , italic_y ( italic_t ) , italic_p )
5.  else
6.     {𝐱^pos(t),𝚿pos(t)}={𝐱^pri(t),𝚿pri(t)}subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡\{\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t)\}=\{\hat{\mathbf{x}}_{pri}(% t),\mathbf{\Psi}_{pri}(t)\}{ over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) } = { over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) }
7.  end if
7.  𝐱^pri(t),𝚿pri(t),𝒂(t),𝒃(t),𝐱^pos(t),𝚿pos(t)subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡𝒂𝑡𝒃𝑡subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t),\boldsymbol{a}(t),\boldsymbol% {b}(t),\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_italic_a ( italic_t ) , bold_italic_b ( italic_t ) , over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t )

III Goal-Oriented Scheduling

The proposed GoS framework comprises the following three key components: state estimator, sensor scheduler, and query process at the clients. Detailed descriptions of each component are provided next.

III-A System State Estimator

We employ CQKF for NLDS state estimation. As initialization, CQKF requires cubature quadrature (CQ) points (𝚵)𝚵(\boldsymbol{\Xi})( bold_Ξ ) and their corresponding weights (𝐰)𝐰(\mathbf{w})( bold_w ), whose computation procedure is available in Algorithm 1. Initially, we determine the cubature points 𝝍j,j{1,,2M}subscript𝝍𝑗for-all𝑗12𝑀{\boldsymbol{\psi}_{j},\forall j\in\{1,\cdots,2M\}}bold_italic_ψ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , 2 italic_M }, which are the intersection points of the unit M𝑀Mitalic_M-hyper-sphere and its axes. For example, the unit 2222-hyper-sphere, also known as the unit circle, has [1,0]T,[0,1]T,[1,0]Tsuperscript10𝑇superscript01𝑇superscript10𝑇{[1,0]^{T},[0,1]^{T},[-1,0]^{T}}[ 1 , 0 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , [ 0 , 1 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , [ - 1 , 0 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and [0,1]Tsuperscript01𝑇{[0,-1]^{T}}[ 0 , - 1 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT as its four cubature points, which are basically the intersection points of the unit 2222-hyper-sphere with its axes. Likewise, the unit M𝑀Mitalic_M-hyper-sphere has 𝝍j=𝟏j,𝝍M+j=𝟏j,j{1,,M}formulae-sequencesubscript𝝍𝑗subscript1𝑗formulae-sequencesubscript𝝍𝑀𝑗subscript1𝑗for-all𝑗1𝑀{\boldsymbol{\psi}_{j}=\mathbf{1}_{j},\boldsymbol{\psi}_{M+j}=-\mathbf{1}_{j},% \forall j\in\{1,\cdots,M\}}bold_italic_ψ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = bold_1 start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_italic_ψ start_POSTSUBSCRIPT italic_M + italic_j end_POSTSUBSCRIPT = - bold_1 start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_M }, as its cubature points. Subsequently, we compute the roots λj,j{1,,n}subscript𝜆superscript𝑗for-allsuperscript𝑗1superscript𝑛{\lambda_{j^{\prime}},\forall j^{\prime}\in\{1,\cdots,n^{\prime}\}}italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , ∀ italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ { 1 , ⋯ , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } of the Chebyshev-Leguerre (CL) polynomial, known as quadrature points. Here, the CL polynomial is given as

L(λ)=k=0n(nk)(1)k(n+ι)!(n+ιk)!λnk=0+1λ++n1λn1+λn,𝐿𝜆superscriptsubscript𝑘0superscript𝑛binomialsuperscript𝑛𝑘superscript1𝑘superscript𝑛𝜄superscript𝑛𝜄𝑘superscript𝜆superscript𝑛𝑘subscript0subscript1𝜆subscriptsuperscript𝑛1superscript𝜆superscript𝑛1superscript𝜆superscript𝑛\displaystyle\begin{split}L(\lambda)=&\sum_{k=0}^{n^{\prime}}\binom{n^{\prime}% }{k}(-1)^{k}\frac{(n^{\prime}+\iota)!}{(n^{\prime}+\iota-k)!}\lambda^{n^{% \prime}-k}\\ =&\ell_{0}+\ell_{1}\lambda+\cdots+\ell_{n^{\prime}-1}\lambda^{n^{\prime}-1}+% \lambda^{n^{\prime}},\end{split}start_ROW start_CELL italic_L ( italic_λ ) = end_CELL start_CELL ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_k end_ARG ) ( - 1 ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT divide start_ARG ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_ι ) ! end_ARG start_ARG ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_ι - italic_k ) ! end_ARG italic_λ start_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_k end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL = end_CELL start_CELL roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_λ + ⋯ + roman_ℓ start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_POSTSUBSCRIPT italic_λ start_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_λ start_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT , end_CELL end_ROW (3)

where ι=M21𝜄𝑀21\iota=\frac{M}{2}-1italic_ι = divide start_ARG italic_M end_ARG start_ARG 2 end_ARG - 1. Consider =[1,,n1]Tbold-ℓsuperscriptsubscript1subscriptsuperscript𝑛1𝑇\boldsymbol{\ell}=[\ell_{1},\cdots,\ell_{n^{\prime}-1}]^{T}bold_ℓ = [ roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , roman_ℓ start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. To compute quadrature points, we first have to formulate the companion matrix (𝐃)𝐃{(\mathbf{D})}( bold_D ) corresponding to L(λ)𝐿𝜆L(\lambda)italic_L ( italic_λ ), where

𝐃=[𝟎n1𝐈n10T].𝐃matrixsubscript0superscript𝑛1subscript𝐈superscript𝑛1subscript0superscriptbold-ℓ𝑇\displaystyle\mathbf{D}=\begin{bmatrix}\mathbf{0}_{n^{\prime}-1}&\mathbf{I}_{n% ^{\prime}-1}\\ -\ell_{0}&-\boldsymbol{\ell}^{T}\end{bmatrix}.bold_D = [ start_ARG start_ROW start_CELL bold_0 start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_POSTSUBSCRIPT end_CELL start_CELL bold_I start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - roman_ℓ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL start_CELL - bold_ℓ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] . (4)

Next, we formulate the characteristic polynomial of 𝐃𝐃{\mathbf{D}}bold_D, which is det(𝐃λ𝐈n)𝐃𝜆subscript𝐈superscript𝑛{\det(\mathbf{D}-\lambda\mathbf{I}_{n^{\prime}})}roman_det ( bold_D - italic_λ bold_I start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ), here λ𝜆\lambdaitalic_λ corresponds to the eigenvalues of 𝐃𝐃\mathbf{D}bold_D. Note that, L(λ)=det(𝐃λ𝐈n)𝐿𝜆𝐃𝜆subscript𝐈superscript𝑛{L(\lambda)=\det(\mathbf{D}-\lambda\mathbf{I}_{n^{\prime}})}italic_L ( italic_λ ) = roman_det ( bold_D - italic_λ bold_I start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ). Therefore, the eigenvalues of 𝐃𝐃\mathbf{D}bold_D are the roots of L(λ)𝐿𝜆{L(\lambda)}italic_L ( italic_λ ). Finally, we determine 𝚵𝚵\boldsymbol{\Xi}bold_Ξ and 𝐰𝐰\mathbf{w}bold_w by utilizing the cubature and quadrature points in step 3 of Algorithm 1, respectively. Note that, L(λj)superscript𝐿subscript𝜆superscript𝑗{L^{\prime}(\lambda_{j^{\prime}})}italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) in step 3 of Algorithm 1 is the first derivative of L()𝐿{L(\cdot)}italic_L ( ⋅ ) at λ=λj𝜆subscript𝜆superscript𝑗{\lambda=\lambda_{j^{\prime}}}italic_λ = italic_λ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT.

The CQKF is detailed in Algorithm 2 and encompasses two steps: prediction step and update step, elaborated thoroughly in Algorithm 3 and 4, respectively.

Algorithm 3 PredictionStepPredictionStep\operatorname{\textsc{PredictionStep}}predictionstep
0.  𝐱^pos(t1),𝚿pos(t1),𝚺v1,𝐰,𝚵,ϖ,ς,subscript^𝐱𝑝𝑜𝑠𝑡1subscript𝚿𝑝𝑜𝑠𝑡1subscript𝚺subscript𝑣1𝐰𝚵italic-ϖ𝜍\hat{\mathbf{x}}_{pos}(t-1),\mathbf{\Psi}_{pos}(t-1),\mathbf{\Sigma}_{v_{1}},% \mathbf{w},\boldsymbol{\Xi},\varpi,\varsigma,over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , bold_w , bold_Ξ , italic_ϖ , italic_ς , 𝒂(t1),𝒃(t1)𝒂𝑡1𝒃𝑡1\boldsymbol{a}(t-1),\boldsymbol{b}(t-1)bold_italic_a ( italic_t - 1 ) , bold_italic_b ( italic_t - 1 )
1.  𝚺pri=Chol(𝚿pos(t1))subscript𝚺𝑝𝑟𝑖Cholsubscript𝚿𝑝𝑜𝑠𝑡1\mathbf{\Sigma}_{pri}=\operatorname{\textsc{Chol}}(\mathbf{\Psi}_{pos}(t-1))bold_Σ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT = chol ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) ) \triangleright Cholesky decomposition
2.  𝜻i(t1)=𝚺pri𝝃i+𝐱^pos(t1),i{1,,2Mn}formulae-sequencesubscript𝜻𝑖𝑡1subscript𝚺𝑝𝑟𝑖subscript𝝃𝑖subscript^𝐱𝑝𝑜𝑠𝑡1for-all𝑖12𝑀superscript𝑛\boldsymbol{\zeta}_{i}(t-1)=\mathbf{\Sigma}_{pri}\boldsymbol{\xi}_{i}+\hat{% \mathbf{x}}_{pos}(t-1),\forall i\in\{1,\cdots,2Mn^{\prime}\}bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t - 1 ) = bold_Σ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT bold_italic_ξ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , ∀ italic_i ∈ { 1 , ⋯ , 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } \triangleright 𝐙(t1)=[𝜻1(t1),,𝜻2Mn(t1)]T𝐙𝑡1superscriptsubscript𝜻1𝑡1subscript𝜻2𝑀superscript𝑛𝑡1𝑇\mathbf{Z}(t-1)=[\boldsymbol{\zeta}_{1}(t-1),\cdots,\boldsymbol{\zeta}_{2Mn^{% \prime}}(t-1)]^{T}bold_Z ( italic_t - 1 ) = [ bold_italic_ζ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t - 1 ) , ⋯ , bold_italic_ζ start_POSTSUBSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_t - 1 ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
3.  𝐙(t1),𝒂(t),𝒃(t)HoltsMethod(ϖ,ς,𝐙(t1),\mathbf{Z}^{*}(t-1),\boldsymbol{a}(t),\boldsymbol{b}(t)\leftarrow\operatorname% {\textsc{HoltsMethod}}(\varpi,\varsigma,{\mathbf{Z}(t-1)},bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) , bold_italic_a ( italic_t ) , bold_italic_b ( italic_t ) ← holts ( italic_ϖ , italic_ς , bold_Z ( italic_t - 1 ) , 𝒂(t1),𝒃(t1),𝐱^pos(t1)){\boldsymbol{a}(t-1)},\boldsymbol{b}(t-1),\hat{\mathbf{x}}_{pos}(t-1))bold_italic_a ( italic_t - 1 ) , bold_italic_b ( italic_t - 1 ) , over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) ) \triangleright 𝐙(t1)=[𝜻1(t1),,𝜻2Mn(t1)]Tsuperscript𝐙𝑡1superscriptsuperscriptsubscript𝜻1𝑡1superscriptsubscript𝜻2𝑀superscript𝑛𝑡1𝑇\mathbf{Z}^{*}(t-1)=[\boldsymbol{\zeta}_{1}^{*}(t-1),\cdots,\boldsymbol{\zeta}% _{2Mn^{\prime}}^{*}(t-1)]^{T}bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) = [ bold_italic_ζ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) , ⋯ , bold_italic_ζ start_POSTSUBSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
4.   𝐱^pri(t)=i=12Mnwi𝜻i(t1)subscript^𝐱𝑝𝑟𝑖𝑡superscriptsubscript𝑖12𝑀superscript𝑛subscript𝑤𝑖superscriptsubscript𝜻𝑖𝑡1\hat{\mathbf{x}}_{pri}(t)=\sum_{i=1}^{2Mn^{\prime}}w_{i}\boldsymbol{\zeta}_{i}% ^{*}(t-1)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 )
5.   𝚿pri(t)=i=12Mnwi𝜻i(t1)𝜻iT(t1)subscript𝚿𝑝𝑟𝑖𝑡superscriptsubscript𝑖12𝑀superscript𝑛subscript𝑤𝑖superscriptsubscript𝜻𝑖𝑡1superscriptsubscript𝜻𝑖absent𝑇𝑡1\mathbf{\Psi}_{pri}(t)=\sum_{i=1}^{2Mn^{\prime}}w_{i}\boldsymbol{\zeta}_{i}^{*% }(t-1)\boldsymbol{\zeta}_{i}^{*T}(t-1)bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ italic_T end_POSTSUPERSCRIPT ( italic_t - 1 ) 𝐱^pri(t)𝐱^priT(t)+𝚺v1subscript^𝐱𝑝𝑟𝑖𝑡superscriptsubscript^𝐱𝑝𝑟𝑖𝑇𝑡subscript𝚺subscript𝑣1-\hat{\mathbf{x}}_{pri}(t)\hat{\mathbf{x}}_{pri}^{T}(t)+\mathbf{\Sigma}_{v_{1}}- over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_t ) + bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT
5.  𝐱^pri(t),𝚿pri(t),𝒂(t),𝒃(t),𝐙(t1)subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡𝒂𝑡𝒃𝑡superscript𝐙𝑡1\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t),\boldsymbol{a}(t),\boldsymbol% {b}(t),\mathbf{Z}^{*}(t-1)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_italic_a ( italic_t ) , bold_italic_b ( italic_t ) , bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 )

III-A1 Prediction step

The prediction step computes the prior estimates, 𝐱^pri(t)subscript^𝐱𝑝𝑟𝑖𝑡\hat{\mathbf{x}}_{pri}(t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) and 𝚿pri(t)subscript𝚿𝑝𝑟𝑖𝑡\mathbf{\Psi}_{pri}(t)bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ). At the outset, we compute the Cholesky decomposition 𝚺prisubscript𝚺𝑝𝑟𝑖\mathbf{\Sigma}_{pri}bold_Σ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT of the previous posterior covariance 𝚿pos(t1)subscript𝚿𝑝𝑜𝑠𝑡1\mathbf{\Psi}_{pos}(t-1)bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ), which is further put into service to determine the sampling points 𝐙(t1)𝐙𝑡1\mathbf{Z}(t-1)bold_Z ( italic_t - 1 ). Later on, Holt’s method, elucidated in the next paragraph, transforms 𝐙(t1)𝐙𝑡1\mathbf{Z}(t-1)bold_Z ( italic_t - 1 ) into the updated sampling points 𝐙(t1)superscript𝐙𝑡1\mathbf{Z}^{*}(t-1)bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ). At last, we compute 𝐱^pri(t)subscript^𝐱𝑝𝑟𝑖𝑡\hat{\mathbf{x}}_{pri}(t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) and 𝚿pri(t)subscript𝚿𝑝𝑟𝑖𝑡\mathbf{\Psi}_{pri}(t)bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) by utilizing 𝐙(t1)superscript𝐙𝑡1\mathbf{Z}^{*}(t-1)bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) in step 4 and 5 of Algorithm 3.

Knowing 𝐟()𝐟\mathbf{f}(\cdot)bold_f ( ⋅ ) is essential to transform 𝐙(t1)𝐙𝑡1\mathbf{Z}(t-1)bold_Z ( italic_t - 1 ) into 𝐙(t1)superscript𝐙𝑡1\mathbf{Z}^{*}(t-1)bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ), but such information is not available at the edge node. Therefore, we opt for Holt’s method, described in detail in Algorithm 5, a reliable way to model the NLSD function 𝐟()𝐟\mathbf{f}(\cdot)bold_f ( ⋅ ). Holt’s method estimates 𝐟()𝐟\mathbf{f}(\cdot)bold_f ( ⋅ ) according to the expression available in step 1, which is updated at each time step with the help of the following smoothing parameters: ϖ,ς,𝒂(t)and𝒃(t)italic-ϖ𝜍𝒂𝑡and𝒃𝑡{\varpi,\varsigma,\boldsymbol{a}(t)\ \textrm{and}\ \boldsymbol{b}(t)}italic_ϖ , italic_ς , bold_italic_a ( italic_t ) and bold_italic_b ( italic_t ). Here, ϖandςitalic-ϖand𝜍{\varpi\ \textrm{and}\ \varsigma}italic_ϖ and italic_ς are constants, while 𝒂(t)and𝒃(t)𝒂𝑡and𝒃𝑡{\boldsymbol{a}(t)\ \textrm{and}\ \boldsymbol{b}(t)}bold_italic_a ( italic_t ) and bold_italic_b ( italic_t ) are variables whose update procedure is mentioned is step 2 and 3 of Algorithm 5.

Note that, CQKF necessitates p𝑝pitalic_p, denoting the index of the selected action, and a random number θ𝒰(0,1)𝜃𝒰01{\theta\in\mathcal{U}(0,1)}italic_θ ∈ caligraphic_U ( 0 , 1 ). Both the term action and p𝑝pitalic_p are part of Algorithm 6. If p>0𝑝0{p>0}italic_p > 0 and θp𝜃subscriptPlanck-constant-over-2-pi𝑝{\theta\geq\hbar_{p}}italic_θ ≥ roman_ℏ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT, where p=0.02p110subscriptPlanck-constant-over-2-pi𝑝0.02𝑝110\hbar_{p}=0.02\lceil{\frac{p-1}{10}}\rceilroman_ℏ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = 0.02 ⌈ divide start_ARG italic_p - 1 end_ARG start_ARG 10 end_ARG ⌉ [5], we advance to the update step to compute the posterior estimates, 𝐱^pos(t)subscript^𝐱𝑝𝑜𝑠𝑡\hat{\mathbf{x}}_{pos}(t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) and 𝚿pos(t)subscript𝚿𝑝𝑜𝑠𝑡\mathbf{\Psi}_{pos}(t)bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ). Otherwise, {𝐱^pos(t),𝚿pos(t)}={𝐱^pri(t),𝚿pri(t)}subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡{\{\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t)\}=\{\hat{\mathbf{x}}_{pri}% (t),\mathbf{\Psi}_{pri}(t)\}}{ over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) } = { over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) }.

III-A2 Update step

In the update step, we compute the Cholesky decomposition 𝚺possubscript𝚺𝑝𝑜𝑠\mathbf{\Sigma}_{pos}bold_Σ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT of 𝚿pri(t)subscript𝚿𝑝𝑟𝑖𝑡\mathbf{\Psi}_{pri}(t)bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ). Following this, we determine the sampling points 𝐙(t)𝐙𝑡\mathbf{Z}(t)bold_Z ( italic_t ), which undergo a linear transformation to become the updated sampling points 𝐙(t)superscript𝐙𝑡\mathbf{Z}^{*}(t)bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t ), as delineated in step 3 of Algorithm 4. Subsequently, we compute a vector 𝐲^(t)^𝐲𝑡\hat{\mathbf{y}}(t)over^ start_ARG bold_y end_ARG ( italic_t ), representing the predicted sensor measurements, which is then put into service to determine the innovation error covariance 𝚿yy(t)subscript𝚿𝑦𝑦𝑡\mathbf{\Psi}_{yy}(t)bold_Ψ start_POSTSUBSCRIPT italic_y italic_y end_POSTSUBSCRIPT ( italic_t ), cross-covariance 𝚿xy(t)subscript𝚿𝑥𝑦𝑡\mathbf{\Psi}_{xy}(t)bold_Ψ start_POSTSUBSCRIPT italic_x italic_y end_POSTSUBSCRIPT ( italic_t ), and Kalman gain 𝐊(t)𝐊𝑡\mathbf{K}(t)bold_K ( italic_t ). Lastly, we compute 𝐱^pos(t)subscript^𝐱𝑝𝑜𝑠𝑡\hat{\mathbf{x}}_{pos}(t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) and 𝚿pos(t)subscript𝚿𝑝𝑜𝑠𝑡\mathbf{\Psi}_{pos}(t)bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) by employing 𝐊(t)𝐊𝑡\mathbf{K}(t)bold_K ( italic_t ), 𝚿yy(t)subscript𝚿𝑦𝑦𝑡\mathbf{\Psi}_{yy}(t)bold_Ψ start_POSTSUBSCRIPT italic_y italic_y end_POSTSUBSCRIPT ( italic_t ), and y(t)𝑦𝑡y(t)italic_y ( italic_t ) in step 8 and step 9 of Algorithm 4. Here, y(t)𝑦𝑡y(t)italic_y ( italic_t ) denotes the measurement of the polled sensor.

Algorithm 4 UpdateStepUpdateStep\operatorname{\textsc{UpdateStep}}updatestep
0.  𝐱^pri(t),𝚿pri(t),𝐙(t1),𝚺v2,𝐇,𝐰,𝚵,y(t),psubscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡superscript𝐙𝑡1subscript𝚺subscript𝑣2𝐇𝐰𝚵𝑦𝑡𝑝\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t),\mathbf{Z}^{*}(t-1),\mathbf{% \Sigma}_{v_{2}},\mathbf{H},\mathbf{w},\boldsymbol{\Xi},y(t),pover^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) , bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , bold_H , bold_w , bold_Ξ , italic_y ( italic_t ) , italic_p
1.  𝚺pos=Chol(𝚿pri(t))subscript𝚺𝑝𝑜𝑠Cholsubscript𝚿𝑝𝑟𝑖𝑡\mathbf{\Sigma}_{pos}=\operatorname{\textsc{Chol}}(\mathbf{\Psi}_{pri}(t))bold_Σ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT = chol ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) \triangleright Cholesky decomposition
2.  𝜻i(t)=𝚺pos𝝃i+𝐱^pri(t),i{1,,2Mn}formulae-sequencesubscript𝜻𝑖𝑡subscript𝚺𝑝𝑜𝑠subscript𝝃𝑖subscript^𝐱𝑝𝑟𝑖𝑡𝑖12𝑀superscript𝑛\boldsymbol{\zeta}_{i}(t)=\mathbf{\Sigma}_{pos}\boldsymbol{\xi}_{i}+\hat{% \mathbf{x}}_{pri}(t),i\in\{1,\cdots,2Mn^{\prime}\}bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) = bold_Σ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT bold_italic_ξ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , italic_i ∈ { 1 , ⋯ , 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } \triangleright 𝐙(t)=[𝜻1(t),,𝜻2Mn(t)]T𝐙𝑡superscriptsubscript𝜻1𝑡subscript𝜻2𝑀superscript𝑛𝑡𝑇\mathbf{Z}(t)=[\boldsymbol{\zeta}_{1}(t),\cdots,\boldsymbol{\zeta}_{2Mn^{% \prime}}(t)]^{T}bold_Z ( italic_t ) = [ bold_italic_ζ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) , ⋯ , bold_italic_ζ start_POSTSUBSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_t ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
3.   𝜻i(t)=𝐇𝜻i(t),i{1,,2Mn}formulae-sequencesuperscriptsubscript𝜻𝑖𝑡𝐇subscript𝜻𝑖𝑡𝑖12𝑀superscript𝑛\boldsymbol{\zeta}_{i}^{*}(t)=\mathbf{H}\boldsymbol{\zeta}_{i}(t),i\in\{1,% \cdots,2Mn^{\prime}\}bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t ) = bold_H bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t ) , italic_i ∈ { 1 , ⋯ , 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } \triangleright 𝐙(t)=[𝜻1(t),,𝜻2Mn(t)]Tsuperscript𝐙𝑡superscriptsuperscriptsubscript𝜻1𝑡superscriptsubscript𝜻2𝑀superscript𝑛𝑡𝑇\mathbf{Z}^{*}(t)=[\boldsymbol{\zeta}_{1}^{*}(t),\cdots,\boldsymbol{\zeta}_{2% Mn^{\prime}}^{*}(t)]^{T}bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t ) = [ bold_italic_ζ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t ) , ⋯ , bold_italic_ζ start_POSTSUBSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
4.  𝐲^(t)=i=12Mnwi𝜻i(t)^𝐲𝑡superscriptsubscript𝑖12𝑀superscript𝑛subscript𝑤𝑖superscriptsubscript𝜻𝑖𝑡\hat{\mathbf{y}}(t)=\sum_{i=1}^{2Mn^{\prime}}w_{i}\boldsymbol{\zeta}_{i}^{*}(t)over^ start_ARG bold_y end_ARG ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t ) \triangleright 𝐲^(t)=[y^1(t),,y^N(t)]T^𝐲𝑡superscriptsubscript^𝑦1𝑡subscript^𝑦𝑁𝑡𝑇\hat{\mathbf{y}}(t)=[\hat{y}_{1}(t),\cdots,\hat{y}_{N}(t)]^{T}over^ start_ARG bold_y end_ARG ( italic_t ) = [ over^ start_ARG italic_y end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) , ⋯ , over^ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ( italic_t ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
5.  𝚿yy(t)=i=12Mnwi𝜻i(t)𝜻iT(t)𝐲^(t)𝐲^T(t)+𝚺v2subscript𝚿𝑦𝑦𝑡superscriptsubscript𝑖12𝑀superscript𝑛subscript𝑤𝑖superscriptsubscript𝜻𝑖𝑡superscriptsubscript𝜻𝑖absent𝑇𝑡^𝐲𝑡superscript^𝐲𝑇𝑡subscript𝚺subscript𝑣2\mathbf{\Psi}_{yy}(t)=\sum_{i=1}^{2Mn^{\prime}}w_{i}\boldsymbol{\zeta}_{i}^{*}% (t)\boldsymbol{\zeta}_{i}^{*T}(t)-\hat{\mathbf{y}}(t)\hat{\mathbf{y}}^{T}(t)+% \mathbf{\Sigma}_{v_{2}}bold_Ψ start_POSTSUBSCRIPT italic_y italic_y end_POSTSUBSCRIPT ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t ) bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ italic_T end_POSTSUPERSCRIPT ( italic_t ) - over^ start_ARG bold_y end_ARG ( italic_t ) over^ start_ARG bold_y end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_t ) + bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT
6.  𝚿xy(t)=i=12Mnwi𝜻i(t1)𝜻iT(t)𝐱^pri(t)𝐲^T(t)subscript𝚿𝑥𝑦𝑡superscriptsubscript𝑖12𝑀superscript𝑛subscript𝑤𝑖superscriptsubscript𝜻𝑖𝑡1superscriptsubscript𝜻𝑖absent𝑇𝑡subscript^𝐱𝑝𝑟𝑖𝑡superscript^𝐲𝑇𝑡\mathbf{\Psi}_{xy}(t)=\sum_{i=1}^{2Mn^{\prime}}w_{i}\boldsymbol{\zeta}_{i}^{*}% (t-1)\boldsymbol{\zeta}_{i}^{*T}(t)-\hat{\mathbf{x}}_{pri}(t)\hat{\mathbf{y}}^% {T}(t)bold_Ψ start_POSTSUBSCRIPT italic_x italic_y end_POSTSUBSCRIPT ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ italic_T end_POSTSUPERSCRIPT ( italic_t ) - over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) over^ start_ARG bold_y end_ARG start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_t ) \triangleright 𝐙(t1)=[𝜻1(t1),,𝜻2Mn(t1)]Tsuperscript𝐙𝑡1superscriptsuperscriptsubscript𝜻1𝑡1superscriptsubscript𝜻2𝑀superscript𝑛𝑡1𝑇\mathbf{Z}^{*}(t-1)=[\boldsymbol{\zeta}_{1}^{*}(t-1),\cdots,\boldsymbol{\zeta}% _{2Mn^{\prime}}^{*}(t-1)]^{T}bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) = [ bold_italic_ζ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) , ⋯ , bold_italic_ζ start_POSTSUBSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
7.  𝐊(t)=𝚿xy(t)𝚿yy(t)1𝐊𝑡subscript𝚿𝑥𝑦𝑡subscript𝚿𝑦𝑦superscript𝑡1\mathbf{K}(t)=\mathbf{\Psi}_{xy}(t)\mathbf{\Psi}_{yy}(t)^{-1}bold_K ( italic_t ) = bold_Ψ start_POSTSUBSCRIPT italic_x italic_y end_POSTSUBSCRIPT ( italic_t ) bold_Ψ start_POSTSUBSCRIPT italic_y italic_y end_POSTSUBSCRIPT ( italic_t ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT \triangleright Kalman gain
8.   𝐱^pos(t)=𝐱^pri(t)+𝐊(t)𝟏p(y(t)y^p(t))subscript^𝐱𝑝𝑜𝑠𝑡subscript^𝐱𝑝𝑟𝑖𝑡𝐊𝑡subscript1𝑝𝑦𝑡subscript^𝑦𝑝𝑡\hat{\mathbf{x}}_{pos}(t)=\hat{\mathbf{x}}_{pri}(t)+\mathbf{K}(t)\mathbf{1}_{p% }(y(t)-\hat{y}_{p}(t))over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) = over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) + bold_K ( italic_t ) bold_1 start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_y ( italic_t ) - over^ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) )
9.   𝚿pos(t)=𝚿pri(t)𝐊(t)𝚿yy(t)𝐊T(t)subscript𝚿𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑟𝑖𝑡𝐊𝑡subscript𝚿𝑦𝑦𝑡superscript𝐊𝑇𝑡\mathbf{\Psi}_{pos}(t)=\mathbf{\Psi}_{pri}(t)-\mathbf{K}(t)\mathbf{\Psi}_{yy}(% t)\mathbf{K}^{T}(t)bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) = bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) - bold_K ( italic_t ) bold_Ψ start_POSTSUBSCRIPT italic_y italic_y end_POSTSUBSCRIPT ( italic_t ) bold_K start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_t )
9.  𝐱^pos(t),𝚿pos(t)subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t )
Algorithm 5 HoltsMethodHoltsMethod\operatorname{\textsc{HoltsMethod}}holts
0.  ϖ,ς,𝐙(t1),𝒂(t1),𝒃(t1),𝐱^pos(t1)italic-ϖ𝜍𝐙𝑡1𝒂𝑡1𝒃𝑡1subscript^𝐱𝑝𝑜𝑠𝑡1\varpi,\varsigma,\mathbf{Z}(t-1),\boldsymbol{a}(t-1),\boldsymbol{b}(t-1),\hat{% \mathbf{x}}_{pos}(t-1)italic_ϖ , italic_ς , bold_Z ( italic_t - 1 ) , bold_italic_a ( italic_t - 1 ) , bold_italic_b ( italic_t - 1 ) , over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 )
1.   𝜻i(t1)=ϖ(1+ς)𝜻i(t1)+(1+ς)(1ϖ)𝜻i(t1)superscriptsubscript𝜻𝑖𝑡1italic-ϖ1𝜍subscript𝜻𝑖𝑡11𝜍1italic-ϖsubscript𝜻𝑖𝑡1\boldsymbol{\zeta}_{i}^{*}(t-1)=\varpi(1+\varsigma)\boldsymbol{\zeta}_{i}(t-1)% +(1+\varsigma)(1-\varpi)\boldsymbol{\zeta}_{i}(t-1)bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) = italic_ϖ ( 1 + italic_ς ) bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t - 1 ) + ( 1 + italic_ς ) ( 1 - italic_ϖ ) bold_italic_ζ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_t - 1 ) ς𝒂(t1)+(1ς)𝒃(t1),i{1,,2Mn}𝜍𝒂𝑡11𝜍𝒃𝑡1for-all𝑖12𝑀superscript𝑛-\varsigma\boldsymbol{a}(t-1)+(1-\varsigma)\boldsymbol{b}(t-1),\forall i\in\{1% ,\cdots,2Mn^{\prime}\}- italic_ς bold_italic_a ( italic_t - 1 ) + ( 1 - italic_ς ) bold_italic_b ( italic_t - 1 ) , ∀ italic_i ∈ { 1 , ⋯ , 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } \triangleright 𝐙(t1)=[𝜻1(t1),,𝜻2Mn(t1)]Tsuperscript𝐙𝑡1superscriptsuperscriptsubscript𝜻1𝑡1superscriptsubscript𝜻2𝑀superscript𝑛𝑡1𝑇\mathbf{Z}^{*}(t-1)=[\boldsymbol{\zeta}_{1}^{*}(t-1),\cdots,\boldsymbol{\zeta}% _{2Mn^{\prime}}^{*}(t-1)]^{T}bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) = [ bold_italic_ζ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) , ⋯ , bold_italic_ζ start_POSTSUBSCRIPT 2 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
2.   𝒂(t)=ϖ𝐱^pos(t1)+(1ϖ)𝒂(t1)𝒂𝑡italic-ϖsubscript^𝐱𝑝𝑜𝑠𝑡11italic-ϖ𝒂𝑡1\boldsymbol{a}(t)=\varpi\hat{\mathbf{x}}_{pos}(t-1)+(1-\varpi)\boldsymbol{a}(t% -1)bold_italic_a ( italic_t ) = italic_ϖ over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) + ( 1 - italic_ϖ ) bold_italic_a ( italic_t - 1 )
3.   𝒃(t)=ς(𝒂(t)𝒂(t1))+(1ς)𝒃(t1)𝒃𝑡𝜍𝒂𝑡𝒂𝑡11𝜍𝒃𝑡1\boldsymbol{b}(t)=\varsigma(\boldsymbol{a}(t)-\boldsymbol{a}(t-1))+(1-% \varsigma)\boldsymbol{b}(t-1)bold_italic_b ( italic_t ) = italic_ς ( bold_italic_a ( italic_t ) - bold_italic_a ( italic_t - 1 ) ) + ( 1 - italic_ς ) bold_italic_b ( italic_t - 1 )
3.  𝐙(t1),𝒂(t),𝒃(t)superscript𝐙𝑡1𝒂𝑡𝒃𝑡\mathbf{Z}^{*}(t-1),\boldsymbol{a}(t),\boldsymbol{b}(t)bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) , bold_italic_a ( italic_t ) , bold_italic_b ( italic_t )

III-B Query Process and Query Response

The query process can be modeled as a Markov chain (MC). Each client c𝑐citalic_c operates independently, following its own MC, with its state at time t𝑡titalic_t denoted as qc(t)𝒬csubscript𝑞𝑐𝑡subscript𝒬𝑐{q_{c}(t)\in\mathcal{Q}_{c}}italic_q start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) ∈ caligraphic_Q start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT, governed by a known transition matrix 𝐓csubscript𝐓𝑐\mathbf{T}_{c}bold_T start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT. Client c𝑐citalic_c always requests the same function zcsubscript𝑧𝑐z_{c}italic_z start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT when its MC is within a subset of states, denoted as 𝒬~csubscript~𝒬𝑐\tilde{\mathcal{Q}}_{c}over~ start_ARG caligraphic_Q end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT, where 𝒬~c𝒬csubscript~𝒬𝑐subscript𝒬𝑐{\tilde{\mathcal{Q}}_{c}\subset\mathcal{Q}_{c}}over~ start_ARG caligraphic_Q end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ⊂ caligraphic_Q start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT. Besides, the state of each client remains unknown to the edge node.

The edge node responds to a query, from client c𝒞𝑐𝒞{c\in\mathscr{C}}italic_c ∈ script_C, with an estimate z^c(𝐱^pos(t),𝚿pos(t))subscript^𝑧𝑐subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡{\hat{z}_{c}(\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t))}over^ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) ). The objective of the edge node is to respond to queries as accurately as possible, essentially minimizing the error in query responses. This error is quantified by the query response MSEMSE\operatorname{MSE}roman_MSE, which for client c𝑐citalic_c is defined as [5, 6]

MSEc(t)=𝔼[(z^c(𝐱^pos(t),𝚿pos(t))zc(𝐱(t)))2].subscriptMSE𝑐𝑡𝔼delimited-[]superscriptsubscript^𝑧𝑐subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡subscript𝑧𝑐𝐱𝑡2\displaystyle\operatorname{MSE}_{c}(t)=\mathbb{E}\bigl{[}(\hat{z}_{c}(\hat{% \mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t))-z_{c}(\mathbf{x}(t)))^{2}\bigr{]}.roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) = blackboard_E [ ( over^ start_ARG italic_z end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) ) - italic_z start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_x ( italic_t ) ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] . (5)

III-C GoS Problem

The problem is to anticipate future queries and schedule sensor transmissions to minimize the MSEMSE\operatorname{MSE}roman_MSE on future query responses. This task demands foresight, necessitating an understanding not only of the monitored NLDS but also of the query process and the interplay among various query functions.

We can model the GoS problem at the edge node as a partially observable Markov decision process (POMDP), in which the edge node must decide whether to poll a sensor. Herein, the action space is 𝒜={0,1,,N}𝒜01𝑁{\mathcal{A}=\{0,1,\cdots,N\}}caligraphic_A = { 0 , 1 , ⋯ , italic_N }, where action p=0𝑝0{p=0}italic_p = 0 signifies no device is polled, and action p=n{1,,N}𝑝𝑛1𝑁{p=n\in\{1,\cdots,N\}}italic_p = italic_n ∈ { 1 , ⋯ , italic_N } represents sensor n𝑛nitalic_n is polled.

Before initiating the sensor scheduling operation, the edge node possesses prior estimates. Moreover,

Tr(𝚿pri(t))=𝔼[(𝐱(t)𝐱^pri(t))T(𝐱(t)𝐱^pri(t))].Trsubscript𝚿𝑝𝑟𝑖𝑡𝔼delimited-[]superscript𝐱𝑡subscript^𝐱𝑝𝑟𝑖𝑡𝑇𝐱𝑡subscript^𝐱𝑝𝑟𝑖𝑡\displaystyle\operatorname{Tr}(\mathbf{\Psi}_{pri}(t))=\mathbb{E}[(\mathbf{x}(% t)-\hat{\mathbf{x}}_{pri}(t))^{T}(\mathbf{x}(t)-\hat{\mathbf{x}}_{pri}(t))].roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) = blackboard_E [ ( bold_x ( italic_t ) - over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_x ( italic_t ) - over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) ] . (6)

Consequently, the state in POMDP can be represented as 𝒔(t)=(Tr(𝚿pri(t)),𝒒(t))𝒔𝑡Trsubscript𝚿𝑝𝑟𝑖𝑡𝒒𝑡{\boldsymbol{s}(t)=(\operatorname{Tr}(\mathbf{\Psi}_{pri}(t)),\boldsymbol{q}(t% ))}bold_italic_s ( italic_t ) = ( roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) , bold_italic_q ( italic_t ) ), where 𝒒(t)=[q1(t),,qC(t)]T𝒒𝑡superscriptsubscript𝑞1𝑡subscript𝑞𝐶𝑡𝑇{\boldsymbol{q}(t)=[q_{1}(t),\cdots,q_{C}(t)]^{T}}bold_italic_q ( italic_t ) = [ italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) , ⋯ , italic_q start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_t ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and the state space is 𝒮=×c=1C𝒬c𝒮superscriptsubscriptproduct𝑐1𝐶subscript𝒬𝑐{\mathcal{S}=\mathbb{R}\times\prod_{c=1}^{C}\mathcal{Q}_{c}}caligraphic_S = blackboard_R × ∏ start_POSTSUBSCRIPT italic_c = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_C end_POSTSUPERSCRIPT caligraphic_Q start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT. However, the edge node lacks knowledge of 𝒒(t)𝒒𝑡{\boldsymbol{q}(t)}bold_italic_q ( italic_t ), instead possessing information about the time 𝝉(t)=[τ1(t),,τC(t)]TC×1𝝉𝑡superscriptsubscript𝜏1𝑡subscript𝜏𝐶𝑡𝑇superscript𝐶1{\boldsymbol{\tau}(t)=[\tau_{1}(t),\cdots,\tau_{C}(t)]^{T}\in\mathbb{N}^{C% \times 1}}bold_italic_τ ( italic_t ) = [ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) , ⋯ , italic_τ start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ( italic_t ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∈ blackboard_N start_POSTSUPERSCRIPT italic_C × 1 end_POSTSUPERSCRIPT that elapsed since the last query [6]. Consequently, the edge node has an observation 𝒐(t)=(Tr(𝚿pri(t)),𝝉(t))𝒐𝑡Trsubscript𝚿𝑝𝑟𝑖𝑡𝝉𝑡{\boldsymbol{o}(t)=(\operatorname{Tr}(\mathbf{\Psi}_{pri}(t)),\boldsymbol{\tau% }(t))}bold_italic_o ( italic_t ) = ( roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) , bold_italic_τ ( italic_t ) ), with an observation space 𝒪=×C𝒪superscript𝐶{\mathcal{O}=\mathbb{R}\times\mathbb{N}^{C}}caligraphic_O = blackboard_R × blackboard_N start_POSTSUPERSCRIPT italic_C end_POSTSUPERSCRIPT.

The reward rp(t)subscript𝑟𝑝𝑡{r_{p}(t)}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) in POMDP is defined as

rp(t)={μ𝟙(p==0)Tr(𝚿pos(t)),no query,c=1CαcMSEc(t)𝟙(τc==0),otherwise,\displaystyle r_{p}(t)=\left\{\!\!\!\!\begin{array}[]{l}-\mu^{\mathbbm{1}(p==0% )}\operatorname{Tr}(\mathbf{\Psi}_{pos}(t)),\textrm{no query},\\ -\!\sum_{c=1}^{C}\alpha_{c}\operatorname{MSE}_{c}(t)\mathbbm{1}(\tau_{c}==0),% \textrm{otherwise},\end{array}\right.italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) = { start_ARRAY start_ROW start_CELL - italic_μ start_POSTSUPERSCRIPT blackboard_1 ( italic_p = = 0 ) end_POSTSUPERSCRIPT roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) ) , no query , end_CELL end_ROW start_ROW start_CELL - ∑ start_POSTSUBSCRIPT italic_c = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_C end_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) blackboard_1 ( italic_τ start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = = 0 ) , otherwise , end_CELL end_ROW end_ARRAY (9)

where μ(0,1)𝜇01{\mu\in(0,1)}italic_μ ∈ ( 0 , 1 ), p𝒜𝑝𝒜{p\in\mathcal{A}}italic_p ∈ caligraphic_A denotes the selected action, while αc[0,1],c𝒞formulae-sequencesubscript𝛼𝑐01for-all𝑐𝒞{\alpha_{c}\in[0,1],\forall c\in\mathscr{C}}italic_α start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ∈ [ 0 , 1 ] , ∀ italic_c ∈ script_C, signifies the relative importance of client c𝑐citalic_c. Additionally, we presume that αcsubscript𝛼𝑐{\alpha_{c}}italic_α start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT is known to the edge node.

The long-term reward R(π)𝑅𝜋R(\pi)italic_R ( italic_π ) can be stated as

R(π(t))=𝔼[t=0γtrp(t+t)|𝒐(o),π(t)],𝑅𝜋𝑡𝔼delimited-[]conditionalsuperscriptsubscriptsuperscript𝑡0superscript𝛾superscript𝑡subscript𝑟𝑝𝑡superscript𝑡𝒐𝑜𝜋𝑡\displaystyle R(\pi(t))=\mathbb{E}\Biggl{[}\sum_{t^{\prime}=0}^{\infty}\gamma^% {t^{\prime}}r_{p}(t+t^{\prime})\bigg{|}\boldsymbol{o}(o),\pi(t)\Biggr{]},italic_R ( italic_π ( italic_t ) ) = blackboard_E [ ∑ start_POSTSUBSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_γ start_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t + italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) | bold_italic_o ( italic_o ) , italic_π ( italic_t ) ] , (10)

where γ[0,1)𝛾01{\gamma\in[0,1)}italic_γ ∈ [ 0 , 1 ) is the exponential discount factor. Moreover, π:𝒪𝚽(𝒜):𝜋𝒪𝚽𝒜{\pi:\mathcal{O}\rightarrow\mathbf{\Phi}(\mathcal{A})}italic_π : caligraphic_O → bold_Φ ( caligraphic_A ) represents the policy which maps 𝒪𝒪\mathcal{O}caligraphic_O to 𝚽(𝒜)𝚽𝒜{\mathbf{\Phi}(\mathcal{A})}bold_Φ ( caligraphic_A ), where 𝚽(𝒜)𝚽𝒜{\mathbf{\Phi}(\mathcal{A})}bold_Φ ( caligraphic_A ) encompasses the probability of selecting each action. Finally, the GoS problem can be defined as [6]

π(t)=argmaxπ:𝒪𝚽(𝒜)R(π(t)),superscript𝜋𝑡:𝜋𝒪𝚽𝒜argmax𝑅𝜋𝑡\displaystyle\pi^{*}(t)=\underset{\pi:\mathcal{O}\rightarrow\mathbf{\Phi}(% \mathcal{A})}{\operatorname{argmax}}R(\pi(t)),italic_π start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t ) = start_UNDERACCENT italic_π : caligraphic_O → bold_Φ ( caligraphic_A ) end_UNDERACCENT start_ARG roman_argmax end_ARG italic_R ( italic_π ( italic_t ) ) , (11)

where πsuperscript𝜋\pi^{*}italic_π start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT represents the optimal policy.

TABLE II: Online and Target Network Architecture and Parameters
Parameters Values
Input dimension C+1𝐶1C+1italic_C + 1
Output dimension N+1𝑁1N+1italic_N + 1
Number of hidden layers 1111
Hidden layers dimension {4}4\{4\}{ 4 }
Activation function ReLU
Optimizer RMSProp
Initial learning rate 1.01.01.01.0
Mini-batch size (B)𝐵(B)( italic_B ) |𝒜|×30𝒜30|\mathcal{A}|\times 30| caligraphic_A | × 30
Memory buffer size (||)(|\mathcal{E}|)( | caligraphic_E | ) [13] |𝒜|×100𝒜100|\mathcal{A}|\times 100| caligraphic_A | × 100
Exponential discount factor (γ)𝛾(\gamma)( italic_γ ) [6] 0.90.90.90.9
Threshold for global norm of gradient vector (δ)𝛿(\delta)( italic_δ ) [13] 5.05.05.05.0
Θonl,ΘtarsubscriptΘ𝑜𝑛𝑙subscriptΘ𝑡𝑎𝑟\Theta_{onl},\Theta_{tar}roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT , roman_Θ start_POSTSUBSCRIPT italic_t italic_a italic_r end_POSTSUBSCRIPT (initialize) [0.3,0.3]0.30.3[-0.3,0.3][ - 0.3 , 0.3 ]
ε𝜀\varepsilonitalic_ε (initial value) 1111
μ𝜇\muitalic_μ 0.10.10.10.1
Refer to caption
Figure 2: A schematic of our proposed GoS.

III-D CQKF-cum-DRL-based Scheduler

We solve (11)11(\ref{schedulingProblem})( ) using DRL, thus, we name our scheduler as CQKF-cum-DRL-based scheduler, described in detail in Algorithm 6. Meanwhile, we are maintaining two DNNs, named online network and target network, to improve the stability of our DRL scheduler. For insights into the architecture of both networks, refer to Table II. A schematic of our proposed GoS is available in Fig. 2.

Algorithm 6 operates as follows. Initially, it computes the prior estimates to formulate 𝒐(t)𝒐𝑡{\boldsymbol{o}(t)}bold_italic_o ( italic_t ). Subsequently, the online network, characterized by its weights ΘonlsubscriptΘ𝑜𝑛𝑙\Theta_{onl}roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT, takes 𝒐(t)𝒐𝑡{\boldsymbol{o}(t)}bold_italic_o ( italic_t ) as its input and outputs the action values q^i(𝒐(t)),i𝒜subscript^𝑞𝑖𝒐𝑡for-all𝑖𝒜{\hat{q}_{i}(\boldsymbol{o}(t)),}{\forall i\in\mathcal{A}}over^ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_italic_o ( italic_t ) ) , ∀ italic_i ∈ caligraphic_A. Here, q^i(𝒐(t))subscript^𝑞𝑖𝒐𝑡{\hat{q}_{i}(\boldsymbol{o}(t))}over^ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_italic_o ( italic_t ) ) serves as an estimate of the reward that the scheduler would gain if action i𝑖iitalic_i is chosen. The ϵitalic-ϵ\epsilonitalic_ϵ-greedy method then employs the action values to select an action p𝒜𝑝𝒜{p\in\mathcal{A}}italic_p ∈ caligraphic_A. Primarily, the ϵitalic-ϵ\epsilonitalic_ϵ-greedy method opts to select p𝑝pitalic_p as the argument of the maximum action value. However, to explore the whole action space, the ϵitalic-ϵ\epsilonitalic_ϵ-greedy method occasionally opts to select p𝑝pitalic_p randomly from the set 𝒜𝒜\mathcal{A}caligraphic_A. The former operation is called exploitation, while the latter, is exploration. The posterior estimates are then reckoned according to steps 2-7 of Algorithm 2. Subsequently, rp(t)subscript𝑟𝑝𝑡{r_{p}(t)}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ), gained by the online network for selecting action p𝑝pitalic_p, is computed using Algorithm 7. If there is no query, then utilize μ𝟙(p==0)Tr(𝚿pos(t)){-\mu^{\mathbbm{1}(p==0)}\operatorname{Tr}(\mathbf{\Psi}_{pos}(t))}- italic_μ start_POSTSUPERSCRIPT blackboard_1 ( italic_p = = 0 ) end_POSTSUPERSCRIPT roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) ) as the reward, to convey the mean square error in the posterior estimate to the DRL scheduler. Note that, because of μ𝟙(p==0){\mu^{\mathbbm{1}(p==0)}}italic_μ start_POSTSUPERSCRIPT blackboard_1 ( italic_p = = 0 ) end_POSTSUPERSCRIPT, the reward expression provides an extra incentive to the DRL scheduler for selecting action-0, in case of no query. However, if a query has been asked, the subsequent procedure must be followed. At first, compute MSEc(t),c𝒞subscriptMSE𝑐𝑡for-all𝑐𝒞{\operatorname{MSE}_{c}(t),\forall c\in\mathscr{C}}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) , ∀ italic_c ∈ script_C, required in (9)9(\ref{rewardequation})( ). The computation of MSEc(t)subscriptMSE𝑐𝑡\operatorname{MSE}_{c}(t)roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) involves taking S𝑆Sitalic_S samples from a Gaussian distribution with mean 𝐱^pos(t)subscript^𝐱𝑝𝑜𝑠𝑡\hat{\mathbf{x}}_{pos}(t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) and covariance 𝚿pos(t)subscript𝚿𝑝𝑜𝑠𝑡\mathbf{\Psi}_{pos}(t)bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ). These samples are then utilized to obtain the vector 𝒖=[u1,,uS]T𝒖superscriptsubscript𝑢1subscript𝑢𝑆𝑇\boldsymbol{u}=[u_{1},\cdots,u_{S}]^{T}bold_italic_u = [ italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_u start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, where us=zc(𝐱s)subscript𝑢𝑠subscript𝑧𝑐subscript𝐱𝑠u_{s}=z_{c}(\mathbf{x}_{s})italic_u start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) and 𝐱ssubscript𝐱𝑠\mathbf{x}_{s}bold_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT is the sthsuperscript𝑠𝑡s^{th}italic_s start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT sample. The variance of 𝒖𝒖\boldsymbol{u}bold_italic_u yields MSEc(t)subscriptMSE𝑐𝑡\operatorname{MSE}_{c}(t)roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ). Once MSEc(t),c𝒞subscriptMSE𝑐𝑡for-all𝑐𝒞{\operatorname{MSE}_{c}(t),\forall c\in\mathscr{C}}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) , ∀ italic_c ∈ script_C, has been computed, reckon rp(t)subscript𝑟𝑝𝑡r_{p}(t)italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) using (9)9(\ref{rewardequation})( ).

Algorithm 6 CQKF-cum-DRL-based Scheduler at t𝑡titalic_t
0.  Θtar,𝒐(t1),𝒐(t),𝐱^pos(t1),𝚿pos(t1),subscriptΘ𝑡𝑎𝑟𝒐𝑡1𝒐𝑡subscript^𝐱𝑝𝑜𝑠𝑡1subscript𝚿𝑝𝑜𝑠𝑡1\Theta_{tar},\boldsymbol{o}(t-1),\boldsymbol{o}(t),\hat{\mathbf{x}}_{pos}(t-1)% ,\mathbf{\Psi}_{pos}(t-1),roman_Θ start_POSTSUBSCRIPT italic_t italic_a italic_r end_POSTSUBSCRIPT , bold_italic_o ( italic_t - 1 ) , bold_italic_o ( italic_t ) , over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , 𝒂(t1),𝒃(t1),η,ε,ȷ𝒂𝑡1𝒃𝑡1𝜂𝜀italic-ȷ\boldsymbol{a}(t-1),\boldsymbol{b}(t-1),\eta,\varepsilon,\jmathbold_italic_a ( italic_t - 1 ) , bold_italic_b ( italic_t - 1 ) , italic_η , italic_ε , italic_ȷ
1.  Compute {𝐱^pri(t),𝚿pri(t),𝒂(t),𝒃(t),𝐙(t1)}subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡𝒂𝑡𝒃𝑡superscript𝐙𝑡1\{\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t),\boldsymbol{a}(t),% \boldsymbol{b}(t),\mathbf{Z}^{*}(t-1)\}{ over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_italic_a ( italic_t ) , bold_italic_b ( italic_t ) , bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) } using step 1 of Algorithm 2
2.  Evaluate q^i(𝒐(t)),i𝒜subscript^𝑞𝑖𝒐𝑡for-all𝑖𝒜\hat{q}_{i}(\boldsymbol{o}(t)),\forall i\in\mathcal{A}over^ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_italic_o ( italic_t ) ) , ∀ italic_i ∈ caligraphic_A using the online network
3.  Draw θ𝜃\thetaitalic_θ from 𝒰(0,1)𝒰01\mathcal{U}(0,1)caligraphic_U ( 0 , 1 )
4.  if θ>ε𝜃𝜀\theta>\varepsilonitalic_θ > italic_ε then
5.     pargmaxi𝒜q^i(𝒐(t))𝑝subscriptargmax𝑖𝒜subscript^𝑞𝑖𝒐𝑡p\leftarrow\operatorname{argmax}_{i\in\mathcal{A}}\hat{q}_{i}(\boldsymbol{o}(t))italic_p ← roman_argmax start_POSTSUBSCRIPT italic_i ∈ caligraphic_A end_POSTSUBSCRIPT over^ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( bold_italic_o ( italic_t ) ) \triangleright Exploitation
6.  else
7.      Select p𝑝pitalic_p randomly from {0,,N}0𝑁\{0,\cdots,N\}{ 0 , ⋯ , italic_N } \triangleright Exploration
8.  end if \triangleright p::𝑝absentp:italic_p : index of selected action
9.  Compute {𝐱^pos(t),𝚿pos(t)}subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡{\{\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t)\}}{ over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) } using steps 2-7 of Algorithm 2
10.  rp(t)Reward(𝒞,S,𝐱^pos(t),𝚿pos(t),𝝉,p,{αc,c})subscript𝑟𝑝𝑡Reward𝒞𝑆subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡𝝉𝑝subscript𝛼𝑐for-all𝑐r_{p}(t)\leftarrow\operatorname{\textsc{Reward}}(\mathscr{C},S,\hat{\mathbf{x}% }_{pos}(t),\mathbf{\Psi}_{pos}(t),\boldsymbol{\tau},p,{\{\alpha_{c},\forall c% \}})italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) ← reward ( script_C , italic_S , over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_italic_τ , italic_p , { italic_α start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , ∀ italic_c } )
11.  if ȷ==||\jmath==|\mathcal{E}|italic_ȷ = = | caligraphic_E | then
12.     Remove TupleBsubscriptTuple𝐵\operatorname{\textsc{Tuple}}_{B}tuple start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT from \mathcal{E}caligraphic_E
13.     ȷȷ1italic-ȷitalic-ȷ1\jmath\leftarrow\jmath-1italic_ȷ ← italic_ȷ - 1
14.  end if \triangleright ::absent\mathcal{E}:caligraphic_E : memory buffer at the edge node
15.  if t>1𝑡1t>1italic_t > 1 then
16.      Store {𝒐(t1),p,rp(t),𝒐(t)}𝒐𝑡1𝑝subscript𝑟𝑝𝑡𝒐𝑡\{\boldsymbol{o}(t-1),p,r_{p}(t),\boldsymbol{o}(t)\}{ bold_italic_o ( italic_t - 1 ) , italic_p , italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) , bold_italic_o ( italic_t ) } as (ȷ+1)thsuperscriptitalic-ȷ1𝑡{(\jmath+1)^{th}}( italic_ȷ + 1 ) start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT tuple in \mathcal{E}caligraphic_E
17.     ȷȷ+1italic-ȷitalic-ȷ1\jmath\leftarrow\jmath+1italic_ȷ ← italic_ȷ + 1 \triangleright ȷ::italic-ȷabsent\jmath:italic_ȷ : number of tuples available in \mathcal{E}caligraphic_E
18.  end if
19.  ηη+1𝜂𝜂1\eta\leftarrow\eta+1italic_η ← italic_η + 1
20.  if η==20\eta==20italic_η = = 20 then
21.     Θtar=ΘonlsubscriptΘ𝑡𝑎𝑟subscriptΘ𝑜𝑛𝑙\Theta_{tar}=\Theta_{onl}roman_Θ start_POSTSUBSCRIPT italic_t italic_a italic_r end_POSTSUBSCRIPT = roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT \triangleright Update target network
22.     η=0𝜂0\eta=0italic_η = 0 \triangleright Restart counter
23.  end if
24.  Sample a mini-batch \mathcal{B}caligraphic_B of size B𝐵Bitalic_B from \mathcal{E}caligraphic_E. Then, provide Tuplej,4,j{1,,B}subscriptTuple𝑗4for-all𝑗1𝐵{\operatorname{\textsc{Tuple}}_{j,4},\forall j\in\{1,\cdots,B\}}tuple start_POSTSUBSCRIPT italic_j , 4 end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_B }, as input to the target network and utilize the target network’s outputs in (12)12(\ref{targetvalueequation})( ) to determine the target values q¯¯j,j{1,,B}subscript¯¯𝑞𝑗for-all𝑗1𝐵{\bar{\bar{q}}_{j},\forall j\in\{1,\cdots,B\}}over¯ start_ARG over¯ start_ARG italic_q end_ARG end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_B } for \mathcal{B}caligraphic_B
25.  Provide Tuplej,1,j{1,,B}subscriptTuple𝑗1for-all𝑗1𝐵{\operatorname{\textsc{Tuple}}_{j,1},\forall j\in\{1,\cdots,B\}}tuple start_POSTSUBSCRIPT italic_j , 1 end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_B }, as input to the online network and utilize the corresponding target values q¯¯j,j{1,,B}subscript¯¯𝑞𝑗for-all𝑗1𝐵{\bar{\bar{q}}_{j},\forall j\in\{1,\cdots,B\}}over¯ start_ARG over¯ start_ARG italic_q end_ARG end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_B }, as labels for updating ΘonlsubscriptΘ𝑜𝑛𝑙\Theta_{onl}roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT by minimizing ΩΩ\Omegaroman_Ω, in (13)13(\ref{DRLlossequation})( ), using RMSProp
26.  εmax(0.1,ε0.005)𝜀0.1𝜀0.005\varepsilon\leftarrow\max(0.1,\ \varepsilon-0.005)italic_ε ← roman_max ( 0.1 , italic_ε - 0.005 )
26.  𝐱^pos(t),𝚿pos(t),𝒂(t),𝒃(t),ε,Θtar,𝒐(t),η,ȷsubscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡𝒂𝑡𝒃𝑡𝜀subscriptΘ𝑡𝑎𝑟𝒐𝑡𝜂italic-ȷ\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t),\boldsymbol{a}(t),\boldsymbol% {b}(t),\varepsilon,\Theta_{tar},\boldsymbol{o}(t),\eta,\jmathover^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_italic_a ( italic_t ) , bold_italic_b ( italic_t ) , italic_ε , roman_Θ start_POSTSUBSCRIPT italic_t italic_a italic_r end_POSTSUBSCRIPT , bold_italic_o ( italic_t ) , italic_η , italic_ȷ

Now that both rp(t)subscript𝑟𝑝𝑡r_{p}(t)italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) and p𝑝pitalic_p are available, we proceed to store {𝒐(t1),p,rp(t),𝒐(t)}𝒐𝑡1𝑝subscript𝑟𝑝𝑡𝒐𝑡{\{\boldsymbol{o}(t-1),p,r_{p}(t),\boldsymbol{o}(t)\}}{ bold_italic_o ( italic_t - 1 ) , italic_p , italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) , bold_italic_o ( italic_t ) } as Tupleȷ+1subscriptTupleitalic-ȷ1\operatorname{\textsc{Tuple}}_{\jmath+1}tuple start_POSTSUBSCRIPT italic_ȷ + 1 end_POSTSUBSCRIPT, i.e., (ȷ+1)thsuperscriptitalic-ȷ1𝑡{(\jmath+1)^{th}}( italic_ȷ + 1 ) start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT tuple, in the finite memory \mathcal{E}caligraphic_E and increase ȷitalic-ȷ\jmathitalic_ȷ by 1111. Here, ȷitalic-ȷ\jmathitalic_ȷ represents the number of tuples available in \mathcal{E}caligraphic_E. If \mathcal{E}caligraphic_E is full, we remove TupleBsubscriptTuple𝐵\operatorname{\textsc{Tuple}}_{B}tuple start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT from \mathcal{E}caligraphic_E and decrease ȷitalic-ȷ\jmathitalic_ȷ by 1111 before storing the new tuple. Following this, we update the target network weights, denoted as ΘtarsubscriptΘ𝑡𝑎𝑟\Theta_{tar}roman_Θ start_POSTSUBSCRIPT italic_t italic_a italic_r end_POSTSUBSCRIPT, by setting Θtar=ΘonlsubscriptΘ𝑡𝑎𝑟subscriptΘ𝑜𝑛𝑙{\Theta_{tar}=\Theta_{onl}}roman_Θ start_POSTSUBSCRIPT italic_t italic_a italic_r end_POSTSUBSCRIPT = roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT, if the counter η𝜂\etaitalic_η reached its threshold value, herein set to 20202020.

Next, the training process for the online network commences by sampling a mini-batch \mathcal{B}caligraphic_B of size B𝐵Bitalic_B from \mathcal{E}caligraphic_E. Then, we provide Tuplej,4,j{1,,B}subscriptTuple𝑗4for-all𝑗1𝐵{\operatorname{\textsc{Tuple}}_{j,4},\forall j\in\{1,\cdots,B\}}tuple start_POSTSUBSCRIPT italic_j , 4 end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_B }, i.e., fourth element of TuplejsubscriptTuple𝑗{\operatorname{\textsc{Tuple}}_{j}\in\mathcal{B}}tuple start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ caligraphic_B, as input to the target network and obtain its output 𝒒j={qj,i|i𝒜},j{1,,B}formulae-sequencesubscript𝒒𝑗conditional-setsubscript𝑞𝑗𝑖for-all𝑖𝒜for-all𝑗1𝐵{\vec{\boldsymbol{q}}_{j}=\{\vec{q}_{j,i}|\forall i\in\mathcal{A}\},}{\forall j% \in\{1,\cdots,B\}}over→ start_ARG bold_italic_q end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { over→ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_j , italic_i end_POSTSUBSCRIPT | ∀ italic_i ∈ caligraphic_A } , ∀ italic_j ∈ { 1 , ⋯ , italic_B }. Now, we utilize outputs of the target network to determine the target values as

q¯¯j=Tuplej,3+γmaxi𝒜qj,i,j{1,,B},formulae-sequencesubscript¯¯𝑞𝑗subscriptTuple𝑗3𝛾𝑖𝒜subscript𝑞𝑗𝑖for-all𝑗1𝐵\displaystyle\bar{\bar{q}}_{j}=\operatorname{\textsc{Tuple}}_{j,3}+\gamma% \underset{i\in\mathcal{A}}{\max}\ \vec{q}_{j,i},\forall j\in\{1,\cdots,B\},over¯ start_ARG over¯ start_ARG italic_q end_ARG end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = tuple start_POSTSUBSCRIPT italic_j , 3 end_POSTSUBSCRIPT + italic_γ start_UNDERACCENT italic_i ∈ caligraphic_A end_UNDERACCENT start_ARG roman_max end_ARG over→ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_j , italic_i end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_B } , (12)

for \mathcal{B}caligraphic_B. Not to mention, q¯¯j,j{1,,B}subscript¯¯𝑞𝑗for-all𝑗1𝐵{\bar{\bar{q}}_{j},\forall j\in\{1,\cdots,B\}}over¯ start_ARG over¯ start_ARG italic_q end_ARG end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_B }, is an estimate of R(π)𝑅𝜋R(\pi)italic_R ( italic_π ). Thereupon, we provide Tuplej,1,j{1,,B}subscriptTuple𝑗1for-all𝑗1𝐵{\operatorname{\textsc{Tuple}}_{j,1},\forall j\in\{1,\cdots,B\}}tuple start_POSTSUBSCRIPT italic_j , 1 end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_B }, as input to the online network. The corresponding target values q¯¯j,j{1,,B}subscript¯¯𝑞𝑗for-all𝑗1𝐵{\bar{\bar{q}}_{j},\forall j\in\{1,\cdots,B\}}over¯ start_ARG over¯ start_ARG italic_q end_ARG end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , ∀ italic_j ∈ { 1 , ⋯ , italic_B }, serve as labels for updating ΘonlsubscriptΘ𝑜𝑛𝑙\Theta_{onl}roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT by minimizing ΩΩ\Omegaroman_Ω using RMSProp optimizer, where

Ω=1Bj=1B[q¯¯jq^Tuplej,2(Tuplej,1)]2.Ω1𝐵superscriptsubscript𝑗1𝐵superscriptdelimited-[]subscript¯¯𝑞𝑗subscript^𝑞subscriptTuple𝑗2subscriptTuple𝑗12\displaystyle\Omega=\frac{1}{B}\sum_{j=1}^{B}\big{[}\bar{\bar{q}}_{j}-\hat{q}_% {\operatorname{\textsc{Tuple}}_{j,2}}(\operatorname{\textsc{Tuple}}_{j,1})\big% {]}^{2}.roman_Ω = divide start_ARG 1 end_ARG start_ARG italic_B end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B end_POSTSUPERSCRIPT [ over¯ start_ARG over¯ start_ARG italic_q end_ARG end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - over^ start_ARG italic_q end_ARG start_POSTSUBSCRIPT tuple start_POSTSUBSCRIPT italic_j , 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( tuple start_POSTSUBSCRIPT italic_j , 1 end_POSTSUBSCRIPT ) ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (13)

To deal with the exploding gradient problem during the online network’s training phase, we perform the gradient-norm clipping [14]. This involves clipping the gradient vector ΘonlΩsubscriptsubscriptΘ𝑜𝑛𝑙Ω\nabla_{\Theta_{onl}}\Omega∇ start_POSTSUBSCRIPT roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_Ω as

𝝌=δΘonlΩmax(ΘonlΩ2,δ).𝝌𝛿subscriptsubscriptΘ𝑜𝑛𝑙ΩsubscriptnormsubscriptsubscriptΘ𝑜𝑛𝑙Ω2𝛿\boldsymbol{\chi}=\frac{\delta\ \nabla_{\Theta_{onl}}\Omega}{\max({\|\nabla_{% \Theta_{onl}}\Omega\|}_{2},\delta)}.bold_italic_χ = divide start_ARG italic_δ ∇ start_POSTSUBSCRIPT roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_Ω end_ARG start_ARG roman_max ( ∥ ∇ start_POSTSUBSCRIPT roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_Ω ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_δ ) end_ARG . (14)

Herein, δ𝛿\deltaitalic_δ represents the threshold value for ΘonlΩ2subscriptnormsubscriptsubscriptΘ𝑜𝑛𝑙Ω2{\|\nabla_{\Theta_{onl}}\Omega\|}_{2}∥ ∇ start_POSTSUBSCRIPT roman_Θ start_POSTSUBSCRIPT italic_o italic_n italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_Ω ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and the vector 𝝌𝝌\boldsymbol{\chi}bold_italic_χ stores the clipped gradients. At last, to emphasize exploitation over exploration in the ϵitalic-ϵ\epsilonitalic_ϵ-greedy method, it is necessary to gradually decrease ε𝜀\varepsilonitalic_ε. Thus, we reduce ε𝜀\varepsilonitalic_ε by 0.0050.0050.0050.005, unless it has already reached 0.10.10.10.1.

Algorithm 7 RewardReward\operatorname{\textsc{Reward}}reward
0.  𝒞,S,𝐱^pos(t),𝚿pos(t),𝝉,p,{αc,c𝒞}𝒞𝑆subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡𝝉𝑝subscript𝛼𝑐for-all𝑐𝒞\mathscr{C},S,\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t),\boldsymbol{% \tau},p,\{\alpha_{c},\forall c\in\mathscr{C}\}script_C , italic_S , over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_italic_τ , italic_p , { italic_α start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , ∀ italic_c ∈ script_C }
1.  if Query has been asked at t𝑡titalic_t then
2.     for every c𝑐citalic_c that asked a query do
3.         Draw 𝐱ssubscript𝐱𝑠\mathbf{x}_{s}bold_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT from 𝒩(𝐱^pos(t),𝚿pos(t)),s{1,,S}𝒩subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡for-all𝑠1𝑆{\mathcal{N}(\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t)),\forall s\in\{1% ,\cdots,S\}}caligraphic_N ( over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) ) , ∀ italic_s ∈ { 1 , ⋯ , italic_S }
4.         us=zc(𝐱s),s{1,,S}formulae-sequencesubscript𝑢𝑠subscript𝑧𝑐subscript𝐱𝑠for-all𝑠1𝑆u_{s}=z_{c}(\mathbf{x}_{s}),\forall s\in\{1,\cdots,S\}italic_u start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) , ∀ italic_s ∈ { 1 , ⋯ , italic_S } \triangleright 𝒖=[u1,,uS]T𝒖superscriptsubscript𝑢1subscript𝑢𝑆𝑇\boldsymbol{u}=[u_{1},\cdots,u_{S}]^{T}bold_italic_u = [ italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_u start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
5.         MSEc(t)=Var(𝒖)subscriptMSE𝑐𝑡Var𝒖\operatorname{MSE}_{c}(t)=\operatorname{\textsc{Var}}(\boldsymbol{u})roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) = VAR ( bold_italic_u ) \triangleright Sample variance
6.     end for
7.     rp(t)=c𝒞αcMSEc(t)𝟙(τc==0)r_{p}(t)=-\sum_{c\in\mathscr{C}}\alpha_{c}\operatorname{MSE}_{c}(t)\mathbbm{1}% (\tau_{c}==0)italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) = - ∑ start_POSTSUBSCRIPT italic_c ∈ script_C end_POSTSUBSCRIPT italic_α start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) blackboard_1 ( italic_τ start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = = 0 ) \triangleright Reward
8.  else
9.      rp(t)=μ𝟙(p==0)Tr(𝚿pos(t))r_{p}(t)=-\mu^{\mathbbm{1}(p==0)}\operatorname{Tr}(\mathbf{\Psi}_{pos}(t))italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) = - italic_μ start_POSTSUPERSCRIPT blackboard_1 ( italic_p = = 0 ) end_POSTSUPERSCRIPT roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) ) \triangleright Reward
10.  end if
10.  rp(t)subscript𝑟𝑝𝑡r_{p}(t)italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t )

IV Benchmark Schedulers

IV-A Monte Carlo scheduler

The Monte Carlo scheduler, described in detail in Algorithm 8, is adopted as a benchmark due to its versatility in handling any query type. For a given client c𝒞𝑐𝒞{c\in\mathscr{C}}italic_c ∈ script_C, Algorithm 8 operates as follows. Initially, it computes the prior estimates, and then subsequently, in an iterative manner, S𝑆Sitalic_S distinct Gaussian samples are drawn for sensor n𝑛nitalic_n in step 11, by computing S𝑆Sitalic_S distinct posterior estimates either in step 7 or in step 9 depending on the inequality in step 5. The S𝑆Sitalic_S Gaussian samples are then employed to compute S𝑆Sitalic_S distinct query responses in step 12, in an iterative manner. These query responses are stored in 𝒖𝒖{\boldsymbol{u}}bold_italic_u. Next, in step 14, Var(𝒖)Var𝒖{\operatorname{\textsc{Var}}(\boldsymbol{u})}VAR ( bold_italic_u ) is computed and stored in 𝝂𝝂{\boldsymbol{\nu}}bold_italic_ν. Here Var(𝒖)Var𝒖{\operatorname{\textsc{Var}}(\boldsymbol{u})}VAR ( bold_italic_u ) represents MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) expected in case sensor n𝑛nitalic_n is polled. Repeat the procedure outlined from step 3 to step 14 a total of N𝑁Nitalic_N times, to calculate Var(𝒖)Var𝒖{\operatorname{\textsc{Var}}(\boldsymbol{u})}VAR ( bold_italic_u ) for every sensor. Now, in step 16, a sensor is polled, whose index value corresponds to the index of the minimum element in 𝝂𝝂{\boldsymbol{\nu}}bold_italic_ν. Following this, to compute the actual MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) in step 18, Algorithm 8 again computes the posterior estimates by leveraging the received observation from the polled sensor.

Indeed, it is worth mentioning that the Monte Carlo scheduler does come with limitations. Unlike the proposed CQKF-cum-DRL-based scheduler, we need to design C𝐶Citalic_C Monte Carlo schedulers in the case of C𝐶Citalic_C clients. Moreover, the Monte Carlo scheduler does not even take into account the information related to the query requests while polling a sensor. It simply polls a sensor whenever a query is asked.

Note that the utilization of CQKF necessitates modifications to the original Monte Carlo scheduler available in [5]. Specifically, we have modified the procedure by relocating the computation of prior estimates, moving it outside the for loops present in steps 2 and 3. This alteration is due to the use of Holt’s method, whose smoothing parameters 𝒂(t)and𝒃(t)𝒂𝑡and𝒃𝑡{\boldsymbol{a}(t)\ \textrm{and}\ \boldsymbol{b}(t)}bold_italic_a ( italic_t ) and bold_italic_b ( italic_t ) should only be updated once per time step.

Algorithm 8 Monte Carlo Scheduler for Client c𝒞𝑐𝒞c\in\mathscr{C}italic_c ∈ script_C
0.  𝐱^pos(t1),𝚿pos(t1),𝒂(t1),𝒃(t1)subscript^𝐱𝑝𝑜𝑠𝑡1subscript𝚿𝑝𝑜𝑠𝑡1𝒂𝑡1𝒃𝑡1\hat{\mathbf{x}}_{pos}(t-1),\mathbf{\Psi}_{pos}(t-1),\boldsymbol{a}(t-1),% \boldsymbol{b}(t-1)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t - 1 ) , bold_italic_a ( italic_t - 1 ) , bold_italic_b ( italic_t - 1 )
1.  Compute {𝐱^pri(t),𝚿pri(t),𝒂(t),𝒃(t),𝐙(t1)}subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡𝒂𝑡𝒃𝑡superscript𝐙𝑡1\{\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t),\boldsymbol{a}(t),% \boldsymbol{b}(t),\mathbf{Z}^{*}(t-1)\}{ over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_italic_a ( italic_t ) , bold_italic_b ( italic_t ) , bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) } using step 1 of Algorithm 2
2.  for n{1,,N}𝑛1𝑁n\in\{1,\cdots,N\}italic_n ∈ { 1 , ⋯ , italic_N } do
3.     for s{1,,S}𝑠1𝑆s\in\{1,\cdots,S\}italic_s ∈ { 1 , ⋯ , italic_S } do
4.        Draw θ𝜃\thetaitalic_θ from 𝒰(0,1)𝒰01\mathcal{U}(0,1)caligraphic_U ( 0 , 1 )
5.        if θ0.02n110𝜃0.02𝑛110\theta\geq 0.02\lceil{\frac{n-1}{10}}\rceilitalic_θ ≥ 0.02 ⌈ divide start_ARG italic_n - 1 end_ARG start_ARG 10 end_ARG ⌉ then
6.           Draw y𝑦yitalic_y from 𝒩(𝟏nT𝐱^,𝟏nT𝚿𝟏n)𝒩superscriptsubscript1𝑛𝑇^𝐱superscriptsubscript1𝑛𝑇𝚿subscript1𝑛\mathcal{N}(\mathbf{1}_{n}^{T}\hat{\mathbf{x}},\mathbf{1}_{n}^{T}\mathbf{\Psi}% \mathbf{1}_{n})caligraphic_N ( bold_1 start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG bold_x end_ARG , bold_1 start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Ψ bold_1 start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )
7.            𝐱^,𝚿UpdateStep(𝐱^pri(t),𝚿pri(t),𝐙(t1),\hat{\mathbf{x}},\mathbf{\Psi}\leftarrow\operatorname{\textsc{UpdateStep}}(% \hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t),\mathbf{Z}^{*}(t-1),over^ start_ARG bold_x end_ARG , bold_Ψ ← updatestep ( over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Z start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_t - 1 ) , 𝚺v2,𝐇,𝐰,𝚵,n)\mathbf{\Sigma}_{v_{2}},\mathbf{H},\mathbf{w},\boldsymbol{\Xi},n)bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , bold_H , bold_w , bold_Ξ , italic_n )
8.        else
9.            {𝐱^,𝚿}={𝐱^pri(t),𝚿pri(t)}^𝐱𝚿subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡\{\hat{\mathbf{x}},\mathbf{\Psi}\}=\{\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{% pri}(t)\}{ over^ start_ARG bold_x end_ARG , bold_Ψ } = { over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) }
10.        end if
11.         𝐱s=𝒩(𝐱^,𝚿)subscript𝐱𝑠𝒩^𝐱𝚿\mathbf{x}_{s}=\mathcal{N}(\hat{\mathbf{x}},\mathbf{\Psi})bold_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = caligraphic_N ( over^ start_ARG bold_x end_ARG , bold_Ψ )
12.         us=zc(𝐱s)subscript𝑢𝑠subscript𝑧𝑐subscript𝐱𝑠u_{s}=z_{c}(\mathbf{x}_{s})italic_u start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = italic_z start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT )
13.     end for
14.      νi=Var(𝒖)subscript𝜈𝑖Var𝒖\nu_{i}=\operatorname{\textsc{Var}}(\boldsymbol{u})italic_ν start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = VAR ( bold_italic_u ) \triangleright Sample variance
15.  end for
16.   p=argminn{1,,N}𝝂𝑝subscriptargmin𝑛1𝑁𝝂p=\operatorname{argmin}_{n\in\{1,\cdots,N\}}\boldsymbol{\nu}italic_p = roman_argmin start_POSTSUBSCRIPT italic_n ∈ { 1 , ⋯ , italic_N } end_POSTSUBSCRIPT bold_italic_ν \triangleright 𝝂=[ν1,,νN]T𝝂superscriptsubscript𝜈1subscript𝜈𝑁𝑇\boldsymbol{\nu}=[\nu_{1},\cdots,\nu_{N}]^{T}bold_italic_ν = [ italic_ν start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_ν start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT
17.  Compute {𝐱^pos(t),𝚿pos(t)}subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡{\{\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t)\}}{ over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) } using steps 2-7 of Algorithm 2
18.   Compute MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) using steps 3-5 of Algorithm 7
18.  𝐱^pos(t),𝚿pos(t),𝒂(t),𝒃(t)subscript^𝐱𝑝𝑜𝑠𝑡subscript𝚿𝑝𝑜𝑠𝑡𝒂𝑡𝒃𝑡\hat{\mathbf{x}}_{pos}(t),\mathbf{\Psi}_{pos}(t),\boldsymbol{a}(t),\boldsymbol% {b}(t)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) , bold_italic_a ( italic_t ) , bold_italic_b ( italic_t )

IV-B Benchmark DRL Scheduler

Our second benchmark scheduler adopts the action space, POMDP state/observation space, reward function, and online and target network architecture utilized by the scheduler in [6]. The working of the benchmark DRL scheduler is same as the one described in Algorithm 6, except for the following changes:

  • Change 𝒜𝒜\mathcal{A}caligraphic_A to {1,,N}1𝑁{\{1,\cdots,N\}}{ 1 , ⋯ , italic_N }, indicating that the edge node must poll a sensor at every time step.

  • In Algorithm 6, provide 𝒐(t)=(𝐱^pri(t),𝚿pri(t),𝝉(t))𝒐𝑡subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡𝝉𝑡{\boldsymbol{o}(t)=(\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t),% \boldsymbol{\tau}(t))}bold_italic_o ( italic_t ) = ( over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_italic_τ ( italic_t ) ), with 𝒪=M+M2×C𝒪superscript𝑀superscript𝑀2superscript𝐶{\mathcal{O}=\mathbb{R}^{M+M^{2}}\times\mathbb{N}^{C}}caligraphic_O = blackboard_R start_POSTSUPERSCRIPT italic_M + italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT × blackboard_N start_POSTSUPERSCRIPT italic_C end_POSTSUPERSCRIPT, as input to the online network. Here, {𝐱^pri(t),𝚿pri(t)}subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡{\{\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t)\}}{ over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) } indicates the complete state of CQKF after the prediction step.

  • Change step 9 of Algorithm 7 to rp(t)=0subscript𝑟𝑝𝑡0{r_{p}(t)=0}italic_r start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_t ) = 0, indicating a zero reward when no query is posed at t𝑡titalic_t.

  • Change the online and target network architecture by increasing the number of hidden layers to three, having {2.5M,M,N}2.5𝑀𝑀𝑁{\{2.5M,M,N\}}{ 2.5 italic_M , italic_M , italic_N } neurons and a dropout probability of {0.1,0.1,0}0.10.10{\{0.1,0.1,0\}}{ 0.1 , 0.1 , 0 }, respectively.

Thus, the distinctive features that set apart the benchmark DRL scheduler from the proposed scheduler are its action space, observation space, reward function, and DNN architecture.

V Complexity of the Considered Schedulers

TABLE III: Complexity of Fundamental Operations
Operations Complexity Operations Complexity
Chol(𝚿pri(t))Cholsubscript𝚿𝑝𝑟𝑖𝑡\operatorname{\textsc{Chol}}(\mathbf{\Psi}_{pri}(t))chol ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) M3/3superscript𝑀33M^{3}/3italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT / 3 ReLU 1111
𝒩(𝟏iT𝐱^,𝟏iT𝚿𝟏i)𝒩superscriptsubscript1𝑖𝑇^𝐱superscriptsubscript1𝑖𝑇𝚿subscript1𝑖\mathcal{N}(\mathbf{1}_{i}^{T}\hat{\mathbf{x}},\mathbf{1}_{i}^{T}\mathbf{\Psi}% \mathbf{1}_{i})caligraphic_N ( bold_1 start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT over^ start_ARG bold_x end_ARG , bold_1 start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_Ψ bold_1 start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) 1111 𝚿yy(t)1subscript𝚿𝑦𝑦superscript𝑡1\mathbf{\Psi}_{yy}(t)^{-1}bold_Ψ start_POSTSUBSCRIPT italic_y italic_y end_POSTSUBSCRIPT ( italic_t ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT M3superscript𝑀3M^{3}italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT
argminn{1,,N}νnsubscriptargmin𝑛1𝑁subscript𝜈𝑛\operatorname{argmin}_{n\in\{1,\cdots,N\}}\nu_{n}roman_argmin start_POSTSUBSCRIPT italic_n ∈ { 1 , ⋯ , italic_N } end_POSTSUBSCRIPT italic_ν start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT N𝑁Nitalic_N 𝒩(𝐱^,𝚿)𝒩^𝐱𝚿\mathcal{N}(\hat{\mathbf{x}},\mathbf{\Psi})caligraphic_N ( over^ start_ARG bold_x end_ARG , bold_Ψ ) M𝑀Mitalic_M
Inequality 1111 Var(𝒖)Var𝒖\operatorname{\textsc{Var}}(\boldsymbol{u})VAR ( bold_italic_u ) S𝑆Sitalic_S
Draw θ𝜃\thetaitalic_θ from 𝒰(0,1)𝒰01\mathcal{U}(0,1)caligraphic_U ( 0 , 1 ) 1111 zc(𝐱s)subscript𝑧𝑐subscript𝐱𝑠z_{c}(\mathbf{x}_{s})italic_z start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) 1111

Herein, we quantify the computational complexity of our considered schedulers in terms of the number of arithmetic operations they perform. Table III presents the complexity expressions for fundamental operations utilized in the algorithms. Note that the complexity expressions for our considered schedulers pertain specifically to the complexity associated with making a scheduling decision at a single time step.

Notice that, because of step 5 of Algorithm 8, deriving an exact expression for the complexity of the Monte Carlo scheduler is not feasible. However, we can derive expressions for both the lower and upper bound of the complexity of the Monte Carlo scheduler. The lower bound expression pertains to the case that the inequality in step 5 of Algorithm 8 is never satisfied. Conversely, the upper bound expression represents the case that the aforesaid inequality is always satisfied. The lower and upper bound complexity expressions are given by

ϑ1,lb=M33+8M3n+22M2n+4M2+12Mn+NS(4+M)+N,subscriptitalic-ϑ1𝑙𝑏superscript𝑀338superscript𝑀3superscript𝑛22superscript𝑀2superscript𝑛4superscript𝑀212𝑀superscript𝑛𝑁𝑆4𝑀𝑁\displaystyle\begin{split}\vartheta_{1,lb}=&\frac{M^{3}}{3}+8M^{3}n^{\prime}+2% 2M^{2}n^{\prime}+4M^{2}+12Mn^{\prime}\\ &+NS(4+M)+N,\end{split}start_ROW start_CELL italic_ϑ start_POSTSUBSCRIPT 1 , italic_l italic_b end_POSTSUBSCRIPT = end_CELL start_CELL divide start_ARG italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG 3 end_ARG + 8 italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + 22 italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + 4 italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 12 italic_M italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + italic_N italic_S ( 4 + italic_M ) + italic_N , end_CELL end_ROW (15a)
ϑ1,ub=ϑ1,lb+NS(22M33+16M3n+10M2n+8M2+M+3),subscriptitalic-ϑ1𝑢𝑏subscriptitalic-ϑ1𝑙𝑏𝑁𝑆22superscript𝑀3316superscript𝑀3superscript𝑛10superscript𝑀2superscript𝑛8superscript𝑀2𝑀3\displaystyle\begin{split}\vartheta_{1,ub}=&\ \vartheta_{1,lb}+NS\Bigl{(}\frac% {22M^{3}}{3}+16M^{3}n^{\prime}+10M^{2}n^{\prime}\\ &+8M^{2}+M+3\Bigr{)},\end{split}start_ROW start_CELL italic_ϑ start_POSTSUBSCRIPT 1 , italic_u italic_b end_POSTSUBSCRIPT = end_CELL start_CELL italic_ϑ start_POSTSUBSCRIPT 1 , italic_l italic_b end_POSTSUBSCRIPT + italic_N italic_S ( divide start_ARG 22 italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG start_ARG 3 end_ARG + 16 italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + 10 italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + 8 italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_M + 3 ) , end_CELL end_ROW (15b)

respectively. By taking into account the dominant terms in (15a)15a(\ref{LBmontecarloComplexity})( ) and (15b)15b(\ref{UBmontecarloComplexity})( ), the final lower and upper bound complexity expressions for the Monte Carlo scheduler, in terms of big-O notation, are given by

ϑ1,lbsubscriptitalic-ϑ1𝑙𝑏\displaystyle\vartheta_{1,lb}italic_ϑ start_POSTSUBSCRIPT 1 , italic_l italic_b end_POSTSUBSCRIPT =O(8M3n+NSM),absent𝑂8superscript𝑀3superscript𝑛𝑁𝑆𝑀\displaystyle=O(8M^{3}n^{\prime}+NSM),= italic_O ( 8 italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_N italic_S italic_M ) , (16a)
ϑ1,ubsubscriptitalic-ϑ1𝑢𝑏\displaystyle\vartheta_{1,ub}italic_ϑ start_POSTSUBSCRIPT 1 , italic_u italic_b end_POSTSUBSCRIPT =O(NSM3n).absent𝑂𝑁𝑆superscript𝑀3superscript𝑛\displaystyle=O(NSM^{3}n^{\prime}).= italic_O ( italic_N italic_S italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) . (16b)

The complexity expression for the proposed scheduler is the summation of the complexities across three distinct phases: action values generation phase, action selection phase, and training phase. The complexity expressions for first and third phase are ϑ^1=i=1|𝒍|1li+1(2li+1)subscript^italic-ϑ1superscriptsubscript𝑖1𝒍1subscript𝑙𝑖12subscript𝑙𝑖1{\hat{\vartheta}_{1}=\sum_{i=1}^{|\boldsymbol{l}|-1}l_{i+1}(2l_{i}+1)}over^ start_ARG italic_ϑ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT | bold_italic_l | - 1 end_POSTSUPERSCRIPT italic_l start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ( 2 italic_l start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + 1 ) and ϑ^3=Bϑ^1subscript^italic-ϑ3𝐵subscript^italic-ϑ1{\hat{\vartheta}_{3}=B\hat{\vartheta}_{1}}over^ start_ARG italic_ϑ end_ARG start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = italic_B over^ start_ARG italic_ϑ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, respectively, as derived in [6]. Here, 𝒍=[l1,,l|𝒍|]T𝒍superscriptsubscript𝑙1subscript𝑙𝒍𝑇{\boldsymbol{l}=[l_{1},\cdots,l_{|\boldsymbol{l}|}]^{T}}bold_italic_l = [ italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_l start_POSTSUBSCRIPT | bold_italic_l | end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT with l1=|𝒐(t)|subscript𝑙1𝒐𝑡{l_{1}=|\boldsymbol{o}(t)|}italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = | bold_italic_o ( italic_t ) | and l|𝒍|=|𝒜|subscript𝑙𝒍𝒜{l_{|\boldsymbol{l}|}=|\mathcal{A}|}italic_l start_POSTSUBSCRIPT | bold_italic_l | end_POSTSUBSCRIPT = | caligraphic_A |, while the remaining elements of 𝒍𝒍{\boldsymbol{l}}bold_italic_l are the hidden layer sizes. Moreover, because of steps 47 of Algorithm 6, the complexity of the second phase falls within the range [3,(2+|𝒜|)]32𝒜{[3,(2+|\mathcal{A}|)]}[ 3 , ( 2 + | caligraphic_A | ) ]. In the case of the proposed scheduler, 𝒍=[(C+1),4,(N+1)]T𝒍superscript𝐶14𝑁1𝑇{\boldsymbol{l}=[(C+1),4,(N+1)]^{T}}bold_italic_l = [ ( italic_C + 1 ) , 4 , ( italic_N + 1 ) ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Thus, the lower and upper bound complexity expressions for the proposed scheduler are

ϑ2,lb=ϑ^1+ϑ^3+3,=(B+1)ϑ^1+3,=(30N+31)(8C+9N+21)+3,\displaystyle\begin{split}\vartheta_{2,lb}=&\hat{\vartheta}_{1}+\hat{\vartheta% }_{3}+3,\\ =&(B+1)\hat{\vartheta}_{1}+3,\\ =&(30N+31)(8C+9N+21)+3,\end{split}start_ROW start_CELL italic_ϑ start_POSTSUBSCRIPT 2 , italic_l italic_b end_POSTSUBSCRIPT = end_CELL start_CELL over^ start_ARG italic_ϑ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + over^ start_ARG italic_ϑ end_ARG start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT + 3 , end_CELL end_ROW start_ROW start_CELL = end_CELL start_CELL ( italic_B + 1 ) over^ start_ARG italic_ϑ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + 3 , end_CELL end_ROW start_ROW start_CELL = end_CELL start_CELL ( 30 italic_N + 31 ) ( 8 italic_C + 9 italic_N + 21 ) + 3 , end_CELL end_ROW (17a)
ϑ2,ub=ϑ^1+ϑ^3+2+(N+1),=ϑ2,lb+N,\displaystyle\begin{split}\vartheta_{2,ub}=&\hat{\vartheta}_{1}+\hat{\vartheta% }_{3}+2+(N+1),\\ =&\ \vartheta_{2,lb}+N,\end{split}start_ROW start_CELL italic_ϑ start_POSTSUBSCRIPT 2 , italic_u italic_b end_POSTSUBSCRIPT = end_CELL start_CELL over^ start_ARG italic_ϑ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + over^ start_ARG italic_ϑ end_ARG start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT + 2 + ( italic_N + 1 ) , end_CELL end_ROW start_ROW start_CELL = end_CELL start_CELL italic_ϑ start_POSTSUBSCRIPT 2 , italic_l italic_b end_POSTSUBSCRIPT + italic_N , end_CELL end_ROW (17b)

respectively. By taking into account the dominant terms in (17a)17a(\ref{LBproposedschedulerComplexity})( ) and (17b)17b(\ref{UBproposedschedulerComplexity})( ), the final complexity expression for the proposed scheduler is given by

ϑ2subscriptitalic-ϑ2\displaystyle\vartheta_{2}italic_ϑ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT =O(9N2+8CN).absent𝑂9superscript𝑁28𝐶𝑁\displaystyle=O(9N^{2}+8CN).= italic_O ( 9 italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 8 italic_C italic_N ) . (18)
TABLE IV: Complexities for Various System Parameter Configurations
{N,M,C,S,n}𝑁𝑀𝐶𝑆superscript𝑛\{N,M,C,S,n^{\prime}\}{ italic_N , italic_M , italic_C , italic_S , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }
Proposed
Scheduler
Benchmark
DRL Scheduler
Monte Carlo
Scheduler
{20,20,2,100,2}202021002\{20,20,2,100,2\}{ 20 , 20 , 2 , 100 , 2 }
[136930,[136930,[ 136930 ,
136950]136950]136950 ]
[27591913,[27591913,[ 27591913 ,
27591932]27591932]27591932 ]
[198367,[198367,[ 198367 ,
651977700]651977700]651977700 ]
{30,20,2,100,2}302021002\{30,20,2,100,2\}{ 30 , 20 , 2 , 100 , 2 }
[285820,[285820,[ 285820 ,
285850]285850]285850 ]
[42644333,[42644333,[ 42644333 ,
42644362]42644362]42644362 ]
[222377,[222377,[ 222377 ,
977891377]977891377]977891377 ]
{20,30,2,100,2}203021002\{20,30,2,100,2\}{ 20 , 30 , 2 , 100 , 2 }
[136930,[136930,[ 136930 ,
136950]136950]136950 ]
[59090323,[59090323,[ 59090323 ,
59090342]59090342]59090342 ]
[552940,[552940,[ 552940 ,
2175018940]2175018940]2175018940 ]
{20,20,8,100,2}202081002\{20,20,8,100,2\}{ 20 , 20 , 8 , 100 , 2 }
[167218,[167218,[ 167218 ,
167238]167238]167238 ]
[27952513,[27952513,[ 27952513 ,
27952532]27952532]27952532 ]
[198367,[198367,[ 198367 ,
651977700]651977700]651977700 ]

As mentioned in subsection IV-B, the working of the benchmark DRL scheduler is the same as the proposed scheduler. Thus, the general complexity expression for the benchmark DRL scheduler is the same as the ones derived for the proposed scheduler. However, this time 𝒍=[(M+M2+C),2.5M,M,N,N]T𝒍superscript𝑀superscript𝑀2𝐶2.5𝑀𝑀𝑁𝑁𝑇{\boldsymbol{l}=[(M+M^{2}+C),2.5M,M,N,N]^{T}}bold_italic_l = [ ( italic_M + italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_C ) , 2.5 italic_M , italic_M , italic_N , italic_N ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. Thus, the lower and upper bound complexity expressions for the benchmark DRL scheduler are

ϑ3,lb=(30N+1)(5M3+10M2+5MC+3.5M+2NM+2N2+2N)+3,subscriptitalic-ϑ3𝑙𝑏30𝑁15superscript𝑀310superscript𝑀25𝑀𝐶3.5𝑀2𝑁𝑀2superscript𝑁22𝑁3\displaystyle\begin{split}\vartheta_{3,lb}=&(30N+1)(5M^{3}+10M^{2}+5MC+3.5M\\ &+2NM+2N^{2}+2N)+3,\end{split}start_ROW start_CELL italic_ϑ start_POSTSUBSCRIPT 3 , italic_l italic_b end_POSTSUBSCRIPT = end_CELL start_CELL ( 30 italic_N + 1 ) ( 5 italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT + 10 italic_M start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 5 italic_M italic_C + 3.5 italic_M end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + 2 italic_N italic_M + 2 italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 italic_N ) + 3 , end_CELL end_ROW (19a)
ϑ3,ub=ϑ3,lb+N1,subscriptitalic-ϑ3𝑢𝑏subscriptitalic-ϑ3𝑙𝑏𝑁1\displaystyle\begin{split}\vartheta_{3,ub}=&\ \vartheta_{3,lb}+N-1,\end{split}start_ROW start_CELL italic_ϑ start_POSTSUBSCRIPT 3 , italic_u italic_b end_POSTSUBSCRIPT = end_CELL start_CELL italic_ϑ start_POSTSUBSCRIPT 3 , italic_l italic_b end_POSTSUBSCRIPT + italic_N - 1 , end_CELL end_ROW (19b)

respectively. By taking into account the dominant terms in (19a)19a(\ref{LBbenchmark2Complexity})( ) and (19b)19b(\ref{UBbenchmark2Complexity})( ), the final complexity expression for the benchmark DRL scheduler is given by

ϑ3subscriptitalic-ϑ3\displaystyle\vartheta_{3}italic_ϑ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT =O(5M3N+5MCN+2MN2+2N3).absent𝑂5superscript𝑀3𝑁5𝑀𝐶𝑁2𝑀superscript𝑁22superscript𝑁3\displaystyle=O(5M^{3}N+5MCN+2MN^{2}+2N^{3}).= italic_O ( 5 italic_M start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_N + 5 italic_M italic_C italic_N + 2 italic_M italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 italic_N start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) . (20)

From (16)16(\ref{bigOMonteCarloComplexity})( ), (18)18(\ref{bigOproposedschedulerComplexity})( ) and (20)20(\ref{bigOScheduler2Complexity})( ), we observe that the proposed scheduler has quadratic computational complexity, while the benchmark schedulers have polynomial computational complexity. Moreover, by taking into account, (15a)15a(\ref{LBmontecarloComplexity})( ), (15b)15b(\ref{UBmontecarloComplexity})( ), (17a)17a(\ref{LBproposedschedulerComplexity})( ), (17b)17b(\ref{UBproposedschedulerComplexity})( ), (19a)19a(\ref{LBbenchmark2Complexity})( ), (19b)19b(\ref{UBbenchmark2Complexity})( ), the complexity ranges of the considered schedulers for various system parameter configurations are available in Table IV. As can in seen in Table IV, both lower and upper bound complexities of the proposed scheduler are extremely small for all the system parameter configurations. Specifically, the upper bound complexity of the proposed scheduler is significantly lower than the benchmark schedulers. Furthermore, its notably low complexity renders it suitable for implementation on an embedded processor-based edge node.

VI Results

TABLE V: Parameters Used in Simulations
Parameters Values
𝚺v1subscript𝚺subscript𝑣1\mathbf{\Sigma}_{v_{1}}bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 2.5×103𝐈M2.5superscript103subscript𝐈𝑀2.5\times 10^{-3}\mathbf{I}_{M}2.5 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT
𝚺v2subscript𝚺subscript𝑣2\mathbf{\Sigma}_{v_{2}}bold_Σ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 𝐈Nsubscript𝐈𝑁\mathbf{I}_{N}bold_I start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT
𝐇𝐇\mathbf{H}bold_H 𝐈Msubscript𝐈𝑀\mathbf{I}_{M}bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT
N,M𝑁𝑀N,Mitalic_N , italic_M 20202020
nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT 2222
S𝑆Sitalic_S 100100100100
αc,c𝒞subscript𝛼𝑐for-all𝑐𝒞\alpha_{c},\forall c\in\mathscr{C}italic_α start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , ∀ italic_c ∈ script_C 1111
η𝜂\etaitalic_η 00 (initial value)
𝒂(0),𝒃(0)𝒂0𝒃0\boldsymbol{a}(0),\boldsymbol{b}(0)bold_italic_a ( 0 ) , bold_italic_b ( 0 ) 𝟎Msubscript0𝑀\mathbf{0}_{M}bold_0 start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT (initial value)
𝐱^pos(0)subscript^𝐱𝑝𝑜𝑠0\hat{\mathbf{x}}_{pos}(0)over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( 0 ) 𝟎Msubscript0𝑀\mathbf{0}_{M}bold_0 start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT (initial value)
𝚿pos(0)subscript𝚿𝑝𝑜𝑠0\mathbf{\Psi}_{pos}(0)bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( 0 ) 𝐈Msubscript𝐈𝑀\mathbf{I}_{M}bold_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT (initial value)
NLSD (21a) NLSD (21b)
{ϖ,ς}italic-ϖ𝜍\{\varpi,\varsigma\}{ italic_ϖ , italic_ς }
{0.77,0.02}0.770.02\{0.77,0.02\}{ 0.77 , 0.02 } {0.75,0.025}0.750.025\{0.75,0.025\}{ 0.75 , 0.025 }
[\fgee,\fges]\fgee\fges[\fgee,\fges][ , ]
[0.5,0.2]0.50.2[-0.5,-0.2][ - 0.5 , - 0.2 ] [0.2,0.1]0.20.1[-0.2,-0.1][ - 0.2 , - 0.1 ]

Our simulations consider the following two NLSD functions

𝐟(𝐱(t))𝐟𝐱𝑡\displaystyle\mathbf{f}(\mathbf{x}(t))bold_f ( bold_x ( italic_t ) ) =𝐱(t)+0.05𝐱(t)(𝟏M𝐱(t)𝐱(t)),absent𝐱𝑡direct-product0.05𝐱𝑡subscript1𝑀direct-product𝐱𝑡𝐱𝑡\displaystyle=\mathbf{x}(t)+0.05\mathbf{x}(t)\odot(\mathbf{1}_{M}-\mathbf{x}(t% )\odot\mathbf{x}(t)),= bold_x ( italic_t ) + 0.05 bold_x ( italic_t ) ⊙ ( bold_1 start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT - bold_x ( italic_t ) ⊙ bold_x ( italic_t ) ) , (21a)
𝐟(𝐱(t))𝐟𝐱𝑡\displaystyle\mathbf{f}(\mathbf{x}(t))bold_f ( bold_x ( italic_t ) ) =𝐱(t)Roll(𝐱(t)),absentdirect-product𝐱𝑡Roll𝐱𝑡\displaystyle=\mathbf{x}(t)\odot\operatorname{\textsc{Roll}}(\mathbf{x}(t)),= bold_x ( italic_t ) ⊙ roll ( bold_x ( italic_t ) ) , (21b)

where Roll(𝐱(t))=[x2,,xM,x1]TRoll𝐱𝑡superscriptsubscript𝑥2subscript𝑥𝑀subscript𝑥1𝑇{\operatorname{\textsc{Roll}}(\mathbf{x}(t))=[x_{2},\cdots,x_{M},x_{1}]^{T}}roll ( bold_x ( italic_t ) ) = [ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, and direct-product\odot signifies the element-wise product. Notably, (21a) and (21b) lead to NLDSs with non-correlated and correlated states, respectively. Furthermore, we model the query process at the client side using periodic and memoryless MC, depicted in Fig. 3. Herein, a client generates a query when its corresponding MC reaches state A. Table VI showcases the information about the clients and the queries asked by them, for the case of C=2𝐶2C=2italic_C = 2. Note that, the parameter \mathfrak{C}fraktur_C in Table VI refers to the MC combinations possible at the client side.

Refer to caption
Figure 3: Periodic and memoryless MC utilized as query process, at the client side. Meanwhile, in both the MCs, state A leads to a query generation event.
TABLE VI: Information about the Clients when C=2𝐶2C=2italic_C = 2
Parameters Client-1 Client-2
=11\mathfrak{C}=1fraktur_C = 1
Periodic MC,
Initial MC state: D
Periodic MC,
Initial MC state: B
=22\mathfrak{C}=2fraktur_C = 2 Memoryless MC Memoryless MC
=33\mathfrak{C}=3fraktur_C = 3 Memoryless MC Periodic MC
Query asked Maximum query Count range query
Refer to caption
Refer to caption
Refer to caption
Figure 4: ASF resulting from using the considered schedulers for various \mathfrak{C}fraktur_C and using NLSD (21b).
Refer to caption
Figure 5: MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of the maximum query response accumulated during the run of the considered schedulers for various \mathfrak{C}fraktur_C.
Refer to caption
Figure 6: MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of the count range query response accumulated during the run of the considered schedulers for various \mathfrak{C}fraktur_C.
Refer to caption
Figure 7: MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of the maximum query response accumulated during the run of the proposed scheduler for various \mathfrak{C}fraktur_C and μ𝜇\muitalic_μ, for the case C=2𝐶2{C=2}italic_C = 2.

The performance evaluation of the schedulers is performed over a duration of 4000400040004000 time steps through MSEc(t),c𝒞subscriptMSE𝑐𝑡for-all𝑐𝒞{\operatorname{MSE}_{c}(t),\forall c\in\mathscr{C}}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) , ∀ italic_c ∈ script_C, and action selection frequency (ASF) metrics. Besides, we are reckoning the duration of the first 2000200020002000 time steps as a warm-up period for Holt’s method. Consequently, any actions selected and MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) values, c𝒞for-all𝑐𝒞{\forall c\in\mathscr{C}}∀ italic_c ∈ script_C, computed during the warm-up period are discarded.

TABLE VII: Number of Sensor Transmissions
Proposed Scheduler
\mathfrak{C}fraktur_C
μ=0.1𝜇0.1\mu=0.1italic_μ = 0.1
NLSD {(21a),(21b)}
μ=0.01𝜇0.01\mu=0.01italic_μ = 0.01
NLSD {(21a),(21b)}
μ=1𝜇1\mu=1italic_μ = 1
NLSD {(21a),(21b)}
1111 {190,199}190199\{190,199\}{ 190 , 199 } {193,195}193195\{193,195\}{ 193 , 195 } {1947,1983}19471983\{1947,1983\}{ 1947 , 1983 }
2222 {192,200}192200\{192,200\}{ 192 , 200 } {186,175}186175\{186,175\}{ 186 , 175 } {1981,1995}19811995\{1981,1995\}{ 1981 , 1995 }
3333 {169,203}169203\{169,203\}{ 169 , 203 } {182,191}182191\{182,191\}{ 182 , 191 } {1987,1971}19871971\{1987,1971\}{ 1987 , 1971 }
\mathfrak{C}fraktur_C
Benchmark DRL Scheduler Monte Carlo Scheduler
1111
2000200020002000 667667667667
2222
2000200020002000 682682682682
3333
2000200020002000 658658658658

Considering NLSD (21b), the bar-plots in Fig. 4 reveal that the action-0 is the most adopted by the proposed scheduler among all of its possible actions. Moreover, the ASFs of all of its remaining actions are below 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT. This dominance of action-0 stems from the reward 0.1Tr(𝚿pos(t))0.1Trsubscript𝚿𝑝𝑜𝑠𝑡{-0.1\operatorname{Tr}(\mathbf{\Psi}_{pos}(t))}- 0.1 roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_o italic_s end_POSTSUBSCRIPT ( italic_t ) ), which incentivizes the proposed scheduler to opt for the action-0 in the absence of queries. Besides, opting for action-0 minimizes sensor transmissions, consequently saving sensor energy. Meanwhile, the Monte Carlo scheduler predominantly selects action-1 across all {\mathfrak{C}}fraktur_C, resulting in a substantial amount of energy depletion at sensor-1. On the other hand, the ASFs of most of the actions are below 101superscript10110^{-1}10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT across all {\mathfrak{C}}fraktur_C when using the benchmark DRL scheduler. However, ASFs obtained through benchmark schedulers are still higher than those obtained through the proposed scheduler. Furthermore, the proposed scheduler requires the lowest number of sensor transmissions in every {\mathfrak{C}}fraktur_C, as evidenced by Table VII. Consequently, in comparison to the proposed scheduler, the sensor energy depletion is relatively higher in the case of benchmark schedulers.

As illustrated through the box-plots in Fig. 5, the benchmark schedulers obtain a lower MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of the maximum query response compared to the proposed scheduler. However, note that the MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) values for all three schedulers are varying in the range of 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT. Thus, the disparity in MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of maximum query response obtained in the case of the proposed scheduler and benchmark schedulers is marginal.

Considering NLSD (21b), the box-plots in Fig. 6 unfolds that the proposed scheduler leads to a decline in MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of count range query response, relative to the benchmark schedulers, across all {\mathfrak{C}}fraktur_C. Meanwhile, in the case of NLSD (21a), the proposed and benchmark schedulers obtain similar MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ).

Furthermore, as illustrated in Fig. 5 and 6, the proposed scheduler exhibits superior performance in count range query compared to maximum query when contrasted with benchmark schedulers. This disparity arises because the MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of the maximum query response is notably more susceptible to outliers within the data gathered in 𝒖𝒖{\boldsymbol{u}}bold_italic_u, in the steps steps 3-4 of Algorithm 7. Consequently, the MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of the maximum query response, refer to step 5 of Algorithm 7, typically fails to offer accurate insights into the central tendency of the collected data. Therefore, estimating a satisfactory MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of the maximum query response in the case of the proposed scheduler necessitates a higher value of μ𝜇\muitalic_μ. Fig. 7 proves this claim, as increasing μ𝜇\muitalic_μ from 0.10.10.10.1 to 1111 has actually minimized the MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of the maximum query response in the case of the proposed scheduler. An increment in μ𝜇\muitalic_μ would lead to an increase in the number of sensor transmissions, which, in turn, improves the accuracy of posterior estimates. Consequently, this leads to a decline in the number of outliers within 𝒖𝒖{\boldsymbol{u}}bold_italic_u. However, increasing the value of μ𝜇\muitalic_μ has one significant drawback, which is an increase in the number of sensor transmissions. Table VII shows that increasing μ𝜇\muitalic_μ from 0.10.10.10.1 to 1111 has significantly increased the number of sensor transmissions.

Based on the preceding discussion, it is apparent that the proposed scheduler either succeeds in reducing MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) or obtains a resembling MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ), relative to the benchmark schedulers. Furthermore, the proposed scheduler accomplishes this by reducing the number of sensor transmissions. The key to the satisfactory performance of the proposed scheduler lies in its input. Instead of feeding the complete prior state of CQKF, i.e., (𝐱^pri(t),𝚿pri(t))subscript^𝐱𝑝𝑟𝑖𝑡subscript𝚿𝑝𝑟𝑖𝑡{(\hat{\mathbf{x}}_{pri}(t),\mathbf{\Psi}_{pri}(t))}( over^ start_ARG bold_x end_ARG start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) , bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ), as input to the DRL scheduler, we provide a specific attribute of the prior state of CQKF, which is Tr(𝚿pri(t))Trsubscript𝚿𝑝𝑟𝑖𝑡{\operatorname{Tr}(\mathbf{\Psi}_{pri}(t))}roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ). As mentioned in Section III-C, Tr(𝚿pri(t))Trsubscript𝚿𝑝𝑟𝑖𝑡{\operatorname{Tr}(\mathbf{\Psi}_{pri}(t))}roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) reflects the mean square error in the prior estimate. By using Tr(𝚿pri(t))Trsubscript𝚿𝑝𝑟𝑖𝑡{\operatorname{Tr}(\mathbf{\Psi}_{pri}(t))}roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) as input, the DRL scheduler focuses solely on selecting the most fruitful action, which later minimizes MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ). In contrast, providing the complete prior state of CQKF as input, as done with the benchmark DRL scheduler, adds the extra workload of extracting the valuable information from the input to the DRL scheduler.

Meanwhile, relieving the DRL scheduler of the aforementioned extra workload positively impacts its ability to leverage correlation among NLDS states. In Fig. 6, for NLSD (21b), the proposed scheduler demonstrates a comparatively superior ability to capitalize on the correlation among NLDS states compared to the benchmark schedulers. Better exploitation of correlation implies that the proposed scheduler possesses superior insights about the most fruitful sensor during the time of sensor polling. This, in turn, yields posterior estimates that are relatively better than the ones obtained in the case of the benchmark schedulers. Consequently, this leads to a decline in its MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of count range query response, relative to the benchmark schedulers. However, in the case of NLSD (21a), no such correlation among NLDS states is available for the proposed scheduler to exploit, leading to its MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of count range query response similar to the benchmark schedulers.

Moreover, because of extra workload, the benchmark DRL scheduler requires a more complex DNN architecture, featuring three hidden layers with {2.5M,M,N}2.5𝑀𝑀𝑁{\{2.5M,M,N\}}{ 2.5 italic_M , italic_M , italic_N } neurons. In contrast, the DNN architecture of the proposed scheduler comprises just one hidden layer with four neurons. This streamlined architecture is another advantage of utilizing Tr(𝚿pri(t))Trsubscript𝚿𝑝𝑟𝑖𝑡{\operatorname{Tr}(\mathbf{\Psi}_{pri}(t))}roman_Tr ( bold_Ψ start_POSTSUBSCRIPT italic_p italic_r italic_i end_POSTSUBSCRIPT ( italic_t ) ) as input.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 8: MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) accumulated during the run of the considered schedulers for maximum, count range, sample mean, and sample variance query responses. We consider C=4𝐶4{C=4}italic_C = 4 and all four clients are modelled through a periodic MC, having the following twelve MC states: {A,B,,L}ABL{\{\textbf{A},\textbf{B},\cdots,\textbf{L}\}}{ A , B , ⋯ , L }. Besides, B,D,FBDF{\textbf{B},\textbf{D},\textbf{F}}B , D , F, and H are the initial MC states for clients 1, 2, 3, and 4, respectively. Moreover, clients 1, 2, 3, and 4 are asking maximum, count range, sample mean, and sample variance queries, respectively.

Fig. 8 considers the scenario where alongside the maximum and count range queries, two additional queries, sample mean and variance, are posed to the edge node by two additional clients. Note that there is a negligible disparity between MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of query responses obtained in the case of the proposed scheduler and benchmark schedulers, for the maximum, sample mean and variance queries. Besides, Fig. 8 manifests that the proposed scheduler leads to a decline in MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of count range query response, relative to the benchmark schedulers, when factoring in NLSD (21a). Meanwhile, in the case of NLSD (21b), the MSEc(t)subscriptMSE𝑐𝑡{\operatorname{MSE}_{c}(t)}roman_MSE start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_t ) of the count range query response closely resembles, for all three schedulers. Finally, even with an increase in the number of clients, the performance of the proposed scheduler has not been degraded relative to the benchmark schedulers.

VII Conclusion

This paper introduced a GoS method tailored for IoT sensors tasked with sensing NLDS. The reporting operation is scheduled by the edge node and the phrase goal-oriented in GoS emphasizes its primary objective, which is to accurately respond to client queries regarding the NLDS state. Through GoS, the edge node gathers partial yet insightful sensor observations to advance towards its objective. These observations, along with a state estimator, are used to estimate the complete NLDS state, which is later employed to generate query responses. Notably, our state estimator operates effectively without necessitating an NLDS mathematical model. Moreover, our findings showed that the proposed GoS yields an energy-efficient state observation from the sensor perspective.

Our work here considers only a single RL agent due to the centralized nature of the scheduling. A promising avenue for future research would be to adapt the proposed goal-oriented sensor scheduling framework to a multi-agent RL system, such as unmanned aerial vehicle swarm where each RL agent acts as a sensor scheduler.

References

  • [1] O. L. A. López, O. M. Rosabal, D. E. Ruiz-Guirola, P. Raghuwanshi, K. Mikhaylov, L. Lovén, and S. Iyer, “Energy-sustainable IoT connectivity: Vision, technological enablers, challenges, and future directions,” IEEE Open Journal of the Communications Society, vol. 4, pp. 2609–2666, 2023.
  • [2] P. Di Lorenzo, M. Merluzzi, F. Binucci, C. Battiloro, P. Banelli, E. C. Strinati, and S. Barbarossa, “Goal-oriented communications for the IoT: System design and adaptive resource optimization,” IEEE Internet of Things Magazine, vol. 6, no. 4, pp. 26–32, 2023.
  • [3] C. Zhang, H. Zou, S. Lasaulce, W. Saad, M. Kountouris, and M. Bennis, “Goal-oriented communications for the IoT and application to data compression,” IEEE Internet of Things Magazine, vol. 5, no. 4, pp. 58–63, 2022.
  • [4] A. Hashemi, M. Ghasemi, H. Vikalo, and U. Topcu, “Randomized greedy sensor selection: Leveraging weak submodularity,” IEEE Transactions on Automatic Control, vol. 66, no. 1, pp. 199–212, 2021.
  • [5] F. Chiariotti, A. E. Kalør, J. Holm, B. Soret, and P. Popovski, “Scheduling of sensor transmissions based on value of information for summary statistics,” IEEE Networking Letters, vol. 4, no. 2, pp. 92–96, 2022.
  • [6] J. Holm, F. Chiariotti, A. E. Kalør, B. Soret, T. B. Pedersen, and P. Popovski, “Goal-oriented scheduling in sensor networks with application timing awareness,” IEEE Transactions on Communications, vol. 71, no. 8, pp. 4513–4527, 2023.
  • [7] D. Gündüz, F. Chiariotti, K. Huang, A. E. Kalør, S. Kobus, and P. Popovski, “Timely and massive communication in 6G: Pragmatics, learning, and inference,” IEEE BITS the Information Theory Magazine, vol. 3, no. 1, pp. 27–40, 2023.
  • [8] Z. Liu, A. Clark, P. Lee, L. Bushnell, D. Kirschen, and R. Poovendran, “Towards scalable voltage control in smart grid: A submodular optimization approach,” in Proceedings of the ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS), 2016, pp. 1–10.
  • [9] V. Tzoumas, M. A. Rahimian, G. J. Pappas, and A. Jadbabaie, “Minimal actuator placement with optimal control constraints,” in Proceedings of the American Control Conference (ACC), 2015, pp. 2081–2086.
  • [10] A. Li, S. Wu, S. Meng, and Q. Zhang, “Towards goal-oriented semantic communications: New metrics, open challenges, and future research directions,” arXiv preprint arXiv:2304.00848, 2023.
  • [11] S. K. Nanda, “Advanced Kalman filtering with applications to power system and epidemiological data analysis,” PhD dissertation, Indian Institute of Technology Indore, May 2023.
  • [12] G. Valverde and V. Terzija, “Unscented Kalman filter for power system dynamic state estimation,” IET Generation, Transmission & Distribution, vol. 5, pp. 29–37, Jan. 2011.
  • [13] O. Nabati, T. Zahavy, and S. Mannor, “Online limited memory neural-linear bandits with likelihood matching,” in Proceedings of the International Conference on Machine Learning (ICML), Jul. 2021, pp. 7905–7915.
  • [14] “Tensorflow.” [Online]. Available: https://www.tensorflow.org/api_docs/python/tf/clip_by_global_norm