Download as pdf or txt
Data-driven methods for the reduction of energy consumption in warehouses:

Use-case driven analysis

Article in Internet of Things · August 2023

DOI: 10.1016/j.iot.2023.100882

Ibrahim Shaer Abdallah Shami

The University of Western Ontario The University of Western Ontario


Data-driven Methods for the Reduction of Energy

Consumption in Warehouse
Ibrahim Shaer∗ , Abdallah Shami∗
∗ Department of Electrical and Computer Engineering
Western University
London ON, Canada
{ishaer, abdallah.shami}@uwo.ca

Abstract—On a global scale, buildings account for a significant The introduction of renewable energy disrupts the con-
portion of energy consumption and CO2 emissions, contributing ventional production pipelines in factories that are heavily
to inclement environmental conditions. The Heating, Ventilation, dependent on carbon-based fuels [7]. A gradual induction of
and Air Conditioning (HVAC) system is an important energy
consumer in buildings, which can be controlled to reduce overall these sources into the industrial infrastructure is favourable
energy consumption. This paper provides an overview of the given the operational expenses associated with the complete
HVAC control problem for energy reduction in warehouses. In replacement of traditional energy sources [8]. The integration
particular, this paper first introduces the enabling technologies of renewable sources of energy to power industrial building
for HVAC control. After that, an extensive explanation of the activities is challenged by many factors. On the one hand,
warehouse environment is provided. Issues such as multi-zone
spaces, occupancy profiling, and ambient environmental condi- the storage medium of renewable sources of energy is prone
tions are highlighted in connection to energy consumption. Next, to gradual degradation and inefficiencies due to charging and
a survey of traditional solutions followed by the proposed solution discharging cycles [9]. On the other hand, renewable energy
of incorporating Reinforcement Learning (RL) and Supervised exhibits extreme volatility due to its dependence on natural
Learning techniques are discussed. After that, an in-depth conditions that can be highly unpredictable [10].
summary of the challenges associated with the implementation of
these techniques is examined. Lastly, a use case study is conducted To complement the benefits of the partial integration of
to demonstrate the outlined challenges using a dataset collected renewable energy, industrial building operators can optimize
in a real-world environment. their energy consumption to decrease their energy bills. A
Index Terms—Energy Consumption Reduction, Warehouse common controllable energy consumer for all types of build-
HVAC systems, Use Case study , Reinforcement Learning, Su- ings is Heating, Ventilation, and Air Conditioning (HVAC)
pervised Learning systems. In manufacturing buildings, HVAC systems are used
to control the air quality to ensure the workers’ safety. For
I. I NTRODUCTION warehouses, temperature, humidity, and particulate control
The total energy used in commercial buildings and industrial achieved through HVAC systems are pertinent for its func-
facilities accounts for around 40% and 36% of the global tioning. HVAC systems account for about 30% of the total
energy consumption, respectively [1, 2]. With the expansion energy consumption in warehouses [11]. Therefore, this energy
of industrial activities to new geographical and technological footprint defines a sizable opportunity for technical solutions
territories, these numbers are expected to increase exponen- to optimize warehouse energy consumption.
tially. In this regard, there is a strong correlation between the While thermal comfort is integral for residential buildings, it
proliferation of industrial activities that use different carbon- is less of a pressing issue in warehouses. The HVAC control’s
based fuels to power their processes and the prominence of primary goal is to preserve its recommended indoor climate
greenhouse effects [3]. These effects have resulted in dire conditions to avoid inventory spoilage [12]. An essential part
consequences on the Earth’s ecosystem, manifested by the of the refrigerated inventory is dedicated to the food sector,
accelerated loss of Arctic ice, the rise of sea levels, and which consumes about 72% of the total energy, whereby fossil
frequently inclement weather conditions [4]. Such conditions fuels account for almost 79% of the energy consumed [13].
herald drastic climate changes that have long-lasting effects Every year, almost $35 billion is estimated to be lost in perish-
on the economy, environment, and daily life. able item value worldwide due to spoilage, a condition that is
Many governments have implemented aggressive energy detrimental to industries’ budgets and the Earth’s ecosystem.
legislation to curb the free fall of the Earth’s ecosystem. The These conditions can be addressed by proper HVAC control
new policies are geared towards providing monetary incentives [14]. The optimal control of HVAC systems in warehouses
for industries that manage to limit their carbon footprint by gains importance on three different levels: the social and
relying on alternative sustainable sources of energy, such as environmental levels achieved through curbing greenhouse
wind and solar energy [5, 6]. In the face of these increased emissions, the legal level by following the regulatory bodies,
monetary costs and the volatility of the oil price market, oper- and the financial level by limiting energy consumption.
ators of industrial buildings are compelled to utilize renewable Slashing the costs of HVAC systems in warehouses while
sources of energy and optimize their energy expenditure. maintaining the environmental requirements of the inventory

is challenging. The uncertainty of renewable energy sources, Section VII explains the practical hurdles of reinforcement
the structural limitations of warehouses, and the human factor learning implementation. Section VIII presents a use case
represented by occupancy patterns are all salient impediments study to showcase the outlined challenges using a dataset
that contribute to the operation of HVAC systems, and conse- collected in a real-world environment. Lastly, Section IX
quently, the energy costs of warehouses. Therefore, these chal- concludes the paper.
lenges must be considered when designing and implementing
an energy optimization strategy. II. BACKGROUND
The warehouse operators introduced different methods into This section examines background information on enablers
their systems to address each challenge. In particular, methods to implementing energy optimization strategy. These enablers
such as scheduled control, reactive control and model predic- include the Industrial Internet of Things (IIoT) devices that are
tive control are the most popular. These methods suffer from connected using the communication capabilities of Wireless
some limitations that undermine their utility in the dynamic Sensor Networks (WSNs). Renewable energy sources are
warehouse environment. For instance, scheduled control de- deployed to replace carbon-based fuels to power the warehouse
grades upon a sudden change in occupancy and renewable gradually. The Building Energy Management System (BEMS)
energy patterns. The reactive control has a myopic view of the can be developed to better manage the energy consumption of
environment, which prioritizes short-term environment vari- warehouses.
ability over long-term goals. Lastly, Model Predictive Control
(MPC) requires models that encompass most of the factors that A. Industrial Internet of Things
contribute to the warehouse environment, an endeavour of time
The emergence of the Industry 4.0 concept is facilitated by
and resource-intensive nature. To address all of the limiting
the desire for the materialization of the inter-connected smart
characteristics of traditional approaches, this work proposes
industry [15]. The envisioned transformation is spearheaded by
Reinforcement Learning (RL) and Supervised Learning to
integrating Internet of Things (IoT) technologies that enable
maintain the environmental requirements of warehouses while
the autonomous sensing, automation, and control of different
reducing energy consumption. The integration of RL and
operations in industrial complexes. Together, these technolo-
Deep Learning (DL) techniques enables the high-dimensional
gies are referred to as the Industrial Internet of Things (IIoT).
mapping of the warehouse environment and addresses the
The inter-connectivity, storage, and computing capabilities of
long-term concerns of HVAC systems through its reward func-
IIoT devices enable different industrial processes to extend
tion formulation. Supervised Learning facilitates the decision-
their existing applications and envision new ways of operation.
making process of RL by modelling and predicting the future
In particular, these technologies are responsible for creating a
values of environmental conditions.
representative network of information on industrial processes
The contributions of this paper are as follows: that shapes the decision-making capabilities of IIoT systems.
• Detail the physical phenomena taking place in a ware- An example of an IIoT use case in a warehouse would be a
house environment; sensor mounted on the warehouse entrance to identify a person
• Discuss the traditional data-scarce methods leveraged entering or leaving the facility. This information collected over
for HVAC control and their limitations in a warehouse a time horizon can profile the occupancy in warehouses.
environment; A drop-off in the costs and sizes of IIoT gateways and an
• Motivate the utilization of Reinforcement Learning com- increase in their processing power facilitate monitoring and
bined with Supervised Learning methods as a replace- analyzing information from different sources. With regards
ment for traditional methods and highlight the challenges to energy consumption, the Annual Energy Outlook report
of their implementation in the warehouse environment; published by the International Energy Agency recommends
• Conduct a case study that implements supervised learning the adoption of next-generation sensors and control technolo-
techniques to predict occupant proxies using a dataset gies, which can reduce energy costs annually by almost $18
collected in a real-world setting; and, billion [16]. As such, the integration of IIoT technologies is
• Analyze the prediction results in connection to the multi- fundamental for the realization of efficient energy optimization
zone and air diffusion challenges in the warehouse envi- strategies.
This paper is the first to provide an overview of the energy B. Wireless Sensor Networks
consumption issue in warehouses. It first introduces the core Wireless Sensor Networks (WSNs) are one of the founda-
principles and technologies that are foundational to the energy tional units in IIoT technology. A WSN is a group of spatially
optimization goal, which is explained in Section II. After that, dispersed sensor nodes interconnected using wireless commu-
the main challenges associated with the warehouse environ- nication [17]. The battery-powered sensor node is formed of
ment are explained in Section III. Towards the goal of energy a processor, storage unit, and a group of sensors. Its principal
optimization, Section IV discusses the traditional methods that function is to capture the ambient conditions’ variations and
are currently implemented. Section V explains the role of convert them into electric signals processed by the node’s
supervised learning and reinforcement learning to fulfil the processor [17].
goal of energy reduction. Section VI details the challenges and A favourable property of these sensors is their ability to
solutions of supervised learning implementations. Similarly, sense when deployed far from the target phenomenon. To

monitor a phenomenon of interest, a host of sensors can be of Digital Twin was meant to mirror processes occurring in
randomly deployed so that they capture every aspect of the product life-cycle management [21]. At its core, this concept
bespoke phenomena. Such a deployment strategy eliminates consists of three components, the physical entity, the virtual
the need for cables and complex wiring and promotes the entity, and the bi-directional data connections. The physical
flexible deployment of IIoT devices in remote and harsh entity is defined as the real-world existence of an entity.
locations [18]. The processing capabilities of each sensor The real-world landscape within which the physical entity
enable pre-processing of the collected data so that the filtered exists is defined as the physical environment. This environment
and useful data can only be sent to the data fusion sites. encompasses all the factors that can contribute to altering
The aforementioned harsh conditions are frequently encoun- the physical entities. On the other hand, the virtual entity is
tered in warehouses. A use case of WSNs in warehouses is defined as the digitized representation of the physical entity.
monitoring the ascent of high-temperature air in warehouses. Different virtual processes such as optimization, analysis, and
Towards that end, sensors are expected to be mounted on the predictions are realized on the virtual entity’s end. The virtual
warehouse’s high ceiling, which is a human-unreachable area. environment mirrors the physical environment constructed
using IIoT devices. The data connections are broken down
C. Deployment of Renewable Sources of Energy into physical-to-virtual and virtual-to-physical connections.
Technologies such as IIoT devices and WSNs continuously
The integration of renewable energy sources into the ware- relay the physical environment to the virtual environment. The
houses’ electric grid should account for the warehouse en- rate of this exchange is referred to as the twinning rate [22].
vironment and the cost-payoff trade-off. Photo-Voltaic (PV) The flow of information obtained from virtual processes to
panels, wind energy, biomass, and electromagnetic field uti- change the state of the physical entity and its environment is
lization are considered prime candidates for supplying renew- materialized using the virtual-to-physical connection. Through
able energy. Compared to other renewable energy sources its actuators, the BEMS can realize this connection, which
such as wind turbines and biomass, the characteristics of bridges the gap between the hypotheses built in the virtual
PV panels mitigate the constraints imposed by the ware- environment and the feedback obtained with their implemen-
house environment. The warehouse rooftops provide a suitable tation in the physical environment [22].
medium for their placement because of the space available Projecting these definitions to the studied environment
and the low probability of sunlight blockage. These conditions is straightforward. Here, the physical entity represents the
are favourable for on-site energy generation. Additionally, HVAC systems in a warehouse environment. The warehouse
the applicability of this setup has been proven by recent environment represents the real-world blueprint, which in the
implementations in large refrigerated warehouses [14]. DT terminology, is defined as the physical environment. The
virtual entity is connected to the main purpose for constructing
D. Building Energy Management System the DT. In the context of HVAC systems, these can include
A Building Energy Management System (BEMS) is an predictive modelling of some environment-specific parameters,
entity responsible for controlling and monitoring loads of changing HVAC setpoints, and scheduling these systems.
different electrical and mechanical entities inside a building. Lastly, the virtual environment encompasses all the factors
This overarching rule can reduce the energy needed for illumi- deemed necessary to aid the virtual entity in achieving its main
nation, heating, and ventilating a building [19]. In the realm of purposes.
HVAC control, BEMS handles its main components, including
Air Handling Units (AHUs), chillers, and heating. III. P HYSICAL P HENOMENA IN WAREHOUSES
The connectivity and monitoring capabilities of IIoT devices The warehouses are characterized by their spacious areas
and WSNs have driven the integration of IoT technology that store the goods and merchandise of a host of big corpora-
solutions into the BEMS. Moreover, the vast and heteroge- tions. The vast spaces and the warehouses’ architecture com-
neous data available and the sensing and control capabilities plicate indoor climate control, resulting in many challenges to
have necessitated the integration of computational intelligence the goal of energy optimization. Each of these challenges is
into BEMS. Machine Learning techniques are suitable for explained in the following subsections as they represent the
achieving the main goals of BEMS in terms of efficient energy physical environment in the DT terminology.
consumption, its integration with the smart grid technology,
and its resilience to any updates of sensory data [20]. Fur- A. Large Air Leaks
thermore, The migration to IP-based networking has enabled
the remote monitoring of energy consumption by a centralized The common leak points in warehouses include large doors
entity promoted by the emergence of user-friendly cloud-based and windows that introduce when opened, an outdoor air-
software-as-a-service applications [19]. flow that disrupts the internal climate conditions [23] . The
extent of this disruption depends on the outdoor thermal
conditions, which can vary based on the warehouse location
E. Digital Twin that dictates the ambient climate. Warehouses may encounter
Grieves et al. [21] introduced the concept of Digital Twin drastic changes in internal thermal conditions in scenarios
(DT) defined as a virtual blueprint containing information where shipping doors are opened and closed to fulfil frequent
about a physical product. In its nascent stages, the concept deliveries. Therefore, any operation with dock doors faces

an uphill battle when it comes to the preservation of the

warehouse’s indoor climate.
Another source of air leak points could include manual
doors or windows left ajar due to workers’ negligence [24].
This factor sheds light on the significance of the human factor
in the thermal conditioning of warehouses. On a different
note, the wear and tear of warehouses expose small leaks that
disrupt the indoor climate. The HVAC control reacts to any of
these encountered conditions to re-calibrate the indoor climate
to align with the constraints of the stored inventory. The
inaccuracies in estimating the air infiltration rates in industrial
buildings, caused by small and large leak sources, increases
the energy demands [24].
Fig. 1: Air Stratification
B. Air Distribution
Small and large commercial warehouses typically include These zones include the dry zone, refrigerated zone, and
very long ceilings of about 9-12 ft. to store as much inventory flammable zone.
as possible [25]. Problems with air distribution emerge in an With the zonal conditions in mind, the preservation of
enclosed environment with high ceilings. The air stratification indoor climate conditions cannot be taken as a monolith. A
phenomenon occurs in spaces that have been closed for an single model that characterizes the whole warehouse would
extended period [26]. In warehouse settings, this can be be unaware of each zone’s requirements, which necessitates
observed when no shipments are being unloaded or offloaded, the creation of a unique approach to meet these requirements.
a condition encountered with seasonal changes in consumption This characteristic highlights the inherent differences in HVAC
patterns. This phenomenon is depicted in Figure 1. Two promi- control between residential buildings and warehouses. In that
nent air movements occur during this phenomenon. In the first regard, single thermal modelling of a whole residential build-
movement, denoted by number 1 in Figure 1, the heavier cold ing, regardless of the thermal preferences of each room or
air settles at the bottom of the warehouse. As a result, in the apartment, is still a viable approach. Its viability stems from
second movement, denoted by number 2, the lighter warm air its ability to provide acceptable thermal comfort to residents.
ascends to the warehouse’s ceiling. The division of the air The permissive nature of HVAC control in residential buildings
based on the temperature is detrimental to the preservation of cannot be applied to multi-zone warehouses with their very
the indoor climate which can potentially spoil the inventory. drastic variations in thermal requirements, causing the energy
To address this challenge, ventilation systems are used to migration phenomena [29]. As implied by their description, the
redistribute and recirculate the rising warm air to avoid re- temperature levels in a dry zone cannot be simply projected
heating the ground level. to the cold zone.
The work by Amores et al. [27] showcases the influence Thermal leaks from one room to another represent a chal-
of outdoor temperature on the air stratification of indoor lenge when developing HVAC control policies in residential
air. Stark differences between warm and cold months were buildings [30, 31]. In their study, Pritoni et al. [30] reported
reported. During warm months, the ceiling’s temperature in- that HVAC systems are operational in unoccupied areas be-
creases, causing the cold air to accumulate on the ground cause of the thermal flows between occupied and unoccupied
levels, aggravating the temperature division between different zones. This phenomenon is also realized in a warehouse
warehouse levels. In contrast, the rising hot air mixes with the environment, albeit with higher stakes conditions. Similar to
cold ceiling during cold months and decreases its temperature. the air distribution challenges, the thermal flows occur in a
Consequently, the temperature of hot air drops, forcing it to horizontal fashion between cold and warmer zones. These
settle at the ground level. This study concludes that ventilation exogenous variables should be also integrated into the HVAC
systems are integral to preventing this phenomenon and, control to reduce the energy consumption of HVAC systems.
consequently, preserving the quality of the stored inventory.
D. Occupancy Profile
C. Multi-zone Space The Air Leaks challenge touches upon the workers’ activity
The type of goods stored in a warehouse depends on contribution to the warehouses’ indoor climate. The workers
the companies the warehouse serves and on its geographical perform a spectrum of duties in warehouses, ranging from
location. Warehouses are inclined to diversify their goods’ simply walking to loading and unloading shipments. The
portfolios to appeal to the widest array of customers. There- thermal energy and the respiration activity of occupants vary
fore, warehouses are usually divided into multiple zones, depending on the occupants’ behaviour; thus, affecting the
whereby each zone includes materials that require specific indoor climate conditions in occupied spaces. The downstream
indoor climate conditions, including temperature, humidity, effect of human presence contributes to activating HVAC
and particle concentrations, to maintain their quality [28]. systems and surges the warehouses’ energy consumption [32].

In contrast to office, manufacturing, and residential buildings large cities or urban concentrations determines the construction
with fixed or predictable occupancy schedules linked to regular of new warehouses. These conditions along with the brand
working shifts (8-5), workers’ occupancy profile in warehouses of satisfying customer needs contribute to the distribution of
displays high volatility. warehouses associated with specific industries over a large
The working hours in warehouses extend to 12-hour shifts, geographical area. The dispersion of warehouses is favourable
exceeding the expected number of hours in other types of for profitable operations.
buildings [33]. These prolonged shifts occur in irregular The consequence of distributed warehouses operating on
hours, due to the all-year-round operations of warehouses. continent or country levels are numerous for creating global
Warehouses expect shipments at different hours of the day as warehouse HVAC systems. From the structural perspective,
they represent a hub connecting various economies and supply warehouse dispersion translates to building warehouses that
chains. As such, workers’ presence is needed to fulfil these follow the local area regulations. These pre-conditions are
deliveries. Data have revealed that half of the warehouse labour set to adapt the warehouses to the dominant climate [40].
hours in North America are spent searching for products in The unique regulations create diverse indoor climate profiles
warehouses [34]. The walking distances and the effort exerted when exposed to outdoor environmental changes [41]. With
in each part of the warehouse disrupt the stability of thermal the dispersion of warehouses over the European Union, China,
conditions inside warehouses. the USA, and Canada, an array of climates can be encountered
Warehouse workers of big corporations such as Amazon Inc. for warehouses operated by one company. Climates exhibit
are challenged by mandatory overtime that is caused by whim- distinctive characteristics that are inherently integrated by the
sical patterns of consumption [33] and labour shortages [35]. overarching governmental regulations, which naturally affect
Social media trends and unexpected events, such as global the HVAC systems. In that regard, the U.S. Department of
pandemics, are the main contributor to the radical changes in Energy (DOE) recommends specific envelope, lighting, and
consumption [36]. The high demand for some products results HVAC systems for warehouses in each of their 8 defined
in the volatility of occupancy in some warehouse zones. These climates [41]. When one industry or company is involved,
patterns are expected in warehouses with diverse products that the structural differences of warehouses invoke the need for
cater to a large customer base, implying frequent warehouse approaches tailored to the ambient climatic environment.
activities. Receiving and distributing the goods, order picking To illustrate the effect of the dominant climate on the
and shipping [34] are among the most prevalent activities in indoor climate, the following example is explained. When
warehouses. For HVAC systems, this volatility is translated to the warehouse door is opened, a warehouse in the subtropical
the reactive operation of HVAC, contributing to the increase climate of Austin, Texas, located in the Southern U.S., expe-
in energy consumption. riences distinct changes in internal temperature compared to a
The unpredictable patterns of consumption were accentu- warehouse in the humid continental climate of Detroit, located
ated due to the COVID-19 pandemic that forced people to in the Northern U.S. This example showcases the effect of
resort to online shopping to avoid contracting the virus in retail climate and seasons on the indoor environmental conditions
stores [36]. The pandemic has also introduced the new concept and highlights the grave effects of the large doors on these
of social distancing, which enforced approaches to track and conditions, as explained in the Air Leaks challenge. In the
organize occupancy in closed areas to avoid spreading the work by Seifhashemi et al. [40], they studied the effect of
COVID-19 virus. Warehouses are heavily involved in this cool rooftops on energy saving in different Australian climates.
organizational shift that has implicated the supply chains [37]. The results suggest drastic differences between cool and warm
The insights gathered from applying social distancing prove temperatures, which emphasizes the effects of climate on the
its contribution to influenza fizzle [38]. Warehouse owners can HVAC systems in warehouses . Therefore, their integration is
leverage such insights to limit the future exposure of their paramount for the reduction of energy consumption in ware-
workers to flu and address one of the many health hazards houses. A summary of warehouses’ challenges is presented in
associated with warehouse jobs. In conclusion, many factors Table I.
that contribute to the uncertainty of warehouse occupancy
should be considered to maintain the indoor climate of each
warehouse zone.

E. Environmental Conditions Warehouse operators can leverage traditional methods to

The warehouse industry has transcended its crude image as address warehouse environment challenges. This section dis-
just an “inventory storage” to become the core link of modern cusses the advantages and limitations of each method, high-
logistics and supply chains. Multiple factors contribute to the lighting the factors and technical aspects that favour spe-
establishment of warehouses. These factors include the fast- cific methods over others and positioning these solutions in
developing economic structures and rapid changes in logistics light of the DT terminology. Since this work is the first
to meet the regional and local needs of industries and supply to address warehouse HVAC challenges connected to energy
chains [39]. Logistical concerns also include the distances consumption, the surveyed approaches are evaluated in light
between warehouses and commercial and delivery companies of the constraints imposed by warehouses’ indoor climate
and transportation routes [34]. Additionally, the proximity to conditioning.

Challenge Reason Effect

– Large doors to fulfil deliveries
– Manual doors/windows left ajar
Large Air Leaks [23, 24] – Wear and tear Outdoor airflow disrupts the indoor climate

Closing spaces for an extended period of

Air Distribution [26, 27] time Air stratification phenomena

Multi-zone Space [29, 30, Fulfillment of supply chain requirements Multiple models should be created to ac-
31] and diversifying the goods’ portfolio count for each zone
– Fulfillment of deliveries throughout the
day Occupancy profiling experiences large drifts
Occupancy Profile [32, – Emergence of E-commerce and the effect and diverges from residential occupancy
33] of unexpected events on the supply chain models

Environmental Conditions Dispersion of warehouses to adapt to accel- Outdoor airflow effects drastically differ be-
[40, 41] erated economic changes tween warehouses of the same industry

TABLE I: Summary of Warehouse Challenges

A. Programmable and Scheduled Control of thermostats by workers entering the warehouses has a
similar effect. The workers are compelled to change the
Programmable and scheduled control allows regular occu- setpoints due to the drastic differences between indoor and
pants to either manually change the HVAC settings or input outdoor environmental conditions. However, these changes do
fixed occupancy schedules to aid the operation of HVAC not account for the indoor climate requirements needed to
systems. Such an approach is common in commercial build- preserve the freshness of the inventory. The confluent effects
ings, whereby the occupancy profiles can be deterministi- of this approach render the scheduled control unsuitable for
cally quantified [42]. In this case, the building operators are the warehouse environment because it is likely to violate its
responsible for defining the setback or setpoint conditions. thermal constraints.
The conditions refer to the requirements for controlling a
specific space’s indoor climate. The setback conditions are
the minimum acceptable requirements for a space when no B. Reactive Control
occupants are expected, which are defined to conserve energy. This approach addresses the limitations of the Pro-
On the other hand, the setpoint conditions are the levels of grammable and Scheduled Control by reacting to any occu-
conditioning that need to be attained. The manual calibration pancy changes or by setting thresholds that trigger specific
of thermostats is prevalent in residential houses, allowing the changes in HVAC systems, known as the rule-based approach.
adjustment of internal temperatures based on occupants’ level This method exploits the occupancy profiles built using mul-
of comfort. tiple sensors to react to any occupancy changes. Due to its
The literature provides many examples of scheduled con- reactive nature, the HVAC system is triggered upon workers’
trol, also referred to as intermittent control, to replace the arrival to transition from its current setback environmental
continuous operation of HVAC systems with the goal of conditions to its setpoints. Under these definitions, the control
reducing energy consumption. This method was evaluated method represents the virtual processes that are fed with the
under different conditions of buildings structures and climates sensor data that mirrors the physical environment. In what
and varied in complexity and depth of analysis. Works such as follows, the focus will be on the temperature setpoints as they
[43, 44] evaluated the utility of this approach in small house represent the most calibrated environmental condition.
and office spaces, whereby energy savings of 5% and 30% are The literature adopting reactive approaches for HVAC con-
reported, respectively. More profound approaches incorporated trol differs depending on the occupancy-detection strategy. A
intermittent control as part of a broader analysis such as the common method is the integration of different sensors such
selection of an insolation layer [45], peak energy shifting as Passive Infrared (PIR) sensors [30, 49, 50], and occupancy
[46], and selecting the best schedule control strategy [47, 48]. counting and presence methods in the works of [51, 52] into
All of these methods displayed an impressive energy-saving the decision-making processes of HVAC systems. The applied
potential, gauged at around 42% for [45], in the range of 14- methodologies compared to manual and scheduled control
29% and 18-43% for [46, 47], respectively, and 17% for [48]. achieved remarkable energy savings of 42% for [52], in the
ranges between 2-48 % in different climates [51], 20-30% in
The implications of such an approach in warehouses are unoccupied periods in [30], north of 54% for [49], and around
manifold. The unpredictability of workers’ schedules in ware- 28% for [50].
houses causes a discrepancy between the actual and the pre- The seasonal changes result in noticeable effects on the
defined occupancy profiles. The incompatibility in schedules recovery time between setback and setpoint temperatures. The
can unnecessarily operate the HVAC systems, contributing authors in [53] studied the lag-time between the setback and
to energy wastage. In a similar vein, the manual calibration setpoint temperatures in residential houses during the winter

and summer seasons. The lag time can also be referred to requirements and the energy consumption of HVAC systems.
as the pre-conditioning time [54]. The study reported that the In that regard, the contributions to this field target the pre-
estimated lag time is between 2.25 hours and 6 hours. These dictive component, the control component, or the changes in
intervals are widened in warehouses that are more spacious ambient circumstances. For example, Xu et al. [60] incorporate
than residential houses. the climate and building information to predict the warm-up
The gradual degradation of the thermal conditions from time using MPC for a building’s heating system, contributing
setpoint to setback temperature or vice versa, upon occupancy to 20% energy savings. The work in [61] capitalized on the
changes, can spoil the inventory because of the fluctuation in insights of the previous work by investigating the performance
the inventory’s ambient temperature. Furthermore, the reactive of MPC in a range of climates and outdoor weather conditions,
approach does not integrate the long-term occupancy and with the intention of evaluating the climatic conditions that can
energy price changes but is more fixated on the instantaneous provide the most energy savings. On average, a 10% reduction
changes. Such a myopic view of the environment results in energy consumption is reported that varies with climatic
in HVAC operations of high energy consumption given the conditions. The work by Wang et al. [62] expanded the MPC
unpredictability of occupancy behaviour in warehouses. implementation to multiple zones with different requirements
and thermal dynamics. An energy savings of 24% are reported
C. Model Predictive Control compared to methods without HVAC control. On another note,
Optimal control approaches are proposed to address the in [63], MPC was utilized to control HVAC systems, mechan-
shortcomings of reactive control, especially those related to ical ventilation and lighting, resulting in energy consumption
overlooking the long-term implications of any control action. reduction in the range of 15-20% compared to reactive control.
These approaches depend on two pillars: accurate modelling The gradual degradation of the MPC methods’ models limits
of the building’s thermal conditions and predicting stochas- their applicability in a highly dynamic warehouse environment.
tic factors. However, the two pillars were not addressed in The shortcomings of the MPC approach are not limited to
conjunction. Therefore, the analysis will explain each factor modelling only. Solving MPC problems is time-consuming
separately and then elaborate on their combined effect in a and resource-intensive, hindering their deployment on IIoT
warehouse environment. Both modelling methods fall under gateways of limited resources and applicability in real-time
the virtual processes when adopting Model Predictive control scenarios given the dynamic warehouse environment [55, 57].
(MPC). The model built to reflect the thermal dynamics are specific to
Creating an accurate physical model encompassing all the the environment or zone they were built on. The implications
factors contributing to a building’s thermodynamics is chal- of this supposition are two-fold. First, the developed models
lenging. Building-related factors such as the structure, mate- cannot be applied to other zones or buildings. The models
rial, and buildings’ decay and internal factors such as lighting should be easily transferable to different zones or warehouses
and occupancy affect the thermal conditions of buildings [55]. for a warehouse setting. Second, the geographical dispersion of
However, additional factors such as the thermal insulation and warehouses prevents the use of the “one-size-fits-all” model of
flows from each zone exist in a warehouse setting. Given that the environmental conditions of warehouses given the differ-
many factors are involved, devising a mathematical model that ences in structural regulations in each geographical location.
explains the buildings’ thermodynamics is a time and resource- These factors are critical in devising proper HVAC control in
intensive task [56]. If one of these models is available, its warehouses.
accuracy is expected to degrade with time, especially with In summary, the traditional methods suffer from three
the changes that warehouse structure undergoes [57]. Thermal salient limitations that undermine their applicability to reduc-
insulation degradation and changes in internal zones’ distri- ing energy consumption while maintaining the indoor climate
bution because of renovations are two prominent examples of in warehouses. The programmable and scheduled control is
such changes. oblivious to the dynamicity of the occupancy profile in ware-
The stochastic factors involved in an HVAC system en- houses. The reactive control is more concerned about short-
compass occupancy behaviour. While occupancy profiling is term goals by reacting instantly to any occupancy changes,
not straightforward, several approaches have been applied to which disregards the implications of these myopic decisions
predict occupancy patterns within a specific time horizon. on the long-term goals of the energy optimization strategy.
Defining this horizon is predicated on the pre-conditioning This effect is more profound considering the fluctuations in
time that depends on the accurate modelling of the buildings’ energy prices, occupancy, and weather conditions. As for the
thermal dynamics and occupancy profile. Therefore, any accu- MPC method, mathematical models developed to mirror the
rate HVAC control relies on the accurate prediction of these building’s thermodynamics are prone to degradation due to
factors [58]. Given the occupancy models’ stochasticity and changing conditions. A summary of different methods and
the wear and tear of warehouses, the model defining the pre- their respective pros and cons are provided in Table II.
conditioning either drifts from the actual models or needs to
be constantly calibrated.
Model Predictive Control (MPC) represents the conver-
gences of the thermodynamics modelling and the modelling of This section discusses the technical requirements of the
the stochastic factors [59]. The literature provides ample exam- warehouse environment and, accordingly, outlines the method-
ples of MPC implementations that jointly consider the thermal ologies that can fulfil these requirements. These solutions

Method Summary Advantages Disadvantages

– Mismatch between the rigidity
– Only requires occupancy mod-
Programmable of scheduled control and dynam-
and Scheduled The HVAC control relies on occu- icity of occupants’ profile
– Easy to implement due to its
Control pancy schedules or manual calibra- – Violation of thermal condi-
limited requirements
[45, 46, 47] tion by occupants tions resulting from thermostats’
– Prioritizes occupants’ comfort
manual calibration

– Prioritizes fleeting changes over

– Requires modelling occupancy the long-term implications of
HVAC systems are triggered based profile and defining climate- present setpoint changes
Reactive Control based thresholds – Divergence between present and
on rules defined by internal climate
[30, 49, 52] – Easy to implement setpoint conditions due to hys-
thresholds or occupancy changes
teresis property

Optimization algorithms are em- – Creates an all-encompassing – Structural and environmental

ployed to produce an optimal set model of the buildings’ condition changes degrade the
Model Predictive of HVAC control decisions, which thermodynamics developed models
Control [61, 62] depend on the proper modelling of – Factors the stochastic conditions – Solving MPC problems is of
buildings’ thermodynamics and the using its prediction capability resource-intensive nature
prediction of stochastic factors

TABLE II: Summary of Traditional Methods

represent the methods that are part of the virtual processes fits into three disciplines: supervised learning, unsupervised
in the realm of DT. learning, and reinforcement learning (RL). Since evaluating
any energy consumption methodology and under the outlined
A. Warehouse Requirements tasks of predicting important factors and controlling HVAC
systems, unsupervised learning is eliminated as a candidate
The warehouse HVAC requirements must be translated to
method. Therefore, supervised learning and RL techniques are
constraints for any proposed HVAC control methodologies. To
employed for the energy optimization goal in warehouses. The
begin, the warehouse environment experiences high dynamic-
following subsections explain each of these techniques and
ity in occupancy behaviour, renewable energy generation out-
their contribution to the problem under study.
put, outdoor environmental conditions, and electricity prices.
1) Supervised Learning: Supervised learning is concerned
Therefore, the controllers’ ambient environment is unknown,
with learning accurate predictions for a set of labelled data by
which eliminates the utility of physical modelling encompass-
modelling the relationship between a set of variables, denoted
ing all of these conditions [64]. However, these factors should
by features or predictors, and the output variable of interest.
be incorporated into warehouse HVAC operations to preserve
Depending on the type of output variable, the supervised
indoor climate conditions. Predicting these values within a
learning task can be either a classification or regression task.
pre-defined time window is a viable method to mitigate their
With supervised learning, the set of predictors and output
associated uncertainty [64].
variables are always available for training ML algorithms that
The HVAC control decisions should account for their im-
serve as ground-truth data.
plications on future decisions, given the decisions’ temporal
In regard to HVAC control, supervised learning can be
correlations [7]. For example, activating the HVAC system to
leveraged to predict the future values of different factors that
full capacity for precooling purposes leads to an immediate
can facilitate the decision-making process. The number of
spike in consumption and costs; however, in the long term, this
occupants, occupant proxies, energy load predictions, weather
action might prove to be useful. The reaped benefits originate
conditions, renewable energy, and energy prices are factors that
from the probable increase in occupancy or electricity prices
fall under this category. Many studies incorporated this addi-
compared to current conditions. As such, proper HVAC control
tional information to improve the HVAC control performance.
necessitates balancing the long-term and short-term goals in
For example, the works of [52, 65] reported significant en-
its decision-making processes.
ergy savings with occupancy-driven HVAC control. Similarly,
weather, electricity price, and renewable energy forecasts were
B. Candidate Solutions incorporated in multiple studies [64, 66, 67] with the goal
Facilitated by the widespread deployment of IoT sensors, of providing a better overview of the current environment.
the ubiquity of data, and powerful computing, data-driven Developing models that can accurately predict each of these
models have proven their merit in enhancing the buildings’ values can diminish some of their uncertainty. The utility of
energy consumption [27]. These factors along with the control these models is determined by the time window of the resultant
and actuation capabilities provided by IIoT devices and BEMS predictions, which is dictated by the predicted factor and the
bolster Machine Learning (ML) techniques as solutions to space under study [68]. The discussion about lag time or pre-
concerns related to energy efficiency in buildings [20] and conditioning time in the warehouse requirements subsection
in warehouses in particular. ML is a data-driven method that presents a good example of the bespoke window.

While supervised learning addresses some of the issues at assesses the expected reward that an agent can get starting at
hand, it does not enforce any sequential decision-making. For state s and following the policy π.
HVAC control, this factor is essential to realize the energy Limitations of Vanilla RL: Trial-and-error methods allow
optimization strategy. As such, supervised learning acts as a the RL agent to accumulate experiences involving states and
complementary piece to any HVAC control strategy. actions producing a comprehensive environment model. The
2) Reinforcement Learning: Compared to supervised and discretization of the state and action pairs to determine their
unsupervised learning, RL is a data-driven approach used for corresponding value function results in a large number of state-
sequential decision-making. The RL algorithms possess many action pairs [69]. The tabular methods that assign a value for
advantages that set them apart from other ML approaches. each of these pairs are inefficient because they consider all
RL’s advantages: First, RL’s main goal is to balance short- combinations of state-action pairs. In that regard, the growth
term and long-term goals, reflected by immediate feedback and in these combinations is superfluous as many combinations
reward mechanisms. In relation to HVAC control, the trade- cannot be encountered in real-world scenarios. If tabular
off between these goals stems from the need to immediately methods are followed, continuous values such as energy prices
satisfy indoor environmental conditions versus the uncertainty are partitioned into coarse-grained brackets to mitigate the
of future occupancy behaviour, energy prices, and renewable explosion in the number of possible discrete values. Such
energy sources. Second, RL does not require pre-defined a method implies less accurate mappings between the actual
training data to learn. Since it is a control-centred algorithm, values of energy prices and the HVAC control decision. To that
RL interacts with its environment and learns sequentially using end, different function approximation methods were developed
a trial-and-error methodology. This characteristic is desirable to address these shortcomings.
when no optimal strategy for HVAC control is available. Function Approximation methods: Different feature con-
Lastly, given that RL methods interact with their environment, struction methods can be utilized such as linear and coarse
they instantly receive feedback about the usefulness of their coding methods as a basis for function approximation [70].
HVAC control actions, a property similar to accuracy measures However, these methods are either limited by their assump-
in supervised learning [69]. tions (linearity) or require expert knowledge or extra pro-
RL’s Elements: The RL is applied to control HVAC systems cessing to decide some parameters [70]. As a result, nonlin-
by interacting with an environment that follows a Markov ear function approximation using Artificial Neural Networks
Decision Process (MDP). The RL agent starts interacting with (ANN) provides the tools to approximate value functions
the MDP from an initial state and performs an action, which and creates high-dimensional combinations of states using
results in rewards that guide the agents’ future actions. Upon the provided raw data. The combination of ANN and RL is
the action completion, the MDP transitions to the next environ- referred to as Deep Reinforcement Learning (DRL).
ment state based on MDP’s transition dynamics. The rewards The integration of ANNs and their more profound variants
are accumulated in a time-discounted fashion, which means of Deep Neural Networks (DNNs) results in some prominent
that less weight is attributed to older interactions and their advantages. High-dimensional feature combinations of state
corresponding rewards. While this Markovian modelling sim- and action pairs could not have been quantified using linear
plifies the environment, the definition of the current state can or polynomial interactions or encountered in real-world data
be expanded to include either lagged version of state variables obtained from the trial-and-error method. An additional ad-
or their predictions achieved using supervised models. The vantage of using deep learning (DL) is the ability to integrate
MDP can be represented using a tuple P = (µ0 , S, A, T, λ, R) Transfer Learning (TL) into the mapping of value functions
such that: and state-action pairs [55]. As a result, DRL acquires trans-
• µ0 is the initial state. ferability, which is a favourable property for HVAC control in
• S is the state space that reflects the studied environment. warehouses. The integration of DRL methods for HVAC con-
This set can include the factors deemed necessary for trol is garnering increased attention, manifested its prominent
HVAC control decisions. adoption in literature through works such as [55, 67, 69].
• A is the action space that represents the decision taken 3) Advantages of Data-driven methods: Supervised and
by the agent to control HVAC systems. Reinforcement learning techniques can address the challenges
• R : S ×A×S → − R represents the reward distribution imposed by the warehouse environment and the limitations
where R(s, a, s′ ) is the reward gained when applying an of the traditional approaches. On the one hand, supervised
action a to a state s to transition to a state s′ . learning can predict some of the future values of the fac-
• T : S ×A×S → − R is the transition probability distri- tors contributing to HVAC control; therefore, addressing the
bution, such that T (s′ | s, a) indicates the probability of shortcomings of reactive and scheduled control. On the other
transitioning to state s′ when taking an action a at a state hand, RL combines a long-term outlook, reflected by the
s. reward function, and integrates feedback into its decision-
• λ represents the decaying factor of previous rewards. making process. These two characteristics are lacking in other
The agent’s behaviour when interacting with the MDP is traditional methods, which highlights the superiority of the
determined using a policy π that maps states to actions. The combination of supervised and reinforcement learning. While
notation π(a | s) represents the probability of taking an action this combination improves upon the traditional methods, de-
a at state s, representing a stochastic policy. The quality of a veloping a holistic HVAC control model using these methods
state s is determined using a value function. Here, the quality for the warehouse environment presents its own challenges.

VI. S UPERVISED L EARNING IN WAREHOUSES : Large Air Leaks challenges. The Wi-Fi connectivity, activity
C HALLENGES AND P OTENTIAL S OLUTIONS levels, and CO2 can serve as better indicators of occupancy.
The optimal HVAC control is predicated on the predictions For example, Wang et al. [74] employed ML techniques to
of specific factors that include outdoor and indoor environmen- predict Wi-Fi connectivity counts and compared it to manually
tal conditions, energy load predictions, and occupancy profil- collected ground truth data in an office environment. Their
ing. Compared to other factors, occupancy profiling is less methodology accurately predicted the number of occupants;
straightforward to quantify. The subsequent sections involve however, some limitations compromise their approach’s ap-
an extensive discussion of occupancy profiling methods, the plicability. The association between the Wi-Fi connectivity
availability of datasets to quantify occupancy, and the adopted count and occupancy is based on the assumption that the
feature engineering techniques, which represent the steps occupants will undoubtedly use their cell phones when joining
necessary for predicting occupancy in a specific space. The a specific area. The reported results showed that this method
discussion analyzes the current state of the art, its challenges, failed to predict peak occupancy as a result of this strong
and prospective solutions for each step. assumption. The limitations of this method also extend to the
adopted feature engineering technique, which extracts time-
based features predicated on the existing data periodicity.
A. SL1: Occupancy Inference Method Again, this assumption precludes the methods’ implementation
Multiple sensors can accurately collect the occupancy be- in situations that are not compliant with these periodic trends.
haviour in a specific space. Classifying occupancy inference Here, the assumptions of Wi-Fi connectivity counts are unfit
methods depends on the data acquisition of occupants’ pres- to be implemented in a warehouse environment. This analysis
ence and movement, which can be divided into two main leaves the CO2 and activity levels as good indicators of
categories. The first category requires the direct involvement occupancy.
of occupants by collecting their identities upon arrival or using
The activity levels can be obtained using a Passive Infrared
videos or images of the involved space to infer the number of
(PIR) sensor that increments an internal counter when a
occupants [71]. The second method is predicated on proxy
certain activity is captured. The workers’ constant movement
estimators of occupants using either motion sensors [30],
to load, unload, or search for inventory boosts the chances of
WiFi signal disruptions [72], or CO2 levels [73] in specific
employing these sensors to reflect the number of workers in a
residential areas.
warehouse. These sensors’ deployment in a warehouse envi-
The first category of methods is intrusive and raises some
ronment presents a more compelling case than its deployment
privacy concerns. Even if participants’ consent is granted,
in an office environment characterized by minimal movements.
these adopted occupancy inference methods also expose some
This fact is challenged by the possibility of accounting for
technical and practical limitations. Works such as [65, 71]
a single occupant twice while moving in opposite directions
explored image- and video-based occupancy inference. These
during the sampling periods. Therefore, the overestimation of
methods require the adoption of Deep Neural Networks
current occupants is possible when employing PIR sensors.
(DNNs) that are data- and resource-intensive algorithms. Ad-
ditionally, the process of obtaining ground-truth data on occu- The CO2 concentrations represent a lagged indicator of oc-
pants’ numbers requires manual labour, which adds monetary cupancy, which is produced by occupants through respiration
concerns to the building operators. Furthermore, this segment [68]. The double-dipping phenomena of activity levels and
of research is in its nascent stages with no performance the interference of other influencing factors for temperature
guarantees. Lastly, surveillance cameras need to be mounted are not encountered when utilizing CO2 concentrations. The
on warehouse zone levels to effectively capture occupants existence of occupants in a specific area automatically influ-
and their activity. To realize this function, significant financial ences the collected CO2 concentrations. Therefore, linking the
investments should be carried out on the warehouse operators’ variability in CO2 concentrations in a specific time window
end. The confluent factors undermine the image- and video- mirrors the change in the number of occupants in a past time
based occupant inference. window. The CO2 concentrations are also used to trigger the
Inferring the number of occupants using proxy indica- ventilation system, which means that sensors that record these
tors has many advantages. The proxy indicators can include concentrations are part of the warehouse infrastructure. Since
temperature, activity levels, CO2 concentrations, and Wi-Fi it is a lagged indicator, supervised learning techniques can be
connectivity. All these indicators preserve the occupants’ pri- leveraged to predict the future values of CO2 concentrations
vacy, which represents a salient concern for surveillance-based using the current state of the environment. One concern
methods. In the case of Wi-Fi signals, anonymization of MAC with associating CO2 variability to the occupancy change is
addresses can address privacy concerns [74]. The listed proxy linked to the type of activity executed by occupants. The
indicators are already part of the warehouse infrastructure so work by Kapalo et al. demonstrates the differences in CO2
no new equipment should be installed for occupancy inference. production [76] as a result of the physical activity, which can
Each of these indicators reflects an aspect of occupancy, be projected to the warehouse environment. A worker that is
which can be leveraged to infer their count [75]. The rise unloading or re-racking the inventory drastically affects the
in indoor temperature can be attributed to the thermal energy CO2 concentrations compared to a worker that is walking
produced by new occupants. However, multiple factors can around. This factor must be considered when solely relying on
affect the indoor temperature such as the ones explained in the CO2 concentration changes to quantify occupancy changes.

B. SL2: Dataset Availability concentrations. The comprehensive nature of this dataset in

terms of the period of the collected data and the inclusion
The state-of-the-art provides ample publicly available of different rooms are favourable to be projected into the
datasets for quantifying occupancy using either proxy esti- warehouse environment. These properties allow the mapping
mators or manual counting. These datasets include an array of of the warehouse multi-zone spaces to different rooms; thus,
environmental features that were deemed necessary to estimate assessing the transferability of any occupant-based models on
the number of occupants in a specific space. The work of Dong a local level.
et al. [77] gathered a database with datasets including occupant
count and behaviour in buildings across different geographical
locations. The aim of this database is to be employed for
C. SL3: Feature Engineering
advancing the field of building control by linking occupant
behaviour to building energy consumption. Using the raw data as inputs to any supervised ML model
Since these datasets are collected in an office environment, often leads to poor prediction results, which, in the context
it is crucial to analyze them using the warehouse environment of the problem, can prove disastrous to the inventory and
lens. This analysis incorporates the warehouse challenges and the goal of energy optimization. Therefore, creating features
the possible projection or extrapolation of warehouse-specific should follow the requirements of the prediction tasks and
phenomena. As a first step to realizing IoT-based applications the studied environment. Towards that end, multiple layers
in warehouses, researchers and practitioners should exploit in the feature engineering step should be explored for proxy
the datasets collected in residential buildings. Therefore, the occupancy prediction from time-series data.
warehouse-based literature can be enriched and the utility The literature is dominated by works that extract time-
of IoT-based applications in warehouses can be exhibited. related features due to inferred periodicity and seasonality of
Together, these factors would serve as proof of concepts for different features evidenced by the exploratory data analysis.
future real-world implementations. The following analysis is These features include the day of the week, the hour, season-
not based on comprehensive experimentation with the avail- related features, and if the day is a holiday. In the warehouse
able datasets, but rather a shallow overview of the applicability setting, these features are of little to no importance due to the
and the technical viability of these datasets with respect to the separation in the working hours of warehouse workers and
warehouse environment. An extensive classification and ML- residential and office occupants, as previously established in
based analysis should be conducted for these datasets, which the challenges section. Therefore, if 8 a.m. and 5 p.m. are
will be part of our future work. features of high importance for an office job, this saliency is
The literature employing these datasets is dominated by blurred for warehouses. In the case of the holiday feature,
methods predicting occupant behaviour status, door status, the role of the warehouse in the supply chain necessitates
lighting status, and plug load to name a few. These methods its constant operations. These conditions diminish the effect
are office-specific, which disqualifies any of them from the of holidays in determining occupancy. The warehouses are
warehouse-related analysis, as they cannot serve as occupant scattered over large geographical areas, which means that local
proxies. For example, an alley lights up with the existence features such as holidays or seasons are not uniform across
of occupants, which means that the change in lighting status geographical areas. The incorporation of these features can
is not indicative of changes in occupant count. On the other affect the transferability of the developed supervised models
hand, the occupancy counts included in some datasets are across warehouses.
rare and are collected assuming the existence of a man- The time-dependent nature of environmental features in
ual labelling procedure. This procedure is instrumental for connection with occupancy is an aspect to incorporate into
validating any occupant prediction methods. Moreover, the the feature engineering steps. In particular, features such as
gathered datasets are characterized by their locality, whereby pressure, temperature, CO2 concentrations, light status, and
they are either collected in a single space or geographical area. activity levels are all affected by the existence of occupants
These constraints undermine the developed prediction method either momentarily or after some time. The light status and
because their transferability aspect cannot be evaluated. This activity levels belong to the set of features that automatically
property is fundamental to warehouses given the need for reflect the existence of occupants, while other mentioned
local transferability across zones and a general one across features lag behind the existence of occupants. These factors
warehouses. Most of the datasets do not capture a whole year’s necessitate the inclusion of lagged environmental features to
worth of data, implying the omission of climate- and occupant- predict any occupancy proxy estimators.
specific trends observed in some seasons or months from any The literature related to occupancy count estimation adopted
developed model. These limitations render these datasets unfit different feature engineering methods, ranging from utilization
for the warehouse environment. Such circumstances call for a of raw sensory data, time-dependent features, and lagged envi-
more thorough investigation of the occupancy estimation field. ronmental features. Due to the temporal nature of occupancy
This investigation led us to a single work that addresses prediction, methods that use raw data utilize different flavours
the data scarcity aspect and encompasses data gathered from of Long-short Term Memory (LSTM) algorithms [74, 78]. On
rooms of different capacities. The work by Kallio et al. [73] the hand, works such as [68, 73] used lagged features while
made available a dataset that includes a set of environmen- [79] utilized different time-dependent features. The summary
tal features to build models for predicting the future CO2 of challenges and their solutions are summarized in table III.

Challenge Summary Solution

– Occupancy can be inferred by the direct – Proxy indicators preserve occupants’ pri-
involvement of occupants vacy with no additional costs
Occupancy Inference – Proxy estimators such as CO2 concentra- – CO2 concentration is a great candidate to
Method [30, 65, 71, 74] tions and activity levels can be used determine occupancy changes

– The datasets are collected in residential

settings The dataset of [73] and a select datasets in
– The datasets suffer from their limited [77] involve a comprehensive set of environ-
Dataset Availability time frame and assumptions about the mental features collected in multiple rooms
ability to manually count occupants across the years

Stemming from the co-dependence of en-

The unique characteristics of the warehouse
Feature Engineering [68, vironmental features, including lagged ver-
environment undermine the conventional
73, 78] sions of environmental features is instru-
time-based feature engineering techniques
mental for feature engineering

TABLE III: Summary of Supervised Learning Challenges

VII. D EEP R EINFORCEMENT L EARNING : C HALLENGES available to train the models specific to the target task. The
AND P OTENTIAL S OLUTIONS implications of applying specific HVAC control actions, and
This section investigates the challenges of applying DRL the partial observability of the environment are prohibitive for
to control warehouse HVAC systems. Each subsection details exploratory interactions with the environment. TL techniques
these challenges’ roots and suggests solutions based on the allow for a smooth transition phase, leveraging the previously
available literature and authors’ experience. acquired knowledge until enough data is accumulated to cater
to the new task at hand.
A. DRL1: Transferability TL methods can be employed in a warehouse environment
This subsection explains the warehouse environment condi- on two levels. On a general level, the warehouse operators
tions that prompt the transferability challenge in DRL. Later, owned by companies are in a constant state of expansion,
use cases that showcase these challenges in connection with driven by the changing needs of economies and supply chains.
DRL are explained and future directions are investigated. Therefore, the establishment of warehouses over large ge-
1) Facilitating Factors: The transferability property is fun- ographical areas is favourable for increasing profits. As a
damental to any developed model, but it gains more impor- result, environmental conditions and local legislation specific
tance in the warehouse environment. The warehouses are in a to each area challenge the seamless transfer of models to new
constant state of updating their business strategies to accom- locations. On the local level, the TL methods can be applied
modate new customers or adapt to any disruptions in supply to warehouses’ multi-zone space. A DRL agent developed in
chains caused by changes in the economies or the occurrence a zone can be transferred to another zone.
of unexpected events. These factors contribute to the expansion The direct transfer of developed models to a new building or
or shrinkage of the inventory. As a result, warehouse operators zone is faced with multiple hurdles. The degree of similarity
construct new zones or brand-new buildings, which require a between the source and the destination task needs to be
fully-functioning HVAC control method and potent supervised quantified to avoid negative transfers [81]. This pre-requisite
learning methods without historical data. The changes to the entails a thorough investigation of the structural, climatic, and
warehouse can be localized such as retrofitting efforts meant internal warehouse dynamics similarity. Experts can resort to
to address the wear and tear of warehouses. The newly created the physical modelling of this similarity when the data is
environments diverge from the old ones in terms of their absent, which can potentially meet this requirement. However,
internal environmental dynamics, which is consequential to if sufficient data exist, this similarity can be quantified using
the DRL agent with no sufficient experience to make an indoor climatic phenomena such as heat transfer [82]. An
informative decision. additional layer of concern is attributed to the selection of
All these conditions facilitate the application of TL methods. a convenient source or reference warehouse, especially when
TL is a method to transfer the knowledge acquired on a many warehouses or zones are involved, which can be realized
source dataset to initiate learning in the target dataset [80]. using many criteria. For example, when different climates
TL is applied by transferring the weights learned from the are involved, the warehouse experiencing the full spectrum
source task to the target task, which in this case represents of seasons can be used as a reference warehouse to other
the value functions. Due to differences between source and warehouses that experience mild seasonal changes. Another
target tasks, re-tuning of the models using the target dataset is example that better applies to the warehouses pertains to
applied. This process is achieved when some neural network patterns of consumption. A warehouse that is close to populous
layers are frozen to retain information from the source task metropolitan areas aggregates the consumption models of
while others are retrained to introduce new information to the sparsely populated areas, which can better serve as a reference
model. TL is required whenever little to no information is warehouse. This discussion represents a broad analysis of the

transferability aspect in warehouses in general, but it is yet to Structural and Automation Levels: The distinctive levels
analyze the TL mechanisms in DRL. of automation and technological advancements in warehouses
2) Use Cases: In the field of TL applied to DRL, the reflect on the state space and the dynamics of the indoor
differences between the source and the target buildings affect climate. The process of picking up, loading, and unloading
any component of the MDP, which includes the state space, inventory is materialized manually by forklifts or automati-
action space, reward function, transition dynamics, and policy cally by automated storage and retrieval systems. The tran-
[83]. Few works have applied the TL in the context of DRL sition from one technology to another is yet to be fully
for HVAC building control. These works are based on strong materialized, and the path to full automation is still in its
assumptions of uniformity of the state and action spaces, nascent stages [88]. The carbon emissions of forklifts alter
and reward functions [84, 85]. However, such conditions are the indoor dynamics, including the sensor readings, compared
rarely encountered on a local level across the warehouse zones to a more automated warehouse. Even if the state and action
and on a global level across the warehouses scattered over spaces are uniform, these circumstances cause a diversion of
the globe. These conditions engender two broad challenges, the transition dynamics between the source and the target
one related to the indoor climate requirements of each zone warehouses. The integration of renewable energy sources in
and the second pertaining to structural and automation level the operation of HVAC control is another use case of the
differences between warehouses. Use cases that highlight these effect of technological advancement. The incentives of local
aspects will be detailed. governments, the availability of installation equipment, and the
Multi-zone Space: The warehouse’s multi-zone space essen- enabling ambient climate are all important factors that dictate
tially means that HVAC control systems should account for a the viability of integrating renewable energy sources into the
distinctive set of environments. For example, the flammable warehouses’ power grid. Examples of these programs include
zone should include sensors that monitor hazardous gases Ontario’s Clean Energy Credit Registry and the European
or toxic vapours that are common in factories and chemical Green Deal which promote the replacement of traditional
warehouses, known as Metal Oxide Gas (MOX) sensors. These energy sources [89, 90]. The effective integration of renewable
sensors are connected to alarm systems for fast detection of energy sources augments the state and action spaces. On one
any leakages that pose occupational hazards for workers when end, the action space encompasses two sets of actions. The first
inhaled [86]. Meanwhile, these conditions should activate the set is concerned with the control of HVAC setpoints, which
ventilation systems to curb the instantaneous effects of these is a common theme in warehouses. The second set involves
gases. On the other hand, the dry zone requires a special the scheduling of an HVAC system to either a conventional
set of sensors that do not include hazardous gases sensor electric grid or the utilization of renewable energy. In a similar
detectors. Therefore, if the initial DRL model is developed vein, the state space is appended with past, present, and
in the dry zone, the transferability of this model is hampered future renewable energy levels that affect the decision-making
by the inclusion of the MOX sensors. Another example of process of the DRL.
the changes in state space is caused by different climates. The last portion of the transferability analysis is related
Differences in climates introduce new dynamics to the indoor to the MDP’s reward changes. As previously explained, the
environmental conditions and new variables to the state space. reward function shapes warehouse operators’ priorities which
While temperature and humidity are sufficient indicators of can be categorized into two themes: reducing energy con-
outdoor conditions in hot arid climates, rainfall and snowfall sumption and maintaining indoor environmental conditions
amounts are crucial to better understanding the outdoor envi- within predefined setpoints. However, augmenting the electric
ronmental conditions in humid climates. These subtle nuances grid with renewable energy and the thermal requirements of
are fundamental to the HVAC control and are accordingly the inventory contribute to changing warehouse operators’
integrated into the state space. The expansion of the state space objectives. The storage medium of renewable energy sources
has a trickle-down effect on all MDP components that should is prone to gradual degradation and inefficiencies due to the
incorporate this new state, which are foreign applications to charging and discharging cycles [9], and its charging levels
conventional TL techniques. should be maintained within specific thresholds to avoid pos-
Warehouse inventory is stacked with items that cater to sible spoilage. The reward function would integrate this aspect
the requirements of the customer base they serve. For exam- in warehouses with renewable energy generation. The macro-
ple, agrarian communities are more conscious of their con- theme of maintaining indoor environmental conditions hides
sumption, which explains their disposition for fresh and less many subtleties that have a consequential effect on perishable
processed food with limited shelf time. On a different note, items. The extreme fluctuations in indoor climate can result
crops produced by these communities should be monitored in adverse effects on perishable goods [91]. To achieve a
to maintain their quality for future transportation. Based on reduction in energy consumption, the indoor climate can be
these factors, the preferences of the nearby populous and their susceptible to large swings in thermal conditions. Therefore,
economic activities are major determinants of the types of the permissiveness of these changes depends on the type of
stored items. These conditions require integrating sensors that perishable items, which need to be factored into the reward
can detect the deterioration of perishable items [87], which function.
is not a common requirement across warehouses. Similar to Potential Solutions: The challenges of DRL transferability
other cases, these conditions alter the state space, discarding are associated with the warehouse environment and its require-
the employment of conventional TL techniques. ments. Zooming into each of these challenges shows that they

are shared with the challenges of the IoT sensors. At their core, foundations of such an assumption. Second, collecting the
the hurdles of transferability are related to the non-uniformity sensory data of each room and mapping these states to actions
of the source and target domains, manifested by changes in specific to each room causes an explosion in inputs to the
the state and action space. In the DRL framework, such an NN representing the value function. The convergence to an
alteration affects the neural network (NN) architectures that acceptable solution demands considerable data that would not
map the state-action pairs to their value, known as value be necessarily available. Fortunately, the research community
functions. Surveying the literature does not yield any decisive proposes multi-agent DRL to address scalability issues by
methods for addressing these concerns. Thus, it is important to breaking down the MDP environment into co-dependent ones
tackle such concerns from a fundamental angle based on the to reduce the complexity of training each agent. No concrete
representation of state and action spaces. Finding a common methods are followed to demarcate the state and action spaces
representation between any different sets has been exten- of the spawned DRL agents. However, building structures can
sively studied in the literature. These methods pre-dominantly easily define these agents based on the rooms or thermal zones.
project these sets into a shared dimension achieved using Potential Solutions: On the warehouses’ end, the multi-
linear methods, such as Kernel Principal Components and its agent DRL reaps multiple benefits. First, each agent can
special use cases [92], and non-linear methods such as the train its model based on the requirements and the occupancy
Autoencoders and its variations [93]. However, applying these behaviours of each zone, which conforms with the needs of
methods misses each environment’s shared aspects, which is warehouse operators. Each agent defines its state and action
reflected by their respective NN’s architecture and trained spaces and reward functions. Second, the decoupling of the
weights. The commonality in feature space is mirrored by requirements of each zone provides more tractable and easier-
the commonality in the NN’s architecture [94]. Therefore, to-train agents given the reduction in the state and action
methods that extract these shared aspects are good candidates spaces and the data supplied for training.
to address the non-uniformity of NN architectures induced The multi-agent algorithms can be divided into three groups,
by changes in the action and state spaces. While originally fully cooperative, fully competitive, and a hybrid approach
tailored to different data representations, such as audio or [98]. In short, the cooperative agents work towards optimizing
video, multi-view representation learning [95] is a suitable a long-term goal that is represented by the reward function.
candidate to tackle the TL in divergent environments. A varia- On the other hand, competitive agents interact with their
tion of this method was implemented in a Federated Learning environments to yield a zero-sum game. Lastly, the mixed
(FL) use case, whereby the local models have heterogeneous model includes competitive and cooperative aspects. Assigning
NN architectures [96]. The direct application of this type the multi-agent space of the warehouse HVAC control requires
of learning provides insights into the shared facets of these an in-depth analysis of the environment dynamics that govern
architectures by generating common representations. While each zone. Despite the prominent distinction in each zone’s
not previously implemented in the realm of TL, this suggestion indoor requirements, the HVAC control’s main goal of each
can spawn intriguing applications to address the challenges zone is to maintain that zone’s climate within specific thresh-
of TL in the warehouse environment. An extra benefit of olds and reduce energy consumption. Therefore, the long-
these approaches is quantifying NN’s similarity and addressing term goals are shared between the agents, alluding to the
different NN architectures with common features. cooperative nature of the multi-agent environment. This type
In terms of the variation in warehouse indoor climate of cooperative multi-agent DRL is referred to as team-average
dynamics and alteration of the reward function, reward shaping reward [99]. However, this cooperative regime is defied when
(RS) [97] is a mature field developed to address these two renewable energy is integrated into the equation. In this case,
issues. The RS exploits external knowledge to restructure the agents have to compete to exploit this rare resource
the reward to re-tune the agent’s policy. It is important to to reduce their contribution to energy consumption, which
find the type of applied TL techniques in all use cases. separates a cooperative multi-agent environment from a hybrid
These techniques are zero-shot transfer, few-shot transfer, and one. The literature provides some methods to address the
sample-efficient transfer [83]. Inferring the type of transfer mixed multi-agent DRL environment such as decentralized Q-
is predicated on the similarity between the source and target Learning [100] and Morse-Smale games [101]. These methods
warehouse under study. are yet to be applied in building settings, more so to the
warehouse setup.
Under the cooperative or hybrid paradigm, the multi-agent
B. DRL2: Scalability environment faces challenges related to non-stationarity. The
This subsection outlines the root causes of the scalability environment’s non-stationarity is well-established in the air
challenge and its implications on DRL models. leaks challenges. In particular, when the agent controlling the
Root Causes: The common approaches in HVAC control- cold zone applies an action, this action will affect its zone
related literature delegate this control to a single RL agent. through HVAC functions and other zones by air diffusion.
However, this strategy is impractical and exposes some lim- Therefore, the perceived environment of one agent is affected
itations. First, this strategy is based on loose assumptions of by other agents’ actions, which invalidates the basic assump-
the uniformity of indoor climate dynamics between different tions of stationarity in a single-agent setting. Incorporating
rooms or halls. The location of these rooms, their sizes, the actions of other agents into the state space of one agent
and the occupancy patterns are all factors that erode the is a shortcut that can mitigate the non-stationarity of the

Fig. 2: Summary of DRL Challenges

environment. However, the connectivity issues that may arise • Investigate the contribution of the features resulting from
in the IoT environment hinder the applicability of such an the feature engineering technique on the prediction accu-
approach. If this issue is addressed, the explosion of each racy and draw conclusions linked to the challenges of the
agent’s state spaces is another issue to consider. All of these warehouse environment (o2); and,
conditions present considerable challenges to applying multi- • Analyze the predictions’ divergence in light of the ware-
agent DRL systems in the warehouse environment. Figure 2 house challenges (o3).
depicts a schematic that summarizes the DRL challenges. Each of these objectives is studied independently and traced
to the physical phenomena taking place in the warehouse
VIII. U SE C ASE S TUDY environment. The code is available on the GitHub repository
. Such an analysis is instrumental to prove the challenges
This section presents a use case to highlight the above- of the warehouse environment and their effect on the applied
mentioned environment-specific and solution-specific chal- supervised learning procedures. The discussions in connection
lenges. An extensive discussion about the supervised learning- with this environment spawn many research questions to
related challenges is conducted to motivate their connection to be addressed by the research community. Addressing these
the warehouse environment’s structural challenges. hurdles is crucial not only from the intellectual curiosity
standpoint but more from the challenges’ downstream effect
on the supply chains and the Earth’s ecosystem.
A. Objectives
This manuscript has extensively explained the myriad chal-
B. Dataset Description
lenges related to the field of energy reduction in the warehouse
environment. Since the literature and the research community The shortcomings of the datasets as mentioned in the SL2
are yet to investigate this theme, the use case study is ap- are partially addressed in the dataset used [103] for this case
plied to the residential environment. However, the analysis, study. This dataset includes data points collected in various
discussions, and conclusions are projected in the warehouse seasons across two years in a building that includes 7 floors
environment. Toward that end, a publicly available dataset at one-minute intervals. This property engenders a comprehen-
collected in a residential building is employed [102]. Based sive dataset that mitigates the scarcity of data, allows extensive
on the detailed structural- and solution-based challenges, the analysis of the seasons’ effect on any predicted property,
objectives of this use case fall into three main themes: and facilitates the analysis of the methods’ transferability
on different floors and single-floor zones. Lastly, this dataset
• Showcase the effect of different feature engineering tech-
niques on the obtained results, which is connected to the 1 https://github.com/Western-OC2-Lab/Data-driven-Methods-for-the-
supervised learning challenge SL3 (o1); Reduction-of-Energy-Consumption-in-Warehouses-Use-Case.git

Approach Algorithm Parameters

NCL = {32, 64, 128}
FL = {1, 2, 3}
a1 1D-CNN
NFL = {32, 64, 128}
CL = {1, 2}
Ridge Regression 10
λ = 0.0001 −
−→ −0.5
Lasso Regression
a2 & a3 MD = {3, 5, 10}
Gradient Boosting Trees
Random Forest Trees MSS = {2, 5, 10}

TABLE IV: Hyper-parameters of the approaches

• Approach 2 (a2): This approach combines the historical

Fig. 3: Floor 6 zonal visualization data of each feature and the time-related features that are
fed to different ML methods;
• Approach 3 (a3): This approach considers generating
encompasses load-based data such as electricity consumption lagged versions of environmental features within a spe-
of Air Conditioning (AC) units, lighting, and plug loads, which cific historical time window.
can be employed as occupancy indicators. The analysis applied to the first phase singles out the
The dataset captures the sensory data in an academic best approach and discussion the hyper-parameter optimization
office building at Chulalongkorn University in Thailand. The process. All the methods are predicting the AC unit’s energy
collected data includes the energy consumption of individ- consumption as it is the best indicator of lagged occupancy out
ual AC units, plug loads, lighting loads, and environmental of the collected features. The application of the best method
sensor data, including temperature, humidity, and perceived is scrutinized in light of the warehouse environment. Next,
luminance. Some rooms include multiple AC units, referred to the second phase investigates the best methods’ features. In
as AC i, such that i represents the number of the AC unit. The particular, this phase is concerned with extracting the feature
data is split over seven files, each representing the captured importance and the set of features that contribute to better
data on each floor. Floors 1-2 are divided into 4 zones whereas predictions of energy consumption. In concert with the first
floors 3-7 are split into 5 zones. Floor 6 which is divided phase, this phase also highlights these features’ relevance in
into 5 zones is chosen for analysis purposes as it includes a warehouse environment. The last phase links the prediction
more zones to investigate and enough data to conduct the inaccuracies to the studied zones’ ambient environment. The
extensive experimental procedure. However, this same analysis possible associations between these divergences and the phys-
can be applied to other floors that share the same architectural ical phenomenon in multi-zone spaces are pointed out. All of
characteristics. these methods are evaluated using the Mean Absolute Error
(MAE) metric.

C. Experimental Design and Procedure

D. Experimental Parameters
The floor under study includes five zones of different sizes Obtaining satisfactory prediction results requires making the
and exposures to outside environmental and indoor environ- best out of the available features and adopted algorithms. Each
mental conditions due to the adjacency to other zones, the algorithm used in any of the three approaches a1, a2, and
available data, and the placement of sensors. The last factor a3 involves a set of hyper-parameters that need to be tuned
determines how biased is the data to a particular area in a zone. based on the available data. This process is referred to as
The decision to choose any of these five zones is predicated hyper-parameter tuning (HPO) strives to find the ideal ML
on fulfilling these conditions. Based on these conditions and architecture with the optimal set of hyper-parameters. Many
the visual representations of the zones depicted in Figure 3 methods can be employed to achieve this feature ranging from
adopted from [102], the analysis could have been conducted brute-force approach such as Grid Search to more intelligent
in all zones except zone 3. To limit the purported effect of the and feedback-based approach such as Bayesian Optimization
adjacency to other zones on the predictions and the effect of [104]. This work applies Grid Search to infer the best set of
the sensors’ placements, the decision came to choose zone 2 hyper-parameters.
of a square shape. Approach a1 leverages CNNs, which means that the number
The analysis is divided into three phases to align with the of convolutional layers (CL), the number of filters per layer
set objectives. The first objective targets SL3, which suggests (NFL), and the number of layers (FL) and neurons per
three main approaches for Feature Engineering to tackle the layer (NFL) in the fully connected layer should be tuned.
nature of the data: Approaches a2 and a3 experiment with a host of algorithms
• Approach 1 (a1): This approach uses 1D-CNN fed by ranging from linear algorithms such as lasso and ridge regres-
raw input data; sion to non-linear ones that include Gradient Boosting Trees,

and Random Forest. Each of these algorithms includes a set Window Method Algorithm Parameters MAE
of hyper-parameters to tune. In particular, the lasso and ridge NCL = [64],
a1 1D-CNN 1.11
NFL = [32]
regressions should tune the penalty parameter of their objective h-10 f-10 MD = 10,
a2 Gboost 5.71
function (λ). The trees should tune the maximum depth (MD) MSS = 2
of their grown trees and the minimum samples to execute a MD = 3,
a3 Gboost 0.91
MSS = 10
split (MSS). In addition to the HPO process, it is important NCL = [64],
a1 1D-CNN 1.13
to define the history and future time windows. The history NFL = [64]
time window is important for approaches a1 and a3 that use h-20 f-20 MD = 10,
a2 Gboost 5.72
MSS = 2
lagged versions of features. The future time window defines MD = 10,
a3 Gboost 1.08
the prediction horizon. The history and future time windows MSS = 5
are defined as h − w f − w, such that w = {5, 10, 15, 20}.
TABLE V: HPO results for a1, a2, and a3
The summary of these parameters is provided in table IV.

E. Experimental Results of values. The approaches are evaluated on a bin-month

This subsection is split into different subsections, each combination such that the training and validation datasets
addressing one of the outlined objectives based on the defined are defined accordingly. This new evaluation regime unveils
experimental design and procedure. the approaches’ ability to predict consumption in different
1) Data Preliminaries: Two datasets are available for zone seasonal conditions and their ability to predict different levels
2 of the 6th floor, one collected in 2018 and the other in of consumption.
2019. The 2018 dataset is captured over the summer season, The first phase’s results, which consider a random validation
whereas the 2019 dataset is collected over the whole year set are summarized in table V. Only history and future time
encompassing different seasons. Here, the 2018 dataset is a windows of 10 and 20 minutes are shown for illustrative
manifestation of the many ailments of the publicly available purposes; however, the observed trends are consistent with the
dataset in occupancy prediction, which include scarcity of absent time windows. In different time windows, method a2 is
data and limited exposure to climate seasonality. As such, obviously the worst-performing algorithm. The combination of
the 2019 dataset is used for ML methods’ training, while the single-lagged environmental features and time-based features
2018 dataset is used for further analysis. On the other end, such as the hour of the day, the day of the week, and
the definition of the validation set changes, depending on the the month were not strong predictors of future AC energy
analysis goals. consumption. The main motivation of a2 stems from the
The exploratory data analysis suggests the existence of association between the occupants’ existence and the AC’s
many missing data points of each feature that can be reme- activation. However, based on these results, the poor results
died using imputation methods. However, selecting the proper show that such a hypothesis is debunked if the amount of
imputation methods entails additional complexity to the pa- energy consumption is involved. In cases when the problem is
rameter selection process and adds little contribution to the transformed to a classification task, meaning that to predict if
overarching goals this case study is trying to achieve. There- the AC units are activated, a2 will be favorable. On other
fore, the missing data points are removed from the dataset. hand, the results of approaches a1 and a3 are extremely
Since a1 and a3 involve lagged versions of features, there is a close but show a clear favorite. Both algorithms are based
need to find the continuous portions of data with no missing on similar assumptions that lagged versions of all features
data points. A continuous frame constitutes part of the data, contribute to better predictions. However, a focal difference
whereby the difference between two consecutive data points is that no manual feature engineering process is applied for
is equal to the sampling granularity, and it extends to more a1 compared to a3. While a3 outperformed a1 in every time
than an hour. The resulting dataset is split such that each data window, this superiority is tainted by two main limitations.
point belongs to a frame. This classification facilitates creating The first is the manual feature engineering process, and the
inputs that abide by the data’s contiguity requirement to extract second is the transferability of tree-based algorithms across
lagged versions of features. zones and buildings. These conditions should be factored in
2) a1 vs. a2 vs. a3: The analysis’s first phase tackles o1 that when selecting the feature engineering process.
investigates the utility of each feature engineering approach. The AC’s energy consumption is divided into five equal
The first step is to apply HPO to obtain the ideal configuration bins. The resulting bins are bins = {0, 5.82, 11.65, 17.47,
of the best ML algorithm and its hyper-parameters, so that a1, 23.3, 29.13} such that each bin is represented by its index
a2, and a3 are compared on a fair basis. The configurations instead of its value. To summarize the information and con-
in the HPO process are evaluated on a random partition of dense any reported figures, the MAE of the binning process
the data, referred to as the validation set. After conducting is categorized based on the defined seasons in the Northern
the HPO process resulting in the best set of configurations, hemisphere. As such, the results of March are equivalent to the
the next phase compares the two approaches through a more Winter season, April, May, and June to the Spring season, July,
fine-grained lens. Here, the validation data represents the data August, and September to the Summer season, and October
points of a single month. The AC unit’s energy consumption to the Fall Season. The remaining months are missing from
of the 2019 dataset is divided into five bins of an equal number the dataset. For seasons with multiple months, the MAE is

averaged for each month and the standard deviation is reported. across all boosted trees [106].
Figures 4a and 4b depict the variation of MAE with re- The following approach is adopted to answer the lingering
spect to each season over prediction horizons of 10 and 20 questions that arose from the discrepancy in performance
minutes for approach a3. The AC units are activated over all between the summer season and other seasons. The feature
seasons due to the ambient climate, whereby drastic changes importance is extracted for each absent month, which took
in temperature are not expected. However, these variations are place when the feature engineering approaches were evaluated.
manifested in the adopted binning structure. The highest bin After that, the feature importance of the models with no
values (bin 5) corresponding to the highest AC units’ energy fall months and without one of the summer months are set
consumption are only encountered in the summer season, aside for analysis. Next, the top 6 features that contribute
which aligns with the high-temperature conditions. The figures to over 90% of the importance of the model with no fall
show the importance of the binning process to better extrap- months are highlighted. Later, the percentage difference of
olate the prediction performance, especially if the regression the feature importance between each of the models without
task is skewed towards an interval of values. Generally, a3 one summer month and the model with no fall months is
successfully predicted the energy consumption for different calculated. This percentage is averaged across the three models
bins in Winter, Spring, and Fall semesters for both prediction without one summer month. The adopted process highlights
horizons. An uptick in the MAE is noticeable with the expan- the contribution of each summer month to the deterioration of
sion of the prediction time window; an expected observation performance with respect to higher energy consumption bins
resulting from the increase in uncertain conditions with such (3, 4, 5).
expansion. The increase in MAE follows the increase in the Figure 5 depicts the difference in feature importance associ-
energy consumption bin, which suggests two interpretations. ated with the top features for a 10-minute prediction horizon.
First, there exists a scarce number of instances of higher The names on the vertical axis follow this convention z{zone
energy consumption, preventing the accurate prediction of number} {feature name}(unit) {lag time}. The most drastic
energy in such circumstances. Second, the feature engineering changes are experienced by features that are not associated
process of a3 may not encompass all the factors that contribute with the AC’s unit energy consumption. This observation
to the prediction, which can include conditions that are not alludes to the possible effect of outdoor environments on
sensed or exogenous factors. These aspects are investigated to internal indoor conditions, which is proven by the features that
fulfil o3. displayed the greatest deviation. In particular, the importance
The good results obtained in the Winter and Fall seasons of relative humidity and temperature and their respective lags
of limited data points undermine the time-based feature en- has inversely switched. This factor can be attributed to changes
gineering technique as a3 successfully extrapolated common in outdoor environmental conditions that are not captured
aspects between all seasons based on lagged versions of in the available dataset, which contributed to the rise in
environmental features. The multi-month seasons such as the energy consumption. The increase in the importance of plug
spring and summer seasons experience some fluctuations in energy consumption showcases the effect of occupants on the
their predictions attributed to the subtle variations in months AC’s energy consumption, which is not part of the captured
or the transition from one season to another. environment.
3) Feature Importance Analysis: The comparison between 4) Warehouse-related Observations: These experiments
various feature engineering approaches and their performance shed light on many aspects that should be integrated into
on various energy consumption bins provides a surface-level the supervised learning process of the building HVAC system,
understanding of the underlying dynamics of the environment. which can be extended to the warehouse environment. First,
To better grasp the radical deviations in performance between the history and future time windows are critical aspects that
summer and the other seasons, it is important to zoom into determine the utility of predictions. The changes in lagged
the contributing factors. Since the feature engineering process features’ importance and their contributions to predictions’
produces a myriad of features, the importance of these features accuracy demonstrate the effect of the defined time window.
can be analyzed to reason the reported results. Fortunately, the Given that spatial configurations dictate the delayed impact of
best a3 configuration involves a tree-based algorithm (Gradient activating AC units, the variations in these configurations in
Boosting), which facilitates the feature importance inference the warehouse environment highlight the importance of finding
process. The equivalent methods for DNNs require a more a suitable time window. Second, the biased prediction results
profound approach involving the calculation of gradients in the obtained with the variation of energy consumption bins have
absence of a feature or a set of features to infer the features’ shown the interference of exogenous factors. It is hypothesized
importance [105]. that conditions such as outdoor environmental conditions and
Before diving into the specifics of the features’ importance occupancy are affecting these predictions as reflected by the
for Gradient boosting, explaining how their respective values feature importance changes. Quantifying the occupancy and its
are calculated is instrumental. The feature importance is a inference method is at the heart of the warehouse challenges.
weighted factor that gauges the utility of a feature in the Additionally, the effect of environmental conditions detailed
boosted trees’ construction. Since the decision trees are formed in its corresponding challenge is manifested in the conducted
from many decision nodes, the feature’s importance is mea- study.
sured by its ability to improve the performance measure. After To demonstrate the viability of the effect of some external
that, the individual importance of each feature is averaged factors on prediction accuracy, some additional experiments










(a) Prediction Window = 10 (b) Prediction Window = 20

Fig. 4: MAE over energy consumption bins for different prediction windows


]B6 5+ B
AC1 0.43 0.84

]B6 GHJ& B
Light 0.39 0.84



Plug 0.41 0.84

]B$& N: B Temperature -0.37 0.84

]B6 GHJ& B Relative Humidity -0.31 0.84

]B6 5+ B Luminance 0.41 0.84

3HUFHQWDJH'LIIHUHQFH TABLE VI: Correlation results between absolute errors and
Fig. 5: Percentage different of feature importance zone 1 features for the winter season

factor circles back to the existence of occupants that trigger

are conducted. The only external factors available are the data the activation of AC units in a single zone, which lowers the
of zones 1 and 3. Since zone 1 involves a more comprehensive temperature and humidity values. Based on these results, the
set of features, its data is used to study the underlying reasons existence of occupants is a common phenomenon encountered
for some prediction inaccuracies. In this subsection only, the in each zone contributing to the AC units’ activation. The
goodness of fit is employed to evaluate the prediction accuracy. second factor can be attributed to heat transfer between zones.
In short, the goodness of fit (r2) determines the proportion of The activation of AC units in zone 1 contributed to temperature
the variance in the dependent variable that is predictable from and humidity changes in zone 2. As a result, the developed
the independent variables. This metric is applied to evaluate models in zone 2 do not account for heat transfer between
the developed model that does not include the winter season. zone, which means that these models fail to predict energy
The results of this model 4a are less biased compared to consumption. This interpretation supports the high positive
other seasons, which validates the use of r2. After that, the correlation between zone 1 AC unit energy consumption and
pairwise correlation between the absolute prediction errors of absolute errors. The drawn conclusions are real-world exam-
AC’s energy consumption in zone 2 and the features available ples of two challenges that were previously highlighted. The
in zone 1 is calculated (Corr AE). This calculation opens first pertains to the air leaks between warehouse zones. The
the discussion about any relationship that can explain the second is the invalidity of the stationarity of the underlying
prediction errors of zone 2 AC energy consumption. The environments in multi-zone space, affecting the supervised
extracted observations and interpretations should be linked to learning techniques and the application of multi-agent DRL
physical phenomena occurring in multi-zonal spaces. algorithms.
Table VI summarizes the results of the correlation between
prediction errors and r2 for the winter seasons. The features IX. C ONCLUSION
overwhelmingly display high correlation values with errors. The increased energy consumption and CO2 emissions from
Interpreting such phenomena requires diving into the possible buildings have been raising many environmental concerns
reasons for this association. On one hand, the features that because of the associated increased emissions with greenhouse
exhibit a high positive correlation instantly change with occu- effects. This paper comprehensively discusses energy con-
pancy, which includes light, luminance, and plug load. This sumption reduction in warehouses through autonomous HVAC
observation is in harmony with a previous interpretation stating control. An in-depth analysis of the warehouse environment
that inaccuracies in predictions can be attributed to the absence and its effect on HVAC control is provided, highlighting
of occupancy quantification in the available dataset. On the the shortcomings of traditional data-scarce approaches. Data-
other hand, temperature and relative humidity displaying high driven supervised learning and reinforcement learning ap-
negative correlations can be attributed to two factors. The first proaches are proposed to address the limitations of traditional

approaches. An experimental procedure is carried out to show- R EFERENCES

case the feature engineering aspect of the occupancy inference
challenge using a dataset collected in real-world settings. The [1] M. M. Gaber, A. Aneiba, S. Basurra, O. Batty, A. M.
promise of data-driven methods is faced with its own set of Elmisery, Y. Kovalchuk, and M. H. U. Rehman, “In-
challenges that can be summarized as follows: ternet of things and data mining: From applications
• The occupancy inference methods have yet to converge to techniques and systems,” Wiley Interdisciplinary Re-
to an indicator or set of indicators to reflect the number views: Data Mining and Knowledge Discovery, vol. 9,
of occupants, while the occupancy existence methods no. 3, p. e1292, 2019.
settled on exploiting the activity level sensors. For the [2] “Carbon tax basics,” Oct 2021. [Online]. Available:
time being, predictions of CO2 concentrations are the best https://www.c2es.org/content/carbon-tax-basics/
descriptors of occupancy changes. [3] U. N. Programme, “Sustainable buildings.”
• The datasets for occupancy inference are not all- [Online]. Available: https://www.unep.org/
encompassing because they lack the inclusion of all sea- explore-topics/resource-efficiency/what-we-do/cities/
sons or include features reflecting occupancy existence, sustainable-buildings
not the occupant number. [4] D. W. Kweku, O. Bismark, A. Maxwell, K. A.
• The all-year-round operation of warehouses and the Desmond, K. B. Danso, E. A. Oti-Mensah, A. T.
dispersion of warehouses over large geographical areas Quachie, and B. B. Adormaa, “Greenhouse effect:
requires avoiding time-based feature engineering tech- greenhouse gases and their impact on global warming,”
niques for supervised learning methods. Journal of Scientific research and reports, vol. 17, no. 6,
• Transferability of the developed Deep Reinforcement pp. 1–9, 2018.
Learning (DRL) agents is jeopardized due to the non- [5] G. R. Timilsina, “Where is the carbon tax after thirty
uniformity of the state and action spaces between the years of research?” World Bank Policy Research Work-
source and target domains. On a warehouse level, the ing Paper, no. 8493, 2018.
differences stem from the effective integration of renew- [6] R. Gerlagh and B. Van der Zwaan, “Options and instru-
able sources of energy. On the zone levels, the contrast ments for a deep cut in co2 emissions: Carbon dioxide
for DRL agents originates from the climatic requirements capture or renewables, taxes or subsidies?” The Energy
of each zone. Journal, vol. 27, no. 3, 2006.
• Multi-agent DRL methods utilized to tackle the scala- [7] L. Lei, Y. Tan, K. Zheng, S. Liu, K. Zhang, and X. Shen,
bility of the state and action spaces are faced with one “Deep reinforcement learning for autonomous internet
main challenge. This challenge pertains to the definition of things: Model, applications and challenges,” IEEE
of the dynamics governing the relationship between the Communications Surveys & Tutorials, vol. 22, no. 3,
agents in the environment, categorized as cooperative, pp. 1722–1760, 2020.
competitive, and hybrid. Each of these dynamics comes [8] J. Blazquez, R. Fuentes-Bracamontes, C. A. Bollino,
with its own set of hurdles, related to the reward function and N. Nezamuddin, “The renewable energy policy
specifications and the non-stationarity of the environment. paradox,” Renewable and Sustainable Energy Reviews,
vol. 82, pp. 1–5, 2018.
While this paper provides a great starting point to tackle the
[9] M. M. Sandhu, S. Khalifa, R. Jurdak, and M. Portmann,
energy efficiency problem of HVAC systems in warehouses,
“Task scheduling for energy-harvesting-based iot: A
several shortcomings of this manuscript should be addressed.
survey and critical analysis,” IEEE Internet of Things
These include the following:
Journal, vol. 8, no. 18, pp. 13 825–13 848, 2021.
• The paper collated the challenges of the HVAC systems, [10] Y. Wu, V. K. Lau, D. H. Tsang, L. P. Qian, and
despite involving different characteristics, requirements, L. Meng, “Optimal energy scheduling for residential
and components. smart grid with centralized renewable energy source,”
• As one branch of data-driven techniques, unsupervised IEEE Systems Journal, vol. 8, no. 2, pp. 562–576, 2013.
learning techniques exhibit the potential to solve scala- [11] Beanstalk, “Warehouse heating, cool-
bility and transferability issues. These techniques were ing, & ventilation,” Mar 2021. [On-
not surveyed in this work. line]. Available: https://www.reznorhvac.com/
• This manuscript touched upon the occupancy inference efficient-warehouse-heating-cooling-ventilation-tips
dataset without extensive explanations about the features [12] S. T. Hammond, J. H. Brown, J. R. Burger, T. P.
of each dataset and how these features can be leveraged Flanagan, T. S. Fristoe, N. Mercado-Silva, J. C. Nekola,
in a warehouse environment. and J. G. Okie, “Food spoilage, storage, and trans-
Future work will address these shortcomings by first sur- port: Implications for a sustainable future,” BioScience,
veying the occupancy inference data and providing an in-depth vol. 65, no. 8, pp. 758–768, 2015.
analysis with respect to the warehouse environment following [13] “How to reduce the costs of inventory spoilage,”
the model set in this paper’s experimental procedure. On the Jan 2021. [Online]. Available: https://www.wisys.com/
implementation side, the future steps will build a DRL agent how-to-reduce-the-costs-of-inventory-spoilage
managing the HVAC control of a single zone to be extended [14] K. Fikiin, B. Stankov, J. Evans, G. Maidment, A. Foster,
to multiple zones using multi-agent DRL. T. Brown, J. Radcliffe, M. Youbi-Idrissi, A. Alford,

L. Varga et al., “Refrigerated warehouses as intelligent [29] Y. Lu, J. Dong, and J. Liu, “Zonal modelling for thermal
hubs to integrate renewable energy in industrial food and energy performance of large space buildings: A
refrigeration and to enhance power grid sustainability,” review,” Renewable and Sustainable Energy Reviews,
Trends in Food Science & Technology, vol. 60, pp. 96– vol. 133, p. 110241, 2020.
103, 2017. [30] M. Pritoni, J. M. Woolley, and M. P. Modera, “Do
[15] K. Henning, “Recommendations for implementing the occupancy-responsive learning thermostats save energy?
strategic initiative industrie 4.0,” 2013. a field study in university residence halls,” Energy and
[16] [Online]. Available: https://www.iea.org/reports/ Buildings, vol. 127, pp. 469–478, 2016.
world-energy-outlook-2017 [31] K. Choi, S. Park, J. Joe, S.-I. Kim, J.-H. Jo, E.-J.
[17] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and Kim, and Y.-H. Cho, “Review of infiltration and airflow
E. Cayirci, “Wireless sensor networks: a survey,” Com- models in building energy simulations for providing
puter networks, vol. 38, no. 4, pp. 393–422, 2002. guidelines to building energy modelers,” Renewable and
[18] M. Sofos, J. T. Langevin, M. Deru, E. Gupta, K. S. Sustainable Energy Reviews, vol. 181, p. 113327, 2023.
Benne, D. Blum, T. Bohn, R. Fares, N. Fernandez, [32] R. Eini, L. Linkous, N. Zohrabi, and S. Abdelwa-
G. Fink et al., “Innovations in sensors and controls for hed, “Smart building management system: Performance
building energy management: Research and develop- specifications and design requirements,” Journal of
ment opportunities report for emerging technologies,” Building Engineering, vol. 39, p. 102222, 2021.
National Renewable Energy Lab.(NREL), Golden, CO [33] “Glimpsing the future of work in
(United States), Tech. Rep., 2020. warehouses,” Aug 2020. [Online]. Avail-
[19] D. Minoli, K. Sohraby, and B. Occhiogrosso, “Iot con- able: https://mitsloan.mit.edu/ideas-made-to-matter/
siderations, requirements, and architectures for smart glimpsing-future-work-warehouses
buildings—energy optimization and next-generation [34] J. Bartholdi and S. Hackman, “Warehouse
building management systems,” IEEE Internet of Things & distribution science,” Aug 2019. [Online].
Journal, vol. 4, no. 1, pp. 269–283, 2017. Available: https://www.warehouse-science.com/book/
[20] M. Manic, D. Wijayasekara, K. Amarasinghe, and J. J. editions/wh-sci-0.98.1.pdf
Rodriguez-Andina, “Building energy management sys- [35] K. S. Hald and P. Coslugeanu, “The preliminary supply
tems: The age of intelligent and adaptive buildings,” chain lessons of the covid-19 disruption—what is the
IEEE Industrial Electronics Magazine, vol. 10, no. 1, role of digital technologies?” Operations Management
pp. 25–39, 2016. Research, vol. 15, no. 1, pp. 282–297, 2022.
[21] M. Grieves, “Digital twin: manufacturing excellence [36] J. Koch, B. Frommeyer, and G. Schewe, “Online shop-
through virtual factory replication,” White paper, vol. 1, ping motives during the covid-19 pandemic—lessons
no. 2014, pp. 1–7, 2014. from the crisis,” Sustainability, vol. 12, no. 24, p. 10247,
[22] D. Jones, C. Snider, A. Nassehi, J. Yon, and B. Hicks, 2020.
“Characterising the digital twin: A systematic literature [37] K. Lakshmi Shree and R. Ashok Kumar, “Location
review,” CIRP Journal of Manufacturing Science and analysis using ensemble approach for warehouses: A
Technology, vol. 29, pp. 36–52, 2020. study during covid-19,” in Computational Intelligence
[23] X. Liu, X. Liu, T. Zhang, R. Ooka, and H. Kikumoto, in Pattern Recognition. Springer, 2022, pp. 749–762.
“Comparison of winter air infiltration and its influ- [38] N. Jones, “How covid-19 is changing the cold
ences between large-space and normal-space buildings,” and flu season,” Dec 2020. [Online]. Available:
Building and Environment, vol. 184, p. 107183, 2020. https://www.nature.com/articles/d41586-020-03519-3
[24] P. Brinks, O. Kornadt, and R. Oly, “Air infiltration [39] R. Sugathadasa, D. Wakkumbura, H. N. Perera,
assessment for industrial buildings,” Energy and Build- and A. Thibbotuwawa, “Analysis of risk factors for
ings, vol. 86, pp. 663–676, 2015. temperature-controlled warehouses,” Operations and
[25] “Commercial and institutional consumption of energy Supply Chain Management: An International Journal,
survey.” [Online]. Available: http://library.cee1.org/ vol. 14, no. 3, pp. 320–337, 2021.
sites/default/files/library/1907/1092.pdf [40] M. Seifhashemi, B. Capra, W. Milller, and J. Bell, “The
[26] X. Wang, Y. Yang, Y. Xu, F. Wang, Q. Zhang, C. Huang, potential for cool roofs to improve the energy efficiency
and C. Shi, “Prediction of vertical thermal stratifica- of single storey warehouse-type retail buildings in aus-
tion of large space buildings based on block-gebhart tralia: A simulation case study,” Energy and Buildings,
model: Case studies of three typical hybrid ventilation vol. 158, pp. 1393–1403, 2018.
scenarios,” Journal of Building Engineering, vol. 41, p. [41] A. Vishwanath, V. Chandan, and K. Saurav, “An iot-
102452, 2021. based data driven precooling solution for electricity
[27] C. Porras-Amores, F. R. Mazarrón, and I. Cañas, “Study cost savings in commercial buildings,” IEEE Internet
of the vertical distribution of air temperature in ware- of Things Journal, vol. 6, no. 5, pp. 7337–7347, 2019.
houses,” Energies, vol. 7, no. 3, pp. 1193–1206, 2014. [42] “U.s. energy information administration - eia
[28] P. Baker and M. Canessa, “Warehouse design: A struc- - independent statistics and analysis.” [On-
tured approach,” European journal of operational re- line]. Available: http://www.eia.gov/consumption/
search, vol. 193, no. 2, pp. 425–436, 2009. commercial/reports/2012/preliminary/

[43] Z. Wang, B. Lin, and Y. Zhu, “Modeling and measure- Cities, and Transportation, 2020, pp. 230–239.
ment study on an intermittent heating system of a res- [56] J. Kleissl and Y. Agarwal, “Cyber-physical energy sys-
idence in cambridgeshire,” Building and Environment, tems: Focus on smart buildings,” in Design Automation
vol. 92, pp. 380–386, 2015. Conference. IEEE, 2010, pp. 749–754.
[44] G. Yu, N. Fang, D. Hu, and W. Zhao, “Research on [57] A. H. Hosseinloo, A. Ryzhov, A. Bischi, H. Ouerdane,
energy-saving performance of intermittent heating for K. Turitsyn, and M. A. Dahleh, “Data-driven control
rooms in hot summer & cold winter zone.” of micro-climate in buildings: An event-triggered rein-
[45] E. Tunçbilek, A. Komerska, and M. Arıcı, “Optimi- forcement learning approach,” Applied Energy, vol. 277,
sation of wall insulation thickness using energy man- p. 115451, 2020.
agement strategies: Intermittent versus continuous op- [58] Y. Jin, D. Yan, A. Chong, B. Dong, and J. An, “Building
eration schedule,” Sustainable Energy Technologies and occupancy forecasting: A systematical and critical re-
Assessments, vol. 49, p. 101778, 2022. view,” Energy and Buildings, vol. 251, p. 111345, 2021.
[46] S. Morgan and M. Krarti, “Impact of electricity rate [59] K. Mason and S. Grijalva, “A review of reinforcement
structures on energy cost savings of pre-cooling controls learning for autonomous building energy management,”
for office buildings,” Building and environment, vol. 42, Computers & Electrical Engineering, vol. 78, pp. 300–
no. 8, pp. 2810–2818, 2007. 312, 2019.
[47] J. Ling, H. Tong, J. Xing, and Y. Zhao, “Simulation and [60] B. Xu, S. Zhou, and W. Hu, “An intermittent heating
optimization of the operation strategy of ashp heating strategy by predicting warm-up time for office buildings
system: A case study in tianjin,” Energy and Buildings, in beijing,” Energy and buildings, vol. 155, pp. 35–42,
vol. 226, p. 110349, 2020. 2017.
[48] M. S. Kim, Y. Kim, and K.-S. Chung, “Improvement [61] N. S. Raman, B. Chen, and P. Barooah, “On energy-
of intermittent central heating system of university efficient hvac operation with model predictive control:
building,” Energy and Buildings, vol. 42, no. 1, pp. 83– A multiple climate zone study,” Applied Energy, vol.
89, 2010. 324, p. 119752, 2022.
[49] C. Wang, K. Pattawi, and H. Lee, “Energy saving [62] H. Wang, S. Bo, C. Zhu, P. Hua, Z. Xie, C. Xu, T. Wang,
impact of occupancy-driven thermostat for residential X. Li, H. Wang, R. Lahdelma et al., “A zoned group
buildings,” Energy and Buildings, vol. 211, p. 109791, control of indoor temperature based on mpc for a space
2020. heating building,” Energy Conversion and Management,
[50] J. Lu, T. Sookoor, V. Srinivasan, G. Gao, B. Holben, vol. 290, p. 117196, 2023.
J. Stankovic, E. Field, and K. Whitehouse, “The smart [63] S. Yang, M. P. Wan, B. F. Ng, S. Dubey, G. P. Henze,
thermostat: using occupancy sensors to save energy in W. Chen, and K. Baskaran, “Model predictive control
homes,” in Proceedings of the 8th ACM conference on for integrated control of air-conditioning and mechani-
embedded networked sensor systems, 2010, pp. 211– cal ventilation, lighting and shading systems,” Applied
224. Energy, vol. 297, p. 117112, 2021.
[51] Z. Pang, Y. Chen, J. Zhang, Z. O’Neill, H. Cheng, [64] G. Pinto, D. Deltetto, and A. Capozzoli, “Data-driven
and B. Dong, “Nationwide hvac energy-saving potential district energy management with surrogate models and
quantification for office buildings with occupant-centric deep reinforcement learning,” Applied Energy, vol. 304,
controls in various climates,” Applied Energy, vol. 279, p. 117642, 2021.
p. 115727, 2020. [65] G. Mosaico, M. Saviozzi, F. Silvestro, A. Bagnasco, and
[52] V. L. Erickson, M. A. Carreira-Perpinan, and A. E. A. Vinci, “Simplified state space building energy model
Cerpa, “Observe: Occupancy-based system for efficient and transfer learning based occupancy estimation for
reduction of hvac energy,” in Proceedings of the 10th hvac optimal control,” in 2019 IEEE 5th International
ACM/IEEE International Conference on Information forum on Research and Technology for Society and
Processing in Sensor Networks, 2011, pp. 258–269. Industry (RTSI). IEEE, 2019, pp. 353–358.
[53] M. M. Manning, M. C. Swinton, F. Szadkowski, J. Gus- [66] J. Hou, H. Li, N. Nord, and G. Huang, “Model predic-
dorf, and K. Ruest, “The effects of thermostat set-back tive control under weather forecast uncertainty for hvac
and set-up on seasonal energy consumption, surface systems in university buildings,” Energy and Buildings,
temperatures and recovery times at the ccht twin house vol. 257, p. 111793, 2022.
facility,” ASHRAE Transactions, vol. 113, no. 1, pp. 1– [67] M. Ahrarinouri, M. Rastegar, and A. R. Seifi, “Multia-
12, 2007. gent reinforcement learning for energy management in
[54] M. Esrafilian-Najafabadi and F. Haghighat, residential buildings,” IEEE Transactions on Industrial
“Occupancy-based hvac control systems in buildings: Informatics, vol. 17, no. 1, pp. 659–666, 2020.
A state-of-the-art review,” Building and Environment, [68] I. Shaer and A. Shami, “Hierarchical modelling for
p. 107810, 2021. co2 variation prediction for hvac system operation,”
[55] S. Xu, Y. Wang, Y. Wang, Z. O’Neill, and Q. Zhu, Algorithms, vol. 16, no. 5, p. 256, 2023.
“One for many: Transfer learning for building hvac [69] Z. Wang and T. Hong, “Reinforcement learning for
control,” in Proceedings of the 7th ACM International building controls: The opportunities and challenges,”
Conference on Systems for Energy-Efficient Buildings, Applied Energy, vol. 269, p. 115036, 2020.

[70] R. S. Sutton and A. G. Barto, Reinforcement learning: deep reinforcement learning: A survey,” arXiv preprint
An introduction. MIT press, 2018. arXiv:2009.07888, 2020.
[71] M. Aftab, C. Chen, C.-K. Chau, and T. Rahwan, “Auto- [84] S. Xu, Y. Wang, Y. Wang, Z. O’Neill, and Q. Zhu, “One
matic hvac control with real-time occupancy recognition for many: Transfer learning for building hvac control,”
and simulation-guided model predictive control in low- in Proceedings of the 7th ACM international conference
cost embedded system,” Energy and Buildings, vol. 154, on systems for energy-efficient buildings, cities, and
pp. 141–156, 2017. transportation, 2020, pp. 230–239.
[72] M. Azam, M. Blayo, J.-S. Venne, and M. Allegue- [85] Z. Deng and Q. Chen, “Reinforcement learning of
Martinez, “Occupancy estimation using wifi motion occupant behavior model for cross-building transfer
detection via supervised machine learning algorithms,” learning to various hvac control systems,” Energy and
in 2019 IEEE Global Conference on Signal and Infor- Buildings, vol. 238, p. 110860, 2021.
mation Processing (GlobalSIP). IEEE, 2019, pp. 1–5. [86] M. F. R. AI-Okby, S. Neubert, T. Roddelkopf, and
[73] J. Kallio, J. Tervonen, P. Räsänen, R. Mäkynen, K. Thurow, “Integration and testing of novel mox gas
J. Koivusaari, and J. Peltola, “Forecasting office indoor sensors for iot-based indoor air quality monitoring,” in
co2 concentration using machine learning with a one- 2021 IEEE 21st International Symposium on Compu-
year dataset,” Building and Environment, vol. 187, p. tational Intelligence and Informatics (CINTI). IEEE,
107409, 2021. 2021, pp. 000 173–000 180.
[74] Z. Wang, T. Hong, M. A. Piette, and M. Pritoni, [87] M. E. Barachi, S. Salman, and S. Mathew, “A sensor-
“Inferring occupant counts from wi-fi data in buildings embedded smart carton for the real-time monitoring of
through machine learning,” Building and Environment, perishable foods’ lifetime,” in 2022 7th International
vol. 158, pp. 281–294, 2019. Conference on Smart and Sustainable Technologies
[75] S. Naylor, M. Gillott, and T. Lau, “A review of (SpliTech), 2022, pp. 1–8.
occupant-centric building control strategies to reduce [88] M. Bartolini, E. Bottani, and E. H. Grosse, “Green ware-
building energy use,” Renewable and Sustainable En- housing: Systematic literature review and bibliometric
ergy Reviews, vol. 96, pp. 1–10, 2018. analysis,” Journal of Cleaner Production, vol. 226, pp.
[76] P. Kapalo, L. Mečiarová, S. Vilčeková, E. Krı́dlová Bur- 242–258, 2019.
dová, F. Domnita, C. Bacotiu, and K.-E. Péterfi, “Inves- [89] “Development of a clean energy credit registry.” [On-
tigation of co2 production depending on physical activ- line]. Available: https://ero.ontario.ca/notice/019-5816
ity of students,” International Journal of Environmental [90] “Eu renewable energy financing mechanism.”
Health Research, vol. 29, no. 1, pp. 31–44, 2019. [Online]. Available: https://energy.ec.
[77] B. Dong, Y. Liu, W. Mu, Z. Jiang, P. Pandey, T. Hong, europa.eu/topics/renewable-energy/financing/
B. Olesen, T. Lawrence, Z. O’Neil, C. Andrews et al., eu-renewable-energy-financing-mechanism en
“A global building occupant behavior database,” Scien- [91] A. Ilic, T. Staake, and E. Fleisch, “Using sensor in-
tific data, vol. 9, no. 1, pp. 1–15, 2022. formation to reduce the carbon footprint of perishable
[78] Z. Chen, R. Zhao, Q. Zhu, M. K. Masood, Y. C. goods,” IEEE Pervasive Computing, vol. 8, no. 1, pp.
Soh, and K. Mao, “Building occupancy estimation with 22–29, 2008.
environmental sensors via cdblstm,” IEEE Transactions [92] B. Schölkopf, A. Smola, and K.-R. Müller, “Kernel
on Industrial Electronics, vol. 64, no. 12, pp. 9549– principal component analysis,” in International confer-
9559, 2017. ence on artificial neural networks. Springer, 1997, pp.
[79] K. Imamovic, F. C. Sangogboye, and M. B. Kjærgaard, 583–588.
“Improving occupancy presence prediction via multi- [93] P. Baldi, “Autoencoders, unsupervised learning, and
label classification,” in Proceedings of the 2nd ACM deep architectures,” in Proceedings of ICML workshop
International Conference on Embedded Systems for on unsupervised and transfer learning. JMLR Work-
Energy-Efficient Built Environments, 2015, pp. 113– shop and Conference Proceedings, 2012, pp. 37–49.
114. [94] Y. Li, J. Yosinski, J. Clune, H. Lipson, and
[80] K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A survey J. Hopcroft, “Convergent learning: Do different neu-
of transfer learning,” Journal of Big data, vol. 3, no. 1, ral networks learn the same representations?” arXiv
pp. 1–40, 2016. preprint arXiv:1511.07543, 2015.
[81] G. Pinto, Z. Wang, A. Roy, T. Hong, and A. Capozzoli, [95] Y. Li, M. Yang, and Z. Zhang, “A survey of multi-
“Transfer learning for smart buildings: A critical review view representation learning,” IEEE Transactions on
of algorithms, applications, and future perspectives,” Knowledge and Data Engineering, vol. 31, no. 10, pp.
Advances in Applied Energy, p. 100084, 2022. 1863–1883, 2019.
[82] H. Hu, H. Wang, Z. Zou, and J. Zhu, “Investigation [96] I. Shaer and A. Shami, “Corrfl: Correlation-based neural
of inter-zonal heat transfer in large space buildings network architecture for unavailability concerns in a
based on similarity: Comparison of two stratified air- heterogeneous iot environment,” IEEE Transactions on
conditioning systems,” Energy and Buildings, vol. 254, Network and Service Management, pp. 1–1, 2023.
p. 111602, 2022. [97] A. Y. Ng, D. Harada, and S. Russell, “Policy invariance
[83] Z. Zhu, K. Lin, and J. Zhou, “Transfer learning in under reward transformations: Theory and application to

reward shaping,” in Icml, vol. 99, 1999, pp. 278–287.

[98] K. Zhang, Z. Yang, and T. Başar, “Multi-agent rein-
forcement learning: A selective overview of theories and
algorithms,” Handbook of Reinforcement Learning and
Control, pp. 321–384, 2021.
[99] K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. Basar,
“Fully decentralized multi-agent reinforcement learning
with networked agents,” in International Conference on
Machine Learning. PMLR, 2018, pp. 5872–5881.
[100] G. Arslan and S. Yüksel, “Decentralized q-learning for
stochastic teams and games,” IEEE Transactions on
Automatic Control, vol. 62, no. 4, pp. 1545–1558, 2016.
[101] E. Mazumdar, L. J. Ratliff, and S. Sastry, “On the
convergence of gradient-based learning in continuous
games,” arXiv preprint arXiv:1804.05464, 2018.
[102] M. Pipattanasoporn, G. Chitalia, J. Songsiri,
C. Aswakul, W. Pora, S. Suwankawin,
K. Audomvongseree, and N. Hoonchareon, “Cu-
bems, smart building electricity consumption and
indoor environmental sensor datasets,” Scientific Data,
vol. 7, no. 1, pp. 1–14, 2020.
[103] M. Pipattanasomporn, G. Chitalia, J. Songsiri,
C. Aswakul, W. Pora, S. Suwankawin,
K. Audomvongseree, and N. Hoonchareon, “CUBEMS,
a smart building electricity consumption and indoor
environmental sensor dataset.” 6 2020. [Online].
Available: https://figshare.com/articles/dataset/
CU-BEMS Smart Building Electricity Consumption
and Indoor Environmental Sensor Datasets/11726517
[104] L. Yang and A. Shami, “On hyperparameter optimiza-
tion of machine learning algorithms: Theory and prac-
tice,” Neurocomputing, vol. 415, pp. 295–316, 2020.
[105] J. T. Springenberg, A. Dosovitskiy, T. Brox, and
M. Riedmiller, “Striving for simplicity: The all convo-
lutional net,” arXiv preprint arXiv:1412.6806, 2014.
[106] T. Hastie, R. Tibshirani, J. H. Friedman, and J. H.
Friedman, The elements of statistical learning: data
mining, inference, and prediction. Springer, 2009,
vol. 2.

View publication stats

