1. Introduction
Indoor/outdoor (IO) context sensing plays a vital role for numerous applications, for example, human localization and tracking [
1,
2,
3,
4,
5], activity recognition [
6,
7] and transportation mode [
8,
9,
10], power management and medical care [
11]. For seamless positioning and navigation [
12,
13], IO detection is a bridge between indoor and outdoor localization. To improve positioning accuracy and reduce power consumption, multi-source fusion positioning system triggers specific positioning and fusion strategies according to the results of IO detection. Smartphones automatically adjust screen brightness according to the IO status and environment condition (e.g., time, weather). The IO status provides personalized service such as adaptively adjusting the device volume.
According to World Bank statistics, the number of current smartphone users is 4.57 billion, and this is expected to grow to 4.78 billion by 2020 [
14]. Recently smartphones are equipped with a variety of sensors as well as powerful processor and storage capabilities. The smartphone-based IO detection mainly benefits from the extensive use of smartphones—consumers always carry smartphones. To provide context-aware information, a lot of previous methods focused on IO detection in a variety of environments have done. The methods are classified into two categories: fixed detection rules or thresholds-based techniques, and machine learning-based techniques.
The first category uses fixed detection rules and thresholds such as a sensor reading above a certain value are considered a state. Zhou et al. [
15] and IODetector [
16] leveraged smartphone built-in sensors including proximity, light sensor, accelerometer, magnetometer, and cell tower RSS to distinguish between outdoor, semi-outdoor and indoor environments. These were two similar related works, as they utilized hard thresholds in light detector, cellular detector and magnetism detector, and then fused the detection results of each sub-module by Hidden Markov Model (HMM) [
17], and achieved a recognition accuracy of above 88% and 92%, respectively, in their campus and city areas. Both mentioned the above algorithm depends on the measurements of a large number of visible neighbor cellular towers. However, most current smartphones do not support recording measurements from all neighboring cellular towers. Zou et al. [
18] presented an IO detection technique that leveraged the low power iBeacon technology to discriminate between semi-outdoor and indoor environments. Their test environment is the beacon placement environment on campus and the recognition accuracy reached 96.2%. Li et al. [
19] presented a lightweight IO detection based on Wi-Fi RSS signals and a light sensor. The Wi-Fi sub-detector utilized AdaBoost [
20] and the light sensor module utilized the threshold to detect the environment separately, and then a semi- conditional random field (CRF) algorithm was used to aggregate the Wi-Fi and light sensor results. The evaluation results showed that the IOS detector can achieve over 96% accuracy in the FIT Building at Tsinghua University, Xidan Street, and CETC office building environments. SatProbe [
21] only used the number of GPS visible satellites as a more direct indicator of IO status. They collected 79 segments of raw GPS traces, with 2595 randomly sampled points for detection test and the overall detection accuracy of SatProbe is 85.6%. Gao et al. [
22] extracted the number of visible satellites that CNR more than 25 dB-Hz and the sum of all visible satellites CNR that more than 25db-Hz based on the availability and strength of GNSS signals (GPS and GLONASS) as features. Then, a Hidden Markov Model was used to infer the current environment types (indoor, intermediate and outdoor) according to those extracted features and the proposed environmental context detection method is tested in the city of London as a whole, achieving an overall 88.2% accuracy. Meanwhile, GNSS signals received from GNSS receivers had also been used for IO detection [
23]. SenseIO [
24] designed a ubiquitous multi-model system to fuse cell tower, Wi-Fi-based, activity recognition and light intensity data based on fixed detection rules for IO detection and their experiments for each module and all framework scenarios show that the SenseIO provides promising detection accuracy (above 92%). In [
25], utilizing the fixed rules, light sensor, magnetic sensor and satellite signals were integrated to identify the IO status to help achieve seamless indoor and outdoor positioning. However, fixed detection rules or threshold-based methods are difficult to adapt to different environments and devices.
In the second category, features are extracted from smartphone embedded sensors and detected IO status by the machine learning algorithm. SenseMe [
26] utilized C4.5 algorithm on data generated from GPS, gyroscope, accelerometer and the Bluetooth module sense environmental context, as well as the context-aware location. They evaluated SenseMe against several metrics with the aid of 2 two-week long live deployments involving 15 participants and the detection accuracy reached 91.23%. Sung et al. [
27] proposed a sound-based IO detection method that utilized acoustic features created by different patterns of reverberations according to ambient environments. Then, Sung leveraged a binary classification method to determine the IO environments by using the acoustic feature. Considering the electromagnetic environments are different, the data characteristics of the magnetic sensors under IO situations are different. The experiments were conducted at the KAIST campus located in Daejeon, South Korea and the best accuracy (96.79%) was achieved when the score calculation range was 50, and the threshold value, 2000. In addition, the transition time of their method took only 3.81 s on average. Canovas et al. [
28] employed a binary classification technique on the received signal strength indicator (RSSI) from 802.11 access points to identify a pedestrian’s indoor or outdoor status. They conducted experiments on their campus, with a mean error rate around 2.5%. MagIO [
29] utilized machine learning algorithms including Support Vector Machines (SVM), Gradient Boosting Machines (GBM), Random Forest (RF), K-Nearest Neighbor (kNN) and Decision Trees (DT) to deal with magnetic signals for IO detection. Experiments showed that Naive Bayes and random forest possess the capability to achieve an accuracy of 80% and higher with magnetic data alone. An ensemble-based stacking approach is presented, as well, which achieves an accuracy of 85.30% for a campus area, shopping mall and subway station using three different smartphones. Wang et al. [
30] applied a machine learning algorithm to classify the neighboring GSM station’s signal in different environments and identify the users’ current context by signal recognition. They test the algorithm in four different environments in their campus. The results show that their algorithm is capable of identifying open outdoors, semi-outdoors, light indoors and deep indoors environments with 100% accuracy using four nearby GSM stations’ signal strength. Radu [
31] considered employing co-training according to the feature of light, magnetic and cell sensors for detection. It can automatically learn characteristics of new environments and devices and thereby provides a detection accuracy exceeding 90% even in unfamiliar circumstances. Anagnostopoulos [
32] leveraged J48 and other machine learning algorithms to detect the IO state. They utilized multiple contextual features such as activity, barometric, ambient light, GSM, magnetometer variance, etc. Using all sensors; they could achieve 99% classification accuracy with a 10-fold cross-validation test. Wi-FiBoost [
33] was designed to utilize AdaBoost [
20] determines in a fast and accurate way whether a device is inside or outside particular buildings. They conducted all their experiments in two facilities located on their campus and showed that the resulting performance, a mean error rate around 2.5%. Some of the mentioned above algorithm depend on the measurement of signals from a large number of visible neighbor cellular towers. However, most of the current smartphones do not support recording measurements from all neighbor cellular towers. Also, in the mentioned above algorithms, all the algorithms except Sung et al. [
27] did not evaluate the indoor and outdoor scene switching delay.
On the other hand, it is difficult to obtain satisfactory classification results with only two classification labels, since a variety of signal variations in complex scenarios and the similarity of indoor and outdoor signal sources in IO transition regions. Due to the GNSS signal, light intensity, geomagnetism, Wi-Fi and other sensor features are different in the open outdoors and deep indoors. Distinguishing IO state in this two environment is easy. However, the actual indoor and outdoor scenes encountered in the urban area are not all the above two ideal scenes, such as on the overpass, near the tall buildings, inside the glass curtain wall, close to the indoor patio, etc. We define these ideal and non-ideal indoor and outdoor environments as complex scenes. By defining the categories of complex scenes, we present the diversity of scenes for data collection.
For the mentioned above reasons, a lot of previous studies proposed an ambiguous state like semi-outdoor or shallow indoors to obtain better experiments result. Both fixed detection rules or threshold-based methods achieve satisfactory detection accuracy in ideal open outdoors or closed indoors environments, however, the IO detection accuracy of the abovementioned methods significantly decreases or they are even unable to identify the IO transition areas as shown in
Figure 1. However, an uncertain status like semi-outdoor and shallow indoors is difficult for many applications to interpret since the environmental characteristics there are not defined. In this paper, we focus on these complex scenarios, but the final detection status only includes indoors and outdoors. Furthermore, the accuracy and transition delay of IO transition delay in complex scenarios is also a problem we are concerned with.
To accomplish IO detection in complex scenarios, we must address the following two major challenges. We define the scenes without data collection in the training stage and the scenes without specific parameters as new environments. First, the poor performance of IO detection in the new environment is due to a variety of signal variations, since users may be in new scenarios that not match the training phase, such as in a room, near a window, under an overpass, in the open outdoors and so on. Detecting all environments with fixed rules and constraints is impractical. Second, it is difficult to detect the transition between indoor and outdoor environments correctly within a short time due to the similarity of indoor and outdoor signal sources when switching between indoors and outdoors in complex scenarios.
To address both challenges mentioned above, we leverage Global Navigation Satellite System (GNSS) measurements from Android smartphones to detect IO complex environments. Because of the availability and accuracy of satellite signals tend to be less affected by factors other than the environment, we extract spatial geometry distribution, time sequence and statistical features from the GNSS measurements through Android smart mobile devices. Then, we applied supervised machine learning algorithms to predict IO status. Finally, we regard the predicted IO status as the observations of Hidden Markov Model (HMM) [
17] to accurately recognize IO status and immediately detect the transition between IO in complex scenarios. To the best of our knowledge, this paper is the first that uses a stacking model with HMM for IO detection.
The main contributions of our work are summarized as follows:
- 1)
We propose a novel IO detection algorithm employing an ensemble model based on stacking. To further filter the occasional detection errors and improve the reliability of IO detection in complex scenarios, we adopt the HMM to the detection results obtained by the ensemble model.
- 2)
We focus on IO switching detection to guarantee the continuity of IO detection. To improve the IO detection accuracy and reduce IO switching delay, we analyze and extract spatial geometry distribution, time sequence and statistical features of GNSS measurements using different sliding windows in Android smartphone rather than other GNSS receivers.
- 3)
Also, to evaluate the proposed algorithm in typical IO scenarios, we compare our proposed algorithm with two state-of-the-art IO detection methods using GNSS information on four different datasets. The experimental results showed that, in the complex IO scenarios, our proposed algorithm achieved higher IO detection accuracy and lower switching delay than other algorithms under the new test environments.