Smartphone Sensor Accuracy Varies From Device To D

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Behavior Research Methods

https://doi.org/10.3758/s13428-020-01404-5

Smartphone sensor accuracy varies from device to device in mobile


research: The case of spatial orientation
Tim Kuhlmann 1,2 & Pablo Garaizar 3 & Ulf-Dietrich Reips 2

# The Author(s) 2020

Abstract
Smartphone usage is increasing around the globe—in daily life and as a research device in behavioral science. Smartphones offer
the possibility to gather longitudinal data at little cost to researchers and participants. They provide the option to verify self-report
data with data from sensors built into most smartphones. How accurate this sensor data is when gathered via different smartphone
devices, e.g., in a typical experience sampling framework, has not been investigated systematically. With the present study, we
investigated the accuracy of orientation data about the spatial position of smartphones via a newly invented measurement device,
the RollPitcher. Objective status of pitch (vertical orientation) and roll (horizontal orientation) of the smartphone was compared
to data gathered from the sensors via web browsers and native apps. Bayesian ANOVAs confirmed that the deviations in pitch
and roll differed between smartphone models, with mean inaccuracies per device of up to 2.1° and 6.6°, respectively. The
inaccuracies for measurements of roll were higher than for pitch, d = .28, p < .001. Our results confirm the presence of
heterogeneities when gathering orientation data from different smartphone devices. In most cases, measurement via a web
browser was identical to measurement via a native app, but this was not true for all smartphone devices. As a solution to lack
of sensor accuracy, we recommend the development and implementation of a coherent research framework and also discuss the
implications of the heterogeneities in orientation data for different research designs.

Keywords Smartphone . Experience sampling . Ambulatory assessment . Sensor data . Tilt . Pitch . Roll

Smartphones present themselves as a powerful tool for re- can be investigated in real-time include many everyday activ-
searchers. They offer the possibility to gather data from par- ities, as most smartphone users carry it around everywhere
ticipants in everyday life, largely independent of location and they go. In addition to subjective measurements, smartphones
time. In addition, the measurement device is already familiar offer the availability of physical sensors that are already inte-
to participants and causes little to no intrusion or additional grated and easily accessible (Miller, 2012). These include,
costs (Miller, 2012). Smartphones have been widely imple- among others, GPS, Bluetooth, and data on spatial orientation.
mented as part of experience sampling designs (ESMs, e.g., Data from these sensors can be gathered via apps and
Stieger & Reips, 2019) and in the health sciences (Bert, browsers and the advantages of using these data are more
Giacometti, Gualano, & Siliquini, 2014). In experience sam- and more evident in the behavioral, social, and health sci-
pling designs, smartphones are implemented as tools to gather ences. In the health sector, the data are used to recognize
data from participants at specified times in a diary study or to physical activity, mostly using data from the accelerometer.
gather data when events in their lives occur. The topics that Studies have shown that smartphones are capable of achieving
similar accuracies for physical activity recognition as dedicat-
ed devices, such as smart watches and heart rate monitors
* Tim Kuhlmann
[email protected]
(Brajdic & Harle, 2013; Case et al., 2015). Sensor data can
also be used to identify falling or other medical emergencies
1
(Yavuz et al., 2010) and to improve accessibility for wheel-
Differential Psychology, Assessment & Research Methods,
chair users (Gupta, Holloway, Heravi, & Hailes, 2015). The
Department of Psychology, University of Siegen,
Adolf-Reichwein-Str. 2a, 57068 Siegen, Germany Bluetooth sensor has been implemented to detect whether a
2 person is in a work or social situation (Lathia, Pejovic,
Research Methods, Assessment, & iScience, Department of
Psychology, University of Konstanz, Konstanz, Germany Rachuri, Mascolo, Musolesi, & Rentfrow, 2013). In the social
3 sciences, studies have further been conducted that link a
Faculty of Engineering, University of Deusto, Bilbao, Spain
Behav Res

person's well-being to their surroundings via GPS code (i.e., Java/Kotlin for Android devices, Objective-C/Swift
(MacKerron & Mourato, 2013; Stieger & Reips, 2019). for iOS devices). Web apps run within a web browser (Google
Stieger and Reips (2019) investigated data from both Chrome, Apple Safari, Mozilla Firefox) and use web APIs
smartphone sensors and from open-access Internet databases (Application Programming Interfaces) through JavaScript
on temperature, longitude, latitude, altitude, wind speed, rain- code. Moreover, there are some application development
fall, and further environment-based variables to predict fluc- frameworks (Xamarin, Appcelerator, Adobe PhoneGap) that
tuations of well-being by using a smartphone-based mobile are able to port their code to multiple platforms such as
experience sampling method. In their study, they found a high Android through different approaches. Embedding a web
correlation between smartphone GPS measurement of alti- browser within a native app to run JavaScript code is a com-
tudes and Google Maps measurement of altitudes, but a con- mon strategy for these cross-platform frameworks. Therefore,
sistent difference in absolute measurement. we have to know how an app was developed to categorize it as
In order for the sensor data to be useful to researchers, it has native or web app.
to be accurate and a valid indicator for the behavior. If the data The distinction between native and web apps is important
from the implemented devices shows a large amount of error, when we are gathering information from mobile device’s sen-
conclusions drawn from the data are necessarily unreliable. sors. Native applications are able to collect data from hard-
This fact is much more important in the context of smartphone ware sensors directly while web applications are unable to do
studies as compared to previous research studies where the so, for security reasons. However, most native mobile appli-
measurement device was given out by researchers to the par- cations do not take values directly from sensors but use what
ticipants (Miller, 2012). Previous studies allowed researchers mobile operating systems call "software sensors". Software
to pick an adequate device, preprogram all the necessary parts, sensors provide estimates of actual position, orientation, and
and check the reliability of the data. Error might still be pres- motion values by combining the readings of various hardware
ent, but it can be investigated and potentially mitigated. sensors such as accelerometers, gyroscopes, magnetometers,
Furthermore, it is largely homogenous across the sample. In or barometers. Hardware sensors in today’s smartphones are
most smartphone studies, participants are using their own de- similar to circuit chips in appearance and work electronically.
vice and only download an app or open their web browser to Using software sensors is a good development strategy
participate. This presents researchers with additional prob- because it allows developers to forget about the peculiarities
lems, as there may be a large heterogeneity of data across of each hardware sensor and delegate the integration of their
devices. Naturally, over the Internet it is impossible to check values to the mobile operating system. Considering this is the
all devices for their idiosyncrasies (Reips, Buchanan, Krantz, approach followed by mobile web browsers, the differences
& McGraw, 2015). between native and web apps should be minimal.
With regard to data from the objective sensors, this prob- The focus of the current study is sensing the smartphone’s
lem is rather new to behavioral and social scientists. Data on spatial orientation. Data from the orientation software sensor
the devices’ location and orientation telling us indirectly about is used to take photo sphere images or when playing games
fine-grained motion behaviors of a large number of people that use the tilt of the phone as input. In the behavioral sci-
have only been introduced with the development of ences, this data has been implemented as a proxy measure-
smartphones. Their implementation has for the most part been ment for body posture (Kuhlmann & Reips, 2020) and posi-
focused on games and interactive apps. The implementation tion of wheelchairs (Milenković et al., 2013). In future studies,
of sensor data in behavioral and social science research is only possible implementations include the measurement of motor
beginning, meaning the requirements for accuracy of tasks in experience sampling designs or the measurement of
measurement via user devices are not well investigated. the environment by placing the phone in the surrounding en-
Blunck et al. (2013) developed a taxonomy of heterogeneities vironment. Questionnaire items or tasks could also be an-
and their sources in mobile phones. The focus of the present swered by tilting the phone instead of responding on a typical
study is heterogeneities due to the device, i.e., resulting from scale.
the platform, hardware, and OS. Thus, here we are investigat- On-board technology provides information about the tilt
ing a special case of using heterogeneous consumer-grade of the smartphone across three different axes, x, y, and z.
equipment in Internet-based research. The rotation around one of these axes, z, indicates the car-
dinal direction of the phone. The other two rotations de-
scribe the rotation of the device itself around the other two
Gathering a device’s orientation from sensors axes (see Fig. 1). As mentioned before, data about the
location of the smartphone is gathered by using a software
Social and behavioral researchers run their experiments on sensor that integrates information from the accelerometer,
mobile devices using native and web apps. Native apps run gyroscope, and magnetometer of the device to provide ac-
on top of the operating system of the device and use compiled celeration data from gravity. If the device is lying on a flat
Behav Res

different devices, as sensor data accuracy varied between dif-


ferent models and manufacturers. For some investigated de-
vices, deviations from the actual acceleration were as large as
the difference between standing still and accelerating on a fast
train (Stisen et al., 2015). They also found some indications of
heterogeneities for data from the orientation sensor on activity
recognition, but this was not the main part of their investiga-
tion, as orientation data is not the best choice for this task.
Data on the accuracy of the spatial orientation is mostly
based on the investigation of external influences on the accu-
racy and natural drifts in values, mostly implemented during
production (Grewal & Andrews, 2010). One of these external
influences is the temperature at which the orientation sensor
operates. Changing temperature results in inaccuracies of the
readings (Weinberg, 2011). As this inaccuracy is predictable
and quite consistent, most orientation sensors are coupled with
Fig. 1 Illustration of orientation measures pitch and roll in relation to a temperature sensor. Another source of inaccuracies is accel-
smartphone axes eration and vibration. This is especially a problem for compact
orientation sensors without much buffer that are implemented
surface, this force is aiming to the ground, at a 90° angle to in mobile devices (Weinberg, 2011). The orientation sensor
the screen and the two axes around which pitch and roll are itself cannot be calibrated via a simple user prompt as is the
measured. No force is present along the axes that are par- case for the cardinal direction. Some studies did implement
allel to the long and short side of the phone. When the calibration techniques involving external sensors and expen-
device is tilted around its axes, the force is no longer ver- sive setups (e.g., Umek & Kos, 2016).
tical to the screen. This deviation is used to calculate the It has not been investigated so far how the implementation
values for pitch and roll, indicating the tilt of the via different applications and frameworks influences the ori-
smartphone in relation to its flat position. entation data, e.g., whether the data is gathered via a browser
As mentioned before, orientation data is gathered via dif- or a native app. As mentioned in the previous section, frame-
ferent APIs and sensor frameworks. Native mobile apps can works and browsers read and transform data on tilt differently
gather orientation data via sensor-specific frameworks for dif- (Deveria, 2018; Tibbet & Volodine, 2017). In addition, app
ferent operating systems (Blunck et al., 2013). Web apps use development frameworks might perform transformations of
the DeviceOrientation Event specification (Tibbet & the data that suit the intended implementation of the target
Volodine, 2017) to access to this data. The different APIs audience. For example, applications developed via the MIT
and frameworks present a potential source of heterogeneity App Inventor transform the values of pitch and roll when they
between different software implementations and devices. cross 90° of tilt (MIT App Inventor Public Open Source,
How the actual values of orientation are computed already 2018).
differs depending on the implemented browser and operating The current study investigates the accuracy of orien-
system (Deveria, 2018). tation data in implementations that closely resemble
those of actual study designs. We gather data from
smartphones that are participants’ actual phones without
Accuracy of smartphone sensor data modifying their settings, installed apps, preferred brows-
er, etc. To ascertain the real values of pitch and roll that
The accuracy of other sensor data has been investigated in the smartphone is rotated to, we designed and built a
previous studies. Stisen and colleagues (2015) investigated mounting device for smartphones, RollPitcher, which
the heterogeneities of data gathered from accelerometers of allows for the independent manipulation of pitch and
different smartphone and smartwatch models. The accelera- roll in a controlled lab setting. Our study therefore fills
tion sensor measures the acceleration of the devices along an important gap in knowledge between accuracy mea-
different axes and provides useful data to distinguish different surements of sensors close to production and their ac-
activities. The authors were interested in the effect of possible curacy in actual implementations in smartphone studies.
heterogeneities on activity recognition due to different sen- Our hypotheses are:
sors, devices, and workload. In their study, they found that
heterogeneities impaired the performance of human activity H1: The accuracy of the orientation data differs between
recognition. The impairments differed significantly between smartphone devices
Behav Res

H2: The accuracy of the orientation data differs, to a small along both axes. These values were within 0.15° of the
degree, between modes of measurement on the same proposed pitch and roll values, confirming the precision
smartphone of the mount and procedure.
H3: The inaccuracies of the orientation data are consistent
across measurements of different angles of the same
smartphone with the same software, i.e., deviations from Sample of smartphones
real values correlate across measurements
A total of 56 different smartphones were measured, 31
Android devices, 24 iOS devices, and one Windows 10 de-
vice. All devices were measured via the browser implementa-
Method tion. In addition, a subsample of 39 devices was also measured
via the native apps. A complete list of the smartphone models
RollPitcher, the smartphone mounting device with their OS is shown in Table 1. The smartphones were
selected to represent a typical sample of smartphones imple-
Two RollPitcher devices were custom built by the scientific mented in the research setting during a smartphone study. We
workshop of the University of Konstanz, one made of metal used smartphones from participants who took part for course
and one entirely out of plastic, except for small parts. They credit or remuneration. The phones cover different manufac-
consist of a solid base, on top of which the mount was attached. turers, models, and operating systems. We did not alter any of
A technical drawing of the mounts is shown in Fig. 2. They the settings, OS updates or installed browsers to follow the
have two different hinges, which allows for the separate adjust- logic of simulating a real study situation as close as possible,
ment of pitch and roll values. The mount for the smartphone apart from the objective position of the device.
itself is made out of solid plastic in both RollPitchers and has a
cut-out in the base to allow all smartphone models to lie flat on
the surface despite possible bumps from cameras on the Measures
backside.
The actual metal and plastic devices weigh about 8 The values of pitch and roll were gathered via two different
and 3 kg, respectively. The metal and plastic devices software implementations, a website and native applications
are shown in Fig. 3. The base can be adjusted via four in Android and iOS. The measurement on a website was im-
different screws and thus allows leveling of the base plemented via the DeviceOrientation Event Specification
precisely to 0°. Levelling out the base was achieved (Tibbet & Volodine, 2018). This specification provides sever-
by adjusting the position with a high-precision digital al DOM events related to the orientation and motion of a
mechanic’s level, the Stabila STB196E-2-60P, with a device. The deviceorientation event supplies the physical ori-
maximum error of .05° at 0°. To ascertain the precision entation of the device, the devicemotion event supplies the
of the objective angle positions, the mechanic’s level acceleration of the device, and the compassneedscalibration
was used on some occasions to measure the pitch and event is used to warn web apps about the need of recalibration
roll of the smartphone by placing it on top of the screen of the compass being used to provide data for one of the other
two events. Considering this, we created a simple web app,
available in the OSF archive, and registered it to receive
deviceorientation events. The events provide four attributes,
of which two are of interest to the present study (see Fig. 1).
The pitch of the device is represented by beta. It describes the
top-down orientation around the x-axis, represented in degrees
with values ranging from – 180 to 180. The roll is represented
by gamma, which describes the left–right orientation around
the y-axis, represented in degrees with values ranging from –
90 to 90. The code to register deviceorientation events via the
web app is the following:

window.addEventListener("deviceorientation",
function(event) {
Fig. 2 Technical drawing of RollPitcher, the smartphone mounting // process event.beta and event.gamma
device }, true);
Behav Res

Fig. 3 RollPitcher metal mounting device with smartphone and mechanic’s level (left) and plastic mounting device (right)

The beta angle is 0° when the device's top and bottom are We developed the native application for Android ourselves
the same distance from the earth's surface. If the device is in a following the guidelines provided by the Android official doc-
vertical plane and the top of the screen pointing upwards, the umentation. Although the orientation sensor was deprecated
value of beta is 90°. The gamma angle is 0° when the device's in Android 2.2 (API level 8), the Android sensor framework
left and right sides are the same distance from the surface of provides alternate methods for acquiring device orientation.
the earth. The orientation angles are computed by using a device's

Table 1 Measured smartphone models

Manufacturer Model Number of devices tested Operating system

Samsung A5 1 Android
Samsung Duos 1 Android
Samsung J5 1 Android
Samsung S4 1 Android
Samsung S5 1 Android
Samsung S6 1 Android
Samsung S7 2 Android
Samsung S8 12 Android
Samsung S9+ 1 Android
Huawei P8 1 Android
Huawei P8 Lite 3 Android
Huawei P9 Lite 3 Android
Huawei P20 Lite 1 Android
Xiaomi Pocophone F1 1 Android
Apple iPhone 5 1 iOS
Apple iPhone 6 9 iOS
Apple iPhone 6s 2 iOS
Apple iPhone SE 2 iOS
Apple iPhone 7 6 iOS
Apple iPhone 8 1 iOS
Apple iPhone XR 1 iOS
CAT S61 1 Android
Nokia Lumia 950 1 Windows 10 Mobile
Honor 9 Lite 1 Android
Motorola G4+ 1 Android
Behav Res

geomagnetic field sensor in combination with the device's Results


accelerometer. The use of these two hardware sensors pro-
vides three orientation angles, two of which are relevant for First, we describe the exclusion of suspect roll values. Then
the present study: pitch describes the degrees of rotation about the descriptive statistics for deviations of pitch and roll are
the x-axis, i.e., top-bottom tilt from – 180 to 180 degrees; roll presented and thereafter the statistical analyses comparing dif-
describes the degrees of rotation about the y-axis, i.e., left- ferent smartphone devices.
right tilt from – 90 to 90 degrees. The angles correspond to
the aforementioned beta and gamma values from the Device Exclusion of roll values at pitch of 85°
Orientation API. The native application implemented on iOS
devices was the sensor reading app “Sensors Multitool”, avail- Roll values that were gathered at pitch angles of 85° were
able free of charge from the Apple AppStore. It provides sep- excluded from all combined analyses as these showed unusu-
arate sensor readings for pitch and roll, named x and y, and ally large deviations from the objective values. The deviations
displays them on-screen. from objective roll values ranged from – 28.5° to 55.7° (SD =
9.9°) at pitch angles of 85°. The deviations from objective roll
Procedure values for all other pitch angles was – 11.6° to 10.0° (SD =
1.8°). Possible explanations and interpretations for this quali-
At the start of the measurements, the native apps were tative difference are reviewed in the Discussion section. The
installed on participants’ smartphones and their screen was main hypothesis is that the angle of 85° is too close to 90°, at
set to portrait mode. Before the smartphone was mounted, which there are no meaningful values for roll.
any protective case was removed and RollPitcher was levelled
out to a precision of ± .05°. The smartphone was then located The influence of RollPitcher building material
in position. The measurements took place according to a
scheduled sequence of angle combinations. The pitch angles We conducted measurements via two RollPitchers that dif-
had the values 0°, 30°, 60°, and 85° in both directions. Roll fered in the material they were made of, from metal and the
angles were 0°, 15°, and 30° in both directions. These values other from plastic. To assess whether the material of the
were chosen to represent typical locations of smartphones RollPitcher has an influence on the measurement, we per-
during everyday use. The pitch values typically deviate more formed the same measurement routine on identical
from the null point than roll values (Kuhlmann & Reips, smartphones in RollPitchers in short succession. This was
2020). The vertical angle of 90° was not measured, as for this carried out with four different smartphones. A Bayesian re-
angle there is no meaningful value of roll because the peated measures ANOVA with RollPitcher device as the re-
smartphone is standing on the side. Every combination of peated measures variable and pitch deviation as the dependent
pitch and roll angles was implemented via the mounting de- variable was calculated. The Bayes factor was BF10 = .156,
vice and the data sent three times, with a pause of 1–2 s providing no evidence for an effect of the device, but moder-
between sending, to a Firebase database. This data had a pre- ate evidence in favor of no difference. The effect size of the
cision of nine decimal points. Data from the iOS native app repeated measures factor was η2 = .0004, indicating that less
was recorded by hand to a spreadsheet with a precision of one than 0.1% of the variance in pitch deviations could be attrib-
decimal point. This procedure led to 35 different combinations uted to the RollPitcher device. The results for roll deviation
of angles measured per device, once via native app, and once were similar with BF10 = .136 and η2 < .001.
via web browser. The browser was chosen based on partici-
pants’ standard settings for web browsing, representing the
Deviations of pitch and roll from objective values
most likely option that a participant would partake in an
Internet-based questionnaire.
The distributions of the deviation of sensor measured pitch
and roll values from the objective angles are shown in Fig.
Statistical analyses 4. The distributions are based on data gathered via the browser
of the smartphones. The mean deviation was 0.05° for pitch,
The data was imported from Firebase and spreadsheets into R. ranging from – 17.8° to 8.1° (SD = 1.2°). For the roll values,
Analyses were carried out in R and JASP (JASP Team, 2018). the mean deviation was 0.20°, ranging from – 11.6° to 10.0°
Data files, R scripts and the browser app are available at (SD = 1.8°).
https://osf.io/hfcx8/. The three repeated measures of pitch The distributions based on data from the native apps are
and roll for each angle combination were close to identical, shown in Fig. 5. The distributions are similar to the ones
differing by less than .01° on average. The arithmetic mean of gathered via the browser. The mean deviation was 0.05° for
the three measurements was used in statistical analyses. pitch, ranging from – 5.71° to 3.48° (SD = 1.1°). For the roll
Behav Res

Fig. 4 Browser-measured deviations of the sensor gathered pitch values Fig. 5 Native app-measured deviations of the sensor gathered pitch
(top) and roll values (bottom) from the objective position of the values (top) and roll values (bottom) from the objective position of the
smartphones across all devices smartphones across all devices

values, the mean deviation was 0.21°, ranging from – 14.7° to devices were included as a between factor. Results for the
10.3° (SD = 2.0°). deviation values of pitch are shown in Table 2.
The correlation between the values from the browser and The heterogeneities in pitch deviations due to smartphone
the native app was r = .91 for pitch and r = .90 for roll. Overall, device did show very strong support for an influence of the
the results show a high, albeit not perfect, overlap between the smartphone device, with a Bayes factor of BF10 = 1.32 *
two modes of measurement. 10 103 . The explained variance in pitch deviations by
smartphone device amounted to η2 = .38. Also, the Bayes
factor for the repeated measures supports a difference of pitch
Comparison of devices and mode of measurement

Bayesian repeated measures ANOVAs for the absolute devi-


Table 2 Bayesian repeated measures ANOVA of absolute pitch
ations of pitch and roll values from the objective angles were deviations
calculated. This allows for a comparison of the deviation in
both directions and removes the possibility of inaccuracies in Models BF 10 Error %
both directions to cancel each other out. It therefore allows for
Null model 1.000
a better comparison of the heterogeneity between devices and
Mode of measurement (repeated) 75.81 3.248
software. The repeated measures factor was mode of measure-
Smartphone device 1.32 * 10103 0.195
ment, i.e., native app or web browser. The 56 smartphone
Behav Res

deviations due to mode of measurement, BF10 = 75.81, but the Table 3 Bayesian repeated measures ANOVA of absolute roll
deviations
explained variance was very small with η2= .001. When in-
cluding the smartphone device in the null model and comput- Models BF 10 Error %
ing the Bayes factors for adding the interaction it showed very
strong support for improving the model, BF10 = 9.64 * 109. Null model 1.000
This result signifies that the mode of measurement, browser Mode of measurement (repeated) 0.08 1.042
vs. native app, did not affect all devices equally, with some Smartphone device 1.73 * 10136 0.212
devices showing larger differences than others. The mean ab-
solute deviations and their standard deviations for the browser
values of pitch are shown in Fig. 6. that for some smartphone devices, the mode of measurement
Results for the Bayesian repeated measures ANOVA of does change the values, but not as a main effect. The mean
deviation values of roll are shown in Table 3. The results for absolute deviations and their standard deviations for the
roll are consistent to the ones for pitch. The Bayes factor for browser values of roll are shown in Fig. 7.
the repeated measures did provide evidence against a differ- The following analyses are only reported for the browser-
ence of roll deviations due to mode of measurement, BF10 = based measurements because the differences between browser
0.08. The explained variance was very low with η2 < .001. and native measurements were very small and the browser
The heterogeneities in roll deviations due to smartphone de- data was available for all devices.
vice did show very strong support for an influence of the A linear mixed model with smartphone device as a random
smartphone device with a Bayes factor of BF10 = 1.73 * effect was calculated to compare the deviations in pitch values to
10 136 . The explained variance in roll deviations by the deviations in roll values. The analysis confirmed the impres-
smartphone device was higher as compared to pitch devia- sion from the descriptive plots. Deviations from objective roll
tions, η2= .57. values were higher than the deviations from pitch values by an
When including the smartphone device in the null model average of 0.36°, t (3260) = 10.91, p < .001, d = .28.
and computing the Bayes factors for adding the mode of mea- Hypothesis 3, the consistency of the deviations within the
surement and the interaction, the main effect for mode of same smartphone device, was tested via ICCs. We were inter-
measurement did not improve the model, BF10 = 0.08, but ested in the consistency of the deviations across the different
the interaction again did, BF10 = 4.23 * 1013. This signifies objective angles that were measured. For the pitch values, the

Fig. 6 Mean browser-based absolute deviations from the objective pitch values and their SDs by smartphone device
Behav Res

Fig. 7 Mean browser-based absolute deviations from the objective roll values and their SDs by smartphone device

inaccuracies did show a moderate amount of consistency each device were calculated. The dependent variable
within devices, ICC = .26, p < .001. This signifies that pitch was always the absolute tilt deviation and the angle,
measurement deviations within a device were somewhat con- pitch, or roll, was included as a covariate. As there was
sistent across measurement occasions. For the roll values, the only one device with a different OS than Android or
consistency of inaccuracies within devices was smaller, ICC = iOS, the Nokia Lumia 950, the analysis compared only
.07, p < .001. Roll measurement deviations were not as stable these two operating systems. The OS of the device did
within the measured devices. show an association with the accuracy of measurement,
t(52.94) = – 2.39, p = .021. iOS devices showed slightly
smaller inaccuracies, but the effect size was very small
Comparison of operating systems and manufacturer with η2= .03. The mean inaccuracy for pitch and roll of
both operating systems is shown in Fig. 8.
To compare the impact of the operating system and the The manufacturer of the device, e.g., Samsung, Apple,
manufacturers of the device on the accuracy of measure- Huawei, did not predict inaccuracies in pitch or roll deviation,
ment, linear mixed models with random intercepts for t(53.95) = – 0.174, p = .86.

4 4
Pitch deviation (absolute)

Roll deviation (absolute)

3 3

2 2

1 1

0 0

Fig. 8 Absolute browser-based deviations of pitch and roll aggregated for operating systems
Behav Res

Discussion y-axis, i.e., roll, is always at a right angle to the gravitational


force and different roll values can therefore not be distin-
Our results show that heterogeneities in pitch and roll data are guished. Our results suggest that this problem is present at
present for the orientation sensor. Devices differ in accuracy pitch angles lower than 90°. This explains the wide range of
with some showing mean deviations close to 0° and little deviations measured at pitch angles of 85°. Future research
variance while other devices show mean inaccuracies of up should determine the exact angle from which the qualitative
to 2°, on some occasions reaching over 6° compared to the difference occurs. Our advice to researchers is to handle roll
objective tilt. The deviations are higher for measurements of data at pitch angles approaching 90° with care and, whenever
roll than they are for measurements of pitch. Hypothesis 1, possible, check for unusually large variance or deviations.
referring to the heterogeneities between devices, was therefore Another, more complex, solution is to use the raw data from
supported. The results are in line with findings on the accel- the sensors and calculate quaternions instead of Euler angles
erometer (Stisen et al., 2015). (Favre, Jolles, Siegrist, & Aminian, 2006).
Whether a browser or a native application was used to Our results indicate that inaccuracies are moderately con-
gather data did not have an influence on the sistent within devices, meaning that a deviation in pitch in one
measurement accuracy of the sensor data, overall. direction at a certain angle does show a positive correlation
However, there was a significant interaction between with the deviation in pitch at other angles. This supports our
smartphone device and mode of measurement, pointing third hypothesis, at least for deviations in pitch. When
towards some differences between devices. For some de- conducting longitudinal studies, this is possibly an important
vices, the values of pitch and roll were basically identi- factor as variables are often ipsatized, i.e. centered around the
cal, regardless of whether they were measured via a web person mean, in these designs to separate between- and
browser or native application. For other devices, the dif- within-person effects (Curran & Bauer, 2011). Ipsatizing cre-
ferences were more pronounced. In addition to these het- ates variables for the within-person effect that are centered
erogeneities, the directions, e.g., signs of angles, did dif- around a person mean. Stable deviations within one device
fer depending on the software. Depending on the planned mean that these ipsatized values are influenced to a lesser
study, a reversal of angles in some devices does have the degree by heterogeneities and deviations of orientation data.
potential to seriously alter the results of analyses. The The effect is not completely removed because the correlations
results with regard to mode of measurement partially within a device are not perfect and vary across devices and tilt.
support our second hypothesis. The software implemen- It is still an important fact to consider when evaluating wheth-
tation does have an influence, but to a smaller degree er and how big of a problem heterogeneities are in the context
than the differences across devices. This is in line with in a given research design.
previous suggestions (Blunck et al., 2013) and technical Comparisons between persons, i.e., devices, are influ-
considerations when implementing software sensors enced to a higher degree. Not only are the inaccuracies
(Deveria, 2018). a bigger problem because they are not consistent across
The magnitude of the deviations is not negligible, but their devices, but the possibility of different software
importance depends on the research question that is investi- implementations also opens the possibility of more pro-
gated. If the orientation sensor is merely supposed to indicate found problems for the comparability (Blunck et al.,
switches between portrait and landscape mode or capture fall- 2013). There is no binding standard on the signage of
ing behavior, small heterogeneities might not be as impactful. pitch and roll, meaning that a negative pitch for one
In one of our studies, however, deviations of 4–5° from the software solution might be the same value with opposite
actual values are close in magnitude to the effect size when sign in others. There is no way to ascertain comparabil-
trying to measure body posture (Kuhlmann & Reips, 2020). ity of signs apart from testing them beforehand.
The analyses on differences in inaccuracies between differ- Assuming that the number of software implementations
ent operating systems and manufacturers only revealed small is not too large, this should not be too problematic. A
or non-significant effects. The inaccuracies of specific devices bit more problematic is the possibility that certain
contribute more variance than distinguishing by operating sys- values are transformed or cut off by the software. For
tem or manufacturer. example, the MIT App inventor transforms values with
For the present study, we excluded roll values that were an absolute value of over 90° to either roll back to 0
acquired at 85° of pitch because of their inaccuracy. If this with increasing tilt or it freezes them at the angle until
correction is not performed, the heterogeneities and inaccura- the value gets lower again (MIT App Inventor Public
cies are magnitudes higher to the point of approaching random Open Source, 2018). Transformations of data are usually
values of roll. In the Procedure section, we mentioned that at automated with certain applications in mind, e.g.,
pitch values of 90°, there does not exist a meaningful value for games, which may not be in line with researchers’ in-
roll because the device is standing on the side. Tilt around the terests. Furthermore, these transformations are not easy
Behav Res

to find in manuals, as they pertain to a very specific the symposium New methods and tools in Internet-based research at the
48th Annual Meeting of the Society for Computers in Psychology (SCiP)
topic not usually of interest to everyday app developers.
in New Orleans, Louisiana, US, November 15, 2018.
A recommendation for developers of research-oriented
frameworks is to provide a coherent API where re- Open practices statement The data files, R scripts and the browser app
searchers can forget about device particularities and are available at https://osf.io/hfcx8/. The study was not preregistered.
get similar values in cross-platform setups. Such an
Funding Information Open Access funding provided by Projekt DEAL.
API would make comparisons easier for researchers
and solve many of the problems of comparability before Open Access This article is licensed under a Creative Commons
they arise in data analysis. Attribution 4.0 International License, which permits use, sharing, adap-
tation, distribution and reproduction in any medium or format, as long as
you give appropriate credit to the original author(s) and the source, pro-
Limitations vide a link to the Creative Commons licence, and indicate if changes were
made. The images or other third party material in this article are included
in the article's Creative Commons licence, unless indicated otherwise in a
The present study is limited by the number of devices that credit line to the material. If material is not included in the article's
were investigated. Though they were selected to be compara- Creative Commons licence and your intended use is not permitted by
ble to the situation in a typical smartphone study, they do not statutory regulation or exceeds the permitted use, you will need to obtain
cover the complete range of possible devices in a research permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/.
design. This is not the aim of the present study, though. We
want to investigate whether there exists a problem when
implementing the orientation sensor and provide researchers
with an estimate of possible effects. Our study finds there is a
problem and the effects are substantial. References
Another limitation is the value of the angles of pitch and
roll that we investigated. They do not cover the entire range of Bert, F., Giacometti, M., Gualano, M. R., & Siliquini, R. (2014).
possible angles, but merely represent a typical combination Smartphones and health promotion: a review of the evidence.
Journal of Medical Systems, 38, 9995. doi: https://doi.org/10.1007/
that reflects common pitch and roll values. Our study is not
s10916-013-9995-7
guaranteed to also reflect extreme cases, e.g., turning the Blunck, H., Bouvin, N. O., Franke, T., Grønbæk, K., Kjaergaard, M. B.,
phone upside down, or reflect every other possible combina- Lukowicz, P., & Wüstenberg, M. (2013). On heterogeneity in mo-
tion of angles. We do provide a meaningful number of com- bile sensing applications aiming at representative data collection.
binations of pitch and roll, however, that covers the spectrum Proceedings of the 2013 ACM Conference on Pervasive and
Ubiquitous Computing Adjunct Publication, 1087-1098. doi:
of a typical smartphone study (Kuhlmann & Reips, 2020). https://doi.org/10.1145/2494091.2499576
Brajdic, A., & Harle, R. (2013). Walk detection and step counting on
unconstrained smartphones. Proceedings of the 2013 ACM
International Joint Conference on Pervasive and Ubiquitous
Conclusions Computing, 225–234. doi:https://doi.org/10.1145/2493432.
2493449
The present study does show that heterogeneities are present Case, M. A., Burwick, H. A., Volpp, K. G., & Patel, M. S. (2015).
in data on the spatial orientation of smartphones. The inaccu- Accuracy of smartphone applications and wearable devices for
tracking physical activity data. JAMA, 313, 625-626. doi:https://
racies are usually between 0.5 and 3° (except for 85° pitch doi.org/10.1001/jama.2014.17841
angle), large enough to potentially influence results. They dif- Curran, P. J., & Bauer, D. J. (2011). The disaggregation of within-person
fer depending on the smartphone device. Future studies could and between-person effects in longitudinal models of change.
aim to expand the number of devices tested and possibly a Annual Review of Psychology, 62, 583–619. doi:https://doi.org/10.
1146/annurev.psych.093008.100356
database could be created as a reference for researchers. A
Deveria, A. (2018). Can I Use...: Up-to-date browser support tables for
database would not allow researchers to perfectly adjust their support of front-end web technologies on desktop and mobile web
design and analyses, but might provide helpful data and guide- browsers. Retrieved December 26, 2018, from https://caniuse.com/#
lines when implementing orientation data in a study. It might feat=deviceorientation.
also help other researchers to estimate the stability of their Favre, J., Jolles, B. M., Siegrist, O., & Aminian, K. (2006). Quaternion-
based fusion of gyroscopes and accelerometers to improve 3D angle
findings. The results of the present study do provide an esti- measurement. Electronics Letters, 42, 612–614. doi: https://doi.org/
mate of the magnitude of heterogeneity across different 10.1049/el:20060124
smartphones and research designs in which they matter. Grewal, M., & Andrews, A. (2010). How good is your gyro [ask the
experts]. IEEE Control Systems, 30, 12–86.
Acknowledgements We would like to thank Stefan Stieger for advice JASP Team (2018). JASP (Version 0.9)[macOS10.14.2].
and input during the early stages of study design. We would also like to Kuhlmann, T. & Reips, U.-D. (2020). Smartphone tilt as a measure of
thank Jonathan Buchholz, Johanna Hoppe, Maria Krasnova, and Leonie well-being? Evidence from two experience sampling studies.
Ripper for assistance in measuring. This article was presented as part of Manuscript in preparation.
Behav Res

Lathia, N., Pejovic, V., Rachuri, K. K., Mascolo, C., Musolesi, M., & Tibbet, R. & Volodine, T. (2017). DeviceOrientation Event Specification:
Rentfrow, P. J. (2013). Smartphones for large-scale behavior change W3C working group note 30 May 2017. Retrieved from https://
interventions. IEEE Pervasive Computing, 3, 66–73. www.w3.org/TR/2017/NOTE-orientation-event-20170530/
MacKerron, G., & Mourato, S. (2013). Happiness is greater in natural Tibbet, R. & Volodine, T. (2018). DeviceOrientation Event Specification.
environments. Global Environmental Change, 23, 992–1000. doi: Retrieved December 26, 2018, from https://w3c.github.io/
https://doi.org/10.1016/j.gloenvcha.2013.03.010 deviceorientation/
Milenković, A., Milosevic, M. and Jovanov, E. (2013). Smartphones for Umek, A., & Kos, A. (2016). Validation of smartphone gyroscopes for
smart wheelchairs. In 2013 IEEE International Conference on Body mobile biofeedback applications. Personal and Ubiquitous
Sensor Networks, 1–6. Computing, 20, 657–666. doi: https://doi.org/10.1007/s00779-016-
Miller, G. (2012). The smartphone psychology manifesto. Perspectives 0946-4
on Psychological Science, 7, 221–237. doi: https://doi.org/10.1177/ Weinberg, H. (2011). Gyro Mechanical Performance: The Most
1745691612441215 Important Parameter. Retrieved from http://www.mouser.cn/
MIT App Inventor Public Open Source (2018). Retrieved December pdfdocs/ADI_MS2158_TechnicalArticle.PDF
26th, 2018, from https://github.com/mit-cml/appinventor-sources/ Yavuz, G., Kocak, M., Ergun, G., Alemdar, H. O., Yalcin, H., Incel, O.
zipball/master D., & Ersoy, C. (2010). A smartphone-based fall detector with on-
Reips, U.-D., Buchanan, T., Krantz, J. H., & McGraw, K. (2015). line location support. Proceedings of the International Workshop on
Methodological challenges in the use of the Internet for scientific Sensing for App Phones, Zurich, Switzerland, 31–35.
research: Ten solutions and recommendations. Studia Psychologica,
15, 139–148.
Stieger, S., & Reips, U.-D. (2019). Well-being, smartphone sensors, and Publisher’s note Springer Nature remains neutral with regard to jurisdic-
data from open-access databases: A mobile experience sampling tional claims in published maps and institutional affiliations.
study. Field Methods, 31, 277–291. doi: https://doi.org/10.1177/
1525822X18824281

You might also like