Big IoT Data Analytics - Architecture, Opportunities, and Open Research Challenges
Big IoT Data Analytics - Architecture, Opportunities, and Open Research Challenges
Big IoT Data Analytics - Architecture, Opportunities, and Open Research Challenges
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
Abstract—Voluminous amounts of data have been produced The widespread popularity of IoT has made big data
since the past decade as the miniaturization of Internet of analytics challenging because of the processing and
things (IoT) devices increases. However, such data are not collection of data through different sensors in the IoT
useful without analytic power. Numerous big data, IoT, and
analytics solutions have enabled people to obtain valuable environment. The International Data Corporation (IDC)
insight into large data generated by IoT devices. However, report indicates that the big data market will reach over
these solutions are still in their infancy, and the domain lacks a US$125 billion by 2019 [3]. IoT big data analytics can be
comprehensive survey. This study investigates state-of-the-art defined as the steps in which a variety of IoT data are
research efforts directed toward big IoT data analytics. The examined [4] to reveal trends, unseen patterns, hidden
relationship between big data analytics and IoT is explained. correlations, and new information [5]. Companies and
Moreover, this study adds value by proposing a new
architecture for big IoT data analytics. Furthermore, big IoT individuals can benefit from analyzing large amounts of
data analytic types, methods, and technologies for big data data and managing huge amounts of information that can
mining are discussed. Numerous notable use cases are also affect 1businesses [6]. Therefore, IoT big data analytics aims
presented. Several opportunities brought by data analytics in to assist business associations and other organizations to
IoT paradigm are then discussed. Lastly, open research achieve improved understanding of data, and thus, make
challenges, such as privacy, big data mining, visualization, and efficient and well-informed decisions. Big data analytics
integration, are presented as future research directions.
enables data miners and scientists to analyze huge amounts
Index Terms— Big data, Internet of things, Data analytics, of unstructured data that can be harnessed using traditional
Distributed computing, Smart city. tools [5]. Moreover, big data analytics aims to immediately
extract knowledgeable information using data mining
I. INTRODUCTION techniques that help in making predictions, identifying
The development of big data and the Internet of things (IoT) recent trends, finding hidden information, and making
is rapidly accelerating and affecting all areas of decisions [7].
technologies and businesses by increasing the benefits for Techniques in data mining are widely deployed for both
organizations and individuals. The growth of data produced problem-specific methods and generalized data analytics.
via IoT has played a major role on the big data landscape. Accordingly, statistical and machine learning methods are
Big data can be categorized according to three aspects: (a) utilized. IoT data are different from normal big data
volume, (b) variety, and (c) velocity [1]. These categories collected via systems in terms of characteristics because of
were first introduced by Gartner to describe the elements of the various sensors and objects involved during data
big data challenges [2]. Immense opportunities are collection, which include heterogeneity, noise, variety, and
presented by the capability to analyze and utilize huge rapid growth. Statistics [8] show that the number of sensors
amounts of IoT data, including applications in smart cities, will be increased by 1 trillion in 2030. This increase will
smart transport and grid systems, energy smart meters, and affect the growth of big data. Introducing data analytics and
remote patient healthcare monitoring devices. IoT into big data requires huge resources, and IoT has the
capability to offer an excellent solution. Appropriate
Mohsen Marjani, Abdullah Gani, Aisha Siddiqa, Ibrahim Abaker Targio resources and intensive applications of the platforms are
Hashem, and Ibrar Yaqoob are with Department of Computer System and
Technology, Faculty of Computer Science and Information Technology, provided by IoT services for effective communication
University of Malaya, Kuala Lumpur, Malaysia among various deployed applications. Such process is
([email protected]). Fariza Nasaruddin is with Department of
Information System, Faculty of Computer Science and Information suitable for fulfilling the requirements of IoT applications,
Technology, University of Malaya, Kuala Lumpur, Malaysia. and can reduce some challenges in the future of big data
Ahmad Karim is with Department of Information Technology, Bahauddin
Zakariya University, Multan, Pakistan. analytics. This technological amalgamation increases the
possibility of implementing IoT toward a better direction.
Moreover, implementing IoT and big data integration
1
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
solutions can help address issues on storage, processing, areas may collect various kinds of data, such as
data analytics, and visualization tools. It can also assist in geographical, astronomical, environmental, and logistical
improving collaboration and communication among various data.
objects in a smart city [9]. Application areas, such as smart
ecological environments, smart traffic, smart grids, A large number of communication devices in the IoT
intelligent buildings, and logistic intelligent management, paradigm are embedded into sensor devices in the real
can benefit from the aforementioned arrangement. Many world. Data collecting devices sense data and transmit these
studies on big data has focused on big data management; in data using embedded communication devices. The
particular, big data analytics has been surveyed [10, 11]. continuum of devices and objects are interconnected
However, this survey focused on IoT big data in the context through a variety of communication solutions, such as
of the analytics of a huge amount of data. The contributions Bluetooth, WiFi, ZigBee, and GSM. These communication
of this survey are as follows. devices transmit data and receive commands from remotely
controlled devices, which allow direct integration with the
a) State-of-the-art research efforts conducted in terms physical world through computer-based systems to improve
of big data analytics are investigated. living standards.
b) An architecture for big IoT data analytics is
proposed. Over 50 billion devices ranging from smartphones, laptops,
c) Several unprecedented opportunities brought by sensors, and game consoles are anticipated to be connected
data analytics in the IoT domain are introduced. to the Internet through several heterogeneous access
d) Credible use cases are presented. networks enabled by technologies, such as radio frequency
e) Research challenges that remain to be addressed identification (RFID) and wireless sensor networks. [15]
are identified and discussed. mentioned that IoT could be recognized in three paradigms:
These contributions are presented from Sections 3 to 6. The Internet-oriented, sensors, and knowledge [16]. The recent
conclusion is provided in Section 7. adaptation of different wireless technologies places IoT as
the next revolutionary technology by benefiting from the
II. OVERVIEW OF IOT AND BIG DATA full opportunities offered by Internet technology.
An overview of IoT technologies and big data is provided B. Big data
before the discussion.
The volume of data generated by sensors, devices, social
A. IoT media, health care applications, temperature sensors, and
various other software applications and digital devices that
IoT offers a platform for sensors and devices to
continuously generate large amounts of structured,
communicate seamlessly within a smart environment and
unstructured, or semi-structured data is strongly increasing.
enables information sharing across platforms in a
This massive data generation results in ―big data‖ [17].
convenient manner. The recent adaptation of different
Traditional database systems are inefficient when storing,
wireless technologies places IoT as the next revolutionary
processing, and analyzing rapidly growing amount of data
technology by benefiting from the full opportunities offered
or big data [18]. The term ―big data‖ has been used in the
by the Internet technology. IoT has witnessed its recent
previous literature but is relatively new in business and IT
adoption in smart cities with interest in developing
[19]. An example of big data-related studies is the next
intelligent systems, such as smart office, smart retail, smart
frontier for innovation, competition, and productivity;
agriculture, smart water, smart transportation, smart
healthcare, and smart energy [12, 13]. McKinsey Global Institute [20] defined big data as the size
of data sets that are a better database system tool than the
IoT has emerged as a new trend in the last few years, where usual tools for capturing, storing, processing, and analyzing
mobile devices, transportation facilities, public facilities, such data [18]. ―The Digital Universe‖ study [21] labels big
and home appliances can all be used as data acquisition data technologies as a new generation of technologies and
equipment in IoT. All surrounding electronic equipment to architectures that aim to take out the value from a massive
facilitate daily life operations, such as wristwatches, volume of data with various formats by enabling high-
vending machines, emergency alarms, and garage doors, as velocity capture, discovery, and analysis. This previous
well as home appliances, such as refrigerators, microwave study also characterizes big data into three aspects: (a) data
ovens, air conditioners, and water heaters are connected to sources, (b) data analytics, and (c) the presentation of the
an IoT network and can be controlled remotely. Ciufo [14] results of the analytics. This definition uses the 3V‘s
stated that these devices ―talk‖ to one another and to central (volume, variety, velocity) model proposed by Gartner [2].
controlling devices. Such devices deployed in different The model highlights an e-commerce trend in data
management that faces challenges to manage volume or size
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
of data, variety or different sources of data, and velocity or linearly with the rapid increase in computational resources
speed of data creation. Some studies declare volume as a [19].
main characteristic of big data without providing a pure
definition [22]. However, other researchers introduced Big data analytics processes consume considerable time to
additional characteristics for big data, such as veracity, provide feedback and guidelines to users, whereas only a
value, variability, and complexity [23, 24]. The 3V‘s model, few tools [29] can process huge data sets within reasonable
or its derivations, is the most common descriptions of the amount of processing time. By contrast, most of the
term ―big data.‖ remaining tools use the complicated trial-and-error method
to deal with massive amounts of data sets and data
III. BIG DATA ANALYTICS heterogeneity [30]. Big data analytics systems exist. For
example, the Exploratory Data Analysis Environment [31]
Big data analytics involves the processes of searching a
is a big data visual analytics system that is used to analyze
database, mining, and analyzing data dedicated to improve
complex earth system simulations with large numbers of
company performance [25]. data sets.
Big data analytics is the process of examining large data A. Existing analytics systems
sets that contain a variety of data types [4] to reveal unseen Different analytic types are used according to the
patterns, hidden correlations, market trends, customer requirements of IoT applications [32]. These analytic types
preferences, and other useful business information [5]. The are discussed in this subsection under real-time, off-line,
capability to analyze large amounts of data can help an memory-level, business intelligence (BI) level, and massive
organization deal with considerable information that can level analytics categories. Moreover, a comparison based on
affect the business [6]. Therefore, the main objective of big analytics types and their levels is presented in Table 1.
data analytics is to assist business associations to have
Real-time analytics is typically performed on data collected
improved understanding of data, and thus, make efficient
from sensors. In this situation, data change constantly, and
and well-informed decisions. Big data analytics enables data
rapid data analytics techniques are required to obtain an
miners and scientists to analyze a large volume of data that analytical result within a short period. Consequently, two
may not be harnessed using traditional tools [5]. existing architectures have been proposed for real-time
analysis: parallel processing clusters using traditional
Big data analytics require technologies and tools that can relational databases and memory-based computing
transform a large amount of structured, unstructured, and platforms [33]. Greenplum [34] and Hana [35] are examples
semi-structured data into a more understandable data and of real-time analytics architecture.
metadata format for analytical processes. The algorithms
used in these analytical tools must discover patterns, trends, Off-line analytics is used when a quick response is not
and correlations over a variety of time horizons in the data required [32]. For example, many Internet enterprises use
[26]. After analyzing the data, these tools visualize the Hadoop-based off-line analytics architecture to reduce the
findings in tables, graphs, and spatial charts for efficient cost of data format conversion [36]. Such analytics
decision making. Thus, big data analysis is a serious improves data acquisition efficiency. SCRIBE [37], Kafka
challenge for many applications because of data complexity [38], Time-Tunnel [39], and Chukwa [40] are examples of
and the scalability of underlying algorithms that support architectures that conduct off-line analytics and can satisfy
such processes [27]. the demands of data acquisition.
Talia (2013) highlighted that obtaining helpful information Memory-level analytics is applied when the size of data is
from big data analysis is a critical matter that requires smaller than the memory of a cluster [32]. To date, the
scalable analytical algorithms and techniques to return well- memory of clusters has reached terabyte (TB) level [41].
timed results, whereas current techniques and algorithms are Therefore, several internal database technologies are
inefficient to handle big data analytics. Therefore, large required to improve analytical efficiency. Memory-level
infrastructure and additional applications are necessary to analytics is suitable for conducting real-time analysis.
support data parallelism. Moreover, data sources, such as MongoDB [42] is an example of this architecture.
high-speed data stream received from different data sources,
have different formats, which makes integrating multiple BI analytics is adopted when the size of data is larger than
sources for analytics solutions critical [28]. Hence, the the memory level, but in this case, data may be imported to
challenge is focused on the performance of current the BI analysis environment [43]. BI analytic currently
algorithms used in big data analysis, which is not rising supports TB-level data [32]. Moreover, BI can help discover
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
strategic business opportunities from the flood of data. In traditional databases [44]. Massive analytics uses the
addition, BI analytics allows easy interpretation of data Hadoop distributed file system for data storage and
volumes. Identifying new opportunities and implementing map/reduce for data analysis. Massive analytics helps create
an effective strategy provide competitive market advantage the business foundation and increases market
and long-term stability. competitiveness by extracting meaningful values from data.
Moreover, massive analytics obtains accurate data that
Massive analytics is applied when the size of data is greater leverage the risks involved in making any business decision.
than the entire capacity of the BI analysis product and In addition, massive analytics provides services effectively
TABLE 1: COMPARISON OF DIFFERENT ANALYTICS TYPES AND THEIR LEVELS
Existing
Analytic Types/Level Specified Use Advantages/Category
Architectures/Tools
+Parallel processing
clusters using
Real time[33] To analyze the large amounts of data +Greenplum traditional
generated by the sensors +HANA databases memory
based
computing platforms
+Efficient Data
+Scribe acquisition
To use for the
+ Kafka +Reduce the cost of
Offline [36] Applications where there is no high
+Timetunnel data
requirements on response time
+Chukwa format conversion
To use where the total
Memory level [41] data volume is smaller than the
+MongoDB +Real time
maximum
Memory of the cluster
Business intelligence level To use when the data +Both offline and
+Data analysis plans.
[43] scale surpasses the Online
memory level
To use when data scale is
totally surpassed the
Massive level [44] +Mostly belong to
capacity of business +MapReduce
Offline
intelligence products and traditional
databases
B. Relationship between IoT and big data analytics ―things,‖ big data implementations will necessitate
Big data analytics is rapidly emerging as a key IoT initiative performing lightning-fast analytics with large queries to
to improve decision making. One of the most prominent allow organizations to gain rapid insights, make quick
features of IoT is its analysis of information about decisions, and interact with people and other devices. The
―connected things.‖ Big data analytics in IoT requires interconnection of sensing and actuating devices provide the
processing a large amount of data on the fly and storing the capability to share information across platforms through a
data in various storage technologies. Given that much of the unified architecture and develop a common operating
unstructured data are gathered directly from web-enabled picture for enabling innovative applications.
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
The need to adopt big data in IoT applications is Accordingly, statistical and machine learning methods are
compelling. These two technologies have already been utilized. The evolution of big data also changes analytics
recognized in the fields of IT and business. Although, the requirements. Although the requirements for efficient
development of big data is already lagging, these mechanisms lie in all aspects of big data management [30],
technologies are inter-dependent and should be jointly such as capturing, storage, preprocessing, and analysis; for
developed. In general, the deployment of IoT increases the our discussion, big data analytics requires the same or faster
amount of data in quantity and category; hence, offering the processing speed than traditional data analytics with
opportunity for the application and development of big data minimum cost for high-volume, high-velocity, and high-
analytics. Moreover, the application of big data technologies variety data [45].
in IoT accelerates the research advances and business
models of IoT. The relationship between IoT and big data, Various solutions are available for big data analytics, and
which is shown in Figure 1, can be divided into three steps advancements in developing and improving these solutions
to enable the management of IoT data. The first step are being continuously achieved to make them suitable for
comprises managing IoT data sources, where connected new big data trends. Data mining plays an important role in
sensors devices use applications to interact with one analytics, and most of the techniques are developed using
another. For example, the interaction of devices such as data mining algorithms according to a particular scenario.
CCTV cameras, smart traffic lights, and smart home Knowledge on available big data analytics options is crucial
devices, generates large amounts of data sources with when evaluating and choosing an appropriate approach for
different formats. This data can be stored in low cost decision making. In this section, we present several methods
commodity storage on the cloud. In the second step, the that can be implemented for several big data case studies.
generated data are called ―big data,‖ which are based on Some of these analytics methods are efficient for big IoT
their volume, velocity, and variety. These huge amounts of data analytics. Diverse and tremendous size data sets
data are stored in big data files in shared distributed fault- contribute more in big data insights. However, this belief is
tolerant databases. The last step applies analytics tools such not always valid because more data may have more
as MapReduce, Spark, Splunk, and Skytree that can analyze ambiguities and abnormalities [7].
the stored big IoT data sets. The four levels of analytics start
from training data, then move on to analytics tools, queries, We present big data analytics methods under classification,
and reports. clustering, association rule mining, and prediction
categories. Figure 2 depicts and summarizes each of these
C. Big data analytics methods categories. Each category is a data mining function and
involves many methods and algorithms to fulfill information
Big data analytics aim to immediately extract extraction and analysis requirements. For example,
knowledgeable information that helps in making Bayesian network, support vector machine (SVM), and k-
predictions, identifying recent trends, finding hidden nearest neighbor (KNN) offer classification methods.
information, and ultimately, making decisions [7]. Data Similarly, partitioning, hierarchical clustering, and co-
mining techniques are widely deployed for both problem- occurrence are widespread in clustering. Association rule
specific methods and generalized data analytics. mining and prediction comprise significant methods.
5
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
Big Data Analytics Clustering is another data mining technique used as a big
Classification Clustering
data analytics method. Contrary to classification, clustering
uses an unsupervised learning approach and creates groups
for given objects based on their distinctive meaningful
features [56]. As we have presented in Figure 2 that
grouping a large number of objects in the form of clusters
makes data manipulation simple. The well-known methods
used for clustering are hierarchical clustering and
Prediction Association Rule
partitioning. The hierarchical clustering approach keeps
combining small clusters of data objects to form a
hierarchical tree and create agglomerative clusters. Divisive
clusters are created in the opposite manner by dividing a
single cluster that contains all data objects into smaller
appropriate clusters [57].
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
of big data mining functionalities that are elaborated in this association rule-based data analytics methods are applicable
section, ‗‘ is used to show the support for an application to industry and e-governance and are well adopted in
whereas ‗-‘ denotes that it is not obvious whether the healthcare, e-commerce, and bioinformatics. Predictive
method supports to an application or not. In particular, analytics are useful for disaster and market predictions,
Table 2 shows that classification methods are suitable for whereas time series analysis is used in disaster forecasting,
medical imaging, industry, speech recognition, natural medical imaging, speech recognition, social network
language processing, and e-governance. Clustering and analysis, and e-governance.
TABLE 2: APPLICATIONS OF BIG DATA MINING FOR I OT
Applications
Disaster management
Healthcare
Medical Imaging
Human Genetics
Market Analysis
Industry
Speech Recognition
Bioinformatics
NLP
e-governance
Method
Classification [46] - - - - - -
Clustering [57] - - -
Association rule[58, 65] - - - - - -
Prediction [61] - - - - - - - -
Time Series [62] [63] [64] - - - - - -
has support
- not obvious
D. IoT architecture for big data analytics representation with IoT as the unifying architecture.
However, the current architecture focuses on IoT with
The architectural concept of IoT has several definitions
regard to communications. To our knowledge, our proposed
based on IoT domain abstraction and identification. It offers
architecture, which integrates IoT and big data analytics,
a reference model that defines relationships among various
has not been studied in the current literature. Figure 3
IoT verticals, such as, smart traffic, smart home, smart
illustrates the IoT architecture and big data analytics. In this
transportation, and smart health. The architecture for big
figure, the sensor layer contains all the sensor devices and
data analytics offers a design for data abstraction.
the objects, which are connected through a wireless
Furthermore, this standard provides a reference architecture
network. This wireless network communication can be
that builds upon the reference model. Many IoT
RFID, WiFi, ultra-wideband, ZigBee, and Bluetooth. The
architectures are found in the literature [66] [67] [13]. For
IoT gateway allows communication of the Internet and
example, [13] offered an IoT architecture with cloud
various webs. The upper layer concerns big data analytics,
computing at the center and a model of end-to-end
where a large amount of data received from sensors are
interaction among various stakeholders in a cloud-centric
stored in the cloud and accessed through big data analytics
IoT framework for better comparison with the proposed IoT
applications. These applications contain API management
architecture. This architecture is achieved by seamless
and a dashboard to help in the interaction with the
ubiquitous sensing, data analytics, and information
processing engine.
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
Big
Big data
data analytic
analytic
Cloud storage
IoT Gateway
Network devices
IoT
IoT Devices
Devices
A novel meta-model-based approach for integrating IoT such as smart grids, tank levels, and water flows, and silos
architecture objects is proposed. The concept is semi- stock calculation, in which processing takes a long time
automatically federated into a holistic digital enterprise even on a dedicated and powerful machine [68]. A smart
architecture environment. The main objective is to provide meter is a device that electronically records consumption of
an adequate decision support for complex business, electric energy data between the meter and the control
architecture management with the development of system. Collecting and analyzing smart meter data in IoT
assessment systems, and IT environment. Thus, environment assist the decision maker in predicting
architectural decisions for IoT are closely connected with electricity consumption. Furthermore, the analytics of a
code implementation to allow users to understand the smart meter can also be used to forecast demands to prevent
integration of enterprise architecture management with IoT. crises and satisfy strategic objectives through specific
pricing plans. Thus, utility companies must be capable of
IV. USE CASES high-volume data management and advanced analytics
This section presents a number of use cases for big IoT data designed to transform data into actionable insights.
analytics. Although the use cases are relevant to IoT
applications, the choices have been guided for the ones that B. Smart transportation
are most commonly used in IoT applications and for the
A smart transportation system is an IoT-based use case that
amount of data that can be generated for analytics.
aims to support the smart city concept. A smart
transportation system intends to deploy powerful and
A. Smart metering advanced communication technologies for the management
Smart metering is one of the IoT application use cases that of smart cities. Traditional transportation systems, which are
generates a large amount of data from different sources, based on image processing, are affected by weather
conditions, such as heavy rains and thick fog. Consequently,
8
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
the captured image may not be clearly visible. The design of D. Smart agriculture
an e-plate system [69] using RFID technology provides a
Smart agriculture is a beneficial use case in big IoT data
good solution for intelligent monitoring, tracking, and
analytics. Sensors are the actors in the smart agriculture use
identification of vehicles. Moreover, introducing IoT into
case. They are installed in fields to obtain data on moisture
vehicular technologies will enable traffic congestion
level of soil, trunk diameter of plants, microclimate
management to exhibit significantly better performance than
condition, and humidity level, as well as to forecast
the existing infrastructure. This technology can improve
weather. Sensors transmit obtained data using network and
existing traffic systems in which vehicles can effectively
communication devices. These data pass through an IoT
communicate with one another in a systematic manner
gateway and the Internet to reach the analytics layer shown
without human intervention.
in Fig. The analytics layer processes the data obtained from
Satellite navigation systems and sensors can also be applied the sensor network to issue commands. Automatic climate
in trucks, ships, and airplanes in real time. The routing of control according to harvesting requirements, timely and
these vehicles can be optimized by using the bulk of controlled irrigation, and humidity control for fungus
available public data, such as traffic jams, road conditions, prevention are examples of actions performed based on big
delivery addresses, weather conditions, and locations of data analytics recommendations.
refilling stations. For example, in case of runtime address
change, the updated information (route, cost) can be E. Smart grid
optimized, recalculated, and passed on to drivers in real
The smart grid is a new generation of power grid in which
time. Sensors incorporated into these vehicles can also
managing and distributing electricity between suppliers and
provide real-time information to measure engine health,
consumers is upgraded using two-way communication
determine whether equipment requires maintenance, and
technologies and computing capabilities to improve
predict errors [70].
reliability, safety, efficiency with real-time control, and
monitoring [73, 74]. One of the major challenges in a power
C. Smart supply chains system is integrating renewable and decentralized energy.
Embedded sensor technologies can communicate Electricity systems require a smart grid to manage the
bidirectionally and provide remote accessibility to over 1 volatile behavior of distributed energy resources (DERs)
million elevators worldwide [71]. The captured data are [75]. However, most energy systems have to follow
used by on- and off-site technicians to run diagnostics and governmental laws and regulations, as well as consider
repair options to make appropriate decisions, which result in business analysis and potential legal constraints [76]. Grid
increased machine uptime and enhanced customer service. sensors and devices continuously and rapidly generate data
Ultimately, big IoT data analytics allows a supply chain to related to control loops and protection and require real-time
execute decisions and control the external environment. processing and analytics along with machine-to-machine
IoT-enabled factory equipment will be able to communicate (M2M) or human-to-machine (HMI) interactions to issue
within data parameters (i.e., machine utilization, control commands to the system. However, the system must
temperature) and optimize performance by changing fulfill visualization and reporting requirements.
equipment settings or process workflow [72]. In-transit
visibility is another use case that will play a vital role in F. Smart traffic light system
future supply chains in the presence of IoT infrastructure.
The smart traffic light system consists of nodes that locally
Key technologies used by in-transit visibility are RFIDs and
interact with IoT sensors and devices to detect the presence
cloud-based Global Positioning System (GPS), which
of vehicles, bikers, and pedestrians. These nodes
provide location, identity, and other tracking information.
communicate with neighboring traffic lights to measure the
These data will be the backbone of supply chains supported
speed and distance of approaching transportation means and
by IoT technologies. The information gathered by
manage green traffic signals [77]. IoT data gathered using
equipment will provide detailed visibility of an item shipped
the system require real-time analytics processing to perform
from a manufacturer to a retailer. Data collected via RFID
necessary tasks, such as changing the timing cycles
and GPS technologies will allow supply chain managers to
according to traffic conditions, sending informative signals
enhance automated shipment and accurate delivery
to neighboring nodes, and detecting approaching vehicles
information by predicting time of arrival. Similarly,
that use IoT sensors and devices to prevent long queues or
managers will be able to monitor other information, such as
accidents. Moreover, smart traffic light systems can send
temperature control, which can affect the quality of in-
their collected IoT data to cloud storage for further
transit products.
analytics. Table 3 presents the use cases of IoT big data
analytics.
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
E-commerce
Healthcare
Fig. 4. Example of use cases and opportunities for big IoT data analytics architecture
10
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
opportunities to build a smart environment. Big IoT data VI. OPEN CHALLENGES AND FUTURE DIRECTIONS
analytics has widespread applications in nearly every IoT and big data analytics have been extensively accepted
industry. However, the main success areas of analytics are by many organizations. However, these technologies are
in e-commerce, revenue growth, increased customer size, still in their early stages. Several existing research
accuracy of sale forecast results, product optimization, risk challenges have not yet been addressed. This section
management, and improved customer segmentation. presents several challenges in the field of big IoT data
analytics.
B. Smart cities
Big data collected from smart cities offer new opportunities A. Privacy
in which efficiency gains can be achieved through an Privacy issues arise when a system is compromised to infer
appropriate analytics platform/infrastructure to analyze big or restore personal information using big data analytics
IoT data. Various devices connect to the Internet in a smart tools, although data are generated from anonymous users.
environment and share information. Moreover, the cost of With the proliferation of big data analytics technologies
storing data has been reduced dramatically after the used in big IoT data, the privacy issue has become a core
invention of cloud computing technology. Analysis problem in the data mining domain. Consequently, most
capabilities have made huge leaps. Thus, the role of big data people are reluctant to rely on these systems, which do not
in a smart city can potentially transform every sector of the provide solid service-level agreement (SLA) conditions
economy of a nation. Hadoop with YARN resource regarding user personal information theft or misuse. In fact,
manager has offered recent advancement in big data the sensitive information of users has to be secured and
technology to support and handle numerous workloads, protected from external interference. Although temporary
real-time processing, and streaming data ingestion. identification, anonymity, and encryptions provide several
ways to enforce data privacy, decisions have to be made
C. Retail and logistics with regard to ethical factors, such as what to use, how to
IoT is expected to play a key role as an emerging use, and why use generated big IoT data [7].
technology in the area of retail and logistics. In logistics, Another security risk associated with IoT data is the
RFID keeps track of containers, pallets, and crates. In heterogeneity of the types of devices used and the nature of
addition, considerable advancements in IoT technologies generated data, such as raw devices, data types, and
can facilitate retailers by providing several benefits. communication protocols. These devices can have different
However, IoT devices generate large amounts of data on a sizes and shapes outside the network and are designed to
daily basis. Thus, powerful data analytics enables communicate with cooperative applications. Thus, to
enterprises to gain insights from the voluminous amounts of authenticate these devices, an IoT system should assign a
data produced through IoT technologies. Applying data non-repudiable identification system to each device.
analytics to logistic data sets can improve the shipment Moreover, enterprises should maintain a meta-repository of
experience of customers. Moreover, retail companies can these connected devices for auditing purposes. This
earn additional profit by analyzing customer data, which can heterogeneous IoT architecture is new to security
predict the trends and demands of goods. By looking into professionals, and thus, results in increased security risks.
customer data, optimizing pricing plans and seasonal Consequently, any attack in this scenario compromises
promotions can be planned efficiently to maximize profit. system security and disconnects interconnected devices.
In the context of big IoT data, security and privacy are the
D. Healthcare key challenges in processing and storing huge amounts of
Recent years have witnessed tremendous growth in smart data. Moreover, to perform critical operations and host
health monitoring devices. These devices generate private data, these systems highly rely on third party
enormous amounts of data. Thus, applying data analytics to services and infrastructure. Therefore, an exponential
data collected from fetal monitors, electrocardiograms, growth in data rate causes difficulty in securing each and
temperature monitors, or blood glucose level monitors can every portion of critical data. As previously discussed,
help healthcare specialists efficiently assess the physical existing security solutions (Karim, 2016 #86) are no long
conditions of patients. Moreover, data analytics enables applicable to providing complete security in big IoT data
healthcare professionals to diagnose serious diseases in their scenarios. Existing algorithms are not designed for the
early stages to help save lives. Furthermore, data analytics dynamic observation of data, and thus, are not effectively
improves the clinical quality of care and ensures the safety applied. Legacy data security solutions are specifically
of patients. In addition, physician profile can be reviewed designed for static data sets, whereas current data
by looking into the history of treatment of patients, which requirements are changing dynamically (Lafuente, 2015).
can improve customer satisfaction, acquisition, and Thus, deploying these security solutions is difficult for
retention. dynamically increasing data. In addition, legislative and
regulatory issues should be considered while signing SLAs.
11
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
With regard to data generated through IoT, the following APIs is necessary to avoid interoperability and reliability
security problems can emerge [78]: (a) timely updates - problems. (b) Second, devices must be well protected while
difficulty in keeping systems up to date, (b) incident communicating with peers. (c) Third, devices should be
management - identifying suspicious traffic patterns among hardcoded with the best security practices to protect against
legitimate ones and possible failure to capture unidentifiable common security and privacy threats.
incidents, (c) interoperability - proprietary and vendor-
B. Data mining
specific procedures will pose difficulties in finding hidden
or zero day attacks, (d) and protocol convergence - although Data mining methods provide efficient and best-fitting
IPv6 is currently compatible with the latest specifications, predictive or descriptive solutions for big data that can also
this protocol has yet to be fully deployed. Therefore, the be generalized for new data [79]. The evolution of big IoT
application of security rules over IPv4 may not be data and cloud computing platforms has brought the
applicable to protecting IPv6. challenges of data exploration and information extraction
At present, no answer can address these challenges and [80]. However, for the overall big IoT data architecture,
manage the security and privacy of interconnected devices. Figure 5 presents the primary challenges related to
However, the following guidelines can overcome these processing and data mining.
adversities. (a) First, a true open ecosystem with standard
Knowledge
Data Processing
Discovery
Accessiblity Parallel
Exhaustive data reads/writes: The high-volume, high- devise algorithms remains to provide compatibility with the
velocity, and high-variety qualities of big IoT data challenge latest parallel architectures. Moreover, synchronization
exploration, integration, heterogeneous communication, and issues may occur in parallel computing, while information is
extraction processes. The size and heterogeneity of data exchanged within different data mining methods. This
impose new data mining requirements, and diversity in data bottleneck of data mining methods has become an open
sources also poses a challenge [81-83]. Furthermore, issue in big IoT data analytics that should be addressed.
compared with small data sets, large data sets comprise
C. Visualization
more abnormalities and ambiguities that require additional
preprocessing steps, such as cleansing, reduction, and Visualization is an important entity in big data analytics,
transmission [23, 84]. Another issue lies in the extraction of particularly when dealing with IoT systems where data are
exact and knowledgeable information from the large generated enormously. Furthermore, conducting data
volumes of diverse data. Consequently, obtaining accurate visualization is difficult because of the large size and high
information from complex data requires analyzing data dimension of big data. This situation shows underlying
properties and finding association among different data trends and a complete picture of parsed data. Therefore, big
points. data analytics and visualization should work seamlessly to
Researchers have introduced parallel and sequential obtain the best results from IoT applications in big data.
programming models and proposed different algorithms to However, visualization in the case of heterogeneous and
minimize query response time while dealing with big data. diverse data (unstructured, structured, and semi-structured)
Moreover, researchers have selected existing data mining is a challenging task. Designing visualization solution that is
algorithms in different manners to (a) improve single source compatible with advanced big data indexing frameworks is
knowledge discovery, (b) implement data mining methods a difficult task. Similarly, response time is a desirable factor
for multi-source platforms, and (c) study and analyze in big IoT data analytics. Consequently, cloud computing
dynamic data mining methods and stream data [85]. Hence, architectures supported with rich GUI facilities can be
parallel k-means algorithm [86] and parallel association rule deployed to obtain better insights into big IoT data trends
mining methods [65] are introduced. However, the need to [87].
12
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
30. Siddiqa, A., et al., A Survey of Big Data 46. Estivill-Castro, V., Why so many clustering
Management: Taxonomy and State-of-the-Art. algorithms: a position paper. ACM SIGKDD
Journal of Network and Computer Applications, Explorations Newsletter, 2002. 4(1): p. 65-75.
2016. 47. Bielza, C. and P. Larrañaga, Discrete Bayesian
31. Steed, C.A., et al., Big data visual analytics for network classifiers: a survey. ACM Computing
exploratory earth system simulation analysis. Surveys (CSUR), 2014. 47(1): p. 5.
Computers & Geosciences, 2013. 61: p. 71-82. 48. Chen, F., et al., Data mining for the internet of
32. Chen, C.P. and C.-Y. Zhang, Data-intensive things: literature review and challenges.
applications, challenges, techniques and International Journal of Distributed Sensor
technologies: A survey on Big Data. Information Networks, 2015. 2015: p. 12.
Sciences, 2014. 275: p. 314-347. 49. Luss, R. and A. d‘Aspremont, Predicting abnormal
33. Pfaffl, M.W., A new mathematical model for returns from news using text classification.
relative quantification in real-time RT–PCR. Quantitative Finance, 2015. 15(6): p. 999-1012.
Nucleic acids research, 2001. 29(9): p. e45-e45. 50. Melin, P. and O. Castillo, A review on type-2
34. Waas, F.M. Beyond conventional data fuzzy logic applications in clustering, classification
warehousing—massively parallel data processing and pattern recognition. Applied soft computing,
with Greenplum database. in International 2014. 21: p. 568-577.
Workshop on Business Intelligence for the Real- 51. Soualhi, A., K. Medjaher, and N. Zerhouni,
Time Enterprise. 2008. Springer. Bearing health monitoring based on hilbert–huang
35. Färber, F., et al., SAP HANA database: data transform, support vector machine, and regression.
management for modern business applications. IEEE Transactions on Instrumentation and
ACM Sigmod Record, 2012. 40(4): p. 45-51. Measurement, 2015. 64(1): p. 52-62.
36. Cheng, M., et al., Mu rhythm-based cursor control: 52. Larose, D.T., k‐Nearest Neighbor Algorithm.
an offline analysis. Clinical Neurophysiology, Discovering Knowledge in Data: An Introduction
2004. 115(4): p. 745-751. to Data Mining, 2005: p. 90-106.
37. Castro, M., et al., SCRIBE: A large-scale and 53. Su, M.-Y., Real-time anomaly detection systems
decentralized application-level multicast for Denial-of-Service attacks by weighted k-
infrastructure. IEEE Journal on Selected Areas in nearest-neighbor classifiers. Expert Systems with
communications, 2002. 20(8): p. 1489-1499. Applications, 2011. 38(4): p. 3492-3498.
38. Kreps, J., N. Narkhede, and J. Rao. Kafka: A 54. Muja, M. and D.G. Lowe, Scalable nearest
distributed messaging system for log processing. in neighbor algorithms for high dimensional data.
Proceedings of the NetDB. 2011. IEEE Transactions on Pattern Analysis and
39. Notsu, H., et al. Time-tunnel: Visual analysis tool Machine Intelligence, 2014. 36(11): p. 2227-2240.
for time-series numerical data and its extension 55. Hu, C., et al., Data-driven method based on particle
toward parallel coordinates. in International swarm optimization and k-nearest neighbor
Conference on Computer Graphics, Imaging and regression for estimating capacity of lithium-ion
Visualization (CGIV'05). 2005. IEEE. battery. Applied Energy, 2014. 129: p. 49-55.
40. Rabkin, A. and R.H. Katz. Chukwa: A System for 56. Srivastava, K., et al., Data mining using
Reliable Large-Scale Log Collection. in LISA. hierarchical agglomerative clustering algorithm in
2010. distributed cloud computing environment.
41. Hong, S. and H. Kim. An analytical model for a International Journal of Computer Theory and
GPU architecture with memory-level and thread- Engineering, 2013. 5(3): p. 520.
level parallelism awareness. in ACM SIGARCH 57. Berkhin, P., A survey of clustering data mining
Computer Architecture News. 2009. ACM. techniques, in Grouping multidimensional data.
42. Chodorow, K., MongoDB: the definitive guide. 2006, Springer. p. 25-71.
2013: " O'Reilly Media, Inc.". 58. Gosain, A. and M. Bhugra. A comprehensive
43. Jourdan, Z., R.K. Rainer, and T.E. Marshall, survey of association rules on quantitative data in
Business Intelligence: An Analysis of the data mining. in Information & Communication
Literature 1. Information Systems Management, Technologies (ICT), 2013 IEEE Conference on.
2008. 25(2): p. 121-131. 2013. IEEE.
44. Bifet, A., et al., Moa: Massive online analysis. The 59. Fitzwater, M., Efficient mining of maximal
Journal of Machine Learning Research, 2010. 11: sequential patterns using multiple samples. 2005.
p. 1601-1604. 60. Yang, Z. and M. Kitsuregawa. LAPIN-SPAM: An
45. Mukhopadhyay, A., et al., A survey of improved algorithm for mining sequential pattern.
multiobjective evolutionary algorithms for data in 21st International Conference on Data
mining: Part I. Evolutionary Computation, IEEE Engineering Workshops (ICDEW'05). 2005. IEEE.
Transactions on, 2014. 18(1): p. 4-19. 61. Gandomi, A. and M. Haider, Beyond the hype: Big
data concepts, methods, and analytics. International
15
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
Journal of Information Management, 2015. 35(2): 76. Neureiter, C., et al. A Standards-based Approach
p. 137-144. for Domain Specific Modelling of Smart Grid
62. Kalpakis, K., D. Gada, and V. Puttagunta. Distance System Architectures. in Proceedings of the 11th
measures for effective clustering of ARIMA time- International Conference on System of Systems
series. in Data Mining, 2001. ICDM 2001, Engineering (SoSE), Kongsberg, Norway. 2016.
Proceedings IEEE International Conference on. 77. Bonomi, F., et al. Fog computing and its role in the
2001. IEEE. internet of things. in Proceedings of the first
63. Kumar, N., et al. Time-series Bitmaps: a Practical edition of the MCC workshop on Mobile cloud
Visualization Tool for Working with Large Time computing. 2012. ACM.
Series Databases. in SDM. 2005. SIAM. 78. Steinklauber, K. Data Protection in the Internet of
64. Ryan, D., High performance discovery in time Things. 2014 [cited 2016 20 June]; Available
series: techniques and case studies. 2013: Springer from: https://securityintelligence.com/data-
Science & Business Media. protection-in-the-internet-of-things.
65. Wu, X. and S. Zhang, Synthesizing high-frequency 79. Mukhopadhyay, A., et al., A survey of
rules from different data sources. IEEE multiobjective evolutionary algorithms for data
Transactions on Knowledge and Data Engineering, mining: Part I. IEEE Transactions on Evolutionary
2003. 15(2): p. 353-367. Computation, 2014. 18(1): p. 4-19.
66. Duan, R., X. Chen, and T. Xing. A QoS 80. Hu, T., et al. A survey of mass data mining based
architecture for IOT. in Internet of Things on cloud-computing. in Anti-counterfeiting,
(iThings/CPSCom), 2011 International Conference Security, and Identification. 2012. IEEE.
on and 4th International Conference on Cyber, 81. Sun, Y., et al., Mining knowledge from
Physical and Social Computing. 2011. IEEE. interconnected data: a heterogeneous information
67. Zhang, Y., et al., ICN based Architecture for IoT. network analysis approach. Proceedings of the
IRTF contribution, October, 2013. VLDB Endowment, 2012. 5(12): p. 2022-2023.
68. Darby, S., Smart metering: what potential for 82. Chen, M., et al., Itinerary planning for energy-
householder engagement? Building Research & efficient agent communications in wireless sensor
Information, 2010. 38(5): p. 442-457. networks. IEEE Transactions on Vehicular
69. Rahman, T.A. and S.K.A. Rahim. RFID vehicle Technology, 2011. 60(7): p. 3290-3299.
plate number (e-plate) for tracking and 83. Zhang, D., et al., A Taxonomy of Agent
management system. in Parallel and Distributed Technologies for Ubiquitous Computing
Systems (ICPADS), 2013 International Conference Environments. TIIS, 2012. 6(2): p. 547-565.
on. 2013. IEEE. 84. Chen, M., V.C. Leung, and S. Mao, Directional
70. Sherly, J. and D. Somasundareswari, INTERNET controlled fusion in wireless sensor networks.
OF THINGS BASED SMART Mobile Networks and Applications, 2009. 14(2): p.
TRANSPORTATION SYSTEMS. 2015. 220-229.
71. Tohamy, N., What you need to know about the 85. Wu, X., et al., Data mining with big data. IEEE
Internet of Things. MHD Supply Chain Solutions, transactions on knowledge and data engineering,
2015. 45(3): p. 32. 2014. 26(1): p. 97-107.
72. Pettey, C. Five Ways the Internet of Things Will 86. Su, K., et al., A logical framework for identifying
Benefit the Supply Chain. 2015 [cited 2016; quality knowledge from different data sources.
Available from: Decision Support Systems, 2006. 42(3): p. 1673-
http://www.gartner.com/smarterwithgartner/five- 1683.
ways-the-internet-of-things-will-benefit-the- 87. Wang, L., G. Wang, and C.A. Alexander, Big data
supply-chain-2/. and visualization: methods, challenges and
73. Yan, Y., et al., A survey on smart grid technology progress. Digital Technologies, 2015.
communication infrastructures: Motivations, 1(1): p. 33-38.
requirements and challenges. IEEE 88. Azar, A.T. and A.E. Hassanien, Dimensionality
communications surveys & tutorials, 2013. 15(1): reduction of medical big data using neural-fuzzy
p. 5-20. classifier. Soft computing, 2015. 19(4): p. 1115-
74. Bera, S., S. Misra, and J.J. Rodrigues, Cloud 1127.
computing applications for smart grid: A survey. 89. Popov, V.L. and M. Heß, Method of
IEEE Transactions on Parallel and Distributed dimensionality reduction in contact mechanics and
Systems, 2015. 26(5): p. 1477-1494. friction. 2015: Springer.
75. Dethlefs, T., et al. Energy Service Description for 90. Donalek, C., et al. Immersive and collaborative
Capabilities of Distributed Energy Resources. in data visualization using virtual reality platforms. in
DA-CH Conference on Energy Informatics. 2015. Big Data (Big Data), 2014 IEEE International
Springer. Conference on. 2014. IEEE.
16
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2017.2689040, IEEE Access
17
2169-3536 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.