A Review On Data Analytics For Supply Chain Management

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

I.J.

Information Engineering and Electronic Business, 2018, 5, 30-39


Published Online September 2018 in MECS (http://www.mecs-press.org/)
DOI: 10.5815/ijieeb.2018.05.05

A Review on Data Analytics for Supply Chain


Management: A Case study
Anitha P
Department of Information Science and Engineering
JSS Academy of Technical Education, Bengaluru-560060, India
Email: [email protected]

Malini M. Patil
Department of Information Science and Engineering
JSS Academy of Technical Education, Bengaluru-560060, India
Email: [email protected]

Received: 25 February 2018; Accepted: 24 May 2018; Published: 08 September 2018

Abstract—The present study bridges the gap between the in the market. This integration enables the working
two intersecting domains, data science and supply chain pattern of a system that gives quick response to the needs
management. The data can be analyzed for inventory of the market. Also the ad-hoc situations that arise in the
management, forecasting and prediction, which is in the market. Such a system is referred to as Supply Chain
form of reports, queries and forecasts. Because of the Management (SCM) [1]. SCM is defined as managing the
price, weather patterns, economic volatility and complex flow of information, material and resources across and
nature of business, the forecasts may not be accurate. within the network of upstream and downstream
This has resulted in the growth of Supply chain analytics. organizations [2]. Supply chain can be defined as network
It is the application of qualitative and quantitative of flow of products, financial deliverables, customer
methods to solve relevant problems and to predict the services and network of information, material and
outcomes by considering quality of data. The issues like resources. Management of multiple relationships in
increased collaboration between companies, customers, supply chain is referred to as supply chain management.
retailers and governmental organizations, companies are Some of the factors like product success, customer
adopting Big Data solutions. Big Data applications can be satisfaction, growth of an organization depends on the
linked for Supply Chain Management across the fields successful execution of SCM [5].
like procurement, transportation, warehouse operations, The term Big Data has been used initially by two
marketing and also for smart logistics. As supply chain NASA researchers during 1997 to refer to the
networks becoming vast, more complex and driven by visualization challenge for systems with large amount of
demands for more exacting service levels, the type of data sets which are ubiquitous in nature [6]. Big data can
data that is managed and analyzed also becomes more be understood as the data which is complex, large in
complex. The present work aims at providing an volume, rapid growing with numerous, autonomous and
overview of adoption of capabilities of Data Analytics as independent data sources [33]. Data has increased in
part of a “next generation” architecture by developing a various fields on a large scale basis from couple of
linear regression model on a sales-data. The paper also decades, because of which the term “Big Data” has been
covers the survey of how big data techniques can be used coined [20]. Big data has its impact in various domains:
for storage, processing, managing, interpretation and helps in renovating the supply chain, manages the
visualization of data in the field of Supply chain. customer fidelity in marketing, health, optimizing the
route and reducing cost in transportation, reducing the
Index Terms—Big Data, Supply Chain Management, risk in finance etc. [7]. Deployment of Big data
Supply Chain Analytics, Supply Chain Network, management system for Supply chain management will
regression analysis, Smart logistics, Big Data achieve greater benefits as system becomes more agile.
Management Systems. Big Data can be defined as a large volume of data –
both unstructured and structured. Due to rise in the field
of social media, Internet of things and mobile devices, it
I. INTRODUCTION is found that there is a massive increase in real time data
After the 1990s great changes in the operating rules of generation [1]. As per the survey, more than 1200
world economy and market competition patterns, Exabyte’s of data are generated every year from different
enterprises have identified the need for globalization of data sources [1]. Most of the data generated are not
economic development. Companies need to rely on the structured. Amount of unstructured data is approximately
integration of their own and external resources available 80%, where these data are difficult to store, analyze and
process.

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39
A Review on Data Analytics for Supply Chain Management: A Case study 31

The analysis of Big Data leads to insights that help in complexity, supplier complexity, process complexity and
taking better decisions and strategic business moves service complexity [3].
which is termed as Analytics. According to Waller and Fawcett (2013), research area
Complete process of extraction of Big Data consists of in the field of Big Data intersecting with SCM could
two things- data management and data analytics. Data illuminate a “great number of new opportunities” for both
management comprises “processes and supporting academia and practitioners [2]. They also pointed out that
technologies to gather data, store data, to prepare and very sparse literature survey about predictive analytics,
retrieve it for analysis. The techniques used to analyze data science and Big Data are available in the field of
and gather intelligence from Big data is referred to as Big SCM. SCM and Logistics are not the new ideas. Logistics
Data can be defined as the process of managing the
Analytics. Analytics is defined as the collection and procurement, flow of products, storage of materials, parts
analysis of data in terms of qualitative and quantitative and finished inventory to maximize the profit through the
for decision making. BDA is the application of advanced cost effective fulfillment of orders [10]. SCM is an
analytic techniques to huge data sets. Gartner explains extensive field than logistics. The concept of logistics
that only 15% of Fortune 500 companies will be able to evolved as a subfield of Supply chain management whose
make full use of big data for creating value and only 8% main vision is creating & presenting a single plan of flow
of them are currently using Big Data Analytics [1]. of materials and information. During 1980 logistics has
Big data analytics has proceeded its need for the been defined as to fulfill taking care, transportation,
Supply chain management of any organization and many loading and unloading, packing and processing between
companies are struggling to unveil its business value [2]. the manufacturer and consumer for commodity and
Big data analytics challenge is to analyze the irregular various other functions [28]. SCM superimposes the
patterns of data arriving next to the present huge data sets logistics framework in implementing the connectivity and
[26]. BDA is important in creating an integrated view of coordination between entities like suppliers and
operational performance and customer satisfaction of customers [10].
both sender and recipient in the SCM [16]. It is really According to the professionalism, skills, flexibility,
challenging to meet fully the Supply chain susceptibility reliability, attitude, behavior, reputation and integrity of
because of the complexity in Supply chain components, human resources in logistics companies are mainly
process, supplier and services [25]. Supply chain important from the client’s point of view [23]. Fig.1
analytics has been developed its own identity in supply shows the flow of supply chain management. It explains
chain management by using the Business Intelligence various activities involved in manufacturing a product
tools for the analysis of customer behavior, optimization and services starting from planning till delivery of
of upstream and downstream operations and also insight materials to end customer. Truly speaking, SCM is not a
on advanced routing solutions. Examples are component chain of processes instead it is a network of multiple
businesses and relationships [8].

Fig.1. Process of Supply Chain Management

needed for solving the Big data analytics and one should
II. BIG DATA familiarize with the data [18]. In data preparation stage,
ELTL (Extract, Load, Transform, Load) operations are
The name Big Data has been used first by two NASA applied on the required data. Huge volumes of data from
researchers in 1997 as a challenge for visualizing large
different sources causes high probabilities of errors [30].
data sets. There after researchers and specialists in the So data needs to be transformed, cleaned and audited
field of Information Management have been gradually before they are loaded into Data warehouses [30].
paying attention towards Big Data. The process for Big Technologies can be used if required in this phase. In the
Data is classified into 7 phases as shown in the Fig 2. subsequent phases, especially in the next phase project
Data Discovery phase includes accessing the resources
team has to decide the usage of methods, workflows and

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39
32 A Review on Data Analytics for Supply Chain Management: A Case study

techniques required for analyzing the data along with variety of data in the field of SCM includes data from
evaluation and interpretation of data. In the model diverse sources like retailers, distributors, suppliers,
execution phase, the model chosen in the previous phase inventory, sales, consumer [9] etc. Big Data collection
is executed with appropriate data sets available. Once the process in SCM includes 2 variety of sources: Upstream
results are available, communication of the results and and downstream sources. The data from upstream source
optimizing the results if possible has been carried out in includes supplier’s side, through intermediate stream or
the remaining phases [18]. warehouse side. Data from downstream includes logistics,
Big Data has a positive impact in different domains: distribution or retailer side [20].
helps in reconditioning the supply chain, increasing sales
4. Veracity:
and marketing, real route optimization etc. [7]. Big data
analytics process has been explained from the perspective Correctness or trustworthiness of data is referred to as
of supply chain data [18]. The following section provides veracity. This verifies the quality of data from SCM,
the taxonomy of ten main attributes of Big data compliance issues etc.
applications in Supply Chain management.
5. Value:
1. Volume:
It refers to monetary worth of data. It is challenging to
Volume refers to the huge amount of data generated monitor the value of reports, statistics, impacts on the
from emails, twitter, photos, and videos every second. In insights etc.
SCM, Volume can be related to the data generated from
the use of Sensors, bar codes, ERP, Transport 6. Variability:
management system and database technologies. Lack of variability in big data can be defined as the
Previously volume is measured in Gigabytes which is data which is not consistent or liable to vary or change.
now measured in Zettabytes (ZB) or even Yottabytes Supply Chain variability in terms of information sharing,
(YB). There are different forms and ways of storing the integration, quality control, unexpected delays in the
Big data generated from supply chain industry [20]. supply process etc.
Rational database management system (RDBMS), which
is a structured model employed to see, analyze, 7. Visualization:
manufacture and store the huge amount of supply chain Analyzing the data graphically is termed as
management data. Also the data clusters in Big data visualization. Visualization method is more effective in
storage includes components like conveying meaning than spreadsheets and reports or
using numbers and formulas in terms of Supply chain
data can be visualized using ERP, custom developed
reports or using graphical method.
8. Virality:
It measures speed of data movement from one network
to other. From the supply chain management view, it is
very essential for logistics process to be carried out.
9. Viscosity:
It mainly refers to the data latency or the delay in data.
Fig.2. Big Data Process. It can be easily understood as an element of velocity.
10. Volatility:
 Direct attached storage (DAS) - includes different
types of hard disks/ hard drives which are attached How long the data is valid and how long it should be
to DBMS stored. It is mainly associated with old and new data.
 Network storage (NS) – which comes in two forms RESEARCH FINDING 1: From literature survey it is
Network attached storage(NAS) and Storage Area found that Supply chain management is a big network of
Network(SAN) [20] multiple business strategies and relationships. In the Big
data ecosystem, where the data is found to be completely
2. Velocity: ubiquitous, it is challenging to justify the Big data
It mainly refers to the speed of the data collected, dimensions (all V’s). For example, the Big data
analyzed and transferred. It impacts on the efficiency and dimension volume relates to the data generated from
decision making models and algorithms in the field of Transport management system, Enterprise resource
SCM. planning and many more. Similar kind of explanation can
be found in another dimensions of Big data that is Variety,
3. Variety: which is more often referenced under data collection
It refers to the different forms of data like structured, process in SCM. Other issues related to Big data
unstructured or semi structured [35]. Also it includes dimensions are quality of data, information sharing,
different types of data from XML to video to SMS. The development of customized report, logistics process,

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39
A Review on Data Analytics for Supply Chain Management: A Case study 33

validation of data and others are really challenging to 1. Time series methods & Advanced forecasting.
achieve the completeness of all dimensions of Big data These methods are used for predicting the sales in
for SCM. SCM.
2. Statistical algorithms such as Discriminant
Analysis, k-NN, Naive Bayes (NB) and Bayes
III. DATA ANALYTICS AND TECHNIQUES USED IN SUPPLY Networks (BN).
CHAIN MANAGEMENT 3. Decision trees, CART and Random Forests uses
the hierarchical sequential structure
In spite of the largest growth in the field of data
4. Clustering algorithms used to group homogeneous
analytics experienced by customer insight, Analytics has elements in a data set.
many applications across end to end supply chain. As the 5. Frequent pattern mining algorithms
acquisition and transportation cost per entry is driven to
be minimum, there is an inevitably corrupted Predictive analytics mainly focused on forecasting at
measurements and errors in the large scale data has been strategic, tactic and operational levels, which is based on
found [29]. Since any of the data sources continuously
the planning process in terms of network design,
generate data in real time, analytics must often be production planning, inventory management and capacity
performed [29]. Applications of advanced analytic planning [14]. Predictive analytics uses mathematical
techniques has been described to supply chain
algorithms and programming in order to predict the
management [14]. Supply Chain Data Analytics has been patterns within data
classified into three types of Analytics: Descriptive, To understand the Descriptive and Predictive Analytics,
Predictive and Prescriptive analytics.
an experimental is performed on sales data, which is a
A. Descriptive Analytics: benchmark dataset available. The results are found to be
interesting. A predictive model is developed based on
Descriptive Analytics (DA) is mainly used to analyze regression analysis.
“what is happening” now in order to answer the question Detailed explanation is provided in section VI.
of “What happened” in the past. This is the first level of
analytics where 90% of organizations apply this strategy C. Prescriptive Analytics:
for betterment of the future. DA identifies the historical DA and PA are focused on what and when it will
data and analyzes the pattern. Descriptive Analytics happen, whereas Prescriptive Analytics anticipates on
mainly aims at identifying the problems and opportunities
“why it has happened”. It collects the data continuously
in the field of SCM within the existing processes and to re-predict the events which enable the decision makers
functions [17]. to increase the prediction accuracy for taking better
Descriptive Analytics uses the techniques like
decisions. Prescriptive analytics explains the reasons
behind certain events. It is mainly associated with
 Data Modeling simulation and optimization [2]. The aim of Prescriptive
 Regression Analysis analytics is to improve the business performance [17].
 Visualization Three classes of algorithms used under this analytics
 OLAP (online analytical processing) operations method are
like drill down, up and across to identify the areas.
 Decision trees
Big data tools for supply chain analytics has been  Fuzzy Rule-Based System
summarized in the Table 1. OLAP operations for Supply  Switching Neural Networks (Logic Learning
chain may include shipments, products, logistics, Machine)
customers, suppliers and other dimensions like rates and
cost. The applications of Descriptive analytics provide Prescriptive analytics is focused on the optimization of
the managers with real time data regarding the quantities mathematical and simulation techniques in order to
of goods and location in the supply chain. provide the decision support tools which has been built
B. Predictive Analytics: on descriptive and predictive methods.
RESEARCH FINDING 2: Big data tools available for
Predictive analytics (PA) use both quantitative and supply chain analytics mainly used for data exploration,
qualitative methods to analyze the real time and historical integration of data, statistical analysis, proper
data to estimate the past and future levels of integration visualization methods and understanding the data
of business processes among functions or companies, as warehouse system. Few of them are R-prog, Informatica,
well as the associated costs and service levels [9]. PASW. The main observation is about LINGO, DSM,
Predictive analytics aims at projecting what will happen which are mainly used for documentation purpose based
in the future and why it may happen [17]. PA includes on customer support system. Pentaho is used to handle
algorithms/techniques such as [2] structured and unstructured volumes of data. To

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39
34 A Review on Data Analytics for Supply Chain Management: A Case study

summarize the integration of all tools is an important task functions are forecasting, inventory management,
in SCA for developing a Decision support system. transport management and also human resources. Big
data can address the issues of Supply chain like timely
Table 1. Big data tools for Supply Chain Analytics response, time delivery, real time planning, supplier and
Name customer relationship management etc. [15]. Operating
of the Tool Description an effective supply chain involves continuous flow of
Optimization tool for linear, nonlinear and information, which in turn helps to create better material
mathematical programming which is introduced by flow [8]. The main focus in supply chain is the customer.
John H Thomson in 1989. This tool can be used for
LINGO Easy model expressions, convenient data options and
So achieving a good customer focused system is one of
also it helps for documentation the aim of SCM [8]. The key supply chain processes/
challenges are listed and shown in the Fig 3.
Drop shipping management tool which is mainly used
to increase arbitrage opportunities. This is a best tool
in the field of drop shipping arbitrage. It can be used
DSM for handling customer support based on the sales
history, ticketing system and statistical reports.

It is an object oriented, free, open source software


environmental tool for statistical analysis and
graphics. Introduced by “Ross Ihakaa and Robert
R programming Gentleman”. It mainly based on command line
interfaces. It is used by statisticians and data miners
which supports for statistical tests, linear and
nonlinear modeling and also for time series analysis.
It is a document oriented database which replaces the
concept of a “row” with a more flexible model.
Dwight Merriman, Eliot Horowitz and Kevin Ryan.
Mongo DB supports for
Mongo DB 1. Indexing
2. Aggregation
3. Special collection types and
4. File storage
Fig.3. Big Data Challenges in Logistics
Pentaho serves as an analytical/software platform for
Big Data [20]. It generates reports from both
structured and unstructured volumes of data. It Supply chain need to be aware of the benefits given by
Pentaho
provides services for Businessmen for easy access, the big data for their operations.
visualization, integration and data exploration[20]. A. Distribution Network optimization
Informatica introduces a series of products relates to B. Green Logistics
data integration and warehousing. It extracts data from C. Route Optimization
Informatica heterogeneous data sources, combine and standardizes D. Space Optimization
the data which will be presented in a uniform format
for the purpose of processing. E. Last mile delivery problem
F. Redelivery Consignments
This is used to perform data analysis collected from
different sources and presentation functions which G. Custom clearance time
includes graphical and statistical analysis. Some of the H. Track & Trace
PASW features used statistical data analysis are descriptive
statistics and sophisticated inferential and multivariate A. Distribution Network Optimization
statistical procedures.
Distribution network involves locating warehouses and
It is a method used to process vast amounts of data production plants and also to identify the strategy for
and also a model based on java for distributed
computing. Map-reduce algorithm contains two main distribution of products from supplier to warehouses and
tasks- Map and Reduce. This technique splits the from warehouses to customers [12]. The key challenges
Hadoop - Map
input data sets into independent chunks which are of Distribution network optimization are optimization of
processed by the map task in a parallel way, sort the domestic transportation network, achieve flexibility in
reduce
data and given as an input for the reduce task. This
framework concentrates on scheduling the tasks, network redeployment when network changes and also to
monitoring the scheduled task and also re-executing minimize the total cost of distribution network. Benefits
the tasks which will fail during execution. of n/w optimization by achieving above challenges are
the Optimized Network.
B. Green Logistics
IV. CHALLENGES OF SUPPLY CHAIN MANAGEMENT It implies efficiency in using transportation
According to Robak et al [2014], the open research equipment’s. Some of the key features in Green logistics
problem in supply chain management along with logistics are Minimization of carbon emission, Cost reduction
can be analyzed from the view of stake holders and from resource saving and environmental externalities [13].
executive business components, where key business Benefits by overcoming the mentioned challenges are

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39
A Review on Data Analytics for Supply Chain Management: A Case study 35

better utilization of resources, social and environment customers. Minimization of cost, time and space are the
responsibility etc. Introducing or adopting green logistics main parameters. The above challenges are also reflecting
is a complex process which requires cross disciplinary the development of customer end activities such as
coordination and also changes in the current operation efficient delivery of products, packaging and handling
process [19]. This can be also being achieved by and documentation processes of related activities. A
introducing new practices in the area of supply and complete model can be developed by taking a real world
distribution that links them to other participants like data and establish that the above challenges are met using
suppliers and customers in the value chain. This link must Big data approach
be supported by management staff’s, their characteristics
and also by human resources [19].
V. SUSTAINABLE SMART LOGISTICS
C. Route Optimization
Logistics is a part of supply chain management.
Route optimization is the very important factor to
Logistics can be defined as process of managing the
control efficiently the physical flow of supply chain.
procurement, storage and movement of goods along with
Some of the challenges are like Optimization of single
the related information flow to maximize the profit of the
route trucking trip, Allocate appropriate resources per trip,
organization through cost effective fulfillment of orders.
Cost reduction. By achieving these goals, the benefits
Logistics has been identified as a core element of supply
from Route optimization is Route Efficiency.
chain management [21]. The aim of the Logistics is to
D. Space Optimization serve the customers in a cost effective way. Sustainability
in logistics can be defined as a cultural issue based on the
The key parameters in Space optimization are
demonstration of many companies and organizations. It
Maximizing space utilization, improve productivity and
can be a trend setting the business model or setting up the
also to minimize the cost. Benefits from Space
new market opportunities and also preparing for future
Optimization is utilization of space in a better way.
scenarios [22]. The term intelligent or smart logistics can
E. Last mile delivery problem be defined as a different logistics operation which are
planned, managed and controlled in a smart way
The main challenge in this problem is delivering
compared to conservative solutions [21]. Besides the
thousands of packages to customer in an efficient way. planning, managing & controlling the objects and
Another challenge here is time bound delivery of goods. resources of logistics, also the aggregation and processing
Benefit by overcoming this problem is Customer
of the collected data is an important task of Smart
Satisfaction. BDA enable the Last mile delivery problem logistics [27]. Some of the approaches to improve
by increasing the level of operational efficiency [15]. logistics by making them more intelligent are as follows
F. Redelivery Consignments
A. Autonomous Logistics
Some of the parameters in Redelivery consignments It describes the ability of logistics objects to process
are proper packaging and handling, Efficient the information, to provide and to execute their own
transportation which reduces the redelivery of the
decisions.
products. Benefits by achieving above parameters is
efficiency in monetary through minimizing the redelivery. B. Product intelligence
G. Custom clearance time The way of storing and transporting any physical order
or product instance in an efficient manner.
The main parameter in custom clearance time is to
maintain proper documentation which represents the C. Intelligent transport systems
client in the time of custom examination and assessment.
It mainly refers to the innovative services related to
The benefit by achieving the given parameter is to avoid
transport and traffic management. This enables the user to
detention charges.
be better informed, safer and smarter use of transport
H. Track and Trace network.
Some of the parameters under track and trace is “Near- D. Physical Internet
real-time “tracking and Status & position information
Physical internet suggests exploiting the digital internet
[11]. Advantage by achieving this parameter is
metaphor in order to develop a physical internet towards
Performance improvement in track & trace.
meeting the global logistics sustainability challenge.
RESEARCH FINDING 3: Literature survey reveals
that the Big data challenges in Logistic management of E. Intelligent cargo
supply chain system mainly relates to the stake holders
Capabilities under Intelligent Cargo are self-
that is customers and the key Business functions. It is
identification, context detection, access to services, status
found that network optimization, route optimization and
monitoring and registering.
space optimization form the basis for a proper
management of physical space, physical flow of Supply F. Self-organizing logistics
chain and proper strategy of distribution of products to

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39
36 A Review on Data Analytics for Supply Chain Management: A Case study

Self-organized logistic company’s functions without A. Linear regression


intervention by managers, engineers or by the software
Linear regression is a statistical method which is used
control. Of the above approaches, the Intelligent traffic
to learn about relationship between response variables
management systems attracted the greatest interest during
and a predictor variable. Regression analysis is a best
recent years. Uckelmann described Smart Logistics as
choice when all of the independent variables are
technical components to gain data on the level of material
continuous valued. Straight line linear regression analysis
flow and to process these data for monitoring and further
is the simplest form of regression which contains a single
purposes [23]. Smart Logistics are a key approach in the
predictor variable y and models as a linear function of x.
information logistics cross company and international
transportation networks to meet the need for robustness,
Y = mx +c. (2)
flexibility, resilience and agility [23]. Technical
components in smart logistics are as follows: B. Coefficients of determination

A. RFID for Identification The coefficient of determination denoted by R is an


output of regression analysis. This has been interpreted as
B. RTLS like GPS and others for location
C. Usage of Sensors. follows

RFID is used to ensure a secure identification of the  The R ranges from 0 to 1.


different objects at all stages of supply chain. For  R is 0 means the dependent variable cannot be
example, in the use case of vehicle development process predicted from an independent variable.
up to 50 single items are placed in one vehicle, which  If R is between 0 to 1 means that the extent to
needs to be securely identified. Smart logistics using which the dependent variable is predictable [32].
RFID can be used to identify objects placed in the vehicle.  R is 0.10 indicates that 10% of variance in y is
One of the application of RFID is to record the ID and predictable from x, and if it is 0.20 then 20% is
sensed temperature given by the sensors during predictable and so on.
transportation [34].

VII. EXPERIMENTS & RESULTS


VI. REGRESSION This section comprises of discussion on experiments,
Regression is a collection of data mining or statistical results, data set description, summary of scatter plot
modelling technique used to describe the behavior of representation and about PSPP.
random variable by using one or more quantitative A. Data Set Description:
variables. The Linear regression uses the method or of a
straight line (y= mx + c) in order to predict the value of y The sample sales data set size taken for regression
by determining the appropriate values for m and c based analysis is 25. The data set is in .csv format. The
upon the value of given x. y becomes a dependent attributes of sales-data and their description is provided in
variable and other variables becomes a predictor or the Table 2.
independent variable. The relationships between target
variable and the predictors are summarized in model. Table 2. Data set description
This relationship has been applied to variety of datasets Attribute Description
whose target values are unknown. Some of the Store number Number for each stores
applications of regression analysis in Business are Sales_quantity_Feb Quantity of sales in Feb
predictive analytics, operation efficiency, supporting Sales_Feb Amount of sales happened in Feb
decisions, correcting errors and new insights. Regression GP Gross Profit
analysis is mainly used to determine the values of Expected_Sales Result of sales after performing linear
parameters for a function which cause the function to best regression
fit of asset data that we provide. Equation mentioned
below explains the relationships in terms of symbols. B. PSPP:
Equation explains that regression is a function of
PSPP is an open source software which can be used for
estimating the value of a continuous target expressed as Y,
the analysis of sampled data. Originally developed in late
a function (F) of one or more predictors (x1, x2, --- xn), a
1990s,intended as a free alternative for IBM SPSS
set of parameters (θ1, θ2 ...θn) and a measure of error as
Statistics. It has a graphical user interface and
(e) [32].
conventional command-line interface. It is written in C
and uses GNU Scientific Library for its mathematical
Y= F (x, θ) + e. (1)
routines. PSPP is used for statistical analysis of sampled
datasets. This tool reads the data, analyzes according to
The target variable can be understood as dependent
the commands and writes the result to output window.
variables or response variables and predictors as
The language is similar to SPSS statistical products. The
independent or explanatory variables. Parameters of
important features of PSPP are frequencies, cross-tabs
regression are also known as regression coefficients [32].

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39
A Review on Data Analytics for Supply Chain Management: A Case study 37

comparison of means (t-tests and one-way ANOVA); found from the table that the significance of ANOVA for
linear regression, logistic regression and many more. sales_quan_Feb is found to be 0.
The regression coefficients for the predictor
C. Scatter plot:
(Expected_sales) depicts the difference in response per
In order to determine the linear relationship between unit difference in the predictor. They are tabulated in
the variables (dependent and independent), it is suggested Table 3(c).
to run a scatter plot on the given data set. If the graph From the coefficients Table 3(c) regression equation is
contains no linear relationship, then no need for linear shown below.
regression. From Fig 4, it is found that points on the
graph are linear. This indicates that linear relationship DV=23411.25+0.02 * IV (3)
exists between the variables and simple regression can be
applied. The scatter plot obtained is shown in Fig.4 Where
DV = Dependent variable
IV = Independent variable

Table 3. (a): Model Summary


Model Summary (sales_quan_feb)

R R Square Adjusted R Square Std.Error of the Estimate


.96 .92 .92 32816.24

Table 3. (b): ANOVA


ANOVA (sales_quan_feb)

Sum of Squares df Mean Square F Sig.


Fig.4. Scatter plot to check linearity Regression 282643336179.97 1 282643336179.97 262.46 .000

D. Interpretation of Results: Residual 23691920426.65 22 1076905473.94

The results obtained from regression model run on Total 306335256606.63 23


PSPP are depicted in Table 3(a-c) and Table 4. The
tabulations are related to model summary, ANOVA, Table 3. (c) : Coefficients
Coefficients. Table 4 depicts the results generated from
simple linear regression along with a scatter plot in Fig 5. Coefficients (sales_quan_feb)
According to the table, predictions are almost nearing to Unstandardized Standardized t Sig.
the actual values. It is approximately 80%. Coefficients Coefficients
The interpretation of coefficient of determination R B Std.Error Beta
obtained from model summary is in the range of 0 &1, (Constant) 23411.25 8709.12 .00 2.69 .013
that is 0.96. This indicates that the dependent variable can Sales_feb
be predicted. In the present experiment the dependent .02 .00 .96 16.20 .000
variable is sales_quan_Feb and shown in the Table 3(a).
Analysis of variance (ANOVA) is the collection of
statistical models and their related procedures. It is used
to analyze the differences among the group means. It is
Table 4. Simple Regression Table
Sls_Qty_Jan Sales_Jan GP_Jan Sls_Qty_Feb Sales_Feb GP-Feb Expected_Sales
255205 8504700 1573382 267965 8349929 1652051 7573852
413922 12475303 1831377 434618 21322029 1922946 12284188
282774 9839422 1849295 296913 9108444 1941759 8392048
81951 2387691 417905 86048 2639719 438801 2432089
91253 2819919 539118 95816 2939359 566074 2708175
43466 1660907 193122 45640 1400098 202778 1289984
231654 6353325 984933 243236 7461811 1034180 6874903
49835 1065829 507583 52326 1605223 532962 1478960
95128 4148480 589751 99884 3064161 619238 2823155
89868 3755131 703207 94361 2894734 738368 2667051

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39
38 A Review on Data Analytics for Supply Chain Management: A Case study

[6] Cox, Michael, and David Ellsworth. "Application-


14000000
controlled demand paging for out-of-core
12000000 visualization." Proceedings of the 8th conference on
Visualization'97. IEEE Computer Society Press, 1997.
Expected_Sales

10000000
[7] Benabdellah, Abla Chaouni, Et Al. "Big Data for Supply
8000000 Chain Management: Opportunities and Challenges."
6000000 [8] Lambert, Douglas M., and Martha C. Cooper. "Issues in
supply chain management." Industrial marketing
4000000 management 29.1 (2000): 65-83.
2000000 [9] Waller, Matthew A., and Stanley E. Fawcett. "Data
science, predictive analytics, and big data: a revolution
0
that will transform supply chain design and
0 200000 400000 management." Journal of Business Logistics 34.2 (2013):
Sales_Quan 77-84.
[10] Christopher, Martin. Logistics & supply chain
management. Pearson UK, 2016.
Fig.5. Scatter plot after prediction.
[11] Jakobs K., Pils C., Wallbaum M. (2001) Using the
Internet in Transport Logistics - The Example of a Track
& Trace System. In: Lorenz P. (eds) Networking — ICN
2001. ICN 2001. Lecture Notes in Computer Science, vol
VIII. CONCLUSION 2093. Springer, Berlin, Heidelberg
[12] Amiri, Ali. "Designing a distribution network in a supply
The bibliographic survey conducted in this paper chain system: Formulation and efficient solution
mainly focuses on the Supply chain management procedure." European Journal of Operational
activities related to data warehouse, marketing and Research 171.2 (2006): 567-576.
transportation. The Business transformation and [13] Dekker, Rommert, Jacqueline Bloemhof, and Ioannis
improvements in operation cost have contributed to the Mallidis. "Operations Research for green logistics–An
development of the Supply chain analytics. A predictive overview of aspects, issues, contributions and
model developed on sample sales-data using linear challenges." European Journal of Operational
regression method is found to be 80% nearer to the Research 219.3 (2012): 671-679.
[14] Souza, Gilvan C. "Supply chain analytics." Business
available results. The model also provides information
Horizons 57.5 (2014): 595-605.
about the descriptive statistics through which the [15] Robak, S., Franczyk, B., Robak, M. (2014). Research
descriptive analytics can be done. Further the work is Problems Associated with Big Data Utilization in
enhanced using multiple and polynomial regression Logistics and Supply Chain Design and Management.
methods. The present work provides a good platform for Annals of Computer Science and Information Systems,
the Big data analytics and possible integration of BDA 3(1), 245-249.
and SC. It can act as a good customer support system. [16] Mikavicaa, Branka, Aleksandra Kostić- Ljubisavljevića,
and Vesna Radonjić. "Big data: challenges and
opportunities in logistics systems." 2nd Logistics Intl.
ACKNOWLEDGEMENT
Conference. 2015.
The authors extend their gratitude to their organization [17] Wang, Gang, et al. "Big data analytics in logistics and
JSS Academy of Technical Education, Bengaluru, for the supply chain management: Certain investigations for
facilities provided in order to carry out the research work. research and applications." International Journal of
Production Economics 176 (2016): 98-110.
[18] Robak, Silva, Bogdan Franczyk, and Marcin Robak.
REFERENCES "Business process optimization with big data analytics
[1] Feki, Mondher, Imed Boughzala, and Samuel Fosso under consideration of privacy." Computer Science and
Wamba. "Big Data Analytics-Enabled Supply Chain Information Systems (FedCSIS), 2016 Federated
Transformation: A Literature Review." 2016 49th Hawaii Conference on. IEEE, 2016.Hypothesis
International Conference on System Sciences (HICSS). [19] Seroka-Stolka, Oksana. "Green Initiatives in
IEEE, 2016 Environmental Management of Logistics Companies.
[2] Rozados, Ivan Varela, and Benny Tjahjono. "Big data " Transportation Research Procedia 16 (2016): 483-489.
analytics in supply chain management: Trends and related [20] Addo-Tenkorang, Richard, and Petri T. Helo. "Big data
research." 6th Internafional Conference on Operafions and applications in operations/supply-chain management: A
Supply Chain Management. 2014. literature review." Computers & Industrial
[3] Kao, Gio, et al. "Supply chain lifecycle decision Engineering 101 (2016): 528-543.
analytics." Security Technology (ICCST), 2014 [21] McFarlane, Duncan, Vaggelis Giannikas, and Wenrong
International Carnahan Conference on. IEEE, 2014. Lu. "Intelligent logistics: Involving the
[4] Mishra, Deepa, et al. "Big Data and supply chain customer." Computers in Industry 81 (2016): 105-115.
management: a review and bibliometric analysis." Annals [22] Ceniga, Pavel, and Viera Sukalova. "Future of logistics
of Operations Research (2016): 1-24. management in the process of globalization." Procedia
[5] Kamble, Shridhar, Aaditya Desai, and Priya Vartak. "Data Economics and Finance 26 (2015): 160-166.
mining and data warehousing for supply chain [23] Palšaitis, Ramūnas, Kristina Čižiūnienė, and Kristina
management." Communication, Information & Computing Vaičiūtė. "Improvement of Warehouse Operations
Technology (ICCICT), 2015 International Conference on. Management by Considering Competencies of Human
IEEE, 2015. Resources." Procedia Engineering 187 (2017): 604-613.

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39
A Review on Data Analytics for Supply Chain Management: A Case study 39

[24] Uckelmann, Dieter. "A definition approach to smart Authors’ Profiles


logistics." International Conference on Next Generation
Wired/Wireless Networking. Springer Berlin Heidelberg, Dr. Malini M. Patil is presently working
2008. as an Associate Professor in the
[25] Kao, G., Lin, H., Eames, B., Haas, J., Fisher, A., Department of Information Science and
Michalski, J., ... & Wyss, G. (2014, October). “Supply Engineering at J.S.S. Academy of
chain lifecycle decision analytics”. In Security Technical Education, Bangalore,
Technology (ICCST), 2014 International Carnahan Karnataka, INDIA. She received her Ph.D.
Conference on (pp. 1-7). IEEE degree from Bharathiar University in the
[26] Leveling, J., Edelbrock, M., & Otto, B. (2014, December). year 2015. Her research interests are big
Big data analytics for supply chain management. data analytics, bioinformatics, cloud computing, image
In Industrial Engineering and Engineering Management processing. She has published more than 20 research papers in
(IEEM), 2014 IEEE International Conference on (pp. 918- many reputed international journals and guiding four students.
922). IEEE. She has attended and presented papers in many international
[27] Kirch, Martin, Olaf Poenicke, and Klaus Richter. "RFID conferences in India and Abroad. Published article, entitled
in Logistics and Production–Applications, Research and "Performance analysis of Hoeffding trees in data streams by
Visions for Smart Logistics Zones." Procedia using massive online analysis framework" in International
Engineering 178 (2017): 526-533. Journal of Data Mining, Modelling and Management,
[28] Jian-qiang, Wu, Zhang Lei, and Zhu Guo-qing. Inderscience Publishers. Published article, entitled "Mining
"Performance-based Evaluation on the Logistics Data streams with concept drift in massive online analysis
Warehouse." Procedia Engineering 11 (2011): 522-528. frame work", WSES Transactions on computers”, She is a
[29] Slavakis, Konstantinos, Georgios B. Giannakis, and member of IEEE, ISTE, CSI, IEI. She is a recipient of
Gonzalo Mateos. "Modeling and optimization for big data Distinguished Woman in Science Award for the year 2017 from
analytics: (statistical) learning tools for our era of data Venus International Foundation. She has received a best paper
deluge." IEEE Signal Processing Magazine 31.5 (2014): presenter award in Second International Conference of Data
18-31. Management, Analytics and Innovation – ICDMAI-2018, Pune,
[30] Abai, Nur Hani Zulkifli, Jamaiah H. Yahaya, and Aziz India. The award was sponsored by springer. Contact email:
Deraman. "User requirement analysis in data warehouse [email protected]
design: a review." Procedia Technology 11 (2013): 801-
806.
[31] Park, Sung H. "Simple linear regression." International P. Anitha is currently working as an
Encyclopedia of Statistical Science. Springer Berlin Assistant Professor in the Department of
Heidelberg, 2011. 1327-1328. Information Science and Engineering at
[32] Srimani P.K., Patil M.M. (2014) Regression Model for JSSATE, Bengaluru. She is a research
Edu-data in Technical Education System: A Linear scholar in the field of Supply Chain Data
Approach. In: ICT and Critical Infrastructure: Proceedings Analytics at JSSATE Research Centre,
of the 48th Annual Convention of Computer Society of Department of CSE, affiliated to
India- Vol II. Advances in Intelligent Systems and Visveswaraya Technological university,
Computing, vol 249. Springer, Cham. Belagavi, India under the guidance of Dr. Malini M Patil,
[33] Nivedita Das, Leena Das, Siddharth Swarup Rautaray, Associate Professor, Department of ISE, JSSATE, Bengaluru.
Manjusha Pandey, " Big Data Analytics for Medical She has completed her M. Tech in 2012 from Department of
Applications", International Journal of Modern Education Computer science, BMSCE, Bengaluru under the Visveswaraya
and Computer Science(IJMECS), Vol.10, No.2, pp. 35-42, Technological University. She is a life member of Indian
2018.DOI: 10.5815/ijmecs.2018.02.04. Society for Technical Education.
[34] Ping-Ho, Ting. "An efficient and guaranteed cold-chain
logistics for temperature-sensitive foods: applications of
RFID and sensor networks." International Journal of
Information Engineering and Electronic Business 5.6
(2013):
[35] Pradeep Kumar M. Kanaujia, Manjusha Pandey, Siddharth
Swarup Rautaray, "A Framework for Development of
Recommender System for Financial Data Analysis",
International Journal of Information Engineering and
Electronic Business(IJIEEB), Vol.9, No.5, pp. 18-27,
2017. DOI: 10.5815/ijieeb.2017.05.03

How to cite this paper: Anitha P, Malini M. Patil," A Review on Data Analytics for Supply Chain Management: A
Case study", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.10, No.5, pp. 30-
39, 2018. DOI: 10.5815/ijieeb.2018.05.05

Copyright © 2018 MECS I.J. Information Engineering and Electronic Business, 2018, 5, 30-39

You might also like