Big Data Based Retail Recommender System of Non E-Commerce: IEEE - 33044
Big Data Based Retail Recommender System of Non E-Commerce: IEEE - 33044
Big Data Based Retail Recommender System of Non E-Commerce: IEEE - 33044
ABSTRACTü ü Recommender system, as a means of items recommends or predicates items that users have not
achieving precision marketing, has been widely used noticed but might have interests. The system recommends
and brought about significant benefits in modern e- items by filling user-item rating matrix [2-3]. This
commerce systems. However, there is a lack of study on technology is ideally suited for Internet-based e-commerce
the applying of recommender system to traditional non applications, because enterprise users can easily obtain
e-commerce retailing mode. This paper presents a and analyze the behavior data of users for products and
retail recommender model based on collaborative recommend right product to users with e-commerce.
filtering, and designs the corresponding distributed The bigger the scale of users and products become, the
computing algorithm on MapReduce, so as to more critical role recommender systems play. But there is
implement a big data based retail recommender system. also difficulty in analyzing large-scale data. Traditional
The big data mechanism helps the system do scalable calculation methods face challenges. Entering the era of
data processing easily. Experimental results show that big data [4], big data technology that develops on the basis
the system is effective for the estimation of retail sales of cloud computing provides large-scale datasets with
for each store and product. As a result, non e- excellent solutions [5]. Implementations of the popular
commerce enterprises could benefit from this novel MapReduce framework [6], such as Apache Hadoop [7],
way of precision marketing supports. have become part of the standard toolkit for processing
large datasets using cloud resources [8]. In order to solve
KEYWORDSüRecommender systems; Collaborative the problem of scalability, Zhao and Shang study the
Filtering; Big data; MapReduce; Precision Marketing implement of collaborative filtering algorithm on cloud
computing platform, designed a user-based collaborative
filtering algorithm for the MapReduce program
framework, and implement the algorithm on the Hadoop
I. INTRODUCTION platform [9]. Jiang et al. proposed a method by
implementing a scaling-up item-based CF algorithm on
Today, when we browse an e-commerce website and MapReduce in a Hadoop cluster with good concurrent
hope to purchase an item, we must be aware that processing efficiency [10].
personalized recommender systems have brought us In the area of e-commerce, the recommender
convenience, which could recommend a product that we technology has been widely studied and applied. Schafer
have not noticed about but would be probably fond of. et al. studied how recommender systems help e-commerce
Therefore we often surprise pleasantly: Yes, although I sites increase sales and analyze the recommender systems
have not thought of it before, but this is exactly what I want! at several market-leading sites [11]. With the help of item-
Rather than require users to provide clear demands, based collaborative filtering recommender system,
personalized recommender systems model the interests of Amazon [12] obtained a 20% to 30% promotion of sales
users by analyzing their history behaviors and [1]. Furthermore, Xiao and Bombast use e-commerce
recommending the information that meets to the user’s product recommendation agents to improve the quality of
interests and needs [1]. There are two kinds of applications the decisions consumers make when searching for and
of recommender systems: recommendation and selecting products online and give a detailed investigation
predication. Collaborative filtering algorithm is one of the on these issues [13]. However, these researches and
most important technology of recommender systems. The applications are for Internet-based e-commerce. Could the
algorithm based on similarity measure between users or traditional retail model get new opportunities for develop-
ment from recommender systems? At present, there is no two users give the same or similar ratings on their common
other study involved in the application of recommender rated items, they are thought to have similar interests. The
system for the traditional retail model. The work presented users who have similar interests with some user are called
here has focused on it. the specified user’s neighborhood. The neighborhood
The contribution of our work includes 3 phases: Firstly, users’ ratings for the specified item are used to predicate
the study presents a retail recommender system based on how much the specified user might rate that item [14].
collaborative filtering. In this system, retail stores are However, because the number of items is often much
regarded as users with different personality, and the result smaller than the number of users in practice, and items are
values of a Sales→Rate transform of the product sales are more stable usually, Item-based CF (Item-CF) was
regarded as the users’ evaluation of the items. Through this presented and got more application in actual systems
mechanism, the sales data of some mature retail stores [12,15]. The kNN(k most nearest neighbors) Item-CF
could be used to train the collaborative filtering algorithm makes rating prediction by investigating k-
recommender model, and sales prediction of other stores nearest similar items with specified item, and will be
would be made before practical product promotion. So that explained in the following words.
non e-commerce companies can get guidance for product The similarity between items is measured using Cosine
promotion at different retail stores and design their sales Similarity or Correlation similarity (also called Pearson
management strategies more precisely, Thus get more Similarity):
efficient sales with lower cost. Secondly, based on the
research of Jiang [10], we further design big data based wij( Cos)
¦ uN ( i ) N ( j ) ui
r ruj
(1)
retail recommender algorithm. Big data based processing
and analysis for retail recommender is implemented on
¦ r2
uU ui ¦uU ruj2
Hadoop platform using MapReduce framework. Finally,
wij( Pearson)
¦ uN ( i ) N ( j )
( rui ri ) ( ruj rj )
(2)
we use a set of data from real retail orders for offline tests
and verify the validity of the presented system. The ¦ ( r ri )2
uU ui ¦uU (ruj rj )2
application results of the system model is analyzed with (Cos)
different kinds of transform and collaborative filtering Here wij is the Cosine Similarity between two
algorithms. Experimental results show that the retail items i and j , and wij
( Pearson)
is the Pearson Similarity.
recommender system can effectively predict sales for
specific retail store and product. The characteristics of big N (i ) is the set of users who have rated item i . U is
data processing make it easy to cope with the analysis of a the universal set of users. ruj is the rating user u gives
large scale retail dataset.
This contribution is arranged as follows: Section II for item j .
gives the problem definition and basic algorithms of Let S (i, K ) be the set of items which have the
collaborative filtering; Section III gives a detailed biggest K similarity with item i , N (u) is the set of
introduction of the retail recommender model and items which user u has give rating for, the rating
algorithm; Section IV gives offline test results and analysis;
prediction of user u for item i is as follows:
Section V summarizes the research results.
( Cos)
rˆui
¦ w( Cos) ruj
jS ( i , K ) N ( u ) ij
(3)
II. PRELIMINARIES ¦ jS (i,K )N (u ) ij w ( Cos)
Where T is the test set, each element of T is the parallelizes the execution of all functions and ensures
corresponding user rating on item. T is the size of T , fault-tolerance. With these distributed and parallel
computing mechanism, big data analysis becomes
that is the number of elements in T . applicable.
MapReduce[6,8,10], presented by Google, is a kind
of distributed computing model, and Hadoop implements
III. BIG DATA BASED RETAIL
a most popular system of it. The MapReduce programming
model is as follows: RECOMMENDER SYSTEM
The computation takes a set of input <key,value> Today, the pressure to demonstrate Marketing Return
pairs, and produces a set of output <key,value> pairs. Two On Investment has never been greater, and many
functions: Mapper and Reducer will be expressed to the companies are taking a more scientific approach to
computation by the user of the MapReduce framework. marketing, and treating it as a true business discipline. This
The types of the functions are outlined as follows: means applying more rigor to capturing, analyzing and
Mapper: <k1,v1> → list<k2,v2> manipulating customer data, and delivering narrowly-
Reducer: <k2,list<v2>> → list<k3,v3> defined messages designed to resonate with customers’
The Mapper function takes an input pair and produces specific wants and needs. This process is called precision
a set of intermediate <key,value> pairs. The MapReduce marketing [16]. In the way of e-commerce, recommender
framework groups together all intermediate values systems are exploited to estimate the customers’ interests,
associated with the same intermediate key and passes them so that it is a good means for precision marketing. But in
to the Reducer function. the way of traditional retailing, recommender systems
The Reducer function accepts an intermediate key and have not been taken advantage of, because of the lack of
a list of values for that key which are passed by the Mapper direct consumer data. However, because companies sell
Function, merges together these values to form a smaller products through retail stores, each geographical store
list of values. confronts a group of characteristic customers, so if
A couple of Mapper and Reducer procedure is called a companies could make prediction of the wants and needs
phase, multiple phase can be nested to perform the of each customer group identified by the store for each
complete computing task. The data flow in a phase is product specification, they would benefit from a precision
illustrated in Figure 1. marketing of targeted product promotion on stores. We
will issue a retail model, which offers a new direction for
Input HDFS Output HDFS
non-e-commerce enterprises to achieve precision
Shuffle marketing by taking advantage of recommender system.
Split 0 Mapper
In the presented model, retail store corresponds to the
user of recommendation problem, and product
Split 1 Mapper Reducer Part 0
corresponds to the item. Obviously, product sales on store
... ... indicates the needs of customer group to the product, so
Split 2 Mapper
the sales corresponds to the rate of recommendation
... ... Reducer Part m problem. Additionally, due to stores order products
according to the sales, orders could be used as sales from
Split n Mapper the perspective of the ease of access to data. For the reason
that the sales scales of different stores are of big
differences, there is not a uniform criterion for the raw
Figure 1. Data flow in a phase
sales data, but recommended rate should be with the same
Input files are stored in some kind of distributed file standard, which means that the rating range of different
system which is HDFS on Hadoop mostly. A set of input users should be the same (e.g. an integer rating rang of 1
files are divided into chunks called splits, splits are to 5 in MovieLens [17] datasets). Therefore, the most
distributed to the mappers inside their nodes. The import point in retail recommender model is to define a
MapReduce framework writes the mapper function’s transform (called Sales ė Rate transform), which map
output locally, and then aggregates the relevant records at sales to a uniform rating range. After computing the rating
each reducer by having them remotely read the records prediction by collaborative filtering, a Sales ĕ Rate
from the mappers. This process is called the shuffle stage. transform is carried out to get the sales estimation of
The reducer function is applied to each key with the retailing problem. The illustration of the model is as Figure
respective set of values, and the outputs or reducers are 2.
stored in the output HDFS. The MapReduce framework The purpose of Sales→Rate transform is to map the
sales of different sizes to a uniform rating range, or to have for all stores on their sold products;
the same rating distribution, so that the sales from different b) Select a collaborative filtering algorithm, compute
stores could be measured with each other. In this paper two the rating prediction for v on j with the
kinds of transform are presented, which are relative calculated ratings from a);
percentage of total product sales based transform and c) Using the corresponding inverse transform of a), and
relative percentage of maximum product sales based the rating prediction from b), calculate the sales
transform. estimation.
Retailing Recommendation Just as MAE is used to measure the off-line error of
Problem Problem common recommendation prediction problem, the value of
SMAE (Sales MAE) is defined to measure the retail
5HWDLOLQJ FRUUHVSRQGLQJ
8VHU recommendation problem. The smaller of SMAE, the
6WRUH
better of the prediction effect.
¦ sˆui sui
&ROODERUDWLYH
FRUUHVSRQGLQJ suiT
)LOWHULQJ
3URGXFW ,WHP SMAE (10)
6DOHVė5DWH
T
WUDQVIRUP In order to let the retail recommender system working
6DOHV 5DWH
with big data, we make improvements to the item-based
6DOHVĕ5DWH CF algorithm on MapReduce designed by Jiang et al.
WUDQVIRUP
6DOHV 5DWH Phases of Sales/Rate transform, inverse transform, and
(VWLPDWLRQ 3UHGLFWLRQ similarity sorting for kNN computing, are added to the
algorithm, as well as a more clear description of the key
Figure 2. Illustration of retailing recommender model and value to utilize the MapReduce framework totally. To
Let the sales of store u on the sold product i is sui , the computation with TotalTf rating transform, Pearson
similarity, and Item-CF as an example, the entire
the estimated sales of store v on the unsold product j implementation flow of our retail recommender algorithm
is ŝvj , the two kinds of transform are explained as follows: on MapReduce is illustrated as Figure 3.
1. The relative percentage of total product sales based
transform (marked as TotalTf) for store u on the sold
product i , get the value of its sales percentage in the total
sales of product i :
sui
rui u 100% (6)
¦ vU
svi
Compute the rating prediction r̂vj with the algorithm
introduced in section II, then use the inverse transform to
get the sales estimation is:
¦
sˆvj ( rˆvj uU suj ) / 100% (7)
2. The relative percentage of the max product sales
based transform (marked as MaxTf) for store u on the
sold product i , get the value of its sales percentage in the
max store sales of product i :
sui
rui u 100% (8)
max{svi }
vU
And the inverse transform to get the sales estimation is:
sˆvj ( rˆvj max{suj}) / 100% (9)
uU
Figure 3. Diagram of retail recommender algorithm on MapReduce
The algorithm flow for computing sales estimate on with TotalTf and Pearson similarity
store v and product j is described as follows:
In Figure 3, keys and values inside the angle brackets
a) Select a Sales→Rate transform, calculate the ratings are separated by a comma, and what inside the parentheses
represent tuples. The entire process is done by five groups i, siTotal ! of Reducer-1, i, (u, rui ) ! of Mapper-2
of MapReduce phases.
as input. According to formula (4), Reducer-5 calculates
1. In phase-1, Mapper-1 accepts given sales data as the sales estimation and output the final result
input, which is represented by tuple (u, i, sui ) , then (v, j ), (rˆuj , sˆuj ) ! .
extracts product i as a key, and outputs intermediate
key/value pair i, (u, sui ) ! . MapReduce framework IV. EXPERIMENTAL RESULTS AND
sends values with the same key i to Reduce-1, which ANALYSIS
computes total product sales and outputs the result
For different models and datasets, the effect of different
i, siTotal ! . collaborative filtering algorithms are different. Results
2. In phase-2, Mapper-2 accepts the output of Reducer- gotten from MovieLens [17] datasets may not be the same
1 and intermediate key/value pair output of Mapper-1. as other types of datasets, or even completely opposite [1].
According to formula (6), Mapper-2 calculates Sales→ Accordingly, the retail prediction model that defines in this
Rate transform and output two intermediate key/value paper needs a set of real marketing data for offline test.
These tests use different collaborative filtering algorithms
pairs: (i, j ), Nij ! and i, (u, rui ) ! . Here N ij is
and transforms to compare the computing results.
the stores set that have sold product i and j . Reduce-2 This paper uses orders data from all cigarette retail
stores in Naning of China in 2012 as the experimental
receives i, (u, rui ) ! , calculates ri 2 ¦ (r
uU
ui ri )2 dataset. Because of the tobacco monopoly policy, all
cigarettes must be sold through registered stores, online
and the mean rate value of product i marked as ri , then retail transactions is forbidden, and cigarettes are
output the results. consumables, customer’s needs can be directly reflected in
the purchase quantity. As described in Section III, the
3. In phase-3, Mapper-3 accepts the output
number of orders can represent the product sales, so the
(i, j ), Nij ! of Mapper-2 and the output cigarettes orders can be used as typical experimental data
for the retail recommender system.
i, (ri , ri 2 ) ! of Reducer-2 as inputs. After a mapping
The original dataset contains 1,304,046 sale records of
transform, Mapper-3 outputs intermediate key/value pair 277 cigarette brands from 29,079 stores. In order to reflect
(i, j ), ( ri , rj , ri 2 , rj , Nij ) ! . The intermediate the prediction effect of the experimental data from the
2
mature retail stores, the raw dataset is processed. The data
key/value pair is transmitted to Reducer-3. According to must meet the following requirement: each store sells
formula (2), Mapper-3 calculates the similarity degree cigarette of at least 20 brands (or models) and each brand
(i, j ), wij ! between product i and j . is sold by at least 100 stores. The processed dataset has
23522 stores, 166 brands and 1,087,215 sales records in
4. In phase-4, Mapper-4 accepts the similarity output total. The average sales per brand and store is 54.6, the
of Reducer-3 as input. After a conversion mapping, largest single-store sales per brand is 44,816, and the data
Mapper-4 outputs the intermediate key/value pair sparsity is 72.2%.
i, ( j, wij ) ! and facilitate the next step for the reduce After a random selection, dataset is divided into
processing. The key/value pairs is immediately sent to mutually orthogonal training set and test set, the training
Reducer-4. Reducer-4 then sorts the pairs by similarity and set contains 80% of the data, test set contains 20%. In order
to prevent over-fitting, we generate five groups of training
outputs i, Si ! . Here SiK {( j, Simij )} Sim ij is
K
and test sets for experiment. Each group randomly selects
the biggest K similarity of product i . 20% data as test set and the others as training set. The
result is the average value of the five experimental groups.
5. Finally, Mapper-5 accepts the inputs of For comparison, we first consider a common simple
(i, j ), Nij ! outputted by Mapper-2, similarity model that is based on the average prediction value of
outputted by Reducer-4 and tuple (v, j ) that needs to retail products (denoted as Avg). The prediction is the
mean value of the sales from all the stores that sells the
estimate the sales of product j . By a conversion mapping, product.
Mapper-5 outputs the intermediate key/value pairs We use the methods described in Section III to train the
j, (v, S Kj ) ! to Reducer-5, Reducer-5 also receives retail recommender model and calculate the SMAE value
of the test set, as shown in Figure 4. Here, Item-CF is
REFERENCES
Figure 4. SMAE of Retail Recommendation [1] Liang Xiang, “Recommendation System Practice.” Posts
Experimental results show that the retail recommender & Telecommunications Press, 2012.6. (in Chinese)
[2] Su, Xiaoyuan, and Taghi M. Khoshgoftaar. "A survey of
system can effectively estimate the product sales for the
collaborative filtering techniques." Advances in artificial
retail stores, but the right Sales-Rating transform and the
intelligence, vol. 2009, pp. 4, 2009.
similarity that collaborative filtering uses must be properly [3] Rajaraman, Anand, and Jeffrey David Ullman. “Mining
chosen. of massive datasets.” Cambridge University Press, 2012.
When using Item-CF, relative percentage of total [4] Mayer-Schönberger, Viktor, and Kenneth Cukier. “Big
product sales based transform works well. The reason is Data: A Revolution that Will Transform how We Live,
that the transformation can better reflect a measure of Work, and Think.” Eamon Dolan/Houghton Mifflin
consistency in product sales, which reflects not only the Harcourt, 2013.
product popularity of different retail store, but also better [5] Fernández, A., et al. "An Overview on the Structure and
stability (Due to changes in the relative sales among retail Applications for Business Intelligence and Data Mining
stores, relative percentage of maximum product sales in Cloud Computing." 7th International Conference on
based transform would soon diverge). So the transform can Knowledge Management in Organizations: Service and
get prediction with smaller error. Overall, for the Cloud Computing. Springer Berlin Heidelberg, pp. 559-
experimental data sets, the best choice is relative 570, 2013.
percentage of total product sales based transform model. It [6] Dean, Jeffrey, and Sanjay Ghemawat. "MapReduce:
uses the cosine similarity Item-CF to achieve sales simplified data processing on large clusters."
forecasts, this algorithm gets the smallest error with the Communications of the ACM, vol. 51.1, pp. 107-113,
best prediction accuracy and the least amount of 2008
computation when K=5. [7] Hadoop. http://hadoop.apache.org.
[8] Jayalath, Chamikara, Julian Stephen, and Patrick Eugster.
"From the Cloud to the Atmosphere: Running