Big Data Based Retail Recommender System of Non E-Commerce: IEEE - 33044

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

IEEE - 33044

BIG DATA BASED RETAIL RECOMMENDER SYSTEM


OF NON E-COMMERCE

Chen Sun1, Rong Gao1, Hongsheng Xi1


1. Department of Automation, University of Science and Technology of China, Hefei
[email protected], [email protected], [email protected]

ABSTRACTü ü Recommender system, as a means of items recommends or predicates items that users have not
achieving precision marketing, has been widely used noticed but might have interests. The system recommends
and brought about significant benefits in modern e- items by filling user-item rating matrix [2-3]. This
commerce systems. However, there is a lack of study on technology is ideally suited for Internet-based e-commerce
the applying of recommender system to traditional non applications, because enterprise users can easily obtain
e-commerce retailing mode. This paper presents a and analyze the behavior data of users for products and
retail recommender model based on collaborative recommend right product to users with e-commerce.
filtering, and designs the corresponding distributed The bigger the scale of users and products become, the
computing algorithm on MapReduce, so as to more critical role recommender systems play. But there is
implement a big data based retail recommender system. also difficulty in analyzing large-scale data. Traditional
The big data mechanism helps the system do scalable calculation methods face challenges. Entering the era of
data processing easily. Experimental results show that big data [4], big data technology that develops on the basis
the system is effective for the estimation of retail sales of cloud computing provides large-scale datasets with
for each store and product. As a result, non e- excellent solutions [5]. Implementations of the popular
commerce enterprises could benefit from this novel MapReduce framework [6], such as Apache Hadoop [7],
way of precision marketing supports. have become part of the standard toolkit for processing
large datasets using cloud resources [8]. In order to solve
KEYWORDSüRecommender systems; Collaborative the problem of scalability, Zhao and Shang study the
Filtering; Big data; MapReduce; Precision Marketing implement of collaborative filtering algorithm on cloud
computing platform, designed a user-based collaborative
filtering algorithm for the MapReduce program
framework, and implement the algorithm on the Hadoop
I. INTRODUCTION platform [9]. Jiang et al. proposed a method by
implementing a scaling-up item-based CF algorithm on
Today, when we browse an e-commerce website and MapReduce in a Hadoop cluster with good concurrent
hope to purchase an item, we must be aware that processing efficiency [10].
personalized recommender systems have brought us In the area of e-commerce, the recommender
convenience, which could recommend a product that we technology has been widely studied and applied. Schafer
have not noticed about but would be probably fond of. et al. studied how recommender systems help e-commerce
Therefore we often surprise pleasantly: Yes, although I sites increase sales and analyze the recommender systems
have not thought of it before, but this is exactly what I want! at several market-leading sites [11]. With the help of item-
Rather than require users to provide clear demands, based collaborative filtering recommender system,
personalized recommender systems model the interests of Amazon [12] obtained a 20% to 30% promotion of sales
users by analyzing their history behaviors and [1]. Furthermore, Xiao and Bombast use e-commerce
recommending the information that meets to the user’s product recommendation agents to improve the quality of
interests and needs [1]. There are two kinds of applications the decisions consumers make when searching for and
of recommender systems: recommendation and selecting products online and give a detailed investigation
predication. Collaborative filtering algorithm is one of the on these issues [13]. However, these researches and
most important technology of recommender systems. The applications are for Internet-based e-commerce. Could the
algorithm based on similarity measure between users or traditional retail model get new opportunities for develop-

5th ICCCNT 2014


July 11- 13, 2014, Hefei, China
IEEE - 33044

ment from recommender systems? At present, there is no two users give the same or similar ratings on their common
other study involved in the application of recommender rated items, they are thought to have similar interests. The
system for the traditional retail model. The work presented users who have similar interests with some user are called
here has focused on it. the specified user’s neighborhood. The neighborhood
The contribution of our work includes 3 phases: Firstly, users’ ratings for the specified item are used to predicate
the study presents a retail recommender system based on how much the specified user might rate that item [14].
collaborative filtering. In this system, retail stores are However, because the number of items is often much
regarded as users with different personality, and the result smaller than the number of users in practice, and items are
values of a Sales→Rate transform of the product sales are more stable usually, Item-based CF (Item-CF) was
regarded as the users’ evaluation of the items. Through this presented and got more application in actual systems
mechanism, the sales data of some mature retail stores [12,15]. The kNN(k most nearest neighbors) Item-CF
could be used to train the collaborative filtering algorithm makes rating prediction by investigating k-
recommender model, and sales prediction of other stores nearest similar items with specified item, and will be
would be made before practical product promotion. So that explained in the following words.
non e-commerce companies can get guidance for product The similarity between items is measured using Cosine
promotion at different retail stores and design their sales Similarity or Correlation similarity (also called Pearson
management strategies more precisely, Thus get more Similarity):
efficient sales with lower cost. Secondly, based on the
research of Jiang [10], we further design big data based wij( Cos)
¦ uN ( i ) N ( j ) ui
r ˜ ruj
(1)
retail recommender algorithm. Big data based processing
and analysis for retail recommender is implemented on
¦ r2 ˜
uU ui ¦uU ruj2
Hadoop platform using MapReduce framework. Finally,
wij( Pearson)
¦ uN ( i ) N ( j )
( rui  ri ) ˜ ( ruj  rj )
(2)
we use a set of data from real retail orders for offline tests
and verify the validity of the presented system. The ¦ ( r  ri )2 ˜
uU ui ¦uU (ruj  rj )2
application results of the system model is analyzed with (Cos)
different kinds of transform and collaborative filtering Here wij is the Cosine Similarity between two
algorithms. Experimental results show that the retail items i and j , and wij
( Pearson)
is the Pearson Similarity.
recommender system can effectively predict sales for
specific retail store and product. The characteristics of big N (i ) is the set of users who have rated item i . U is
data processing make it easy to cope with the analysis of a the universal set of users. ruj is the rating user u gives
large scale retail dataset.
This contribution is arranged as follows: Section II for item j .
gives the problem definition and basic algorithms of Let S (i, K ) be the set of items which have the
collaborative filtering; Section III gives a detailed biggest K similarity with item i , N (u) is the set of
introduction of the retail recommender model and items which user u has give rating for, the rating
algorithm; Section IV gives offline test results and analysis;
prediction of user u for item i is as follows:
Section V summarizes the research results.
( Cos)
rˆui
¦ w( Cos) ˜ ruj
jS ( i , K ) N ( u ) ij
(3)
II. PRELIMINARIES ¦ jS (i,K )N (u ) ij w ( Cos)

The problems of recommender systems [1-2,14] are


generally TopN recommendation and rating prediction, rˆ( Pearson)
r
¦ jS ( i , K ) N ( u )
wij( Pearson) ˜ ( ruj  rj )
(4)
¦
ui i
both of which have similar basic principles and methods. w( Cos)
jS ( i , K ) N ( u ) ij
This paper will focus on rating prediction problem, that is,
users rate items in accordance with the degree of their Where rˆui( Cos) is the prediction using Cosine
preference, the task of the recommender system is to Similarity and rˆ
( Pearson)
is the prediction using Pearson
ui
predict how much the users might rate the items which
Similarity.
they have not rated. The technology of collaborative
The accuracy of rating prediction is measured using
filtering is used to solve this problem.
mean absolute error (MAE):
Collaborative filtering (CF) has become the most
widely adopted technology of recommender systems, and
MAE
¦ ruiT
rˆui  rui
(5)
the neighborhood-based CF algorithm is prevalent. User-
T
based CF was presented at first, the principle is that when

5th ICCCNT 2014


July 11- 13, 2014, Hefei, China
IEEE - 33044

Where T is the test set, each element of T is the parallelizes the execution of all functions and ensures
corresponding user rating on item. T is the size of T , fault-tolerance. With these distributed and parallel
computing mechanism, big data analysis becomes
that is the number of elements in T . applicable.
MapReduce[6,8,10], presented by Google, is a kind
of distributed computing model, and Hadoop implements
III. BIG DATA BASED RETAIL
a most popular system of it. The MapReduce programming
model is as follows: RECOMMENDER SYSTEM
The computation takes a set of input <key,value> Today, the pressure to demonstrate Marketing Return
pairs, and produces a set of output <key,value> pairs. Two On Investment has never been greater, and many
functions: Mapper and Reducer will be expressed to the companies are taking a more scientific approach to
computation by the user of the MapReduce framework. marketing, and treating it as a true business discipline. This
The types of the functions are outlined as follows: means applying more rigor to capturing, analyzing and
Mapper: <k1,v1> → list<k2,v2> manipulating customer data, and delivering narrowly-
Reducer: <k2,list<v2>> → list<k3,v3> defined messages designed to resonate with customers’
The Mapper function takes an input pair and produces specific wants and needs. This process is called precision
a set of intermediate <key,value> pairs. The MapReduce marketing [16]. In the way of e-commerce, recommender
framework groups together all intermediate values systems are exploited to estimate the customers’ interests,
associated with the same intermediate key and passes them so that it is a good means for precision marketing. But in
to the Reducer function. the way of traditional retailing, recommender systems
The Reducer function accepts an intermediate key and have not been taken advantage of, because of the lack of
a list of values for that key which are passed by the Mapper direct consumer data. However, because companies sell
Function, merges together these values to form a smaller products through retail stores, each geographical store
list of values. confronts a group of characteristic customers, so if
A couple of Mapper and Reducer procedure is called a companies could make prediction of the wants and needs
phase, multiple phase can be nested to perform the of each customer group identified by the store for each
complete computing task. The data flow in a phase is product specification, they would benefit from a precision
illustrated in Figure 1. marketing of targeted product promotion on stores. We
will issue a retail model, which offers a new direction for
Input HDFS Output HDFS
non-e-commerce enterprises to achieve precision
Shuffle marketing by taking advantage of recommender system.
Split 0 Mapper
In the presented model, retail store corresponds to the
user of recommendation problem, and product
Split 1 Mapper Reducer Part 0
corresponds to the item. Obviously, product sales on store
... ... indicates the needs of customer group to the product, so
Split 2 Mapper
the sales corresponds to the rate of recommendation
... ... Reducer Part m problem. Additionally, due to stores order products
according to the sales, orders could be used as sales from
Split n Mapper the perspective of the ease of access to data. For the reason
that the sales scales of different stores are of big
differences, there is not a uniform criterion for the raw
Figure 1. Data flow in a phase
sales data, but recommended rate should be with the same
Input files are stored in some kind of distributed file standard, which means that the rating range of different
system which is HDFS on Hadoop mostly. A set of input users should be the same (e.g. an integer rating rang of 1
files are divided into chunks called splits, splits are to 5 in MovieLens [17] datasets). Therefore, the most
distributed to the mappers inside their nodes. The import point in retail recommender model is to define a
MapReduce framework writes the mapper function’s transform (called Sales ė Rate transform), which map
output locally, and then aggregates the relevant records at sales to a uniform rating range. After computing the rating
each reducer by having them remotely read the records prediction by collaborative filtering, a Sales ĕ Rate
from the mappers. This process is called the shuffle stage. transform is carried out to get the sales estimation of
The reducer function is applied to each key with the retailing problem. The illustration of the model is as Figure
respective set of values, and the outputs or reducers are 2.
stored in the output HDFS. The MapReduce framework The purpose of Sales→Rate transform is to map the

5th ICCCNT 2014


July 11- 13, 2014, Hefei, China
IEEE - 33044

sales of different sizes to a uniform rating range, or to have for all stores on their sold products;
the same rating distribution, so that the sales from different b) Select a collaborative filtering algorithm, compute
stores could be measured with each other. In this paper two the rating prediction for v on j with the
kinds of transform are presented, which are relative calculated ratings from a);
percentage of total product sales based transform and c) Using the corresponding inverse transform of a), and
relative percentage of maximum product sales based the rating prediction from b), calculate the sales
transform. estimation.
Retailing Recommendation Just as MAE is used to measure the off-line error of
Problem Problem common recommendation prediction problem, the value of
SMAE (Sales MAE) is defined to measure the retail
5HWDLOLQJ FRUUHVSRQGLQJ
8VHU recommendation problem. The smaller of SMAE, the
6WRUH
better of the prediction effect.
¦ sˆui  sui

&ROODERUDWLYH
FRUUHVSRQGLQJ suiT

)LOWHULQJ
3URGXFW ,WHP SMAE (10)
6DOHVė5DWH
T
WUDQVIRUP In order to let the retail recommender system working
6DOHV 5DWH
with big data, we make improvements to the item-based
6DOHVĕ5DWH CF algorithm on MapReduce designed by Jiang et al.
WUDQVIRUP
6DOHV 5DWH Phases of Sales/Rate transform, inverse transform, and
(VWLPDWLRQ 3UHGLFWLRQ similarity sorting for kNN computing, are added to the
algorithm, as well as a more clear description of the key
Figure 2. Illustration of retailing recommender model and value to utilize the MapReduce framework totally. To
Let the sales of store u on the sold product i is sui , the computation with TotalTf rating transform, Pearson
similarity, and Item-CF as an example, the entire
the estimated sales of store v on the unsold product j implementation flow of our retail recommender algorithm
is ŝvj , the two kinds of transform are explained as follows: on MapReduce is illustrated as Figure 3.
1. The relative percentage of total product sales based
transform (marked as TotalTf) for store u on the sold
product i , get the value of its sales percentage in the total
sales of product i :
sui
rui u 100% (6)
¦ vU
svi
Compute the rating prediction r̂vj with the algorithm
introduced in section II, then use the inverse transform to
get the sales estimation is:
¦
sˆvj ( rˆvj uU suj ) / 100% (7)
2. The relative percentage of the max product sales
based transform (marked as MaxTf) for store u on the
sold product i , get the value of its sales percentage in the
max store sales of product i :
sui
rui u 100% (8)
max{svi }
vU
And the inverse transform to get the sales estimation is:
sˆvj ( rˆvj max{suj}) / 100% (9)
uU
Figure 3. Diagram of retail recommender algorithm on MapReduce
The algorithm flow for computing sales estimate on with TotalTf and Pearson similarity
store v and product j is described as follows:
In Figure 3, keys and values inside the angle brackets
a) Select a Sales→Rate transform, calculate the ratings are separated by a comma, and what inside the parentheses

5th ICCCNT 2014


July 11- 13, 2014, Hefei, China
IEEE - 33044

represent tuples. The entire process is done by five groups  i, siTotal ! of Reducer-1,  i, (u, rui ) ! of Mapper-2
of MapReduce phases.
as input. According to formula (4), Reducer-5 calculates
1. In phase-1, Mapper-1 accepts given sales data as the sales estimation and output the final result
input, which is represented by tuple (u, i, sui ) , then  (v, j ), (rˆuj , sˆuj ) ! .
extracts product i as a key, and outputs intermediate
key/value pair  i, (u, sui ) ! . MapReduce framework IV. EXPERIMENTAL RESULTS AND
sends values with the same key i to Reduce-1, which ANALYSIS
computes total product sales and outputs the result
For different models and datasets, the effect of different
 i, siTotal ! . collaborative filtering algorithms are different. Results
2. In phase-2, Mapper-2 accepts the output of Reducer- gotten from MovieLens [17] datasets may not be the same
1 and intermediate key/value pair output of Mapper-1. as other types of datasets, or even completely opposite [1].
According to formula (6), Mapper-2 calculates Sales→ Accordingly, the retail prediction model that defines in this
Rate transform and output two intermediate key/value paper needs a set of real marketing data for offline test.
These tests use different collaborative filtering algorithms
pairs:  (i, j ), Nij ! and  i, (u, rui ) ! . Here N ij is
and transforms to compare the computing results.
the stores set that have sold product i and j . Reduce-2 This paper uses orders data from all cigarette retail
stores in Naning of China in 2012 as the experimental
receives  i, (u, rui ) ! , calculates ri 2 ¦ (r
uU
ui  ri )2 dataset. Because of the tobacco monopoly policy, all
cigarettes must be sold through registered stores, online
and the mean rate value of product i marked as ri , then retail transactions is forbidden, and cigarettes are
output the results. consumables, customer’s needs can be directly reflected in
the purchase quantity. As described in Section III, the
3. In phase-3, Mapper-3 accepts the output
number of orders can represent the product sales, so the
 (i, j ), Nij ! of Mapper-2 and the output cigarettes orders can be used as typical experimental data
for the retail recommender system.
 i, (ri , ri 2 ) ! of Reducer-2 as inputs. After a mapping
The original dataset contains 1,304,046 sale records of
transform, Mapper-3 outputs intermediate key/value pair 277 cigarette brands from 29,079 stores. In order to reflect
 (i, j ), ( ri , rj , ri 2 , rj , Nij ) ! . The intermediate the prediction effect of the experimental data from the
2
mature retail stores, the raw dataset is processed. The data
key/value pair is transmitted to Reducer-3. According to must meet the following requirement: each store sells
formula (2), Mapper-3 calculates the similarity degree cigarette of at least 20 brands (or models) and each brand
 (i, j ), wij ! between product i and j . is sold by at least 100 stores. The processed dataset has
23522 stores, 166 brands and 1,087,215 sales records in
4. In phase-4, Mapper-4 accepts the similarity output total. The average sales per brand and store is 54.6, the
of Reducer-3 as input. After a conversion mapping, largest single-store sales per brand is 44,816, and the data
Mapper-4 outputs the intermediate key/value pair sparsity is 72.2%.
 i, ( j, wij ) ! and facilitate the next step for the reduce After a random selection, dataset is divided into
processing. The key/value pairs is immediately sent to mutually orthogonal training set and test set, the training
Reducer-4. Reducer-4 then sorts the pairs by similarity and set contains 80% of the data, test set contains 20%. In order
to prevent over-fitting, we generate five groups of training
outputs  i, Si ! . Here SiK {( j, Simij )} Sim ij is
K
and test sets for experiment. Each group randomly selects
the biggest K similarity of product i . 20% data as test set and the others as training set. The
result is the average value of the five experimental groups.
5. Finally, Mapper-5 accepts the inputs of For comparison, we first consider a common simple
 (i, j ), Nij ! outputted by Mapper-2, similarity model that is based on the average prediction value of
outputted by Reducer-4 and tuple (v, j ) that needs to retail products (denoted as Avg). The prediction is the
mean value of the sales from all the stores that sells the
estimate the sales of product j . By a conversion mapping, product.
Mapper-5 outputs the intermediate key/value pairs We use the methods described in Section III to train the
 j, (v, S Kj ) ! to Reducer-5, Reducer-5 also receives retail recommender model and calculate the SMAE value
of the test set, as shown in Figure 4. Here, Item-CF is

5th ICCCNT 2014


July 11- 13, 2014, Hefei, China
IEEE - 33044

carried out with cosine similarity and Pearson similarity V. CONCLUSIONS


on the transformed ratings.
As illustrated in Figure 4, the sales mean absolute error This paper studies application of the recommender
of Avg (the mean value prediction model) is 41.6, which system technology in the non e-commerce enterprises.
shows that a simple model can also get pretty good results. Traditional enterprises can divide retail stores into two
The effect of relative percentage of maximum product categories: mature retail stores and immature stores. The
sales based transform is even worse and the initial value is mature stores is those that get cultivation and development
higher. When K=5, the prediction error that uses Pearson from the enterprise and carry out a variety of marketing
similarity (MaxTf-Pearson) is 34.5, and the prediction and promotional activities. These stores have good sales
error that uses cosine similarity (MaxTf-Cos) is 54.2. With performance. User’s real demand of the product is known,
the increase of the neighborhood K, the prediction rapidly i.e. user’s score of items is known. Immature stores is
diverges. The effect of relative percentage of total product those that haven’t enough marketing and promotional
sales based transform is very good, when K=25, the activities or have no sales. We regard that the real demand
TotalTf-Pearson method obtains minimum error 27.5, and of these stores is unknown, i.e. user don’t score the items.
the TotalTf-Cos method even obtains the best result 24.8 Because of the high cost of marketing activities, it is
when K=5. important to select the most appropriate store-product
pairs and deploy product promotion. We can get a better
marketing results in cost savings at the same time. Thus,
the significance of the use of retail recommender system is
that the model can be gotten by training mature retail stores
and be used to estimate product sales of the immature retail
stores. So we can select the stores that may get better sales
performance to carry out marketing activities and achieve
precision marketing. Meanwhile, with the latest big data
processing technology, this paper designs a retail
recommender algorithm based on MapReduce, so that the
retail recommender system can cope with large scale
datasets and getting better scalability.

REFERENCES
Figure 4. SMAE of Retail Recommendation [1] Liang Xiang, “Recommendation System Practice.” Posts
Experimental results show that the retail recommender & Telecommunications Press, 2012.6. (in Chinese)
[2] Su, Xiaoyuan, and Taghi M. Khoshgoftaar. "A survey of
system can effectively estimate the product sales for the
collaborative filtering techniques." Advances in artificial
retail stores, but the right Sales-Rating transform and the
intelligence, vol. 2009, pp. 4, 2009.
similarity that collaborative filtering uses must be properly [3] Rajaraman, Anand, and Jeffrey David Ullman. “Mining
chosen. of massive datasets.” Cambridge University Press, 2012.
When using Item-CF, relative percentage of total [4] Mayer-Schönberger, Viktor, and Kenneth Cukier. “Big
product sales based transform works well. The reason is Data: A Revolution that Will Transform how We Live,
that the transformation can better reflect a measure of Work, and Think.” Eamon Dolan/Houghton Mifflin
consistency in product sales, which reflects not only the Harcourt, 2013.
product popularity of different retail store, but also better [5] Fernández, A., et al. "An Overview on the Structure and
stability (Due to changes in the relative sales among retail Applications for Business Intelligence and Data Mining
stores, relative percentage of maximum product sales in Cloud Computing." 7th International Conference on
based transform would soon diverge). So the transform can Knowledge Management in Organizations: Service and
get prediction with smaller error. Overall, for the Cloud Computing. Springer Berlin Heidelberg, pp. 559-
experimental data sets, the best choice is relative 570, 2013.
percentage of total product sales based transform model. It [6] Dean, Jeffrey, and Sanjay Ghemawat. "MapReduce:
uses the cosine similarity Item-CF to achieve sales simplified data processing on large clusters."
forecasts, this algorithm gets the smallest error with the Communications of the ACM, vol. 51.1, pp. 107-113,
best prediction accuracy and the least amount of 2008
computation when K=5. [7] Hadoop. http://hadoop.apache.org.
[8] Jayalath, Chamikara, Julian Stephen, and Patrick Eugster.
"From the Cloud to the Atmosphere: Running

5th ICCCNT 2014


July 11- 13, 2014, Hefei, China
IEEE - 33044

MapReduce across Datacenters." IEEE Transactions on 2003.


Computers, vol. 99.1, pp.1, 2013. [13] Bo Xiao, and Izak Benbasat. "E-commerce product
[9] Zhao, Zhi-Dan, and Ming-sheng Shang. "User-based recommendation agents: use, characteristics, and
collaborative-filtering recommendation algorithms on impact." Mis Quarterly, vol. 31.1, pp. 137-209, 2007.
hadoop." Knowledge Discovery and Data Mining, Third [14] Asanov, Daniar. "Algorithms and Methods in
International Conference on. IEEE, vol. WKDD'10, pp. Recommender Systems." Berlin Institute of Technology,
478-481, 2010. Berlin, Germany, 2011.
[10] Jiang, Jing, et al. "Scaling-up item-based collaborative [15] Sarwar, Badrul, et al. "Item-based collaborative filtering
filtering recommendation algorithm based on hadoop." recommendation algorithms." Proceedings of the 10th
Services, IEEE World Congress on, vol. SERVICES international conference on World Wide Web. ACM, pp.
2011, pp. 490-497, 2011. 285-295, 2001.
[11] Schafer, J. Ben, Joseph A. Konstan, and John Riedl. "E- [16] Zabin, Jeff, and Gresh Brebach. Precision marketing:
commerce recommendation applications." Applications The new rules for attracting, retaining, and leveraging
of Data Mining to Electronic Commerce. Springer US, profitable customers. John Wiley & Sons, 2004.
pp.115-153, 2001. [17] J. Riedl and J. Konstan. Movielens dataset.
[12] Linden, Greg, Brent Smith, and Jeremy York. "Amazon. http://www.cs.umn.edu/Research/Group Lens.
com recommendations: Item-to-item collaborative
filtering." Internet Computing, vol. IEEE 7.1, pp. 76-80,

5th ICCCNT 2014


July 11- 13, 2014, Hefei, China

You might also like