0% found this document useful (0 votes)
239 views8 pages

Case Study-Machine Learning at American Express

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 8

Case Study- Machine Learning at American Express

BY ISHITA NAGPAL20609025
SNEHA SINGH 20609006
CHETAN KHATTAR 20609021
INTRODUCTION

American Express has a rich history of using data and analytics to create deeper
relationships with prospects and existing customers, but it is the advent of
machine learning that has allowed scientists to harness the full power of their
data. American Express's Information and Risk Management team, in
partnership with the company's technology team, embarked on a journey to
create world-class big data capabilities nearly five years ago. According to Chao
Yuan, senior vice president and head of Decision Science at American Express,
big data analytics helps American Express drive commerce, serve customers
more efficiently, and detect fraud.

Enterprise Information Management and Digital Partner for American Express


explains that American Express' big data ecosystem is built on Hadoop and
other leading technologies and supports all entities petabyte scale business.
TeraSort and MinuteSort 1.65 TB. The platform is highly scalable with a
flexible architecture to meet growing business needs.
BENEFITS
An audience of over 300 recently got a peek into this big data story thanks to a
presentation by Chao Yuan, SVP at American Express who heads their
Modeling and Decision Science Team for US Consumer Business, and by co-
presenter Ted Dunning, Chief Application Architect at MapR Technologies, at
an event organized by the Hive Data Think Tank in Palo Alto.
REQUIREMENTS.

A collection of production big data use cases in which American Express has
seen big benefits from using machine learning to improve decisions and better
leverage their data. Ted then explained what is required of a big data platform in
order to support large-scale machine learning projects such as these in
productions settings.

 Data from both sides of business

American Express is used to operating at large scale. In business for 165 years,
it has continued to transform itself to keep up with changing demand. It has
gone from being primarily a shipping company, then a travel business and now
a major credit card issuer, handling over 25% of US credit card spending. And
in 2014, the company reached a milestone: one trillion dollars in transactions.
The nature of the company gives it the opportunity to see data from both the
customer and merchant side of business, in fact, from millions of sellers and
millions of buyers.

Enterprise Information Management and Digital Partner for American Express


explains that American Express' big data ecosystem is built on Hadoop and
other leading technologies and supports all entities petabyte scale business.
TeraSort and MinuteSort 1.65 TB. The platform is highly scalable with a
flexible architecture to meet growing business needs.

Data mass not only increases, but the data source also changes. Many people do
online business or through their mobile devices. Chao explains that it is part of
the current trip from the American Express, they must follow the changes in
interactive style as well as the growth volume. Part of this involves losing a lot
of decisions, millions of every day. If American Express can become a little
smarter in these decisions, it may have a huge advantage for customers and for
society. That is why they are expanding the use of machine learning on a large
scale. With access to big data, machine learning models can make superior
distinctions and thus better understand customer behavior.

IMPLEMENTAION
1. FRAUD DETECTION
In the case of fraud detection and prevention, machine learning has been helpful
to improve American Express’s already excellent track record, including their
online business interactions. To do this, modeling methods make use of a
variety of data sources including card membership information, spending
details, and merchant information. The goal is to stop fraudulent transactions
before substantial loss is incurred while allowing normal business transactions
to proceed in a timely manner. A customer has swiped their card to make a
purchase, for instance, and expects to get approval immediately. In addition to
accurately finding fraud, the fraud detection system is required to have these
two characteristics:

 Detect suspicious events early


 Make decisions in a few milliseconds against a vast dataset
 Large-scale machine learning techniques done correctly are able to meet
these criteria and offer an improvement over traditional linear regression
methods, taking the precision of predictions to a new level.

2. NEW CUSTOMER ACQUISITION


Finding new customers is a widespread need for business, and American
Express is no exception. For example, when a prospective customer visits their
website, there are many products (different credit card plans) from which to
choose. Previously, around 90% of new customers came from direct mail
campaigns, but now, with the web and with the advantage of targeted marketing
through machine learning models, the amount of new customer acquisition via
online interactions has risen to 40%. This is advantageous especially because
the costs involved online are less than direct mail contact.

3. RECOMMENDATION FOR IMPROVED CUSTOMER EXPERIENCE


One of his favorite uses of machine learning at American Express is to
build a machine learning mobile phone application to provide customized
recommendations for restaurant choices. When the customer gives
permission, the machine learning application uses recent spending
histories and card member profile data for a huge number of transactions
to train a recommendation model. The model predicts which restaurants a
particular customer might enjoy and makes those recommendations (The
technical basis for this approach was further explained by co-presenter
Ted Dunning, as described below.)

The level of success of this improved customer experience is not only of interest
to the card issuer but also to restaurant merchants who get feedback on how
good a particular offer may be.
TOOLS AND TECHNIQUIES

Doing these types of successful large-scale machine learning in production puts


certain requirements on the big data platform that supports them, which was the
main focus of Ted’s presentation. Machine learning applications need to work
with large amounts of data from a wide range of sources that has been prepared
and staged in specific ways. The MapR data platform is well suited to store,
stream and facilitate search on data that is big and needs to move fast.
MapR’s real-time read/write file system, integrated NoSQL database and large
array of Hadoop ecosystem tools meet the needs of large-scale machine learning
applications. MapR’s ability to use legacy code directly, to make consistent
snapshots for data versioning and to use remote mirroring for applications
synchronized across multiple data centers are especially useful.
Although the specific design and choice of algorithms differs with different
types of machine learning, these applications do share some commonalities in
the needs that they place on the big data platform that supports them.
Scalability
The quality of recommendations, as with other machine learning applications,
depends in part on the quality and quantity of available data. Models learn from
motifs observed across a large number of historical actions, so one requirement
for the data platform is scalability of storage. This is true for different types of
use cases from fraud detection at secure websites to predictive maintenance in
large industrial settings.
Speed
Another need placed on a data platform by machine learning applications is to
handle large-scale queries fast. Take the example of detecting anomalies in the
propagation of telecom data. When special events occur, large groups of people
may suddenly put a localized burden on telecommunications, such as tens of
thousand of people in a sport arena who are using their phones for tweeting. To
avoid having such a situation overload the communication system, it’s useful to
temporarily activate localized higher bandwidth to serve this “flash mob”. If
you can detect these anomalies quickly, you can prioritize service appropriately
including maintaining service for first responders in an emergency. This need
for speed is similar to the requirement for rapid response when validating a
credit card transaction – either way, the machine learning system must be able
to rapidly query millions of records and return a response in a second or less.
Compatible non-Hadoop File Access
Hadoop is a bit of a revolution, but in order to make the best use of existing
experience as well as new ideas, it’s helpful for a data platform to seamlessly
support both Hadoop and non-Hadoop applications and code, particularly with
other machine learning systems. Ted explained that the fact that MapR has a
real-time fully read/write file system that supports NFS access makes it possible
to use legacy code along with Hadoop applications, a particular advantage in
large-scale machine learning projects.
Data Versioning
One difference between a machine learning project being an interesting “arts
and craft” study to being a serious work of engineering is the capability for
repeatable processes and, in particular, for version control. It’s a challenge to do
version control at the scale of terabytes and more of data, because it’s too
expensive in space and time to make full copies. What is needed instead are
transactionally consistent snapshots for data versioning, such as those available
with the MapR data platform. Consistent snapshots let you understand and
reference the state of training data from an exact point-in-time, compare results
for old data and new models or vice versa and ultimately see what is really
causing changes in observed behavior.

Federation Across Multiple Data Centers


With some large companies, machine learning projects need to be deployed
across multiple data centers, either to share results or more often to share a
consistent collection of data for use in independent development projects. This
requirement for the data platform is met by MapR mirroring, which creates a
consistent remote replica of data that is useful for geographic distribution as
well as to provide the basis for disaster recovery if needed.

CONCLUSION AND ANALYSIS


The model identifies recommendation indicators based on historical co-
occurrence of users and items (or actions). The beauty of this type of design is
that the computational heavy lifting in which the learning algorithm is used in
training the model can be done ahead of time, offline. Then conventional
techniques such as a search engine can be put to work to easily deploy the
system, making it able to deliver rapid online recommendations in real-time

You might also like