10th Chapter 4 - Developing-Managing and Using Customer-Related Databases
10th Chapter 4 - Developing-Managing and Using Customer-Related Databases
10th Chapter 4 - Developing-Managing and Using Customer-Related Databases
Developing,
managing and using
customer-related
databases
Chapter objectives
By the end of this chapter, you will understand:
1. the central role of customer-related databases to the successful delivery of CRM
outcomes
2. the importance of high quality data to CRM performance
3. the issues that need to be considered in developing a customer-related database
4. what data integration contributes to CRM performance
5. the purpose of a data warehouse and data mart
6. how data access can be obtained by CRM users
7. the data protection and privacy issues that concern public policy makers.
Introduction
In this chapter we discuss the importance of developing an intimate
knowledge and understanding of customers. This is essential to achieving
CRM success. Strategic CRM, which focuses on winning and keeping
protable customers, relies on customer-related data to identify which
customers to target, win and keep. Operational CRM, which focuses on
the automation of customer-facing processes such as selling, marketing
and customer service, needs customer-related data to be able to deliver
excellent service, run successful marketing campaigns and track sales
opportunities. Analytical CRM mines customer-related data for strategic
or tactical purposes. Collaborative CRM involves the sharing of customerrelated data with organizational partners, with a view to enhancing
company, partner and customer value. Customer-related databases are the
foundation for the execution of CRM strategy. Prociency at acquiring,
enhancing, storing, distributing and using customer-related data is critical
to CRM performance.
What is a customer-related
database?
You may have already noted that this chapter is not about customer
databases. Rather, it is about customer-related databases. Why?
Companies typically do not have a single customer database; instead, they
have a number of customer-related databases. Large organizations, such
as nancial services companies, can have 20 or more customer systems,
each with a separate database. These databases capture customer-related
data from a number of different perspectives. Customer-related databases
Figure 4.1
Building a customerrelated database
Contact data
Who is the main contact (name) and who else (other names) is involved
in buying decisions? What are their roles? Who are the decision-makers,
buyers, inuencers, initiators and gatekeepers? What are the customers
invoice addresses, delivery addresses, phone numbers, fax numbers,
e-mail addresses, street addresses and postal addresses?
Contact history
Who has communicated with the customer, when, about what, in which
medium and with what outcome?
Transactional history
What has the customer bought and when? What has been offered to the
customer, but not been purchased?
Current pipeline
What opportunities are currently in the sales pipeline? What is the
value of each opportunity? What is the probability of closing? Is there a
10 per cent, 20 per cent 90 per cent chance of making a sale? Some
CRM applications enable sales people to allocate red, amber or green
signals to opportunities according to the probability of success.
Opportunities
Whereas transactional history looks backwards, opportunity looks
forwards. This is where opportunities that have not yet been opened or
discussed are recorded.
Products
What products does the customer have? When were these products
purchased, and when are they due for renewal? Have there been any
service issues related to these products in the past?
Communication preferences
What is the preferred medium of communication mail, telephone, email, face-to-face, etc.? If it is e-mail, is plain text or html preferred?
What is the preferred salutation? And the preferred contact time and
location? Customers may prefer you to contact them by phone for some
communications (e.g. an urgent product recall), by mail for others (e.g.
invoicing), by e-mail (e.g. for advice about special offers) and face-toface for other reasons (e.g. news about new products). These preferences
can change over time. When a customers preferences are used during
customer communications, it is evidence that the company is responsive
to customer expectations. Many companies allow customers to opt in to,
or out of, different forms of communication. Customers may prefer to
adjust their own preferences. Amazon.com, for example, allows customers
to opt to receive e-mail about six different types of content: terms and
conditions of shopping at Amazon; new products; research surveys;
magazine subscription renewal notices; information about and from
Amazons partners and special offers.
Census data
Census data are obtained from government census records. In different
parts of the world, different information is available. Some censuses
are unreliable; others do not make much data available for nongovernmental use.
In the USA, where the census is conducted every ten years, you cannot
obtain census data at the household level, but you can at a more aggregated
geodemographic level, such as zip code, census tract and block group.
Census tracts are subdivisions of counties. Block groups are subdivisions
of census tracts, the boundaries of which are generally streets. In the USA
there are about 225 000 block groups, with an average of over 1000 persons
per group. Census data available at geodemographic level includes:
median income
average household size
average home value
average monthly mortgage
percentage ethnic breakdown
marital status
percentage college educated.
For the UK census there are 155 000 enumeration districts, each
comprising about 150 households and ten postcodes. The enumeration
district is the basis for much geodemographic data.
Individual-level data are better predictors of behaviour than aggregated
geodemographic data. However, in the absence of individual-level data,
census data may be the only option for enhancing your internal data. For
example, a car reseller could use census data about median income and
average household size to predict who might be prospects for a purchase
promotion.
Modelled data
Modelled data are generated by third parties from data that they
assemble from a variety of sources. You buy processed, rather than
raw, data from these sources. Often they have performed clustering
routines on the data. For example, Claritas has developed a customer
classication scheme called PRIZM. In Great Britain, PRIZM describes
the lifestyles of people living in a particular postcode. Every postcode is
assigned to one of 72 different clusters on the basis of their responses to
a variety of lifestyle and demographic questions. Eighty per cent of the
data used in the clustering process is less than three years old.
Figure 4.2 provides the PRIZM prole of residents of one postcode
in the London suburb of Twickenham. They are assigned to PRIZM
code A101, which applies to about one-third of one per cent of
households in the country. The gure proles their occupational status,
living accommodation, car ownership, vacation choices and media
consumption.
Young professionals
Rented accommodation
Above average car ownership
Take foreign holidays
Read the quality press
Assigned to PRIZM code A101
Lifestyle: A (AD)
Income quintile: 1 (15)
Cluster type: 1 (172)
0.34% of GB households
Income rank: 5 (172)
Age rank: 28 (172)
Figure 4.2
PRIZM analysis of
TW9 1UU, England
If you want to use external data to enhance your internal data, youll
need to send a copy of the data that you want to enhance to the external
data source. The source will match its les to yours using an algorithm
that recognizes equivalence between the les (often using names and
addresses). The source then attaches the relevant data to your les and
returns them to you.
Relational databases
Relational databases are now the standard architecture for CRM
applications (see Figure 4.3). Relational databases store data in two
dimensional tables comprised of rows and columns. Relational databases
have one or more elds that provide a unique form of identication for
each record. This is called the primary key. For sales databases, each
customer is generally assigned a unique number which appears in the
rst column. Therefore, each row has a unique number. Companies also
have other databases for marketing, service, inventory, payments and
so on. The customers unique identifying number enables linkages to be
made between the various databases.
Lets imagine you are a customer of an online retailer. You buy
a book and supply the retailer with your name, address, preferred
delivery choice and credit-card details. A record is created for you on
the Customer database, with a unique identifying number. An Orders
received database records your purchase and preferred delivery choice.
An Inventory database records that there has been a reduction in
the stock of the item you ordered. This may trigger a re-ordering
process when inventory reaches a critical level. A Payment database
records your payment by credit-card. There will be one-to-many
linkages between your customer record and these other databases.
With the advent of enterprise suites from vendors such as Oracle
and SAP, all of these databases may reside in the one system and be
preintegrated. The choice of hardware platform is inuenced by several
conditions:
1. The size of the databases. Even standard desktop PCs are capable
of storing huge amounts of customer data. However, they are not
designed for this data to be shared easily between several users.
Figure 4.3
Relational database
model4
2. Existing technology. Most companies will already have technology
that lends itself to database applications.
3. The number and location of users. Many CRM applications are quite
simple, but in an increasingly global marketplace the hardware may
need very careful specication and periodic review. For example, the
hardware might need to enable a geographically dispersed, multilingual,
user group to access data for both analytical and operational purposes.
range validation: does an entry lie outside the possible range for a eld?
missing values: the computer can check for values that are missing in
any column.
check against external sources: you could check postcodes against an
authoritative external listing from the mail authorities.
It does not take long for databases to degrade. Companies can maintain
data integrity in a number of ways.
1. Ensure that data from all new transactions, campaigns and
communications is inserted into the database immediately. You will
need to develop rules and ensure that they are applied.
2. Regularly de-duplicate databases.
3. Audit a subset of the les every year. Measure the amount of
degradation. Identify the source of degradation: is it a particular data
source or eld?
4. Purge customers who have been inactive for a certain period of time.
For frequently bought products, the dormant time period might be six
months or less. For products with a longer repeat purchase cycle, the
period will be longer. It is not always clear what a suitable dormancy
period is. Some credit-card users, for example, may have different
cards in different currencies. Inactivity for a year only indicates that
the owner has not travelled to a country in the previous year. The
owner may make several trips in the coming year.
5. Drip-feed the database. Every time there is a customer contact there is
an opportunity to add new or verify existing data.
6. Get customers to update their own records. When Amazon customers
buy online, they need to conrm or update invoice and delivery details.
Data integration
As noted earlier, in most companies there are several customer-related
databases, maintained by different functions or channels. There might
also be customer data in product or production databases, as well as
call centres and websites, as suggested in Figure 4.5. External data from
suppliers, business partners, franchisees and others may also need to be
integrated.
Failure to integrate databases may lead to inefciency, duplication and
damaged customer relationships. Poor integration is indicated when you
have bought an item online, only to be offered the same item at a later
time through a different channel of the same company.
Customer data integration relies on standardization of data across
databases. An indicator of the magnitude of the problem is that when
Dun & Bradstreet was integrating data from several sources to create
Retail store
Party plan
Catalogue store
Integrated
customer
database
Data
analysis
and
mining
CRM strategy
development
and
implementation
Web-site
Home shopping
External data
Figure 4.5
A single view of the
customer
a marketing database it found 113 different entries for AT&T alone.
These included ATT, A.T.T., AT and T and so on.
Companies often face the challenge of integrating data from several
sources into a coherent single view of the customer. Sometimes this
becomes a signicant challenge in a CRM project, and a necessary hurdle to
cross before implementing marketing, sales or service CRM applications.
The major on-premise CRM vendors, such as Oracle and SAP,
offer solutions to this problem. SAP, for example, offers Master Data
Management as part of its NetWeaver business integration platform.
This enables companies to capture and consolidate data from different
sources into a centralized database.
For companies with older mainframe (legacy) systems, another
solution to the problem of database integration is to convert to newer
systems with a centralized database that can accept real time inputs from
a number of channels.7 However, where there is considerable investment
in legacy systems and a huge number of records this may not be cost
effective. Legacy systems are typically batch-processing systems. In
other words, they do not accept real time data. Many technology rms
have developed software and systems to allow companies to integrate
databases held on different legacy systems. Sometimes middleware
has to be written to integrate data from diverse sources. Middleware is
a class of software that connects different parts of a system that would
not otherwise be able to communicate to each other. Middleware acts as
a broker of information between systems, receiving information from
source systems, and passing it to destination systems in a format that
can be understood. It is often referred to as a kind of glue that holds a
network together.
Case 4.1
Data integration at the American Heart Association
The American Heart Association (AHA) is a not-for-prot US health organization dedicated to
reducing disability and death from heart attack, stroke and related cardiovascular disorders.
One of the AHAs major goals has been improving its relationships with stakeholders,
including many thousands of volunteers conducting unpaid work for the organization,
donors, businesses and the media. However, a challenge facing the AHA in achieving this
goal was integrating the organizations data, which was previously located in over 150
separate databases, often geographically isolated and specic to certain departments within
the organization. These provided a fragmented view of customers proles and history of
activities.
AHA chose to implement a CRM software system across the organization to integrate all
existing databases. Since implementation the AHA has found its staff is far more productive,
it is able to respond to customers more quickly and provide more personalized service.
Donations from customers have increased by over 20 per cent, using the system to contact
potential donors compared to previous activities.
Data warehousing
As companies have grown larger they have become separated both
geographically and culturally from the markets and customers they
serve. Disney, an American corporation, has operations in Europe, Asia
and Australasia, as well as in the USA. Benetton, the French fashion
brand has operations across ve continents. In retailing alone it operates
over 7000 stores and concessions. Companies such as these generate
a huge volume of data that needs to be converted into information that
can be used for both operational and analytical purposes.
The data warehouse is a solution to that problem. Data warehouses
are really no more than repositories of large amounts of operational,
historical and other customer-related data. Data volume can reach
terabyte levels, i.e. 240 bytes of data. A warehouse is a repository for
data imported from other databases. Attached to the front end of the
warehouse is a set of analytical procedures for making sense out of the
data. Retailers, home shopping companies and banks have been early
adopters of data warehouses.
Watson describes a data warehouse as follows:8
Data standardization
Personal data: m/f, M/F, male/female
Units of measurement: metric/imperial
Field names: sales value, Sale$, $val
Dates: mm/dd/yy, dd/mm/yyyy, yyyy-mm-dd
Data cleaning
De-duplication
Updating and purging
Identify misuse of data entry fields e.g. use of phone field to record e-mail address
Figure 4.6
Data transformation
After transformation, the data then needs to be uploaded into the
warehouse. Archival data that have little relevance to todays operations
may be set aside, or only uploaded if there is sufcient space. Recent
operational and transactional data from the various functions, channels
and touchpoints will most probably be prioritized for uploading.
Refreshing the data in the warehouse is important. This may be done
on a daily or weekly basis depending upon the speed of change in the
business and its environment.
Data marts
A data mart is a scaled down version, or subset, of the data warehouse,
customized for use in a particular business function or department.
Case 4.2
Data warehousing at Owens & Minor Inc.
Owens & Minor Inc., a Fortune 500 company headquartered in Richmond, VA, is the USAs
leading distributor of national name brand medical/surgical supplies. The companys data
warehouse project was rst implemented in April 1997, starting with a single subject area
sales. Today, the data warehouse environment has grown to integrate over 20 different subject
areas with over ten years of history. The size of the warehouse is just under 2 terabytes of
total space. Internally there are over 900 users out of a total employee base of 3000, which
is a very high percentage of business intelligence users. Externally Owens & Minor has four
different extranet user groups that total around 600 users.
Source: The Data Warehousing Institute10
Standard reports
Standard reports are automatically generated periodically by the CRM
system. Examples include monthly reports to sales management about
sales representatives activity and performance against quota, and daily
reports of call centre activity. OLAP technologies allow users to drill
down into the data on a screen rather than resorting to a at, xedformat, report. Starting with aggregated sales data for a region, a sales
Database queries
A number of different types of query languages are available to CRM users
when they want to raise a database query. Some are graphical users can
click and drag the data they want, and then drill down until they reach
the level of granularity they require. Database managers may prefer to use
SQL, which is now the standard query language for relational databases.
SQL queries employing standard commands, such as SELECT, INSERT,
DELETE, UPDATE, CREATE, DROP, can be used to access required data.
Data mining
In the CRM context, data mining can be dened as follows:
Data mining is the application of descriptive and predictive analytics
to support the marketing, sales and service functions.12
Although data mining can be performed on operational databases, it
is more commonly applied to the more stable datasets held in data marts
or warehouses. Higher processing speeds, reduced storage costs and
better software packages have made data mining more attractive and
economical.
Data mining can provide answers to questions that are important for
both strategic and operational CRM purposes. For example:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Case 4.3
Data mining at Marks & Spencer
Data mining has proven to be a successful strategy for the UK retailer Marks & Spencer
(M&S). The company generates large volumes of data from the ten million customers per
week it serves in over 300 stores. The organization claims data mining lets it build one-to-one
relationships with every customer, to the point that whenever individual customers come into
a store the retailer knows exactly what products it should offer in order to build protability.
Marks & Spencer believes two factors are important in data mining. First is the quality
of the data. This is higher when the identity of customers is known, usually as a result of
e-commerce tracking or loyalty programme membership. Second is to have clear business
goals in mind before starting data mining. For example, M&S uses data mining to identify
high margin, average margin or low margin customer groups. The company then
proles high margin customers. This is used to guide customer retention activities with
appropriate targeted advertising and promotions. This technique can also be used to prole
average margin or low margin customers who have the potential to be developed into
high margin customers.
Sequential patterns often emerge from data mining. Data miners look
for if then rules in customer behaviour. For example, they might nd
a rule such as If a customer buys walking shoes in November, then there
is a 40 per cent probability that they will buy rainwear within the next
six months, or If a customer calls a contact centre to request information
about interest rates, then there is a 50 per cent probability the customer
will churn in the next three months. Rules such as these enable CRM users
to implement timely tactics. In the rst instance, there is an opportunity for
cross-selling. Secondly, there may be an opportunity to save the customer.
Data-mining also works by classifying. Customers can be classied
into mutually exclusive groups. For example, you might be able to
segment your existing customers into groups according to the value
they produce for your company. You can then prole each group. When
you identify a potential new customer you can judge which group the
prospect most resembles. That will give you an idea of the prospects
potential value.
You could also classify customers into quintiles or deciles in terms of
important transactional information such as the recency, frequency and
monetary value of the purchases they have made. This is called RFM
511
521
531
541
551
411
421
431
441
451
311
321
331
341
351
211
221
231
241
251
5
4
Recency 1
Figure 4.7
A recency
frequency
monetary value
matrix
111
Frequency 1
121
131
141
151
3
2
1 Monetary value
Privacy issues
Privacy and data protection are major concerns to legislators around
the world. Customers are increasingly concerned about the amount of
information commercial organizations have about them, and the uses to
which that information is put. In fact, consumers are not aware of just
how much information is available to companies. When you use the
Internet, small programmes called cookies are downloaded onto your
hard disk from the sites you visit. A very small number of websites obtain
permission from their site visitors prior to the download; most do not.
There have been two major responses to the privacy concerns of
customers. The rst is self-regulation by companies and associations.
For example, a number of companies publish their privacy policies and
make a commercial virtue out of their transparency. Professional bodies
in elds such as direct marketing, advertising and market research have
adopted codes of practice that members must abide by.
The second response has been legislation. In 1980, the Organization
for Economic Cooperation and Development (OECD) developed a set
of principles that has served the foundation for personal data protection
legislation around the world.16 These principles are voluntary guidelines
that member nations can use when framing laws to protect individuals
against abuses by data gatherers. The principles are as follows:
Only collect and process data for legitimate and explicit purposes.
Only collect personal data when individual consent has been granted,
or is required to enter into or full a contract, or is required by law.
Ensure the data is accurate and up to date.
At the point of data collection, to advise the individual of the identity
of the collector, the reason for data collection, the recipients of the
The USA has not adopted these legislative standards, but in order to
enable US companies to do business with EU organizations, the US
Commerce Department has devised a set of Safe Harbor principles.
US organizations in the Safe Harbor are assumed to adhere to seven
principles regarding notice (as in notication, above), choice, onward
transfer (disclosure to third parties), security, data integrity, access and
enforcement (accountability). US companies obtain Safe Harbor refuge
by voluntarily certifying that they adhere to these principles. This
enables data transfers to be made to the USA. Two areas of difference
between the EU Directive and these Safe Harbor principles are in access
and enforcement. The Safe Harbor wording for access is weaker. The
Safe Harbor principle states that individuals must have reasonable access
to personal information about them that an organization holds, and to
be able to correct or amend the information where it is inaccurate. The
enforcement principle is unclear about sanctions should a company
breach the standard and it allows no possibility of enforcement by
government agencies.
In the USA, there is a tendency to rely on self-regulation by individual
or associated companies, rather than legislation at state or federal
level. For example, the World-Wide Web Consortium (W3C) has
developed a Platform for Privacy Preferences (P3P) standard for
improving privacy protection in e-commerce. This comprises three major
elements:
1. A personal prole: each Internet user creates a le consisting of
personal data and privacy rules for use of that data. Personal data
might include demographic, lifestyle, preference and click-stream
data. Privacy rules are the rules that the user prescribes for use of the
data, e.g. opt-in or opt-out rules, and disclosure to third parties. The
prole is stored in encrypted form on the users hard drive, can be
updated at any time by the users and is administered by the users
web browser.
2. A prole of website privacy practices: each website discloses what
information has been accessed from the users personal prole and
how it has been used.
3. Automated protocols for accessing and using the users data: these
allow either the user or the users agent (perhaps the web browser)
automatically to ensure that the personal prole and the privacy rules
are being complied with. If compliance is assured, then users can
enter websites and transact without problems.
This is now being complemented with a more rigorous approach to
legislation. In Australia, privacy legislation has been enacted at state and
federal levels.
Summary
In this chapter youve read about the development, management and use of customerrelated databases. CRM cannot deliver its promised benets without appropriate
customer-related data. Customer-related data are used for strategic, operational,
analytical and collaborative CRM purposes. Customer-related databases need to be
constructed with a very clear idea of the applications for which the data are needed.
These applications range across the full territory of CRM strategy development and
implementation. Customer-related data can be used to answer strategic questions such
as Which customers should we serve? and tactical questions such as What is the best
day to communicate with a given customer?
We described a six-step approach to developing a high quality customer-related
database, consisting of dening the database functions, establishing the information
requirements, identifying the information sources, selecting the database technology
and hardware platform, populating and maintaining the database. We saw how
compiled list data, census data and modelled data can be imported to enhance the basic
data available in company-maintained databases, most of which adopt the standard
relational architecture. Data integration from disparate databases is often a barrier to
the delivery of desired CRM outcomes. Attached to the front end of many databases are
data mining systems that allow users to make sense of the data. We ended by looking
at data warehouses, data marts and privacy issues.
References
1. Based on OConnor, J. and Galvin, E. (2001) Marketing in the digital age,
(2nd edn). Harlow, England: Financial Times/Prentice Hall.
2. Drozdenko, R.G. and Drake, P.D. (2002) Optimal database marketing:
strategy, development and data mining. Thousand Oaks, CA: Sage.
3. ANSI is the American National Standards Institute.
4. Courtesy of StayinFront Inc., www.stayinfront.com Used with
permission.
5. Courtesy of Intelligent Search Technology Ltd, www.intelligentsearch.
com
6. Based on Watson, R.T. (1999) Data management: databases and
organisations. New York: John Wiley.
7. Drozdenko, R.G. and Drake, P.D. (2002) Optimal database marketing:
strategy, development and data mining. Thousand Oaks, CA: Sage.
8. Watson, R.T. (1999) Data management: databases and organizations. New
York: John Wiley.
9. DAddario, J. (2002) The application revolution. http://www.tdwi.org/
Publications/display.aspx?id6460&ty. Accessed 11 September
2007.
10. The Data Warehousing Institute. http://www.tdwi.org/display.
aspx?ID7145. Accessed 11 September 2007.