Data Driven Ebook

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

The data-driven

enterprise
By Mark Schwartz,
Enterprise Strategist, AWS
Introduction
There is a lot of talk these days about the data-driven enterprise and the need to become
one. But what exactly does it take to become data-driven, and why is it so important in
today’s digital environment? What practical steps can an enterprise take to make data
fundamental to its mindset and practices? And what is the connection between data and
that other priority of the digital age—business and technical agility?

Until recently, it was common for enterprises to view data solely in the context of
transactions. They locked their data away in siloed databases that were excellent for
transaction processing but less suited to open-ended analysis. Most organizations were
operating under the mental model of the invoice or the order form: “Please give me 20
widgets at a price of $100 per widget.” Or “Please pay me for 20 widgets at $100 per
widget.” Data was performative and imperative—a stimulus or artifact of conducting a
transaction. Today, the value of data goes far beyond its transactional role.

Data-driven organizations strive to base their business decisions on the evidence provided
by data—which requires a certain rigor and, at the same time, an ability to innovate based
on identifying opportunities within the data that can lead to new products or markets.
These organizations also come to treat data as a strategic asset they can use to improve
customer experiences and increase internal efficiencies. In other words, they analyze data
to inform decision-making and use data to serve their customers. Data can be the basis, for
example, for personalization, dynamic pricing, market expansion, product innovation, or
supply chain optimization.

This eBook will address what it means to be data-driven, offering examples of how
companies are using data to drive their businesses. We’ll also connect the dots between
becoming data-driven and driving agility, digital transformation, and continuous innovation.

2
The business value of data
Data can be used in any number of assessments It is no surprise, then, that a company’s data can be a
that will drive business results. For example, if an factor in its acquisition value or may enable it to form
enterprise assesses its historical transactions and, as a partnerships with other business ventures. Take, for
result, finds ways to optimize its supply chain, thereby example, Microsoft’s acquisition of LinkedIn, with its
reducing costs, then the data has played a role in data on 433 million customers, for $26.2 billion. The
enabling that cost reduction. Consequently, data has bankruptcy proceedings of Caesars Entertainment
a business value that stems from its potential use in Operating Company, Inc. from 2015 to 2017 provide
increasing profits or accomplishing mission objectives. another example, where creditors argued that the
data on the 45 million customers in its Total Rewards
It is easy to find instances of data being used for its customer loyalty program was worth $1 billion and
non-transactional value. Johnson & Johnson, for was its most valuable asset.2
example, uses the transactional data it has stored in
the cloud to improve physician compliance, optimize It is helpful to think of data having business value as
its supply chain, and discover new medicines. Nike a kind of financial call option—that is, it gives us the
collects data on customer achievements to drive its opportunity to make changes in the supply chain or
customers’ digital experiences in Nike+. Lyft collects launch a new product, but it does not obligate us to
and stores the GPS coordinates of every ride. By do so. We can exercise the option or not, depending
analyzing this data, Lyft found that 90 percent of rides on how valuable the data indicates that the new
overlapped with other rides from nearby locations. business will be. It is here that organizations have
This insight led to the creation of Lyft Line, a service had trouble understanding and realizing the value
that allows passengers to share a car and receive of data, as valuing a call option is considerably more
discounts of up to 50 percent.1 complicated than calculating the ROI of a projected
stream of cash flows. As a result, enterprises often
Because data can lead to profits—even if the profits neglect the value of data. But, as I show in my book
are not yet being realized—we can think of data as War and Peace and IT,3 many of the techniques of agile
a financial asset (although not a generally accepted IT delivery result in this kind of option value.
accounting principle [GAAP] asset in most cases).

1
See “Becoming a Nimble Giant: How Amazon DynamoDB Serves Nike at Scale,” AWS re:Invent, 2018; “AWS Cloud Databases: Modernize Your Data Infrastructure
with Fully Managed, Purpose-built Databases”; and “Lyft Case Study,” AWS Case Study, 2016
2
Short, J., Todd, S., “What’s Your Data Worth,” MIT Sloan Management Review, March 2017. A detailed analysis of the Caesars bankruptcy can be found here. The
bankruptcy was exceedingly complex, and the value of Total Rewards was included with other assets, so it is not clear what value was ultimately attached to it.
3
Schwartz, M., War and Peace and IT: Business Leadership, Technology, and Success in the Digital Age, IT Revolution Press, 2019 3
Data and agility
Value is created not just by data but also by tools and processes designed to unify data,
analyze it, and produce those business outcomes. In today’s digital world, fraught with rapid
change, uncertainty, and complexity—disruption, you might say—we need to use data to
support business agility and to respond quickly and flexibly to changing circumstances.
Agility enables organizations to turn rapid change into opportunity and to avoid disruption
by responding nimbly to competitive threats. Enterprises in the digital age have learned
that they need to get early versions of products to market quickly and evolve them through
continuous feedback from the market.4

Over the last few years, techniques for building agility into the product development process
have emerged—including, for example, Agile software development, DevOps, and Lean
software development. The cloud has been used to speed up the delivery of IT capabilities for
both software and hardware. Team-based organizational structures have made it possible to
mobilize the resources to meet changing needs. These developments have helped enterprises
increase the agility of their processes.

But agile processes are only one part of the story. The company’s data must also be unified,
agile, and easily accessible for unexpected and constantly changing uses. To unify data, you
need to enable collaboration on it while providing access to the right users at the right time
with the right controls. Employees must have the proper skills and easily available tools to
work with data. This ability to flexibly use data—i.e., to make it available for new uses that we
don’t know about in advance—is the missing link in achieving enterprise agility. And it is the
distinguishing factor that separates agile organizations from those that have merely adopted
the frameworks and trappings of agile models. Business agility requires data agility. A data-
driven enterprise is a master of both.

This focus on bringing agility to data is new. In the days when data was merely transactional,
we could lock it away in databases with structures that reflected the way the data would
be used for those transactions. The tools of the day were relational database systems, such
as Oracle or SQL Server, whose strengths are in transactional processing. Data was used to
conduct transactions and to produce operational reports that supported those transactions.

4
Ries, E., The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses, 4
Crown Business, 2011
To the extent that we paid attention to privacy, we enforced it by strictly limiting access to There are really two questions:
the data—rather than searching for ways to make it available within the bounds of privacy
guardrails. Instead of “privacy by design,” we practiced “privacy by obscurity.”

There were attempts to free data for ad-hoc analysis with business intelligence (BI)
systems. But today’s tools have advanced far beyond what BI systems were meant to do:
1
We now have advanced capabilities supporting business intelligence, analytics services How can we bring
with built-in machine learning (ML), serverless analytics, a range of purpose-built agility to our data?
databases to handle different types of data, algorithms for massively parallel processing
vast amounts of unstructured data like video and speech, Internet of Things (IoT) devices
that deliver streams of sensor-derived data, and…well, just vast amounts of data. With
these tools, we can now free our data from its transactional and operational context.
2
How can we use data
More importantly, we have realized that being data-driven is not just a technical challenge to bring agility to
but also an organizational one. To be data-driven, an organization must think differently our business?
about how it makes business decisions and how it interacts with customers. It is a
commitment to the value of data, a kind of organizational humility that says “The data
knows better than we do.”

How can we make our data available to be used in unexpected ways; that is, how can we
flexibly use it to give our departments and teams business agility? How can we apply it
to bring rigor and creativity to business decision-making? How can we change business
culture to take advantage of this new flexibility?

And how can we put appropriate control guardrails around the data to safeguard its
privacy—while at the same time allowing it to be used flexibly and quickly?

5
Agility for data
How can we bring agility to our data? In “Analytics Without Limits: FINRA’s Scalable and Secure Big Data
Architecture – Part 1,” John Brady, the CISO of the Financial Industry
To achieve business agility, we’ll need to be poised to respond to unexpected Regulatory Authority (FINRA), frames these objectives elegantly by saying
changes in the business and competitive environments, and we’ll need to that he wants to “Lower the cost of curiosity.” He refers to cost in its widest
create innovations that are truly novel. That means we will need to be able sense, including the time it takes to draw inferences from the data and the
to put our data to work in ways that we don’t necessarily anticipate when risk in making it available. FINRA’s business is to explore the 37 billion or
we collect it. more transactions that take place in the financial markets every day, looking
for patterns of fraud. Since it doesn’t always know in advance what a pattern
Our challenges: of fraud looks like, FINRA must rely on the expertise of its analysts to spot
suspicious behavior. FINRA’s task is all about curiosity: The organization wants
• Our data is probably locked away in transactional, relational its analysts to examine data with inquisitiveness as to what patterns appear
databases and siloed in ways that make it inaccessible to different parts and why. The task of its IT organization is to reduce the cost of that curiosity
of our organization. and the effort that an analyst has to exert to explore a hunch.
• We may not have the right analytical tools, or they may not be available
Brady’s idea applies across organizations and roles. Can a marketer easily
to the right people at the right times.
explore data to find unexpected patterns in consumer purchasing activity? Can
• Our models for security and privacy are ad hoc, as we perhaps never operations explore data to identify performance optimizations or to diagnose
contemplated using the data for exploration. Most likely, we are problems in operating processes? Can finance explore data to concoct new
fostering privacy simply by making the data as inaccessible as possible. ways to drive performance or slice and dice data to drive executive decision-
making? Can IT leaders test their hypotheses on optimizing cloud spending
Our goals:
with rigor and creativity?
• Maximize data availability, subject to guardrails for privacy and
Curiosity drives innovation and improvement. Agile data allows employees
confidentiality.
to freely explore ideas, hunches, hypotheses, and conjectures at the speed of
• Foster transparency across the enterprise by breaking down data and thought and to promote new ideas with the data to support them.
organizational silos.
To make data agile, an enterprise needs to address how and what data it gets,
• Offer employees the appropriate tools to explore the data in unplanned
how it preserves that data, how and under what conditions it makes the data
ways and to take advantage of the latest advances in databases, analytics,
accessible, and what tools and skills it has for working with that data.
and machine learning.
• Gain the expertise to interpret the data, both rigorously and creatively.

6
1 Get the data
To use data nimbly, we must first obtain the data. And given the
unknown uses to which we will put it, we need to collect more data
than we know how to use. That, in a nutshell, is what big data is about.
Fortunately, with the cloud, the cost of storing data is low and declining.

We can therefore instrument our business processes to produce


data—lots of it—and make it available for analysis. For example, IoT
applications often include sensors that blast a stream of data points into
the cloud that the enterprise can analyze immediately or store away for
future use. Enterprises can also now work with a much wider range of
data types, such as video, text, and speech. The possibilities for using
this information in novel and interesting ways are tremendous.

GE Oil & Gas, for example, pulls an MRI-like device they call a “pig”
through its oil pipelines to collect over 750 terabytes of information
to help spot potential problems in pipeline infrastructure. Hudl has
collected about 10 petabytes of video and other data that sports
coaches can review with players. Peloton gathers data from its exercise
cycles and analyzes it to provide insights to customers. And Airbnb
accumulates about 50 gigabytes of data each day for fast analysis in
the cloud using Amazon Elastic MapReduce (Amazon EMR)—a tool that
allows large volumes of data to be analyzed quickly in parallel.5

5
“GE Oil & Gas Case Study,” AWS Case Study, 2016; “Hudl Case Study,” AWS Case Study, 2014; “How Peloton Relies on
the Scalability of AWS,” AWS Case Study, AWS San Francisco Summit, 2018; “Airbnb on AWS,” AWS Case Study, 2016

7
2 Store the data
Once we acquire the data, we must store it to make it available for analysis. The power of the data lake lies in the tools used to analyze it: tools that let
Traditionally, we stored data in a structured format based on our expectations you combine heterogeneous data (structured and unstructured), data from
about how it would be used transactionally. For example, we might have a different organizational silos, and data in large quantities. Today’s tools can
field in a database for “quantity ordered” and another field for “unit price.” apply machine learning algorithms and statistical analyses, and they work with
We would collect the data to fill these fields and file them away by slotting natural language text, video, and speech.
them into the appropriate blanks in the database, knowing that we could
always multiply those values to derive a total price. By forcing the data into In other words, the data lake meets the enterprise’s need for storing data
such a mold, we made it useful for transactions, but we might have lost before it knows all the ways it will be used. We can pour data into the lake
information that could have been useful for analysis. This was the relational from different business silos and analyze it all together. We can quickly set
database model. up a way to pour data from a newly acquired company into the lake, thereby
gaining transparency into its operations, and we can integrate its data with
The past few decades have been dominated by the use of these relational our own.
databases, which are very well suited to efficient processing of old-world
volumes of transactional data in ways that are known in advance (“multiply
unit price by order quantity”). Whether you are working with non-transactional
data, operating at tremendous internet scales of transactions, or managing
data that does not slot easily into pre-defined “data fields,” there are now The magic that makes this all possible is:
much better database alternatives, which are purpose-built for the cloud. 1 Low cost of storage.

For example, Amazon Timestream is designed specifically to manage time- 2 Availability of tools that work with loosely structured,
series data (like the data produced over time by an industrial sensor or by heterogeneous data.
tracking market activity over time). Amazon Quantum Ledger Database
(Amazon QLDB) is intended for the type of data used in blockchain (data 3 Choice of services that lets you push data into the data
whose history must be verifiable, using techniques like cryptography), and lake at high bandwidth and asynchronously (just send
Amazon Neptune is designed for complex connections and relationships, like the data toward the data warehouse as you receive
social networks. Enterprises are no longer limited to what they can force-fit it, and it will get there as quickly as it can, no need to
into a relational model. Better still (for agility), data that will be used for yet- wait—sort of like an email).
undetermined analysis can be stored in a flexible repository called a data lake,
where each piece of data is simply stored in the form in which it was received.

8
3 Make the data
available
The next step in bringing agility to data is to make it available when
and where it is useful. (Note that I didn’t say when and where it is
needed. I’m talking about agility and innovation here.) The model
that is often used today is one of self-service provisioning. When an
analyst is curious, he or she can spin up a set of tools and a subset of
the data to analyze without having to request and wait for someone
else to provide it. The resulting freedom lets the analyst pursue a
train of thought, a “flow,” rather than proceeding in a stop-start way
that destroys creativity—or, you could say, that increases the cost of
curiosity. The cloud is an important enabler for this, as it allows new
work environments to be provisioned, used, and then discarded when
no longer needed. It also makes it easy to put guardrails in place to
protect privacy (more on this on the following pages).

9
4 Provide tools
A data-driven enterprise makes appropriate analytics tools easily and quickly To apply machine learning, you train a model based on earlier datasets and
available to its employees, often through a self-provisioning model as then apply the model to new data as it is observed.
described earlier. A wide variety of software and services is available: If you
want to perform traditionally structured queries against the data, for example,
you can set up a data warehouse based on the data in the data lake. Or you
can provision a tool that lets you do old-school SQL-type queries directly AWS offers three general approachest
against the data lake. to machine learning:
Such traditional activities barely scratch the surface of what’s possible, 1 Use a pretrained model such as Amazon Rekognition,
however. To get the maximum value out of your data, you must make it easy which has already been trained to recognize objects
for anyone to make sense of data-driven insights. With the right business in images, or Amazon Lex, which has been trained to
intelligence technology, your entire organization will have access to a common understand intentions expressed in natural language.
view of data. That helps empower your teams to, for example, visualize
real-time data with information modeling tools, construct scenarios and 2 Train and apply your own model based on any one of
ascertain their consequences, or create, publish, and embed interactive data the common algorithms used for machine learning
visualizations and dashboards. using Amazon SageMaker.

Today’s analytics revolution is all about artificial intelligence (AI) and machine 3 Use your own algorithms and training approaches—if
learning, which open new possibilities for what we can do with our data: you have employees skilled in machine learning—by
Predict outcomes, spot anomalies, categorize data, analyze sentiment, discover working directly with Amazon infrastructure that is
patterns, guide robots, and much more. optimized for machine learning.

For example, Capital One is using machine learning to detect fraud while still
maintaining high levels of customer service. T-Mobile uses machine learning
to improve its customer service by having it predict what articles will be With tools such as these, enterprises can unleash the creativity of their
most helpful to the customer and making them quickly available to customer employees and find new ways to put data to use.7
service agents. Sky News, in its coverage of Britain’s royal wedding, used
“At Capital One, Enhancing Fraud Protection With Machine Learning,” AWS Case Study, 2021; “At T-Mobile, AI Humanizes
Amazon Web Services (AWS) machine learning to recognize the faces of
6

Customer Service,”; “Sky: Something New at the Royal Wedding,” AWS Media Blog, 2019; “AWS Machine Learning
celebrities in the crowd and identify them for the TV audience. And Formula 1, Customers: From the World’s Largest Enterprises to Emerging Start-ups, More Machine Learning Is Built on AWS than
Anywhere Else.”
Major League Baseball, and the National Football League are all using machine 7
“AWS Machine Learning Customers: From the World’s Largest Enterprises to Emerging Start-ups, More Machine Learning
learning to enhance the viewing experience for sports fans.6 Is Built on AWS than Anywhere Else.”

10
5 Upskill
The next important element in extracting value from data is to make sure you have
employees with the right skills—in addition to a sense of curiosity. This is why data
scientists are in such high demand. Yes, there are plenty of tools available for people with
little skill or experience in statistics. But to really make the most of data and to do so with
rigor, it is important to have people with a good understanding of how to make correct
inferences from data.

For a simple example, those of us with less statistical experience tend to over-rely
on averages, even when looking at an entire distribution of values. In one case that I
remember from my time as CIO at USCIS, we were looking to reduce the time it took
us to process certain types of applications. We created dashboards to track the average
amount of processing time, but each change we tried seemed to have only a small impact
on the metric. What we missed was that the small number of applications that raised
national security or fraud concerns took much longer to process, thereby skewing the
average. We had no way to control how long those applications took to process. Although
our improvements applied to the great majority of cases, because of the highly skewed
average, we couldn’t really see their impact. When we realized the problem and began
monitoring, say, the 85th percentile completion time, we could identify the significant
impact our changes had on the majority of cases. We possessed the data, the tools, and the
access—we just lacked the skills to draw the correct inferences.

Data-driven decisions can also be poorly founded when the data is presented in a
misleading way, even when done so unintentionally. In his book The Visual Display of
Quantitative Information, Edward Tufte shows how data can be distorted or obscured by
the way it is presented.8 Again, an enterprise that wants to be rigorous in its use of data
must ensure that it has the right skills in analysis and presentation, as well as the data.

Tufte, E., The Visual Display of Quantitative Information, Graphics Press, 2015
11
8
6 Provide guardrails
Before we can make data available for novel uses—to satisfy curiosity, so Many other challenges arise in using the vast amounts of data available to
to speak—we must put guardrails around it for privacy and confidentiality. enterprises. It is often a challenge to accurately connect data from different
Data-driven enterprises practice “privacy by design,” deliberately establishing IT systems pertaining to a single individual, especially in countries like the US
safeguards based on planning and foresight. They gain speed and flexibility that do not have a single national ID system. Data can be inaccurate not only
down the road by making sure they have already considered what needs because of mistakes made in data entry but also because of limitations in the IT
protection and have set up automated ways to do so. In fact, the recent systems that collect the data. For example, there are IT systems that only allow
European Union General Data Protection Regulation (GDPR) requires privacy for a surname and a given name, which imposes inaccuracy for people who
by design. have more than two names.9

The cloud provides many tools for setting up automated access controls at a Regardless, the goal of a data-driven enterprise is to make data available to
granular level that let you give employees access to precisely the data they drive rigorous and accurate decision-making and continuous innovation. It
should have access to. There are ways to track the provenance and validity requires collecting and storing data for flexible use later, making it and the
of the data, encrypt or obscure it, and restrict access on a field-by-field or right tools available without friction to those who will use them, ensuring
record-by-record basis. In other words, you can specify which customers’ data privacy and confidentiality by design, cultivating the skills to make valid
an employee has access to and which pieces of data associated with those inferences, and solving data hygiene problems that can lead to poorly informed
customers the employee can view. Amazon Macie uses machine learning to decisions. This is what it means to bring agility to data.
identify which data in your data lake is personally identifiable information
(PII) and tracks how it is used. Or you can choose to manage data only at an
aggregated level or with information masked or anonymized. The flexibility
is there; each data-driven enterprise must make responsible decisions about
privacy given the type of data they steward.

For some great stories about IT systems insensitive to real-world scenarios, consult Gojko Adzic’s Humans vs Computers
12
9
How can we use data to bring
agility to our business?
An agile business in the digital age proceeds by trying an idea, getting feedback, and then
adjusting course—and doing so repeatedly. This fast-feedback approach lets the company
innovate (at low risk, high speed, and low cost) and reduce investment risk by testing ideas
before committing to them. It results in a good fit between the company’s products and
the markets they are intended to serve and ensures that the company is solving the right
problem in the right way at the right time.

Fast feedback
Feedback, in this sense, does not mean asking customers whether they like a new feature
or product. More commonly, data-driven enterprises use quantitative feedback—the kind
of feedback that is gathered by watching how customers actually act—or by monitoring
changes in market behavior or other metrics.

For example, companies often improve the usability of their websites through A/B testing;
that is, by trying two variations on a piece of the design (usually, one variation is the current
status-quo version and the other is a new piece of design they are considering introducing).
They show some customers version A and some version B. They collect data on the
customers’ activity and analyze it in relation to the outcomes they care about. If they want
to decide whether to make a button green or red to maximize the number of times it is
clicked, then they can show some users a green version and some a red one—and see which
gets more clicks. Expedia and Netflix are examples of companies that routinely conduct A/B
testing, drawing on large amounts of data from a data warehouse in the cloud.10

“A story of Netflix and AB Testing in the User Interface Using DynamoDB,” AWS re:Invent 2017
13
10
The powerful approach of learning and adjusting through feedback goes far beyond just A/B
user interface testing. New product ideas, for example, can be tested by creating a “minimum
viable product,” the smallest and simplest version of the product the company can use to
gather information on whether the product will be successful or what needs to be changed to
make it so. Marketing strategies, promotions, technology alternatives—all these can be tested
through trial and measurement to reduce risk and uncertainty. And the key to doing so is
gathering data and making it available for analysis.

The technique of using minimum viable products and fast feedback is described in Eric
Ries’ book The Lean Startup.11 According to Ries, at any given moment, a startup holds
two hypotheses: a value hypothesis, about how its proposed product will create value for
customers, and a growth hypothesis, about how the company will be able to grow its market—
that is, get customers to use the product. The minimum viable product is the smallest product
that will give the startup information to confirm or refute these hypotheses, at which point it
can make changes and retest them with the market.

This set of practices does not just apply to startups or to new product development. It has
become central to the way organizations, including large enterprises, achieve business agility
by changing course based on their learnings. If an enterprise is thinking of developing a new IT
system for use by its own employees, it presumably has a hypothesis about how that IT system
will deliver the business outcomes that are proposed in its business case. That hypothesis
should be tested, and changes should be made based on what the data shows.

As a result, agile practice requires data. To learn and adapt, the enterprise has to collect data
on the impact of its new initiatives and use it to inform those initiatives. Agility further requires
that the enterprise sense changes in its business environment, so it can respond appropriately
to maximize its business outcomes. A data-driven enterprise not only brings agility to its data
but also uses data to support its agility.

Ries, E., The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses,
14
11

Crown Business, 2011


Culture and process change
Becoming data-driven requires a very different way of making decisions, and it is a deep cultural
change for many organizations. In the past, we might have made decisions by crafting detailed
plans, analyzing options with the available data, and choosing the option that appears to deliver
the best outcomes. In the digital world, we refuse to accept only the data that is available the
instant the plan is created. Instead, we design experiments to yield additional data and then
incorporate that data into our decision-making. We resolve uncertainty by generating new data.

An example is the technique for IT governance that we devised at USCIS. Instead of writing a large
requirements document and handing it over to the technologists for implementation, we simply
handed over a business objective. In one case, we noticed that a skilled case processor (a “status
verifier”) could process about 70 cases a day, and our business objective was to make that number
much higher. In another business case, we found that several paper files got lost in transit as we
moved them between processing locations, and we wanted to eliminate those losses.

For each of these objectives, we began by creating a dashboard that showed the key metric—the
number of daily cases or the number of missing files. Instead of writing a requirements document,
we created a cross-functional team of business operators and IT technologists, and we charged
them with improving the metric. We gave them the tools to make changes to IT systems and
business processes quickly and then monitored the dashboards with them. They tried small,
incremental changes and monitored the results every day. Based on what they saw in the data,
they could decide what to do next to maximize the outcome.

And management could decide whether to continue funding the initiative or to direct the funds
elsewhere. The result was a data-driven, reduced-risk, lightweight governance process that
delivered value quickly.

This leads to another important point: Accountability is enhanced by transparency. By making


data widely available, we made the team’s progress visible. As a result, oversight bodies could
constantly revisit the investment decision, gauging the investment up or down, redefining
objectives, or stopping the investment entirely. Results were the only measure of success, and
results could be achieved quickly. But those results had to be supported by the data.

15
Spotting patterns
Another area where data can promote agility is To cite one more example of using data to “keep
through sensing changes or recognizing patterns an eye on events,” the existence of a data point
in the environment. For example, machine learning can serve as confirmation that an activity took
can be used to detect and respond to anomalies. We place—for example, when audit trail logs are created
can train a machine learning model with historical or automatically. By following the trail of activities,
routine data so it becomes used to what is “normal” auditors may be able to validate compliance or
and then apply it to find activity that is not normal. investigate improper activity. Blockchain is often used
This technique can be used, for example, to spot to store data that confirms that activities took place—
fraudulent transactions or network intrusions by for example, a transfer of money between two parties
hackers. It could also be used to spot equipment on or an approval of a contract by the parties involved. By
a factory production line that is diverging from its using automated guardrails and audit data to establish
normal behavior and might have to be repaired or compliance, enterprises can often avoid heavyweight
replaced—and to do so before it actually fails. compliance processes that reduce agility.

When we collect large amounts of data, we may There are, of course, challenges in using data to
find we can identify relationships we didn’t know support business agility. As we noted above, it requires
were there. Social media companies build large skill to draw the appropriate inferences from data. The
databases of relationships between people. The US data does not always tell us what action to take; we
Department of Homeland Security might find that must interpret it and make good decisions from those
a potential terrorist once lived at the same address interpretations. Often, we face a trade-off between
as someone who is already known to be a terrorist— false positives and false negatives—for instance, if
which might lead them to ask questions when they we use the data to spot anomalous transactions to
next encounter the person. Or perhaps fraudulent identify potential fraud, we run the risk of flagging
immigration applications might turn out to have all too many transactions as anomalous and annoying
been prepared by the same immigration lawyer. Here, our customers or flagging too few and allowing fraud
we have moved well beyond merely using data to to sneak through. The larger the dataset becomes,
process transactions; we can now find important and the more likely that meaningless patterns will emerge
interesting relationships between those transactions. or that important patterns will become buried in
Even so, we don’t know exactly what relationships we the sheer number of potential connections. Noise
might find; agility, flexibility, and curiosity are the keys accumulates along with signals.
to deriving value from data.

16
In closing
A data-driven organization is one that puts data to work to improve business outcomes, both by using
data to drive a rigorous decision process and by making the data available for stimulating innovation and
providing value to customers. When data is locked into an inflexible framework, siloed, or difficult to get
at, it becomes a barrier to business agility, preventing the company from responding to opportunities
or from getting products to market quickly. Even worse, when a business doesn’t drive its processes
and investments through the use of data, it is foregoing important contact with the market it is
trying to serve or passing up feedback that could help it succeed better in its initiatives. A data-driven
organization, on the other hand, uses data to gain agility and uses agility to make its data more valuable.

Learn more about how your business can unlock the power of data with AWS ›

About the Author


Mark Schwartz is an enterprise strategist at Amazon Web Services and the
author of The Art of Business Value, A Seat at the Table: IT Leadership in the
Age of Agility, and War and Peace and IT: Business Leadership, Technology,
and Success in the Digital Age. Before joining AWS, he was the CIO of the US
Citizenship and Immigration Service (part of the Department of Homeland
Security), CIO of Intrax, and CEO of Auctiva. He has an MBA from Wharton, a
BS in Computer Science from Yale, and an MA in Philosophy from Yale.

Mark Schwartz
Read more from Mark Schwartz ›
Enterprise Strategist, AWS

© 2022, Amazon Web Services, Inc. or its affiliates. All rights reserved. 17

You might also like