Optimizing Anti-Money Laundering Transaction Monitoring Systems Using SAS® Analytical Tools
Optimizing Anti-Money Laundering Transaction Monitoring Systems Using SAS® Analytical Tools
ABSTRACT
Financial institutions are faced with a common challenge to meet the ever increasing demand from
regulators to monitor and mitigate money laundering risk. Anti-Money Laundering (AML) Transaction
Monitoring systems produce large volumes of work items, most of which do not result in quality
investigations or actionable results. Backlogs of work items have forced some financial institutions to
contract staffing firms to triage alerts spanning back months. Additionally, business analysts struggle to
define interactions between AML models and explain what attributes make a model productive. There is
no one approach to solve this issue. Analysts need several analytical tools explore model relationships,
improve existing model performance, and add coverage for uncovered risk. This paper will demonstrate
an approach to improve existing AML models and focus money laundering investigations on cases which
are more likely to be productive using analytical SAS tools including SAS Visual Analytics ®, SAS
Enterprise Miner ®, SAS Studio ®, SAS/STAT ® and SAS Enterprise Guide ®.
INTRODUCTION
Within populations of customers, different sub-groups of customers can be identified based on
transactional activity and behavior. One way to effectively monitor and mitigate money laundering risk is
to apply a targeted approach to monitoring different groups of customers through customer segmentation.
This paper will discuss a top-down and bottom-up approach that can be applied to AML Transaction
Monitoring systems, enabling investigative teams to prioritize efforts in identifying suspicious activity with
a higher likelihood of success. By applying a top-down methodology based on business knowledge,
customers can be initially characterized based on known attributes. The bottom-up approach is a data-
driven methodology involving data mining and unsupervised modeling methods used to identify
homogenous customer groups with similar transactional behavior. Following customer segmentation,
scenarios are developed to categorize and alert on various types of suspicious activities. Initial threshold
parameters are established and the scenario tuning process is performed within customer segments to
reduce the false positive rate of identifying productive alerts. This process can be operationalized using
SAS solutions such as SAS Anti-Money Laundering ®, SAS Enterprise Case Management ®, SAS
Financial Crimes Monitor ®, and SAS Social Network Analysis ®.
1
Figure 1. Aggregate Transactions by Transaction Description Per Customer Using SAS Studio
Based on the summary statistics shown in Table 1, on average, these customers appear to have a low
frequency of transactions across all transaction types on a monthly basis. Almost all average monthly
aggregate transaction types appear to have right-skewed distributions, meaning that there are a few
customers have high average transaction amounts.
It is important to use business logic to perform “data cleaning” to eliminate transaction types or variables
that are not useful in identifying suspicious activity. Figure 2 summarizes frequency of transaction types
using a histogram. Looking at the frequency of transaction types per customer on a monthly basis, it is
clear that balance inquiries occur most frequently, whereas wire transactions occur at a lower frequency.
Balance inquiries, debit and credit adjustments, and various fees are logically not indicative of suspicious
activity and should be removed from the analysis.
2
Figure 2. Data Exploration - Histogram of Transaction Types
Following initial variable cleaning and exploration, apply top-down segmentation of the customer base to
divide the population into pre-characterized groups. This requires an understanding of known attributes
based on business logic. Many financial institutions have multiple Lines of Business (LOBs), including
Deposit, Auto, and Mortgage lines of business. Customers within each LOB are expected to exhibit
different behavior, have different payment cycles, and therefore need to be assessed for suspicious
activity using different criteria. In this customer base of 150,000 customers, customers are initially
classified as commercial or personal customers. As shown in Figure 3, the majority of customers are
personal customers. For the remainder of the analysis, only the subset of personal customers will be
used. This is an example of how to divide the data based on top-down methodology. Once the customers
are divided into groups using top-down methodology, each group can be analyzed separately using a
bottom-up methodology. This process is fluid, in that both top-down and bottom-up methodology can be
applied at various stages within the data analysis process and can lead to different results.
3
Before applying a bottom-up, data-driven approach to identify customer segments, it is important to
standardize the data across transaction types. This not only creates a consistent scale across transaction
types, but also gives all transaction types the same level of importance when developing analytical
models. For example, wire transfers are typically higher transaction amounts, whereas fees would most
likely be smaller and possibly more frequent amounts. There are several different ways of standardizing
data. One method is to rescale the data to have a zero mean and unit standard deviation. Negative
standardized values would indicate a lower than average amount, and positive standardized values would
indicate a higher than average amount. Values across transaction types can be compared and interpreted
more easily using this method. Figure 4 shows the data before and after standardization using SAS
Studio.
Figure 4. Standardized Monthly Average Sum Per Customer Across Transaction Types Using SAS
Studio
4
Table 2. K-means 5-Cluster Solution with Number of Customers Per Segment
If one cluster contains a significant portion of the data, the cluster solution has potential for improvement.
Figure 5 exhibits a case in which other cluster solutions other than a 5 cluster solution could prove to be a
more useful segmentation of customers.
After identifying distinctive clusters of personal customers, it is important to classify the customer sub-
groups based on behavioral differences. It is easier to detect behavioral differences between clusters
using visualizations. SAS Visual Analytics is a great tool for generating both static and interactive plots to
gain insight into the cluster segments.
One useful visualization is a parallel coordinate plot. This is shown in Figure 6 below. This interactive
visualization shows generalized customer attributes within each cluster. The vertical bar on the left hand
side contains color-coded clusters. The relative cluster bar size indicates the number of customers within
each cluster. The remaining vertical bars each represent a transaction type, and represent an axis
ranging from the minimum to maximum standardized value per average customer.
5
Figure 6. Parallel Coordinate Plot of Cluster Solution and Transaction Types
By using this interactive visualization, it becomes apparent that one of the clusters contains customers
that have higher average currency amount for incoming/credit cash transactions but lower average
currency amounts for outgoing/debit wire transfers. This is shown in Figure 7.
Figure 7. Parallel Coordinate Plot of Cluster Solution and Transaction Types Highlighting Cluster 0
Another cluster of customers, shown in Figure 8, has relatively lower credit cash transactions but higher
average wire amounts compared to other clusters.
6
Figure 8. Parallel Coordinate Plot of Cluster Solution and Transaction Types Highlighting Cluster 1
SCENARIO TUNING
AML Transaction Monitoring scenarios contain threshold parameters that can be further modified and
“tuned” to alter the AGP process in an effort to improve alert productivity. If a scenario is creating a large
amount of alerts, perhaps normal activity is being captured and there are more false positive alerts
generated as a result. To counteract this, re-adjusting threshold parameters ensures that the alerts being
generated are more likely to be productive for the investigative team. Each scenario can have multiple
parameters or thresholds that require tuning.
The scenario implementation and tuning process is an iterative process that includes several steps. The
first step is to remove any outlier alerted aggregate wire transactions to ensure that the appropriate
threshold will be set based on the typical population. Figure 9 summarizes the distribution and outliers
found within segment 4 customer data containing the alerted monthly aggregated wire transactions.
7
Figure 9. Distribution and Box Plot of Aggregated Wire Transactions Per Customer within
Segment 4
The distribution of alerted aggregated wire amounts is visualized within each customer segment. Figure
10 shows the slightly right-skewed distribution of aggregate wire alerts for customer segment 4. The
reference lines show the 60th and 90th percentiles. The alerted transactions within the 60-90th percentile
range are then divided into 3 traunches. A traunch is essentially a bucket, or collection of observations to
analyze. Customers with aggregate amounts below the 60th percentile most likely will not generate an
alert. Likewise, customers above the 90th percentile will be very likely to alert as shown in Figure 10
below.
Figure 10. Distribution of Alerted Aggregated Wire Transactions and 60th – 90th Percentile
Once the 60th-90th percentile is divided into traunches, a simple random sample is selected from each
group, consolidated, and provided to an investigative team. The investigative team identifies if each alert
is productive or non-productive. Implementing traunches ensures that equal amounts of alerts are
randomly selected from the distribution and that bias does not interfere with results.
8
Figure 11 demonstrates how a scatterplot of the sampled and non-sampled alerts can be visualized in
SAS Visual Analytics to show the breakdown of productive, non-productive, and non-sampled alerts. In
this example, samples of alerts were taken from below the 60th percentile and above the 90th percentile
for the purpose of showing that alerts below the 60th percentile are not productive alerts. In addition, alerts
sampled above the 90th percentile have a higher likelihood of being productive. This sampling and
assessments of alerts enables investigators and analysts establish the initial lower threshold for a
particular customer segment. It is evident that the lower threshold should be set around $155,250. Most
alerts generated are not productive below this threshold. The iterative process of determining the lower
threshold is commonly referred to as “below-the-line” testing.
Figure 11. Scatterplot of Alert Disposition by Aggregated Wire Transactions for Segment 4
Customers with SAS Visual Analytics
9
automating the documentation process though scenario tuning reports. Not all AML transaction
monitoring engines allow for this approach. Selecting an engine that provides custom scenario tuning
flexibility should be a factor considered in AML technology vendor selection. The SAS Anti-Money
Laundering solution provides an open approach that lends itself to the operationalization of the scenario
tuning process.
CONCLUSION
Financial institutions commonly face challenges with the overwhelming task of maintaining an effective
AML monitoring program within tight budgetary guidelines. In order to effectively monitor and mitigate
money laundering risk, business analysts can use SAS products such as SAS Enterprise Guide ©, SAS
Studio ©, SAS Visual Analytics ©, and SAS Enterprise Miner © to apply top-down and bottom-up
methodologies coupled with scenario development and tuning approaches to establish a more targeted
approach to AML Transaction Monitoring. By implementing the methodologies outlined within this paper,
financial institutions can drastically improve the productivity of their investigative teams to effectively
identify and report financial crime.
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the authors at:
Leigh Ann Herhold
Zencos Consulting
(919) 237-9079
[email protected]
Stephen Overton
Zencos Consulting
(919) 341-9667
http://stephenoverton.net
[email protected]
Eric Hale
Zencos Consulting
(919) 619-6000
[email protected]
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of
SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are trademarks of their respective companies.
10