Geography of Online Scams

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7
At a glance
Powered by AI
The analysis found that many online dating fraud profiles that claim to be from Western countries like the US are actually coming from proxies in places like West Africa, Malaysia, and South Africa, with Nigeria being a top contributor. Preventative efforts targeting these locations could have a significant impact.

According to the analysis, many profiles appearing to come from the US actually originate from undetected proxies in West Africa. Nigeria, Malaysia, and South Africa were found to be major origins of online dating fraud, with Nigeria contributing the most.

The researchers analyzed resource overlap between scam profiles, including shared text and images. This allowed them to determine that up to 11% of profiles using proxies could be assigned a different national origin based on elements shared with profiles from direct connections.

1

The Geography of Online Dating Fraud


Matthew Edwards∗ , Guillermo Suarez-Tangil† , Claudia Peersman∗
Gianluca Stringhini† , Awais Rashid∗ , Monica Whitty‡
∗ Cyber Security Group, Department of Computer Science, University of Bristol, UK
† Information Security Group, Department of Computer Science, University College London, UK
‡ Cyber Security Centre, WMG, University of Warwick, UK

Abstract—This paper presents an analysis of online dating their site via known web proxies and similarly allocated IP
fraud’s geography. Working with real romance scammer dating blocks. There are however limitations to the effectiveness of
profiles collected from both proxied and direct connections, these countermeasures, with privately hosted or intentionally
we analyse geographic patterns in the targeting and distinct
characteristics of dating fraud from different countries, revealing disguised proxies escaping the checks of proxy listing services.
several strong markers indicative of particular national origins The real location, even at a national level, of the creators of
having distinctive approaches to romance scamming. We augment the scam profiles is of interest both to law enforcement and
IP geolocation information with other evidence about the dating for other preventative efforts – not only for the purpose of
profiles. By analysing the resource overlap between scam profiles, identifying that a given profile is a scam, but for following up
we discover that up to 11% of profiles created from proxied
connections could be assigned a different national origin on with appropriate countermeasures once a significant origin of
the basis of text or images shared with profiles from direct scams has been identified (e.g., contacting local law enforce-
connections. Our methods allow for improved understanding ment, funding targeted preventative campaigns). This paper is
of the origins of dating fraud, beyond only direct geolocation the first study we know of to address this topic.
of IP addresses, with patterns and resource sharing revealing In this paper, we use a dataset of real online dating scam
approximate location information which could be used to target
prevention campaigns.
profiles which includes profiles created via both proxied and
direct connections. We set out to answer the following research
questions:
I. I NTRODUCTION • Where does dating fraud come from? What does IP
The online romance scam is one the most prevalent forms of geolocation evidence tell us about the origins of profiles
mass-marketing fraud in many Western countries. False dating created via direct connections, and how does this connect
profiles are created by scammers as a prelude to a sustained to the locations given in the profiles?
false romance, during which the target is repeatedly defrauded • Do profile elements get reused internationally? Does
of large sums of money. The impact on victims in terms of reuse suggest different origins for dating profiles? Can
both monetary loss and emotional harm can be substantial. we complement IP geolocation by examining profile
However, technical analysis of the methods used by these elements being reused between unproxied and proxied
scammers remains sparse, with few quantitative analyses of connections?
attacks and attackers. • Does dating fraud from different regions present
Previous work has explored victim understanding of the different characteristics? Do countries tend towards
scam process in interview settings [1], text reuse in romance certain forms of romance scam in a distinctive manner?
scammer approaches via Craigslist [2] and strategies deployed In Section II below, we describe the available data, and note
in an anonymous Chinese dating site [3]. A major unaddressed its limitations. In Section III below, we outline the significant
hurdle for combatting this fraud is understanding its true origin countries within the SOURCE dataset, and the national
global origins, as misrepresentation of location is common. locations those profiles present. In Section IV we look at
Uncertainty about location and international legal obstacles text and images being shared between romance scam profiles,
can hinder investigation and prosecution. and what these patterns suggest about the PROXY dataset.
The locations scammers give in their profile are typically In Section V, we examine the major scam origin nations
regarded as being as false as the profile picture, calculated to to identify patterns in other elements of the profiles, before
attract the interest of their targets [1]. Dating sites record the concluding in Section VI with a discussion of the policy
IP addresses used by scammers in creating and accessing their implications of this analysis.
profiles, and may compare those addresses to blacklists or use
the IP geolocation (especially when compared to the profile’s II. DATA S OURCE
declared location) to inform a judgement about the likelihood The data used in this paper comes from a public online
that a profile is genuine. In response, most scam profile dating scamlist maintained at scamdigger.com, which offers
authors make use of web proxies to disguise their IP address up romance scammer profile data for public awareness. An
connection information, and so they appear to be using a exhaustive collection of the 5,402 scam profile instances, as
connection from the location given in their profile information. collected during March 2017, was examined with respect to
Dating sites are predictably countering by banning access to two sources of geographic information:
2

1) The location given in the scammer dating profile infor- these types of fraud. The next largest origins, Malaysia and
mation. South Africa, are also well-known for producing other forms
2) The IP address used to create the profile, as reported by of internet fraud. All of the listed nations score below 50
the dating site. on the 2016 Corruption Perception Index [6], except for the
Other profile elements of note include the age, gender, United States and the United Kingdom, suggesting these may
occupation, marital status and self-description, which are be unusual cases.
analysed in detail in related work. Of the two sources of
Nation Count Proportion
geographic information, the former was recorded as a string,
often specifying location to a city level. This was geocoded 1 Nigeria 488 0.302
2 Ghana 216 0.134
to lat/lon coordinates and a standard format through queries 3 Malaysia 178 0.110
to the Open Street Map’s Nominatim service1 . For the sake 4 South Africa 140 0.087
of brevity, the locations given in profiles are referred to as the 5 United Kingdom 86 0.053
6 United States 57 0.035
presented locations. 7 Turkey 50 0.031
The IP address information was mapped to a location 8 India 47 0.029
through the use of a geolocation service 2 , providing both coor- 9 Togo 41 0.025
10 Senegal 40 0.025
dinates and structured address information. Some 368 records 11 Philippines 29 0.018
contained no IP address information and were excluded, 12 Ukraine 28 0.017
leaving 5,194 profile instances. Of the IP addresses used, 13 Russia 24 0.015
14 Ivory Coast 23 0.014
many (67.9%) have been identified as known web proxies or 15 Kenya 22 0.014
VPN end-points by the dating site, raising doubts about the
reliability of the inferred geographic location. For this purpose, TABLE I: The SOURCE countries for > 20 scam profiles
we separate the data into the SOURCE (i.e., un-proxied users)
and PROXY (i.e., proxied users) subsets, of 1,666 and 3,528 Figure 1 plots the major scam origins against their profile’s
profiles respectively. It is possible that IP addresses from presented location, as directional arrows weighted by volume
the SOURCE dataset are in fact unknown proxies, perhaps of scams. The United States is the location most commonly
shared secretly amongst criminals, and similarly, it is possible presented in dating profiles, at 63% of the SOURCE dataset,
that PROXY users are only masking their specific connection followed by the UK (11%), Germany (3%) and Canada (2%).
information rather than their national origins. We address these As presented locations are usually indicative of the victims’
possibilities below as they touch upon our results. nationality, we can understand the data as reporting that
Some important limitations of the data source must residents of the US are the major target of romance scams,
be acknowledged as context for our analysis. Firstly, the followed by those of other western nations.
scamdigger.com site is primarily a scam-list for profiles sub- Africa: Most African sources focus their attention on the
mitted to a particular dating site, datingnmore.com, which major western targets reported above. A notable exception
reviews submitted profiles with particular focus on online is a cluster of profiles from Ghana which appear to report
dating fraud, and lists those identified as scammers either at their location accurately. This may be a simple reaction to a
registration or after interaction with members. The profiles scam-detection methodology which uses mismatches between
presented are thus those of scammers that attempt to target presented and IP-geolocated locations3 ; or could represent a
this particular dating site, which may be a source of unknown more ‘honest’ scam format aimed at extracting funds through
bias. As with almost all criminal data analysis, these are also straight seduction. A similar but smaller group appears in
those dating fraud profiles from scammers who have been South Africa. Other exceptions include a small cluster of pro-
identified or caught, and it is possible that they are not rep- files from South Africa and Ghana which present their location
resentative of a more skillful subpopulation, which could also as Iraq and Afghanistan. These are classic “military scam”
be geographically biased. The former issue could be explored profiles, purporting to be members of the US military stationed
further through comparison with statistics from other dating overseas. A small number of Nigerian profiles present their
sites, where they can be persuaded to release this information. location as Malaysia, for unclear reasons.
The latter is an inherent limitation of criminological data. Europe: Almost all SOURCE profiles from the United King-
dom presented themselves as from the United States, with
III. G EOGRAPHIC O RIGINS OF DATING F RAUD only 9% targeting the United Kingdom itself, despite this also
Table I lists the significant origin countries for the SOURCE being an internationally targeted location. Profiles originating
dataset. The largest single origin by far was Nigeria, at in Turkey targeted the United States and Germany, in keeping
over 30% of the dataset. West Africa in general accounts with the international norm. Most interestingly, profiles from
for over 50% of the SOURCE locations. These proportions the Ukraine and Russia almost always presented their national
closely match previous observations of the national origins location as consistent with their IP address. This marked devi-
of advance-fee fraud, as determined by email header IP ation from the pattern of romance scams originating elsewhere
addresses [4], [5], suggesting potential commonality between highlights the distinctive nature of Russian and Ukrainian
dating fraud.
1 https://wiki.openstreetmap.org/wiki/Nominatim (March 2017)
2 http://freegeoip.net (September 2017) 3 Such a method is in use by the dating site operators
3

Fig. 1: The major paths from SOURCE IP addresses to the locations given in profiles

Asia: India follows the international norm in presenting pro- that knowledge of proxies is affected similarly despite their
files as from the United States and United Kingdom, although location around the globe, means we are searching for an
the ratio allocated to each is weighted more in favour of the unknown threshold at which to discard the idea that certain
United Kingdom (2:1 vs the 10:1 in West Africa), perhaps origins are genuine – the rate of false negative error in these
due to closer national ties. There are some small groups proxy lists. As we cannot be certain of this rate, no hard
of Indian source IPs which present profiles in Singapore or conclusions can be drawn from proxy ratios alone, but we
Malaysia. Malaysian scammers also present profiles in the US can say that a large SOURCE:PROXY ratio is a signal carrying
and UK at the Indian 2:1 ratio, with small secondary clusters some information about the credibility of location information.
presenting from Malaysia and nearby Australia. Scammers in Where the number of profiles with an unknown IP address
the Philippines split their presentation between the Philippines is a small fraction of the number of known proxies for this
itself and the US, an unusual pattern that likely reflects the location, we will regard these locations as suspect. Where this
close links between the US and the Philippines. is not the case, we can be more confident that the IP address
United States: Almost all SOURCE profiles from the United accurately reflects the origin of the scam profile.
States gave their location as within the United States. However,
the most common presented state locations were New York Nation P ROXY S OURCE :P ROXY
and Texas, while the source addresses were mostly located in United States 1949 0.03
Arizona, California and Virginia, suggesting a degree of lo- United Kingdom 204 0.42
Russia 47 0.50
cation misrepresentation within the nation or else imprecision Ukraine 23 1.17
of unknown proxying attempts. Philippines 11 2.42
Turkey 10 4.55
India 5 7.83
IV. AUGMENTING G EOLOCATION E VIDENCE Kenya 1 11.00
Ivory Coast 0 23.00
As previously highlighted, SOURCE IP addresses are not Malaysia 5 29.67
necessarily accurate origins – they could be unknown proxies South Africa 3 35.00
which escaped detection. While this is inherently an unknown Nigeria 12 37.54
Senegal 0 40.00
factor, we can make use of certain additional evidence as an Togo 0 41.00
augmentation. For SOURCE IP information we can assess the Ghana 4 43.20
likelihood of impersonations, and for the unknown PROXY
subset’s true locations we can examine the reuse of text and TABLE II: Ratio of suspected source IPs to known proxies by
images with direct connections. country

Table II presents this ratio for the major SOURCE countries.


A. Probabilistic Assessment From this, we can say that we have the most reason to be
We can first estimate the likelihood of this possibility by suspicious of the validity of IP addresses situated in the United
comparing the ratio of SOURCE and PROXY IPs for national States, with the observed count of scam IP addresses not
locations. It is known that proxy lists will have a certain degree known to be proxies being a very small fraction of those from
of error or incompleteness, which, under a base assumption known proxies. We also know that the majority of the SOURCE
4

dataset from outside the US have presented their location and similar scam types. Geographic clusters of resources can
as being in the US, attesting international effort at exactly also be useful in identifying the true origins of profiles using
this form of misinformation. Looking at temporal reporting proxies to hide their location.
information, we find that the proportion of SOURCE profiles in Text reuse is common in scam profiles, with key chunks of
the US has been decreasing since 2013, suggestive of gradually text and expressions being observed across different unique
improving proxy detection. profiles. To identify these overlaps, we first preprocessed the
The UK is the next most suspect IP location, also attracting textual descriptions to standardise case and remove punctu-
a large volume of SOURCE profiles as a falsely presented ation, and then used a longest common substring method to
location, and with more PROXY than SOURCE IP addresses. cluster texts. Any two texts which shared more than a threshold
However, scammers would have to be an order of magnitude of 10 tokens (words) were considered to be part of the same
more effective at masking their IP addresses as UK locations cluster. By this method, 899 unique profiles could be assigned
than as US locations, in order to explain the ratios of scam to a cluster, sharing text with at least one other profile4
profiles generated by these IP addresses. It is notable that
both SOURCE and PROXY profiles from UK IP addresses most Location Assigned
Nigeria 88
often present themselves as located in the US. This suggests Ghana 56
either that the UK supports a population of relatively security- Malaysia 41
conscious romance scammers targeting the US, or is acting as a Italy 11
South Africa 8
significant staging ground for fraud from elsewhere directed at India 5
the US. Temporal information here also suggests a downward United Kingdom 5
trend since a spike in 2014. Benin 4
Kenya 4
Russia and the Ukraine are also locations with a significant Philippines 4
number of PROXY profiles, but here there is less reason to Other 15
suspect the SOURCE IP addresses do not reflect the national TABLE III: Inferred true locations of PROXY profiles
origin of the scam. Unlike the US and UK, we do not see any
significant number of other SOURCE profiles presenting Russia
Looking first of all at reuse within the SOURCE subset,
and the Ukraine as their location, and unlike the SOURCE
the greatest text reuse occurred within nations, with multiple
profiles, most PROXY profiles from these locations present
unique profiles originating in Nigeria and South Africa sharing
their location as the US. The reporting figures appear stable
description text. The greatest international text reuse was
over the observed period. The few presented Russian and
between Nigeria and South Africa, with multiple profiles in
Ukrainian PROXY profiles may simply be scammers protecting
each country sharing elements, and, interestingly, between
their individual location and connection information, without
Nigeria and the United States. Given the previous evidence
interest in masking their national origins. Similarly, known
that the SOURCE profiles in the United States may have been
proxies account for just over a quarter of the IP addresses
created through undetected proxies, we can take these Nigerian
from the Philippines, but there are few profiles traced from
and South African scripts appearing in the US as further
outside the country which purport to be located there, so there
evidence of this under-detection. Similarly, scripts appearing in
is little reason to suspect large-scale misrepresentation.
the United Kingdom suggest that there are undetected proxies
The remaining locations are only lightly populated by IP
amongst the SOURCE IP addresses from the UK. Text reuse
addresses from known proxies, and we may have confidence
within Africa and between Nigeria and to a lesser extent
that these are genuine national origins of online dating fraud.
with all of Malaysia, India and Turkey, suggest a common
Some locations show up neither as significant SOURCE approach to romance scamming in these nations. Notably, we
origins nor as presented locations in profiles, but only as see little to no direct text reuse from Russia, the Ukraine
transit points in the PROXY dataset. These are locations with or the Philippines, either internally or externally, though it is
significant proxy populations, but apparently of low appeal worth noting that we have relatively few examples from these
as targets for international dating fraud. All such profiles countries in comparison to the numbers from West Africa.
predominantly presented as located in the United States, with
Turning to the PROXY dataset, we find that 241 (11%) share
the proxy country being at best a distant second. Notable
text with SOURCE profiles, meaning that their true location can
transit locations include the Netherlands, Switzerland, Sweden,
be indirectly inferred. Table III reveals the results of assigning
France, Australia, Romania and Finland.
the majority national label for shared clusters. As well as
adding significantly to the totals for the already-dominant
B. Profile Description Reuse West African and Malaysian scam origins, this inference also
reveals a number of Italian scam profiles. Combining these
Previous work has shown that romance scammers engage
discovered origins with the smaller number of Italian SOURCE
in substantial reuse of certain profile elements to save on
profiles which enabled this inference, Italy would place 11th
labour, using certain cached images and making use of tex-
in Table I, with more profiles originating here than in Russia
tual “scripts” which can be copied and pasted with minimal
or the Ukraine.
editing [2]. We here seek to explore how these sharing
patterns appear geographically. Understanding which sources 4 This number does not count variants of the same profile identified as such
are sharing resources can help identify cooperating criminals from the dataset, so these 899 reflect 28% of the dataset
5

nations in West Africa. These images might be copied from


other scammers, or profiles in our dataset could have been
created by the same scammer under an unresolved alias.
Turning to the PROXY subset, 19 image clusters in this data
were connected to the SOURCE subset, allowing a total of 48
proxied profiles (1% of the subset) to be connected to profiles
(a) 2d...a8. (b) d4...9e. (c) 6c...89. (d) 15...bf. from unproxied connections. The major connected locations
were Nigeria (22), Ghana (13), Togo (5) and the UK (4),
Fig. 2: Images reused by scammers in different profiles. Each
with the majority of the PROXY profiles affected presenting
sub-caption shows an excerpt of the hash of the image. Note
a US location. Again, this is congruent with other evidence
that although certain images are perceptually equal, their
of a largely West African scammer population making use of
hashes are different.
proxied connections to present themselves as US citizens, with
some hints of scammers also acting from within the UK.
C. Profile Image Reuse
V. C HARACTERISING G EOGRAPHICAL D IFFERENCES IN
The use of images plays an important role in online S CAM P ROFILES
dating sites. Scammers often reuse profile images that
have been shown to attract vulnerable users in other lo- A previous section has explored how the presented lo-
cations. The military, the academic and the medical con- cation in a scam profile can differ according to the actual
text are recurrently exploited [1]. Figure 2 shows four ex- location of its creator. Other profile elements may also vary
amples of image reuse. These images appear in different geographically, according to the particular flavours of romance
scammers’ profiles. While some images are perceptually scam being employed in each location. In the section below,
the same picture, their hashes are totally different. This we examine how demographic characteristics are distributed
is the case, for instance, of Figure 2a and 2b, where according to the origin of scams from the SOURCE dataset.
their hashes are 2da1883450f2b74357465d3031cfd2a8 and We survey the demographic information—age, gender, oc-
d43c4519edc110c6a53dd10e40414e9e respectively. cupation, ethnicity and marital status—for each major scam
In our work, we use perceptual hashing to fingerprint origin country5 in the SOURCE dataset. Z-tests were performed
images. This type of hashing extracts features from the images for the age, gender, and topmost category of occupation, eth-
so that two images will have the same perceptual hash when nicity and marital status, compared to the SOURCE population
features are similar. These hash functions can distinguish averages. Table IV presents the results, with statistically sig-
between dissimilar images, while being robust against different nificant differences (α = 0.05) highlighted in bold. Bonferroni
transformations and “attacks” [7]. For the purpose of this correction was applied to adjust for multiple comparisons.
paper, we leveraged different perceptual hashing algorithms in- Gender is presented as the proportion of males.
cluding the classic perceptual hash function—computed from An immediate division can be drawn between countries
the Discrete Cosine Transform (DCT) between the different which predominantly present male profiles (e.g., Nigeria,
frequency domains of the image—and wavelet hashing— Malaysia, South Africa) and those which present mostly
using the Discrete Wavelet Transformation (DWT) [8]. Per- female profiles (e.g., the Philippines, Ukraine, Senegal). The
ceptual hashes within the dataset are compared in a pairwise age of scam profiles corresponds with their gender, with
manner using their Hamming distance, and then tested for female scam profiles typically averaging around the age of
equivalence based on a distance threshold, and manually 30, and male profiles averaging towards 50. The rates by
verified to exclude false positives. We observe a total of 187 which profiles declare themselves single also appear to be
images which are perceptually equivalent, with some being gender-biased, with female profiles being far less likely to
reused across up to five different scam profiles. use alternative statuses such as divorced or widowed. These
would seem to correspond to very different top-level strategies
Image clusters were then aggregated from perceptually
of online dating fraud being pursued in different countries,
equivalent images which were connected by being presented
with, presumably, different targets in mind.
on the same profile page (our assumption being that these
Within nations presenting mostly male profiles, the strate-
are attempts to portray the same subject, even if perceptually
gies appear to be fairly similar. They all mostly report white
dissimilar in setting). There were a total of 183 profiles
ethnicities, and most frequently use military or engineering
connected by 57 image clusters. Within the SOURCE subset,
occupations. Two exceptions are India, where the all-male
there were 45 profiles connected by 27 clusters of images.
scam profiles mostly present themselves as ‘businessmen’, and
Examining reuse within the SOURCE subset, images were
Italy, where the profiles most commonly report professions
predominantly shared between profiles created within Nigeria
in the real estate sector. Marital status provides the most
(14 internal connections to 4 external), Ghana (12 to 2), the
interesting distinctions. It is clear that a heavy use of the
UK (5 internal) and South Africa (5 internal). The external
‘widow’ backstory is especially favoured by South African
connections from Nigeria and Ghana were to Ghana, Nigeria,
and Turkish scammers, also most evident in the profiles with
Benin, Kenya and Turkey. Though the numbers here are
small, they fit with the more substantial body of text evidence 5 Those in Table I, plus Italy, which is promoted to importance when
showing resource sharing largely appearing to happen within considering text reuse evidence
6

Nation (S OURCE IP) N Age Gender Occupation Ethnicity Marital Status


x̄ z x̄ z x x̄ z x x̄ z x x̄ z
Nigeria 488 42.61 0.95 0.73 4.13 military 0.19 1.54 white 0.60 -1.38 single 0.47 -1.73
Ghana 216 40.01 -2.81 0.46 -5.34 military 0.22 2.07 white 0.68 1.65 single 0.63 3.70
Malaysia 178 46.53 5.33 0.79 4.29 engineer 0.30 5.71 white 0.60 -0.86 single 0.46 -1.17
South Africa 140 48.61 6.95 0.81 4.35 engineer 0.22 2.15 white 0.77 3.57 widow 0.57 7.47
UK 86 46.15 3.38 0.93 5.64 military 0.33 4.09 white 0.66 0.71 divorce 0.33 5.51
USA 57 47.33 3.56 0.84 3.21 engineer 0.34 3.71 white 0.61 -0.20 widow 0.30 0.18
Turkey 50 46.08 2.53 0.72 1.21 military 0.26 1.82 white 0.86 3.48 widow 0.58 4.66
India 47 42.62 0.30 1.00 5.17 business 0.32 8.14 white 0.62 -0.14 single 0.53 0.39
Togo 41 39.44 -1.56 0.20 -5.89 military 0.37 3.52 white 0.39 -3.20 single 0.68 2.34
Senegal 40 33.98 -4.67 0.10 -7.07 student 0.57 11.23 black 0.38 7.75 single 0.88 4.81
Philippines 29 27.66 -7.07 0.03 -6.75 sales 0.50 14.85 mixed 0.48 7.81 single 0.97 5.14
Ukraine 28 29.15 -6.11 0.07 -6.23 academic 0.22 9.82 white 1.00 4.24 single 0.89 4.26
Russia 24 29.25 -5.72 0.08 -5.65 accounts 0.43 23.18 white 0.96 3.51 single 0.79 2.94
Ivory Coast 23 36.52 -2.44 0.30 -3.32 student 0.30 3.86 black 0.48 8.02 single 0.65 1.48
Kenya 22 35.73 -2.72 0.45 -1.79 self 0.32 3.28 white 0.55 -0.82 single 0.64 1.30
Italy 19 39.37 -1.09 0.89 2.33 realty 0.63 23.74 white 0.89 2.55 single 0.89 3.59
SOURCE 1666 42.13 - 0.64 - military 0.17 - white 0.63 - single 0.50 -

TABLE IV: Dominant demographic characteristics by origin country. Significant differences highlighted.

the (suspect) location in the USA. The UK features profiles


unusually willing to make use of a ‘divorced’ status. These
exceptions mark certain nations as following variant patterns
from the “Nigerian” approach. The commonalities between
countries could be attributed to larger-scale campaigns of
location disguise of the same individuals, or an international
criminal group or community following similar patterns of
activity, perhaps actively sharing tactics.
Nations presenting mostly female profiles have more
markedly distinct strategies. Profiles from Senegal and the
Ivory Coast are most likely to state a black ethnicity, and to
report being students. Profiles from Russia and the Ukraine are
almost universally white, but may be distinguishable by their
declared occupation, with accounting professions being partic-
ularly distinctive of Russian profiles. Profiles from Togo are
notable for including a number of females reporting military
occupation, whilst the Philippines often present distinctively
as mixed-race, working in sales positions.
Ghana presents an unusual picture, with a distinctive bias
towards a more mixed-gender approach to scamming. Further
examination reveals that Ghanaian profiles represent a mix
of two competing local approaches: male profiles following
the preferred pattern from nearby Nigeria, with dominantly
military occupations, whilst female profiles borrow from a
tradition more akin to that in Senegal and the Ivory Coast, Fig. 3: The variety of topic categories [9] found in scammers’
presenting mostly as students – though, interestingly, Ghanaian profile descriptions by origin country.
female profiles are still more likely to be white than black.
Kenya also appears to represent a balanced gender mix of
scam profiles, but the comparatively small number of profiles they tend to focus on their motives or drivers (e.g. affiliation,
from there make it difficult to be confident about this pattern. status, power), while scammers from Togo, the Philippines,
The language in profile descriptions also shows some re- the Ivory Coast and India used more words related to social
gional characteristics. To analyse the variety of topic categories processes (e.g. references to friendship and the use of personal
found in the profile descriptions, we used dictionary terms that pronouns such as “I”, “you”, “he”, “she”, “we”, “they”).
are mapped to categories from the LIWC 2015 dictionary [9]. Interestingly, when analysing the number of references to cog-
Normalised category frequencies were recorded for each pro- nitive processes, such as tentative language use (e.g. “maybe”,
file description and grouped by country of origin. “perhaps”, etc.), discrepancies (e.g. “should”, “would”, etc.)
As can be seen in Figure 3 our results showed that scammers and cognitive words (e.g. “know”, “cause”, etc.) the highest
based in Russia referred considerably more to personal con- number of references was found among Ukrainian scammers.
cerns, such as work, leisure and especially religion, and that Finally, Kenya, and to a lesser extent, Senegal, stood out for
7

their language use linked to negative emotions, such as sadness R EFERENCES


and affective processes (e.g. “cry”) and for displaying more [1] M. T. Whitty, “The scammers’ persuasive techniques model: Development
informal language forms (including ‘netspeak’ features). of a stage model to explain the online dating romance scam,” British
It appears that there are certainly regional characteristics Journal of Criminology, vol. 53, no. 4, pp. 665–684, 2013.
[2] T.-F. Yen and M. Jakobsson, “Case study: Romance scams,” in Under-
of online dating fraud, which appear to be distinctive. Further standing Social Engineering Based Scams. Springer, 2016, pp. 103–113.
work will discuss whether demographic and linguistic features [3] J. Huang, G. Stringhini, and P. Yong, “Quit playing games with my heart:
such as these are robust enough to automatically identify the Understanding online dating scams,” in Proceedings of the International
Conference on Detection of Intrusions and Malware, and Vulnerability
origins of scam profiles. Assessment. Springer, 2015, pp. 216–236.
[4] M. Kienpointner, “How to present fallacious messages persuasively: The
VI. C ONCLUSION case of the nigeria spam letters,” Considering Pragma-Dialectics, pp.
161–173, 2006.
We have provided an overview of the geography of online [5] E. Edelson, “The 419 scam: information warfare on the spam front and
dating fraud, to the extent that our data source allows us to a proposal for local filtering,” Computers & Security, vol. 22, no. 5, pp.
explore this topic. While most online dating fraud profiles 392–401, 2003.
[6] Transparency International. (2016) Corruption perceptions index. [On-
present their location as a major location in the US or line]. Available: https://www.transparency.org/news/feature/corruption
other Western country, their origins are mostly West African, perceptions index 2016
Malaysian or South African, with Nigeria the largest single [7] C. Zauner, M. Steinebach, and E. Hermann, “Rihamark: perceptual image
hash benchmarking.” in Media Forensics and Security, 2011, p. 78800X.
contributor. These may be the most important targets for efforts [8] V. Monga and B. L. Evans, “Perceptual image hashing via feature points:
at preventing online dating fraud. Preventative and disruptive performance evaluation and tradeoffs,” IEEE Transactions on Image
efforts, working with regional agencies in these locations, Processing, vol. 15, no. 11, pp. 3452–3465, 2006.
[9] J. W. Pennebaker, R. L. Boyd, K. Jordan, and K. Blackburn, “The
could have significant international impact. development and psychometric properties of LIWC2015,” Tech. Rep.,
Treating IP location information critically, we observe that 2015.
profiles which appear to have been created from the United
States can often share text or images with scam profiles being
created from elsewhere. Text reuse indicators can suggest true
locations for up to 11% of data coming from known proxies.
This is only an initial analysis, but refining and applying this
methodology using a database of known scam profiles could
help capture new scam profiles reusing observed elements.
Methods such as these, working from larger evidence-bases,
could help form a technological countermeasure to increasing
utilization of proxies amongst scammers, using the wider
criminal population to help identify and locate the more careful
elements.
At a national level, most countries producing romance scam
profiles tend toward creating mostly male or mostly female
profiles, suggestive of different cultures motivating the activity.
With a few exceptions, mostly-male-profile countries follow
a similar strategy with profile demographics, while mostly-
female-profile countries appear to have more distinctive trade-
mark approaches, which may be useful for investigators assess-
ing the likelihood of particular origins. Further work is needed
to determine how reliable presented characteristics alone can
be for determining the origins of a given profile.
While a certain number of fraudulent dating profiles are
seen to originate in the US, a number of other indicators
suggest that many of these profiles come from undetected
proxies for West African dating fraudsters, and the true rate
for dating fraud originating in the US may be much lower than
IP geolocation evidence alone would suggest.
The findings here might be taken up by law enforcement
and government to guide global efforts at disruption of this
particular crime. Whilst it comes as no surprise that these
crimes emanate from West Africa, according to our data, the
UK also have a share of fraudsters, and proxies across Europe
are being used as transit points to target victims in the US.
Law enforcement in Western countries, therefore, need to be
concerned with both the crimes that enter the country from
abroad and the criminals located within their own territory.

You might also like