SP_2927_10.10072F978-3-319-47671-1_14

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/312324333

OSINT in the Context of Cyber-Security

Chapter · January 2016


DOI: 10.1007/978-3-319-47671-1_14

CITATIONS READS

27 16,768

2 authors, including:

Fahimeh Tabatabaei

5 PUBLICATIONS 33 CITATIONS

SEE PROFILE

All content following this page was uploaded by Fahimeh Tabatabaei on 21 February 2018.

The user has requested enhancement of the downloaded file.


Chapter 14
OSINT in the Context of Cyber-Security

Fahimeh Tabatabaei and Douglas Wells

Abstract The impact of cyber-crime has necessitated intelligence and law


enforcement agencies across the world to tackle cyber threats. All sectors are now
facing similar dilemmas of how to best mitigate against cyber-crime and how to
promote security effectively to people and organizations. Extracting unique and
high value intelligence by harvesting public records to create a comprehensive
profile of certain targets is emerging rapidly as an important means for the intel-
ligence community. As the amount of available open sources rapidly increases,
countering cyber-crime increasingly depends upon advanced software tools and
techniques to collect and process the information in an effective and efficient
manner. This chapter reviews current efforts of employing open source data for
cyber-criminal investigations developing an integrative OSINT Cybercrime
Investigation Framework.

14.1 Introduction

During the 21st century, the digital world has acted as a ‘double-edged sword’
(Gregory and Glance 2013; Yuan and Chen 2012). Through the revolution of
publicly accessible sources (i.e., open sources), the digital world has provided
modern society with enormous advantages, whilst at the same time, issues of
information insecurity have brought to light vulnerabilities and weaknesses (Hobbs
et al. 2014; Yuan and Chen 2012). The shared infrastructure of the internet creates
the potential for interwoven vulnerabilities across all users (Appel 2011): “The
viruses, hackers, leakage of secure and private information, system failures, and
interruption of services” appeared in an abysmal stream (Yuan and Chen 2012).

F. Tabatabaei (&)
Mehr Alborz University, Tehran, Iran
e-mail: [email protected]
D. Wells
CENTRIC/Sheffield Hallam University, Sheffield, UK

© Springer International Publishing AG 2016 213


B. Akhgar et al. (eds.), Open Source Intelligence Investigation,
Advanced Sciences and Technologies for Security Applications,
DOI 10.1007/978-3-319-47671-1_14
214 F. Tabatabaei and D. Wells

(Wall 2007; 2005) and Nykodym et al. (2005) discussed that cyberspace possess
four unique features called ‘transformative keys’ for criminals to commit crimes:
1. Globalization, which provides offenders with new opportunities to exceed
conventional boundaries
2. Distributed networks, which create new opportunities for victimization
3. Synopticism and Panopticism, which enable surveillance capability on victims
remotely
4. Data trails, which may allow new opportunities for criminals to commit identity
theft
In addition to the above, Hobbs et al. (2014) claim that one of the main trends of
the recent years’ internet development is that “connection to the Internet may be a
very risky endeavour.”
As well as the epidemic use and advancement of mobile communication tech-
nology, the use of open sources propagates the fields of intelligence, politics and
business (Hobbs et al. 2014). Whilst traditional sources and information channels
(news outlets, databases, encyclopedias, etc.) have been forced to adapt to the new
virtual space to maintain their presence, many ‘new’ media sources (especially from
social media) disseminate large amounts of user-generated content that has sub-
sequently reshaped the information landscape. Examples of the scale of user gen-
erated information include the 500 million Tweets per day on Twitter and the 98
million daily blog posts on Tumblr (Hobbs et al. 2014) as well as millions of
individual personal Facebook pages. With the evolution of the information land-
scape, it has been essential that law enforcement agencies now harvest relevant
content through investigations and regulated surveillance, to prevent and detect
terrorist activities (Koops et al. 2013).
As has been considered in earlier chapters the term Open Source Intelligence
(OSINT) emanates from national security services and law enforcement agencies
(Kapow Software 2013). OSINT for our purposes here is predominantly defined as,
“the scanning, finding, collecting, extracting, utilizing, validation, analysis, and
sharing intelligence with intelligence-seeking consumers of open sources and
publicly available data from unclassified, non-secret sources” (Fleisher 2008;
Koops et al. 2013). OSINT encompasses various public sources such as academic
publications (research papers, conference publications, etc.), media sources
(newspaper, radio channels, television, etc.), web content (websites, social media,
etc.), and public data (open government documents, public companies announce-
ments, etc.) (Chauhan and Panda 2015a, b).
OSINT was traditionally described by searching publicly available published
sources (Burwell 2004) such as books, journals, magazines, pamphlets, reports and
the like. This is often referred to literature intelligence or LITINT (Clark 2004).
However, the rapid growth of digital media sources throughout the web and public
communication airwaves have enlarged the scope of Open Source activities
(Boncella 2003). Since there are diverse public online sources from which we can
collect intelligence, this type of OSINT is described as WEBINT by many authors.
14 OSINT in the Context of Cyber-Security 215

Indeed, the terms WEBINT and OSINT are often used interchangeably (Chauhan
and Panda 2015a, b). Social media such as social networks, media sharing com-
munities and collaborative projects are areas where the majority of user generated
content is produced. Social Media Intelligence or SOCMINT refers to ‘the intelli-
gence that is collected from social media sites’. Some of their information may be
openly accessible without any kind of authentication required prior to investigation
(Omand et al. 2014; pp. 36; Chauhan and Panda 2015a, b).
Many law enforcement and security agencies are turning towards OSINT for the
additional breadth and depth of information to reinforce and help validate con-
textual knowledge (see for instance Chap. 13). Unlike typical IT systems, which
can adopt only a limited range of input, OSINT data sources are as varied as the
internet itself and will continue to evolve as technology standards expand (Kapow
Software 2013): “OSINT can provide a background, fill epistemic gaps and create
links between seemingly unrelated sources, resulting in an altogether more com-
plete intelligence picture” (Hobbs et al. 2014, p. 2).
OSINT increasingly depends on the assimilation of all-source collection and
analysis. Such intelligence is an essential part of “national security, competitive
intelligence, benchmarking, and even data mining within the enterprise” (Appel
2011, p. xvii). The process of OSINT is shown in Fig. 14.1. OSINT has been used
for a long time by the government, military and in the corporate world to keep an
eye on the competition and to have a competitive advantage (Chauhan and Panda
2015a, b). Also a great number of internet users’ enjoy legal activities “from
communications and commerce to games, dating, and blogging” (Appel 2011, p. 6),
and OSINT plays a critical role in this context.

Fig. 14.1 The OSINT


process
216 F. Tabatabaei and D. Wells

The current chapter aims to present an in-depth review of the role of OSINT in
cyber security context. Cybercrime and its related applications are explored such as
the concepts of the Deep and Dark Web, anonymity and cyber-attacks. Further, it
will review OSINT collection and analysis tools and techniques with a glance at
related works as main parts of its contribution. Finally, these related works are
articulated alongside the cyber threat domain and its open sources to establish a ‘big
picture’ of this topic.

14.2 The Importance of OSINT with a View on


Cyber Security

Increases in the quantity and type of challenges for contemporary, national security,
intelligence, law enforcement and security practitioners have sped up the use of
open sources in the internet to help draw out a more cohesive picture of people,
entities and activities (Appel 2011; also Chaps. 2, 3, 12 and 13). A recent PWC1
American Survey (2015) entitled “Key findings from the 2015 US State of
Cybercrime Survey” from more than 500 executives of US businesses, law
enforcement services and government agencies articulates that “cybercrime con-
tinues to make headlines and cause headaches among business executives.” 76 % of
cyber-security leaders said they are more concerned about cyber threats this year:
“Cybersecurity incidents are not only increasing in number, they are also becoming
progressively destructive and target a broadening array of information and attack
vectors” (PWC 2015).
In a report of the U.S. Office of Homeland Security, critical mission areas,
wherein the adoption of OSINT is vital, include general-intelligence, advanced
warnings, domestic counter-terrorism, protecting critical infrastructure (including
cyberspace), defending against catastrophic terrorism and emergency preparedness
and response (Chen et al. 2012). Therefore, intelligence, security and public safety
agencies are gathering large volumes of data from multiple sources, including the
criminal records of terrorism incidents and from cyber security threats (Chen et al.
2012).
Glassman and Kang (2012) discussed OSINT as the output of changing human–
information relationships resulting from the emergence and growing dominance of
the World Wide Web in everyday life. Socially inappropriate behaviour has been
detected in Web sites, blogs and online-communities of all kinds “from child
exploitation to fraud, extremism, radicalisation, harassment, identity theft, and
private-information leaks.” Identity theft and the distribution of illegally “copied
films, TV shows, music, software, and hardware designs are good examples of how
the Internet has magnified the impact of crime” (Hobbs et al. 2014).

1
PricewaterhouseCoopers.
14 OSINT in the Context of Cyber-Security 217

The globalization, speed of dissemination, anonymity, cross-border nature of the


internet, and the lack of appropriate legislation or international agreements have
made some of them very wide-spread, and very difficult to litigate (Kim et al.
2011). There exist different types of dark sides of the internet, but also applications
to shed on the dark sides, comprising both technology-centric and
non-technology-centric ones. Technology-centric dark sides include spam, mal-
ware, hacking, Denial of Service (DoS) attacks, phishing, click fraud and violation
of digital property rights. Non-technology-centric dark sides include online scams
and frauds, physical harm, cyber-bullying, spreading false or private information
and illegal online gambling. Non-technology responses include legislation, law
enforcement, litigation, international collaboration, civic actions, education and
awareness and caution by people (Kim et al. 2011).
Computer crime and digital evidence are growing by orders that are as yet
unmeasured except by occasional surveys (Hobbs et al. 2014). To an intelligence
analyst, the internet is pivotal owing to the capabilities of browsers, search engines,
web sites, databases, indexing, searching and analytical applications (Appel 2011).
However, there are key issues which can distract from the right direction of OSINT
projects such as harvesting data from big open records on the internet and the
integration of data to add the capability of OSINT project parameters (Kapow
Software 2013).

14.3 Cyber Threats: Terminology and Classification

Cyber-crime2 is any illegal activity arising from one or more internet components
such as Web sites, chat rooms or e-mail (Govil and Govil 2007) and commonly
defined as “criminal offenses committed using the internet or another computer
network as a component of the crime” (Agrawal et al. 2014). In 2007, the European
Commission (EC) identified three different types of cyber-crime: traditional forms
of crime using cyber relating to, for example, forgery, web shops and e-market
types of fraud, illegal content such as child pornography and ‘crimes unique to
electronic networks’ (e.g., hacking and Denial of Service attacks). Burden and
Palmer (2003) distinguished ‘true’ cybercrime (i.e., dishonest or malicious acts,
which would not exist outside of an online environment) from crimes which are
simply ‘e-enabled’. They presented ‘true’ cyber-crimes as hacking, dissemination
of viruses, cyber-vandalism, domain name hijacking, Denial of Service Attacks
(DoS/DDoS), in contrast to ‘e-enabled’ crimes such as misuse of credit cards,
information theft, defamation, black mailing, cyber-pornography, hate sites, money
laundering, copyright infringements, cyber-terrorism and encryption. Evidently,
crime has infiltrated the Web 2.0 “along with all other types of human activities”
(Hobbs et al. 2014).

2
In this chapter, the terms computer crime, internet crime, online crimes, hi-tech crimes, infor-
mation technology crime and cyber-crimes are being used interchangeably.
218 F. Tabatabaei and D. Wells

Cyber-attacks are increasingly being considered to be of the utmost severity for


national security. Such attacks disrupt legitimate network operations and include
deliberate detrimental effects towards network devices, overloading a network and
denying services to a network to legitimate users. An attacker may also exploit loop
holes, bugs, and misconfigurations in software services to disrupt normal network
activities (Hoque et al. 2014).
The attacker’s goal is to perform reconnaissance by restraining the power of
freely available information extracted using different intelligence gathering ways
before executing a targeted attack (Enbody and Sood 2014). Meanwhile, “secrecy”
is a key part of any organized cyber-attack. Actions can be hidden behind a mask of
anonymity varying from the use of ubiquitous cyber-cafes to sophisticated efforts to
covert internet routing (Govil and Govil 2007). Cyber-criminals exploit opportu-
nities for anonymity and disguise over web-based communication to navigate
malicious activities such as phishing, spamming, blackmail, identity theft and drug
trafficking (Gottschalk et al. 2011; Igbal et al. 2012). Network security tools
facilitate network attackers in addition to network defenders in recognizing network
vulnerabilities and colleting site statistics. Network attackers attempt to identify
security breaches based on common services open on a host gathering relevant
information for launching a successful attack.
Kshetri (2005) classified cyber-attacks into two types: targeted and opportunistic
attacks. In targeted attacks specific tools are applied against specific cyber targets,
which makes this type more dangerous than the other one. Opportunistic attacks
entail the disseminating of worms and viruses deploying indiscriminately across the
internet (Hoqu et al. 2014). Figure 14.2 provides a taxonomy of cyber-crime types
(what) with their motives (why) and the tools to commit them (how).
To counter the ability of organized cyber-crime to operate remotely through
untraceable accounts and compromised computers and fighting against online crime
gangs it is therefore essential to supply tools to LEAs and actors in national security
for the detection, classification and defence from various types of attacks (Simmons
et al. 2014).

14.4 Cyber-Crime Investigations

14.4.1 Approaches, Methods and Techniques

Current information professionals draw from a variety of methods for organizing


open sources including but not limited to web-link analysis, metrics, scanning
methods, source mapping, text mining, ontology creation, blog analysis and pattern
recognition methods. Algorithms are developed using computational topology,
hyper-graphs, social network analysis (SNA), Knowledge Discovery and Data
Mining (KDD), agent based simulations, dynamic information systems analysis,
amongst others (Brantingham 2011).
14
OSINT in the Context of Cyber-Security

Fig. 14.2 Cyber Crime types: Which-Why-How (Type, Motives, Committing Tools and techs)
219
220 F. Tabatabaei and D. Wells

Table 14.1 Tools for the collection, storage and classification of open source data
Tools purpose Application/description of tool(s)
Data encoding The term encoding refers to the process of putting a sequence
of characters into a special format for transmission or storage
purposes. In a web environment, relevant datasets are recovered
from data services available either locally or globally on the
internet. Depending on the service and the type of information,
data can be presented in different formats. Modelling platforms are
required to interact with a mixture of data formats including plain
text, markup languages and binary files (Vitolo et al. 2015;
Webopedia.com n.d.).
Examples: The Geoinformatics for Geochemistry System (database
web services adopting plain text format), base 64online Encoder,
XML encoder
Data acquisition The automatic collection of data from various sources (e.g., sensors
and readers in a factory, laboratory, medical or scientific
environment). Data acquisition has usually been conducted via data
access points and web links such as http or ftp pages, but required
periodical updates. Using a catalogue allows a screening of
available data sources before their acquisition (Ames et al. 2012;
Vitolo et al. 2015).
Examples: Meta-data catalogues
Data provenance This term is used to refer to the process of tracing and recording the
origins of data and its movement between databases. Behind the
concept of provenance is the dynamic nature of data. Instead of
creating different copies of the same dataset, it is important to keep
track of changes and store a record of the process that led to the
current state. Data provenance can, in this way, guarantee
reliability of data and reproducibility of results. Provenance is now
an increasingly important issue in scientific databases, where it is
central to the validation of data for inspecting and verifying
quality, usability and reliability of data (particularly in Semantic
Web Services) (Buneman et al. 2000; Szomszor and Moreau 2003;
Tilmes et al. 2010; Vitolo et al. 2015).
Examples: Distributed version Control Systems such as Git,
Mercuriala
Data storage This term refers to the practice of storing electronic data with a
third party service accessed via the internet. It is an alternative to
traditional local storage (e.g., disk or tape drives) and portable
storage (e.g., optical media or flash drives). It can also be called
‘hosted storage’, ‘internet storage’ or ‘cloud storage’. Relational
databases (DB) are currently the best choice in storing and sharing
data (Vitolo et al. 2015; Webopedi.com n.d.).
Examples: Postgre SQL, MySQL, Oracle, NoSQL
(continued)
14 OSINT in the Context of Cyber-Security 221

Table 14.1 (continued)


Tools purpose Application/description of tool(s)
Data curation Data curation is aimed at data discovery and retrieval, data quality
assurance, value addition, reuse and preservation over time. It
involves selection and appraisal by creators and archivists;
evolving provision of intellectual access; redundant storage; data
transformations. Data curation is critical for scientific data
digitization, sharing, integration, and use (Dou et al. 2012;
Webopedia.com n.d.).
Examples: Data warehouses, Data marts, Data Management Plan
tools (DMPTool)b
Data visualization (and This term refers to the presentation of data in a pictorial or
interaction) graphical format (e.g., creating tables, images, diagrams and other
intuitive ways to understand data). Interactive data visualization
goes a step further: moving beyond the display of static graphics
and spreadsheets to using computers and mobile devices to drill
down into charts and graphs for more details, and interactively (and
immediately) changing what data you see and how it is processed
(Vitolo et al. 2015; Webopedia.com n.d.).
Examples: Poly Maps, NodeBox, FF Chartwell, SAS visual
Analytics, Google Map
a
Distributed version control systems have been designed to ease the traceability of changes, in
documents, codes, plain text data sets and more recently geospatial contents.
b
DMP tools create ready-to-use data management plans for specific funding agencies to meet
funder requirements for data management plans, get step-by-step instructions and guidance for
your data and learn about resources and services available at your institution to help fulfill the data
management requirements of your grant.

OSINT analytic tools provide frameworks for data mining techniques to analyse
data, visualize patterns and offer analytical models to recognize and react to
identify patterns. These tools should combine/unify indispensable features and
contain integrated algorithms and methods supporting the typical data mining
techniques, entailing (but not limited to) classification, regression, association and
item-set mining, similarity and correlation as well as neural networks (Harvey
2012). Such analytics tools are software products which provide predictive and
prescriptive analytics applications, some running on big open sources computing
platforms, commonly parallel processing systems based on clusters of commodity
servers, scalable distributed storage and technologies such as Hadoop and NoSQL
databases. The tools are designed to empower users rapidly to analyse large
amounts of data (Loshin 2015). The most predominant tools and techniques for
OSINT collection and storage are summaries in Table 14.1.

14.4.2 Detection and Prevention of Cyber Threats

Techniques to make use of open sources involve a number of specific disciplines


including statistics, data mining, machine learning, neural networks, social network
222 F. Tabatabaei and D. Wells

analysis, signal processing, pattern recognition, optimization methods and visual-


ization approaches (Chen and Zhang 2014; also Chapters in Part 2 of this book).
Gottschalk et al. (2011) presented a four-stage growth model for Knowledge
Discovery to support investigations and the prevention of white-collar3 crime in
business organizations (Gottschalk 2010). The four stages are labelled:
1. Investigator-to-technology
2. Investigator-to-investigator
3. Investigator-to-information
4. Investigator-to-application
Through the proper exercise of knowledge, such processes can assist in problem
solving. This four-part system attempts to validate the conclusions by finding
evidence to support them. In law enforcement this is an important system feature as
evidence determines whether a person is charged or not for a crime (Gottschalk
et al. 2011) and the extent to which proceedings against them will succeed (see
Chaps. 17 and 18).
Lindelauf et al. (2011) investigated the structural position of covert criminal net-
works using the secrecy versus information trade-off characterization of covert
networks to identify criminal networks topologies. They applied this technique on
evidence for the investigation of Jemaah Islamiyah’s Bali bombing as well as
heroin distribution networks in New York. Danowski (2011) developed a
methodology combining text analysis and social network analysis for locating
individuals in discussion forums, who have highly similar semantic networks based
on watch-list members’ observed message content or based on other standards such
as radical content extracted from messages they disseminate on the internet. In the
domain of countering cyber terrorism and inciting violence Danowski used a
Pakistani discussion forum with diverse content to extract intelligence of illegal
behaviour. Igbal et al. (2013) presented a unified data mining solution to address the
problem of authorship analysis in anonymous textual communications such as
spamming and spreading malware and to model the writing style of suspects in the
context of cyber-criminal behaviour.
Brantingham (2011) offered a comprehensive computational framework for
co-offending network mining, which combines formal data modelling with data
mining of large crime and terrorism data sets “aimed towards identifying common
and useful patterns”. Petersen et al. (2011) proposed a node removal algorithm in
the context of cyber-terrorism to remove key nodes of a terrorism network. Fallah
(2010) proposed a puzzle-based strategy of game theory using the solution concept
of the Nash Equilibrium to handle sophisticated DoS attack scenarios. Chonka et al.
(2011) offered a solution through Cloud TraceBack (CTB) to find the source of DoS
attacks and introduced the use of a back propagation neutral network, called Cloud

3
White-collar crime is financial crime committed by upper class members of society for personal or
organizational gain. White-collar criminals are individuals who tend to be wealthy, highly edu-
cated, and socially connected, and they are typically employed by and in legitimate organizations..
Table 14.2 Categorization of methods using open source data for cyber-criminal investigations
14

Domain (Which) Author (Who) Methodology description (How)


Data mining Criminal networks Iqbal et al. Proposing a framework that consists of three modules. 1. click miner, 2. topic
(2012) miner and 3. information visualizer. It is a unified framework of data mining
and natural language processing techniques to collect data from chat logs for
intuitive and interpretable evidence that facilitates the investigative process
for crime investigation.
Available from: Online Messages (Chat Logs) extracted from Social
Networks
Activity boom in cyber Ansari et al. Describing a typical fuzzy intrusion detection scenario for information mining
cafes, and anomaly (2007) application in real time that investigates vulnerabilities of computer networks
detection Available from: Data available via ISPs
Malware activities Wu et al. Investigating detection solutions of Fast-flux domains by using Data Mining
detection using fast-flux (2010) techniques (Linear Regression) to detect the FFSNa and analysing the feature
OSINT in the Context of Cyber-Security

services networks (FFSN) attributes


Available from: Data in two classes: white and black lists. The white list
includes more than 60 thousands benign domain names; the black list has
about 100 FFSNs domain names detected by http://dnsbl.abuse.ch
Cyber terrorism resilience Koester and Providing a supporting framework via FCA (Factor Concept Analysis) to find
Schmidt and fill information gaps in Web Information Retrieval and Web Intelligence
(2009) for cyberterrorism resilience
Available from: Small terrorist data sets based on 2002, 2005, London,
Madrid
Text Mining Counter Cyber Terrorism Srihari (2009) Using Unapparent Information Revelation (UIR) method to propose a new
framework for different interpretation. A generalization of this taskinvolves
query terms representing general concepts (e.g. indictment, foreign policy)
Intrusion Detection Adeva and Proposing detection attempts of either gaining unauthorised access or
System Atxa (2007) misusing a web application and introducing an intrusion detection software
component based on text-mining techniques using “Arnas” system
Social Network Cyber terrorism (detecting Chen et al. Providing a novel graph-based algorithm that generates networks to identify
Analysis terrorist networks) (2011) hidden links between nodes in a network with current information available to
223

investigators
(continued)
Table 14.2 (continued)
224

Terrorist network fighting Kock Wiil Offering a novel method to analyse the importance of links and to identify key
et al. (2011) entities in the terrorist (covert) networks using Crime Fighter Assistantb
Available from: Open sources: 9/11 attacks (2001), Bali night club bombing
(2002), Madrid bombings (2004), and 7/7 London bombings (2005)
Network attacks (intrusion He and Using an Automatic Semantic Network with two layers: first mode and second
detection) Karabatis mode networks. The first mode network identifies relevant attacks based on
(2012) similarity measures; the second mode network is modified based on the first
mode and adjusts it by adding domain expertise
Available from: Selected data from the KDD CUP 99 data set made available
at the Third International Knowledge Discovery and Data Mining Tools
Competitionc
Optimization methods Preventing DDoS attacks Spyridopoulos Making a two-player, one-shot, non-cooperative, zero-sum game in which the
(based on game et al. (2013) attacker’s purpose is to find the optimal configuration parameters for the
theory) attack in order to cause maximum service disruption with the minimum cost.
This model attempts to explore the interaction between an attacker and a
defender during a DDoS attack scenario
Available from: A series of experiments based on the Network Simulator
(ns-2) using the dumbbell network topology
Trust management and Li et al. (2009) Proposing a defence technique using two trust management systems (Key
DoS attacks Note and Trust Builder) and credential caching. In their two player zero-sum
game model, the attacker tries to deprive as much resources as possible, while
the defender tries to identify the attacker as quickly as possible
Available from: KeyNote (open-source library for the KeyNote trust
management system) as an example to demonstrate that a DoS attack can
easily paralyze a trust management server
Cyber terrorism Matusitz A model combining game theory and social network theory to model how
(2009) cyber-terrorism works to analyse the battle between computer security experts
and cyberterrorists; all players wish the outcome to be as positive or
rewarding as possible
(continued)
F. Tabatabaei and D. Wells
Table 14.2 (continued)
14

Related works for Cyber-crime investigation Katos and Presenting an information system to capture the information provided by the
conceptual Bendar (2008) different members during a cyber-crime investigation adopting elements of
frameworks the Strategic Systems Thinking Framework (SST). SST consists of three main
aspects: 1. intra-analysis, 2. inter analysis and 3. value-analysis
Computer hacking Kshetri (2005) Proposing a conceptual framework based on factors and motivations, which
encourage and energize the cyber offenders’ behaviour:
1. Characteristics of the source nation
2. Motivation of attack
3. Profile of target organization (types of attack)
Preventing white collar Gottschalk Developing an organizing framework for knowledge management systems in
crime (2011) policing financial crime containing four stages to investigation and prevention
financial crimes:
1. Officer to technology systems
OSINT in the Context of Cyber-Security

2. Officer to officer systems


3. Officer to information systems
4. Officer to application systems
Detecting cyber-crime in Lagazio et al. Proposing a multi-level approach that aims at mapping the interaction of both
financial sector (2015) interdependent and differentiated factors with focusing on system dynamics
theory in the financial sector. The factors together can facilitate or prevent
cyber-crime, while increasing and/or decreasing its economic and social costs.
Capturing and analysing Song (2011) Proposing a military intelligence early warning mechanism based on open
military intelligence to sources with four modules (1. collection module, 2. early-warning intelligence
prevent crises processing, 3. early warning intelligence analysis, 4. preventive actions) to
help the collection, tracking, monitoring and analysis of crisis signals used by
operation commanders and intelligence personnel to support preventive
actions
a
Creates a fully qualified domain name to have hundreds (or thousands) IP addresses assigned to it
b
A knowledge management tool for terrorist network analysis
c
This training dataset was originally prepared and managed by MIT Lincoln Labs
225
226 F. Tabatabaei and D. Wells

Text Mining Information Extraction

Optimization
Game Theory
Method
Data Mining
Web Mining Link Analysis

Machine Learning
Techniques / Methods

Node Removal

Social Network Network Extraction


Analysis
Semantic Networks
Analysis

Statistical Method Regression Models

Conceptual Knowledge-
based Frameworks

Cloud Computing

Fig. 14.3 Categorization of cyber-crime investigation methods and models

Protector, which was trained to detect and filter against such attack traffic.
Mukhopadhyay et al. (2013) suggested a Copula-aided Bayesian Belief Network
(CBBN) to assess and to quantify cyber-risk and cyber vulnerability assessment
(CVA).
In summary, the field of computational criminology includes a wide range of
computational techniques to identify:
1. Patterns and emerging trends
2. Crime generators and crime attractors
3. Terrorist, organized crime and gang social and spatial networks
4. Co-offending networks
Current models and methods are summarized Table 14.2 according to providing
cyber-crime types (which), author (who), methodology (how) and open sources
used for testing.
While many approaches seem to be helpful for cyber-crime investigation,
existing literature suggests that social network analysis (SNA), data mining, text
analysis, correlational studies and optimization methods specifically with focus on
big data analysis of open sources are the most practical techniques to aid
14 OSINT in the Context of Cyber-Security 227

Cyber crime Detection Tools and Open Source


Techniques
Tools and techniques to
Motivation / Goal (Records)
Cyber Crime
commit cyber crime
Collection,
Investigation
Cyber crime Prevention Tools and Storage,
Domain/Type (Cyber Crime Analysis and
Techniques
Committing and Processing
Combating (Detection Strategies to
Profile of Targeted and Prevention)
Open source (Records)
System/Organization / Collection and Storage protect cyber
Enterprise Tools and Techniques space

Increasing Open The Growth of Social


Source types Media and Revolution
of Big Data

Fig. 14.4 Cybercrime investigation framework

practitioners and security and forensic agencies. Currently available techniques can
be categorized in a schematic diagram such as Fig. 14.3.

14.5 Conclusions

The impact of cyber-crime has necessitated intelligence and law enforcement


agencies across the world to tackle cyber threats. All sectors are now facing similar
dilemmas of how to best mitigate against cyber-crime and how to promote security
effectively to people and organizations (Jahankhani et al. 2014; Staniforth 2014).
Extracting unique and high value intelligence by harvesting public records to create
a comprehensive profile of certain targets is emerging rapidly as an important
means for the intelligence community (Bradbury 2011; Steele 2006). As the amount
of available open sources rapidly increases, countering cyber-crime increasingly
depends upon advanced software tools and techniques to collect and process the
information in an effective and efficient manner (Kock Wiil et al. 2011).
This chapter reviewed current efforts of employing open source data for
cyber-criminal investigations. Figure 14.4 provides a summary of the findings in
the form of an integrative Cybercrime Investigation Framework.

References

Adeva JJG, Atxa JMP (2007) Intrusion detection in web applications using text mining. Eng Appl
Artif Intell 20:555–566
Agarwal VK, Garg SK, Kapil M, Sinha D (2014) Cyber crime investigations in India: rendering
knowledge from the past to address the future. ICT and critical infrastructure: proceedings of
228 F. Tabatabaei and D. Wells

the 48th annual convention of CSI, vol 2, Springer International Publishing Switzerland,
pp. 593–600. doi:10.1007/978-3-319-03095-1_64
Ames DP, Horsburgh JS, Cao Y, Kadlec J, Whiteaker T, Valentine D (2012) Hydro desktop: web
services-based software for hydrologic data discovery, download, visualization, and analysis.
Environ Model Software 37:146–156
Ansari AQ, Patki T, Patki AB, Kumar V (2007) Integrating fuzzy logic and data mining: impact on
cyber security. Fourth international conference on fuzzy systems and knowledge discovery
(FSKD 2007). IEEE Computer Society
Appel EJ (2011) Behavior and technology, Internet Searches for Vetting, Investigations, and
Open-Source Intelligence. Taylor and Fransic Group, pp. 3–17. ISBN 978-1-4398-2751-2
Boncella RJ (2003) Competitive intelligence and the web. Commun AIS 12:327–340
Bradbury D (2011) In plain view: open source intelligence. Comput Fraud Secur 5–9
Brantingham PL (2011) Computational Criminology. 2011 European intelligence and security
informatic conference. IEEE Computer Society. doi:10.1109/EISIC.2011.79
Burden K, Palmer C (2003) Internet crime: cyber crime—A new breed of criminal? Comput Law
Secur Rep 19(3):222–227
Buneman P, Khanna S, Chiew Tan W (2000) Data provenance: some basic issues. University of
pennsylvania scholarly commons. Retrieved fromhttp://repository.upenn.edu/cgi/viewcontent.
cgi?article=1210&context=cis_papers
Burwell HP (2004) Online competitive intelligence: increase your profits using cyber-intelligence.
Facts on Demand Press, Tempe, AZ
Chauhan S, Panda K (2015) Open source intelligence and advanced social media search. Hacking
web intelligence open source intelligence and web reconnaissance concepts and techniques.
Elsevier, pp. 15–32. ISBN: 978-0-12-801867-5
Chauhan S, Panda K (2015) Understanding browsers and beyond. Hacking web intelligence open
source intelligence and web reconnaissance concepts and techniques. Elsevier, pp. 33–52.
ISBN: 978-0-12-801867-5
Chen A, Gao Sh, Karampelas P, Alhajj R, Rokne J (2011) Finding hidden links in terrorist
networks by checking indirect links of different sub-networks. In: Kock Wiil U
(ed) Counterterrorism and open source intelligence. Springer Vienna, pp. 143–158. doi:10.
1007/978-3-7091-0388-3_8
Chen H, Chiang RHL, Storey VC (2012) Business intelligence and analytics: from big data to big
impact. Bus Intell Res 36(4):1–24
Chen LP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies:
A survey on Big Data. Inform Sci 314–347
Chertoff M, Simon T (2015) The impact of the dark web on internet governance and cyber
security. Global Commission on Internet Governance. No. 6
Chonka A, Xiang Y, Zhou W, Bonti A (2011) Cloud security defence to protect cloud computing
against HTTP-DoS and XML-DoS attacks. J Netw Comput Appl 34:1097–1107
Clark RM (2004) Intelligence analysis: a target-centric approach. CQ Press, Washington, DC
Danowski JA (2011) Counterterrorism mining for individuals semantically-similar to watchlist
members. In: Kock Wiil U (ed) Counterterrorism and open source intelligence. Springer Berlin
Heidelberg, pp. 223–247. doi:10.1007/978-3-7091-0388-3_12
Dou L, Cao G, Morris PJ, Morris RA, Ludäscher B, Macklin JA, Hanken J (2012) Kurator: a
Kepler package for data curation workflows. International Conference on Computational
Science, ICCS 2012, Procedia Computer Science, vol 9, pp. 1614–1619. doi:10.1016/j.procs.
2012.04.177
Enbody R, Soodo A (2014) Intelligence gathering. Elsevier Inc, Targeted cyber attacks. ISBN
9780128006047
Fallah M (2010). A puzzle-based defence strategy against flooding attacks using game theory.
IEEE Trans Dependable Secure Comput 7:5–19
FlashPoint (2015) Illuminating The Deep & Dark Web: the next Frontier in Comprehensive IT
Security. FlashPoint
14 OSINT in the Context of Cyber-Security 229

Fleisher C (2008) OSINT: its implications for business/competitive intelligence analysis and
analysts. Inteligencia Y Seguridad 4:115–141
Ghel R (2014) Power/freedom on the dark web: A digital ethnography of the Dark Web Social
Network. New media and society
Google 2014 Learn about Sitemaps. ps://support.google.com/webmasters/answer/156184?hl=en
Gottschalk P (2010) White-collar crome: detection, prevention and strategy in business enterprises.
Universal-Publishers, Boca Raton, Florida, USA. ISBN-10: 1599428393, ISBN-13:
9781599428390
Gottschalk P, Filstad C, Glomseth R, Solli-Sæther H (2011) Information management for
investigation and prevention of white-collar crime. Int J Inf Manage 31:226–233
Govil J, Govil J (2007) Ramifications of cyber crime and suggestive preventive measures.
Electro/information technology. Chicago, pp 610–615. IEEE. doi:10.1109/EIT.2007.4374526
Gregory M, Glance D (2013) Cyber-crime, cyber security and cyber warfare. Security and
networked society. Springer, pp 51–95. ISBN: 978-3-319-02389-2
Harvey C (2012) 50 top open source tools for big data. Retrieved 01 July 2015, from http://www.
datamation.com/data-center/50-top-open-source-tools-for-big-data-1(2,3).html
He P, Karabatis G (2012) Using semantic networks to counter cyber threats. IEEE. doi:10.1109/
ISI.2012.6284294
Hobbs Ch, Morgan M, Salisbury D (2014) Open source intelligence in the twenty-first century.
Palgrave, pp. 1–6. ISBN 978-0-230-00216-6
Hoque N, Bhuyan H, Baishya RC, Bhattacharyya DK, Kalita JKV (2014) Network attacks:
taxonomy, tools and systems. J Netw Comput Appl 40:307–324. doi:10.1016/j.jnca.2013.08.
001
Igbal F, Fung BCM, Debbabi M (2012) Mining criminal networks from chat log.
2012 IEEE/WIC/ACM international conferences on web intelligence and intelligent agent
technology. Macau, pp. 332–337. IEEE. doi:10.1109/WI-IAT.2012.68
Iqbal F, Binsalleeh H, Fung BCM, Debbabi M (2013) A unified data mining solution for
authorship analysis in anonymous textual communications. Inf Sci 231:98–112
Jahankhani H, Al-Nemrat A, Hosseinian-Far A (2014) Cybercrime classification and character-
istics. In: Akhgar B, Staniforth A, Bosco F (eds.) Cyber crime and cyber terrorism investigators’
handbook. Elsevier Inc., pp. 149–164. doi:10.1016/B978-0-12-800743-3.00012-8
Kang MJ (2012) Intelligence in the internet age: the emergence and evolution of Open Source
Intelligence (OSINT). Comput Hum Behav 28:673–682. doi:10.1016/j.chb.2011.11.014
Kim W, Jeong OR, Kim Ch, So J (2011) The dark side of the Internet: attacks, costs and responses.
Inform Syst 36:675–705
Kapow Software (2013) http://www.kofax.com/go/kapow/wp-building-your-osint-capability.
Retrieved from http://www.kofax.com: http://www.kofax.com/go/kapow/wp-building-your-
osint-capability
Katos V, Bednar PM (2008) A cyber-crime investigation framework. Comput Stand Interfaces
30:223–228. doi:10.1016/j.csi.2007.10.003
Koops BJ, Hoepman JH, Leenes R (2013) Open-source intelligence and privacy by design.
Computer Law and Security Review. 2(9):676–688
Kshetri N (2005) Pattern of global cyber war and crime: a conceptual framework. J Int Manage
11:541–562
Koester B, Schmidt SB (2009) Information superiority via formal concept analysis. In.
Argamon S, Howard N (eds) Computational methods for counterterrorism. Springer,
pp. 143–171. doi:10.1007/978-3-642-01141-2_9
Kock Wiil U, Gniadek J, Memon N (2011) Retraction note to: a novel method to analyze the
importance of links in terrorist networks. In: Wiil UK (ed) Counterterrorism and open source
intelligence. Springer Vienna, p. E1. doi:10.1007/978-3-7091-0388-3_22
Lagazio M, Sherif N, Cushman M (2015) A multi-level approach to understanding the impact of
cyber crime on the financial sector. Comput Secur 45:58–74
230 F. Tabatabaei and D. Wells

Li J, Li N, Wang X, Yu T (2009) Denial of service attacks and defenses in decentralized trust


management. Int J Inf Secur 8:89–101. Springer
Lindelauf R, Borm P, Hamers H (2011) Understanding terrorist network topologies and their
resilience against disruption. In: Kock Wiil U (ed.) Counterterrorism and open source
intelligence. Springer, Vienna, pp 61–72. doi:10.1007/978-3-7091-0388-3_5
Loshin D (2015) How big data analytics tools can help your organization. Retrieved from http://
searchbusinessanalytics.techtarget.com/feature/How-big-data-analytics-tools-can-help-your-
organization
Matusitz J (2009) A postmodern theory of cyberterrorism: game theory. Inform Secur J: Glob
Perspect 18:273–281. Taylor and Francis. doi:10.1080/19393550903200474
Mukhopadhyay A, Chatterjee S, Saha D, Mahanti A, Sadhukhan SK (2013) Cyber-risk decision
models: To insure IT or not? Decis Support Syst 56:11–26. Retrieved from http://dx.doi.org/
10.1016/j.dss.2013.04.004
Nykodym N, Taylor R, Vilela J (2005) Criminal profiling and insider cyber crime. Digital Invest
2:261–267. Elsevier
Omand D, Miller C, Bartlett J (2014) Towards the discipline of social media intelligence (2014).
In: Hobbs, Morgan, Salisbury (eds.) Open source intelligence in the twenty-first century.
Palgrave, 24–44. ISBN 978-0-230-00216-6
Petersen RR, Rhodes CJ, Kock Wiil U (2011) Node removal in criminal networks. 2011 European
intelligence and security informatics conference. IEEE Computer Society, pp. 360–365.
PWC cyber security (2015) https://www.pwc.com/us/en/increasing-it-effectiveness/publications/
assets/2015-us-cybercrime-survey.pdf. Retrieved from http://www.pwc.com/cybersecurity
Simmons C, Ellis C, Shiva S, Dasgupta D, Wu Q (2014) AVOIDIT: a cyber attack taxonomy.
Annual symposium on information assurance. Office of Naval Research (ONR).
Song J (2011) The analysis of military intelligence early warning based on open source
intelligence. Int Conf Intell Secur Inform (ISI). p. 226. IEEE
Spyridopoulos T, Karanikas G, Tryfonas T, Oikonomou G (2013) A game theoric defence
framework against DoS/DDoS cyber attacks. Comput Secur 38:39–50
Staniforth A (2014) Police investigation processes: practical tools and techniques for tackling
cyber crime. In: Akhgar B (ed.) Cyber crime and cyber terrorism investigator’s handbook.
Elsevier, pp. 31–42
Srihari RK (2009) Unapparent information revelation: text mining for counterterrorism. In:
Argamon S, Howard N (eds) Computational methods for counterterrorism. Springer, Berlin
Heidelberg, pp 67–87
Steele RD (2006) Open source intelligence. In Johnson LK (ed.) Strategic intelligence:
understanding the hidden side of government (intelligence and the quest for security).
Praeger, pp. 95–116
Sui D, Cavarlee J, Rudesill D (2015) The deep web and the darknet: a look inside the internet’s
massive black box. Wilson Center, Washington
Szomszor M, Moreau L (2003) Recording and reasoning over data provenance in web and grid
services. On the move to meaningful internet systems, pp. 603–620.
Tilmes C, Yesha Ye, Halem M (2010) Distinguishing provenance equivalence of earth science
data. Int Conf Comput Sci (ICCS). p. 1–9
Vitolo C, Elkhatib Y, Reusser D, Macleod CJA, Buytaert W (2015) Web technologies for
environmental Big Data. Environ Model Softw 63:185–198
Wall DS (2005) The internet as a conduit for criminal activity. In: Pattavina A (ed) Information
technology and the criminal justice system. Sage Publications, USA. ISBN 0-7619-3019-1
Wall DS (2007) Hunting shooting, and phishing: new cybercrime challenges for cybercanadians in
the 21st century. The ECCLES centre for american studies
Wall DS (2008) Hunting shooting, and phishing: new cybercrime challenges for cyber canadians
in the 21st Century. The Eccles Centre for American Studies. www.bl.uk/ecclescentre. The
British Library Publication
Wang SJ (2007) Measures of retaining digital evidence to prosecute computer-based cyber-crimes.
Comput Stand Interfaces 29:216–223. Elsevier
14 OSINT in the Context of Cyber-Security 231

Webopedia.com. (n.d.). Webopedia.com


Wu J, Zhang L, Qu S (2010) A comparative study for fast-flux service networks detection. Netw
Comput Adv Inf Manage (NCM). pp 346–350. IEEE
Yuan T, Chen P (2012) Data mining applications in E-Government information security, 2012
international workshop on information and electronics engineering (IWIEE). Proc Eng 29:235–
240

View publication stats

You might also like