Statistics: Presented By: Dr. Danica Ommen
Statistics: Presented By: Dr. Danica Ommen
Statistics: Presented By: Dr. Danica Ommen
5/13/2020 forensicstats.org
What is the “Statistics” group discussion about?
forensicstats.org | 2
CSAFE 1.0 Accomplishments
forensicstats.org | 3
CSAFE 1.0 Accomplishments
Major Accomplishments:
• Convened an international meeting on issues related to assessing forensic databases and
their use in casework, held at CMU in Pittsburgh.
• Framework for evaluating the significance of a forensic comparison published in
Significance, highlighting:
• What data are needed for evaluation of a match with respect to a population of interest.
• The need for a database to assess the probability of seeing similar results by chance, either to
develop a random match probability or estimate the denominator in a likelihood ratio.
• Lack of availability of relevant databases in many pattern disciplines
• PhD dissertation (Tackett, UVA) on Bayesian statistical methods for analyzing fingerprint
database searches
Impact:
• Deepened understanding of need for publicly available databases.
forensicstats.org | 4
CSAFE 2.0 Statistics Projects and Lead
Investigators
STAT I- Statistical Methods to Assess Reliability, Reproducibility, Accuracy of Categorical
Forensic Opinions
• Lead PI: Hal Stern, UCI
STAT II- Validation and Reliability of Score-Based Likelihood Ratios (SLRs) for Forensic Evidence
• Lead PI: Danica Ommen, ISU
STAT III- Machine Learning Methods for Dependent Score-Data Resulting From Forensic
Evidence Comparisons
• Lead PI: Danica Ommen, ISU
forensicstats.org | 5
Research Area Objectives
GOAL:
Critically examine statistical issues related to methods for analyzing pattern and digital evidence
that are relevant across CSAFE project areas
• Firearms
• Footwear
• Handwriting
• Digital
(Note: The issues considered here are motivated primarily from statistical tools developed
during the initial CSAFE funding period.)
Projects focus on issues at three different “levels” of the forensic evidence evaluation process
• Framework for evidence interpretation (score-based likelihood ratios)
• Methods of assessing (dis)similarity (inference for machine learning techniques)
• Expressing and assessing conclusions (reliability / validity of categorical conclusions)
forensicstats.org | 6
Background
• Likelihood ratio (LR) as a summary of the evidence:
LR = Pr(E | Hs) / Pr(E | Hd)
• The numerator assesses how likely is the evidence if Hs is true
• The denominator assesses how likely is the evidence if Hd is true
• Likelihood ratios are challenging to apply with pattern / digital evidence
• Mathematical representation of the evidence (often an image) is very high-dimensional
• Developing probability models for E is challenging
• Score-based likelihood ratios (SLR) have been proposed by some
• Replace the evidence E by a “score” S summarizing differences/similarities of the two
samples
• SLR = Pr(S | Hs) / (Pr(S | Hd)
• Several CSAFE 1.0 projects developed a score by using machine learning
algorithms to distinguish between known match pairs and known non-
match pairs; the estimated probability of being a match is a possible score
forensicstats.org | 7
CSAFE 2.0
Proposed Activities:
• Explore the strengths and weaknesses of SLRs for quantifying the value of evidence
• Develop framework of evidence interpretation which exploits the strengths of SLRs for pattern and digital
evidence
Potential Impact:
• List of recognized strengths and weaknesses, with supporting statistical arguments, of SLRs
• Framework for expressing conclusions regarding SLR results
• Applicable to a wide range of impression and pattern evidences (e.g., CSAFE 1.0 projects in footwear and
firearms)
forensicstats.org | 8
CSAFE 2.0
Proposed Activities:
• Explore the extent to which violating the independence assumption affects the performance of ML methods
• Develop ML methods for evaluating comparison scores that accommodate/adjust for the dependency in the data
Potential Impact:
• Provide statistically rigorous methods of computing SLRs for variety of evidence types
• Critically evaluate CSAFE methods for potential areas of correction/improvement before deploying methods in labs
forensicstats.org | 9
CSAFE 2.0
Proposed Activities:
• Develop statistical approaches to the analysis of data from reliability (repeatability and reproducibility) studies
for categorical conclusions in forensic science.
• Develop statistical approaches to the analysis of validity studies for categorical conclusions in forensic science.
• Apply the statistical approaches to available data from the pattern disciplines. Data can include publicly available
data from the FBI fingerprint study (and perhaps additional data from new studies of shoeprints, handwriting).
Potential Impact:
• Most studies of methods have focused on outcomes of binary decisions
• Statistical approaches to assessing reliability and validity for categorical scales will be important in
understanding their properties. forensicstats.org | 10
Resources and Needs
Impact Summary:
• Our results are applicable to a wide range of evidence types
• Statistically sound methods of expressing evidential value
• Better methods for communicating results
• Support for decisions/conclusions
forensicstats.org | 11
Resources and Needs
forensicstats.org | 12