Design of Credit Model Design IIM Fintech Abrg
Design of Credit Model Design IIM Fintech Abrg
Design of Credit Model Design IIM Fintech Abrg
Models
November,2023 0
• Most common biases and how they creep into organizations
and analytics
– Availability: What is easily recalled or data most easily available
must be the one most relevant!!
Beware of the
– Confirmation: Looking for data or selective evidence
Biases in interpretations- often confused with 'business knowledge'
Decision
Making: – Anchoring: Initial value or interpretation persuades future
analytical findings to revolve around the past findings.
They can creep
into the analytical – Representativeness: By the degree to which characteristics
process via data, conform to a stereotypical perception of members of that group. Ex:
What is
data?
Data
• "Raw" observations • Structured or unstructured
• Limited meaning of its own • Unprocessed or Processed
What is
information
What is Predicted
value
knowledge?
Knowledge
• Know well to predict
• Validate understanding from previous relationship
2
Context Setting
Framing
-Essential to define the problem in pure business terms
-Does it require a model to solve the problem?
-Can the Problem be solved without building the model?
4
Context Setting
Mapping nature of the problem with algorithm :Choice of pure econometric to higher
order ANN- tradeoff between predictability vs stability vs explainability
High Value vs Low Value: Tendency to go for manual/pure judgmental for very hi-value
decisions- of course human bias impacts and decision quality not rigorously tested; Low value
decisions more likely to be automated; Bionic/ insight informed decision more optimal
Decisive vs Interactive: Is the agent interacting with the model to fine-tune results ex:
google search or the model throws its output after running an algo- pattern matching
Cost of Wrong Decision: ' Netflix' presenting movie you do not like vs giving away the wrong
loan or approving a transaction which should have been tagged as 'money laundering' activity
1 2 3
6
Element of a risk model design
Improving the risk
decision quality
Choosing Model with
highest business
What is the data, tech relevance
Exactly ,How will & org ability
the risk score be
used? Ensuring Adoption
1 2 3 4
# Variables
50-5000 10-50
Missing value treatment Trend inspection & Identify similarly behaving Variable selection
transformation of variables variables – treat multicollinearity
• Diagnose nature of missing • Identify the intuitively trending • Variables that make same prediction and • Most significant variables post the
Activity
Imputation
Bivariate plots Correlation matrix Predictive power – IV value
• Depends on the % of missing • To identify similarly performing • Variable with highest predictive
• PD trends plotted for each
Statistical tool /
M1
6
M2
..
4 M3
M4
...
2 M5
..
Cohort analysis to 0
MOB1 MOB3 MOB6 MOB9 MOB12 MOB15 MOB18 MOB21 MOB24 MOB27 MOB30
identify optimal Month on Books (MOB)
default definition
which has a high • Monthly cohort plots (loan originated in same month) are tracked and evaluated w.r.t
9
Segmentation
1 2 3
10
Model calibration
Default rate
Default rate Default rate
Calibration
challenges
Overpredicted
default
Underpredicted
default
11
Validating the model performance out-side of training data is critical for
stake-holder confidence
Muti-dimensional model validation
a b c d e
Power Rank ordering Stability Temporal effect Calibration
Ability of the model to Model should be able to Power of individual Model should show Whether the synthetic
differentiate between rank order defaults in a variables and overall reasonable predictability ‘probability score’
good Vs bad monotonical fashion model should show outside the defined coming out of the model
reasonable stability model time period actually closely maps to
• Gini High power model does across seasons & the default rate
– for accuracy not necessitate rank business cycles • Temporal decay
across ordering
population • Bivariate strength • Hosmer-Lemeshow
• KS to be tested on