FRA Assignment: (Type The Company Name)
FRA Assignment: (Type The Company Name)
FRA Assignment: (Type The Company Name)
FRA Assignment
Logistic regression model
Finance and Risk Analytics –Assignment
Problem Statement
Create India credit risk (default) model, using the data provided in the spreadsheet raw-data.xlsx,
and validate it on validation_data.xlsx. Please use the logistic regression framework to develop the
credit default model.
Data Insights
Net worth next year, Total assets, Net worth, Total income, Total expenses, Profit after tax, PBDITA,
PBT (Profit Before Tax), Cash profit, PBDITA as % of total income, PBT as % of total income, Cash
profit as % of total income, PAT as % of net worth, Sales, Total capital, Reserves and funds,
Borrowings, Current liabilities & provisions, Capital employed, Net fixed assets, Investments, Net
working capital, Debt to equity ratio (times), Cash to current liabilities (times), Total liabilities.
In addition to the above variables there are other financial parameters which define the financial
strength of the organization taking the total tally of variables to 51.
Data Preparation
Companies were classified into probable Defaulter and Non Defaulters based on the Net worth next
year. Ones with negative worth were classified as Defaulters (marked as 1) and the rest as Non
Defaulters (marked s 0).
Calculated Min, Max, Mean, Standard Deviation, Median and Percentiles (1 st to 4th and 99th to 97th )
Based on data understanding, identified floor and cap (Min / Max or based on Percentile)
Data imputation - Transformed the data with replacing the values with floor and cap.
Created new variables to be used for model building. New variables are Critical variables divided by
Total assets.
New variables created
Net worth/ Total assets,Total income / Total assets, Total expenses / Total assets, Profit after tax /
Total assets, PBT / Total assets, Sales / Total assets. Current liabilities & provisions / Total assets,
Capital employed / Total assets, Net fixed assets / Total assets, Investments / Total assets, Total
liabilities / Total assets
Calculated Mean for Non Default and Default companies and then derived the ratio. Variables with
ratio > 3 and ratio < 1/3 were shortlisted for making the model.
Microsoft xls in which calculations were made and new transformed variables were created is
attached below
3-PBDITA (Transfrmd)
4-PBT (Transfrmd)
9-Sales (Transfrmd)
14. Default
Data was partitioned to test the accuracy before validating the accuracy with validation_data
Model building flow
Data Partitioning
Model Coefficients and Statistics
3-PBDITA (Transfrmd)
4-PBT (Transfrmd)
9-Sales (Transfrmd)
14. Default
Confusion Matrix
Model Stats
Measure Value
Sensitivity 0.956
Specificity 0.691
Precision 0.986
Negative Predictive Value 0.397
False Positive Rate 0.31
False Discovery Rate 0.014
False Negative Rate 0.045
Accuracy 0.945
Measure Value
Sensitivity 0.9649
Specificity 0.6667
Precision 0.9693
Negative Predictive Value 0.6349
Accuracy 0.9399