Tute Exercises PDF

ACTL3162
General Insurance Techniques
Exercises
Contents
1 Introduction 2
1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Collective Risk Modelling 5

2.1 Compound distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Explicit claims count distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 Individual Claim Size Modelling 29

3.1 Data analysis and descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Selected parametric claims size distributions . . . . . . . . . . . . . . . . . . . . 30
3.3 Model selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Calculating within layers for claim sizes . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4 Approximations for Compound Distributions 57

4.1 Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Algorithms for compound distributions . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5 Ruin Theory and Premium Calculation Principles 80

5.1 Ruin theory in discrete time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2 Ruin theory in continuous time . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3 Premium Risk-based Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6 Generalised Linear Models (GLMs) 91

6.1 Components of a GLM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2 Deviance and Scaled Deviance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
i
ACTL3162 General Insurance Techniques Exercises
6.3 Fit a GLM and Evaluate the quality of a model . . . . . . . . . . . . . . . . . . 93

6.4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7 Bayesian Models and Credibility Theory 104

7.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2 Exact Bayesian models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3 Linear credibility estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8 Claims Reserving 124

8.1 Outstanding loss liabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.2 Claims reserving algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
8.3 Stochastic claims reserving methods . . . . . . . . . . . . . . . . . . . . . . . . . 125
8.4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
9 Game and Decision Theory 134

9.1 Decision theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.2 Game theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
9.3 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
1
Module 1
Introduction
1.1 Preliminaries
Exercise 1.1: [NLI1, Solution][Wuthrich (2014), Exercise 1]
(a) Assume X ∼ N (0, 1). Prove that a + bX ∼ N (a, b2 ) for a, b ∈ R

(b) Assume that Xi are independent and Xi ∼ N (µi , σi2 ), prove i Xi ∼ N ( i µi , i σi2 ).
P P P
(c) Assume X ∼ N (0, 1). Prove that E[X 2k+1 ] = 0 for all k ∈ N0 (natural numbers with
zero).
Exercise 1.2: [NLI2, Solution][Wuthrich (2014), Exercise 2] Assume that Xk has a χ2 -distribution
with k ∈ N degrees of freedom, i.e. Xk is absolutely continuous with density
1
f (x) = xk/2−1 exp(−x/2), for x ≥ 0.
2k/2 Γ(k/2)
(a) Prove that f is a density (hint: see Section 3.3.3 and proof of Proposition 2.20 in
Wuthrich, 2014).
(b) Prove
MXk (r) = (1 − 2r)−k/2 for r < 1/2.
d
(c) Choose Z ∼ N (0, 1) and prove Z 2 = X1 .
i.i.d d
(d) Choose Z1 , ..., Zk ∼ N (0, 1). Prove ki=1 Zi2 = Xk and calculate the first two moments of
P
the latter.
1.2 Solutions
Solution 1.1: [NLI1, Exercise]
(a) The moment generate function of a + bX is

2 b2 /2
Ma+bX (r) = E[er(a+bX) ] = era E[erbX ] = era MX (rb) = era+r ,
which is the moment generating function of N (a, b2 ).
2
P
(b) We can write down the moment generating function for i Xi (using the assumption of
independence),
!
Y Y X X
MPi Xi (r) = MXi (r) = exp(rµi + r2 σi2 /2) = exp r µi + r 2 σi2 /2 ,
i i i i
which is the moment generating function of N ( i µi , i σi2 ).

P P
(c) It is tempted to use the moment generating function to solve this exercise, but obtaining
explicit formula for the n-th derivative of the m.g.f. is not so easy. We will resort back
to the old fashion way. First denote E[X 2k+1 ] = I2k+1 for k ∈ N0 , then
Z ∞
2k+1 1 1 2
I2k+1 = x √ exp − x dx
−∞ 2π 2
Z ∞ 1 2

1 d exp − x
= −x2k √ 2
dx
−∞ 2π dx
x=∞ Z ∞
1 2k 1 2 1 2k−1 1 2
=√ −x exp − x + √ 2k x exp − x dx
2π 2 x=−∞ 2π −∞ 2
1
= √ 2k · I2k−1 .
2π
(2k−2)
This recursive formula keeps on going and yields I2k+1 = √2k √ . . . √22π I1 . Since I1 =
2π 2π
E[X] = 0, therefore we have E[X 2k+1 ] = 0 for all k ∈ N0 .
(a) To show that f is a density function, first note that f (x) ≥ 0 for all x ≥ 0. Secondly
using a gamma density with shape parameter k/2 for k ∈ N and scale parameter 1/2, we
have
Z ∞
1
k/2
xk/2−1 exp(−x/2)dx = 1.
0 2 Γ(k/2)
(b) The moment generating function for Xk is

Z ∞
1
MXk (r) = exp(rx) k/2 xk/2−1 exp(−x/2)dx
0 2 Γ(k/2)
Z ∞
1
= k/2 xk/2−1 exp(−(1/2 − r)x)dx
2 Γ(k/2) 0
(1/2 − r)−k/2 Γ(k/2)
= , if 1/2 − r > 0
2k/2 Γ(k/2)
= (1 − 2r)−k/2 .
(c) We start with the cumulative function of Z 2 , for x ≥ 0,

√ √
F (x) = P(Z 2 ≤ x) = P(− x ≤ Z ≤ x)
Z √x
1 1 2
= √ √ exp − z dz
− x 2π 2
Z √x
1 1 2
=2 √ exp − z dz.
0 2π 2
3
Let z√= y 1/2 and dz = 1/2y −1/2 dy, using this change of variable, we have z changes from
0 to x and y changes from 0 to x. The cumulative probability function F (x) becomes
Z x
1 1
F (x) = √ exp − y y −1/2 dy
2π 2
Z0 x
1 1
= 1/2 Γ(1/2)
exp − y y −1/2 dy,
0 2 2
which is the cumulative function of Chi-square with 1 degree of freedom, X1 .

i.i.d.
(d) Suppose Z1 , ..., Zk ∼ N (0, 1), then Z12 , ..., Zk2 are i.i.d. Chi-square random variables with
2 d
Pk
1 degree of freedom. Then to prove i=1 Zi = Xk , it suffices to show that the sum
Pk i 1 2 k
i=1 X1 has Chi-square distribution with k degrees of freedom, where X1 , X1 , ..., X1 are
i.i.d. Chi-square random variables with 1 degree of freedom. This can be done using the
moment generating function,
k
Y k
Y
MPk Zi2 (r) = MX1i (r) = (1 − 2r)−1/2 = (1 − 2r)−k/2 .
i=1
i=1 i=1
4
Module 2
Collective Risk Modelling
2.1 Compound distributions

Exercise 2.1: [los3K, Solution] [Kaas et al. (2008), Problem 2.2.2] Throw a true die and let X
denote the outcome. Then, toss a coin X times. Let Y denote the number of heads obtained.
What are the expected value and the variance of Y ?
Exercise 2.2: [los7, Solution] Let the sum S = X1 + X2 + X3 with X1 , X2 and X3 distributed
as follows:
x f1 (x) f2 (x) f3 (x)
0 0.2 - 0.5
1 0.3 - 0.5
2 0.5 p -
3 - 1−p -
4 - - -
where 0 < p < 1. You are also given that FS (4) = 0.43. Calculate the value of p.
Exercise 2.3: [los8, Solution] Consider the following three distributions:
x fX1 (x) fX2 (x) fX3 (x)

0 0.1 0.0 0.30
1 0.2 0.2 0.10
2 0.2 0.0 0.05
3 0.2 0.3 0.30
4 0.2 0.4 0.15
5 0.1 0.1 0.10
Calculate
1. The distribution of X1 + X2
2. The distribution of X1 + X2 + X3
5
Exercise 2.4: [los11, Solution] An insurance portfolio has the following characteristics:
Distribution of N Distribution of X
n P (N = n) x p (x)
0 0.4 1 0.6
1 0.3 2 0.4
2 0.3
Number of claims, N , and claim amounts, X, are mutually independent. Compute the condi-
tional probability that the average claim size exceeds the expected claim size, given that there
is at least one claim.
Exercise 2.5: [los12, Solution] Aggregate claims are
S = X 1 + X2 + · · · + XN
where:
• Xi has a uniform distribution on (1, 5), i = 1, 2, ...;
• N has the distribution
n P (N = n)
0 0.3
1 0.2
2 0.5
• N , X1 , X2 , . . . are mutually independent.
Determine P (S < 4).
Exercise 2.6: [los13K, Solution] [Kaas et al. (2008), Problem 2.3.1] Calculate Pr[S = s] for
s = 0, 1, . . . , 6 when S = X1 + 2X2 + 3X3 and Xj ∼ Poisson(j).
Exercise 2.7: [los9R, Solution] [R∗ ]
1. Use R to perform the convolutions required in Exercice 2.3: 1 and 2.
2. Generalize your code in order to be able to calculate the distribution of α1 X1 +α2 X2 +α3 X3
in function of αi and fXi (x), i = 1, 2, 3, if the αi ’s are positive integers and for any non-
negative range of Xi , i = 1, 2, 3. You should not have to change your program if the
distributions or the αi ’s change.
To check your program, try to run it with α1 = 3, α2 = 0 and α3 = 2, as well as the
same probabilities as in Exercise 2.3, except for fX1 (5) = fX1 (6) = 0.05. Here is the
distribution you should get:
x Solution
1 0 0.0300
2 1 0.0000
3 2 0.0100
4 3 0.0600
5 4 0.0050
6 5 0.0200
6
7 6 0.0900
8 7 0.0100
9 8 0.0350
10 9 0.1200
11 10 0.0200
12 11 0.0500
13 12 0.1200
14 13 0.0300
15 14 0.0500
16 15 0.0750
17 16 0.0300
18 17 0.0350
19 18 0.0750
20 19 0.0225
21 20 0.0350
22 21 0.0150
23 22 0.0225
24 23 0.0075
25 24 0.0150
26 25 0.0050
27 26 0.0075
28 27 0.0000
29 28 0.0050
Exercise 2.8: [los21R, Solution][R∗ ] Consider Exercise 2.6 above. Can you use the R code
developed in Exercise 2.7 to get the distribution of S? Why?
Exercise 2.9: [los22R, Solution][R∗ ] Implement the formula

∞
X
fS (x) = p∗n (x) Pr[N = n]
n=0
in R as a function of the distributions of N and X. What are the conditions for this to be
feasible? Print the pmf and df of S, as well as the descriptive statistics of S. Try to make your
code as efficient as possible.
Check your code with Bowers et al. (1997), Example 12.2.2.
[Properties of compound Poisson random variables]
Exercise 2.10: [los23R, Solution][R∗ ] Let S ∼ compound Poisson(λ, p(xi ) = πi ), i = 1, . . . , m.

Use Theorem 12.4.2 to develop a program in R that calculates and displays fS (x) and FS (x)
for x ≤ s, in function of the inputs s, λ and πi , i = 1, . . . , m.
Check your results with Exercise 1.12 and plot fS (x) for x ≤ 36. You should get:
x f_S F_S
1 0 0.0024787522 0.002478752
2 1 0.0024787522 0.004957504
3 2 0.0061968804 0.011154385
4 3 0.0128068862 0.023961271
5 4 0.0149757944 0.038937065
6 5 0.0243950527 0.063332118
7 6 0.0332600344 0.096592153
[...]
37 36 0.0004111363 0.999152382
7
0.06
0.05
0.04
results$f_S
0.03
0.02
0.01
0.00
0 5 10 15 20 25 30 35
results$x
2.2 Explicit claims count distributions

Exercise 2.11: [los10, Solution] Calculate E[S], V ar(S) and mS (t) in case N has the following
distribution:
1. Poisson(λ)
2. binomial(n, p)
3. negative binomial(r, p)
Exercise 2.12: [NLI2.7, Solution][Wuthrich (2014), Corollary 2.7] Assume S1 , ..., Sn are inde-
pendent with Sj ∼ CompBinom(vj , p, G) for all j = 1, ..., n. Show that the aggregated claim
has a compound binomial distribution with
n
X Xn
S= Si ∼ CompBinom( vj , p, G).
i=1 j=1
Exercise 2.13: [los14, Solution] Let S ∼ compound Poisson(λ = 2, p(x) = x/10), x =

1, 2, 3, 4. Calculate Pr[S = s] for s ≤ 4:
1. using the basic convolution method for compound distributions;

2. using the sparse vector algorithm.
Exercise 2.14: [los15K, Solution] [Kaas et al. (2008), Problem 3.5.8] Assume that S1 is com-
pound Poisson distributed with parameter λ = 2 and claim sizes p(1) = p(3) = 21 . Let
S2 = S1 + N , where N is Poisson(1) distributed and independent of S1 . Determine the mgf of
S2 . What is the corresponding distribution? Determine Pr[S2 ≤ 2.4]. Leave the powers of e
unevaluated.
8
Exercise 2.15: [los16, Solution] You are given S = S1 + S2 , where S1 and S2 are independent
and have compound Poisson distributions with λ1 = 3 and λ2 = 2 and individual claim amount
distributions:
x p1 (x) p2 (x)
1 0.25 0.10
2 0.75 0.40
3 0.00 0.40
4 0.00 0.10
Determine the mean and variance of the individual claim amount for S.
Exercise 2.16: [los17K, Solution] [Kaas et al. (2008), Problem 3.4.3] Assume that S1 is com-
pound Poisson with λ1 = 4 and claims p1 (j) = 41 , j = 0, 1, 2, 3, and S2 is also compound Poisson
with λ2 = 2 and p2 (j) = 21 , j = 2, 4. If S1 and S2 are independent, then what is the distribution
of S1 + S2 ?
Exercise 2.17: [los18, Solution] Suppose that S has a compound Poisson distribution with
parameter λ and with discrete claims distribution
p (xi ) = P (X = xi ) = πi , for i = 1, 2, ..., m.
Now suppose S can be written as Xm

S= xi Ni
i=1
where Ni denotes the frequency of the claim amount xi for i = 1, 2, ..., m.
1. Prove that N1 , N2 , ..., Nm are mutually independent.
2. Prove that each Ni has a Poisson distribution and find its parameter.
[In other words, prove A Theorem 12.4.2.]
Exercise 2.18: [los19, Solution] Suppose that the number of accidents incurred by an insured
driver in a single year has a Poisson distribution with parameter λ. If an accident happens, the
probability is p that the damage amount will exceed a deductible (or excess) amount. On the
assumption that the number of accidents is independent of the severity of the accidents, derive
the distribution of the number of accidents that result in a claim payment.
Exercise 2.19: [sur1K, Solution] [Kaas et al. (2008), Problem 4.2.2] Let {N (t), t ≥ 0} be a
Poisson process with parameter λ, and let pn (t) = Pr[N (t) = n] and p−1 (t) ≡ 0. Show that
p0n (t) = −λpn (t) + λpn−1 (t), n = 0, 1, 2, . . . ,
and interpret these formulas by comparing pn (t) with pn (t + dt).

[Express pn (t + dt) as a function of pn (t) and interpret this relation.]
Exercise 2.20: [NLI6, Solution][Wuthrich (2014), Exercise 6] An insurance company decides

to offer a no-claims bonus to good car drivers, namely,
• a 10% discount after 3 years of no claim, and
• a 30% discount after 6 years of no claim.
9
How does the base premium need to be adjusted so that this no-claims bonus can be financed?
For simplicity we assume that all risks have been insured for at least 6 years. Answer the
question in the following two situations:
(a) Homogeneous portfolio with i.i.d. risks having i.i.d. Poisson claim counts with frequency
parameter λ = 0.2.
(b) Heterogeneous portfolio with independent risks being characterised by a frequency param-
eter Λ having a gamma distribution with mean λ = 0.2 and Vco(Λ) = 1 (Vco stands for
coefficient of variation). Conditionally, given Λ, the individual years have i.i.d. Poisson
claim counts with frequency parameter Λ.
2.3 Parameter estimation

Exercise 2.21: [Fit3, Solution] Observations (which are number of claims) Y1 , Y2 , ..., Yn are
independent Poisson random variables with E(Yi ) = µi where
(
α, for i = 1, 2, ..., m
log µi =
α + β, for i = m + 1, m + 2, ..., n
Here we assume m < n. Derive the maximum likelihood estimates of α and β.
Exercise 2.22: [Fit4, Solution][Klugman et al. (2012), Exercise 12.52] Consider the Inverse
Gaussian distribution with density function expressed as
1/2 " 2 #
θ θ x−µ
fX (x) = exp − , for x > 0.
2πx3 2x µ
1. Show that n n
(xj − µ)2

X X 1 1 n
=µ 2
− + (x − µ)2 ,
j=1
xj j=1
xj x x
Pn
where x = (1/n) j=1 xj .
2. For a given sample x1 , x2 , ..., xn , show that the maximum likelihood estimates of µ and θ
are
µ
b=x
and
n
θb = .
Pn 1 1
j=1 −
xj x
Exercise 2.23: [Fit5, Solution] The following 20 claim amounts were observed over a period
of time:
132 149 476 147 135 110 176 107 147 165
135 117 110 111 226 108 102 108 227 102
You are interested in estimating the probability that a claim will exceed 200. You are to fit the
Pareto distribution with cumulative distribution function of the form
FX (x) = 1 − (100/x)α for x > 100 and α > 0.
10
1. Determine the MLE of the Pareto parameter α.

2. Use this to estimate the probability that a claim will exceed 200.
Exercise 2.24: [NLI8, Solution][Wuthrich (2014), Exercise 8] Natural hazards in Switzerland

are covered by the so-called Schweizersische Elementarschade-Pool (ES-Pool). This is a pool
of private Swiss insurance companies which organises the diversification of natural hazards in
Switzerland.
For pricing of these natural hazards one distinguishes between small events and large events,
the latter having a total claim amount exceeding CHF 50 millions per events. The following
15 storm and flood events have been observed in years 1986-2005 (these are the events with a
total claim amount exceeding CHF 50 millions).
date amount in CHF mio. date amount in CHF mio.

20.06.1986 52.8 18.05.1994 78.5
18.08.1986 135.2 18.02.1999 75.3
18.07.1987 55.9 12.05.1999 178.3
23.08.1987 138.6 26.12.1999 182.8
26.02.1990 122.9 04.07.2000 54.4
21.08.1992 55.8 13.10.2000 365.3
24.09.1993 368.2 20.08.2005 1051.1
08.10.1993 83.8
• Fit a Pareto distribution with parameters θ = 50 and α > 0 to the observed claim sizes.
Estimate parameter α using the unbiased version of the MLE.
• We introduce a maximal claims cover of M = 2 billions CHF per event, i.e. the individual
claims are given by Yi ∧ M = min{Yi , M } (see also Section 3.4.2 in Wuthrich, 2014). For
the yearly claim amount of storm and flood events, we assume a compound Poisson
distribution with Pareto claim sizes Yi . What is the expected total yearly claim amount?
• What is the probability that we observe a storm and flood event next year which exceeds
the level of M = 2 billions CHF?
2.4 Solutions
Solution 2.1: [los3K, Exercise] Firstly, one should notice that X is a discrete uniform random
variable over {1, 2, 3, 4, 5, 6}. Therefore we have
1+2+3+4+5+6 7
E(X) = = ,
6 2
2
2 2 12 + 22 + 32 + 42 + 52 + 62 7 35
V ar(X) = E(X ) − (E(X)) = − = .
6 2 12
We have Y |X = x ∼binomial(x, 1/2). Hence,
E (Y ) = E [E (Y |X )] = E [1/2X] = 7/4
and
V ar (Y ) = V ar [E (Y |X )] + E [V ar (Y |X )]
= V ar [1/2X] + E [1/4X] = 77/48.
11
Solution 2.2: [los7, Exercise] First set up the following table:
x f1 (x) f2 (x) f1+2 (x) f3 (x) f1+2+3 (x) FS (x)

0 0.2 0 0 0.5 0 0
1 0.3 0 0 0.5 0 0
2 0.5 p 0.2p 0 0.1p 0.1p
3 0 1 − p 0.2 + 0.1p 0 0.1 + 0.15p 0.1 + 0.25p
4 0 0 0.3 + 0.2p 0 0.25 + 0.15p 0.35 + 0.4p
Thus, 0.35 + 0.4p = 0.43 implies p = 0.2.
Solution 2.3: [los8, Exercise]
1. The range of X1 + X2 will be [0, 10]. We have

x fX1 (x) fX2 (x) fX1 +X2 (x)
0 0.1 0.0 0.00 = 0.1 · 0
1 0.2 0.2 0.02 = 0.1 · 0.2 + 0.2 · 0
2 0.2 0.0 0.04 = 0.1 · 0 + 0.22 + 0.2 · 0
3 0.2 0.3 0.07 = 0.1 · 0.3 + 0.2 · 0 + 0.22 + 0.2 · 0
4 0.2 0.4 0.14 = 0.1 · 0.4 + 0.2 · 0.3 + 0.2 · 0 + 0.22 + 0.2 · 0
5 0.1 0.1 0.19 = 0.12 + 0.2 · 0.4 + 0.2 · 0.3 + 0.2 · 0 + 0.22 + 0.1 · 0
6 0.18 = 0.2 · 0.1 + 0.2 · 0.4 + 0.2 · 0.3 + 0.2 · 0 + 0.1 · 0.2
7 0.16 = 0.2 · 0.1 + 0.2 · 0.4 + 0.2 · 0.3 + 0.1 · 0
8 0.13 = 0.2 · 0.1 + 0.2 · 0.4 + 0.1 · 0.3
9 0.06 = 0.2 · 0.1 + 0.1 · 0.4
10 0.01 = 0.12
P
Check your results by verifying that x fX1 +X2 (x) = 1.
2. In order to get the distribution of X1 + X2 + X3 , it suffices to calculate the convolution

of X1 + X2 with X3 . We have then
x fX3 (x) fX1 +X2 (x) fX1 +X2 +X3 (x)

0 0.30 0.00 0
1 0.10 0.02 0.006
2 0.05 0.04 0.014
3 0.30 0.07 0.026
4 0.15 0.14 0.057
5 0.10 0.19 0.0895
6 0.18 0.109
7 0.16 0.132
8 0.13 0.149
9 0.06 0.1355
10 0.01 0.1095
11 0.085
12 0.054
13 0.025
14 0.0075
15 0.001
12
Solution 2.4: [los11, Exercise] Let S = X1 + · · · + XN denote the aggregate claims. Then we
wish to compute the probability P NS > E (X) |N > 0 where E (X) = 1.4. Thus, we have

S Pr [S > 1.4N ∩ N > 0]
Pr > E (X) |N > 0 =
N Pr [N > 0]
Pr [S > 1.4 |N = 1 ] Pr [N = 1] + Pr [S > 2.8 |N = 2 ] Pr [N = 2]
=
h i 1 − 0.4
(0.4) (0.3) + 1 − (0.6)2 (0.3)
= = 0.52.
0.6
To understand 1 − (0.6)2 , note that if N = 2 then S ∈ {2, 3, 4} and thus

Pr[S > 2.8|N = 2] = 1 − Pr[S = 2|N = 2] = 1 − 0.62 .
Solution 2.5: [los12, Exercise] Use equation

∞
X
FS (x) = P ∗n (x) Pr[N = n]
n=0
to get the exact value of P (S < 4) :

∞
X
P (S < 4) = P (X1 + · · · + XN < 4 |N = n) P (N = n)
n=0
= 0.3 + P (X1 < 4) · 0.2 + P (X1 + X2 < 4) · 0.5
where you can verify

3
P (X1 < 4) =
4
and Z 3 Z 4−x1
1 1 1
P (X1 + X2 < 4) = · dx2 dx1 = .
1 1 4 4 8
Thus,
3 1
P (S < 4) = 0.3 + × 0.2 + × 0.5 = 0.5125.
4 8
Solution 2.6: [los13K, Exercise] We have S = X1 + 2X2 + 3X3 with Xj ∼ Poisson(j) :
x f1 (x) ∗f2 (x) = f1+2 (x) ∗f3 (x) = f1+2+3 (x)

e−1 × e−2 × e−3 × e−3 × e−6 ×
0 1 1 1 1 1
1 1 0 1 0 1
2 1/2 2 2 1/2 0 2 1/2
3 1/6 0 2 1/6 3 5 1/6
4 1/24 2 3 1/24 0 6 1/24
5 1/120 0 2 41/120 0 9 101/120
6 1/720 1 1/3 2 301/720 4 1/2 13 301/720
↑ ↑ ↑
e−1 /x! e−2 2x/2 / (x/2)! e−3 3x/3 / (x/3)!
13
Solution 2.7: [los9R, Exercise]
1. Try to copy this code and paste this code in a R document. If you source it, you should
find the same results as in Exercise 1.8.1 and 1.8.2.
# put probabilities of the three distributions in vectors

fX1 <- c(.1,.2,.2,.2,.2,.1)
fX2 <- c(0,.2,0,.3,.4,.1)
fX3 <- c(.3,.1,.05,.3,.15,.1)
#####################
# convolution 1.8.1 #
#####################
#initialise an empty vector of type double for the probabilities

fX12_8a <- c()
# we need to put 0’s in the probability vectors

# this is to avoid NAs in the recursions afterwards
# the appropriate number depends on the range of the sum, here [0,10]
fX1 <- c(fX1,rep(0,5))
fX2 <- c(fX2,rep(0,5))
#perform the convolutions

for(i in 1:11) # we know that the range of X_1+X_2 is [0,10]
{#now we calculate the probability
fX12_8a[i] <- sum( fX1[1:i]*fX2[i:1] )
}# end of the for loop
#print results
results<-data.frame(x=c(0:10),DistrX1=fX1,DistrX2=fX2,Solution=fX12_8a)
print("Exercise 1.8.1")
print(results)
#####################
# convolution 1.8.2 #
#####################
#use the same algorithm as above, but with fX12_8a and fX3...
#initialise an empty vector of type double for the probabilities

fX123_8b <- c()
# add 0’s
fX12_8a <- c(fX12_8a,rep(0,5))
fX3 <- c(fX3,rep(0,10))

for(i in 1:16) # we know that the range of X_1+X_2+X_3 is [0,15]
{#now we calculate the probability
fX123_8b[i] <- sum(fX3[1:i]*fX12_8a[i:1])
}# end of the for loop
#print results
resultsb<-data.frame(x=c(0:15),DistrX3=fX3,DistrX1X2=fX12_8a,Solution=fX123_8b)
print("Exercise 1.8.2")
print(resultsb)
Here is how your output should look like:
14
[1] "Exercise 1.8.1"

x DistrX1 DistrX2 Solution
1 0 0.1 0.0 0.00
2 1 0.2 0.2 0.02
3 2 0.2 0.0 0.04
4 3 0.2 0.3 0.07
5 4 0.2 0.4 0.14
6 5 0.1 0.1 0.19
7 6 0.0 0.0 0.18
8 7 0.0 0.0 0.16
9 8 0.0 0.0 0.13
10 9 0.0 0.0 0.06
11 10 0.0 0.0 0.01
[1] "Exercise 1.8.2"
x DistrX3 DistrX1X2 Solution
1 0 0.30 0.00 0.0000
2 1 0.10 0.02 0.0060
3 2 0.05 0.04 0.0140
4 3 0.30 0.07 0.0260
5 4 0.15 0.14 0.0570
6 5 0.10 0.19 0.0895
7 6 0.00 0.18 0.1090
8 7 0.00 0.16 0.1320
9 8 0.00 0.13 0.1490
10 9 0.00 0.06 0.1355
11 10 0.00 0.01 0.1095
12 11 0.00 0.00 0.0850
13 12 0.00 0.00 0.0540
14 13 0.00 0.00 0.0250
15 14 0.00 0.00 0.0075
16 15 0.00 0.00 0.0010
2. Here is one possibility:
#############
# variables #
#############
# vectors of probabilities
fX1 <- c(.1,.2,.2,.2,.2,.05,.05)
fX2 <- c(0,.2,0,.3,.4,.1)
fX3 <- c(.3,.1,.05,.3,.15,.1)
# weights
alpha <- c(3,0,2)
################
# convolutions #
################
#just to be sure..
if(sum(alpha)==1) print("there is no convolution to do!")
#calculate the range of S

Xmax <- c(length(fX1)-1,length(fX2)-1,length(fX3)-1)
rangeS <- sum(Xmax*alpha)
#for convenience, put probabilities in an array, with 0’s where relevant

fX <- array(c(fX1,rep(0,rangeS-Xmax[1]),
fX2,rep(0,rangeS-Xmax[2]), #we complete each colun with 0’s
15
fX3,rep(0,rangeS-Xmax[3])), #end of values

c(rangeS+1,3)) #now the dimension of the array (second argument)
#####
#note that we have at most only two convolutions to do
# if we scale the X’s correctly
#initialising results array - 3 columns for scaled X’s, one for convolution
#of first two, last for the solution
fS <- array(rep(0,(rangeS+1)*5),c(rangeS+1,5))
#appropriately scaling the fX’s

for(i in 1:3) { # for each X
if(alpha[i]==0) next
for(j in 1:(Xmax[i]+1)) {
fS[(j-1)*alpha[i]+1,i] <- fX[j,i]
} # end of j loop
}# end of i loop
for(i in 1:(rangeS+1)) {
if(alpha[2]==0){ # then only 1 and 3 need to be convoluted (see test above)
fS[i,5] <- sum(fS[1:i,1]*fS[i:1,3]) #end of the story
} else { if(alpha[1]==0){ #then only 2 and 3 need to be convoluted
fS[i,5] <- sum(fS[1:i,2]*fS[i:1,3]) #end of the story
} else { fS[i,4] <- sum(fS[1:i,1]*fS[i:1,2]) # we do 1 and 2 and see...
if(alpha[3]==0) { # if alpha 3 is 0, then it is finished
fS[i,5] <- fS[i,4] # ... and we translate results in column 5
# otherwise we do the last convolution 1*2 with 3:
} else {fS[i,5] <- sum(fS[1:i,3]*fS[i:1,4])}
} # end alpha 3 if
} #end second else
} # end for
#print results
results<-data.frame(x=c(0:rangeS),Solution=fS[,5])
print("Solution")
print(results)
plot(results)
#check we have a true distribution..:
if(min(fS[,5])<0) print("Oups some probabilities are negative") else
print("All probabilities are positive")
if(max(fS[,5])>1) print("Oups some probabilities are > 1") else
print("All probabilities are < 1")
print(c("The sum of them is: ",sum(fS[,5])))
And the plot is
16
0.12
0.10
0.08
Solution
0.06
0.04
0.02
0.00
0 5 10 15 20 25
You can see the effect of the scaling as the pmf is not smooth.
Solution 2.8: [los21R, Exercise] The support of the Xi is not finite, which means that it is
not possible to calculate the exact distribution of S without truncating the distribution of Xi
at some points; otherwise the program will consider the support of S as infinite and there will
be infinite loops. However, it is possible to achieve a decent level of accuracy (if one is patient),
but that would require careful additional programming to determine where the distribution of
the Xi ’s should be truncated. This would also require analysis to check if the moments and
quantiles of S are reasonably conserved...
Note that it is not possible to develop a code for the general case
S = α1 X1 + α2 X2 + α3 X3 , αi ≥ 0 (integers), i = 1, . . . , 3
if some of the Xi ’s have infinite support without truncating them. However, if all the Xi ’s are
Poisson, we can then use A Theorem 12.4.1 and allow for any number of αi ’s.
Solution 2.9: [los22R, Exercise] First note that this formula can only be used if X and N are
both discrete and with a finite range. One possible code is as follows:
# inputs #
##########
fX <-c(0,.5,.4,.1)
fN <-c(.1,.3,.4,.2)
# program #
###########
# range of X, N and S
rX <- length(fX)-1
rN <- length(fN)-1
17
rS <- rX*rN
###initialise our results array

#first create a vector with the contents of the array
cont <- c()
#first column
cont <- c(cont,1,rep(0,rS)) #p*0(x)
cont <- c(cont,fN[1]) #last line is distribution of N
#second column
cont <- c(cont,fX,rep(0,rS-rX),fN[2])
#following columns are 0’s, except in the last line
for(i in 1:(rN-1)) { #there will be rN-1 convolutions
cont <- c(cont,rep(0,rS+1),fN[i+2])
} #end of the i loop
#last two columns for pmf and df
#(not in the loop to avoid n/a’s in the last row - a detail)
cont <- c(cont,rep(0,2*(rS+2)))
#and finally the array
distS <- array(cont,c(rS+2,rN+3))
# perform the convolutions

if(fX[1]>0){
for(i in 1:(rN-1)) {
for(j in 1:((1+i)*rX+1)) { #lower side of triangle - see below
distS[j,i+2] <- sum(distS[1:j,2]*distS[j:1,i+1])
} # end j loop
} # end i loop
} else { #for efficiency: if f_X(0)=0, there will only be probabilities
# in a triangle with upper side with slope -1 from col 2
# and lower side with slope -rX (whether mass is >0 or not!)
for(i in 1:(rN-1)) {
for(j in 1:((1+i)*rX+1)) { #first change (lower side)
distS[j,i+2] <- sum(distS[1:max(j-i,1),2]*distS[(j+1)-(1:max(j-i,1)),i+1])
#second change (upper side) is max(j-1,1)
} # end j loop
} # end i loop
}
# calculate pmf and pf

# first line (can’t be put in the following loop)
distS[1,rN+2] <- sum(distS[1,1:(rN+1)]*distS[rS+2,1:(rN+1)]) #pmf at i
distS[1,rN+3] <- distS[1,rN+2]
# next lines
for(i in 2:(rS+1)) {
distS[i,rN+2] <- sum(distS[i,1:(rN+1)]*distS[rS+2,1:(rN+1)]) #pmf at i
distS[i,rN+3] <- distS[i-1,rN+3]+distS[i,rN+2]
} # end i loop
# print results #
#################
results <- data.frame(x=0:rS,pmf=distS[1:(rS+1),rN+2],df=distS[1:(rS+1),rN+3])
print(results)
source("path to program of Exercise 11.a")
pmf_to_desc_stats(distS[1:(rS+1),rN+2],1)
#check we have a true distribution..:

if(min(distS[1:(rS+1),rN+2])<0) print("Oups some probabilities are negative") else
print("All probabilities are positive")
if(max(distS[1:(rS+1),rN+2])>1) print("Oups some probabilities are > 1") else
print("All probabilities are < 1")
Efficiency comments:
1. Consider the convolutions part. In the bounds for j and for the convolutions, we recognise
that the probability masses of the convolutions will spread in the shape of a triangle
in the table of successive convolutions. The upper side of the triangle has slope 0 if
Pr[X = 0] > 0 and slope -1 otherwise. The lower side of the triangle will always have a
slope of minus the range of X. So you will have probabilities everywhere only if the range
of X is infinite (which will never happen in this program - see the note at the beginning of
the solution) and if Pr[X = 0] > 0. We can thus save resources by doing the convolutions
(products) only when these probabilities are different form zeros, which is achieved by
18
choosing the bounds appropriately.

2. Note the line
distS[j,i+2] <- sum(distS[1:j,2]*distS[j:1,i+1])
We use here the fact that R is a vector based program and we can perform j products in
a single line using ”1:j” and ”j:1” in the index of distS. This effectively does the same as
the following loop
temp <- 0
for(k in 1:j) {
temp <- temp + distS[k,2]*distS[j-k+1,i+1]
} # end k loop
distS[j,i+2] <- temp
which is a more traditional way to program it (in VBA or Maple or Mathematica, you
would program the convolution with the loop).
What impact do these considerations have? Here are the approximate processing time if
fX (x) = 0.01, x = 1, 2, . . . , 100 and fN (x) = 0.05, x = 0, 1, . . . , 19 (performed on an iMac
with 3.06 GHz Intel Core 2 Duo chip):
no triangle triangle
loop 2 min. 56 sec. 1 min. 5 sec.
vector ≈ 3 sec. ≈ 1.5 sec.
In this case, using the vector-based programming is much more important than being smart
about the triangle, but this is because we use R and we can take advantage of its very powerful
algorithms to handle vectors. In VBA, the triangle trick would be crucial.
Finally, note the shape of the pmf of S in this case (we omit here the mass at 0):
0.0012
0.0010
distS[2:(rS + 1), rN + 2]
0.0008
0.0006
0.0004
0.0002
0.0000
0 500 1000 1500
1:rS
19
You can see the effect of compounding: when probabilities are due to only a few claims there
are irregularities (jumps) in the pmf (lhs of plot). When we are looking at outcomes of S that
involve a lot a different possible number of claims, the pmf is much smoother (rhs of plot).
Solution 2.10: [los23R, Exercise] Here is the code with the parameters corresponding to Ex-
ercise 1.12:
# Inputs #
##########
fX <- c(0,1/6,2/6,3/6)
l <- 6
s <- 36
# Program #
###########
#initialise the vectors for the lambda_i’s and alpha_i’s

lambda <- c()
alpha <- c()
# calculate the lambda_i’s and the alpha_i’s

for(i in 2:length(fX)) # we can discard alpha=0
{if(fX[i]>0) {
lambda <- c(lambda,fX[i]*l)
alpha <- c(alpha,i-1)}
} #end i loop
#number of variables
num <-length(alpha)
#number of columns that are necessary
#(the num distributions + num-1 convolutions)
colu <- 2*num -1
#initialise our results array

fS <- array(0,c(s+1,colu))
#initialise the first num columns (Poisson rvs)

for(i in 1:num){
for(j in seq(1,(s+1),by=alpha[i])){ #we spread the probabilities
fS[j,i] <- exp(-lambda[i])*lambda[i]^((j-1)/alpha[i])/factorial(((j-1)/alpha[i]))
} # end of j loop
} # end of i loop

for(i in 1:(num-1)){
for(j in 1:(s+1)){
fS[j,num+i] <- sum(fS[1:j,num+i-1]*fS[j:1,num-i])
} # end j loop
} # end i loop
#calculate the df
FS <- c()
for(i in 1:(s+1)){FS[i]<-sum(fS[1:i,colu])}
#print results
results <- data.frame(x=c(0:s),f_S=fS[,colu],F_S=FS)
plot(results$x,results$f_S)
print(results)
Solution 2.11: [los10, Exercise] Poisson:

E (N ) = λ; V ar (N ) = λ;
exp λ et − 1 ;

mN (t) =
λ · µ; V ar (S) = λ σ 2 + µ2 ;

E (S) =
mS (t) = exp [λ (mX (t) − 1)] .
20
Binomial:
E (N ) = np; V ar (N ) = npq;
n
mN (t) = q + pet ;
E (S) = np · µ; V ar (S) = npσ 2 + npqµ2 ;
mS (t) = [q + pmX (t)]n .
Negative Binomial:
E (N ) = r (1 − p) /p; V ar (N ) = r (1 − p) /p2 ;
r
p/ 1 − (1 − p) et ;

mN (t) =
E (S) = r (1 − p) /p · µ; V ar (S) = r (1 − p) /p · σ 2 + r (1 − p) /p2 · µ2 ;
mS (t) = {p/ [1 − (1 − p) mX (t)]}r .
Solution 2.12: [NLI2.7, Exercise] Assume S1 , ..., Sn are independent Sj ∼ CompBinom(vj , p, G)

for all j = 1, ..., n. The moment generating function of Sj (see Proposition 2.6 from Wuthrich,
2014) for j = 1, ..., n is
MSj (r) = (pMY1 (r) + (1 − p))vj for r ∈ R.
Since Sj for j = 1, ..., n are independent, then the moment generating function of the sum is
n
Y Pn
M j Sj (r) =
P
(pMY1 (r) + (1 − p))vj = (pMY1 (r) + (1 − p)) j=1 vj .
j=1
Pn
Sj ∼ CompBinom( nj=1 vj , p, G).
P
This proves that j=1
Solution 2.13: [los14, Exercise]
1. First, note that:

x p∗0 (x) p∗1 (x) p∗2 (x) p∗3 (x) p∗4 (x)
0 1 0 0 0 0
1 0 0.1 0 0 0
2 0 0.2 0.01 0 0
3 0 0.3 0.04 0.001 0
4 0 0.4 0.10 0.006 0.0001
pn e−2 2e−2 2e−2 4 −2
3
e 2 −2
3
e
Thus,
P (S e−2 ;
= 0) =
P (S 0.2e−2 ;
= 1) =
P (S 0.42e−2 ;
= 2) =
P (S e−2 (0.6 + 0.08 + 0.0013) = 0.6813e−2 ;
= 3) =

−2 4
P (S = 4) = e 0.8 + 0.2 + 0.006 × + 0.000067 = 1.008067e−2 .
3
21
2. S ∼compound Poisson(λ = 2) with p (x) = 0.1x, x = 1, 2, 3, 4. Write fj for the pdf with
Sj = 1 · N1 + · · · + j · Nj , j = 1, 2, 3, 4, and pj for the pdf of j · Nj so that
pj (x) = P (j · Nj = x) = exp (−0.2 · j) · (0.2 · j)x/j / (x/j)!.
x p1 ∗p2 = f2 ∗p3 = f3 ∗p4 = f4
0 .819 .670 .549 .549 .301 .449 .1353
1 .164 − .110 − .060 − .0270
2 .016 .268 .231 − .127 − .0568
3 .001 − .045 .329 .205 − .0922
4 .000 .054 .048 − .062 .359 .1364
Solution 2.14: [los15K, Exercise] Write S1 = N + 3N3 with

N ∼ Poisson (λ · p(1) = 1) and
N3 ∼ Poisson (λ · p(3) = 1) ,
Now S2 = N1 + 3N3 with N1 being the sum of two independent N ∼ Poisson(1). Thus we have
N1 ∼ Poisson (λ1 = 1 + 1 = 2) and
N3 ∼ Poisson (λ3 = 1) again.
From this and Theorem 12.4.2, we observe that S2 is compound Poisson with λ = 3 and
λ1 λ3
P (X = 1) = = 2/3; P (X = 3) = = 1/3.
λ1 + λ3 λ1 + λ3
Next, invoke Panjer to compute
P (S ≤ 2) = P (S ≤ 2.4) .
Or using an ad hoc method:
P (S2 ≤ 2.4) = P (N1 ≤ 2 & N3 = 0)
= P (N1 ≤ 2) · P (N3 = 0)
= 20 /0! + 21 /1! + 22 /2! e−2 · e−1 = 5e−3 .

Solution 2.15: [los16, Exercise] From Theorem 12.4.1, we know S is compound Poisson with
λ = 5 and individual claims distribution
x P (x)
3
1 5
(0.25) + 25 (0.10) = 0.19
3
2 5
(0.75) + 25 (0.40) = 0.61
2
3 5
(0.40) = 0.16
2
4 5
(0.10) = 0.04
Thus, mean of individual claim amount is
1 × 0.19 + 2 × 0.61 + 3 × 0.16 + 4 × 0.04 = 2.05
and the variance of individual claim amount is
(1 − 2.05)2 × 0.19 + (2 − 2.05)2 × 0.61
+ (3 − 2.05)2 × 0.16 + (4 − 2.05)2 × 0.04
= 0.5075.
22
Solution 2.16: [los17K, Exercise] According to Theorem 12.4.1, S is compound Poisson(6)

and
4 1 1
p (0) = × =
6 4 6
4 1 1
p (1) = × =
6 4 6
4 1 2 1 2
p (2) = × + × =
6 4 6 2 6
4 1 1
p (3) = × =
6 4 6
2 1 1
p (4) = × = .
6 2 6
Solution 2.17: [los18, Exercise] Define the sum of the number of claims arising from each
possible claim amount by
Xm
N= Ni .
i=1
So conditional on N = n, the number of claims N1 , N2 , ..., Nm of each claim amount have a

multinomial distribution with parameters n, π1 , π2 , ..., πm . Its joint moment generating function
(m.g.f.) can be obtained as
" m
!# ∞
" m
! #
X X X
E exp ti Ni = E exp ti Ni N = n P (N = n)

i=1 n=0 i=1
∞
X n e−λ λn
= π1 et1 + π2 et2 + · · · + πm etm
n=0
n!
∞
−λ
X n 1
= e λπ1 et1 + λπ2 et2 + · · · + λπm etm
n=0
n!
m
!
X
= e−λ exp λπi eti
i=1
Ym
exp λπi eti − 1 .

=
i=1
This implies mutual independence of N1 , N2 , ..., Nm [explain why?]. Furthermore, by setting

ti = t and tj = 0 for all j 6= i, we obtain the m.g.f. of Ni which is given below:
E [exp (ti Ni )] = exp λπi eti − 1

which is the m.g.f. of a Poisson with parameter λπi .
Solution 2.18: [los19, Exercise] Denote N to be the number of accidents which is given to be
Poisson(λ) distribution. Now, suppose N1 is the number of these accidents that lead to claims
(i.e. damage amount exceeds the deductible or excess). Then clearly, conditionally on N , N1
has a Binomial (N, p) distribution, since an accident leads to either a claim or no claim. Thus,
we have
m!
P (N1 = n |N = m) = pn (1 − p)m−n , n ≤ m.
n! (m − n)!
23
By the law of total probability, we find that

∞
X
P (N1 = n) = P (N1 = n |N = m) P (N = m)
m=n
∞
X m! e−λ λm
= pn (1 − p)m−n
m=n
n! (m − n)! m!
∞
e−λ pn X 1
= λm (1 − p)m−n
n! m=n (m − n)!
∞
e−λ pn λn X 1
= [λ (1 − p)]m−n
n! m=n
(m − n)!
∞
e−λ (λp)n X 1
= [λ (1 − p)]k
n! k=0
k!
e−λ (λp)n λ(1−p) e−λp (λp)n
= e =
n! n!
which gives a Poisson distribution with parameter λp.
This could have been anticipated (and also shown) as a consequence of Theorem 12.4.2.
Solution 2.19: [sur1K, Exercise] We have
pn (t) = P (N (t) = n) = e−λt (λt)n /n!
with p−1 (t) = 0. This implies the derivative
p0n (t) = e−λt (−λ) (λt)n /n! + e−λt nλ (λt)n−1 /n!

= −λpn (t) + λpn−1 (t)
which gives the result for n = 0, 1, ... Note that for all n, we can write
p0n (t) = [pn (t + dt) − pn (t)] /dt.
Using the previous result,
−λpn (t) + λpn−1 (t) = [pn (t + dt) − pn (t)] /dt.
Re-arranging, we get
pn (t + dt) = pn−1 (t) × λdt + pn (t) × (1 − λdt) .
Thus, the probability of having n jumps till t + dt is equal to the probability of getting n − 1
jumps till t and another one within the next tiny interval dt, plus the probability of getting n
jumps till t and not other one within the next tiny interval dt.
Note that this expression could have been written directly using the law of total probability
and the property of the Poisson process seen in the lecture.
24
(a) Assume that we are currently evaluating all the policies and we can break them down into
three categories: no claim for 6+ years (e−6·0.2 = 0.3012), no claim for 3-6 years (e−3·0.2 −
e−6·0.2 = 0.2476174) and the rest. So about 30.12% of policies receive a 30% discount and
24.76% of policies receive a 10% discount. Now assume that the new premium is P and
we wish to solve
P 0.7 · e−6·0.2 + 0.9 · (e−3·0.2 − e−6·0.2 ) + 1 − e−3·0.2 = E[Y ] =⇒ P = 1.13 · E[Y ]

So in order to finance the bonus, we have the raise the base premium by 13 percent.
(b) For claims in the heterogeneous portfolio, we have assumed that the frequency parameter
Λ follows a Gamma(α, β) distribution. Using the fact that the mean λ = 0.2 and Vco(Λ) =
1, we can work out that α = 1 and β = 5. Thus, the probability of having zero claim in
one year (conditioning on values of Λ) is
Z ∞
Prob[Number of Claims = 0|Λ] = e−λ · 5 · e−5λ dλ = 5/6.
0
So the three categories have the following break-down, no claim for 6+ years ((5/6)6 =
0.334898), no claim for 3-6 years ((5/6)3 − (5/6)6 = 0.2438057) and the rest. Using the
same method, the new premium we wish to solve is
P 0.7 · (5/6)6 + 0.9 · ((5/6)3 − (5/6)6 ) + 1 − (5/6)3 = E[Y ] =⇒ P = 1.142661 · E[Y ],

which yields a 14% increase on the base premium.
Solution 2.21: [Fit3, Exercise] The likelihood can be written as

n
Y
L (α, β; y) = e−µi · (µi )yi /yi !
i=1
so that log-likelihood is
n
X
` (α, β; y) = log L (α, β; y) = [−µi + yi log µi − log yi !]
i=1
m
X n
X
α
−eα+β + yi (α + β) − log yi !

= [−e + yi α − log yi !] +
i=1 i=m+1
m
X n
X n
X
α α+β
= −me + α yi − (n − m) e + (α + β) yi − log yi !.
i=1 i=m+1 i=1
Differentiating this log-likelihood we get

m n
∂` (α, β; y) α
X
α+β
X
= −me + yi − (n − m) e + yi = 0 (2.1)
∂α i=1 i=m+1
and n
∂` (α, β; y) X
= − (n − m) eα+β + yi = 0. (2.2)
∂β i=m+1
25
Equations (2.1) and (2.2) yield

m
X
α
−me + yi = 0
i=1
so that the solutions are Pm

i=1 yi
α
b = log
m
and Pn Pm
i=m+1 yi i=1 yi
βb = log − log .
n−m m
Solution 2.22: [Fit4, Exercise]
1. We have
n n
(xj − µ)2 µ2
X X
= xj − 2µ +
j=1
xj j=1
xj
n 2 n 2
µ2
X
X µ µ
= − + − 2µ + xj
j=1
xj x j=1
x
n
nµ2

2
X 1 1
= µ − + − 2nµ + nx
j=1
xj x x
n
2
X 1 1 n 2
µ − 2µx + x2

= µ − +
j=1
xj x x
n
X 1 1 n
= µ 2
− + (x − µ)2
j=1
xj x x
which proves the required result.
2. The likelihood of the sample is given by

n
" #
n/2 θ X (xj − µ)2
L (µ, θ; xj ) = c · θ exp − 2
2µ j=1 xj
and the corresponding log-likelihood gives

n
n θ X (xj − µ)2
` (µ, θ; xj ) = log c + log θ − 2
2 2µ j=1 xj
" n #
n θ X 1 1 n 2
= log c + log θ − 2 µ2 − + (x − µ)
2 2µ j=1
xj x x
" n 2 #
n θ X 1 1 n x
= log c + log θ − − + −1 .
2 2 j=1 xj x x µ
Therefore, differentiating we have

∂` θ 2n x x
=− −1 − 2 =0
∂µ 2 x µ µ
26
which implies
µ
b=x
and, similarly, we have
" n 2 #
∂` n1 1 X 1 1 n x
= − − + −1 =0
∂θ 2 θ 2 j=1 xj x x µ
which implies
n
θb = P .
n 1 1
j=1 xj
− x
They both yield the desired results.
Solution 2.23: [Fit5, Exercise] Notice that the density of the given Pareto can be written as
fX (x) = α100α x−α−1 .
1. We maximize the likelihood given by

Y
20
L (α) = α 100 20α
x−α−1
i
and taking the log likelihood, we have

X
` (α) = 20 log α + 20α log 100 − (α + 1) log xi .
Differentiating, we get
∂` (α) 20 X
= + 20 log 100 − log xi = 0
∂α α
so that
20 20
α
b= P = = 2.848.
log xi − 40 log 10 99.1252 − 92.1034
2. The required probability estimate is therefore
P (X > 200) = (100/200)2.848 = 0.139.
• From Wuthrich (2014, Lemma 3.7), we have the biased MLE estimator of α (given θ)
n
!−1
1X
α̂M LE = log Yi − log θ .
n i=1
Then the unbiased MLE estimator for α is

n−1
α̂M LE = 0.9824864.
n
27
• In previous part of the question, the fitted distribution for claim sizes Yi is a Pareto
distribution (50,0.9824864). Next we count the number of claims for each year and fit
them to a Poisson distribution. There are 2 claims in 1986, 2 in 1987, 1 in 1990, 1 in
1992, 2 in 1993, 1 in 1994, 3 in 1999, 2 in 2000 and 1 in 2005. Other years in the period
1986-2005 do not have any claims. These claim counts yield an MLE estimator for the
Poisson parameter of λ̂M LE = 0.75. The expected claim amount (per claim) is
Z 2000 Z ∞
E[min(Yi , 2000)] = yg(y)dy + 2000g(y)dy
0 2000
= I(G(2000)) + 2000(1 − G(2000))
−0.9824864+1 −0.9824864
2000 2000
=1− + 2000
50 50
= 53.27017.
So the expected total yearly claim amount is (using properties of compound Poisson),
0.75 · 53.27017 = 39.95263.
• The probability that we observe a storm and flood event next year which exceeds the
level of 2 billions CHF is the product between probability of having one claim and claim
amount exceeds 2 billions,
Z ∞
−λ̂M LE
λ̂M LE e · g(y)dy = 0.00945.
2000
28
Module 3
Individual Claim Size Modelling
3.1 Data analysis and descriptive statistics

Exercise 3.1: [los36R, Solution][R∗ ] Develop a function that will yield a vector with the first
three central moments, γ1 and γ2 in function of the cgf of a random variable. One possible
beginning is
CMom123Gam12 <- function(cgf,param){
where ”cgf” is an expression and where ”param” is a list with the numerical values of the
parameters of ”cgf”. The following command—for an inverse Gaussian(α = 2, β = 4)
CMom123Gam12(expression(alpha*(1-sqrt(1-2*t/beta))),list(alpha=2,beta=4))
should yield
0.500000 0.125000 0.093750 2.121320 7.500000
[Hint: use the functions ”D” and ”eval”.]
Exercise 3.2: [los20R, Solution][R∗ ]
1. Create a function that will calculate and return (to the assigned item—”item<-function(·)”)
a vector with E[·], V ar(·), γ1 (·) and γ2 (·) of a non-negative discrete random variable (with
finite range) in function of its pmf. In addition, a binary variable indicates if these de-
scriptive statistics should be printed in a data frame or not. Thus, the code (in a separate
R document) should look like
pmf_to_desc_stats <- function(pmf,print) {
[code omitted ^]
¨
where ”pmf” is a vector with the probabilities and where ”print” is a binary (0-1) variable
indicating if the results should print or not.
2. Add this function to the code developed in Exercise 2.7 part 2 to print the descriptive
statistics. You should get something like
[1] "Descriptive statistics"
mean variance skewness kurtosis
1 12.05 35.1675 0.1213840 -0.4493699
29
3.2 Selected parametric claims size distributions

Exercise 3.3: [los1K, Solution] [Kaas et al. (2008), Problem 3.8.1] Determine the mean and the
variance of the lognormal and the Pareto distribution. Proceed as follows: if Y ∼ lognormal(µ, σ 2 ),
then ln Y ∼ N (µ, σ 2 ); if Y ∼ Pareto(α, x0 ), then Y /x0 ∼ Pareto(α, 1) and ln(1 + Y /x0 ) ∼
exponential(α).
Exercise 3.4: [NLI7, Solution][Wuthrich (2014), Exercise 7] Assume Y ∼ Γ(γ, c), where its
density is for y ≥ 0,
cγ γ−1
g(y) = y exp(−cy).
Γ(γ)
• Prove the statements of the moment generating function MY and the loss size index
function I(G(y)). Hint: use the trick of the proof of Proposition 2.20 in Wuthrich (2014).
• Prove the statement

1 − I(G(u))
e(u) = µY − u, E[Y 1{u1 <Y ≤u2 } ] = µY (I(G(u2 )) − I(G(u1 ))).
1 − G(u)
Exercise 3.5: [los2K, Solution] [Kaas et al. (2008), Problem 2.2.1] Determine the expected
value and the variance of X = IB if the claim probability equals 0.1. First, assume that B
equals 5 with probability 1. Then, let B ∼ Uniform(0,10).
Exercise 3.6: [los4K, Solution] [Kaas et al. (2008), Problem 2.2.5] If X = IB, what is mX (t)?
Exercise 3.7: [los5K, Solution] [Kaas et al. (2008), Problem 2.2.6] Consider the following cdf
F: 
 0 for x < 2
x
F (x) = for 2 ≤ x < 4
 4
1 for 4 ≤ x
Determine independent random variables I, X, and Y such that Z = IX + (1 − I)Y has cdf
F , I ∼ Bernoulli, X is a discrete and Y a continuous random variable.
Exercise 3.8: [los6K, Solution] [Kaas et al. (2008), Problem 2.2.8] Suppose that T = qX +
(1 − q)Y and Z = IX + (1 − I)Y with I ∼ Bernoulli(q). Compare E[T k ] with E[Z k ], k = 1, 2.
3.3 Model selection

Exercise 3.9: [NLI5, Solution][Wuthrich (2014), Exercise 5] Consider the data given in Table
3.1. Estimate the parameters for the Poisson and the negative-binomial models. Which model
is preferred? Does a χ2 -goodness-of-fit test reject the null hypothesis on the 5% significance
lvel of having Poisson distributions?
t 1 2 3 4 5 6 7 8 9 10
Nt 1000 997 985 989 1056 1070 994 986 1093 1054
vt 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000
Table 3.1: Observed claims counts Nt and corresponding volumes vt
30
Exercise 3.10: [Fit6R, Solution] The following data are the results of a sample of 250 losses:
Range of loss Number of observations
0 - 25 5
25 - 50 37
50 - 75 28
75 - 100 31
100 - 125 23
125 - 150 9
150 - 200 22
200 - 250 17
250 - 350 15
350 - 500 17
500 - 750 13
750 - 1,000 12
1,000 - 1,500 3
1,500 - 2,500 5
2,500 - 5,000 5
5,000 - 10,000 3
10,000 - 25,000 3
≥ 25,000 2
Consider the inverse exponential distribution with CDF

FX (x) = exp (−θ/x) , for x > 0, θ > 0.
1. Determine the maximum likelihood estimate of θ. (Be careful with the log-likelihood
function, since the data are grouped.)
2. Conduct a χ2 goodness-of-fit test of this inverse exponential distribution model on the
data.
3. Explain the two other types of hypothesis tests, together with their similarities and differ-
ences, that can be conducted in determining the quality of the fit of your chosen model.
Exercise 3.11: [NLI9, Solution][Wuthrich (2014), Exercise 9] Assume we have i.i.d. claim
sizes Y = (Y1 , ..., Yn )0 with n = 1000 which were generated by a gamma distribution, see Figure
3.1. The sample mean and sample standard deviation are given by
µ
bn = 0.1039 and σ
bn = 0.1039.
If we fit the parameters of the gamma distribution we obtain the method of moments estimator
and the MLEs
bM M = 0.9794 and b
γ cM M = 9.4249,
bM LE = 1.0013 and b
γ cM LE = 9.6360.
This provides the fitted distributions displayed in Figure 3.2. The fits look perfect and the
corresponding log-likelihoods are given by
γMM , b
`Y (b cM M ) = 1264.013 and `Y (b
γ M LE , b
cM LE ) = 1264.171.
31
Figure 3.1: i.i.d. claim sizes Y = (Y1 , ..., Yn )0 with n = 1000; lhs: observed data; rhs: empirical
distribution function.
Figure 3.2: Fitted gamma distribution; lhs: log-log plot; rhs: QQ plot.
γ M LE , b
(a) Why is `Y (b cM LE ) > `Y (b
γM M , b
cM M ) and which fit should be preferred according to
AIC?
(b) The estimates of γ are very close to 1 and we could also use an exponential distribu-
cM LE = 9.6231
tion function. For the exponential distribution function we obtain MLE b
M LE
and `Y (b
c ) = 1264.169. Which model (gamma or exponential) should be preferred
according to the AIC and the BIC?
Exercise 3.12: [Fit7R, Solution][R∗ ] The data in the attachment liability.txt contains data
on liability insurance claim sizes (in German former currency - Marks) for the year 1982.
1. Give a summary of the data.
2. Fit a Pareto distribution to these data using maximum likelihood estimation. Provide the
usual graphical comparisons (histogram vs. fitted parametric density function, empirical
32
CDF vs. fitted parametric CDF, Q-Q plot, P-P plot).
3. Fit a Weibull distribution to these data using maximum likelihood estimation. Provide
the usual graphical comparisons.
4. Fit a lognormal distribution to these data using maximum likelihood estimation. Provide
the usual comparisons.
5. Which distribution would you choose? Justify your answer.
3.4 Calculating within layers for claim sizes

Exercise 3.13: [Fit1, Solution] Consider a set of n right-censored insurance claims observations
where the observed data is represented as
(xj , δj ) for j = 1, 2, ..., n
where xj refers to the claim amount observed and δj is the (right-censor) indicator whether
the applicable policy limit has been reached. You are to fit a (simple) exponential distribution
model to the observed claims with
fX (x) = βe−βx , for x > 0.
1. Write down the log-likelihood function for estimating the exponential parameter.
2. Maximize this function and solve for the parameter estimate.
3. Describe how you can derive a standard error of your parameter estimate.
Exercise 3.14: [Fit2, Solution] [Institute of Actuaries, Subject 106, September 2000] An insur-
ance company has a portfolio of policies with a per-risk excess of loss reinsurance arrangement
with a deductible of M > 0. Claims made to the direct insurer, denoted by X, have a Pareto
distribution with cumulative distribution function
α
200
FX (x; α) = 1 − .
200 + x
There were a total of n claims from the portfolio. Of these claims, I were for amounts less than
the deductible. The claims less than the deductible are
xi , for i = 1, 2, ..., I.
PI
The value of the statistic i=1 log (200 + xi ) = y is given.
1. Derive an expression for the maximum likelihood estimate of α in terms of n, I, M and

y.
2. From last year’s experience, we have the following information:
M = 600, n = 500, I = 400, y = 2209.269.
(a) Verify that the maximum likelihood estimate of α is α

b = 1.75.
33
1
≠–
d
E [(Y0 ≠ d)+ ] = d.
ACTL3162 General Insurance Techniques ◊ –≠1 Exercises
Choose inflation index i > 0 such that ◊(1 + i) < d. From (3.7) we obtain
(b) Assuming that α = 1.75, estimate the average amounts paid by the insurer and the
(d)
Y1 =made
reinsurer on a claim (1 +during
i)Y0 ≥thePareto(◊(1
year. + i), –).
w)
This provides
Exercise for – >Solution]
3.15: [sur12K, 1 and i >[Kaas
0 et al. (2008), Problem 3.8.5] Determine the cdf of
Pr[Z ≤ d] and the stop loss premium
A E[(Z −d)+ ] for a mixture or combination Z of exponential
B≠–
distribution as in d 1
E [(Y1 ≠ d)+ ] =p(x) = qαe−αx + (1 − q)βe d −βx , x > 0.
◊(1 + i) –≠1
A B
d of Z −1z, given Z > z.
(m
≠–
Also determine the conditional distribution
= (1 + i) –
d > (1 + i) E [(Y0 ≠ d)+ ] .
◊ –≠1
Exercise 3.16: [sur13, Solution] Show that
Observe that we obtain a strict inequality,
Z d i.e. the pure riskZ premium
d
grows faster
than the claim
E [(S sizes
− d) itself. The reason
+ ] = E[S] − d + (d −forx)dF
this(x)
faster growth
= E[S] − is[1that
− F claims
(x)] dx.Y0 Æ d
may entitle for claims payments after 0 claims inflation adjustments,
0 i.e. not only the
claim sizes are growing under inflation but also the number of claims is growing if
one doesfunction
distribution
determine
not adapt
of loss
the re-insurance
the distribution
are cumulative
tes
Exercise 3.17: [NLI10, Solution][Wuthrich (2014), Exercise 10] In Figure 3.3 we display the
thewithout
deductible
covers
function
distribution
to inflation.
reinsurance
the distribution
the graphs function
in Figure 3.3? of loss
Note that
⌅
Y ∼ G and the resulting distribution function
of the loss to the insurer after applying different re-insurance covers to loss Y . Can you explicitly
Exercise 10. In Figure 3.30 from
we display ≥ G andbelow
theYfunctions
of the loss after applying different re-insurance covers to
functions.
no
NL
Figure
Figure 3.3: 3.30: Distribution
Cumulative functions
Distribution implied
functions by re-insurance
implied contracts.
by re-insurance contracts
Y . Can you explicitly determine the re-insurance covers from the graphs in Figure
3.30. 3.18: [NLI11, Solution][Wuthrich (2014), Exercise 11] Assume claims sizes Y⌅i in a
Exercise
give line of business can be described by a log-normal distribution with mean E[Yi ] = 3000 and
Vco(Y i ) = 4 (coefficient
Version
of variation).
June 29, 2015, M.V. Wüthrich, ETH Zurich
Up to now the insurance company was not offering contracts with deductibles. Now it wants to
offer the following three deductible versions d = 200, 500, 1000. Answer the following questions:
1. How does the claims frequency change by the introduction of deductibles?
2. How does the expected claim size change by the introduction of deductibles?
3. By which amount changes the expected total claim amount?
34
Exercise 3.19: [sur14K, Solution] [Kaas et al. (2008), Problem 3.10.4] Assume that X1 , X2 , . . .
are independent and identically distributed risks that represent the loss on a portfolio in con-
secutive years. We could insure these risks with separate stop loss contracts for one year with a
retention d, but we could also consider only one contract for the whole period of n years with the
retention nd. Show that E [(X1 − d)+ ] + . . . + E [(Xn − d)+ ] ≥ E [(X1 + . . . + Xn − nd)+ ]. If
d ≥ E[Xi ], examine how the total net stop loss premium for the one-year contracts E [(X1 − d)+ ]
relates to the stop loss premium for the n-year period E [(X1 + . . . + Xn − nd)+ ].
Hint [Kaas et al. (2008), Rule of thumb 3.10.1]: For retentions t larger than the expectation
µ = E[U ] = E[W ], we have for the stop loss premiums of risks U and W :
E [(U − t)+ ] V ar(U )

≈ .
E [(W − t)+ ] V ar(W )
Exercise 3.20: [sur15, Solution] [2005 Quiz 1 Question 4] An insurer has a portfolio consisting
of 1000 one year term life insurance policies that pays $100 in the event of death within one
year. The probability of death is 0.002.
The insurer has a EoL reinsurance for each policy in excess of $90.
1. For the insurer, calculate the expected total annual claims and the variance of total annual
claims without the reinsurance.
2. For the insurer, calculate the expected total annual claims and the variance of total annual
claims with the reinsurance.
3. For the reinsurer, calculate the expected total annual claims and the variance of total
annual claims.
Exercise 3.21: [sur16R, Solution][R∗ ] This question is a follow-up from Exercise 4.18
In Exercise 4.18, we have calculated x, fS (x), FS (x), x = 0, 1, 2, . . . , 25. Now using the same
idea, prepare a table for each of the cases i = 1, 2 and 3 including:
1. d, E[(S − d)+ ], E[((S − d)+ )2 ], V ar[(S − d)+ ], d = 0, 1, 2, . . . , 25.
2. Interpret your results (compare the three assumptions i = 1, 2, or 3).
3.5 Solutions
Solution 3.1: [los36R, Exercise] One possible code is as follows:
CMom123Gam12 <- function(cgf,param){

#initialising vectors
dcgf <- c(cgf) # for cgf, cgf’, cgf’’, etc...
kappa <- c() # for the cumulants
param <- data.frame(param,t=0)
for(i in 1:4){
dcgf <- c(dcgf, D(dcgf[i],"t")) # i-th derivative
kappa <- c(kappa, eval(dcgf[i+1],param)) #i-th cumulant
}
# remember that 4-th cumulant is not the 4-th central moment!
#calculating skewness and kurtosis

gamma <- c(kappa[3]/kappa[2]^(3/2),kappa[4]/kappa[2]^2)
35
#returning results
c(kappa[1:3],gamma)
}
Solution 3.2: [los20R, Exercise]
1. One possible code is as follows:

pmf_to_desc_stats <- function(pmf,print) {
# range of rv
range <- length(pmf)-1
# descriptive statistics
m <- sum(pmf*(0:range))
v <- sum(pmf*(0:range - m)^2)
g1 <- sum(pmf*(0:range - m)^3)/v^(3/2)
g2 <- sum(pmf*(0:range - m)^4)/v^2-3
# print results
if(print==1){
results <- data.frame(mean=m,variance=v,skewness=g1,kurtosis=g2)
print(results)
}
# return vector of results

c(m,v,g1,g2)
}
Note that it is important to return a vector with the results you may need to use later in
your code, because they are lost once the function has been processed. In other words, if
you run the function on one line, ”m” (the mean) will not be available in the subsequent
lines. This also means (and this is the reason why it is so) that you can use ”m” in your
main code and use the function without having a conflict between the two ”m”’s.
2. Add the following lines to the code:
source("[path to file]")
print("Descriptive statistics")
pmf_to_desc_stats(fS[,conv+1],1)
Note that ”[path to the file]” needs to be replaced by the path to the R document where
the code of your function is. If you create several functions, you could put all of them in
a single file that is sourced at the beginning of your normal R files. This way, you can
use all your functions whenever you want. Note that before spending time creating a new
function, it may be advisable to check if this function does not already exist in R. . .
Solution 3.3: [los1K, Exercise] If Y ∼ lognormal(µ, σ 2 ), then log Y ∼ N (µ, σ 2 ) . Use that
E (Y j ) = mlog Y (j) . Thus,
E (Y ) = E [exp (log Y )] = mlog Y (1)

1 2
= exp µ + σ
2
and
E Y2

= E [exp (2 log Y )] = mlog Y (2)
= exp 2µ + 2σ 2

36
so that
V ar (Y ) = exp 2µ + σ 2exp σ 2 − 1 .

If Y ∼ Pareto(α, x0 ) then xY0 Y
∼ Pareto(α, 1) and log 1 + x0 ∼ Exponential (α).
Proof

Y
Pr ≤ y = Pr [Y ≤ yx0 ]
x0
α
x0
=1−
x0 + yx0
1
=1− which is the CDF of a Pareto (α, 1) random variable
(1 + y)α
and

Y
Pr ln 1 + ≤ y = Pr [Y ≤ (ey − 1)x0 ]
x0
α
x0
=1−
x0 + x0 (ey − 1)
−αy
=1−e which is the CDF of an Exp(α) random variable
So
E [Y ] = E [Y + x0 ] − x0

Y
= x0 E + 1 − x0
x0

Y
= x0 E exp 1 × ln +1 − x0
x0
= x0 MZ (1) − x0 where Z ∼ Exponential(α)
α
= x0 × − x0
α−1
x0
=
α−1
Now

Y
E exp 2 ln +1 = MZ (2) where Z ∼ Exponential(α)
x0
α
=
α−2
But
" 2 #
Y Y + x0
E exp 2 ln +1 =E
x0 x0
1
= 2 E Y 2 + 2Y x0 + x20

x0
37
So

2 α x0
2 2
E Y = x0 − x0 − 2x0
α−2 α−1

2 α(α − 1) − (α − 1)(α − 2) − 2(α − 2)
= x0
(α − 1)(α − 2)
2
x0
= (2(α − 1) − 2(α − 2))
(α − 1)(α − 2)
2x20
=
(α − 1)(α − 2)
Then
2
2x20

x0
V ar [Y ] = −
(α − 1)(α − 2) α−1
2
2x0 (α − 1) x20 (α − 2)
= −
(α − 1)2 (α − 2) (α − 1)2 (α − 2)
x20 (2α − 2 − α + 2)
=
(α − 1)2 (α − 2)
x20 α
=
(α − 1)2 (α − 2)
• First we find the moment generating function MY (r),

Z ∞
cγ γ−1
MY (r) = exp(ry) y exp(−cy)dy
0 Γ(γ)
γ Z ∞
(c − r)γ γ−1

c
= y exp[−(c − r)y]dy
c−r 0 Γ(γ)
γ
c
= .
c−r
Next we find the loss size index function for level y, I(G(y))
I(G(y)) = E[Y 1{Y ≤y} ]/µY

c y cγ γ−1
Z
= x x exp(−cx)dx
γ 0 Γ(γ)
Z y
cγ+1
= xγ+1−1 exp(−cx)dx
0 Γ(γ + 1)
Z cy
1
= z γ+1−1 exp(−z)dz
Γ(γ + 1) 0
= G(γ + 1, cy).
38
•
e(u) = E[Y − u|Y > u] = E[Y |Y > u] − u
Z ∞ Z ∞
yg(y) 1
= dy − u = yg(y)dy − u
u 1 − G(u) 1 − G(u) u
Z ∞ Z u
1
= yg(y)dy − yg(y)dy − u
1 − G(u) 0 0
1
= (µY − µY I(G(u))) − u
1 − G(u)
1 − I(G(u))
= µY − u.
1 − G(u)
E[Y 1{u1 <Y ≤u2 } ] = E[Y (1{Y ≤u2 } − 1{Y ≤u1 } )]

= E[Y 1{Y ≤u2 } ] − E[Y 1{Y ≤u1 } ]
= µY (I(G(u2 )) − I(G(u1 ))).
Solution 3.5: [los2K, Exercise] First note that with B = 5 w.p. 1, we have
E (X |I = 1) = 5, E (X |I = 0) = 0
V ar(X|I = 1) = 0, V ar(X|I = 0) = 0.
Now consider the case q = 0.1, which is the probability of a claim occurring. We have
(
5, w.p. 0.1
X=
0, w.p. 0.9
hence
E (X) = 0.5
V ar(X) = E(X 2 ) − (E(X))2
= 15(0.1) − 0.52 = 2.25.
Now if B ∼ U (0, 10) then
10
E (X |I = 1) = = 5, E (X |I = 0) = 0
2
102 100
V ar(X|I = 1) = = , V ar(X|I = 0) = 0.
12 12
Thus
E(X) = E[E(X|I)] = 5(0.1) + 0(0.9) = 0.5
We will use the decomposition of variance result V ar[X] = V ar[E[X|I]] + E[V ar[X|I]] to
calculate the variance. We have
V ar(E(X|I)) = E (E(X|I) − E(X))2

27
= (0 − 0.5)2 (0.9) + (5 − 0.5)2 (0.1) =
12
100 10
E(V ar(X|I)) = 0(0.9) + (0.1) = .
12 12
Then
27 10 37
V ar(X) = + = .
12 12 12
39
Solution 3.6: [los4K, Exercise] Condition on I = 1 and I = 0:

mX (t) = E E eXt |I

= (1 − q) e0 + qmX|I=1 (t)
= 1 − q + qmB (t) .
Solution 3.7: [los5K, Exercise] F (x) has a jump of size 1/2 in 2 and is uniform on (2, 4), so
F is the following mixture of cdf’s:
1 1
F (x) = G (x) + H (x)
2 2
with 
0, x<2
( 
0, x < 2 x−2
G(x) = , H(x) = , 2≤x<4
1, x ≥ 2  2
1, x ≥ 4.

The mixed r.v. I ·X +(1 − I)·Y has cdf F for I ∼ Bernoulli(1/2), X ≡ 2 and Y ∼ Uniform(2, 4),
independent.
Solution 3.8: [los6K, Exercise] We have for the r.v. T = qX + (1 − q) Y :

E (T ) = qE (X) + (1 − q) E (Y )
and
E T2 = E q 2 X 2 + 2q (1 − q) XY + (1 − q)2 Y 2

= q 2 E X 2 + 2q (1 − q) E (X) E (Y ) + (1 − q)2 E Y 2 .

We have random variable Z

(
X, w.p. q
Z = IX + (1 − I) Y =
Y, w.p. (1 − q)
which is a mixture random variable. For I independent of X and Y we have
E (Z) = E [IX + (1 − I) Y ]
= E (I) E (X) + (1 − E (I)) E (Y )
= qE (X) + (1 − q) E (Y ) = E (T )
and
E Z2 = E I 2 X 2 + 2I (1 − I) XY + (1 − I)2 Y 2

= E I 2 E X 2 + 0 + E (1 − I)2 E Y 2

= qE X 2 + (1 − q) E Y 2 .

Note that I(1 − I) ≡ 0 as either I or 1 − I is 0. Note also that

FZ (x) = P r(Z ≤ x) = P r(Z ≤ x and I = 0) + P r(Z ≤ x and I = 1)
= P r(Z ≤ x|I = 0)P r(I = 0) + P r(Z ≤ x|I = 1)P r(I = 1)
= P r(Y ≤ x|I = 0)P r(I = 0) + P r(X ≤ x|I = 1)P r(I = 1)
= (1 − q) · FY (x) + q · FX (x) .
40
Generally one can get

E(Z k ) = qE(X k ) + (1 − q)E(Y k )
by integration as
Z Z Z
k k
x dFZ (x) = q x dFX (x) + (1 − q) xk dFY (x).
Solution 3.9: [NLI5, Exercise] Obtaining MLE for Poisson is easy and the formula is
T
1 X
λM LE
P OI = PT Nt .
t=1 vt t=1
The corresponding Chi-square statistics is

T
X (Nt − λM LE vt )2
P OI
χ2P OI = = 14.83803.
t=1
λM LE
P OI vt
We compare this value to the 95%-quantile of the Chi-square distribution with 8 degrees of
freedom 16.91898. Since the value of test statistic is smaller than the critical value, we cannot
reject the null hypothesis on the 5% significance level.
For the negative-binomial model, we do not have explicit formula for the MLE and we use
the function fitdistr(...) from the MASS package in R. Note that we can do this since vt
have uniform values of 10000, so we can find the maximum likelihood estimator for the size
parameter and divide by 100000. Theen test statistic for the negative binomial model is similar
to the Poisson one,
T
2
X (Nt − λM LE
N B vt )
2
χN B = = 1955.112.
t=1
λM LE
N B vt
This is way larger than the 95%-quantile of the Chi-square distribution with 7 degrees of
freedom 16.91898. So it is clear that the Poisson model is preferred.
R-code:
# r-code for Exercise 5 in NLI

# read the table first
N.vector<-c(1000,997,985,989,1056,1070,994,986,1093,1054)
v.vector<-rep(10000,10)
# Poisson MLE estimator of Lambda
lambda.Poi<-sum(N.vector)/sum(v.vector)
# chi square test statistics
sum((N.vector-lambda.Poi*v.vector)^2/(lambda.Poi*v.vector))
# we then need the 95%-quantile of the chi-sq with 9 degrees of freedom
qchisq(0.95,9)
# package MASS is needed

library(MASS)
# use fitdistr(...) function to find the MLE
NB.fit<-fitdistr(N.vector,"negative binomial")
lambda.NB<-NB.fit$estimate[1]/10000
# chi square test statistics
sum((N.vector-lambda.NB*v.vector)^2/(lambda.NB*v.vector))
41
Solution 3.10: [Fit6R, Exercise] Denote the range of losses by [ai , bi ) for i = 1, 2, ...18 since
there are 18 given intervals of losses. Note that for a1 = 0 and b18 = ∞.
1. Therefore the likelihood of the sample is given by

18
Y
L (θ; ai , bi , ni ) = [F (bi ) − F (ai )]ni
i=1
where ni refers to the number of claims observed in the i-th range of loss. The log-
likelihood is given by
18
X
` (θ; ai , bi , ni ) = ni log [exp (−θ/bi ) − exp (−θ/ai )] .
i=1
Differentiating this w.r.t. θ gives us

∂` (θ; ai , bi , ni )
18
X ni − b1i exp (−θ/bi ) + 1
ai
exp (−θ/ai )
= =0
∂θ i=1
exp (−θ/bi ) − exp (−θ/ai )
which can only be numerically computed. So we will use Excel to find the MLE of θ. In
the sheet Parameter Estimate of Fit6R.xls(x), the derivatives are computed and use
goal seek to find the MLE such that the derivative is zero. The parameter estimate for
θ is 93.18568.
2. In the sheet Chi-sq Test of Fit6R.xls(x), we use Excel to compute the Chi-square test
statistic 16.5232 and the associated p-value 0.34825.
It is clear that the result of the chi-square test above does support the Inverse Exponential
model for the given data. The two other tests that can be conducted are the Kolmogorov-
Smirnoff (K-S) and the Anderson-Darling (A-D) tests. Both test, similar to the chi-square
test, whether the data comes from the assumed population. K-S and A-D tests are quite
similar - both look at the difference between the empirical and model df’s: K-S in absolute
value, A-D in squared difference. But A-D is weighted average, with more emphasis on
good fit in the tails than in the middle; K-S puts no such emphasis. For K-S and A-D
tests, no adjustments are made to account for increase in the number of parameters. The
result is that more complex models often will fare better on these tests. On the other
hand, the χ2 test adjusts the d.f. for increases in the number of parameters. All 3 tests
are sensitive to sample size:
• test statistics tend to increase when sample size increase.

• large sample size therefore increases probability of rejecting all models
(a) The maximum likelihood estimators (b γ M LE , b

cM LE ) jointly maximise the likelihood func-
tion, which also maximise the log-likelihood function. This is the reason why we have
γ M LE , b
`Y (b cM LE ) > `Y (b
γMM , b
cM M ). For both methods, two estimated parameters are
involved, so the MLE approach produces the smallest AIC value and is preferred.
42
(b) We calculate the AIC and the BIC for gamma and exponential fits,
AICexp = −2526.34 and AICgamma = −2524.34,

BICexp = −2521.43 and BICgamma = −2514.53.
According to both the AIC and the BIC criteria, exponential distribution is preferred
(smallest AIC/BIC).
Solution 3.12: [Fit7R, Exercise]
1. The summary of the data is provided below.
data<-read.table("liability.txt",header=T)
attach(data)
source("DataSummStats.R")
DataSummStats(liability)
Value
Number 6.800000e+02
Mean 7.416837e+04
5th Q 3.000000e+04
25th Q 3.500000e+04
Median 5.000000e+04
75th Q 7.500000e+04
95th Q 2.000000e+05
Variance 8.689669e+09
StdDev 9.321839e+04
Minimum 3.000000e+04
Maximum 1.200000e+06
Skewness 6.440000e+00
Kurtosis 5.720000e+01
2. The Pareto distribution has density,

αλα
f (x) = , x1 , ..., xn > λ.
xα+1
Using the maximum likelihood estimation, we can obtain
λ̂mle = min(x1 , ..., xn )

n
α̂mle = Pn .
i=1 (log(xi ) − log(λ))
Substituting the data yields, λ̂mle = 30000 and α̂mle = 2.012135. The graphical compar-
isons are shown below.
source("qpareto.R")
source("dpareto.R")
source("ppareto.R")
#MLE
lambda.hat<-min(liability)
alpha.hat<-860/sum(log(liability)-log(lambda.hat))
par.hat.p<-c(alpha.hat,lambda.hat)
par.hat.p
par(mfrow=c(2,2))
#histogram
43
hist(liability,breaks=100,prob=T,xlab="claims",main="Histogram of Claims")
xgrid<-seq(min(liability),max(liability),length=860)
lines(xgrid,dpareto(xgrid,par.hat.p[1],par.hat.p[2]),col=2)
legend(300000,0.00002,legend=c("Pareto Model"),lty=1,col=2)
#empirical distribution function

empirical<-ecdf(liability)
plot(xgrid,empirical(xgrid),type="l",xlab="claims",ylab="cdf",main="Empirical CDF")
lines(xgrid,ppareto(xgrid,par.hat.p[1],par.hat.p[2]),col=2)
legend(300000,0.6,legend=c("Empirical cdf","Estimated cdf"),lty=1,col=1:2)
#qqplot
plot(qpareto(empirical(liability),par.hat.p[1],par.hat.p[2]),liability,
xlab="theoretical quantiles",ylab="sample quantiles",main="Q-Q plot",cex=0.45)
abline(0,1,col=2)
#ppplot
plot(ppareto(liability,par.hat.p[1],par.hat.p[2]),empirical(liability),
xlab="theoretical probability",ylab="sample probability",main="P-P plot",cex=0.45)
abline(0,1,col=2)
3. The cumulative distribution function of Weibull distribution is

k
F (x) = 1 − e−(x/λ) , x ≥ 0.
44
Use constrOptim function in R to find the maximum likelihood estimator, we obatin
λ̂mle = 79791.605439
k̂mle = 1.184008.
loglike<-function(x,par){
-sum(log(dweibull(x,par[1],par[2])))
}
init.est <- c(1,5000)

fit.wei<-constrOptim(init.est,loglike,NULL,ui=c(1,1),ci=c(0,0),x=liability)
par.hat.w<-fit.wei$par
par.hat.w
par(mfrow=c(2,2))
#histogram
lines(xgrid,dweibull(xgrid,par.hat.w[1],par.hat.w[2]),col=2)
legend(300000,0.00002,legend=c("Weibull Model"),lty=1,col=2)

lines(xgrid,pweibull(xgrid,par.hat.w[1],par.hat.w[2]),col=2)
#qqplot
plot(qweibull(empirical(liability),par.hat.w[1],par.hat.w[2]),liability,
abline(0,1,col=2)
#ppplot
plot(pweibull(liability,par.hat.w[1],par.hat.w[2]),empirical(liability),
abline(0,1,col=2)
45
4. For the log-normal case, there are two parameters need to be estimated (µ and σ 2 ). Use
a similar method mentioned in the previous question, we obtain that
µ̂mle = 10.9373482
σ̂mle = 0.6262126
The graphical comparisons are below.
loglike<-function(x,par){
-sum(log(dlnorm(x,par[1],par[2])))
}
init.est <- c(10,0.5)

fit.ln<-constrOptim(init.est,loglike,NULL,ui=c(1,1),ci=c(0,0),x=liability)
par.hat.l<-fit.ln$par
par.hat.l
par(mfrow=c(2,2))
#histogram
lines(xgrid,dlnorm(xgrid,par.hat.l[1],par.hat.l[2]),col=2)
legend(300000,0.00002,legend=c("Lognormal Model"),lty=1,col=2)
46

lines(xgrid,plnorm(xgrid,par.hat.l[1],par.hat.l[2]),col=2)
#qqplot
plot(qlnorm(empirical(liability),par.hat.l[1],par.hat.l[2]),liability,
abline(0,1,col=2)
#ppplot
plot(plnorm(liability,par.hat.l[1],par.hat.l[2]),emprical(liability),
abline(0,1,col=2)
5. #loglikelihood
sum(log(dpareto(liability,par.hat.p[1],par.hat.p[2])))
sum(log(dweibull(liability,par.hat.w[1],par.hat.w[2])))
sum(log(dlnorm(liability,par.hat.l[1],par.hat.l[2])))
#kolmogorov-smirnoff
47
max(abs(ecdf(liability)(liability)-ppareto(liability,par.hat.p[1],par.hat.p[2])))
max(abs(ecdf(liability)(liability)-pweibull(liability,par.hat.w[1],par.hat.w[2])))
max(abs(ecdf(liability)(liability)-plnorm(liability,par.hat.l[1],par.hat.l[2])))
#A-D statistics
f.p<-function(x){
860*dpareto(x,par.hat.p[1],par.hat.p[2])*(ecdf(liability)(x)
-ppareto(x,par.hat.p[1],par.hat.p[2]))^2/
(ppareto(x,par.hat.p[1],par.hat.p[2])*(1-ppareto(x,par.hat.p[1],par.hat.p[2])))
}
f.w<-function(x){
860*dweibull(x,par.hat.w[1],par.hat.w[2])*(ecdf(liability)(x)
-pweibull(x,par.hat.w[1],par.hat.w[2]))^2/
(pweibull(x,par.hat.w[1],par.hat.w[2])*(1-pweibull(x,par.hat.w[1],par.hat.w[2])))
}
f.l<-function(x){
860*dlnorm(x,par.hat.l[1],par.hat.l[2])*(ecdf(liability)(x)
-plnorm(x,par.hat.l[1],par.hat.l[2]))^2/
(plnorm(x,par.hat.l[1],par.hat.l[2])*(1-plnorm(x,par.hat.l[1],par.hat.l[2])))
}
sum(f.p(seq(30001,1200000)))
sum(f.w(seq(30001,1200000)))
sum(f.l(seq(30001,1200000)))
#Chi-square statistics
n=100
interval<-seq(30000,1200000,length=n)
e.p<-c()
for(i in 1:n-1){
e.p[i]<-860*(ppareto(interval[i+1],par.hat.p[1],par.hat.p[2])
-ppareto(interval[i],par.hat.p[1],par.hat.p[2]))
}
e.w<-c()
interval<-seq(30000,1200000,length=n)
for(i in 1:n-1){
e.w[i]<-860*(pweibull(interval[i+1],par.hat.w[1],par.hat.w[2])
-pweibull(interval[i],par.hat.w[1],par.hat.w[2]))
}
e.l<-c()
for(i in 1:n-1){
e.l[i]<-860*(plnorm(interval[i+1],par.hat.l[1],par.hat.l[2])
-plnorm(interval[i],par.hat.l[1],par.hat.l[2]))
}
o<-c()
for(i in 1:n-1){
o[i]<-860*(ecdf(liability)(interval[i+1])-ecdf(liability)(interval[i]))
}
sum((e.p-o)^2/e.p)
sum((e.w-o)^2/e.w)
sum((e.l-o)^2/e.l)
We will evaluate the models based on the following table.
Model Loglikelihood K-S A-D χ2

Pareto -7822.041 0.1820444 205.7854 95384.95
Weibull -8284.174 0.1876902 30.54793 37484626594
Lognormal -8083.975 0.1968117 21.46748 6058129
Using the table above, we will choose the Pareto distribution as it has the highest log-
likelihood, lowers K-S and χ2 test statistics. But its high value of A-D statistic is quite
48
baffling and therefore not considered.
Solution 3.13: [Fit1, Exercise] Note that there are no truncation in the observations and that
P (X > x) = e−βx .
1. The likelihood function for the observed data (xj , δj ), j = 1, 2, ..., n can be written as
n
Y 1−δj δj
L (β; x, δ) = βe−βxj · e−βxj
j=1
so that the log-likelihood function for estimating β is expressed as

n
X
` (β; x, δ) = log L (β; x, δ) = [(1 − δj ) (log β − βxj ) − βxj δj ]
j=1
n
X
= [(1 − δj ) log β − βxj + βxj δj − βxj δj ]
j=1
n
X n
X
= log β (1 − δj ) − β xj .
j=1 j=1
2. Maximizing the log-likelihood function in (a), we find

n n
∂` (β; x, δ) 1X X
= (1 − δj ) − xj = 0
∂β β j=1 j=1
so that Pn
j=1 (1 − δj )
βb = Pn .
j=1 xj
3. Standard errors can be derived based on the second derivative of the log-likelihood. In
this case, it will be
n
∂ 2 ` (β; x, δ) 1 X
= − (1 − δj ) = H(β; x, δ)
∂β 2 β 2 j=1
which should be negative at the optimum. We have

−1
V ar(β̂) = H(β̂; x, δ)
d ,
so the standard error of our parameter estimate will equal to

qP
#−1/2 n
j=1 (1 − δj )
" n
1 X βb
sc
.e(β̂) = (1 − δj ) = qP = Pn .
βb2 j=1 n
(1 − δ ) j=1 xj
j=1 j
It is the square root of the negative of the inverse of the Hessian (which is the second
derivative) evaluated at the MLE.
49
Solution 3.14: [Fit2, Exercise] Note that the density of the Pareto can be expressed as
α200α
fX (x; α) = , x > 0.
(200 + x)α+1
Note also that the deductible of the Excess of Loss is equivalent to a policy limit from the point
of view of the insurer: the insurer pays at most M for each risk.
1. The likelihood of α, given the observed claims is then

( I ) α(n−I)
Y α200α 200
L (α; x) = α+1 ×
i=1
(200 + x i ) 200 + M
I
αI 200αn Y
= α(n−I)
(200 + xi )−α−1
(200 + M ) i=1
so that the log-likelihood is
` (α; x) = log L (α; x)

I
X
= I log α + αn log 200 − α (n − I) log(200 + M ) − (α + 1) log (200 + xi ) .
i=1
First order condition gives us

I
∂` (α; x) I X
= + n log 200 − (n − I) log (200 + M ) − log (200 + xi ) = 0
∂α α i=1
which gives
I
α
b= PI .
(n − I) log (200 + M ) − n log 200 + i=1 log (200 + xi )
You may wish to check that this gives the maximum by evaluating the second derivative:
∂ 2 ` (α; xi ) I
2
= − 2 < 0.
∂α α
2. Based on the values from last year’s experience, we have
400 400
α
b= = = 1.75.
100 log (800) − 500 log 200 + 2209.269 228.57
The average amount for one claim is
Z ∞
1.75 × 2001.75 200
E (X) = x 2.75 dx = = 266.67
0 (200 + x) 0.75
and the portion of this average claims paid by the reinsurer is
Z ∞ Z ∞
E[(X − M )+ ] = (x − M ) fX (x) dx = zfX (z + M ) dz
M 0
200α ∞ zα800α
Z
= dz
800α 0 (800 + z)α+1
α 1.75
1 800 1 800
= = = 94.28.
4 α−1 4 1.75 − 1
50
Alternatively, E[X] and E[(X − M )+ ] can be calculated using the tail value method
Z ∞
E[X] = (1 − FX (x))dx,
0
Z ∞
E[(X − M )+ ] = (1 − FX (x))dx.
M
The rest is paid by the direct insurer which is
266.67 − 94.28 = 172.39.
Solution 3.15: [sur12K, Exercise]

We get the same mixtures or combinations of the cdf’s and slp’s with exponential distributions:
P (Z > d) = qe−αd + (1 − q) e−βd

E (Z − d)+ = qe−αd /α + (1 − q) e−βd /β.

For the conditional distribution of Z − z, given Z > z :
P (Z − z > y |Z > z ) = q (z) e−αy + (1 − q (z)) e−βy
with q (z) the following function:
q (z) = qe−αz / qe−αz + (1 − q) e−βz .

q (·) is monotonic with q (0) = q and q (∞) = 1 if 0 < α < β.
Solution 3.16: [sur13, Exercise]

Z ∞
E[(S − d)+ ] = (x − d)f (x)dx
d
Z ∞ Z ∞
= xf (x)dx − d f (x)dx
d d
Z ∞ Z d Z ∞ Z d
= xf (x)dx − xf (x)dx − d f (x)dx + d f (x)dx
0 0 0 0
Z d
= E[S] − d + (d − x)dF (x)
0
Z d
d
= E[S] − d + [(d − x)F (x)]x=0 − F (x)d(d − x)
0
Z d
= E[S] − d + F (x)dx (for this step we require F (0) = 0)
0
Z d
= E[S] − [1 − F (x)]dx
0
51

Note that G(y) gives the cumulative distribution of the losses without reinsurance.
First for re-insurance type 1, there is a mass at zero (blue line) and this probability mass
corresponds to GY (1). This means that with probability GY (1) = 0.4, the insurer does not pay
anything. The loss goes to re-insurer when Y ≤ 1. Hence re-insurance type 1 contract has a
policy limit at level 1, Y ∧ 1.
For re-insurance type 2, the probability mass beyond y = 2 is zero. This means that when
Y ≥ 2, the insurer pays $2 only, i.e. the re-insurer pays excess over $2. The distribution
function remains the same as G(y) below y = 2. These imply that the re-insurance type 2 is a
deductible at level 2, (Y − 2)+ .
For re-insurance type 3, the distribution function is the same as the original below y = 1, so
there is no re-insurance contract for this section. There is a mass at y = 2 (discontinuity in
red line) and the resulting distribution function corresponds to GY (2). This means that for
claims of size larger than 1, the re-insurance type 3 provides payment with a limit at Y = 2,
i.e. (Y − 1) ∧ 1, Y > 1.
1. The claims frequency policyholders experience is not affected by the introduction of de-
ductibles. However the company will record a lower claims frequency as less non-zero
claims are filed by policyholders.
2. With the introduction of deductibles, the expected claim size will decrease.
3. Recall that E[(Y − d)+ ] = P[Y > d]e(d), where d > 0 is the deductible level and e(d) is
the mean excess above d. For a log-normal distributed claim,
log d − (µ + σ 2 )

log d − µ
E[(Y − d)+ ] = µY 1 − Φ −d 1−Φ .
σ σ
Substituting d = 200, 500, 1000 yields E[(Y − 200)+ ] = 2822.893, E[(Y − 500)+ ] = 2620.95
and E[(Y − 1000)+ ] = 2372.275.
R-codes:
mean<-3000
vco<-4
sigma2<-log(vco^2+1)
sigma<-sigma2^(1/2)
mu<-log(mean)-sigma2/2
d<-200
mean*(1-pnorm((log(d)-mu-sigma2)/sigma))-d*(1-pnorm((log(d)-mu)/sigma))
d<-500
d<-1000
52
Solution 3.19: [sur14K, Exercise] Notice that

X X X X
xi − nd = (xi − d) ≤ (xi − d)+ = (xi − d)+ .
+ + +
Now, replace xi by Xi and take expectation of the left hand side:

X h i
E Xi − nd = nE X − d + .
+
Also,
1
V ar X = V ar (X1 ) .
n
Using the hint, for d ≥ µ, E(X̄) = E(X1 )

h i V ar X 1
E X −d +
/E (X1 − d)+ ≈ = .
V ar (X1 ) n
This leads to P
E ( Xi − nd)+
≈ 1.
E (X1 − d)+
Hence, the stop loss premium for a one-year contract with retention d is about as large as the
one for an n-year period with retention nd.
Solution 3.20: [sur15, Exercise] Let S be the total annual claims paid by the insurer in the
absence of the excess of loss reinsurance, S ∗ be the total annual claims paid by the insurer in
the presence of the excess of loss reinsurance, and SR be the total annual claims paid by the
reinsurance. We have:
1. Expected total annual claims without the reinsurance:
E(S) = n × b × q = 1000 × 100 × 0.002 = 200
and variance of total annual claims without the reinsurance:
V ar(S) = n × b2 × q × (1 − q) = 1000 × 1002 × 0.002 × 0.998 = 19960.
2. Expected total annual claims with the reinsurance:
E(S ∗ ) = n × b∗ × q = 1000 × [100 − (100 − 90)+ ] × 0.002 = 180
and variance of total annual claims with the reinsurance:
V ar(S ∗ ) = n × b∗2 × q × (1 − q) = 1000 × [100 − (100 − 90)+ ]2 × 0.002 × 0.998 = 16168.
3. Expected total annual claims for the reinsurer:
E(SR ) = n × bR × q = 1000 × (100 − 90)+ × 0.002 = 20
and variance of total annual claims without the reinsurance:
V ar(SR ) = n × b2R × q × (1 − q) = 1000 × (100 − 90)2+ × 0.002 × 0.998 = 199.6.
53
Solution 3.21: [sur16R, Exercise] For the solution code, refer to solution of [los40R, Exercise]
1. Below are tables for d, E[(S − d)+ ], E[((S − d)+ )2 ], V ar[(S − d)+ ], d = 0, 1, 2, . . . , 25.
print(c("Distribution of stop-loss premiums if N is Poisson"))

[1] "Distribution of stop-loss premiums if N is Poisson"
print(Poisson[,4:7],print.gap=3)
d E[(S-d)+] E[(S-d)+^2] Var[(S-d)+]
0 18.0000000000 454.000000000 130.000000000
1 17.0407622040 418.959237796 128.571661303
2 16.0978292895 385.820646302 126.680538467
3 15.1663097922 354.556507221 124.539554507
4 14.2547909497 325.135406479 121.936341460
5 13.3594574196 297.521158110 119.046055564
6 12.4886716496 271.673029040 115.706109469
7 11.6397107390 247.544646652 112.061780564
8 10.8208210870 225.084114826 107.993945829
9 10.0303982617 204.232895477 103.624006188
10 9.2766869917 184.925810224 98.868888682
11 8.5673901552 167.081733077 93.681559005
12 7.8896753530 150.624667569 88.377690393
13 7.2449475938 135.490044622 83.000778985
14 6.6340789981 121.611018030 77.600013876
15 6.0574949924 108.919444039 72.226198457
16 5.5162157850 97.345733262 66.917096675
17 5.0096373915 86.819880085 61.723413291
18 4.5387041817 77.271538512 56.671702863
19 4.1017018776 68.631132453 51.807174160
20 3.6992547622 60.830175813 47.145690017
21 3.3292385667 53.801682484 42.717853050
22 2.9879389459 47.484504971 38.556725827
23 2.6745280175 41.822038008 34.668937892
24 2.3877410250 36.759768966 31.058461763
25 2.1264373905 32.245590550 27.723854574
print(c("Distribution of stop-loss premiums if N is Negative Binomial"))

[1] "Distribution of stop-loss premiums if N is Negative Binomial"
print(NegBin[,4:7],print.gap=3)
d E[(S-d)+] E[(S-d)+^2] Var[(S-d)+]
0 18.00000000 535.0000000 211.0000000
1 17.09525987 499.9047401 207.6568301
2 16.21168860 466.5977917 203.7789445
3 15.34164188 435.0444612 199.6784858
4 14.49603081 405.2067885 195.0718792
5 13.66814123 377.0426165 190.2245318
6 12.86859214 350.5058831 184.9052195
7 12.09161076 325.5456802 179.3386293
8 11.34760558 302.1064638 173.3383114
9 10.63176435 280.1270939 167.0926806
10 9.95437163 259.5409579 160.4514433
11 9.32218787 240.2643984 153.3612117
12 8.71665314 222.2255574 146.2455154
13 8.13869304 205.3702112 139.1318868
14 7.58859823 189.6429200 132.0560968
15 7.06644148 174.9878802 125.0532850
16 6.57275805 161.3486807 118.1475323
17 6.10673999 148.6691827 111.3769094
18 5.66900677 136.8934359 104.7557981
19 5.25781821 125.9666109 98.3219586
20 4.87370111 115.8350916 92.0821292
54
21 4.51463981 106.4467507 86.0647781

22 4.17744116 97.7546697 80.3036551
23 3.86159068 89.7156379 74.8037554
24 3.56626265 82.2877846 69.5695553
25 3.29074389 75.4307780 64.6017827
print(c("Distribution of stop-loss premiums if N is Binomial"))

[1] "Distribution of stop-loss premiums if N is Binomial"
print(Bin[,4:7],print.gap=3)
d E[(S-d)+] E[(S-d)+^2] Var[(S-d)+]
0 1.800000e+01 4.135000e+02 8.950000e+01
1 1.701680e+01 3.784832e+02 8.891185e+01
2 1.604479e+01 3.454216e+02 8.798634e+01
3 1.508165e+01 3.142952e+02 8.683907e+01
4 1.413351e+01 2.850800e+02 8.532380e+01
5 1.319920e+01 2.577473e+02 8.352842e+01
6 1.228477e+01 2.322633e+02 8.134766e+01
7 1.139022e+01 2.085883e+02 7.885132e+01
8 1.052165e+01 1.866765e+02 7.597125e+01
9 9.680273e+00 1.664745e+02 7.276686e+01
10 8.872400e+00 1.479219e+02 6.920238e+01
11 8.106065e+00 1.309434e+02 6.523512e+01
12 7.374416e+00 1.154629e+02 6.108090e+01
13 6.679437e+00 1.014091e+02 5.679419e+01
14 6.022823e+00 8.870681e+01 5.243241e+01
15 5.405522e+00 7.727846e+01 4.805880e+01
16 4.829187e+00 6.704376e+01 4.372271e+01
17 4.293645e+00 5.792092e+01 3.948554e+01
18 3.800209e+00 4.982707e+01 3.538548e+01
19 3.347468e+00 4.267939e+01 3.147385e+01
20 2.936074e+00 3.639585e+01 2.777532e+01
21 2.564032e+00 3.089575e+01 2.432149e+01
22 2.227323e+00 2.610439e+01 2.114343e+01
23 1.924682e+00 2.195239e+01 1.824799e+01
24 1.654269e+00 1.837344e+01 1.563683e+01
25 1.414363e+00 1.530480e+01 1.330438e+01
2. As we have discussed in Exercise 1.31, the variability of S is different under different

assumption. If the portfolio has a bigger variance(more risk), then the stop-loss premium
will be higher. Hence we will expect the most stop-loss premium for the most variable
portfolio, i.e. the negative binomial case. We will expect the least stop-loss premium for
the least variable portfolio, i.e. the binomial case. Illustration by the diagram below.
55
Stop−loss Premium
15
E[(S−d)+]
10 Poi
NegBin
Bin
5
0
0 20 40 60 80
56
Module 4
Approximations for Compound

Distributions
4.1 Approximations
Exercise 4.1: [los29, Solution] [2004 Final Exam Question] For a specific type of insurance
risk, the claim amount X is being modeled as the product of two random variables I and B as
X = I · B,
where I is the claim indicator random variable and B is the random variable representing the
amount of the claim, conditional on the event that a claim occurs.
Now, assume that the probability that a claim occurs is P (I = 1) = q, or otherwise, the
probability that no claim occurs is P (I = 0) = 1 − q. Furthermore, denote by µ and σ 2 to be
the mean and the variance of B, respectively.
1. Show that the mean and variance of X are given by
E (X) = qµ and V ar (X) = qσ 2 + q (1 − q) µ2 .
2. Suppose that B can be expressed as
B =b·Z
where b is a fixed constant and Z is a Poisson random variable with parameter λ, so that
B takes possible values of 0, b, 2b, 3b, and so on. Determine expressions for the mean,
variance, and probability distribution of X in terms of the parameters λ, q, and b.
3. Consider further two types of the above risk:
Risk Claim Parameters

Type Random Variable
A XA b = 1, λ = 1.5, q = 0.1
B XB b = 2, λ = 1.0, q = 0.2
Given that the two types of risk are independent, use the convolution formula to calculate
the probability that XA + XB is greater than 3. Do not use any approximations.
57
4. Consider a portfolio of 100 independent Type A risks (above). Use the Normal (Central
Limit Theorem) approximation to calculate the probability that the average claim amount
per risk is greater than 0.5.
Exercise 4.2: [NLI12, Solution][Wuthrich (2014), Exercise 12] Assume that S has a compound
Poisson distribution with expected number of claims λv > 0 and claim size distribution G having
finite third moment.
1. Prove that the fit of moment approximation for a translated gamma distribution for X
provides the following system of equations
E[Y13 ]
λvE[Y1 ] = k + γ/c, λvE[Y12 ] = γ/c 2
and = 2γ −1/2 .
(λv) E[Y1 ]
1/2 2 3/2
2. Solve this system of equations for k ∈ R, γ > 0 and c > 0 (assume that G(0) = 0).
Exercise 4.3: [los31K, Solution] [Kaas et al. (2008), Problem 3.6.1] Assume S is compound
Poisson distributed with parameter λ = 12 and uniform(0, 1) distributed claims. Approximate
Pr[S < 10] with the CLT approximation, the translated gamma approximation and the NP
approximation.
Exercise 4.4: [los32K, Solution] [Kaas et al. (2008), Problem 2.5.9] An insurer’s portfolio
contains 2000 one-year life insurance policies. Half of them are characterised by a payment
b1 = 1 and a probability of dying within 1 year of q1 = 1%. For the other half, we have b2 = 2
and q2 = 5%. Use the CLT to determine the minimum safety loading, as a percentage, to be
added to the net premium to ensure that the probability that the total payment exceeds the
total premium income is at most 5%.
Exercise 4.5: [los33K, Solution] [Kaas et al. (2008), Problem 2.5.13] A portfolio consists of
two types of contracts. For type k, k = 1, 2, the claim probability is qk and the number of
policies is nk . If there is a claim, then its size is x with probability pk (x):
nk qk pk (1) pk (2) pk (3)

Type 1 1000 0.01 0.5 0 0.5
Type 2 2000 0.02 0.5 0.5 0
Assume that the contracts are independent. Let Sk denote the total claim amount of the
contracts of type k and let S = S1 + S2 . Calculate the expected value and the variance of a
contract of type k, k = 1, 2. Then, calculate the expected value and the variance of S. Use the
CLT to determine the minimum capital that covers all claims with probability 95%.
Exercise 4.6: [los34, P Solution] We want to approximate the individual

Pn model Se using two
n ∗ ∗
collective models: S = 1 Nj · bj which is based on λj = qj and S = 1 Nj · bj which is based
on λ∗j = − ln(1 − qj ), bj > 0 for all j. Compare E[S]
e with E[S ∗ ] and E[S], and V ar[S]
e with
∗ ∗
V ar[S ] and V ar[S]. Also compare Pr[Ii = j] with Pr[Ni = j] and Pr[Ni = j], j = 0, 1, 2, . . ..
Exercise 4.7: [los35K, Solution] [Kaas et al. (2008), Problem 3.7.2] Consider a portfolio of
100 one-year life insurance policies. 25 policies have insured amounts 1 and probability dying
within this year 0.01, 25 policies have insured amounts 1 and probability dying within this year
58
0.02, 25 policies have insured amounts 2 and probability dying within this year 0.01 and 25
policies have insured amounts 2 and probability dying within this year 0.02.
Determine the expectation and the variance of the total claims S. e Choose an appropriate
compound Poisson distribution S to approximate Se and compare the expectations and the
variances. Determine for both S and Se the parameters of a suitable approximating translated
gamma distribution.
Exercise 4.8: [los37R, Solution][R∗ ] Develop a function that will return and print (if the user
wishes so) the following approximations of Pr[S ≤ s] in function of the first three central
moments, as well as γ1 and γ2 :
• Normal Power 1 (CLT)

• Normal Power 2
• Edgeworth 1 (first 2 terms)
• translated gamma
Note: Edgeworth’s approximation(each term improves the accuracy):

" #
S − E[S] γ1 γ2 γ2
Pr p ≤ z ≈ Φ(z) − Φ(3) (z) + Φ(4) (z) + 1 Φ(6) (z),
V ar(S) 6 24 72
where φ(·) denotes the standard normal distribution, and

dk
Φ(k) (x) = Φ(x).
dxk
Note
1 1 2
Φ(1) (x) = ϕ(x) = √ e− 2 x
2π
Φ(2) (x) = −xϕ(x)
Φ(3) (x) = (x2 − 1)ϕ(x)
etc. . .
Application of this function to an inverse Gaussian(α = 2, β = 4) (moments as in the previous

exercise) with s = 0.6 should yield:
Approximations for Pr[X<=0.6]
NP1 NP2 EW1 EW2 EW3 TG
Pr[X<=0.6] 0.6113513 0.7037253 0.7360268 0.8349541 0.7386938 0.728984
Exercise 4.9: [los41R, Solution][R∗ ] [From Assignment question3, 2008]

This question is a follow-up Exercise 4.18
1. For each of the three cases i = 1, 2, 3, give the true value of Pr(Si > 50) as well as its
CLT, translated gamma and normal power approximations.
2. Interpret your results (compare the three assumptions i = 1, 2, or 3).
59
4.2 Algorithms for compound distributions

Exercise 4.10: [los24, Solution] Determine a and b for the (a, b) family (Poisson, negative
binomial and binomial), also called Panjer distributions (see MW Definition 4.6).
Exercise 4.11: [los25, Solution] Let S ∼ compound Poisson(λ = 2, p(x) = x/10), x =

1, 2, 3, 4. Determine fS (s) for s = 0, 1, 2, 3, 4. using Panjer’s recursion algorithm.
Exercise 4.12: [los26, Solution] Check the results of Exercise 2.5 using Panjer’s recursion
algorithm.
Exercise 4.13: [los27, Solution] The individual claim amount distribution has the following
distribution:
x P (X = x)
1 0.2
2 0.2
3 0.2
4 0.4
Determine fS (s) for s = 0, 1, 2, 3, 4:
1. if S is compound Poisson with parameter λ = 1, and
2. if S is compound Negative Binomial with parameters (r = 2, p = 0.2).
Exercise 4.14: [los28, Solution] Let p(x) be Poisson(1). Verify that p∗n (x) using de Pril’s
algorithm will yield the exact probabilities of a Poisson(n) random variable for x = 0, 1, 2, 3.
Exercise 4.15: [los30R, Solution][R∗ ] Develop a function for Panjer’s recursion that returns
fS (x) and FS (x), and this in function of a, b, fX (x) and fS (0). In addition, the function should
allow for the choice (”type” below) between computing
• the first s (”vartype” below) recursion to get up to fS (s), or
• whatever number of values that is necessary in order to have all FS (x) < 1 − and the
first > 1 − . ( ≡ vartype below)
Finally, the function should also allow its user to print fS (x) and FS (x) for all the recursions
that were performed (binary ”print” below). The function could then begin in the following
way:
Panjer <- function(a,b,fX,fS0,type,vartype,print) {
[code omitted ^]
¨
Exercise 4.16: [los38R, Solution][R∗ ] Develop a function that will discretise a continuous dis-
tribution in m steps of length h, in function of its pdf or cdf. One possible beginning is
discretisation <-function(densityorcdf,type,h,m){
60
where ”densityorcdf” is a function and where ”type” is binary and defines if ”densityorcdf” if
a cdf (1) or a density (0).
The following code—discretisation of a gamma(α = 2, β = 4)
pmf<-discretisation(function(x){16*x*exp(-4*x)},0,0.002,1500)
plot(pmf)
should yield the following plot:

0.0030
0.0025
0.0020
0.0015
pmf
0.0010
0.0005
0.0000
0 500 1000 1500
Index
Exercise 4.17: [los39R, Solution][R∗ ] Let S ∼ compound Poisson(λ = 20, p(x) = 0.2e−0.2x ).
We are interested in determining Pr[S ≤ 150]. Use the functions developed in the previous
exercises to compute all 6 approximations mentioned in Exercise 4.8, as well as the equivalent
probability using Panjer’s recursion after having discretised p(x) with h = 0.01 and m =
5000. Calculate the deviation between the approximations and the probability calculated with
Panjer’s recursion.
Here are the outputs you should get:
Approximations for Pr[X<=150]:
NP1 NP2 EW1 EW2 EW3 TG Panjer
Pr[X<=150] 0.9430769 0.93132 0.9295227 0.9306522 0.93277 0.9325275 0.9323526
Deviation with Panjer probability:

NP1 NP2 EW1 EW2 EW3 TG Panjer
Pr[X<=150] 0.01072422 -0.001032627 -0.002829964 -0.001700449 0.0004173918 0.0001748720 0
Exercise 4.18: [los40R, Solution][R∗ ] [From Assignment question3, 2008]

Let 
 0.2 if x = 0,

0.1 if x ∈ {1, 3, 5, 7, 9, 10},

Pr(Xj = x) =

 0.05 if x ∈ {2, 4, 6, 8}, and
0 otherwise,

and
Si = X1 + X2 + . . . + XNi , i = 1, 2, 3
61
with
N1 ∼ Poisson(λ1 = 4),
N2 ∼ Neg Bin(r2 = 4, p2 = 0.5), and
N3 ∼ Bin(n3 = 8, p3 = 0.5),
and X1 , X2 , . . . , Xj , . . . , Ni are mutually independent for i = 1, 2, 3.
1. Using Panjer’s recursion formula, prepare a table for each of the cases i = 1, 2 and 3
including:
• x, fS (x), FS (x), x = 0, 1, 2, . . . , 25
2. Compute the following moments for X and Si , i = 1, 2, 3:
• E[(·)k ], k = 1, 2, 3
• E[(· − E[·])k ], k = 2, 3
• γ1 (·)
For the moments of Si , use the distribution of Si you computed with Panjer’s recursion
formula (after an appropriate number of recursions).
3. Interpret your results in part 1 and 2 (compare the three assumptions, i = 1, 2 or 3).
4.3 Solutions
Solution 4.1: [los29, Exercise] This was a Year 2004 final exam question.
1. Applying law of iterated expectations, we find
E (X) = E [E (X |I )] = E (X |I = 1) P (I = 1) + E (X |I = 0) P (I = 0)
= qE (B) + 0 = qµ
and applying variance formula using conditionals, we have
V ar (X) = V ar [E (X |I )] + E [V ar (X |I )]
= [E (B)]2 V ar (I) + qV ar (B)
= q (1 − q) [E (B)]2 + qV ar (B)
after noting that E (X |I ) = E (B) I, V ar (X |I = 1) = V ar (B |I = 1) = V ar (B),

V ar (X |I = 0) = 0, and V ar(X|I) = V ar(B)I. Thus, we have
V ar (X) = q (1 − q) µ2 + qσ 2 .
2. First, notice that E (Z) = V ar (Z) = λ so that
E (X) = qbλ
and
V ar (X) = q (1 − q) b2 λ2 + qb2 λ,
62
where (E(B))2 = µ2 = b2 λ2 , V ar(B) = σ 2 = b2 λ.

Next, since Z is Poisson, the m.g.f. of B is
mB (t) = E[etB ] = E[etbZ ] = exp λ ebt − 1

and the m.g.f of X is then

mX (t) = E[etX ]
= E[etIB ]
= E[etIB |I = 1]P (I = 1) + E[etIB |I = 0]P (I = 0)
= E[etB ]q + 1 × (1 − q)
bt −1)
= qeλ(e + (1 − q)
∞
X λk ekbt
= qe−λ + (1 − q)
k=0
k!
∞
X λk kbt
= (1 − q) + qe−λ + qe−λ e .
k=1
k!
The distribution of X can be identified by using the 1-1 correspondence between distri-
bution and mgf, i.e. by matching
∞
X
mX (t) = E[etX ] = ety P (X = y)
y=0
with the expression above. Thus, by carefully selecting coefficients of ety , we find
λx
P (X = bx) = qe−λ , for x = 1, 2, 3, ...
x!
and
P (X = 0) = (1 − q) + qe−λ .
Alternatively, we have
(
0, w.p. 1 − q
X=
bZ, w.p. q
where bZ also has mass at 0. Using this will also give the above expression of P (X = 0).
3. Notice that XA takes possible values 0, 1, 2, ... while XB takes possible values 0, 2, 4, ...
Now, doing the convolution for XA + XB , we have
(1) (2) (3) (4) (6)
x P (XA = x) P (XB = x) P (XA + XB = x) P (XA + XB ≤ x)
= (2)*(3)
0 0.922313 0.873576 0.805710 0.805710
1 0.033470 0 0.029238 0.834949
2 0.025102 0.073576 0.089789 0.924737
3 0.012551 0 0.013427 0.938164
λi 1.5 1.0
x x/2
qe−1.5 (1.5)
x!
qe−1 (1)
(x/2)!
(for integer x2 )
Therefore, the required probability is P (XA + XB > 3) = 1 − P (XA + XB ≤ 3) = 1 −

0.938164 = 0.061836.
63
4. For Type A, the mean is (per policy)
E (X) = qλ = 0.1 (1.5) = 0.15
and
V ar (X) = q (1 − q) λ2 + qλ = 0.3525.
Thus, using Normal (CLT) approximation, we have
100
! 100
!
1 X X
P XAi > 0.5 = P XAi > 50
100 i=1 i=1
 
P100 P100
X Ai − E[ X Ai ] 50 − 100 (0.15) 
= P  i=1 q i=1
> p
P100
V ar[ i=1 XAi ] 0.3525 (100)
= P (N (0, 1) > 5.8951) = 1 − P (N (0, 1) ≤ 5.8951)
≈ 1 − 1 = 0.0
1. Let X = k + Z, where Z ∼ Γ(γ, c) such that the three parameters of X fulfil
E[X] = E[S], Var(X) = Var(S) and ςX = ςS .
Equating the first moment yields
E[S] = E[X] =⇒ λvE[Y1 ] = k + γ/c.
Equating the second moment (i.e. variance) yields
Var(S) = Var(X) =⇒ λvE[Y12 ] = γ/c2 .
Lastly equation the skewness

E[(S − E[S]])3
ςS = ςX =⇒ = 2γ −1/2
σS3
E[Y13 ]
=⇒ = 2γ −1/2 .
(λv)1/2 E[Y12 ]3/2
2. From equating skewness, we can first solve for γ

(λv)E[Y12 ]3
γ=4 .
E[Y13 ]2
Next from equating variance, we can solve for c
1 (λv)E[Y12 ]3 1 E[Y12 ]
c2 = γ = 4 =⇒ c = 2 > 0.
λvE[Y12 ] E[Y13 ]2 λvE[Y12 ] E[Y13 ]
Finally substituting the solved γ and c into equation for the expectation gives k
γ (λv)E[Y12 ]2
k = λvE[Y1 ] − = λvE[Y1 ] − 2 .
c E[Y13 ]
64
Solution 4.3: [los31K, Exercise] Let X ∼ U nif orm(0, 1) be the claim severity random vari-
able. Define µk = E[X k ]. Using formulas in the F&T book page 16, we have
E (S) = λµ1 = 6
and
1
V ar (S) = σS2 = λµ2 = 12 · =4
3
and
1 √ 3
γS = λµ3 /σS3 = 12 · / 4 = 3/8.
4
1 b 4 − a4
(Note that the third moment of a U (a, b) distribution is .
4 b−a
CLT:
P (S < 10) ≈ P [Z < (10 − 6) /2] = Φ (2) = 0.977.
Translated Gamma:
Using known expressions for the skewness of a Gamma(α, β) distribution (skewness = √2 ), we
α
have that α = 4/γS2 = 28 49 ;. Further, it is also known that
α 4/γS2
V ar[S] = =
β2 β2
hence
s
4 2 8
β= = = .
γS2 V ar[S] γS σS 3
Finally, using the expression in the lecture slides, we have that the shift is
α 4/γS2 2σS 2
k = E[S] − = E[S] − = = −4
β 2/γS σS γS 3
Thus,

2 4 8
P (S < 10) ≈ G 10 − −4 ; 28 ; = 0.968.
3 9 3
NP approximation:
Calculating s using the expressions in the lecture notes, we have that

S−6 10 − 6
P (S < 10) = P <
2 2
s !
9 6(2) 3 √
≈Φ + + 1 − = Φ 97 − 8 = 0.968.
(3/8)2 3/8 3/8
Solution 4.4: [los32K, Exercise] Denote S to be the total payment. We have
µS = 1000 × b1 × q1 + 1000 × b2 × q2
= 1000 × 1 × 0.01 + 1000 × 2 × 0.05 = 110
65
and
V ar (S) = 1000 × b21 × q1 × (1 − q1 ) + 1000 × b22 × q2 × (1 − q2 )

= 1000 × 12 × 0.01 × 0.99 + 1000 × 22 × 0.05 × 0.95 = 199.9
We know Φ (y) = 0.95 for y = 1.645. So by letting P be equal to smallest premium income, it
should satisfy
P − 110
P Z≤ √ ≥ 0.95,
199.9
√
therefore the premium income must be P ≥ 110 + 1.645 × 199.9 = 133.258, and the loading
therefore has to be at least (133.258 − 110)/110 = 23.258/110 = 21.14%.
Solution 4.5: [los33K, Exercise] Let X1 be a claim of type 1, then P (X1 = 0) = 1 − q1 , and
P (X1 = j) = q1 · p1 (j) , for j = 1, 3. So
E (X1 ) = 1 × 0.01 × 0.5 + 3 × 0.01 × 0.5 = 0.02
and
E X12 = 12 × 0.01 × 0.5 + 32 × 0.01 × 0.5 = 0.05

so that V ar (X1 ) = 0.0496. Also,
E (X2 ) = 1 × 0.02 × 0.5 + 2 × 0.02 × 0.5 = 0.03
and
E X22 = 12 × 0.02 × 0.5 + 22 × 0.02 × 0.5 = 0.05

so that V ar (X2 ) = 0.0491. Then, calculating the expected value and variance of S:
E (S1 ) = 1000 × 0.02 = 20, V ar (S1 ) = 1000 × 0.0496 = 49.6
and
E (S2 ) = 2000 × 0.03 = 60, V ar (S1 ) = 2000 × 0.0496 = 98.2
and thus
E (S1 + S2 ) = 20 + 60 = 80, V ar (S1 + S2 ) = 147.8.
The required capital is p
E (S) + 1.645 V ar (S) = 99.999.
Solution 4.6: [los34, Exercise] Note that λj = qj < λ∗j = − log (1 − qj ) since − log (1 − qj ) =
n
qj + qj2 /2 + qj3 /3 + · · ·. Since S ∗ = 1 Nj∗ · bj , we have
P
n
X n
X
∗
E (S ) = λ∗j · bj > qj · bj = E (S) = E Se
1 1
and n n
X X
∗
V ar (S ) = λ∗j · b2j > qj · b2j = V ar (S) > V ar Se .
1 1
66
Since
P (Ii = 0) = 1 − qi = 1 − P (Ii = 1)
P (Ni∗ = n) = (1 − qi ) [− log (1 − qi )]n /n!
which implies
P (Ni∗ = 0) = 1 − qi = P (Ii = 0)
P (Ni∗ = 1) < P (Ii = 1)
P (Ni∗ = 2, 3...) > P (Ii = 2, 3, ...) = 0.
Furthermore,
P (Ni = 0) = exp (−λi ) = exp (−qi ) > P (Ii = 0)

P (Ni = 1) = qi · exp (−qi ) < P (Ii = 1)
P (Ni = 2, 3...) > P (Ii = 2, 3, ...) = 0.
Solution 4.7: [los35K, Exercise] For S, e we have

X
E Se = ni qi bi = 25 (0.01 + 0.02 + 0.02 + 0.04) = 2 14
and X
V ar Se = ni qi (1 − qi ) b2i = 3.6875
and X
γ
e= ni qi (1 − qi ) (1 − 2qi ) b3i /σ 3 = 0.906.1
Take the collective risk model: S ∼CP λ = 1 12 , p (1) = p (2) = 1/2 ⇒

√
µ = 2 14 ; σ 2 = 3.75; γ = µ3 / (µ2 )3/2 λ = 6.75σ −3 = 0.93.
For Se :
α = 4.871; β = 1.149; x0 = −1.988.
For S :
α = 4.630; β = 1.111; x0 = −1.917.
Approx <- function(Mom,s,print){
#standardised s
z<-(s-Mom[1])/Mom[2]^.5
#Vector of results
aprob <- c()
# Normal Power #
################
#Normal Power 1 (CLT)

aprob <- c(aprob,pnorm(s,Mom[1],Mom[2]^.5))
1
Correction to the skewness formula: made on 21-9-2004.
67
#Normal Power 2
temp <- sqrt(9/Mom[4]^2+6*z/Mom[4]+1)-3/Mom[4]
aprob <- c(aprob,pnorm(temp))
# Edgeworth #
#############
#Calculating the derivatives of phi

phi<-expression(1/sqrt(2*pi)*exp(-x^2/2))
dphi<-c(phi);
for (i in 2:6){dphi <- c(dphi,D(dphi[i-1],"x"))}
#Edgeworth 1
aprob <- c(aprob,pnorm(z)-Mom[4]/6*eval(dphi[3],list(x=z)))
#Edgeworth 2
aprob <- c(aprob,as.double(aprob[3])+Mom[5]/24*eval(dphi[4],list(x=z)))
#Edgeworth 3
aprob <- c(aprob,as.double(aprob[4])+Mom[4]^2/72*eval(dphi[6],list(x=z)))
# Translated Gamma #
####################
#parameters
tgbeta <- 2*Mom[2]/Mom[3]
tgalpha <- tgbeta^2*Mom[2]
x0 <- Mom[1]-tgalpha/tgbeta
#approximation
aprob <- c(aprob,pgamma(s-x0,shape=tgalpha,rate=tgbeta))
approximations <- data.frame(NP1=aprob[1],NP2=aprob[2],EW1=aprob[3],EW2=aprob[4],EW3=aprob[5],TG=aprob[6])

row.names(approximat ions) <- paste("Pr[X",as.character(s),"]",sep="")
#print results
if(print==1){
cat("Approximations for Pr[X",as.character(s),"]\n",sep="")
print(approximations)
} # end if
Solution 4.9: [los41R, Exercise] For the solution code, refer to [los40R, Exercise]
1. The following are the comparisons of the three approximation method.
print(c("Comparison of true probability and approximations for S(50)"))

[1] "Comparison of true probability and approximations for S(50)"
print(Proba,digits=5,print.gap=2)
True CLT Gamma NP
If N is Poisson 0.0091035 0.00250348 0.0103916 0.0107226
If N is Negative Binomial 0.0326376 0.01379840 0.0340177 0.0372348
If N is Binomial 0.0011910 0.00035914 0.0020282 0.0020194
2. Recall that we have computed the skewness of the distribution of S under the different
assumptions.
X S if N Poisson S if N Neg Bin S if N Binomial

Var(.) 12.25 130.000 211.00 89.500
E[(.-E[.])^3] 6.00 1050.000 3534.00 354.750
gamma(.) 0.14 0.708 1.15 0.419
We can see that under the negative binomial and Poisson assumptions of N , the coeffi-
cients of skewness of S take value of 1.15 and 0.419. When approximating S, the trans-
lated gamma method produces the most accurate results as the method uses a gamma
68
distribution to approximate which itself is positively skewed. The normal power method
performs quite well, while the CLT method performs very poorly.
Under the binomial assumption of N , S has a coefficient of skewness of 0.419. In this
case, the normal power method performs the best as it uses the CLT idea which preserves
certain degree of symmetry and at the same time allows for a certain degree of skewness.
Solution 4.10: [los24, Exercise] For Poisson, we have

e−λ λn
Pr[N = n] n! λ λ
= e−λ λn−1
= = 0+
Pr[N = n − 1] (n−1)!
n n
and thus a = 0 and b = λ.

For negative binomial, we have
(n+r−1)! r
Pr[N = n] n!(r−1)!
p (1 − p)n n+r−1
= (n+r−2)!
= (1 − p)
Pr[N = n − 1] pr (1 − p)n−1 n
(n−1)!(r−1)!

r−1
= (1 − p) 1 +
n
and thus a = 1 − p and b = (1 − p)(r − 1).
For binomial, we have
m!
Pr[N = n] n!(m−n)!
pn (1 − p)m−n m−n+1 p
= m!
=
Pr[N = n − 1] (n−1)!(m−n+1)!
pn−1 (1 − p)m−n+1 n 1−p

p m+1
= −1
1−p n
and thus a = −p/1 − p and b = (m + 1)p/(1 − p).
Solution 4.11: [los25, Exercise] For the compound Poisson:

s
1X
f (s) = λhp (h) f (s − h)
s h=1
with λ = 2 and p (j) = j/10 for j = 1, 2, 3, 4. Thus

1
f (s) = {.2f (s − 1) + .8f (s − 2) + 1.8f (s − 3) + 3.2f (s − 4)}
s
so that
f (0) = e−2
f (1) = .2f (0) = .2e−2
f (2) = .42f (0) = .42e−2
and so on.
Note that f (0) will be a factor in all the f (x), x > 0. Thus, multiplying all the elements by
f (0) at the end only will be more efficient and will also prevent any rounding error (of the
evaluation of e−2 ) to spread and amplify for large x.
69
Solution 4.12: [los26, Exercise] Note that iXi is compound Poisson with parameters (λi =
i, pi (i) = 1). It follows from Theorem 12.4.1 that S is compound Poisson with parameters
λ = 1 + 2 + 3 = 6 and 

 1/6 x = 1,
2/6 x = 2,

p(x) =

 3/6 x = 3,
0 elsewhere.

We can then apply Panjer’s recursion as usual with P r[S = 0] = e−6 .
Solution 4.13: [los27, Exercise] Use Panjer’s recursion equation.
1. Note that p (0) = 0 so that with N ∼Poisson(1) :

fS (0) = P (N = 0) = e−1
and
x
1X
fS (x) = (ax + bh) p (h) fS (x − h)
x h=1
x
1X
= λhp (h) fS (x − h)
x h=1
1
= [0.2fS (x − 1) + 0.4fS (x − 2) + 0.6fS (x − 3) + 1.6fS (x − 4)]
x
where in the Poisson case, a = 0, b = λ. Therefore:
fS (1) = 0.2fS (0) = 0.2e−1
1
fS (2) = [0.2fS (1) + 0.4fS (0)] = 0.22e−1
2
1
fS (3) = [0.2fS (2) + 0.4fS (1) + 0.6fS (0)] = 0.241e−1
3
1
fS (4) = [0.2fS (3) + 0.4fS (2) + 0.6fS (1) + 1.6fS (0)] = 0.4641e−1 .
4
...
2. In this case we have

a = 1 − p = 0.8 and b = (1 − p)(r − 1) = 0.8.
Since Pr[X = 0] = 0
fS (0) = Pr[N = 0] = pr = 0.04.
We have then
min(4,s)
X s+j
fS (s) = 0.8 p(j)fS (s − j), s = 1, 2, . . . .
j=1
s
and thus
fS (1) = 0.8 · 2 · 0.2 · 0.04 = 0.0128
fS (2) = 0.8 {3/2 · 0.2 · 0.0128 + 2 · 0.2 · 0.04} = 0.015872
fS (3) = · · · = 0.01959936
fS (4) = · · · = 0.03691315
...
70
Solution 4.14: [los28, Exercise] Use of de Pril’s algorithm yields

x −1 j
∗n 1 X j e 1 ∗n
p (x) = −1 (n + 1) − 1 p (x − j)
e j=1 x j!
x
X j 1 ∗n
= (n + 1) − 1 p (x − j).
j=1
x j!
We have then
p∗n (0) = (p(0))n = e−n

p∗n (1) = (n + 1 − 1)e−n = ne−n
n2

∗n −n n+1−2 n
p (2) = e n+ = e−n
2 2 2
2
n3

∗n −n n+1−3n 2n + 2 − 3 n n
p (3) = e + + = e−n .
3 2 3 2 3! 3!
Panjer <- function(a,b,fX,fS0,type,vartype,print) {

# range of X
rX <- length(fX)-1
# initialisation of our results vectors

pmf <- c(fS0)
df <- c(fS0)
if(type==1) { # if we want the first s recursions

for(i in 1:vartype){ # i=s
temp <- sum((a+b*(1:min(i,rX))/i)*fX[(1:min(i,rX))+1]*pmf[i-(1:min(i,rX))+1])
pmf <- c(pmf, temp/(1-a*fX[1])) # we divide only at the end
df <- c(df,df[i]+pmf[i+1]) # the df...
} # end i loop
i <- vartype+1 # useful to know how many recursions we did for below
} else { # if we focus on the df
i <- 1 # since we use while, we need to create our own counter
while(df[i]<(1-vartype)){ # we can use while here
temp <- sum((a+b*(1:min(i,rX))/i)*fX[(1:min(i,rX))+1]*pmf[i-(1:min(i,rX))+1])
pmf <- c(pmf, temp/(1-a*fX[1]))
df <- c(df,df[i]+pmf[i+1])
i <- i+1 # increment the counter
# the number of recursions is simply i in this case (for below)
} # end while
}
# printing results
if(print==1) {
results <- data.frame(x=0:(i-1),fS=pmf,FS=df)
print(results)
} # end if
#returning results
array(c(pmf,df),c(i,2))
}
Note that we need to return an array now in order to be able to refer to the results as ”object[i,j]”
with ”i” for x and ”j” for fS (j = 1) or FS (j = 2).
71
discretisation <-function(densityorcdf,type,h,m){
if(type==1) { #cdf
pmf <- c()
pmf <- c(densityorcdf(h/2))
for(i in 1:(m-1)) {
pmf <- c(pmf,densityorcdf(h*i+h/2)-densityorcdf(h*i-h/2))
}
pmf <- c(pmf,1-densityorcdf((m-.5)*h))
pmf } else { pmf <- c() #density
pmf <- c(as.double(integrate(densityorcdf,0,h/2)[1]))
for(i in 1:(m-1)) {
pmf <- c(pmf,as.double(integrate(densityorcdf,h*i-h/2,h*i+h/2)[1]))
}
pmf <- c(pmf,as.double(integrate(densityorcdf,(m-.5)*h,Inf)[1]))
pmf }# end else
} # end function
# loading the functions

source("...09_IRM_tut_wk03_E12.R")
beta <-.2
lambda <- 20
s<-150
###########################################
# Calculate the approximate probabilities #
# Calculate the moments

cgf <- expression(lambda*((beta/(beta-t))^alpha-1))
param <- list(lambda=20,alpha=1,beta=.2)
Moments <- CMom123Gam12(cgf,param)
# Calculate the approximations

Approx<-Approx(Moments,s,0)
##########################################
# Calculate the "Panjer" probability via #
# discretisation and Panjer #
# Discretise our distribution of X

cdf <- function(x){1-exp(-beta*x)}
pmfX <- discretisation(cdf,1,0.01,5000)
# Calculate the cdf of S until s

# we need thus s/h=150*100 recursions
pmfS <- Panjer(0,lambda,pmfX,exp(lambda*(pmfX[1]-1)),1,15000,0)
# Determine the "Panjer" probability

tprob <- pmfS[15001,2]
#################
# print results #
results <- data.frame(Approx,Panjer=tprob)

cat("Approximations for Pr[X",as.character(s),"]:\n",sep="")
print(results)
cat("\n","Deviation with Panjer probability:\n",sep="")
print(results-tprob)
The plot of fS (x) shows that the distribution of S tends to be normal when λ is big, but the
distribution is still slightly skewed (γ1 = 0.474), which is why the CLT approximation performs
badly.
72
0.00000 0.00002 0.00004 0.00006 0.00008 0.00010 0.00012

pmfS[, 1]
0 5000 10000 15000
Index
Solution 4.18: [los40R, Exercise] This is the R-code for this question.
#Assignment 2008, Question 3
################################
# Assumptions and parameters #
################################
#Distribution of losses
p <- c(.2,.1,.05,.1,.05,.1,.05,.1,.05,.1,.1)
########
#this will be the array of the moments of X, S with Poisson, S with NB, S with B
Moments <- array (dim=c(6,4))
rownames(Moments) <- c("E[.]","E[.^2]","E[.^3]","Var(.)","E[(.-E[.])^3]","gamma(.)")
colnames(Moments) <- c("X "," S if N Poisson"," S if N Neg Bin"," S if N Binomial")
# Moments of X #
####################
Moments[1,1] <- sum(c(0:(length(p)-1))*p)

Moments[2,1] <- sum(c(0:(length(p)-1))^2*p)
Moments[3,1] <- sum(c(0:(length(p)-1))^3*p)
Moments[4,1] <- Moments[2,1] - Moments[1,1]^2
Moments[5,1] <- Moments[3,1]-3*Moments[2,1]*Moments[1,1]+2*Moments[1,1]^3
Moments[6,1] <- Moments[5,1]/Moments[4,1]^(1.5)
# Initialization of parameters
P_lambda <- 4
NB_r <- 4
NB_p <- 0.5
B_n <- 8
B_p <- 0.5
#End of calculations
end <- 300
##############################################
# Distribution of S and moments of (S-d)+ #
##############################################
#this will be array for outputs about the different distributions:

Output <- array(dim=c(3,end+1,7))
73
# index 1: Poisson / Neg. Bin. / Bin

# index 2: s+1
# index 3: "x","fs(x)","Fs(x)","d","E[(S-d)+]","E[(S-d)+^2]","Var[(S-d)+]"
# Preliminaries #
#####################
## masses at 0 = Fs(0) !!! if p(0)=0

#Output[1,1,2] <- c(exp(P_lambda*(p[1]-1)))
#Output[2,1,2] <- c(p^NB_r)
#Output[3,1,2] <- c((1-B_p)^B_n )
# masses at 0 = Fs(0) !!! if p(0)>0

Output[1,1,2] <- c( exp(P_lambda*(p[1]-1)))
Output[2,1,2] <- c( (NB_p/(1-(1-NB_p)*p[1]))^NB_r )
Output[3,1,2] <- c( ((1-B_p)+B_p*p[1])^B_n )
#Masses at 0 in Fs
Output[1,1,3] <- Output[1,1,2]
#Moments of S for the SL Premiums

Output[1,1,5] = P_lambda*Moments[1,1] # expectation
Output[1,1,7] = P_lambda*Moments[2,1] # variance
Output[1,1,6] = Output[1,1,7] + Output[1,1,5]^2 # second moment around the origin
Output[2,1,5] = NB_r*(1-NB_p)/NB_p*Moments[1,1]
Output[2,1,7] = NB_r*(1-NB_p)/NB_p^2*Moments[1,1]^2 + NB_r*(1-NB_p)/NB_p*Moments[4,1]
Output[2,1,6] = Output[2,1,7] + Output[2,1,5]^2
Output[3,1,5] = B_n*B_p*Moments[1,1]
Output[3,1,7] = B_n*B_p*(1-B_p)*Moments[1,1]^2+B_n*B_p*Moments[4,1]
Output[3,1,6] = Output[3,1,7] + Output[3,1,5]^2
# a-b parameters
ab <- array(dim=c(3,2))
ab[1,1] <- 0
ab[1,2] <- P_lambda
ab[2,1] <- 1-NB_p
ab[2,2] <- (NB_r-1)*(1-NB_p)
ab[3,1] <- -B_p/(1-B_p)
ab[3,2] <- (B_n+1)*B_p/(1-B_p)
sum <- c(0,0,0)
#pdf and cdf and SL premium moments for the three cases
for (i in 1:3)
{
Output[i,1,1] <- 0 # column x
Output[i,1,4] <- 0 # column d
for (s in 1:end)
{
sum[i] <- 0
for (h in 1:min(s,length(p)-1))
{
sum[i] <- sum[i] + (ab[i,1]+ab[i,2]*h/s)*p[h+1]*Output[i,s-h+1,2]
}
Output[i,s+1,1] <- s # label column x
Output[i,s+1,2] <- 1/(1-ab[i,1]*p[1])*sum[i] #fs
Output[i,s+1,3] <- Output[i,s,3] + Output[i,s+1,2] # Fs
Output[i,s+1,4] <- s # label column d
Output[i,s+1,5] <- Output[i,s,5] - 1 + Output[i,s,3] # E[(S-d)+]
Output[i,s+1,6] <- Output[i,s,6] - 2*Output[i,s,5]+1-Output[i,s,3]
Output[i,s+1,7] <- Output[i,s+1,6] - Output[i,s+1,5]^2
}
}
####################
# Moments of S #
####################
74
# Remember: #
###############
##this will be the array of the moments of X, S with Poisson, S with NB, S with B
#Moments <- array (dim=c(6,4))
#rownames(Moments) <- c("E[.]","E[.^2]","E[.^3]","Var(.)","E[(.-E[.])^3]","gamma(.)")
#colnames(Moments) <- c("X "," S if N Poisson"," S if N Neg Bin"," S if N Binomial")
for (i in 2:4) {
Moments[1,i] <- sum(c(0:end)*c(Output[i-1,1:(end+1),2]))
Moments[2,i] <- sum(c(0:end)^2*c(Output[i-1,1:(end+1),2]))
Moments[3,i] <- sum(c(0:end)^3*c(Output[i-1,1:(end+1),2]))
Moments[4,i] <- sum((c(0:end)-Moments[1,i])^2*c(Output[i-1,1:(end+1),2]))
Moments[5,i] <- sum((c(0:end)-Moments[1,i])^3*c(Output[i-1,1:(end+1),2]))
Moments[6,i] <- Moments[5,i]/Moments[4,i]^(1.5)
}
################################
# Probability approximations #
################################
Proba <- array(dim=c(3,4))

rownames(Proba) <- c("If N is Poisson","If N is Negative Binomial","If N is Binomial")
colnames(Proba) <- c("True","CLT","Gamma","NP")
#survival function at the point:

ss <- 50
for (i in 1:3) {
Proba[i,1] <- 1-Output[i,ss+1,3]
Proba[i,2] <- 1-pnorm(ss,Moments[1,i+1],Moments[4,i+1]^.5)
Proba[i,3] <- 1-pgamma(ss-Moments[1,i+1]+2*Moments[4,i+1]^.5/Moments[6,i+1] ,
4/Moments[6,i+1]^2 , 2/Moments[6,i+1]/Moments[4,i+1]^.5)
Proba[i,4] <- 1-pnorm((9/Moments[6,i+1]^2+6*(ss-Moments[1,i+1])/Moments[4,i+1]^.5/Moments[6,i+1]+1)
^.5-3/Moments[6,i+1])
}
###################
# Print results #
###################
# Distributions and stop-loss premium #

#######################################
#End of table
prints <- 80
Poisson <- array(dim=c(prints+1,7))

rownames(Poisson) <- rep("",times=prints+1)
colnames(Poisson) <- c("x","fs(x)","Fs(x)","d","E[(S-d)+]","E[(S-d)+^2]","Var[(S-d)+]")
NegBin <- array(dim=c(prints+1,7))

rownames(NegBin) <- rep("",times= prints +1)
colnames(NegBin) <- c("x","fs(x)","Fs(x)","d","E[(S-d)+]","E[(S-d)+^2]","Var[(S-d)+]")
Bin <- array(dim=c(prints+1,7))

rownames(Bin) <- rep("",times= prints +1)
colnames(Bin) <- c("x","fs(x)","Fs(x)","d","E[(S-d)+]","E[(S-d)+^2]","Var[(S-d)+]")
for (i in 1:7) {
for (j in 1:(prints+1)) {
Poisson[j,i] <- Output[1,j,i]
NegBin[j,i] <- Output[2,j,i]
Bin[j,i] <- Output[3,j,i]
}
}
print(c("Distribution of S and stop-loss premiums if N is Poisson"))

print(Poisson,print.gap=3)
print(c("Distribution of S and stop-loss premiums if N is Negative Binomial"))
print(NegBin,print.gap=3)
print(c("Distribution of S and stop-loss premiums if N is Binomial"))
75
print(Bin,print.gap=3)
# Moments #
###########
options(scipen=2)
print(c("Moments of the 4 distributions"))

print(Moments,digits=3)
# Probabilities #
#################
print(c("Comparison of true probability and approximations for S(50)"))

print(Proba,digits=5,print.gap=2)
# plots #
#########
mfrow=c(2,2)
plot(Poisson[,1],Poisson[,2],main="f(x) of S if N Poisson",xlab="",ylab="",col="1")
plot(NegBin[,1],NegBin[,2],main="f(x) of S if N NegBin",xlab="",ylab="",col="2")
plot(Bin[,1],Bin[,2],main="f(x) of S if N Bin",xlab="",ylab="",col="3")
plot(Poisson[,1],Poisson[,3],main="F(x) of S",xlab="",ylab="",col="1",type="l")
lines(NegBin[,1],NegBin[,3],col="2")
lines(Bin[,1],Bin[,3],col="3")
plot(Poisson[,4],Poisson[,5],type="l",col="1",main="Stop-loss Premium",xlab="d",ylab="E[(S-d)+]",)
lines(NegBin[,4],NegBin[,5],col="2")
lines(Bin[,4],Bin[,5],col="3")
legend(60,10,legend=c("Poi","NegBin","Bin"),lty=1,col=1:3)
1. Below are the tables for x, fS (x), FS (x), x = 0, 1, 2, . . . , 25.
print(c("Distribution of S if N is Poisson"))
[1] "Distribution of S if N is Poisson"
print(Poisson[,1:3],print.gap=3)
x fs(x) Fs(x)
0 0.040762203978 0.04076220
1 0.016304881591 0.05706709
2 0.011413417114 0.06848050
3 0.020000654752 0.08848116
4 0.016185312460 0.10466647
5 0.024547760130 0.12921423
6 0.021824859398 0.15103909
7 0.030071258549 0.18111035
8 0.028466826771 0.20957717
9 0.036711555213 0.24628873
10 0.044414433611 0.29070316
11 0.031582034174 0.32228520
12 0.032987043085 0.35527224
13 0.033859163507 0.38913140
14 0.034284589883 0.42341599
15 0.035304798458 0.45872079
16 0.034700813781 0.49342161
17 0.035645183751 0.52906679
18 0.033930905728 0.56299770
19 0.034555188644 0.59755288
20 0.032430919914 0.62998380
21 0.028716574671 0.65870038
22 0.027888692422 0.68658907
23 0.026623935971 0.71321301
24 0.025483357983 0.73869637
25 0.024071665116 0.76276803
print(c("Distribution of S if N is Negative Binomial"))

[1] "Distribution of S if N is Negative Binomial"
print(NegBin[,1:3],print.gap=3)
x fs(x) Fs(x)
0 0.0952598689 0.09525987
1 0.0211688598 0.11642873
2 0.0135245493 0.12995328
3 0.0244356591 0.15438894
4 0.0177214790 0.17211042
5 0.0283404926 0.20045091
6 0.0225677187 0.22301863
76
7 0.0329761903 0.25599482
8 0.0281639543 0.28415877
9 0.0384485051 0.32260728
10 0.0452089616 0.36781624
11 0.0266490320 0.39446527
12 0.0275746285 0.42203990
13 0.0278652951 0.44990519
14 0.0279380518 0.47784325
15 0.0284733228 0.50631657
16 0.0276653689 0.53398194
17 0.0282848455 0.56226678
18 0.0265446534 0.58881144
19 0.0270714614 0.61588290
20 0.0250558092 0.64093871
21 0.0218626373 0.66280134
22 0.0213481731 0.68414952
23 0.0205224539 0.70467197
24 0.0198092739 0.72448125
25 0.0189021953 0.74338344
print(c("Distribution of S if N is Binomial"))
[1] "Distribution of S if N is Binomial"
print(Bin[,1:3],print.gap=3)
x fs(x) Fs(x)
0 1.679616e-02 0.01679616
1 1.119744e-02 0.02799360
2 8.864640e-03 0.03685824
3 1.500768e-02 0.05186592
4 1.382022e-02 0.06568614
5 1.988766e-02 0.08557380
6 1.986838e-02 0.10544218
7 2.599662e-02 0.13143879
8 2.717911e-02 0.15861790
9 3.350957e-02 0.19212747
10 4.153728e-02 0.23366474
11 3.468674e-02 0.26835149
12 3.666887e-02 0.30502036
13 3.836575e-02 0.34338611
14 3.931281e-02 0.38269892
15 4.096584e-02 0.42366476
16 4.079371e-02 0.46445847
17 4.210595e-02 0.50656442
18 4.069388e-02 0.54725830
19 4.134746e-02 0.58860576
20 3.935274e-02 0.62795850
21 3.533203e-02 0.66329053
22 3.406891e-02 0.69735943
23 3.222769e-02 0.72958712
24 3.050720e-02 0.76009432
25 2.843743e-02 0.78853176
2. The following are moments for X and Si , i = 1, 2, 3:
options(scipen=2)
print(c("Moments of the 4 distributions"))
[1] "Moments of the 4 distributions"
print(Moments,digits=3)
X S if N Poisson S if N Neg Bin S if N Binomial
E[.] 4.50 18.000 18.00 18.000
E[.^2] 32.50 454.000 535.00 413.500
E[.^3] 262.50 13902.000 20760.00 11019.750
Var(.) 12.25 130.000 211.00 89.500
E[(.-E[.])^3] 6.00 1050.000 3534.00 354.750
gamma(.) 0.14 0.708 1.15 0.419
77
3. Firstly recall that
N1 ∼ Poisson(λ1 = 4),
N2 ∼ Neg Bin(r2 = 4, p2 = 0.5), and
N3 ∼ Bin(n3 = 8, p3 = 0.5).
The random variables N1 , N2 and N3 all have the expected value of 4. But their variances
are different. N1 has variance of 4, N2 has variance of 8 and N3 has variance of 2. Recall
that N in the collective risk model is the random variable that represents the number of
claims in a portfolio. The variability of this random variable will play a vital role in the
distribution of S. We will illustrate this using the following graphs.
Binomial Poisson
0.04
0.04
0.03
0.03
Poisson[1:81, 2]
0.02
0.02
0.01
0.01
0.00
0.00
0 20 40 60 80 0 20 40 60 80
Bin[1:81,
Negative 1]
Binomial Poisson[1:81,
Distribution 1]
functions
1.0
0.08
0.8
Poisson[1:81, 3]
0.06
0.6
0.04
0.4
0.02
0.2
0.00
0 20 40 60 80 0 20 40 60 80
The first three graphs represent the distribution of S under each assumption of Ni , i =
1, 2, 3. Notice that under the negative binomial assumption, S has a big tail. Under the
Poisson assumption, the tail is smaller. Under the binomial assumption, the tail is the
smallest. This is caused by the different values of variance of the random variables Ni
under different assumptions. Under the negative binomial assumption, N has the largest
variance. Therefore this causes a large variability in the entire portfolio which is reflected
by the big tail.
78
This idea is reinforced by the fourth graph, which is the cumulative distribution function
under each assumptions of Ni , i = 1, 2, 3. We notice that, under the negative binomial
assumption (red line), more probabilities are assigned to the larger values of x. In com-
parison, for the binomial assumption, more probabilities are assigned to small values of
x.
79
Module 5
Ruin Theory and Premium Calculation

Principles
5.1 Ruin theory in discrete time

Exercise 5.1: [sur2, Solution] An insurance company has a portfolio of policies with the
following characteristics:
• initial surplus is 1.0;
• individual claim amounts have distribution

x p (x)
2 0.2
4 0.5
6 0.3
• one claim occurs at each of time 1, 2, 3, ...; and
• premiums are paid at the end of each period.
Determine the smallest relative security loading the insurer can choose so that it is certain
that ruin does not occur at time 1.
5.2 Ruin theory in continuous time

Exercise 5.2: [sur3, Solution] Show that all these expressions are equivalent (to determine R)
1 + (1 + θ)p1 r = mX (r) (1)

λ + πr = λmX (r) (2)
Z ∞
[erx − (1 + θ)] [1 − P (x)]dx = 0 (3)
0
erπ = E[erS(1) ] (4)
where p1 = E(X).
80
Exercise 5.3: [sur4, Solution] Show that

E e−RC(t) = e−Rc0 .

Exercise 5.4: [sur5A, Solution] [Bowers et al. (1997), Exercise 13.5] Calculate
lim R
θ−→0
and
lim R
θ−→∞
Exercise 5.5: [sur6A, Solution] [A Exercise 13.6] Use

1
erx > 1 + rx + (rx)2 , r > 0, x > 0 (1)
2
to show that
2θp1
R< ,
p2
Exercise 5.6: [sur7, Solution] Suppose that claims form a compound Poisson process, with
λ = 1 and each claim size X follows U nif orm(0, 1) distribution.
Premiums are received continuously at the rate of π = 1.
Find the adjustment coefficient if proportional reinsurance is purchased with a retention α = 0.5
and with reinsurance loading equal to 100%.
Exercise 5.7: [sur8, Solution] An insurance company has aggregate claims that have a com-
pound Poisson distribution with:
• λ = 2;
• each claim size X follows U nif orm(0, 2) distribution; and
• premium collection rate is 6.
The insurer buys a proportional reinsurance that reimburses 20% of each individual claim.
The adjustment coefficient with reinsurance is 1.75. Determine the reinsurance loading.
Exercise 5.8: [sur9R, Solution][R∗ ] Let Pr[X = x] = 14 , x = 0, 1, 2, 4, and let λ = 1.
1. Find R if c = 3.
2. Plot R for θ = i/100, i = 1, 2, . . . , 100.
Exercise 5.9: [sur10R, Solution][R∗ ] Consider Exercise 5.6 above. Assume now that θ = 0.60
(where this is the loading of the insurer). Find the optimal α that will minimise the Lundberg’s
upper bound for the probability of ruin and plot R for α = 0.5 + i/100, i = 1, 2, . . . , 49.
Exercise 5.10: [sur11, Solution] Consider a Brownian motion process with drift µ and volatil-
ity σ 2 . Using the moment generating function to show that the Brownian motion can be
approximated by a shifted Poisson process {τ N (t) − πt, t ≥ 0}.
[Hint: you need to determine λ(µ, σ, τ ) and c(µ, σ, τ )]
81
Exercise 5.11: [sur12R, Solution][R∗ ] Implement in R the approximation of the Brownian

motion by a shifted Poisson process as seen in Exercise 5.10. Plot your process for a number
of jumps equal to three times λ (approximately three units of time).
If you set the seed for your exponential pseudo-random variables to 12345 (use ”set.seed(12345)”)
and if τ = 0.03, µ = 3 and σ = 6, you should get the following graph:
Approximation of a Brownian Motion

20
15
10
W(t)
5
0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
5.3 Premium Risk-based Principles

Exercise 5.12: [NLI13, Solution][Wuthrich (2014), Exercise 13] We would like to insure the
following car fleet. Assume that the car fleet can be modelled by a compound Poisson distri-
(i) (i)
i vi λi E[Y1 ] Vco(Y1 )
passager car 40 25% 2000 2.5
delivery van 30 23% 1700 2.0
truck car 10 19% 4000 3.0
bution (“Vco” means coefficient of variation).
1. Calculate the expected claim amount of the car fleet.
2. Calculate the premium for the car fleet using the variance loading principle with α =
3 · 10−6 .
Exercise 5.13: [NLI13.1, Solution] Suppose that we have two independent lines of business, S1
and S2 , who have distributions S1 ∼ CompPoi(10, G1 ∼ Exp(10)) and S2 ∼ CompPoi(20, G2 ∼
Exp(20)).
(a) Determine the distribution of the sum S = S1 + S2 .
82
(b) Calculate the premium using the variance loading principle: π = E[S] + αVar(S) using
α = 0.25.
(c) Now assume
Pn that we have n lines of business (n being large), S1 , S2 , ..., Sn and the sum
S = i=1 i . Propose a candidate for α when using the standard deviation loading
S
principle: π = E[S] + αVar(S)1/2 (hint: Central Limit Theorem).
5.4 Solutions
Solution 5.1: [sur2, Exercise] Note that this surplus process is not a Cramér-Lundberg process
since the number of claims is not random.
The surplus process can still be expressed as C(t) = c0 + πt − S(t) where c0 = 1 and the
premium rate π = E (X) (1 + θ) = 4.2 (1 + θ). Thus,
P (ruin at t = 1) = P (C(1) < 0)
= P (1 + 4.2 (1 + θ) − S(1) < 0)
= P (S(1) > 5.2 + 4.2θ) .
This probability is 0 if 5.2 + 4.2θ ≥ 6 or equivalently θ ≥ 0.8/4.2 = 0.2.
Solution 5.2: [sur3, Exercise] Recall that the adjustment coefficient R is the positive solution
to:
1 + (1 + θ) p1 R = mX (R) .
If we multiply both sides by λ, we have
λ + λ (1 + θ) p1 R = λmX (R)
and is equivalent to (2), since π = λ (1 + θ) p1 ,
λ + πR = λmX (R) .
Again, from equation (1), we have
1 + (1 + θ) E (X) R = E eRX

so that Z ∞ Rx
e − 1 − (1 + θ) Rx dP (x) = 0.
0
This implies
Z ∞Z x Z ∞ Z ∞
R eRy − (1 + θ) R dydP (x) = R eRy − (1 + θ) R

dP (x) dy
0 0 0 y
Z ∞
(1 − P (y)) R eRy − (1 + θ) R dy

=
0
= 0.
And since R 6= 0, we have (3). From (2) derived above, we have
πR = λ [mX (R) − 1]
⇐⇒ eπR = exp {λ [mX (R) − 1]} (which is a compound Poisson m.g.f.)
⇐⇒ eπR = mS(1) (R) = E eRS(1)

which confirms (4).
83
Solution 5.3: [sur4, Exercise] We have
E e−RC(t) = E e−R[c0 +πt−S(t)] = e−Rc0 e−πRt E eRS(t) = e−Rc0 e−πRt E etRS(1)

and since eRπ = E[eRS(1) ] by definition (of R in Exercise 5.2),
E e−RC(t) = e−Rc0 .

Solution 5.4: [sur5A, Exercise] Think of the graph of 1 + (1 + θ)p1 r against mX (r) seen in
lecture.
As θ → 0+ , the slope of the line tends to p1 , which is exactly m0X (0). There is then only one
root, R = 0 as R → 0+ , and thus ψ(u) ≤ 1, u ≥ 0, i.e. the bound is meaningless.
The case θ → ∞ is less clear. The mgf of X is not always defined for all r ∈ [0, ∞). For
β
instance, if X is exponential(β) with MX (t) = β−t , R → β as θ → ∞. On the other hand, if X
is inverse Gaussian, apart from the trivial 0, R may or may not exist depending on parameters,
as θ → ∞. To illustrate this, consider example 13.4.3.e in [A] on page 412: Assume
√
that the
α 1− 1−2t/β
claim amount distribution is inverse Gaussian with mgf MX (t) = e for t < β/2
with lim MX (t) = eα , and E(X) = αβ .
t→β/2

Case 1: eα > 1 + (1 + θ) αβ β2 = 1 + (1 + θ) α2

α

Case 2: eα < 1 + (1 + θ) 2
84
Solution 5.5: [sur6A, Exercise] Note that the expression on the right hand side of (1) is the
beginning of Taylor’s
expansion of the exponential on the left hand side. Now remember that
rX
mX (r) = E e and thus
1
1 + (1 + θ)p1 R = E eRX > 1 + Rp1 + R2 p2

2
1 2
⇒ θp1 R > R p2
2
2θp1
⇒R<
p2
Solution 5.6: [sur7, Exercise] First, note that

Z 1
πh = (1 + ξ)λ h (x) p (x) dx
0
With ξ = 100%, λ = 1, h(x) = (1 − α)x, this simplifies to

Z 1
πh = 2 (1 − α)xp (x) dx = 1 − α
0
R1
as xp(x)dx = E(Y ) = 0.5. Therefore π − πh = 1 − (1 − α) = α. Lundberg’s equation is then
0

λ MY −h(Y ) (Rh ) − 1 = (π − πh )Rh
R1
where MY −h(Y ) (Rh ) = MαY (Rh ) = MY (αRh ) = 0 eαRh x dx.
R
Because R solves 1 + R = e R−1 , αRh solves
eαRh − 1
1 + αRh = .
αRh
In the absence of reinsurance (α = 1), the non-trivial solution of this equation can be shown
to be R = 1.793. But if α = 0.5, it is clear that
Rh = 1.793/0.5 = 3.587.
85
Solution 5.7: [sur8, Exercise] Since reinsurance is proportional, h (x) = (1 − α)x = 0.2x,
λ = 2, α = 0.8, and reinsurance premium is
Z 2
πh = (1 + ξ) λ h (x) p (x) dx
0
Z 2
= (1 + ξ) 2 0.2xp (x) dx = 0.4 (1 + ξ)
0
where ξ denotes the reinsurance loading. Thus, we have

Z 2
λ + (π − πh ) Rh = λMY −h(Y ) (Rh ) = λ eRh (x−h(x)) p (x) dx
0
Z 2
2 + [6 − 0.4 (1 + ξ)] (1.75) = 2 e1.75(0.8x) 0.5dx
0
and from which we can solve for ξ = 110%.
Solution 5.8: [sur9R, Exercise] We solve for the positive R which is the solution to λ + πR =
λmX (R). The m.g.f. is given by
1
1 + et + e2t + e4t

mX (t) =
4
so that the equation to solve is
1
1 + eR + e2R + e4R = 0.

1 + πR −
4
1. If π = 3, then θ = 3/p1 − 1. Since p1 = 7/4, θ = 5/7. The resulting R is 0.3159.
2. The resulting plot is as follows

0.4
0.3
0.2
R
0.1
0.0
0.0 0.2 0.4 0.6 0.8 1.0
theta
86
The code used for these answers is
# create our objective function

eqR <- function(r,theta){
return(1+(1+theta)*7/4*r-(1+exp(r)+exp(2*r)+exp(4*r))/4)
}
# create a function for finding the root of eqR for given theta
fR <- function(x){
uniroot(eqR,lower=0.001,upper=2,theta=x)$root
}
# part a
R <- fR(5/7)
print(R)
# part b
plot <- c()
for(i in 1:100){plot <- c(plot,fR(i/100))} # the y values
plot(1:100/100,plot,xlab="theta",ylab="R")
Solution 5.9: [sur10R, Exercise] We have the optimal values
α∗ = 0.71 and R∗ = 1.409.
The plot is as follows:

1.40
1.35
1.30
1.25
R
1.20
1.15
1.10
0.5 0.6 0.7 0.8 0.9 1.0
alpha
The code used for these answers is
# create our objective function
87
eqR <- function(r,alpha){

return( 1+(alpha-.2)*r-(exp(alpha*r)-1)/alpha/r )
}
# create a function for finding the root of eqR for given theta
fR <- function(x){
uniroot(eqR,lower=0.001,upper=100,alpha=x)$root
}
# finding the optimal alpha and corresponding R

R <- optimise(fR,interval=c(0.5,0.999),maximum=TRUE)
print(R)
#plotting
plot <- c()
for(i in 1:49) plot <- c(plot,fR(.5+i/100))
plot(51:99/100,plot,xlab="alpha",ylab="R")
Solution 5.10: [sur11, Exercise] Consider the shifted Poisson process {τ N (t) − πt, t ≥ 0} and
we will match the mean and variance of the shifted Poisson process with the Brownian motion
(1 unit of time), i.e.
µ = τ λ − π and σ 2 = τ 2 λ.
Solving for λ and π yields
σ2 σ2
λ= and π = − µ.
τ2 τ
Next we obtain the moment generating function of process {τ N (t) − πt}
kτ
E ek(τ N (t)−πt) = e−πtk E ek(τ N (t)) = e−πtk eλt(e −1) .

Now we use Taylor’s series expansion on ekτ − 1 and obtain
(kτ )2 (kτ )3
ekτ − 1 = kτ + + + ...
2! 3!
If we substitute the expansion back into (5.10) and make change of λ and c using (5.10), we
obtain

σ2 2 (kτ )2 (kτ )3

− −µ tk+ σ2 t kτ + 2! + 3! +...
E ek(τ N (t)−πt) = e
τ τ
σ2 2 2
tk+µtk+ στ tk+ σ2 tk2 +o(τ )
= e− τ .
where o(τ ) → 0 when τ → 0. Now if we let τ → 0, then (??) becomes

2
(µt)k+ σ2 t k2
E ek(τ N (t)−πt) = e

,
which is the moment generation function of a Brownian motion process with drift µ and volatil-
ity σ 2 .
Solution 5.11: [sur12R, Exercise] Here is a possible code:
88
#model: shifted Poisson process

# = total net loss for constant claims tau
# X(t) = tau*N(t) - ct
# E[X(t)]=(tau*lambda-c) t
# Var(X(t))= tau^2 lambda t
###
# we choose lambda and c such that
# mu = tau lambda - c
# sigma^2 = tau^2 * lambda
###
# we have then
# lambda = sigma^2/tau^2
# c = sigma^2/tau - mu
# and we will let tau -> 0
#initialising parameters
mu <- 3
sigma <- 6
tau <- .03
lambda <- sigma^2/tau^2

c <- sigma^2/tau - mu
# number of jumps
num <- round(3*lambda)
# times between jumps

set.seed(12345)
times <- rexp(num,lambda)
# calculate process
Brownian <- array(c(rep(0,2*(num+1))),c(num+1,2))
for(i in 1:num){
Brownian[i+1,1] <- Brownian[i,1] + times[i]
Brownian[i+1,2] <- tau*i-c*Brownian[i+1,1]
} #end of for loop
# plot Brownian
colnames(Brownian)<-c("t","W(t)")
plot(Brownian,type="l",lwd="0.1",main="Approximation of a Brownian Motion")
Solution 5.12: [NLI13, Exercise] Let S = 3i=1 Si be the aggregation of the three individual
P
lines of car fleet business, the moment generating function of S is
( 3 )
X
MS (r) = exp λi vi MY (i) (r) − 1 . (5.1)
1
i=1
1. Using the moment generating function of S, the expected claim amount of the car fleet is
3
X (i)
E[S] = λi vi E[Y1 ] = 39330. (5.2)
i=1
89
2. Using the moment generating function of S, the variance of the claim amount S is
3
X (i)
Var(S) = λi vi E[(Y1 )2 ] = 693705000 (5.3)
i=1
The premium for the car fleet using the variance loading principle with α = 3 · 10−6 is
39330 + 3 · 10−6 · 693705000 = 41411.12 (5.4)
Solution 5.13: [NLI13.1, Exercise]
(a) Using the aggregation of compound Poisson distributions, the sum S = S1 + S2 also has
a compound Poisson distribution

1 2
S = S1 + S2 ∼ CompPoi 30, G1 (x) + G2 (x) . (5.5)
3 3
(b) We first determine the expected value and variance of S,

1 1 2 1
E[S] = 30 + =2 (5.6)
3 10 3 20

1 2 2 2 3
Var[S] = 30 2
+ 2
= . (5.7)
3 10 3 20 10
Then the premium using the variance loading principle is π = 2 + 0.25 ∗ 0.3 = 2.075.
(c) According to the Central Limit Theorem, the sum S = ni=1 Si can be approximated by
P
the normal distribution with mean nE[Si ] = E[S] and variance nVar(Si ) = Var(S). A
natural candidate for α is 1.96 (the 97.5% quantile of a standard normal random variable).
Such a premium corresponds to a 2.5% Value-at-Risk.
90
Module 6
Generalised Linear Models (GLMs)
6.1 Components of a GLM

Exercise 6.1: [glm1, Solution] For the following members of the exponential dispersion family,
give the density (including the domain), the mean and the variance:
1. Normal(µ, σ 2 )
2. Poisson(µ)
3. Binomial(m, p)
4. Negbin(r, p)
5. Gamma(α, β)
6. Inverse Gaussian(α, β)
Exercise 6.2: [glm2K, Solution] [Kaas et al. (2008), Problem 8.6.8] The following is an extract
from Kaas et al. (2008, pp.193–4).
The Esscher transform with parameter h of a continuous distribution f (y) is the

density
ehy f (y)
fh (y) = R , (8.41)
ehz f (z)dz
provided the denominator is finite, i.e., the mgf with f (y) exists at h. A similar
transformation of the density can be performed for discrete distributions. In both
cases, the mgf with the transformed density equals mh (t) = m(t + h)/m(h). For a
density f in the exponential dispersion family, the cgf of fh has the form
b(θ + (t + h)ψ) − b(θ) b(θ + hψ) − b(θ)
κh (t) = −
ψ ψ
b(θ + hψ + tψ) − b(θ + hψ)
= , (8.42)
ψ
which is again a cgf of an exponential dispersion family member with parameter
θh = θ + hψ and the same ψ.
It can be shown that the Esscher transform with parameter h ∈ < transforms
91
1. N (0, 1) into N (h, 1);

2. Poisson(1) into Poisson(eh );
3. binomial(m,1/2) into binomial (m, eh (1 + eh )−1 );
4. negative binomial (r, 1/2) into negative binomial (r, 1 − eh /2) when −∞ < h <
log 2;
5. gamma (1,1) into (1,1 − h) when −∞ < h < 1);
√
6. inverse Gaussian (1,1) into inverse Gaussian ( 1 − 2h,1 − 2h) when −∞, h <
1/2.
So we see that all the examples of distributions in the exponential dispersion family
that we have given can be generated by starting with prototypical elements of each
type, and next taking Esscher transforms . . . .
Prove the six statements immediately above about Esscher transforms.
Exercise 6.3: [glm3, Solution] [Jiwook’s Final Exam Question 2002 - modified] The density
of the Binomial distribution is given by
n!
f (y; p) = py (1 − p)(n−y) .
(n − y)!y!
Show that the Binomial distribution is a member of the exponential dispersion family with
density
yθ − b(θ)
f (y; θ, φ) = exp + c(y; ψ) .
ψ
• Give expressions for b(θ), c(y; ψ) and ψ.

• List the three constituent parts of a generalized linear model.
• Find the expression for the deviance of a binomial model.
6.2 Deviance and Scaled Deviance

Exercise 6.4: [glm4K, Solution] [Kaas et al. (2008), Problem 8.4.1] Verify that

D L(y; µ) 2X yi
= −2 log = yi log − (yi − µ̂i )
ψ L(y; y) ψ i µ̂i
is the scaled deviance for a Poisson distribution.
Exercise 6.5: [glm5K, Solution] [Kaas et al. (2008), Problem 8.4.2] Verify that

D L(y; µ) 2X yi (yi − µ̂i )
= −2 log = − log +
ψ L(y; y) ψ i µ̂i µ̂i
is the scaled deviance for a gamma distribution.
Exercise 6.6: [glm7, Solution] Show that the deviance for an Inverse Gaussian distribution
has the following form:
n
X 1 (yi − µbi )2
D= .
i=1
µbi 2 yi
92
6.3 Fit a GLM and Evaluate the quality of a model

Exercise 6.7: [glm8, Solution] Question #9, ACTL3003/5106 Final Exam 2005.
Exercise 6.8: [glm9, Solution] [Institute question, April 2006] An insurance company has a
set of n risks (i = 1, 2, ..., n) for which it has recorded the number of claims per month, Yij , for
m months (j = 1, 2, ..., m).
It is assumed that the number of claims for each risk, for each month, are independent Poisson
random variables with
E (Yij ) = µij .
These random variables are modelled using a Generalized Linear Model, with
log µij = βi , for i = 1, 2, ..., n.
1. Derive the maximum likelihood estimator of βi .
2. Show that the deviance for this model is

n X m
X yij
2 yij log − (yij − y i )
i=1 j=1
y i
1
Pn
where y i = m j=1 yij .
3. A company has data for each month over a 2 year period. For one risk, the average
number of claims per month was 17.45. In the most recent month for this risk, there were
9 claims. Calculate the contribution that this observation makes to the deviance.
Exercise 6.9: [glm10, Solution] [Institute question, Sep 2003] There are m male drivers in
each of three age groups, and data on the number of claims made during the last year are
available. Assume that the numbers of claims are independent Poisson random variables. If
Yij is the number of claims for the jth male driver in group i (i = 1, 2, 3; j = 1, 2, ..., m), let
E(Yij ) = µij and suppose log (µij ) = αi .
1. Show that this is a Generalized Linear Model, identifying the link function and the linear
predictor.
2. Determine the log-likelihood, and the maximum likelihood estimators of αi for i = 1, 2, 3.
3. For a particular data set with 20 observations in each group, several models are fitted,
with deviances as shown below:
Link function Deviance
Model 1 log (µij ) = αi 60.40

α, if i = 1, 2
Model 2 log (µij ) = 61.64
β, if i = 3
Model 3 log (µij ) = α 72.53
93
i. Determine whether or not model 2 is a significant improvement over model 3, and

whether or not model 1 is a significant improvement over model 2.
ii. Interpret these three models.
Exercise 6.10: [glm11, Solution] An insurance company tested for claim sizes under two
factors, i.e. CAR, the insurance group into which the car was placed, and AGE, the age
of the policyholder (i.e. two-way contingency table). It was assumed that the the claim size
yi follows a gamma distribution, i.e.
ν
1 yi νi i yi νi
f (yi ) = exp − for yi ≥ 0, µi > 0, νi = 1
Γ(νi ) yi µi µi
with a log-link function. Analysis of a set of data for which n = 8 provided the following SAS
output:
Observation Claim size CAR type Age group Pred Xbeta Resdev
1 27 1 1 25.53 3.24 0.30
2 16 1 2 24.78 3.21 −1.90
3 36 1 1 3.41 1.03
4 45 1 2 38.09 3.64 1.11
5 38 2 1 40.85 3.71 −0.46
6 27 2 2 36.97 3.61 −1.73
7 14 2 1 2.45 0.69
8 6 2 2 14.59 2.68 −2.55
Calculate the fitted claim sizes missing in the table.
Exercise 6.11: [glm12R, Solution][R∗ ] In this question, the vehicle insurance data set1 is used,
car.csv. This data set is based on one-year vehicle insurance policies taken out in 2004 or
2005. There are 67856 policies of which 4624 had at least one claim.
The data frame car.csv contains claim occurrence clm, which takes value 1 if there is a
claim and 0 otherwise. The variable veh value represents the vehicle value which takes value
from $0 − $350, 000. We will not be concerned about other variables at the moment.
In this question, we will build a logistic regression model to apply to the vehicle insurance
data set. Previous study has shown that the relationship between the likelihood of occurrence
of a claim and vehicle value are possibly quadratic or cubic.
1. Suppose the relationship between vehicle value and the probability of a claim is cubic,
formulate the model and test significance of the coefficients.
2. Use AIC to determine the which model is the best model. Linear, quadratic or cubic.
Exercise 6.12: [glm13R, Solution][R∗ ] Third party insurance 1 is a compulsory insurance for
vehicle owners in Australia. It insures vehicle owners against injury caused to other drivers,
passengers or pedestrians, as a result of an accident.
In this question, the third party claims data set Third party claims.xls is used. This data
1
de Jong and Heller (2008)
94
set records the number of third party claims in a twelve-month period between 1984-1986 in
each of 176 geographical areas (local government areas) in New South Wales, Australia.
1. Now consider a model for the number of claims (claims) in an area as a function of the
number of accidents (accidents). Produce a scatter plot of of claims against accidents.
Do you think a simple linear regression model is appropriate?
2. Fit a simple linear regression to the model and use the plot command to produce residual
and diagnostic plots for the fitted model. What do the plots tell you?
3. Now fit a Poisson regression model with claims as response and log(accident) as the
predictor (include offset=log(population) in your code). Check if there is overdisper-
sion in the model by computing the estimate of ψ.
4. Now fit the regression model by specifying family=quasipoisson. Comment on the

estimates of the parameters and their standard errors.
6.4 Solutions
Solution 6.1: [glm1, Exercise] See the lecture notes (or Table 3.1 in de Jong and Heller, 2008).
The density and the mean are given in the table and the variance can be derived easily from
the table with:
σ 2 = ψ · V (µ) .
Try to map some of the densities into the exponential family formulation.
Solution 6.2: [glm2K, Exercise] (8.6.8) Although this has not been discussed in lecture, it
should not be a difficult exercise to show that “exponential dispersion” is preserved under
Esscher transformation. The proof is straightforward using the cgf argument in (8.42), although
mgf can also be used. Now, to prove the statements in Remark 8.6.10, start from
mh (t) = mY (t + h) /mY (h) ,
or equivalently in terms of cgf, we have
κh (t) = log mh (t) = κY (t + h) − κY (h) .
For example, in the Poisson case from Table A, we have
κh (t) = et+h − 1 − eh − 1 = eh et − 1

which is clearly the cgf of a Poisson eh . As yet another example, in the Gamma(1, 1) case
from Table A, we have

1 1
κh (t) = log − log
1−t−h 1−h

1−h
= log
1−h−t
95
which is clearly the cgf of a Gamma(1, 1 − h) . For inverse Gaussian (1,1), we have
p √
κh (t) = (1 − 1 − 2(t + h)) − (1 − 1 − 2h)
√ p
= 1 − 2h − 1 − 2(t + h)
√
r
1 − 2t − 2h
= 1 − 2h(1 − )
1 − 2h
√
r
2t
= 1 − 2h(1 − 1 − ).
1 − 2h
Solution 6.3: [glm3, Exercise] You ought to be able to verify that the Binomial belongs to
the family of Exponential Dispersion with

θ
n!
b (θ) = n log 1 + e , c (y; ψ) = log , and ψ = 1.
(n − y)!y!
You should also be able to show that

p µ
θ = log = log .
1−p n−µ
The three components of a generalized linear model are: (1) Stochastic Component: The
observations Yi are independent and each follows an Exponential Dispersion P distribution. (2)
Systematic Component: Every observation has a linear predictor ηi = j xij βj where xij
denotes the jth explanatory variable, and (3) Link function: The expected value E (Yi ) = µi
is linked to the linear predictor ηi by the link function ηi = g (µi ). Now to find the deviance of
the binomial (deviance is also the scaled deviance since ψ = 1, we have
Q yi 
n! µ µ
bi (n−yi )
bi
(1 − ) y n−yi !
D  i (n−yi )!yi ! n n Y µ
bi i n − µ bi
= −2 log  Q  = −2 log

yi yi
n!
(1 − yni )(n−yi )

ψ (n−y )!y ! n i
yi n − yi
i i
i
X
µ
bi

n−µ bi

= −2 yi log + (n − yi ) log
i
yi n − yi
n
X yi n − yi
= 2 yi log + (n − yi ) log .
i=1
µ
bi n−µ bi
Solution 6.4: [glm4K, Exercise] (8.4.1) We know that if D denotes the deviance, the scaled
deviance is
D
= −2 log L/
b L
e
ψ
by definition, where L
b is the likelihood computed using the MLE’s µ b under the current model
replacing the µ, while L
e is the likelihood computed with the µ replaced by the estimates under
the “full model”, hence the actual observations y, in view of the remarks just below (8.22). To
96
show that (8.23) results from this is basic algebra. To see this, note that
 n 
Q −bµi yi
 i=1 e µ bi /yi !  n yi !
D Y
−(b
µi −yi ) µ
bi
= −2 log  n
 = −2 log e
ψ  Q −y yi  yi
e i yi /yi ! i=1
i=1
n
X µ
bi
= −2 − (b
µi − yi ) + yi log
i=1
yi
n
X µ
bi
= 2 µi − yi ) − yi log
(b .
i=1
y i
Solution 6.5: [glm5K, Exercise] (8.4.2) To show that (8.26) results, following the discussion in
the previous problem, we can verify that, for exponential dispersion models, the scaled deviance
can be expressed as
D
= −2 log L/ b L
e
ψ
 
n
X i i y θ
e − θ
bi b θ
ei − b θbi
= 2  − .
i=1
ψ ψ
For Gamma, we have θ (µ) = −1/µ and b (θ) = − log (−θ) = log µ, we then have
n
D X yi (1/bµi − 1/yi ) log yi − log µ
bi
= 2 −
ψ i=1
ψ ψ
n
2 X yi − µ bi
= − log (yi /b
µi ) .
ψ i=1 µ
bi
Now, if the scale parameter were different for each observation according to some weight wi ,
then it is easy to verify.
Solution 6.6: [glm7, Exercise] Recall that the scaled deviance for any member of the Expo-
nential Dispersion family has the form
D h
e y − ` θ;
i
= 2 ` θ; by
ψ
n
2 X h e i
= yi θi − b θi − yi θi − b θbi
e b
ψ i=1
where for the Inverse Gaussian, we have verified (in lecture) that
2
2 1 β 1 √
ψ = β/α , θ = − = − 2 , and b (θ) = − −2θ = −1/µ.
2 α 2µ
97
Thus, the deviance can be expressed as

n
X 1 1 1 1
D = 2 yi − 2 + − yi − 2 −
i=1
2yi yi 2bµi µ
bi
n n
" 2 #
yi2
X
X 1 2yi 1 yi
= 2 1+ 2 − = 1−
i=1
2yi µ
bi µ
bi i=1
yi µ
bi
n
" 2 # n
X 1 µ bi − yi X 1 1
= = 2
µi − y i ) 2 .
(b
i=1
yi µ
b i i=1
µ
b i y i
This gives the desired result.
Solution 6.7: [glm8, Exercise] See Final Exams solution, Year 2005.
Solution 6.8: [glm9, Exercise]

y
Q µijij e−µij
1. The likelihood is i,j and the log-likelihood is therefore
yij !
n X
X m
` (β) = (yij log µij − µij − log yij !)
i=1 j=1
Xn X m
yij βi − eβi − log yij !

=
i=1 j=1
Xn X m n
X n X
X m
βi
= βi yij − e m− log(yij )!.
i=1 j=1 i=1 i=1 j=1
Applying first order condition:

m
∂` (β) X
= yij − meβi = 0
βi j=1
so that m
1 X
βi
e = yij , y i
m j=1
and the MLE is

βbi = log y i .
2. The deviance is
Pn Pm
i=1 j=1 (yij log yij − yij − log yij !)
2 [` (y; y) − ` (y; µ)] = 2
− ni=1 m
P P
j=1 (yij log y i − y i − log yij !)
n X m
X yij
= 2 yij log − (yij − y i ) .
i=1 j=1
y i
98
3. The contribution to the deviance in this case is

yij
Dij = 2 yij log − (yij − y i )
yi

9
= 2 · 9 log − (9 − 17.45) = 4.98.
17.45
Solution 6.9: [glm10, Exercise]
1. If Y has a Poisson distribution with mean parameter µ, then its density can be written
as
−µ y y log µ − µ
f (y; µ) = e µ /y! = exp − log y!
1
which is of the exponential dispersion family form. The link function is the log so that
g (µ) = log µ and the linear predictor is
η = log µ = αi .
So this is a Generalized Linear Model.
2. The likelihood is given by
Y3 Ym µyijij e−µij
i=1 j=1 yij !
so that the log-likelihood is
3 X
X m
(yij log µij − µij − log yij !) .
i=1 j=1
In terms of αi , we re-write this as

3
X 3
X
αi
` (α1 , α2 , α3 ) = − me + yi+ αi + constant
i=1 i=1
where yi+ refers to the sum of the observations in the ith group. Differentiating, we get
∂` (αi )
= −meαi + yi+ = 0
∂αi
so that the maximum likelihood estimator of αi is
α
bi = log (yi+ /m) .
3. In comparing the models, notice the nesting: Model 3 is the smallest and is contained in
Model 2 which is contained in Model 1. We may use our Rule of Thumb of significant
improvement if the decrease in deviance is larger than twice the additional parameter.
Here we summarize in table form:
First additional Significant
Model Deviance Difference d.f. D1 − D2 > 2 (p − q)? improvement?
Model 3 72.53 - -
Model 2 61.64 10.89 1 Yes Yes
Model 1 60.40 1.24 1 No No
99
So Model 2 is a significant improvement from Model 3, but Model 1 is not a significant

improvement from Model 1.
Now, regarding interpretation of the models: Model 3 says that there is no difference in
the average number of claims for the three age groups. Model 2 says that there is no
difference in the average number of claims between age groups 1 and 2, but that the third
age group may be different. Model 1 gives the possibility of different average number of
claims for each age group.
Solution 6.10: [glm11, Exercise] We know that the linear predictor, for the ith observation,
is X
ηi = log µi = xij βj = xTi β (in vector form).
j
Thus,
T
E (yi ) = µi = exi ·β .
and therefore, the predicted values are
E (y3 ) = e3.41 = 30.27
and
E (y7 ) = e2.45 = 11.59.
Solution 6.11: [glm12R, Exercise]
1. Suppose the cubic model

π
ln = β0 + β1 x + β2 x2 + β3 x3
1−π
where x is the vehicle value and π is the probability of a claim of the policy.
> car<-read.csv(".../car.csv")
> attach(car)
> names(car)
[1] "veh_value" "exposure" "clm" "numclaims" "claimcst0" "veh_body"
[7] "veh_age" "gender" "area" "agecat" "X_OBSTAT_"
> car.glm<-glm(clm~veh_value+I(veh_value^2)+I(veh_value^3),family=binomial,data=car)
> summary(car.glm)
Call:
glm(formula = clm ~ veh_value + I(veh_value^2) + I(veh_value^3),
family = binomial, data = car)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.4093 -0.3885 -0.3729 -0.3561 2.9462
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.9247606 0.0476282 -61.408 < 2e-16 ***
veh_value 0.2605947 0.0420331 6.200 5.66e-10 ***
I(veh_value^2) -0.0382409 0.0084167 -4.543 5.53e-06 ***
I(veh_value^3) 0.0008803 0.0002752 3.199 0.00138 **
100
---
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 33767 on 67855 degrees of freedom

Residual deviance: 33711 on 67852 degrees of freedom
AIC: 33719
Number of Fisher Scoring iterations: 6
The fits shows that all the coefficients are significant as the p-values are all smaller than
0.01.
2. > car.qua<-glm(clm~veh_value+I(veh_value^2),family=binomial,data=car)
> car.lin<-glm(clm~veh_value,family=binomial,data=car)
> car.lin$aic
[1] 33749.12
car.qua$aic
[1] 33718.92
car.cub$aic
[1] 33718.72
The difference between the AIC of the cubic and quadratic models is less than one. This shows
that if we include a cubic explanatory variable, the improvement of the fit quantified by AIC
only decreases by 0.2. Therefore, when evaluating a model by the principal of parsimony, a
quadratic model is preferred. Further, the AIC of the quadratic model is much less than that
of the linear, suggesting that the linear model is inadequate.
Solution 6.12: [glm13R, Exercise]

1. plot(accidents,claims,xlab="Accidents",ylab="Claims")
We can clearly see that there is a concentration of points around the origin make it
difficult to discern the relationship between the predictor and response. The data is also
strongly heteroskedasitic, which means more variable for higher value of the predictor.
This is a violation of the homoskedasticity assumption of linear model.
101
2. > third.lm<-lm(claims~accidents,offset=log(population))
> plot(third.lm)
The residuals vs fitted plot shows that the residual is clearly do not follow a standard nor-
mal distribution and the variance seems to inflate as the fitted value increases. Diagnostic
checks indicate clear violation of the homoskedasticity assumption.
3. > third.poi<- glm(claims ~ log(accidents), family=poisson,offset=log(population))
> summary(third.poi)
Call:
glm(formula = claims ~ log(accidents), family = poisson, offset = log(population))
Deviance Residuals:
-38.957 -3.551 0.116 3.842 45.965
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -7.093809 0.026992 -262.81 <2e-16 ***
log(accidents) 0.259103 0.003376 76.75 <2e-16 ***
---
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
(Dispersion parameter for poisson family taken to be 1)

AIC: 17066
> sum(resid(third.poi,type="pearson")^2)/third.poi$df.residual
[1] 101.7168
The estimate of ψ takes a value of 101.7168. The inflated dispersion parameter suggests
there is overdispersion in the data.
4. > third.qpoi<- glm(claims ~ log(accidents), family=quasipoisson,offset=log(population))
> summary(third.qpoi)
Call:
glm(formula = claims ~ log(accidents), family = quasipoisson,
offset = log(population))
Deviance Residuals:
-38.957 -3.551 0.116 3.842 45.965
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -7.09381 0.27223 -26.058 < 2e-16 ***
log(accidents) 0.25910 0.03405 7.609 1.66e-12 ***
---
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
102
(Dispersion parameter for quasipoisson family taken to be 101.7172)

AIC: NA
Model Dispersion parameter β̂0 (se) β̂1 (se)

Poisson ψ=1 -7.09381(0.02699) 0.25910(0.003376)
Quasi-Poisson ψ̂=101.7172 -7.09381(0.27223) 0.25910(0.03405)
The quasi-poisson estimates of β are identical to those of the Poisson model, but with
standard errors larger by a factor of ψ̂ 1/2 = 10.085.
103
Module 7
Bayesian Models and Credibility

Theory
Let Xjt denote the claim size of policy j during year t, for 1 ≤ j ≤ J and 1 ≤ t ≤ T . This
random variable is function of a risk profile, which cannot be observed and which is assumed to
be the same for a given contract (for given j but across t) but different between policies (across
j).
The unobservable risk profile is modelled as a random variable Θ, and the risk profile of policy
j is a possible outcome θj of Θ (but we cannot observe it). Risk profiles across contracts are
assumed to be independent. We will denote
µ(Θ) = E (Xjt |Θ) and σ 2 (Θ) = V ar (Xjt |Θ)
as the expectation and variance of claim sizes, as functions the risk profile, respectively. The
moments of these quantities are key quantities and are denoted
V ar [µ(Θ)] = a and E [µ(Θ)] = m and E σ 2 (Θ) = s2 .

Note that when we will need to consider µ(Θ) or σ 2 (Θ) for a particular policy j, we will write
µj (Θ) or σj2 (Θ). Finally, assume that for the same policy (given a certain risk profile Θ), claim
sizes are independent over time (across t). Claim sizes across policies (across j) are always
independent.
For nonparametric estimates, we will denote
T J
1X 1X
Xj = Xjt and X = Xj
T t=1 J j=1
as the average claim size of policy j and overall average claim size (across j as well), respectively.
Finally Xj,T +1 is the claim size whose expectation we want to estimate for policy j for the T + 1
period.
7.1 Preliminaries
Exercise 7.1: [cre1, Solution] Prove that
Cov (X, Y + Z) = Cov (X, Y ) + Cov (X, Z)
104
and that
Cov (X, αY ) = αCov (X, Y ) .
Also, derive the formula
Cov(X, Y ) = E [Cov(X, Y |Z)] + Cov (E[X|Z], E[Y |Z])
for the decomposition into conditional variances.
Exercise 7.2: [cre.id1, Solution] Show that


a
 if i = j, t 6= k,
Cov (Xit , Xjk ) = a + s2 if i = j, t = k,

0 if i 6= j.

Exercise 7.3: [cre.id2, Solution] First show that

s2
Cov(Xjt , X j ) = a + ,
T
and then find Cov(Xj,T +1 , X j ).

(
s2
a+ T
if i = j,
Cov(X i , X j ) =
0 6 j.
if i =

s2

1
Cov(X j , X) = a+ .
J T

s2

1
V ar(X) = a+ .
J T
7.2 Exact Bayesian models

Exercise 7.7: [cre3K, Solution] [Kaas et al. (2008), Problem 7.5.2] Suppose that Λ has a
gamma(α, τ ) prior distribution, and that given Λ = λ, the annual numbers of claims X1 , . . . , XT
are independent Poisson(λ) random variables. Prove that the posterior distribution of Λ, given
X1 = x1 , . . . , XT = xt , is gamma(α + xΣ , τ + T ), where xΣ = x1 + . . . + xT .
Exercise 7.8: [cre4, Solution] [April 2006 Institute of Actuaries CT6 Question] An insurer
has for 2 years insured a number of domestic animals against veterinary costs. In year 1, there
were n1 policies and in year 2, there were n2 policies. The number of claims per policy per year
follows a Poisson distribution with unknown (mean) parameter θ.
Individual claim amounts were a constant c in year 1 and a constant c(1 + r) in year 2. The
average total claim amount per policy was y1 in year 1 and y2 in year 2. Prior beliefs about
θ follow a Gamma distribution with mean α/λ and variance α/λ2 . In year 3, there are n3
policies, and individual claim amounts are c(1 + r)2 . Let Y3 be the random variable denoting
the average total claim amounts per policy in year 3.
105
1. State the distribution of the number of claims on the whole portfolio over the 2 year
period.
2. Derive the posterior distribution of θ, given y1 and y2 .
3. Show that the posterior expectation of Y3 given y1 and y2 can be written in the form of
a credibility estimate
α
Z · k + (1 − Z) · · c (1 + r)2
λ
specifying expressions for k and Z.
4. Describe k in words and comment on the impact the values of n1 , n2 have on Z.
Exercise 7.9: [cre5, Solution] You are given that an individual automobile insured has an
annual claim frequency distribution that follows a Poisson distribution with mean λ, where
because of parameter uncertainty, λ actually follows a Gamma distribution with parameters α
and β. A total of one claim is observed for the insured over a five-year period.
• One actuary assumes that α = 2 and β = 5, and a second actuary assumes the same
mean for the Gamma distribution, but only half the variance.
• Both actuaries determine the Bayesian premium for the expected number of claims in the
next year using their model assumptions.
• Determine the ratio of the Bayesian premium that the first actuary calculates to the
Bayesian premium that the second actuary calculates.
7.3 Linear credibility estimation

Exercise 7.10: [cre7, Solution] Assume that a company only has one policy j and the claim
sizes Xj1 , Xj2 , ..., XjT are identically distributed, and conditionally on Θ, Xj1 , Xj2 , ..., XjT are
independent and identically distributed with
E [Xjt |Θ] = µ(Θ) and V ar (Xjt |Θ) = s2 (Θ).
We are interested in a linear estimator for Xj,T +1 of the form
P̂ = g0 + gX j .
Find values for g0 and g such that P̂ is unbiased and such that hit minimisesi the quadratic
2
error with respect to Xj,T +1 . You may assume that minimising E Xj,T +1 − P̂ is equivalent
h i2
to minimising E µ(Θ) − P̂ .
Show that P̂ can be expressed in the form of as credibility formula.
Exercise 7.11: [cre8, Solution] In the Bühlmann model, find the variance of the credibility
premium
P̂ = zX j + (1 − z)X
as well as its MSE (remember we want to estimate Xj,T +1 ).
106
Exercise 7.12: [cre9, Solution] In the Bühlmann model, recall that M SB: mean square be-
tween risks is
J
1 X 2
M SB = Xj − X
J − 1 j=1
and M SW : mean square within risk is
J T
1 XX 2
M SW = Xjt − X j .
J(T − 1) j=1 t=1
Therefore show that

s2
E[M SB] = a + and that E[M SW ] = s2 .
T
Exercise 7.13: [cre10, Solution] Show that the credibility factor

Cov XT +1 , X
z= .
V ar X
can be re-expressed as
T
z= .
T +k
Give the expression for the constant k and explain how it will affect the credibility coefficient
z.
[This is a past exam question]
Exercise 7.14: [cre11K, Solution] [Kaas et al. (2008), Problem 7.4.1] Let X1 , . . . , XT be in-
2
dependent random variables with variancesP V 2ar(X t ) = s /wt for certain positive
Pnumbers wt ,
2
t = 1,P
. . . , T . Show that the variance t αt s /wt of the linear combination t αt Xt with
αΣ = t αt = 1 is minimal when we take αt ∝ wt , where the symbol P ∝ means ‘proportional
to’. Hence the optimal solution has αt = wt /wΣ where wΣ = t wt . Prove also that the
2
minimal value of the variance in this case is s /wΣ .
Exercise 7.15: [cre12K, Solution] [Kaas et al. (2008), Problem 7.4.2] Prove that in the Bühlmann-
Straub model, we have V ar(X zw ) ≤ V ar(X ww ). (Here,
J
X zj
X zw = X jw ,
j=1
zΣ
where
J
X
zΣ = zj
j=1
T
X wjt
X jw = Xjt
t=1
w jΣ
J
X wjΣ
X ww = X jw ,
j=1
w ΣΣ
and zj is the credibility factor.)
107
Exercise 7.16: [cre13K, Solution] [Kaas et al. (2008), Problem 7.4.10] Estimate the credibility
premiums in the Bühlmann-Straub setting when the claims experience for three years is given
for three contracts, each with weight wjt ≡ 1. The claims on the contracts are as follows:
t=1 t=2 t=3

j=1 10 12 14
j=2 13 17 15
j=3 14 10 6
1 2 3 4 5
risk class 1 v1,t 729 786 872 951 1019
S1,t 583 1100 262 837 1630
X1,t 80.0% 139.9% 30.0% 88.0% 160.0%
risk class 2 v2,t 1631 1802 2090 2300 2368
S2,t 99 1298 326 463 895
X2,t 6.1% 72.0% 15.6% 20.1% 37.8%
risk class 3 v3,t 796 827 874 917 944
S3,t 1433 496 699 1742 1038
X3,t 180.0% 60.0% 80.0% 190.0% 110.0%
risk class 4 v4,t 3152 3454 3715 3859 4198
S4,t 1765 4145 3121 4129 3358
X4,t 56.0% 120.0% 84.0% 107.0% 80.0%
risk class 5 v5,t 400 420 422 424 440
S5,t 40 0 169 1018 44
X5,t 10.0% 0.0% 40.0% 240.1% 10.0%
Table 7.1: Observed claim Si,t and corresponding numbers of policies vi,t .
\
\
(a) Choose the data of Table 7.1 and calculate the inhomogeneous credibility estimator µ(Θ i)
for the claims ratios under the assumption that the collective mean is given by µ0 = 90%
and the variance between risk classes is given by τ 2 = 0.20.
(b) What changes if the variance between risk classes is given by τ 2 = 0.05?
Exercise 7.18: [NLI24, Solution][Wuthrich (2014), Exercise 24] Estimate the prediction uncer-
hom
\
\
tainty E[(Xi,T +1 − µ(Θi ) )2 ] for the data in Table 7.1 under the assumption that the volume
grows 5% in each risk class.

The observed numbers of policies vi and claim counts Ni in 21 different regions are given in Table
7.2. Calculate the inhomogeneous credibility estimators for each region i under the assumption
that Ni |Θi has a Poisson distribution with mean µ(Θi )vi = Θi λ0 vi and E[Θi ] = 1. The prior
frequency parameter is given by λ0 = 8.8% and the prior uncertainty by τ 2 = 2.4 · 10−4 .
108
region i vi Ni
1 50,061 3,880
2 10,135 794
3 121,310 8,941
4 35,045 3,448
5 19,720 1,672
6 39,092 5,186
7 4,192 314
8 19,635 1,934
9 21,618 2,285
10 34,332 2,689
11 11,105 661
12 56,590 4,878
13 13,551 1,205
14 19,139 1,646
15 10,242 850
16 28,137 2,229
17 33,846 3,389
18 61,573 5,937
19 17,067 1,530
20 8,263 671
21 148,872 15,014
total 763,525 69,153
Table 7.2: Observed volumes vi and claims counts Ni in regions i = 1, 2, ..., 21.
7.4 Solutions
Solution 7.1: [cre1, Exercise]
We first have
Cov (X, Y + Z) = E [X (Y + Z)] − E (X) E (Y + Z)

= E (XY ) − E (X) E (Y ) + E (XZ) − E (X) E (Z)
= Cov (X, Y ) + Cov (X, Z)
Remember E[X + Y ] = E[X] + E[Y ] holds irrespective of the dependence structure of X and
Y . Then we have
Cov (X, αY ) = E [X (αY )] − E (X) E (αY )

= αE (XY ) − αE (X) E (Y )
= αCov (X, Y ) .
Lastly, we use the fact that E (W ) = E [E (W |Z )].
E [Cov (X, Y |Z )] = E [E (XY |Z )] − E [E (X |Z ) E (Y |Z )]

= E (XY ) − E [E (X |Z ) E (Y |Z )]
= E (XY ) − E (X) E (Y ) + E (X) E (Y ) − E [E (X |Z ) E (Y |Z )]
= Cov (X, Y ) − {E [E (X |Z ) E (Y |Z )] − E [E (X |Z )] E [E (Y |Z )]}
= Cov (X, Y ) − Cov [E (X |Z ) , E (Y |Z )] .
109
Notice that when X = Y , then it reduces to the familiar formula for conditional variances:
E [Cov (X, X |Z )] = Cov (X, X) − Cov [E (X |Z ) , E (X |Z )]
or
V ar (X) = E [V ar (X |Z )] + V ar [E (X |Z )] .
Solution 7.2: [cre.id1, Exercise]

Using the result of Exercise 7.1, we have
Cov(Xit , Xjk ) = E [Cov(Xit , Xjk |Θ)] + Cov [E(Xit |Θ), E(Xjk |Θ)]
= E [Cov(Xit , Xjk |Θ)] + Cov [µi (Θ), µj (Θ)] .
Now, when i = j and t 6= k, we have
Cov(Xit , Xjk ) = Cov(Xit , Xik ) = Cov [µi (Θ), µi (Θ)] = V ar [µ(Θ)] = a.
For the case of i = j and t = k, we have
Cov(Xit , Xit ) = V ar(Xit )

= E [V ar(Xit |Θ)] + V ar [E(Xit |Θ)]
= E s2 (Θ) + V ar [µ(Θ)]

= s2 + a.
For the last case, since policies are independent across multiple lines (when i 6= j), we have
Cov(Xit , Xjk ) = 0.

Firstly, we have

Cov(Xjt , X j ) = E Cov Xjt , X j |Θ + Cov E [Xjt |Θ] , E X j |Θ
1
= E [Cov (Xjt , Xjt |Θ)] + Cov (E [Xjt |Θ] , E [Xjt |Θ])
T !
T
1 X 1
since Cov Xjt , X j |Θ = Cov Xjt , Xit Θ = Cov Xjt , Xjt Θ

T T
" # t=1
T
1X
since E X j |Θ = E Xjt Θ = E[Xjt |Θ]

T t=1
1
= E [V ar(Xjt |Θ)] + Cov (µj (Θ), µj (Θ))
T
= s2 /T + a.
Then for Cov(Xj,T +1 , X j ), note that Xj,T +1 is not in X j , therefore

Cov(Xj,T +1 , X j ) = E Cov Xj,T +1 , X j |Θ + Cov E [Xj,T +1 |Θ] , E X j |Θ
= 0 + Cov (E [Xj,T +1 |Θ] , E [Xjt |Θ])
= Cov (µj (Θ), µj (Θ)) = a.
110

For the case of i = j, we have
!
1X
Cov X j , X j = Cov Xjt , X j
T all t
1X
= Cov(Xjt , X j )
T all t
1
= T (s2 /T + a) = s2 /T + a.
T
For the case of i 6= j, since risk profiles across contracts are assumed to be independent,
therefore Cov(X i , X j ) = 0.

We have

Cov(X j , X) = E Cov X j , X|Θ + Cov E X j |Θ , E X|Θ
" !# " #!
1 X 1 X
= E Cov X j , Xj Θ + Cov E X j |Θ , E Xj Θ

J
all j
J
all j
1 1
= E V ar(X j |Θ) + Cov E X j |Θ , E X j |Θ
J " J !#
T
1 1 X 1
= E V ar Xjt Θ + Cov (E[Xjt |Θ], E[Xjt |Θ])

J T t=1 J
11 1
= E [V ar (Xjt |Θ)] + V ar (E[Xjt |Θ])
JT J
1 s2

= +a .
J T

We have

V ar X = Cov(X, X)
( )
1 XX
= Cov(X i , X j )
J 2 all i all j
2
1 s
= J +a (using the result in Exercise 7.4)
J2 T
1 s2

= +a .
J T
111
Solution 7.7: [cre3K, Exercise] We can write the posterior density as
fΛ|X1 ,...,XT (λ |x1 , ..., xT ) = c · fX1 ,...,XT |Λ (x1 , ..., xT |λ) · fΛ (λ)
= c · fX1 |Λ (x1 |λ) · · · fXT |Λ (xT |λ) · fΛ (λ)
T
Y e−λ λxi 1 α α−1 −τ λ
= c· · τ λ e
i=1
xi ! Γ (α)
∗ −T λ xΣ
= c ·e λ · λα−1 e−τ λ
= c∗ · e−λ(T +τ ) λ(xΣ +α)−1
where c∗ is a constant that makes this a proper density. Clearly this has the form of a Gamma
density with parameters xΣ + α and T + τ .
Solution 7.8: [cre4, Exercise] Exact solutions drawn from the Institute paper/report.
1. The total number of claims has a Poisson distribution with parameter θ (n1 + n2 ) (given
θ).
2. Let Yi denote the average total claim amount per policy in year i and let Xi denote the
total number of claims in year i. Then Xi has a Poisson distribution with parameter θni
and
n1 n2
X1 = Y1 and X2 = Y2 ,
c c (1 + r)
where n1 Y1 is the total claim, and c is the amount per claim.
We have
f (y1 , y2 |θ) = P r(Y1 = y1 |Θ = θ)P r(Y2 = y2 |Θ = θ)

n1 y1 n2 y2
= P r X1 = Θ = θ P r X2 = Θ = θ .
c c(1 + r)
Then
f (θ |y1 , y2 ) ∝ f (y1 , y2 |θ ) π (θ)

h ih i
∝ e−θn1 (θn1 )y1 n1 /c e−θn2 (θn2 )y2 n2 /c(1+r) e−λθ θα−1

∝ e−θ(n1 +n2 +λ) θ(α+y1 n1 /c+y2 n2 /[c(1+r)])−1
which implies that the posterior distribution of θ is Gamma with parameters α̃ = α +

y1 n1 /c + y2 n2 /[c(1 + r)] and β̃ = n1 + n2 + λ.
112
3. Thus, our predicted value of Y3 , given the observed claims y1 and y2 is given by
Z ∞
E (Y3 |y1 , y2 ) = E(Y3 |Θ = θ)f (θ|y1 , y2 )dθ
0
Z ∞
c(1 + r)2

= E X3 Θ = θ f (θ|y1 , y2 )dθ
0 n3
Z ∞
2 α̃
= c(1 + r) θf (θ|y1 , y2 )dθ = c(1 + r)2
0 β̃
2
c (1 + r) α + y1 n1 /c + y2 n2 /c(1 + r)
= × n3 ×
n3 n1 + n2 + λ
2 2
cα (1 + r) + n1 y1 (1 + r) + n2 y2 (1 + r)
=
n1 + n2 + λ
α λ
= c (1 + r)2 × ×
λ n1 + n2 + λ
!
n1 y1 (1 + r)2 + n2 y2 (1 + r) n1 + n2
+ ×
n1 + n2 n1 + n2 + λ
so that effectively we have

n1 y1 (1 + r)2 + n2 y2 (1 + r)
k=
n1 + n2
and
n1 + n2
Z= .
n1 + n2 + λ
4. k is effectively a weighted average of the inflation adjusted average claim amounts for the
previous 2 years, weighted by the number of policies in force. As the number of policies
in force increases, Z becomes closer to 1, and so the more weight is placed on the actual
experience and less on the prior expectations.
Solution 7.9: [cre5, Exercise] It was shown in class that when claim frequency X1 , ..., XT are
independent Poisson(λ) with λ having a Gamma(α, β) prior, then the posterior distribution is
Gamma(α + xΣ , β + T ) so that the Bayesian premium is given by
α + xΣ
E (Λ |X1 , ..., XT ) = .
β+T
According to the first actuary, α = 2 and β = 5, hence the mean is αβ = 52 and the variance is
α 2
β2
= 25 . The second actuary sets the parameter with equal mean, but only half the variance.
Therefore
α∗ 2 α∗ 1
∗
= and 2 =
β 5 ∗
(β ) 25
∗ ∗
so that α = 4 and β = 10. Since there is only one claim in 5 years, xΣ = 1 and T = 5. The
first actuary sets the premium to
α + xΣ 2+1 3
= =
β+T 5+5 10
and the second actuary to
α ∗ + xΣ 4+1 5 1
∗
= = = .
β +T 10 + 5 15 3
113
The ratio is therefore

3/10
= 90%.
1/3
Note that at policy inception, the required premium is calculated using
Z ∞
E(X) = E(E(X|Λ)) = E(X|Λ = λ)f (λ)dλ
0
Z ∞
= λf (λ)dλ = E(Λ)
0
which is the same for both actuaries.
Despite requiring the same premium at policy inception and assuming larger parameter un-
certainty, the first actuary still charges a smaller premium. Intuitively, this is because as the
increasing number of years contributes to larger credibility attached to one’s own claims experi-
ence, it also provides a greater number of opportunities to correct for premium miscalculations
in the past. If claims experience becomes more favorable than expected, then there will be
larger correction in premium calculated. And the magnitude of correction increases then with
time, assuming of course, favorable experience continues.
Solution 7.10: [cre7, Exercise]

The MSE of the estimator P̂ can be expressed as
h i2
M SE = E Xj,T +1 − P̂
2
= E Xj,T +1 − g0 + gX j
2
= V ar Xj,T +1 − g0 + gX j + E Xj,T +1 − g0 + gX j .
Unbiasedness condition implies the second term is zero. Now, note that

V ar Xj,T +1 − g0 + gX j = V ar (Xj,T +1 ) − 2Cov Xj,T +1 , g0 + gX j + V ar g0 + gX j
= V ar (Xj,T +1 ) − 2gCov Xj,T +1 , X j + g 2 V ar X j .

Differentiating M SE with respect to g and setting the derivative to 0 yields,

−2Cov Xj,T +1 , X j + 2gV ar X j = 0
Thus, first order condition implies

Cov Xj,T +1 , X j
g=
V ar X j
and the unbiasedness condition implies
E(P̂ ) = E(Xj,T +1 )

⇒ g0 + gE X j = E (Xj,T +1 ) = E [µ (Θ)]
⇒ g0 + gE[µ(Θ)] = E [µ (Θ)]
or equivalently
g0 = (1 − g) E [µ (Θ)] .
114
Thus, the best linear Bayes estimator can be expressed as

!
Cov Xj,T +1 , X j Cov Xj,T +1 , X j
P̂ = Xj + 1 − E [µ (Θ)]
V ar X j V ar X j
where the credibility estimator is given by

Cov Xj,T +1 , X j
z= .
V ar X j
Using Exercise 7.3 and 7.4 yields the familiar credibility formula:

Cov Xj,T +1 , X j V ar [µ (Θ)]
z = = 1 2
V ar X j T
E [σ (Θ)] + V ar [µ (Θ)]
T V ar [µ (Θ)] T
= 2
= 2
.
E [σ (Θ)] + T V ar [µ (Θ)] T + (E [σ (Θ)] /V ar [µ (Θ)])
Solution 7.11: [cre8, Exercise] From Exercise 7.4, 7.5 and 7.6, we obtained V ar(X) and
Cov(X j , X). Now, we have the variance of P̂

V ar zX j + (1 − z) X
= z 2 V ar X j + 2z (1 − z) Cov X j , X + (1 − z)2 V ar X

a + s2 /T
= z 2 (a + s2 /T ) + 2z(1 − z)(a + s2 /T ) + (1 − z)2
J
2

2z(1 − z) (1 − z)
= (a + s2 /T ) z 2 + +
J J
The expression in curly braces is inferior to 1 as long as J > 1, which shows that the variance
of the credibility premium is lower than the one of X j . Furthermore, choosing z = 0 or z = 1
yields the variances of X and X j , respectively, as it should.
The MSE (as an estimator for Xj,T +1 ) of the credibility premium can be derived as follows:
h 2 i
E Xj,T +1 − zX j − (1 − z) X

= V ar Xj,T +1 − zX j − (1 − z) X
[because unbiasedness of the linear estimator]

= V ar (Xj,T +1 ) − 2Cov Xj,T +1 , zX j + (1 − z) X + V ar zX j + (1 − z) X
For the first element, using Exercise 7.2 we have
V ar (Xj,T +1 ) = s2 + a.
For the second element, using Exercise 7.3 we have

Cov Xj,T +1 , zX j + (1 − z) X = zCov Xj,T +1 , X j + (1 − z)Cov Xj,T +1 , X
1−z
= zCov Xj,T +1 , X j + Cov Xj,T +1 , X j
J
a
= za + (1 − z)
J
115
Therefore the MSE becomes

s2 2z(1 − z) (1 − z)2

2
a 2
M SE = s + a − 2za + 2(1 − z) + a+ z + +
J T J J
2

2(1 − z) 2z(1 − z) (1 − z)
= a 1 − 2z − + z2 + +
J J J
2

1 2 2z(1 − z) (1 − z)
+s2 1 + z + +
T J J

2 1 2
2 1 21 2

= a (1 − z) + (1 − z) + 2z(1 − z) − 2(1 − z) + s 1 + z (1 − z) + 2z(1 − z)
J T J

1 1 2
= a 1− (1 − z)2 + s2 1 + z (1 − z 2 ) .
J TJ
1
PJ
Solution 7.12: [cre9, Exercise] We have M SB = J−1 j=1 (X j − X)2 is unbiased estimator
of V ar(X j ) as X j ’s are iid with sample mean X, where
V ar(X j ) = E(V ar(X j |Θ)) + V ar(E(X j |Θ))

1
=E V ar(Xjt |Θ) + V ar(E(Xjt |Θ))
T
1
= s2 + a.
T
To show this, we have
J
1 X h 2 i
E (M SB) = E Xj − X .
J − 1 j=1
Since E[X j ] = E[X], then we have E[X j − X] = 0. Then E(M SB) becomes
J
1 X h 2 i
E (M SB) = E Xj − X
J − 1 j=1
J
1 Xh h 2 i 2 i
= E Xj − X − E[X j − X]
J − 1 j=1
J
1 X
= V ar X j − X
J − 1 j=1
J
1 X
= V ar X j − 2Cov(X j , X) + V ar X
J − 1 j=1
J
s2 s2 s2

1 X 1 1
= a+ −2 a+ + a+
J − 1 j=1 T J T J T
J
s2

1 X 1
= 1− a+
J − 1 j=1 J T
s2 s2

1
= (J − 1) a + =a+
J −1 T T
116
which proves the

PT result.
1
We have T −1 t=1 (Xjt − X)2 is unbiased estimator of σ 2 (Θj ) = V ar(Xjt |Θ). To show this,
consider
J X T
1 X h 2 i
E (M SW ) = E Xjt − X j
J (T − 1) j=1 t=1
Similarly since E(Xjt − X j ) = 0, we have

J T
1 XX h 2 i
E (M SW ) = E Xjt − X j
J (T − 1) j=1 t=1
J X T h h
1 X 2 i 2 i
= E Xjt − X j − E[Xjt − X j ]
J (T − 1) j=1 t=1
J T
1 XX
= V ar Xjt − X j
J (T − 1) j=1 t=1
J T
1 XX
= V ar (Xjt ) + V ar X j − 2Cov Xjt , X j
J (T − 1) j=1 t=1
J T
1 XX
= V ar(E(Xjt |Θ)) + E(V ar(Xjt |Θ)) + V ar X j − 2Cov Xjt , X j
J (T − 1) j=1 t=1
J T
1 XX
(a + s2 ) + a + s2 /T − 2 a + s2 /T

=
J (T − 1) j=1 t=1
J T
1 XX
= s2 (1 − 1/T )
J (T − 1) j=1 t=1
1
= JT s2 (1 − 1/T ) = s2 ,
J (T − 1)
which proves the desired result.
Solution 7.13: [cre10, Exercise] The credibility factor is given by

Cov Xj,T +1 , X j
z= .
V ar X j
To further simplify, see Exercise 7.3 and 7.4

s2
Cov Xj,T +1 , X j = a and V ar X j = a + .
T
Thus, we have the familiar credibility formula:

Cov Xj,T +1 , X j a T
z= = s2
= s2
V ar X j a+ T T+ a
and k is therefore
E [s2 (Θ)] s2
k= = .
V ar [µ (Θ)] a
117
Recall the formula of the credibility premium,
PTcred
+1 = zX j + (1 − z)m.
Firstly notice that, if we have more experience in the data(i.e. T increases), then z will increase.
This makes sense as more experience means that we will give more credibility to the individual
mean X j .
If the heterogeneity of the portfolio increases (a %) that is risks are quite different amongst
each portfolio, then we will expect k to decrease. Decreased value of k will increase the credi-
bility coefficient z. So if the portfolios we have are quite different from each other, we will use
more information from the individual mean structure X j , i.e. giving it more credibility.
In another situation where if the risk variability decreases within the portfolio(s2 &), then
we will also expect k to decrease which will result in a increasing value of z. This again makes
sense. If each individual portfolio does not vary dramatically, then we will obviously use more
information on the mean structure of individual portfolio X j .
Solution 7.14: [cre11K, Exercise] We minimize the variance:

!
X X
V ar α t Xt = αt2 V ar (Xt )
t t
where V ar (Xt ) = s2 /wt and subject to the condition αΣ = t αt = 1. The Lagrangian for
P
this problem can be written as
X
αt2 s2 /wt − λ (αΣ − 1) .

L=
t
Setting the derivatives of L equal to zero gives for t = 1, 2, ..., T :

∂L
= 2αt s2 /wt − λ = 0

∂αt
and is true if and only if
αt λ
= 2.
wt 2s
Since this equality must hold for every t, αt must then be proportional to wt . This implies
αt = wt /wΣ . The variance is therefore
T
! T
X X
(wt /wΣ )2 · s2 /wt = s2 /wΣ .

V ar α t Xt =
t=1 t=1
Solution 7.15: [cre12K, Exercise] In Exercise 7.14 we show that if the variance is of the
form s2 /wt , then the alphas of an estimator of X should be proportional to the inverse: wt
(or equivalently, wt /s2 since s2 is a constant). We have for the unconditional variance of Xij
(whose expectation is the collective premium, which we need in the credibility premium)
V ar(Xij ) = V ar(µ(Θ)) + E[σ 2 (Θ)] = a + s2 /wjΣ ,
118
which is now proportional to zj and not to wjt . The best unconditional expected value of Xij
should then be computed using the zj ’s, not the wij ’s, as this will minimise the variance of the
estimator.
wjt wjΣ
Note that X jw = Tt=1 wjΣ Xjt andX ww = Jj=1 wΣΣ
P P
X jw .
Consider the sequence of independent random variables X1w , · · · , XJw . The variance of X jw is
V ar(X jw ) = V ar(E[X jw |Θ]) + E[V ar(X jw |Θ)]

" T #! " T
!#
X wjt X wjt
= V ar E Xjt |Θ + E V ar Xjt |Θ
t=1
w jΣ t=1
w jΣ
T
! " T 2 #
X wjt X wjt
= V ar E[Xjt |Θ] + E V ar(Xjt |Θ)
t=1
w jΣ t=1
w jΣ
T
! " T 2 2 #
X wjt X wjt σ (Θ)
= V ar µ(Θ) + E
t=1
w jΣ t=1
wjΣ wjt
T
X wjt
= V ar(µ(Θ)) + 2
E[σ 2 (Θ)]
t=1
wjΣ
2 awjΣ + s2
= a + s /wjΣ =
wjΣ
a a
= = wjΣ = .
zj
wjΣ + s2 /a
z
Using the result in Exercise 7.14 above, we can see that when αj = zΣj , the variance of the
linear combination Jj=1 αj X jw is minimized.
P
z
Hence the variance of the linear combination X zw = Jj=1 zΣj X jw is the smallest among all the
P
linear combinations. Therefore,
J
!
X wjΣ
V ar(X zw ) ≤ V ar X jw = V ar(X ww ).
j=1
w ΣΣ
PT wjt PJ wjΣ
Solution 7.16: [cre13K, Exercise] Note that X jw = t=1 wjΣ Xjt and X ww = j=1 wΣΣ X jw .
From the formulas we saw in the lecture we get
1 X 2
se2 = wjt Xjt − X jw = 8
J (T − 1) j,t
and 2
− (J − 1) se2
P
j w jΣ X jw − X ww 11
a= P 2 = .
wΣΣ − j wjΣ /wΣΣ
e
3
Thus, the Bühlmann-Straub credibility factor is given by
aT
e 11
ze = = .
aT + se2
e 19
119
The credibility premiums are therefore:
b1,T +1 = ze · 12 + (1 − ze) · 12 1 = 12.14

X
3
b2,T +1 = ze · 15 + (1 − ze) · 12 1 = 13.88
X
3
b3,T +1 = ze · 10 + (1 − ze) · 12 1 = 10.98.
X
3
Solution 7.17: [NLI23, Exercise] From Wuthrich (2014, Theorem 8.17), the inhomogeneous
credibility estimator is given by
\
\
µ(Θi ) = αi,T Xi,1:T + (1 − αi,T )µ0 ,
b
with credibility weight αi,T and observation based estimator Xbi,1:T . Since τ 2 and µ0 are given,
2 2
we only need to obtain the estimator for σ , σ̂T = 261.2. Then using the above formula, results
are summarised in the following table,
risk class 1 risk class 2 risk class 3 risk class 4 risk class 5
α
bi,T 76.94% 88.64% 76.94% 93.37% 61.72%
\
\ 2
µ(Θ i ) (τ = 0.2) 0.9866506 0.3702171 1.1623246 0.8988722 0.7169975
\
\ 2
µ(Θi ) (τ = 0.05) 0.9512165 0.5048693 1.0550632 0.8990594 0.8148146
R-code:

# need MASS package for calculating column and row sums
library(MASS)
# read the data first
v.matrix<-c(729,1631,796,3152,400,786,1802,827,3454,420,872,
2090,874,3715,422,951,2300,917,3859,424,1019,2368,944,4198,440)
dim(v.matrix)<-c(5,5)
S.matrix<-c(583,99,1433,1765,40,1100,1298,496,4145,0,262,326,
699,3121,169,837,463,1742,4129,1018,1630,895,1038,3358,44)
dim(S.matrix)<-c(5,5)
X.matrix<-S.matrix/v.matrix
w.matrix<-v.matrix
tau2<-0.2
mu0<-0.9
# calculating estimator for sigma^2

Xhat<-rep(rowSums(w.matrix*X.matrix)/rowSums(w.matrix),5)
dim(Xhat)<-c(5,5)
s2hat<-rowSums(w.matrix*((X.matrix-Xhat)^2))/4
sigma2hat<-mean(s2hat)
# calculating \hat{alpha}_{i,T}
120
alphahat<-rowSums(w.matrix)/(rowSums(w.matrix)+sigma2hat/tau2)
# calculating \hat{X}_{i,1:T}
Xhat<-rowSums(w.matrix*X.matrix)/rowSums(w.matrix)
# calculating inhomogeneous credibility estimators

alphahat*Xhat+(1-alphahat)*mu0
tau2<-0.05
mu0<-0.9
# calculating estimator for sigma^2

Xhat<-rep(rowSums(w.matrix*X.matrix)/rowSums(w.matrix),5)
dim(Xhat)<-c(5,5)
s2hat<-rowSums(w.matrix*((X.matrix-Xhat)^2))/4
sigma2hat<-mean(s2hat)
alphahat<-rowSums(w.matrix)/(rowSums(w.matrix)+sigma2hat/tau2)
# calculating \hat{X}_{i,1:T}
Xhat<-rowSums(w.matrix*X.matrix)/rowSums(w.matrix)
# calculating inhomogeneous credibility estimators

alphahat*Xhat+(1-alphahat)*mu0
Solution 7.18: [NLI24, Exercise] From Wuthrich (2014, Equation (8.18)), the prediction un-
certainty has the following formula is given by,
 !
hom 2
\ 2

\ σ 1 − αi,T
E  Xi,T +1 − µ(Θi ) = 2
+ (1 − αi,T )τ 1 + P .
wi,T +1 i αi,T
We have estimators σ̂T2 = 261.2, τ̂T2 = 0.1021 and wi,T +1 using 5% increment factor. Results
are tabulated below,
risk class 1 risk class 2 risk class 3 risk class 4 risk class 5
hom
\
\
E[(Xi,T +1 − µ(Θi) )2 ] 0.2482469 0.1062646 0.2676411 0.0597071 0.5744316
R-code:

# need MASS package for calculating column and row sums
library(MASS)
v.matrix<-c(729,1631,796,3152,400,786,1802,827,3454,420,872,2090,
874,3715,422,951,2300,917,3859,424,1019,2368,944,4198,440)
dim(v.matrix)<-c(5,5)
S.matrix<-c(583,99,1433,1765,40,1100,1298,496,4145,0,262,326,
121
699,3121,169,837,463,1742,4129,1018,1630,895,1038,3358,44)
dim(S.matrix)<-c(5,5)
X.matrix<-S.matrix/v.matrix
w.matrix<-v.matrix
sigma2hat<-261.2
tau2hat<-0.1021
alphahat<-rowSums(w.matrix)/(rowSums(w.matrix)+sigma2hat/tau2hat)
# calculating weight in periodic T+1

w.matrix[,5]*1.05
# calculating predictive uncertainty

sigma2hat/(w.matrix[,5]*1.05)+(1-alphahat)*tau2hat*(1-alphahat)/sum(alphahat)

In this exercise, we have 21 risk classes and for every risk class we have T = 1 observation.
The data is summarised in Table 7.2 where claims counts are denoted as Ni and the observed
numbers of policies are vi . We consider a BS model on the frequencies Xi = Ni /vi where the
conditional expectation and variance is given by (using the Poisson assumption),
E[Ni |Θi ]
µ(Θi ) = E[Xi |Θi ] = = Θi λ0 ,
vi
σ 2 (Θi ) Var[Ni |Θi ] Θi λ0
= Var[Xi |Θi ] = 2
= .
vi vi vi
The collective mean is µ0 = E[µ(Θ1 )] = λ0 E[Θ1 ] = 8.8%. The prior uncertainty is given by
τ 2 = 2.4·10−4 . The volatility within risk classes also happens to be σ 2 = E[σ 2 (Θ1 )] = λ0 E[Θ1 ] =
8.8% (thanks to the Poisson assumption).
122
region i vi Ni frequency (Ni /vi ) credibility weights credibility estimators

1 50061 3880 0.07750544 0.9927289 0.07758175
2 10135 794 0.07834238 0.9650849 0.07867957
3 121310 8941 0.07370373 0.9969865 0.07374682
4 35045 3448 0.09838779 0.9896456 0.09828023
5 19720 1672 0.08478702 0.9817458 0.08484567
6 39092 5186 0.13266141 0.9907076 0.13224640
7 4192 314 0.07490458 0.9195671 0.07595788
8 19635 1934 0.09849758 0.9816682 0.09830514
9 21618 2285 0.10569895 0.9833217 0.10540377
10 34332 2689 0.07832343 0.9894328 0.07842568
11 11105 661 0.05952274 0.9680372 0.06043295
12 56590 4878 0.08619898 0.9935624 0.08621057
13 13551 1205 0.08892333 0.9736546 0.08889900
14 19139 1646 0.08600240 0.9812020 0.08603995
15 10242 850 0.08299160 0.9654371 0.08316471
16 28137 2229 0.07921953 0.9871362 0.07933248
17 33846 3389 0.10013000 0.9892827 0.10000000
18 61573 5937 0.09642213 0.9940803 0.09637228
19 17067 1530 0.08964669 0.9789679 0.08961205
20 8263 671 0.08120537 0.9575109 0.08149407
21 148872 15014 0.10085174 0.9975431 0.10082016
R-code:

v.vector<-c(50061,10135,121310,35045,19720,39092,4192,19635,21618,
34332,11105,56590,13551,19139,10242,28137,33846,61573,17067,8263,148872)
N.vector<-c(3880,794,8941,3448,1672,5186,314,1934,2285,2689,661,
4878,1205,1646,850,2229,3389,5937,1530,671,15014)
x.vector<-N.vector/v.vector
# collective mean, vol between, vol within
mu0<-0.088
sigma2<-0.088
tau2<-0.00024
# credibility weights
alpha.vector<-v.vector/(v.vector+sigma2/tau2)
# credibility estimators
ans<-cbind(v.vector,N.vector,x.vector,
alpha.vector,alpha.vector*x.vector+(1-alpha.vector)*mu0)
123
Module 8
Claims Reserving
8.1 Outstanding loss liabilities

Exercise 8.1: [IBNR1Y, Solution] Suppose that an insurer has observations of claims from
year 2009 to year 2011. The observations of the claims arrivals and development are recorded in
two tables: the first one lists the accident date, reporting date, settlement status and settlement
date (if applicable) of each observed claim; the second one records the transaction of the claims.
claim ID accident date reporting date settlement status settlement date

1 2009.01.31 2009.02.01 settled 2010.05.02
2 2009.07.12 2009.12.11 unsettled NA
3 2010.02.15 2010.05.19 settled 2010.06.20
4 2010.12.08 2011.02.08 settled 2011.08.13
5 2011.04.05 2011.06.20 unsettled NA
6 2011.08.22 2011.09.11 settled 2011.12.31
claim ID transaction date paid loss (in AUD)

1 2009.03.15 100
1 2010.01.25 50
2 2010.02.20 60
2 2011.05.10 180
3 2010.05.21 160
3 2010.06.13 70
4 2011.03.22 80
4 2010.08.10 40
5 2011.09.06 90
6 2011.10.12 500
6 2011.11.09 200
Construct a 3-by-3 annual loss triangle based on the above two tables.
8.2 Claims reserving algorithms

Exercise 8.2: [IBNR4K, Solution] [Kaas et al. (2008), Problem 9.2.4] Apply the arithmetic
separation method to the same data of the previous exercise. Determine the missing γ values
by linear or loglinear interpolation, whichever seems more appropriate.
124
Exercise 8.3: [IBNR5, Solution] For a certain portfolio of general insurance policies, denote by
Xij the claims that occur in accident year i, but paid in development year j, where i = 1, 2, ..., t
and j = 1, 2, ..., t with observable claims only for i + j ≤ t + 1. The triangle below shows the
observed incremental claims for this portfolio for a 3-year development period:
Accident Development Year
Year 1 2 3
1 2,541 1,029 217
2 2,824 790
3 1,981
1. Give five (5) reasons for the possible delay between the occurrence and the actual payment
of claims that gives rise to Incurred-but-not-Reported (IBNR) reserves.
2. In the Chain Ladder approach of estimating the bottom-half of the claims run-off triangle,
the claims Xij are assumed to be Poisson distributed with mean αi βj . Derive explicit
forms for the maximum likelihood estimators for the parameters αi and βj .
3. Using the result in (2), calculate the maximum likelihood estimates for αi and βj for
i = 1, 2, 3 and j = 1, 2, 3 and use these to estimate the bottom half of the triangle.
4. Explain the difference between the Chain Ladder approach and the Arithmetic Separation
Method.
Exercise 8.4: [IBNR7, Solution] Estimate the expected outstanding claims reserve for the
data in the table below (figures in $1000), using the Bornhuetter-Ferguson method. Assume
that an expected loss ratio of 85%.
Development year (DY)

Accident Earned
Year premium 1 2 3 4
2003 860 473 620 690 715
2004 940 512 660 750
2005 980 611 700
2006 1,020 647
8.3 Stochastic claims reserving methods

Exercise 8.5: [IBNR1K, Solution] [Kaas et al. (2008), Problem 9.2.1] Consider the following
simple situation:
1 2 3 4 5
1 A1 A2 A3 B1 E
2 A4 A5 A6 B2
3 C1 C2 C3 X
b34
∗
4 D1 D2 D X
b44
5 F
Here, D∗ is a prediction. Use the Poisson (GLM) chain ladder technique to show that
DΣ × BΣ
X̂44 =
AΣ
125
(where the sum DΣ includes D∗ ) and
DΣ × (BΣ + X̂34 )
X̂44 =
A Σ + CΣ
indeed produces the same estimate.
Exercise 8.6: [IBNR3K, Solution] [Kaas et al. (2008), Problem 9.2.3] Apply the Poisson
(GLM) chain ladder method to the given IBNR triangle with cumulated figures. What could be
the reason why run-off triangles to be processed through the chain ladder method are usually
given in a cumulated form?
Year of Development Year

Origin 1 2 3 4 5
1 232 338 373 389 391
2 258 373 429 456
3 221 303 307
4 359 430
5 349
Exercise 8.7: [IBNR6, Solution] [Jiwook’s Final Exam Question 2002] The claim payments
(in incremental form) from a general insurance portfolio are represented by the following table
Development Year
2000 2001 2002
Year of Origin
2000 X00 X01 X02
2001 X10 X11
2002 X20
Based on the following assumptions, we would like to estimate the outstanding claims where:
• claim payments for each year of origin and development year have a log-normal distribu-
tion,
i+j j
!
X X
2
ln (Xij ) ∼ Normal α + λk + βk , σ
k=1 k=1
where Xij denotes the claim amount paid in development year j arising from losses occuring
in year of origin i.
• claim payments for each year of origin and development year are independent.
• the expected value of the logarithm of the claim payments in year of origin 0, Year 2000
and development year 0, Year 2000 is α.
• the expected change in the logarithm of the claim payments from one accounting year to
the next is given by λi for each accounting year i.
• for each year of origin, the expected change in the logarithm of the claims payment from
development year j − 1(j = 1, 2...) to development j is equal to βj and this is the same
for each year of origin.
126
• the logarithm of claim payments have the same variance, σ 2 , regardless of year of origin
or year of development.
The λi values allow for any inflation in values from one accounting year to the next. The βj
values allow for the settlement pattern of claims over time arising from the same policy year.
The run-off triangle of expected values for the logarithm of the claims payments will then be
Development Year
2000 2001 2002
Year of Origin
α
α +λ1
2000 α +λ1 +λ2
+β1 +β1
+β2
α
α +λ1
2001
+λ1 +λ2
+β1
α
2002 +λ1
+λ2
From the log-likelihood of the run-off triangle, we have obtained
α̂ = 7.0632, λ̂1 = 0.025, λ̂2 = 0.012, β̂1 = −0.0208, β̂2 = 0.0123 and σ̂ 2 = 1.
Assuming that λ̂3 = λ̂4 = 0.018, estimate the outstanding claims X12 , X21 and X22 .
Exercise 8.8: [IBNR2Y, Solution] Consider a 3-by-3 incremental loss triangle where we have
I = t = 3 and J = 2 (that is, we observe {xij ; i + j ≤ 3, 1 ≤ i ≤ 3, 0 ≤ j ≤ 2}, and all the
observations are not cumulative). The exposures of the ith accident period is a known constant
ci (1 ≤ i ≤ 3). We assume that Xij follows a normal distribution with parameters µij = αj ci
and σij = σj (1 ≤ i ≤ 3 and 0 ≤ j ≤ 2). In other words, the probability density function is of
Xij is
(xij −αj ci )2
1 −
2σ 2
fXij (xij ) = √ e j .
σj 2π
Furthermore, Xij and Xlk are independent (for i 6= l, j 6= k, 1 ≤ i ≤ 3 and 0 ≤ j ≤ 2).
1. Derive the maximum likelihood estimates of α0 , α1 and α2 .
2. Given the maximum likelihood estimates of σj (denoted by σ̂j , j = 0, 1, 2), estimate the
mean and variance of the outstanding claims liability.
8.4 Solutions
Solution 8.1: [IBNR1Y, Exercise] The construction and result of the loss triangle are shown
in the following table.
127
Year Development Year

of Origin 1 2 3
1 100 (50+60=)110 180
2 (160+70+40)=270 80
3 (90+500+200)=790
Solution 8.2: [IBNR4K, Exercise] (9.2.4) The factors βbj and γ

bk in the Arithmetic Separation
method are:
βb1 = 0.6701, βb2 = 0.2111, βb3 = 0.0700, βb4 = 0.0444, and βb5 = 0.0044
and
γ
b1 = 346.2170, γ
b2 = 413.0731, γ
b3 = 390.0336, γ
b4 = 515.2749, and γ
b5 = 453.
These can be obtained as suggested by the maximum likelihood estimates derived in lecture.
For example,
P
i+j−1=5 xij
X
γ
b5 = P5 = xij = 349 + 71 + 4 + 27 + 2 = 453
j=1 β j i+j−1=5
the sum of the claims in the big diagonal, and so on. You must be able to verify the rest. Then,
we can extrapolate the γ-factors linearly to yield γk , for k = 6, ..., 9:
γ
b6 = 518.27, γ
b7 = 549.85, γ
b8 = 581.43, γ
b9 = 613.01 γk = 328.79 + 31.58 · k)
(b
One can also exponentially extrapolate, leading to
bk = e5.81+0.0759·k .

γ
b6 = 526.05, γ
b7 = 567.53, γ
b8 = 612.29, γ
b9 = 660.57 γ
As a result, the lower right triangle of estimated values for the first case (linear extrapolation)
becomes:
Year Development Year Row
of Origin 1 2 3 4 5 Total
1
2 2.3 2.3
3 23.0 2.4 25.4
4 36.3 24.4 2.6 63.3
5 109.4 38.5 25.8 2.7 176.4
The total IBNR reserve required would be about 267.4. (Try working out the exponential case!)
Alternatively, one can use linear extrapolation based on the last two γ-factors:
k−5
γ
bk = γ γ5 − γ
b5 + (b b4 ) =γ γ5 − γ
b5 + (b b4 )(k − 5); k = 6, ..., 9.
5−4
One can also use log-linear extrapolation based on the last two γ-factors:
k−5
ln γ
bk = ln γ b5 − ln γ
b5 + (ln γ b4 ) = ln γ b5 − ln γ
b5 + (ln γ b4 )(k − 5); k = 6, ..., 9.
5−4
128
Solution 8.3: [IBNR5, Exercise] IBNR problem.
1. Some possible reasons for delay: (1) delay in assessing exact size or amount of claims; (2)
delay in investigating whether claim is valid; (3) long legal proceedings; (4) claims have
occurred, but not filed later; and (5) claim consists of series of payments (e.g. disability
insurance).
2. First, notice that you can write the probability:
P (X = xij ) = e−αi βj (αi βj )xij /xij !
for i, j satisfying i + j < t + 1. The full likelihood of all observed values can be written as
t t−i+1
Y Y
L (α, β) = e−αi βj (αi βj )xij /xij !
i=1 j=1
t t+i−1
X X
⇒ l(α, β) = {−αi βj − xij (log αi + log βj ) − log xij !}
i=1 j=1
Take the log of the likelihood and maximize. The solutions will have the form:
Pt−i+1 Pt−j+1
j=1 xij xij
α
bi = Pt−i+1 and βj = Pi=1
b
t−j+1 ,
βbj
j=1 i=1 α
bi
but must impose βb1 + · · · + βbt = 1. Observe that

Pthe numerators here are actually row
sums (Ri = j=1 xij ) and column sums (Kj = t−j+1
Pt−i+1
i=1 xij ), respectively.
3. Notice that we can actually write the expected observed claims in the chain ladder ap-
proach as
Accident Development Year
Year 1 2 3
1 α 1 β1 α 1 β2 α 1 β3
2 α 2 β1 α 2 β2
3 α 3 β1
So easily we can verify, together with the assumption that all claims settle after 3 devel-
opment years, i.e. β1 + β2 + β3 = 1, the following:
α1 = 2541 + 1029 + 217 = 3, 787

217
β3 = = 0.0573
3787
2824 + 790
α2 = = 3, 834
1 − 0.0573
1029 + 790
β2 = = 0.2387
3787 + 3834
1981
α3 = = 2, 814
1 − 0.0573 − 0.2387
2541 + 2824 + 1981
β1 = = 0.7040
3787 + 3834 + 2814
Thus using these results to estimate the bottom half, we would have:
X
b23 = α
b2 βb3 = 220; X
b32 = α
b3 βb2 = 672; X
b33 = α
b3 βb3 = 161.
129
4. The chain ladder method assumes that the claims are Poisson distributed, Xij ∼Poisson(αi βj ),
where the α’s denote the accident year effect, and the β’s denote the development year
effect. It has no calendar year effect, unlike the Arithmetic Separation method where
the claims are assumed to be Poisson(βj γk ). Here as in the chain ladder, β’s denote
the development year but the γ’s denote the accident year effect. Both methods use
maximum likelihood to estimate their corresponding parameters, although in predicting
unpaid claims, because future calendar years have not occurred yet, in the Arithmetic
separation method, the γ’s may have to be extrapolated from the estimated ones.
Solution 8.4: [IBNR7, Exercise] First calculate the initial expected total loss as 85% of the
earned premium. This gives figure of 731, 799, 833 and 867.
Now calculate the development factors for individual years in the usual way. We find that the
factors are 1.2406, 1.1250, 1.0362.
Tackling the years one at a time:
The total expected outgo for Accident Year 2003 is 715 as we are assuming that Accident Year
2003 is fully run-off.
For Accident Year 2004, the expected outgo was initially 799. On this basis we would expect
799
to have paid out 1.0362 = 771.09 so far. So we should have to pay out 799 − 771.09 = 27.91 in
the future. In fact we have incurred 750, so our final figure would be 750 + 27.91 = 777.91.
For Accident Year 2005, the expected outgo was initially 833. On this basis we would expect to
833
have paid out 1.0362×1.125 = 714.58 so far. So we would have to pay out 833 − 714.58 = 118.42
in the future. In fact we have incurred 700 so far, so our finial figure should be 700 + 118.42 =
818.42.
For Accident Year 2006, the expected outgo was initially 867. On this basis we would expect
867
to have paid out 1.0362×1.125×1.2406 = 599.50 so far. So we would have to pay out 867 − 599.50 =
267.5 in the future. In fact we have incurred 647 so far, so our final figure would be 647+267.5 =
915.5.
So the total payout expected is 3225.83, of which we have already paid 2812. So the balance is
413.83.
Solution 8.5: [IBNR1K, Exercise] (9.2.1) This is immediate from, beginning with equation,

DΣ × BΣ + X̂34 DΣ × (BΣ + CΣ BΣ /AΣ )
Xb44 = =
AΣ + CΣ AΣ + CΣ
DΣ × BΣ /AΣ × (AΣ + CΣ )
=
AΣ + CΣ
DΣ × BΣ
=
AΣ
which gives the result.
Solution 8.6: [IBNR3K, Exercise] (9.2.3) First, it can be verified that the row and column
130
totals are:
1 232 338 373 389 391 391
2 258 373 429 456 456
3 221 303 307 307
4 359 430 430
5 349 349
Column Total 1419 374 95 43 2
You can proceed by estimating the parameters as suggested in the book (or applying the
mechanics of using ratios of cumulative claims as discussed in lecture). We have
α
b1 = 391.0, α
b2 = 458.3, α
b3 = 325.1, α
b4 = 498.0, and α
b5 = 545.5
and
βb1 = 0.640, βb2 = 0.224, βb3 = 0.081, βb4 = 0.051, and βb5 = 0.0051.
As a result, we have the bottom part of the claims run-off triangle:
1
2 2.3 2.3
3 16.5 1.7 18.2
4 40.3 25.2 2.5 68.0
5 122.0 44.1 27.6 2.8 196.5
Some differences may exist due to rounding. The total IBNR reserve is about 285.
Solution 8.7: [IBNR6, Exercise] (Jiwook’s solution to Final Exam Question 2002) Note that
we can write the claims run-off model as:
Development Year
0 1 2
Year of Origin
α
α +λ1
0 α +λ1 +λ2
+β1 +β1
+β2
α
α +λ1
α +λ1 +λ2
1
+λ1 +λ2 +λ3
+β1 +β1
+β2
α
α +λ1
α +λ1 +λ2
2 +λ1 +λ2 +λ3
+λ2 +λ3 +λ4
+β1 +β1
+β2
131
Firstly,
µ12 = α̂ + λ̂1 + λ̂2 + λ̂3 + β̂1 + β̂2

= 7.0632 + 0.025 + 0.012 + 0.018 − 0.028 + 0.0123
= 7.1097
Hence,
log X12 ∼ N ormal 7.1097, 12

So 2
b12 = E (X12 ) = eµ+ σ2
X =7.1097+0.5
= e7.6097 = 2017.67
Secondly,
µ21 = α̂ + λ̂1 + λ̂2 + λ̂3 + β̂1

= 7.0632 + 0.025 + 0.012 + 0.018 − 0.0208
= 7.0974
Hence
log X21 ∼ N ormal 7.0974, 12

So 2
b21 = E (X21 ) = eµ+ σ2 = e7.5974 = 1993.01
X
Lastly,
µ22 = α̂ + λ̂1 + λ̂2 + λ̂3 + λ̂4 + β̂1 + β̂2

= 7.0632 + 0.025 + 0.012 + 0.018 + 0.018 − 0.0208 + 0.0123
= 7.1277
So 2
b22 = E (X22 ) = eµ+ σ2 = e7.6277 = 2054.32
X
Solution 8.8: [IBNR2Y, Exercise]
1. The likelihood function is

3−j
2 Y (xij −αj ci )2
Y 1 −
2σ 2
L(α0 , α1 , α2 , σ0 , σ1 , σ2 ) = √ e j .
j=0 i=1
σj 2π
Therefore, the log-likelihood function is

2 X3−j
(xij − αj ci )2 √
X
log L(α0 , α1 , α2 , σ0 , σ1 , σ2 ) = − ln σj − 2
− ln 2π
j=0 i=1
2σ j
2 2 P3−j 2 X3−j
X X
i=1 (xij − αj ci )
2 X √
=− ((3 − j) ln σj ) − − ln 2π
j=0 j=0
2σj2 j=0 i=1
Taking the partial derivative of the log-likelihood function with respect to αj (j = 0, 1, 2)

gives
132
P3−j P3−j
∂ log L(α0 , α1 , α2 , σ) i=1 (xij ci )− αj i=1 c2i
= .
∂αj σj2
∂ log L(α0 ,α1 ,α2 ,σ)
Solving ∂αj
= 0 gives the maximum likelihood estimate of αj , denoted by α̂j ,
P3−j
i=1 (xij ci )
α̂j = P 3−j 2 .
i=1 ci
One can easily check that the second derivative is negative.
(2)
E(X
b 22 + X31 + X32 ) = E(X
b 22 ) + E(X
b 31 ) + E(X
b 32 ) = α̂2 c2 + α̂1 c3 + α̂2 c3
Vd
ar(X22 + X31 + X32 ) = Vd
ar(X22 ) + Vd ar(X32 ) = σ̂12 + 2σ̂22
ar(X31 ) + Vd
133
Module 9
Game and Decision Theory
9.1 Decision theory

Exercise 9.1: [DnG1, Solution] [Decisions & Games notes, exercise # 2] The loss function
under a decision problem is summarized below:
states
decision
θ1 θ2 θ3
d1 14 12 13
d2 13 15 14
d3 11 15 5
1. Determine the minimax solution to the problem.

2. Given the probability distribution (pmf): P (θ1 ) = 41 , P (θ2 ) = 14 , and P (θ3 ) = 12 , deter-
mine the Bayes criterion solution.
Exercise 9.2: [DnG2, Solution] [Decisions & Games notes, exercise # 3] The profit per client-
day made by a privately owned health center depends on the variable costs involved. Variable
costs, over which the owner of the health center has no control, take one of three levels: θ1 =high,
θ2 =most likely, and θ3 =low. The owner has to decide at what level to set the number of client-
days for the coming year. Client-days can be either d1 = 16, d2 = 13.4, or d3 = 10 (each in
‘000’). The profit in ($) per client-day is as follows:
states
decision
θ1 θ2 θ3
d1 85 95 110
d2 105 115 130
d3 125 135 150
1. Determine the Bayes criterion solution based on the annual profits, given the probability
distribution (pmf) P (θ1 ) = 0.1, P (θ2 ) = 0.6, and P (θ3 ) = 0.3.
2. Determine both the minimax regret solution and the maximin solution to this problem.
Exercise 9.3: [DnG4, Solution] [Decisions & Games notes, exercise # 5] A firm is contem-
plating three investment alternatives: stocks, bonds, and a savings account, involving three
134
potential economic conditions: accelerated, normal, or slow growth. Each condition has a
probability of occurrence P (accelerated growth)= 0.2, P (normal growth)= 0.5 and P (slow
growth)= 0.3. It is assumed that the decision maker who has $100,000 to invest wishes to
invest all the fund in a single investment class. The annual returns ($) yielded from the stocks,
bonds, and savings account are as follows:
Economic Conditions
Alternative accelerated normal slow
Investment growth growth growth
Stocks 20,000 13,000 -8,000
Bonds 16,000 12,000 2,000
Savings 10,000 10,000 10,000
1. Determine the Bayes criterion solution.
2. Determine both the minimax regret solution and maximin solution to this problem.
3. Explain briefly when Bayesian decision analysis (i.e. Bayes rule) can be used.
9.2 Game theory

Exercise 9.4: [DnG3, Solution] [Decisions & Games notes, exercise # 4] Consider the following
two-person, zero-sum game:
Player B
strategies (loss)
x y z
Player A 1 250 300 150
strategies (gain) 2 50 165 125
3 100 275 225
1. Using the rule of dominance, reduce the payoff matrix to a 2-by-2 matrix.
2. Solve algebraically for the mixed-strategy probabilities for players A and B and determine
the expected gain for player A and the expected loss for player B. Discuss the meaning
of this solution value.
9.3 Solutions
Solution 9.1: [DnG1, Exercise] Decisions & Games, Exercise # 2: Solution from Institute.
First, check out the table below:
states
decision p1 = 0.25 p2 = 0.25 p3 = 0.5 expected
θ1 θ2 θ3 maximum loss
d1 14 12 13 14 13
d2 13 15 14 15 14
d3 11 15 5 15 9
135
Thus, the minimax solution is d1 . The Bayes criterion solution is d3 since it gives smallest
expected loss. [Refer to bold values.]
Solution 9.2: [DnG2, Exercise] Decisions & Games, Exercise # 3: Solution from Institute.
First convert the table into annual profits as follows:
annual profits
decision p1 = 0.1 p2 = 0.6 p3 = 0.3 expected
θ1 θ2 θ3 maximum regret minimum profit
d1 1360 1520 1760 47 1360 1576
d2 1407 1541 1742 18 1407 1587.9
d3 1250 1350 1500 260 1250 1385
Therefore, Bayes criterion solution is to choose d2 as it gives the largest expected profit. The
minimax regret solution is to choose d2 and the maximin solution is also to choose d2 .
Solution 9.3: [DnG4, Exercise] Decisions & Games, Exercise # 5: Jiwook’s solution, Emil
modified.
1. First, the annual profits:

Economic Conditions
probabilities 0.2 0.5 0.3
Alternative accelerated normal slow Expected
Investment growth growth growth profit
Stocks 20,000 13,000 −8,000 8,100
Bonds 16,000 12,000 2,000 9,800
Savings 10,000 10,000 10,000 10,000
where for example, 8,100 = 20,000 × 0.2 + 13,000 × 0.5 + (−8,000) × 0.3. Bayes Decision:
choose Savings.
2. The opportunity loss (amount of regret) are given below:
Economic Conditions
Alternative accelerated normal slow maximum
Investment growth growth growth regret
Stocks 0 0 18,000 18,000
Bonds 4,000 1,000 8,000 8,000
Savings 10,000 3,000 0 10,000
Thus, for minimax Decision: choose Bonds (i.e. choose minimum of maximum regret).
Economic Conditions
Alternative accelerated normal slow minimum
Investment growth growth growth payoff
Stocks 20,000 13,000 −8,000 −8,000
Bonds 16,000 12,000 2,000 2,000
Savings 10,000 10,000 10,000 10,000
So for maximin Decision: choose Savings.
136
3. Bayes rule can be used when we want to revise probabilities of potential states of nature
based on additional information, experiments and personal judgments.
Solution 9.4: [DnG3, Exercise] Decisions & Games, Exercise # 4: Jiwook’s solutions.
1. First, notice that:

Player B
x y z
250 < 300 > 150
1
∨ ∨ ∨
50 < 165 > 125
Player A 2
∧ ∧ ∧
3 100 < 275 > 225
So ‘Strategy 1 & 3’ dominate 2 for Player A. ‘Strategy x & z’ dominate y for Player B.
Therefore, a 2-by-2 matrix would be
Player B
x z
Player A 1 250 150
3 100 225
2. If Player B selects strategy x, the possible payoffs for Player A are 250 and 100. Therefore,
if Player A selects strategy 1 with probability p, Player A’s expected gain is given by
250p + 100(1 − p)
On the other hand, if Player B selects z, Player A’s expected gain would be
150p + 225(1 − p)
So, we have
250p + 100(1 − p) = 150p + 225(1 − p)

⇒ 225p = 125
125 5 4
⇒p = = ,1 − p =
225 9 9
Similarly, for Player B’s expected gain, we need
250q + 150(1 − q) = 100q + 225(1 − q)

⇒ 225q − 75 = 0
3 2
⇒q = ,1 − q =
9 3
Hence, Player A’s gain will be:
5 4
if B selects x : 250 × + 100 × = 183.33
9 9
5 4
if B selects z : 150 × + 225 × = 183.33
9 9
137
Player B’s gain will be

1 2
if A selects 1 : 250 × + 150 × = 183.33
3 3
1 2
if A selects 3 : 100 × + 225 × = 183.33
3 3
No matter what strategy player B chooses, if player A employs the mixed strategy that is
composed with ( 59 1st , 49 3rd ), it is “guaranteed” for him to obtain 183.33 (in expected value
sense). No matter what strategy player A chooses, if player B employs the mixed strategy
that is composed with ( 31 x, 23 z), it is guaranteed for him to obtain 183.33. Therefore,
The value of the game 183.33

The optimal strategy for player A ( 59 1st , 49 3rd )
The optimal strategy for player B ( 13 x, 23 z)
138
References
Bowers, N. L., Gerber, H. U., Hickman, J. C., Jones, D. A., Nesbitt, C. J., 1997. Actuarial
mathematics. Society of Actuaries.
de Jong, P., Heller, G. Z., 2008. Generalized Linear Models for Insurance Data. Cambridge
University Press.
Kaas, R., Goovaerts, M., Dhaene, J., Michel, D., 2008. Modern Actuarial Risk Theory. Springer-
Verlag Berlin Heidelberg.
Klugman, S. A., Panjer, H. H., Willmot, G. E., 2012. Loss Models: From Data to Decisions.
Wiley.
Ohlsson, E., Johansson, B., 2010. Non-Life Insurance Pricing with Generalized Linear Models.
Springer-Verlag Berlin Heidelberg.
Wuthrich, M. V., 2014. Non-life insurance: Mathematics & statistics. Lecture note, RiskLab,
ETH Zurich; Swiss Finance Institute, available at SSRN: http://ssrn.com/abstract=2319328.
139

Tute Exercises PDF

Uploaded by

Copyright:

Available Formats

Tute Exercises PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tute Exercises PDF

Uploaded by

Copyright:

Available Formats

ACTL3162

General Insurance Techniques

2 Collective Risk Modelling 5

3 Individual Claim Size Modelling 29

4 Approximations for Compound Distributions 57

5 Ruin Theory and Premium Calculation Principles 80

6 Generalised Linear Models (GLMs) 91

6.3 Fit a GLM and Evaluate the quality of a model . . . . . . . . . . . . . . . . . . 93

7 Bayesian Models and Credibility Theory 104

8 Claims Reserving 124

9 Game and Decision Theory 134

(a) Assume X ∼ N (0, 1). Prove that a + bX ∼ N (a, b2 ) for a, b ∈ R

(a) The moment generate function of a + bX is

which is the moment generating function of N (a, b2 ).

which is the moment generating function of N ( i µi , i σi2 ).

Solution 1.2: [NLI2, Exercise]

(b) The moment generating function for Xk is

(c) We start with the cumulative function of Z 2 , for x ≥ 0,

which is the cumulative function of Chi-square with 1 degree of freedom, X1 .

Collective Risk Modelling

2.1 Compound distributions

Exercise 2.3: [los8, Solution] Consider the following three distributions:

x fX1 (x) fX2 (x) fX3 (x)

Exercise 2.5: [los12, Solution] Aggregate claims are

• Xi has a uniform distribution on (1, 5), i = 1, 2, ...;

• N has the distribution

• N , X1 , X2 , . . . are mutually independent.

Determine P (S < 4).

Exercise 2.7: [los9R, Solution] [R∗ ]

1. Use R to perform the convolutions required in Exercice 2.3: 1 and 2.

Exercise 2.9: [los22R, Solution][R∗ ] Implement the formula

[Properties of compound Poisson random variables]

Exercise 2.10: [los23R, Solution][R∗ ] Let S ∼ compound Poisson(λ, p(xi ) = πi ), i = 1, . . . , m.

2.2 Explicit claims count distributions

Exercise 2.13: [los14, Solution] Let S ∼ compound Poisson(λ = 2, p(x) = x/10), x =

1. using the basic convolution method for compound distributions;

p (xi ) = P (X = xi ) = πi , for i = 1, 2, ..., m.

Now suppose S can be written as Xm

1. Prove that N1 , N2 , ..., Nm are mutually independent.

[In other words, prove A Theorem 12.4.2.]

p0n (t) = −λpn (t) + λpn−1 (t), n = 0, 1, 2, . . . ,

and interpret these formulas by comparing pn (t) with pn (t + dt).

Exercise 2.20: [NLI6, Solution][Wuthrich (2014), Exercise 6] An insurance company decides

• a 10% discount after 3 years of no claim, and

• a 30% discount after 6 years of no claim.

2.3 Parameter estimation

Here we assume m < n. Derive the maximum likelihood estimates of α and β.

FX (x) = 1 − (100/x)α for x > 100 and α > 0.

1. Determine the MLE of the Pareto parameter α.

Exercise 2.24: [NLI8, Solution][Wuthrich (2014), Exercise 8] Natural hazards in Switzerland

date amount in CHF mio. date amount in CHF mio.

Solution 2.2: [los7, Exercise] First set up the following table:

x f1 (x) f2 (x) f1+2 (x) f3 (x) f1+2+3 (x) FS (x)

Thus, 0.35 + 0.4p = 0.43 implies p = 0.2.

Solution 2.3: [los8, Exercise]

1. The range of X1 + X2 will be [0, 10]. We have

2. In order to get the distribution of X1 + X2 + X3 , it suffices to calculate the convolution

x fX3 (x) fX1 +X2 (x) fX1 +X2 +X3 (x)

so that the solutions are Pm