Lecture 1 Quant
Lecture 1 Quant
Marco Avellaneda
G63.2936.001
-- Projects will deal with real data. They will involve programming
and quantitative financial analysis as well as your
contribution to and interpretation of the theory presented.
-- The grade will be based on the three projects and on class participation.
Consider a stock (e.g IBM). The return R over a specified period is the
change in price, plus dividend payments, divided by the initial price.
St + ∆t − St + Dt ,t + ∆t
Rt =
St
Nf
R = ∑ β j Fj + ε
j =1
∑β F
j =1
j j Explained, or systematic portion
R = βF + ε , Cov( R, ε ) = 0
R = ∑ β j F j + ε , Corr ( F j , ε ) = 0
j =1
Counter-arguments: (i) How do we actually define the factors? (ii) Is the number
of factors known? (iii) The structure of the stock market and risk-premia
vary strongly (think pre & post WWW) (iv) The issue of correlation of residuals
is intimately related to the number of factors.
Factor decomposition in practice
-- Putting aside normative theories (how stocks should behave), factor
analysis can be quite useful in practice.
σ =
i
2 1 T
∑
T − 1 t =1
R(
it − R
2
i , ) 1 T
Ri = ∑ Rit
T t =1
Rit
Yit =
σi
1 T
Γij = ∑
T − 1 t =1
YitY jt
Cij =
1 T
T − 1 t =1
R (
∑ it i jt j ij
− R R − R )(
+ γδ , γ = 10 )
-9
Cij
Γ reg
ij =
Cii C jj
( )
V ( j ) = V1( j ) , V2( j ) ,...,VN( j ) , j = 1,2,..., N . eigenvectors
N
VN ( j)
F jt = ∑ Vi Yit = ∑
( j) i
Rit
i =1 σ i
returns of
i =1 “eigenportfolios”
N~1400 stocks
T=252 days
Top 50 eigenvalues for S&P 500 index
components, May 1 2007,T=252
30%
25%
20%
15%
10%
5%
0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
Model Selection Problem:
How many EV are significant?
Need to estimate the significant eigenportfolios which can be used as factors.
N
< Ri R j >= Cij = ∑ λkVi ( k )V j( k )
k =1
N
Vi ( k ) ~ 1 N
Vi ( k )
Fk ≡ ∑ Ri , Fk ≡ ∑ Ri
i =1 σi λk i =1 σi
~ ~ ~
< F >= λk ,
k
2
< F >= 1,
k
2
< Fk Fk ' >= δ kk '
Ri = ∑ β ik Fk ⇒ β ik = σ i λk Vi ( k )
k
Karhunen-Loeve Decomposition
Since the eigenvectors vanish or are very small in a real system, the modeling
consists in defining a small number of factors and attribute the rest to ``noise’’
Bai and Ng 2002, Econometrica
2
1 N T
m
I (m) = min
β NT
∑∑
i =1 t =1
Rit − ∑
k =1
β ik kt
F
m* = arg min (I ( m) + m ⋅ g (N , T ))
m
N 2 (k ) 2
( )
N N
J (m ) = ∑λ k also, I (m ) = ∑ λk ∑ σ i Vi
k = m +1 k = m +1 i =1
N
m = arg min ∑ λk + mg (N , T ) )
*
Linear penalty function
m k = m +1
m
1
N
∑λ
k =1
k = Explained variance by first m eigenvectors
N
1
N
∑λ
k = m +1
k = Tail
N
1 m
N
∑ λk + g = Objective Function = U (m, g )
k = m +1 N
∂ 2U (m* ( g ), g )
Convexity =
∂g 2
Objective function U(m,g)
150%
100%
1 U
10 50%
19
m
28 0%
g
Optimal value of U(m,g) for different g
0.8
0.8
0.7
0.7
U(m*(g),g)
0.6
0.6
0.5
0.5
0.4
0.4
0.3
1 2 3 4 5 6 7 8 9 10 11 12 13
g
Implementation of Bai & Ng
on SP500 Data
g m* Lambda_m* Explained Variance Tail Objective Function
Convexity
1 117 0.20% 87.88% 12.12% 0.355 -
2 59 0.39% 71.44% 28.56% 0.522 -0.085085
3 29 0.59% 57.11% 42.89% 0.603 -0.041266
4 16 0.76% 48.51% 51.49% 0.643 -0.018110
5 10 0.96% 43.52% 56.48% 0.665 -0.007000
6 7 1.18% 40.43% 59.57% 0.680 -0.003096
7 6 1.22% 39.25% 60.75% 0.691 -0.004872
8 4 1.56% 36.56% 63.44% 0.698 0.001069
9 4 1.56% 36.56% 63.44% 0.706 0.000000
10 4 1.56% 36.56% 63.44% 0.714 0.000000
11 4 1.56% 36.56% 63.44% 0.722 0.000000
12 4 1.56% 36.56% 63.44% 0.730 0.000000
13 4 1.56% 36.56% 63.44% 0.738 -
If we choose the cutoff m* as the one for which the sensitivity to g is zero, then
m*~5 seems appropriate.
This would lead to the conclusion that the S&P 500 corresponds to a 5-factor model.
The number is small in relation to industry sectors and to the amount of variance
explained by industry factors.
The density of states: a useful formalism
Spectral theory as seen by physicists – origins in Quantum Mechanics and
High Energy Physics.
#{k : λk / N ≤ E}
F (E ) ≡ F ( E ) is increasing, F (1) = 1
N
λk
f (E ) = F ' (E ) = f (E )
1
N
∑δ E −
k
∴
N
D.O.E.
One way to think about the DOE is as changing the x-axis for the y-axis,i.e.
counting the number of eigenvalues in a neighborhood of any E, 0<E<1.
1.2
0.8
F(E)
0.6
0.4
0.2
0
0.00 0.20 0.40 0.60 0.80 1.00
E
In the DOE language…
N λm
= ∫ E f (E )dE , = 1 − F (λm )
1 m
N
∑λ
k = m +1
k
N
0
E
U (E , g ) = ∫ x f (x )dx + g (1 − F (E ))
0
∂U (E , g )
= E f (E ) − gf (E ) = (E − g ) f (E )
∂E
If f ( g ) ≠ 0, then E * ( g ) = g.
Dependence of the problem on g
( )
g
V (g ) = U E * ( g ), g = ∫ xf ( x)dx + g (1 − F ( g ) )
0
= gF ( g ) − ∫ F ( x )dx + g − gF ( g )
g
= g − ∫ F ( x )dx
g
V ' (g ) = 1 − F ( g )
V ' '(g) = − f (g)
-- Industry sectors
-- Market capitalization