Guo Spring 08 P

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 650
At a glance
Powered by AI
The document discusses study strategies for exam P and provides shortcuts and insights for various probability distributions commonly seen on the exam.

The document covers topics such as study methods, calculator tips, probability concepts, discrete and continuous random variables, and various probability distributions.

The formula for finding the variance of independent random variables X and Y is Var(X + Y) = Var(X) + Var(Y) + 2Cov(X, Y).

http://guo.coursehost.

com Deeper Understanding, Faster Calculation --Exam P Insights & Shortcuts 10th Edit ion by Yufeng Guo For SOA Exam P/CAS Exam 1 Exam Dates Spring 2008 Edition This electronic book is intended for individual buyer use for the sole purpose o f preparing for Exam P. This book can NOT be resold to others or shared with oth ers. No part of this publication may be reproduced for resale or multiple copy d istribution without the express written permission of the author. 2007, 2008 By Yufeng Guo Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Table of Contents Chapter 1 Exam-taking and study strategy ........................................ 5 Top Horse, Middle Horse, Weak Horse............................................. ............................. 5 Truths about Exam P............................. .......................................................................... 6 Why good candidates fail........................................................... ..................................... 8 Recommended study method................ ........................................................................ 10 CBT (computer-based testing) and its implications................................... ................... 11 Chapter 2 Doing calculations 100% correct 100% of the time ....... 13 What calculators to use for Exam P.............................................. ................................ 13 Critical calculator tips ................... ................................................................................ 17 Comparison of 3 best calculators............................................ ...................................... 26 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Set, sample space, probability models ............................. 27 Multiplic ation/addition rule, counting problems ........... 41 Probability laws and whodun it..................................... 48 Conditional Probability............... ...................................... 60 Bayes theorem and posterior probabiliti es .................... 64 Random variables .................................... .......................... 73 Discrete random variable vs. continuous random variable......................... .................. 75 Probability mass function ................................ ............................................................. 75 Cumulative prob ability function (CDF).......................................................... .............. 78 PDF and CDF for continuous random variables................... ........................................ 79 Properties of CDF .................. ................................................................................ ....... 80 Mean and variance of a random variable............................... ....................................... 82 Mean of a function .................. ................................................................................ ...... 84 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chap ter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Independence.................................................................... .. 86 Percentile, mean, median, mode, moment........................ 90 Find E ( X ) ,Var ( X ) , E ( X | Y ) ,Var ( X | Y ) ................................ 94 Bernoulli distribution ....................................................... 110 Binomial distribution....................................................... . 111 Geometric distribution ................................................... .. 120 Negative binomial ....................................................... ...... 128 Hypergeometric distribution ......................................... .. 139 Uniform distribution .................................................... .... 142 Exponential distribution ..............................................

.... 145 Poisson distribution .................................................. ........ 168 Gamma distribution ................................................ ......... 172 Beta distribution ................................................ ............... 182 Weibull distribution........................................ .................. 192 Pareto distribution...................................... ...................... 199 Normal distribution ................................. ......................... 206 Lognormal distribution............................ ........................ 211 Page 2 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Chapter 26 Chapter 27 Chapter 28 Chapter 29 Chapter 30 Chapter 31 Chapter 32 Cha pter 33 Chapter 34 Chapter 35 Chapter 36 Chi-square distribution.................................................... 220 Bivariate normal distribution.......................................... 225 Join t density and double integration .............................. 230 Marginal/con ditional density........................................... 250 Transformation: CDF, PDF, and Jacobian Method ..... 261 Univariate & joint order statistics .... .............................. 278 Double expectation........................... ................................. 296 Moment generating function ............... ............................ 302 Joint moment generating function .............. .................... 326 Markovs inequality, Chebyshev inequality................ ... 333 Study Note Risk and Insurance explained ................. 346 14 Key MGF formulas you must memorize .......................................... ..................... 303 Deductible, benefit limit ...................................................... ....................................... 346 Coinsurance......................... ................................................................................ ........ 351 The effect of inflation on loss and claim payment.................. .................................... 355 Mixture of distributions .............. ................................................................................ 358 Coefficient of variation .................................................. ............................................. 362 Normal approximation ......... ................................................................................ ....... 365 Security loading ................................................... ....................................................... 374 Chapter 37 On becoming an actuary .............................................. 375 Guos Mock Exam .............................................................. ...................... 377 Solution to Guos Mock Exam............................ ..................................... 388 Final tips on taking Exam P........... .......................................................... 423 About the author. ................................................................................ ...... 424 Value of this PDF study manual....................................... ....................... 425 User review of Mr. Guos P Manual .................... ................................... 425 Yufeng Guo, Deeper Understanding: Exam P Page 3 of 425

http://www.guo.coursehost.com Yufeng Guo, Deeper Understanding: Exam P Page 4 of 425

http://guo.coursehost.com Chapter 1 Exam-taking and study strategy Top Horse, Middle Horse, Weak Horse Sun Bin was a military strategy genius in ancient China who wrote the book Art o f War. He advised General Tian who served the Prince of Qi. As a hobby, General Tian raced and bet on horses with the Prince of Qi. The race consisted of three rounds. Whoever won two or more rounds won the bet. Whoever won least of the thr ee races lost the race and his money bet on the race. For many years, General Ti an could not win the horse race. General Tian always raced his best horse agains t Prince Qis best horse in the first round. Because Prince Qi had more money and could buy the finest horses in the country, his best horse was always faster tha n General Tians best horse. As a result, General Tian would lose the first round. Then in the second round, General Tian raced his second best horse against Prin ce Qis second best horse. Once again, Prince Qis middle horse was better than Gene ral Tians middle horse. General Tian would lose the second round of racing. Usual ly General Tian could only win the third round. Prince Qi generally won the firs t two rounds and General Tian won the final round. The Prince won the overall ra ce and collected the bet. After Sun Bin was hired as the chief military strategi st for General Tian, he suggested the General should try a different approach. A ccording to Sun Bins advice, in the next race General Tian pitted his worst horse against Prince Qis best horse. Although the General lost the first round, he cou ld now race his best horse against Prince Qis second best or middle horse. Genera l Tians best horse easily defeated the Princes slower middle horse. The Generals mi ddle horse swept past the Princes worst horse in the third round and clinched the overall victory and winners purse at last for General Tian. Moral of the story: 1. The goal of combating a formidable enemy is to win the war, not to win each i ndividual battle. The goal in taking Exam P is to pass the exam, not to get ever y individual problem correct. 2. Avoid an all-out fight with the opponent on the front where you are weak and the opponent is strong. Surrender to preserve your resources when the opponents strength outweighs your weakness. When you encounte r the most difficult (surprise) problems on the exam, make an intelligent guess if you must but dont Yufeng Guo, Deeper Understanding: Exam P Page 5 of 425

http://guo.coursehost.com linger over them. Move on to those problems (repeatabl e) for which you have a solution framework and will most likely solve successful ly. Truths about Exam P 1. Few employers will take your resume seriously without knowing that you have a t least passed Exam P. If you are considering becoming an actuary, take Exam P a s early as you feel reasonably prepared. Exam P is the entrance exam to the actu ary profession. If you fail but are still interested in an actuary career, take the exam again. The sooner you pass Exam P, the more competitive you are in your job search. You might even want to pass Course FM to improve your chances of la nding an entry level job. 2. In addition to helping you land a job in the actuar y profession, taking Exam P can also help you objectively assess whether you are up to the pressure of actuary exams. To move up in the actuary profession from an apprentice to a fellow, a candidate needs to pass a total of eight exams, beg inning with Exam P. Preparing for each exam often requires hundreds of lonely st udy hours. Taking Exam P will give you a taste of the arduous exam preparation p rocess and helps you decide whether you really want to become an actuary. Each y ear, as new people enter the actuary profession, some actuaries are leaving the profession and finding new careers elsewhere. One of the reasons that people lea ve the actuary profession is the intense and demanding exam experience. Many int elligent and hard-working people become actuaries, only to find that the exam pr ocess is unbearable: too much time away from social activities, too stressful to prepare for an exam, and too painful to fail and re-take an exam. 3. Accept the fact that most likely you wont be 100% ready for Exam P. The scope of study is enormous and SOA has the right to test any randomly chosen concepts from the syllabus. To correctly solve every tested problem, you would have to sp end your lifetime preparing for the exam. Accept that you wont be able to solve a ll the tested problems in an exam. And you certainly do NOT have to solve all th e problems correctly to pass the exam. 4. Your goal should be to pass the exam q uickly so you can look for a job, not to score a 10. Most employers look for pot ential employees who can pass the exam. They dont care whether you got a 6 or 10. 5. Exam P problems consist of two categories repeatable problems (easy) and sur prise problems (tough). Repeatable problems are those tested repeatedly in the p ast. Examples include finding posterior probabilities, calculating the mean and variance of a random variable, applying exponential distributions, or applying P oisson distribution to a word problem. These problems are the easiest in the exa m because you can master them in advance by solving previous Course 1 or P Yufeng Guo, Deeper Understanding: Exam P Page 6 of 425

http://guo.coursehost.com exams. Surprise problems, on the other hand, are the b rand new types of problems unseen in previous Course 1 and P problems. For examp le, May 2005 P exam has one question about Chebyshev s inequality. This is a bra nd new problem never tested in the past. For the majority number of exam candida tes, this problem was a complete surprise. Unless you happen to study Chebyshev s inequality when you prepare for Exam P, theres no way you can solve it in the h eat of the exam. Surprise problems are designed to make the exam as a whole more challenging and interesting. 6. Strive to get 100% correct on repeatable proble ms. Most problems on Exam P are repeatable problems. If you can solve these rout ine problems 100% correct, you have a solid foundation for a passing score. 7. W hen facing a surprise problem, allow yourself one or two minutes to try and figu re it out. If you still cant solve it, just smile, guess, and move on. The rule o f thumb is that if you cant solve a problem in three minutes, you wont be able to solve it in a reasonable amount of time that will leave you time to finish the r epeatable (and easier) problems on the exam. When facing a hopeless problem, nev er linger. Move on to the next problem. 8. During the exam, regurgitate, do not attempt to invent. You have on average three minutes per problem on the exam. Th ree minutes is like the blink of an eye in the heat of the exam. In three minute s, most people can, at best, only regurgitate solutions to familiar problems. Ev en regurgitating solutions to familiar problems can be challenging in the heat o f the exam! Most likely, you cannot invent a fresh solution to a previously unse en type of problem. Inventing a solution requires too much thinking and too much time. In fact, if you find yourself having to think too much in the exam, chanc es are that you may have to take Exam P again. 9. Be a master of regurgitation. Before the exam, solve and resolve Sample P problems and any newly released P ex ams (if any) till you get them 100% right under the exam like condition. Build a 3-minute solution script for each of the previously tested P problem. This help s you solve all of the repeatable problems in Exam P, setting a solid foundation for passing Exam P. Why a 3-minute solution script is critical Creating a script is a common formula for handling many challenging problems. Fo r example, you must speak for fifteen minutes at an important meeting. Realizing that you are not particularly good at making an impromptu speech, you write and rehearse a script as a rough guide to what you want to say in the meeting. Scri pts are effective because they change the daunting task of doing something speci al on the spur of the moment to a routine task of doing something ordinary. Yufeng Guo, Deeper Understanding: Exam P Page 7 of 425

http://guo.coursehost.com Mental scripts are needed for Exam P (and higher level exams). Even most of the repeatable problems in an exam require you to have a t horough understanding of the complex, often unintuitive, underlying concepts. Ma ny tested questions also require you to do intensive calculations with a calcula tor. A single error in the calculation will lead to a wrong answer. The mastery of complex concepts and the ability to do error-free calculations cant take place in three minutes. They need to be learned, practiced, and memorized prior to th e exam. After you have developed a 3-minute mental script for the repeatable pro blems for Exam P, solving a problem at exam time is simply a reactivation of the preprogrammed conceptual thinking and calculation sequences. Your 3-minute scri pt will enable you to solve a repeatable type of problems in three minutes. Reme mber that theres no such thing as outperforming yourself in the exam. You always under-perform and score less than what your knowledge and ability deserve. This is largely due to the tremendous amount of pressure you inevitably feel in the e xam. If you dont have a script ready for an exam problem, dont count on solving th e problem on the spur of the moment. How to build a 3-minute solution script: 1. Simplify the solution, from concepts to calculations, into a 3-minute repeata ble process. Just as fast food restaurants can deliver hot hamburgers and French fries into customers hands in a couple of minutes, youll need to deliver a soluti on to an exam question in three minutes. 2. Think simple. Just as a fast food re staurant uses picture menus to allow customers to intuitively see menu options a nd quickly make a choice, you must convert the textbook version of a concept (of ten highly complex) into a quick and simple equivalent which allows access to th e core of a problem. For example, sample space is a complex and unintuitive conc ept used by pure mathematicians to rigorously define classical probability. If y ou find sample space a difficult concept, use your own words to define it (sample space is all possible outcomes). 3. Calculate fast. Conceptually solving a probl em is only half the battle. Unless you know how to press your calculator keys to get the final answer, you wont score a point. So for every repeatable problem in the exam, you need to know the precise calculator key sequences for all the cal culations involved in that problem. Many people with sufficient knowledge of Exa m P concepts will flounder largely because of slow or sloppy calculations. Why good candidates fail Yufeng Guo, Deeper Understanding: Exam P Page 8 of 425

http://guo.coursehost.com At each exam sitting, many good candidates who studied hard for Exam P 1 failed. This unsatisfactory result delayed their entry into t he actuary profession. Candidates fail for several major reasons: 1. Memorizing formulas without gaining a sophisticated understanding of the core concepts. Exa m P appears formula-driven. It seems that as long as you have memorized the myri ad of formulas, you should easily pass the exam. Such erroneous thinking has mis led many candidates into an exam disaster. Those who have memorized formulas wit hout understanding the nuances and substance of core concepts walk into the exam room, only to find that subtly designed exam problems render many of their memo rized formulas useless. 2. Reading too much, but solving too few practice proble ms (especially previous SOA Course 1 and Sample P problems). Studying for Exam P is like learning how to swim. While reading a book on how to swim is helpful, u ltimately you need to immerse yourself in the water and discover how to swim rat her than sink. To pass Exam P, you need to do more than read about core concepts . You must immerse yourself in Sample Exam P. 3. Doing busy work instead of lear ning, progressing, and excelling. If solving too few practice problems can lead to exam failure, the opposite extreme can be true too. A hard-working exam candi date can literally solve hundreds of practice problems (even SOA problems) witho ut learning much and fail the exam miserably. Lets see how an untrained swimmer u ltimately becomes a world champion. In the beginning the swimmer has many bad sw imming habits such as uncoordinated body movements and poor breathing. To become a better athlete he must gradually shed his bad habits and relearn correct swim ming techniques from the ground up, perhaps from a good coach. Gradually, he bec omes a more graceful, more coordinated, and more efficient swimmer. After perfec ting his swimming skills for many years, his technique propels him toward the le vel of stamina and skill which will enable him to compete for and win a gold med al. Now imagine another hard-working swimming who practices twice as much as the first swimmer but who doesnt bother changing his poor swimming habits. He merely swims according to what he knows how to swim. The more he practices swimming, t he more entrenched his poor swimming habits become. Years go by and he is still a bad swimmer. When doing practice problems for Exam P, focus on shedding your p oor habits such as lengthy thinking or error-prone calculation. Focus on gaining conceptual insights and learning efficient computation skills. Learn how to sol ve problems in a systematic and streamlined fashion. Otherwise, solving hundreds of practice Yufeng Guo, Deeper Understanding: Exam P Page 9 of 425

http://guo.coursehost.com problems only reinforces your bad habits and youll make the same mistakes over and over. 4. Leisure strong, exam weak. Some candidates are excellent at solving problems leisurely without a time constraint, but are u nable to solve a problem in 3 minutes during the heat of an exam. The major weak ness is that thinking is too lengthy or calculation is too slow. If you are such a candidate, practice previous SOA exams under a strict time constraint to lear n how to cut to the chase. 5. Concept strong, calculation weak. Some candidates are good at diagnosing problems and identifying formulas. However, when its time to plug the data into a calculator, they make too many mistakes. If you are such a candidate, refer to Chapter Two of this book to learn how to do calculations 100% right 100% of the time. Recommended study method 1. Sense before you study. For any SOA or CAS exams (Exam P or above), always ca refully scrutinize one or two of the latest exams to get a feel for the exam sty le before opening any textbooks. This prevents you from wasting time trying to m aster the wrong thing. For example, if you look at any previous Sample P problem s, youll find that SOA never asked candidates to define a sample space, a set, or a Poisson distribution. Trying to memorize the precise, rigorous definition of any concept is a complete waste of time for Exam P. SOA is more interested in te sting your understanding of a concept than in testing your memorization of the d efinition of a concept. 2. Quickly go over some textbooks and study the fundamen tal (the core concepts and formulas). Dont attempt to master the complex problems in the textbooks. Solve some basic problems to enhance your understanding of th e core concepts. 3. Focus on applications of theories, not on pure theories. One key difference between the probability problems in Exam P and the probability p roblems tested in a college statistics class is that SOA problems are oriented t oward solving actual problems in the context of measuring and managing insurance risks, while a college exam in a statistics class is often oriented toward pure theory. When you learn a new concept in Exam P readings, always ask yourself, Wh ats the use of such a concept in insurance? How can actuaries use this concept to measure or manage risks? For example, if you took a statistics class in college, you most likely learned some theories on normal distribution. What you probably didnt learn was why normal distribution is useful for insurance. Normal distribu tion is a common tool for Yufeng Guo, Deeper Understanding: Exam P Page 10 of 425

http://guo.coursehost.com actuaries to model aggregate loss. If you have N indep endent identically distributed losses x1, x 2,..., xn , then the sum of these lo sses Sn = x1 + x 2 + ... + xn is approximately normally distributed when N is no t too small, no matter whether individual loss x is normally distributed or not. Because Sn is approximately normally distributed, actuaries can use a normal ta ble and easily find the probability of Sn exceeding a huge loss such as Pr( Sn > $10,000,000). 4. Focus more on learning the common sense behind a complex theore m or a difficult formula. Focus less on learning how to rigorously prove the the orem or derive the formula. One common pitfall of using college textbooks to pre pare for Exam P is that college textbooks are often written in the language of p ure mathematical statistics. These textbooks tend to place a paramount emphasis on setting up axioms and then rigorously proving a theorem and deriving a formul a. Though scholastically interesting, such a purely theoretical approach is unpr oductive in preparing for SOA and CAS exams. 5. Master sample Exam P Exam and an y newly released Exam P (if any). SOA exam problems are the best practice proble ms. Work and rework each problem till you have mastered it. CBT (computer-based testing) and its implications 1. There are more exam sittings in 2006 and beyond. 2006 has four sittings for E xam P. If a candidate fails Exam P in one sitting, he can take another exam seve ral months later. 2. Most likely, SOA wont release any CBT exam P. The old Course 1 exams you can download from the SOA website is all you can get. Unless SOA ch anges its mind, you wont see any Exam P to be released in the near future. 3. CBT contains a few pilot questions that are not graded. You dont know which problem is a pilot and which is not. Pilot questions wont add anything to your grade, eve n if you have solved them 100% right. 4. When taking CBT, learn to tolerate impe rfections of CBT and have a peaceful frame of mind. Your assigned CBT center may be several hours away from your home. Be sure to check your CBT center out seve ral days before the exam. Yufeng Guo, Deeper Understanding: Exam P Page 11 of 425

http://guo.coursehost.com Expect to have computer monitor freezing problems. Man y candidates reported that their computer monitors froze up, from time to time, for a few seconds. Learn to cope with a dim and tiny table at your CBT center. Y ou may be assigned to such a table at your CBT center. Learn to tolerate noise i n the exam room. 5. Check out the official CBT demo from the SOA website. Get comfortable with CB T. Lear how to navigate from one problem to the next, how to mark and unmark a p roblem. Yufeng Guo, Deeper Understanding: Exam P Page 12 of 425

http://guo.coursehost.com Chapter 2 Doing calculations 100% correct 100% of the time What calculators to use for Exam P SOA/CAS approved calculators: BA-35, BA II Pl us, BA II Plus Professional, TI-30X, TI-30Xa, TI-30X II (IIS solar or IIB batter y). Best calculators for Exam P: BA II Plus, BA II Plus Professional, TI-30X IIS . I recommend you buy two calculators: (1) TI-30X IIS (because its solar-powered, you dont need to worry about it running out of battery), (2) either BA II Plus o r BA II Plus Professional. TI-30X IIS costs about $15. BA II Plus costs about $3 0. BA II Plus Professional costs from $48 to $70. If you already have BA II Plus , you dont need the more expensive BA II Plus Professional for Exam P. However, y ou will need to buy BA II Plus Professional for Exam FM. BA II Plus Professional is Texas Instruments new calculator selling from $48 - $70 in retail stores. SOA just added BA II Plus Professional as one of the approved calculators for the 2 005 exams. BA II Plus Professional can do everything BA II Plus does. In additio n, BA II Plus has new features currently lacking in BA II Plus. One new feature in BA II Plus Professional is the modified duration calculation. Because the mod ified duration is on Exam FM syllabus, youll need a BA II Plus Professional for E xam FM. You can buy BA II Plus Professional at amazon.com for less than $50 (typ ically free shipping). This price includes a 1-year subscription to the Money ma gazine. If you dont want the subscription, you can return a postage-paid card to Money and ask for a refund. The refund is about $14. So your net cost of BA II P lus Professional is about $36. Be aware that when entering numbers into a BA II Plus Professional calculator, you have to press the keys a lot harder than you d o when entering numbers on a BA II Plus or TI30X IIS. In this respect I find the BA II Plus and TI-30X IIS easier to use than the BA II Plus Professional. When you buy the BA II Plus (or BA II Plus Professional) and the TI-30X IIS, one tip is to choose one color for the BA II Plus (or Plus Professional) and a strikingl y different color for the TI-30X IIS. For example, buy a black BA II Plus, and a purple TI-30X IIS. This way, in the heat of the exam, you know exactly which ca lculator is which. Yufeng Guo, Deeper Understanding: Exam P Page 13 of 425

http://guo.coursehost.com I recommend that you spend several hours running the c alculation examples listed in the calculator manual (Texas Instruments called it Guidebook) for BA II Plus, BA II Plus Professional, and TI-30X IIS. If your calcu lator didnt have a guidebook when you bought it, you can go to www.ti.com to down load a guidebook. If you are a borderline student, calculation skills make or br eak your exam. If you can do messy calculations 100% correct 100% of the time, e very problem solved is a point earned. You might have a better chance of passing the exam than someone else who knows more than you about the subject but who ma kes mistakes here and there. I recommend that you devote some of your precious s tudy time toward mastering BA II Plus and TI-30X IIS. They are to you in the exa m as a weapon is to a soldier in a combat. Many candidates make the mistake of s pending too much time learning concepts and too little time learning calculation skills. The guidebooks for BA II Plus (or Professional) and TI-30X IIS are stra ightforward and easy to learn, so I dont want to repeat the guidebooks. I just wa nt to highlight some of the most important things you need to know about BA II P lus (or Professional) and TI30X IIS. Because BA II Plus and BA II Plus Professio nal are equally good for Exam P, Ill only talk about BA II Plus. Please keep in m ind that if BA II Plus can do a job, then BA II Plus Professional can do the sam e job with identical key strokes. So in the following discussion, if you see the term BA II Plus, it refers to both BA II Plus and BA II Plus Professional. Cont rasting the BA II Plus with TI-30X IIS Each calculator has its strengths and wea knesses. The first strength of the BA II Plus is that it is faster than TI-30X I IS. Immediately after you enter the data into BA II Plus, it gives you the answe r right away. There is no waiting time. In contrast, the TI-30X II takes several seconds before it gives you the final answer, especially when you have a comple x formula or have many numbers to insert into a formula. Second, the BA II Plus often requires fewer key strokes than TI-30X IIS for the same calculation. For e xample, to calculate 3 = 1.73205081 , you have two key strokes for the BA II Plu s: 3 x (which means first you press 3 and then press x ). However, the TI-30X IIS requires four key strokes: 2nd 3 = (you first press 2nd, then press , then enter 3, and finally press = ). The third strength of BA II Plus is that it has separators but TI-30X IIS does not. For example, if you enter one million into BA II Plus, youll see 1,000,000 (there are two separators , in this figure to indicate the unit of 1000). However, if you enter one million into TI-30X IIS, youll see 1000000. Lac k of separators in TI-30X IIS increases your chances of entering a wrong number into the calculator or reading a wrong output from the calculator if the number entered or displayed is big. Yufeng Guo, Deeper Understanding: Exam P Page 14 of 425

http://guo.coursehost.com The fourth and greatest advantage of BA II Plus is that it has more powerful sta tistics functions than the TI-30X IIS. In the one-variable statistics mode, you can enter any positive integer as the data frequency for BA II Plus. In contrast , the maximum data frequency in TI-30 IIS is 99. If the data frequency exceeds 9 9, you cannot use TI-30X IIS to find the mean and variance. This severely limits the use of the statistics function for the TI-30X IIS. Please note that the TI30X IIS can accommodate up to 42 distinct values of a discrete random variable X . BA II Plus Statistics Worksheet can accommodate up to 50 distinct values. The restriction of no more than 42 data pairs in TI-30X IIS and no more than 50 in BA II Plus is typically not a major concern for us; the exam questions asking fo r the mean and variance of a discrete random variable most likely have fewer tha n a dozen distinct values of X and f (x ) so candidates can solve each problem i n three minutes. The fifth strength of BA II Plus is that it has 10 memories lab eled M0, M1, ...,M8, M9. In contrast, TI-30 IIS has only 5 memories labeled A,B, C,D, and E. Right now, you may wonder if the TI-30X IIS has any value given its many disadvantages. The power of the TI-30X IIS lies in its ability to display t he data and formula entered by the user. This what you type is what you see featu re allows you to double check the accuracy of your data entry and of your formula . It also allows you to redo calculations with modified data or modified formula s. For example, if you want to calculate 2e you will see the display: 2e ( 2.5) 1 2.5 1 , as you enter the data in the calculator, If you want to find out the final result, press the Enter key and you will see: 2e ( 2.5) 1 -0.835830003 So 2e ( 2.5) 1 = -0.835830003 After getting the result of -0.835830003 , you realize that you made an error in your data entry. Instead of calculating 2e ( 2.5) 1 , you really wanted to calc ulate 2e ( 3.5) 1 . To correct the data entry error, you simply change 2.5 to 3.5 on your TI-30X IIS. Now you should see: 2e ( 3.5) 1 -0.939605233 Yufeng Guo, Deeper Understanding: Exam P Page 15 of 425

http://guo.coursehost.com With the online display feature, you can also reuse formulas. For example, a pro blem requires you to calculate y = 2e x 1 for x1 = 5, x 2 = 6, and x 3 = 7. Ther e is no need to do three separate calculations from scratch. You enter 2e (5) 1 into the calculator to calculate y when x=5. Then you modify the formula to calc ulate y when x = 6 and x = 7 . This redo calculation feature of TI-30X IIS is ex tremely useful for solving stock option pricing problems in Course 6 and Course 8V (8V means Course 8 for investment). Such problems require a candidate to use similar formulas to calculate a series of present values of a stock, with the ca lculated present value as the input for calculating the next present value. Such calculations are too time-consuming using the BA II Plus under exam conditions when you have about 3 minutes to solve a problem. However, these calculations ar e a simple matter for the TI-30X IIS. Please note that BA II Plus can let you se e the data you entered into the calculator, but only when you are using BA II Pl uss built-in worksheets such as TVM (time value of money), Statistics Worksheet, % Worksheet, and all other worksheets. However, the BA II Plus does not display data if you are NOT using its built-in worksheets. Unlike the TI-30X IIS, BA II Plus never displays any formulas, whether you are using its built-in worksheets or not. So which calculator should I use for Exam P, BA II Plus or TI-30X IIS? D ifferent people have different preferences. You might want to use both calculato rs when doing practice problems to find out which calculator is better for you. These are my suggestions: Statistics calculations --- If you need to calculate t he mean and variance of a discrete random variable, use (1) BA II Plus Statistic s Worksheet, or (2) TI-30X IIS for its ability to allow you to redo calculations with different formulas. When using TI-30X IIS, you can take advantage of the f act that the formula for the mean and the formula for the variance are very simi lar: E (X ) = xf (x ), E ( X 2 ) = x 2 f (x ), Var (X ) = E ( X 2 ) E 2 (X ) xf (x ) , then modify the mean formula to So you first calculate E ( X ) = E (X 2 ) = x 2 f (x ) , and calculate the variance Var ( X ) = E ( X 2 ) E 2 ( X ) . Dont use the TI-30X IIS statistics function to calculate the mean and variance be cause the TI-30X IIS statistics function is inferior to the BA II Plus Statistic s Worksheet. Later Yufeng Guo, Deeper Understanding: Exam P Page 16 of 425

http://guo.coursehost.com in this book, I will give some examples on how to calc ulate the mean and variance of a discrete random variable using both calculators . Other calculations Most likely, either BA II Plus or TI-30X IIS can do the cal culations for you. BA II Plus lets you quickly calculate the results with fewer calculator key strokes, but it doesnt leave an audit trail for you to double chec k your data entry and formulas (assuming you are not using the BA II Plus builtin worksheets). If you are not sure about the result given by BA II Plus, you ha ve to reenter data and calculate the result a second time. In contrast, TI-30X I IS lets you see and modify your data and formulas, but you have more calculator key strokes and sometimes you have to wait several seconds for the result. I rec ommend that you try both calculators and find out which calculator is right for you for which calculation tasks. Critical calculator tips You might want to read the guidebook to learn how to use both calculators. Ill just highlight some of t he most important points. BA II Plus always choose AOS. BA II Plus has two calcu lation methods: the chain method and the algebraic operating system method. Simp ly put, under the chain method, BA II Plus calculates numbers in the order that you enter them. For example, if you enter 2 + 3 100 , BA II Plus first calculate s 2+3=5. It then calculates 5 100 = 500 as the final result. Under AOS, the calc ulator follows the standard rules of algebraic hierarchy in its calculation. Und er AOS, if you enter 2 + 3 100 , BA II Plus first calculates 3 100 = 300 . And t hen it calculates 2 + 300 = 302 as the final result. AOS is more powerful than t he chain method. For example, if you want to find 1 + 2e 3 + 4 5 , under AOS, yo u need to enter 1 + 2 3 2nd e x + 4 5 x (the result is about 50.1153) Under the ch ain method, to find 1 + 2e 3 + 4 5 , you have to enter: 1+(23 2nd e x )+(45 x ) Yo u can see that AOS is better because the calculation sequence under AOS is the s ame as the calculation sequence in the formula. In contrast, the calculation seq uence in the chain method is cumbersome. BA II Plus and TI-30X IIS always set th e calculator to display at least four decimal places. This way, youll be able to see the final result of your calculation to at least four decimal places. BA II Plus can display eight decimal places; TI-30X IIS can display nine decimal place s. I recommend that you set both calculators to display, at minimum, four decima l places. Yufeng Guo, Deeper Understanding: Exam P Page 17 of 425

http://guo.coursehost.com Please note that the internal calculations done in BA II Plus or TI-30X IIS are NOT affected by the number of decimal places you set the calculator to display. Your choice of decimal places only affects what you can see on the calculator sc reen. BA II Plus and TI-30X IISbe careful about storing and retrieving intermedia te values. If a calculation has several steps, you will need to store the interm ediate values somewhere either on scrap paper (which will be the exam paper in t he actual exam) or in the calculators memories. Both methods have pros and cons. If you like to copy intermediate values onto scrap paper (the simpler approach o f the two), writing down all the decimal places will be too time-consuming for s ome heavyduty calculations and your calculation may lose some precision. If you like to store intermediate values in the calculators memory, you do not have to t ransfer intermediate values back and forth between the scrap paper and your calc ulator. However, you have the extra burden of keeping track of which memory stor es which intermediate values. Mistakes associated with using calculators memories are prevalent. One common mistake is thinking you stored an intermediate value in your calculators memory but you didnt, in which case your intermediate value is lost and you have to recalculate. Another common mistake is to store an interme diate value in one memory (such as Memory M0 in BA II Plus or in Memory A in TI30 IIS) but to retrieve it from another memory (such as Memory M1 in BA II Plus or Memory B in TI-30X IIS). Watch out for the distinction between the minus sign and the negative sign. Both BA II Plus and TI-30X IIS have a minus sign and a n egative sign - . They look similar, but are completely different. If you look clo sely, the minus sign is longer and the negative sign is shorter. The real differ ence is that the minus sign is for subtraction (such as 5 3=2) and the negative sign is to indicate that a number is negative (such as e 1 =0.36787944). Using t he subtraction sign mistakenly for the negative sign will cause a calculation er ror. Be careful using % on the BA II Plus. For example, if you type 100 5% in BA II Plus, you might expect to get 99.95. However, after pressing the [Enter] key , you get 95 instead. BA II Plus calculates A X % = A (1 X %) . So if you type 1 00+5%, youll get 105. If your intention is to calculate 100 0.05, you can enter 1 00 -1 i 5% (i.e. 100 minus 1 times 5%). Learn how to reset your calculator. Acco rding to exam rules, when you use BA II Plus or TI-30X IIS for an SOA or CAS exa m, exam proctors on site will need to clear the memories of your BA II Plus or T I-30X IIS. Typically, a proctor will clear your calculators memories by resetting the calculator to its default setting. This is done by pressing 2nd Reset Enter for B A II Plus and by simultaneously pressing On and Clear for TI-30X IIS and TI-30X IIB. You will need to know how to adjust the settings of BA II Plus and TI-30X IIS t o your best advantage for the exam. Yufeng Guo, Deeper Understanding: Exam P Page 18 of 425

http://guo.coursehost.com In its default settings, BA II Plus, among other thing s, uses the chain calculation method, displays only two decimal places, and sets its Statistics Worksheet to the LIN (standard linear regression) mode. After th e proctor resets your BA II Plus and before the exam begins, you need to change the settings of your BA II Plus to the AOS calculation method, at least a four d ecimal place display, and the 1-V (one variable) as the mode for its Statistics Worksheet. Refer to the guidebook on how to change the settings of your BA II Pl us. You need change only one item on your TI-30X IIS once the proctor resets it. In its default settings, TI-30X IIS displays two decimal places. Instead set yo ur TI-30X IIS to display at least four decimal places. When you study for Exam P , practice how to reset your BA II Plus and TI-30X IIS and how to change their d efault settings to the settings most beneficial for taking the exam. Finally pra ctice, practice, practice until you are perfectly comfortable with the BA II Plu s and TI-30X IIS. Work through all the examples in the guidebook relevant to Exa m P. Calculator Exercise 1 A group of 23 highly-talented actuary students in a l arge insurance company are taking SOA Exam FM at the next exam sitting. The prob ability for each candidate to pass Course 2 is 0.73, independent of other studen ts passing or failing the exam. The company promises to give each actuary studen t who passes Course 2 a raise of $2,500. Whats the probability that the insurance company will spend at least $50,000 on raises associated with passing Exam FM? Solution If the company spends at least $50,000 on exam-related raises, then the number of students who will pass FM must be at least 50,000/2,500=20. So we nee d to find the probability of having at least 20 students pass FM. Let X = the nu mber of students who will pass FM. The problem does not specify the distribution of X . So possibly X has a binomial distribution. Lets check the conditions for a binominal distribution: There are only two outcomes for each student taking th e exam either Pass or Fail. The probability of Pass (0.73) or Not Pass (0.27) re mains constant from one student to another. The exam result of one student does not affect that of another student. X satisfies the requirements of a binomial random variable with parameters n =23 and p =0.73. We also need to find the probability of x 20 . Yufeng Guo, Deeper Understanding: Exam P Page 19 of 425

http://guo.coursehost.com Pr(x 20) = Pr(x = 20) + Pr(x = 21) + Pr(x = 22) + Pr(x = 23) x Applying the formula f X (x ) = Cn p x (1 p )n x , we have f (x 20 23 20) 21 22 23 = C (.73)20 (.27)3 + C 23 (0.73)21(.27)2 + C 23 (.73)22 (.27) + C 23 (. 73)23 = .09608 Therefore, there is a 9.6% of chance that the company will have to spend at leas t $50,000 to pay for exam-related raises. Calculator key sequence for BA II Plus : Method #1 direct calculation without leaving audit trails Procedure Set to display 8 decimal places (4 decimal places are sufficient, but assume you want to see more decimals) Set AOS (Algebraic operating system) Keyst roke 2nd Format 8 Enter Display DEC=8.00000000 2nd [FORMAT], keep pressing multiple times until you see Chn. Press 2nd [ENTER] (i f you see AOS, your calculator is already in AOS, in which case press [CLR Work] ) Calculate 20 C 23 (.73)20 (.27)3 AOS Calculate 20 C23 23 2 nd nC r 1,771.000000 20 3.27096399 Calculate (.73) 20 .73 Calculate y y x 20 0.064328238

(.27)3 .27 x 3+ Calculate 21 C23 (0.73)21(.27)2 Calculate 21 C23 23 2 nd n Cr 253.0000000 21 0.34111482 Calculate (.73) 21 .73 y x 21 Yufeng Guo, Deeper Understanding: Exam P Page 20 of 425

http://guo.coursehost.com Calculate (.27)2 .27 0.08924965 x + 2 Calculate 22 C23 (0.73)22 (.27) Calculate 22 C23 23 2 .73 nd n x Cr 23.00000000 22 0.02263762 Calculate (.73) 22 y 22 0.09536181 Calculate Calculate 23 C23 (0.73)23 (.27) 23 C23 .27 + 1.00000000 23 0.09608031 .73 Calculate 23 2 nd n Cr 23 Calculate (.73) 23

and get the final result y x = Method #2 With audit trails Procedure Set to display 8 decimal places (4 decimal places are sufficient, but assume you want to see more decimals) Set AOS (Algebraic operating system) Keyst roke 2nd Format 8 Enter Display DEC=8.00000000 2nd [FORMAT], keep pressing multiple times until you see Chn. Press 2nd [ENTER] (i f you see AOS, your calculator is already in AOS, in which case press [CLR Work] ) 2nd MEM 2nd CLR Work CE/C 3 AOS Clear memories Get back to calculation mode Calculate M0=0.00000000 0.00000000 C 20 23 (.73 ) (.27 ) 20 and 20 C23 store it in Memory 1 Calculate 23 2nd nC r 1,771.000000 20 3.27096399 Calculate ( 0.73 ) 20 .73 y x 20 Yufeng Guo, Deeper Understanding: Exam P Page 21 of 425

http://guo.coursehost.com Calculate (.27 ) 3 0.06438238 .27 y x 3= 0.06438238 0.00000000 Store the result in Memory 0 Get back to calculation mode Calculate 21 C23 ( 0.73 ) (.27)2 21 21 C23 STO 0 CE/C and store it in Memory 1 Calculate 23 2 nd n Cr 253.0000000 21 0.34111482 Calculate (.73) 21 .73 Calculate y x 21 0.02486727 (.27)2 .27 x = 0.02486727 0.00000000 2 Store the result in Memory 1 Get back to calculation mode Calculate 22 C23 (0.73)22 (.27) and STO 1 CE/C store it in Memory 3 Calculate 22 C23 22 23 2 .73

nd n x Cr 23.00000000 22 0.02263762 Calculate (.73) y 22 0.00611216 Calculate (.27) Store the result in Memory 2 Calculate .27 = STO 2 0.00611216 C 23 23 23 (0.73) and store it 23 C23 in Memory 4 Calculate 23 2 nd n Cr 23 1.00000000 23 0.00071850 Calculate (.73) 23 and get the final result .73 y x = 0.00071850 Store the result in Memory 3 Recall values stored in Memory 1,2,3, and 4. Sum th em up.

STO 3 RCL 0 + RCL 1 + RCL 2 + RCL3 = 0.06438238 0.02486727 0.08924965 0.00611216 0.09536181 0.00071850 0.09608031 Yufeng Guo, Deeper Understanding: Exam P Page 22 of 425

http://guo.coursehost.com Comparing Method #1 with Method #2: Method #1 is quick er but more risky. Because you dont have an audit history, if you miscalculate on e item, youll need to recalculate everything again from scratch. Method #2 is slo wer but leaves a good auditing trail by storing all your intermediate values in your calculators memories. If you miscalculate one item, you need to recalculate that item alone and reuse the result of other calculations (which are correct). 20 For example, instead of calculating C 23 (.73 ) 3 20 20 (.27 ) 3 as you should, you calculated 20 C 23 (.73 ) (.27 ) . To correct this error under method #1, you have to start from scratch and calculate each of the following four items: 20 C 23 (.73 ) 20 3 21 21 22 23 (.27 ) , C23 ( 0.73 ) (.27)2 , C23 (0.73)22 (.27) , and C23 (0.73) 23 In to he 21 contrast, correcting this error under Method #2 is lot easier. You just need 3 20 20 recalculate C 23 (.73 ) (.27 ) ; you dont need to recalculate any of t following three items: 22 23 C23 ( 0.73 ) (.27)2 , C23 (0.73)22 (.27) , and C23 (0.73)23 21

You can easily retrieve the above three items from your calculators memories and calculate the final result: 20 21 22 23 C 23 (.73)20 (.27)3 + C 23 (0.73)21(.27)2 + C 23 (.73)22 (.27) + C 2 3 (.73)23 = .09608 I recommend that you use Method #1 for simple calculations and Method #2 for com plex calculations. Calculator keystrokes for TI-30X IIS: Now lets solve the same problem using TI-30 IIS. Before entering the formula for f (x 20) into a TI-30X IIS calculator, you might want to calculate the four items: 20 21 22 23 C23 =1,771, C23 =253, C23 =23, C23 =1. To calculate the first two items above, you can use the combination operator in BA Plus or TI-30X IIS (BA Plus is faster than TI-30X IIS). For TI-30X IIS, the o perator key for 20 C23 is 23 PRB n C r 20 ENTER. You should be able to calculate t he last two items without using a calculator. Now you are ready to enter the for mula: Yufeng Guo, Deeper Understanding: Exam P Page 23 of 425

http://guo.coursehost.com 1771 (.73 20 )(.27 3 ) + 253 (.73 21)(.272 ) + 23 (.73 22) .27 + .73 23 Press [ENTER], wait a few seconds, and you should get 0.09608031. Calculator Exercise 2 -- Fraction math 1 5 7 + + 2 3 11 a decimal). Evaluate 2 . Express your answer in a fraction (i.e. you cannot express it as 9 Solution Fraction math happens when you do integration in calculus or solve prob ability problems. It is simple conceptually, but is easy to make a mistake in th e heat of the exam. Here is a shortcut you can use TI-30X IIS to do fraction mat h for you. One nice feature of the TI-30X IIS is that it can convert a decimal n umber into a fraction. This feature can save you a lot of time. For example, if we want to convert 0.015625 into a fraction, TI30X IIS can quickly do the job fo r you. The key sequence for TI-30X IIS is: Type .015625 Press 2nd PRB (PRB key i s to the right of LOG key) Press ENTER Then TI-30 IIS (IIB) will display 1/64 So 0.015625=1/64 Now, back to our original problem To convert 1 5 7 + + 2 3 11 2 into a fraction, the key strokes are: 9 Type 1/2+5/3+7/11-2/9 Press ENTER Press 2nd PRB Press ENTER TI-30X IIS will give you the result: 2 115 198 Yufeng Guo, Deeper Understanding: Exam P Page 24 of 425

http://guo.coursehost.com Calculator Exercise 3 1 Evaluate 0 1 3 + x 2 5 x2 + 11 3 x 6 9 4 x dx . Express your result as a fraction. 7 Solution 1 0 1 3 + x 2 5 x2 + 11 3 x 6 9 4 x dx 7 = 1 3 1 x + x2 2 5 2 1 3 1 + 2 5 2 1 3 11 1 4 x + x 3 6 4 9 1 7 5 9 1 5 x 7 5 1 0 = 1 11 1 + 3 6 4 For TI-30X IIS, enter 1/2 + 3/(5 * 2) 1/3 + 11/(6 * 4) 9/(7 * 5 Press ENTER, you should get: 0.667857143 Press 2nd PRB ENTER, you should get: 18 7/280 1 1 3 11 3 9 4 187 So ( + x x 2 + x x )dx = 2 5 6 7 280 0 Please note that for TI-30X IIS to convert a decimal number into a fraction, the denominator of the converted fraction converted must not exceed 1,000. For example, if you pres s 0.001 2nd PRB ENTER You will get 1/1000 (so 0.001=1/1000) However, if you want to convert 0.0011 into a fraction, TI-30X IIS will not convert. If you press: 0 .0011 2nd PRB ENTER You will get 0.0011. TI-30 IIS will not convert 0.0011 into 11/10,000 because 0.0011 =11/10000 (the denominator 10,000 exceeds 1,000). Howev er, this constraint is typically not an issue on the exam. Yufeng Guo, Deeper Understanding: Exam P

Page 25 of 425

http://guo.coursehost.com Comparison of 3 best calculators Features Approximate cost Calculation speed Have separators? Max # of decimals d isplayed # of memories Leave audit trail? 1-V Statistics # of distinct values da ta frequency BA II Plus Professional $48-$70 Fast Yes 8 10 Only in its built-in worksheets 50 any positive integer BA II Plus $30 Fast Yes 8 10 Only in its buil t-in worksheets TI-30X IIS $15 Slower No 9 5 All calculations 50 Any positive integer Ease of entering data Good for Exam FM? Have to press calculator keys hard Yes. Can calculate Modified Duration. Good fo r all calculations (including time value of money). Yes Use AOS; display at leas t 4 decimals; set statistics to 1-V mode. Soft and gentle Yes. Cannot calculate Modified Duration, but good for other calc ulations (including time value of money). Yes Use AOS; display at least 4 decima ls; set statistics to 1-V mode. 42 A positive integer not exceeding 99 Soft and gentle Yes. Not good for time va lue of money, but good for general calculations Yes Display at least 4 decimals Good for upper exams? Optimal settings for Exam P Homework for you read the BA II Plus guidebook and the TI-30 IIS guidebook. Work through relevant examples in the guidebooks. Rework examples in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 26 of 425

http://guo.coursehost.com Chapter 3 models Set, sample space, probability Probability models use set notation extensively. Lets first have a review of set theories. Then well look at how to build probability models. Set theories Set. A set is a collection of objects. These objects are called elements or members of the set. If an object x is an element of the set S , then we say x S . Example. The set of all even integers no smaller than 1 and no greater than 9 is A = {2, 4, 6,8} . Set A has four elements: 2,4,6, and 8. 2 A; 4 A; 6 A; 8 A. . Example. The set of all non-negative even integers is B = {0, 2, 4, 6,....} . B has an infinite number of elements. Null set. A null set or an empty set is a set that has no elements. A null set i s often = { }. expressed as . Subset. For two sets A and B , if every element of A is also an element of B , we say that A is a subset of B . This is expressed as A B . Example. Set A contains the even integers no smaller than 1 and no greater than 9. Set B contains all the non-negative even integers. Set C contains all integer s. Set D contains all real numbers. Then A B; B C; C D. Identical set. If every element in set A is an element in set B and every elemen t in set B is also an element in set A , then A, B are two identical sets. This is expressed as A= B. Example. Set A contains all real numbers that satisfy the equation x 2 3 x + 2 = 0 . Set B = {1, 2} . Then A = B . Complement. For two set s A and U where A is a subset of U (i.e. A U ), the set of the elements in U tha t are not in A is the complement set of A . The complement of A is expressed as A (some textbooks express it as A c ). Page 27 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Example. Let A = {1, 7} and U = {1, 2,3,5, 7,9} . A = {2,3,5,9} . Union. For two sets A and B , the union of A and B contains all the elements in A and all the elements in B . The union of A and B is expressed as A B . Example. Let A = {1, 2,3} , B = {2,5, 7,9} . A B = {1, 2,3,5, 7,9} . Example. Let set C represent all the even integers no smaller than 1 and no grea ter than 9; set D represent all non-negative integers. C D = D . Intersection. For two sets A and B , the intersection of A and B contains all th e common elements in both A and B . The intersection of A and B is expressed as A B. Example. Let A = {1, 2,3} , B = {2,5, 7,9} . A Example. Let A = {1,3} , B = {2,5 , 7,9} . A B = {2} . B= . Example. Let set C represent all the even integers no smaller than 1 and no grea ter than 9, set D represent all non-negative integers. C D = C . Steps to build a probability model A probability model is a simplification and abstraction of a real world experime nt. It gives us a framework to assign probability and analyze a random process. Step 1 Set up an appropriate sample space Step 2 Set up basic probability laws ( called axioms) Step 3 Assign probability measures To understand how to how to bu ild a probability, lets look at some examples. Case 1 Toss a fair coin All probability models are concerned about a random process whose outcomes are n ot unknown in advance. This random process is called an experiment. A probabilit y model attempts to answer the question Whats the chance that we observe one outco me out of many possible outcomes? In the case of tossing a coin, we dont know whet her heads or tails will show up. So tossing a coin is a random process whose out comes are not known in advance. We want to find out the chance of getting a head or tail in one toss of coin. Well build a probability model using the 3-step pro cess. Page 28 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Step 1 Determine sample space Sample space is all poss ible outcomes of an experiment. When setting sample space, well want to make sure We havent forgotten any possible outcomes. In other words, we must exhaustively list all of the possible outcomes. (exhaustive property) We havent listed the sam e outcome twice. In other words, the outcomes we have listed should be mutually exclusive. (mutually exclusive property) When tossing a coin, well get either heads (H) or tails (T). So our sample space is {H,T}. H and T are mutually exclusive yet collectively exhaustive. Step 2 Set up probability axioms Before we dive into the details of calculating the probab ility of certain outcomes, we need to set up and follow some high level probabil ity laws. These laws wont tell us exactly how to calculate a probability. However , by following these laws, we can build a probability model that is consistent a nd not self-contradictory. All probability models must follow these 3 axioms (Ko lmogorovs axioms): Axiom 1 Probability is non-negative. For any event A , P ( A ) 0 . Axiom 2 then P ( A Probability is additive. If A and B are two mutually exclusive events, B ) = P ( A) + P ( B ) . Axiom 3 Probabilities must add up to one. If represent the sample space (which i s the list of all possible outcomes), then P ( ) =1. In other words, if we perfo rm an experiment, we are 100% certain to observe one of the possible outcomes. A pplying these three axioms to the coin-toss example, we have: = {H , T } P ( H ) 0 , P (T ) 0 P(H P( T ) = P ( H ) + P (T ) -- because H and T are mutually excl usive ) = P(H T ) =1 Please note these axioms dont tell us how to calculate P ( H ) or P (T ) . The pu rpose of the axioms is to make sure that our model is sound. The calculation of P ( H ) and P (T ) requires additional work, in the confines of the three probab ility axioms. Page 29 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com If you have trouble seeing what probability axioms are good for, think about writing an essay. Your high school teacher told you milli ons of times that an essay needs to have 3 parts: opening, body, and conclusion. Having an opening, a body, and a conclusion is like an axiom for writing essays. This axiom gives you a framework for writing essays. If you follow this axiom, y our essay is structurally sound. However, following this axiom wont guarantee tha t your essay is good. Your essay can have an opening, a body, and a conclusion, but it may still be a bad essay. To be good, besides having an opening, a body, and a conclusion, your essay needs to have substance. Similarly, if you follow t he three probability axioms, your probability model is structurally sound. Howev er, this doesnt guarantee that your probability model is good. To be good, your p robability model needs to have substance. You must have detailed knowledge of th e random process. Step 3 Assign probability measures Assigning probability measu res requires the detailed knowledge of the experiment. Fortunately, tossing a co in is a simple experiment; our common sense is enough to help us assign probabil ities. Its reasonable to assume that heads and tails are equally likely to occur for a fair coin. P ( H ) = P (T ) We combine Step 2 and 3: P(H P( T ) = P ( H ) + P (T ) T ) =1 ) = P(H P ( H ) = P (T ) Solving the above equations, we have: P ( H ) = P (T ) = 1 2 General rule: If an experiment has n outcomes that are equally likely to occur, then the probability of an event A is: P ( A) = # of elements in A n Page 30 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Case 2 Toss an unfair coin In a toss of an unfair coin , Step 1 and Step 2 of the model building process are the same as in a toss of a fair coin. The only difference is on Step 3. If a coin is not fair, then P ( H ) P (T ) . To calculate P ( H ) and P (T ) , we need to know more about the coin . For example, if our analysis indicates that heads are twice as likely to occur as tails, then P ( H ) = 2 P (T ) We combining Step 2 and 3: P(H P( T ) = P ( H ) + P (T ) ) = P(H T ) =1 P ( H ) = 2 P (T ) Solving the above equations, we have: P(H ) = 2 1 , P (T ) = 3 3 Step 3 of the model building process requires one to understand the experiment t hat is being studied. There is no one correct way to assign probability. People with different experiences may assign different probabilities to the same event. However, no matter how you assign your probability, if you follow the 3 axioms of probability, your model is internally consistent. Case 3 Tossing two fair coins Step 1 Determine sample space We face a choice here. Should we consider the two coins to be distinguishable and set up ordered pairs of H and T? Or should we co nsider them to be indistinguishable and hence set up unordered pairs of H and T? Lets look at each option. Option 1. We assume the two coins are different. We ca n think one coin is A and the other is B. The sample space is: = {H A H B , H AT B , T A H B , T AT B } Page 31 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com H A H B represents that Coin A is heads up and B is he ads up and so on. Dropping the superscript, we have: = { HH , HT , TH , TT } Option 2. We assume the two coins are identical. Then HT a nd TH are the same because they both give us one head and one tail. Then we have onl y three outcomes: two heads, two tails, one head and one tail. The sample space is: ={ Two heads, Two tails, One head and one tail } For now lets consider only o ption 1 (the two coins are distinguishable). The sample space is = { HH , HT , T H , TT } . Now lets continue building the model. Step 2 Apply the 3 axioms P ( HH ) 0 , P ( HT ) 0 , P (TH ) 0 , P (TT ) 0 P( ) = P ( HH HT TH TT ) = P ( HH ) + P ( HT ) + P (TH ) + P (TT ) = 1 Step 3 Assign probability Its reasonable to assume that each outcome is equally l ikely to occur. So P ( HH ) = P ( HT ) = P (TH ) = P ( TT ) Applying the axiom P ( HH ) + P ( HT ) + P (TH ) + P (TT ) = 1 , we have: P ( HH ) = P ( HT ) = P (TH ) = P (TT ) = 1 4 Now lets look at the 2nd option of specifying the sample space. Now the two coins are indistinguishable. The sample space is = { HH , HT , TT } . Step 2 of the m odel building process tells us: P ( HH ) 0 , P ( HT ) 0 , P (TT ) 0 P( ) = P ( HH HT TT ) = P ( HH ) + P ( HT ) + P (TT ) = 1 When we assign probability in Step 3, we have a trouble. Since we treat the two coins are the same, now we cant easily assess the probability of having one heads and one tail. For example, we cant say that its equally likely to have two heads, two tails, one head and one tail. Page 32 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com This brings up an important point. When specifying the sample space, make sure your outcomes are detailed enough for you to answer the question at hand. Case 4 Randomly draw a number from [ 0, 1 ] Step 1 Determine sample space Let x represent the number randomly drawn from the interval [0,1]. Then the sample space is = { x : 0 x 1} . Step 2 Apply axioms P ( x) 0 , P( ) = P (0 x 1) = 1 Please note that for the above two equations to hold, we must have P ( x ) = 0 f or any 0 x 1 . If P ( x ) is a positive number, then when we sum up the total probabili ty from x = 0 to x = 1 , well get an infinite number. To have P ( ) = P (0 x 1) = 1 , we must have P ( x ) = 0 . Step 3 Assign probability How can we assign probability measu re to satisfy the axioms? We can assign probability of having x [ a, b] (where 0 a b 1 ) proportional to the length b a : P (a P( x b ) = k ( b a ) (where 0 a b 1 and k is a constant) ) = P ( 0 x 1) = k (1 0 ) = 1 , k = 1 P ( a x b ) = 1 ( b a ) = b a P ( x = a ) = P ( a x a ) = 1 ( a a ) = 0 By assigning P ( a x b ) = b a (where 0 a b 1 ), well be able to satisfy the thre e axioms of probability. Case 5 Randomly draw two numbers x and y from [ 0, 1 ]. Find the probability for x + y 0.5 . Step 1 Determine sample space = {( x, y ) : 0 Step 2 Satisfy axioms Page 33 of 425 x 1, 0 y 1} Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com P ( x, y ) 0 for 0 P( x 1, 0 y 1 ) =1 ) , well get an infinite number. So to have P ( ) = 1 , we for any ( x, y ) . Once again, we need to have P ( x, y ) = 0 for any ( x, y ) . If P ( x, y ) is p ositive, when we add up the total probability P ( must have P ( x, y ) = 0 Step 3 Assign probability Here ( x, y ) lies in the unit square ABCO. To assign probability such that P ( satisfied, we can assign P ( 0 represents x + y = 0.5 ; area DEO represents 0 x+ y 0.5 . ) = 1 is x + y 1) to be proportionally to the area DEO. Line DE Since area ABCO corresponds to P ( ) = 1 , we have: 2 P (0 x + y 1) = DEO P( ABCO 1 1 DEO 2 2 = )= ABCO 1 = 1 8 Page 34 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Case 6 Flip a coin until a head appears Step 1 Determi ne sample space = { H , TH , TTH , TTTH , TTTTH ,....} Step 2 Apply axioms The most important ax iom is: P( ) = P(H TH TTH TTTH TTTTH ....) = 1 Since all the elements in the sample space are mutually exclusive, we have: P( ) = P ( H TH TTH TTTH TTTTH ....) = P ( H ) + P (TH ) + P ( TTH ) + P (TTTH ) + P (TTTTH ) + ... = 1 At first glance, it seems impossible to satisfy P ( ) = 1 . We have infinite num ber of outcomes, yet the total probability needs to be one. After more thoughts, however, we realize that its easy to have P ( ) = 1 . For example, if the probab ilities follow a geometric progression, then we can have infinite # of 1 probabi lities, yet the sum of the total probability is finite. For example, if P ( H ) = , 2 1 1 P (TH ) = 2 , P (TTH ) = 3 ,, then the total probabilities is 2 2 1 1 1 1 1 P ( ) = + 2 + 3 + ... + n + ... = 2 = 1 1 2 2 2 2 1 2 Please note Case 6 is different from Case 4 and 5. Case 4 and 5 have infinite # of continuous outcome s. The only way to satisfy P ( ) = 1 is to have zero probability for each single outcome. This is why we need to have P ( x ) = 0 in Case 4 and P ( x, y ) = 0 i n Case 5. The outcomes in Case 6, however, are discrete. As a result, each outco me may have nonzero probability, yet the total probability can be finite. Page 35 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Step 3 Assign probability To assign probability for ea ch outcome, we need to have detailed knowledge of the experiment. The experiment in this problem is keep flipping coins until a head shows up. As youll see later i n this book, the probability model for experiment is a geometric distribution: P ( H ) = p , P (TH ) = p (1 p ) , P (TTH ) = p (1 p ) , 2 Where 0 < p < 1 Then, P( ) = P ( H ) + P (TH ) + P (TTH ) + P (TTTH ) + P (TTTTH ) + ... 2 3 = p + p (1 p ) + p (1 p ) + p (1 p ) + ... = p =1 1 (1 p ) For now, dont worry about geometric distribution. Focus on understanding how to b uild a probability model. Case 7 Flipping a tack If we flip a tack many times, we want to find out how often we see the needle po inting up or pointing down . Step 1 Determine sample space = {U , D} , where U =up, D =down Step 2 Apply axio ms P (U ) 0 , P ( D ) 0 , P ( Step 3 Assign probability Its reasonable to assume that outcomes U and D have fixed probabilities P (U ) = p and P ( D ) = 1 p . Ho wever, unlike in tossing a fair coin where its reasonable to assume P ( H ) = P ( T ) , here we cant assume P (U ) = P ( D ) . We cant easily establish some sort Page 36 of 425 ) = P (U D ) = P (U ) + P ( D ) = 1 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com of relationship between P (U ) and P ( D ) . What we c an do, perhaps, is to flip a tack many times and determine the % of the times th at the needle points up. However, this % is not P (U ) ; its an estimate of P (U ) . Even though its hard for us to assign probability to U and D , the axioms sti ll hold. Well always have P (U ) 0 , P ( D ) 0 , P ( ) = P (U D ) = P (U ) + P ( D ) = 1 Hopefully by now you have an idea of how to build a framework to analyze a rando m process using set, sample space, and probability axioms. Please note that when you solve problems, you dont have to formally list the 3 steps. As long as you c an follow the essence of this 3-step process, you are fine. Sample Problems and Solutions Problem 1 John didnt bother learning the sample space or the probability axioms, yet he sco red a 9 in Exam P. Explain this paradox. Solution This can happen. SOA exams mainly test a candidates ability to apply theories to real world problems; SOA exams rarely test a concept for the sake of testing a c oncept. SOA never asked Tell me what a sample space is. Instead, SOA asks you to s olve a problem. If you can solve the problem, SOA assumes that you understand sa mple space and probability laws. Though John didnt bother learning sample space o r probability axioms, he understood the common sense behind sample space and axi oms. For example, when calculating a probability, John always clearly specifies all of the possible outcomes, even though he doesnt know that the complete list o f all the outcomes is called sample space. John always assumes that the total prob ability for all the possible outcomes must add up to one, even though he doesnt kn ow that this is one of the axioms. So John has sufficient understanding of Exam P core concepts and deserves passing Exam P. The key point to take home is this: when studying for Exam P and higher exams, focus on understanding the essence o f a concept, not on memorizing jargon. Page 37 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Problem 2 Let A = {1,3,7} , U = {1, 2,3, 4,5, 6, 7,8,9 } , B = {4, 7} . Find (1) (2) (3) (4) All the subsets of A . A. A B. A B. Solution (1) The subsets of A are as follows: , {1} , {3} , {7} , {1,3} , {1, 7} , {3,7} , {1,3, 7} (2) A = {2, 4,5, 6,8, 9} (3) A (4) A B = {1, 3, 7} {4, 7} = {7} . B = {1, 3, 7} {4, 7} = {1, 3, 4, 7} . Problem 3 You toss a coin and record which side is up. Determine the sample space . Determ ine the subset of representing heads. Determine the subset of representing tails . Solution If you toss a coin, you have a total of 2 possible outcomes: either heads or tai ls. As the result, the sample space is = {H ,T } . The subset representing heads is {H } . The subset representing tails is {T } . Problem 4 You throw a die and record the outcomes. Determine the sample space . Determine the subset of representing an even number. Determine the subset of representing an odd number. Determine the subset of representing the smallest number. Determi ne the subset of representing the largest number. Determine the subset of repres enting a number no less than 3. Page 38 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Solution If you throw a die, you have a total of 6 possible outcomes -- you get 1,2,3,4, 5,or 6. As the result, the sample space is: = {1,2,3,4,5,6} Within the sample space Within the sample space Within the sample space Within the sample s pace Within the sample space {3, 4,5, 6} . Problem 5 An urn contains three balls one red ball, one black ball, and one white ball. Two balls are randomly drawn from the urn and their colors are recorded. Determine: (1) The sample space. (2) The event representing where a red ball and black ball are drawn. (3) The proba bility that of the two randomly drawn balls, one is a red ball and the other a b lack ball. Solution Let R=red, B=black, and W=white. The sample space is: { {R,B }, {B,R}, { R,W}, {W,R}, {B,W}, {W,B} }. The event representing where a red ball and black ball are drawn is { {R,B}, {B,R} }. The probability that of the two r andomly drawn balls, one is a red ball and the other a black is: # of elements i n the event 2 1 = = # of elements in the sample space 6 3 Page 39 of 425 , the subset representing an even number is {2, 4,6} . , the subset representing an odd number is {1,3,5} . , the subset representing the smallest number is {1} . , the subset representing the biggest number is {6} . , the subset representi ng a number no less than 3 is Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com In the above calculation, we can assume that each outc ome in the sample space is equally likely to occur unless told otherwise. Homewo rk for you redo all the problems listed in this chapter. Page 40 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Chapter 4 Multiplication/addition rule, counting problems Exam Advice Counting problems can be extremely complex and difficult. However, p revious SOA exam counting problems have been relatively simple and straightforwa rd. Focus on the basics. Dont spend too much time trying to master difficult prob lems. Focus on the essence. Dont attempt to memorize the precise definition of te rms such as sampling with (or without) order and with (or without) replacement. As long as you know how to calculate the number of selections, it is not necessary to know whether a sampling method is with (or without) order and with (or witho ut) replacement. Multiplication Rule You are going from City A to City C via City B. There are m different ways from City A to City B. There are n different ways from City B to City C. How many different ways are there from City A to City C? There are m n d ifferent ways from City A to City C. A B C m n Example. A department of a company has 3 managers, 10 professionals, and 5 secre taries. You want to choose one manager, one professional, and one secretary to f orm a committee. How many different ways can you form a committee? Solution Ther e are 3 ways to choose a manager, 10 ways to choose a professional and 5 ways to choose a secretary. As a result, there are 3 10 5 = 150 different ways to form a committee. Yufeng Guo, Deeper Understanding: Exam P Page 41 of 425

http://guo.coursehost.com Addition Rule If Event A and B are mutually exclusive, then Pr ( A B ) = Pr ( A ) + Pr ( B ) B) For any two events A and B , Pr ( A B ) = Pr ( A) + Pr ( B ) Pr ( A If you understand the above formulas, then you have implicitly understood the ad dition rule without explicitly memorizing it. Basic terms Understand the essence. No need to memorize the definition. Sample with order The order by which items are listed matters. For example, two of the five equally qualified candidates ( A, B, C , D, E ) are chosen to fill t wo positions in a company --Vice President and Sales Manager. If we first list t he candidate chosen for the Vice President position and then the candidate for t he Sales Manager position, we can assume the order by which candidates are liste d is important. Different orders represent different choices. For example, AB is different from BA . AB and BA two distinct entities. Sample without order The o rder by which items are listed does not matter. For example, two of the five equ ally qualified candidates ( A, B, C , D, E ) are chosen to form a committee. Evi dently, a committee consisting of AB is the same committee consisting of BA . AB and BA are an identical entity. Sample with replacement After an item is taken out from the pool, it is immediately put back to the pool before the next random draw. As a result, the same item can be selected again and again. In addition, the number of items in the pool stays constant before each draw. For example, yo u have 3 awards to give individually to 5 potential recipients. If one recipient is allowed to get multiple awards, then this is a sample with replacement. Samp le without replacement Once an item is taken out of a pool, it permanently leave s the pool and can never be selected again. The number of items in the pool alwa ys decreases by one at the end of each random draw. You have 3 awards to give in dividually to 5 potential recipients. If no recipients are allowed to get more t han one award, then this is a sample without replacement. Of course, you can sam ple with (or without) order and with (or without) replacement. These terms are e xplained below. Once again, when studying these terms, focus on learning the cal culation, not the precise definition of the terms. Yufeng Guo, Deeper Understanding: Exam P Page 42 of 425

http://guo.coursehost.com Select r out of n people to form a line (or select r o ut of n distinct objects) -order matters and duplication is NOT allowed (or dupl ication is NOT possible) (called sampling with order without replacement) How many different ways can you form the line? You can start filling the empty spots fro m left to right (you can also form the line from right to left and get the same result). You have n choices to fill the first spot, (n 1) choices to fill the se cond spot,, and (n r + 1) choices to fill the r -th spot. Applying the multiplica tion rule, you have a total of n (n 1) (n 2) ... (n r + 1) = Pnr ways of forming the line. If r = n , then we h ave n ! ways of forming the line. Line up n = r1 + r2 + ... + rk colored balls where r1 balls have color 1 (such a s red), r2 balls have color 2 (such as black), , rk balls have color k How many distinct ways are there to put n balls in a row? We have a total of n ! permutations if each ball is distinct. However, we have identical balls and we need to remove the permutations by these identical balls. So r1 balls are identi cal and have r1 ! permutations; r2 balls are identical and have r2 ! permutation s. rk balls are identical and have rk ! permutations. Thus, we need to divide n ! by r1 !r2 !...rk ! to get the distinct number of permutations. The total numbe r of distinct permutations is: (r + r + ...rk )! n! = 1 2 r1 !r2 !...rk ! r1 !r2 !...rk ! Example. You are using six letters A, A, A, B, B, C to form a six-letter symbol. How many distinct symbols can you create? Solution (3 + 2 + 1)! 6! = = 60 3! 2!1! 3! 2!1! Yufeng Guo, Deeper Understanding: Exam P Page 43 of 425

http://guo.coursehost.com Select r out of n (where n r ) people to form a commit tee order does NOT matter and duplication is NOT allowed (called sampling without order and without replacement) How many different ways can you form a committee? First, you can choose r out of n people to form a line. You have a total of n ( n 1) (n 2) ... (n r + 1) = Pnr ways to form a line with exactly r people standin g in line. Out of these Pnr ways, the same committee is repeated r ! times. As a r esult, you have a total of Pnr r! = n! r = Cn r !(n r )! ways of forming a different committee. r C n is often called the binomial coefficient. If you have difficulty understanding why we need to divide Pnr by r !, use a sim ple example. Say you are selecting two members out of three people A, B, C to fo rm a committee. First, you choose 2 out of 3 people to stand in a line. You have a total of P32 = 3 2 = 6 ways of forming a 2-person line: AB, BA, AC , CA, BC , CB Of these 6 ways, you notice that the committee consisting of AB is the same as t he committee consisting of BA ; AC is the same as CA ; BC is the same as CB . In other words, the number of committees is double counted. As a result, we need t o divide P32 by 2! = 2 to get the correct number of committees that can be forme d. Form r -lettered symbols using n letters -- order matters and duplication is all owed (called sampling with order and with replacement) Example. Out of five letters a, b, c, d , e , you are forming three-lettered sym bols such as aaa, aab,abc, . How many different symbols can you build? There is n o relationship between r and n --- r can be greater than, less than, or equal to n. Yufeng Guo, Deeper Understanding: Exam P Page 44 of 425

http://guo.coursehost.com You can use any of the n letters as the first letter o f your symbol. You can use any of the n letters as the second letter of your sym bol (because duplication is allowed). You can use any of the n letters as the r -th letter of your symbol. Applying the multiplication rule, we find that the nu mber of unique symbols we can create is: n n ...n = n r r Example. You are using four letters a, b, c, d to form a three-letter-symbol (for example, adb is one symbol). You are allowed to use the same letter multiple times when building a s ymbol. How many symbols can you form? Solution You can form a total of 43 = 64 d ifferent symbols. Collect a total of n items from r categories order doesnt matter and duplication is allowed (called sampling without order and with replacement) This is best expla ined using an example. You want to collect a dozen silverware items from 3 categ ories: spoons, forks, and knives. You are not obligated to collect any items fro m any specific category, but you must collect a total of 12 items. You have many choices. You can have 12 spoons, zero forks and zero knives; you can have 5 spo ons, 5 forks, and 2 knives; you can have zero spoons, zero forks, and 12 knives. The only stipulation is that you must collect a total of 12 items. How many dif ferent collections can you have? To solve this problem, we need to use a diagram . We have a total of 14 empty boxes to fill. These 14 boxes consist of 12 silver ware items we want to collect and 2 other boxes that indicate the ending of our selection of two categories. For example, we want to collect 5 spoons, 4 forks, and 3 knives. From left to ri ght, we fill 5 empty boxes with spoons. In the sixth box, we put in an X sign indi cating the end of our spoon collection. We then start filling the next 4 empty b oxes with forks. Similarly, at the end of our fork collection, we fill an empty box with the X sign indicating the end Yufeng Guo, Deeper Understanding: Exam P Page 45 of 425

http://guo.coursehost.com of our folk collection. Finally, we fill the 3 remaini ng empty boxes with knives. However, this time we dont need any X sign to signal th e end of our knife collection (our collection automatically ends there). Because we have a total of 3 categories, we need to have 3-1=2 X signs. This is why we ha ve a total of 14 boxes. See diagram below. S S S S S X F F F F X K K K Similarly, if we want to collect zero spoons, 8 forks, and 4 knives, the diagram will look like this: X F F F F F F F F X K K

K K If we want to collect zero spoons, zero forks, and 12 knives, the diagram will l ook like this: X X K K K K K K K K K K K K Now we see that by positioning the two X boxes differently, we have different coll ections. We can put our two X boxes anywhere in the 14 available spots and each 2 unique positioning represents one unique collection. Because we have a total of C 14 2 different ways of positioning the two X boxes, we have a total of C 14 different collections. r Generally, we have C n+1 r 1 n = C n+ r 1 different ways to assemble n items from r categories. Example. Assume that a grocery store allows you to spend $5 to purchase a total of 10 items from a selection of four different vegetables: cucumbers, carrots, p eppers, and tomatoes. Of these four categories, you can choose whatever items yo u want, but the total number of items you can have is 10. How many different veg etable combinations can you get with $5? Solution r n = 10, r = 4, C n+1 r 3 = C 13 = 1

13 12 11 = 286 3 2 1 Page 46 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com You have 286 ways of buying 10 items. SOA Problem (#1 Nov 2001) An urn contains 10 balls; 4 red and 6 blue. A second u rn contains 16 red balls and an unknown number of blue balls. A single ball is d rawn from each urn. The probability that both balls are the same color is 0.44. Calculate the number of blue balls in the second urn. ( A) 4 ( B ) 20 (C ) 24 ( D) 44 ( E ) 64 Solution Let x =# of blue balls in the second urn. 2nd urn 16 x 10 4/10 6/10 x+1 6 16/(16+x) x/(16+x) Red ball Blue ball Total Pr(red ball) Pr(blue ball) 1st Urn 4 6 Pr(both red) + Pr(both blue)=0.44 Because the first urn and second urn are indep endent, 4 16 6 x Pr(both red)= , Pr(both blue)= 10 16 + x 10 16 + x So we have 4 16 6 x + = 0.44 10 16 + x 10 16 + x x=4 Homework for you redo all the problems listed in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 47 of 425

http://guo.coursehost.com Chapter 5 Probability laws and whodunit In virtually every exam sitting, there are several probability problems which re quire you to sort out who did what. These problems are more a test of common sense than a test of ones profound understanding of probability concepts. This is a si mple set-up of such a problem: In a group of people, X% did A, Y% did B, Z% did both A and B. What percentage did A but not B? This problem is simple because yo u have only two basic categories (A and B) and 22 =4 categories ( AB,AB,AB,AB ) to divide the population ( AB =doing A and B, AB =doing B but not A, AB =doing A but not B, and AB =doing neither A nor B). It is easy to mentally track four po ssibilities. A more complex problem requires you to sort people or things under three categories. The problem is set up something like this: Of a group of peopl e, X% did A, Y% did B, Z% did C, M% did A and B, N% did B and C, etc. What perce ntage of the people did A and B and C? Or did nothing (no A, no B, and no C)? Or did some other combinations of A, B, and C? This problem is much harder. Essent ially, you have a total of 23 =8 possible categories: ABC, ABC, ABC, ABC, ABC, ABC, ABC, ABC It is hard to mentally track 8 possibilities, not only under exam conditions whe re you have about 3 minutes to solve a problem, but even under conditions where time is not a big constraint. To solve this kind of problem right, you need a wa y to track the different possibilities. Is it likely to have an exam question wh ere you have 4 basic categories (A, B, C, and D) and a total of 42 =16 possible combinations for you to sort out? Not very likely. It takes too much work for yo u to solve it. It takes too much work for SOA to write this question and come up with an answer too. However, if this kind of difficult problem does show up, yo u will probably want to skip it and focus on easier problems, unless your goal i s to score a 10. How to tackle the problem in 3 minutes There are 2 approaches t o sort out who did what the formula-driven approach and the common sense approach. You might want to familiarize yourself with both approaches and choose one that you like better. Yufeng Guo, Deeper Understanding: Exam P Page 48 of 425

http://guo.coursehost.com Formula-driven approach There are a series of formulas you can use to solve the w ho did what problem. Some of the formulas are easier to memorize than others. I r ecommend that you memorize just the basic formulas they are more useful and vers atile than complex formulas. For complex formulas, dont worry about memorizing th em. Instead, get a feel for their logic. Basic formulas you need to memorize: P ( A) + P ( A) = 1 (some textbooks use Ac for A ) P( A B ) = P ( A) + P ( B ) P ( A B) In other words, P ( A or B ) = P ( A ) + P ( B ) P ( A and B ) On the right hand side of the above formula, we have to subtract P ( A and B ) because its double counted in both P ( A) and P ( B ) . P(A B ) = P ( A) P ( B A) Intuitively, P ( A B ) = % of population who did both A and B P ( A) = % of the population who did A P ( B A ) =% of the people who did A did B P(A B ) = P ( B ) P ( A B ) says that the % of population who did both A and B = % of the population who did B % of th e people who did B did A This makes good sense. P(A P(A B B ) = P ( A) P ( B ) if A and B are independent C ) = P ( A) + P ( B ) + P ( C ) P ( A B) P ( A C) P(B C) + P( A B C) The proof is simple. P(A B C) = P A (B C ) = P ( A) + P ( B C) P A (B C) P(B C ) = P ( B ) + P (C ) P ( B C) Yufeng Guo, Deeper Understanding: Exam P Page 49 of 425

http://guo.coursehost.com P A (B B) C) = P (A = P(A P (A B) + P ( A B B) C) C) P C) (A B) (A C) (A (A C) = P( A Putting everything together, we have: P(A B C ) = P ( A) + P ( B ) + P ( C ) P ( A B) P ( A C) P(B C) + P( A B C) The above formula is ugly but SOA likes to test it. So memorize it. Complex form ulas --- dont memorize them, but do have a feel for their logic: Associate Laws: A (B C) = ( A B) C In other words, A or ( B or C )=(A or B ) or C. This says that if you want to co

unt the % of people who did at least one of the three tasks -- A, B, and C, you can count it in two ways. The first counting method is to compile two lists of p eople one lists those who did at least one of the two tasks, B and C, and the ot her lists those who did Task A. If you merge these two lists, you should get the total % of people who did at least one task. The second counting method is to c ome up with the list of people who did at least one of the two tasks A and B and the list of people who did C. Then you merge these two lists. Intuitively, you can see that the two counting methods should generate the identical result. A (B C) = ( A B) C In other words, A and ( B and C )=(A and B ) and C. The left-hand side and the r ight-hand side are two ways of counting who did all three tasks A, B, and C. The two counting methods should give identical results. Communicative Laws: A B = B A ( In other words, A or B =B or A . Makes intuitive sense.) A B = B A ( In other words, A and B =B and A . Makes intuitive sense.) Yufeng Guo, Deeper Understanding: Exam P Page 50 of 425

http://guo.coursehost.com Distributive Laws: ( words, ( A or B ) and C =(A and C ) or (B and ( A B) C = ( A C ) ( B C ) In other words, ( A C ) Distributive Laws are less intuitive. You xamples to convince yourself that the laws are DeMorgans Laws: A1 A2 A3 ... An = A1 A2 A3 ... An

A B) C = ( A C ) ( B C ) In other C ) and B ) or C =(A or C ) and (B or might want to use some concrete e true.

In other words, Not any of A1 or A2 or A3 or An is the same as No A1 and No A2 a nd No A3 and No An .They are just two ways of counting the % of people who did n ot do any of the n tasks. An = A1 A1 A2 A3 ... A2 A3 ... An The left-hand side and the right-hand are two ways of counting the % of people w ho didnt complete all of the n tasks (i.e. who did not do any task at all, or who did some but not all tasks.) Common common ble or l show Sense Approach If you dislike memorizing formulas, you can just use your sense to solve the problem. However, you will still need some tools (a ta a diagram called a Venn diagram) to help you sort out who did what. I wil you how.

Yufeng Guo, Deeper Understanding: Exam P Page 51 of 425

http://guo.coursehost.com Sample Problems and Solutions Problem 1 (#1 May 2000) The probability that a visit to a primary care physicians (PCP) office results in neither lab work nor referral to a specialist is 35%. Of those coming to a PCPs office, 30% are referred to specialists and 40% require lab work. Determine the probability that a visit to a PCPs office results in both lab work and referral t o a specialist. A 0.05 Solution Formula-driven approach Let L=lab work and R=ref erral to specialists. We need to find P ( L B 0.12 C 0.18 D 0.25 E 0.35 R) . We are given: P ( L ) = 40% , P ( R ) = 30% , neither R nor L Neither R=35%. The tricky part is neither R nor L Neither R=35%. Generally, Neither A nor B = (No A) and (No B)= A You might want to memorize this. B=A B Once you understand neither nor, the rest is really simple. P R P(R P(R ( L =35%, ) P(R L ) =1- P R L) ( L =65%. ) L) = P ( R) + P ( L) P ( R L) L ) =5% 65% = 40% + 30% - P ( R Yufeng Guo, Deeper Understanding: Exam P Page 52 of 425

http://guo.coursehost.com Common sense approachtable method You can come up with a table that looks like this: A 1 2 3 4 Lab work No lab work Sum B Referral ? 30 % C No referral 35% 100% D Sum 40% Cell(D,2)=% of visits leading to lab work=40% Cell(B,4)=% of visits leading to r eferral=30% Cell(C,3)=% of visits leading to no lab work and no referral=35% We need to find Cell(B,2)=% of visits leading to both lab work and referral. Becaus e Pr(Referral)+Pr(no referral)=1, Pr(lab work)+Pr(no lab work)=1, Cell(D,4)=100% =Cell(D,2)+Cell(D,3)=Cell(B,4)+Cell(C,4) We update the table with the above info : A 1 2 3 4 Lab work No lab work Sum B Referral ? 30% C No referral 35% 70% D Su m 40% 60% 100% Because Cell(B,3)+Cell(C,3)=60%, Cell(B,3)=25% Because Cell(B,2)+Cell(B,3)=30%, Cell(B,2)=5%. The final table looks like this: A 1 2 3 4 Lab work No lab work Su m B Referral 5% 25% 30% C No referral 35% 35% 70% D Sum 40% 60% 100% You can verify that the numbers in the above table satisfy the following relatio nships: Cell(B,2)+Cell(C,2)=Cell(D,2) Cell(B,3)+Cell(C,3)=Cell(D,3) Cell(B,2)+Ce ll(B,3)=Cell(B,4) Cell(C,2)+Cell(C,3)=Cell(C,4) Yufeng Guo, Deeper Understanding: Exam P Page 53 of 425

http://guo.coursehost.com Common sense approach Venn diagram First draw a graph: L a b c R L=a+b=40% R=b+c=30% d=35% a+b+c+d=100% d You can easily find all the variables by solving the above equations. The resu lt: a=35%, b=5%, c=25% L R a=35% b=5% c=25% d=35% You can also see from the above diagram: a=P L c=P L ( ( R , R , ) ) b = P(L d=P L R) R ( ) Yufeng Guo, Deeper Understanding: Exam P Page 54 of 425

http://guo.coursehost.com Problem 2 (#5 May 2003) An insurance company examines its pool of auto insurance customers and gathers t he following information: (i) (ii) (iii) (iv) All customers insure at least one car. 70% of the customers insure more than one car. 20% of the customers insure a sports car. Of the customers who insure more than one car, 15% insure a sports car. Calculate the probability that a randomly selected customer insures exactly one car and that this car is not a sports car. Solution Formula-driven approach Let C=insuring more than one car, S=insuring a sports car. All customers insure at l east one car C =insuring exactly one car. S insuring exactly one car and this car is not a sports car = C So we need to find out P C want to simplify it first: C S =1 C S =1 C S (DeMorgans law) ( S . Since P C ) ( S has two negation operators, we might ) Intuitively, C C S =1 C P (C S means no C and no S. C S. S means C or S or both. This is why S ) = P (C ) + P ( S ) P (C S ) = P (C ) + P ( S ) P (C ) P ( S C ) P ( C ) =70%, P ( S ) =20%, and P ( S C ) =15%. P (C S ) = P (C ) + P ( S ) P (C ) P ( S C ) =70% + 20% - 70% 15% = 79.5% P C ( S = 1 P (C ) S ) =1-79.5%=20.5% Yufeng Guo, Deeper Understanding: Exam P Page 55 of 425

http://guo.coursehost.com Common sense approach table method Lets develop the following table to sort out who did what. A 1 B Insure zero spor ts car C Insure one sports car D Insure 2 or more sports car 0% E Sum 2 3 4 Insure one car Insure more than one car Sum 30% 70% 100% 20% Cell(C,4)=P(insuring one sports car)=20% Cell(E,3)=P(insuring more than =70% Cell(E,2)=P(insuring one car)=1 70%=30% Cell(D,2)=0 (if you insure , you cannot insure two or more sports cars) Lets fill out some of the cells. Cell(C,3)=70%(15%)=10.5% (15% of those who insured more than one red one sports car.) Cell(C,2)=Cell(C,4) Cell(C,3)=20% 10.5%=9.5% one car) only car remaining car insu

Cell(B,2)=Cell(E,2) [ Cell(C,2) + Cell(D,2) ] = 30% - [ 9.5% + 0 ] = 20.5% The up dated table: A 1 B Insure zero sports car 20.5% C Insure one sports car 9.5% 70% (15%) =10.5% 20% D Insure 2 or more sports car 0% E Sum 2 3 4 Insure one car Insure more than one car Sum 30% 70% 100% Yufeng Guo, Deeper Understanding: Exam P Page 56 of 425

http://guo.coursehost.com Problem 3 A survey of a group of consumers finds the f ollowing result: 59% bought life insurance products. 34% bought annuity products . 48% bought mutual fund products. 14% bought none (no life insurance, no annuit y, no mutual fund) 17% bought both life insurance and annuity products. 30% boug ht both life insurance and mutual fund products. 19% bought both annuity and mut ual fund products. Find the % of the group who bought all three products (life i nsurance, annuities, and mutual funds). Solution Formula-driven approach Let L,A , and M stand for that a consumer surveyed owns life insurance products, annuity products, and mutual fund products respectively. P ( L ) = 59% , P ( A) = 34% , P ( M ) = 48% P(L P L P(L A ( A ) =17%, P ( L A M =14% (14% bought none) ) M ) =30%, P ( A M ) =19% M ) =1 P L ( A M =1 P L ) ( A M =1-14%=86% ) We can also derive the above formula by reasoning. 14% bought none (no life insu rance, no annuity, no mutual fund). The percentage of customers who bought at le ast one product is P ( L A M ) =1-14%=86% Plug in the above data into a memorize d formula: P(L A M ) = P ( L ) + P ( A) + P ( M ) P ( L A) P ( L M ) P( A A M) M )+ P(L A M) 86%=59%+34%+48%-17%-30%-19% + P ( L Then P ( L A M ) =11% Yufeng Guo, Deeper Understanding: Exam P Page 57 of 425

http://guo.coursehost.com Venn Diagram: Draw a Venn diagram as below. We will keep track of all the components in the above diagram (letters A through H): A=Life only (no Annuity, no Mutual Fund) D=Annuity only (no Life, no Mutual Fund) G=Mutual Fund only (no Life, no Annuity) E=Life and Annuity and Mutual Fu nd C=Life and Annuity only, no Mutual Fund B=Life and Mutual Fund only, no Annui ty F=Mutual Fund and Annuity Only, no Life C+E=Life and Annuity (whether theres M utual Fund or not) B+E=Life and Mutual Fund (whether theres Annuity or not) F+E=M utual Fund and Annuity (whether theres Life or not) H=Nothing (i.e. no Life, no A nnuity, no Mutual Fund) Life=A+B+C+E Annuity=C+D+E+F Mutual Fund=B+E+F+G (sum of all the Life purchases) (sum of all the Annuity purchases) (sum of all the Mutu al Fund purchases) Life + Annuity + Mutual Fund + Nothing = 100% We are asked to find E. Next, we u se the following information to set equations: Yufeng Guo, Deeper Understanding: Exam P Page 58 of 425

http://guo.coursehost.com 59% bought life insurance products. 34% bought annuity products. 48% bought mutual fund products. 14% bought nothing H=14% (4) C+E=17% (5) B+E=30% (6) A+B+C+E=59% (1) C+D+E+F=34% (2) B+E+F+G=48% (3) 17% bought both life insurance and annuity products. 30% bought both life insurance and mutual fund products. 19% bought both annuity and mutual fund products. Total is 100% A+B+C+D+E+F+G+H=100% (8) F+E=19% (7) We have 8 variables A through H. We have 8 equations. If you have lot of patienc e, you should be able to solve these equations and track each component as follo ws: So the % of people who purchase every product is E=11%. In contrast to the formu la-driven approach, the Venn Diagram method does not require you to memorize for mulas. I recommend that you get comfortable with both methods. Please note that the table approach is too cumbersome to track things if you have 3 categories (l ike L , A, M in this problem) or more. Homework for you: #1 May 2000; #9 Nov 200 1; #3 Nov 2000; #31 May 2001; #1 May 2003 Yufeng Guo, Deeper Understanding: Exam P Page 59 of 425

http://guo.coursehost.com Chapter 6 Conditional Probability Often times we have partial information about an experiment. This partial inform ation will change our calculation of probability. Example 1. We throw two dies a nd sum up the two numbers. We want to find out the probability that the sum is 8 . The sample space is: (1,1) (2,1) (3,1) (4,1) (5,1) (6,1) (1,2) (2,2) (3,2) (4, 2) (5,2) (6,2) (1,3) (2,3) (3,3) (4,3) (5,3) (6,3) (1,4) (2,4) (3,4) (4,4) (5,4) (6,4) (1,5) (2,5) (3,5) (4,5) (5,5) (6,5) (1,6) (2,6) (3,6) (4,6) (5,6) (6,6) Of the above 36 outcomes, 5 outcomes (blue numbers) give us a sum of 8. So the p robability of getting a total of 8 is 5/36. Now suppose we know that the throw o f the 1st die gives a 5. Given this information, whats the probability of still g etting a sum of 8? Given the 1st number is 5, the total outcomes are listed belo w in red. Of these 6 outcomes, only (5,3) gives us an 8. So the probability of g etting an 8 given the 1st number is 5 is 1/6, not 5/36. Knowing that the 1st die gives us a 5 altered the probability. (1,1) (2,1) (3,1) (4,1) (5,1) (6,1) (1,2) ( 2,2) (3,2) (4,2) (5,2) (6,2) (1,3) (2,3) (3,3) (4,3) (5,3) (6,3) (1,4) (2,4) (3, 4) (4,4) (5,4) (6,4) (1,5) (2,5) (3,5) (4,5) (5,5) (6,5) (1,6) (2,6) (3,6) (4,6) (5,6) (6,6) Why does the new information alter the probability? Because the new information alters the sample space. The sample space before we know the 1st die gives us a 5 is: (1,1) (2,1) (3,1) (4,1) (5,1) (6,1) (1,2) (2,2) (3,2) (4,2) (5,2) (6,2) (1,3) (2 ,3) (3,3) (4,3) (5,3) (6,3) (1,4) (2,4) (3,4) (4,4) (5,4) (6,4) (1,5) (2,5) (3,5 ) (4,5) (5,5) (6,5) (1,6) (2,6) (3,6) (4,6) (5,6) (6,6) Yufeng Guo, Deeper Understanding: Exam P Page 60 of 425

http://guo.coursehost.com The sample space after we know the 1st die gives us a 5 is: (5,1) (5,2) (5,3) (5,4) (5,5) (5,6) The new sample space is smaller than the original one. The arrival of new inform ation reduces our sample space. Consequently, we have to recalculate probability in the reduced sample space. Lets look at the formula for conditional probabilit y. P ( B A) = = # of elements in A B # of elements in the reduced sample space A # of elements in A B # of elements in the original sample space # of elements in the reduced sample space of A # of elements in the original sam ple space But # of elements in A B # of elements in the original sample space = P( A B) # of elements in the reduced sample space of A = P ( A) # of elements in the ori ginal sample space P ( B A) = P( A B) P ( A) In this example, let A =getting a 5 if we throw the 1st die (new information) B =getting a sum of 8 i f we throw two dies. Then P ( B A ) is the probability of getting a sum of 8 if the 1st die is 5. In this example, P ( A) = 1 . 6 Yufeng Guo, Deeper Understanding: Exam P Page 61 of 425

http://guo.coursehost.com P(A 1 1 B ) = P (1st # is 5; 2nd # is 3) = P (1st # is 5 ) P ( 2nd # is 3) = 6 6 1 1 P ( A B) 6 6 1 = = P ( B A) = 1 P ( A) 6 6 Please note that all the probabil ity formulas that hold in the original sample space still hold in the reduced sa mple space. For example: P ( B A) + P B A = 1 P(B C A ) = P ( B A ) + P ( C A) P ( B C A) ( ) Example ( X = 5 P ( X = P ( X = 2. We throw a die. Let X represent the # that is facing up. Calculate P X > 4) . Solution 5 X > 4) = P( X = 5 P ( X > 4) X > 4) = P ( X > 4) P ( X = 5) = P(X = 5 5) X = 6) = 1 2

Lets look at how the sample space is altered. The original sample space: 1 2 3 4 5 6 The new sample space after we know that X > 4 : 5 The probability of getting a 5 in the new sample space is 0.5. 6 Example 3. (#39, May 2001) An insurance company insures a large # of homes. The insured value, X , of a randomly selected home is assumed to follow a distributi on with density function: f ( x) = 3x 0 4 for x > 1 elsewhere Yufeng Guo, Deeper Understanding: Exam P Page 62 of 425

http://guo.coursehost.com Given that a randomly selected home is insured for at least 1.5, what is the probability that it is insured for less than 2? Solution P(X 2 X > 1.5 ) = 2 P( X 2 P ( X > 1.5 ) 2 X > 1.5 ) = P (1.5 < X P ( X > 1.5 ) = 1.5 3 2) P (1.5 < X 2) = + f ( x )dx = + 3x 4 dx = x 1.5 3 2 1.5 2 3 1.5 P (1.5 < X ) = P(X f ( x )dx = 3 x 4 dx = x 1.5 3 + 1.5 = 1.5 3 1.5 2 X > 1.5 ) = P (1.5 < X

P ( X > 1.5 ) 2) = 1.5 3 2 1.5 3 3 = 0.5781 Homework for you: #7, Nov 2000; #17, #32, Nov 2001. Yufeng Guo, Deeper Understanding: Exam P Page 63 of 425

http://guo.coursehost.com Chapter 7 probabilities Bayes theorem and posterior Prior probability. Before anything happens, as our baseline analysis, we believe (based on existing information we have up to now or using purely subjective jud gment) that our total risk pool consists of several homogenous groups. As a part of our baseline analysis, we also assume that these homogenous groups have diff erent sizes. For any insured person randomly chosen from the population, he is c harged a weighed average premium. As an over-simplified example, we can divide, by the aggressiveness of a persons driving habits, all insureds into two homogenou s groups: aggressive drivers and nonaggressive drivers. In regards to the sizes of these two groups, we assume (based on existing information we have up to now or using purely subjective judgment) that the aggressive insureds account for 40% of the total insureds and non-aggressive account for the remaining 60%. So for a n average driver randomly chosen from the population, we charge a weighed averag e premium rate (we believe that an average driver has some aggressiveness and so me non-aggressiveness): Premium charged on a person randomly chosen from the pop ulation = 40%*premium rate for an aggressive drivers rate + 60%*premium rate for a non-aggressive drivers rate Posterior probability. Then after a year, an event changed our belief about the makeup of the homogeneous groups for a specific ins ured. For example, we found in one year one particular insured had three car acc idents while an average driver had only one accident in the same time period. So the three-accident insured definitely involved more risk than did the average d river randomly chosen from the population. As a result, the premium rate for the three-accident insured should be higher than an average drivers premium rate. Th e new premium rate we will charge is still a weighted average of the rates for t he two homogeneous groups, except that we use a higher weighting factor for an a ggressive drivers rate and a lower weighting factor for a non-aggressive drivers r ate. For example, we can charge the following new premium rate: Premium rate for a driver who had 3 accidents last year = 67%* premium rate for an aggressive dr ivers rate + 33%* premium rate for a non-aggressive drivers rate Yufeng Guo, Deeper Understanding: Exam P Page 64 of 425

http://guo.coursehost.com In other words, we still think this particular drivers risk consists of two risk groups aggressive and non-aggressive, but we alter the sizes of these two risk groups for this specific insured. So instead of assumin g that this persons risk consists of 40% of an aggressive drivers risk and 60% of a non-aggressive drivers risk, we assume that his risk consists of 67% of an aggr essive drivers risk and 33% of a non-aggressive drivers risk. How do we come up wi th the new group sizes (or the new weighting factors)? There is a specific formu la for calculating the new group sizes: For any given group, Group size after an event =K the group size before the event this groups probability to make the eve nt happen. K is a scaling factor to make the sum of the new sizes for all groups equal to 100%. In our example above, this is how we got the new size for the ag gressive group and the new size for the non-aggressive group. Suppose we know th at the probability for an aggressive driver to have 3 car accidents in a year is 15%; the probability for a nonaggressive driver to have 3 car accidents in a ye ar is 5%. Then for the driver who has 3 accidents in a year, the size of the aggressive risk for someone who had 3 accidents in a year = K i (prior size of pure aggressive risk) i (probability of an aggressive driver havi ng 3 car accidents in a year) = K (40% )(15%) the size of the non-aggressive ris k for someone who had 3 accidents in a year = K i (prior size of the non-aggress ive risk) i (probability of a no- aggressive driver having 3 car accidents in a year) = K ( 60% ) (5%) K is a scaling factor such that the sum of posterior size s is equal to one. So K ( 40% ) (15%) + K ( 60% ) ( 5%) =1, K= 1 = 11.11% 40%(15%) + 60%(5%) the size of the aggressive risk for someone who had 3 accidents in a year = 11.1 1% (40% ) ( 15% )= 66.67% the size of the non-aggressive risk for someone who ha d 3 accidents in a year =11.11% (60% ) ( 5%) = 33.33% The above logic should make intuitive sense. The bigger the size of the group pr ior to the event, the higher contribution this group will make to the events occu rrence; the bigger the probability for this group to make the event happen, the higher the contribution this Yufeng Guo, Deeper Understanding: Exam P Page 65 of 425

http://guo.coursehost.com group will make to the events occurrence. So the produc t of the prior size of the group and the groups probability to make the event hap pen captures this groups total contribution to the events occurrence. If we assign the post-event size of a group proportional to the product of the prior size an d the groups probability to make the event happen, we are really assigning the po stevent size of a group proportional to this groups total contribution to the eve nts occurrence. Again, this should make sense. Lets summarize the logic for findin g the new size of each group in the following table: Event: An insured had 3 acc idents in a year. A Homogenous groups (which are 2 components of a risk) Aggressive B Before-event group size 40% C Groups probability to make the even happen 15% D=(scaling facto r K) BC Post-event group size Non-aggressive 60% 5% K40%15% 40% 15% = 40% 15% + 60% 5% K60%5% 60% 5% = 40% 15% + 60% 5% We can translate the above rule into a formal theorem: If we divide the populati on into n non-overlapping groups G1,G 2, ...,Gn such that each element in the po pulation belongs to one and only one group, then after the event E occurs, Pr(Gi | E ) = K Pr(Gi ) Pr( E | Gi ) K is a scaling factor such at K [ Pr(G1 | E ) + Pr(G2 | E ) + ... + Pr(Gn | E )] = 1 Or K [Pr(G1 ) Pr( E | G1 ) + Pr(G2 ) Pr( E | G2 ) + ... + Pr(Gn ) Pr( E | Gn )] = 1 So K= 1 Pr(G1 ) Pr( E | G1 ) + Pr(G2 ) Pr( E | G2 ) + ... + Pr(Gn ) Pr( E | Gn ) Pr(Gi ) Pr( E | Gi ) Pr(G1 ) Pr( E | G1 ) + Pr(G2 ) Pr( E | G2 ) + ... + Pr(Gn ) Pr( E | Gn ) And Pr(Gi | E ) = Yufeng Guo, Deeper Understanding: Exam P Page 66 of 425

http://guo.coursehost.com Pr(Gi | E ) is the conditional probability that Gi wil l happen given the event E happened, so it is called the posterior probability. Pr(Gi | E ) can be conveniently interpreted as the new size of Group Gi after th e event E happened. Intuitively, probability can often be interpreted as a group size. For example, if a probability for a female to pass Course 4 is 55% and ma le 45%, we can say that the total pool of the passing candidates consists of 2 g roups, female and male with their respective sizes of 55% and 45%. Pr(Gi ) is th e probability that Gi will happen prior to the event Es occurrence, so its called prior probability. Pr(Gi ) can be conveniently interpreted as the size of group Gi prior to the occurrence of E. Pr( E | Gi ) is the conditional probability tha t E will happen given Gi has happened. It is the Group Gi s probability of making the event E happen. For example, say a candidate who has passed Course 3 has 50 % chance of passing Course 4, that is to say: Pr(passing Course 4 / passing Cour se 3)=50% We can say that the people who passed Course 3 have a 50% of chance of passing Course 4. How to tackle the problem in 3 minutes Solving a posterior probability problem for a discrete random le matter of applying Bayes theorem to a specific situation. yes theorem to work, you first need to divide the population rlapping groups such that everybody in the population belongs e group. variable is a simp Remember that for Ba into several non-ove to one and only on

Sample Problems and Solutions Before we jump into the formula, lets look at a sixth-grade level math problem, w hich requires zero knowledge about probability. If you understand this problem, you should have no trouble understanding Bayes Theorem. Problem 1 A rock is found to contain gold. It has 3 layers, each with a different density of gold. You ar e given: Yufeng Guo, Deeper Understanding: Exam P Page 67 of 425

http://guo.coursehost.com The top layer, which accounts for 80% of the mass of t he rock, has a gold density of only 0.1% (i.e. the amount of gold contained in t he top layer is equal to 0.1% of the mass of the top layer). The middle layer, w hich accounts for 15% of the rocks mass, has a gold density of 0.05%. The bottom layer, which accounts for only 5% of the rocks mass, has a gold density of 0.002% . Questions What is the rocks density of gold (i.e.: what % of the rocks mass is gol d)? Of the total amount of gold contained in the rock, what % of gold comes from the top layer? What % from the middle layer? What % comes from the bottom layer ? Solution Lets set up a table to solve the problem. Assume that the mass of the rock is one (can be 1 pound, 1 gram, 1 ton it doesnt matter). A Layer B Mass of t he layer 0.80 0.15 0.05 1.00 C Density of gold in the layer 0.100% 0.050% 0.002% D=BC Mass of gold contained in the layer 0.000800 0.000075 0.000001 0.000876 E=D /0.000876 Of the total amount of gold in the rock, what % comes from this layer? 91.3% 8.6% 0.1% 100% 1 2 3 4 5 Top Middle Bottom Total As an example of the calculations in the above table, Cell(D,2)=0.80.100%=0.00080 0, Cell(D,5)=0.000800+0.000075+0.000001=0.000876, Cell(E,2)= 0.000800/0.000876=9 1.3%. So the rock has a gold density of 0.000876 (i.e. 0.0876% of the mass of th e rock is gold). Of the total amount of gold contained in the rock, 91.3% of the gold comes from the top layer, 8.6% of the gold comes from the middle layer, an d the remaining 0.1% of the gold comes from the bottom layers. In other words, t he top layer contributes to 91.3% of the gold in the rock, the middle layer 8.6% , and the bottom layer 0.1%. Yufeng Guo, Deeper Understanding: Exam P Page 68 of 425

http://guo.coursehost.com The logic behind this simple math problem is exactly t he same logic behind Bayes Theorem. Now lets change the problem into one about pri or and posterior probabilities. Problem 2 In underwriting life insurance applications for nonsmokers, an insuran ce company believes that theres an 80% chance that an applicant for life insuranc e qualifies for the standard nonsmoker class (which has the standard underwritin g criteria and the standard premium rate); theres a 15% chance that an applicant qualifies for the preferred smoker class (which has more stringent qualifying st andards and a lower premium rate than the standard nonsmoker class); and theres a 5% chance that the applicant qualifies for the super preferred class (which has the highest underwriting standards and the lowest premium rate among nonsmokers ). According to medical statistics, different nonsmoker classes have different p robabilities of having a specific heart-related illness: The standard nonsmoker class has 0.100% of chance of getting the specific heart disease. The preferred nonsmoker class has 0.050% of chance of getting the specific heart disease. The super preferred nonsmoker class has 0.002% of chance of getting the specific hea rt disease. If a nonsmoking applicant was found to have this specific heart-related illness, what is the probability of this applicant coming from the standard risk class? What is the probability of this applicant coming from the preferred risk class? What is the probability of this applicant coming from the super preferred risk c lass? Solution The solution to this problem is exactly the same as the one to th e rock problem. Event: the applicant was found to have the specific heart diseas e A B C E=D/0.000876 D=BC (i.e. the scaling factor =1/0.000876) 1 Group BeforeThi s groups After-event After-event size of the event size probability size of the g roup (scaled) of the of having group (not yet group the specific scaled) heart i llness 2 Standard 0.80 0.100% 0.000800 91.3% Yufeng Guo, Deeper Understanding: Ex am P Page 69 of 425

http://guo.coursehost.com 3 4 5 Preferred Super Preferred Total 0.15 0.05 1.00 0 .050% 0.002% 0.000075 0.000001 0.000876 8.6% 0.1% 100% So if the applicant was found to have the specific heart disease, then Theres a 9 1.3% chance he comes from the standard risk class; Theres an 8.6% chance he comes from the preferred risk class; Theres a 0.1% chance he comes from the super pref erred risk class. Problem 3 (continuous random variable) You are tossing a coin. Not knowing p , the success rate of a heads showing up in one toss of the coin, you subjectively assume that p is uniformly distributed over [ 0,1] . Next, you do an experiment by tossing the coin 3 times. You find that, in this experiment , 2 out of 3 tosses have heads. Calculate the posterior probability p . Solution Event: getting 2 heads out of 3 tosses. 1 A Group B Beforeevent size of the group 1 C This groups probability to make the event happen D=BC After-event size of the group (not yet scaled) E=D Scaling fact or After-event size of the group (scaled) 2 Any p in [0,1] C32 p 2 (1 p ) C32 p 2 (1 p ) 1 C32 p 2 (1 p ) C32 p 2 (1 p )dp 100% 0 3 Total 1 1 C32 p 2 (1 p )dp 0 The key to solving this problem is to understand that we have an infinite number of groups. Each value of p ( 0 p 1 ) is a group. Because p is uniform over [0,1 ], f ( p ) = 1 . As a result, for a given group of p , the before-event size is one. And for a given group of p , this groups probability to make the event gettin g 2 heads out of 3 tosses happen is a binomial distribution with probability of C 32 p 2 (1 p ) . So the afterevent size is Yufeng Guo, Deeper Understanding: Exam P Page 70 of 425

http://guo.coursehost.com k 1 C32 p 2 (1 p ) scaling factor before-event the group s probability group size to have 2 heads o ut of 3 tosses After-event size of the groups k is a scaling factor such that the sum of the after-event sizes for all the gro ups is equal to one. Since we have an infinite number of groups, we have to use integration to sum up all the after-event sizes for each group: 1 k C32 p 2 (1 p )dp = 1 k= 1 1 0 0 C32 p 2 (1 p )dp Then the after-event size (or posterior probability) is: C32 p 2 (1 p ) 1 kC p (1 p ) = 2 3 2 C p (1 p )dp 2 3 2 = p 2 (1 p ) 1 p 2 (1 p )dp 0 0 It turns out that the posterior probability we just calculated is termed Beta dis tribution. Lets generalize this problem: You are tossing a coin. Not knowing p , t he success rate of heads showing up in one toss of the coin, you subjectively as sume that p is uniformly distributed over [ 0,1] . Next, you do an experiment by tossing the coin m + n times (where m, n are non-negative integers). You find t hat, in this experiment, m out of these m + n tosses have heads. Then the poster ior probability of p is: f ( p) = p m (1 p ) 1 0 n p m (1 p ) dp n

The above distribution f ( p ) is called Beta distribution. If we set m = 1 and n = 1 where > 0 and > 0 , we have Yufeng Guo, Deeper Understanding: Exam P Page 71 of 425

http://guo.coursehost.com 1 f ( p) = p 1 (1 p) p) 1 , 1 p 0 1 (1 where 0 p 1 dp This is a generalized Beta distribution. Please note that finding the posterior probability for continuous random variables is not on the Exam P syllabus (its on Exam C or the new Course 4). However, we introduced it here anyway for two purp oses: (1) to get a sense of how to use Bayes theorem and calculate the posterior probability for a continuous random variable; and more importantly (2) to derive Beta distribution. Beta distribution is on the Exam P syllabus. Without using t he concept of posterior probability, it is very hard for us to intuitively inter pret Beta distribution. Well pick up Beta distribution later. Final note. The acc uracy of the posterior probability under Bayes Theorem depends on the accuracy of the prior probability. If the prior probability is way off (because it is based on existing data or purely subjective judgment), the posterior probability will be way off. Homework for you: #2, #33 May 2000; #12, #22, 28 Nov 2000; #6, #23 May 2001; #4 Nov 2001; #8, #31 May 2003. Yufeng Guo, Deeper Understanding: Exam P Page 72 of 425

http://guo.coursehost.com Chapter 8 Random variables A random variable is a function that assigns a number to each element of the sam ple space. We typically write a random variable in a capital letter such as X . Example 1. If we flip a coin and observe which side is up, the sample space is { H, T}. If we assign H=1 and T=0, our random variable is: X= 1 0 with probability of 0.5 with probability of 0.5 Theres more than one way to assign a value to each element in the sample space. I n the flip of a coin, we can also assign H=0 and T=1. In this case, the random v ariable is: X= 0 1 with probability of 0.5 with probability of 0.5 Of course, you can assign H=2 and T=3, etc. Example 2. If we flip a coin three t imes, the sample space is: = { HHH , HHT , HTH , HTT , THH , THT , TTH , TTT } Each element in the sample s pace has 0.125 chance of occurring. If we assign X =# of heads, Y =# of tails, Z =# of successive flips that have the same outcome, then each element correspond s to the following 3 sets of numerical values: Element HHH HHT HTH HTT THH THT T TH TTT X 3 2 2 1 2 1 1 0 Y 0 1 1 2 1 2 2 3 Z 3 2 1 2 2 1 2 3 Probability 0.125 0 .125 0.125 0.125 0.125 0.125 0.125 0.125 Please note that in HHH, we have 3 consecutive heads; so Z=3. In HTH, no two con secutive outcomes are the same; so Z=1. Here X , Y , Z are three random variable s. Yufeng Guo, Deeper Understanding: Exam P Page 73 of 425

http://guo.coursehost.com Example 3. If we roll a die and record the side thats f ace up, the sample space is = {1, 2,3, 4,5, 6} . We can assign a value to each e lement in the sample space as follows: Element of the sample space 1 2 3 4 5 6 Here X is a random variable. X 1 2 3 4 5 6 Probability 1/6 1/6 1/6 1/6 1/6 1/6 Of course, you can assign values differently as follows: Element of the sample s pace 1 2 3 4 5 6 X 6 5 4 3 2 1 Probability 1/6 1/6 1/6 1/6 1/6 1/6 You see that a random variable is really an arbitrary translation of each elemen t in the sample space into a number. Of course, some translation schemes are mor e useful than others. What do we gain by such translation? By mapping the entire sample space into a series of numbers, we can extract relevant information from the sample space to solve the problem at hand, while ignoring other details of the sample space. Example 4. We flip a coin and are interested in finding the # of times heads are up. If we assign 1 and 0 to H and T respectively, then the information about th e # of heads up can be conveniently summarized as follows: X= 1 0 with probability of 0.5 with probability of 0.5 You see that we have reduced the coin flipping process to a simple, elegant math equation. More importantly, this equation answers our question at hand. Example 5. If we flip a coin three times. We are concerned about the # of times heads show up. Let X represent the # of times heads show up, then we: Yufeng Guo, Deeper Understanding: Exam P Page 74 of 425

http://guo.coursehost.com X Probability 3 0.125 2 0.125 2 0.125 1 0.125 2 0.125 1 0.125 1 0.125 0 0.125 Example 6. We roll a die and record the side thats face up. We are interested in finding the probability of getting 1,2,3,4,5, and 6 respectively. If we let rand om variable X represent the number thats face up, then we have: X Probability 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 Expressed more succinctly: P ( X = n) = 1 , where n = 1, 2, 3, 4,5, 6 6 Discrete random variable vs. continuous random variable If a random variable can take on discrete values, then its a discrete random vari able. If a random variable can take on any value in a range, then its a continuou s random variable. Example 7. Let the random variable X represent the # of heads we get from flipping a coin n times. Then X can take on integer values ranging from 0 to n . X is a discrete random variable. Example 8. Let random variable Y represent the number randomly chosen from the range [0,1]. Then Y can take on an y value in [0,1]. Y is a continuous random variable. PMF and CDF for discrete random variables Probability mass function The most important way to describe a discrete random variable is through the pro bability mass function (PMF). If x is a possible value of the random variable X , the probability mass of x , denoted as p X ( x ) , is the probability that X = x : pX ( x ) = P ( X = x ) Example 9. We flip a coin twice and record the # of times we get heads. Let X re present the # of heads in 2 flips of a coin. The probability mass function of X is: Yufeng Guo, Deeper Understanding: Exam P Page 75 of 425

http://guo.coursehost.com 14 if x = 0 p X ( x ) = 1 2 if x = 1 1 4 if x = 2 A probability mass function must satisfy t he 3 axioms: pX ( x ) 0 pX ( x = a x x = b ) = p X ( a ) + p X ( b ) where a b pX ( x ) = 1 b , we have p X ( x = a The second condition is trivial. Since a So p X ( x = a x = b) = 0 . x = b ) = p X ( a ) + p X ( b ) automatically holds. So a valid PMF needs to satisfy the following two conditions: pX ( x ) 0 , x pX ( x ) = 1 Example 10. You are given the following PMF: pN ( n ) = e n n! , where n = 0,1, 2,..., + and is a positive constant Verify that this is a legitimate PMF. Solution pN ( n ) = e + n =0 + n =0 n n! + 0 for n = 0,1, 2,..., + n + n =0 n pN ( n ) = n e n =0 2

n! + =e 3 n! n n! + = 1+ + 2! 3! + n=0 + ... + n n! + ... = e (Taylor series) n=0 pN ( n ) = e n! =e (e ) = 1 Yufeng Guo, Deeper Understanding: Exam P Page 76 of 425

http://guo.coursehost.com So pN ( n ) = e n n! is a valid PMF. Example 11. A special die has 3 sides painted 1, 2, and 3 respectively. If the d ie is thrown, each side has an equal chance of landing face up on the ground. Tw o dies are thrown together and let X represent the sum of the two sides facing u p. Find the probability mass function of X . Solution outcome of the 1st throw 1 2 3 outcome of 2nd throw 1 2 3 2 3 4 3 4 5 4 5 6 In the above table, the blue cells represent the values of X . Because each side has 1 3 chance of landing face up, each cell has (1 3) = 1 9 chance of occurrin g. 2 We convert the above table into the new table below: pX ( x ) x 2 3 4 5 6 19 29 39 29 19 To understand the above table, lets look at p X ( 3) = 2 9 . This is how we get p X ( 3) = 2 9 . There are two ways to have X = 3 : you get a 1 from the 1st die and 2 from the 2nd die (with probability of 1 9 ); you get 2 from the 1st die a nd 1 from the 2nd die (with probability of 1 9 ). So the total probability of ha ving X = 3 is 2 9 . Example 12. Claim payment, X , has the following PMF: pX ( x ) x $0 0.32 $50 0.2

$80 0.18 $135 0.1 $250 0.15 $329 0.05 Calculate 1. P ( X > 120 ) 2. P ( X Solution Yufeng Guo, Deeper Understanding: Exam P Page 77 of 425 300 X > 120 )

http://guo.coursehost.com P ( X > 120 ) = P ( x = 135 ) + P ( x = 250 ) + P ( x = 329 ) = 0.1 + 0.15 + 0.0 5 = 0.3 P(X 300 X > 120 ) = P ( X > 120 X 300 P ( X > 120 ) ) = P (120 < X 300 ) P ( X > 120 ) P (120 < X P( X 300 ) = P ( x = 135 ) + P ( x = 250 ) = 0.1 + 0.15 = 0.25 300 X > 120 ) = P (120 < X 300 P ( X > 120 ) ) = 0.25 = 0.833 0.3 Cumulative probability function (CDF) The cumulative function is defined as FX ( x ) = P ( X Formulas: P ( a < X b) = F (b) F ( a ) P (a Proof. P ( a < X b ) = F (b ) {a < X b} = { X b} { X a} . P ( a X b ) = P ( a < X b) + p ( a ) = F (b ) F ( a ) + p ( a ) Example 13. If the PMF for X is: x) X b ) = F (b ) F ( a ) + p ( a ) F (a) pX ( x ) Then, F( x 2 3 4 5 6 19 29 39 29 19

) = P( X )=0 . 0 ) = 0 . Theres no way to have X 1) = 0 . Theres no way to have X 2) = p ( 2) = 1 9 3) = p ( 2 ) + p ( 3) = 1 9 + 2 9 = 1 3 Page 78 of 425 The minimum value of x is 2; theres no way to have X F ( 0) = P ( X F (1) = P ( X F ( 2) = P ( X F ( 3) = P ( X 0. 1. Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com F ( 4) = P ( X F ( 5) = P ( X F (6) = P ( X F (7) = P ( X 4 ) = p ( 2 ) + p ( 3) + p ( 4 ) = 1 9 + 2 9 + 3 9 = 2 3 5 ) = p ( 2 ) + p ( 3) + p ( 4 ) + p ( 5 ) = 1 9 + 2 9 + 3 9 + 2 9 = 8 9 6 ) = p ( 2 ) + p ( 3) + p ( 4 ) + p ( 5 ) + p ( 6 ) = 1 9 + 2 9 + 3 9 + 2 9 + 1 9 = 1 7 ) = p ( 2 ) + p ( 3) + p ( 4 ) + p ( 5 ) + p ( 6 ) + p ( 7 ) = 1 9 + 2 9 + 3 9 + 2 9 + 1 9 + 0 = 1 because p ( 7 ) = 0 F (+ ) = P(X + ) = P( X 6) = 1 PDF and CDF for continuous random variables For a continuous random variable X , the probability density function (PDF), f ( x ) , is defined as: P (a x b) = b f ( x )dx a P ( a x b ) is the area under the graph f ( x ) . Because including or excluding the end points doesnt affect the area, including or excluding the end points doe snt affect the probability: P (a < X < b) = P ( a X < b) = P ( a X b) = P (a < X b) = b f ( x )dx a The CDF (cumulative probability function) of the continuous random variable X is defined as: F ( x) = P ( X x ) . This is the same definition when X is discrete. If a random variable is discrete, we say PMF (probability mass function); if a r andom variable is continuous, we say PDF (probability density function). Whether a random variable is discrete or continuous, we always say CDF (cumulative prob ability function). Please note that often for the sake of convenience, people us e f ( x ) to refer to either PMF p X ( x ) or PDF f ( x ) . Yufeng Guo, Deeper Understanding: Exam P Page 79 of 425

http://guo.coursehost.com Properties of CDF Rule 1 F ( x ) = P ( X x ) for all x -- this is just the definition. F (b ) . a} Rule 2 CDF can never be decreasing. If a b , then F ( a ) To see why, notice F ( x ) = P ( X x b contains x x ) . If a b , then { x P(x a . So P ( x b ) a ) . This gives us { x b} . In other words, F (b) F ( a ) . Rule 3 F ( )=0 and F ( + ) =1. They are true for both discrete and continuous random variables. To see why, not ice < X < + . Theres zero chance that X can be smaller or equal to ; F ( ) = P( X ) = 0 . On the other hand, we are 100% certain that X cannot exceed + . So F ( + ) = P(X + ) =1. Rule 4 If X is discrete and takes integer values, the PMF and CDF can be obtaine d from each other by summing or differencing: F (k ) = k i= p X ( i ) -- this is the definition of F ( k ) k) P( X k 1) = F ( k ) F ( k 1) pX ( k ) = P ( X Rule 5 If X is continuous, the PDF and CDF can be obtained from each other by in tegration or differentiation: F ( x) = x f ( t )dt , f ( x) = d F ( x) . dx By definition, F ( x ) = P ( X both sides of F ( x ) =

x x) = P ( X x) = x f ( t )dt . Taking the derivative at f ( t )dt gives us f ( x ) = d F ( x) . dx x 1. Example 14. X has the following density: f ( x ) = 3x 2 where 0 Then, F ( x ) = P ( X P ( 0.2 x) = x f ( t )dt = x f ( t )dt = 3t 2 dt = x 3 . 0 x 0 X 0.6 ) = F ( 0.6 ) F ( 0.2 ) = 0.63 0.23 = 0.208 Page 80 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Example 15. A real number is randomly chosen from [0,1]. Then this number is squ ared. Let X represent the result. Find the PDF and CDF for X . Solution Well find the CDF first. Let U represent the # randomly drawn from [0,1]. Then X =U2. FX ( x ) = P ( X x ) = P (U 2 x) = P U ( x ) Because any number in the interval [0,1] has an equal chance of being drawn, P U x must be proportional to the length of the interval 0, x . The total ( ) probability that P (U 1) = 1 -- we are 100% certain that any number taken from [ 0,1] must not exceed 1. Consequently, P U ( x = length of 0, x x 1 ) = x FX ( x ) = x where 0 If x 0 , then FX ( x ) = 0 ; if x 1 , then FX ( x ) = 1 . So the CDF is: 0 FX ( x ) = 1 f ( x) = x if x 0 if 0 x 1 if x 1 d d 1 where 0 < x 1 and f ( x ) is zero elsewhere. F ( x) = x= dx dx 2 x Please note that the following key difference between PMF for a discrete random variable and PDF for a continuous random variable: PMF is a real probability and its value must not exceed one; PDF is a fake probability and can take on any no n-negative value. PDF itself doesnt have any meaning. For PDF to be useful, we mu st integrate it over a range. Yufeng Guo, Deeper Understanding: Exam P Page 81 of 425

http://guo.coursehost.com In the example above, PDF is f ( x ) = f ( x) = 1 2 x + . f ( x) = 1 1 2 x 1 2 x for 0 < x 1 . When x 0, is not a probability. To get a probability, we must integrate f ( x ) = 2 x get a real probability: P (a < X b) = over a range. For example, if we integrate f ( x ) over [ a, b] , well b f ( x )dx a Mean and variance of a random variable You just have to memorize a series of formulas: If X is discrete, then mean E ( X ) = x x pX ( x ) E(X ) 2 variance Var ( X ) = E X = x x E ( X ) pX ( x ) = E ( X 2 ) E 2 ( X ) 2 If X is continuous, then the mean E ( X ) = + xf ( x )dx E(X ) 2 + variance Var ( X ) = E X = x E(X ) 2 f ( x ) dx = E ( X 2 ) E 2 ( X ) Standard deviation of X - no matter X is continuous or discrete " X = Var ( X ) Yufeng Guo, Deeper Understanding: Exam P Page 82 of 425

http://guo.coursehost.com Example 16. x p ( x) Then E(X ) = x 1 0.2 2 0.15 3 0.05 4 0.43 5 0.17 x p X ( x ) = 1( 0.2 ) + 2 ( 0.15 ) + 3 ( 0.05 ) + 4 ( 0.43) + 5 ( 0.17 ) = 3.22 Var ( X ) = E X 2 E(X ) 2 = 0.2 (1 3.22 ) + 0.15 ( 2 3.22 ) + 0.05 ( 3 3.22 ) + 0.43 ( 4 3.22 ) + 0.17 ( 5 3.22 ) = 2.0116 2 2 2 2 " X = Var ( X ) = 1.4183 Alternative way to calculate Var ( X ) : E(X2) = x x 2 p X ( x ) = 12 ( 0.2 ) + 2 2 ( 0.15 ) + 32 ( 0.05 ) + 42 ( 0.43) + 52 ( 0.17 ) = 12.38 Var ( X ) = E ( X 2 ) E 2 ( X ) = 12.38 3.222 = 2.0116 " X = Var ( X ) = 1.4183 Example 17. f ( x ) = 3x 2 where 0 E(X ) = + x 1 . Then xf ( x )dx = x ( 3 x 2 )dx = 0 1 3 4 2 Var ( X ) = E X

E(X ) 2 + = x E(X ) f ( x) = 1 (x 3 4 ) 3 x 2 dx = 2 0 3 80 " X = Var ( X ) = 3 80 Alternative way to calculate Var ( X ) : Yufeng Guo, Deeper Understanding: Exam P Page 83 of 425

http://guo.coursehost.com E(X 2 )= + x f ( x )dx = x 2 ( 3 x 2 )dx = 2 0 2 1 3 5 Var ( X ) = E ( X 2 ) E 2 ( X ) =

Mean of a function Many times we need to find E Y = g ( X ) . One way to find E [Y ] is to find the pdf fY ( y ) = fY g ( x ) and then calculate E (Y ) = + y fY ( y ) dy . However, finding fY ( y ) = fY g ( x ) is not easy. Fortunately, we can calculate E [Y ] without finding fY ( y ) = fY g ( x ) : E Y = g(X ) = g ( x ) p X ( x ) if X is discrete x + E Y = g(X ) = g ( x ) f ( x ) dx if X is continuous Dont worry about how to prove it. Just memorize it. Example 18. Y = X 2 1 , where X has the following distribution: f ( x ) = e x , where x > 0 . Then E ( X 2 1) = + + + ( x 2 1) e x dx = x + + x 2 e x dx 0 0 e x dx + + 0

3 #3$ 3 % & = 5

4 ( 80

x 2 e x dx = 0 0 x 2 de + = x2e x + + 0 + 0 e x d x2 = 0 e x d x 2 = 2 e x xd x 0 + e xdx = x xde 0 x = xe x + + + 0 + 0 e dx = x e xd x = e 0 x + 0 =1 0

E ( X 2 1) = + (x 2 1) e x dx = 1 0 Yufeng Guo, Deeper Understanding: Exam P Page 84 of 425

http://guo.coursehost.com Alternative method: e + 0 x is the exponential pdf (probability density function) with mean ) = 1 . Conseque ntly, x 2 e x dx = E ( X 2 ) = E 2 ( X ) + Var ( X ) = ) 2 + ) 2 = 2) 2 = 2 E ( X 2 1) = E ( X 2 ) E (1) = 2 1 = 1 Properties of mean: E ( aX + b ) = a E ( x ) + b E ( X + Y ) = E ( X ) + E (Y ) E ( XY ) = E ( X ) E (Y ) -- if X and Y are indep endent. Example 19. X has the following distribution: pX ( x ) Y = X 2 1 . Calculate Var (Y ) . X 1 0.15 2 0.2 3 0.3 4 0.35 Solution Y2 Y = X2 1 X pX ( x ) E Y = g(X ) = x 0 0 1 0.15 9 3 2 0.2 64 8 3 0.3 225 15 4 0.35 g ( x ) p X ( x ) = 0 ( 0.15 ) + 3 ( 0.2 ) + 8 ( 0.3) + 15 ( 0.35 ) = 8.25 E Y2 = x g 2 ( x ) p X ( x ) = 0 ( 0.15 ) + 9 ( 0.2 ) + 64 ( 0.3) + 225 ( 0.35 ) = 99.75 Var ( Y ) = E (Y 2 ) E 2 (Y ) = 99.75 8.252 = 31.6875 Yufeng Guo, Deeper Understanding: Exam P Page 85 of 425

http://guo.coursehost.com Chapter 9 P(A Independence Event A and Event B are independent if and only if B ) = P ( A) P ( B ) n and for any Events A1, A2, ..., An are mutually independent if and only if for any 2 k subse t of indices i 1, i 2 , ... i k , P Ai 1 ( Ai 2 ... Ai k )) = P ( A ) P ( A )...P ( A ) i1 i2 ik Sample Problems and Solutions Problem 1 A system consists of four independent components connected as in the d iagram below. The system works as long as electric current can flow through the system from left to right. Each component has a failure rate of 5%, independent of any other components. Determine the probability that the system fails. A B C D Solution Lets first calculate the probability that the system works. Let the even t Let the event Let the event Let the event A B C D represent that the component represent that the component represent that the com ponent represent that the component A works, B works, C works, D works. (C D) . The event that the system works is S = ( A Use the formula P ( M B) N ) = P(M ) + P( N ) P(M

N): Page 86 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com P( S ) = P (A B) (C D) = P ( A B ) + P (C D) P (A B) (C D) P(A P (C P B ) = P ( A) P ( B ) (because A, B are independent) D ) = P ( C ) P ( D ) (becau se C , D are independent) B) (A (C D) = P ( A B ), (C B ) P (C D) (because ( A P( S ) = P ( A D) are independent) D) P ( A B) (C D) = P ( A ) P ( B ) + P ( C ) P ( D ) P ( A) P ( B ) P ( C ) P ( D ) B ) + P (C =0.95*0.95+0.95*0.95-0.95*0.95*0.95*0.95 =0.99 The probability that the systems fails is P S = 1 P ( S ) =1-0.99=0.01 Problem 2 Two random variables X , Y have zero correlation (i.e. Solution X ,Y ( ) = 0 ). Are X , Y independent? If X , Y are independent, then X ,Y = 0 . However, the reverse may not be true;

zero correlation doesnt automatically mean independence. Correlation coefficient is defined as = cov( X , Y ) X X ,Y Y X ,Y measures whether X , Y have a good linear relationship. X ,Y = 1 or X ,Y = 1 means that X , Y have a perfect linear relationship of Y = aX + b or X = cY + d (where a, b, c, d are constants). X ,Y = 0 indicates that X , Y dont have any linear relationship. However, X , Y may ha ve X ,Y a non-linear relationship (such as Y = X 2 ). As a result, even independent from each other. = 0 , X , Y may not be Yufeng Guo, Deeper Understanding: Exam P Page 87 of 425

http://guo.coursehost.com One exception to the rule (see Chapter 25 on bivariate normal distribution): when X , Y are both normal random variables, then X ,Y = 0 means that X , Y to are indeed independent. Problem 3 Two events A, B have no intersections. In other words, A A, B are inde pendent? Solution A B= B= . Does this mean that doesnt mean that A, B are independent. A B = merely means that A, B are mutually exclusive. Mutually exclusive events m ay be affected by a common factor, in which case A, B are not independent. For example, let A represent that an exam candidate passes Exam P on the first t ry; let B represent that the same candidate passes Exam P after at least a secon d try. We can see that A B = , but A, B are not independent. If A occurs, B defi nitely cannot occur. Likewise, if B occurs, A definitely cannot occur. If A, B a re truly independent, then whether A occurs or not does not have any influence o n whether B will occur. In other words, if A, B are independent, then the inform ation that A occurs is useless in predicting whether B may occur or not; the inf ormation that B occurs is useless in predicting whether A may occur or not. The only way to test whether A, B are independent is to check whether the condition P(A P(A If A B ) = P ( A) P ( B ) holds. If P ( A B) B= B ) = P ( A) P ( B ) , then A, B are independent; if P ( A) P ( B ) , then A, B are not independent. , then P ( A B) = 0 . P ( A B ) = P ( A) P ( B ) holds only if either B= P ( A ) = 0 or P ( B ) = 0 or both. We can see that if A , then the only way for A, B to be independent is that at least one of the two events A, B is empty. Thi s conclusion is common sense. An event doomed never to occur (independent of any other events occurring or not) is indeed an independent event. Problem 4 If P ( A B C ) = P ( A ) P ( B ) P ( C ) , does this mean that A, B, C are independent? Yufeng Guo, Deeper Understanding: Exam P Page 88 of 425

http://guo.coursehost.com Solution P(A B C ) = P ( A ) P ( B ) P ( C ) does not mean that A, B, C are independent. A, B, C are independent if the following conditions are met: P(A P(A P(B P(A B ) = P ( A) P ( B ) C ) = P ( A) P ( C ) C ) = P ( B ) P (C ) B C ) = P ( A) P ( B ) P ( C ) Homework for you redo all the problems listed in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 89 of 425

http://guo.coursehost.com Chapter 10 Percentile, mean, median, mode, moment MEAN of X = E ( X ) . Percentile. If your SAT score in the 90-th percentile, the n 90% of the people who took the same test scored below you or got the same scor e as you did; only 10% of the people scored better than you. For a random variab le X , p -th percentile (donated as Pr( X x p ) = p% F ( x p ) = p% x p ), means that Pr( X > x p ) = 1 p % Median=Middle=50th percentile = x50 MOde=Most Often=Most Observed f ( x) is maxed at mode d f ( x) =0 dx Mode k -th Moment of X = E ( X k ) . 1st moment= E ( X ) . 2nd moment= E ( X 2 ) k -t h Central Moment of X = E [ X E ( X )] . 1st central moment=0. 2nd central k moment= Var ( X ) . Sample Problems and Solutions Problem 1 (SOA #22 May 2003) An insurers annual weather related-loss, X , is a random variable with density fu nction 2.5(200) 2.5 f ( x) = x3.5 0 for x > 200 otherwise Calculate the difference between the 30th percentile and 70th percentile. Yufeng Guo, Deeper Understanding: Exam P Page 90 of 425

http://guo.coursehost.com Solution Let x30 and x70 represent the 30th and 70th p ercentile. x F ( x) = 200 f (t )dt =1 200 x 2. 5 F ( x30 ) = 0.3 1 200 x30 200 x70 2. 5 2. 5 = 0.3 , x30 = 230.67 2. 5 F ( x70 ) = 0.7 1 = 0.7 200 x70 = 0.3 x70 = 323.73 The difference is 323.73 230.67 = 93.06 Problem 2 (#18 May 2000) An insurance company policy reimburses dental expenses, X , up to a maximum bene fit of 250. The probability density function for X is: f ( x) = ce 0 0.004 x for x 0 where c is a constant. Calculate the median benefit for this policy. ( A) 161, ( B ) 165, (C ) 173 ( D ) 182 ( E ) 250. Solution First, we need to create a random variable representing benefits payable under t he policy. Let Y = benefits payable under the policy.

Y= X 250 If X 250 If X > 250 Median=Middle=50th percentile = y50 Yufeng Guo, Deeper Understanding: Exam P Page 91 of 425

http://guo.coursehost.com Pr(Y y50 ) = 50% F ( y50 ) = 50% . We need to find y50 . It is always a good idea to draw a diagram. We can see that Y is a non-decreasing function of X . A non-decreasing transform ation preserves the percentiles. In other words, if x p is the p -th percentile of X and y p is the p -th percentile of Y , then y p = y ( x p ) Pr(Y y50 ) = Pr( X So we just need to find x50 , the 50th percentile of X . Then Y ( x50 ) is the 5 0th percentile of Y. Please note that X has exponential distribution. So F ( x) = 1 e 0.004 x . x50 ) = 50% (i.e. x50 corresponds to y50 ) F ( x50 ) = 50% 1 e 0.004 x50 = 50% e 0.004 x50 = 50% 0.004 x50 = ln ( 50% ) x50 = ln ( 50% ) = 173.28 0.004 y ( x = 173.28) = 173.28 (because 173.28 < 250) So y50 =173.28. Yufeng Guo, Deeper Understanding: Exam P Page 92 of 425

http://guo.coursehost.com Final point. F (250) = 1 e 0.004250 = 0.632 . So 250 is the 63.2-th percentil of the dental expense X ; 250 is also the 63.2 percentile of Y . Because Y is a always $250 once X 250 , any percentile higher than 63.2 (such as 70 percentile of Y , 75 percentile of Y , 99 percentile of Y ) is alway s 250. Problem 3 (#39 Course 2 Sample Test) The loss amount, X , for a medical insuranc e policy has cdf 0 F ( x) = 1 2 x2 9 1 1 3 x 3 x<0 0 x 3 x>3 Calculate the mode of the distribution. ( A) 2/3, ( B ) 1, (C ) 3/2 ( D ) 2 ( E ) 3. Solution MOde=Most Observed f ( x) is maxed at mode d f ( x) =0 dx Mode f ( x) = d 1 F ( x) = ( 4 x x 2 ) for 0 dx 9 x 3 (anywhere else =0) d d 1 f ( x) = ( 4 x x2 ) = 1 ( 4 2 x ) dx dx 9 9 d f ( x) is zero at x = 2 . So the mode=2. dx Homework for you redo all the problems listed in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 93 of 425

http://guo.coursehost.com Chapter 11 Find E ( X ) ,Var ( X ) , E ( X | Y ) ,Var ( X | Y ) Calculating E ( X ) ,Var ( X ) , E ( X | Y ) ,Var ( X | Y ) for a discrete rando m variable is a commonly tested type of problem. Developing the skill to calcula te E ( X ) and Var ( X ) quickly and accurately will help you on many SOA/CAS ex ams. Because the BA II Plus Statistic Worksheet can calculate E ( X ) and Var ( X ) of a discrete random variable X when X has no more than 50 distinct values ( most exam problems fall within this limit), there is no need to calculate the me an and variance using the standard formula under exam conditions. The mean and v ariance may also be calculated using the TI-30 IIS and the formula: E( X ) = xf ( x), E ( X 2 ) = x 2 f ( x), Var ( X ) = E ( X 2 ) E 2 ( X ) Sample Problems and Solutions Problem 1 (#8 May 2000) A probability distribution of the claim sizes for an aut o insurance policy is given in the table below: Claim Size 20 30 40 50 60 70 80 Probability 0.15 0.10 0.05 0.20 0.10 0.10 0.30 What percentage of the claims is within one standard deviation of the mean claim size? (A) 45%, (B) 55%, (C) 68%, (D) 85%, (E)100% Solution This problem is conceptually easy but calculation-intensive. It is easy to make calculation errors. Always let the calculator do all the calculations for you. U sing BA II Plus One critical thing to remember about the BA II Plus Statistics W orksheet is that you cannot directly enter the probability mass function f ( x i ) into the calculator to find E ( X ) and Var ( X ) . BA II Plus 1-V Statistics Worksheet accepts only scaledup Yufeng Guo, Deeper Understanding: Exam P Page 94 of 425

http://guo.coursehost.com probabilities that are positive integers. If you enter a non-integer value to the statistics worksheet, you will get an error when att empting to retrieve E ( X ) and Var ( X ) . To overcome this constraint, first s cale up f ( x i ) to an integer by multiplying f ( x i ) by a common integer. Cl aim Size x 20 30 40 50 60 70 80 Total Probability Pr(x ) 0.15 0.10 0.05 0.20 0.1 0 0.10 0.30 Scaled-up probability =100 Pr(x ) 15 10 5 20 10 10 30 1.00 100 Next, enter the 7 data pairs of (claim size and scaled-up probability) into the BA II Plus Statistics Worksheet to get E ( X ) and X . BA II Plus calculator key sequences: Procedure Set the calculator to display 4 decimal places Set AOS (Algebraic oper ating system) Keystrokes 2nd [FORMAT] 4 ENTER Display DEC=4.0000 2nd [FORMAT], keep pressing multiple times until you see Chn. Press 2nd [ENTER] (i f you see AOS, your calculator is already in AOS, in which case press [CLR Work] ) AOS Select data entry portion of Statistics worksheet Clear worksheet Enter data set 2nd [Data] 2nd [CLR Work] 20 ENTER 15 ENTER 30 ENTER 10 ENTER X01 (old contents) X01 0.0000 X01=20.0000 Y01=15.0000 X02=30.0000 Y02=10.0000 Yufeng Guo, Deeper Understanding: Exam P Page 95 of 425

http://guo.coursehost.com 40 ENTER 5 ENTER 50 ENTER 20 ENTER 60 ENTER 10 ENTER 70 ENTER 10 ENTER 80 ENTER 30 ENTER Select statistical calculation portion of Statistics worksheet Select o ne-variable calculation method View the sum of the scaled-up probabilities 2nd [ Stat] Keep pressing 2nd SET until you see 1-V Old content 1-V n=100.0000 (Make s ure the sum of the scaled-up probabilities is equal to the scaled-up common fact or, which in this problem is 100. If n is not equal to the common factor, youve m ade a data entry error.) X07=80.0000 Y07=30.0000 X06=70.0000 Y06=10.0000 X04=50. 0000 Y04=20.0000 X05=60.0000 Y05=10.0000 X03=40.0000 Y03=5.0000 View mean View sample standard deviation x =55.0000 S x =21.9043 (this is a sample standard deviation--- dont use this value). Note that Sx = View standard deviation View X 1 n n 1 i =1 (X i X )2 =21.7945 X X =5,500.0000 (not needed for this problem) View X2 X 2 =350,000.0000 (not needed for this problem, though this function might be useful for other calculat ions) Yufeng Guo, Deeper Understanding: Exam P Page 96 of 425

http://guo.coursehost.com You should always double check (using to scroll up or down the data pairs of X and Y) that your data entry is correct before accepting E ( X ) and X generated by BA II Plus. If you have made an error in data entry, you can 2nd DEL to delete a data pair (X, Y) or 2nd INS to insert a data pair ( X,Y). If you typed a wrong number, you can use to delete the wrong number and th en re-enter the correct number. Refer to the BA II Plus guidebook for details on how to correct data entry errors. If this procedure of calculating E ( X ) and X seems more time-consuming than the formula-driven approach, it could be becaus e you are not familiar with the BA II Plus Statistics Worksheet yet. With practi ce, you will find that using the calculator is quicker than manually calculating with formulas. Then, we have ( X X , X + X ) = (55 21.7945, =(33.21, 76.79) 55 + 21.7945) Finally, you find Pr(33.21 X 76.79) = Pr( X = 40) + Pr( X = 50) + Pr( X = 60) + Pr( X = 70) =0.05+ 0.20+0.10+0.10 = 0.45 Using TI-30X IIS Though the TI-30X IIS statistic function can also solve this pr oblem, we will use a standard formula-driven approach. This generic approach tak es advantage of the TI-30X IIS redo calculation functionality. First, calculate E ( X ) using E ( X ) = xf (x ) to xf (x ) . Then modify the formula x 2 f (x ) to calculate Var(X) without re-entering f (x ) . To find E ( X ) , we type: 20*.15+30*.1+40*.05+50*.2+60*.1+70*.1+80*.3 Then pres s Enter. E ( X ) =55. Next we modify the formula 20 .15+30 .1+40 .05+50 .2+60 .1+7 0 .1+80 .3 to Yufeng Guo, Deeper Understanding: Exam P Page 97 of 425

http://guo.coursehost.com 20 2 .15+30 2 .1+40 2 .05+50 2 .2+60 2 .1+70 2 .1+80 2 .3 To change 20 to 20 2 , move the cursor immediately to the right of the numbe r 20 so your cursor is blinking on top of the multiplication sign . Press 2nd INS x 2 You find that 20 2 .15+30 2 .1+40 2 .05+50 2 .2+60 2 .1+70 2 .1+80 2 .3 =3500 S o E ( X 2 ) =3,500 Var ( X ) = E ( X 2 ) E 2 ( X ) =3,500- 552 =475. Finally, yo u can calculate X and the range of ( X X , X + X ). Keep in mind that you can enter up to 88 digits for a formula in TI-30X IIS. If your formula exceeds 88 digits, TI 30X IIS will ignore the digits entered after the 88th digit. Problem 2 (#19, November 2001) A baseball team has scheduled its opening game for April 1. If it rains on April 1, the game is postponed and will be played on the next day that it does not ra in. The team purchases insurance against rain. The policy will pay 1,000 for eac h day, up to 2 days, that the opening game is postponed. The insurance company d etermines that the number of consecutive days of rain beginning on April 1 is a Poisson random variable with a 0.6 mean. What is the standard deviation of the a mount the insurance company will have to pay? (A) 668, (B) 699, (C) 775, (D) 817 , (E) 904 Solution Let N =# of days it rains consecutively. N can be 0,1,2, or any non-negative inte ger. Pr(N = n ) = e n n! =e 0.6 0.6 n n! (n =0,1,2,..+ ) If you dont understand this formula, dont worry; youll learn it later. For now, tak e the formula as given and focus on the calculation. Yufeng Guo, Deeper Understan ding: Exam P Page 98 of 425

http://guo.coursehost.com Let X = payment by the insurance company. According to the insurance contract, if there is no rain (n=0), X=0. If it rains for only 1 day, X=$1,000. If it rains for two or more days in a row, X is always $2,000. We are asked to calculate X . If a problem asks you to calculate the mean, standar d deviation, or other statistics of a discrete random variable, it is always a g ood idea to list the variables values and their corresponding probabilities in a table before doing the calculation to organize your data. So lets list the data p air ( X , probability) in a table: Payment X 0 1,000 2,000 Pr( N Probability of receiving X Pr ( N = 0 ) = e Pr ( N = 1) = e 0.6 0.6 0 0! 0.6 =e 0.6 0.6 1 1! = 0.6e 0.6 2) = Pr( N = 2) + Pr( N = 3) + ... =1-[ Pr( N = 0) + Pr( N = 1)] =1-1.6e 0.6 Once you set up the table above, you can use BA II Pluss Statistics Worksheet or TI-30 IIS to find the mean and variance. Calculation Method 1 --- Using TI-30X I IS First we calculate the mean by typing: 1000*.6e^(-.6)+2000(1-1.6e^(-.6 As men tioned before, when typing e^(-.6) for e 0.6 , you need to use the negative sign , not the minus sign, to get -6. If you type the minus sign in e^( .6), you will g et an error message. Additionally, for 0.6 e 0.6 , you do not need to type 0.6*e ^(-.6), just type .6e^(-.6). Also, to calculate 2000(1 1.6e .6 ) , you do not ne ed to type 2000*(1-1.6*(e^(-.6))). Simply type 2000(1-1.6e^(-.6 Your calculator understands you are trying to calculate 2000(1 1.6e .6 ) . However, the omission of the parenthesis sign works only for the last item in your formula. In other words, if your equation is Yufeng Guo, Deeper Understanding: Exam P Page 99 of 425

http://guo.coursehost.com 2000(1 1.6e .6 ) + 1000 .6e .6 you have to type the first item in its full parenthesis, but can skip typing the closing parenthesis in the 2nd item: 2000(1-1.6e^(-.6)) + 1000*.6e^(-.6 If you type 2000(1-1.6e^(-.6 + 1000*.6e^(-.6 your calculator will interpret this as 200 0(1-1.6e^(-.6 + 1000*.6e^(-.6) ) ) Of course, this is not your intention. Lets co me back to the calculation. After you type 1000*.6e^(-.6)+2000(1-1.6e^(-.6 press ENTER. You should get E ( X ) = 573.0897. This is an intermediate value. You can store it on your scrap paper or in one of your calculators memories. Next, modify your formula to get E (x 2 ) by typing: 1000 2 .6e ^ ( .6) + 2000 2 (1 1.6 ^ ( .6 You will get 816892.5107. This is E (x 2 ) . Next, calculate Var ( X ) Var (X ) = E (x 2 ) E 2 (x ) =488460.6535 X = Var (x ) = 698.9960 . Calculation Method 2 --Using BA II Plus First, please note that you can always c alculate X without using the BA II Plus built-in Statistics Worksheet. You can calculate E (X ), E (X 2 ),Var (X ) in BA II Plus as you do any other calculations without using the built-in worksheet. In this p roblem, the equations used to calculate E ( X ) = 0* e .6 X are: + 1, 000(.6e .6 ) + 2, 000(1 1.6e .6 ) Yufeng Guo, Deeper Understanding: Exam P Page 100 of 425

http://guo.coursehost.com E ( X 2 ) = 02 e .6 + 1, 0002 .6e X .6 + 2, 0002 (1 1.6e .6 ) Var ( x) = E ( x 2 ) E 2 ( x), = Var ( x) You simply calculate each item in the above equations with BA II Plus. This will give you the required standard deviation. However, we do not want to do this ha rd-core calculation in an exam. BA II Plus already has a built-in statistics wor ksheet and we should utilize it. The key to using the BA II Plus Statistics Work sheet is to scale up the probabilities to integers. To scale the three probabili ties: (e .6 , 0.6e .6 , 1 1.6e .6 ) Scale up probability to integer (multiply the original probability by 10,000) 5, 488 3,293 1,219 10,000 is a bit challenging, but there is a way: Payment X Probability (assuming you se t your BA II Plus to display 4 decimal places) e 0.6 = 0.5488 0.6 0.6e = 0.3293 0.6 1-1.6e =0.1219 1.0 0 1,000 2,000 Total Then we just enter the following data pairs into BA II Pluss statistics worksheet : X01=0 X02=1,000 X03=2,000 Y01=5,488; Y02=3,293; Y03=1,219. X Then the calculator will give you = 698.8966 Make sure your calculator gives you n that matches the sum of the scaled-up prob abilities. In this problem, the sum of your scaled-up probabilities is 10,000, s o you should get n=10,000. If your calculator gives you n that is not 10,000, yo u know that at least one of the scaled-up probabilities is wrong. Of course, you can scale up the probabilities with better precision (more closely resembling t he original probabilities). For example, you can scale them up this way (assumin g you set your calculator to display 8 decimal places): Yufeng Guo, Deeper Understanding: Exam P Page 101 of 425

http://guo.coursehost.com Payment X Probability Scale up probability to integer more precisely (multiply the original probability by 100,000,000) 54,881,164 32, 928,698 12,190,138 100,000,000 0 1,000 2,000 Total e 0.6 = 0.54881164 0.6 0.6e = 0.32928698 0.6 1-1.6e =0.12190138 Then we just enter the following data pairs into BA II Pluss statistics worksheet : X01=0 X02=1,000 X03=2,000 Y01=54,881,164; Y02=32,928,698; Y03=12,190,138. X Then the calculator will give you n=100,000,000) =698.8995993 (remember to check that For exam problems, scaling up the original probabilities by multiplying them by 10,000 is good enough to give you the correct answer. Under exam conditions it i s unnecessary to scale the probability up by multiplying by100,000,000. By now t he shortcomings of the TI-30X IIS statistics function should be apparent. TI-30 IIS has a limit that the scaled-up probability must not exceed 99. If you use th e TI-30X IIS statistics function to solve the problem, this is your table for sc aled-up probabilities: Payment X Probability (assuming TI-30X IIS displays 2 dec imals) Scale up probability to integer (multiply the original probability by 100 . If you multiply probabilities with 1000, you wont be able to enter the scaled-u p probabilities on TI-30X IIS) 55 33 12 100 0 1,000 2,000 Total e 0.6 = 0.55 0.6 0.6e = 0.33 0.6 1-1.6e =0.12 1.00 If you enter the 3 data pairs (X1=0,FRQ=55), (X2=1000,FRQ=33), (X3=2000,FRQ=12) into the TI-30 IIS statistics function, you get: X =696.49 Yufeng Guo, Deeper Understanding: Exam P Page 102 of 425

http://guo.coursehost.com Compare the above value with the correct answer 699, w hich we obtained by multiplying probabilities by 10,000. You see that the TI-30X IIS value is about $2 off from the correct answer. Though having $2 off is not far off, in the exam you want to be more precise. It is for this reason that you should not use the TI-30X IIS statistics function. Which way is better for calc ulating E ( X ) and Var ( X ) of a discrete random variable, using the TI-30X II S recalculation functionality or using the BA II Plus Statistics Worksheet? Both m ethods are good. I recommend that you learn both methods and choose one you thin k is better. Next, lets move on to finding the conditional expectation E ( X | Y ) and the conditional variance Var (X | Y ) of a discrete random variable X . If an exam problem gives you a list of data pairs of ( X ,Y ) and asks you to find E ( X | Y ) and Var (X | Y ) , you can reorganize the original data pairs into new data pairs of [ ( X | Y ),Prob( X | Y ) ] and then use BA II Plus or TI-30X IIS to calculate the conditional mean and conditional variance. Problem 3 Two discrete random variables X ,Y have the following joint distributi ons: Y 5 0.07 0.1 0.02 0.07 X 0 1 2 3 2 0.08 0.04 0.01 0.05 7 0.09 0.11 0.05 0.04 8 0.06 0.03 0.12 0.06 Find: (1) (2) (3) (4) E ( X ),Var ( X ) E ( X | Y = 5),Var ( X | Y = 5) Var (XY ) Cov( XY ) X (5) Var Y (6) Var (X 2Y ) Solution Yufeng Guo, Deeper Understanding: Exam P Page 103 of 425

http://guo.coursehost.com (1) E ( X ),Var ( X ) Use BA II Plus. To make things e asier, sum up the probabilities for each row and for each column: Y 2 5 0.07 0.10 0.02 0.07 0.26 7 0.09 0.11 0.05 0.04 0.29 8 0.06 0.03 0.12 0.06 0. 27 Sum 0.30 0.28 0.20 0.22 0 1 2 3 Sum 0.08 0.04 0.01 0.05 0.18 X For example, in the above table, 0.18=0.08+0.04+0.01+0.05, and 0.3=0.08+0.07+0.0 9+0.06. To find E ( X ) and Var ( X ) , we need to convert the ( X ,Y ) joint di stribution table into the X -marginal distribution table: X Pr( X ) scaled-up probability =100 Pr( X ) 0 1 2 3 0.3 0.28 0.2 0.22 30 28 20 22 Next, enter the following data pairs into BA II Plus 1-V Statistics Worksheet: ( 0,30), (1,28), (2,20), (3,22). Specifically, you enter X01=0 Y01=30; X03=2 Y03=2 0; X02=1 Y02=28; X04=3 Y04=22 Remember in BA II Plus 1-V Statistics Worksheet, you always enter the values of the random variable before entering the scaled-up probabilities. Switching the o rder will produce a nonsensical result. After entering the data into BA II Plus, you should get: X = 1.34, X = 1.1245 Yufeng Guo, Deeper Understanding: Exam P Page 104 of 425

http://guo.coursehost.com To find Var (X ) = 2 X , you do not need to leave the Statistics Worksheet. After BA II X Plus displays the value of , you simply press the x 2 key. This gives you the variance. Please note that squaring X does not change the internal value of X . After squa ring X , you can press the up arrow W and the down arrowX. This removes the squa ring effect and gives you the original X . Var (X ) = 2 X = 1.12452 = 1.2644 So we have E ( X ) = 1.34, Var( X ) = 1.2644 (2) E ( X | Y = 5),Var ( X | Y = 5) Once again well create a table listing all possible values of ( X | Y = 5) and p robabilities of Pr( X | Y = 5) : Y =5 0.07 0.10 0.02 0.07 0.26 Pr( X | Y = 5) 100 0.26 Pr( X | Y = 5) 0.07/0.26 7 0.10/0.26 10 0.02/0.26 2 0.07/0.26 7 26 X 0 1 2 3 Total In the above table, we used the formula Pr( X | Y = 5) = For example, Pr( X = 0 | Y = 5) = Pr( X = 0, Y = 5) 0.07 = Pr(Y = 5) 0.26 Pr( X , Y = 5) Pr(Y = 5) Then we enter the following data into the BA II Plus 1-V Statistics Worksheet: X 01=0 Y01=7; X03=2 Y03=2; We have X = 1.3462, X X02=1 Y02=10; X04=3 Y04=7 = 1.1416 , Var ( X ) = 2 X = 1.14162 = 1.3033 Page 105 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com So we have E ( X | Y = 5) = 1.3462, Var ( X | Y = 5) = 1.3033 You can see that w hen using the BA II Plus Statistics Worksheet, instead of scaling up Pr( X ,Y = 5) , we can scale up Pr( X ,Y = 5) and ignore Pr(Y = 5) because it is a Pr(Y = 5 ) constant. Ignoring Pr(Y = 5) , we have: Y =5 0.07 0.10 0.02 0.07 0.26 X 0 1 2 3 Total 100 Pr( X | Y = 5) 7 10 2 7 26 Then we just enter into the BA II Plus 1-V Statistics Worksheet the following: X 01=0 Y01=7; X03=2 Y03=2; X02=1 Y02=10; X04=3 Y04=7 Ignoring Pr(Y = 5) gives us the same result. This is a great advantage of using BA II Plus. Next time you are calculating the mean and variance of a random vari able in BA II Plus, simply ignore any common constant in each probability number . This saves time. (3) Var (XY ) Once again, we need to create a table listing a ll of the possible values of XY and their corresponding probabilities. All possi ble combinations of XY are the boldface numbers listed below: Y 0 1 2 3 2 0 2 4 6 5 0 5 10 15 7 0 7 14 21 8 0 8 16 24 X Pr( XY ) is listed below in boldface: Y X 0 1 2 3 2 0.08 0.04 0.01 0.05 5 0.07 0.1 0.02 0.07 7 0.09 0.11 0.05 0.04 8 0.06 0.03 0.12 0.06 Page 106 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com We then scale up Pr( XY ) . We multiply Pr( XY ) by 100: Y 2 0 1 2 3 8 4 1 5 5 7 10 2 7 7 9 11 5 4 8 6 3 12 6 X Next, we enter XY and the scaled-up Pr( XY ) into the BA II Plus 1-V Statistics Worksheet: X01=0, Y01=8; X05=2, Y05=4; X09=4, Y09=1; X13=6, Y13=5; X02=0, Y02=7; X06=5, Y06 =10; X10=10, Y10=2; X14=15, Y14=7; X03=0, Y03=9; X07=7, Y07=11; X11=14, Y11=5; X 15=21, Y15=4; X04=0, Y04=6; X08=8, Y08=3; X12=16, Y12=12; X16=24, Y16=6. Note: you can also consolidate the scaled-up probability for X01=0 into Y01=8+7+ 9+6=30. BA II Plus should give you: X = 8.0800, X = 7.5574 ,Var (X ) = 2 X = 7.55742 = 57.1136 So we have E ( XY ) = 8.08, Var ( XY ) = 57.1136 Make sure BA II Plus gives you n=100. If n is not 100, at least one of your scal ed-up probabilities is wrong. (4) Cov( XY ) . Cov( XY ) = E (XY ) E (X ) E (Y ) . We already know E ( XY ) and E ( X ) . We ju st need to calculate E (Y ) . 2 5 7 8 Y Pr(Y ) 0.18 0.26 0.29 0.27 100 Pr(Y ) 18 26 29 27 We enter the followi ng data into the BA II Plus 1-V statistics worksheet: X01=2, Y01=18; X02=5, Y02=26; X03=7, Y03=29; X04=8, Y04=27; We should get: Yufeng Guo, Deeper Understanding: Exam P Page 107 of 425

http://guo.coursehost.com X = 5.8500, X = 2.1184 , Var ( X ) = 2 X = 2.11842 = 4.4875 So we have E (Y ) = 5.85, Var (Y ) = 4.4875 Then Cov( XY ) = E ( XY ) E ( X ) E (Y ) = 8.08 1.34 5.85 = 0.2410 (5) Var X Y X are listed below in boldface: Y 5 0 1/5 2/5 3/5 7 0 1/7 2/7 3/7 8 0 1/8 2/8 3/8 All of the possible values of Y 2 0 1/2 2/2 3/2 X 0 1 2 3 Pr X Y are below in boldface: Y 2 0.08 0.04 0.01 0.05 5 0.07 0.1 0.02 0.07 7 0.09 0.11 0.05 0.04 8 0.06 0.03 0 .12 0.06 X 0 1 2 3 The scaled-up probabilities are below in boldface: Y 2 0 1 2 3 8 4 1 5 5 7 10 2 7 7 9 11 5 4 8 6 3 12 6 X Enter the following into the BA II Plus 1-V Statistics Worksheet: X01=0, Y01=8; X02=0, Y02=7; X03=0, Y03=9; X04=0, Y04=6; X05=1/2, Y05=4; X06=1/5, Y06=10; X07=1/7, Y07=11; X08=1/8, Y08=3; X09=2/2, Y09=1; X10=2/5, Y10=2; X11=2/ 7, Y11=5; X12=2/8, Y12=12; X13=3/2, Y13=5; X14=3/5, Y14=7; X15=3/7, Y15=4; X16=3 /8, Y16=6. You should get: Yufeng Guo, Deeper Understanding: Exam P Page 108 of 425

http://guo.coursehost.com X = 0.2784, = 0.3427 X This means that E X X = 0.2784, Var = 0.1175 Y Y (6) Var ( X 2Y ) --- This problem is for you to solve. Answer: E ( X 2 Y ) = 18. 2, Var ( X 2 Y ) = 477.2 . Please note in solving (3),(4),(5),(6), you are using the same scaled-up probabilities. You can reuse the scaled-up probabilities you entered for the previous problem. You just enter the new series of X values in the BA II Plus 1-V Statistics Worksheet. Though tedious, this problem is a good opportunity for you to practice how to find the mean and variance of a discrete random variable using the BA II Plus Statistics Worksheet. Homework for you: Sam ple P #114. Hint: Enter X01=0, Y01=50; X02=1, Y02=125. Yufeng Guo, Deeper Understanding: Exam P Page 109 of 425

http://guo.coursehost.com Chapter 12 Bernoulli distribution If X is a Bernoulli random variable, then X= 1 0 with probability of p ( 0 p 1) with probability of q = 1- p Examples: The single flip of a coin resulting in heads or tails The single throw of a die resulting in a 6 or non-6 (1 through 5) A single contest resulting in winning or losing The launch of a product resulting in either success or failure Key formulas: E ( X ) = p , Var ( X ) = pq , M X ( t ) = q + pet Sample Problems and Solutions Problem 1 During one policy year, a certain life insurance policy either results in no claims with a probability of 99% when the policyholder is still alive or results in a single claim if the policyholder dies. Let X represent the number o f claims on the policy during one policy year. Find E ( X ) , X , and M X ( t ) . Solution X is a Bernoulli random variable with p =1% E ( X ) = p = 1% , X = Var ( X ) = pq = 1% ( 99% ) = 9.95% M X ( t ) = q + pet = 0.99 + 0.01et Homework for you rework all the problems lis ted in this chapter Yufeng Guo, Deeper Understanding: Exam P Page 110 of 425

http://guo.coursehost.com Chapter 13 Binomial distribution Binomial distribution is one of the most frequently used and most intuitive dist ributions in Exam P. The easiest way to understand Binomial distribution is to t hink about tossing coins. You toss the same coin n times and you want to know ho w many times you can get heads. For each toss, there are only two outcomes -- ei ther you get heads or tails. The probability of getting heads, represented by p, doesnt change from one toss to another. In addition, whether you get a head in o ne toss doesnt affect whether you get heads in the next toss (so the result of ea ch toss is independent). If you let X = # of heads you get in n tosses of the co in, then X has a binomial distribution. Key formulas to remember: For a binomial random variable X p X ( x ) = Cnx p x q n x = n! pxqn x , 0 x !( n x ) ! p 1 , q = 1 p , x = 0,1, 2,..., n n E ( X ) = np , Var ( X ) = npq , M X ( t ) = ( q + pet ) n! p x q n x . If there x !( n x ) ! is only one way of having x successes and ( n x ) of failures, then the probability of having exactly x successes is p x q n x due to the independence among n trials. However, n! out of n trials, there ar e ways of having x successes and n x of failures. So x !( n x ) ! n! p x qn x . the total probability of having exactly x successes is x !( n x ) ! Lets get an intuitive feel for the formula p X ( x ) = Cnx p x q n x = Shortcuts Shortcut #1 --- How to quickly find the probability mass function of a binomial distribution for n=2 and n=3 (which is often tested). You can easily memorize th e probability mass function for n=2 and n=3, eliminating the need to derive the probability from the generic formula p X ( x ) = Cnx p x q n x = n! pxqn x !( n x ) ! x To memorize the probability mass function for n=2, notice that Yufeng Guo, Deeper Understanding: Exam P Page 111 of 425

http://guo.coursehost.com 1 = ( p + q ) = p 2 + 2 pq + q 2 . 2 In the above formula, each item on the right-hand size represents the probabilit y mass function for n=2. In other words, p X ( 2 ) = p 2 , p X (1) = 2 pq , p X ( 0 ) = q 2 Similarly, to quickly find th e probability mass function for n=3, notice 1 = ( p + q ) = p 3 + 3 p 2 q + 3 pq 2 + q 3 3 Then pX ( 3) = p 3 , p X ( 2 ) = 3 p 2 q , pX tcut #2 How to memorize the mean, variance, binomial variable. A binomial random variable nt identically distributed Bernoulli variable y mass function: Y= 1 0 with probability of p ( 0 p 1) with probability of q = 1- p The mean and variance of Y are: E (Y ) = p , Var ( Y ) = pq Since X = Y1 + Y2 + ... + Yn and Y1, Y2 ,..., Yn are independent identically dis tributed Bernoulli variables, E ( X ) = E (Y1 + Y2 + ... + Yn ) = nE (Y ) = np Var ( X ) = Var (Y1 + Y2 + ... + Yn ) = nVar (Y ) = npq M X ( t ) = M Y1 +Y2 +...+Yn ( t ) = M Y ( t ) n = ( q + pet ) n The last equation is based on the rule that the MGF (moment generating function) of the sum of independent random variables is simply the product of the MGFs of each individual random variable. The mean E ( X ) = np should make sense to you. In one trial, the number of successes you can expect should be p , because p is the probability of having a success in one Yufeng Guo, Deeper Understanding: Exam P Page 112 of 425 (1) = 3 pq 2 , p X ( 0 ) = q3 Shor and moment generating function of a X is simply the sum of n independe Y . Y has the following probabilit

http://guo.coursehost.com trial. Then if you have n independent trials, you shou ld expect to have a total of np successes. Application of binomial distribution in insurance Binomial distribution is frequently used to model the frequency (the number of l osses), not the severity (the dollar amount of individual losses), of a random l oss variable. For example, the total number of burglaries or fires or some other random event that happens to a group of geographically dispersed homes is appro ximately a binomial random variable. You can easily see that the assumptions und erlying a binomial random variable approximately hold in this case. Because fire is a rare event, you can assume that for a given period of time (6 months or 1 year for example), a house has no fire or only 1 fire. You can also assume that the probability of having a fire is constant from one city to another (to keep y our model simple). Since the houses you are studying are located in different ar eas, it is reasonable to assume that whether one house has a fire does not affec t the likelihood of another house catching fire. Then you can use the probabilit y mass function of binomial distribution to calculate the total number of fires that can happen to a block of houses for a given period of time. Determine whether you have a binomial random variable. Interestingly, if the probability distribution is not binomial (Poisson, exponen tial, or some other non-binomial related distribution), the problem will most li kely give you the actual distribution (The number of claims filed has a Poisson d istribution). However, if the distribution is binomial, negative binomial, or geom etric, the exam question most likely wont tell you so and you have to figure out the distribution yourself. This is because binomial, negative binomial, or geome tric distributions can often be identified unambiguously from the context of a t ested problem. In contrast, Poisson, exponential, and many other non-binomial re lated distributions are not self-evident; SOA may need to give you the actual di stribution to avoid possible ambiguity around a tested problem. To verify whethe r a distribution is indeed binomial, you can check whether the random variable m eets the following standards: 1. There are n independent trials. The outcome of one trial does not affect the outcome of any other trial. 2. Each trial has only two outcomes success or failure. 3. The probability of success remains constant throughout n trials. 4. You are determining the probability of having x number of successes, where x is a non-negative integer no greater than n. Yufeng Guo, De eper Understanding: Exam P Page 113 of 425

http://guo.coursehost.com Once you have identified that you have a binomial rand om variable, just plug data into the memorized formula for the probability mass function, the mean, and the variance. Common Pitfalls to Watch for Some candidat es erroneously drop the combination factor Cnx = n! when x !( n x ) ! calculating the probability mass function of a binomial ran dom variable. For example, to find the probability of having one success in two independent trials, some candidates write p X (1) = pq Their logic is this: the probability of having one success in one trial is p and the probability of having one failure in another trial is q . Hence the probabi lity of having one success and one failure is p X (1) = pq according to the mult iplication rule. The above logic is wrong because there are two ways to have exa ctly one success out of two independent trials. One way is to succeed in the fir st trial and fail the second trial. The other way is to fail the first trial and succeed in the second trial. So the correct formula for n = 2 is p X (1) = 2 pq . Sample Problems and Solutions Problem 1 On a given day, each computer in a lab has at most one crash. There is a 5% chan ce that a computer has a crash during the day, independent of the performance of any other computers in the lab. There are 25 computers in the lab. Find the pro bability that on a given day, there are (1) exactly 3 crashes (2) at most 3 cras hes (3) at least 3 crashes (4) more than 3 and less than 6 crashes Solution X = # of computer crashes in a given day. X has a binomial distribution with n = 25 and p =5%. p X ( x ) = Cnx p x q n x x = C25 ( 5% ) ( 95% ) x 25 x Yufeng Guo, Deeper Understanding: Exam P Page 114 of 425

http://guo.coursehost.com Probability of having exactly 3 crashes in a day is: 3 p X ( 3) = C25 ( 5% ) ( 95% ) 3 25 3 =9.3% Probability of having at most 3 crashes in a day is: P(X 0 25 3) = p X ( 0 ) + p X (1) + p X ( 2 ) + p X ( 3) = 0 25 0 3 x=0 x C25 ( 5% ) ( 95% ) x 2 25 2 25 x = C ( 5% ) ( 95% ) =96.6% +C 1 25 ( 5% ) ( 95% ) 1 25 1 2 + C25 ( 5% ) ( 95% ) 3 + C25 ( 5% ) ( 95% ) 3 25 3 Probability of having at least 3 crashes in a day is: P(X 3) = 1 P ( X 2) = 1 25 0 2 x =0 x C25 ( 5% ) ( 95% ) x 25 x =1 C 0 25 ( 5% ) ( 95% ) 0 +C 1 25 ( 5% ) ( 95% ) 1 25 1

2 + C25 ( 5% ) ( 95% ) 2 25 2 =12.7% Alternatively, the probability of having at least 3 crashes in a day is: P ( X 3) = 1 P ( X 3) + P ( X = 3) = 1 P ( at most 3 crashes ) + P ( 3 crashes ) = 1-96.6%+9.3%=12.7% Probability of having more than 3 but less than 6 crashes in a day is: 4 25 4 5 25 5 4 5 p X ( 4 ) + p X ( 5 ) = C25 ( 5% ) ( 95% ) + C25 ( 5% ) ( 95% ) =3.3% Problem 2 A factory has 25 machines working separately, of which 15 are of Type A and 10 o f Type B. On a given day, each machine Type A has a 5% chance of malfunctioning, independent of the performance of any other machines; each machine Type B has a n 8% chance of malfunctioning, independent of the performance of any other machi nes. Let Y represent, on a given day, the total number of machines malfunctionin g. Find P (Y = 2 ) , the probability of having exactly 2 machines malfunctioning on a given day. Solution Yufeng Guo, Deeper Understanding: Exam P Page 115 of 425

http://guo.coursehost.com On a given day, the number of Type A machines malfunct ioning X A and the number of Type B machines malfunctioning X B are two independ ent binomial random variables. X A is binomial with n =15 and p =5%; X B is bino mial with n =10 and p =8%. Y = X Type tion; X A + A + As and X B X B . There are only 3 combinations of X A and X B to make Y = 2 : two malfunction and no Type Bs malfunction; one Type A and one Type B malfunc no Type As malfunction and two Type Bs malfunction. In other words, Y = =2

( X A , X B ) = (2,0), (1,1), (0,2) We just need to find the probability of each of the above three combinations and calculate the sum. P ( X A = x A , X B = xB ) = P ( X A = x A ) P ( X B = xB ) Because X A and X B are independent P ( X A = 2, X B = 0 ) = P ( X A = 2 ) P ( X B = 0 ) 2 P ( X A = 2 ) = C15 ( 5% ) ( 95% ) 2 0 P ( X B = 0 ) = C10 ( 8% ) ( 92% ) 0 15 2 =13.48% =43.44% 10 0 P ( X A = 2, X B = 0 ) =13.48%*43.44%=5.85% P ( X A = 1, X B = 1) = P ( X A = 1) P ( X B = 1) 1 = C15 ( 5% ) ( 95% ) 1 15 1 1 C10 ( 8% ) ( 92% ) 1 10 1 =13.82% P ( X A = 0, X B = 2 ) = P ( X A = 0 ) P ( X B = 2 ) 0 = C15 ( 5% ) ( 95% ) 0 15 0 2 C10 ( 8% ) ( 92% ) 2 10 2 =6.85% P (Y = 2 ) = 5.85%+13.82%+6.85%=26.52% Problem 3 (CAS Exam 3 Spring 2005 #2, wording simplified) BIB is new insurer writing homeowner policies. You are given: BIBs initial wealth in a saving account = $15 Number of insured homes = 3 Annual premium of $10 per homeowner policy collected at the beginning of each year Annual claim per homeo wner policy, if any, is $40 and is paid immediately Beside claims, BIB doesnt hav e any other expenses BIBs saving account earns zero interest Yufeng Guo, Deeper Un derstanding: Exam P Page 116 of 425

http://guo.coursehost.com Each homeowner files at most one claim per year. The probability that each homeo wner files a claim in Year 1 is 20%. The probability that each homeowner files a claim in Year 2 is 10%. Claims are independent. Calculate the probability that BIB will NOT go bankrupt in the first 2 years. Solution BIB will not go bankrupt at in the first 2 years (i.e. BIB still stays in business at the end of Year 2) if there is at most 1 claim in the first 2 years. If there is at most 1 claim i n the 1st two years, then BIBs total wealth at the end of Year 2 before paying an y claims = initial wealth + premiums collected during Year 1 + premiums collecte d during Year 2 = $15 + $10 (3) + $10 (3) = $75 BIBs total expense incurred durin g Year 1 and Year 2 = cost per claim total # of claims during the first two year s = $40 (1) = $40 BIBs total wealth at the end of Year 2 after paying the claims = $75 - $40 = $35 If BIB incurs two claims in the first two years, then the clai m cost during the first two years is $40(2)=$80. BIBs total wealth at the end of Year 2 is $75 - $80 = - $5. BIB will go bankrupt at the end of Year 2. So BIB ne eds to have zero or one claim in the first two years to stay in business. There are three ways to have zero or one claim during the first two years: Have zero c laim in Year 1 and Year 2 (Option 1) Having zero claim in Year 1 and 1 claim in Year 2 (Option 2) Having one claim in Year 1 and zero claim in Year 2 (Option 3) The number of claims in Year 1 is a binominal distribution with parameter n = 3 and p = 0.2 . The number of claims in Year 2 is a binominal distribution with pa rameter n = 3 and p = 0.1 . Next, we set up a table keeping track of claims: Yufeng Guo, Deeper Understanding: Exam P Page 117 of 425

http://guo.coursehost.com Option #1 #2 Year 1 0 0 Year 2 0 1 A Yr 1 Probability C0 ( 0.20 )( 0.83 ) 3 = 0.83 C0 ( 0.20 )( 0.83 ) 3 = 0.83 B Yr 2 Probability C0 ( 0.10 )( 0.93 ) 3 = 0.93 C1 ( 0.11 )( 0.92 ) 3 C=AB Total Probability 0.83 ( 0.93 ) = 37.32% #3 1 0 C1 ( 0.21 )( 0.82 ) 3 = 3 ( 0.2 ) ( 0.82 ) C0 ( 0.10 )( 0.93 ) 3 = 0.93 = 3 ( 0.1) 0.92 0.83 ( 3)( 0.1) 0.92 = 28% 3 ( 0.2 ) ( 0.82 )( 0.93 ) = 12.44% Total 77.76% The probability is 77.76% that BIB will not go bankrupt during the first 2 years . Problem 4 (CAS Exam 3 Spring 2005 #15) A service guarantee covers 20 TV sets. Each year, each set has 5% chance of fail ing. These probabilities are independent. If a set fails, it is immediately repl aced with a new set at the end of the year of failure. This new set is included in the service guarantee. Calculate the probability of no more than one failure in the first two years. Solution There are three ways to have zero or one failures during the first two years: Have zero failure in Year 1 and Year 2 (Option 1) Having zero failure in Year 1 and 1 failure in Year 2 (Option 2) Having one failure in Year 1 and zero failure in Year 2 (Option 3) The # of failures each year is a binomial distribution with parameter n = 20 and

p = 5% . Next, we set up a table keeping track of claims: Yufeng Guo, Deeper Understanding: Exam P Page 118 of 425

http://guo.coursehost.com A Yr1 Yr 2 #1 #2 0 0 0 1 Yr 1 Probability C0 ( 0.050 )( 0.9520 ) 20 = 0.9520 C0 ( 0.050 )( 0.9520 ) 20 = 0.9520 C1 ( 0.051 )( 0.9519 ) 20 B Yr 2 Probability C0 ( 0.050 )( 0.9520 ) 20 C=AB Total Probability ( 0.95 ) 20 2 = 0.9520 C1 ( 0.051 )( 0.9519 ) 20 C0 ( 0.050 )( 0.9520 ) 20 = 20 ( 0.05 ) 0.9519 = 0.9520 =12.85% 20 ( 0.05 ) 0.9539 =13.53% #3 1 0 = 20 ( 0.05 ) 0.9519 Total 20 ( 0.05 ) 0.9539 =13.53% 39.91% The probability of no more than one failure in the first two years is 39.91%. Ho mework for you: #40 May 2000; #23, #37 May 2001; #27 Nov 2001. Yufeng Guo, Deeper Understanding: Exam P Page 119 of 425

http://guo.coursehost.com Chapter 14 Geometric distribution You perform Bernoulli trials (flipping a coin, throwing a die) repeatedly till y ou get your first success (for example -- getting a first head or first 6), then you stop and count the total number of trials you had (including the final tria l that brings you the only success). The total number of trials you counted is a geometric random variable. If we let X = # of independent trials just enough fo r one success, and p is the constant probability of success in any single trial, then X is geometric random variable with parameter p . Please note that one nas ty thing about a geometric (and a negative binomial) random variable is that the re are always two ways to define the random variable to define it as the number of trials or the number of failures. Though the variance formula is identical ei ther way, the mean formulas are different (the two means differ by one). Because exam questions can use either way to define a geometric variable, you need to d etermine, according to the context of the problem, which mean and probability ma ss formulas to use. Here is a second way to define a geometric random variable: If we let Y = # of independent failures just enough for one success, p is the co nstant probability of success in any single trial, then Y is geometric random va riable with parameter p . Lets look at the probability mass function of X and Y : S=success, F=failure, p=the probability of success in a single trial. Then Yufeng Guo, Deeper Understanding: Exam P Page 120 of 425

http://guo.coursehost.com X = # of trials before 1st success Y = # of failures before 1st success Outcome in terms of trials Outcome in terms of failures Probability P ( X = n) 1 2 3 4 n 0 1 2 3 k (k = n 1) S S p FS FS FFS FFS FFFS FFFS FF ...F S n 1 FF ...F S k (1 (1 p) p p) p (1 (1 p) p 2 (1 (1 p) p 3

(1 (1 p) n 1 p Probability P (Y = k ) p p) p 2 p) p 3 p) p k Yufeng Guo, Deeper Understanding: Exam P Page 121 of 425

http://guo.coursehost.com Key formulas for geometric random variable Formula PMF (probability mass function) p X ( n ) = (1 p ) P(X P (Y n 1 k p p X ( k ) = (1 p ) p n ) = (1 p ) k) = P(X =1 =1 n 1 P(X P (Y n) k) k + 1) = (1 p ) k Explanation To have 1st success at n-th trial, you must fail the first (n-1) tri als and succeed at n-th trial. Notice X=Y+1. To need at least n trials to get on e success, you must have zero success in the first n 1 trials. The continuous co unterpart of geometric is exponential. Exponential CDF is Cumulative probability mass function FX ( n ) = P ( X n) = 1 P ( X p) p) n n + 1) k + 1) (1 (1 FY ( k ) = P ( Y k ) = 1 P (Y k +1 FX ( x ) = 1 e x . Notice that Mean E(X ) = 1 p 1 1 p E (Y ) = E ( X ) 1 = Variance MGF CDF for geometric and exponential is one minus something. You can derive it usin g MGF (next page). Intuitive formula. For example, if p = 0.2 = 1 5 , then on av

erage every 5 trials bring in one success. So E ( X ) =1 p. Var ( X ) = Var ( Y ) = M X (t ) = pet 1 (1 p ) et 1 1 p2 1 p You can derive it using MGF (next page). Not an intuitive formula. Just have to memorize the first one. To get the 2nd formula, use t MY (t ) = M X Conditional probability (Lack of memory property) (t ) = M X (t ) e b + 1) = 1 (1 p p ) et M X + c ( t ) = M X ( t ) ect P(X a+b X P(X = P(X = (1 p ) P (Y a + b ) (1 p ) = b b + 1) (1 p ) a 1 a +b 1 = P( X b) a) X and Y dont remember their past. After making initial b trials with no success, your chance of succeeding after at least a more trials is the same as if you wou ld reset your counter to zero after the initial b trials and start counting tria ls afresh. Why so? If an event is truly random (such as coin tossing, your past success or failure should have no bearing on your future success or failure. a+b Y =

P (Y a + b ) (1 p ) = b P (Y b ) (1 p ) a a+b = (1 p ) = P (Y a) Yufeng Guo, Deeper Understanding: Exam P Page 122 of 425

http://guo.coursehost.com How to derive the mean and variance formula using MGF: M X (t ) = pet = 1 (1 p ) et e t t p (1 p ) p) 1 d d M X (t ) = p e dt dt (1 = pe t e t (1 p) 2 E(X ) = d M X (t ) dt = pe t =0 { t e t (1 p) 2 } t =0 =p 1 (1 p)

2 = 1 p d2 d M X (t ) = pe 2 dt dt { t e 2 t (1 t p) 2 } t = pe t e t (1 p) + pe ( 2) ( pe t e ) e t (1 p) 2 2 p)

3 d2 E(X ) = M X (t ) dt 2 2 = t =0 { (1 e 3 t (1 + pe t ( 2) ( 3 2 e t ) 1 p e t (1 p) 3 } t =0 = p 1 (1 p) 2 +2p 1 p) = p( p 2 p2 1 p

2 )+ 2p( p ) = p 1 1 = 2 2 p p 1 p Var ( X ) = E ( X 2 ) E 2 ( X ) = Sample Problems and Solutions Problem 1 A candidate is taking a multiple-choice exam. For each tested problem in the exa m, there are 5 possible choices --- A, B, C, D, and E. Because the candidate has zero knowledge of the subject, he relies on pure guesswork to answer each quest ion, independent of how he answers any previous questions. Find: (1) The probabi lity that the candidate answers three questions wrong in a row before he finally answers the fourth question correctly. (2) Let random variable X = # of problem s the candidate answers wrong in a row before he finally guesses a correct answe r. Find E ( X ) and Var ( X ) . Yufeng Guo, Deeper Understanding: Exam P Page 123 of 425

http://guo.coursehost.com (3) When the candidates exam is being graded, the grader finds that the candidate has answered the first five exam questions all wrong. Can the grader conclude t hat the candidate is more or less likely to guess a problem right in other probl ems? (4) When the candidates exam is being graded, the grader finds that the cand idate has answered the first two exam questions all correctly. Can the grader co nclude that the candidate is more or less likely to guess a problem right in oth er problems? Solution (1) The probability of having three wrongs but the fourth right. If you use the number of failures X as the geometric random variable, then you have x p X ( x ) = p (1 p ) , x =0,1,2,3, and p =0.2 p X ( 3) = p (1 p ) = 0.2 (1 0.2 ) =10.24% 3 3 If you use the number of trials Y as the geometric random variable, then you hav e: y 1 pY ( y ) = p (1 p ) , y =1,2,3, and p =0.2 pY ( 4 ) = p (1 p ) 4 1 = 0.2 (1 0.2 ) =10.24% 3 (2) Find E ( X ) and Var ( X ) If you use the number of failures X as the geomet ric random variable, then you have E(X ) = 1 1 1 1= 1 = 4 , Var ( X ) = 2 p 0.2 p 1 1 = p 0.22 1 = 20 0.2 If you use the number of trials Y as the geometric random variable, then you hav e E (Y ) = 1 1 1 = = 5 , Var ( Y ) = 2 p 0.2 p 1 1 = p 0.22 1 = 20 0.2 (3) Geometric distribution lacks memory. The past failures have no use in predic ting future successes or failures. If the candidate really has zero knowledge of the subject and all exam problems are unrelated, then the candidate answering t he first 5 questions all wrong is purely by chance. This information is useless for predicting the candidates performance on other exam questions. (4) Once again , geometric distribution lacks memory. The past successes have no bearing on fut ure successes or failures. Yufeng Guo, Deeper Understanding: Exam P Page 124 of 425

http://guo.coursehost.com Problem 2 You throw a die repeatedly until you get a 6. Whats the probability tha t you need to throw more than 20 times to get 6? Solution If you use the number of trials X as the geometric random variable, then you have P(X n ) = (1 p ) n 1 , p =1 6 21 1 P ( X > 20 ) = P ( X 21) = (1 p ) = (1 1 6 ) 21 1 = 2.6% If you use the number of failures Y as the geometric random variable, then you h ave: k P (Y k ) = (1 p ) Throwing a die at least 21 times to get a 6 is the same as having at least 20 failures before 20 20 a success. P (Y 20 ) = (1 p ) = (1 1 6 ) =2.6% Problem 3 The annual number of losses incurred by a policyholder of an auto insurance poli cy, N , is geometrically distributed with parameter p =0.8. If losses do occur, the amount of losses is either $1,000 with probability of 0.4 or $5,000 with pro bability of 0.6. Assume the number of losses and amounts of losses are independe nt. Let S =annual aggregate loss incurred by the policyholder. Find E ( S ) and Var ( S ) . Solution Let X =amount of a single loss (set K= $1,000 to make our calculation easier): X= 1K with probability 0.4 5K with probability 0.6 We wrote 1K and 5 K so we wont forget that the unit is $1,000. S = NX Because N and X are independent, we have Yufeng Guo, Deeper Understanding: Exam P Page 125 of 425

http://guo.coursehost.com E (S ) = E (N ) E ( X ) Var ( S ) = Var ( NX ) = Var ( N ) E 2 (intuitive) ( X ) + E ( N )Var ( X ) (not intuitive ) You should memorize the above formulas. Note in the second formula above, the fi rst item on the right hand side is Var ( N ) E 2 ( X ) , not Var ( N ) E ( X ) . To see why, notice that Var ( NX ) is $ 2 (variance of a dollar amount is dolla r squared); E ( N ) Var ( X ) is dollar squared. Var ( N ) E 2 ( X ) is also dol lar squared. If you use Var ( N ) E ( X ) as opposed to the correct term Var ( N ) E 2 ( X ) , then Var ( N ) E ( X ) is a dollar amount, which will not make the equation hold. To calculate E ( X ) and Var ( X ) , we simply plug in the following data (remem ber to scale up the probability) into BA II Plus (by now this should be your sec ond nature): X= 1K with probability 0.4 5 K with probability 0.6 Set X01=1, Y01=4; X02=5, Y02=6. Using BA II Plus 1-V Statistics Worksheet, you s hould get: E ( X ) = 3.4 K and Var ( X ) = 3.84 K 2 Alternatively, E ( X ) = 1( 0.4 ) + 5 ( 0.6 ) = 3.4 K Var ( X ) = E ( X 2 ) E 2 ( X ) = 15.4 3.4 2 = 3.84 K 2 E ( X 2 ) = 12 ( 0.4 ) + 52 ( 0.6 ) = 15.4 K 2 Next, we need to find the mean and variance of N . N has a geometric distributio n with parameter p =0.8. Should we treat N as the number of trials or the number of failures? Because N is the number of losses occurred in a year and a policyh older may have zero losses in a year, N starts from zero, not from one. We shoul d treat N as the number of failures and use the mean formula that produces the l ower mean: E(N) = 1 1 1= 1 = 0.25 , 0.8 p Var ( N ) = 1 p2 1 1 = p 0.82 1 = 0.3125 0.8 E ( S ) = E ( N ) E ( X ) = 3.4 K ( 0.25) = 0.85K = 850 Yufeng Guo, Deeper Understanding: Exam P Page 126 of 425

http://guo.coursehost.com Var ( S ) = Var ( N ) E 2 ( X ) + E ( N ) Var ( X ) = 0.3125 ( 3.4 K ) + 0.25 ( 3.84 K 2 ) 2 = 4.5725 K 2 = 4.5725 (1, 0002 ) = 4, 572,500 ( $2 ) Problem 4 The annual number of losses incurred by a policyholder of an auto insurance poli cy, N , is geometrically distributed with parameter p =0.4. Find E ( N 2 N > 2 ) , Var ( N 2 N > 2 ) , E ( N N > 2 ) , Var ( N N > 2 ) Solution Geometric random variable N doesnt have any memory of its past. After seeing k lo sses ( k is a non-negative integer), if we set the counter to zero and start cou nting the number of losses incurred from this point on (that is N k ), the count N k will have the identical geometric distribution with the same parameter p . In this problem, after seeing two losses, if we reset the counter to zero, the n umber of losses we will see in the future is N 2 , where N is the original count of losses before the counter is reset to zero. N 2 has a geometric distribution with the identical parameter p =0.4. In other words, the conditional random var iable N 2 N > 2 is also geometrically distributed with p =0.4. 1 1 1= 1 = 1.5 p 0.4 1 1 1 1 Var ( N 2 N > 2 ) = Var ( N ) = 2 = = 3.75 2 p p 0.4 0.4 E ( N N > 2 ) = E ( N 2 N > 2 ) + 2 =1.5+2=3.5 Var ( N N > 2 ) = Var ( N 2 N > 2 ) = 3.75 E ( N 2 N > 2) = E ( N ) = Homework for you: rework all the problems in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 127 of 425

http://guo.coursehost.com Chapter 15 Negative binomial Negative binomial distribution is one of the more difficult concepts in Exam P. And it is very easy for candidates to miscalculate probabilities related to a ne gative binomial distribution. To understand the negative binomial distribution, remember the following critical points. Critical Point #1- Negative binomial dis tribution is like a negative version of our familiar binomial distribution. In a b inomial distribution, the number of trials n is fixed and we want to find the pr obability of having k number of successes in these n trials (so K is the random variable). In contrast, in a negative binomial distribution, the number of succe sses, k , is fixed and we want to find out the probability that these k successe s are produced in n trials (so N is the random variable). Critical point #2 the probability mass function of a negative binomial distribution with parameter ( n , k , p ) is a fraction of the probability mass function of the binomial distrib ution with identical parameters ( n, k , p ) , where n = # of trials, k = # of s uccesses, and p =the probability of success. The fraction is equal to k n . Expr essed in an equation, f NB ( n, k , p ) = k B f ( n, k , p ) , where NB=negative binomial, B=binomial n Stated differently, P( k successes are produced by n trials) = k P ( k successes in n trials ) n We can easily prove the above equation. f NB ( n, k , p ) = P(k successes are produced by n trials) = P (first n-1 trial s produce k-1 successes and final trial is a success) = P(first n-1 trials produ ce k-1 successes) P (final trial is a success) The above equation stands because each trial is independent. (first n-1 trials p roduce k-1 successes) is a binomial distribution; its probability is n k ( n 1) ( k 1) k Cnk 11 p k 1 (1 p ) = Cn 11 p k 1 (1 p ) P (final trial is a success) = p f NB ( n, k , p ) = pCnk 11 p k 1 (1 p) n k = Cnk 11 p k (1 p ) n k Yufeng Guo, Deeper Understanding: Exam P Page 128 of 425

http://guo.coursehost.com However, Cnk 11 = ( n 1)! = k n ! = k C k ( k 1)!( n k )! n k !( n k ) ! n n n k f NB ( n, k , p ) = Cnk 11 p k (1 p ) = k k k n Cn p (1 p ) n k = k B f ( n, k , p ) n We want to express the negative binomial probability mass function (PMF) as a fr action of the binomial PMF. Binomial is one of the easiest probability functions for candidates to understand and calculate. Since you can find binomial PMF wit h parameters ( n, k , p ) , you can easily calculate the negative binomial PMF b y simply taking a fraction (k/n) of the binomial PMF with the identical paramete rs. This way, you dont have to memorize the negative binomial PMF. You just need to memorize the fraction (k/n). Without using the rigorous proof above, we can still intuitively see that, given the identical parameters ( n, k , p ) , the negative binomial PMF is smaller th an the binomial PMF. This is because the negative binomial distribution has more constraints than does the binomial distribution. Both distributions have k succ esses in n trials; the PMF for n k each distribution is a multiple of p k (1 p ) . However, the negative binomial distribution requires that (1) the first n 1 t rials have k -1 successes, (2) the final trial, the n-th trial, be a success, In contrast, the binomial distribution does not have these two constraints. These additional constraints reduce the number of ways of having k successes in n tria ls in the negative binomial distribution. Mathematically, it happens to be that the number of ways of having k successes in n trials in a negative binomial dist ribution is only a fraction (k/n) of the number of ways of having k successes in n trials. Critical point #3 A negative binomial distribution with parameters ( n, k , p ) is the sum of k independent identically distributed geometric distribution with parameter p. If you understand this, memorizing the formulas for the mean and va riance of a negative binomial distribution becomes easy. Lets look into this further. Let N1 = # of trials required for 1st success; N1 is a geometric variable with param eter p. N 2 = # of trials required for 2nd success; N 2 is a geometric variable with parameter p. Yufeng Guo, Deeper Understanding: Exam P Page 129 of 425

http://guo.coursehost.com N k = # of trials required for k-th success, then N k is a geometric variable with parameter p. And N = # of trials required for a total of k successes. N is a negative binomial di stribution with parameter p. Then N = N1 + N 2 + ... + N k Because each trial is independent, then N1 , N 2 , , N k are independent. E ( N ) = E ( N1 + N 2 + ... + N k ) = k 1 k = p p 1 p2 1 p Var ( N ) = Var ( N1 + N 2 + ... + N k ) = k To obtain the moment generating function of a negative binomial distribution, we simply raise the moment generating function of a geometric distribution to the power of k : M N ( t ) = M N1 + N2 +...+ Nk ( t ) = M N1 ( t ) k = M N2 ( t ) k = ... = M Nk ( t ) k pet = 1 (1 p ) et k How is the negative binomial distribution used in insurance? It turns out that i f the number of claims by the insured has a Poisson distribution with mean chang es from one risk group to another (i.e. low risk insureds have a lower average cl aim than a high risk group) and has a gamma distribution, then the number of claims is a negative binomial distribution. This is called th e GammaPoisson Model, which is routinely tested in SOA Exam C. The Gamma-Poisson Model is beyond the scope of Exam P. So you do not need to worry about this. As you progress in your actuarial career and exams, you will pick up this concept. For now just keep in mind that the number of claims by the insured is sometimes modeled with a negative binomial distribution. Please note that some textbooks define the negative binomial random variable X as the number of failures (instea d of number of trials N ) before k -th success. Instead of memorizing a new set of formulas for the probability mass function, mean, variance, and the moment ge nerating function for X , you simply transform X to N by setting X +K = N. Yufeng Guo, Deeper Understanding: Exam P Page 130 of 425

http://guo.coursehost.com To find f X ( x ) , note that the probability of havin g x number of failures before k -th success is the same as the probability of ha ving a total of x + k trials before k -th success. 1 f X ( x ) = f N ( x + k ) = Cxk+ k 1 p k (1 p ) x Or f X ( x ) = f N ( x + k ) = = k [ binomial probability of k successes in x + k trials ] x+k k x Cxk+ k p k (1 p ) x+k E(X ) = E(N = X + k) k = k 1 k =k 1 p p 1 p2 1 p Var ( X ) = Var ( N = X + k ) = k To find the moment generating function of X , we use the formula M aX +b ( t ) = M aX ( t ) ebt . Please note that X = N k . M X (t ) = M N k (t ) = M N (t ) e kt pet = 1 (1 p ) et k k e kt = 1 (1 p p ) et Sample Problems and Solutions Problem 1 You roll a fair die repeatedly. Let N = the number of trials needed to roll four 5s. Find E ( N ) and Var ( N ) . Solution N is a negative binomial random variable with parameter k = 4 and p = 1 4 k 4 = = 24 p 1 6 E ( N ) formula is intuitive. On average, you need to roll the die 6 times to get one 5. To get four 5s, you need to roll 4*6=24 times. E(N) = Yufeng Guo, Deeper Understanding: Exam P Page 131 of 425

http://guo.coursehost.com Var ( N ) = k 1 p2 1 = 4 ( 62 6 ) = 4 ( 30 ) = 120 p The formula for Var ( N ) is not intuitive. You have to memorize it. Problem 2 A negative binomial distribution X =0,1,2,has parameters k and p . If p is cut in E(X ) half, Var ( X ) will increase by 500%. Calculate before p is cu t in half. Var ( X ) Solution Because X =0,1,2,, X = # of failures before k -th success. E(X ) k = 1 1 p 1 p E(X ) = k 1 1 , p Var ( X ) = k 1 p2 1 p Var ( X ) 1 k 2 p =p To find p , we need to use the information that If p is cut in half, Var ( X ) wi ll increase by 500%. Var ( X ) if p is cut in half : Var ( X ) = k 1 ( p 2) 1 p2 2 1 p 2 Var ( X ) prior to p is cut in half: Var ( X ) = k 1 p k 1 ( p 2)

2 1 1 = (1 + 500% ) k 2 p 2 p 1 1 , p= p 2 Prior to p being cut in half, E ( X ) Var ( X ) = p = 1 2 Yufeng Guo, Deeper Understanding: Exam P Page 132 of 425

http://guo.coursehost.com Problem 3 A fast food restaurant with a huge daily cus tomer base is selling an experimental (and more expensive) low-carb meal choice among three well-established cheaper meal choices. The profit for the new low-ca rb meal choice is $2 per meal sold; the profit for any of the three well-establi shed meal choices is $1 per meal sold. Not sure how many customers will order th e low-carb meal choice, the restaurant manager decides to set the food supply (w hich is more expensive) for the low-carb meal to the level that a maximum of 25 low-carb meals can be served on a given day when the low-carb meal is on trial. Each visiting customer chooses one and only one meal option among four options ( the experimental low-carb meal plus three well-established meal options). Each c ustomer orders his or her meal independent of the order by any other customer. C ustomers are three times as likely to order any of the three existing meal choic es as to order the experimental low-carb meal choice. Suppose the restaurant clo ses its doors immediately after the 25 low-carb meals are sold out. Calculate th e expected profit the restaurant will make on all the meals sold on a given day when the low-carb meal is on trial. Solution The problem is set up in such a way that a negative binomial distributi on can be used to model the number of visiting customers before the 25th low-car b meal is sold. Notice the wording A fast restaurant with a huge daily customer b ase A huge daily customer base reflects the fact that the negative binomial rando m variable can be very large (can be + theoretically). If the customer base is v ery small, then the negative binomial distribution is not appropriate to model t he number of visiting customers before the 25th low-carb meal is sold. Let N = # of visiting customers before the 25th low-carb meal is sold X = total prof it made on all the meals sold while the restaurant is still open. X = 25 ( 2 ) + ( N 25 )(1) The above equation says that for the 25 customers who have bought the new low-carb meal, the restaurant makes $2 profit per customer; for the remaining ( N 25 ) customers, the restaurant makes $1 profit per custom er. Taking the expectation on the above equation, we have Yufeng Guo, Deeper Understanding: Exam P Page 133 of 425

http://guo.coursehost.com E ( X ) = E 25 ( 2 ) + ( N 25 )(1) = E ( N ) + 25 We know that E ( N ) = k . p To find p , the probability that a visiting customer orders a low-carb meal, we use the information that customers are three times as likely to order any of the three conventional meal choices as to order the new meal choice. p + 3(3 p ) = 1 p = 0.1 , E ( N ) = E ( X ) = E ( N ) + 25 = 250 + 25 = 275 25 = 250 0.1 So the restaurant can expect to earn $275 total profit by the time that the 25 l ow-carb meals are sold out. Problem 4 A fair die is repeatedly thrown n times until the second 5 appears. What is the probability that n =20? What is the probability of having to throw the die more than 20 times (i.e. n 21)? Solution f NB ( n = 20, k = 2, p = 1 6 ) = 2 B f ( n = 20, k = 2, p = 1 6 ) 20 2 2 2 1 = C20 20 6 5 6 18 = 1.982% To calculate the probability of throwing the die at least 21 times to get the se cond 5, we + . sum up the probability for n =21, 22,, + . The sum should converge as n However, this approach is time-consuming. The alternate approach is to cal culate the probability for n=1,2,,20. P ( n = 21) = 1 P ( n = 1) + P ( n = 2 ) + P ( n = 3) + ... + + P ( n = 20 ) This is a lot of work too. Yufeng Guo, Deeper Understanding: Exam P Page 134 of 425

http://guo.coursehost.com As a shortcut, you can reason that the probability of having to throw the die at least 21 times to get the second 5 is the same as the probability of having a zero or one 5 in 20 trials. Obviously, if you throw the die 20 times but get zero or only one 5, you just have to keep throwing the die more times in order to get a total of two 5s. Having a zero or one 5 in 20 trial s is a binomial distribution. P(throwing the die at least 20 times to get the se cond 5) =P(having no 5s in 20 trials} + P(having one 5 in 20 trials) 0 20 1 19 0 1 = C20 (1 6 ) ( 5 6 ) + C20 (1 6 ) ( 5 6 ) =0.1304 Problem 5 The total number of claims in a year incurred by a small block of auto insurance policies is modeled as a negative binomial distribution with p=0.2 an d k=5. If there is a claim, the claim amount is $2,000 per claim for all claims. Find the probability that the total amount of annual claims incurred by this bl ock of auto insurance will exceed $100,000. Solution Let N = total # of claims in a year by the group of the insured. X = total dollar am ount of the annual claims by the group of the insured Then X = 2, 000 N . P ( X > 100, 000 ) = P ( 2, 000 N > 100, 000 ) = P ( N > 50 ) To find P ( N > 50 ) , be careful on how you interpret N , the negative binomial random variable. Because N is the total number of claims in year, theoretically N can be zero. Th us, you should interpret N as the number of failures, not the number of trials, before the 5th success. If you interpret N as the number of trials, then N =5,6, 7,,+ . You essentially set 5 as the minimum number of claims. This is clearly ina ppropriate because the problem doesnt say that the annual number of claims must b e at least 5. Next, you convert N to the number of trials before the 5th success : # of failures + k = # of trials more than 50 failures = more than 55 trials. f NB (more than 50 failures before 5th success, p =0.2) = f NB (more than 55 tria ls before 5th success, p =0.2) Yufeng Guo, Deeper Understanding: Exam P Page 135 of 425

http://guo.coursehost.com = f B (more than 5 successes in 55 trials, p =0.2) = f B (0 success in 55 trials, p =0.2) + f B (1 success in 55 trials, p =0.2) + f B (2 success in 55 trials, p =0.2) + f B (3 success in 55 trials, p =0.2) + f B ( 4 success in 55 trials, p =0.2) In other words, to have more than 55 trials befo re getting the 5th success, you need to have at most 4 successes in the 55 trial s. The probability of having at most 4 successes in 55 trials is a binomial prob ability. f B (more than 5 successes in 55 trials, p =0.2) = 4 x =0 x C55 0.2 x 0.855 x 0 1 2 3 4 = C55 0.855 + C55 0.2 ( 0.854 ) + C55 ( 0.22 )( 0.853 ) + C55 ( 0.23 ) ( 0.852 ) + C55 ( 0.24 )( 0.851 ) = 0.865% Tip: how to check your result for complex calculations I often use Micr osoft Excel to check complex calculations. For example, in Excel the 4 formula for calculating x=0 x C55 0.2 x 0.855 x is BINOMDIST(# of successes, # of trials, p , indicator) =BINOMDIST( k , n , p , in dicator)= BINOMDIST (4, 55, 0.2, True) If you set the indicator = True, youll cal culate the cumulative density, the probability of k having at most k successes in n trials, x=0 Cnx p x (1 p ) n x . If you set the indicator = False, youll calculate the probability of having exact ly k successes in n trials, Cnk p k (1 p ) n k . Excel also has formulas for negative binomial, Poisson, exponential densities. Y ou can use Excel Help menu to learn how the formulas. Problem 6 At 4:00 pm on a Fr iday afternoon, an overworked claim adjustor realized that he still had five pen ding claims to review. To accelerate his work, the claim adjustor decided to thr ow a die to determine the dollar amount to be awarded for each claim. He kept th rowing a die until he got the second 1 or second 6 or until he had thrown the di e 5 times, at which time he stopped throwing the die. At each throw of the die, he picked a claim file randomly and awarded $10,000 to that claim, regardless of the actual dollar Yufeng Guo, Deeper Understanding: Exam P Page 136 of 425

http://guo.coursehost.com amount that claim deserved. Any claims left after the last throw of the die were awarded $5,000 per claim, regardless of the actual ci rcumstance of the claims. Find 1. the expected total dollar amount awarded to th e five claims. 2. the variance of the total dollar amount awarded to the five cl aims. Solution Let N =# of times the dice was thrown before the second 1 or seco nd 6. Then N has a negative binomial distribution with k =2 and p = 2 6 = 1 3 . Let X = total dollar amount awarded to the 5 claims. Lets come up with a table li sting all possible values of X ( n ) and f ( n ) . Then we can calculate mean an d variance of X using the standard formula: E(X ) = + n=2 X ( n) f ( n) , E(X2) = + n= 2 X 2 ( n) f (n) , Var ( X ) = E ( X 2 ) E 2 ( X ) N =n P ( N = n ) = f NB ( n, k = 2, p = 1 3) = 2 B 2 n! 1 f ( n, k = 2, p = 1 3) = n n 2!( n 2 ) ! 3 2 2 X (K=$1,000) 2 3 n 2 n=2 2 2! 1 2 2!( 2 2 ) ! 3 2 3! 1 3 2!( 3 2 ) ! 3 2 4! 1 4 2!( 4 2 ) ! 3 1 2 3 2 3 2 3 2 2 = 3 2 1 3 2 = 0.1111 10(2)+5(3)=35K 10(3)+5(2)=40K 10(4)+5(1)=45K 10(5)=50K n =3 n =4 n 5 2

= 0.1481 4 2 2 = 0.1481 P ( N = 2 ) + P ( N = 3) + P ( N = 4 ) =1- (0.1111+0.1481+0.1481)=0.5927 Yufeng Guo, Deeper Understanding: Exam P Page 137 of 425

http://guo.coursehost.com To use BA II Plus 1-V Statistics Worksheet, well scale up the probabilities: X P ( X = x) 1000 P ( X = x ) 35 45 55 65 .1111 .1481 .148 1 .5927 1111 1481 1481 5927 Plugging the numbers in BA II Plus, we get: Var ( X ) = 113.573824 ( K 2 ) = 113.573824 ( $21000, 000 ) = 113,573,824 ( $2 ) E ( X ) =57.224(K)=$57,224, X =10.65710205(K) Please note the unit for Var ( X ) is dollar squared ( $2 ). We write the values of X as 35K,40K,45K,and 50K instead of $35,000,$40,000,$45,000, and $50,000. Th is significantly reduces the number of times we have to press calculator keys, h ence reducing possible errors of accidentally dropping or adding a zero. Homewor k for you: #11 Nov 2001. Yufeng Guo, Deeper Understanding: Exam P Page 138 of 425

http://guo.coursehost.com Chapter 16 Hypergeometric distribution Imagine a shipment has seven defective parts and ten good parts. You want to fin d out if you randomly take out five parts, how likely is it to get three defecti ve parts and two good parts. Your first reaction might be to use binomial distri bution with parameters n = 5 and p = 7 17 . Pr(having 3 defective parts out of 5 parts)= C53 ( 7 17 ) (10 17 ) = 0.2416 3 2 After close examination, you will find that binomial distribution is not a good fit here. If you use binominal distribution, essentially you assume (1) you have 5 independent draws, and (2) the chance of a defective part being chosen at eac h of the 5 draws is 7/17. The only way to meet these two conditions is that afte r you randomly choose your first part, whether it is defective or good, you put it back before your second random draw. Then after you randomly take out the sec ond part, you put the second part back before making the third draw. And so on. If you do not put the first part back, your chance of getting a defective part i n the second draw is 4/16 (if you get a defective part in the first draw) or 5/1 6 (if you get a good part in the first draw). Now the outcome of a previous tria l affects the outcome of the next trial and the probability of having a success in different trials is different now. You can see that the only way to meet the conditions for binomial distribution is putting things back before making the next draw. In mathematical terms, putting things back before making the next draw is s ampling with replacement. Binomial distribution is sampling with replacement. In this problem, however, we dont put things back (so here we have sampling without replacement). We permanently take out the first part before making our second d raw. We permanently take out the second part before making our third draw and so on. To find the probability that exactly 3 out of the 5 randomly chosen parts a re defective, we 5 3 reason that out of the total of C17 ways of getting 5 parts , we have C7 ways of getting 3 2 defective parts and C10 ways of getting 2 good parts. Thus, the probability th at exactly 3 out of the 5 randomly chosen parts are defective is: 3 2 C7 C10 5 C17 The general formula for a hypergeometric distribution: Yufeng Guo, Deeper Understanding: Exam P Page 139 of 425

http://guo.coursehost.com A finite population has N objects, of which K objects are special and N-K objects are ordinary. If m objects are randomly chosen from the population, the probability that out of the m objects chosen, x objects are special is: f ( x) = x m CK C N m CN x K When you apply the above formula to a problem, please do two quality checks: Fir st, make sure the combinatorial elements in the nominator add up to the combinat orial elements in the denominator. Notice in the formula above, we have x + (m x) = m , K + ( N K) = N You want to make sure that these two equations are satisfied. Otherwise, you did something wrong. x Second, make sure you dont have any negative factorials. In CK , make sure x m x m CN K , make sure m x N K ; in C N , make sure m factorial, you did something wrong. K ; in N . If you get a negative Please note that if both N and K are big, the hypergeometric distribution and bi nomial distribution give roughly the same result. So if N and K are big, we do n ot have to worry about whether we sample with replacement or without replacement . We can use just use binomial distribution to approximate the hypergeometric di stribution by setting p=K N. Sample Problems and Solutions Problem 1 Among 200 people working for a large actuary department of an insuranc e company, 120 are actuaries and 80 are support staff. If 8 people are randomly chosen to attend a brainstorm meeting with company executives, what is the proba bility that exactly 5 actuaries are chosen to attend the meeting? Solution 5 3 C120 C80 = 0.2842 8 C200 Yufeng Guo, Deeper Understanding: Exam P Page 140 of 425

http://guo.coursehost.com Two quality checks: 5+3=8, 120+80=120. In addition, we dont have any negative factorials. If we use binomial distribution to approximat e, we have p = 120 200 = 0.6 . C85 0.650.43 =0.2787 You see that if both N and K are big, binomial distribution gives good approxima tion to hypergeometric distribution. I think this is all you need to know about hypergeometric distribution. Dont bother memorizing the mean and variance formula s for hypergeometric distribution. You are better off spending your time on some thing else. Homework for you: rework the problem in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 141 of 425

http://guo.coursehost.com Chapter 17 Uniform distribution Key formulas If the random variable X is uniformly distributed over [ a, b] where b > a , the n f ( x) = 1 b a , E(X ) = a+b , 2 X = b a 2 3 Proof. 1 2 E(X ) = x dx = x b a b a 2 a b 2 2 b 1 1 b = a 1 1 2 (b a2 ) = b + a b a2 2 = 2 2 1 1 3 ( b a3 ) = b + ab + a b a3 3 1 3 E(X ) = x dx = x b a b a 3 a 1 1 Var ( X ) = E ( X b a 2 ) b 2 + ab + a 2 E (X ) = 3 2 b+a 2 2 (b = a) 12 2 X

= Var ( X ) = b a 2 3 Sample Problems and Solutions Problem 1 Loss for an auto insurance policy is uniformly distributed over [ 0, L ] . If th e deductible is 500, then the expected claim payment with the deductible is only 1 9 of the expected claim payment without the deductible. Find the expected cla im payment without the deductible. Solution Let X represents the loss incurred by the auto insurance policy. X is uniformly distributed over [ 0, L ] . We have 1 L , E(X ) = L 2 Yufeng Guo, Deeper Understanding: Exam P f ( x) = Page 142 of 425

http://guo.coursehost.com Let Y represent the claim payment with a deductible of 500. Y is distributed as follows: Y= L 0 X 500 if X 500 if X > 500 500 E (Y ) = y ( x ) f ( x )dx = 0 0 f ( x )dx + L (x 500 ) f ( x )dx = L (x 500 ) f ( x )dx 0 500 500 = 1 2L (x 500 ) 2 L 500 = 1 2 ( L 500 ) 2L If there is no deductible, then the expected claim will be E (Y ) = E ( X ) = L . 2 Because the expected claim with the deductible is 1/9 of the expected clai m with no deductible, we have 1 1 L 2 , ( L 500 ) = 2L 9 2 (L

500 ) 2 L = 3 2 Please note that L must exceed the deductible 500. So, L 500 = L , 3 L = 750 The expected claim without a deductible is L 2 =375. Problem 2 In a policy year, a low risk auto insurance policyholder can incur either one lo ss with probability of p or no loss with probability of 1- p . p is uniformly di stributed over [0, 0.3]. If there is a loss, loss is uniformly distributed over [500, 2,500]. Assume that the probability of having a loss and the loss amount a re independent. Find the expected loss incurred by the policyholder in a policy year. Solution Let N = # of losses in a year, X = amount of individual losses in a year, Y is t he total loss amount in a year. Y =NX, E (Y ) = E ( N ) E ( X ) (because N and X are independent) Page 143 of 425

http://guo.coursehost.com E(X ) = 500 + 2,500 = 1,500 2 because X is uniform over [500, 2,500] To find the mean E ( N ) , well use the double expectation theorem (to be explain ed in a future chapter): E ( N ) = EP E ( N p ) = 0.3 E ( N p ) f ( p ) dp 0 The above equation says that to find E ( N ) , we first calculate E ( N p ) assu ming that p is a known constant. This gives us E ( N p ) , the conditional mean given a kn own constant p . E ( N p ) can be interpreted as the contribution to E ( N ) made by a known constant p . Next, we calculate all p s contribution to E ( N ) by integ rating E ( N p ) f ( p ) over p [ 0, 0.3] . Given p (i.e. if we fix p ), N is a Bernoulli random variable with parameter p . E ( N p) = p E ( N ) = EP E ( N p ) = 0.3 E ( N p ) f ( p ) dp = 0.3 pf ( p ) dp = E ( P ) = Finally, E (Y ) =0.15*1,500=225 Alternatively, 0 0 0.3 = 0.15 2 Average loss amount = average probability of having a claim average claim amount if there is a loss average probability of having a loss = 0.3/2 = 0.15 average loss amount if theres a loss = (500+2,500)/2 =1,500 Average loss amount = 0.15(1, 500) = 225. Homework for you: #38 May 2000; #29 Nov 2001. Page 144 of 425

http://guo.coursehost.com Chapter 18 Exponential distribution Key Points Gain a deeper understanding of exponential distribution: Why does exponential di stribution model the time elapsed before the first or next random event occurs? Exponential distribution lacks memory. What does this mean? Understand and use t he following integration shortcuts: For any + > 0 and a 0: a 1 e x dx = e e x a + x a + 1 dx = ( a + x )e a 2 x2 a 1 e dx = (a + ) + 2 e a You will need to understand and memorize these shortcuts to quickly solve integr ations in the heat of the exam. Do not attempt to do integration by parts during the exam.

Explanations Exponential distribution is the continuous version of geometric dis tribution. While geometric distribution describes the probability of having N tr ials before the first or next success (success is a random event), exponential d istribution describes the probability of having time T elapse before the first o r next success. Lets use a simple example to derive the probability density funct ion of exponential distribution. Claims for many insurance policies seem to occu r randomly. Assume that on average, claims for a certain type of insurance polic y occur once every 3 months. We want to find out the probability that T months w ill elapse before the next claim occurs. To find the pdf of exponential distribu tion, we take advantage of geometric distribution, whose probability mass functi on we already know. We divide each month into n intervals, each interval being o ne minute. Since there are 30*24*60=43,200 minutes in a month (assuming there ar e 30 days in a month), we convert each month into 43,200 Yufeng Guo, Deeper Under standing: Exam P Page 145 of 425

http://guo.coursehost.com intervals. Because on average one claim occurs every 3 months, we assume that the chance of a claim occurring in one minute is 1 3 43, 200 How many minutes must elapse before the next claim occurs? We can think of one minute as one trial. Then the probability of having m trials (i.e. m minutes ) before the first success is a geometric distribution with p= 1 3 43, 200 Instead of finding the probability that it takes exactly m minutes to have the f irst claim, well find the probability that it takes m minutes or less to have the first claim. The latter is the cumulative distribution function which is more u seful. P (it takes m minutes or less to have the first claim) =1 P (it takes mor e than m minutes to have the first claim) The probability that it takes more tha n m trials before the first claim is (1 p ) . To see why, you can reason that to have the first success only after m trials, the first m trials m must all end w ith failures. The probability of having m failures in m trials is (1 p ) . m Therefore, the probability that it takes m trials or less before the first succe ss is m 1 (1 p ) . Now we are ready to find the pdf of T : P (T t ) = P (43,200 t trials or fewer before the 1st success) 43,200 t =1 1 1 3 43, 200 =1 1 1 3 43, 200 3 43,200 t t 3 1 e t 3 Of course, we do not need to limit ourselves by dividing one month into interval s of one minute. We can divide, for example, one month into n intervals, with ea ch interval of 1/1,000,000 of a minute. Essentially, we want n + . P (T t ) = P ( nt trials or fewer before the 1st success) Yufeng Guo, Deeper Understanding: Exam P Page 146 of 425

http://guo.coursehost.com nt 3n t 3 =1 1 1 3n =1 1 1 3n =1 e t 3 (as n + ) If you understand the above, you should have no trouble understanding why expone ntial distribution is often used to model time elapsed until the next random eve nt happens. Here are some examples where exponential distribution can be used: T ime until the next claim arrives in the claims office. Time until you have your next car accident. Time until the next customer arrives at a supermarket. Time u ntil the next phone call arrives at the switchboard. General formula: Let T=time elapsed (in years, months, days, etc.) before the ne xt random event occurs. F ( t ) = P (T t) =1 e t , f (t ) = 1 e t , where = E (T ) P (T > t ) = 1 F ( t ) = e Alternatively, F ( t ) = P (T t) =1 e t , f (t ) = e , where = 1 E (T )

P (T > t ) = 1 F ( t ) = e Mean and variance: E (T ) = = 1 , Var ( T ) = 2 = 1 2 Like geometric distribution, exponential distribution lacks memory: P (T > a + b T > a ) = P (T > b ) We can easily derive this: P (T > a + b T > a ) = P (T > a + b T > a ) P (T > a + b ) e ( a + b ) = = P (T > a ) P (T > a ) ea =e b = P (T > b ) In plain English, this lack of memory means that if a components time to failure follows exponential distribution, then the component does not remember how long it has been working (i.e. does not remember wear and tear). At any moment when i t is working, the component starts fresh as if it were completely new. At any mo ment while the component Yufeng Guo, Deeper Understanding: Exam P Page 147 of 425

http://guo.coursehost.com is working, if you reset the clock to zero and count t he time elapsed from then until the component breaks down, the time elapsed befo re a breakdown is always exponentially distributed with the identical mean. This is clearly an idealized situation, for in real life wear and tear does reduce t he longevity of a component. However, in many real world situations, exponential distribution can be used to approximate the actual distribution of time until f ailure and still give a reasonably accurate result. A simple way to see why a co mponent can, at least by theory, forget how long it has worked so far is to thin k about geometric distribution, the discrete counterpart of exponential distribu tion. For example, in tossing a coin, you can clearly see why a coin doesnt remem ber its past success history. Since getting heads or tails is a purely random ev ent, how many times you have tossed a coin so far before getting heads really sh ould NOT have any bearing on how many more times you need to toss the coin to ge t heads the second time. The calculation shortcuts are explained in the followin g sample problems. Sample Problems and Solutions Problem 1 The lifetime of a light bulb follows exponential distribution with a m ean of 100 days. Find the probability that the light bulbs life (1) Exceeds 100 d ays (2) Exceeds 400 days (3) Exceeds 400 days given it exceeds 100 days. Solutio n Let T = # of days before the light bulb burns out. F (t ) = 1 e t , where t = E (T ) = 100 P (T > t ) = 1 F ( t ) = e P (T > 100 ) = 1 F (100 ) = e P (T > 400 ) = 1 F ( 400 ) = e P (T > 400 T > 100 ) = 100 100 400 100 = e 1 =0.3679 = e 4 =0.0183 = e e 400 100 100 100 P ( T > 400 ) P (T > 100 ) = e 3 =0.0498 Yufeng Guo, Deeper Understanding: Exam P Page 148 of 425

http://guo.coursehost.com Or use the lack of memory property of exponential dist ribution: P (T > 400 T > 100 ) = P (T > 400 100 ) = e 300 100 = e 3 =0.0498 Problem 2 The length of telephone conversations follows exponential distribution . If the average telephone conversation is 2.5 minutes, what is the probability that a telephone conversation lasts between 3 minutes and 5 minutes? Solution F (t ) = 1 e t / 2.5 5 / 2.5 P ( 3 < T < 5 ) = (1 e ) (1 e 3/ 2.5 )=e 3/ 2.5 e 5/ 2.5 = 0.1659 Problem 3 The random variable T has an exponential distribution with pdf f ( t ) = Find E (T T > 3) , Var (T T > 3) , E (T T 3) , Var (T T 3) . 1 e 2 t/2 . Solution First, lets understand the conceptual thinking behind the symbol E (T T > 3) . He re we ] . The pdf in the original sample space T ! [ 0, + ] is f ( t ) ; the pdf in th e f (t ) 1 reduced sample space t ! [3, + ] is . Here the factor is a P ( T > 3) P ( T > 3) normalizing constant to make the total probability in the reduced sample space a dd up to one: + are only interested in T > 3 . So we reduce the original sample space T ! [ 0, +

T ! [3, + ] to 3 P ( T > 3) f (t ) dt = 1 P ( T > 3) + f ( t ) dt = 3 1 P (T > 3 ) = 1 P (T > 3) Yufeng Guo, Deeper Understanding: Exam P Page 149 of 425

http://guo.coursehost.com Next, we need to calculate E (T T > 3) , the expected value of T in the reduced sample space T ! [3, + ]: + + f (t ) 1 1 t dt = t f ( t )dt = t f ( t )dt P ( T > 3) P ( T > 3) 3 1 F ( 3) 3 E (T T > 3) = Reduced Sample space T ![3, + ] 1 F ( 3) = e + 3/ 2 tf ( t )dt = 5e 3/ 2 (integration by parts) + 3 E (T T > 3) = 1 1 F ( 3) tf ( t )dt = 3 5e 3/ 2 =5 e 3/ 2 Here is another approach. Because T does not remember wear and tear and always s tarts anew at any working moment, the time elapsed since T =3 until the next ran dom event (i.e. T 3 ) has exponential distribution with an identical mean of 2. In other words, (T 3 T > 3) is exponentially distributed with an identical mean of 2. So E (T 3 T > 3) = 2 . E (T T > 3) = E (T 3 T > 3) + 3 = 2 + 3 = 5 Next, we will find Var (T T > 3) . E T2 T >3 = + ( ) 1 Pr (T > 3) 3/ 2 + t 2 f ( t )dt = 3 1 Pr ( T > 3)

+ t2 3 1 e 2 t/2 dt t2 3 1 e 2 t/2 dt = 29e (integration by parts) E T2 T > 3 = ( ) 29e 3/ 2 = 29 e 3/ 2 Var ( T T > 3) = E T 2 T > 3 ( ) E 2 (T T > 3) = 29 52 = 4 = 2 Yufeng Guo, Deeper Understanding: Exam P Page 150 of 425

http://guo.coursehost.com It is no coincidence that Var (T T > 3) is the same as Var (T ) . To see why, we know Var ( T T > 3) = Var ( T 3 T > 3) . This is because Var ( X + c ) = Var ( X ) st ands for any constant c . Since (T 3 T > 3) is exponentially distributed with an identical me an of 2, then Var ( T 3 T > 3) = 2 = 22 = 4 . 3) . 3 Next, we need to find E (T T 3 1 E (T T < 3) = t dt = tf ( t )dt F ( 3) 0 Pr (T < 3) 0 f (t ) F ( 3) = 1 e 3 3/ 2 tf ( t )dt = + tf ( t )dt + tf ( t )dt 0 + 0 3 tf ( t )dt = E ( T ) = 2 0 + tf ( t )dt = 5e 3/ 2 (we already calculated this) 3 3 1 2 5e 3/ 2 E (T T < 3) = tf ( t )dt = F ( 3) 0 1 e 3/ 2 Here is another way to find E (T T < 3) . E (T ) = E (T T < 3) P (T < 3) + E (T T > 3 ) P (T > 3)

The above equation says that if we break down T into two groups, T >3 and T <3, then the overall mean of these two groups as a whole is equal to the weighted av erage mean of these groups. Also note that P (T = 3) is not included in the righ t-hand side because the probability of a continuous random variable at any singl e point is zero. This is similar to the concept that the mass of a single point is zero. Yufeng Guo, Deeper Understanding: Exam P Page 151 of 425

http://guo.coursehost.com Of course, you can also write: E (T ) = E (T T 3) P (T 3) + E (T T > 3 ) P (T > 3) 3) P ( T 3) Or E (T ) = E (T T < 3) P (T < 3) + E (T T You should get the same result no matter which formula you use. E (T ) = E (T T < 3) P (T < 3) + E (T T > 3 ) P (T > 3) " E (T T < 3) = E (T ) E (T T > 3) P (T > 3 ) P (T < 3) 3/ 2 " E (T T < 3) = ( + 3) e 1 e 3/ 2 = 2 5e 3/ 2 1 e 3/ 2 Next, we will find E T 2 T < 3 : E T2 T <3 = 3 ( ) ( ) 1 1 1 t 2 f ( t )dt = t2 e P (T < 3) 0 P (T < 3) 0 2 3 3 t/2 dt t2 0 1 e 2 2 t/2 dt = (t + 2) 2

+4 e 3 x/2 3 0 = 8 29e 3/ 2 1 1 E T T <3 = t2 e P (T > 3) 0 2 ( ) t/2 dt = 8 29e 3/ 2 1 e 3/ 2 Alternatively, E (T 2 ) = E T 2 T < 3 P (T < 3) + E T 2 T > 3 P (T > 3) ( ) ( ) " E T T <3 = 2 ( ) E (T 2 ) E T 2 T > 3 P (T > 3) P (T < 3) ( ) = 2 2 29 P (T > 3) 8 29e 3/ 2 = 1 e 3/ 2 P (T < 3) Yufeng Guo, Deeper Understanding: Exam P Page 152 of 425

http://guo.coursehost.com Var ( T T < 3) = E T T < 3 2 ( ) 8 29e 3/ 2 E ( T T < 3) = 1 e 3/ 2 2 2 5e 3/ 2 1 e 3/ 2 2 In general, for any exponentially distributed random variable T with mean and fo r any a 0 : >0 T " " a T > a is also exponentially distributed with mean E (T a T > a ) = , Var (T a T > a) = 2 2 E (T T > a ) = a + , Var (T T > a ) = E (T + a T > a) = 1 P (T > a ) + (t a ) f ( t )dt a a " a (t a ) f ( t )dt = E ( T + a T > a ) P (T > a ) = e E (T T > a ) = +

1 P (T > a ) tf ( t )dt a a " a tf ( t )dt = E (T T > a ) P (T > a ) = ( + a ) e 1 tf ( t )dt P (T < a ) 0 a E (T T < a ) = a " 0 tf ( t )dt = E ( T T < a ) P (T < a ) = E ( T T < a ) (1 e a ) E (T ) = E (T T < a ) P (T < a ) + E (T T > a ) P (T > a ) " = E (T T < a ) (1 e a ) + E (T T > a ) e a E (T 2 ) = E T 2 T < a P (T < a ) + E T 2 T > a P (T > a ) You do not need to me morize the above formulas. However, make sure you understand the logic behind th ese formulas. Before we move on to more sample problems, I will give you some in tegration-by-parts formulas for you to memorize. These formulas are critical to you when solving Yufeng Guo, Deeper Understanding: Exam P Page 153 of 425 ( ) ( )

http://guo.coursehost.com exponential distribution-related problems in 3 minutes . You should memorize these formulas to avoid doing integration by parts during the exam. Formulas you need to memorize: For any > 0 and a + 1 x/ e dx = e a / a + a + a 0 (1) x x2 1 e e x/ dx = ( a + )e a/ (2) 2 1 x/ dx = (a + ) 2 + e a/ (3) You can always prove the above formulas using integration by parts. However, let me give an intuitive explanation to help you memorize them. Let X represent an exponentially random variable with a mean of , and f ( x) is the probability dis tribution function, then for any a 0, Equation (1) represents P ( X > a ) = 1 F ( a ) , where F ( x ) = 1 e x / is the cumulative distribution function of X . Y ou should have no trouble memorizing Equation (1). For Equation (2), from Sample Problem 3, we know + a xf ( x )dx = E ( X X > a ) P ( X > a ) = ( a + )e a/ To understand Equation (3), note that P ( X > a) = e + a a/ x 2 f ( x ) dx = E X 2 X > a P ( X > a )

( ) E X 2 X > a = E 2 ( X X > a ) + Var ( X X > a ) ( ) E2 ( X X > a) = (a + ) 2 , Var ( X X > a ) = 2 Then + a x 2 f ( x ) dx = (a + ) 2 + 2 e a/ Yufeng Guo, Deeper Understanding: Exam P Page 154 of 425

http://guo.coursehost.com We can modify Equation (1),(2),(3) into the following equations: For any > 0 and b a 1 x/ e dx = e a / e b / b a b a 0 b (4) a/ a x x2 1 e e x/ dx = ( a + dx = )e (b + ) e b / + 2 (5) 1 x/ (a + ) 2 e a/ (b + ) 2 + 2 e b/ (6) We can easily prove the above equation. For example, for Equation (5): b a

x 1 e x/ dx = + a x 1 e x/ dx + b x 1 e x/ dx = ( a + )e a/ (b + ) e b / We can modify Equation (1),(2),(3) into the following equations: For any > 0 and a 0 1 x/ e dx = e x / + c x x2 1 1 e e x/ (7) x/ dx = dx = (x + )e (x + ) +c 2 (8) e x/ x/

2 + +c (9) Set e x x = 1 . For any x > 0 and a 0 dx = e x +c x+ (10) 1 e x ( 2 e ) dx = +c (11) e x x ( e x ) dx = x+ 1

2 + 1 2 +c (12) So you have four sets of formulas. Just remember one set (any one is fine). Equa tions (4),(5),(6) are most useful (because you can directly apply the formulas), but the formulas are long. Yufeng Guo, Deeper Understanding: Exam P Page 155 of 425

http://guo.coursehost.com If you can memorize any one set, you can avoid doing integration by parts during the exam. You definitely do not want to calculate messy integrations from scrat ch during the exam. Now we are ready to tackle more problems. Problem 4 After an old machine was installed in a factory, Worker John is on cal l 24-hours a day to repair the machine if it breaks down. If the machine breaks down, John will receive a service call right away, in which case he immediately arrives at the factory and starts repairing the machine. The machines time to fai lure is exponentially distributed with a mean of 3 hours. Let T represent the ti me elapsed between when the machine was first installed and when John starts rep airing the machine. Find E (T ) and Var (T ) . Solution T is exponentially distributed with mean = 3 . F ( t ) = 1 e We simply apply the mean and variance formula: t /3 . E (T ) = = 3, Var (T ) = 2 = 32 = 9 Problem 5 After an old machine was installed in a factory, Worker John is on call 24-hours a day to repair the machine if it breaks down. If the machine breaks down, John will receive a service call right away, in which case he immediately arrives at the factory and starts repairing the machine. The machine was found to be worki ng today at 10:00 a.m.. The machines time to failure is exponentially distributed with a mean of 3 hours. Let T represent the time elapsed between 10:00 a.m. and when John starts repairing the machine. Find E (T ) and Var (T ) . Solution Yufeng Guo, Deeper Understanding: Exam P Page 156 of 425

http://guo.coursehost.com Exponential distribution lacks memory. At any moment when the machine is working , it forgets its past wear and tear and starts afresh. If we reset the clock at 10:00 and observe T , the time elapsed until a breakdown, T is exponentially dis tributed with a mean of 3. E (T ) = = 3, Var (T ) = 2 = 32 = 9 Problem 6 After an old machine was installed in a factory, Worker John is on call 24-hours a day to repair the machine if it breaks down. If the machine breaks down, John will receive a service call right away, in which case he immediately arrives at the factory and starts repairing the machine. Today, John happens to have an ap pointment from 10:00 a.m. to 12:00 noon. During the appointment, he wont be able to repair the machine if it breaks down. The machine was found working today at 10:00 a.m.. The machines time to failure is exponentially distributed with a mean of 3 hours. Let X represent the time elapsed between 10:00 a.m. today and when John starts repairing the machine. Find E (T ) and Var (T ) . Solution Let T =time elapsed between 10:00 a.m. today and a breakdown. T is exponentially distributed with a mean of 3. X = max ( 2, T ) . #2, if T 2 X =$ %T , if T > 2 You can also write #2, if T < 2 X =$ %T , if T 2 As said before, it doesnt matter where you include the point T =2 because the pro bability density function of a continuous variable at any single point is always zero. Yufeng Guo, Deeper Understanding: Exam P Page 157 of 425

http://guo.coursehost.com 1 Pdf is always f ( t ) = e 3 2 or T > 2 . + 2 t /3 no matter T 2 E(X ) = 2 0 + 0 x ( t ) f ( t ) dt = dt = 2 (1 e 0 2 1 e 3 t /3 dt + t 1 e 3 t /3 dt 2 1 e 3 1 e 3 t /3 2/3 ) 2/3 + 2 t t /3 dt = ( 2 + 3) e 2/3 = 5e 2/3 E ( X ) = 2 (1 e E(X 2) =

2 0 + 0 ) + 5e 2/3 = 2 + 3e 1 e 3 2/3 + 2 x 2 f ( t ) dt = 2 2 2 0 t /3 dt + t2 1 e 3 t /3 dt 22 1 e 3 1 e 3 t /3 dt = 2 2 (1 e 2/3 ) = 4 (1 2/3 e 2/3 ) + 2 t2 t /3 dt = ( 52 + 32 ) e 2/3 = 34e 2/3 E ( X 2 ) = 4 (1 e ) + 34e

2/3 = 4 + 30e 2/3 2/3 Var ( X ) = E ( X 2 ) E 2 ( X ) = 4 + 30e ( 2 + 3e ) 2/3 2/3 2 We can quickly check that E ( X ) = 2 + 3e #2, if T 2 X =$ %T , if T > 2 " E(X is correct: " X if T 2 #0, 2=$ %T 2, if T > 2 2 T > 2 ) P (T > 2 ) 2 ) = 0 E ( T T < 2 ) P (T < 2 ) + E (T = E (T 2 T > 2 ) P (T > 2 ) = 3e 2/3 2/3 " E(X ) = E(X 2 ) + 2 = 2 + 3e You can use this approach to find E ( X 2 ) too, but this approach isnt any quick er than using the integration as we did above Yufeng Guo, Deeper Understanding: Exam P Page 158 of 425

http://guo.coursehost.com Problem 7 After an old machine was installed in a factory, Worker John is on cal l 24-hours a day to repair the machine if it breaks down. If the machine breaks down, John will receive a service call right away, in which case he immediately arrives at the factory and starts repairing the machine. Today is Johns last day of work because he got an offer from another company, but hell continue his curre nt job of repairing the machine until 12:00 noon if theres a breakdown. However, if the machine does not break by noon 12:00, John will have a final check of the machine at 12:00. After 12:00 noon John will permanently leave his current job and take a new job at another company. The machine was found working today at 10 :00 a.m.. The machines time to failure is exponentially distributed with a mean o f 3 hours. Let X represent the time elapsed between 10:00 a.m. today and Johns vi sit to the machine. Find E ( X ) and Var ( X ) . Solution Let T =time elapsed be tween 10:00 a.m. today and a breakdown. T is exponentially distributed with a me an of 3. X = min ( 2, T ) . #t , if T 2 X =$ %2, if T > 2 1 Pdf is always f ( t ) = e 3 t /3 no matter T + 2 2 or T > 2 . E(X ) = 2 0 1 t e 0 3 2 t /3 dt + 2 1 e 3 t /3 dt 1 t e 3 2 t /3 dt =3 ( 2 + 3) e 2 / 3 2/3 + 2 1 e 3 t /3 dt = 2e 2/3

E ( X ) = 3 5e + 2e 2/3 = 3 3e 2/3 Yufeng Guo, Deeper Understanding: Exam P Page 159 of 425

http://guo.coursehost.com To find Var ( X ) , we need to calculate E ( X 2 ) . E(X2) = 2 + x 2 ( t ) f ( t ) dt = t 2 f ( t ) dt + 22 f ( t ) dt 0 2 0/3 2 + 0 t 2 f ( t ) dt = ( 0 + 3) 2/3 2 + 32 e ( 2 + 3) 2 + 32 e 2/3 = 18 34e 2/3 0 + 22 f ( t ) dt = 4e 2 E ( X 2 ) = 18 34e 2/3 + 4e 2/3 = 18 30e 2/3 Var ( X ) = E ( X 2 ) E 2 ( X ) = (18 30e We can easily verify that E ( X ) = 3 3e T + 2 = min (T , 2 ) + max (T , 2 ) 2/3 2/3

) (3 3e 2/3 2 ) is correct. Notice: " E (T + 2 ) = E min (T , 2 ) + E max (T , 2 ) We know that E min (T , 2 ) = 3 3e E max (T , 2 ) = 2 + 3e 2/3 2/3 (from this problem) (from the previous problem) E (T + 2 ) = E (T ) + 2 = 3 + 2 So the equation E (T + 2 ) = E min (T , 2 ) + E max (T , 2 ) holds. We can also check that E ( X 2 ) = 18 30e T + 2 = min (T , 2 ) + max (T , 2 ) 2/3 is correct. " (T + 2 ) 2 = min (T , 2 ) + max (T , 2 ) = min (T , 2 ) 2 2 + max ( T , 2 ) 2 + 2 min (T , 2 ) max (T , 2 ) Yufeng Guo, Deeper Understanding: Exam P Page 160 of 425

http://guo.coursehost.com #t min (T , 2 ) = $ %2 " " " if t 2 , if t > 2 #2 max ( T , 2 ) = $ %t if t 2 if t > 2 min (T , 2 ) max (T , 2 ) = 2 t (T + 2 ) 2 = min (T , 2 ) 2 2 + max (T , 2 ) 2 + 2 ( 2t ) + E 2 ( 2t ) E (T + 2 ) = E min (T , 2 ) + max (T , 2 ) = E min (T , 2 ) 2 2 + E max (T , 2 ) 2 2 E (T + 2 ) = E (T 2 + 4t + 4 ) = E (T 2 2 ) + 4 E ( t ) + 4 = 2 ( 3 ) + 4 ( 3) + 4 = 34 E min (T , 2 ) E max (T , 2 ) 2 = 18 30e 2/3 (from this problem) = 4 + 30e 2/3 2 (from previous problem) E 2 ( 2t ) = 4 E ( t ) = 4 ( 3) = 12 E min (T , 2 ) = 18 30e 2/3

2 + E max (T , 2 ) 2/3 2 + E 2 ( 2t ) + 4 + 30e + 12 = 34 2 So the equation E (T + 2 ) = E min (T , 2 ) + max (T , 2 ) 2 holds. Problem 8 An insurance company sells an auto insurance policy that covers losses incurred by a policyholder, subject to a deductible of $100 and a maximum payment of $300 . Losses incurred by the policyholder are exponentially distributed with a mean of $200. Find the expected payment made by the insurance company to the policyho lder. Solution Let X =losses incurred by the policyholder. X is exponentially distributed with a mean 1 e x / 200 . of 200, f ( x ) = 200 Let Y =claim payment by the insurance company. if X 100 #0, & Y = $ X 100, if 100 X 400 & if X 400 %300, Yufeng Guo, Deeper Understanding: Exam P Page 161 of 425

http://guo.coursehost.com E (Y ) = 100 0 400 + 0 y ( x ) f ( x ) dx = 100 0 0 f ( x ) dx + (x 100 400 100 ) f ( x ) dx + + 400 300 f ( x ) dx 0 f ( x ) dx = 0 100 ) f ( x ) dx = 400 100 (x 100 400 100 xf ( x ) dx 100 / 200 400 100 100 f ( x ) dx 600e 2 xf ( x ) dx = (100 + 200 ) e 100 f ( x ) dx = 100 ( e ( 400 + 200 ) e 400 / 200 = 300e 1/ 2 400 / 200 400 100 + 400 100 / 200 e ) = 100 ( 1/ 2 e 2 ) 300 f ( x ) dx = 300e 400 / 200

= 300e 2 Then we have E ( X ) = 300e 1/ 2 600e 2 100 ( e 1/ 2 e 2 ) + 300e 2 = 200 ( e 1/ 2 e 2 ) Alternatively, we can use the shortcut developed in Chapter 20: E(X ) = d +L Pr ( X > x ) dx = 100 + 300 e 100 x / 200 dx =200 e x / 200 100 400 = 200 ( e 1/ 2 e 2 )

d Problem 9 An insurance policy has a deductible of 3. Losses are exponentially distributed with mean 10. Find the expected non-zero payment by the insurer. Solution Let X represent the losses and Y the payment by the insurer. Then Y = 0 if X Y = X 3 if X > 3 . We are asked to find E (Y Y > 0 ) . E (Y Y > 0 ) = E ( X 3 X > 3) 3; X 3 X > 3 is an exponential random variable with the identical mean of 10. So 3 X > 3) = E ( X ) = 10 . E(X Yufeng Guo, Deeper Understanding: Exam P Page 162 of 425

http://guo.coursehost.com Generally, if X is an exponential loss random variable with mean positive deductible d E(X d X > d) = E(X ) = , E(X X > d) = E(X , then for any d X > d)+d = +d Problem 10 Claims are exponentially distributed with a mean of $8,000. Any claim exceeding $30,000 is classified as a big claim. Any claim exceeding $60,000 is classified as a super claim. Find the expected size of big claims and the expect ed size of super claims. Solution This problem tests your understanding that the exponential distribution lacks memory. Let X represents claims. X is exponentia lly distributed with a mean of =8,000. Let Y =big claims, Z =super claims. E (Y ) = E ( X X > 30, 000 ) = E ( X 30, 000 X > 30, 000 ) + 30, 000 = E ( X ) + 30, 000 = + 30, 000 = 38, 000 E ( Z ) = E ( X X > 60, 000 ) = E ( X 60, 000 X > 60, 000 ) + 60, 000 = E ( X ) + 60, 000 = + 60, 000 = 68, 000 Problem 11 + Evaluate 2 (x 2 + x )e x/5 dx . Solution + (x x2 2 + x )e + x/5 dx = 5 2 (x 2 + x) 2 + 1 e 5

2 + x /5 dx = 5 x 2 + 2 1 e 5 x/5 + x/5 dx + 5 x 2 1 e 5 x/5 dx 2 1 e 5 + x/5 dx = 52 + ( 5 + 2 ) e + x) e 2/5 , 2 x 1 e 5 dx = ( 5 + 2 ) e 2/5 " 2 (x 2 x/5 dx = 5 52 + ( (5 + 2 ) + ( 5 + 2 ) e 2 2/5 = 405e

2/5 Yufeng Guo, Deeper Understanding: Exam P Page 163 of 425

http://guo.coursehost.com Problem 12 (two exponential distributions competing) You have a car and a van. T he time-to-failure of the car and the time-to-failure the van are two independen t exponential random variables with mean of 8 years and 4 years respectively. Ca lculate the probability that the car dies before the van. Solution Let X and Y represent the time-to-failure of the car and the time-to-fa ilure of the van respectively. We are asked to find P ( X < Y ) . X and Y are independent exponential random variables with mean of 8 and 4 respec tively. Their pdf is: 1 fX ( x) = e 8 1 fY ( y ) = e 4 x8 , FX ( x ) = 1 e x8 , where x 0 , where y 0 y 4 , FY ( y ) = 1 e y 4 Method #1 X and Y have the following joint pdf: x8 1 f X ,Y ( x, y ) = f X ( x ) fY ( y ) = e 8 1 e 4 y 4 Yufeng Guo, Deeper Understanding: Exam P Page 164 of 425

http://guo.coursehost.com The shaded area is x 0 , y 0 , and x < y . + + P(X < Y ) = shaded Area + f ( x, y ) dxdy = f ( x, y ) dydx = + + 0 + x 0 x 1 e 8 x8 1 e 4 y 4 dydx = 0 1 e 8 x8 (e x 4 )dx = 0 1 e 8 3x 8 dx = 1 3

Method 2 P(X < Y ) = + P(x < X x + dx ) P (Y > x + dx ) 0 The above equation says that to that find P ( X < Y ) , we first fix X at a tiny interval ( x, x + dx ] . Next, we set Y > x + dx . This way, we are guaranteed that X < Y when X falls in the interval ( x, x + dx ] . To find P ( X < Y ) when X falls [ 0, + ] , we simply integrate P ( x < X P(x < X x + dx ) P ( Y > x + dx ) over the interval [ 0, + ]. 1 x + dx ) = f ( x ) dx = e x 8 dx 8 P (Y > x + dx ) = P (Y > x ) because dx is tiny = 1 FY ( x ) = 1 (1 e x 4 )=e x 4 Yufeng Guo, Deeper Understanding: Exam P Page 165 of 425

http://guo.coursehost.com 1 8 1 1 + 8 4 P(X < Y ) = + f X ( x ) P (Y > x ) = + 0 0 1 e 8 x8 dx e x 4 = 1 8 + e 0 1 1 + x 8 4 dx = = 1 3 This is the intuitive meaning behind the formula . In this problem, we have a car 1 1 + 8 4 and a van. The time-to-failure of the car and the time-to-failure the van are two independent exponential random vari ables with mean of 8 years and 4 years respectively. So on average a car failure arrives at the speed of 1/8 per year; van failure arrives at the 1 1 speed of p er year; and total failure (for cars and vans) arrives a speed of per + 8 4 1 1 year. Of the total failure, car failure accounts for 8 = of the total failure. 1 1 3 + 8 4 With this intuitive explanation, you should easily memorize the follo wing shortcut: In general, if X and Y are two independent exponential random var iables with parameters of 1 and 2 respectively: 1 8 f X ( x ) = 1e Then P ( X < Y ) = + 1x and fY ( y ) =

2 e + 2y f X ( x ) P (Y > x )dx = 2 1 + 2 + 1 e 1x e 2x dx = 1 0 e ( 1+ 2 )x dx = 1 1 + 2 0 0 Similarly, P (Y > X ) = . Now you see that P ( X < Y ) + P (Y > X ) = 1 1 + 2 + 2 1 + 2 = 1 . This means that P ( X = Y ) = 0 . To see why P ( X = Y ) = 0 , please note that X = Y is a line in the 2-D plane. A line doesnt have any area (i.e. the area is zero). If you int egrate the joint pdf over a line, the result is zero. If you have trouble unders tanding why P ( X = Y ) = 0 , you can think of probability in a 2-D plane as a v olume. You can think of the joint pdf in a 2-D plane as the height function. In

order to have a volume, you must integrate the height function over an area. A l ine doesnt have any area. Consequently, it doesnt have any volume. Yufeng Guo, Deeper Understanding: Exam P Page 166 of 425

http://guo.coursehost.com Problem 13 (Sample P #90, also May 2000 #10) An insura nce company sells two types of auto insurance policies: Basic and Deluxe. The ti me until the next Basic Policy claim is an exponential random variable with mean two days. The time until the next Deluxe Policy claim is an independent exponen tial random variable with mean three days. What is the probability that the next claim will be a Deluxe Policy claim? (A) 0.172 (B) 0.223 (C) 0.400 (D) 0.487 (E ) 0.500 Solution Let T B = time until the next Basic policy is sold. T B is exponential random variable with 1 1 B = B = . 2 Let T D = time until the next Deluxe policy is sold. T D is exponential random variable 1 1 with D = D = . 3 The next claim is a Deluxe policy means that T D < T B . 1 TD 2 P (T D < T B ) = D = 3 = = 0.4 B 1 1 5 T +T + 3 2 Homework for you: #3 May 2000; #9, #14, #34 Nov 2000; #20 May 2001; #35 Nov 2001; #4 May 2003. Yufeng Guo, Deeper Understanding: Exam P Page 167 of 425

http://guo.coursehost.com Chapter 19 Poisson distribution Poisson distribution is often used to model the occurrence of a random event tha t happens unevenly (in some time periods it happens more often than other time p eriods). The occurrences of many natural events can be approximately modeled as a Poisson distribution such as: The number of claims that happen in a given time interval (a month, a year, etc.) The number of hits at a website in a given tim e interval The number of customers who arrive at a store in a given time interva l The number of phone calls (or e-mails) you get in a day The number of shark at tacks in one summer (Professor David Kelton of Penn State University used Poisso n distribution to model the number of shark attacks in Florida in one summer.) To enhance your understanding of Poisson distribution, I recommend that you read the article about using Poisson distribution to model the number of shark attac ks: http://www.pims.math.ca/pi/issue4/page12-14.pdf A random variable X has a Po isson distribution with parameter mass function is: f (X = k) = e , k = 0,1,2,, k ! is the occurrence rate per unit. k >0 if its probability Poisson distribution is a special case of binomial distribution. For a binomial distribution with parameter n and p , if we let n + , p 0 , but np then the binomial distribution becomes Poisson distribution. You can find the pr oof in many textbooks. To memorize the mean and variance of Poisson distribution , remember that binominal distribution with parameter n and p has a mean of np a nd a variance of np (1 p ) . Let p 0 but np , then we see that a Poisson distrib ution has mean of and variance of . Yufeng Guo, Deeper Understanding: Exam P Page 168 of 425

http://guo.coursehost.com Previous Course 1 Problems on Poisson distribution were straightforward. The maj or thing to watch out for is that you will need to convert the given occurrence rate of a random event into the occurrence rate of the time horizon in an exam p roblem. Sample Problems and Solutions Problem 1 Customers walk into a store at an average rate of 20 per hour. Find th e probability that (1) no customers have arrived at the store in 10 minutes. (2) no more than 4 customers have arrived at the store in 30 minutes. Solution Customers walk into a store randomly and unevenly. Poisson distribution can be used to model the number of customers who have arrived at the store in a given interval. To find the probability distribution of the number of customers who have walked into the store in 10 minutes, we need first to convert the arri val rate per hour into the arrival rate per 10 minutes. Because there are 6 tenminute intervals in an hour, the average # of arrivals per 10 minutes = 20 /6 = 10/3. Let k =# of customer arrivals in 10 minutes. (10 / 3) k f (k ) = e k! f (0 ) = (10 / 3) 0 e 0! 10 / 3 , k =0,1,2,... 10 / 3 =e 10 / 3 0.0357 , Let n = # of customer arrivals in 30 minutes. The average # of arrivals in 30 mi nutes, is 20/2=10. f ( n) = (10) n e n! 4 10 , n =0,1,2,... P(n 4) = (10) n e n! n=0 10 =e 10 1+ 10 102 103 104 + + + = 0.0293 1 2 3! 4! Page 169 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://guo.coursehost.com Problem 2 The number of typos in a book has a Poisson distribution with an avera ge of 5 typos per 100 pages. What is the probability that you cannot find a typo in 50 pages? Solution Let n = # of typos in 50 pages. f ( n) = n n! e , n =0,1,2,... =average # of typos in 50 pages = 5/2=2.5 2.5 0 f (0) = e 0! 2.5 = 8.21% Problem 3 A beach resort buys a policy to insure against loss of revenues due to major storms in the summer. The policy pays a total of $50,000 if there is only one major storm during the summer, a total of $100,000 if there are two major s torms, and a total payment of $200,000 if there are more than two major storms. The number of major storms in one summer is modeled by a Poisson distribution wi th mean of 0.5 per summer. Find (1) the expected premium for this policy during one summer. (2) the standard of deviation of the cost of providing this insuranc e for one summer. Solution Let N =# of major storms in one summer. N has Poisson distribution with the mean =0.5. f (n) = n n! e , n =0,1,2,... Let X = payment by the insurance company to the beach resort. The expected premi um is the expected cost of the insurance: Yufeng Guo, Deeper Understanding: Exam P Page 170 of 425

http://guo.coursehost.com E X (n) = + n =0 X ( n) f (n) We will use BA II Plus to find the mean and variance. n 0 1 2 3+ Total Payment X ( n) 0 50,000 100,000 200,000 f (n) e 0.5 =0.6065 0.5e 0.5 =0.3033 0.5 2 0.5 e =0.0758 2 1-(0.6065+0.3033+0.0758)=0. 0144 10,000 f ( n ) 6,065 3,033 758 144 10,000 Using the BA II Plus 1-V Statistics Worksheet, you should get: E ( X ) = 25, 625 , X = 37,889 Homework for you: #24, May 2000; #23 Nov 2000; #19 Nov 2001. Yufeng Guo, Deeper Understanding: Exam P Page 171 of 425

http://www.guo.coursehost.com Chapter 20 Gamma distribution A continuous random variable X has a gamma distribution if it has the following pdf: f ( x) = 1 ( ) x 1 e x/ , x 0, > 0, >0 Where , are constants (called parameters). ( ) is defined as + ( )= 0 t 1 e t dt ( > 0) ( ) is called the gamma function (pay attention here -- ( ) is not called gamma distribution). If is a positive integer, then ( ) = ( 1)! Gamma distribution can be very complex, especially when example, if = 1 2 , then the gamma pdf becomes: f ( x) = 1 x (1 2) 1 2 is not an integer. For 12 e x/ , x 0, >0 Then finding just 1 = 2 +

1 can be very challenging: 2 t 1 2 + + e dt = t 0 x e 1 x2 dx = 2 e 2 0 x2 dx ( let t = x ) 2 0 We must use polar transformation to find (details are not important and hence no t shown here): + 2 e 0 x2 dx = 1 = 2 You might want to memorize (no need to learn how to prove these) the following b asic facts about the gamma function: Yufeng Guo, Deeper Understanding: Exam P Page 172 of 425

http://www.guo.coursehost.com 1 = 2 Whether is an integer or not, as long as + > 0 , then ( )= 0 t 1 e t dt exists, ( ). ( + 1) = Though gamma distribution can be very complex, here is the good news: most likel y you will be tested on a simplified version of gamma distribution where is a po sitive integer. Lets focus only on gamma distributions where to n . is a positive integer. Well change Simplified gamma distribution (most likely to be tested) 1 f ( x) = n x n 1 e x/ , x 0 (n 1)! To help us remember this complex pdf (so we can quickly and correc tly write it out during the exam), we rewrite the pdf into: f ( x) = e x ( x) n 1 (n 1)! , where = 1 Possion probability of having ( n -1) events during interval [0, x ] Or f ( x) = 1 e x/ (x / ) n 1 (n 1)! Poisson probability of having ( n -1) events during interval [0, x ] In other words, we can express the gamma pdf as a multiple of Poisson pdf. The m ultiple is . If n = 1 , then gamma distribution becomes exponential distribution . One of the easiest ways to understand this simplified gamma distribution is to relate it to exponential distribution. While exponential distribution models th

e time elapsed until one random event occurs, gamma distribution models the time elapsed until n random events occur. You can find the proof of this in many tex tbooks. Yufeng Guo, Deeper Understanding: Exam P Page 173 of 425

http://www.guo.coursehost.com Having n random events is the same as having a ser ies of one random event. If a machine malfunctions three times during the next h our, the machine is really having three separate malfunctions within the next ho ur. As such, gamma distribution is really the sum of n identically distributed e xponential random variables. Let T1 , T2 ,..., T n be independent identically di stributed exponential variables with parameter (mean) , then X = T1 + T2 + ... + T n have gamma distribution with pdf f ( x) = e x ( x) n 1 , where (n 1)! = 1 Possion probability of having ( n -1) events during [ 0, x ] Once you understand this, you should have no trouble finding the mean and varian ce of X. E ( X ) = E (T1 + T2 + ... + T n ) = E (T1 ) + E (T2 ) + ... + E (T n ) = n Var ( X ) = Var (T1 + T2 + ... + T n ) = Var (T1 ) + Var (T2 ) + ... + Var (T n ) = n 2 We are not quite done yet. To solve gamma distribution problems, we need a quick way of finding the cdf. Brute force integration x x F ( x) = 0 f (t )dt = 0 e t ( t) n 1 dt (n 1)! can be painful. You will want to memorize the following formula: F ( x) = Pr( X x) = Pr(it takes time x or less to have n random events) =Pr(the # of events that occurred during [0, x] = n, n + 1, n + 2,... + ) = 1- P r(the # of events that occurred during [0, x] = 0,1, 2,...n 1) =1 - e x 1+ x ( x) 2 ( x) n 1 + + ... + 1! 2! (n 1)!

Poisson distribution Lets walk through the formula above. Pr( X x) really means that it takes no longe r than time x before n random events occur. The only way to make this happen is to have Yufeng Guo, Deeper Understanding: Exam P Page 174 of 425

http://www.guo.coursehost.com n, n + 1, n + 2,...or + events occur during 0, x . The number of events that can occur during 0, x is a Poisson distribution, which we already know how to calculate. P lease note that we use x instead of as the parameter for Poisson distribution. i s the occurrence rate per unit of time (or per unit of something). We have a tot al of x units. Consequently, the occurrence rate during the time interval [ 0, x ] is x . I mentioned this point in the chapter on Poisson distribution. Forgett ing this leads to a wrong result. Similarly, we have Pr( X > x) = Pr(it takes longer than time x to have n random events) = Pr(the # of events that occurred during [0, x] = 0,1, 2,...or n 1) =e x 1+ ( x) n 1 x ( x) 2 + + ... + 1! 2! (n 1)! Poisson distribution Now we are ready to tackle gamma distribution problems. Sample Problems and Solutions Problem 1 For a gamma distribution with parameter n = 10, = 2 , quickly write out the expr ession for pdf and for F (2) . Solution f ( x) = e x ( x) n 1 (n 1)! , where = 1 Poisson probability of having (n-1) events during 0, x f ( x) = e x ( x) n 1 =2e (n 1)! 2x (2 x) 10 1 = 2e (10 1)! 2x (2 x) 9 9!

Yufeng Guo, Deeper Understanding: Exam P Page 175 of 425

http://www.guo.coursehost.com F (2) = Pr( X 2) = Pr(it takes time length of 2 or less to have 10 random events ) =Pr(the # of events that occurred during [0, 2] = 10,11,12,... + ) = 1- Pr(the # of events that occurred during [0, x] = 0,1, 2,...9) x =1 - e 1+ x ( x) 2 ( x )9 + + ... + 1! 2! 9! Poisson distribution =1-e 4 1+ 4 (4) 2 (4)9 + + ... + 1! 2! 9! Problem 2 Given that a gamma distribution has the following cdf: F ( x) =1 - e x 1+ x ( x) 2 ( x) n 1 + + ... + 1! 2! (n 1)! Poisson distribution Show that the pdf is indeed as follows: f ( x) = e x ( x) n 1 (n 1)! Solution You wont be asked to prove this in the actual exam, but knowing the conn ection between gamma F ( x) and f ( x) enhances your understanding f ( x) = = = = e = e d d " F ( x) = #1 - e dx dx " % d e dx 1+ x x x 1+ x ( x)2 ( x) n 1 ! " + + ... + $ 1! 2! (n 1)! " & 1+ x ( x)2 ( x) n 1 + + ... + 1! 2! (n 1)! x

x ( x) 2 ( x) n 1 d e + + ... + 1! 2! (n 1)! dx 1+ 1+ x ( x) 2 ( x) n 1 + + ... + 1! 2! (n 1)! ( x) n 1 x ( x) 2 + + ... + 1! 2! (n 1)! e x x d x ( x)2 ( x) n 1 1+ + + ... + dx 1! 2! (n 1)! x ( x) n 2 + ... + 1! (n 2)! ( x ) n 2 x + ... + 1! (n 2)! e e + x x 1+ = e x ( x)n 1 (n 1)! Now we are convinced that our formula for F ( x) is correct. Yufeng Guo, Deeper Understanding: Exam P Page 176 of 425

http://www.guo.coursehost.com Problem 3 You are solving three math problems. On average, you solve one problem every 2 minutes. Whats the probability that you wi ll solve all three problems in 5 minutes? Solution Method 1 Forget about gamma distribution and use Poisson distribution L et n =the number of problems you solve in 5 minutes. n has a Poisson distributio n: f ( n) = e x ( x)n where n! 1 = , x=5 2 Pr(solve all 3 problems in 5 minutes)= Pr(n = 3, 4,5,... ) = 1 Pr(n = 0,1, 2) = 1-e x 1+ x ( x) 2 + = 1-e 1! 2! 5/ 2 1+ 5 / 2 (5 / 2) 2 + = 45.62% 1! 2! Method 2 Use gamma distribution and do integration Let T1 = the # of minutes it takes you to solve the 1st problem (the 1st problem can be any problem you choos e to solve first) Let T2 = the # of minutes it takes you to solve the 2nd proble m Let T3 = the # of minutes it takes you to solve the 3rd problem T1 , T2 , T3 a re exponentially distributed with mean = 2. Let X T2 + X has 1 = 1 =the total # of minutes it takes you to solve all three problems. X = T1 + T3 gamma distribution with n = 3, = and the following pdf: 2

f ( x) = e x ( x) n 1 1 = e (n 1)! 2 x/2 ( x / 2) 2 2! Yufeng Guo, Deeper Understanding: Exam P Page 177 of 425

http://www.guo.coursehost.com Pr(solve all 3 problems in 5 minutes)= Pr(T 5 5 5) = 0 f (t )dt = e 5 t/2 1 e 2 0 t/2 (t / 2) 2 dt = 1 2! e 5/ 2 x 5 1 e 2 t/2 (t / 2) 2 dt 2! =1 =1 (t / 2) 2 t d =1 2! 2 x2 t 1 dx = 1 x 2 e x dx (set x = ) 2! 2 5/ 2 2 1 e 2 5/ 2 [12 + (1 + 5 / 2) 2 ] (use the shortcut developed in the chapter on exponential distribution) =45.62% Method 3 Use gamma distribution and avoid doing integration Well use the memorize d formula of gamma cdf: F ( x) =1 - e x x ( x) 2 ( x) n 1 1+ + + ... + 1! 2! (n 1)! Poisson distribution We have n = 3, = 1 = 1 2 F ( x = 5) = 1-e 5/ 2

5 / 2 (5 / 2) 2 1+ + = 45.62% 1! 2! You might want to familiarize yourself with all three methods above. Problem 3 Solve the following integration: + e 0 t n

Solution The above integration is frequently tested on SOA exams. Yufeng Guo, Deeper Understanding: Exam P Page 178 of 425

t dt where is a non-negative constant and n is a positive integer.

http://www.guo.coursehost.com + e 0 + t n

1 +

n +1 0

e x x n dx = (n + 1) = n ! (this is a gamma function) 0 + + e 0 t n

t dt = n!

n +1 , or 0 e t

tn 1 dt = n +1 n!

e x x n dx (let x =

t dt = 1 +

n +1 0 e t t ) =

( t ) d ( n

t )

e t t n dt = (n + 1) = n ! , where 0 (n + 1) is a gamma function. Alternatively, + e 0 t n

t dt = ( n + 1)

n +1 1 !+ 0 e t t )( ( n + 1)

n +1) 1 1 ! dt = ( n + 1)

n +1 1 ! = n! n+1 and n +1

(The integration of a gamma pdf over [ 0, + + ] is one). This gives us: e 0 t n

gamma pdf with parameter

for +

> 0 and a non-negative integer n. Generally, for a positive integer n ,

t dt = n! n+1

Notice that e in the integration sign is the Poisson probability mass function n! P ( N = n ) with parameter = t . Setting = t , we have: t

( t ) n Yufeng Guo, Deeper Understanding: Exam P Page 179 of 425

d ( t ) = 1 for

, e 0 t

To help memorize the above equation, lets rewrite the above formulas as +

( t ) n!

> 0 and a non-negative integer n.

http://www.guo.coursehost.com + n e 0 n! d =1. Poisson probability mass function So you just need to memorize: + n + e 0 n! d = 1 or 0 e t

n d ( t ) = 1 Poisson probability mass function Poisson probability mass function + + If we set = x and n = 1 , we get 0 e x xdx = 1 . 0 + e x xdx is the mean of an exponential random variable with parameter = 1 . So 0 + e x xdx = 1 is correct. +

( t ) n!

If we set = x and n = 2 , we get 0 e x x2 dx = 1 or 2! e x x 2 dx = 2 . If we do integration0 by-parts, we get: + + e x dx = x 2 0 0 x de + 2 x = xe 2 x + 0 + + + + 0 e dx = x 2 0 e dx = 2 e x xdx = 2 x 2 0 So we know that 0 e x x2 dx = 1 is correct. 2!

+ In the future, if you see 0 + e x x n dx , immediately change e x x n to a Poisson probability + x mass function: 0 e x x n dx = n ! e 0 xn dx = n ! n! Problem 4 Solve the following integration: + e 3t t 5 dt 0 Yufeng Guo, Deeper Understanding: Exam P Page 180 of 425

http://www.guo.coursehost.com Solution Lets change e 3t t 5 into a Poisson probability mass function: + 0 5! e t dt = 6 3 3t 5 + e 0 ( 3t ) ( 3t ) 5! 5 d ( 3t ) = 5! 36 Problem 5 Given that 1 = 2 , find 11 2 Solution Generally, 11 = 2 9 2 7 2 5 2 3 2 1 2 = = = = = 11 9 = 2 2 7 2 5 2 3 2 1 2 ( + 1) = ( ) 9 2 7 2 5 2 3 2 1 2 where >0. 9 9 +1 = 2 2 7 +1 2 5 +1 2 3 +1 2 1 +1 2 7 2 5 = 2 3 = 2 1 = 2 = ( ) Homework for you: Rework all the problems in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 181 of 425

http://www.guo.coursehost.com Chapter 21 Beta distribution Lets pick up where we left off on beta distribution in Chapter 7. You are tossing a coin. Not knowing p , the success rate of heads showing up in one toss of the coin, you subjectively assume that p is uniformly distributed ov er [ 0,1] . Next, you do an experiment by tossing the coin m + n times (where m, n are non-negative integers). You find that, in this experiment, m out of these m + n tosses have heads. Then the posterior probability of p is: f ( p) = p m (1 p ) 1 n p 0 m (1 p ) dp n where 0 p 1 ; m, n are non-negative integers. The above distribution f ( p ) is called beta distribution. If we set m = f ( p) = p 1 1 and n = 1 1 1 where 1 > 0 and > 0 , we have (1 p) p) , 1 p 0 (1

where 0 p 1 dp This is a generalized beta distribution. Yufeng Guo, Deeper Understanding: Exam P Page 182 of 425

http://www.guo.coursehost.com Key points to remember: General beta distribution: pdf: f ( p) = p 1 1 1 (1 p) p) 1 p 0 (1 = 1 p dp (1 p ) B( , ) 1 1 = ( ) ( ) ( + ) p 1 (1 p) 1 where B( , >0, 1 > 0, 0 p 1

)= 1 p 0 (1 p) 1 dp = ( ) ( ) ( + ) (Beta function) Cdf: F ( p ) = f ( x )dx = 0 p 1 B( , p )0 x 1 (1 x) 1 dx = Bx ( , B( , )=I ( ) x , ) Where Bx ( , )= p x

0 1 (1 x) 1 dx is called the incomplete beta function Ix ( , ) is called the incomplete beta function ratio. Mean and variance good if you can memorize them: E (P) = + , Var ( P ) = 1 + + + +1 Yufeng Guo, Deeper Understanding: Exam P Page 183 of 425

http://www.guo.coursehost.com Simplified Beta distribution (likely to be on the exam): where 1 and n = 1. integers. Let m = and are positive pdf: f ( p) = p m (1 p ) 1 0 n p m (1 p ) dp n = (m + n + 1) # of trials + 1 m Cm + n p m (1 p) n binomial distribution m successes n failures where 0 p 1 and m, n are non-negative integers cdf: Two methods Method 1 Do the following integration: m F ( p ) = f ( x )dx = (m + n + 1) Cm + n x m (1 x) n dx 0 0 p p Method 2 F ( p ) =Binomial probability of having more than m successes in m+n+1 trials, w here the success rate is p in one trial. Or F ( p) = m +n +1 k = m +1 k Cm + n + 1 p k (1 p ) m +n + 1 k The proof of Method 2 is complex. And theres no intuitive explanation for it. Jus t memorize it. Integration shortcut -- comes in handy in the heat of the exam: 1 p m (1 p ) n dp = 0 1 m (m + n + 1) Cm + n

Yufeng Guo, Deeper Understanding: Exam P Page 184 of 425

http://www.guo.coursehost.com This is how we get 1 f ( p) = (m + n + 1)C m m +n p (1 p) and 0 m n p m (1 p) n dp = 1 m (m + n + 1) Cm + n . Proof. p m (1 p ) 1 0 f ( p) = n p m (1 p ) dp n p m (1 p ) = , B ( m + 1, n + 1) n B ( m + 1, n + 1) = ( m + 1) ( n + 1) = m !n ! (m + n + 1)! ( m + n + 2) f ( p) = 1 (m + n + 1)! m m p (1 p ) n = (m + n + 1)Cm + n p m (1 p) n m !n ! 1 f ( p )dp = 1 0 0 1 m (m + n + 1)Cm + n p m (1 p) n dp = 1 p m (1 p ) n dp = 0 1 m (m + n + 1) Cm + n Sample problem and solutions Problem 1 A random variable X (where 0 X 1 ) has the following pdf:

f ( x) = k x 5 (1 x) 2 , where k is a constant. Find k , E ( X ), Var ( X ). Yufeng Guo, Deeper Understanding: Exam P Page 185 of 425

http://www.guo.coursehost.com Solution Method 1: Using memorized formulas for general beta distribution X has beta distribution with parameters = 6, =3. k= 1 B( , ) + = ( ) ( ) = 6 2 = 6+3 3 ( + ) = ( 6 + 3) = ( 9 ) = 8! = 168 ( 6 ) ( 3) ( 6 ) ( 3) 5!2! E (P) = Var ( P ) = 1 + + + +1 = 6 6+3 3 6+3 1 1 = 6 + 3 +1 45 Method 2 do integration First, well find k . 1 1 1 f ( x)dx = 1 0 0 1 k x (1 x) dx = 1 5 2 k x 5 (1 x) 2 dx = 1 0 x 5 (1 x) 2 dx =

0 1 1 1 5 = 5 = (5 + 2 + 1) C5+ 2 8 C7 8 C72 k = 8 C72 Alternatively, we have # of success=5, # of failures=2. of successes 5 5 2 k = ( # of trials +1) C ##of trials = ( 5 + 2 + 1) C 5+ 2 = 8 C 7 = 8C 7 = 168 1 1 1 E ( X ) = xf ( x)dx = k x (1 x) dx = k x 6 (1 x) 2 dx 6 2 0 1 0 0 x 6 (1 x) 2 dx = 0 1 1 6 = (6 + 2 + 1) C8 9 C82 Yufeng Guo, Deeper Understanding: Exam P Page 186 of 425

http://www.guo.coursehost.com 8 (7)(6) k 6 2 2! E ( X ) = k x 6 (1 x) 2 dx = = 2 = 2 = 9(8)(7) = 9 C8 9 C8 9 3 0 2! 1 8 C 72 1 1 1 E ( X 2 ) = x 2 f ( x)dx = k x 7 (1 x) 2 dx = k x 7 (1 x) 2 dx 0 0 0 1 x 7 (1 x) 2 dx = 0 1 1 1 7 = (7 + 2 + 1) C7+ 2 10 C92 8(7)(6) 7 2! E ( X 2 ) = k x 7 (1 x) 2 dx = 2 = 2 = 10(9)(8) = 10 C9 10 C9 15 0 2! k 8 C 72 7 Var ( X ) = E ( X ) E ( X ) = 15 2 2 2 3 2 = 1 45 Problem 2 The percentage of defective products in a batch of products, p , is as sumed to be uniformly distributed in [0,1]. An engineer randomly chooses 50 item s with replacement from this batch and discovers no defective products. Determin e the posterior probability that more than 5% of the products in this batch are defective. Solution If 50 items are sampled with replacement, then the number of defective items found in this sample is a binomial distribution (if it is a sample without replacement, youll get a hypergeometric distribution). Given our interpretation of Beta distribution, the posterior probability of p has Beta distribution: f ( p) = p m (1 p ) 1 0 n p m (1 p ) dp n = (m + n + 1) # of trials + 1 m Cm + n p m (1 p) n binomial distribution m successes n failures In this problem, m = 0, n = 50 . m f ( p ) = (m + n + 1)Cm + n p m (1 p) n = (0 + 50 + 1)C00+ 50 p 0 (1 p) 50 = 5

1(1 p) 50 Yufeng Guo, Deeper Understanding: Exam P Page 187 of 425

http://www.guo.coursehost.com Pr ( p > 5% ) = 1 f ( p ) dp = 51(1 p ) 5% 51 1 50 1 dp = 5% d (1 p ) 51 = (1 p) 51 1 5% 5% = (.95 ) = 7.3% Alternatively, Pr ( p > 5% ) = 1 Pr ( p 5% ) = 1 F ( 5% ) F ( 5% ) =Binomial probability of hav ing more than zero success in 51 trials, where the success rate is 5% per trial. 1 F ( 5% ) =having zero success in 51 trials= (.95 ) = 7.3% . 51 Problem 3 You one ion f ( are tossing a coin. Not knowing p , the success rate of heads showing up in toss of the coin, you subjectively assume that p has the following distribut over [ 0,1] : p) = 1 2 p 3

Next, you do an experiment by tossing the coin 10 times. You find that, in this experiment, 2 out of these 10 tosses have heads. Find the mean of the posterior probability of p . Solution Using Bayes Theorem, we know the posterior probability is: k scaling factor before-event the group s probability group size to have 2 heads out of 10 tosses

After-event size of the groups 1 2 p 3 2 C10 p 2 (1 p ) 8 Yufeng Guo, Deeper Understanding: Exam P Page 188 of 425

http://www.guo.coursehost.com So the posterior probability has the following for m: constant p 4 (1 p ) , where 0 8 p 1. Without knowing exactly what the constant is, we see that the posterior probabil ity is a Beta distribution with = 5 and = 9 . Next, we can simply use the follow ing memorized formula: E (P) = Alternatively, f ( p ) = c p 4 (1 p ) , where c is a constant. 8 + = 5 5 = 5 + 9 14 1 f ( p )dp = 1 1 c p 4 (1 p ) dp = 1, c = 8 1 1 0 0 0 p 4 (1 p ) dp 8 1 4 = (4 + 8 + 1)C44+8 = 13C 12 E ( P ) = pf ( p ) dp = p 13 C 12 p (1 p ) dp =13 C 12 4 4 8 1 1 4 p 5 (1 p ) dp 8 0 0 0

4 13 C 12 13 = = 5 ( 5 + 8 + 1) C 5+8 14 12(11)(10)(9) 5 4! = 13 (12 )(11)(10 )( 9 ) 14 5! Problem 4 A random variable X (where 0 X 1 ) has the following pdf: f ( x) = k x 3 (1 x) 2 , where k is a constant. Find Pr X 1 1 . =F 3 3 Yufeng Guo, Deeper Understanding: Exam P Page 189 of 425

http://www.guo.coursehost.com Solution X has a simplified beta distribution with parameters m = 3 and n = 2 . Method 1 do integration. 1 1 =F = 3 3 1 3 1 3 Pr X f ( x)dx = k x 3 (1 x) 2 dx 0 0 m k = (m + n + 1)Cm + n = (3 + 2 + 1)C33+ 2 = 6C53 = 6C52 = ( 6 ) 1 3 1 3 5 ( 4) = 60 2! 1 3 F 1 = 60 x 3 (1 x) 2 dx = 60 x 3 ( x 2 3 0 0 1 3 5 2 x + 1)dx = 60 ( x5 2 x 4 + x 3 )dx 0 1 3 0 = 60 ( x 0 1 2 x + x )dx = 60 x 6 6 4 3 2 5 1 4 x + x 5 4 10.0137% Method 2 F 1 =Binomial probability of having more than m = 3 3 successes in m + n + 1 = 3 + 2 + 1 = 6 trials, where the success 1 rate is p = in one trial. 3 4 1 1 F = C64 3 3 2 3 2 +C 5 6 1 3 5 2 1 + C66 3 3

6 2 3 0 8.23045% + 1.64609% + 0.13717% 10.0137% Problem 5 A random variable X (where 0 3 2 X 1 ) has the following pdf: f ( x) = k x 2 (1 x) , where k is a constant. Find Pr X 1 1 . =F 3 3 Yufeng Guo, Deeper Understanding: Exam P Page 190 of 425

http://www.guo.coursehost.com Solution 3 5 +1 = 2 2 Because one parameter is a fraction, we cannot use binomial distrib ution to calculate 1 as we did in the previous problem. We have to do integratio n. In addition, we do F 3 X is a beta random variable with parameters = 2 + 1 = 3 and = not like (1 x) 2 . Lets set y = 1 x . 1 3 3 2 2 3 3 1 3 3 Pr X 1 = k x 2 (1 x) dx = 3 0 3+ 5 2 = 5 2 1 2 k (1 y ) y 2 dy = k y 2 (1 y ) dy 2 2 2 3 1 k= ( 3) ( 5.5 ) = ( 4.5 )( 3.5)( 2.5) ( 2.5) = ( 4.5)( 3.5)( 2.5) = 19.6875 2 ( 3) ( 2.5 ) ( 2!) ( 2.5) 1 1 y 2 (1 y ) dy = y1.5 ( y 2 2 y + 1) dy = 2 3 3 (y 3.5 2 y 2.5 + y1.5 ) dy 2 3 2 3 1 4.5 2 3.5 1 2.5 = y y + y 4.5 3.5 2.5 Pr X 1 3 19.6875 ( 0.804% ) 15.83% 1 0.804% 2 3 Final comment. If you are given the following beta distribution (both parameters are fractions) f ( x) = k x 2 (1 x) 2

5 3 and you are asked to find Pr X 1 1 , then the integration is tough. To do the =F 3 3 integration, you might wan t to try setting x = sin 2 t , y = cos 2 t This is too much work. Its unlikely that SOA will include this type of heavy calc ulus in the exam. Homework for you: Rework all the problems in this chapter Yufeng Guo, Deeper Understanding: Exam P Page 191 of 425

http://www.guo.coursehost.com Chapter 22 Weibull distribution We know that a components time to failure can be modeled using exponential distri bution. Now lets consider a machine that has n components. The machine works as l ong as all the components are working; it stops working if at least one componen t stops working. What is the probability distribution of the machines time to fai lure? Lets make two simplifying assumptions: (1) each component is independent of any other component, and (2) each components time to failure is exponentially di stributed with identical mean . Let T1 represent the first components time to fai lure. T1 is exponentially distributed with mean . Pr (T1 > t ) = e t Let T2 represent the second components time to failure. T2 is exponentially distr ibuted with mean Let T n represent the n-th components time to failure. T n is exp onentially distributed with mean . Pr (T n > t ) = e t . Pr (T2 > t ) = e t Let T represent the machines time to failure. Then T = min (T1 , T2 ,..., T n ) . To derive the pdf of T , notice Pr (T > t ) = Pr min (T1 , T2 ,..., T n ) > t = Pr (T1 > t ) Because T1 , T2 ,..., T n are independent, we have Pr ( T1 > t ) (T2 > t ) ... (T n > t ) (T2 > t ) ... (T n > t ) = Pr ( T1 > t ) Pr (T2 > t ) ...Pr (T n > t ) = e t ( ) n F ( T ) = Pr (T t ) = 1 Pr ( T > t ) = 1 (e t ) n Yufeng Guo, Deeper Understanding: Exam P Page 192 of 425

http://www.guo.coursehost.com F (T ) = 1 ( e t ) n is very close to Weibull distribution. However, theres one more step. Statisticians found out that if they change F (T ) = 1 ( e t ) n n = 1 e nt into F (T ) = 1 e (x/ ) , the resulting distribution F (T ) = 1 e (x/ ) is much more useful. = 1 e nt Theres no good theoretical justification why F (T ) = 1 be changed to F (T ) = 1 e (x/ ) (e t ) needs to . The key point is that people who use Weibull distribution dont care much where the cdf F (T ) = 1 e ( x / ) comes from. All th ey care is that this cdf is very flexible and they can easily fit their data int o this cdf. This is pretty much all you need to know about the theories behind W eibull distribution. If an objects time to failure, X , follows Weibull distribut ion, then its probability to fail by time x is: Pr( X x) = 1 e ( x / ) where > 0,

>0. Stated differently, if an objects time to failure, X , follows Weibull distributi on, then its probability to survive time x is: Pr( X > x) = e ( x / ) where > 0, >0. is called the scale parameter and is called the shape parameter. Weibull distribution is widely used to describe the failure time of a machine (a car, a vacuum cleaner, light bulbs, etc.) which consists of several components and which fails to work it at least one component stops working. Please note tha t many textbooks use the following notation: Pr( X > x) = e ( x / ) I like to us e Pr( X > x) = e ( x / ) to help me remember that Weibull is just a complex vers ion of exponential distribution. What is so special about Weibull distribution? Weibull cdf can take a variety of shapes such as a bell curve, a U shape, a J sh ape, a roughly straight line, or some other shape. To get a feel for the Weibull pdf shape, go to http://www.engr.mun.ca/~ggeorge/3423/demos/ Yufeng Guo, Deeper Understanding: Exam P Page 193 of 425

http://www.guo.coursehost.com and download the Weibull Probability Distribution Excel spreadsheet. This spreadsheet lets you enter parameters and . Then it disp lays the corresponding graphs for Weibull pdf and cdf. You can play around with the spreadsheet. Enter different parameters in the spreadsheet and watch how Wei bull pdf and cdf shapes change. Because Weibull pdf can take a variety of shapes , Weibull distribution can fit a variety of data. This is why Weibull distributi on has many applications. When you use Weibull distribution to fit data, you rec ognize the fact that your data may not fit into a neat bell curve, a neat expone ntial distribution, or other standard shape. By using Weibull distribution, you give your data a chance to speak for themselves. In contrast, if you use a distrib ution with a fixed shape, say a normal distribution, you implicitly force your d ata to fit into a bell curve. To solve Weibull distribution-related problems, re member the following key points: 1. Do not be scared. Weibull distribution sound s hard, but the calculation is simple. 2. The bare-bones formula you want to mem orize about a Weibull random variable X (where X 0 ) is Pr( X > x) = e (x/ ) , where > 0, > 0 . Compare this with the formula for an exponential variable X : Pr( X > x) = e x / , and you se e that exponential distribution is just a simplified version of Weibull distribu tion by setting = 1 . 3. About the pdf, mean and variance formula, you have two options. One option is to memorize the formulas; the other is to derive the mean and variance from the bare bones formula of Pr( X > x) . I recommend that you m aster both options. You should use Option 1 for the exam and use Option 2 as a b ackup (in case you forget the formulas). Method 1 memorize the formulas for E ( X ), Var ( X ) Under Method 1, you still do not need to memorize the pdf. You can find the pdf using the formula: d d d d f ( x) = F ( x) = Pr( X x) = [1 Pr( X > x)] = Pr( X > x) dx dx dx dx f ( x) = x 1 e (x/ ) Yufeng Guo, Deeper Understanding: Exam P Page 194 of 425

http://www.guo.coursehost.com However, to find the mean and variance, you will w ant to memorize: E( X n ) = If n n 1+ n , where n = 1, 2,... is an integer (likely to appear on the exam), then the above formula becomes n E( X n ) = n ! Method 2 - derive E ( X ), Var ( X ) using general probability formulas. Method 2 is not hard (I will show you how). Derive pdf: (same as Method 1) d d d d f ( x) = F ( x) = Pr( X x) = [1 Pr( X > x)] = Pr( X > x) dx dx dx dx Derive mean and variance: E( X n ) = + f ( x) x n dx + , 0 To simplify our integration, we use the following shortcut: For a non-negative r andom variable X where 0 X E( X n ) = + Pr( X > x)dx n 0 Please note that this shortcut works for any non-negative variable with a lower bound of zero and an upper bound of positive infinity. Proof. f ( x) = d Pr( X > x) dx + f ( x)dx = d [ Pr( X > x)] + E( X n ) = f ( x) x n dx = x n f ( x)dx = + + x n d [ Pr( X > x)] 0

0 + 0 0 = x n [ Pr( X > x) ] 0 + 0 Pr( X > x)d x n (integration by parts) + x n [ Pr( X > x) ] = 0 n Pr( X > 0) ( + n ) Pr ( X > + ) = ( + n ) Pr ( X > + ) Yufeng Guo, Deeper Understanding: Exam P Page 195 of 425

http://www.guo.coursehost.com asked to calculate E ( X n ) , it must exist in the first place. So ( + n ) Pr ( X > + zero. If you feel uneasy about setting ( + n ) Pr ( X > + ( + n ) Pr ( X > + ) cannot be . If it is , E ( X n ) will be undefined. But if you are ) must be ) to zero, use a human life as the random variable. Let X represent the number of years until the death of a human being. If you choose a large number such as x = 200 , then you will surely have Pr( X > 200) = 0 because no one can live for 200 years. Clearly, we can set x n Pr( X > x) + = 0 . So we have E ( X n ) = + Pr ( X > x ) dx n 0 Sample problems and solutions Problem 1 For a Weibull random variable X with parameters E ( X ), Var ( X ) , 80th percentile, and Pr( X > 20) . = 10, = 1 , find the pdf, 4 Solution We will first solve the problem with Method 2 (the hard way). If you can master this method you will do fine in the exam, even if you forget the formulas. Method 2 this is the Lets focus on the basic formulas and derive the rest approach. For Weibull distribution, we always start from Pr ( X > x ) (which is called the survival function), not from f ( x) . Pr ( X > x ) is lot easer to memorize than cdf. Lets start with Pr ( X > x ) . From here we can find everything else. Pr ( X > x ) = e (x/ ) =e

(x ) 1 4 where x 0 1 4 d d (x ) Pr( X > x) = e f ( x) = dx dx + + =e (x ) 1 4 1 x ( ) 4 3 4 (not hard) E( X ) = 0 Pr( X > x)d x = 0 e (x ) 1 4 dx Dont know what to do next? Simply do a transformation. Yufeng Guo, Deeper Understa nding: Exam P Page 196 of 425

http://www.guo.coursehost.com Let set ( x 1 ) 4 = t . This will change the ugly e (x ) 1 4 into a nice e t . (x ) 1 4 =t + x =t 1 4 4 x= + t4 dx = 4 t 3dt + E( X ) = 0 e (x ) dx = 0 e t 4 t dt = 4 3 0 + e t t 3dt = 4 (3!) = ( 4!) = 240 Please note that we use the formula 0

e t t n dt = n ! in the chapter on gamma distribution. Similarly, we can find E ( X 2 ) . E( X 2 ) = = 0 + Pr( X > x)dx 2 = + e 0 (x ) 2 1 4 + dx = 2 0 e (x ) 2 1 4 2 xdx 2 0 + e 2( t t 4 )(4 t dt =8 2 3 ) + e t t 7 dt = 8 2 ( 7!) = ( 8!) = 4, 032, 000 0 Var ( X ) = E ( X 2 ) E 2 ( X ) = ( 8!)

( 4!) = 3,974, 400 Next, we will find the 80th percentile, x80 . F ( x80 ) = 80% Pr( X > x80 ) = 20 % x80 1 4 e ( x80 10) 1 4 = 20% ( 1 4 10 ) = ln 20% x80 = 10( ln 20%) 4 67.1 Pr( X > 20) = e (2010) = .3045 Method 1 Use memorized formulas (unpleasant to memorize, but you will work faste r in the exam if you have them memorized). Remember, under Method 1, we still do not want to memorize the pdf. We will deri ve the pdf instead. Lets start with Pr( X > x) . Yufeng Guo, Deeper Understanding: Exam P Page 197 of 425

http://www.guo.coursehost.com Pr( X > x) = e f ( x) = (x/ ) =e (x ) 1 4 where x 0 1 4 d Pr( X > x) = dx d (x ) e dx =e (x ) 1 4 1 x ( ) 4 3 4 (not hard) To find the mean and variance, we use the memorized formula: n E( X n ) = n 1 + We have E( X ) = 1+ 1 = 1+ 1 = 1 4 1+ 2 = 1 4 ( 5 ) = ( 4!) E( X 2 ) = 2 1+ 2 = 2 2 ( 9 ) = ( 8!) 2 Var ( X ) = E ( X 2 ) E 2 ( X ) = ...

You see that we get the same result as Method 1. The calculation of x80 and Pr( X > 20) is the same as the calculation in Method 2. Homework for you: Rework all the problems in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 198 of 425

http://www.guo.coursehost.com Chapter 23 Pareto distribution Key points: 1. Pareto distribution is often used to model losses where the proba bility of incurring a catastrophic loss (such as in a hurricane) is small but no t zero. 2. You need to memorize two formulas: Pr( X > x) = for k < x+ , where , are positive numbers and 0 x<+ . , the k-th moment is k E X ( )= Ck k 1 , where C k 1 = k! ( ( 1) k ! 1) ! = ( 1)( 2 ) ... ( k! k) Please note that in the k-th moment formula, is not necessarily an integer. When is a positive non-integer, we still use the notation C k 1 to help us memorize the formula. However, if k , then E X ( k ) is not defined (we will look into this with an example). Explanations Lets use both Pareto distribution and exponential distribution to model a loss ra ndom variable X . Exponential: Pr ( X > x ) = e x Pareto Pr( X > x) = x+

In exponential distribution, the probability that loss exceeds a large threshold amount (such as $1,000,000) is close to zero. You can see, mathematically, that when X gets bigger, Pr ( X > x ) quickly approaches zero. For example, if we set = $5, 000 (average loss amount), then the probability tha t loss exceeds $1,000,000 is: Pr ( X > x ) = e x = e 1,000,000 5,000 = e 200 = 1.384 10 87 0 Page 199 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com In contrast, in Pareto distribution, the probability that loss exceeds a large t hreshold amount, though small, is not zero. = $5, 000 , = 2 . The probability that loss exceeds $1,000,000 For example, lets set is: Pr( X > x) = x+ 5, 000 = 1, 000, 000 + 5, 000 2 = 0.0049752 = 0.00002475 87 0.00002475 is small, but is significantly more than 1.384 10 . If you are an actuary trying to fit a bunch of loss data into a neat formula, yo u need to ask yourself, How likely is a catastrophic loss? Is the probability so small that I can ignore it? If No, then you can use Pareto distribution to fit the loss data. If Yes, then you can use exponential (or some other models). This bring s up an important concept called a tail. For two probability distributions A and B (such as exponential and Pareto) and for an identical loss amount x , if Pr A ( X > x) > Pr B ( X > x) , we say that A has a fatter (or heavier) tail than does B. For example, Pareto distribution has a fatter or heavier tail than does expo nential distribution. Stated differently, if a probability distribution has a fa t (or heavy) tail, then the probability that loss exceeds a large threshold amou nt approaches zero slowly. In contrast, if a probability distribution has a ligh t tail, then the probability that loss exceeds a large threshold amount approach es zero quickly. Pareto, gamma, and lognormal distributions have a heavy tail. E xponential has a light tail. Heavy-tailed distributions are better at modeling l osses where theres a small but non-zero chance of having a large loss. Now lets tu rn our attention to the k-th moment formula: E X ( k ) = k Ck 1 The derivation of this formula is time-consuming. Just memorize the above formul a. However, lets take time to derive the formula for Pareto mean and variance. Th is way, if you forget the formula in the exam, you can still calculate the mean and variance using basic statistics principles. To derive the mean and variance

formula, we will use the following equation: Yufeng Guo, Deeper Understanding: Ex am P Page 200 of 425

http://www.guo.coursehost.com E( X n ) = + + Pr( X > x)d x n = + + 0 0 x+ + dx n E(X ) = = If Pr ( X > x )dx = + 0 0 x+ dx = t dt (let x + = t) t dt > 1 ), then + 1 < 0 (i.e. E(X ) = + t dt = + t +1 + +1

= 1 + +1 +1 +1 = 1 If =1, E( X ) = <1 + t 1dt = [ln t ] + (undefined) If E(X ) = t dt = t +1 + +1 + = 1 + +1 + 1 +1 (undefined) E(X 2 )= + Pr ( X > x )dx =

2 + + 2x 0 0 x+ dt t 1 +1 1 +1 dx = 2 (t ) t dt =2 + t t + dt 1 t dt = + t t dt = t = t2 2 + If > 2 , then t2 2 + = 1 2 + 2

2 = 2 2 + t dt = +1 + t dt = +1 t +1 + +1 = +1 1 + +1 +1 +1 Yufeng Guo, Deeper Understanding: Exam P Page 201 of 425

http://www.guo.coursehost.com If >1 +1 1 + +1 >2: +1 +1 = +1 +1 +1 = 2 1 Finally for E(X 2 )=2 + + t t dt t dt = 2 2 2 2 2. 1 = (

2 2 1)( 2) = 2 C2 1 You can easily verify that E ( X 2 ) is undefined if Sample problems and solutions Problem 1 A Pareto distribution has the parameters Solution Pr( X > x) = 100 = x + 100 2 = 100 and = 2 . Find the 60th percentile. x+ Let x60 represent the 60th percentile. F ( x60 ) = 60% Problem 2 Individual clai ms on an insurance policy have a Pareto distribution with a mean of $3,000 and a variance of $2 63,000,000 (the unit is dollar squared). This policy has a deduc tible of $5,000. The maximum amount the insurer will pay on an individual claim is $9,000. Find (1) The expected claim payment the insurer will pay on this poli cy. (2) The expected claim payment the insurer will pay on this policy, given th ere is a claim. Pr( X > x60 ) = 40% 100 x60 + 100 2 = 40% x60 = 58.11 Yufeng Guo, Deeper Understanding: Exam P Page 202 of 425

http://www.guo.coursehost.com Solution Let X represent the individual loss; let Y represent the claim payment by the insurance company. We are asked to find E ( Y ) and E (Y Y > 0 ) . We first need to find the two parameters of the Pareto di stribution. E X ( k )= Ck k 1 E(X ) = 1 , E(X 2 2 )= 2 C2 = 1 ( 1)( 1)( 2 2 2 2) 2 2 Var ( X ) = E ( X ) E2 ( X ) = ( 2 2) 1

= 1 2 We are given E ( X ) = 3K , Var ( X ) = 63K 2 , where K=$1,000. 2 1 = 3K , 1 2 = 63K 2 7 = , 3 = 4K We use K=$1,000 to keep our calculation simple and fast. Then Pr ( X > x ) = x+ 4 = x+4 7 3 x is the loss amount in thousands of dollars 0, if X 5 K ! Y = "( X 5 ) K , if 5K < X 14 K !9 K , if X > 14 K # The expected claim payment: E (Y ) = 14 ( x 5 ) f ( x ) dx + + 9 f ( x ) dx 5 14 Since we dont know f ( x ) , we will solve the above integration using Pr( X > x) . Yufeng Guo, Deeper Understanding: Exam P Page 203 of 425

http://www.guo.coursehost.com f ( x) = 14 d [ F ( x)] dx 5 ) f ( x ) dx = f ( x ) dx = d [ F ( x)] = d 1 Pr ( X > x ) = d Pr ( X > x ) 14 (X (x 5 )d Pr ( X > x ) = Pr ( X > x )d 5 5 14 5 (x 5) (x 5 ) Pr ( X > x ) 14 5 (integration by parts) = Pr ( X > x )dx 5 14 14 (14 5 ) Pr ( X > 14 ) + ( 5 5 ) Pr ( X > 5 ) = Pr ( X > x )dx 9 Pr ( X > 14 ) 5 + 9 f ( x ) dx = 9 14 + f ( x )dx = 9 Pr ( X > 14 ) + 14 14 14

E (Y ) = ( x 5) f ( x ) dx + 9 f ( x ) dx = Pr ( X > x )dx 5 5 14 Generally, for a random loss variable X , if there is a deductible of d payment of L 0 , then the expected payment is 0 and a maximum E (Y ) = d +L d Pr ( X > x ) dx If there is no deductible and no limit on how much the insurer will pay (i.e. d =0 and L = + ), then the expected payment by the insurer is: E (Y ) = + Pr ( X > x )dx (we looked at this formula in Chapter 20) 0 Now we are ready to find the expected payment: E ( Y ) = Pr ( X > x )dx = 5 14 14 5 4 x+4 7 3 18 dx = 9 4 t 7 3 dt = 4 7 18 3 9 t 7 3 dt =4

7 3 1 t 7 +1 3 18 7 +1 3 9 = 0.61372 K = $613.72 Yufeng Guo, Deeper Understanding: Exam P Page 204 of 425

http://www.guo.coursehost.com The expected claim payment the insurer will pay on this policy, given that there is a claim: E (Y X > 5 ) = Pr (Y > 0 ) 7 3 E (Y ) = Pr ( X > 5 ) 7 3 E (Y ) 4 Pr ( X > 5) = 5+4 4 = 9 E (Y ) 4 = $613.72 E (Y X > 5) = Pr ( X > 5 ) 9 7 3 = $4, 071.26 Homework for you: Rework all the problems in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 205 of 425

http://www.guo.coursehost.com Chapter 24 Normal distribution Most students are pretty comfortable with normal distribution. So I will give a quick outline. 1. Why normal distribution is important The sample mean is approx imately normally distributed, no matter what form the population variable is dis tributed in (Central limit theorem). If a random sample of size n is taken from a population that has a mean E ( X ) 1 and a standard deviation , then the sampl e mean X = ( X1 + X 2 + ... + X n ) is n approximately normally distributed with a mean of E (X ) and a standard deviation of n for large n. So X N E ( X ), n f or large n. The sum of a large number of independent identically distributed random variable s is approximately normally distributed. (another version of the Central limit t heorem) If X1, X 2 ,..., X n are independent identically distributed with a mean of E (X ) and a standard deviation , then S = X1 + X 2 + ... + X n is approxima tely normally distributed with mean nE ( X ) and standard deviation kind of dist ribution X1, X 2 ,..., X n have). That is to say N nE ( X ) , n . n n i =1 (regardless of what X i is A rule of thumb: If a large number of small effects are acting additively, norma l distribution can probably be assumed (such as the sum of independent identical ly distributed loss random variables) In contrast, if a large number of small effects are acting multiplicatively, nor mal distribution should NOT be assumed. Instead, a lognormal distribution can pr obably be assumed (for example, compound interest rates have multiplicative effe cts and can be modeled with lognormal distribution). 2. The probability density function (pdf) has two parameters: mean E ( X ) and standard deviation . A norma l random variable X . Page 206 of 425 N E (X ), has the mean E ( X ) and the standard deviation Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Normal distributions pdf and cdf are complex and ha ve never been directly tested in SOA exams. So there is no need to memorize the formula: f (x ) = 1 2 2 e 1 x E (X ) / 2 { } 2 ( <x<+ ) 3. Standard normal distribution Z distribution X N [E (X ), z2 ] N(0,1) is obtained by transforming normal x E (X ) using Z = . 1 f (z ) = e 2 2 ( <z <+ ) = 1. This pdf is a special case of a normal distributions pdf where E (X ) = 0 and 4. The accumulative density function for standard normal distribution Z N(0,1) : 1 F (z ) = 2 z e w2 /2 dw = (z ) For a z 0, youll need the equation (z ) + ( z ) = 1 . The equation (z ) + ( z ) = 1 stands true for any z + .

5. A linear combination of two or more independent normal random variables is no rmal. If X N ( E ( X ) , X ) , Y N ( E (Y ) , y ) , and X and Y are independent, then Z = aX + bY + c N aE ( X ) + bE (Y ) + c, a 2 2 X + b2 2 Y 6. Continuity correction factor Binomial distribution B(X , n , p ) approaches N (np, np(1 p )) for large n and p not too close to 0 or 1. Poisson distribution P ( X , ) approaches normal distribution N( , ) To use normal approximation to find the probability of having Yufeng Guo, Deeper Understanding: Exam P Page 207 of 425

http://www.guo.coursehost.com x occurrences (in binomial distribution or Poisson distribution), remember to ad d or subtract the continuity correction factor. Let X represent the number of successes, whileY represents the normal random var iable used to approximate X , then Pr ( X = k ) = Pr (k = k + 0.5 0.5 < Y < k + 0.5 ) = Pr (k 0.5 Y < k + 0.5 ) ! k 0.5 ! Since Y is continuous, we have Pr (Y " a ) = Pr (Y > a ) , Pr (Y a ) = Pr (Y < a ) . So X = k corresponds to Y $ ( k 0.5, k + 0.5 ) . A single point k becomes a rang e (k 0.5, k + 0.5) . 0.5 ) = Pr (Y k k 0.5 ) = 0.5 k 0.5 Pr ( X < k ) = Pr (Y < k ! Pr ( X > k ) = Pr (Y > k + 0.5 ) = 1 ! Pr ( X k ) = Pr ( X < k + 1) = Pr (Y < k + 1 0.5 ) = Pr (Y < k + 0.5 ) = k + 0.5 ! Pr ( X " k ) = Pr ( X > k 1) = Pr(Y > k 1 + 0.5) = Pr(Y > k 0.5) = 1 =

k 0.5 ! Sample Problems and Solutions Problem 1 The claim sizes for a particular type of policy are normally distributed with a mean of $5,000 and a standard deviation of $500. Determine the probability that two randomly chosen claims differ by $600. Yufeng Guo, Deeper Understanding: Exam P Page 208 of 425

http://www.guo.coursehost.com Solution Let X1, X 2 represent two claim sizes randomly chosen. X1, X 2 are norm ally distributed with a mean of 5,000 and a standard deviation of 500. Let Y = X 1 random variables is also normal). We are asked to find Pr ( Y > 600 ) . E (Y ) = E (X1 X 2 ) = E ( X1 ) E ( X 2 ) = 0 X 2 . Then Y is normally distributed (th e linear combination of normal Because X1, X 2 are independent, we have Var (Y ) = Var ( X1 X 2 ) = Var ( X1 ) + Var ( X 2 ) = 2 5002 Pr ( Y > 600 ) = Pr z > 600 Y Y ! = Pr z > 600 0 = Pr ( z > 0.8485 ) 500 2 ! Pr ( z > 0.8485 ) = 1 Pr ( z < 0.8485 ) = 1 Pr ( 0.8485 < z < 0.8485 ) Pr ( 0.8485 < z < 0.8485 ) = (0.8485) ( 0.8485) = (0.8485) [1 (0.8485)] =2 (0.8485) 1 = 0.6038 Pr ( z > 0.8485 ) = 1 Pr ( z < 0.8485 ) = 0.3962 Pr ( Y > 600 ) = Pr ( z > 0.8485 ) = 0.3962 Problem 2 The annual claim amount for an auto insurance policy has a mean of $30 ,000 and a standard deviation of $5,000. A block of auto insurance has 25 such p olicies. Assume individual claims from this block of auto insurance are independ ent. Using normal approximation, find the probability that the aggregate annual claims for this block of auto insurance exceed $800,000. Solution Yufeng Guo, Deeper Understanding: Exam P Page 209 of 425

http://www.guo.coursehost.com Let X (i ) =annual claims amount for the i-th auto insurance policy (in $1,000), Y =the aggregate annual claims amount for the blo ck of auto insurance (in $1,000). X (1), X (2),..., X (25) are independent identically distributed with a mean of 30 and a standard deviation of 5. Y = X (1) + X (2) + ... + X (25) is approximat ely normal. E (Y ) = 25E ( X (i )) = 25(30)=750 Var (Y ) = 25Var ( X (i )) = 25(5)2 , z= Y E (Y ) Y Y = Var (Y ) = 25 = 800 750 =2 25 (2) = 97.72% Pr(Y < 800) = Pr(Y > 800) = 1 Pr(Y < 800) = 1 97.72% = 2.28% So there is a 2.28% chance that the aggregate annual claim amount for the 25 ins urance policies exceeds $800,000. Homework for you: #9,#19 May 2000; #6, Nov 200 0; #19, May 2001; #15, #40 Nov 2001; #13, May 2003. Yufeng Guo, Deeper Understanding: Exam P Page 210 of 425

http://www.guo.coursehost.com Chapter 25 Lognormal distribution If a positive random variable Y ( Y > 0 ) is the product of n independent identi cally distributed random variables X1, X 2 ,..., X n Y = X1X 2 ...X n Then lnY = ln X1 + ln X 2 + ... + ln X n Because ln X1,ln X 2 ,..ln X n are independent id entically distributed, lnY is approximately normally distributed (Central limit theorem). If the normal random variable lnY has a mean of lnY and a standard dev iation of lnY , then ln y lnY 2 lnY 2 2 pdf -- no need to memorize: f ( y ) = 1 1 e y 2 lnY normal pdf for lnY cdf -- most important formula that you must memorize: ln y F ( y ) = Pr (Y y ) = Pr ( lnY ln y ) = lnY lnY Mean good if you can memorize: E (Y ) = e lnY + 1 2 lnY 2 Variance -- good if you can memorize: 2 lnY + lnY 2 = e Var (Y ) = e lnY 2 1 e lnY 2 1 E 2 (Y )

Yufeng Guo, Deeper Understanding: Exam P Page 211 of 425

http://www.guo.coursehost.com Please note in the above formulas, lnY and lnY are the mean and standard deviation of lnY , not the mean and standard deviation of Y . In the heat of the exam, it is very easy for candidates to mistakenly use t he mean and standard deviation of Y as the two parameters for the lognormal rand om variable Y . To help avoid this kind of mistake, use lnY and lnY instead of t he standard notations and . Lets look at the pdf formula. lnY is normally distributed with mean ard deviation lnY : lnY . Applying the normal pdf formula (see Chapter 22), we get the pdf for f ( ln y ) = 1 2 lnY ln y 2 e 2 lnY 2 ( < ln y < + ) To find the pdf for Y , notice d 1 ln y = > 0 . So lnY is an increasing function. dy y F ( y ) = Pr (Y f (y ) = y ) = Pr ( lnY ln y ) = F ( ln y ) dF ( ln y ) d ( ln y ) 1 d d = f ( ln y ) F (y ) = F ( ln y ) = dy dy d ( ln y ) dy y f (y ) = 1 1 e y 2 lnY ln y lnY 2 lnY 2 2 normal pdf for lnY The cdf formula is self-explanatory. It is the most important lognormal formula and is very likely to be tested on Exam P. The mean formula is counter-intuitive . We might expect that E (Y ) = e However, the correct formula is E (Y ) = e lnY + 1 2 lnY lnY and stand

lnY 2 E ( lnY ) =e lnY . . Yufeng Guo, Deeper Understanding: Exam P Page 212 of 425

http://www.guo.coursehost.com The mathematical proof for the mean formula is com plex. To get an intuitive feel for this formula without using a rigorous proof, notice that the pdf for lnY is NOT a bell curve --the pdf for lnY is skewed to t he right (you can look at a pdf graph from a textbook). As a result, there is a factor of e E (Y ) = e lnY + 1 2 lnY 2 1 2 lnY 2 applied to e e 1 2 lnY : =e lnY lnY 2 It is hard to find an intuitive explanation for the variance formula. To be safe , you might want to memorize it. Another point. When solving problems, many cand idates find it difficult to determine whether a random variable is normally dist ributed or lognormally distributed. To avoid the confusion, remember the follow two rules: Rule 1 If X is normal with parameters of mean and standard deviation X , . then e is lognormal with parameters of mean and standard deviation

Rule 2 If Y is lognormal with parameters of mean and standard deviation , then l n Y is normal with parameters of mean and standard deviation . These two rules w ill come in handy when you solve a difficult problem. Another point. The product of independent lognormal random variables is also lognormal. If X is lognormal ln X , ln X , Y is lognormal ln Y , ln Y , X , Y are independent, then XY is als o lognormal with the following parameters: ( ) ( ) E ln ( XY ) = E ( ln X + ln Y ) = E ( ln X ) + E ( ln Y ) = ln X + lnY Var ln ( XY ) = Var ( ln X + ln Y ) = Var ( ln X ) + Var ( ln Y ) = ln X 2 + ln

Y 2 XY is lognormal ( 2 ln X + lnY 2 )

ln X +

ln Y ,

Lets see why. If X and Y are lognormal, then ln X and ln Y are normal (Rule 2). T he sum of two independent normal random variables is also normal. So ln X + ln Y = ln( XY ) is normal. Yufeng Guo, Deeper Understanding: Exam P Page 213 of 425

http://www.guo.coursehost.com Finally, for a lognormal random variable Y , if yo u are given the mean E (Y ) and variance Var (Y ) , then you will need to find t he lognormal parameters by solving the following equations (SOA can easily write a question like this): E (Y ) = e lnY + 1 2 lnY 2 lnY 2 Var (Y ) = e 1 E 2 (Y ) ln Y = ln 1 + Var (Y ) 1 2 , ln Y = ln E (Y ) 2 ln Y 2 E (Y ) Var (Y ) . This is a common mist ake. Make sure E 2 (Y ) Caution: Dont write you write ln Y ln Y = ln 1+ ln Y = ln(...) , not = ln (...) . Lognormal distribution has wide applications in insurance and other fields. Gene rally, the product of many independent small effects is likely to be lognormally distributed. Lognormal distribution is a multiplicative process. In contrast, n ormal distribution is an additive process (the sum of independent identically di stributed random variables is approximately normal). For example, the number of defects in software is a multiplicative (or compounding) process and thus approx imately lognormal. The market value of an asset is the compounding effect of an interest rate and is approximately lognormal. To see why the number of defects o r mistakes is a compounding process, think of a simple example. Compare two outc omes you pass Exam P or fail Exam P. If you fail Exam P, the financial effect is not simply the sum of the wasted exam fee and lost income you could have earned while you were studying for Exam P. The effect is far more. Failing Exam P will delay your job search, which in turn will delay your promotion, which in turn w ill cause other effects. Defects or mistakes have a compounding process. If you forget everything else about lognormal distribution, remember this: if X is logn ormal, then ln X is normal. Yufeng Guo, Deeper Understanding: Exam P Page 214 of 425

http://www.guo.coursehost.com Sample Problems and Solutions Problem 1 The value of investing $1 in stocks for 20 years is G (dollars). Assum e that Y is lognormal with the parameters lnY = 2 and lnY = 0.3 . Find (1) The p robability that Y is between 6 and 8; (2) E (Y ) and Var (Y ) . Solution F ( y ) = P (Y y ) = P ( lnY ln 8 2 0.3 ( 0.6941) = 1 (0.2648) 0.60 ln y ) = ln y lnY lnY = ln y 2 0.3 Pr ( 6 < Y < 8 ) = ln 6 2 = 0.3 ln 8 2 = 0.3 Pr(6 < Y < 8) ln 6 2 0.3 (0.6941) 0.24 0.60 0.24 = 36% 8) = Pr(6 < Y 8) = Pr(6 Y < 8) Please note that Pr(6 < Y < 8) = Pr(6 Y because Y is continuous. E (Y ) = e lnY + 1 2 2 lnY 2 =e 2 + 1 0.3 2 2 = 7.729 Var (Y ) = e lnY 1 E 2 (Y ) = Var (Y ) = e 0.3 ( 2 1 ( 7.729 ) = 5.626 2 )

Yufeng Guo, Deeper Understanding: Exam P Page 215 of 425

http://www.guo.coursehost.com Problem 2 You are given the 20th and 80th percenti le of individual claims X : (1) the 20th percentile is 61.04 (2) the 80th percen tile is 85.49 Given X is lognormally distributed. Find the probability that a cl aim exceeds 92. Solution d 1 ln x = > 0 (for any x >0) dx x So Y = ln x is an increasing function. Then the 20th and 80th percentile of X co rresponds to the 20th and 80th percentile of Y = ln x . We can easily prove this . Let x 0.2 and x 0.8 represent the 20% and 80% percentile of X . Let y0.2 and y 0.8 represent the 20% and 80% percentile of y . 0.2 = Pr(X 0.8 = Pr(X x 0.2 ) = Pr(ln X x 0.8 ) = Pr(ln X ln x 0.2 ) ln x 0.8 ) y0.2 =lnx 0.2 y0.8 =lnx 0.8 From the normal table, we find that the z for 0.8 is 0.842. In other words, (0.8 42) = 0.8 . Then ( 0.842) = 1 (0.842) = 0.2 . Using the formula (Y is normal) F (x ) = ln x

we have ln 61.04 = 0.842 , ln 85.49 = 0.842 Solve the equations: = 4.28, Then Pr( X > 92) = 1 = 0.2 ln 92 4.28 =1 0.2 (1.21) = 1 .89 = 0.11 Yufeng Guo, Deeper Understanding: Exam P Page 216 of 425

http://www.guo.coursehost.com Problem 3 An actuary models losses due to large fires using a lognormal distribu tion. The average loss due to a large fire is $30 million. The standard deviatio n of losses due to large fires is $10 million. Calculate the probability that a loss due to a large fire exceeds $35 million. Solution Let X represent the loss amount (in million dollars) in a large fire. We are given the following informat ion: X is lognormal E ( X ) = 30 ( X ) = 10 We are asked to find P ( X > 35 ) . First, we need to solve for the two paramete rs of the lognormal random variable X : E (X ) = e ln X + 1 2 ln X 2 ln X 2 = 30 Var ( X ) = e 1 E 2 ( X ) = 102 ln X = ln 1 + Var ( X ) = E2 ( X ) ln 1 + 10 30 2 = 0.3246 ln X = ln E ( X ) 1 1 2 2 ln X = ln 30 2 ( 0.3246 ) = 3.3485 2 P ( X > 35 ) = P ( ln X > ln 35 ) = 1 P ( ln X ln 35 ) = 1 ln 35 3.3485 0.3246 =1 ( 0.637 ) = 1 0.7389 0.26

Yufeng Guo, Deeper Understanding: Exam P Page 217 of 425

http://www.guo.coursehost.com Problem 4 The cumulative value of investing $1 for 20 years, Y , is calculated as follows: Y = (1 + i1 )(1 + i2 ) ... (1 + i20 ) where i k is the return in the k-th year. You are given that for k = 1, 2,..., 20 : (1) i1 , i2 ,...i20 are independent; (2) (1 + ik ) is lognormally distributed; (3) The expected annual return E ( ik ) = 5 % ; the standard deviation of the annual return Find (1) The expected valu e of investing $1 after 20 years. (2) The probability that the value of investin g $1 after 20 years is less than 95% of the expected value. ( ik ) = 6 % . Solution Many people have difficulty with this problem. The major difficulty is to determ ine whether Y = (1 + i1 )(1 + i2 ) ... (1 + i20 ) is normally distributed or log normally distributed. To avoid the confusion, use Rule 1 and Rule 2. We are told that (1 + ik ) is lognormal. This means that ln (1 + ik ) is normal (Rule 2). T hen ln Y = ln (1 + i1 ) + ln (1 + i2 ) + ... + ln (1 + i20 ) is normal; the sum of several independent normal random variables is also normal. If ln Y is normal , then elnY = Y is lognormal (Rule 1). So Y is lognormal with the following para meters: ln Y = ln (1 + i1 ) + ln (1 + i2 ) + ... + ln (1 + i20 ) E [ ln Y ] = 20 E ln (1 + i ) , Var [ ln Y ] = 20Var ln (1 + i ) Yufeng Guo, Deeper Understanding: Exam P Page 218 of 425

http://www.guo.coursehost.com The problem didnt give us E ln (1 + i ) and Var ln (1 + i ) . However, it did give us E ( i ) and ( i ) . So we need to find the E ln (1 + i ) and Var ln (1 + i ) using E ( i ) and (i ) . E ( X ) = E (1 + i ) = 1 + E ( i ) = 105% ( X ) = (1 + i ) = ( i ) = 6% Solving the following equations: E (X ) = e ln X + 1 2 ln X 2 ln X 2 = 105% 2 Var ( X ) = e 1 E 2 ( X ) = ( 6% ) ln X = ln 1 + Var ( X ) = E2 ( X ) ln 1 + 6% 105% 2 = 5.7096% ln X = ln E ( X ) 1 1 2 2 ln X = ln105% 2 ( 5.7096% ) = 4.716% 2 Then Y is lognormal with parameters E [ ln Y ] = 20 E [ ln X ] = 20 ( 4.716% ) = 0.9432 [ ln Y ] = E (Y ) = e lnY + 1 2 20 lnY [ ln X ] = 2

20 ( 5.7096% ) = 0.25534 1 0.25534 2 2 =e 0.9432 + = e 0.9758 = 2.6533 95% E (Y ) = ( 95% ) 2.6533 = 2.5206 Pr Y < 95% E (Y ) = Pr [Y < 2.5206] = Pr [ ln Y < ln 2.5206 ] = ln 2.5206 E ( ln Y ) ( ln Y ) = ln 2.5206 0.9432 = 0.25534 ( 0.0732 ) = 1 ( 0.0732 ) = 1 0.53 = 0.47 Homework for you: Rework all the problems in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 219 of 425

http://www.guo.coursehost.com Chapter 26 Chi-square distribution You have a 2 distribution when you square and sum n independent standard normal distributions. Key points: 1. Let Z1 , Z 2 ,..., Z n be independent standard nor mal distributions (i.e. each having mean of zero and standard deviation of one), then X 2 = Z12 + Z 2 2 + ... + Z n 2 where n is a positive integer. has a Chi-square 2 n distribution with n degrees of freedom. Please note that the random variable is X 2 , not X . 2. Relationship between Chi-square 2 n distribution and gamma distribution can find the proof in some textbooks): Let Y = X 2 = Z12 + Z 2 2 + ... + Z where Z1 , Z 2 ,..., Z n are independent standard n normal distributions. Y has gamma distribution with parameters = and 2 = 2: 1 f (Y = y ) = n 2 y e y/2, y 0 (n 2) 3. Mean and variance formula: X 2 has gamma distribution with parameters E(X2) = Var ( X 2 ) = = 2 = n and 2 = 2 . Then n 2 = n , (mean equal to the degree of freedom) 2 n 2 = 2 = 2n 2 (variance equal to twice the degree of freedom) M (t ) = 1 (1 2t ) n 2 (moment generating function) Yufeng Guo, Deeper Understanding Page 220 of 425 (you n 2 , Then n 2 1

http://www.guo.coursehost.com 4. NOT symmetric (i.e. not bell-like). 2 n distrib ution is like a bell squashed from the right (i.e. the diagram is skewed to the left). As n increases, the mean and variance of 2 n increases and the 2 n diagra m is less squashed on the right side (more bell-like). n is 2 5. You will need to know how to read the chi-square distribution table. A typica l table looks like this: P=the shaded area = Pr( x 2 < x0 2 ) Sample numbers from a chi-square distributi on table Value of P 0.025 0.050 0.00 0.00 0.05 0.10 degree of freedom 1 2 0.005 0 0.01 0.010 0.00 0.02 0.950 3.84 5.99 0.975 5.02 7.38 0.990 0.64 9.21 0.995 7.88 10.60 Based on the above table, if the degree of freedom n = 2 , then Pr( x 2 < 5.99) = 0.95 , Pr( x 2 < 7.38) = 0.975 Sample Problems and Solutions Problem 1 A chi-square random variable Y 2 has a variance of 10. You are given t he following table: Value of P 0.025 0.050 degree of freedom 0.005 0.010 0.950 0.975 0.990 0.995 Page 221 of 425 Yufeng Guo, Deeper Understanding

http://www.guo.coursehost.com 5 10 0.41 2.16 0.55 2.56 0.83 3.25 1.15 3.94 11.07 18.31 12.83 20.48 15.09 23.21 16.75 25.19 Find the mean; Find the 5th percentile of Y 2 ; Given that Pr(a Y 2 b) = 0.005 , find b a . Solution The variance of a chi-square distribution is equal to twice the degree of freedom n . 2n = 10 n=5 The mean of a chi-square distribution is equal to the degree of freedom. So the mean is 5. To find the 5th percentile, we need to solve the equation Pr(Y 2 5th percentile)=5% . From the given Chi-square distribution table, we find that value that correspond s to n = 5 and p = 0.05 is 1.15. Thus, the 5th percentile is 1.15. There are man y combinations of a, b that satisfy Pr(a Y 2 b) = 0.005 . However, based on the given chi-square distribution, only two combinations satisfy Pr(a Y 2 b) = 0.005 for n = 5 : (a, b) = (0, 0.41) or (a, b) = (0.41, 0.55) So we have b a = 0.41 or b a = 0.14 Please note that the information for n = 10 in the chi-square distribution table is not needed for this problem. Problem 2 A chi-square random variable Y 2 has a mean of 1. No chi-square distri bution table is given. Yufeng Guo, Deeper Understanding Page 222 of 425

http://www.guo.coursehost.com Find Pr(Y 2 > 1.21) Solution The mean of a chi-square distribution is equal to the degree of freedom . So the degree of freedom is one. If a chi-square distribution Y 2 has a freedo m of one, then Y is simply a standard normal random variable (based on the defin ition of chi-square distribution). Pr(Y 2 > 1.21) = 1 Pr(Y 2 1.21) = 1 Pr( 1.21 Y 1.21) Pr( Pr( 1.21 Y 1.21 Y 1.21) = Pr( 1.1 Y 1.1) = (1.1) ( 1.1) = (1.1) [1 (1.1)] 1.21) = 2 (1.1) 1 = 2(0.8643) 1 = 0.7286 1.21 Y 1.21) = 1 0.7286 = 0.2714 Pr(Y 2 > 1.21) = 1 Pr( Problem 3 Y has Chi-square distribution with n = 6 . Find Pr (Y 10.15 ) . Solution Y has gamma distribution with parameters = n = 3 and 2 = 2. Pr (Y 10.15) = Pr(it takes time 10.15 or less to have 3 random events) =Pr(the # of events that occurred during [0, 10.15] = 3, 4, 5,... + ) = 1- Pr(the # of ev ents that occurred during [0, 10.15] = 0,1, 2) 10.15 2 =1 - e 10.15 10.15 2 1+ 2 + 1! 2! Poisson distribution 2

= 88.15% Please note that if the degree of freedom is an odd number such as n = 7 , then Y has 7 gamma distribution with parameters = = 3.5 and = 2 . 2 Yufeng Guo, Deeper Understanding Page 223 of 425

http://www.guo.coursehost.com Because is not an integer, we cannot do the follow ing (the number of events must be an integer for Poisson distribution to work): Pr (Y 10.15) = Pr(it takes time 10.15 or less to have 3.5 random events) =Pr(the # of events that occurred during [0,10.15] = 3.5, 4.5, ... + ) If n = 7 , then we have to calculate Pr (Y gamma distribution table. 10.15 ) through other means such as using a Homework for you: Rework all the problems in this chapter. Yufeng Guo, Deeper Understanding Page 224 of 425

Chapter 27 Bivariate If two normal random relation coefficient normal distribution X 1 Y 1 2 2 21 ( 1 2 ) x E( X ) X 2 (x E ( X ) )( y E (Y ) ) X Y + y E (Y ) Y 2 where X + , Y +

normal distribution variables X N (E ( X ), X ) and Y N (E (Y ), Y ) have a cor ( 1 1) , then the joint distribution of X ,Y is a bivariate with the following pdf: f ( x, y ) = 2 exp

No need to memorize this complex formula. The derivation of this pdf is very com plex and involves the transformation of a joint distribution function. I recomme nd that you do not bother learning how to derive the above formula. If we integr ate over Y , we get the X -marginal distribution: f ( x) = f ( x, y)dy =

X 1 2 exp 1 x E( X ) 2 X 2 You should recognize this pdf. It is just a regular pdf of a normal random varia ble with the mean E ( X ) and the standard deviation X . If we integrate over X , we get the Y -marginal distribution: f ( y) = f ( x, y )dx = Y 1 2 exp 1 y E (Y ) 2 Y 2 This is just a regular pdf of a normal random variable with a mean E (Y ) and a standard deviation Y . Yufeng Guo, Deeper Understanding Page 225 of 425

The conditional distribution of Y given X = x also has a normal distribution: fY X = x (Y X = x) = 1 Y X =x 2 exp 1 y E (Y X = x) 2 Y X =x 2 Where E (Y X = x) = E (Y ) + Y X [x E ( X ) ] = E (Y ) + Cov( X , Y ) X Y Y X [x E ( X )] Cov( X , Y ) = E (Y ) + [ x E ( X )] Var ( X ) 2 Y X =x = (1 2 ) 2 Y . You will want to memorize the conditional mean and variance formulas. It would b e easy for an exam question to be written on these. If we examine the formula E (Y X = x) = E (Y ) + Y X [x E ( X )] , we realize that E (Y X = x) and x have a simple linear relationship. In other words, if X is fix ed at x , the conditional mean of Y is simply a linear function of x : E (Y X = x) = ax + b , where a = Y X , b = E (Y )

Y X E (X ) The conditional distribution of X given Y = y also has a normal distribution: f X Y = y ( X Y = y) = Where E ( X Y = y) = E ( X ) + 2 Y X =x 1 X Y=y 2 exp 1 y E ( X Y = y) 2 X Y=y 2 X Y [y E (Y )] = E ( X ) + Cov( X , Y ) [ y E (Y )] Var (Y ) = (1 2 ) 2 Y . Yufeng Guo, Deeper Understanding Page 226 of 425

Similarly, E ( X Y = y ) and y have a simple linear relationship. Please note e conditional mean and variance formulas are symmetric in terms of X , Y . If u have memorized the conditional mean and variance formulas for Y X = x , you n get the conditional mean and variance formulas for X Y = y by switching X , . For a bivariate normal distribution, if = 0 , the pdf becomes 2 f ( x, y ) = 1 2 X Y exp 1 2 x E( X ) X 2 + y E (Y ) Y 2 = 1 2 exp X 1 x E( X ) 2 X 1 2 exp Y 1 y E (Y ) 2 Y 2 =f ( x) f ( y ) Then X , Y are independent. So =0 "

th yo ca Y

X , Y are independent. For a non-bivariate normal distribution, typically = 0 does not guarantee that X , Y are independent. However, if X , Y are normal random variables, then = 0 # X , Y are independent. This is an exception to the general rule that if = 0 then X , Y are not guaranteed to be independent. However, for any distribution (whet her bivariate or not), if X , Y are independent, then =0. Sample Problems and Solutions Problem One Let Z1 , Z 2 represent two independent standard normal random variables. Let X = 2Z1 + 3Z 2 + 4 , Y = 2 Z1 5Z 2 + 2 . Find E (Y X = 0), Var (Y X = 0) . Solution Yufeng Guo, Deeper Understanding Page 227 of 425

You should have memorized the following formulas: E (Y X = x) = E (Y ) + 2 Y X =x Cov( X , Y ) [ x E ( X )] Var ( X ) = (1 2 ) 2 Y . Otherwise, the solution to this problem becomes pure guesswork. The first step i s to calculate Cov( X , Y ) = X Y . Well use the following general formulas: Cov(aX + bY + e, cX + dY + f ) = acVar ( X ) + (ad + bc)Cov( X , Y ) + bdVar (Y ) Var (aX + bY ) = Var (aX ) + Var (bY ) + 2abCov( X , Y ) Var (aX ) = a 2 Var ( X ) Cov( X , Y ) = Cov(2 Z1 + 3Z 2 + 4, 2Z1 5Z 2 + 2) = Co v(2 Z1 + 3Z 2 , 2 Z1 5Z 2 ) The above evaluation stands because the covariance b etween a constant and any random variable is always zero. Cov( X , Y ) = Cov(2Z1 + 3Z 2 , 2 Z1 5Z 2 ) = Cov(2Z1 , 2Z1 ) + Cov(2Z1 , 5Z 2 ) + Cov(3Z 2 , 2Z1 ) + Cov(3Z 2 , 5Z 2 ) Cov(2 Z1 , 2 Z1 ) = 4Var ( Z1 ) = 4 Cov(3Z 2 , 5Z 2 ) = 15Var ( Z 2 ) = 15 Cov(2 Z1 , 5Z 2 ) = 10Cov( Z1 , Z 2 ) = 0 ( Z1 , Z 2 are independen t) Cov(3Z 2 , 2 Z1 ) = 6Cov( Z1 , Z 2 ) = 0 Cov( X , Y ) = 4 15 = 11 Var ( X ) = Var (2Z1 + 3Z 2 + 4) = 4Var ( Z1 ) + 9Var ( Z 2 ) = 4 + 9 = 13 , Var (Y ) = Var (2Z1 5Z 2 + 2) = 4Var ( Z1 ) + 25Var ( Z 2 ) = 4 + 25 = 29 , = Cov( X , Y ) X Y X = 13 Y = 29 = 11 13 29 Yufeng Guo, Deeper Understanding Page 228 of 425

E ( X ) = E (2Z1 + 3Z 2 + 4) = 4 , E (Y ) = E (2 Z1 5Z 2 + 2) = 2 Next, simply a pply the memorized formulas: E (Y X = x) = E (Y ) + E (Y X = 0 ) = 2 + Cov( X , Y ) [ x E ( X )] , Var ( X ) 2 Y X =x = (1 2 ) 2 Y . 11 70 (0 4) = 13 13 2 2 Y X =0 = 1 11 13 29 29 = 256 13 Homework for you: Rework the problem in this chapter. Yufeng Guo, Deeper Understanding Page 229 of 425

http://www.guo.coursehost.com Chapter 28 Joint density and double integration Joint density problems are among the more difficult problems commonly tested in Exam P. Many candidates dread these types of problems. To score a point, not onl y do you need to know probability theories, but you must also be precise and qui ck at doing double integration. The good news is that there is a generic approac h to this type of problem. Once you understand this generic approach and do some practice problems, double integration and joint density problems are just anoth er source of routine problems where you can easily score points in the exam. Bef ore finding a general approach to joint density problems, lets go back to the bas ic idea behind double integration. Basics of integration Problem 1 (discrete ran dom variable) A random variable X has the following distribution: X 0 1 2 3 f (X = x ) .25 .25 .25 .25 Find E ( X ) . Solution maximum of x is 3 3 x =0 E( X ) = xf ( x) = 0(.25) + 1(.25) + 2(.25) + 3(.25) = 1.5 (Equation 1) minimum of x is 0 Yufeng Guo, Deeper Understanding: Exam P Page 230 of 425

http://www.guo.coursehost.com Problem 2 (Continuous random variable) A random variable X has the following probability distribution. f ( x) = 2 x, 0 9 x 3 Find E ( X ) . Solution maximum of x is 3 2 2 1 E ( X ) = xf ( x)dx = x ( x) dx = ( x 3 ) = 2 9 9 3 0 0 0 minimum of x is 0 3 3 3 (Equation 2) You see that Equation 1 and Equation 2 are very similar. In both places, we add up xf (x ) over all possible values of x ranging from the minimum to the maximum . The only difference is that we use summation for Equation 1 (because we have a finite number of x s ) and integration for Equation 2 (because we have an inf inite number of x s ). What about finding a mean for a joint distribution? Two random variables X ,Y have a joint distribution f (x , y ) ? Can you guess the formula for E ( X ) ? The formula is: X max y max x E( X ) = min y min x xf ( x, y ) Y dx dy (Equation 3) You see that Equation (3) is very similar to Equation (1) and (2). The only diff erence is that now we have double integration because we have to sum things up t wice. First, we integrate xf ( x, y ) over all possible values of X (inner integ ration) by holding y constant. Then, we integrate over all possible values of Y . You can also write Yufeng Guo, Deeper Understanding: Exam P Page 231 of 425

http://www.guo.coursehost.com Y max x max y E( X ) = min x min y xf ( x, y ) X dy dx (Equation 4) This time, the inner integration sums over y from min of y to max of y , and the outer integration sums over x from min of x to max of x . You should never writ e (conceptually wrong): min/ max Y and dx don t match max x max y E( X ) = min x min y xf ( x, y ) min/ max X and dy don t match dx dy In other words, if your inner integration is dx , then the inner integration mus t sum over x from min x to max x ; the outer integration must sum over y from mi n y to max y. Nor should you write: min/ max X and dy don t match max y max x E( X ) = min y min x xf ( x, y ) min/ max Y and dx don t match dy dx Now we are ready to tackle the joint density problems and related double integra tions. General approach to joint density problems To illustrate the general appr oach, lets use a simple example: Problem 3 The total car-related damage (measured in thousands of dollars) incurred by an auto insurance policyholder in an auto accident can be classified into two categories: X , which is the damage to his o wn car; and Y , which is the damage to the other drivers car. X ,Y have a joint d ensity of f (x , y ) . What is the probability that the total loss does not exce ed 2 (thousand dollars)? Step One Draw a 2-D region for all the possible combinat ions of two random variables ( X , Y ) . You need this 2-D region for integratio n. Yufeng Guo, Deeper Understanding: Exam P Page 232 of 425

http://www.guo.coursehost.com To begin with, your 2-D region should be where the joint density function f (x , y ) exits; any data points (x , y ) where f (x , y ) is undefined will be outside the 2-D region. Any additional constraint on ( X ,Y ) will shrink the 2-D region. In this problem, obviously f (x , y ) exists only in x 0, y 0 (i.e. the first quadrant). So your 2-D region is now the first quadrant before any additional constraints (the shaded area in Figure 1). The ad ditional constraint is x + y 2 (total loss not exceeding 2). Where is x + y or y 2 x ? You should remember this rule: y f (x ) lies above y = f (x ) because y is equal to or greater than f (x ) ; y below y = f (x ) because y is equal to or less than f (x ) . x +y 2 f (x ) lies 2 is the shaded area in Figure 2. Please also note that in Figure 2, the shaded area is also for x + y < 2 . In ot her words, theres no difference between x + y 2 and x + y < 2 . Why? Double integ ration integrates over an area; a single point or a line doesnt have any area. Pu t another way, the joint density function f (x , y ) is zero on a single point o r a line. f (x , y ) is meaningful only if you integrate f (x , y ) over an area . This is similar to the concept that f (x ) , the density function of a continu ous univariate variable X , is zero at any single point. f (x ) is meaningful on ly if you integrate f (x ) over a line -- for example, if you are using f (x ) t o get the accumulative density function of F (x ) . In general, for joint densit y problems, the 2-D region for y f (x ) is identical to the 2-D region for y > f (x ) ; the 2-D region for y f (x ) is identical to the 2-D region for y < f (x ) ; Finally, the 2D region for x 0, y 0 and x + y 2 should be the intersection o f the shaded region for x 0, y 0 in Figure 1 and the shaded region for x + y 2 i n Figure 2. This is the shaded triangle AOB in Figure 3. Yufeng Guo, Deeper Understanding: Exam P Page 233 of 425

http://www.guo.coursehost.com Figure 1 Shaded area= x 0, y 0 Figure 2 Shaded area= x + y 2 Figure 3 Shaded area = (x 0, y 0) (x + y 2) Step Two --- Choose one variable for the outer integration. Always set up the ou ter integration first. You can choose either X or Y for outer integration. If yo u choose X for outer integration, you can set up the following double integratio n: MAX of X in the 2-D region Pr( x + y 2) = MIN of X in the 2-D region f ( x, y ) dy dx Outer Integration Always set up outer integration first Yufeng Guo, Deeper Understanding: Exam P Page 234 of 425

http://www.guo.coursehost.com You can clearly see that in the 2-D region (x 0, y 0) (x + y 2) , the minimum of X =0 (at O, the center of the plane) and maximum of X =2 (at point A). So we have: 2 Pr( x + y 2) = 0 f ( x, y ) dy dx Outer Integration Always set up outer integration first Of course, you can choose Y for outer integration: MAX of Y in the 2-D region Pr( x + y 2) = MIN of Y in the 2-D region f ( x, y ) dx dy Outer Integration Always set up outer integration first In the 2-D region ( x 0, y 0) ( x + y 2) , the minimum of Y =0 (at O, the center of the plane) and maximum of Y =2 (at point B). So we have: 2 Pr( x + y 2) = 0 f ( x, y ) dx dy Outer Integration Always set up outer integration first Step 3find the min/max of the inner integration (a little tricky). Here is a general rule: If the inner integration is on Y , we draw a vertical li ne X = x that cuts through the 2-D region. The Y coordinates of the two cutting points are the lower bound and upper bound for the inner Y -integration. If the inner integration is on X , we draw a horizontal line Y = y that cuts through th e 2D region. The X coordinates of the two cutting points are the lower bound and upper bound for the inner X -integration. Lets apply the above rule. For inner Y -integration, we draw a vertical line X = x that cuts through the 2-D region. T he two cutting points are C (x ,0) and D (x ,2 x ) . So we integrate Y over the line CD using C (whose Y coordinate is zero) as the lower bound and D (whose Y c oordinate is 2 x ) as the upper bound. Yufeng Guo, Deeper Understanding: Exam P Page 235 of 425

http://www.guo.coursehost.com Figure 4 2 2 2 x Pr( x + y 2) = 0 f ( x, y ) dy Integrate over CD dx = 0 0 f ( x, y ) dy dx Line CD represents all the possible values of Y in the 2-D region given that X = x . So the line CD is really Y X = x . We will revisit this point later on when we need to calculate f (x ) , the X -marginal density of f (x , y ) . So in the above equation, the inner integration is done over all possible values of Y giv en X = x . Then, in the outer integration, we integrate over all possible values of X . The inner and outer integration work together, scanning everywhere withi n the 2-D region for ( X ,Y ) , and summing up the total probability within the 2-D region. Similarly, for the inner X -integration, we draw a horizontal line Y = y that cuts through the 2-D region. The two points are E( 0,y ) and F( 2 y, y ). So we integrate X over the line EF, using E (whose X coordinate is zero) as the lower bound and F (whose X coordinate is 2 y ) as the upper bound. Yufeng Guo, Deeper Understanding: Exam P Page 236 of 425

http://www.guo.coursehost.com Figure 5 2 2 2 y Pr( x + y 2) = 0 f ( x, y ) dx Integrate over line E F dy = 0 0 f ( x, y ) dx dy Line EF represents all the possible values of X in the 2-D region given that Y = y . So the line CD is really X Y = y . We will also revisit this point later on when we need to calculate f (y ) , the Y -marginal density of f (x , y ) . So i n the above equation, the inner integration is done over all possible values of X given Y = y . Then the outer integration is done over all possible values of Y . The inner and outer integration work together, scanning everywhere within the 2-D region for ( X ,Y ) , and summing up the total probability within the 2-D r egion. Now we have set up a complete double integration: 2 2 x Pr( x + y 2) = 0 0 f ( x, y ) dy dx 2 2 y Or Pr( x + y 2) = 0 0 f ( x, y ) dx dy Yufeng Guo, Deeper Understanding: Exam P Page 237 of 425

http://www.guo.coursehost.com Step 4 (final step) evaluate the double integratio n starting from the inner integration. 2 2 x Pr( x + y 2) = 0 0 f ( x, y ) dy Do inner first Do outer last dx 2 2 y Or Pr( x + y 2) = 0 0 f ( x, y ) dx Do inner first Do outer last dy Lets summarize the 4 steps for tackling joint density and double integration prob lems: Step 1 Draw a 2-D region. Initially, the 2-D region is where the joint den sity f (x , y ) exists. Any additional constraint shrinks the 2-D region. When d rawing the 2-D region, remember that y f (x ) is above y = f (x ) and y f (x ) i s below y = f (x ) . Step 2 Set up the outer integration first. Choose either X or Y as the inner integration variable. Find the min/max values of the outer int egration variable in the 2-D region. They are the lower and upper bounds of the outer integration. Step 3 Set up the inner integration. To determine the lower a nd upper bounds, draw a vertical line (if the inner integration is on Y ) or hor izontal line (if the inner integration is on X ) that cuts through the 2-D regio n. Use the two cutting points as the lower and upper bounds for the inner integr ation. Step 4 Evaluate the double integration. Do the inner integration first. L ets practice the above steps. Problem 4 Random variables X and Y have the followi ng joint distribution: f ( x, y ) = k (1 x + 2 y ) , for 0 0, x y 2 x elsewhere Find E ( X ) . Yufeng Guo, Deeper Understanding: Exam P Page 238 of 425

http://www.guo.coursehost.com Solution First, we need to solve k by using the equation: f ( x, y) dx dy = 1 0 x y 2 x Step One --- Determine the 2-D region We break down 0 x y 2 x into three conditi ons: x 0 (1st and 4th quadrants.) x y (Figure 6) y 2 x (Figure 7) Figure 6 Shaded area=Y X Yufeng Guo, Deeper Understanding: Exam P Page 239 of 425

http://www.guo.coursehost.com Figure 7 Shaded area=Y 2 X Now if you put all the separate constraints together, youll find the desired 2-D region for 0 x y 2 x (see the shaded area in Figure 8): Figure 8 shaded area= 0 x y 2 x Step Two ---Set up the outer integration If we do the outer integration on X : I n the 2-D region, min X =0, max X =1. Yufeng Guo, Deeper Understanding: Exam P Page 240 of 425

http://www.guo.coursehost.com 1 f ( x, y ) dx dy = 0 x y 2 x 0 f ( x, y )dy dx If we do the outer integration on Y : In the 2-D region, min Y =0, max Y =2. 2 f ( x, y ) dx dy = 0 x y 2 x 0 f ( x, y )dx dy Step Three --- Determine the lower and upper bounds for the inner integration. I f we do the inner integration on Y : To find the lower and upper bounds on Y , w e draw a vertical line X = x that cuts through the 2-D region. The two cutting p oints are M( x ,2 x ) and N( x , x ). So the lower bound is x (the Y coordinate of N) and the upper bound is 2 x (the Y coordinate of M). See Figure 9. Figure 9 Yufeng Guo, Deeper Understanding: Exam P Page 241 of 425

http://www.guo.coursehost.com Y 1 1 1 2 -x Y f ( x, y )dy dx = 0 0 NM f ( x, y ) dy dx = 0 X x f ( x, y ) dy dx X If we do inner integration on X (more difficult): (See Figure 10) To determine t he lower and upper bounds of X , we need to draw two horizontal lines DE and GH that cut through the 2-D region. This is because the upper bound of X for the tr iangle ABC (the top portion of the 2-D region) is different from the upper bound of X for the triangle ABO (the lower portion of the 2-D region). For the line D (0, y ) and E( 2 y, y ), the lower bound of X is zero (the X coordinate of D) an d the upper bound is 2 y (the X coordinate of E); for the line G(0, y ) and H( y , y ), the lower bound is zero (the X coordinate of G) and the upper bound is y (the X coordinate of H). As a result, to do the inner and outer integrations, w e need to divide the 2-D region into two sub-regions: ABO and ABC. Figure 10 So the double integration becomes: Yufeng Guo, Deeper Understanding: Exam P Page 242 of 425

http://www.guo.coursehost.com 2 1 2 f ( x, y )dx dy = 0 1 y 0 GH f ( x, y )dx dy + 1 DE ABO 2 2-y f ( x, y )dx dy ABC = 0 0 f ( x, y )dx dy + 1 ABO 0 f ( x, y )dx dy ABC Step Four Evaluate the integration starting from the inner integration. Well do t he outer integration on X and the inner integration on Y (this is easier). Y 1 2- x 1 2 x f ( x, y ) dy dx = 1 0 x X 0 x k (1 x + 2 y ) dydx = 1 We always do the inner integration first. 2 x k (1 x + 2 y )dy (remember that x is a constant here) x To help us remember that we are integrating over y , not x , we underline any it em that has y in it, treating all other items as constants (not underlined): 2 x k (1 x + 2 y ) dy = k ( y x y + y 2 ) x 2 x x =k { (2 x) x(2 x) + (2 x) 2 8x + 6) over x : 1 x x2 + x2 } = k (2 x 2 8 x + 6)

Next, we integrate k (2x 2 1 2 x k (1 x + 2 y )dydx = k (2 x 2 8 x + 6)dx 0 x 1 0 2 2 8 = k ( x3 4 x 2 + 6 x) = k ( 4 + 6) = k 3 3 3 0 8 3 k =1 k= 3 8 Now we are ready to find E ( X ) . Yufeng Guo, Deeper Understanding: Exam P Page 243 of 425

http://www.guo.coursehost.com 1 2 x E( X ) = 0 x 3 x (1 x + 2 y )dydx 8 First do the inner integration: 2 x x x (1 x + 2 y )d y = x(2 x 2 8 x + 6) = (2 x3 8 x 2 + 6 x) 3 8 3 8 3 8 Next do the outer integration: 1 E( X ) = 3 3 2 (2 x3 8 x 2 + 6 x)dx = ( x 4 8 8 4 0 8 3 6 2 x + x ) 3 2 1 0 3 2 8 6 5 e heat of prone to n Chapter s. = ( + )= 8 4 3 2 16 Heres a the exam, converting 3 2 8 6 errors. You should use the 8 2 and quickly and accurately repeat of an exam hint I said before. In th ( + ) into a neat fraction is painful and 4 3 2 calculator technique I showed you i use your calculator to deal with fraction

Problem 5 (continue Problem 4) Random variables X and Y have the following joint distribution: 3 (1 x + 2 y ), for 0 x f ( x, y ) = 8 0, elsewhere y 2 x Find Var ( X ) . Solution If you can accurately find the 2-D region and correctly set up the doub le integration form, statistic formulas on joint distributions are similar to th e formulas for single integration --except we have to use a joint pdf (instead o f a single pdf) and integrate over a 2-D region. Var ( X ) = E ( X 2 ) E 2 ( X ) (same formula) Yufeng Guo, Deeper Understanding: Exam P Page 244 of 425

http://www.guo.coursehost.com We already know that E (X ) = E (X 2 ) = 1 2 x 5 16 x 2 f (x , y )dxdy 2 (similar formula) 1 = 0 x 3 3 x (1 x + 2y )d ydx = x 2 (2x 2 8 80 5 16 2 8x + 6)dx = 3 20 3 Var ( X ) = 20 0.05234 If the problem asks us to find other statistics such as Cov (X ,Y ) , then 1 2 x E ( XY ) = xyf (x , y )dxdy = 0 xy (1 x + 2y )d ydx =.... x 3 8 1 2 x E (Y ) = yf (x , y )dxdy = 0 y x 3 (1 x + 2y )d ydx = ... 8 Cov ( X ,Y ) = E ( XY ) E ( X )E (Y ) What about word problems about a joint pdf? Most of the difficulty in such word

problems is about finding the right 2-D region. If you can correctly identify th e 2-D region, the rest of the work is just (tedious) integration. Lets have some word problems and practice how to pinpoint the 2-D region for double integration . Problem 6 (word problem) Claims on homeowners insurance consist of two parts cla ims on the main dwelling (i.e. house) and claims on other structures on the prop erty (such as a garage). Let X be the portion of a claim on the house, let Y be the portion of the same claim on the other structures. X ,Y have the joint density function f (x , y ) if 0 < x < 2, 0 < y < 3 0 if elsewhere Find the probability that the house portion of the claim is more than two times of the other structures portio n of the same claim. Yufeng Guo, Deeper Understanding: Exam P Page 245 of 425

http://www.guo.coursehost.com Solution The probability that the house portion of the claim is more than two tim es of the other structures portion of the same claim is Pr( X > 2Y ) . I will onl y show you how to find the 2-D region and how to set up the double integration. You can do the actual integration. Dont worry about how the joint density functio n, f (x , y ) , looks because f (x , y ) doesnt affect the 2-D region. 0 < x < 2, 0 < y < 3 is an obvious constraint. The other constraint is that the claim on the house must be more than two times of the claim on the other structu res (i.e. X > 2Y or Y < 0.5 X ). So the 2-D region is formed by: 0 < x < 2, 0 < y < 3 , Y < 0.5 X Figure 11 The shaded triangle AOB in Figure 11 is the 2-D region for integration. Pr(X > 2Y ) = 2 0.5 x 0 0 f (x , y )dydx Problem 7 Claims on homeowners insurance consist of two parts claims on the main dwelling (i.e. house) and claims on other structures on the property (such as a garage). Let X be the Yufeng Guo, Deeper Understanding: Exam P Page 246 of 425

http://www.guo.coursehost.com portion of a claim on the house, let Y be the port ion of the same claim on the other structures. X ,Y have the joint density function f (x , y ) if 0 < x < 2, 0 < y < 3 0 if elsewhere Find the probability that the total claim exceeds 3. Solution The 2-D region is the shaded area below (triangle ADE in Figure 12). Figure 12 2 3 Pr( X + Y > 3) = 0 3 x f ( x, y ) dy dx Problem 8 Yufeng Guo, Deeper Understanding: Exam P Page 247 of 425

http://www.guo.coursehost.com Claims on homeowners insurance consist of two part s claims on the main dwelling (i.e. house) and claims on other structures on the property (such as a garage). Let X be the portion of a claim on the house, let Y be the portion of the same claim on the other structures. X ,Y have the joint density function f ( x, y ) if 0 < x < 2, 0 < y < 3 0 if elsewhere Find the probability that the total claim exceeds 1 but is less than 3 and the other structures portion of the same claim is less than 1.5. Solution We need to find Pr [ ( X + Y > 1) ( X + Y < 3) (Y < 1.5)] The 2-D region is the shaded area (ABCDEF) in Figure 13. Figure 13 Yufeng Guo, Deeper Understanding: Exam P Page 248 of 425

http://www.guo.coursehost.com To do the double integration, we divide ABCDEF into ABCD and DEFA. Pr [ ( X + Y > 1) ( X + Y < 3) (Y < 1.5)] = 1 2 1.5 3 y f ( x, y )dxdy + 0 1 y 1 0 f ( x, y )dxdy Homework for you: #5, #10, May 2000; #20, #36, #40 Nov 2000; #5, #13, #22, #24 M ay 2001; #28, #30 Nov 2001; #10, #20, #24 May 2003. Yufeng Guo, Deeper Understanding: Exam P Page 249 of 425

http://www.guo.coursehost.com Chapter 29 Marginal/conditional density How to find the marginal density if you know the joint density: Step 1 Draw the 2-D region for all ( X ,Y ) Step 2 Set up the single integration The X -marginal density is f ( x) = f ( x, y )dy (integrates over all y s given X = x ) y x The Y -marginal density is f ( y) = f ( x, y )dx (integrates over all x s given Y = y ) x y Step 3 If the integration is on Y , draw a vertical line that cuts through the 2 -D region. The Y coordinates of the two cutting points are the lower bound and h igher bound of the integration. If the integration is on X , draw a horizontal l ine that cuts through the 2-D region. The X coordinates of the two cutting point s are the lower bound and higher bound of the integration. You can see that the procedure for finding the marginal density is similar to the procedure for tackl ing problems on joint density and double integration. The only difference is tha t when doing double integration, you have to set up the outer integration. In co ntrast, the marginal density is single integration. If you take out the step rel ated to the outer integration, the procedure for joint density problems becomes the procedure for finding the marginal density. Formulas for the conditional den sity and conditional expectation The conditional density of X given Y = y : f ( x, y ) f x y ( x y) = f ( y) The conditional expectation of X given Y = y : E ( X Y = y) = x y x f x y ( x y ) dx The conditional density of Y given X = x : f ( x, y ) f y x ( y x) = f ( x) Yufen g Guo, Deeper Understanding: Exam P Page 250 of 425

http://www.guo.coursehost.com The conditional expectation of Y given X = x : E (Y X = x) = y x y f y x ( y x) dy To find the conditional range y x , we draw a vertical line that cuts through e 2-D region. The two cutting points are the lower and upper bounds for y x . find the conditional range x y , we draw a horizontal line that cuts through e 2-D region. The two cutting points are the lower and upper bounds for x y . now you should be very good at this. Problem 1 Random variables X and Y have e following joint distribution: 3 (1 x + 2 y ), for 0 x f ( x, y ) = 8 0, elsewhere y 2 x th To th By th

Find the X -marginal density f (x ) and the Y -marginal density f (y ) . Solutio n We worked on this example before when we were solving joint density related pr oblems. Now we are finding the marginal density. Step 1 Draw the 2-D region for ( X ,Y ) : Figure 16 shaded area= 0 x y 2 x Yufeng Guo, Deeper Understanding: Exam P Page 251 of 425

http://www.guo.coursehost.com Step 2 Set up the integration: The X -marginal density is f ( x) = y x f ( x, y )dy = y 3 (1 x + 2 y ) dy 8 x The Y -marginal density is f ( y) = x y f ( x, y )dx = x 3 (1 x + 2 y ) dx 8 y Step 3Determine the lower and upper bounds for integration. If the inner integrat ion is on Y (when finding the X -marginal): To find the lower and upper bounds o n Y , we draw a vertical line X = x that cuts through the 2-D region. The two cu tting points are M( x ,2 x ) and N( x , x ). So the lower bound is x (the Y coor dinate of N) and the upper bound is 2 x (the Y coordinate of M). Figure 17 f ( x) = y 3 (1 x + 2 y ) dy = 8 x 2 -x x 3 (1 x + 2 y )dy 8 Yufeng Guo, Deeper Understanding: Exam P Page 252 of 425

http://www.guo.coursehost.com 2 x 2 3 3 (1 x + 2 y ) dy = ( y x y + y 2 ) x 8 8 x f ( x) = x 3 (2 x) x(2 x) + (2 x) 2 8 3 3 = (2 x 2 8 x + 6) = ( x 2 4 x + 3) 8 4 = Because in the 2-D region, 0 0 x 1. x { x x2 + x2 for 0 } x 1 3 2 ( x 4 x + 3) is defined on 4 1 , then f ( x) = Double check: We should have 1 f ( x)dx = 1 1 0 3 2 3 1 3 1 ( x 4 x + 3)dx = ( x3 2 x 2 + 3 x) = ( 2 + 3) = 1 (OK) 4 4 3 4 3 0 If we do inner integration on X (when findingY -marginal): To determine the lowe r and upper bounds of X , we need to draw two horizontal lines DE and GH that cu t through the 2-D region. For the line D(0, y ) and E( 2 y, y ), the lower bound of X is zero (the X coordinate of D) and the upper bound is 2 y (the X coordina te of E); for the line G(0, y ) and H( y , y ), the lower bound is zero (the X c oordinate of G) and the upper bound is y (the X coordinate of H). Figure 18 Yufeng Guo, Deeper Understanding: Exam P Page 253 of 425

http://www.guo.coursehost.com As a result, we need to do two separate integrations: For 0 y 1 , the Y -marginal density is: y f ( y) = Line GH f ( x, y )dx = 0 f ( x, y )dx = 3 (1 x + 2 y )dx 8 0 y y 3 3 1 2 (1 x + 2 y )d x = (1 + 2 y ) x x 8 8 2 0 y = 0 3 1 2 3 3 2 (1 + 2 y ) y y = y +y 8 2 8 2 For 1 y f ( y) = 2 , the Y -marginal density is: 2 y 2 y f ( x, y )dx = Line DE 0 f ( x, y )dx = 0 3 (1 x + 2 y )dx 8 2 y 0 3 3 1 2 x (1 x + 2 y )d x = (1 + 2 y ) x 8 8 2 2 y 0 = 3 1 3 (1 + 2 y )(2 y ) (2 y ) 2 = 8 2 8 5 2 y + 5y 2 So the Y - marginal density is: 3 3 2 y +y 8 2 3 8 5 2 y + 5y 2 for 0 y 1

f ( y) = for 1 y f (y )dy = 1 2 2 Double check: We should have 1 3 3 2 3 f ( y )dy = y + y dy + 8 2 8 0 1 3 1 3 1 2 y + y = 8 2 2 = 3 1 1 3 + + 8 2 2 8 1 5 2 y + 5 y dy 2 5 3 5 2 y + y 6 2 2 3 + 8 0 1 5 5 7 + 3 =1 6 2 (OK) Yufeng Guo, Deeper Understanding: Exam P Page 254 of 425

http://www.guo.coursehost.com Problem 2 Random variables X and Y have the follow ing joint distribution: 3 (1 x + 2 y ), for 0 x f ( x, y ) = 8 0, elsewhere y 2 x Find f x y ( x y ) , f y x ( y x) , E ( X Y = y ) , E (Y X = x) . Solution f x y ( x y) = f ( x, y ) = f ( y) 3 (1 x + 2 y ) 1 x + 2y 8 (for 0 < y 1) = 3 2 3 3 2 y +y y +y 2 8 2 3 (1 x + 2 y ) 1 x + 2y 8 = (for 1 y 5 2 3 5 2 y + 5y y + 5y 2 8 2 2) 3 (1 x + 2 y ) f ( x, y ) 8 1 x + 2y = = (for 0 f y x ( y x) = 3 2 2( x 2 4 x + 3) f ( x) ( x 4 x + 3) 4 x 1) Yufeng Guo, Deeper Understanding: Exam P Page 255 of 425

http://www.guo.coursehost.com Figure 19 For 0 < y 1 E ( X Y = y) = xy x f x y ( x y )dx = GH x f x y ( x y )dx = 1 x + 2y dx 3 2 GH y +y 2 x y = x 0 1 x + 2y 1 dx = 3 2 3 2 y +y y +y 2 2 y y x(1 x + 2 y ) dx 0 = 1 3 2 y +y 2 ( x x + 2 xy ) d x = 2 0 1 2 ( x 3 2 y +y 2 2 1 1 3 x + x2 y) 3 y 0 = 1 ( y2 3 2 y +y 2 2 2 1 1 3 1 1 2 y + y3 ) = ( y 2 + y3 ) 3 2 3 3 y +y 2 2 For 1 y E ( X Y = y) = x y

x f x y ( x y )dx = DE x f x y ( x y )dx = DE x 1 x + 2y dx 5 2 y + 5y 2 Page 256 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com 2 y = 0 1 x + 2y x dx = 5 2 y + 5y 2 1 5 2 y + 5y 2 2 y 1 5 2 y + 5y 2 2 2 y x(1 x + 2 y ) dx 0 = ( x x + 2 xy ) d x = 0 1 2 ( x 5 2 y + 5y 2 2 1 1 3 x + x2 y) 3 2 y 0 = 1 5 2 y + 5y 2 1 (2 y ) 2 2 1 (2 y )3 + (2 y ) 2 y 3 One quick check: E (X Y = y ) for 0 < y 1 and E (X Y = y ) for 1 y 2 should generate the identical result for y = 1 . You can verify that this is, indeed, t he case; both generate the result that E (X Y = y ) =7/15. Next, we find E (Y X = x) = y x y f y x ( y x) dy Figure 20 E (Y X = x) = y x y f y x ( y x) dy = NM

y f y x ( y x) dy Yufeng Guo, Deeper Understanding: Exam P Page 257 of 425

http://www.guo.coursehost.com f y x ( y x) = f ( x, y ) 1 x + 2y = (for 0 2( x 2 4 x + 3) f ( x) x 1) E (Y X = x) = NM y 1 x + 2y dy = 2( x 2 4 x + 3) 2 x 2 x y x 1 x + 2y dy 2( x 2 4 x + 3) = 2( x 2 1 4 x + 3) y (1 x + 2 y ) dy x = 2( x 2 1 1 2 (1 x) y 2 + y 3 4 x + 3) 2 3 2 x = ... x (Further evaluation of E (Y X = x ) is omitted) Most likely, Exam P wont ask you to do such hard-core integrations as we did here. However, this problem does giv e you a good idea on how to find the marginal and conditional density. Problem 3 (easier calculation) Two loss random variables X and Y have the following joint distribution: 3 (1 x + 2 y ), for 0 x f ( x, y ) = 8 0, elsewhere y 2 x Without using any of the formulas derived in the previous problem, find Pr( X > 0.5 Y = 1) , E ( X Y = 1) , and Var ( X Y = 1) . Yufeng Guo, Deeper Understanding: Exam P Page 258 of 425

http://www.guo.coursehost.com Figure 21 Solution Pr( X > 0.5 Y = 1) = Pr( X > 0.5, Y = 1) Pr(Y = 1) 1 Pr( X > 0.5, Y = 1) = RK f ( x, y = 1)dx = 0.5 f ( x, y = 1)dx f ( x, y = 1) = 3 (1 x + 2 y ) 8 1 3 = (3 x) 8 y =1 2 2.5 Pr( X > 0.5, Y = 1) = 3 3 3 1 2 27 (3 x)dx = t dt = t = 8 8 8 2 64 2 0.5 2.5 (We set 3 x = t to speed up the integration) 1 1 Pr(Y = 1) = JK 2 f ( x, y = 1)dx = 0 3 f ( x, y = 1)dx = = 15 16 3 (3 x)dx 8 0 = 3 3 1 2 t dt = t 8 8 2 3 2 Pr( X > 0.5 Y = 1) = Pr( X > 0.5, Y = 1) 27 15 9 = = 64 16 20 Pr(Y = 1) Yufeng Guo, Deeper Understanding: Exam P Page 259 of 425

http://www.guo.coursehost.com f ( x, y = 1) f ( x, y = 1) E ( X Y = 1) = x dx = x dx f ( y = 1) f ( y = 1) JK 0 1 3 (3 x) 1 1 2 2 = x8 dx = x(3 x)dx = (3x x 2 )dx 15 50 50 0 16 1 2 3 2 = x 5 2 1 3 x 3 1 = 0 7 15 E (X Y = 1) =7/15 matches the result obtained in the previous problem. Var ( X Y = 1) = E ( X 2 Y = 1) E 2 ( X Y = 1) f ( x, y = 1) f ( x, y = 1) E ( X Y = 1) = x dx = x 2 dx f ( y = 1) f ( y = 1) 0 JK 1 2 2 3 (3 x) 1 1 2 2 2 dx = x (3 x)dx = (3x 2 = x2 8 15 50 50 0 16 1 x3 )dx 2 = x3 5 1 4 x 4 1 = 0 3 10 Var ( X Y = 1) = E ( X 2 Y = 1) E 2 ( X Y = 1) 3 Var ( X Y = 1) = 10 7 15 2 = 37 450 Homework for you: #23 May 2000; #7 Nov 2000; #39 May 2001; #17,#34 Nov 2001; #28 May 2003. Yufeng Guo, Deeper Understanding: Exam P Page 260 of 425

http://www.guo.coursehost.com Chapter 30 Transformation: CDF, PDF, and Jacobian Method If you are given the pdf of a continuous random variable X , what is the pdf for Y = f ( X ) ? If you are given f ( x, y ) , the joint pdf of X , Y , whats the j oint pdf for Z1 = g ( X , Y ) and Z 2 = h ( X , Y ) ? There are three methods: Transformation of one variable CDF method (good for one-to-one and one-to-many transformation) PDF method (good for one-to-one transformation) Transformation of n variables J acobian method Transformation of one variable -- CDF method. You first calculate the CDF for the new variable. Then you differentiate the CDF to get the pdf. Problem 1 Random variable X has the following distribution: f ( x) = 1 ( x + 2) , 8 2 x 2 Find the pdf of a random variable Y , where Y = X 2 (0 Y 4) Solution Lets draw a rough diagram of Y = X 2 (0 Y 4) . See Figure 14. Yufeng Guo, Deeper Understanding: Exam P Page 261 of 425

http://www.guo.coursehost.com Figure 14 From Figure 14 you can see: F (y ) = Pr(Y We easily calculate Pr( y ) = Pr( y y) = y y X y ): y y) X y Pr( y X f ( x)dx = 1 1 1 ( x + 2)dx = ( x 2 + 2 x) 8 2 y8 y = y 1 y 2 Double check: F (Y = 0) = 1 1 0 = 0 ,F (Y = 4) = 4 = 1 (OK) 2 2 dF (y ) d 1 1 = , (0<y 4) f (y ) = y = dy dy 2 4 y Problem 2 X uniformly distributed over [ 10,10] . Y = X . Find the pdf of Y . Solution X uniformly distributed over [ 10,10] Yufeng Guo, Deeper Understanding: Exam P

Page 262 of 425

http://www.guo.coursehost.com 1 fX ( x) = , FX ( x ) = 20 Pr (Y y ) = Pr ( X = x f X ( s ) ds = 10 1 1 ds = ( x + 10 ) 20 20 10 x y ) = Pr ( y X y ) = FX ( y ) FX ( y ) y 10 1 1 y ( y + 10 ) ( y + 10 ) = where 0 20 20 10 fY ( y ) = Problem 3 d 1 FY ( y ) = where 0 dy 10 y 10 A new regulation has a mandatory payment clause, which requires an insurance com pany to double its bodily injury claim payments to an auto insurance policyholde r. Prior to this new regulation, bodily injury-related claim payments to a polic yholder, X , has a pdf f X (x ) . After the new regulation becomes effective, wh at is the probability distribution function of bodily injury claim payments to a policyholder? Solution Let Y =bodily injury claim payments made to the policyho lder under the new law. Y = 2X . We need to find the pdf f (y ) . F (y ) = Pr(Y y ) = Pr(2X y ) = Pr( X y y ) = FX 2 2 f (y ) = d d y d y FY (y ) = FX = dy dy dy 2 2 d 1 y F = fX dy 2 2 Problem 4 A machine has two parallel components backing up each other. The machi ne works as long as at least one of the components is working. Each components ti me until failure is independently exponentially distributed with parameters 1 = 5 and 2 = 8 respectively. Find the probability distribution function of the mach ines time until failure. Solution Let T 1 and T 2 represent each components time u ntil failure. Let T represent the machines time until failure. Yufeng Guo, Deeper Understanding: Exam P Page 263 of 425

http://www.guo.coursehost.com T = max (T 1,T 2) Pr(T t ) = Pr [ max(T 1,T 2) t )] = Pr [(T 1 t ) (T 2 t )] Because T 1 and T 2 are independent, we have Pr [(T 1 t ) (T 2 t )] = Pr(T 1 t )Pr(T 2 t ) = (1 e 5t )(1 e 8t ) =1 e f (t ) = dF (t ) d = (1 e dt dt 5t 5t e 8t + e 13t 13 t e 8t + e ) = 5e 5t + 8e 8t 13e 13 t Problem 5 A machine has two components working together. The machine works only if both components are working. Each components time until failure is independent ly exponentially distributed with 1 = 5 and 2 = 8 respectively. Find the probabi lity distribution function of the machines time until failure. Solution Let T 1 a nd T 2 represent each components time until failure. Let T represent the machines time until failure. T = min(T 1,T 2) Pr(T > t ) = Pr [ min(T 1, T 2) > t ) ] = Pr [ (T 1 > t ) Because T 1 and T 2 are independent, we have Pr [ (T 1 > t ) (T 2 > t )] (T 2 > t )] = Pr(T 1 > t ) Pr(T 2 > t ) = (e 5t )(e 8t )=e 13t Then F (t ) = 1 Pr(T > t ) = 1 e 13t dF (t ) d = (1 e 13t ) = 13e f (t ) = dt dt 13t Yufeng Guo, Deeper Understanding: Exam P Page 264 of 425

http://www.guo.coursehost.com Problem 6 X , Y has the following joint pdf: f ( x, y ) = x + y where 0 x 1 and 0 y 1 Let Z = x + y . Find f ( z ) , the pdf of Z . Solution First, we need to find cdf F ( z ) = Pr ( Z Cdf is F ( z ) = Pr ( Z z ) = Pr ( x + y z ) . Then we differentiate F ( z ) to find f ( z ) . z) ? z ) . But how are we going to find Pr ( x + y Before you feel scared, lets simplify Pr ( x + y z = 1 . Do you know how to find Pr ( x + y 1) ? If z = 1 , the problem is: X , Y has the following joint pdf: z ) by setting z to a constant such as f ( x, y ) = x + y where 0 Find Pr ( x + y 1) . x 1 and 0 y 1 If this still doesnt ring a bell, we can translate the above simplified problem i nto a word problem: The claim amount (in the unit of $1 million) on the house X and the claim amount on the garage Y (in the unit of $1 million) have the follow ing joint pdf: f ( x, y ) = x + y where 0 x 1 and 0 y 1 Find the probability that the total claim amount not exceeding 1. Now you should recognize that this problem is a joint density problem. We have a generic 4-ste p process described in Chapter 26 to solve this problem. Determine 2-D region. Set up the outer integration Set up the inner integration Evaluate the double integration Page 265 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com If you know how to solve Pr ( x + y 1) , you can apply the same 4-step process t o find Pr ( x + y z ) . You just treat z as a constant with the following lower and upper bonds: z 1 and 1 z 0 z = x + y 2 . However, well need to consider two situations: 0 These two situat ions have different 2-D regions. 2. Line CE is x + y = 1 . If 0 z 1 , 0 x + y If 1 z 2 , 0 x + y If 0 z 1: z is Area AOB. z is Area OCFGE. F ( z ) = Pr ( X + Y f (z) = d F ( z ) = z2 dz z) = AO B f ( x, y ) dx dy = z z x ( x + y ) dy dx = 0 0 1 3 z 3 If 1 z 2: F ( z ) = Pr ( X + Y 1 1 z) = OC F G E f ( x, y ) dy dx = 1 F DG f ( x, y ) dy dx =1 z 1 z x ( x + y ) dy dx = 1 1 2 +z 3 4 3

1 1 2 +z z 2 + z3 = 3 3 1 3 z 3 f (z) = d d F ( z) = dz dz 1 3 z = 2z z 2 3 f (z) = z2 2z z 2 for 0 z 1 2 for 1 z Yufeng Guo, Deeper Understanding: Exam P Page 266 of 425

http://www.guo.coursehost.com 2 Double check: 0 f ( z )dz should be one. 2 2 f ( z )dz = z dz + 2 0 1 (2 z z 2 ) 0 1 1 dz = z 3 3 1 + z 0 2 1 3 z 3 2 1 1 1 3 = + ( 22 1) ( 2 1) = 1 (OK) 3 3 General procedure for finding the pdf for Z = g ( X , Y ) , given X , Y have the joint pdf f ( x, y ) . Step 1 Find Fz ( z ) = Pr ( Z To find Pr g ( X , Y ) z ) = Pr g ( X , Y ) z . z , treat z as a constant. Now the problem becomes Given that X , Y have the joint pdf f ( x, y ) , whats the probability that g ( X , Y ) z ? Use the 4step procedure described in Chapter 26 to find Pr g ( X , Y ) Step 2 Find f Z ( z ) = Problem 7

d FZ ( z ) dz z . There are two machines A and B. Let T A =machine As time until failure; let T B = machine Bs time until failure. T A and T B are independent random variables, both exponentially distributed with means of 10 and 5 respectively. Let X = Solution TA . Find f ( X ) , the pdf of X (0 < X < + ) . TB Because T A ,T B are independent, their joint pdf is: f (t A , t B ) = f (t A ) f (t B ) = (1/10e Next, we need to find the cdf FX ( x ) = Pr( X t A /10 B )(1/5e t /5 ) TA x) = Pr B T x = Pr(T A xT B ) Now the problem becomes Given that T A ,T B have the following pdf (1/10e t A /10 B )(1/ 5e t /5 ) Page 267 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Whats the probability that T A x T B where x is a positive constant? Now we can use the 4-step process: Step 1 Determine 2-D region (shaded area in F igure 15). Step 2 Set up outer integration on T B . From Figure 15, we see that the lower bound of T B is zero; the upper bond is . (You can also choose to have outer integration on T A .) Step 3 Set up inner integration T A . From Figure 1 5, we see that the lower bound of T A is zero. The upper bound is x T B . Step 4 Evaluate the double integration. Figure 15 + xt B Pr(T + A xT ) = B 0 0 B f (t A , t B ) dt A dt B + xt B xt = 0 + 0 (1/10e (1/ 5e 0 tB /5 t A /10 )(1/ 5e tB /5 ) dt dt = A B 0 (1/ 5e tB /5 ) 0 (1/10e t A /10 ) dt A dt B =

) (1 e xt B /10 ) dt B Yufeng Guo, Deeper Understanding: Exam P Page 268 of 425

http://www.guo.coursehost.com + + = 0 (1/ 5e tB /5 ) (1 e xt B /10 ) dt = B 0 1 ( e 5 tB /5 ) dt B 1 5 + e 0 x+2 B t 10 dt B =1 1 10 2 =1 5 x+2 x+2 2 x+2 (0 < X < + ) Then F ( x) = 1 Double check: F (0) = 1 2 2 =0, F (+ ) = 1 =1 OK 0+2 + +2 f ( x) = d d 2 2 F ( x) = 1 = ( x + 2) 2 dx dx x+2 Transformation of one variable -- PDF method Assume we already know the pdf f X ( x ) for X . We call X the old variable. Then we have a brand new variable Y = g ( X ) . Question - How can we find fY ( y ) , the pdf for Y ? Answer - If the transformation is one to one and J ! 0 , then "x fY ( y ) = f X ( x ) J Where J = . J is called Jacobian. "y Steps to find fY ( y ) Step 1 Recover the old varia ble. From y = g ( x ) , solve for x . Make sure you get only solution. One pair means that the transformation is one to one. Throw the solution x ( y ) in f X ( x ) . Youll get f X x ( y ) . Step 2

Step 3 Calculate Jacobian J = Step 4 fY ( y ) = f X x ( y ) J "x . "y Yufeng Guo, Deeper Understanding: Exam P Page 269 of 425

http://www.guo.coursehost.com Problem 8 X uniformly distributed over [ 10,10] . Y = X . Find the pdf of Y . Solution X uniformly distributed over [ 10,10] fX ( x) = 1 , 20 Step 1 Recover the old variable. Y= X , X= Y Y if X 0 . Notice that Y # 0 . if X # 0 Step 2 If X Throw the solution x ( y ) in f X ( x ) . Youll get f X x ( y ) . 0 , fX x ( y) = 1 1 ; If X # 0 , still f X x ( y ) = . 20 20 Step 3 Calculate Jacobian J = "x . "y If X 0, J = "x " y " x " ( y) = = 1 =1 = = 1 = 1 ; If X # 0 , J = "y "y "y "y J= "x =1 "y Step 4 fY ( y ) = f X x ( y ) J If X 0 , fY ( y ) = f X x ( y ) J = 1 1 ; If X # 0 , fY ( y ) = f X x ( y ) J = . 20 20 0 and X # 0 . For example, X = 10 and X = 10 both give us fY ( y ) includes both X Y = 10 . Consequently: fY ( y ) = f X x ( y ) J + f X x ( y ) J = 2 if X 0 if X #0 1 1 = 20 10 Yufeng Guo, Deeper Understanding: Exam P Page 270 of 425

http://www.guo.coursehost.com If a transformation is one-to-many, often I find t hat the CDF method is easier and less prone to errors. Problem 9 (#13, Nov 2001) An actuary models the lifetime of a device using the r andom variable Y = 10 X 0.8 , where X is an exponential random variable with mea n 1 year. Determine the probability density function f ( y ) , for y > 0 , of th e random variable Y . Solution X is an exponential random variable with mean 1 year. So f ( x ) = e x . Y = 10 X 0.8 , X 0.8 Y Y = , X= 10 10 1 0.8 = ( 0.1y ) 1.25 f ( x) = e x =e ( 0.1y )1.25 J= dx d 1.25 = ( 0.1y ) = 0.11.25 (1.25 ) y 0.25 > 0 dy dy fY ( y ) = f X x ( y ) J = e ( 0.1 y )1.25 0.11.25 (1.25 ) y 0.25 = 0.125 ( 0.1y ) 0.25 e ( 0.1 y )1.25 Problem 10 (#32, Nov 2000) The monthly profit of Company I can be modeled by a continuous random variable w ith density function f . Company II has a monthly profit that is twice that of C ompany I. Determine the probability density function of the monthly profit of Co mpany II. Solution Let X =Company Is monthly profit; Y =Company IIs monthly profit. 1 dx 1 1 y Y = 2X , X= Y, J= = fY ( y ) = f X x ( y ) J = f 2 dy 2 2 2 Transformation of n random variables First, we consider n = 2 , the transformation of 2 random variables. The result can be generalized for n > 2 . Yufeng Guo, Deeper Understanding: Exam P Page 271 of 425

http://www.guo.coursehost.com Assume we already know the joint pdf f X1 , X 2 ( x1 , x2 ) of two continuous random variables variables Y1 = gi ( X 1 , X 2 ) and Y2 = g 2 ( X 1 , X 2 ) . X 1 and X 2 . We call X 1 and X 2 old variables. Then we have two brand new Question - How can we find fY1 ,Y2 ( y1 , y2 ) , the joint pdf for Y1 and Y2 ? A nswer - If the transformation is one to one and J ! 0 , then fY1 ,Y2 ( y1 , y2 ) = f X1 , X 2 ( x1 , x2 ) J " x1 "y Where J = det 1 " x2 " y1 " x1 " y2 " x1 " x2 = " x2 " y1 " y2 " y2 " x1 " x2 . J is called Jacobian. " y2 " y1 Steps to find fY1 ,Y2 ( y1 , y2 ) Step 1 Recover the old variables. From y1 = gi ( x1 , x2 ) and y2 = g 2 ( x1 , x2 ) , solve for x1 and x2 . Make sure you get only solution pair x1 ( y1 , y2 ) , x2 ( y1 , y2 ) . One pair means that the transformation is one to one. If you get two or more solution pairs, than the transformation is not one t o one. If the transformation is not one to one, you cant use the above theorem. A one-to-many transformation is beyond the scope of Exam P and you dont need to wo rry about it. Step 2 Throw the solution x1 ( y1 , y2 ) , x2 ( y1 , y2 ) in f X1 , X 2 ( x1 , x2 ) . Youll get f X1 , X 2 x1 ( y1 , y2 ) , x2 ( y1 , y2 ) . Step 3 " x1 " y1 Calculate Jacobian J = " x2 " y1 " x1 " y2 " x1 " x2 = " x2 " y1 " y2 " y2 " x1 " x2 . " y2 " y1 Step 4 fY1 ,Y2 ( y1 , y2 ) = f X1 , X 2 x1 ( y1 , y2 ) , x2 ( y1 , y2 ) J Yufeng Guo, Deeper Understanding: Exam P Page 272 of 425

http://www.guo.coursehost.com Problem 11 X 1 and X 2 are independent identically distributed exponential random variable with X1 mean of 1. Determine the pdf fo r Y = . X1 + X 2 Solution To use the theorem, we first create a fake random vari able Y2 = X 2 . We define X1 Y1 = . X1 + X 2 Step 1 Recover the old variables. Y 1 = X1 X1 + X 2 Y2 = X 2 y1 y2 , x2 = y2 . We get only one solution pair; so the transformation is one-to 1 y1 one. We can use the theorem. x1 = Step 2 Throw the solution x1 ( y1 , y2 ) , x2 ( y1 , y2 ) in f X1 , X 2 ( x1 , x2 ) . Y oull get f X1 , X 2 x1 ( y1 , y2 ) , x2 ( y1 , y2 ) . Lets first find the pdf for the old random variables X 1 and X 2 . Because X 1 an d X 2 are independent, we have: f X1 , X 2 ( x1 , x2 ) = f X1 ( x1 ) f X 2 ( x2 ) f X1 ( x1 ) = e x1 , f X 2 ( x2 ) = e ( x1 + x2 ) x2 , y1 y2 1 y1 f X1 , X 2 ( x1 , x2 ) = e x1 e e y2 x2 =e ( x1 + x2 ) f X1 , X 2 ( x1 , x2 ) = e =e =e y2 1 y1 Step 3 " x1 " y1 Calculate Jacobian J = " x2 " y1 " x1 " y2 " x1 " x2 = " x2 " y1 " y2 " y2 " x1 " x2 . " y2 " y1 Yufeng Guo, Deeper Understanding: Exam P Page 273 of 425

http://www.guo.coursehost.com y2 " x1 d y1 y2 " x1 d y1 y2 y " x2 dy2 " x2 = = , = = 1 , = = 0, =1 2 " y1 dy1 1 y1 (1 y1 ) " y2 dy2 1 y1 1 y1 " y1 dy1 " y2 " x1 " y1 J= " x2 " y1 " x1 y2 " y2 2 = (1 y1 ) " x2 0 " y2 y1 1 y1 = 1 (1 y2 y1 ) 2 J = (1 y2 y1 ) 2 = y2 (1 y1 ) 2 Because X 1 and X 2 are exponential random variable, we have X 1 # 0 and X 2 # 0 . y2 J = y2 = x2 # 0, y2 = y2 , 2 (1 y1 ) fY1 ,Y2 ( y1 , y2 ) = f X1 , X 2 x1 ( y1 , y2 ) , x2 ( y1 , y2 ) J Step 4 fY1 ,Y2 ( y1 , y2 ) = (1 y2 y1 ) 2 e y2 1 y1 Next, we eliminate y2 : fY1 ( y1 ) = + y2 1 y1

+ 0 (1 y2 y1 ) 2 e y2 1 y1 dy2 = 1 + 2 0 (1 y1 ) y2 e y2 1 y1 dy2 For 0 + y2 e x dy2 to exist, we need to have + y2 # 0 or y1 < 1 . This is similar to the fact 1 y1 that 0 xe dx exists but 0 + xe x dx doesnt. To find 0 + y2 e y2 1 y1

dy2 , we set y2 = u . Then y2 = u (1 y1 ) , dy2 = (1 y1 ) du 1 y1 u y2 e 0 + y2 1 y1 + dy2 = 0 u (1 y1 ) e (1 y1 ) du = (1 y1 ) + 2 0 ue u du ue u du = 1 0 Yufeng Guo, Deeper Understanding: Exam P Page 274 of 425

http://www.guo.coursehost.com + y2 1 y1 + 2 0 y2 e 0 dy2 = (1 y1 ) + 2 0 ue u du = (1 y1 ) y2 1 y1 2 fY1 ( y1 ) = 1 (1 y1 ) y2 e dy2 = 1 So Y1 is uniformly distributed over [ 0,1] . Problem 12 X 1 and X 2 are independent identically exponential random variables with mean o f 1. X Y= 2. X1 Find the PDF for Y . Solution f X1 ( x1 ) = e x1 , f X 2 ( x2 ) = e x2 , where x1 > 0 and x2 > 0 x2 f X1 , X 2 ( x1 , x2 ) = f X1 ( x1 ) f X 2 ( x2 ) = e x1 e Let Y1 = X 1 and Y2 = =e ( x1 + x2 ) X2 , where Y1 > 0 and Y2 > 0 X1 X 1 = Y1 , X 2 = Y1Y2 f X1 , X 2 ( x1 , x2 ) = e ( x1 + x2 ) =e

y1 (1+ y2 ) , fY1 ,Y2 ( y1 , y2 ) = J e y1 (1+ y2 ) " x1 " y1 J= " x2 " y1 " x1 " y2 " x1 " x1 " x2 " ( y1 y2 ) " x2 " ( y1 y2 ) , = 1, =0, = = y2 , = = y1 " x2 " y1 " y2 " y1 " y1 " y2 " y2 " y2 Yufeng Guo, Deeper Understanding: Exam P Page 275 of 425

http://www.guo.coursehost.com " x1 " y1 J= " x2 " y1 " x1 " y2 1 0 = = y1 , y2 y1 " x2 " y2 y1 (1+ y2 ) J = y1 = y1 fY1 ,Y2 ( y1 , y2 ) = y1e fY2 ( y2 ) = fY ( y ) = + fY1 ,Y2 ( y1 , y2 )dy1 = 1 + y1e 0 y1 (1+ y2 ) dy1 = 1 0 (1 + y2 ) 2 (1 + y ) 2 where y > 0 Problem 13 X 1 and X 2 are independent identically exponential random variables with mean of 1. Y1 = X 1 + X 2 and Y2 = X1 . X2 Find the joint PDF for Y1 and Y2 . Solution f X1 ( x1 ) = e x1 , f X 2 ( x2 ) = e x2 , where x1 > 0 and x2 > 0 x2 f X1 , X 2 ( x1 , x2 ) = f X1 ( x1 ) f X 2 ( x2 ) = e x1 e Y1 = X 1 + X 2 and Y2 = f X1 , X 2 ( x1 , x2 ) = e X1 . X2

( x1 + x2 ) =e ( x1 + x2 ) X1 = Y1Y2 , 1 + Y2 X2 = Y1 1 + Y2 y1 =e y1 , fY1 ,Y2 ( y1 , y2 ) = J e " x1 " y1 J= " x2 " y1 " x1 " y2 " x1 " y1 y2 y , = = 2 , " x2 " y1 " y1 1 + y2 1 + y2 " y2 Page 276 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com " x1 " y1 y2 " y2 " 1 " 1 y1 = = y1 = y1 = y1 = 1 , 2 " y2 " y2 1 + y2 " y2 1 + y2 " y2 1 + y2 " y2 1 + y2 (1 + y2 ) " x2 " 1 y1 = = , " y1 " y1 1 + y2 1 + y2 " x1 " y1 J= " x2 " y1 y2 " x1 " y2 1 + y 2 = " x2 1 " y2 1 + y 2 " x2 " y1 " 1 = = y1 = " y2 " y2 1 + y2 " y2 1 + y2 y1 (1 + y2 ) y1 2 (1 + y2 ) y1 2 = 2 (1 + y2 ) (1 + y2 ) y1 2 , J = (1 + y2 ) y1 2 = (1 + y2 ) y1 y1 2 fY1 ,Y2 ( y1 , y2 ) = J e = (1 + y2 ) y1

2 e y1 where y1 > 0 and y2 > 0 Homework for you: #4 May 2000; #32, Nov 2000; 26, May 2001; #13, #37 Nov 2001; # 23, May 2003. Yufeng Guo, Deeper Understanding: Exam P Page 277 of 425

http://www.guo.coursehost.com Chapter 31 Univariate & joint order statistics Let X 1 , X 2 , , X n represent n independent identically distributed random vari ables with a common pdf f X ( x ) and a common cdf FX ( x ) . We assume X is a c ontinuous random variable. Next, we sort X 1 , X 2 , , X n by ascending order: X (1) X (2 ) X (3 ) ... X (n ) X (1) = min ( X 1 , X 2 ,..., X n ) X (1) is called the 1st order statistics of X 1 , X 2 , , X n . X ( 2) = min { X 1 , X 2 ,..., X n } nd {X ( )} 1 set ( X 1 , X 2 ,..., X n ) and find the minimum of the remaining set. X (2 ) is called the 2nd order statistics of X 1 , X 2 , , X n . X ( 3) = min X (2 ) is the 2 smallest number. To find X (2 ) , we simply remove X (1) from th e original { X1 , X 2 ,..., X n } rd {X ( ) , X ( )} 1 2 original set ( X 1 , X 2 ,..., X n ) and find the minimum of the remaining set. X (3) is called the 3rd order statistics of X 1 , X 2 , , X n . X ( k ) = min X (3) is the 3 smallest number. To find X (3) , we simply remove X (1) and X (2 ) from the { X 1 , X 2 ,..., X n } { X ( ) , X ( ) ,..., X ( )} 1 2 k 1 from the original set ( X 1 , X 2 ,..., X n ) and find the minimum of the remain ing set. X (k ) is called the k-th order statistics of X 1 , X 2 , , X n . .. Kee p doing this until X ( n ) = min X (k ) is the k-th smallest number. To find X (k ) , we simply remove X (1) , X (2 ) ,, X (k 1) { X 1 , X 2 ,..., X n } { X ( ) , X ( ) ,..., X ( ) } 1 2 n 1 = max ( X 1 , X 2 ,..., X n ) X ( n) is called the n-th order statistics of X 1 , X 2 , , X n . Yufeng Guo, Deeper Understanding: Exam P

Page 278 of 425

http://www.guo.coursehost.com Example. 5 random samples are drawn from a population and the values of these sa mples are: 5, 2, 9, 20, and 7. We sort these values into 2, 5, 7, 9, 20. So the first order is 2. The 2nd order is 5. The 3rd order is 7. The fourth order is 9. And finally, the 5th order is 20. Now we know what order statistics is. Questio n given n independent identically distributed random variables X 1 , X 2 , , X n with a common pdf f ( x ) and a common cdf F ( X ) , whats the probability distri bution (pdf) of the 1st order X (1) ? Pdf of the 2nd order X (2 ) ? Pdf of the kth order X (k ) ? Pdf of the n-th order X (n ) ? Most likely, Exam P will ask you to find the pdf of X (1) or X (n ) . However, well derive a generic pdf formula for X (k ) . Method 1 good only for finding the pdf of X (1) and X ( n) X (1) = min ( X 1 , X 2 ,..., X n ) P X (1) x = 1 P X (1) > x = 1 P min ( X 1 , X 2 ,..., X n ) > x P min ( X 1 , X 2 ,..., X n ) > x = P ( X 1 > x ) ( X 2 > x ) ... ( X n > x ) = P ( X 1 > x ) P ( X 2 > x ) ...P ( X n > x ) X 1 , X 2 , , X n are independent, P ( X 1 > x ) ( X 2 > x ) ... ( X n > x ) X 1 , X 2 , , X n are identically distributed with a common cdf FX ( x ) P( X 1 > x ) = P( X 2 > x ) = ... = P( X n > x ) = 1 FX ( x ) P ( X 1 > x ) P ( X 2 > x ) ...P ( X n > x ) = 1 FX ( x ) FX n ( x) = P (1) ( x) = X (1) x =1 P ( X 1 > x ) ( X 2 > x ) ... ( X n > x ) 1 FX ( x ) n =1 1 FX ( x ) n fX (1) d d FX ( x ) = 1 1) dx ( dx n 1 { }=n 1

FX ( x ) n 1 d FX ( x ) dx = n 1 FX ( x ) fX ( x) fX ( x) fX ( x) = n (1) 1 FX ( x ) n 1 X ( n) = max ( X 1 , X 2 ,..., X n ) Yufeng Guo, Deeper Understanding: Exam P Page 279 of 425

http://www.guo.coursehost.com P X (n) x = P max ( X 1 , X 2 ,..., X n ) x =P ( X1 x) ( X2 x ) ... ( Xn x) x) X 1 , X 2 , , X n are independent, P ( X1 x) ( X2 x ) ... ( Xn x ) = P ( X1 x) P ( X2 x ) ...P ( X n X 1 , X 2 , , X n are identically distributed with a common cdf FX ( x ) P( X 1 x ) = P( X 2 x ) = ... = P( X n x ) = FX ( x ) P ( X1 x) P ( X 2 x ) ...P ( X n x = FX ( x ) n x ) = FX ( x ) n FX ( n ) ( x ) = P X ( n ) fX n ( x) = ( ) d d FX n ( x ) = FX ( x ) dx ( ) dx n 1 n = n FX ( x ) n 1 d FX ( x ) dx = n FX ( x )

fX fX ( x) fX ( x) ( n) ( x) = n FX ( x ) n 1 Method 2 X (1) = min ( X 1 , X 2 ,..., X n ) f X (1 ) ( x )dx = P[x =P min( X 1 , X 2 ,..., X n ) x + dx ) x + dx ] (x

You might wonder why I didnt write f X (1) ( x ) = P x min ( X 1 , X 2 ,..., X n ) x + dx The above expression is wrong. As explained before, for a continuous random vari able X , the pdf f X ( x ) is not real probability. As a matter of fact, we dont require f X ( x ) 1 ; f X ( x ) can approach infinity. So f X ( x ) is not a rea l probability. As a result, we cant write f X (1) ( x ) = P x min ( X 1 , X 2 ,.. ., X n ) x + dx Yufeng Guo, Deeper Understanding: Exam P Page 280 of 425

( all the other X

one of the X

s s > x + dx )

http://www.guo.coursehost.com However, f X ( x ) dx is a genuine probability: f X ( x ) dx = P ( x This is why I wrote: X < x + dx ) f X (1 ) ( x )dx = P[x =P min( X 1 , X 2 ,..., X n ) x + dx ) x + dx ] (x

( dx is tiny) dx f X ( n1 ) (x )dx = P[x =P max( X 1 , X 2 ,..., X n ) one of the X s x + dx ] x) (x x + dx )

x ] = FX ( x ) n 1

x + dx ] = n f X ( x ) dx , P [ all the other X n 1

P [ x one of the X

( all the other X

s s s

P f X n

( all the other X s > x + dx ) = P ( all the other X s > x ) = 1 FX ( x ) X (1) ( x ) dx = n f X ( x ) 1 FX ( x ) f X (1) ( x ) = n f X ( x ) 1 FX ( x ) (n ) = max ( X 1 , X 2 ,..., X n ) 1 n 1

Lets continue. P ( x one of the X n 1

( all the other X

one of the X

s s > x + dx ) s x + dx ) = n f X ( x ) dx

f X ( n) ( x ) dx = n f X ( x ) FX ( x ) f X ( n) ( x ) = n f X ( x ) FX ( x ) n 1 dx

( n - 1)

Yufeng Guo, Deeper Understanding: Exam P Page 281 of 425

(k 1 ) = ( n k ) X

s > x + dx}

Extending the same logic to X (k ) : f X ( k ) ( x ) dx = P { x one of the X s, ( k 1) X s < x,

s x + dx of the remaining ( n - 1) X

http://www.guo.coursehost.com See the diagram below. x + dx 1 x k 1 + n k P[x

x + dx] = n f X ( x )dx

( n - 1)

1 FX ( x ) n k (binomial distribution) k 1 f X ( k ) ( x ) dx = n f X ( x ) C nk 11 FX ( x ) k f X ( k ) ( x ) = n f X ( x ) C n 11 FX ( x ) k 1 1 FX ( x ) n k n k dx 1 FX ( x ) Alternatively, we can derive f X ( k ) ( x ) as follows: P[x X x + dx ] = f X ( x ) dx , P ( X < x ) = FX ( x ) , P ( X > x ) = 1 FX ( x ) We have: 1 X falls between ( x, x + dx ) , k 1 X s fall between n k X s ( , x) , fall between ( x, ) , If we dont worry about permutations, the pdf should be: f X ( x ) FX ( x ) k 1 1 FX ( x ) n k

(k 1 ) = ( n k ) X

s > x}

P {of the remaining ( n - 1) X = C nk 11 FX ( x ) k 1

one of the X

s, ( k 1) X

s < x,

) , the # of permutations is: k = n C n 11 n! 1!( k 1) ! n 1 ( k 1) ! (k ( n 1)! 1) ! ( n 1) ( k 1) ! Yufeng Guo, Deeper Understanding: Exam P Page 282 of 425

, x ) , and which ( n k ) X =n

s fall between ( x,

Because we dont know which X falls between ( x, x + dx ) , which ( k 1) X l between

s fal

http://www.guo.coursehost.com f X (k ) ( x ) = n! 1!( k 1) ! n 1 fX ( x) 1X (k 1) ! FX ( x ) X s k 1 1 FX ( x ) n k # of permutations ( x, x+ dx ) ( k -1) ( , x) (n k ) X s ( x+ dx, ) The above formula is the same as f X ( k ) ( x ) = n f X ( x ) C nk 11 FX ( x ) k 1 1 FX ( x ) n k If you cant remember f X ( k ) ( x ) = n f X ( x ) C nk 11 FX ( x ) following dia gram: k 1 1 FX ( x ) n k , just draw the x k 1 1 x + dx + n k Then write f X (k ) ( x ) = n! ( k 1)!1! n 1 (k

1) ! FX ( x ) k 1 fX ( x) 1X 1 FX ( x ) X s n k # of permutations ( k -1) X s ( , x) ( x, x +dx ) ( n k ) ( x + dx, ) Order statistics has its applications in the real world. For the purpose of pass ing Exam P, however, I recommend that you dont worry about how order statistics i s used. Finally, please note that X (1) and X ( n ) can be different from the mi nimum and the maximum in transformation. In Chapter 30, we calculated the pdf fo r the minimum and the maximum of two independent random variables that are not i dentically distributed. In contrast, X (1) and X ( n ) refer to the minimum and maximum of several independent identically distributed random variables. If seve ral independent random variables are NOT identically distributed (such as in Cha pter 30), you cant use f X (1) ( x ) = n 1 FX ( x ) n 1 f X ( x ) or f X ( n) ( x ) = n FX ( x ) n 1 f X ( x ) to find the pdf for the minimum or maximum. Problem 1 Yufeng Guo, Deeper Understanding: Exam P Page 283 of 425

http://www.guo.coursehost.com A system has 5 duplicate components. The system wo rks as long as at least one component works. The system fails if all components fail. The life time of each component follows a gamma distribution with paramete rs n = 2 and . Find the probability distribution of the life time of the system. Solution Let X 1 , X 2 , X 3 , X 4 , X 5 represent the life time of each of the 5 components. Let Y represent the life time of the whole system. Let Y = max( X 1 , X 2 , X 3 , X 4 , X 5 ) X 1 , X 2 , X 3 , X 4 , X 5 are independent and iden tically distributed with the following common pdf: f X (x ) = 2 xe x (gamma pdf with parameters n = 2 and ) The common cdf is: FX ( x ) = x 2 0 te t dt = 1 e 4 x xe 2 x fY ( y ) = 5 f X ( x ) FX ( y ) =5 xe x 1 e x xe x 4 Problem 2 A system has 5 components working in parallel. The system works if all of the components work. The system fails if at least one component fails. The l ife time of each component follows a gamma distribution with parameters n = 2 an d . Find the probability distribution of the life time of the system. Solution Let X 1 , X 2 , X 3 , X 4 , X 5 represent the life time of each of the 5 components. Let Y = min ( X 1 , X 2 , X 3 , X 4 , X 5 ) This time, we are look ing for the distribution of the 1st order (minimum) statistics. Yufeng Guo, Deepe r Understanding: Exam P Page 284 of 425

http://www.guo.coursehost.com Applying the formula, we have: fY ( y ) = n 1 FX ( x ) n 1 fX ( x) = 5 1 (1 x e xe x xe x x ) xe 5 1 2 xe x =5 1 (1 e ) 5 1 2 x Problem 3 X 1 , X 2 , X 3 are three independent identically distributed continuo us random variables with the following common pdf: f ( x) = 8 where x x3 2 Calculate E X (1) , E X ( 2 ) , and E X ( 3) . Solution f X (1) ( x ) = 3 f X ( x ) 1 FX ( x ) 2 FX ( x ) = x f x ( t )dt =

2 8 4 dt = 1 2 3 t x 2 2 x f X (1) ( x ) = 3 f X ( x ) 1 FX ( x ) E X (1) = x f X (1) ( x ) dx = x 2 2 8 =3 3 x 1 4 1 2 x 2 = 384 where x x7 2 384 12 dx = 7 5 x 4 1 2 x 2 f X (3) ( x ) = 3 f X ( x ) FX ( x ) 2 8 =3 3 x E X ( 3) 8 = x f X (3) ( x ) dx = 3 x 3 x 2 2 2 1 4 1 2 x 2 dx = 3 2 32 5 f X ( 2) ( x ) = 3 f X ( x ) C 32 11 FX ( x ) =3 1 FX ( x ) 8 4 4 8 4 1 C2 1 2 =6 3 1 2 3 2 x x x x x Yufeng Guo, Deeper Understanding: Exam P 4 x2 Page 285 of 425

http://www.guo.coursehost.com E X ( 2) = x f X ( 2) ( x ) dx = x ( 6 ) 2 2 8 x3 1 4 x2 4 16 dx = 2 x 5 Problem 4 (CAS Exam 3 #25, Spring 2005, modified) Samples are selected from a un iform distribution on [0, 10]. Determine the expected value of the 4th order sta tistic for a sample of size five. Solution fX ( ( x) = n k) k f X ( x ) C n 11 FX ( x ) k 1 1 FX ( x ) n k We have n = 5 and k = 4 . fX ( x ) = 5 f X ( x ) C 54 11 ( 4) 1 x , FX ( x ) = 10 10 FX ( x ) 3 4 1 1 FX ( x ) 5 4 3 = 5 f X ( x ) C 4 FX ( x ) 1 FX ( x ) fX ( x) = fX ( x ) = 5 f X ( x ) C 43 ( 4) 1 x 3 =5 C4 10 10 1 =5 10 10

FX ( x ) 3 3 1 FX ( x ) 1 3 x 10 3 ( 4) x 10 10 x x =2 1 10 10 3 1 x 10 E X ( 4) x = x f X ( x ) dx = x ( 2 ) ( 4) 10 0 0 1 x 20 dx = 10 3 Homework for you: redo all the problems in this chapter. ************************************************************************ ******* ** Yufeng Guo, Deeper Understanding: Exam P Page 286 of 425

http://www.guo.coursehost.com The following is an advanced topic of order statis tics: joint pdf for order statistics. Not sure whether SOA will test it. If you don want to learn about this, just skip it. Well answer the following two question s: Whats the joint pdf for X ( r ) and X ( s ) ? Whats the joint pdf for X (1) , X ( 2 ) , X ( 3) , ., X ( n ) ? First, well find the joint pdf for X ( r ) and X ( s ) . Well consider r < s . Our goal is to find f X ( r ) , Y( s ) ( x, y ) . Please note that if r < s , then X ( r ) x> y. X ( s ) . So we have f X ( r ) , Y( s ) ( x, y ) = 0 if Lets consider P x < X ( r ) < x + dx, diagrams. y < X ( s ) < y + dy where y > x . Well draw three Diagram 1 x < X ( r ) < x + dx x x + dx + r 1 1 n r To have x < X ( r ) < x + dx , we need to have one X in ( x, x + dx ) , ( r 1) X s in ( , x ) , and ( n r ) X s in ( x, + ). Diagram 2 y < X ( s ) < y + dy y s 1 y + dy 1 + n s Yufeng Guo, Deeper Understanding: Exam P Page 287 of 425

http://www.guo.coursehost.com To have y < X ( s ) < y + dy , we need to have one X in ( y, y + dy ) , ( s 1) X s in ( , y ) , and ( n s ) X s in ( y, + ). x ) -- We Diagram 3 x < X ( r ) < x + dx and y < X ( s ) < y + dy (where y combine the above two diagrams into one. s-r 1 x r 1 x + dx 1 y y + dy n r 1 + s 1 n s To have x < X ( r ) < x + dx and y < X ( s ) < y + dy (where y X s in x ), we need to have ( r 1) ( , x ) , one X in ( x, x + dx ) , ( s r 1) X s in ( x + dx, y ) , one X in and ( n r ) X s in ( x, + ( y, y + dy ) , ). Diagram 3 can be simplified as follows: x r 1 x + dx 1

y s r 1 y + dy 1 + n s f X ( r ) , X ( s ) ( x, y ) dx dy = P x < X ( r ) < x + dx, = A FX ( x ) r 1 y < X ( s ) < y + dy s r 1 f X ( x ) dx 1X FX ( y ) FX ( x ) f X ( y ) dy 1X 1 FX ( y ) X s n s ( r -1) X s ( , x) ( x, x +dx) (s

( x +dx, y ) ( y , y +dy ) ( n s ) ( y + dy , ) Where A = n! n! = ( r 1)!1!( s r 1)!1!( n s )! ( r 1)!( s r 1) !( n s )! # of permutations Yufeng Guo, Deeper Understanding: Exam P Page 288 of 425

r 1) X

http://www.guo.coursehost.com Formula: f X ( r ) , X ( s ) ( x, y ) = n! F ( x) ( r 1)!( s r 1) !( n s )! X Where s r and y r 1 FX ( y ) FX ( x ) s r 1 1 FX ( y ) n s fX ( x) fX ( y) x Problem 5 X 1 , X 2 , X 3 , , X n are independent random variables uniformly dist ributed over [ 0, 1] . Find the joint pdf for X (1) and X ( n ) . Solution Here r = 1 , s = n . Using the memorized formula, we have: f X (1) , X ( n) ( x, y ) = n! F ( x) ( r 1)!( s r 1)!( n s )! X n! F ( x) (1 1)!( n 1 1)!( n n ) ! X n! F ( y ) FX ( x ) ( n 2 )! X n 2 r 1 FX ( y ) FX ( x ) FX ( y ) FX ( x ) s r 1 1 FX ( y ) 1 FX ( y ) n s fX ( x) fX ( y) fX ( x) fX ( y) = 1 1 n 1 1 n n = fX ( x) fX ( y) fX ( x) fX ( y) x

= n ( n 1) FX ( y ) FX ( x ) n 2 X is uniform over [ 0, 1] . So f X ( x ) = f X ( y ) = 1 , FX ( x ) = f ( t )dt = dt = x , 0 x FX ( y ) = y . f X (1) , X ( n) ( x, y ) = n ( n 1)( y x ) n 2 0 where y x Yufeng Guo, Deeper Understanding: Exam P Page 289 of 425

http://www.guo.coursehost.com Problem 6 X 1 , X 2 , X 3 are three independent id entically distributed continuous random variables with the following common pdf: f ( x) = 8 where x x3 2 Calculate the joint pdf for X (1) and X ( 2 ) Solution FX ( x ) = f x ( t )dt = 2 x 8 4 dt = 1 2 3 t x 2 x f X ( r ) , X ( s ) ( x, y ) = n! F ( x) ( r 1)!( s r 1) !( n s )! X r 1 FX ( y ) FX ( x ) s r 1 1 FX ( y ) n s fX ( x) fX ( y) The joint pdf for X (1) and X ( 2 ) , r = 1 , s = 2 , n = 3 . f X (1) , X ( 2) ( x, y ) = n! F ( x) ( r 1)!( s r 1)!( n s )! X 3! F ( x) (1 1)!( 2 1 1)!( 3 2 ) ! X r 1 FX ( y ) FX ( x ) FX ( y ) FX ( x ) s r 1 1 FX ( y ) 1 FX ( y ) n s fX ( x) fX ( y) fX ( x) fX ( y) = 1 1 2 1 1 3 2

= 6 1 FX ( y ) f X ( x ) f X ( y ) = 6 4 y2 8 8 1,536 = x3 y 3 x3 y 5 where 2 x y We can find f X (1) ( x ) from f X (1) , X ( 2) ( x, y ) . f X (1) ( x ) = + f X (1) , X ( 2) ( x, y ) dy = + x x 1,536 1,536 dy = 3 3 5 x y x + x 1, 536 y dy = 3 x 5 y4 5 +1 + = x 384 x7 Page 290 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com This result is OK; Problem 3 also gives us f X (1) ( x ) = 3 f X ( x ) 1 FX ( x ) 2 8 =3 3 x 1 4 1 2 x 2 = 384 where x x7 2 We can find f X ( 2) ( y ) from f X (1) , X ( 2) ( x, y ) . f X ( 2) ( y ) = y 2 1, 536 1, 536 1, 536 y 2 3 f X (1) , X ( 2) ( x, y ) dx = dx = 5 x dy = 5 2 x3 y 5 y 2 y 2 y y y 2 = 768 1 y5 4 1 192 4 = 5 1 2 2 y y y The result is correct; from Problem 3, we also have: f X ( 2) ( x ) = 6 8 x3 1 4 x2 4 192 4 = 5 1 2 2 x x x The final topic on order statistics. The joint pdf for X (1) , X ( 2 ) , X ( 3) , ., X ( n ) is "n ! f X ( x1 ) f X ( x2 ) ... f X ( xn ) if x1 x2 x3 # f X (1) , X ( 2) ,..., X ( n ) ( x1 , x2 ,..., xn ) = $ if otherwise #0 % ... xn To illustrate the proof, Ill set n = 3 and prove that "3! f X ( x ) f X ( x ) f X ( z ) if x y z # f X (1) , X ( 2) , X (3) ( x, y, z ) = $ if otherwise #0 % The proof for n > 3 is similar. Yufeng Guo, Deeper Understanding: Exam P

Page 291 of 425

http://www.guo.coursehost.com f X (1) , X ( 2) , X (3) ( x, y, z ) dx dy dz = P x < X (1) < x + dx, y < X ( 2) < y + dy, z < X (1) < z + dz Once again, we cant write f X (1) , X ( 2) , X (3) ( x, y, z ) = P x < X (1) < x + dx, y < X ( 2) < y + dy, z < X (1) < z + dz The above expression is wrong because f X (1) , X ( 2) , X (3) ( x, y, z ) is no t a real probability. However, f X (1) , X ( 2) , X (3) ( x, y, z ) dx dy dz is a real probability. Lets continue. The only way to have x < X (1) < x + dx , y < X ( 2 ) < y + dy , and z < X ( 3) < z + dz where x y z is to have one X in ( x, x + dx ) , another X in ( y, y + dy ) , and the last X in ( z, z + dz ) . x 1 x + dx y 1 y + dy z 1 z + dz + f X (1) , X ( 2) , X (3) ( x, y, z ) dx dy dz = P x < X (1) < x + dx, = 3! 1!1!1! # of permutations y < X ( 2) < y + dy, z < X ( 3) < z + dz f X ( y ) dy 1X f X ( x ) dx 1X f X ( z ) dz 1X ( x, x + dx ) ( y , y + dy ) y

( z , z + dz ) = 3! f X ( x ) f X ( y ) f X ( z ) dx dy dz where x z f X (1) , X ( 2) , X (3) ( x, y, z ) = 3! f X ( x ) f X ( y ) f X ( z ) Because X (1) X ( 2) X ( 3) , then f X (1) , X ( 2) , X (3) ( x, y, z ) = 0 if x y z is not satisfied. Page 292 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Generally, f X (1) , X ( 2) ,..., X ( n ) ( x1 , x2 ,..., xn ) dx1dx2 ...dxn = P x1 < X (1) < x1 + dx1 , x2 < X ( 2 ) < x2 + dx2 , ..., xn < X ( n ) < xn + d xn To only way to have x1 < X (1) < x1 + dx1 , x2 < X ( 2 ) < x2 + dx2 , ..., xn < X ( n ) < xn + dxn where x1 x2 x3 ... xn is to have one X in ( x1 , x1 + dx1 ) , one X in ( x2 , x2 + dx2 ) , ., one X in ( xn , xn + dxn ) . The # of permutations is n ! . Consequently, for x1 x2 x3 ... xn f X (1) , X ( 2) ,..., X ( n ) ( x1 , x2 ,..., xn ) dx1dx2 ...dxn = P x1 < X (1) < x1 + dx1 , x2 < X ( 2 ) < x2 + dx2 , ..., xn < X ( n ) < xn + d xn = n ! f X ( x1 ) dx1 f X ( x2 ) dx2 ... f X ( xn ) dxn f X (1) , X ( 2) ,..., X ( n) ( x1 , x2 ,..., xn ) = n ! f X ( x1 ) f X ( x2 ) . .. f X ( xn ) Problem 7 X 1 , X 2 , X 3 are three independent identically distributed continuous random variables with the following common pdf: f ( x) = 8 where x x3 2 Calculate the joint pdf for X (1) , X ( 2 ) , and X ( 3) Solution If x1 x2 x3 , then 8 x13 8 3 x2 8 3 x3 f X (1) , X ( 2) , X (3) ( x1 , x2 , x3 ) = 3! f X ( x1 ) f X ( x2 ) f X ( x3 ) = 6 Otherwise, Yufeng Guo, Deeper Understanding: Exam P Page 293 of 425

http://www.guo.coursehost.com f X (1) , X ( 2) , X (3) ( x1 , x2 , x3 ) = 0 Problem 8 X 1 and X 2 are independent identically distributed exponential random variables with mean of 1. Find the pdf for X ( 2) X (1) , the difference betwee n the 2nd order statistics and the 1st order statistics. Solution X 1 and X 2 are independent identically distributed with the following common pd f f X ( x ) = e x . The joint pdf for X (1) and X ( 2 ) is: f X (1) , X ( 2) ( x, y ) = 2! f X ( x ) f X ( y ) = 2e x e y = 2e ( x+ y) where 0 x y Well use the Jacobian method to find the joint pdf. Let Y1 = X (1) , Y2 = X ( 2 ) just a fake random variable; well get rid of it in the end. Recover the old rand om variables: X (1) = Y1 , X ( 2) = Y1 + Y2 f X (1) , X ( 2) ( x, y ) = 2e ( x+ y) X (1) . Y1 is = 2e ( 2 y1 + y2 ) & x(1) & y1 =1, & x(1) & y2 =0, & x( 2) & y1 = 1, & x( 2) & y2 =1 & x(1) J= & x(1) & y2 1 0 = = 1, & x( 2) 1 1 & y2 ( 2 y1 + y2 ) & y1 & x( 2) & y1

J =1 fY1 ,Y2 ( y1 , y2 ) = J f X (1) , X ( 2) ( x, y ) = 2e where y1 0 and y2 0. Next, we get rid of Y1 : Yufeng Guo, Deeper Understanding: Exam P Page 294 of 425

http://www.guo.coursehost.com fY2 ( y2 ) = fY1 ,Y2 ( y1 , y2 )dy1 = + 2e 0 ( 2 y1 + y2 ) + dy1 = 2e y2 0 e 2 y1 dy1 = e 2 y1 y2 y2 Incidentally, we notice that fY1 ,Y2 ( y1 , y2 ) = 2e ( 2 y1 + y2 ) = 2e e = g ( y1 ) h ( y2 ) , where g ( y1 ) = 2e 2 y1 and h ( y2 ) = e y2 . So we know that Y1 and Y2 are independent. Homework: Redo all the problems in this chapter. Yufeng Guo, Deeper Understanding: Exam P Page 295 of 425

http://www.guo.coursehost.com Chapter 32 Double expectation Formula E ( X ) = EY EX Y ( X Y ) Lets use a simple example to understand the meaning of the above formula. Let X = a group of college students SAT scores, let Y =gender of a college student (eithe r male or female). Then the above formula becomes: E (SAT scores) = E GENDER ESA T Scores GENDER (SAT Scores GENDER) The formula says that to find the average of a group of college students SAT scores, we first find ESAT Scores GENDER (SAT Sc ores GENDER) . The above symbol means the average SAT score is conditioned on ge nder. In other words, we are dividing the college students into two groups by ge nder --- male college students and female college students and finding the avera ge SAT score of male students and of female students. Next, we find E GENDER ESA T Scores GENDER (SAT Scores GENDER) The above symbol means that we are calculati ng the weighted average of the male students average SAT score and the female stu dents average SAT score. Intuitively, the weighted average for these two averages should be the overall average SAT score of males and females combined, as expre ssed in the formula below: E (SAT scores) = E GENDER ESAT Scores GENDER (SAT Sco res GENDER) = Pr(male) E (SAT Scores GENDER=male) + Pr(female) E (SAT Scores GENDER=female) Of course, we can divide the college students by other categories. For example, we can divide them by major (English, Philosophy, Math, ). We then calculate the averaged SAT score by major. The weighted average SAT score by each major should be the overall average SAT score for the college students as a whole. Yufeng Guo, Deeper Understanding: Exam P Page 296 of 425

http://www.guo.coursehost.com E (SAT scores) = E MAJOR ESAT Scores = MAJOR (SAT Scores MAJOR) Pr(English Major) E (SAT Scores Major=English) + Pr(History Major) E (SAT Scores Major=History) + Problem 1 A group of 20 graduate students (12 male and 8 female) have a total GRE score of 12,940. The GRE score distribution by gender is as follows: Total GRE scores of 12 males 7,740 Total GRE scores of 8 females 5,200 Total GRE score 12,940 Find the average GRE score twice. For the first time, do not use the double expe ctation theorem. The second time, use the double expectation theorem. Show that you get the same result. Solution (1) Find the mean without using the double expectation theorem. Average GRE scor e for 20 graduate students Total GRE scores 12,940 = = 647 # of students 20 = (2) Find the mean using the double expectation theorem. E (GRE scores) = E GENDE R EGRE Scores GENDER (GRE Scores GENDER ) = Pr(male) E (GRE Scores GENDER=male) + Pr(female) E (GRE Scores GENDER=female) Pr(male)=12/20=0.6, Pr(female)=8/20=0.4 E (GRE Scores GENDER=male) =7,740/12=645 Yufeng Guo, Deeper Understanding: Exam P Page 297 of 425

http://www.guo.coursehost.com E (GRE Scores GENDER=female) = 5,200/8=650 E (GRE Scores) =0.6(645)+0.4(650)=647 You can see the two methods produce an identical result. Problem 2 (This problem is one of the more difficult problems. If you can calcul ate E (N ) , most likely that is good enough for Exam P.) The number of claims, N , incurred by a policyholder has the following distribution: n Pr(N = n ) = C 3 p n (1 p )3 n n = 0,1,2,3 p is uniformly distributed over [0, 1]. Find E (N ),Var (N ) . Solution If p is constant, N has the binomial distribution with mean and varianc e: E ( N ) = 3 p, Var ( N ) = 3 p (1 p ) -- if p is constant However, p is not constant. So we cannot directly use the above formula. What sh ould we do? In situations like this, the double expectation theorem comes in han dy. To find E (N ) , we divide N into different groups by p --- just as we divid e the college students into male and female students, except this time we have a n infinite number of groups ( P is a continuous random variable). Each value of P = p is a separate group. For each group, we will calculate its mean. Then we w ill find the weighted average mean for all the groups, with weight being the pro bability of each groups p value. The result should be E (N ) . E ( N ) = E P EN P ( N P = p ) EN P ( N P = p ) = 3 p (This the mean for a given group with P = p ) Next, we need to find the weighted average of each groups mean: Yufeng Guo, Deeper Understanding: Exam P Page 298 of 425

http://www.guo.coursehost.com 1 E (N ) = 0 1 Probability of each group s occurrence 3p each group s mean for N dp = 3 2 p 2 1 = 0 3 2 The integration is needed because we have an infinite number of groups. Alternatively, E (N ) = E E (N P = p ) = E (3 p ) = 3E (p) = 3 1 3 = 2 2 P N P P P In the above, E P ( p ) = 1 because P is uniform over [0, 1] 2 Next, we find Var (N ) using the formula Var (N ) = E (N 2 ) E 2 (N ) . We can calculate E (N 2 ) the same way we calculated E (N ) . E (N 2 ) = E P E N P (N 2 P = p ) For a given group P = p , N is binomial with a mean of 3p and a variance of 3 p(1 p ) . E N P (N 2 P = p ) = E 2N P (N P = p ) + VarN P (N P = p ) = (3 p )2 + 3 p(1 p ) = 6 p 2 + 3 p Next, we calculate the weighted average of E N P (N 2 P = p ) of all the groups, with weight being the probability of each g roup: E( N 2 ) = 1 1 0 Probability of

each group s occurrence 2 2 6 p2 + 3 p each group s mean for N 2 3 2 2 dp = 2 p 3 + 3 2 p 2 1 = 2+ 0 3 2 3 Var (N ) = E (N ) E (N ) = 2 + 2 = 5 4 Page 299 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Alternatively, you can use the following formula (if you have memorized it): Var (X ) = EY VarX Y (X Y = y ) + VarY E X Y (X Y = y ) To help memorize this formu la, we can rewrite it as Var ( X ) = EV + VE , where EV = EY Var ( X Y ) , VE = VarY E ( X Y ) If you have extra brainpower, you can learn the above formula. However, if you n eed to work on more basic concepts and problems, forget about this formula you a re better off learning something else. Applying the variance formula, we have: V ar ( N ) = EP VarN P ( N P = p ) + VarP EN P ( N P = p ) Because N P = p is bino mial with parameter 3 and p , we have E N P (N P ) = 3 p , VarN P (N P = p ) = 3 p(1 p ) EP VarN P ( N P = p) = EP [3 p(1 p ) ] = 3EP p VarP EN P ( N P = p ) = VarP [3 p ] = 9VarP [ p ] p 2 = 3 EP ( p ) EP ( p 2 ) Because P is uniform over [0, 1], 1 1 EP ( p ) = , VarP [ p ] = 2 2 3 2 1 1 = , EP ( p 2 ) = 12 2 2 + 1 12 You should remember that if X is uniform over [a , b ] , then E (X ) = a +b , 2 2 X = b a 2 3 2 1 3 EP ( p ) EP ( p ) = 3 2 9VarP [ p ] = 9 1 3 = 12 4 1 2 1 1 = 12 2 Yufeng Guo, Deeper Understanding: Exam P Page 300 of 425

http://www.guo.coursehost.com 1 3 5 Var ( N ) = EP VarN P ( N P = p ) + VarP EN P ( N P = p ) = + = 2 4 4 Problem 3 X is a Poisson random variable with mean Calculate E ( X ) and Var ( X ) . . is uniformly distributed over [0, 6]. Solution X is Poisson with parameter E(X . So E ( X 6 =3 2 ) = Var ( X ) = . E(X ) = E ) =E [ ]= = Var Var ( X ) = Var E ( X ) [ ] (6 = 0) =3 12 2 Homework for you: #20, May 2000; #10, May2001. Yufeng Guo, Deeper Understanding: Exam P Page 301 of 425

http://www.guo.coursehost.com Chapter 33 Moment generating function Moment generating function (MGF) is one of the least intuitive concepts in proba bility theories. Before we bother to memorize a bunch of MGF formulas, lets under stand why MGF was invented and whats the use of it. MGF is to probability theorie s as catalysts are to chemical reactions. If catalysts are not used in a chemica l reaction, the chemical reaction can still take place but it may take a long ti me for it to occur. However, if catalysts are used, a chemical reaction can proc eed quickly. Catalysts reduce the amount of energy needed to start a chemical re action. MGF reduces the amount of energy we need to have to get things done in p robability theories. Example 1. The # of accidents that happen on a particular h ighway during a 3-month period (i.e. quarter) is a Poisson random variable. On a verage, 1 accident happens during the 1st quarter 2 accidents happen during the 2nd quarter 3 accidents happen during the 3rd quarter 4 accidents happen during the 4th quarter Assume that the # of accidents in any quarter is independent of the # of accidents in any other quarter. Calculate the probability that at least 4 accidents happen on the highway in a year. Solution First, lets define 5 rando m variables: N1 is the # of accident in the 1st quarter. N1 is Poisson with N 2 is the # of accident in the 2 quarter. N 2 is Poisson with N 3 is the # of accid ent in the 3rd quarter. N 3 is Poisson with N 4 is the # of accident in the 4 qu arter. N 4 is Poisson with N is the # of accident in a year. N = N1 + N 2 + N 3 + N 4 th nd 1 = 1. 2 3 4 =2. =3. =4. We are asked to find P ( N 4 ) . To find P ( N 4 ) , well need to find the probability density function of N . We can rameter = 1 + 2 + 3 + 4 = 10 . ariable with = 1 + 2 + 3 + 4 ? ver, the proof will be a piece guess that N is a Poisson random variable with pa But how can we prove that N is a Poisson random v Proving this will be difficult with out MGF. Howe of cake if we use MGF, as youll soon see.

Yufeng Guo, Deeper Understanding: Exam P Page 302 of 425

http://www.guo.coursehost.com Example 2. Random variables X and Y are two indepe ndent normal random variables. We want to know whether their sum, X + Y , is als o a normal random variable. How can we quickly check whether X + Y is also norma l? With MGF, we can prove, effortlessly, that X + Y is also normal. Without MGF, the proof will take lot of work. Key point: MGF enables us to quickly find the distribution of the sum of n independent random variables. This point will be ma de clear to you later. For now, lets define MGF and write up some key formulas. 14 Key MGF formulas you must memorize 1. M X ( t ) = E ( et X ) . This is the definition of MFG. If X is discrete, the n M X ( t ) = E ( et X ) = x et x p X ( x ) + If X is continuous, then M X ( t ) = E ( et X ) = 2. M aX ( t ) = E e( a X ) t = E e X ( a t ) = M X ( a t ) et x f X ( x ) dx 3. M b ( t ) = E ebt = ebt . Here you can think of b as being a random that takes on value b only. 4. If X and Y are independent, then M X +Y X ( t ) M Y ( t ) The proof is simple. M X +Y ( t ) = E e( X +Y ) t = Y t Since X and Y are independent, E e X t eY t = E e X t E eY t = M X ( t ) Generally, if Y = X 1 + X 2 + ... + X n where X 1 , X 2 ,, X n dent, then variable ( t ) = M E e X t e ( t ) M Y are indepen

M Y ( t ) = M X1 ( t ) M X 2 ( t ) ...M X n ( t ) M aX +b ( t ) = E et ( aX +b ) = E et ( aX ) et b = et b E et ( a X ) = et b M a X ( t ) Yufeng Guo, Deeper Understanding: Exam P Page 303 of 425

http://www.guo.coursehost.com Alternatively, imagine you have two independent random variables aX and b . Then M aX +b ( t ) = M aX ( t ) M b ( t ) = M a X ( t ) et b = et b M a X ( t ) 5. M X +b ( t ) = M X a a + b a (t ) = M 1 X a (t ) M b (t ) = M X a b t t bt t a e = ea M X a a 6. M X (t ) t =0 = 1 . To see why, notice that M X ( t ) t =0 = E ( e0* X ) = E ( e0 ) = E (1) = 1 . 7. d M X (t ) dt = E(X ), t =0 d2 M X (t ) dt 2 = E( X 2 ), t =0 dn M X (t ) dt n = E(Xn) t =0 To see why, please note et X = 1 + 1 1 1 1 2 3 n ( tX ) + ( tX ) + ( tX ) + ... + ( tX ) + ... (Taylor s eries) 1! 2! 3! n! Taking expectation regarding to X from both sides: M X ( t ) = E ( et X ) = 1 + tE ( X ) + 1 2 1 1 t E ( X 2 ) + t 3 ( X 3 ) + ... + t n ( X n ) + ... 2! 3! n! Then we have:

d M X (t ) dt = E(X ), t =0 d2 M X (t ) dt 2 = E ( X 2 ) , , t =0 dn M X (t ) dt n = E(Xn) t =0 8. MGF for Bernoulli distribution This is a special case of binominal distributi on with n = 1 (i.e. the # of trial is one). X= 1 with probability of p 0 with pr obability of q = 1 p M X ( t ) = E ( et X ) = pet (1) + qet ( 0) = pet + q MGF 9. MGF for binomial distribution Probability mass function p X ( x ) = Cnx p x q n x Yufeng Guo, Deeper Understanding: Exam P Page 304 of 425

http://www.guo.coursehost.com M X ( t ) = E ( et X ) = ( pet + q ) n MGF How to memorize MGF Binomial distribution is the sum of n independent identicall y distributed Bernoulli random variables. Let X be the binomial random variable with parameter n and p . Let Y1 , Y2 ,, Yn be n independent identically distribut ed Bernoulli random variables with parameter p . Then X = Y1 + Y2 + ... + Yn For example. Let X represent the total # of heads you get if you throw a coin 10 times. Then Y1 is the # of head you get in the 1st throw; Y2 is the # of heads you get in th e 2nd throw; ... Y1 is either 0 or 1. Y2 is either 0 or 1. Y10 is the # of heads you get in the 10th throw; Y10 is either 0 or 1. Then clearly X = Y1 + Y2 + ... + Y10 . M X ( t ) = M Y1 +Y2 +...+Yn ( t ) = M Y1 ( t ) M Y2 ( t ) ...M Yn ( t ) M Y1 ( t ) = pet + q M Y2 ( t ) = pet + q M Yn ( t ) = pet + q So we have: M X ( t ) = M Y1 +Y2 +...+Y n ( t ) = M Y1 ( t ) M Y2 ( t ) ...M Yn ( t ) = ( pet + q ) 10. MGF for Poisson distribution n pX ( x ) = e x x! , where x = 0,1, 2,... + x M X ( t ) = E ( et X ) = et x e x =0 x! =e + x =0 ( e) x! t x =e e et =e

( e 1) t Yufeng Guo, Deeper Understanding: Exam P Page 305 of 425

http://www.guo.coursehost.com M X (t ) = e ( e 1) t 11. MGF for a standard normal distribution If X is the standard normal distribut ion with mean of 0 and variance of 1, then fX ( x) = 1 e 2 1 2 x 2 , + 1 2 x 2 + 1 2 x +t x 2 M X ( t ) = E et X = et x 1 e 2 dx = 1 e 2 dx 1 2 1 2 1 2 x +t x = ( x 2t x) = 1 2 2 2 + + (x t2 2 2t x + t2 ) t2 = (x t) 2 t2 1 e 2 1 2 x +t x 2 dx = 1 e 2 + 1 2 ( x t )2

dx = e 2 1 t2 + 1 e 2 1 e 2 1 2 ( x t )2 dx Setting Y = X Because + t , we have: 1 2 y 2 1 e 2 1 2 ( x t )2 + dx = 1 2 y 2 dy 1 e 2 1 2 y 2 is the density function of a standard normal distribution, we have: 1 e 2 dy = 1 + 1 2 x 2 1 M X ( t ) = E et X = et x 1 e 2 dx = e 2 t2 +

1 e 2 1 2 ( x t )2 dx = e 2 1 t2 If X is a standard normal random variable, then M X ( t ) = e 2 1 t2 12. MGF for a normal random variable with mean and standard deviation

Let X represent a normal random variable with mean and standard deviation Z repr esent the standard normal random variable. Then we have: . Let Yufeng Guo, Deeper Understanding: Exam P Page 306 of 425

http://www.guo.coursehost.com Z= X

X = + Z t+ 1 2 M X (t ) = M + t Z (t ) = e M t Z (t ) = e M Z ( t ) = e 2 2 t 13. MGF for Exponential distribution If X is exponentially distributed with para meter , then fX ( x) = e x , where x 0 + + M X ( t ) = E ( et X ) = et x e 0 x dx = 0 e ( t) x dx = t = 1 1 t = 1 1 t 14. MGF for gamma distribution Let X 1 , X 2 ,, X n represent n independent ident ically distributed exponential random variable with a common parameter . Let S = n i =1

X i . As explained before, Y has gamma distribution with the following pdf: fY ( y ) = ( y) e ( ( n 1)! n y) , where y 0 The MGF of Y is: M Y ( t ) = M X1 + X 2 +...+ X n ( t ) = M X1 ( t ) M X 2 ( t ) ...M X n ( t ) = n t = 1 1 n t , where t < = 1 MGFs for geometric distribution, negative binomial, and uniform are more complex and less frequently tested. I recommend that you dont memorize their moment gener ating functions. However, you might want to memorize the MFG for Bernoulli, bino minal, exponential, gamma, Poisson, and normal. At minimum, memorize the MGF for exponential and Poisson. Exponential and Poisson distributions have elegant MGFs and are frequently tested in Exam P. Yufeng Guo, Deeper Understanding: Exam P Page 307 of 425

http://www.guo.coursehost.com Sample Problems and Solutions Problem 1 Random variable X has the following distribution: X =x p ( x) 0.1 0.4 0.5 0 1 2 Find M X ( t ) . Solution This problem simply tests your knowledge of the definition of the moment generat ing function. M X (t ) = E (e t X ) = Problem 2 p( x) e t x = 0.1e t 0 + 0.4e t1+ 0.5e t 2 = 0.1 + 0.4et + 0.5e 2t A random variable has the following moment generating function: M X (t ) = 1 t 2 2 t 4 3t e + e + e 7 7 7 Find the pdf, mean and variance of X . Solution Because M X (t ) = E (e t X ) , we have 1 !7 ! !2 f ( x) = !7 !4 !7 x =1 x=2 x=3 E( X ) = d M X (t ) dt = t =0 d 1 t 2 2 t 4 3t " e + e + e # dt 7 7 7 $ t =0 Page 308 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com = 1 t 2 4 (1e ) + (2e 2 t ) + (3e3t ) 7 7 7 d2 M X (t ) dt 2 = t =0 = t =0 1 2 4 17 (1) + (2) + (3) = 7 7 7 7 E( X 2 ) = d d " M X (t ) # dt dt $ t =0 = d 1 t 2 4 " 1 2 4 (1e ) + (2e 2 t ) + (3e3t ) # = (12 et ) + (22 e2 t ) + (33 e3 t ) 7 7 7 dt 7 7 7 $ t =0 1 2 2 2 4 3 45 (1 ) + (2 ) + (3 ) = 7 7 7 7 t =0 = Var ( X ) = E ( X 2 ) E 2 ( X ) = 45 7 17 7 2 = 26 49 Problem 3 Prove that the linear combination of two independent normal random variables is also normal. Solution Assume X 1 is normal with mean of 1 and standard deviation of X 2 is normal with mean of 2 and standard deviation of We need to prove that aX 1 + bX 2 is also nor mal for a % 0 . M X1 ( t ) = e 1 t + 1 2 2 2 1 t 1 2 M aX1 +bX 2 ( t ) = M aX1 ( t ) M bX 2 ( t ) 1 2 1 2 2 1 , M X2 (t ) = e

2 t + 1 2 2 2 2t M aX1 ( t ) = M X1 ( a t ) = exp 1 ( at ) + M bX 2 ( t ) = M X 2 ( b t ) = exp ( bt ) + (a t ) 2 2 2 (b t ) 2 Yufeng Guo, Deeper Understanding: Exam P Page 309 of 425 2

http://www.guo.coursehost.com M aX1 +bX 2 ( t ) = M aX1 ( t ) M bX 2 ( t ) = exp 1 ( at ) + = exp 1 ( at ) + 1 2 1 2 2 1 (a t ) 2 2 exp 1 2 2 2 (b t ) 2 2 1 (a t ) 2 + 1 2 2 2 2 (b t ) = exp ( a 1 + b 2 ) t + exp ( a 1 + b 2 ) t + 1 2 (a 2 2 1 + b2 2 2 )t 1 2 2 ( a 1 + b2 22 ) t 2 is the MGF for a normal random variable that 2 has the following mean and standard deviation: The mean = a2 2 1 + b2 2 2 = a 1 + b 2 The standard deviation 2 ( bt ) + 2 ( bt ) +

You see aX 1 + bX 2 is normal with = a2 2 1 + b2 2 2 .

= a 1 + b 2 and Problem 4

Prove that the sum of two independent Poisson random variable is also a Poisson random variable. Solution Let N1 and N 2 represent two independent Poisson rando m variable with respectively. M N1 ( t ) = exp 1 ( e 1) t . M N2 ( t ) = exp 2 ( e 1) t 1 and 2 Because N1 and N 2 are independent, we have: M N1 + N 2 ( t ) = M N1 ( t ) M N2 ( t ) = exp exp 1 ( e 1) t exp 2 ( e 1) t = exp ( 1 + 2

) ( et 1) ( 1 + 2 ) ( et 1) is the MGF for a Poisson random variable with parameter (where i = 1, 2,..., k ), then k i =1 1 + 2 . If N i is a Poisson random variable with mean k i N i is also a Poisson random variable with mean i =1 i Yufeng Guo, Deeper Understanding: Exam P Page 310 of 425

http://www.guo.coursehost.com Problem 5 The # of accidents that happen on a part icular highway during a 3-month period (i.e. quarter) is a Poisson random variab le. On average, 1 accident happens during the 1st quarter 2 accidents happen dur ing the 2nd quarter 3 accidents happen during the 3rd quarter 4 accidents happen during the 4th quarter Assume that the # of accidents in any quarter is indepen dent of the # of accidents in any other quarter. Calculate the probability that at least 4 accidents happen on the highway in a year. Solution First, lets define 5 random variables: N1 is the # of accident in the 1st quarter. N1 is Poisson w ith 1 = 1 . N 2 is the # of accident in the 2nd quarter. N 2 is Poisson with 2 = 2 . N 3 is the # of accident in the 3rd quarter. N 3 is Poisson with N 4 is the # of accident in the 4th quarter. N 4 is Poisson with N is the # of accident in a year. N = N1 + N 2 + N 3 + N 4 3 =3. 4 =4. N = N1 + N 2 + N 3 + N 4 is also a Poisson random variable with mean = 1 + 2 + 3 + 4 = 10 . P(N 4) = 1 P ( N = 0 ) + P ( N = 1) + P ( N = 2 ) + P ( N = 3) 10 =1 e Problem 6 1 fX ( x) = e 2 x 100 101 102 103 + + + = 0.99 0! 1! 2! 3! , where <x<+ . Calculate M X ( t ) . Solution Yufeng Guo, Deeper Understanding: Exam P Page 311 of 425

http://www.guo.coursehost.com M X ( t ) = E et X = 0 + et x f ( x )dx = + + et x 1 x 1 e dx = 2 2 0 0 + et x e x dx + 0 et x e x dx + = 1 2 e( t +1) x dx + 0 e( t 1) x dx = 1 1 (t +1) x e 2 t +1 = + 1 1 (t e 2 t 1 ) 1) x 0 If t + 1 > 0 or t > 1 , 1 ( t +1) x e t +1 1 t 1 e( t

1) x + 0 1 1 e0 e( t + )( t +1 1 e( t 1) = 1 1 [1 0] = . t +1 t +1 If t 1 < 0 or t < 1 , = 0 t 1 1 = 1 t 1 [0 1] = 1 t 1 So for 1 < t < 1 , we have: M X (t ) = 1 1 (t +1) x e 2 t +1 0 + 1 1 (t e 2 t 1 1) x + = 0 1 1 1 1 1 1 1 = + = 2 t +1 t 1 2 1+ t 1 t 1 t2 Problem 7 A discrete random variable X takes 3 possible values: 0, 1, and 2. The 1st momen t of X is 1; the 2nd moment is 1.5. Find M X ( t ) Solution Generally, the k -th moment of X refers to E ( X k ) . The 1st moment of X refers to E ( X ) . The 2nd moment of X refers to E ( X 2 ) .

Let a , b , and c represent the probability that X is equal to 0, 1, and 2 respe ctively. X =x pX ( x ) 0 1 b 2 a c E ( X ) = 0 ( a ) + 1( b ) + 2 ( c ) = b + 2c = 1 E ( X 2 ) = 02 ( a ) + 12 ( b ) + 22 ( c ) = b + 4c = 1.5 Solving the above equations, we have: Yufeng Guo, Deeper Understanding: Exam P Page 312 of 425

http://www.guo.coursehost.com b= 1 1 , c = , a =1 2 4 (b + c ) = 1 4 1 1 t 1 2t + e + e 4 2 4 M X ( t ) = ae0 t + bet + ce2 t = Problem 8 X is an exponential random variable with mean = 5 . Y = 2 X + 3 . Find M Y ( t ) . Solution M X (t ) = = 5 5 t t M Y ( t ) = M 2 X +3 ( t ) = M 2 X ( t ) M 3 ( t ) M 2 X ( t ) = M X ( 2t ) = 5 5 2t , M 3 ( t ) = e3 t 5 5 2t e 3t . M Y ( t ) = M 2 X +3 ( t ) = M 2 X ( t ) M 3 ( t ) = Problem 9 X = U + V . U is exponentially distributed with parameter variable with mean 3. U and V are independent. = 2 . V is a Poisson random Find M X ( t ) . Solution MU (t ) = 2 2 t , M V ( t ) = exp 3 ( et 1) 2 2 t exp 3 ( et 1) M X ( t ) = M U +V ( t ) = M U ( t ) M V ( t ) = Yufeng Guo, Deeper Understanding: Exam P Page 313 of 425

http://www.guo.coursehost.com Problem 10 MY (t ) = 5 5 2t e10 t Find E (Y ) and Var (Y ) Solution Method 1 M Y ( t ) is the product of two terms: 5 5 5 2t and e10t . is the MGF for 2 X , where X is an exponential random variable with parameter 5 2t = 5 . e10t is the MGF for 10 . So Y = 2 X + 10 . E (Y ) = E ( 2 X + 10 ) = 2 E ( X ) + 10 = 2 1 + 10 = 10.4 5 2 1 Var ( Y ) = Var ( 2 X + 10 ) = 4Var ( X ) = 4 5 = 0.16 Please note that the mean and the standard deviation of an exponential distribut ion with 1 1 parameter are both . In this problem, E ( X ) = X = 5 Method 2 M Y (t ) = 5 5 2t e10 t d d 5 10 t 5 d 10 t 10 t d 5 = M Y (t ) = e e +e dt dt 5 2t 5 2t dt dt 5 2t = 10 5 5 2t 2 5 2t e10 t + e10 t 5 ( 2 )( 5 2t ) M X (t ) 2 = 10 + 2 5 2t 5 5 2t e10 t = 10 + d 2 M Y ( t ) = 10 + M X (t ) dt 5 2t Yufeng Guo, Deeper Understanding: Exam P Page 314 of 425

http://www.guo.coursehost.com E(X ) = d MY (t ) dt = t =0 10 + =1 2 5 2t M X (t ) t =0 = 10 + 2 5 2t t =0 M X (t ) t =0 = 10.4 Please note that M X ( t ) d2 d MY (t ) = 2 dt dt = 10 + 2 t =0 10 + 2 2 d d 2 M X ( t ) = 10 + M X (t ) + M X (t ) 10 + 5 2t 5 2t dt dt 5 2t 2 d M X ( t ) + M X ( t ) 4 ( 5 2t ) 5 2t dt d2 E(X ) = MY (t ) dt 2 2 = t =0 10 + 2 d M X ( t ) + M X ( t ) 4 ( 5 2t ) 5 2t dt 2 t =0 = 10 + 2 5 2t t =0 d M X (t ) dt

+ M X (t ) t =0 t =0 4 ( 5 2t ) 2 t =0 = 10 + 2 4 10.4 + 2 = 10.4 2 + 0.16 5 5 Var ( X ) = E ( X 2 ) E 2 ( X ) = 10.42 + 0.16 10.42 = 0.16 Problem 10 Random variable X has the following distribution: X= where U is exponential distribution with parameter V is exponential distribution with parameter U V with probability of 0.25 with probability of 0.75 =1 =2 Find M X ( t ) Solution M X ( t ) = P ( X = U ) MU (t ) + P ( X = V ) MV (t ) = Yufeng Guo, Deeper Understanding: Exam P 1 1 3 2 + 4 1 t 4 2 t Page 315 of 425

http://www.guo.coursehost.com If the above solution seems difficult to understand, here is another solution: P ( X ) x ) = P ( X = U ) P (U ) x ) + P ( X = V ) P (V ) x ) = Taking derivative regarding to x : f ( x) = 1 3 P (U ) x ) + P (V ) x ) 4 4 d d 1 3 1 d 3 d P ( X ) x) = P (U ) x ) + P (V ) x ) = P (U ) x ) + P (V ) x ) 4 4 dx 4 dx dx dx 4 2x = 1 3 1 3 fU ( x ) + fV ( x ) = e x + ( 2e 4 4 4 4 + ) 2x M X ( t ) = E ( et X ) = et x 0 1 x 3 e + ( 2e 4 4 ) dx = 1 1 3 2 + 4 1 t 4 2 t Problem 11 M X ( t ) = e7 t ( 0.2e6t + 0.8 ) 10 Calculate E ( X ) and Var ( X ) Solution Method 1 M X ( t ) is the product of tw o terms e7t and ( 0.2e6 t + 0.8 ) . e7t is the MGF for the 10 constant 7. ( 0.2e6 t + 0.8 ) is the MGF for 6Y , where Y is a binomial random v ariable 10 with parameters n = 10 and p = 0.2 . So X = 6Y + 7 . E ( X ) = E ( 6Y + 7 ) = 6 E (Y ) + 7 = 6 ( n p ) + 7 = 6 (10 0.2 ) + 7 = 19 Var ( X ) = Var ( 6Y + 7 ) = 6 2 Var ( Y ) = 36 ( npq ) = 36 (10 0.2 0.8) = 57.6 Method 2 The following standard approach is labor intensive: Yufeng Guo, Deeper Understanding: Exam P Page 316 of 425

http://www.guo.coursehost.com E(X ) = E(X2) = d M X (t ) dt = t =0 10 " d 7t e ( 0.2e6t + 0.8) # dt $t = 0 10 " d 2 7t e ( 0.2e6t + 0.8 ) # 2 dt $t =0 d2 M X (t ) dt 2 = t =0 Well use the following shortcut: E(X ) = d ln M X ( t ) dt , t =0 Var ( X ) = d2 ln M X ( t ) dt 2 t =0 This shortcut is very useful if MGF is the product of several functions of t . P roof. d 1 d M X (t ) ln M X ( t ) = dt M X ( t ) dt / M X (t ) = M X (t ) d ln M X ( t ) dt = t =0 t =0 1 M X (t ) / M X (t ) t =0 t =0 Because M X ( t ) d ln M X ( t ) dt t =0 = 1 is true for any MGF, we have: / = M X (t ) t =0 t =0 = E(X ) / // / / M X (t ) M X (t ) M X (t ) M X (t ) d2 d d d M X (t ) = ln M X ( t ) = ln M X ( t ) = 2 dt 2 dt dt dt M X ( t ) M X (t )

= // M X (t ) M X (t ) / M X (t ) 2 2 M X (t ) d2 ln M X ( t ) dt 2 = t =0 // ! M X (t ) M X (t ) / M X (t ) 2 2 ! M X (t ) " ! # !t =0 $ 2 = // M X (t = 0) M X (t = 0) / M X (t = 0) 2 M X (t = 0) / // However, M X ( t = 0 ) = 1 , M X ( t = 0 ) = E ( X ) , M X ( t = 0 ) = E ( X 2 ) . Yufeng Guo, Deeper Understanding: Exam P Page 317 of 425

http://www.guo.coursehost.com d2 ln M X ( t ) dt 2 = E ( X 2 ) E 2 ( X ) = Var ( X ) t =0 Back to the problem. M X ( t ) = e7 t ( 0.2e6t + 0.8 ) 10 ln M X ( t ) = 7t + 10 ln ( 0.2e6 t + 0.8 ) 6t 10 ( 0.2 )( 6 ) e d d 12e6t = 7+ ln M X ( t ) = 7t + 10 ln ( 0.2e6t + 0.8 ) = 7 + dt dt 0.2e6t + 0.8 0.2e6t + 0.8 =7+ 12 0.2 + 0.8e 6t d2 d 12 ln M X ( t ) = 7+ 2 dt dt 0.2 + 0.8e 6t = 12 d ( 0.2 + 0.8e dt 6t ) 1 = 12 ( 1)( 0.8)( 6 ) ( 0.2 + 0.8e E(X ) = d ln M X ( t ) dt = 7+ t =0 6t ) 2 12 0.2 + 0.8e 6t = 7 + 12 = 19 t =0 Var ( X ) = d2 ln M X ( t ) dt 2 = 12 ( 1)( 0.8 )( 6 ) ( 0.2 + 0.8e t =0 6t )

2 t =0 = 57.6 Problem 12 A machine has two components. One component is at work. The 2nd compo nent sits idle but is activated after the first one has failed. The machine fail s only after both components have failed. Let X ,Y represent each components time until failure. Let Z represents the machines time until failure. X ,Y have the following moment generating functions: M X (t ) = 1 1 t , M Y (t ) = 1 1 2t Find E ( Z ) , Var ( Z ) , E ( Z 3 ) . Assume X , Y are independent. Solution Yufeng Guo, Deeper Understanding: Exam P Page 318 of 425

http://www.guo.coursehost.com Z = X +Y M Z (t ) = M X (t ) M Y (t ) = 1 . To simplify the calculation, set 1 t 1 2t 1 A (1 t ) + B (1 2t ) ( A + B ) ( A + 2 B ) t A B 1 = + = = 1 t 1 2t 1 2t 1 t (1 2t )(1 t ) (1 2t )(1 t ) 1 For ( A + B ) ( A + 2 B ) t to hold, we must have: 1 = (1 2t )(1 t ) (1 2t )(1 t ) A + B = 1 , A + 2 B = 0 . Solving these equations, we have: A = 2 , B = 1 1 2 1 = 1 t 1 2t 1 2t 1 t M Z (t ) = M X (t ) M Y (t ) = d d M Z (t ) = 2 (1 2t ) dt dt 1 2 1 = = 2 (1 2t ) 1 t 1 2t 1 2t 1 t 1 1 1 1 (1 t ) 1 (1 t ) 2 1 = 2 2 (1 2t ) 2 2 (1 t ) 3 2 d d d 2 M Z (t ) = 2 (1 2t ) dt dt dt (1 t ) 3 = 2 4 (1 2t ) 2 (1 t ) 3 d d2 d 4 M Z (t ) = 2 (1 2t ) 2 dt dt dt E(Z ) = d M Z (t ) dt

2 (1 t ) 3 = 2 4 ( 3)( 2 )(1 2t ) 4 2 ( 3)(1 t ) 4 = 2 2 (1 2t ) t =0 2 (1 t ) 2 t =0 =3 E(Z 2 ) = d d M Z (t ) dt dt = 2 4 (1 2t ) t =0 3 2(1 t ) 3 t =0 = 14 Var ( Z ) = E ( Z 2 ) E 2 ( Z ) = 14 32 = 5 E(Z 3 ) = d3 M Z (t ) dt 3 = 2 4 (3)(2)(1 2t ) t =0 4 2(3)(1 t ) 4 t =0 = 90 Yufeng Guo, Deeper Understanding: Exam P Page 319 of 425

http://www.guo.coursehost.com If you have memorized the moment generating functi on of exponential distribution, you will notice that X is exponentially distribu ted with a mean of 1 and Y is exponentially distributed with a mean of 2. Then E ( Z ) = E ( X + Y ) = E ( X ) + E (Y ) = 1 + 2 = 3 Var ( Z ) = Var ( X + Y ) = Var ( X ) + Var (Y ) = 12 + 22 = 5 Though we can qui ckly calculate E ( Z ) and Var ( Z ) without using the moment generating functio n, we cannot easily calculate E ( Z 3 ) this way. Now you see the power of the m oment generating function. Problem 13 X ,Y are two independent random variables with the following moment generating f unctions: M X (t ) = e Let Z = X Solution t +t 2 , M Y (t ) = e 2t + t2 2 5Y . Find Var ( Z ) . If you have memorized the moment generating function of a normal variable X : M X (t ) = e t + 2 2 t 2 you can then solve this problem quickly. In this problem, X is normal with a mea n of -1 and a variance of 2; Y is normal with a mean of 2 and a variance of 1. T hen Var ( Z ) = Var ( X 5Y ) = Var ( X ) + 52 Var (Y ) = 2 + 52 (1) = 27 If you have not memorized the moment generating function for normal distribution or for any other distribution: MX 5Y ( t ) = M X ( t ) M 5Y ( t ) = M X ( t ) M Y ( t +t 2 5t ) ( 5t ) 2 2 M X (t ) = e , M e 5Y (t ) = M Y (

( 5t ) 2 2 5t ) = e 2 ( 5t ) + MX 5Y (t ) = e t +t 2 2( 5t ) + =e t + t 2 + 2( 5t ) + ( 5t ) 2 2 =e 11t + 13.5 t 2 Yufeng Guo, Deeper Understanding: Exam P Page 320 of 425

http://www.guo.coursehost.com d MX dt 5Y (t ) = d e dt 11t + 13.5 t 2 = ( 11 + 27 t ) e 11t + 13.5 t 2 d2 MX dt 2 E(X E (X 5Y (t ) = d dt ( 11 + 27 t ) e 11t + 13.5 t 2 = ( 11 + 27 t ) + 27 e 2 11t + 13.5 t 2 5Y ) = d MX dt 5Y (t ) t =0 = ( 11 + 27 t ) e 11t + 13.5 t 2 t =0 = 11 5Y ) 2 = d2 M dt 2 X 5Y (t )

2 = {( t =0 11t + 13.5 t 2 11 + 27 t ) + 27 e } 2 t =0 = ( 11) + 27 2 Var ( X 5Y ) = E ( X 5Y ) 2 [ E( X 5Y )] = 27 Alternatively: MX 5Y ( t ) = e 11t + 13.5t 5Y 2 ln M X (t ) = 5Y 11t + 13.5 t 2 d ( 11t + 13.5 t 2 ) = 11 + 27t dt d ln M X dt (t ) = d2 ln M X dt t E(X 5Y (t ) = 5Y ) = d ( 11 + 27t ) = 27 dt d ln M X dt

5Y (t ) t =0 = 11 Var ( X 5Y ) = d2 ln M X dt 2 5Y (t ) t =0 = [ 27 ]t = 0 = 27 You see that the shortcut really cuts to the chase. Yufeng Guo, Deeper Understanding: Exam P Page 321 of 425

http://www.guo.coursehost.com Q14 Random variable X has the following moment gen erating function: M X (t ) = 1 t e (1 + e2 t ) 2 Calculate Var ( X ) . Solution Method 1 M X (t ) = 1 t 1 1 e (1 + e 2 t ) = et + e3t 2 2 2 et is the MGF for a constant of 1; e3t is the MGF for a constant of 3. Remember M b ( t ) = ebt . So we see that X takes on the value of 1 and 3 each with probability of X= 1 3 w ith probability of 0.5 with probability of 0.5 1 : 2 E ( X ) = 0.5 (1 + 3) = 2 , E ( X 2 ) = 0.5 (12 + 32 ) = 5 , Var ( X ) = 5 22 = 1 Method 2 M X (t ) = 1 t 1 e (1 + e 2 t ) = ( et + e3 t ) 2 2 d 1 d t 3t 1 M X (t ) = e + e = ( et + 3e3t ) dt 2 dt 2 d2 d 1 t M X (t ) = ( e + 3e3t ) = 1 ( et + 9e3t ) 2 dt dt 2 2 E(X ) = d M X (t ) dt = t =0 1 t ( e + 3e3t ) 2 = t =0 1 (1 + 3) = 2 2 Yufeng Guo, Deeper Understanding: Exam P Page 322 of 425

http://www.guo.coursehost.com E(X2) = d2 M X (t ) dt 2 = t =0 1 t ( e + 9e3t ) 2 = t =0 1 (1 + 9 ) = 5 2 Var ( X ) = E ( X 2 ) E 2 ( X ) = 5 2 2 = 1 Method 3 ln M X ( t ) = ln 1 t 1 e (1 + e2 t ) = ln + t + ln (1 + e 2 t ) 2 2 2t d d 1 2e2 t 2 ln M X ( t ) = ln + t + ln (1 + e2 t ) = 1 + = 1+ 2 t = 1+ 2 (e 2t 2 1+ e dt dt e +1 d2 d ln M X ( t ) = 1+ 2 (e 2 dt dt Var ( X ) = d2 ln M X ( t ) dt 2 2t + 1) 1 + 1) 1 = 2 ( 1) ( e + 1) 2t + 1) 2 ( 2) e 2t t =0 2t = 2 ( 1) ( e t =0 2t 2 ( 2) e = 4 ( 2) = 1 2

Q15 True/False Two random variables X and Y have identical moment generating fun ction: 1 M X (t ) = M Y (t ) = . This means that X = Y . 1 t Solution 1 1 t False is the moment generating function of an exponential random variable with mean 1. 1 1 t M X (t ) = M Y (t ) = means that X and Y have the following the same pdf: y fX ( x) = e x , fX ( y) = e However, having the same probability distribution function doesnt mean that two r andom variables are the same. To bring this concept home, lets consider two unbia sed coins A and B. Let X represent the number of heads we get if we flip coin A once Y represent the nu mber of heads we get if we flip coin B once Yufeng Guo, Deeper Understanding: Exam P Page 323 of 425

http://www.guo.coursehost.com Clearly, X and Y have identical probability mass f unction: X= 1 with probability 0.5 0 with probability 0.5 Y= 1 with probability 0.5 0 with probability 0.5 X and Y have identical MGF: M X ( t ) = M Y ( t ) = E ( etX ) = E ( etY ) = 0.5 ( et0 + et1 ) = 0.5 (1 + et ) However, its not true that X = Y . For example, if you flip Coin A, you may get a head (so X = 1 ). If you flip Coin B, you may get a tail (so Y = 0 ). Key point to remember: If two random variables X and Y have the same probability function or the same MGF, it doesnt mean X = Y . Problem 16 An exam candidate was solving the following problem (Problem 12 in th is chapter): A machine has two components. One component is at work. The 2nd com ponent sits idle but is activated after the first one has failed. The machine fa ils only after both components have failed. Let X ,Y represent each components ti me until failure. Let Z represents the machines time until failure. X ,Y have the following moment generating functions: M X (t ) = 1 1 t , M Y (t ) = 1 1 2t Find E ( Z ) , Var ( Z ) , E ( Z 3 ) . Assume X , Y are independent. This is his approach: X is an exponential random variable with mean 1. Using the formula M aX ( t ) = M X ( a t ) : M 2 X ( t ) = M X ( 2t ) = 1 . 1 2t M Y (t ) = M 2 X (t ) = 1 1 2t Y = 2X Z = X + Y = X + 2 X = 3X E ( Z ) = E ( 3 X ) = 3E ( X ) = 3 Yufeng Guo, Deeper Understanding: Exam P Page 324 of 425

http://www.guo.coursehost.com Var ( Z ) = Var ( 3 X ) = 9Var ( X ) = 9 E ( Z 3 ) = E ( 27 X 3 ) = 27 E ( X 3 ) = 27 x 3e x dx 0 After a bunch of integration by parts: 0 x 3e x dx =6 E ( Z 3 ) = E ( 27 X 3 ) = 27 E ( X 3 ) = 27 x3e x dx = 27 ( 6 ) = 162 0 Explain why this approach is wrong. Solution This mistake is in this step: M Y (t ) = M 2 X (t ) = 1 1 2t Y = 2X As explained before, two random variables having the same MGF doesnt mean these t wo random variables are the same. Its not true that Y = 2 X . Homework for you: #35, May 2000; #11, #27, Nov 2000 Yufeng Guo, Deeper Understanding: Exam P Page 325 of 425

http://www.guo.coursehost.com Chapter 34 Joint moment generating function The joint moment generating function of ( X , Y ) is defined as M X ,Y ( s, t ) = E e s x +t y If X , Y are discrete, then M X ,Y ( s, t ) = E ( e s x + t y ) = If X , Y are continuous, then M X ,Y ( s, t ) = E e s x +t y = + + + x= + y= e s x + t y f X ,Y ( x, y ) e s x + t y f X ,Y ( x, y ) dxdy Lets focus on the case when X , Y are continuous. If the joint MGF is ever tested , the continuous joint MGF is more likely to be tested. Important properties of joint MGF: (1) If M X ,Y (s, t ) is finite in a rectangle containing (0,0) , then the joint pdf f X ,Y ( x, y ) is completely determined by M X ,Y (s, t ) . (2) M X ,Y (s, 0 ) = M X (s ) arginal MGF. (3) M X ,Y (0, t comes the Y marginal MGF. (4) then the joint MGF becomes M (5) E ( X nY m ) = n+m n s mt M X ,Y ( s, t ) s =t = 0 Once you know the definition of a joint MGF, the remaining work is doing double integration. Youll need to apply the process described in Chapter 26 to quickly a nd correctly complete the double integration. Course 1 May 2003 #39 is the only problem about the joint MGF. Yufeng Guo, Deeper Understanding: Exam P Page 326 of 425 . ) M X By setting t = 0 , the joint MGF becomes the X m = M Y (t ) . By setting s = 0 , the joint MGF be X ,Y ( s, t ) = M X +Y ( t ) . If we set s = t , +Y (t ) , the MGF of X + Y .

http://www.guo.coursehost.com Problem 1 Prove that E ( X Y n m )= n+m n s m t M X ,Y ( s, t ) s =t = 0 . Solution If n = 1 and m = 0 : E ( X ) = d M X ,Y ( s, t ) s =t =0 . ds d d d s x +t y M X ,Y ( s, t ) = E e s x +t y = E ( e ) = E xes x+t y ds ds ds (In the above formula, x, y are treated as constants) d M X ,Y ( s, t ) ds If n = 0 and m = 1 : =E s =t =0 d s x +t y (e ) ds = E xe0 = E ( X ) s =t = 0 E (Y ) = d M X ,Y ( s, t ) s =t =0 . dt d d d s x +t y M X ,Y ( s, t ) = E e s x +t y = E ( e ) = E yes x+t y dt dt dt ( In the above formula, x, y are treated as constants) d M X ,Y ( s, t ) dt =E s =t =0 d s x+t y (e ) dt = E ye0 = E (Y ) s =t = 0 If n = m = 1 : E ( XY ) = 2 2 s t M X ,Y ( s, t ) s =t =0 . E e s x +t y = s t

M X ,Y ( s, t ) = 2 s t E e s x +t y = s E t s E (e t s x +t y s x +t y ) ( xye s x +t y = s E (e t s x +t y ) = s ( ye s x +t y ) =E ( ye s ) =E ) (In the above formula, x, y are treated as constants) 2

s t M X ,Y ( s, t ) s =t = 0 = E ( xye s x +t y ) s =t =0 = E ( XY ) Page 327 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com If n = 2 and m = 0 : E ( X 2 ) = 2 2 2 s2 M X ,Y ( s, t ) s =t = 0 . s s s s s (In the above formula, x, y are treated as constants) 2 M X ,Y ( s, t ) = M X ,Y ( s, t ) = E ( e s x +t y ) = s E ( xe s x +t y ) = E ( x 2 e s x + t y ) s 2 M X ,Y ( s, t ) s =t =0 = E ( x 2 e s x +t y ) s =t = 0 = E(X2) If n = 0 and m = 2 : E (Y 2 2 2 )= 2 t2 M X ,Y ( s, t ) s =t =0 . E ( es x+t y ) = E ( ye s x +t y ) = E ( y 2 e s x +t y ) t t t t t (In the above formula, x, y are treated as constants) 2 M X ,Y ( s, t ) = M X ,Y ( s, t ) = t t 2 M X ,Y ( s, t ) s =t = 0 = E ( y 2 e s x +t y )

s =t = 0 = E (Y 2 ) Following this line of reasoning, we see that E ( X nY m ) = Problem 2 n+m n s mt M X ,Y ( s, t ) s =t = 0 ( X , Y ) is uniformly distributed over 0 < y < x < 1 . Find Solution All you need to do is to find E e s X +t Y = 0< y < x <1 M X ,Y (s, t ) . e sX +tY f X ,Y ( x, y ) dxdy To complete this double integration, youll use the procedure described in Chapter 26: Determine 2-D region. Set up the outer integration Set up the inner integra tion Evaluate the double integration Yufeng Guo, Deeper Understanding: Exam P Page 328 of 425

http://www.guo.coursehost.com I wont draw the 2-D graph for you or show you the o ther steps; Ill just give you the result. However, you might want work step by st ep. Make sure you understand how to do this double integration. To complete this double integration, we need to find the joint pdf. Because ( X , Y ) is uniform ly distributed over 0 < y < x < 1 , we have f X ,Y ( x, y ) = k where k is a positive constant. f X ,Y ( x, y )dxdy = 1 0 < y < x <1 k dxdy = 1 0< y < x <1 dxdy = Area of (0 < y < x < 1) = 0 < y < x <1 1 2 k=2 e sx +ty dxdy E e sx +ty = [ ] e sx +ty f X ,Y ( x, y )dxdy = 2 0< y < x <1 0 < y < x <1 e sx + ty dxdy = 0 < y < x <1 1 x e sx +ty dydx = e sx 0 1 x e ty dy dx 0 0 0 1 = e sx 0 1 xt 1 1 e 1 dx = e t +s 1 t t s+t (

) ( ) 1 (e s s 1 ) 0 M X ,Y (s, t ) = 2 1 e t +s 1 t s+t ( ) 1 (e s s 1 ) where s 0, t Problem 3 ( X , Y ) has the following pdf: f X ,Y ( x, y ) = ke Find M X ,Y (s, t ) . Solution x 2y where 0 < x < y < First, we need to find k . ke 0< x < y < x 2y dxdy = k e 0< x < y < x 2y dxdy =1 Yufeng Guo, Deeper Understanding: Exam P Page 329 of 425

http://www.guo.coursehost.com y y e 0< x < y < x 2y dxdy = 0 0 e x 2y dxdy = e 0 2y 0 e x dx dy = e 0 2y [e y 1 dy = ] 1 6 k =6 Ee [ sx + ty ]= = e 0 6e 0< x < y < sx + ty e x 2y dxdy = 6e 0< x < y <

( s 1) x (t 2 ) y y e dxdy = 0 0 6e (s 1) x e (t 2)y dxdy (t 2 ) y y 6e (s 1) x dxdy = 6 e (t 0 2)y 1 s 1 [e ( s 1) y 1 dy (where s ] 1) 0 = 6 s 1 6 s 1 0 0 e ( s +t 3) y dy 0 e (t

2)y dy = e (3 s t )y dy 0 e (2 t ) y dy = 1 s 1 3 s t 6 s 1 t 1 2 6 1 2 t 1 s+t 3 6 (where s + t < 3, t < 2, s 1) = M X ,Y ( s, t ) = E e s x +t y = 1 s 1 t 2 1 where s + t < 3, t < 2, s s+t 3 1 Problem 4 Random variables X and Y have the following joint pdf: f ( x, y ) = k where 0 x 1, 0 y 1 , and x + y 1 . Find the joint moment generating function of X and Y . Find the moment generatin g function for X using the joint MGF. Find the moment generating function for Y using the joint MGF. Solution

First, we need to find the constant k . Yufeng Guo, Deeper Understanding: Exam P Page 330 of 425

http://www.guo.coursehost.com f ( x, y ) dxdy = 1 A Where A represents the area 0 f ( x, y ) dxdy = 1 , A x 1, 0 y 1 , and x + y 1 . k A dxdy = 1 , 1 Area of A Since A dxdy = Area of A , we have k = You need to memorize the above result. That is, if X and Y are uniformly distrib uted over an area, then the joint pdf is: f ( x, y ) = 1 Area x 1 , 0 y 1 , and x + y 1 , youll find that the 1 area is 0.5. Then the joint pdf is simply = 2. 0.5 If you draw the 2-D diagram for 0 Next, well find the joint MGF. M X ,Y ( s, t ) = 2 1 1 1 x e s x +t y dy dx = 2 e s x 0 1 1 x 1 0 0 ! et y dy 0 " dx = 2 e s x 0

1 t (1 e t2 x) 1 dx = 2 s t x e( ) et t 0 e s x dx = 2 t es s t (s t) s et + ( s t ) If we want to find the MGF for X or Y , we dont need to start from scratch. We ca n use M X ,Y ( s, t ) . M X ( s ) = M X ,Y ( s, t = 0 ) = lim t #0 2 t es s t (s t) s et + ( s t ) 0 ; well need to use 0 Notice that t e s IHospitals rule. d 2 t es dt s et + ( s t ) # 0 and s t ( s t ) # 0 . So we have s et + ( s t ) = 2 ( e s set 1) Yufeng Guo, Deeper Understanding: Exam P Page 331 of 425

http://www.guo.coursehost.com d s t ( s t ) = s ( s 2t ) dt 2 lim t es t #0 s t ( s t) s e + ( s t ) = lim t t #0 2 ( es s ( s 2t ) s et 1) = 2 ( es s2 s 1) M X ( s ) = M X ,Y ( s, t = 0 ) = Similarly, M Y ( t ) = M X ,Y ( s = 0, t ) = 2 ( es s 2 s 1) 2 ( et t 1) t2 Yufeng Guo, Deeper Understanding: Exam P Page 332 of 425

http://www.guo.coursehost.com Chapter 35 Markovs inequality, Chebyshev inequality This topic is not on the SOA syllabus. However, SOA annoyingly tested Chebyshev Inequality in May and September 2005. So you might want to learn this topic. Ple ase note that Markovs inequality and Chebyshev Inequality are almost never used t o estimate probability for any real world applications. Given todays computing po wer, its never necessary to estimate probability using Markovs inequality and Cheb yshev Inequality. Markovs inequality and Chebyshev Inequality are important only theoretically. They give us a crude estimate of probability. Now lets move on to the topic. Markovs inequality If a non-negative random variable X (i.e. X then E( X ) P ( X a) , for all a > 0 a 0 ) has a finite mean (i.e. if E ( X ) exists), Markovs inequality really says that if a non-negative random variable has a finit e mean, then chances are slim that it will take on a huge value. Its easy to prov e Markovs inequality. Well prove Markovs inequality when X is a continuous non-nega tive random variable. The proof is similar if X is a discrete nonnegative random variable. Proof: E(X ) = + + x f ( x )dx = x f ( x )dx + 0 + a + x f ( x )dx a) + x f ( x )dx 0 a + a x f ( x )dx af ( x )dx = a f ( x )dx = a P ( X a a a E(X ) a P(X a), P( X

a) E(X ) a Markovs inequality is really simple. The trouble, however, is to memorize the dir ection of the inequality. Its easy to get confused in the exam and use a wrong fo rmula such as E(X ) E(X ) a P ( X a ) or P ( X a ) . a Yufeng Guo, Deeper Underst anding: Exam P Page 333 of 425

http://www.guo.coursehost.com You have two ways to memorize Markovs inequality: one way is to memorize E(X ) E( X ) a P ( X a ) ; the other is to memorize P ( X a ) . a Method 1: Memorize E ( X ) To memorize E ( X ) a P( X a P( X a) . a ) , remember two rules: Rule need s to the the 1: The two inequality symbols both point to the right. In other words, you to use and . Dont write: E(X ) a P( X points to the left.) E(X ) a P( X point the right.) E(X ) a P( X a ) (Wrong. Here the 1st equality symbol points to right; the 2nd a ) (Wrong. Here the 1st equality symbol points to the left; 2nd a ) (Wrong. Here the two equality symbols both point to the left.)

Rule 2: If we cut off X [0, a ] and use X a to calculate the mean E ( X ) , well get a lower bound of the mean. So we should have the expression E ( X ) something, not E ( X ) something. This is why we have E(X ) a P( X And we dont have E(X ) a P( X a) a) (Wrong !) a) E(X ) . a Method 2: Memorize P ( X How to memorize: The 1st inequality symbol points to the right; the 2nd inequali ty symbol points to the left; the letter a is trapped in between. The letter a i s being pointed from both sides. In other words, we need to write a . Yufeng Guo, Deeper Understanding: Exam P Page 334 of 425

http://www.guo.coursehost.com o inequality symbols point to y symbols point to the left.) int to the same direction.) a X ) , well let a a 0 . If a 0 , then

This way, you the right.) a a E(X ) P ( X To get a feel

wont write: E(X ) P ( X a) (Wrong! Tw E(X ) P ( X a) (Wrong! Two inequalit a) (Wrong! Two inequality symbols po of the formula P ( X E(X ) a + a) E(

because E ( X ) is finite. Since P ( X a ) 1 , evidently we cant have a) E(X ) . a P(X a) E(X ) a + . So well need to write P ( X Hope by now you can write out Markovs inequality correctly. Next, lets move onto C hebyshev Inequality. Chebyshev Inequality If a random variable X has mean 2 , then ( and variance P X

c ) 2 c2 , for all c > 0 We can easily prove Chebyshev Inequality using Markov Inequality. 2 Consider a n ew random variable Y = ( X ) . Notice that Y is always non-negative. Using Marko v Inequality and setting a = c 2 , we have: E (Y ) c 2 P (Y E (Y ) = E ( X 2 c2 ) ) = Var ( X ) 2 (X ) P (Y c2 ) = P c2 = P X

c ) c Then it follows: Var ( X ) c 2 P X (

c , ) P X (

) Var ( X ) c2 Yufeng Guo, Deeper Understanding: Exam P Page 335 of 425

http://www.guo.coursehost.com Var ( X ) , remember that the letter c is being c2 pointed from both sides. In other words, we need to write c . To memorize P X (

c ) Theres another common expression for Chebyshev Inequality. If we let c = k , wher e k is a positive constant and is the standard deviation of X , then P X (

k ) 2 k 2 2 = 1 k2 2nd expression of Chebyshev Inequality P X (

k ) 1 k2 1 that X is k standard deviation away from its mean. k2 The probability is at most Problem 1 X is uniformly distributed over [ 1,1] . Calculate P X (

a where a > 0 using the ) following 3 methods: The exact method Markov Inequality Chebyshev Inequality Solution The exact method X is uniformly distributed over P(X x) = [ 1,1] . The pdf is f ( x ) = 1 ( 1) 1 = 1 . The cdf is 2 x +1 where 1 x 1 . 2 P X ( a =1 P X < a =1 P X ) ( ) ( a ) Yufeng Guo, Deeper Understanding: Exam P Page 336 of 425

http://www.guo.coursehost.com Please note that X is continuous. As a result, P X P X <a =P X ( ) ( a . a) = P ( a < X a) = F (a) F ( a) = a +1 2 a +1 =a 2 ) ( a = P X > a and ) ( ) P X ( a = P( a ) X P X ( a =1 P X ) ( a = 1 a where a 1 ) If a > 1 , then P X ( a =0 ) Markov Inequality If we cut off X E X . E X [ 0, a ]

when calculating the mean E X , well get a lower bound of ( ) ( ) ( ) aP X ( a , ) P X ( a + ) E X a ( ) Applying the general formula E y ( x ) = have y ( x ) f ( x )dx and setting y ( x ) = X , we E X = 1 ( ) 1 X f ( x )dx = 1 2 1 X dx = 1 1 2 0 1 x dx + x dx = 1 0 1 2

P X ( a ) E X a ( )= 1 2a 1 1 1 , then 1 and P X a 1 . It doesnt give us any new knowledge. 2 2a 2a Even wi thout knowing Markov Inequality, we know P X a 1 ; the probability of If a ( ) ( ) anything must not exceed one. Chebyshev Inequality P X ( E(X ) a ) Var ( X ) a2 Yufeng Guo, Deeper Understanding: Exam P Page 337 of 425

http://www.guo.coursehost.com X is uniformly distributed over [ 1 3 1,1] . We have: E ( X ) = 0 , Var ( X ) = 2 2 3 2 = The general formula is: If the random variable X is uniformly distributed over [ a , b ] where b > a , then f ( x) = 1 b a , E( X ) = Var ( X ) a2 b+a , 2 X = b a 2 3 P X ( E(X ) a ) 1 3a 2 P X ( a ) 1 1 , then 3a 2 1 and 2 3a 3 anything new. If a 1 . Then P X ( a )

1 3a 2 1 . This doesnt tell us Problem 2 Using Chebyshev Inequality, find k that will guarantee that the probability is 0 .95 that the deviation of X from E ( X ) is no more than k . Solution find k that will guarantee that the probability is 0.95 that the deviation of X f rom E ( X ) is no more than k means the following: P X E(X ) k 0.95 =1 P X 0.05 =P X P X E(X ) E(X ) k k P X E(X ) = k E(X ) k 0.95 P X P X E(X ) > k E(X ) > k E(X ) > k E(X ) > k Since P X have P X , well always . Yufeng Guo, Deeper Understanding: Exam P Page 338 of 425

http://www.guo.coursehost.com E(X ) So if we can satisfy P X P X E(X ) > k P X k 0.05 , well guarantee that 0.05 . Mathematically, this is: P X E(X ) k 0.05 E(X ) > k Based on Chebyshev Inequality, we have: P X E(X ) k Var ( X ) (k ) 1 k2 2 = 2 (k ) 2 = 1 k2 20 . So we need to make sure Problem 3 0.05 . This gives us k After SOA switched from the pencil-and-paper testing to the computer based testi ng (CBT) for Exam P, some candidates support CBT and other disapprove. Those who support CBT claim that CBT can be offered more than the pencil-and-paper testin g. Those who disapprove CBT cite that they have to pay higher exam fees to take CBT; they have to drive longer to find a CBT center; that computer screens often freeze during the exam. SOA asks you to find the supporting ratio (i.e. the % o f the Exam P candidates who favor CBT) of the Exam P candidates population. You k now that the Exam P population is huge (over 6,000 people can take one exam) and that theres no way you can find the true supporting ratio of the entire Exam P c andidate population. As a result, you decide to randomly survey some Exam P cand idates and find the supporting ratio of your sample. You are thinking of using t he supporting ratio of our sample as an estimate of the true supporting ratio of the population. SOA wants to ensure that chances are more than 95% that your sa mple mean will not differ from the true mean of the population by more than 0.01

. How many Exam P candidates do you need to survey? Solution Let p represent the true supporting ration of the Exam P candidate population. L et p represent the supporting ratio of the sample. Suppose we sample n Exam P ca ndidates. Of these n candidates surveyed, m candidates m support CBT. Then p = . Then m is a binomial distribution with parameter n and p . n E ( m ) = np and V ar ( m ) = npq = np (1 p ) . Yufeng Guo, Deeper Understanding: Exam P Page 339 of 425

http://www.guo.coursehost.com We need to find n such that P p ( p 0.01 ) 0.95 P p ( ( p 0.01 = 1 P p ) ) ( p > 0.01 ) 0.01 , we have: Because P p ( p > 0.01 ) P( p p p ) P p p 0.01 1 P p ( 0.01 ) p 0.01 So as long as we can ensure that 1 P p

P p ( p 0.01 ) ( ) 0.95 , well be fine. This gives us 0.05 Using Chebyshev Inequality, we have: P p E p m n ( ( ) 0.01 ) Var p 0.012 ( ) p= E p =E ( ) E ( m ) np m = = = p, n n n Var p = Var ( ) Var ( m ) np (1 p ) p (1 p ) m = = = n n2 n2 n P p ( p 0.01 ) Var p 0.01

2 ( ) = p (1 p) 0.01 n 2 So we need to ensure that p (1 p ) 0.012 n 0.05 . This gives us n 0.012 ( 0.05 ) p (1 p ) . Are we stuck? No. We dont know p (1 p ) , but we can find the upper bond of p (1 p ) : p (1 p ) = p p2 = 1 4 p2 p 1 1 = 4 4 p 1 2 2 1 4 Yufeng Guo, Deeper Understanding: Exam P Page 340 of 425

http://www.guo.coursehost.com p (1 p ) 2 So the max value of 0.01 ( 0.05 ) is value, we can guarantee that P p ( 1 1 = 50, 000 . If we let n take on this 2 4 0.01 ( 0.05 ) p 0.01 ) 0.05 . So if we survey 50,000 or more exam candidates and find the supporting ratio, ch ances are less than 5% that the sample supporting ratio will differ from the tru e supporting ratio by 0.01. General formula: We want to conduct a survey to estimate the percentage of peopl e who does something. We want to make sure that the probability is at least b th at the percentage found from the sample differs from the true percentage in the population by amount no more than a Based Chebyshev Inequality, how many people do we need to survey? Solution Let p represent the true percentage of the people who do something. Let p repres ent the percentage found in surveyed. Suppose we sample n people. m of the n peo ple surveyed do something. We are asked to find P p m n ( p a ) b. p= E p =E ( ) E ( m ) np m = = = p, n n n 1 4n Var p = Var ( ) Because p = E p , P p

P p E p ( ( ) ( Var ( m ) np (1 p ) p (1 p ) m = = = n n2 n2 n p a ( ) a = 1 P p E p >a ) ( ) b is equivalent to P p E p ( ) ) > 1 P p E p ( ( ( ) a a ( ) ) p a ) ) b. So if we make sure 1 P p E p ( ( ) a ) b , well have P p (

b. Using Chebyshev Inequality, we have: Var p 1 P p E p a 2 a 4na 2 ( ( ) ) ( ) Yufeng Guo, Deeper Understanding: Exam P Page 341 of 425

http://www.guo.coursehost.com To ensure that 1 P p E p 1 1 4na 2 b, n 2 ( ( ) a ) b , we need to have: 1 4a (1 b ) 1 So we need to survey at least people. 2 4a (1 b ) Problem 4 Redo Problem 2 using normal approximation. Solution Just as before, we need to make sure P p ( p 0.01 ) 0.05 Next, well use normal approximation to calculate P p ( p 0.01 . We assume that the ) random variable p is normally distributed. As a result, p E ( p ) is approximate ly normal. E p E ( p) = E p ( ) E ( p) Remember E ( p ) is a constant. From the last problem, we know that E p = E ( p ) . So the normal random variable ( ) p E ( p ) has a mean of zero.

The variance of p E ( p ) is: Var p E ( p ) = Var p ( ) Var E ( p ) = Var p ( ) Remember E ( p ) is a constant. So its variance is zero. From the last problem, we know Yufeng Guo, Deeper Understanding: Exam P Page 342 of 425

http://www.guo.coursehost.com Var p = Var So well use ( ) p (1 p ) m = n n 1 4n 1 as the variance of Var p . This is the worse case scenario. The n 4n calculate d based on this largest variance is safe. ( ) Because p E ( p ) is approximately normal, we have P p E ( p) P p ( 0.01 = 2 P ) {p E ( p) 0.01 } { p E ( p) 0.01 ( p 0.01 = P p E ( p ) ) ( } 0.01 = 2 P ) } P {p E ( p) 0.01 = 1 P {p E ( p)

0.01 } ( )

$ 0.01 % = 1 & 0.02 n ( =1 & 1 % * 4n P p ( ( p 0.01 = 2 P ) ) {p E ( p) 0.01 = 2 1 & 0.02 n } ( ) P p p 0.01 0.05 2 1 & 0.02 n ( ) 0.05 , & 0.02 n ( ) 0.975 0.02 n 41.96 , n 9, 604 We see that normal approximation gives us a far smaller n than does Chebyshev In equality.

# 0.01 E p E p ( ) % =1 &

% Var p E ( p ) )

Problem 5 A poll is conducted among Exam P candidates to estimate the proportion of Exam P candidates who use Deeper Understanding, Faster Calc (the Guo manual). We want to make sure that the probability is at least 80% that the sample proportion will not differ from the true proportion by at most 0.1. How many people do we need t o survey? Use Chebyshev Inequality. Solution Yufeng Guo, Deeper Understanding: Exam P Page 343 of 425

http://www.guo.coursehost.com We want P p n 2 ( p 0.1 ) 0.8 . Here a = 0.1 and b = 0.8 . 1 1 = = 125 2 4a (1 b ) 4 ( 0.1 ) (1 0.8 ) Problem 6 Let X represent the hours it takes a randomly chosen Exam P candidate to prepare for Exam P. Suppose we know, from other information, that the varianc e of X is no more than 40, i.e. Var ( X ) 40 . How large is the sample required to make sure that the probability is at least 0 .95 that the sample mean differs from the true mean E ( X ) by no more than 0.5? Solution Here we cant use the formula n 1 . Here we are interested in finding the 4a (1 b ) average # of hours it takes an Exam P candidate to prepare for Exam P. The # of hours is 1 . not a binomial distribution. So we cant use the formula n 2 4a (1 b ) 2 The solution is still simple. Assume we sample n people. Then, X= X 1 + X 2 + ... + X n n Here X 1 , X 2 ,..., X n are independent identically distributed. nE ( X ) X 1 + X 2 + ... + X n = = E(X ) n n nVar ( X ) Var ( X ) X + X 2 + ... + X n Var X = Var 1 = = n n2 n E X =E ( ) ( ) We need to make sure P X ensure that P X ( E(X ) 0.5 ) 0.95 . Because E X = E ( X ) , we need to ( ) (

E X ( ) 0.5 ) 0.95 Yufeng Guo, Deeper Understanding: Exam P Page 344 of 425

http://www.guo.coursehost.com P X ( E X ( ) 0.5 1 P X P X ( ( ( ) 0.5) E ( X ) 0.5 ) E X ) 1 P X ( E X ( ) 0.5 ) 0.95 0.05 Using Chebyshev Inequality, we have: P X ( E X ( ) 0.5 ) Var X 0.5 2 ( ) = Var ( X ) 0.5 n 2 40 0.52 n

We need to make sure 1 1 40 0.52 n 0.95 , Var ( X ) 0.52 n 0.95 n 3, 200 In general, to ensure P X ( E X ( ) a ) b , we need to survey at least n a 2 (1 b ) Var ( X ) people based on Chebyshev Inequality. Problem 7 We know that V ( X ) 1 . Whats the sample size to ensure that P X ( E X ( ) 0.2 ) 0.95 ? Solution n a (1 b ) 2 Var ( X ) 1 = 500 0.2 (1 0.95 ) 2 So we need to have a sample size of at least 500. Yufeng Guo, Deeper Understanding: Exam P Page 345 of 425

http://www.guo.coursehost.com Chapter 36 Study Note Risk and Insurance explained This chapter explains the study note titled Risk and Insurance by Anderson and Bro wn. Deductible, benefit limit An insurance contract has a deductible d per loss and maximum payment (called be nefit limit) u per loss. Let X represent the loss incurred by the policyholder. Let Y represent the payment made the insurer to the policyholder. Then 0 Y= X u Example 1 if 0 d if d if X X d X d +u d +u You bought an insurance policy. The deductible is $200 per loss. The insurance c ompany will pay the maximum of $5,000 per loss. One day you suffered a loss of $ 150. How much will the insurer pay you? How much do you have to pay out of your own pocket to cover your loss? Solution Your loss is less than the deductible. As a result, the insurance benefit wont ki ck in. You need to pay $150 to cover the loss. The insurance company pays you no ne. Example 2 You bought an insurance policy. The deductible is $200 per loss. The insurance c ompany will pay the maximum of $5,000 per loss. One day you suffered a loss of $ 600. How much will the insurer pay you? How much do you have to pay out of your own pocket to cover your loss? Solution You will pay the 1st $200. After you have met the deductible, the insurance bene fits will kick in. The insurance company will pay you $600-$200=$400. Example 3 You bought an insurance policy. The deductible is $200 per loss. The insurance c ompany will pay a maximum of $5,000 per loss. One day you suffered a loss of $6, 000. How Yufeng Guo, Deeper Understanding: Exam P Page 346 of 425

http://www.guo.coursehost.com much will the insurer pay you? How much do you hav e to pay out of your own pocket to cover your loss? Solution You will pay the 1s t $200 and then the insurance benefit will kick in. If there is no cap on the ho w much the insurance company will pay you per loss incident, youll get $6,000$200 =$5,800 from the insurer. However, the benefits payment is capped at $5,000 per loss. As a result, the insurance company will pay you only $5,000, not $5,800. Y oull need to cover the remaining loss $5,800-$5,000=$800 with your own money. You r total out-of-pocket cost is $200+$800=$1,000. Example 4 A car owner has a 70% chance of no accidents in a year, 30% chance of having only one accident in a ye ar, and 0% chance of having more than one accident in a year. If theres an accide nt, the loss amount (also called severity) is a random variable with the following distribution: Severity 20 60 150 200 Probability 0.10 0.25 0.30 0.35 Theres an annual deductible of $50. The annual maximum payment by the insurer is $145. Calculate (1) Annual expected loss incurred by a car owner (2) Standard de viation of the annual loss incurred by the car owner (3) Annual expected payment made by the insurance company to a car owner (4) Standard deviation of the paym ent made by the insurance company to a car owner (5) Annual expected cost that t he insured car owner must cover his loss with his own money (6) Standard deviati on of the cost that the insured car owner must cover his loss with his own money (7) Correlation coefficient between the insurance companys annual payment and th e insureds annual out-of-pocket cost to cover the loss. Solution Questions (1) an d (2) Let X represent the annual loss incurred by a car owner. Yufeng Guo, Deeper Understanding: Exam P Page 347 of 425

http://www.guo.coursehost.com 70.0% x=0 30% ( 0.10 ) = 3.0% x = 20 f ( x ) = 30% ( 0.25 ) = 7.5% x = 60 30% ( 0.30 ) = 9 .0% x = 150 30% ( 0.35 ) = 10.5% x = 200 In problems like this one, always thoro ughly list all of the possibilities. In addition, make sure the total probabilit ies add up to one. Here 70%+3%+7.5%+9%+10.5%=100%. Good. E(X ) = Var ( X ) = xf ( x ) = 0 ( 70% ) + 20 ( 3% ) + 60 ( 7.5% ) + 150 ( 9% ) + 200 (10.5% ) = 39. 6 x E ( x) 2 2 2 f ( x) 2 2 2 = ( 0 39.6 ) ( 70% ) + ( 20 39.6 ) ( 3% ) + ( 60 39.6 ) ( 7.5% ) + (150 39.6 ) ( 9% ) + ( 200 39.6 ) (10.5% ) =4,938.84 = Var ( X ) = 70.28 To avoid the above calculations, scale f ( x ) to an integer and enter the data pairs of x, f ( x ) to BA II Plus 1-V Statistics Worksheet. Questions (3) and (4) Let Y represent the annual payment made by the insurer to the insured. Then 70.0 % 30% ( 0.10 ) = 3.0% f ( y ) = 30% ( 0.25 ) = 7.5% 30% ( 0.30 ) = 9.0% 30% ( 0. 35 ) = 10.5% y=0 y=0 y = 60 50 = 10 y = 145 when x = 0 when x = 20 when x = 60 w hen x = 200 y = 150 50 = 100 when x = 150 When the loss X = 200 , the insured will pay $150 if theres no benefit limit. How ever, since theres a limit of $145, the insurers payment is capped at $145. The di stribution of Y can be simplified as follows: Yufeng Guo, Deeper Understanding: Exam P Page 348 of 425

http://www.guo.coursehost.com y f ( y) 0 73% 10 7.50% 100 9% 145 10.50% Using BA II Plus or memorized formulas, we have: E (Y ) = 24.975 , Questions (5) and (6) Y = 49.91 Let Z represent how much money the insured needs take out from his own pocket to cover his annual loss. 70.0% z=0 when x = 0 30% ( 0.10 ) = 3.0% z = 20 when x = 20 f ( z ) = 30% ( 0.25) = 7.5% z = 50 when x = 60 30% ( 0.30 ) = 9.0% z = 50 when x = 150 30% ( 0.35) = 10.5% z = 55 when x = 200 Please note that when x = 200 , the insurer will pay only $145. As a result, the insured needs to pay the remainder of $55 to cover his loss. The distribution o f Z can be simplified as follows: f ( z) z 0 70% 20 3% 50 16.5% 55 10.50% Using BA II Plus or memorized formulas, we have: E ( Z ) = 14.625 , Question (7) Z = 22.98 Cov (Y , Z ) Y Z We are asked to calculate X loss Y ,Z = + . You need to remember the core equation: = Y

insurer s share Z insured s share The above equation says that if theres a loss, either the insurer pays or the ins ured pays or both. As a result, the loss amount must be equal to the sum of what the insurer will pay and what the insured will pay. We can verify that X = Y + Z holds under every scenario: Yufeng Guo, Deeper Understanding: Exam P Page 349 of 425

http://www.guo.coursehost.com 70.0% x=0 30% ( 0.10 ) = 3.0% x = 20 f ( x ) = 30% ( 0.25 ) = 7.5% x = 60 30% ( 0.30 ) = 9 .0% x = 150 30% ( 0.35 ) = 10.5% x = 200 70.0% 30% ( 0.10 ) = 3.0% f ( y ) = 30% ( 0.25 ) = 7.5% 30% ( 0.30 ) = 9.0% 30% ( 0.35 ) = 10.5% 70.0% y=0 y=0 y = 60 5 0 = 10 y = 145 z=0 when x = 0 when x = 20 when x = 60 when x = 200 when x = 0 wh en x = 20 when x = 60 when x = 150 when x = 200 y = 150 50 = 100 when x = 150 30% ( 0.10 ) = 3.0% z = 20 f ( z ) = 30% ( 0.25 ) = 7.5% z = 50 30% ( 0.30 ) = 9 .0% z = 50 30% ( 0.35 ) = 10.5% z = 55 E ( X ) = E (Y ) + E ( Z ) As a check, we know E ( X ) = 39.6 , E (Y ) = 24.975 , and E ( Z ) = 14.625 . 39.6=24.975+14.625. The equation holds. Now we know that our calculations of E ( X ) , E (Y ) , and E ( Z ) are OK. Var ( X ) = Var (Y + Z ) = Var (Y ) + Var ( Z ) + 2Cov (Y , Z ) 2Cov (Y , Z ) = Var ( X ) Var (Y ) + Var ( Z ) = 70.282 ( 49.91 2 + 22.982 ) = 1,920.19 Cov (Y , Z ) = 1,920.19 = 960.095 2 Y ,Z = Cov (Y , Z ) Y Z = 960.095 = 0.837 49.91( 22.98) Yufeng Guo, Deeper Understanding: Exam P Page 350 of 425

http://www.guo.coursehost.com We shouldnt be surprised that Y , Z is close to 1; if you get Y , Z close to zero, your calculation must be wrong. Y and Z should h ave a good linear relationship. Coinsurance Some insurance contracts have a coinsurance factor. An insurance policy has a de ductible of d and a benefit limit (the insurers max payment) of u . The insurers p ortion of the payment is (coinsurance factor) is where 0 < 1 . Let X represent t he loss and Y represent the insurers payment. Then 0 Y= u Example 5 if 0 X X d+ d d+ u u (X d ) if d if X You bought an insurance policy. The deductible is $200 per loss. The insurance c ompany will pay 90% of the loss in excess of the deductible subject to the maxim um payment of $5,000 per loss. One day you suffered a loss of $4,000. How much w ill the insurer pay you? How much do you have to pay out of your own pocket to c over your loss? Solution The insurer will pay 90%(4,000-200)=$3,420. You will ne ed to cover the remaining $4,000 - 3,420 = $580 loss out of your own money. Exam ple 6 You bought an insurance policy. The deductible is $200 per loss. The insur ance company will pay 90% of the loss in excess of the deductible subject to the maximum payment of $5,000 per loss. One day you suffered a loss of $5,750. How much will the insurer pay you? How much do you have to pay out of your own pocke t to cover your loss? Solution The insurer will pay you 90%(5,750-200)=$4,995. Y oull need to cover the remaining $5,750 4,995 = $755 out of your own pocket. Yufeng Guo, Deeper Understanding: Exam P Page 351 of 425

http://www.guo.coursehost.com Example 7 You bought an insurance policy. The dedu ctible is $200 per loss. The insurance company will pay 90% of the loss in exces s of the deductible subject to the maximum payment of $5,000 per loss. One day y ou suffered a loss of $5,760. How much will the insurer pay you? How much do you have to pay out of your own pocket to cover your loss? Solution If theres no ben efit limit, then the insurer will pay you 90%(5,760-200)=$5,004. However, the in surers payment is capped at $5,000. So the insurer will pay you $5,000. Youll need to cover the remaining $5,760 5,000 = $760 out of your own pocket. Example 8 A car owner has 70% chance of no accidents in a year, 30% chance of ha ving only one accident in a year, and 0% chance of having more than one accident in a year. If theres an accident, the loss amount (also called severity) is a rand om variable with the following distribution: Severity 20 60 150 200 Probability 0.10 0.25 0.30 0.35 Theres an annual deductible of $50. The annual maximum payment by the insurer is $85. The insurer will pay 60% of the loss in excess of the deductible subject to the max annual payment. Calculate (1) Annual expected loss incurred by a car ow ner (2) Standard deviation of the annual loss incurred by the car owner (3) Annu al expected payment made by the insurance company to a car owner (4) Standard de viation of the payment made by the insurance company to a car owner (5) Annual e xpected cost that the insured car owner must cover his loss with his own money ( 6) Standard deviation of the cost that the insured car owner must cover his loss with his own money (7) Correlation coefficient between the insurance companys an nual payment and the insureds annual out-of-pocket cost to cover the loss. Yufeng Guo, Deeper Understanding: Exam P Page 352 of 425

http://www.guo.coursehost.com Solution Questions (1) and (2) Let X represent the annual loss incurred by a car owner. 70.0% x = 0 30% ( 0.10 ) = 3.0% x = 20 f ( x ) = 30% ( 0.25 ) = 7.5% x = 60 30% ( 0.30 ) = 9.0% x = 150 30% ( 0.35 ) = 10. 5% x = 200 E(X ) = Var ( X ) = xf ( x ) = 0 ( 70% ) + 20 ( 3% ) + 60 ( 7.5% ) + 150 ( 9% ) + 200 (10.5% ) = 39. 6 x E ( x) 2 2 2 f ( x) 2 2 2 = ( 0 39.6 ) ( 70% ) + ( 20 39.6 ) ( 3% ) + ( 60 39.6 ) ( 7.5% ) + (150 39.6 ) ( 9% ) + ( 200 39.6 ) (10.5% ) Var ( X ) = 4,938.84 , = Var ( X ) = 70.28 Once again, you can use BA II Plus 1-V Statistics Worksheet to do the calculatio ns. Questions (3) and (4) Let Y represent the annual payment made by the insurer to the insured. Then 70.0 % 30% ( 0.10 ) = 3.0% f ( y ) = 30% ( 0.25) = 7.5% 30% ( 0.30 ) = 9.0% 30% ( 0.3 5) = 10.5% y=0 y=0 y = 0.6 ( 60 50 ) = 6 y = 0.6 (150 50 ) = 60 y = 85 when x = 0 when x = 20 when x = 60 when x = 150 when x = 200 When the loss X = 200 , the insured will pay 0.6(200-50)=90 if theres no benefit limit. However, since theres a limit of $85, the insurers payment is capped at $85 . The distribution of Y can be simplified as follows: y f ( y) 0 73% 6 7.50% 60 9% 85 10.50% Using BA II Plus or memorized formulas, we have: Yufeng Guo, Deeper Understanding: Exam P Page 353 of 425

http://www.guo.coursehost.com E (Y ) = 14.775 , Questions (5) and (6) Y = 29.45 Let Z represent how much money the insured needs take out from his own pocket to cover his annual loss. 70.0% z=0 when x = 0 when x = 20 when x = 60 when x = 15 0 when x = 200 30% ( 0.10 ) = 3.0% z = 20 f ( z ) = 30% ( 0.25) = 7.5% z = 50 + 0.4 ( 60 50 ) =54 30% ( 0.30 ) = 9.0% z = 50 + 0.4 (150 50 ) =90 30% ( 0.35) = 10.5% z = 200 85 = 115 The distribution of Z can be simplified as follows: z 0 70% 20 3% 54 7.5% 90 9% 115 10.5% f ( z) Using BA II Plus or memorized formulas, we have: E ( Z ) = 24.825 , Z = 41.62 Check: E (Y ) + E ( Z ) = 14.775 + 24.825 = 39.6 = E ( X ) . OK. Question (7) Var ( X ) = Var (Y + Z ) = Var (Y ) + Var ( Z ) + 2Cov (Y , Z ) 2Cov (Y , Z ) = Var ( X ) Var (Y ) + Var ( Z ) = 70.282 ( 29.45 2 + 41.62 2 ) = 2, 339.75 Cov (Y , Z ) = 2,339.75 = 1,169.88 2 Y ,Z = Cov (Y , Z ) Y Z = 1,169.88 = 0.95 29.45 ( 41.62 ) Yufeng Guo, Deeper Understanding: Exam P Page 354 of 425

http://www.guo.coursehost.com The effect of inflation on loss and claim payment. This topic is covered in depths in Exam M and C. For Exam P, you just need to le arn the basics. If you understand the examples in the study note, you should be fine. Problem 9 Looking again at the 100 insured car owners with a 500 deductibl e and no benefit limit, assume that theres 10% annual inflation. Over the next 5 years, what would the expected claim payments and the insurers risk be? The study note gives you the following table. Reproduce this table. Standard deviation of claim payments $2,324 f ( y, t ) Year 1 1 2 2 3 3 4 4 5 5 Loss Claim Loss Claim Loss Claim Loss Claim Loss Claim 80% $0 $0 $0 $0 $0 $0 $0 $0 $0 $0 10% $500 $0 $550 $50 $605 $105 $666 $166 $732 $232 8% $5,000 $4,500 $5,500 $5,000 $6,050 $5,550 $6,655 $6,155 $7,321 $6,821 2% $15,000 $14,500 $16,500 $16,000 $18,150 $17,650 $19,965 $19,465 $21,962 $21,4 62 Expected claim payment $750 $650 $825 $725 $908 $808 $998 $898 $1,098 $998 $2,568 $2,836 $3,131 $3,456 Solution Year 1 This is the base year so inflation has no effect. The loss amounts (severities) are $0, $500, $5,000, and $15,000. Since theres a deductible of $500, the insuran ce company will pay $0, $500-500=$0, $5,000-500=$4,500, and $15,000-500=$14,500 with probabilities of 80%, 10%, 8%, and 2% respectively. Expected loss: Yufeng Guo, Deeper Understanding: Exam P Page 355 of 425

http://www.guo.coursehost.com 0(80%)+500(10%)+5,000(8%)+15,000(2%)=$750 Expected claim payments by the insurer: 0(80%)+0(10%)+4,500(8%)+14,500(2%)=$650 Standard deviation of claims: Var = ( 0 650 ) 0.8 + ( 0 650 ) 0.1 + ( 4,500 650 ) 0.08 + (14, 500 650 ) 0.02 2 2 2 2 Var =5,402,500, = Var = 2,324.33 Year 2: Due to inflation, losses have increased 10%. The loss amounts are: $0(1. 1)=$0, $500(1.1)=$550, $5,000(1.1)=$5,500, $15,000(1.1)=$16,500 The claims after the deductible are: $0, $550-500=$50, $5,500 500=$5,000, $16,50 0-500=$16,000. Expected loss: 0(80%)+550(10%)+5,500(8%)+16,500(2%)=$825 Expected claim: 0(80%)+50(10%)+5,000(8%)+16,000(2%)=$725 Standard deviation of the claim s: (0 725 ) 0.8 + ( 50 725 ) 0.1 + ( 5, 000 725) 0.08 + (16, 000 725 ) 0.02 =2,568 2 2 2 2 Year 3: Due to inflation, losses have increased another 10%. The loss amounts are: $0(1. 12)=$0; $500(1.12)=$605; 2 $5,000(1.1 )=$6,050; $15,000(1.12)=$18,150 The claims after the deductible are: $0, $605-500=$105, $6,050 500=$5,550, $18,150-500=$17 ,650. Plugging into the mean and variance formulas, you should get: Yufeng Guo, Deeper Understanding: Exam P Page 356 of 425

http://www.guo.coursehost.com Expected loss = $908, Expected claim=$808 Standard deviation of the claim = $2,8 36 Year 4 and 5: Just increase the losses by 10% each year. You should be able t o reproduce the results (there may be rounding differences). Observation Please note that the expected loss also increases by 10% each year, yet the expected cl aim and the standard deviation of the claim increased by more than 10%. Make sur e you understand why. Lets look at the expected loss: Year Expected loss 1 $750 2 $825 3 $908 4 $998 5 $1,098 825=750(1.1); 998=908(1.1)= 750(1.13); 908=825(1.1) = 750(1.12); 1,098=998(1.1)= 750(1.14). Why does the expected loss year after year increase by 10% each year? If losses xi increase by uniform percentage, then E ( X ) = percentage. Why do the expecte d claims year after year increase by more than 10%? Let Y represent the claim pa yment. The formula is: xi f ( xi ) should increase by the same Y = min ( X E (Y ) = d , 0) min ( xi d , 0 ) f ( xi ) Consequently, E (Y ) increases by more than 10% each year. Similarly, by 10% eac h year because we have a fixed deductible. Problem 10 Now xi increase by uniform percentage each year, but the deductible d is fixed. Y also increases Yufeng Guo, Deeper Understanding: Exam P Page 357 of 425

http://www.guo.coursehost.com Looking again at the 100 insured car owners with a 500 deductible and benefit limit of 12,500, assume that theres 10% annual inflat ion. Over the next 5 years, what would the expected claim payments and the insur ers risk be? The study note tells us that at the end of Year 5, the expected loss is 1,098; the expected claim is $819; and the standard deviation of the claim i s 2,486. Reproduce the results for Year 5. Solution Probability Loss at end of Year 1 Loss at end of Year 5 80% 10% $0 $0 $500 $732. 05 =500(1.14) $232.05 =732.05-500 8% $5,000 $7,320.5 =5,000(1.14) $6,820.5 =$7,3 20.50-500 2% $15,000 $21,961.5 =15,000(1.14) $12,500 =min(21,961.5-500, 12,500) Claim at end of Year 5 $0 If we let X =loss at the end of Year 5 and Y =claim at the end of Year 5, the ab ove table can be simplified as follows: Probability 80% 10% $0 $0 $732.05 $232.05 8% $7,320.5 $6,820.5 2% $21,961.5 $1,2 ,500 X Y Using BA II Plus 1-V Statistics Worksheet, we have: E ( X ) = 1, 098.075 , E (Y ) = 818.845 , Y = 2, 486.245 Mixture of distributions Section VIII of the study note touches up a very important concept mixture of dis tributions. Since SOA can easily come up a similar question in the exam, lets go t hrough this concept. Ill walk you through the examples used in the study note. Problem 11 Consider an insurance policy that reimburses annual hospital charges for an insu red individual. The probability of any individual being hospitalized in a year i s 15%. That is, Yufeng Guo, Deeper Understanding: Exam P Page 358 of 425

http://www.guo.coursehost.com P ( H = 1) = 0.15 . Once an individual is hospitalized, the charges X have a pro bability density function (pdf) f X ( x H = 1) = 0.1e 0.1 x for x > 0 . Determine the expected value, the standard deviation, and the ratio of the stand ard deviation to the mean (coefficient of variation) of hospital charges for an insured individual. Solution Ill provide an alternative, simpler solution. Let Y represent hospital charges. T hen Y= 0 X with probability of 85% with probability of 15%, where X is distributed a s f ( x ) = 0.1 e 0.1 x Then E (Y ) = 85% E ( 0 ) + 15% E ( X ) = 15% E ( X ) Var ( Y ) = E (Y 2 ) E 2 (Y ) = 0.15E ( X 2 ) E (Y 2 ) = 85% E ( 02 ) + 15% E ( X 2 ) = 15% E ( X 2 ) 0.15E ( X ) 2 = 0.15E ( X 2 ) 0.152 E 2 ( X ) Here we use the following formula: if k is a constant, then E ( k ) = k and E ( k 2 ) = k 2 . To understand this formula, imagine you have a random variable X w hose density is f X ( x ) = 1 if x = k and zero otherwise. Then E ( X ) = k and E ( k 2 ) = k 2 . Lets continue. X is an exponential random variable with mean of 10. E(X ) = X = 10 , E ( X 2 ) = E 2 ( X ) + 2 X = 200 E ( Y ) = 15% E ( X ) = 15% (10 ) = 1.5 Var (Y ) = 0.15 E ( X 2 ) 0.152 E 2 ( X ) = 0.15 ( 200 ) 0.152 (10 2 ) = 27.75 Y The coefficient of variation is E (Y ) = 27.75 = 3.51 1.5 In this example, Y is a mixture of random variables 0 and X with the following d istribution: Yufeng Guo, Deeper Understanding: Exam P

Page 359 of 425

http://www.guo.coursehost.com 0 X with probability of 85% with probability of 15%, where X is distributed as f ( x ) = 0.1e 0.1 x Y= One mistake made by many is to write: Y = 0.85 ( 0 ) + 0.15 X = 0.15 X (Wrong!) If you write ction of X . pression Y = E (Y 2 ) = E 2 Y = 0.85 ( 0 Then Y is no 0.85 ( 0 ) + ( 0.15 X ) = ) + 0.15 X = 0.15 X , then you are treating Y as a fra longer a mixture of two random variables. The wrong ex 0.15 X = 0.15 X leads the following wrong result: 0.152 E ( X 2 ) (Wrong!)

The correct formula is E (Y 2 ) = 85% E ( 02 ) + 15% E ( X 2 ) = 0.15 E ( X 2 ) Var ( Y ) = Var ( 0.15 X ) = 0.152 Var ( X ) = 0.152 E ( X 2 ) 0.152 E 2 ( X ) ( Wrong!) The correct formula is: Var ( Y ) = E (Y 2 ) E 2 (Y ) = 0.15E ( X 2 ) 0.15E ( X ) 2 = 0.15E ( X 2 ) 0.152 E 2 ( X ) This brings up a critical point: the difference between a mixture and a sum. Say two random variables X and Y . X is distributed as follows: X= W1 W2 with probability of p with probability of 1- p Then you are given Y = pW1 + (1 p ) W2 . Is X = Y ? No. Here X is a mixture and Y is the sum. As said earlier, you cant write: X = pW1 + (1 p ) W2 . To see the d ifference between X and Y , notice: E ( X ) = pE (W1 ) + (1 p ) E (W2 ) E ( X 2 ) = pE (W12 ) + (1 p ) E (W22 ) E (Y ) = E pW1 + (1 p ) W2 = pE (W1 ) + (1 p ) E (W2 ) E (Y 2 ) = E pW1 + (1 p ) W2 2 = E p 2W12 + 2 p (1 p ) W1W2 + (1 p ) W22 2 Yufeng Guo, Deeper Understanding: Exam P Page 360 of 425

http://www.guo.coursehost.com Though E ( X ) = E (Y ) , clearly E ( X 2 ) Proble m 12 Consider an insurance policy that reimburses annual hospital charges for an insured individual. The probability of any individual being hospitalized in a y ear is 15%. That is, P ( H = 1) = 0.15 . Once an individual is hospitalized, the charges X have a probability E (Y 2 ) . density function (pdf) f X ( x H = 1) = 0.1 e Assume theres a deductible of 5 0.1 x for x > 0 . Determine the expected value, the standard deviation, and the ratio of the stand ard deviation to the mean (coefficient of variation) of the claim payment. Alternative solution Let Y represent claims for hospital charges. Then Y= 0 Z = max ( X 5, 0 ) with probability of 85% with probability of 15%, where X is distributed as f ( x ) = 0.1e 0.1 x E (Y ) = 85% E ( 0 ) + 15% E ( Z ) = 15% E ( Z ) Var ( Y ) = E (Y 2 ) E 2 (Y ) E (Z ) = + E (Y 2 ) = 85% E ( 0 2 ) + 15% E ( Z 2 ) = 15% E ( Z 2 ) max ( x 5, 0 ) f ( x ) dx = max ( x 5, 0 ) f ( x ) dx + 0 5 + max ( x 5, 0 ) f ( x ) dx 0 5 If x 5 , then max ( x 5,0 ) = 0 . In other words, if a charge is $5 or less, the insurer wont pay anything. If x 5 , then max ( x 5, 0 ) = x 5 . The insurer pays the charge above and beyond the deductible of $5. E (Z ) = + (x 5 ) f ( x ) dx = + (x 5) 5

5 1 e 10 x 10 dx Set u = x 5 . Yufeng Guo, Deeper Understanding: Exam P Page 361 of 425

http://www.guo.coursehost.com E (Z ) = + (x 5 ) f ( x ) dx = + u 0 5 1 e 10 u +5 10 + du = e 0.5 0 u 1 e 10 0.5 u 10 du = 10e 0.5 E ( Y ) = 15% E ( Z ) = 15% E ( Z ) = 15% (10e 0.5 ) = 1.5e + = 0.91 Similarly, E (Z 2 )= + + z f ( x ) dx = 2 2 5

max ( x 5, 0 ) + 2 2 f ( x ) dx + x 10 + max ( x 5, 0 ) 2 u +5 10 2 f ( x ) dx 0 0 5 = 5 (x + 0.5 0 5 ) f ( x ) dx = u2 1 e 10 u 10 5 1 ( x 5) e 10 0.5 dx = 0 1 u e 10 du =e du = 200e E (Y 2 ) = 15% E ( Z 2 ) = 15% ( 200e 0.5 ) = 30e 0.5 Var ( Y ) = E (Y 2 ) E 2 (Y ) = 30e Y 0.5

(1.5e ) 0.5 2 = 17.3682 = Var (Y ) = 4.17 Y The coefficient of variation is: E (Y ) = 4.17 = 4.58 0.91 Coefficient of variation Problem 13 You are given information about a block of insurance policies: Probability that distribution of the annual loss per policy if there s a loss ea ch policy # of policies has a loss in a year Mean Variance 100 200 0.1 0.3 1 2 3 5 Class 1 2 Assume all these 300 policies are independent. Let S represent the total annual loss incurred by the 300 policies. Yufeng Guo, Deeper Understanding: Exam P Page 362 of 425

http://www.guo.coursehost.com Calculate Solution E (S ) S , the coefficient of variation of S . This problem looks scary. The key to solving this problem is to thoroughly list all of the random variables. Let X i represent the annual loss incurred by the i -th policy in Class 1. Let Y j represent the annual loss incurred by the j -th policy in Class 2. We have i = 1, 2,...,100 and j = 1, 2,..., 200 Then S = X 1 + X 2 + ... + X 100 + Y1 + Y2 + ... + Y200 . E ( S ) = E ( X 1 + X 2 + ... + X 100 + Y1 + Y2 + ... + Y200 ) X i s are independ ent identically distributed, E ( X 1 ) = E ( X 2 ) = ... = E ( X 100 ) = E ( X ) Var ( X 1 ) = Var ( X 2 ) = ... = Var ( X 100 ) = Var ( X ) Here X refer to the annual loss incurred by a policy randomly chosen from Class 1. Similarly, Yi s a re independent identically distributed, E ( Y1 ) = E (Y2 ) = ... = E (Y100 ) = E (Y ) Var (Y1 ) = Var (Y2 ) = ... = Var (Y200 ) = Var (Y ) Here Y refer to the a nnual loss incurred by a policy randomly chosen from Class 2. E ( S ) = E ( X 1 ) + E ( X 2 ) + ... + E ( X 100 ) + E ( Y1 ) + E (Y2 ) + ... + E ( Y200 ) E ( S ) = 100 E ( X ) + 200 E (Y ) Var ( S ) = 100Var ( X ) + 200Var ( Y ) Please note that one common mistake is to write: S = 100 X + 200Y (Wrong) Yufeng Guo, Deeper Understanding: Exam P Page 363 of 425

http://www.guo.coursehost.com Though X 1 , X 2 , , X 100 are independent identica lly distributed having the same mean other words, we have E ( X 1 ) = E ( X 2 ) and Var ( X 1 ) = Var ( X 2 ) , but we may have E ( X ) and the same variance Var ( X ) , X 1 , X 2 , , X 100 are not necessarily identical. In X1 X 2 . Consequently, we cant write X 1 + X 2 + ... + X 100 = 100 X (Wrong) Y1 + Y2 + ... + Y200 = 200Y (Wrong) S = 100 X + 200Y (Wrong) If you mistakenly write S = 100 X + 200Y , youll get the wrong variance: Var ( S ) = Var (100 X + 200Y ) = 1002 Var ( X ) + 2002 Var (Y the correct formula is Var ( S ) = 100Var ( X ) + 200Var (Y ) ow we need to find E ( X ) , Var ( X ) , E (Y ) , and Var (Y ) want to precisely specify X and Y : Probability that distribution of the annual loss per policy if ch policy # of policies has a loss in a year Mean Variance 100 5 Class 1 2 X= 0 U 0 V with probability of 0.9 with probability of 0.1, where E (U ) = 1, Var (U ) =3 w ith probability of 0.7 with probability of 0.3, where E (V ) = 2, Var (V ) =5 Y= E ( X have Var ( Var ( ) = 0.9 ( 0 ) + 0.1E (U ) = 0.1(1) = 0.1 . When calculating Var ( X ) , we to be careful. We cant directly find Var ( X ) . For example, we cant write X ) = 0.1 Var (U ) . We need to proceed using the formula X ) = E ( X 2 ) E 2 ( X ) ) (Wrong) We know . Lets continue. N . Once again, we there s a loss ea 200 0.1 0.3 1 2 3

Yufeng Guo, Deeper Understanding: Exam P Page 364 of 425

http://www.guo.coursehost.com E ( X 2 ) = 0.9 ( 02 ) + 0.1E (U 2 ) = 0.1E (U 2 ) = 0.1 E 2 (U ) + Var (U ) = 0 .1 12 + 3 = 0.4 Var ( X ) = E ( X 2 ) E 2 ( X ) = 0.4 0.12 = 0.39 Similarly, Y= 0 V with probability of 0.7 with probability of 0.3, where E (V ) = 2, Var (V ) =5 E ( Y ) = 0.7 ( 0 ) + 0.3E (V ) = 0.3E (V ) = 0.3 ( 2 ) = 0.6 E (Y 2 ) = 0.7 ( 02 ) + 0.3E (V 2 ) = 0.3E (V 2 ) = 0.3 E 2 (V ) + Var (V ) = 0. 3 2 2 + 5 = 2.7 Var (Y ) = E (Y 2 ) E 2 (Y ) = 2.7 0.62 = 2.34 E ( S ) = 100 E ( X ) + 200 E ( Y ) = 100 ( 0.1) + 200 ( 0.6 ) = 130 Var ( S ) = 100Var ( X ) + 200Var (Y ) = 100 ( 0.39 ) + 200 ( 2.34 ) = 507 S E (S ) = 507 = 0.1732 130 Normal approximation Example 14 You are given information about a block of insurance policies: Probability that distribution of the annual loss per policy if there s a loss ea ch policy # of policies has a loss in a year Mean Variance 100 400 500 0.1 0.2 0 .3 3 5 7 4 6 8 Class 1 2 3 Assume all these 1,000 policies are independent. Let S represent the total annua l loss incurred by the 1,000 policies. Yufeng Guo, Deeper Understanding: Exam P Page 365 of 425

http://www.guo.coursehost.com Calculate E (S ) S , the coefficient of variation of S . Use normal approximation to calculate the probability that the total annual loss exceeds 95% of the expected total annual loss, that is P S > 95% E ( S ) Solution Let X represent the annual loss incurred by a policy randomly chosen fr om Class 1. Let Y represent the annual loss incurred by a policy randomly chosen from Class 2. Let Z represent the annual loss incurred by a policy randomly cho sen from Class 3. S = X 1 + X 2 + ... + X 100 + Y1 + Y2 + ... + Y400 + Z1 + Z 2 + ... + Z 500 . E ( S ) = 100 E ( X ) + 400 E (Y ) + 500 E ( Z ) Var ( S ) = 100Var ( X ) + 400V ar (Y ) + 500Var ( Z ) Class 1 2 3 Probability that distribution of the annual loss per policy if there s a loss ea ch policy # of policies has a loss in a year Mean Variance 100 400 500 0.1 0.2 0 .3 3 5 7 4 6 8 X= 0 U 0 V 0 R with probability of 0.9 with probability of 0.1, where E (U ) = 3, Var (U ) =4 w ith probability of 0.8 with probability of 0.2, where E (V ) = 5, Var (V ) =6 wi th probability of 0.7 with probability of 0.3, where E ( R ) = 7, Var ( R ) =8 Y= Z= E ( X 2 ) = 0.9 ( 02 ) + 0.1E (U 2 ) = 0.1E (U 2 ) = 0.1 E 2 (U ) + Var (U ) = 0 .1 32 + 4 = 1.3 Var ( X ) = E ( X 2 ) E 2 ( X ) = 1.3 0.32 = 1.21 E ( X ) = 0.9 ( 0 ) + 0.1E (U ) = 0.1E (U ) = 0.1( 3) = 0.3 Yufeng Guo, Deeper Understanding: Exam P Page 366 of 425

http://www.guo.coursehost.com E (Y 2 ) = 0.8 ( 02 ) + 0.2 E (V 2 ) = 0.2 E (V 2 ) = 0.2 E 2 (V ) + Var (V ) = 0.2 52 + 6 = 6.2 Var (Y ) = E (Y 2 ) E 2 (Y ) = 6.2 12 = 5.2 E ( Y ) = 0.8 ( 0 ) + 0.2 E (V ) = 0.2 E (V ) = 0.2 ( 5 ) = 1 E ( Z 2 ) = 0.7 ( 02 ) + 0.3E ( R 2 ) = 0.3E ( R 2 ) = 0.3 E 2 ( R ) + Var ( R ) = 0.3 7 2 + 8 = 17.1 Var ( Z ) = E ( Z 2 ) E 2 ( Z ) = 17.1 2.12 = 12.69 E ( Z ) = 0.7 ( 0 ) + 0.3E ( R ) = 0.3E ( R ) = 0.3 ( 7 ) = 2.1 E ( S ) = 100 E ( X ) + 400 E (Y ) + 500 E ( Z ) = 100 ( 0.3) + 400 (1) + 500 ( 2.1) = 1, 480 Var ( S ) = 100Var ( X ) + 400Var (Y ) + 500Var ( Z ) = 100 (1.21) + 400 ( 5.2 ) + 500 (12.69 ) = 8,546 S E (S ) = 8,546 = 0.06246 1, 480 Next, we need to find P S > 95% E ( S ) . S is approximately normal. P S > 95% E ( S ) = 1 P S 95% E ( S ) P S 95% E ( S ) = " # 95% E ( S ) E ( S ) S $ = " # 5% E (S ) S $=" 5% (16.01) = " ( 0.80 ) = 1 " ( 0.80 ) P S > 95% E ( S ) = 1 P S 95% E ( S ) = 1 1 " ( 0.8 ) = " ( 0.8 ) = 0.7881 Example 15 You are given information about a block of insurance policies: Probability that each policy has a loss in a year 0.1 0.2 Class 1 2 # of policies 100 200

Loss amount if there s a loss 3 4 Assume all these 300 policies are independent. Yufeng Guo, Deeper Understanding: Exam P Page 367 of 425

http://www.guo.coursehost.com Let S represent the total annual loss incurred by the 300 policies. Calculate E (S ) S , the coefficient of variation of S . Use normal approximation to calculate the probability that the total annual loss exceeds 105% of the expected total annual loss, that is P S > 105% E ( S ) Solution Let X represent the annual loss incurred by a policy randomly chosen fr om Class 1. Let Y represent the annual loss incurred by a policy randomly chosen from Class 2. S = X 1 + X 2 + ... + X 100 + Y1 + Y2 + ... + Y200 . E ( S ) = 100 E ( X ) + 200 E (Y ) Var ( S ) = 100Var ( X ) + 200Var ( Y ) X= 0 with probability of 0.9 3 with probability of 0.1 0 4 with probability of 0.8 wi th probability of 0.2 Y= E ( X ) = 0 ( 0.9 ) + 3 ( 0.1) = 0.3 Var ( X ) = E ( X 2 ) E 2 ( X ) = 0.9 0.32 = 0.81 E ( X 2 ) = 02 ( 0.9 ) + 32 ( 0.1) = 0.9 E (Y ) = 0 ( 0.8) + 4 ( 0.2 ) = 0.8 Var ( Y ) = E (Y 2 ) E 2 (Y ) = 3.2 0.82 = 2.56 E (Y 2 ) = 0 2 ( 0.8 ) + 4 2 ( 0.2 ) = 3.2 To avoid the above calculation, you can use BA II Plus and quickly calculate the mean and variance for X and Y . E ( S ) = 100 E ( X ) + 200 E ( Y ) = 100 ( 0.3 ) + 200 ( 0.8 ) = 190 Var ( S ) = 100Var ( X ) + 200Var (Y ) = 100 ( 0.81) + 200 ( 2.56 ) = 593 Yufeng Guo, Deeper Understanding: Exam P Page 368 of 425

http://www.guo.coursehost.com E (S ) S = 593 = 0.1282 190 Next, we need to find P S > 105% E ( S ) . S is approximately normal. P S > 105% E ( S ) = 1 P S 105% E ( S ) P S 105% E ( S ) = " # 105% E ( S ) E ( S ) S $ = " #5% E (S ) S $ = " 5% ( 7.8 ) = " ( 0.39 ) = 0.6517 P S > 105% E ( S ) = 1 P S 105% E ( S ) = 1 0.6517 = 0.3483 Example 16 Two random loss variables, X and Y have the following joint density function: f X ,Y ( x, y ) = 4 xy, 81 where 0 X 3 and 0 Y 3 The insurer pays X + Y in excess of deductible 1, subject to the maximum payment of 2. Calculate the expected claim payment by the insurer. Solution Let S represent the claim payment by the insurer. Then, 0 S = X +Y 1 2 Next, we draw a diagram if 0 X +Y 1 3 if 1 X + Y if 3 X + Y Yufeng Guo, Deeper Understanding: Exam P Page 369 of 425

http://www.guo.coursehost.com In the above diagram, the square ODEC is where X and Y exist. AB represents x + y = 1 . CD represents x + y = 3 . Area AOB is where 0 x + y 1 . When ( x, y ) falls in AOB, the insurer pays nothing. Area ABDC is where 1 x + y 3 ; here the insurer pays x + y 1 . Area CDE is where 3 x + y , x 3 , and y 3 ; here the insurer pays 2. 0 S = X +Y 1 2 E (S ) = AO B if 0 X +Y 1 3 if 1 X + Y if 3 X + Y 0 f ( x, y ) dxdy + A BDC ( x + y 1) f ( x, y ) dxdy + CDE 2 f ( x, y ) dxdy 0 f ( x, y ) dxdy = 0 AO B ( x + y 1) f ( x, y ) dxdy = A BDC A B DC ( x + y 1) 4 x y dx dy . 81 To do this integration, we divide ABDC into two areas: ABDF and AFC. Yufeng Guo, Deeper Understanding: Exam P Page 370 of 425

http://www.guo.coursehost.com ( x + y 1) A BDC 4 4 4 x y dx dy = ( x + y 1) x y dx dy + ( x + y 1) x y dx dy 81 81 81 A BDF A F C 1 3 x 4 ( x + y 1) x y dx dy = # 81 A BDF 0 1 4 ( x + y 1) x y dx dy = # 81 A FC 1 3 4 ( x + y 1) x ydy $ dx = 81 x 1 0 56 8 2 20 x x dx = 243 81 243 3 x ( x + y 1) 0 4 184 x ydy $ dx = 81 1215 ( x + y 1) f ( x, y ) dxdy = A BDC 20 184 + = 0.2337 243 1215 2 f ( x, y ) dxdy = # CDE 0 3 3 2 3 x 4 5 xy dy $dx = = 1.6667 81 3 E (S ) = A B DC ( x + y 1) f ( x, y ) dxdy + CDE 2 f ( x, y ) dxdy = 0.2337 + 1.6667 = 1.90 Yufeng Guo, Deeper Understanding: Exam P Page 371 of 425

http://www.guo.coursehost.com Example 17 Two random loss variables, X and Y have the following joint density function: f X ,Y ( x, y ) = 2 y , where 0 X 1 and 0 Y 1 The insurer pays X + Y in excess of deductible 0.5, subject to the maximum payme nt of 0.5. Calculate the expected claim payment by the insurer. Solution Let S represent the claim payment by the insurer. Then, X +Y 0 S = X + Y 0.5 0.5 E (S ) = AO B if 0 0.5 1 if 0.5 X + Y if 1 X + Y 0 f ( x, y ) dxdy + ABC D (x + y 0.5 ) f ( x, y ) dxdy + CDE 0.5 f ( x, y ) dxdy Yufeng Guo, Deeper Understanding: Exam P Page 372 of 425

http://www.guo.coursehost.com = ABC D (x + y 0.5 ) f ( x, y ) dxdy + 0.5 CDE f ( x, y ) dxdy To calculate ABC D (x + y 0.5 ) f ( x, y ) dxdy , we divide ABCE into two sub-areas: ABFD and BCF. (x + y ABC D 0.5 ) f ( x, y ) dxdy 0.5 ) f ( x, y ) dxdy + BC F = ABF D (x + y (x + y 1 y 0.5 ) f ( x, y ) dxdy (x+ y ABF D 0.5 ) f ( x, y ) dxdy = 0.5 1 y 0.5 0 # # 0.5 (x + y y 0.5 ) f ( x, y ) dx $ dy $ 0.5 = 0 # # 0.5 (x + y

y 0.5 1 x 0.5 ) 2 y dx $ dy = $ 0 1 1 ydy = 8 64 ( x + y 0.5) f ( x, y ) dxdy = BC F # 0 0.5 (x + y 0.5 ) 2 ydy $dx = 0.05729 (x+ y ABC D 0.5 ) f ( x, y ) dxdy = 1 + 0.05729 = 0.072915 64 Page 373 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com 0.5 CDE f ( x, y ) dxdy = 0.5 1 1 2 ydydx = 0.5 01 x 2 1 = 3 3 E ( S ) = 0.072915 + 1 = 0.40625 3 Security loading Example 18 Losses are uniformly distributed over [0,4000]. The insurer will pay the loss am ount in excess of the deductible of 1,000. The insurer, in setting the premium r ate, has a security loading of 10%. Calculate the gross premium. Solution This study note talks about security loading % . When an insurer sets its premiu m rate, it cant set it just equal to the expected loss. The insurer needs to char ge more to pay expense and earn a profit. Let P represent the gross premium. Let Y represent the claim payment. Then, P = (1 + % ) E ( Y ) Let X represent the loss and Y the payment made by the insu rer. Then Y = 0 if X 1000 ; Y = X 1000 if X 1000 . We need to first calculate E (Y ) . E (Y ) = 1000 0 f ( x )dx + 4000 ( x 1000 ) f ( x )dx = 4000 (x 1000 ) 0 1000 1000 1 dx 4000 Set x 1000 = t . 4000

(x 1000 ) 1000 1 dx = 4000 3000 t 0 1 1 1 dt = t2 4000 4000 2 3000 0 = 1,125 P = (1 + % ) E (Y ) = (1 + 10% )1125 = 1, 237.5 Yufeng Guo, Deeper Understanding: Exam P Page 374 of 425

http://www.guo.coursehost.com Chapter 37 On becoming an actuary As you contemplate entering the actuarial field, several questions resurface acr oss the board from candidates. Lets explore together what challenges, obstacles, and opportunities you might encounter along the way. Why do people pursue the ac tuarial career track? Bypass an expensive MBA, JD, or MD degree. Earn what you l earn; you get a raise every time you pass an exam. Diligent candidates can reap rich monetary rewards for their hard work by their late 20s or early 30s. A steady market demand for actuaries and the highly technical nature of the profession p rovide a job security few careers can equal in todays volatile workplace. White c ollar work environment. What is the biggest challenge candidates face as they work toward becoming a fel low? EXAMS! Several hundred study hours are typically needed to pass one exam. B ecoming a fellow requires you to study for and pass a total of eight exams. You must expect to lead the life of a professional student for several years until y ou have passed all the exams. A key character quality that candidates must posse ss in abundance is perseverance. Sometimes it may take you one or more attempts to finally pass one exam. If you fail, you need to find the motivation to study again for the same exam. Studying for the same exam and reading the same textboo ks over and over can be discouraging. Lastly, exams add a lot of pressure to an already full professional and personal life, especially if you are married and h ave children. If you are single, try to pass as many exams as you can while you are single. If you are married and have children, youll need your spouses support to free up more study time, particularly when the exam is only a few weeks away. How many exams do I need to pass to get a job? If you are a young college stude nt, Exam P alone may open the door for employment. Older career switchers or int ernational candidates may have to pass more exams to win a potential employers at tention. Each candidate must weigh his own unique skill set and determine the de licate balance of book knowledge versus job experience. Having more exams under your belt may impress an employer and hence open more doors. Passing exams is al so an excellent confidencebuilder. However, dont pass too many exams (such as pas sing 5 or 6 exams). Employers want well-balanced candidates whose exam levels ar e in line with their work experience. Yufeng Guo, Deeper Understanding: Exam P Page 375 of 425

http://www.guo.coursehost.com Im an older candidate who wants to make a switch to the actuary field. How can I take the leap? Many companies target college students for their entry-level posi tions so that by the time they are in their late 20s they become a manager and th en perhaps move to director by their mid 30s. Older candidates, of course, will n ot follow this traditional path. In talking with potential employers, older cand idates will want to focus on the unique skills and talents as well as the added maturity and responsibility they can bring to bear on their jobs. Study diligent ly and excel at demonstrating your skills of not only passing exams but also bri nging your knowledge to bear on your daily job experience. Another possibility f or making the leap to a new career in the actuary field is to first get a job as a secretary, technical assistant, or programmer in an actuary department. Once you are hired, then you can study for exams and try to become an actuary. What c omputer skills do employers look for in their actuarial job candidates? The most commonly used software programs in the actuarial field are Excel and Access. Le arning how to program in one or both of these will give your resume an extra boo st. Should I specialize in P&C (property and casualty) or life? As a newcomer to the actuary field, remain open to both areas until you have worked as an actuar y for several years. As you gain experience, your skills and interests will prob ably point to the area of specialization which fits you best. Yufeng Guo, Deeper Understanding: Exam P Page 376 of 425

http://www.guo.coursehost.com Guos Mock Exam Allotted time: 180 minutes To best use this mock exam, please print this PDF, find a quiet place, take this exam under the strict exam-like condition. Turn to the next page. Yufeng Guo, Deeper Understanding: Exam P Page 377 of 425

http://www.guo.coursehost.com Q1 Random variable X has a Poisson distribution wi th mean X. A 2 B 3 C 4 D 5 = 3.2 . Calculate the mode of E 6 Q2 Random variable X has the following pdf: f ( x ) = 3x 2 , where 0 < x < 1 You take a random sample of 3 with median Y . F ind the pdf of Y . A 18 y 5 (1 y 3 ) B 4 y (1 y 2 ) E 30 y 4 (1 y ) C 12 y 2 (1 y ) D 6 y 2 (1 y 3 ) Yufeng Guo, Deeper Understanding: Exam P Page 378 of 425

http://www.guo.coursehost.com Q3 An insurance company divides its large pool of policyholders into three groups: standard class, preferred class, and super pref erred class. You are given the following information: There are 16 times as many policyholders in the standard risk as in the super preferred class The # of pol icyholders in the super preferred class is one third of the policyholders in the preferred class The probability that a super preferred policyholder dies next y ear is one-sixth of the probability that a standard policyholder dies next year The probability that a preferred policyholder dies next year is two thirds of th e probability that a standard policyholder dies next year Calculate the conditio nal probability that a policyholder is a super preferred given he dies next year . A 0.5% B 1% C 1.5% D 2% E 2.5% Q4 A multi-line insurer sells 3 types of insurance polices: auto, home, and life . Policyholders who have two or three types of policies get a multi-line discoun t. Due to discount, the policyholders who have all three policies have 85% chanc e of renewing their policies next year. In contrast, those who have only two pol icies and only one policy have 70% and 50% chance of renewing their policies nex t year. You are given: All policyholders have at least one policy 51% of the pol icyholders have home policies 37% of the policyholders have life policies 15% of the policyholders have all three policies 22% of the policyholders have both au to and home polices 17% of the policyholders have both auto and life policies 19 % of the policyholders have both home and life policies Calculate the conditiona l probability that a randomly selected policyholder has only auto insurance give n he renews his policy or policies next year. A 0.13 B 0.15 C 0.17 D 0.22 E 0.27 Yufeng Guo, Deeper Understanding: Exam P Page 379 of 425

http://www.guo.coursehost.com Q5 An insurance company sells a special decreasing life insurance policy. The policy provides $3,000 death benefit if the insured dies in Year 1. The death benefit decreases each year by $5,000. The conditional probability that the insured dies each year given hes still alive at the beginni ng of the year is 0.2. Calculate the pure premium of this policy. A 14,875 B 15, 000 C 15,240 D 15,750 E 16,200 Q6 x 1 4 e . 4 Z is the portion of the loss not covered by the insurance. Z is equa l to 1 with probability of 0.4 and equal to 0 with probability of 0.6. X is the loss random variable with density function f ( x ) = X and Z are independent. Calculate Var ( XZ ) . A 7.24 B 8.24 C 9.24 D 10.24 E 11.24 Q7 Q7 MGF for a loss random variable X is M x (t ) = 0 .2 . The payment is 0 .2 t Y = 80% X + 10 . Whats the MGF for the payment? 1 10 t e 1 4t A B 0.2 10t e 0.2 t C 0.2 0.8t e 0.2 t D 1 0.8t e 1 4t E 1 e10t 1 0.2t Yufeng Guo, Deeper Understanding: Exam P Page 380 of 425

http://www.guo.coursehost.com Q8 The joint pdf is f(x,y)=2|x|y where -1<x<1 and 0<y<1. Calculate E ( X 2Y ) . A 0.0 B 0.25 C 0.33 D 0.50 E 0.75 Q9 April 15 is approaching. Two taxpayers, Adam and Bob, plan to visit the same IRS office in town to ask for tax questions. Adams arrival time to the IRS office is uniformly distributed over [10:00 am, 12:30 pm]. Bobs arrival time to the IRS office is uniformly distributed over [11:00 am, 1:00 pm]. Adams arrival time to the IRS office is independent from Bobs arrival time to the IRS office. The IRS o ffice hours are from 8:00am to 4:00 pm except for a 30-minute lunch break, which lasts from 11:30 am to noon. X = The probability that both Adam and Bob have to wait for no more than 20 minutes before they can approach an IRS clerk to ask f or a question. Y = The probability that both Adam and Bob have to wait for no mo re than 20 minutes before they can approach an IRS clerk to ask for a question g iven they are already waiting. Calculate X + Y. A 0.3 B 0.5 C 0.8 D 1.0 E 1.3 Q10 f ( x, y ) = 5( y x) 4 (y 4y 4 2 )( 4 y ) where 0 < x < y and 2 < y < 4 . Calculate E ( X ) A 0.00 B 0.11 C 0.28 D 0.51 E 0.75 Yufeng Guo, Deeper Understanding: Exam P Page 381 of 425

http://www.guo.coursehost.com Q11, Q12, Q13, Q14 The liability that results from a car accident falls into two categories: the property damage liability and the personal injury liability. Let random variables X and Y represent the dollar am ount (in $10,000) of the property damage liability and the personal injury liabi lity respectively. The pdf of X is: fX ( x) = 1 x , where 0 8 x k and k is a constant Given X = x , Y is uniformly distributed over [ x, 2 x ] . An insurance policy i s written to cover the sum of X and Y . Theres a deductible of $30,000 and securi ty loading of 30%. Q11 The probability of having an accident where the property damage liability ex ceeds $20,000 and the personal injury liability exceeds $30,000. A 0.688 B 0.712 C 0.733 D 0.752 E 0.801 Q12 Calculate the probability that insurer doesnt incur any claims. A 0.078 B 0.0 82 C 0.094 D 0.102 E 0.124 Q13 Calculate the gross premium A 4,000 B 4,300 C 4,600 D 4,900 E 5,200 Q14 The expected non-zero claim payment B 3,750 C 3,985 D 4,000 A 3,250 E 4,150 Yufeng Guo, Deeper Understanding: Exam P Page 382 of 425

http://www.guo.coursehost.com Q15 You take a bus to work every workday. Your jou rney to work consists of 3 independent components: The time for you to walk from home to the bus stop nearby is normally distributed with mean of 5 minutes and standard deviation of 2 minutes. Your bus arrives immediately after you get to t he bus stop. The time for a bus to take you to the stop near your office is norm ally distributed with mean of 20 minutes and 6 minutes. The time for you to walk to work from the bus stop to your company is normally distributed with mean of 3 minutes and standard deviation of 1 minute. Your boss drives to work every workday. The time for him to drive to work is nor mally distributed with mean of 26 minutes with standard deviation of 5 minutes. Calculate the probability that in a workday you arrive at work at least 5 minute s earlier than you boss does (assuming you leave your home and your boss leaves his home at the same time). A 13% B 19% C 25% D 31% E 37% Q16 A manufacturing plant purchases a special product defect insurance policy. T he insurance provides a payment of $1,000 if the number of defects is 20. Then f or each full 5 incremental defects, the insurance pays an additional $500. Howev er, the payment by the insurance policy can be no more than $2,800 regardless of the number of defects. The probability is 0.19 that the plant will have less th an 5 defects. The probability is 0.36 that the plant will have less than 10 defe cts. The probability is 0.51 that the plant will have less than 15 defects. The probability is 0.64 that the plant will have less than 20 defects. The probabili ty is 0.75 that the plant will have less than 25 defects. The probability is 0.8 4 that the plant will have less than 30 defects. The probability is 0.91 that th e plant will have less than 35 defects. The probability is 0.96 that the plant w ill have less than 40 defects. The probability is 0.99 that the plant will have less than 45 defects. The probability is 1.00 that the plant will have less than 50 defects. Let Y represent the payment by the insurance policy. Calculate mean to the standard deviation of Y. E (Y ) , the ratio of the (Y ) A 0.48 B 0.53 C 0.58 D 0.63 E 0.68 Page 383 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Q17 X is a Poisson random variable. A 3 B 4 F (2 ) = 2.125. Find E ( X ) F (1) C5 D 6 E 7 Q18 Random variables X and Y are jointly uniformly distributed over the area 0 X + Y 1 . Calculate X Y , the standard deviation of X Y . A 0.000 B 0.016 C 0.03 2 D 0.105 E 0.250 Q19 1 where 0<x<y<z<1. xy Calculate the probability that at least two of the three r andom variables X,Y,Z are less than 0.1. Random variables X,Y,Z have the following joint pdf: f(x,y,z)= A 0.20 B 0.25 C 0.30 D 0.35 E 0.40 Q20 In a small town, 55% are men and 45% are women. 74% of the women read the lo cal newspaper everyday. Given that someone reads the local newspaper everyday, t he probability that the reader is a male is 57.53%. Calculate the percentage of the men who read the local newspaper everyday. A 0.78 B 0.82 C 0.88 D 0.92 E 0.9 8 Q21 A wife and husband s lives are uniformly distributed on (0,20) in years. Fin d the conditional probability that the husband outlives the wife given that the husband is still alive 10 years from today. A 0.55 B 0.65 C 0.75 D 0.85 E 0.95 Yufeng Guo, Deeper Understanding: Exam P Page 384 of 425

http://www.guo.coursehost.com Q22 X is exponentially distributed with mean of 1. Y = X 4 . Calculate fY ( 2 ) , the pdf of Y at y = 2 . A 0.14 B 0.17 C 0.21 D 0.25 E 0.31 Q23 Random variable X has the following moment generating function: M X ( t ) = (1 2t ) 1 2 Calculate the coefficient of variation. A 0.2 B 0.5 C 0.8 D 1.1 E 1.4 Q24 Random loss variables X and Y have the following joint pdf: f ( x, y ) = x + k y 2 , where 0 x y 1 An insurance policy is written to cover X + Y . The maximum claim payment is $1. Calculate the net premium A 0.$75 B 0.$94 C $1.0 D $1.6 E $2.2 Q25 A device has 4 duplicate components working in parallel. The device works as long as at least one component works; it fails only if all four components fail simultaneously. The times-to-failure of the 4 components are independent expone ntial random variables with mean of 1, 2, 3, and 4 hours respectively. Calculate the probability that the device is still working 5 hours later. A 0.32 B 0.37 C 0.42 D 0.47 E 0.52 Yufeng Guo, Deeper Understanding: Exam P Page 385 of 425

http://www.guo.coursehost.com Q26 Loss random variable X has the following pdf: f ( x ) = 0.02 x , where 0 < x < 10 The insurer has a deductible of 4 per loss. Calculate the expected claim payment net of deductible. A 2.5 B 2.7 C 2.9 D 3.4 E 4.2 Q27 Two discrete random variables X and Y are jointly distributed over a series of points. The 1 . You are also given: joint probability mass function is p X ,Y ( x, y ) = 28 X =0, 1, 2, 3, 4, 5, 6 Y =0, , X (i.e. for each value in X , Y is a non-negative integer ranging from 0 to X ) Calculate Var ( X ) Var (Y ) A0 B 0.5 C1 D 1.5 E2 Q28 A company pays a benefit of 100 for each of its 1000 employees if an employe e dies next year. The probability that an employee dies next year is 2%. What is the amount the company needs to have in a fund in order to ensure a 95% chance that it can cover the loss? A 2,500 B 2,800 C 3,100 D 3,400 E 3,700 Yufeng Guo, Deeper Understanding: Exam P Page 386 of 425

http://www.guo.coursehost.com Q29 Random variables X and Y have the following jo int pdf: f ( x, y ) = k x , where 0 < x < y < 1 and k is a constant. y 2 2 + Var X Y = 3 3 Calculate E X Y = A 0.17 B 0.27 C 0.37 D 0.47 E 0.57 Q30 Random variable X has the following moment generating fuction: M X ( t ) = , 1 t where is a positive constant. Calculate the probability that X is one stand ard deviation from its mean. 1 A 0.46 B 0.56 C 0.66 D 0.76 E 0.86 Yufeng Guo, Deeper Understanding: Exam P Page 387 of 425

http://www.guo.coursehost.com Solution to Guos Mock Exam Problems 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 A B C D E Keys B A B E C D A C E D A C D E B E A D A B C A E B D C C B D E 6 6 6 6 6 If you answered at least 20 questions correctly, you passed this exam. Yufeng Guo , Deeper Understanding: Exam P Page 388 of 425

http://www.guo.coursehost.com Q1 Random variable X has a Poisson distribution wi th mean X. A 2 Solution pX ( x ) = e B 3 B x = 3.2 . Calculate the mode of C 4 D 5 E 6 x! where x = 0,1, 2,..., n,... MOde = Most Often (Most Observed). At the mode, p X ( x ) reaches its max value. If X is continuous, well find the mode by solving the equation discrete, we cant set d f X ( x ) = 0 . If X is dx d p X ( x ) = 0 . The general strategy for finding the mode for a dx p ( x) disc rete random variable is to look at the ratio . p ( x 1) x x 1 In this case, p ( x 1) = e ( x 1)! 1 or x , pX ( x ) = e x x! p ( x) = p ( x 1) e e x! x 1 = ( x 1)! x So p ( x ) p ( x 1) if increases until x = int ( int ( At x = int ( x . For x = 0,1, 2,..., p ( x ) will keep increasing as x means the integer portio n of :

) . Here int ( ) ) = int ( 3.2 ) = 3 ) , p ( x ) reaches its peak. Then p ( x ) declines as x exceeds int ( ). Here int ( ) represents the integer portion of few values: X =x . So here the mode is int ( 3.2 ) = 3 . Lets test a 0 1 2 3 4 5 3.2 x pX ( x ) = e x! 0.040762 0.130439 0.208702 0.222616 0.178093 0.113979 3.2 Yufeng Guo, Deeper Understanding: Exam P Page 389 of 425

http://www.guo.coursehost.com As a bonus point, lets find the mode of a binomial random variable x with paramet er n and p . P ( x ) = Cnx p x q n x , P ( x 1) = Cnx 1 p x 1q n P ( x) x +1 Cnx p x q n x n x +1 p = x 1 x 1 n x +1 = P ( x 1) Cn p q x q Set n x +1 p 1. x q p ( n x + 1) qx , pn px + p qx , px + qx p ( n + 1) . p ( n + 1) However, p + q = 1 . So we have: x The mode of a Poisson random variable is int ( ). The mode of a binominal random variable X is int p ( n + 1) . Q2 Random variable X has the following pdf: f ( x ) = 3x 2 , where 0 < x < 1 You take a random sample of 3 with median Y . Find the pdf of Y . A 18 y 5 (1 y 3 ) B 4 y (1 y 2 ) E 30 y 4 (1 y ) A P(X x) = x C 12 y 2 (1 y ) D 6 y 2 (1 y 3 ) Solution f ( x ) = 3x 2 , f ( t )dt = 3t 2 dt = x3 , 0 x P ( X > x ) = 1 x3 0 Of the 3 samples taken, one of the 3 samples must fall in the range of ( y, y + dy ) , one of the 2 remaining samples must be greater than y + dy , and the 3rd sample must be smaller Yufeng Guo, Deeper Understanding: Exam P Page 390 of 425

http://www.guo.coursehost.com than y . This way, we are guaranteed that the medi an of the 3 samples is falls in the range ( y, y + dy ) . fY ( y ) dy = P ( X > y + dy ) P ( y < X < y + dy ) P ( X < y) 3! Permutation one sample must be greater than y one sample must fall in ( y , y + dy ) one sample must be less than y P ( X > y + dy ) = P ( X > y ) because dy is tiny = 1 y3 P ( X < y ) = y3 , P ( y < X < y + dy ) = f X ( y ) dy = 3 y 2 dy fY ( y ) dy = 3!(1 y 3 )( 3 y 2 dy )( y 3 ) = 18 y 5 (1 y 3 ) dy , where 0 < y < 1 fY ( y ) = 18 y 5 (1 y 3 ) , where 0 < y < 1 Double check: 1 fY ( y ) dy = 18 y (1 y ) dy = 18 5 3 0 1 1 (y 5 0 0 1 y ) dy = 18 y 6 6 8 1 3 y 9 1 = 18 0 1 1 = 3 2 =1 6 9 So fY ( y ) = 18 y 5 (1 y 3 ) is the legitimate pdf having a total probability o f one. Q3 An insurance company divides its large pool of policyholders into three group s: standard class, preferred class, and super preferred class. You are given the following information: There are 16 times as many policyholders in the standard risk as in the super preferred class The # of policyholders in the super prefer red class is one third of the policyholders in the preferred class The probabili ty that a super preferred policyholder dies next year is one-sixth of the probab ility that a standard policyholder dies next year The probability that a preferr ed policyholder dies next year is two thirds of the probability that a standard policyholder dies next year Calculate the conditional probability that a policyh older is a super preferred given he dies next year. A 0.5% B 1% C 1.5% D 2% E 2. 5%

Yufeng Guo, Deeper Understanding: Exam P Page 391 of 425

http://www.guo.coursehost.com Solution B Let x represent the portion of the policyholders that are in the super preferred class. Let y represent the probability that a super preferred policyholder dies next year. To solve the problem, we dont need to know x or y (though we can calc ulate x ). Please note we dont have enough info to calculate y . Event: A policyh older dies next year Segment Standard Preferred Super Preferred Total Segments size Segment s prob Seg ment s contribution 16x 6y 16 x ( 6 y ) = 96xy 3x ( 4 y ) = 12xy x ( y) = xy Segment s contribution % 96 xy = 88.07% 109 xy 12 xy = 11.01% 109 xy xy = 0.92% 109 xy 100% 6y 3x 2 = 4y 3 y x 100% ( 96 + 12 + 1) xy = 109xy The conditional probability that a policyholder is a super preferred given he di es next year is 0.92%. Q4 A multi-line insurer sells 3 types of insurance polices: auto, home, and life . Policyholders who have two or three types of policies get a multi-line discoun t. Due to discount, the policyholders who have all three policies have 85% chanc e of renewing their policies next year. In contrast, those who have only two pol icies and only one policy have 70% and 50% chance of renewing their policies nex t year. You are given: All policyholders have at least one policy 51% of the pol icyholders have home policies 37% of the policyholders have life policies 15% of the policyholders have all three policies 22% of the policyholders have both au to and home polices 17% of the policyholders have both auto and life policies 19 % of the policyholders have both home and life policies Calculate the conditiona l probability that a randomly selected policyholder has only auto insurance give n he renews his policy or policies next year. A 0.13 B 0.15 C 0.17 D 0.22 E 0.27 Page 392 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Solution E Auto 55% 7% Home 51% 31% 25% 15% 2% 4% 16% Life 37% Event: A random selected policyholder renews his policy (policies) next year Seg ments probability to Segment make the event Segments Segments Segment size happen c ontribution contribution % Auto only 31% 50% 31%(50%)=0.1550 0.1550/0.5785=26.79 % Home only 25% 50% 25%(50%)=0.1250 0.125/0.5785=21.61% Life only 16% 50% 16%(50 %)=0.0800 0.08/0.5785=13.83% Two policies only 13% 70% 13%(70%)=0.0910 0.091/0.5 785=15.73% Three policies only 15% 85% 15%(85%)=0.1275 0.1275/0.5785=22.04% Tota l 100% 0.5785 100.00% The probability is 27% that a randomly selected policyhold er has only auto insurance given he renews his policy or policies next year. Q5 An insurance company sells a special decreasing life insurance policy. The polic y provides $30,000 death benefit if the insured dies in Year 1. The death benefi t decreases each year by $5,000. The conditional probability that the insured di es each year given hes still alive at the beginning of the year is 0.2. Calculate the pure premium of this policy. A 14,875 B 15,000 C 15,240 D 15,750 E 16,200 Page 393 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Solution C The pure premium is the expected death benefit. Year 1 2 3 4 5 6 Death benefit 3 0,000 25,000 20,000 15,000 10,000 5,000 Probability of death that year 0.2 0.8*0 .2 0.82 * 0.2 0.83 * 0.2 0.83 * 0.2 0.84 * 0.2 For example, this is how to find the probability that the insured dies in Year 2 . For him to die in Year 2, he must (1) be alive at Year 1 (with probability 0.8 ), (2) die in Year 2 given hes alive in Year 1 (probability 0.2). So the probabil ity that the insured dies in Year 2 is 0.8*0.2. Mathematically, this is: P ( die in Year 2 ) = P ( alive @ end of Year 1) P ( die in Year 2 alive @end of Year 1 ) = 0.8 0.2 The expected death benefit (in $1,000): 30(0.2)+25(0.8*0.2)+20(0.82 * 0.2)+15(0. 83 * 0.2)+10(0.83 * 0.2)+5(0.84 * 0.2) =15.24288 So the pure premium is $15,243. Since this problem doesnt ask you to calculate variance of the payment, you dont need to use BA II Plus 1-V Statistics Worksheet. If the problem asks you to also find the standard deviation, then you might want to use BA II Plus 1-V Statisti cs Worksheet. However, youll need to be very careful if you do use BA II Plus 1-V Statistics Worksheet: youll need to include the probability of zero payment at t he end of Year 6. The total probability that the policyholder dies in a 6-year p eriod is 0.2+0.16+0.128+0.1024+0.08192+0.065536 = 0.737856. The probability that he is still alive at the end of Year 6 is: 1 - 0.737856 = 0.262144. Yufeng Guo, Deeper Understanding: Exam P Page 394 of 425

http://www.guo.coursehost.com A quick way to calculate the probability of being alive at the end of Year 6: 0.86=0.262144 Next, you can set up the following tab le: Year Death benefit 30,000 25,000 20,000 15,000 10,000 5,000 0 Probability of getting this death benefit 0.2 0.8*0.2=0.16 0.82 * 0.2=0.128 0.83 * 0.2=0.1024 0.83 * 0.2=0.08192 0.84 * 0.2=0.065536 0.86=0.262144 1.0 10,000 Scale up probabi lity: multiply the probability by 10,000 200 160 128 102 82 66 262 1 2 3 4 5 6 6 Total Next, enter the following in 1-V Statistics Worksheet: Year Death benefit Scale up probability: (setting $1,000 as one unit of multiply the probability money to speed up the by 1,000 calculation) 1 X01=30 Y01=200 2 X02=25 Y02=160 3 X03=20 Y 03=128 4 X04=15 Y04=102 5 X05=10 Y05=82 6 X06=5 Y06=66 6 X07=0 Y07=262 Total 1,0 00 You should get: E ( X ) =15.24=$15,240, X =11.47790922=$11,477.91 Yufeng Guo, Deeper Understanding: Exam P Page 395 of 425

http://www.guo.coursehost.com If you forget to include the zero payment, your ta ble will becomes: Year Death benefit Scale up probability: (setting $1,000 as on e unit of multiply the probability money to speed up the by 1,000 calculation) 1 X01=30 Y01=200 2 X02=25 Y02=160 3 X03=20 Y03=128 4 X04=15 Y04=102 5 X05=10 Y05= 82 6 X06=5 Y06=66 Total 738 Youll get: E ( X ) =20.6504065=$20,650.41, X =8.17224837=$8,1722.48 Wrong ! X X >0 The E ( X ) you got is really E ( X X > 0 ) ; the E ( X X > 0 ) =20.6504065=$20,650.41 X X >0 X you got is really : =8.17224837=$8,1722.48 Q6 x 1 4 e . 4 Z is the portion of the loss not covered by the insurance. Z is equa l to 1 with probability of 0.4 and equal to 0 with probability of 0.6. X is the loss random variable with density function f ( x ) = X and Z are independent. Calculate Var ( XZ ) . A 7.24 B 8.24 C 9.24 D 10.24 E 11.24 Solution D Notice that X is exponential random variable with mean 4. Its mean is 4 and vari ance 16. In addition, E ( Z ) = 1( 0.4 ) = 0.4 ; E ( Z 2 ) = 12 ( 0.4 ) = 0.4 Var ( XZ ) = E ( XZ ) 2 E ( XZ ) 2 Yufeng Guo, Deeper Understanding: Exam P Page 396 of 425

http://www.guo.coursehost.com E E ( XZ ) ( XZ ) 2 = E ( X 2 Z 2 ) . X and Z are independent. Then X 2 and Z 2 are independent. = E ( X 2Z 2 ) = E ( X 2 ) E ( Z 2 ) E ( Z 2 ) = 0.4 (12 ) = 0.4 2 E ( X 2 ) = E 2 ( X ) + Var ( X ) = 42 + 42 = 32 , E ( XZ ) 2 = E ( X 2 ) E ( Z 2 ) = 32 ( 0.4 ) = 12.8 E ( XZ ) = E ( X ) E ( Z ) = 4 ( 0.4 ) = 1.6 Var ( XZ ) = E ( XZ ) 2 E ( XZ ) 2 = 12.8 1.62 = 10.24 Q7 MGF for a loss random variable X is M x (t ) = 0 .2 . The payment is 0 .2 t Y = 80% X + 10 . Whats the MGF for the payment? 1 10 t e 1 4t A B 0.2 10t e 0.2 t C 0.2 0.8t e 0.2 t D 1 0.8t e 1 4t E 1 e10t 1 0.2t

Solution A M aX + b (t ) = M x (at )e bt M 0.8 X +10 (t ) = M X (0.8t )e10t = 0 .2 1 10t e10t = e 0.2 0.8t 1 4t Yufeng Guo, Deeper Understanding: Exam P Page 397 of 425

http://www.guo.coursehost.com Q8 The joint pdf is f(x,y)=2|x|y where -1<x<1 and 0<y<1. Calculate E ( X 2Y ) . A 0.0 B 0.25 C 0.33 D 0.50 E 0.75 Solution 1 1 C E(XY)= 1 0 ( x2 y ) 2 x ydydx = 1 3 1 1 2 x x 2 y 2 dydx = 1 0 2 3 1 x x 2 dx 1 2 = 3 0 x dx + 1 0 x3 dx = 2 1 1 1 + = 3 4 4 3 Q9 April 15 is approaching. Two taxpayers, Adam and Bob, plan to visit the same IRS office in town to ask for tax questions. Adams arrival time to the IRS office is uniformly distributed over [10:00 am, 12:30 pm]. Bobs arrival time to the IRS office is uniformly distributed over [11:00 am, 1:00 pm]. Adams arrival time to the IRS office is independent from Bobs arrival time to the IRS office. The IRS o ffice hours are from 8:00am to 4:00 pm except for a 30-minute lunch break, which lasts from 11:30 am to noon. X = The probability that both Adam and Bob have to wait for no more than 20 minutes before they can approach an IRS clerk to ask f or a question. Y = The probability that both Adam and Bob have to wait for no mo re than 20 minutes before they can approach an IRS clerk to ask for a question g iven they are already waiting. Calculate X + Y. A 0.3 Solution B 0.5 E C 0.8 D 1 .0 E 1.3 X=P(Adams waiting time 20 minutes)* P(Bobs waiting time 20 minutes) = [1- P(Adams w aiting time>20 minutes)] * [1- P(Bobs waiting time>20 minutes)] The only way for Adam or Bob to wait for more than 20 minutes is when they arrive between [11:30a m, 11:40am]. If they arrive by 11:30 am (IRS lunch break), they will be Yufeng Guo, Deeper Understanding: Exam P Page 398 of 425

http://www.guo.coursehost.com served immediately without waiting. If they arrive shortly after 11:40am (11:45am for example), their waiting time is less than 20 minutes. P(Adams waiting time>20 minutes) =P(Adam arrives between 11:30am and 11 :40 am) Adams arrival time is uniformly distributed over [10:00 am, 12:30 pm] or 2.5-hour interval, then P(Adam arrives between 11:30am and 11:40 am) = 10 minute s / 2.5 hours = 10 minutes / (2.5*60 minutes)=1/15 Similarly, Adams arrival time is uniformly distributed over [11:00 am, 1 pm]. P(Bob arrives between 11:30am an d 11:40 am) = 10 minutes / 2 hours = 10 minutes / (2*60 minutes)=1/12 X= [1- P(Adams waiting time>20 minutes)] * [1- P(Bobs waiting time>20 minutes)] = (1- 1/15)*(1-1/12)=0.856 Next, lets calculate Y. Here the starting point is that Adam and Bob are both waiting now. To have them both wait, they must each arrive during the interval [11:30, noon]. So instead of considering the original arriv al time [10:00 am, 12:30 pm] for Adam and [11:00 am, 1:00 pm] for Bob, well consi der the updated arrival interval [11:30, noon] for both Adam and Bob. This is ca lled shrinking the sample space (the arrival of new information reduces your ori ginal sample space). So finding Y is reduced to: Adams arrival time is uniformly distributed over [11:30, noon]. Bobs arrival time is uniformly distributed over [ 11:30, noon]. Adam and Bob are independent. Whats the probability that Adam and B ob each arrive during [11:40 am, noon] ? The probability that Adam arrives durin g [11:40 am, noon] given that he can arrive during [11:30 am, noon] is: 20 minut es / 30 minutes = 2/3. The probability that Adam arrives during [11:40 am, noon] given that he can arrive during [11:30 am, noon] is: 20 minutes / 30 minutes = 2/3. The probability that they have to wait at most 20 minutes given that they a re already waiting is: Y=(2/3)*(2/3)=4/9=0.44 X+Y=0.856+0.44 = 1.2956 Yufeng Guo, Deeper Understanding: Exam P Page 399 of 425

http://www.guo.coursehost.com Q10 f ( x, y ) = 5( y x) 4 (y 4y 4 2 )( 4 y ) where 0 < x < y and 2 < y < 4 . Calculate E ( X ) A 0.00 Solution B 0.11 D C 0.28 D 0.51 E 0.75 Lot of candidates will be scared by this problem. They will randomly choose an a nswer in the exam and move on to the next problem. This is a good strategy. You might want to do it too. However, this problem isnt hard if you know how to manip ulate double integration. Using the formula E ( X ) = xf ( x )dx is hard because you have to know f ( x ) . A fast approach is to use the formula E ( X ) = 4 y 4 y xf ( x, y ) dxdy . 4 E(X ) = xf ( x, y ) dxdy = y x 2 0 5( y x) (y 4y 4 4 2 )( 4 y ) dxdy 2 0 4

= 2 5 ( y 2 )( 4 y ) 4y 4 4 x ( y x ) dxdy = 4 5 ( y 2 )( 4 y ) 4y 4 y x ( y x ) dx dy 4 0 2 0 y x ( y x ) dx = 4 0 1 6 y 30 E(X ) = 5 ( y 2 )( 4 y ) 4y 4 2 1 6 y dy = 30 4 (y 2 )( 4 y ) 24 y 2 dy = 2 45 23 0.511 Q11, Q12, Q13, Q14 The liability that results from a car accident falls into two categories: the property damage liability and the personal injury liability. Le t random variables X and Y represent the dollar amount (in $10,000) of the prope rty damage liability and the personal injury liability respectively. The pdf of

X is: fX ( x) = 1 x , where 0 8 x k and k is a constant Yufeng Guo, Deeper Understanding: Exam P Page 400 of 425

http://www.guo.coursehost.com Given X = x , Y is uniformly distributed over [ x, 2 x ] . An insurance policy is written to cover the sum of X and Y . Theres a de ductible of $30,000 and security loading of 30%. Calculate Q11 The probability o f having an accident where the property damage liability exceeds $20,000 and the personal injury liability exceeds $30,000. A 0.688 B 0.712 C 0.733 D 0.752 E 0. 801 Q12 The probability that insurer doesnt incur any claims. A 0.078 B 0.082 C 0 .094 D 0.102 E 0.124 Q13 The gross premium A 4,000 B 4,300 C 4,600 D 4,900 E 5,200 Q14 The expected non-zero claim payment A 3,250 B 3,750 C 3,985 D 4,000 Solution E 4,150 Q11 Q12 Q13 Q14 A C D E k First, we need to find k . The total probability should be one k 1 xdx = 1 8 0 x 4. 1 1 1 2 xdx = x 8 8 2 0 k = 0 k2 1 = 1 , k = 4 . So f X ( x ) = x , where 0 16 8 1 1 1 x = , where 0 < x 8 x 8 4 and x f X ,Y ( x, y ) = f ( x ) f ( y x ) = y 2x . The probability of having an accident where the property damage liability exceed s $20,000 and the personal injury liability exceeds $30,000 is: P(X > 2 Y > 3) Yufeng Guo, Deeper Understanding: Exam P Page 401 of 425

http://www.guo.coursehost.com Area COD is 0 < x 4 and x y 2 x Area ABCDE is where 0 < x 4 , x y 2 x , x > 2 , and y > 3 . Coordinates: A(2,3), B(2,4), C(4,8), D(4,4), F(4,4), E(3,3) P(X > 2 Y > 3) = ABCDE f ( x, y ) dxdy = 1 1 Area ABCDE dxdy = dxdy = 8 ABCDE 8 ABCDE 8 Area ABCDE = Area ABCF Area DEF Area ABCF=0.5*(AB+CF)*AF=0.5*(1+5)*2=6, Area DEF =0.5*EF*DF=0.5*1*1=0.5 Area ABCDE = 6 0.5 =5.5 P(X > 2 Y > 3) = 5.5 = 0.6875 8 Yufeng Guo, Deeper Understanding: Exam P Page 402 of 425

http://www.guo.coursehost.com The shaded area is where x + y 3 , 0 < x 4 , and x y 2 x In the shaded area, the total loss doesnt exceed the deductible. Coordinates: M(1, 2) and N(1.5, 1.5) Th e probability that the insurer doesnt incur any claims is: P( X +Y 3) = MON f ( x, y ) dxdy = 1 Area MON dxdy = 8 MON 8 To find Area MON, we need to find the Coordinates of M and N. Solving the follow ing equations: x + y = 3 and y = 2 x x + y = 3 and y = x x = 1 and y = 2 . So M(1, 2) x = y = 1 .5 . So N(1.5, 1.5) Area MON = Area NOP Area MOP Area NOP = 0.5*OP*(X coordinate of N) = 0.5*3*1.5 A rea MOP = 0.5*OP*(X coordinate of M) = 0.5*3*1 Area MON = 0.5*3*(1-0.5)=0.75 P( X +Y 3) = 0.75 = 0.09375 8 Page 403 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com The shaded area is where the insurer has to pay claims x + y 3 . Coordinates: M( 1, 2) and N(1.5, 1.5) Let Z represent the claim payment. Then Z = max ( X + Y 3, 0 ) . The net premium is the expected payment. E (Z ) = M CDN (x + y 3) f ( x, y ) dxdy = M RN (x+ y (x + y R CD N 3) f ( x, y ) dxdy 3) f ( x, y ) dxdy + (x + y M RN 3) f ( x, y ) dxdy = 1.5 2 x (x + y (x + y 1 3 x 1 3) dydx = 0.023438 8 1 3) dydx = 3.737 8 (x + y R CD N 3) f ( x, y ) dxdy = 4 2x 1.5 x E ( Z ) = 0.023438 + 3.737 = 3.760438 =$3,760.44. This is the net premium. Yufeng Guo, Deeper Understanding: Exam P Page 404 of 425

http://www.guo.coursehost.com The gross premium is: $3,760.44*(1+30%)=$4,888.57 The expected non-zero claim payment (the expected claim payment given theres a cl aim): E ( Z Z > 0) = P ( Z > 0) E (Z ) = P ( X + Y > 3) E (Z ) E (Z ) = 1 P( X +Y E (Z ) 3) = 3, 760.44 = 4,149.45 1 0.09375 To see why E ( Z Z > 0 ) = P ( Z > 0) hold, please note that the unconditional mean is E ( Z ) = z f ( z ) dz . To find the conditional mean, we just need to change th e unconditional density f ( z ) to the conditional density is E ( Z Z > 0 ) = z P ( Z > 0) f ( z) dz = P ( Z > 0) f ( z) . The conditional mean E (Z ) 1 z f ( z ) dz = P ( Z > 0) P ( Z > 0) Q15 You take a bus to work every workday. Your journey to work consists of 3 ind ependent components: The time for you to walk from home to the bus stop nearby i s normally distributed with mean of 5 minutes and standard deviation of 2 minute s. Your bus arrives immediately after you get to the bus stop. The time for a bu s to take you to the stop near your office is normally distributed with mean of 20 minutes and 6 minutes. The time for you to walk to work from the bus stop to your company is normally distributed with mean of 3 minutes and standard deviati on of 1 minutes. Your boss drives to work every workday. The time for him to drive to work is nor mally distributed with mean of 26 minutes with standard deviation of 5 minutes. Calculate the probability that in a workday you arrive at work at least 5 minute s earlier than you boss does (assuming you leave your home and your boss leaves his home at the same time). A 13% B 19% C 25% D 31% E 37% Solution B Page 405 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Let X represent the # of minutes it takes you to w ork. Let Y represent the # of minutes it takes your boss to work. Let W = Y X . We are asked to calculate P (W 5 ) . Let X 1 , X 2 Because X 1 , ormal. E ( X ) = E ( = 28 Var ( X 3 ) = 22 + 62 W is normally , and X 3 represent the 3 components of X . X = X 1 + X 2 + X 3 . X 2 , and X 3 are independent normal random variables, X is also n X 1 + X 2 + X 3 ) = E ( X 1 ) + E ( X 2 ) + E ( X 3 ) = 5 + 20 + 3 ) = Var ( X 1 + X 2 + X 3 ) = Var ( X 1 ) + Var ( X 2 ) + Var ( X + 12 = 41 distributed. E (W ) = E (Y ) E ( X ) = 26 28 = 2

Var (W ) = Var (Y X ) = Var (Y ) + Var ( X ) = 52 + 41 = 66 P (W 5) = 1 P (W 5) = 1 5 ( 2) 66 =1 ( 0.86 ) = 19% Q16 A manufacturing plant purchases a special product defect insurance policy. T he insurance provides a payment of $1,000 if the number of defects is 20. Then f or each full 5 incremental defects, the insurance pays an additional $500. Howev er, the payment by the insurance policy can be no more than $2,800 regardless of the number of defects. The probability is 0.19 that the plant will have less th an 5 defects. The probability is 0.36 that the plant will have less than 10 defe cts. The probability is 0.51 that the plant will have less than 15 defects. The probability is 0.64 that the plant will have less than 20 defects. The probabili ty is 0.75 that the plant will have less than 25 defects. The probability is 0.8 4 that the plant will have less than 30 defects. The probability is 0.91 that th e plant will have less than 35 defects. The probability is 0.96 that the plant w ill have less than 40 defects. The probability is 0.99 that the plant will have less than 45 defects. The probability is 1.00 that the plant will have less than 50 defects. Let Y represent the payment by the insurance policy. Calculate mean to the standard deviation of Y. E (Y ) , the ratio of the (Y ) A 0.48 B 0.53 C 0.58 D 0.63 E 0.68 Solution E

Page 406 of 425 Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com Let X represent the # of defects. We set $1,000 as one unit of payment to speed up calculation. This wont affect the ratio E (Y ) / (Y ) . X>= X< Cumulative probability Incremental Probability P(Y) 0.64 0.11 0.0 9 0.07 0.05 0.03 0.01 1.00 Raw payout (before max payment is applied) 0 1 1.5 2 2.5 3 3.5 Payout Y Scale up probability 100*P(Y) 64 11 9 7 5 3 1 100 0 20 25 30 35 40 45 Total 20 25 30 35 40 45 50 0.64 0.75 0.84 0.91 0.96 0.99 1.00 0 1 1.5 2 2.5 2.8 2.8 In BA II Plus 1-V Statistics Worksheet, enter: X01=0, Y01=64 X02=1, Y02=11 X03=1 .5, Y03=9 X04=2, Y04=7 X05=2.5, Y05=5 X06=2.8, Y06=3+1=4 You should get: Mean E (Y ) =0.622, Y =0.91198465. E (Y ) / Y = 0.622 / 0.91198465 = 0.682029 Note (1) The following information is not needed to solve the problem: The proba bility is 0.19 that the plant will have less than 5 defects. The probability is 0.36 that the plant will have less than 10 defects. The probability is 0.51 that the plant will have less than 15 defects. Note (2) If you have time to waste in the exam or if you want to make some calculation errors, you can also calculate E (Y ) and Y using the following formulas: E (Y ) = ! y p( y ) , E (Y 2 ) = ! y 2 p ( y ) , Y = E (Y 2 ) E 2 (Y ) Note (3) Instead of entering X06=0, Y06=3+1=4, you can also enter: X06=0, Y06=3 X07=0, Y07=1 Yufeng Guo, Deeper Understanding: Exam P Page 407 of 425

http://www.guo.coursehost.com Note (4) Dont forget to enter X01=0, Y01=64. If you forget to enter the zero value of the random variable and the associated probab ility, your result will be wrong. When using BA II Plus 1-V Statistics Worksheet , you must enter the zero value of the random variable X and the associated prob ability. If you dont enter x = 0 , youll calculate E ( X X " 0 ) and Var ( X X " 0 ) , instead of E ( X ) and Var ( X ) . F (2 ) = 2.125. Find E ( X ) F (1) D 6 E 7 Q17 X is a Poisson random variable. A 3 B 4 C5 Solution A F (2 ) e = F (1) 2 (1 + + 0.5 e (1 + ) 2.25 = 0 , 2 ) = 1 + 0.5 1+ 2 =2.125. =3 2.25 ( 3)( + 0.75) = 0 , E(X ) = =3 Q18 Random variables X and Y are jointly uniformly distributed over the area 0 X + Y 1 . Calculate X Y , the standard deviation of X Y . A 0.000 B 0.016 C 0.03 2 D 0.105 E 0.250 Solution D To solve this problem right, youll need to correctly identify the 2-D plane for 0 X + Y 1. If X If X If X 0 and Y 0 and Y 0 and Y 0; Page 408 of 425 0 , then X + Y = X + Y ; the 2-D plane is 0 0 , then X + Y = X Y the 2-D plane i s 0 0 , then X + Y = X Y the 2-D plane is 0 X +Y X Y

1; 1; 1 or X Y 1 X +Y Yufeng Guo, Deeper Understanding: Exam P

http://www.guo.coursehost.com If X 0 and Y 0. 0 , then X + Y = X + Y the 2-D plane is 0 X +Y 1 or 1 X Y So the 2-D plane is ABCD (you should verify this). Well need to calculate f ( x, y ) = k . The total probability should be one: f ( x, y ) dydx = ABCD ABCD kdydx = k ABCD dydx = k ( Area ABCD ) = 1 1 =2 2 f ( x, y ) = k = 1 2 Area ABCD = 4 Area AOB = 4 Var ( X Y ) = E ( X 2Y 2 ) E 2 ( X Y ) E ( X 2Y 2 ) = ABCD x 2 y 2 f ( x, y ) dydx = 1 x 2 y 2 dydx 2 ABCD x 2 y 2 is symmetric inside AOB, AOD, COB, and COD. Yufeng Guo, Deeper Understanding: Exam P Page 409 of 425

http://www.guo.coursehost.com x 2 y 2 dydx = 4 ABCD COD x 2 y 2 dydx . 1 1 x 1 1 x x 2 y 2 dydx = COD 0 0 x 2 y 2 dydx = 0 x2 0 y 2 dy dx = 1 3 x 2 (1 x ) dx 30 1 To quickly solve this integration, youll want to use the shortcut developed in th e chapter on beta distribution (you should memorize this integration shortcut): 1 p m (1 p ) dp = n 1 0 1 m ( m + n + 1) Cm+ n 1 1 1 1 1 1 3 x y dydx = x 2 (1 x ) dx = = = 2 2 30 3 ( 2 + 3 + 1) C2 +3 3 ( 6 ) C5 180 COD 2 2 x 2 y 2 dydx = 4 ABCD COD x 2 y 2 dydx = 4 1 180 E ( X 2Y 2 ) = 1 1 1 1 x 2 y 2 dydx = 4 = 2 ABCD 2 180 90 Next, we need to calculate E 2 ( X Y ) . Please note that X Y is positive inside AOB and COD and negative inside AOD and COB. Consequently, E2 ( X Y ) = ABCD x y f ( x, y ) dydx = x y dydx +

1 x y dydx 2 ABCD = 1 2 x y dydx + AOD COD x y dydx + COB x ydydx = 0 AOB Var ( X Y ) = E ( X 2Y 2 ) E 2 ( X Y ) = E ( X 2Y 2 ) = XY 1 90 = Var ( X Y ) = 1 = 0.105 4 90 By the way, if the problem asks you to find Var ( X ) , you can find the answer using the approach above: Var ( X ) = E ( X 2 ) E 2 ( X ) Yufeng Guo, Deeper Understanding: Exam P Page 410 of 425

http://www.guo.coursehost.com E(X2) = ABCD x 2 f ( x, y ) dydx = 1 x 2 dydx 2 ABCD x 2 is symmetric inside AOB, AOD, COB, and COD. x 2 dydx = 4 ABCD COD x 2 dydx . 1 1 x 1 1 x 1 x dydx = 2 COD 0 0 x dydx = 2 0 x 2 0 dy dx = 0 x 2 (1 x )dx = 1 12 x 2 dydx = 4 ABCD COD x 2 dydx = 4 1 1 = 12 3 E(X2) = 1 1 1 1 x 2 dydx = = 2 ABCD 2 3 6 X is positive inside AOB and COD and negative inside AOD and COB. Consequently, E(X ) = 0 Var ( X ) = E ( X 2 ) E 2 ( X ) = E ( X 2 ) = X 1 6

= 1 = 0.408 6 Q19 1 where 0<x<y<z<1. xy Calculate the probability that at least two of the thr ee random variables X,Y,Z are less than 0.1. Random variables X,Y,Z have the fol lowing joint pdf: f(x,y,z)= A 0.20 B 0.25 C 0.30 D 0.35 E 0.40 Solution A "At least two of the three random variables X,Y,Z are less than 0.1" is the same as Y<0.1. If Y<0.1, then we are guaranteed to have X<Y<0.1 regardless of whethe r Z<0.1 or Z[0.1. y<0.1 0<x<y \ y<z<1 \ 0<y<0.1 Yufeng Guo, Deeper Understanding: Exam P Page 411 of 425

http://www.guo.coursehost.com 0.1 1 y 0.1 1 y P(Y<0.1)= y =0 z = y x =0 x 0.5 y 0.5 dxdzdy = y =0 y 0.5 z = y x =0 x 0.5 dxdzdy 0.1 1 0.1 1 0.1 = y =0 y 0.5 z= y 2 y 0.5 dzdy = 2 y =0 z = y dzdy = 2 y =0 (1 y ) dy = (1 y)

2 0.1 0 = 1 0.92 = 0.19 If the problem asks you to find the probability that at least two of the three r andom variables X,Y,Z are less than a where 0 < a < 1 , then P (Y < a ) = (1 y) 2 a 0 =1 (1 a) = a (2 a) 2 For example, P (Y < 0.25 ) = 0.25 ( 2 0.25) = 0.4375 Q20 In a small town, 55% are men and 45% are women. 74% of the women read the lo cal newspaper everyday. Given that someone reads the local newspaper everyday, t he probability that the reader is a male is 57.53%. Calculate the percentage of the men who read the local newspaper everyday. A 0.78 Solution B B 0.82 C 0.88 D 0.92 E 0.98 Event: Someone reads the local newspaper everyday. Segments Segment Segment Segme nts probability contribution size to produce amount the event Men Women 45% 100% 74% 0.45(0.74)=0.333 0.55 x +0.333 55% x 0.55 x Segments contribution % 0.55 x = 57.53% 0.55 x + 0.333 0.333 = 1 57.53% = 42.27% 0.55 x + 0.333 Total Solving 0.55 x = 57.53% , we get: x =82% 0.55 x + 0.333 If you prefer the formula driven approach, this is how: Yufeng Guo, Deeper Understanding: Exam P Page 412 of 425

http://www.guo.coursehost.com P ( men read ) = P ( men read ) P ( read ) P ( read ) = P ( men ) P ( read men ) + P ( women ) P ( read women ) P ( men rea d ) = P ( men ) P ( read men ) P ( men read ) = P ( men ) P ( read men ) + P ( women ) P ( read women ) 0.55 P ( read men ) P ( men ) P ( read men ) 57.53% = 0.55 P ( read men ) + 0.45 ( 0.74 ) , P ( read men ) = 82% Q21 A wife and husband s lives are uniformly distributed on (0,20) in years. Fin d the conditional probability that the husband outlives the wife given that the husband is still alive 10 years from today. A 0.55 Solution B 0.65 C W F(10,20) C(0,20) B(20,20) C 0.75 D 0.85 E 0.95 E O D(10,0) A(20,0) H H > 10 ) P ( H > W | H > 10 ) = P(H >W P ( H > 10 ) = Area ABED Area ABFD = 0.5 (10 + 20 )(10 ) 10 ( 20 ) = 0.75 Yufeng Guo, Deeper Understanding: Exam P Page 413 of 425

http://www.guo.coursehost.com Q22 X is exponentially distributed with mean of 1. Y = X 4 . Calculate fY ( 2 ) , the pdf of Y at y = 2 . A 0.14 B 0.17 C 0.21 D 0.25 E 0.31 Solution A x FX ( x ) = 1 e FY ( y ) = P (Y y) = P( X 4 4 y) = P ( y X 4 y) \ = P( y + 4 X y + 4 ) = FX ( y + 4 ) FX ( y +4) ( y + 4) =1 e ( y + 4) 1+ e = ey e e ( y + 4) fY ( y ) = d d y FY ( y ) = e dy dy ( 2+ 4 ) 4

( y +4) = ey 4 + e ( y + 4) fY ( 2 ) = e 2 4 + e = e 2 + e 6 = 0.14 Q23 Random variable X has the following moment generating function: M X ( t ) = (1 2t ) 1 2 Calculate the coefficient of variation. A 0.2 B 0.5 C 0.8 D 1.1 E 1.4 Solution E 1 3 d d M X (t ) = (1 2t ) 2 = (1 2t ) 2 , dt dt 3 d = (1 2 0 ) 2 = 1 E(X ) = M X (t ) dt t =0 d2 d M X (t ) = (1 2t ) 2 dt dt 3 2 = 3 (1 2t ) 5 2 Yufeng Guo, Deeper Understanding: Exam P Page 414 of 425

http://www.guo.coursehost.com E(X2) = d2 M X (t ) dt 2 = 3 (1 2 0 ) t =0 5 2 =3 Var ( X ) = E ( X 2 ) E 2 ( X ) = 3 12 = 2 X Coefficient of variation: E(X ) = 2 = 1.41 1 Q24 Random loss variables X and Y have the following joint pdf: f ( x, y ) = x + k y 2 , where 0 x y 1 An insurance policy is written to cover X + Y . The maximum claim payment is $1. Calculate the net premium A 0.$75 B 0.$94 C $1.0 D $1.6 E $2.2 Solution B The shaded area AOB is 0 First, we need to calculate k . AO B x y 1 f ( x, y ) dxdy = 1 Yufeng Guo, Deeper Understanding: Exam P Page 415 of 425

http://www.guo.coursehost.com 1 y 1 y f ( x, y ) dxdy = AO B f ( x, y )dxdy = ( x + k y 2 )dxdy = 1 1 0 0 0 0 0 1 2 x + kxy 2 y dy 0 1 = 0 1 2 1 3 k 4 y + ky 3 dy = y + y 2 6 4 = 0 1 k 10 + =1, k = 6 4 3 The insurer pays x + y in AOC. It pays $1 at ABC. The expected claim is: AO C ( x + y ) f ( x, y ) dxdy + ABC f ( x, y ) dxdy ( x + y ) f ( x, y ) dxdy = AO C 0.5 1 x ( x + y) x+ 0 x 10 2 y dydx = 0.22569 3

Two ways to calculate A BC f ( x, y ) dxdy : y f ( x, y ) dxdy = A BC 1 ( x + y) x+ 0.5 1 y 10 2 y dxdy = 0.71528 3 Or A BC f ( x, y ) dxdy = 1 AOC f ( x, y ) dxdy = 1 AOC 0.5 1 x ( x + y ) f ( x, y ) dxdy =1 0 x x+ 10 2 y dydx = 1 0.28472 = 0.71528 3 Yufeng Guo, Deeper Understanding: Exam P Page 416 of 425

http://www.guo.coursehost.com ( x + y ) f ( x, y ) dxdy + AO C A BC f ( x, y ) dxdy = 0.22569 + 0.71528 = 0.94097 So the net premium is $0.94. Q25 A device has 4 duplicate components working in parallel. The device works as long as at least one component works; it fails only if all four components fail simultaneously. The time-to-failure of the 4 components are independent exponen tial random variables with mean of 1, 2, 3, and 4 hours respectively. Calculate the probability that the device is still working 5 hours later. A 0.32 B 0.37 C 0.42 D 0.47 E 0.52 Solution D Let X 1 , X 2 , X 3 , and X 4 represent the time-to-failure of the 4 components. Let Y represent the devices time-to-failure. Then Y = max ( X 1 , X 2 , X 3 , X 4 ) P (Y > 5 ) = P max ( X 1 , X 2 , X 3 , X 4 ) > 5 = 1 P max ( X 1 , X 2 , X 3 , X 4 ) 5 If you write P (Y 5 ) , thats OK. P (Y 5) = P (Y > 5) because Y is continuous. 5) P ( X 2 5) P ( X 3 5) P ( X 4 5) P max ( X 1 , X 2 , X 3 , X 4 ) 5 = P ( X 1 If X is an exponential random variable with mean of # , then f ( x) = P ( X1 = 1 e 1 x # e # , P(X x) = F ( x) = 1 e x # , 5) P ( X 2

5 1 5) P ( X 3 5 2 5) P ( X 4 1 e 5 4 5) = 0.5276 1 e 1 e 5 3 P (Y > 5 ) = 1 P max ( X 1 , X 2 , X 3 , X 4 ) 5 = 0.4724 Yufeng Guo, Deeper Understanding: Exam P Page 417 of 425

http://www.guo.coursehost.com Q26 Loss random variable X has the following pdf: f ( x ) = 0.02 x , where 0 < x < 10 The insurer has a deductible of 4 per loss. Calculate the expected claim payment net of deductible. A 2.5 B 2.7 C 2.9 D 3.4 E 4.2 Solution C 4 and Let Y represent the claim payment (net of deductible). Then Y = 0 if X Y = X 4 i f X > 4 . We are asked to find E (Y ) E (Y ) = y ( x ) f ( x ) dx = 0 f ( x ) dx + 0 0 10 4 10 (x 4 ) 0.02 xdx = 10 (x 4 ) 0.02 xdx = 2.88 4 4 Q27 Two discrete random variables X and Y are jointly distributed over a series of points. The 1 . You are also given: joint probability mass function is p X ,Y ( x, y ) = 28 X =0, 1, 2, 3, 4, 5, 6 Y =0, , X (i.e. for each value in X , Y is a non-negative integer ranging from 0 to X ) Calculate Var ( X ) Var (Y ) A0 Solution B 0.5 C C1 D 1.5 E2 To find Var ( X ) , we need to list all the possible value of X and the probabil ity mass function. Yufeng Guo, Deeper Understanding: Exam P Page 418 of 425

http://www.guo.coursehost.com X Y pX ( x ) 1/28 2/28 3/28 4/28 5/28 6/28 7/28 28/28=1 Var ( X ) = E ( X 2 ) E 2 ( X ) 0 1 2 3 4 5 6 Total 0 0, 1 0, 1, 2 0, 1, 2, 3 0, 1, 2, 3, 4 0, 1, 2, 3, 4, 5 0, 1, 2, 3, 4, 5, 6 Then E ( X ) = ! xp X ( x ) E ( X 2 ) = ! x2 pX ( x ) However, in the exam, the above formulas are time-consuming and prone to errors. You should use BA II Plus 1-V Worksheet. In BA II Plus 1-V Statistics Worksheet , enter: X01=0, Y01=1 X02=1, Y02=2 X03=2 Y03=3 X04=3, Y04=4 X05=4, Y05=5 X06=5, Y06=6 X07=6, Y07=7 You should get: Mean E ( X ) =4, X =1.73205081 Var ( X ) = 2 X =3 Similarly, you can find Var (Y ) X Y Y pY ( y ) 7/28 6/28 5/28 4/28 3/28 2/28 1/28 28/28=1 0 1 2 3 4 5 6 0 0, 1 0, 1, 2 0, 1, 2, 3 0, 1, 2, 3, 4 0, 1, 2, 3, 4, 5 0, 1, 2, 3, 4, 5, 6 0 1 2 3 4 5 6 Total In BA II Plus 1-V Statistics Worksheet, enter: X01=0, Y01=7 X02=1, Y02=6 X03=2 Y 03=5 X04=3, Y04=4 Yufeng Guo, Deeper Understanding: Exam P Page 419 of 425

http://www.guo.coursehost.com X05=4, X06=5, X07=6, Y05=3 Y06=2 Y07=1 Y You should get: Mean E (Y ) =3, =1.73205081 Var (Y ) = 2 Y =3 =1 Var ( X ) Var (Y ) Q28 A company pays a benefit of 100 for each of its 1000 employees if an employe e dies next year. The probability that an employee dies next year is 2%. What is the amount the company needs to have in a fund in order to ensure a 95% chance that it can cover the loss? A 2,500 B 2,800 C 3,100 D 3,400 E 3,700 Solution B Let K = # of deaths next year. K is binomial random variable with n = 1000 and p = 0.02 . K is approximately normal with EK = np = 1000 ( 0.02 ) = 20 and Var ( K ) = npq = 1000 ( 0.02 )( 0.98 ) = 19.6 Let X = 100 K represent the total payme nt. We need to calculate the 95-th percentile of X , X 95 . Since X = 100 K is a n increasing function of K with one-to-one mapping, 95th percentile of X corresp onds to 95-th percentile of K . So we just need to find K 95 . Then X 95 = 100 K 95 . K 95 0.5 20 = 0.95 , 19.6 K 95 0.5 20 = 1.645 19.6 K 95 = 0.5 + 20 + 1.645 19.6 = 27.783 X 95 = 100 K 95 =100*27.783=2,778 Yufeng Guo, Deeper Understanding: Exam P Page 420 of 425

http://www.guo.coursehost.com Q29 Random variables X and Y have the following jo int pdf: f ( x, y ) = k x , where 0 < x < y < 1 and k is a constant. y 2 2 + Var X Y = 3 3 Calculate E X Y = A 0.17 B 0.27 C 0.37 D 0.47 E 0.57 Solution D f x, y = 3 kx 2 2 3 2 2 3 f x y= = = 2 3 P y= 3 = 2 dx 3 3 k 2 3 kx 2 2 3 = cx , f x, y = 0 0 3 kxdx 2 2 where 0 < x < and 3 3 k 2 2 P y= 3 = 2 3 = c is a constant. 2 dx 3 2 . You just need to find c . 3 f x, y = 0 You dont need to worry about actually finding k or P y = f xy= 2 3 2 2 = cx should add up to one over the range 0 < x < : 3 3 2 3 f xy=

0 2 dx = cxdx = 1 , c = 4.5 . 3 0 2 2 = 4.5 x , where 0 < x < . 3 3 f xy= Yufeng Guo, Deeper Understanding: Exam P Page 421 of 425

http://www.guo.coursehost.com 2 3 2 3 So E X Y = 2 2 = xf x y = dx = x(4.5 x )dx = 1.5 x 3 3 3 0 0 2 2 3 2 3 [ ] 2 3 0 = 1.5 2 3 3 = 4 =0.444 9 4 E 2 XY = 3 2 4 .5 4 = x f x y = dx = x 2 (4.5 x )dx = x 3 4 0 0 2 2 3 0 = 4 .5 2 4 3 = 2 9 2 2 Var X Y = = 3 9 E XY= 4 9 2 =0.02469 2 2 =0.4444 + 0.02469=0.47 + Var X Y = 3 3 Q30 Random variable X has the following moment generating function: M X ( t ) = , 1 #t where # is a positive constant. Calculate the probability that X is one s tandard deviation from its mean. 1 A 0.46

B 0.56 C 0.66 D 0.76 E 0.86 Solution E Notice that X is an exponential random variable with mean # and standard deviati on of #. P X X 2# E(X ) 2 X = P( # X # # ) = P (0 X 2# ) = F ( 2# ) =1 e # =1 e = 0.8647 Yufeng Guo, Deeper Understanding: Exam P Page 422 of 425

http://www.guo.coursehost.com Final tips on taking Exam P 1. The goal of combat is to win the war, not individual battles. The goal of tak ing Exam P is to get a 6 so you can look for a job, not to get a 10. 2. If you n eed to speak in an important meeting (such as before Congress), always prepare a script well in advance and rehearse your script. When preparing for an importan t exam such P, always build a 3 minute solution script ahead of time for each of the tested problems in Sample P and any newly released P exams (if any). Then w hen taking the exam, just regurgitate the script and solve the repeatedly tested problems. 3. In the exam, if a problem is brand new (this type of problems have not been tested before), make one attempt. If you cant solve, just guess an answ er, and move on to the next problem. 4. Focus on mastering the fundamentals. Dif ficult distributions such as negative binomial, hypergeometric, Weibull, Pareto, beta, Chi-square, and bivariate normal arent tested in the Sample Exam P problem s. Since Sample Exam P reflects the level of knowledge you need to have to pass Exam P, chances are that these difficult distributions wont show up in your exam. As a result, you can probably ignore these distributions. However, if you reall y cant sleep well if you ignore these distributions, you might learn some basic k nowledge about these distributions. Dont over-study these difficult distributions . 5. Master Sample Exam P before taking your exam. Put yourself in the exam-like condition and practiced Sample P exam and any newly released P exams (if any). Work and rework until you can solve Sample P exam and any newly released P exams (if any) 100% right in the exam-like condition. Never walk into the exam room w ithout mastering Sample P exam and any newly released P exams (if any). 6. For t hose wanting for additional practice problems, refer to http://www.actuarialoutp ost.com/actuarial_discussion_forum/. Youll find many practice problems there. 7. If you failed Exam P, dont give up. Many candidates eventually passed Exam P afte r failing it multiple times. Yufeng Guo, Deeper Understanding: Exam P Page 423 of 425

http://www.guo.coursehost.com About the author Yufeng Guo was born in central China. After receiving his Bachelors degree in phy sics at Zhengzhou University, he attended Beijing Law School and received his Ma sters of law. He was an attorney and law school lecturer in China before immigra ting to the United States. He received his Masters of accounting at Indiana Univ ersity. He has pursued a life actuarial career and passed exams 1, 2, 3, 4, 5, 6 , and 7 in rapid succession after discovering a successful study strategy. Mr. G uos exam records are as follows: Fall 2002 Passed Course 1 Spring 2003 Passed Cou rses 2, 3 Fall 2003 Passed Course 4 Spring 2004 Passed Course 6 Fall 2004 Passed Course 5 Spring 2005 Passed Course 7 Mr. Guo currently teaches an online prep c ourse for Exam P, FM, and M. For more information, visit http://guo.coursehost.c om. If you have any comments or suggestions, you can contact Mr. Guo at yufeng_g [email protected]. Please note that if I find any errors, I will post the errata at htt p://guo.coursehost.com. Yufeng Guo, Deeper Understanding: Exam P Page 424 of 425

http://www.guo.coursehost.com Value of this PDF study manual 1. Dont pay the shipping fee (can cost $5 to $10 for U.S. shipping and over $30 f or international shipping). Big saving for Canadian candidates and other interna tional exam takers. 2. Dont wait a week for the manual to arrive. You download th e study manual instantly from the web and begin studying right away. 3. Load the PDF in your laptop. Study as you go. Or if you prefer a printed copy, you can p rint the manual yourself. 4. Use the study manual as flash cards. Click on bookm arks to choose a chapter and quiz yourself. 5. Search any topic by keywords. Fro m the Adobe Acrobat reader toolbar, click Edit ->Search or Edit ->Find. Then typ e in a key word. User review of Mr. Guos P Manual Mr. Guos P manual has been used extensively by many Exam P candidates. For user r eviews of Mr. Guos P manual at http://www.actuarialoutpost.com, click here Review of the manual by Guo. Testimonies: Second time I used the Guo manual and was abl e to do some of the similar questions in less than 25% of the time because of kn owing the shortcut. Testimony #1 of the manual by Guo I just took the exam for the second time and feel confident that I passed. I used Guo the second time around . It was very helpful and gives a lot of shortcuts that I found very valuable. I thought the manual was kind of expensive for an e-file, but if it helped me pas s it was well worth the cost. Testimony # 2 of the manual by Guo I took the last e xam in Feb 2006, and I ran out of time and I ended up with a five. I needed to d o the questions quicker and more efficiently. The Guo s study guide really did t he job. Testimony #3 of the manual by Guo Yufeng Guo, Deeper Understanding: Exam P Page 425 of 425

Mini Problem Set Problem 1 A system consists of two machines that work together. The system works only if both machines work. The system fails if either machine fails. The timeuntil-failure of each machine is independent from each other. Th e joint moment 1 . generating function of each machine time-until-failure is s 20 st 5t 4s + 1 Calculate the probability that the system is still working after 1 hour. Problem 2 X and Y are independent random variables. E (X) = 0:5E (Y ). The coe cient of variation of X is 5; the coe cient of variation of Y is 8. Find the coe cient of variation of 10 (X + Y ). Problem 3 There are 15 people. 10 people f rom Department A and the other from Department B. The probability of having an a ccident is p = 0:05 for each person. What is the probability of at least one acc ident from Department A given that there are 4 accidents? Problem 4 A married co uple buys a joint life insurance policy. Let X and Y denote the respective remai ning lifetimes in years of the husband and wife, measured from the time of the p olices inception. X and Y are jointly and uniformly distributed over the region: X2 0 < X < 16 < Y < 16 16 Calculate the E (XjY = 9) and V ar (XjY = 9). Problem 5 Let X denote the loss amounts in an auto collision. Let C denote the portion of X that the insurance company will pay. An actuary determines that X and C are independent with respect to density and probability functions: , where x > 0 0: 4 if c = 1 P (C = c) = 0:6 if c = 0 Calculate the variance of the insurance comp any claim payment Y = CX. 1 f (x) = 1=10e x=10

Problem 6 f (x; y) = k (x y) where 0 < y < x < 1 and k is a constant. Calculate F (x; y) where 0 < y < x < 1 . Problem 7 F (x; y) = 3x2 y Problem 8 An insurance company sells a one year insurance policy that covers re and theft losses. The v ariance of the number of re losses is 4. The variance for the number of theft los ses is 5. The covariance between the number of re and theft losses is 3. Problem 9 Losses under an insurance policy are uniformly distributed on an interval [0, 100]. The policy has a deductible of 20. A loss occurred for which the insurance benet is less than 40. Calculate the probability that the benet was more than 30. Problem 10 The Joint probability density function of X and Y is given by: f (x; y) = c (x + y), where c is a constant, 0 < x < 4, 0 < y < 4. X +Y Calculate the variance of 2 Problem 11 Claim amounts are independently random variables with probability density function: f (x) = kx 3 where x > 20 Calculate the probabilit y that the largest of 3 randomly selected claims is less than 40. 3xy 2 + y 3 wh ere 0 < y < x < 1. Calculate E (Y ) 2

Mini Problem Set Solutions Problem 1 A system consists of two machines that work together. The system works only if both machines work. The system fails if eith er machine fails. The time-until-failure of each machine is independent from eac h other. The joint moment generating function of each machine time-until-failure is s 1 . Calculate the probability that the system is still working 20st 5t 4s + 1 after 1 hour. Solution Let X and Y represent the time-until-failure of the tw o machines. Since X and Y are independent, we have: MX;Y (s; t) = MX (s) MY (t) On the other hand, we are given: 1 1 1 = MX;Y (s; t) = 20st 5t 4s + 1 1 4s 1 5t 1 is the MGF of an exponential random variable with mean = 4 1 4s 1 is the MGF o f an exponential random variable with mean = 5 1 5t Hence X and Y are two indepe ndent exponential random variables. One of these two exponential random variable s has a mean of 4 and the other 5. P (X > 1 \ Y > 1) = P (X > 1) P (Y > 1) = e 1 =4 e 1=5 = 0:637 63 Problem 2 X and Y are independent random variables. E (X) = 0:5E (Y ). The coe cient of variation of X is 5; the coe cient of variation of Y i s 8. Find the coe cient of variation of 10 (X + Y ). Solution Please note that th e denition of the coe cient of the variation of a random variable Z is p V ar (Z) coeZ = E (Z) It then follows that for any non zero constant m p p p V ar (mZ) m2 V ar (Z) V ar (Z) coemZ = = = = coeZ E (mZ) mE (Z) E (Z) p V ar(X + Y ) coe10(X +Y ) = coeX+Y = E (X + Y ) p V ar (X) coeX = =5 V ar (X) = 52 E 2 (X) = 52 0:52 E 2 (Y ) E (X) p V ar (Y ) coeY = =8 V ar (Y ) = 82 E 2 (Y ) E (Y ) V ar (X + Y ) = V ar (X) + V ar (Y ) = 1 + 82 E 2 (Y ) p p V ar (X) + V ar (Y ) = 52 (0:52 ) + 82 E (Y ) 1

E (X + Y ) = E (Y ) + E (Y ) = p (Y ) 1:5E p p V ar(X + Y ) 52 (0:52 ) + 82 E (Y ) 52 (0:52 ) + 82 coeX+Y = = = = E (X + Y ) 1:5E (Y ) 1:5 5: 587 7 coe10(X+Y ) = coeX+Y = 5: 587 7 Problem 3 There are 15 people. 10 people from Department A a nd the other from Department B. The probability of having an accident is p = 0:0 5 for each person. What is the probability of at least one accident from Departm ent A given that there are 4 accidents? Solution The probability that of 15 peop le 4 have accidents and 11 don is a binomial t 4 distribution: C15 p4 q 11 The pr obability that department B has 4 accidents and department A has 0 accident (i.e . all accidents are from department B) is 4 10 C5 p4 q C10 p0 q 10 The probabili ty that all accidents are from department B given there are 4 accidents is: 4 10 1 C5 p4 q C10 p0 q 10 C 4 C 10 4 3 2 1 = = 5 4 10 = 4 p4 q 11 C15 C15 15 14 13 12 1365 The probability of at least one accident from A given that there are 4 a ccidents is 1 1 = 0:999 27 1365 Problem 4 A married couple buys a joint life ins urance policy. Let X and Y denote the respective remaining lifetimes in years of the husband and wife, measured from the time of the polices inception. X and Y are jointly and uniformly distributed over the region: X2 0 < X < 16 < Y < 16 16 Calculate the E (XjY = 9) and V ar (XjY = 9). Solution This problem is really s imple. X and Y are uniformly distributed ! f (x; y) = k where k is a constant. f (x; y = 9) k f (xjy = 9) = = =c P (y = 9) P (y = 9) k Because P (y = 10) is a c onstant, = c is also a constant. P (y = 10) Hence f (xjy = 9) = c is a constant. So given y = 9, x is uniformly distributed. 2

x2 To nd the range of xjy n y = 9, 0 < x < 16 9 = 12 b a) a+b V ar (T ) = T s U 9) = 12 0 + 12 ! E (XjY =

= 9, notice 0 < x < 16 and <y 16 p ! 0 < x < 16y p Give Next, use the formula for a uniform distribution. 2 ( [a; b] ! E (T ) = 2 12 1 xjy = 9 s U [0; 12] f (xjy = 9) = =6 2 2 (12 0) ! V ar (XjY = 9) = = 12 12

Key point: If X and Y are uniformly distributed, then the conditional random var iable XjY = a or Y jX = b is also uniformly distributed. Problem 5 Let X denote the loss amounts in an auto collision. Let C denote the portion of X that the in surance company will pay. An actuary determines that X and C are independent wit h respect to density and probability functions: , where x > 0 0:4 if c = 1 P (C = c) = 0:6 if c = 0 Calculate the variance of the insurance company claim paymen t Y = CX. Solution X with probability 0:4 Y = 0 with probability 0:6 V ar (Y ) = E Y 2 E 2 (Y ) f (x) = 1=10e x=10 Using double expectation: E (Y ) = 0:4E (X) + 0:6E (0) = 0:4E (X) = 0:4 10 = 4:0 2 102 = 80 E Y 2 = 0:4E X 2 + 0:6E 02 = 0:4E X 2 = 0:4 Please note that X is ex ponential with mean 10 Hence E (X) = 10 E X 2 = 2 102 2 V ar (Y ) = 0:4 2 102 (0 :4 10) = 64 Problem 6 3

f (x; y) = k (x y) where 0 < y < x < 1 and k is a constant. Calculate F (x; y) w here 0 < y < x < 1 . Solution First, we calculate k. The 2-D region of 0 < y < x < 1 is a triangle formed by the following 3 points: (0; 0) ; (1; 0) ; and (1; 1 ). The total probability over this triangle should be one. If we use y for the o uter integration and x for the inner integration, we have: R1R1 f (x; y) dxdy = 1 R 0 Ry R1R1 R1R1 1 1 f (x; y) dxdy = 0 y k (x y) dxdy = k 0 y (x y) dxdy 0 y R 1 1 2 (x y) dx = (y 1) y 2 R1R1 R1 1 1 2 (x y) dxdy = 0 (y 1) dy = 0 y 2 6 R1R1 k k (x y) dxdy = = 1 0 y 6 k=6 f (x; y) = 6 (x y) Next, we calculate F (x; y) = P (X x; Y y) To understand how to calculate F (x; y) = P (X x; Y y), rst let use s a concrete example and calculate F (0:5; 0:2) = P (X 0:5; Y 0:2) If you understa nd how to calculate F (0:5; 0:2) = P (X 0:5; Y 0:2), you ll know how to calculate F (x; y) = P (X x; Y y). Using y for the outer integration, we get: R 0:2 R 0:5 R 0:2 R 0:5 P (X 0:5; Y 0:2) = 0 f (x; y) dxdy = 0 6 (x y) dxdy y y R 0:5 2 6 ( x y) dx = 3 (0:5 y) y h i0:2 R 0:2 R 0:5 R 0:2 2 3 6 (x y) dxdy = 0 3 (0:5 y) dy = (0:5 y) = (0:5 0 y 3 0 0) 3 (0:5 0:2) = 0:098 ! F (0:5; 0:2) = P (X Rb 0 0:5; Y b) = 2 3 0:2) = 0:098 RbRa 0 y 3 (a y) dy = 3a b ! F (x; y) = P (X F (x; y) = 3x2 y Solution ! F (a; b) = P (X 2 2 a; Y f (x; y) dxdy = 3xy 2 + y 3 3ab + b x; Y y) = 3x2 y RbRa 0 y 6 (x y) dxdy = 3xy 2 + y 3 where 0 < y < x < 1. Calculate E (Y ) @2 @ 2 F (x; y) = 3x2 y 3xy 2 + y 3 = 6 (x y) f (x; y) = @x@y @x@y R1R1 R1R1 1 E (Y ) = 0 y yf (x; y) dxdy = 0 y 6y (x y) dxdy = 4 4

Shortcut #1: R1 E (Y ) = 0 yf (y) dy f (y) = dF (y) dy F (y) = P (Y y) = P (Y y \ X +1) To see why, notice X +1 automatically holds tru e, that is, P (X +1) = 1. So we have P (Y y \ X +1) = P (Y y \ 1) = P (Y y) F (y ) = lim F (x; y) = lim F (x; y) = 3 12 y 3 1y 2 + y 3 = y 3 3y 2 + 3y x!+1 x!1 Please note that the maximum value of X is 1. Hence x ! +1 means x ! 1. d dF (y) 2 = y 3 3y 2 + 3y = 3 (y 1) dy dy R1 R1 1 2 E (Y ) = 0 yf (y) dy = 0 y3 (y 1) d y = 4 Shortcut #2: R1 R1 Since Y is non-negative, E (Y ) = 0 P (Y > 0) dy = 0 [1 f (y) = F (y)] dy As before, F (y) = lim F (x; y) = lim F (x; y) = y 3 3y 2 + 3y x!+1 x!1 R1 R1 1 E (Y ) = 0 [1 F (y)] dy = 0 1 y 3 3y 2 + 3y dy = 4 Problem 7 An insurance company sells a one year insurance policy that covers re a nd theft losses. The variance of the number of re losses is 4. The variance for t he number of theft losses is 5. The covariance between the number of re and theft losses is 3. Solution X =the total number of re in one year Y =the total number of theft in one year V ar (X + Y ) = V ar (X) + V ar (Y ) + 2Cov (X; Y ) = 4 + 5 + 2 Problem 8 Losses under an insurance policy are uniformly distributed on an interval [0, 100]. The policy has a deductible of 20. A loss occurred for which the insurance benet is less than 40. Calculate the probability that the benet was more than 30. Solution X =loss Y =payment d=deductible=20 P (Y > 30jY < 40) = P (X > 30 + djX < 40 + d) 5 3 = 15

P (50 < X < 60) = P (X > 50jX < 60) = = P (X < 60) 60 50 100 = 1 60 6 100 Problem 9 The Joint probability density function of X and Y is given by: f (x; y ) = c (x + y), where c is a constant, 0 < x < 4, 0 < y < 4. X +Y Calculate the v ariance of 2 Solution First, we need to nd c. R4R4 f (x; y) dxdy = 1 R 0 R0 R4R4 R4R4 4 4 f (x; y) dxdy = 0 0 c (x + y) dxdy = c 0 0 (x + y) dxdy 0 0 R4 (x + y) dx = 4y + 8 0 R4R4 R4 1 (x + y) dxdy = 0 (4y + 8) dy = 64 !c= 0 0 64 1 (x + y) f (x; y) = 64 2 X +Y 1 1 V ar = V ar (X + Y ) = V ar (X + Y ) 2 2 4 h i 2 2 V ar (X + Y ) = E (X + Y ) [E (X + Y )] E (X + Y ) = E (X) + E (Y ) 2 E (X + Y ) = E X 2 + 2XY + Y 2 = E X 2 + 2E (XY ) + E Y 2 Due to symmetry, E (X) = E (Y ) E X2 = E Y 2 E (X + Y ) = 2E (X) 2 E (X + Y ) = 2E X 2 + 2E (XY ) 2 2 2 2 [E (X + Y )] = [E (X) + E (Y )] = [2E (X)] = 4 [E (X)] 2 2 ! V ar (X + Y ) = 2E X + 2E (XY ) 4 [E (X)] E (X) = 7 1 x (x + y) dxdy = 64 3 R4R4 R4R4 20 1 E X 2 = 0 0 x2 f (x; y) dxdy = 0 0 x2 (x + y) dxdy = 64 3 R4R4 1 R4R4 16 xy (x + y) dxdy = E ( XY ) = 0 0 xyf (x; y) dxdy = 0 0 64 3 0 0 R4R4 xf (x; y) dxdy = R4R4 0 0 20 16 7 20 +2 4 = 3 3 3 9 2 X +Y 1 20 5 V ar = = 2 2 9 9 One common mistake is t o write: X +Y 1 1 V ar = V ar (X + Y ) = [V ar (X) + V ar (Y )] 2 4 4 V ar (X + Y ) = 2 6 2

The correct formula is V ar (X + Y ) = V ar (X) + V ar (Y ) + 2Cov (X; Y ) and n ot V ar (X + Y ) = V ar (X) + V ar (Y ). Problem 10 Claim amounts are independen tly random variables with probability density function: f (x) = kx 3 where x > 2 0 Calculate the probability that the largest of 3 randomly selected claims is le ss than 40. Solution Total probability should be one: R1 20 f (x) dx = k ! k = 800 f (x) = 800x 3 X1 ; X2 ; X3 are 3 random samples. X1 ; X2 ; X3 are ind ependent identically distributed with a common density: f (x) = 800x 3 R1 20 x 3 dx = k =1 800 P [max (X1 ; X2 ; X3 ) < 40] = P (X1 < 40 \ X2 < 40 \ X3 < 40) 3 = P (X1 < 40) P (X2 < 40) P (X3 < 40) = [P (X < 40)] P (X < 40) = R1 40 800x 3 dx = P [max (X1 ; X2 ; X3 ) < 40] = 1 4 1 4 3 = 1 64 7

You might also like