Biostatistic and Epidemiology Lecture Note

JIMMA UNIVERSITY
JIMMA. ETHIOPIOA
Online Teaching Course Material
for
BSc -STATISTICS
3rd Year Students -2020
Subject Course : BIOSTATISTICS AND EPIDEMIOLOGY
STAT 3101
Instructor : Dr. Adiveppa . S. Kadi,
Professor
Department of Statistics
Jimma University
JIMMA, ETHIOPIA
Chapter 1
BIOSTATISTCS
THE SCIENTIFIC METHOD
* Science is built up with facts, as a house is with stones. But a collection of facts
is no more a science than a heap of stones is a house (Jules Henri , 1908)
*The Logic of Scientific Reasoning
*The whole point of science is to uncover the “truth.”
* How do we go about deciding something is true?
We have two tools at our disposal to pursue scientific inquiry:
1. We have our senses, through which we experience the world and make
observations.
2. We have the ability to reason, which enables us to make logical inferences.
In science we impose logic on those observations.
Clearly, we need both tools.
 All the logic in the world is not going to create an observation, and
 All the individual observations in the world won't in themselves create a
theory.
 There are two kinds of relation-ships between the scientific mind and the
world, and We impose two kinds of logic :
(a) Deductive and (b) inductive
For illustration See Figure 1.1.

Deductive Inference
 In deductive inference, we hold a theory and based on it we make a

prediction of its consequences.
 That is, we predict what the observations should be.
For example, we may hold a theory of learning that says that positive
reinforcement results in better learning than does punishment,
That is, “rewards work better than punishments “.
 From this theory we predict that math’s and statistics students who are
praised for their right answers during the year
 They will do better on their final exam than those who are punished for
their wrong answers.
 We go from the general—the theory—to the specific—the observations.
This is known a the hypothetical-deductive method.
Inductive Inference
In inductive inference, we go from the specific to the general.
We make many observations, discern a pattern, make a generalization, and draw

infer an explanation.
For example:
It was observed in the Vienna General Hospital in the 1840s that women giving
birth were dying at a high rate of puerperal fever.
A generalization that provoked terror in prospective mothers.
 It was a young doctor named Ignaz Phillip Semmelweis who connected the
observation that medical students performing vaginal examinations did so
directly after coming from the dissecting room, rarely washing their hands
in between.
 The explanation that the cause of death was the introduction of cadaverous
(Cadaverous means looking corpse-like, from being sick or too skinny, like
an aging rock star or a Halloween ghoul or looking very thin and pale )
material into a wound. L
 The practical consequence of that creative leap of the imagination was the
elimination of puerperal fever as a scourge of childbirth by requiring that
physicians wash their hands before doing a delivery.
The ability to make such creative leaps from generalizations is the product of
creative scientific minds.
Epidemiologists have generally been thought to use inductive inference.
For example :
 Several decades ago it was noted that women seemed to get heart attacks
about 10 years later than men did.
A creative leap of the imagination led to the inference that it was women’s
hormones that protected them until menopause. EUREKA!
 They deduced that if estrogen was good for women, it must be good for
men and predicted that the observations would corroborate that
deduction.
 A clinical trial was undertaken which gave men at high risk of heart attack
estrogen in rather large doses, 2.5 mg per day or about four times the
dosage currently used in post-menopausal women. Unsurprisingly , the
men did not appreciate the side effects, but surprisingly to the
investigators, the men in the estrogen group had higher coronary heart
disease rates and mortality .
 What was good for the goose might not be so good for the gander.
The trial was discontinued and estrogen as a preventive measure was
abandoned for several decades.
(Estrogen, or oestrogen, is the primary female sex hormone. It is responsible for

the development and regulation of the female reproductive system and
secondary sex characteristics. There are three major endogenous estrogens in
females that have estrogenic hormonal activity: estrone, estradiol, and estrio)
CH-2 Biostatistics and Epidemiology

 Epidemiologists have generally been thought to use inductive inference.
For example , several decades ago it was noted that women seemed to
attacks about 10 years later than men did a creative leap of the
imagination led to the inference that:
 it was women’s hormones that protected them until menopause. EUREKA!
They deduced that if estrogen was goodfor women, it must be good for
men and predicted that the observations would corroborate that
deduction.
 A clinical trial was undertaken which gave men at high risk of heart attack
estrogen in rather large doses, 2.5 mg per day or about four times the
dosage currently used in post-menopausal women.
 Unsurprisingly , the men did not appreciate the side effects, but surprisingly
to the investigators, the men in the estrogen group had higher coronary
heart disease rates and mortality than those on placebo.
“What was good for the goose might not be so good for the gander”.
 The trial was discontinued and estrogen as a preventive measure was

abandoned for several decades.
 During that course of time, many prospective observational studies
indicated that estrogen replacement given to post-menopausal women
reduced the risk of heart disease by 30-50% .
 These observations led to the inductive inference that post-menopausal
hormone replacement is protective, i.e. observations led to theory.
However, that theory must be tested in clinical trials.
 The first such trial of hormone replacement in women who already had
heart disease, the Heart and Estrogen/progestin Replacement Study
(HERS)found no difference in heart disease rates between the active
treatment group and the placebo group, but did find an early increase in
heart disease events in the first year of the study and a later benefit of
hormones after about 2 years. Since this
 was a study in women with established heart disease, it was a secondary
prevention trial and does not answer the question of whether two-men
without known heart disease would benefit from long-term hormone
replacement. That question has been addressed by the Women’s Health
Initiative (WHI), which is described in a later section.
 The point of the example is to illustrate how observations (that women get
heart disease later than men) lead to theory (that hormones are
protective), which predicts new observations (that there will be fewer
attacks and deaths among those on hormones), which may strengthen the
theory, until it is tested in a clinical trial which can either corroborate it or
overthrow it and lead to a new theory, which then must be further tested
to see if it better predicts new observations. So
Biostatistics and Epidemiology:
 A Primer for Health Professionals there is a constant interplay between

inductive inference (based on observations) and deductive inference (based
on theory), until we get closer and closer to the “truth. ”
 However, there is another point to this story.
 Theories don't just leap out of facts. There must be some substrate out of
which the theory leaps.
 Perhaps that substrate is another preceding theory that was found to be
inadequate to explain these new observations and that theory, in turn, had
replaced some previous theory. In any case, one a specs of the “substrate”
is the “prepared mind” of the investigator.
 If the investigator is a cardiologist, for instance, he or she is trained to look
at medical phenomena from a cardiology perspective and is knowledge
-able about preceding theories and the ir strengths and flaws.
 If the cardiologist hadn't had such training, he or she might not have see n
the connection. Or, with different training, the investigator might lead to a
different inference altogether.
 The epidemiologist must work in an inter-disciplinary team to bring to
bear various perspectives on a problem and to enlist minds “prepared” in
different ways. The question is, how well does a theory hold up in the
face of new observations?
When many studies provide affirmative evidence in favor of a theory, does that
increase our belief in it?
 Affirmative evidence means more examples that are consistent with the
theory. But to what degree does supportive evidence strengthen an
assertion?
 Those who believe induction is the appropriate logic of science hold the
view that affirmative evidence is what strengthens a theory.
 Another approach is that of Karl Popper, perhaps one of the fore most
theoreticians of science. Popper claims that induction arising from
accumulation of affirmative evidence doesn't strengthen a theory.
Induction
 After all, is based on our belief that the things unobserved will be like those
observed or that the future will be like the past.
For example, “we see a lot of white swans and we make the assertion that all
swans are white”.
 This assertion is supported by many observations.

 Each time we see another white swan we have more supportive evidence.
But we cannot prove that all swans are white no matter how many white swans
we see.
 On the other hand, this assertion can be knocked down by the sighting of
a single black swan.
 Now we would have to change our assertion to say that most swans are
white and that there are some black swans.
 This assertion presumably is closer to the truth.
In other words, we can refute the assertion with on e example, but we can't prove
it with many.
The assertion that all swans are white is a descriptive generalization rather than a
theory.
 A theory has a richer meaning that incorporates causal explanations and

underlying mechanisms.
Assertions, like those relating to the color of swans, may be components of a

theory.
According to Popper- the proper methodology is to posit a theory, or a

conjecture, as he calls it, and try to demonstrate that it is false.
 The more such attempts at destruction it survives, the stronger is the

evidence for it.
 The object is to devise ever more aggressive attempts to knock down the
assertion and see if it still survives.
 If it does not survive an attempt at falsification, then the theory is discarded
and replaced by another.
He calls this as the method of conjectures and refutations.
 The advance of science toward the “truth” comes about by discarding

theories whose predictions are not confirmed by observations, or theories
that are not testable altogether, rather than by shoring up theories with
more examples of where they work.
 Useful scientific theories are potentially falsifiable. Un-testable theories
are those where a variety of contradictory observations could each be
consistent with the theory.
 For example: The Oedipus complex theory says that a child is in love with
the parent of the opposite sex. A boy desires his mother and wants to
destroy his father. If we observe a man to say he loves his mother that fits
in with the theory.
 If we observe a man to say he hates his mother that also fits in with the
theory, which would say that it is “reaction formation” that leads him to
deny his true feelings.
 In other words, no matter what the man says, it could not falsify the theory
because it could be explained by it.
 Since no observation could potentially falsify the Oedipus theory, its
position as a scientific theory could be questioned.
 A third, and most reasonable, view is that the progress of science requires
both inductive and deductive inference.
 A particular point of view provides a framework for observations, which
lead to a theory that predicts new observations that modify the theory,
which then leads to new, predicted observations, and so on toward the
elusive “truth,” which we generally never reach.
Asking which comes first, theory or observation, is like asking which comes
first, the chicken or the egg.
 In general then, advances in knowledge in the health field come about

through constructing, testing, and modifying theories.
 Epidemiologists make inductive inferences to generalize from many
observations, make creative leaps of the imagination to infer explanations
and construct theories, and use deductive inferences to test those theories.
 Theories, then, can be used to predict observations. But these observations
will not always be exactly as we predict them, due to error and the inherent
variability of natural phenomena.
 If the observations are widely different from our predictions we will have
to abandon or Modify the theory.
How do we test the extent of the discordance of our predictions based on
theory from the reality of our observations? The test is a statistical or
probabilistic test. It is the test of the null hypothesis, which is the cornerstone of
statistical inference and will be discussed later.
CH-3 PROBABILITY
 The theory of probability is at bottom nothing but
common sense reduced to calculus.
3.1 What Is Probability?
* The probability of the occurrence of an event is indicated by a
number ranging from 0 to 1.
+* An event whose probability of occurrence is 0 is certain not
to occur,
 Where as an event whose probability is 1 is certain to
occur.
The classical definition of probability is as follows:
“if an event can occur in N mutually exclusive, equally likely
ways and if n of these outcomes have attribute A, then the
probability of A, written as P(A),equals n/N. “
This is an a priori definition of probability, that is, one
determines the probability of an event before it has happened.
 Example: Assume one were to toss a die and wanted to
know the probability of obtaining a number divisible by
three on the toss of a die.
 Here there are six possible ways that the die can land.
 Of these, there are two ways in which the number on the
face of the die is divisible by three, are 3 and 6.
 Thus,the probability of obtaining a number divisible by
three on the toss of a die is 2/6 or 1/3.
 In many cases, however, we are not able to enumerate all
the possible ways in which an event can occur, and,
therefore, we use relative frequency definition of
probability.
 This is defined as the number of times that the event of
interest has occurred divided by the total number of trials
(or opportunities for the event to occur).
 Since it is based on previous data, “it is called the a
posteriori definition of probability”.
 For instance, if you select at random a whit American
female, the probability of her dying of heart disease is .
00287.
“This is based on the finding that per 100,000 white American

females 287 died of coronary heart disease “(estimates are for
2001,National Center for Health Statistics).
 When you consider the probability of a white American
female who is between ages 45 and 64, the figure drops to
.00088 (or 88 women in that age group out of 100,000),
and when you consider women 65 years and older, the
figure rises to .01672 (or 1672 per 100,000).
 For white men 65 or older it is .0919 (or 9190per 100,000).
Here two important points are

(1) to determine a probability, you must specify the
population to which you refer.
For example, all white females, white males between 65
and 74, nonwhite females between 65 and 74, and so on;
and
(2) the probability figures are constantly revised as new data
become available.
* This brings us to the notion of “Expected Frequency”.
* If the probability of an event is P and there are N trials (or
opportunities for the event to occur), then we can expect that
the event will occur N×P times.
* It is necessary to remember that probability “works” for large
numbers.
* When in tossing a coin we say the probability of it landing on
heads is 0.50, we mean that in many tosses half the time the
coin will land heads.
* If we toss the coin ten times, we may get three heads (30%)
or six heads(60%), which are a considerable departure from the
50% we expect.
* But if we toss the coin 200,000 times, we are very likely to be
close to getting exactly 100,000 heads or 50%.
* Expected frequency is really the way in which probability
“works.”
* It is difficult to conceptualize applying probability to an
individual.
3.2 Probabilities Laws

* There are two laws for combining probabilities that are
important.
* First, if there are mutually exclusive events(i.e., if one occurs,
the other cannot), the probability of either one or the other
occurring is the sum of their individual probabilities.
Symbolically,
P(AorB) =P(A) +P(B)
An example of this is as follows:
The probability of getting either a 3 or a 4 on the toss of a die is
1/6 + 1/6 = 2/6.Ā
“A useful thing to know is that the sum of the individual
probabilities of all possible mutually exclusive events must
equal 1”.
For example,
If A is the event of winning a lottery, and not A(written as Ā ), is
the event of not winning t he lottery, then
P(A) + P(Ā) = 1.0 and
P(A) = 1–P(Ā).
 Second, if there are two independent events (i.e., the
occurrence of one is not related to the occurrence of the
other), then the joint probability of their occurring
together (jointly) is the product of the individual
probabilities. Symbolically,
P(A and B) = P(A)x P(B)
 An example of this is the probability that on the toss of a

die
 You will get a number that is both even and
odd(1,2,3,4,5,6) divisible by 3.
 This probability is equal to 1/2×1/3 = 1/6. (The only
number both even and divisible by 3 is the number 6.)
( Even 2,4,6=3: odd 1,3,5=3)
 The joint probability law is used to test whether events are
independent or not.
 If they are independent, the product of their individual
probabilities should equal the joint probability.
 If it does not, then they are not independent.
 It is the basis of the chi-square test of significance, which
we will consider in the next section.
 Let us apply these concepts to a medical example.
 The mortality rate for those with a heart attack in a special
coronary care unit in a certain hospital is 15%.
Thus, the probability that a patient with a heart attack admitted
to this coronary care unit will die is .15 and that he will survive
is .85.
 If two men are admitted to the coronary care unit on a
Particular day,
 let A be the event that the first man dies and let B be the
event that the second man dies.
The probability that both will die is
 P(A and B)= P(A) P(B) = 15/100x 15/100 =0.15x0.15= 0225
We assume these events are independent of each other so we

can multiply their probabilities.
Note, however, that the probability that either one or the
other will die from the heart attack is not the sum of their
probabilities because these two events are not mutually
exclusive.
 It is possible that both will die (i.e., both A and B can
occur).
To make this clearer, a good way to approach probability is
through the use of Venn diagrams, as shown in Figure 2.1.
 Venn diagrams consist of squares that represent the
universe of possibilities and circles that d efine the events
of interest.
In diagrams 1, 2, and 3, the space inside the square represents

all N possible outcomes.
 The circle marked A represents all the outcomes that
constitute event A;
 The circle marked B represents all the outcomes that
constitute event B.
 Diagram 1 illustrates two mutually exclusive events; an
outcome in circle A cannot also be in circle B.
 Diagram 2 illustrates two events that can occur jointly: an
outcome in circle A can also be an outcome belonging to
circle B. The shaded area marked AB represents out comes
that are the occurrence of both A and B.
 The diagram 3 represents two events where one (B) is a
subset of the other(A); an outcome in circle B must also be
an outcome constituting event A, but the reverse is not
necessarily true.
3.3 Conditional Probability
* Now let us consider the case where the chance that a
particular event happens is dependent on the outcome of
another event.
* Then the probability of A, given that B has occurred, is called
the conditional probability of A given B, and is written
symbolically as P(A|B).
 An illustration of this is provided by Venn diagram 2.
 When we speak of conditional probability, the
denominator becomes all the outcomes in circle B (instead
of all N possible outcomes) and the numerator consists of
those out-comes that are in that part of A which also
contains outcomes belonging to B.
 This is the shaded area in the diagram labeled AB.
 If were turn to our original definition of probability, we
see that
(the number of outcomes in bothA and B, divided by the total

number of outcomes in B).
 If we divide both numerator and denominator by N, the
total number of all possible outcomes, we obtain
3.4
3.5 Odds and Probability

* When the odds of a particular horse losing a race are said to
be 4 to 1,
* Then probability of losing a race = 4/5 = 0 .80
To convert an odds statement to probability
 Add 4 + 1 to get our denominator of 5.
 The odds of the horse winning are 1 to 4,which means that
the probability of winning a race =1/5 = .20.
CHAPTER - 4
CH-5
BASIC EPIDEMIOLOGY
1 Introduction
 Before the turn of the 19th century the world has witnessed
severe pains and agony due to sequence of epidemics.
 Now we are facing CARONA Pandemic throughout the world
 Several thousands of death tools of human lives occurred
because of dreadful infectious diseases.
 The ravages of the Black Death, a plague in the 14th century are
well known.
 In Europe alone, about 25 million deaths occurred in a
population of around 100 million.
 Similarly reports came from China,UK, France , Spain ect., now
in USA and other parts of the world where millions of many
local populations were said to have died in widely scattered
areas due to Corona Virus .
Perhaps even more worrying is the occurrence of new challenges on a

wide scale. One of the greatest challenges faced by the present day
world is the pandemic of AIDS and CARONA which are likely to take
serious dimensions in the present century.
Since the first case of AIDS was reported among homosexual men in
the USA in 1981.
Similary the first case of Corona found in Wuhan city of Chaina , now
it has reached pandemic proportions, as no country in the world is
free from HIV/AIDS and Corona
AIDS and its related syndromes have changed virtually every aspect of
medicine and society at large. whereas no medicine is not yet found
for Corona
AIDS is a condition in which the inbuilt immune mechanism of the

human body breaks down completely. The process is gradual but
ultimately suppresses the immunity of the individuals. Similar
situation we are observing now with Corona
The process is gradual but ultimately suppresses the immunity of the

individuals.
Now at present the most pandemic disease which spread throughout

the world is Covind -19(Corona), which is killing many lives every day
and we are unable to find a medicine or vaccine for this
It ranks as one of the most destructive microbial scourges in human

history and poses a formidable challenge to the biomedical research
and public health communities of the world.
Following these, serious investigations have been carried out by

medical, biological and mathematical scientists to find clinical,
biological, ideological reasons and also to formulate a mathematical
theory to describe the phenomenon of spread of disease as well.
The clinical questions of diagnosis, prognosis and efficiency of

treatment often depend on the statistical interpretation of
appropriate data.
Again we are interested in developing mathematical or statistical

models because of the light they throw on some aspects of the
biological mechanism at work, such as the life-cycle of the parasite
involved.
 Alternatively, we can use these models to study the large-scale

population phenomena of immediate relevance to any social
and public health measures that might be advocated or
undertaken.
 In particular, we want to know more about the transmission and
spread of infectious disease, about trying to predict the course
of an epidemic and about the recognition of threshold densities
of population which must be surpassed before a flare-up is
likely.
Objectives:
After reading this chapter you should understand:
◆ that the prime focus of epidemiology is on the pattern of disease

and ill-health in the population;
◆ that epidemiology combines elements of clinical, biological, social

and ecological sciences;
◆ that epidemiology is dependent on clinical practice and the clinical

sciences to make a diagnosis, the starting point of epidemiological
work;
◆ that the central goal of epidemiology as a science is to understand

the causes of disease variation and use this knowledge to better the
health of populations and individuals;
◆ that the central goal of epidemiology as a practice is preventing
and controlling disease in populations, guiding health and healthcare
policy and planning, and improving health care in individuals;
◆ that good epidemiological variables should meet the purposes of

epidemiology;
◆ that epidemiology is based on theories though these may not be

made explicit.
Definition of epidemiology:
 The identity of the person who coined the term
epidemiology is unknown but it is derived from the
Greek words meaning study upon populations (epi
=upon, demos = people, ology =study).
 This derivation does not convey what is studied or
the nature of that study.
 “Epidemiology is concerned primarily with disease,
and how disease detracts from health.
 A more descriptive word would be
epidemiopathology (pathos is the Greek word for
suffering and disease) but it is too clumsy to
recommend.
 The word epidemic was used by Hippocrates, but
his writings were mainly compilations of the case
histories of affected people and not a study of the
causes or descriptions of the pattern of the
epidemic in the population.
 The early applications of epidemiology were in the

study of infectious disease epidemics,
environmental hazards and nutritional problems.
 The examination of social inequalities in mortality

patterns was also an early focus of epidemiology.
 Most epidemiology is on human populations but

veterinary epidemiology is important both in its
own right and in the interaction of humans and
animals, causing the diseases known as the Zoo-
noses.
Last’s (2001) dictionary gives a detailed definition of
epidemiology that includes these words
“The study of the distribution and determinants of
health-related states or events in specified populations,
and the application of this study to control of health
problems”.
*Based on what it has done in the last 150 years,
epidemiology is the science and practice which
describes and explains disease patterns in populations,
and puts this knowledge to use to prevent and control
disease and improve health.
*Epidemiology is to seek out the differences and
similarities (‘compare and contrast’) in the disease
patterns of populations to gain new knowledge.
* Most epidemiologists are interested in health but
study it indirectly through disease, partly because of
the difficulty of measuring health.
Uses or Scope of Epidemiology:
Ex: Lung Diseases, Diabetics, Heart Diseases, Blindness
.
*
Subclinical disease: An illness that is staying below the surface of clinical detection. A subclinical disease
has no recognizable clinical findings. It is distinct from clinical disease, which has signs and symptoms
that can be recognized.
Many diseases, including diabetes, hypothyroidism, and rheumatoid

(Clinical Change: Words used to describe the change in the health condituions due to medical
treatments, appliances and medicines)
CH-6
Types of Epidemiology
Descriptive and Analytical Epidemiology
1. Descriptive epidemiology
• Describes the occurrence of disease (cross-
sectional)
2. Analytical epidemiology:
• Observational (cohort, case control, cross-
sectional, ecologic study) – researcher
observes association between exposure
and disease, estimates and tests it
• Experimental (RCT, quasi experiment) –
researcher assigns intervention
(treatment), and estimates and tests its
effect on health outcome
Epidemiologic Study Designs

Description of Disease Distribution in the Population
Person: Disease affects mostly people under five years of age
Place: Disease affects people living alongside the river
Time: Disease reaches its peak in frequency in Week 6
 Aims of Epidemiologic Research

A. Descriptive epidemiology
1. To assess the public health status of a population
2. importance of diseases
B. Analytical Epidemiology
3. To describe the natural history of disease,
4. Explain the etiology of disease
5. Predict the disease occurrence
6. 6. To evaluate the prevention and control of disease
7. Control the disease distribution
Etiology: the cause set of causes, or manner of causation of a disease or
condition
Knowledge of how to treat and
How to prevent the disease.
* If we have sufficient knowledge about the causes of the disease and if these
causes are avoidable, then we may be able to propose effective preventive
programs.
* Clinical medicine may have a tendency to focus on rare but
CH-7
Measures of Disease Occurrence
* Public health priorities should be set by the combination of two aspects
1. How serious diseases are (i.e. it is a product of the number of persons suffering
from a disease and its impact on those affected and society).
2. Also our ability to change their frequency or severity.
* Public health bodies may focus on the big picture by taking the frequency of
disease and considers:
1. What are the best possibilities of saving many lives?
2. How to Prevent ill health and social impairments within available resources?, a
3. How do we best use these resources?
Measures of Diseases:
* We may use number of measures to describe the frequency of a disease.
But before this,
1. we must count the number of people with the disease in the population
(i.e. called the prevalence of the disease).
2. Also must know how many new cases may appear over a given time
period?
This requires either

*An estimate of the risk (the cumulative incidence ) or
* Rate of getting the disease over a given time span, giving new cases per
unit time ( the incidence rate).
* Maternal, infant, and childhood mortality have been monitored in many parts of
the world and they are often considered strong indicators of general health. But
similar effort was not made so far to compute other epidemiological measures.
Incidence and Prevalence Rates:
* A person may either have a disease, not have a disease, or have

something in between.
* So when does a person become affected?
*While counting number of persons having a disease we have to use

a set of criteria that indicates whether the person has the disease or
not.
*For most diseases, we use a classification system , and we fallow

International Classification of Diseases (ICD) to find people in one
group or the other.
*Over a lifetime each of us will get a given disease or we will not get
the disease in question,
*This probability has a time dimension.
* If you die at the age of 30, you are less likely to suffer from a heart
stroke in your lifetime.
* On the other hand if you die at the age of 90 you are more likely to
die because of heart stroke.
* Similarly, we may expect many more cancer cases in developing

countries if life expectancy continues to increase for a population.
*The risk of getting a disease is usually a function of time and these
probabilities are estimated from the observation of populations.
Also, by observing the occurrence of diseases in populations over a

period of time, in such cases, we may be able to estimate incidence
and prevalence of certain diseases.
*While estimating the Incidence rate and provenance rate we may

consider gender, age, time, ethnic group, social conditions, place of
residence, and information of other risk factors.
* Note that these measures are simply an indicator of risk, not a

destiny.
* It is a prediction with uncertainty. In the end the person will either

get the disease or not.
Example:
1. Suppose, the person has a 25% risk of getting the disease within
the next 10 years it does not mean that he/she will be 25%
diseased.
It means that among, say, 1,000 people with his/her
characteristics will expect about 250 persons may develop the
disease.
*To estimate incidence and prevalence in a given population we need

to identify the population and examine everyone in it,
Or
* a sample of them will be taken , at a given point in time to estimate

prevalence, or during a follow-up time period (to estimate incidence).
2. We want to estimate the prevalence of type 1 diabetes in a city

with 100,000 inhabitants. We may call them all in for a medical
examination or
 We may estimate based on a sample randomly selected from

that population.
Thus the Prevalence Proportion (PP) is
 Some Authors have defined Prevalence Proportion as Prevalence

Rate as:
 The calculations of above rates depends on the availability of
data .
Incidence Rates:
* Incidence rates are the most common measure of disease of
measure of disease in a population.
It is also used to compare the frequency of disease in different

populations.
*The incidence rate expresses the probability or risk of illness in a

population over a period of time.
*Since incidence is a measure of risk, when one population has a

higher incidence of disease than another, we say that the first
population is at a higher risk of developing disease than the
second, when all other factors being equal.

* An incidence rate (sometimes referred to simply as incidence) is a
measure of the frequency with which an event, such as a new case of
illness, occurs in a population over a period of time.
*The formula for calculating an incidence rate follows:
n- may be equal to 1,2,3,….
 Under steady-state conditions the prevalence is a function of the

incidence (I) and the duration of the disease (D).
* Again there are two types of prevalence’s.
1. Point prevalence
2. Period prevalence
*The amount of disease present in a population is constantly changing
over time.
* Sometimes, we want to know how much of a particular disease is

present in a population at a single point in time—to get
a kind of “stop action” or “snapshot” look at the population with

regard to that disease.
*This requires estimation of point prevalence for that purpose.
* The numerator in point prevalence is the number of persons with a

particular disease or attribute on a particular date.
* Point prevalence is not an incidence rate, because the numerator

includes pre-existing cases; it is some kind of a proportion, because
the persons in the numerator are also in the denominator.
*On the other hand we want to know how much of a particular

disease is present in a population over a longer period?
* Then, we use period prevalence.
*Here the numerator in period prevalence is the number of persons

who had a particular disease or attribute at any time during a
particular interval.
*The interval can be a week, month, year, decade, or any other

specified time period.
Examples:
1. Example
In a survey of patients at a sexually transmitted disease clinic in San

Francisco, 180 of 300 patients interviewed reported use of a condom
at least once during the 2 months before the interview (1).
Compute period prevalence of condom use in this population over the

last 2 months
Solution:
As per definition Period Prevalence Rate =
(Number of persons who had a particular disease or attribute at any

time during a particular interval / Population during the same period)
x 100
Denote x= No. persons used condom over the period of last 2 Months
Y= Total no of Patience during the same period
Period Prevalence Rate = (x/y) x100 = (180/300) x100= 60.0%
Thus, the prevalence of condom use in the 2 months before the study
was 60% in this Population of patient.
2.Example :Two surveys were done of the same community 12
months apart. Of 5,000 people surveyed the first time, 25 had
antibodies to histoplasmosis. Twelve months later, 35 had
antibodies, including the original 25.Calculate the prevalence at
the second survey, and compare the with the 1-year incidence.
Solution:
Two surveys conducted on the same community, the time gap

between them was 2-months
By definition, we have, prevalence rate in the second survey is
Take n=3, the n
1. Prevalence at the second survey:
x= antibody positive at second survey (see numerator) = 35
y= population = 5,000
Prevalence rate = (35/ 5000) x 1000 = 7 per 1000
Similarly Incidence rate is given by
Let no. of new cases reported = x = 35-25 = 10
Population at risk at the same time period= y= 5000-25= 4750
Incidence rate = (10/4750) x 1000 = 2.10526 = 2 persons per 1000 population

CH-8
Other Measures
Attack Rate
An attack rate is a variant of an incidence rate, applied to a narrowly defined

population observed for a limited time, such as during an epidemic. The attack
rate is usually expressed as a percentage.
In the
Risk Ratio :
*A risk ratio, or relative risk, compares the risk of some health-related event such
as disease or death in two groups.
*The two groups are typically differentiated by demographic factors such as sex
(e.g., males versus females) or by exposure to a suspected risk factor (e.g.,
consumption of potato salad or not).
*Among the two groups, the group of primary interest labeled the “exposed”
group, and the other comparison group labeled as “unexposed” group.
*Taking the group that we are primarily interested in the numerator; and the
other group will be placed in the denominator, thus,
* A risk ratio of 1.0 indicates identical risk in the two groups.
* A risk ratio greater than 1.0 indicates an increased risk for the numerator group,
* While a risk ratio less than 1.0 indicates a decreased risk for the numerator
group
Example:
The following data gives the classic studies of pellagra by Goldberger.

Pellagra is a disease caused by dietary deficiency
calculate the risk ratio of pellagra for females versus males. Pellagra is a disease
caused by dietary deficiency of niacin and characterized by dermatitis, diarrhea,
and dementia. The totals for females and males are also shown in below Table.
To calculate the risk ratio of pellagra for females versus males, we must first
calculate the risk of illness among females and among males.
Conside the below table
The odds ratio is sometimes called the cross-product ratio, because the
numerator is the product of cell and cell d, while the denominator is the
product of cell band a and cell d. A line from cell to cell d(for the numerator) and
another from cell b to cell c(for the denominator) creates an xor cross on the
two-by-two table.
To quantify the relationship between pellagra and sex, the odds ratio is calculated
as:
Problems
HOME ASSIGNMENT
Q.NO.1 Discuss Deductive and Inductive Inference in the context of Biostatistics
Q.NO.2. Discuss Karl Popper claims that “ induction arising from accumulation of
affirmative evidence doesn't strengthen a theory”
q.No.3. Define Probability and state the additive and multiplicative theorems
Q.No.4. Define Conditional and Bayesian probability. also use of Bayesian
probability
Q.No.5.Define Chi-Square Statistics and use of Contingency Tables
Q.NO.6. Define Epidemiology and explain its use and scope
Q.No.7. Define Incident Rate and Prevalence Proportion. Also define
interrelationship between them
Q.No.8. Define attack rate, secondary attack rate, risk ratio and odd ratio
Q.No.9 Two surveys were done of the same community in 2 years apart. Of 8,000
people surveyed the first time, 45 had Malaria. After one year, 65 had same
disease including the original .Calculate the prevalence at the first survey, and
Incidence rate.
Q.No.10.In a city survey conducted in a year out of 1000, about 180 people
suffering from a certain disease. Find Prevalence Proportion and what the
Incidence ratio after two years using prevalence proportion
Q.No.11. In a sample of size n denoted by Xi , where i= 1234…….n, each
Xi = 1 if disease is present
Xi = 0 if disease is not present
Estimate E (Xi) , Var( Xi ), sample proportion p, Var(p) and also 95%
confidence interval for p

Biostatistic and Epidemiology Lecture Note

Uploaded by

Copyright:

Available Formats

Biostatistic and Epidemiology Lecture Note

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Biostatistic and Epidemiology Lecture Note

Uploaded by

Copyright:

Available Formats

JIMMA UNIVERSITY

THE SCIENTIFIC METHOD

is no more a science than a heap of stones is a house (Jules Henri , 1908)

*The Logic of Scientific Reasoning

*The whole point of science is to uncover the “truth.”

* How do we go about deciding something is true?

We have two tools at our disposal to pursue scientific inquiry:

In science we impose logic on those observations.

Clearly, we need both tools.

For illustration See Figure 1.1.

 In deductive inference, we hold a theory and based on it we make a

That is, “rewards work better than punishments “.

In inductive inference, we go from the specific to the general.

We make many observations, discern a pattern, make a generalization, and draw

A generalization that provoked terror in prospective mothers.

Epidemiologists have generally been thought to use inductive inference.

(Estrogen, or oestrogen, is the primary female sex hormone. It is responsible for

CH-2 Biostatistics and Epidemiology

 The trial was discontinued and estrogen as a preventive measure was

Biostatistics and Epidemiology:

 A Primer for Health Professionals there is a constant interplay between

swans are white”.

 This assertion is supported by many observations.

 A theory has a richer meaning that incorporates causal explanations and

Assertions, like those relating to the color of swans, may be components of a

According to Popper- the proper methodology is to posit a theory, or a

 The more such attempts at destruction it survives, the stronger is the

He calls this as the method of conjectures and refutations.

 The advance of science toward the “truth” comes about by discarding

 In general then, advances in knowledge in the health field come about

“This is based on the finding that per 100,000 white American

Here two important points are

3.2 Probabilities Laws

 An example of this is the probability that on the toss of a

We assume these events are independent of each other so we

In diagrams 1, 2, and 3, the space inside the square represents

(the number of outcomes in bothA and B, divided by the total

3.5 Odds and Probability

Perhaps even more worrying is the occurrence of new challenges on a

AIDS is a condition in which the inbuilt immune mechanism of the

The process is gradual but ultimately suppresses the immunity of the

Now at present the most pandemic disease which spread throughout

It ranks as one of the most destructive microbial scourges in human

Following these, serious investigations have been carried out by

The clinical questions of diagnosis, prognosis and efficiency of

Again we are interested in developing mathematical or statistical

 Alternatively, we can use these models to study the large-scale

◆ that the prime focus of epidemiology is on the pattern of disease

◆ that epidemiology combines elements of clinical, biological, social

◆ that epidemiology is dependent on clinical practice and the clinical

◆ that the central goal of epidemiology as a science is to understand

◆ that good epidemiological variables should meet the purposes of

◆ that epidemiology is based on theories though these may not be

 The early applications of epidemiology were in the

 The examination of social inequalities in mortality

 Most epidemiology is on human populations but

Ex: Lung Diseases, Diabetics, Heart Diseases, Blindness

Many diseases, including diabetes, hypothyroidism, and rheumatoid

Epidemiologic Study Designs