Biostatistics Final
Biostatistics Final
Biostatistics Final
1
The confidence level represents the proportion (frequency) of acceptable confidence intervals
that contain the true value of the unknown parameter.
Mostly, the confidence level is selected before examining the data. The commonly used
confidence level is 95% confidence level. However, other confidence levels are also used, such as
90% and 99% confidence levels.
The confidence interval is based on the mean and standard deviation. Thus, the formula to find
CI is
X̄ ± Zα/2 × [ σ / √n ]
Where,
X̄ = Mean
Z = Confidence coefficient
α = Confidence level
σ = Standard deviation
N = sample space
The value after the ± symbol is known as the margin of error.
d) Discuss errors in test of significance.
There are two types of errors in testing significance. They are:
2
2
a) What is a t test? What are the assumptions that need to be considered for conducting t-test?
A T-test is used to assess the difference between means of two groups, either independent or
paired. It assumes normal distribution.
Assumptions for conducting a t-test:
1. Normality
2. Random Sampling
3. Homogeneity of Variance
4. Scale of Measurement.
5. appropriate sample size, and
6. normal distribution of data.
These assumptions are essential for the validity of the t-test results.
b) The measurements of body mass index (BMI) of 25 persons were taken. The mean BMI was
20 and the standard deviation was 1.5. Can we conclude that the mean BMI from which the
sample drawn as 36? (tabulated t value is 1.191)
3.
a) define parametric test and non-parametric test. What is the difference between these two
tests?
Parametric Test:
A statistical test that assumes specific parameters of the population distribution. It requires data
to be normally distributed and have homogeneity of variances. Examples include t-tests, ANOVA,
and parametric correlation.
Non-Parametric Test:
A statistical test that makes fewer assumptions about the population distribution. It is suitable for
non-normally distributed or ordinal/nominal data. Examples include Mann-Whitney U test,
Kruskal-Wallis test, and non-parametric correlation.
The main differences between parametric and non-parametric tests are as follows:
1. Assumptions:
Parametric Test: Assumes specific parameters of the population distribution, such as normality
and homogeneity of variances.
3
Non-Parametric Test: Makes fewer assumptions about the population distribution. Suitable for
non-normally distributed data
2. Data Type:
Parametric Test: Well-suited for interval or ratio data.
Non-Parametric Test: Suitable for ordinal, nominal, or non-normally distributed data.
3. Power:
Parametric Test: Generally, more powerful when assumptions are met.
Non-Parametric Test: Less powerful than parametric tests.
4. Examples:
- Parametric Test: T-tests, ANOVA, parametric correlation.
- Non-Parametric Test: Mann-Whitney U test, Kruskal-Wallis test, non-parametric correlation.
b) what do you mean by power of a test?
The power of a statistical test is the probability that the test will correctly reject a false null
hypothesis. The power of a test is denoted by the symbol: 1−β,
Where, β is the probability of a Type II error.
Power and Type II error are complementary. As power increases, the probability of Type II error
decreases.
1. Significance Level (α): A lower significance level (e.g., 0.01 instead of 0.05) reduces the
probability of Type I error but may also decrease the power of the test.
2. Effect Size: A larger effect size generally leads to higher power.
3. Sample Size (n): Increasing the sample size tends to increase the power of a test.
4. Variability or Standard Deviation (σ): Lower variability in the data increases the power
of the test.
4
4
Write short notes;
a) P value
The p-value is a measure in statistics indicating the probability of observing results as extreme as
the ones obtained, assuming the null hypothesis is true. A small p-value (typically < 0.05) suggests
evidence against the null hypothesis, leading to its rejection.
Conversely, a larger p-value implies that observed results are consistent with the null hypothesis.
It is a crucial component in hypothesis testing, helping researchers assess the statistical
significance of their findings.
b) Z test
The z-test is a statistical method that is used to assess the difference between means of two
groups and is suitable for larger sample sizes (typically n > 30) and when the population standard
deviation is known.
It involves calculating the z-score. The larger the absolute value of the z-score, the more evidence
there is against the null hypothesis. The z-test is commonly employed for large sample sizes and
is a parametric test.
c) level of significance
We need a standard to compare and answer whether the sample variation is significant or not.
This standard level is known as the level of significance. It is denoted by α.
Level of significance divides the whole area into two parts :
1. one part containing values having small or non-significant difference: Acceptance region.
2. The other part containing values having Large or Significant difference: Rejection Region
Commonly used levels of significance are 0.1, 0.05 and 0.01. The choice of the level of significance
is somewhat arbitrary and depends on the researcher's willingness to tolerate Type I errors.
d) errors in test of significance
answer is type I and type II errors.
5
5
a) What is Chi-square test? When will you conduct Chi-square test?
The Chi-Square test is a statistical method used to determine if there is a significant association
between two categorical variables. It assumes independence of observations and typically
requires expected frequencies in each category to be sufficiently large.
We can use a chi-square test of independence when we have two categorical variables. It allows
us to test whether the two variables are related to each other.
b) math problem
6
7