0% found this document useful (0 votes)
4 views32 pages

FINAL BADM

Uploaded by

Ngoc Anh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
4 views32 pages

FINAL BADM

Uploaded by

Ngoc Anh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 32

FINAL EXAM

BUSINESS ANALYTICS FOR DECISION-MAKING

PART 1: MULTIPLE CHOICE QUESTIONS (3 marks total; 0.5 marks each)


1. What does an influence diagram visually represent?
A) The sequence of steps in a process.
B) The relationships and influences between different entities or variables in a model.(chap10)
C) The financial performance of a company over time.
D) The organizational hierarchy within a company.
2. Which of the following best describes the purpose and construction of a histogram?
A) A histogram uses line graphs to connect data points and show trends over time.
B) A histogram represents the distribution of a continuous variable by dividing the data into
intervals or bins, with each bin’s height indicating the number of observations within that
interval.(chap2)
C) A histogram is used to compare frequencies of different categories by displaying them as
distinct bar heights, with each bar representing a different category.
D) A histogram visualizes categorical data by arranging categories along the horizontal axis
and showing their frequencies as individual dots.
3. What does a Type II error represent in hypothesis testing?
A. Rejecting the null hypothesis when it is actually true.
B. Failing to reject the null hypothesis when it is actually false.(Chap 6, hypothesis)
C. Incorrectly accepting the null hypothesis when the alternative hypothesis is true.
D. Incorrectly accepting the alternative hypothesis when the null hypothesis is true.
4. The data-ink ratio, a concept described by Edward R. Tufte, emphasizes which of the
following principles in data visualization?
A) The ratio of graphical elements to textual explanations in a chart.
B) The proportion of ink used for essential data representation versus ink used for non-
essential elements in a chart or table.(chap3)
C) The ratio of color usage to grayscale in a data visualization.
D) The proportion of data labels to data points in a graph.
5. What does a positive correlation coefficient (r > 0) indicate about the relationship
between two variables?
A) As one variable increases, the other variable tends to decrease.
B) There is no relationship between the two variables.
C) As one variable increases, the other variable tends to increase as well.(chap2)
D) The variables are unrelated and show no discernible pattern.không biểu hiện một mô hình rõ
ràng nào.
6. Which of the following statements accurately describes variance in a dataset?
A) Variance measures the average deviation of each observation from the median of the
dataset.
B) Variance is calculated as the difference between each observation and the mean, and these
deviations are squared to compute the variance.(chap2)
C) Variance is the square root of the average of the absolute deviations from the mean.
D) Variance indicates the central tendency of the data by averaging the observations.

Câu 1.What is the primary purpose of a box plot?


(A) To summarize the distribution of quantitative data and identify outliers(chap 2)
(B) To display time series data
(C) To display the frequency distribution of categorical data
D) To show the correlation between two variables
Câu 2.What is the purpose of a trend line in a scatter chart?
A)To approximate the relationship between variables(chap 8)
B) To highlight outliers
C)To connect data points
D) To display individual data points
Câu 3.When should tables be used for data presentation?
(A) When the data is best represented visually
(B)When the reader needs to refer to specific numerical values(chap 3)
(C) When the data is large and complex.
(D) When the audience is unfamiliar with the data
Câu 4: If P(A) = 0.6 and P(B) = 0.4, what isP(A ∩ B) if A and B are independent?
(A) 0.20
(B) 0.24(chap 5)
(C) 0.40
D) 0.10
Câu 5:Two events A and B are independent. What is P(A ∩ B)?
(A) P(A) + P(B)
(B) P(A) / P(B)
(C) P(A) * P(B)(chap 5)
(D) P(A) - P(B)
Cau 6. Which of the following is recommended for effective table design?
(A) Use horizontal lines only when necessary
(B) Use vertical lines liberally
(C) Avoid unnecessary horizontal lines(Chap 3)
(D) Include as much non-data ink as possible
Cau 7.The Multiplication Law states that for any two events A and B, what is P(A ∩ B) if
they are not independent ?
(A) P(B|A) * P(A)(chap 5)
(B) P(A) + P(B)
(C) P(A) * P(B)
(D) P(B|A) * P(B)
Cau 8 Which of the following is an example of an alternative hypothesis (H1)?
(A) H1: p = 0,5
(B) Ho: p ≤ 0,3
(C) Ho:µ =50
(D) H1:µ ≠ 50(chap 6)
Câu 9: Which of the following describes a subset of the population?
(A) Population
(B) Observation
(C) Sample(chap 2)
(D) Variable
Câu 10: What is the null hypothesis (Ho) in hypothesis testing?
(A) It is the hypothesis we aim to support.
(B) It assumes there is no effect or difference.(chap 6)
(C) It represents the effect or change we expect to find
(D) It is always true.
Câu 11: Which term refers to a quantity whose values are not known with certainty?
(A) Deterministic variable
(B) Fixed variable
(C) Constant variable
(D) Random variable
Câu 12: If P(B) = 0.5 and P(A|B) = 0.6, what isP(A ∩ B)?
(A) 0.1
(B) 0.6
(C) 0.3(chap 5)
(D) 0.2
Câu 13.What is a Type Il error?
(A) Rejecting the null hypothesis when it is true
(B) Accepting the alternative hypothesis when it is true
(C) Failing to reject the null hypothesis when it is false
(D) None of the above
Câu 14: What is the purpose of a trend line in a scatter chart?
(A)To connect data points
(B) To display individual data points
(C) To approximate the relationship between variables(Chap 3)
(D) To highlight outliers
Câu 15: Which type of chart is particularly useful for displaying time series data?
(A) Pie chart
(5) Line chart(chap 3)
(C) Bar chart
(D) Scatter chart

Câu 1: What is the primary purpose of data visualization?


(A) To analyze data errors
(B) To summarize and present data effectively(Chap 3)
(C) To collect data
(D) To store data securely

Câu 2: If P(A) = 0.4 and P(B) = 0.5, and A and B are mutually exclusive, what is P(A U B)
(A)0.9(chap 5)
(B)0.1
(с) 0.5
(D) 0

Câu 3: If the sample proportion is 0.4 and the sample size is 100, what is the standard
error of the proportion?
(A)0.04(chap 6)
(B) 0.08
(C) 0.06
(D) 0.05

Câu 4: In a histogram, the variable of interest is placed on which axis?


(A) Y-axis
(B) Vertical axis
(C) Horizontal axis(chap 2)
(D) Z-axis

Câu 5. When constructing a confidence interval for a population proportion, what


distribution is typically used?
(A) Binomial distribution(counts of events.)
(B) Normal distribution(chap 6)
(C) t-distribution(for means with small sample sizes)
(D) Poisson distribution( counts of events)

Cau 6: What do you call a summary of data that shows the number of observations in each
of several non overlapping classes?
(A) Percent frequency distribution
(B) Histogram
(C) Relative frequency distribution
(D) Frequency distribution( chap 2)

Câu 7: What is the complement of an event A, denoted as A'?


(A) The event that A does not occur(chap 5)
(B) The event that A occurs twice
(C) The event that A occurs
(D The event that A occurs at least once

Câu 8: A 95% confidence interval for a population mean is calculated as (10, 20). What
does this mean?
(A) 95% of the data points lie between 10 and 20.
(B) The sample mean is between 10 and 20.
C) 95% of the sample means will fall between 10 and 20
(D) There is a 95% chance the population mean is between 10 and 20(chap 6)

Câu 9: Which type of data is collected from several entities at the same point in time?
(A) Time series data
(B) Longitudinal data
(C)Cross-sectional data(chap 2)
(D) Qualitative data

Câu 10: If P(A) = 0.3, what is P(A')?


(A) 0.3
(B) 0.9
(C) 0.7(chap 5)
(D) 0.5

Câu 11. According to the Addition Law, if A and B are two events, what is P(A U B)?
(A) P(A) + P(B)
(B) P(A) * P(B)
(C) P(A) + P(B) + P(A ∩ B)
(D) P(A) + P(B) - P(A ∩ B)(chap 5)

Câu 12: What is crosstabulation?


(A)A table for describing data of two variables(Chap 3)
(B) A method for collecting data
(C)A type of chart used in presentations

Câu 13 : Who first described the concept of the data-ink ratio?


(A)George Box
(B)Hans Rosling
(C) John Tukey
(D) Edward R. Tufte(chap 3)

Câu 14: What is the primary purpose of a for a popular mean?


(A) To provide a point estimate of the mean
(B) To determine the exact value of the population mean
(C) To compare two population means
(D) To estimate the range in which the population mean lies( chap 6)

Câu 15: What is crosstabulation?


(A) A type of chart used in presentations
(B) A method for collecting data
(C) A table for describing data of two variables( chap 3)
(D) A technique for cleaning data

1. The decisions concerning an organization’s goals and future plans are called

a. financial decisions. b. tactical


decisions.

c. strategic d. operational
decisions.(chap 1) decisions.
2. A forecast that helps direct police officers to areas where crimes are likely to occur based
on past data is an example of

​ a. predictive b. decision
analytics.(chap 1) analysis.

c. prescriptive d. descriptive
analytics. analytics.
3. Optimization models can be used to

a. assess the risk of investment


portfolios.

b. forecast future financial


performance.
c. successfully manage
commercial real estate risk.

d. decide on how to invest cash


received from insurance
policies.(chap 12)
4. Tables should be used instead of charts when _____.

a. the reader needs relative


comparisons of data

b. there are more than two columns


of data

c. the values being displayed have


different units or very different
magnitudes(Chap 3)

d. the reader need not differentiate


the columns and rows
5. Bar charts use _____.

a. horizontal bars to display the


magnitude of the quantitative
variable(chap 3)

b. vertical bars to display the


magnitude of the quantitative
variable

c. horizontal and vertical bars


to display the magnitude of
the quantitative variable

d. vertical bars to display the


magnitude of the categorical
variable
6. This Excel bar chart displays the demographics of a Business Analysis class.
Approximately how many students are in the class?

a. 175

b. 150

c. 105

d. 130
7. The ratio of the amount of ink used in a table or chart that is necessary to convey
information to the total amount of ink used in the table and chart is known as data-ink
ratio. Using additional ink that is not necessary to convey information has what effect on
the data-ink ratio?

a. It reduces the data-ink


ratio.(chap 3)

b. It increases the data-ink


ratio.

c. It doesn't change the data-ink


ratio.

d. The data-ink ratio becomes


zero.
8. _____ acts as a representative of the population.

a. The variable b. The variance


c. A sample(chap d. A random variable
2)
9. The letter grades (A, B, C, D, F) of business analysis students are recorded by a
professor. This variable’s classification _____.

a. is quantitative b. cannot be
data determined

c. is categorical d. is time series


data data
10. Compute the geometric mean for the following data on growth factors of an investment
for 10 years. 1.10, 0.50, 0.70, 1.21, 1.25, 1.12, 1.16, 1.11, 1.13, 1.22

a. 1.0221 b. 1.0148

c. 1.0363 d. 1.1475(chap 2)
11. A _____ determines how far a particular value is from the mean relative to the data
set’s standard deviation.

a. coefficient of b. z-score( chap 2)


variation

c. variance d. percentile
12. Which graph represents a negative linear relationship between x and y?

a. A

b. B

c. C

d. None of the graphs display a


negative linear relationship.
13. Sample space is _____.
a. a process that results in some
outcome

b. the collection of all possible


outcomes(chap 5)

c. the collection of events

d. a subgroup of a
population/the likelihood of
an outcome
14. Which statement is true about mutually exclusive events?
a. If events A and B cannot
occur at the same time, they
are called mutually
exclusive.(chap 5)

b. If either event A or event B


must occur, they are called
mutually exclusive.

c. P(A) + P(B) = 1 for any


events A and B that are
mutually exclusive.

d. None of these are correct.


15. Which of the following statements is correct?
a. The binomial and normal distributions are both discrete probability
distributions.

b. The binomial and normal distributions are both continuous


probability distributions.

c. The binomial distribution is a continuous probability distribution,


and the normal distribution is a discrete probability distribution.

d. The binomial distribution is a discrete probability distribution and


the normal distribution is a continuous probability distribution.
16. Fast food restaurants pride themselves in being able to fill orders quickly. A study was
done at a local fast food restaurant to determine how long it took customers to receive their
order at the drive-thru. It was discovered that the time it takes for orders to be filled is
exponentially distributed with a mean of 1.5 minutes. What is the probability that it takes
less than one minute to fill an order?
a. 0.1813

b. 0.4866(chap 5)

c. 0.6321

d. 0.7769
17. A health conscious student faithfully wears a device that tracks his steps. Suppose that
the distribution of the number of steps he takes in a day is normally distributed with a
mean of 10,000 and a standard deviation of 1,500 steps. One day he took 15,000 steps. What
was his percentile on that day?
a. 95%

b. 97.7%

c. 99.7%( chap 2 ) use Zscore: 3,333; then Norm.S.DIST

d. 100%
18. In a normal distribution, which is greater, the mean or the median?
a. Mean

b. Median

c. Neither the mean nor the


median (they are equal)

d. Cannot be determined with


the information provided
19. For a population with an unknown distribution, the form of the sampling distribution
of the sample mean is _____.

a. approximately normal for small samplesizes

b. exactly normal for large sample sizes

c. exactly normal for small sample sizes


d. approximately normal for large sample sizes((chap 6)

20. In order to determine an interval for the mean of a population with unknown standard
deviation, a sample of 24 items is selected. The mean of the sample is determined to be 23.
The number of degrees of freedom for reading the t value is _____.

a. 21

b. 22

c. 23( chap 6)

d. 24
21. In interval estimation, as the sample size becomes larger, the interval estimate _____.

a. becomes narrower

b. becomes wider( chap 6)

c. remains the same, since the


mean is not changing

d. gets closer to 1.96


22. A statistics teacher started class one day by drawing the names of 10 students out of a hat and
asked them to do as many pushups as they could. The 10 randomly selected students averaged 15
pushups per person with a standard deviation of 9 pushups. Suppose the distribution of the
population of number of pushups that can be done is approximately normal. The 95% confidence
interval for the true mean number of pushups that can be done is _____.

a. 5.75 to 24.25

b. 8.56 to 21.40(Chap 6)

c. 11.31 to 18.55

d. 13.02 to 16.98
23. In a random sample of 400 registered voters, 120 indicated they plan to vote for Trump
for President. Determine a 95% confidence interval for the proportion of all the registered
voters who will vote for Trump.
a. (0.25, 0.34)

b. (0.27, 0.32)

c. (0.29, 0.30)

d. Cannot be determined from


the information given.
24. What are the two decisions that you can make from performing a hypothesis test?

a. Reject the null hypothesis; Fail to reject the null hypothesis (chap 6)

b. Accept the null hypothesis; Accept the alternative hypothesis

c. Make a Type I error; Make a Type II error

d. Reject the alternative hypothesis; Accept the null hypothesis


=====================

CHAPTER 8: TIME SERIES ANALYSIS AND FORECASTING

1. What is the primary characteristic of qualitative forecasting methods?

a) Reliance on historical data

b) Involvement of expert judgment


c) Emphasis on quantification
d) Use of advanced statistical models
2. When is it reasonable to use quantitative forecasting methods?

a) When expert judgment is available

b) When past information is not available


c) When past information is quantifiable and assumed to be indicative of the future

d) When the past is not considered a reliable indicator

3. What is the objective of time series analysis in forecasting?


a) To identify expert judgment
b) To quantify historical data
c) To uncover patterns and extrapolate them into the future
d) To eliminate past values
4. How is a stationary time series defined?
a) Fluctuating randomly around a constant mean
b) Exhibiting gradual shifts to higher or lower values
c) Having statistical properties dependent on time
d) Experiencing cyclical patterns
5. What characterizes a trend pattern in time series analysis?
a) Gradual shifts or movements to relatively higher or lower values

b) Random fluctuations around a constant mean

c) Recurring patterns over successive periods


d) Alternating sequence of points below and above the trendline

6. In time series, what is a seasonal pattern?

a) Gradual shifts to higher or lower values


b) Recurring patterns over successive periods
c) Alternating sequence of points below and above the trendline
d) Fluctuating randomly around a constant mean
7. What is the main purpose of using a naïve forecasting method?
a) To incorporate expert judgment
b) To eliminate past values
c) To predict future data using the most recent data
d) To uncover patterns in the time series
8. What is the key concept associated with measuring forecast accuracy?

a) Seasonal pattern
b) Regression analysis
c) Forecast error
d) Exponential smoothing
9. Which measure of forecast accuracy considers both positive and negative forecast
errors?

a) Mean forecast error


b) Mean absolute error (MAE)
c) Mean squared error (MSE)
d) Mean absolute percentage error (MAPE)
10. What is a moving average in forecasting?
a) A method using expert judgment
b) An analysis of time series patterns
c) A technique for quantifying past information
d) An average of a specified number of past observations
11. What is the primary purpose of using exponential smoothing in forecasting?

a) To eliminate past values

b) To incorporate expert judgment


c) To uncover patterns in the time series
d) To assign exponentially decreasing weights to past observations
12. How is linear trend projection utilized in forecasting?
a) To eliminate past values
b) To uncover patterns in the time series
c) To assign exponentially decreasing weights to past observations
d) To find a best-fitting line to a set of data exhibiting a linear trend
13. What role does regression analysis play in causal forecasting methods?

a) It eliminates multicollinearity

b) It assigns exponentially decreasing weights to past observations


c) It quantifies historical data
d) It models the relationship between independent and dependent variables(chap
7)

14. Why is scatter chart analysis useful in regression modeling?

a) To eliminate multicollinearity
b) To quantify historical data
c) To uncover patterns in the time series
d) To identify linear or nonlinear relationships between variables
15. What is the primary consideration in using software to select the best forecasting
model?
a) The size of the dataset
b) Managerial knowledge
c) Expert judgment
d) Software output and forecast error measures
16. How does dividing data into training and validation sets contribute to forecasting?

a) It quantifies historical data

b) It assigns exponentially decreasing weights to past observations


c) It minimizes forecast error measures
d) It allows testing different models on unseen data
17. What is the main responsibility of the user in selecting a forecasting model?

a) To rely on expert judgment

b) To follow software recommendations


c) To decide based on software output and managerial knowledge
d) To use the most recent data for prediction
18. What does the term " " measure?
a) The fit of the regression model
b) The deviation of the sample mean from the population mean
c) How well a forecasting method reproduces available time series data

d) The probability of Type I error

19. When should regression analysis be used in forecasting a time series?

a) When seasonality is present

b) When a linear trend is absent


c) When the relationship between variables is random
d) When the time series exhibits a linear trend
20. What is the recommended approach for handling large datasets in forecasting?

a) Using the naïve forecasting method

b) Dividing data into training and validation sets


c) Ignoring forecast error measures
d) Relying solely on expert judgment
21. What is the primary consideration in selecting a forecasting method based on software
output?
a) Managerial knowledge
b) Forecast error measures
c) Expert judgment
d) The size of the dataset
22. What distinguishes quantitative forecasting methods from qualitative methods?

a) Reliance on expert judgment

b) Use of historical data


c) Involvement of statistical models
d) Emphasis on seasonality
23. In time series analysis, what does a horizontal pattern indicate?
a) Gradual shifts to higher or lower values
b) Fluctuations around a constant mean
c) Recurring patterns over successive periods
d) Alternating sequence of points below and above the trendline
24. What is the primary purpose of using the naïve forecasting method?
a) To predict future data using the most recent data

b) To uncover patterns in the time series


c) To eliminate past values
d) To incorporate expert judgment
25. What does the coefficient of determination (r²) represent in forecasting?

a) The proportion of the total sum of squares explained by the regression equation

b) The fit of the regression model to the data

c) The percentage of forecast error


d) The probability of Type II error
26. How is the mean absolute percentage error (MAPE) calculated?
a) The absolute value of forecast errors divided by the mean
b) The mean of absolute forecast errors
c) The sum of squared forecast errors divided by the sample size
d) The percentage difference between actual and forecasted values, averaged over all
observations
27. What does the term "multicollinearity" refer to in regression analysis?

a)The presence of a linear trend

b) The correlation among independent variables


c) The absence of forecast error
d) The seasonal pattern in time series data
28. Why is scatter chart analysis useful in regression modeling?
a) To eliminate multicollinearity
b) To quantify historical data
c) To uncover patterns in the time series
d) To identify linear or nonlinear relationships between variables
29. What is the primary role of validation sets in forecasting?
a) To quantify historical data
b) To assign exponentially decreasing weights to past observations
c) To minimize forecast error measures
d) To test different models on unseen data
30. What should be considered when deciding on the use of complex models in forecasting?

a) The emphasis on qualitative methods


b) The availability of expert judgment
c) The meaningfulness of relationships with dependent variables (chap 7)
d) The size of the dataset

CHAPTER 7: LINEAR REGRESSION

1. What is the primary function of a dependent variable in regression analysis?


a) To predict independent variables
b) To establish the hypothesis
c) To measure the characteristics of a population
d) To be predicted based on independent variables

2. In a simple linear regression model, how is the relationship between x and y assumed?
a) Non-linear
b) Quadratic
c) Exponential
d) Linear

3. What is the least squares method used for in regression analysis?


a) To find the estimated regression equation
b) To calculate the mean of the dependent variable
c) To determine the sample size
d) To eliminate the error term

4. What do the parameters β0 and β1 represent in a simple linear regression model?


a) Independent variables
b) Error terms
c) Slope and intercept
d) Dependent variables

5. How is the slope (b1) interpreted in a simple linear regression model?


a) Estimated value of y when x is 0
b) Estimated change in the mean of y for a one-unit increase in x
c) The random variable in the model
d) Coefficient of determination

6. What does the coefficient of determination (r2) represent in regression analysis?


a) The sum of squares due to error
b) The percentage of the total sum of squares explained by the regression equation
c) The ratio of dependent to independent variables
d) The least squares method applied to regression parameters

7. What is the purpose of the F test in regression analysis?


a) To test individual regression parameters
b) To evaluate the goodness of fit for the estimated regression equation
c) To address non significant independent variables
d) To identify multicollinearity

8. When is the t-distribution used in hypothesis testing in regression?


a) When dealing with large samples
b) When dealing with small samples or unknown population standard deviation
c) When the regression equation is quadratic
d) When working with discrete probability distributions

9. In testing individual regression parameters, what does a p-value less than the
significance
level indicate?
a) Reject the null hypothesis(chap 6)
b) Fail to reject the null hypothesis
c) Accept the alternative hypothesis
d) Reject the alternative hypothesis

10. What is the purpose of interval estimation in regression analysis?


a) To determine the sample size
b) To calculate the mean of the dependent variable
c) To provide a range of values around the point estimate( chap 6)
d) To eliminate the error term
11. What is multicollinearity in multiple regression analysis?
a) The ratio of dependent to independent variables
b) Correlation among the independent variables
c) The least squares method applied to regression parameters
d) Testing for an overall regression relationship

12. What is the primary challenge associated with using piecewise linear regression
models?
a) Overfitting the model
b) Underfitting the model
c) Multicollinearity
d) Lack of independence among observations

13. What is backward elimination in variable selection procedures?


a) Adding all variables to the model
b) Removing variables from the model
c) Including only significant variables
d) Selecting variables based on the smallest p-value

14. How does the forward selection procedure work in variable selection?
a) Removing variables based on p-values
b) Including variables with the largest p-value
c) Allowing variables to enter the model based on a criterion
d) Iteratively modeling procedures without guidance

15. What does the best subsets procedure focus on in variable selection?
a) Including all variables in the model
b) Removing variables based on p-values
c) Selecting the smallest p-value
d) Iterative modeling procedures based on stepwise elimination
This method involves examining all possible combinations of variables to find the subset that
provides the best model fit according to some criterion (like adjusted R-squared, AIC, or BIC),
rather than following a strictly forward or backward stepwise approach. It essentially compares
different models with different combinations of variables to determine which subset of variables
yields the best model.

16. What is the purpose of the random variable in a regression model?


a) To predict independent variables
b) To account for variability in the dependent variable
c) To eliminate sampling error
d) To calculate the mean of the dependent variable

17. In regression, what is the purpose of calculating the sum of squares due to error (SSE)?
a) To measure the fit of the regression model
b) To assess multicollinearity
c) To compute the margin of error
d) To estimate the population parameter

18. What conditions are necessary for valid inference( suy luận) in the least squares
regression model?
a) The sample size is large
b) The independent variables are not correlated
c) The error terms are normally distributed with constant variance
d) The regression parameters are equal to zero
Các điều khoản lỗi được phân phối chuẩn với phương sai không đổi
Điều kiện này đảm bảo rằng các giả định của định lý Gauss-Markov được đáp ứng, bao gồm tính
đồng phương sai (phương sai không đổi) và tính chuẩn của các điều khoản lỗi, cho phép suy luận
thống kê hợp lệ. Do đó, tùy chọn c là đúng.

19. In the context of regression analysis, what is the null hypothesis for individual
regression parameters?
a) There is a linear relationship between y and x
b) The error term is equal to zero
c) The dependent variable is normally distributed
d) The regression parameter ( tham số hồi quy) is equal to zero
This hypothesis states that there is no effect of the independent variable on the dependent
variable, meaning the parameter (coefficient) associated with that variable does not contribute to
explaining the variation in the dependent variable.

20. What is the primary objective of the best subsets procedure in variable selection?
a) To include only significant variables
b) To remove variables with large p-values
c) To provide a range of values around the point estimate
d) To guide iterative modeling procedures
Để hướng dẫn quy trình mô hình hóa lặp lại
The best subsets procedure systematically generates and evaluates all possible combinations of
variables to find the subset that provides the best model according to certain criteria, guiding the
user through an iterative process of model selection.
21. What is the purpose of the interaction between independent variables in regression
analysis?
a) To eliminate multicollinearity
b) To improve the fit of the regression model
c) To simplify the regression equation
d) To address nonsignificant variables

Interaction terms in regression allow the effect of one independent variable on the dependent
variable to depend on the level of another independent variable. This can capture more complex
relationships in the data, potentially leading to a better fit of the model by accounting for the
combined effect of variables.

22. How is the quadratic regression model different from a simple linear regression model?
a) It involves multiple independent variables
b) It assumes a non-linear relationship between x and y
c) It uses the least squares method
d) It does not involve an error term

Nó giả định mối quan hệ phi tuyến tính giữa x và y


A quadratic regression model includes a squared term of the independent variable, allowing for a
curved, parabolic relationship between x and y, whereas a simple linear regression model
assumes a straight-line relationship.

23. What is the potential risk associated with overfitting a regression model?
a) Increased accuracy
b) Increased bias
c) Increased generalization
d) Decreased flexibility

24. In the context of regression, what does the F test evaluate?


a) The goodness of fit for the estimated regression equation
b) The significance of individual regression parameters
c) The overall regression relationship
d) The multicollinearity among independent variables

25. What is the role of the y-intercept (b0) in a regression equation?


a) Represents the change in the mean value of the dependent variable
b) Estimates the value of the dependent variable when independent variables are zero
c) Measures the variability in the dependent variable
d) Determines the slope of the regression line
26. What does a p-value less than the significance level indicate in hypothesis testing?
a) Fail to reject the null hypothesis
b) Reject the alternative hypothesis
c) Fail to reject the alternative hypothesis
d) Reject the null hypothesis

27. When using the t-distribution in hypothesis testing, what does a large t-value suggest?
a) A significant relationship between variables(Chap 6)
b) A non-significant relationship between variables
c) An error in the regression model
d) A large sample size

A large t-value indicates that the observed effect (the difference between the sample mean and
the hypothesized population mean, adjusted for sample variability) is large relative to the
standard error, which typically leads to rejecting the null hypothesis of no effect or no
relationship, suggesting that there is a significant relationship between the variables being tested.

28. What is the purpose of backward elimination in variable selection procedures?


a) Including only significant variables
b) Adding all variables to the model
c) Removing variables based on p-values
d) Iteratively modeling procedures without guidance

29. Why is the central limit theorem relevant in regression analysis?


a) It identifies the shape of the sampling distribution(chap 6)
b) It eliminates multicollinearity
c) It calculates the mean of the dependent variable
d) It addresses nonsignificant variables

30. How does the best-subsets procedure guide variable selection in regression analysis?
a) By including only significant variables
b) By removing variables based on p-values
c) By selecting the smallest p-value
d) By providing guidance on iterative modeling procedures

This method involves generating and evaluating all possible combinations of variables to find the
subset that provides the best model fit according to certain criteria, thus guiding the user through
an iterative process of model selection.
CHAPTER 6: STATISTICAL INFERENCE

1. What is the primary disadvantage of taking a census for collecting data?


a) Inexpensive
b) Time-consuming
c) Efficient
d) Reliable

2. What is statistical inference used for in the context of data collection?


a) Collecting census data
b) Making inferences from sample data
c) Avoiding sampling error
d) Estimating population characteristics accurately

3. What term is used to describe the population from which a sample is drawn?
a) Sampled population
b) Census population
c) Finite population
d) Infinite population

4. What is a frame in the context of sampling?


a) The sample mean
b) A list of elements for sampling
c) The population parameter
d) The standard deviation

5. What is the primary advantage of selecting a probability sample when sampling from a
finite population?
a) Lower cost
b) Faster data collection
c) Valid statistical inferences
d) Simplicity in implementation

6. What is the purpose of point estimation in statistics?


a) Describing population characteristics
b) Estimating population parameters
c) Calculating sampling error
d) Analyzing joint probabilities
7. What is the point estimator of the population mean?
a) Sample mean
b) Sample standard deviation
c) Sample proportion
d) Point estimate

8. What is the sampling distribution of a point estimator based on?


a) Joint probabilities
b) The sample distribution
c) Different random samples
d) Sampling error

9. When is a point estimator considered unbiased?


a) When the sample size is small
b) When the expected value equals the population parameter
c) When the distribution is skewed
d) When the standard deviation is large
Khi giá trị kỳ vọng của ước lượng điểm bằng với tham số tổng thể mà nó ước lượng, ta nói rằng
ước lượng điểm đó không thiên vị.

10. What does the Central Limit Theorem state regarding the sampling distribution of the
sample mean?
a) It is always normal regardless of sample size
b) It approximates a normal distribution as sample size becomes large
c) It follows a uniform distribution
d) It is not applicable to small sample sizes

11. Why is interval estimation frequently used in statistics?


a) To eliminate sampling error
b) To calculate sampling distribution
c) To provide a range of possible values for a parameter
d) To simplify statistical analysis

12. What is the margin of error used for in interval estimation?


a) To narrow the confidence interval
b) To reduce the variability of the data
c) To determine the sample size
d) To provide a range of values around the point estimate
13. In interval estimation of the population mean, what is the range within which 95% of
the values lie if the sampling distribution follows a normal distribution?
a) 1.645 standard deviations
b) 1.960 standard deviations
c) 2.576 standard deviations
d) 3.000 standard deviations

14. When is the t-distribution used in interval estimation?


a) When dealing with large samples
b) When the population standard deviation is known
c) When dealing with small samples or unknown population standard deviation
d) When working with a discrete probability distribution

15. What does the null hypothesis represent in hypothesis testing?


a) A tentative conjecture( Một phỏng đoán tạm thời)
b) A proven fact
c) An alternative hypothesis
d) A point estimate

16. What is the alternative hypothesis in hypothesis testing?


a) The complement of the null hypothesis
b) An unrelated statement
c) The same as the null hypothesis
d) An inverse statement

17. What is the purpose of the hypothesis testing procedure?


a) To establish a null hypothesis
b) To collect random samples
c) To test the validity of competing statements about a population
d) To eliminate sampling error

18. How many forms can hypothesis tests about a population parameter take?
a) One
b) Two
c) Three
d) Four

19. What type of test is a two-tailed test in hypothesis testing?


a) Unidirectional
b) Bidirectional
c) One-tailed
d) Three-tailed

20. What is a Type II error in hypothesis testing?


a) Accepting H0 when it is true
b) Rejecting H0 when it is true
c) Accepting H0 when it is false
d) Rejecting H0 when it is false

21. In a two-tailed test, what happens if the conclusion is to accept H0?


a) Type I error
b) Correct conclusion
c) Incorrect conclusion
d) Type II error

22. What is the significance of the degrees of freedom in t-distributions?


a) Determines the width of the distribution
b) Reflects the sample size
c) Affects the shape of the distribution
d) Determines the mean of the distribution

23. What does the Central Limit Theorem state about the sampling distribution of the
sample mean?
a) It is always normally distributed
b) It approaches a normal distribution with increasing sample size
c) It follows a uniform distribution
d) It is not applicable to finite populations

24. In interval estimation, what is the purpose of the margin of error?


a) To reduce sampling error
b) To calculate the population mean
c) To determine the sample size
d) To provide a range of values around the point estimate

25. When is the t-distribution used instead of the standard normal distribution in interval
estimation?
a) When dealing with large samples
b) When the population standard deviation is known
c) When dealing with small samples or unknown population standard deviation
d) When working with discrete probability distributions

27. What does statistical inference(suy luận thống kê) aim to achieve using sample data?
a) Making exact predictions
b) Drawing conclusions about the sample only
c) Estimating population characteristics
d) Eliminating sampling error

28. What is the sampled population in the context of statistical sampling?


a) The population from which the sample is drawn
b) The population after the sample is collected
c) The finite population
d) The infinite population

29. What is a frame in the context of sampling methodology?


a) The sample mean
b) A list of elements for sampling
c) The population parameter
d) The standard deviation

30. What is the primary advantage of selecting a probability sample when sampling from a
finite population?
a) Lower cost
b) Faster data collection
c) Valid statistical inferences
d) Simplicity in implementation

31. What is the point estimator of the population standard deviation?


a) Sample mean
b) Sample standard deviation
c) Sample proportion
d) Point estimate

32. What information does the sampling distribution provide about the sample mean?
a) Joint probabilities
b) The mean of the population
c) Different random samples
d) Sampling error
The sampling distribution of the sample mean describes how the sample means vary from
sample to sample due to the randomness of sampling, thus giving insight into the sampling
error, which is the difference between the sample mean and the population mean

33. When is a point estimator considered unbiased?


a) When the sample size is small
b) When the expected value equals the population parameter
c) When the distribution is skewed
d) When the standard deviation is large

34. What does the Central Limit Theorem state about the sampling distribution of the
sample mean?
a) It is always normally distributed
b) It approximates a normal distribution as sample size becomes large
c) It follows a uniform distribution
d) It is not applicable to finite populations

35. Why is interval estimation frequently used in statistics?


a) To eliminate sampling error
b) To calculate sampling distribution
c) To provide a range of possible values for a parameter
d) To simplify statistical analysis

36. What is the margin of error used for in interval estimation?


a) To narrow the confidence interval
b) To reduce the variability of the data
c) To determine the sample size
d) To provide a range of values around the point estimate

37. In interval estimation of the population mean, what is the range within which 95% of
the values lie if the sampling distribution follows a normal distribution?
a) 1.645 standard deviations
b) 1.960 standard deviations
c) 2.576 standard deviations
d) 3.000 standard deviations

38. When is the t-distribution used in interval estimation?


a) When dealing with large samples
b) When the population standard deviation is known
c) When dealing with small samples or unknown population standard deviation
d) When working with a discrete probability distribution

39. What does the null hypothesis represent in hypothesis testing?


a) A tentative conjecture
b) A proven fact
c) An alternative hypothesis
d) A point estimate

40. What is the alternative hypothesis in hypothesis testing?


a) The complement of the null hypothesis
b) An unrelated statement
c) The same as the null hypothesis
d) An inverse statement

41. What is the primary purpose of the hypothesis testing procedure?


a) To establish a null hypothesis
b) To collect random samples
c) To test the validity of competing statements about a population
d) To eliminate sampling error

42. How many forms can hypothesis tests about a population parameter take?
a) One
b) Two
c) Three
d) Four

45. In a two-tailed test, what happens if the conclusion is to accept H0?


a) Type I error
b) Correct conclusion
c) Incorrect conclusion
d) Type II error

50. What is the primary challenge associated with taking a census for data collection?
a) Cost-effectiveness
b) Time efficiency
c) Misleading results
d) Unreliable data

You might also like