MATH 1280 - Written Assignment Unit 2
MATH 1280 - Written Assignment Unit 2
MATH 1280 - Written Assignment Unit 2
February 8, 2023
The file "flowers.csv" file contains information on measurements of the iris flowers. Create an R
data frame by the name "flower.data" that contains the data in the file.
The following R code shows an example of how to round a vector of numbers to zero decimal
places and then calculate some statistics using the rounded numbers. You might need some of
the calculations for this assignment, but you might not need others. You would replace
example$years with the name of the R object that you want to analyze (in other programming
languages, you might call example$years a variable).
> cumsum(rel.freq)
Value: 1 2 3 4 5 6 7
Tasks
1. Sometimes it is difficult to understand data if you do not know what the numbers represent.
Provide short definitions of two words: sepal, and petal (be sure to cite your sources even if you
paraphrase):
Sepal: __“The flower's outer parts that surround a developing bud (typically green and
leaf-like)” (Merriam Webster Dictionary, n.d)__.
Petal: __“The parts of a flower that are frequently colorful which together form the
2. There is a cumulative relative frequency table printed above for petal lengths (using rounded
values for petal length). Below the number 3 in that table is the number .35. What does .35
represent? (multiple choice)
c. Of all the flowers measured in this sample 35% had a petal length of 3 (after rounding the
petal lengths).
d. Of all the flowers measured in this sample 35% had a petal length of 3 or less (after rounding
the petal lengths).
e. A study of all flowers on the planet would show that about 35% had petal lengths of 3 or less
(after rounding the petal lengths).
3. Using only the cumulative relative frequency table printed above combined with some simple
paper-and-pencil calculations, which petal length occurs most frequently ?
Value: 1 2 3 4 5 6 7
Answer: I get to the conclusion that both 4 and 5 petal lengths occur most commonly after
analyzing the relative frequency.
4. Describe how you determined your answer to the previous question (describe the calculations
that you used). Do not show R code for this task--it will not be counted as an answer.
Answer: __I used the following formula to get the relative frequency of each petal length:
the relative frequency of 5 is equal to the cumulative relative frequency of 5 minus the
cumulative relative frequency of 4 = 1-0.97= 0.03 and so on. After examining the relative
frequencies (0.58-0.35=0.23 & 0.81-0.58=0.23), it was discovered that 4 and 5 petal lengths
were the most common__.
--------------------------------------------------------------------------------------------------------
5. Assuming that you read the flowers.csv file into an R object called flower.data, run the
following R code (do not paste the ">” character into R) and paste both the command and the
output into your answer (you should see five names, each of which should be enclosed in
quotes--if you do not see this, try again or contact your instructor):
> names(flower.data)
Answer:
> flowers.data <- read.csv("flowers.csv")
> names(flower.data)
7. List the variables in the data frame (you can do this by entering the name of the R object that
holds that data that you read using the read.csv command--you should have called it flower.data).
If you do not see five columns of data, then there was a problem reading the input file--try again
or contact your instructor. For each variable identify the type of the variable (factor or numeric).
8. Round the data for the variable Sepal.Length so that it contains integers, then find the
frequency of the value 7 (not the relative frequency):
> attach(flower.data)
> freq.Sepal.Length
4 5 6 7 8
5 47 68 24 6
-----------------------------------------------------------------------------------------------------------
Assuming that you read the flowers.csv file into an R object called flower.data, run the following
R code (do not paste the ">” character into R). Note that we are not rounding the numbers here.
Use the output for the next five tasks:
> table(flower.data$Sepal.Width)
> plot(table(flower.data$Sepal.Width))
9. What is the sum of the first three frequencies in the frequency table for sepal width? Answer
is 8.
10. What does your answer to the previous question represent (in terms of sepal width and
frequency and the percentage of all sepal measurements).
11. What is the sum of the last three frequencies in the frequency table for sepal width? Answer
is 3
12. How many flowers in the sample had sepal widths less than 4 (do NOT round the sepal width
numbers for this, but you can round your final answer to 3 decimal places)?
Answer: There are exactly 4 flowers in the sample that had sepal widths more than or
equal to 4. But in order to know the number of flowers that had sepal widths less than 4, I
will calculate the number of the total items minus the number of flowers with sepal width
less than 4: 150-4= 146.
13. What does the tallest bar in the plot represent? (multiple choice)
a. mean
b. mode
c. median
The correct answer is (b).
----------------------------------------------------------------------------------------------------------
14. Create a frequency table that shows the frequencies for each species of flower in the sample.
Paste your R command and output into your answer (do NOT display data from a data frame,
display data using the table() command).
Answer:
> attach(flower.data)
> table(species)
Species
50 50 50
15. Explain two things about the table that you created for the previous task:
Why did the frequency table for flower species contain words in the first row as opposed to
numbers?
Answer: Because species is a factor rather than a number, the first row of the table for
flower species comprises words rather than numbers. As the type data of species is a factor.
What is the meaning of the numbers in the second row of the table?
Answer: It displays how frequently the names of the flowers appear in the data frame or
sample data.
References
https://www.collinsdictionary.com/dictionary/english/petal
Merriam Webster Dictionary. (n.d). Sepal of a flower. Retrieved from
https://www.merriam-webster.com/dictionary/sepal
Yakir, B. (2011). Introduction to Statistical Thinking (With R, Without Calculus). Jerusalem, IL:
The Hebrew University of Jerusalem, Department of Statistics. Retrieved from
https://my.uopeople.edu/pluginfile.php/1659047/mod_resource/content/5/IntroStat.pdf