Region State Country Median Listing Price (Y) Median $'s Per Square Foot Median Square Feet (X)
Region State Country Median Listing Price (Y) Median $'s Per Square Foot Median Square Feet (X)
Region State Country Median Listing Price (Y) Median $'s Per Square Foot Median Square Feet (X)
This analysis is done to examine the relationship between the selling price of properties and their sizes in
square feet. Pacific region was selected and a random sample of 30 data values was created for analyzing. Then
scatter plot was created in Excel and selected most suitable trend line with its equation as well as the R square
value. Future prediction is also done using the obtained regression equation.
A random sample of 30 data values is created using simple random sampling method.
Data analysis
To compare the sample with the population, following table can be used.
As per the above table, mean of the median listing price of the sample is more deviated from the population
mean. Median of the sample is also significantly less the median of the national value. Population is more
precise than the sample as the standard deviation of the national median listing price is less than that of
Considering the median square feet data, statistics of both sample and population have lesser deviations from
each. Sample mean is smaller than the average national median square feet. But, median is larger in the
population. Sample standard deviation is less than the population standard deviation for median square feet
data set.
As per above explanation, we can conclude that median square feet data set is comparatively better one. The
population of median square feet data is symmetrical normally distributed while the population of median
Sample is made as per simple random sampling method. First the data set is numbered from 1 to 89. Numbered
paper chits from 1 to 89 and mix them properly. After that, take one chit and noted the number and related data
value was selected for the sample. This was done 30 times to make a sample of 30 data values.
Since all data values had equal chance to be selected and the sample size is greater than 30, we can consider
Scatter plot
$600,000 Series2
$500,000 Polynomial (Series2)
$400,000
$300,000
$200,000
$100,000
$0
1000 1200 1400 1600 1800 2000 2200 2400 2600
Median square feet
Regression equation
2 6
y=0.408 x −1720 x +2× 10
The pattern
Median square feet is the independent variable(x) while median listing price is the dependent variable(y).
Independent variable is always used for predictions as y is always depending on the value of x.
As the correlation coefficient is very lower value, relationship or correlation is not strong for these two
variables. So, we can conclude that association of median listing price and median square feet is very poor.
Shape of the curve is nonlinear. So the polynomial curve which the best fit line is selected to calculate
There are no outliers in this sample. Outlier is a data which deviated 1.5 times more or less of inter quartile
range from the 3rd quartile. None of data satisfy that requirement.
y=0.408(1200)2−1720(1200)+ 2× 106
y=523,520