Key points

An image of a scatter diagram.
Image caption,
Scatter diagrams are used to explore patterns between two sets of data.
  • A scatter diagram or scatter graph is used to explore patterns between two sets of (known as 'bi-variate' data).

  • If there is a relationship between the two variables it is called a .

  • By exploring the patterns between the variables it may be possible to draw a . A line of best fit generalises the trend and can be used to make predictions.

An image of a scatter diagram.
Image caption,
Scatter diagrams are used to explore patterns between two sets of data.
Back to top

How to draw scatter diagrams and correlation

  • To produce a scatter diagram data is required. The data often comes in the form of a table.

  • To create a scatter diagram:

  1. Look for the smallest and largest frequencies in both in your table.
  2. Draw a horizontal axis on your square paper or graph paper to represent one variable.
  3. Choose an appropriate scale for this axis and label your axis. Decide if you need to use a .
  4. Draw a vertical axis on your square paper or graph paper to represent the other variable.
  5. Choose an appropriate scale for this axis and label your axis. Decide if you need to use a false origin.
  6. Plot each data point carefully on the graph.
  7. Check you have labelled each axis correctly and give your scatter diagram a title.

Examples

Image gallerySkip image gallerySlide 1 of 9, Example one. An image of a table. The table has three rows and nine columns. The first row is labelled, student, and is populated with the letters, A, to, H. The second row is labelled, maths mark, and is populated with the numbers, thirty six, ninety four, sixty seven, twenty four, sixty two, fifty one, seventy three, and fifty six. The third row is labelled, science mark, and is populated with the numbers, thirty two, eighty eight, seventy three, thirty five, sixty eight, forty five, sixty eight, and sixty two. The cells for the labels, and the cells containing letters, A, to H, are highlighted purple., The table shows the results of eight students' maths and science marks in an assessment. Construct a scatter graph that represents these results.

Question

Graph A, Graph B and Graph C each demonstrate a different type of correlation.

One shows no correlation, another a positive correlation and the other a negative correlation.

Decide which correlation belongs to which graph.

A series of three images. Each image is a sketch of a set of axes. The axes are labelled, x, and y. In the image labelled, A, eight data points are plotted, such that as one variable increases, the other variable increases.  In the image labelled, B, eight data points are plotted, such that as one variable increases, the other variable decreases.  In the image labelled, C, eight data points are plotted, they are randomly spread on  the axes.

Back to top

How to draw and use a line of best fit

  • When a scatter diagram has a positive or negative correlation it is possible to draw a . The line of best fit should approximate the trend. The line of best fit does not have to go through the origin.

  • It is possible to use the line of best fit to make predictions.

  • If there is a data point that does not fit the trend it is called an outlier.

Examples

Image gallerySkip image gallerySlide 1 of 6, Example one. An image of a scatter diagram. A vertical axis has been drawn to the left. The axis has been labelled with numbers. The values are increasing in units of ten from zero to eighty. It is subdivided into intervals of two. The axis has also been labelled, width, measured in millimetres. The horizontal axis has been labelled with numbers. The values are increasing in units of twenty from zero to one hundred and twenty. It is subdivided into intervals of two. The axis has also been labelled, length, measured in millimetres. Twenty data points have been plotted on the axes with crosses. They have co-ordinates; seventeen, comma, thirteen. Nineteen, comma, fifteen. Twenty, comma, thirteen. Twenty one, comma, fifteen. Twenty two, comma, seventeen. Twenty six, comma, twenty two. Thirty one, comma, twenty three. Thirty four, comma, twenty four. Thirty nine, comma, thirty two. Forty five, comma, thirty three. Forty five, comma, thirty six. Fifty two, comma, forty. Fifty seven, comma, forty five. Fifty eight, comma, fourteen. Sixty three, comma, forty four. Sixty six, comma, forty one. Seventy one, comma, forty nine. Seventy seven, comma, fifty two. Seventy nine, comma, fifty three, and one hundred and thirteen, comma, seventy four. Written above: lengths and widths of twenty birds eggs. The data point with co-ordinate; fifty eight, comma, fourteen has been circled in blue. Written beside, in blue: outlier. , The scatter diagram shows the relationship between the length and width of twenty birds' eggs. The horizontal and vertical scales are going up in increments of 10. Between each multiple of 10 is five subdivisions. Each subdivision is worth 2 mm. One piece of data does not match the trend. It is a long way from the rest of the data. The item of data at (58, 14) is called an outlier. Outliers need to be identified before drawing the line of best fit.

Question

The scatter graph shows the relationship between the temperature on a given day and the number of ice creams sold in a café.

A line of best fit has been drawn.

Use the line of best fit to predict how many ice creams will be sold on a day where the temperature is 29°C.

An image of a scatter diagram. A vertical axis has been drawn to the left. The axis has been labelled with numbers. The values are increasing in units of twenty from zero to eighty. It is subdivided into intervals of two. The axis has also been labelled, ice cream sales. A false origin has been used on the horizontal axis. The horizontal axis has been labelled with numbers. The values are increasing in units of two from twenty two to thirty two. It is subdivided into intervals of zero point two. The axis has also been labelled, temperature, measured in degrees Celsius. Ten data points have been plotted on the axes with crosses. They have co-ordinates; twenty two, comma, six. Twenty two, comma, twelve. Twenty four, comma, twenty. Twenty five, comma, thirty four. Twenty six, comma, twenty six. Twenty eight, comma, forty eight. Twenty eight, comma, sixty two. Thirty, comma, sixty four. Thirty one, comma, sixty six, and thirty two, comma, seventy four. Written above:  temperature and ice cream sales. A line of best fit has been drawn passing through co-ordinates; twenty two, comma, ten, and thirty two, comma, seventy nine. The line of best fit is coloured orange.

Back to top

Practise understanding scatter diagrams

Quiz

Practise understanding scatter diagrams with this quiz. You may need a pen and paper to help you with your answers.

Back to top

Real-life maths

An image with six silhouettes of a child growing taller between the ages of one and seventeen.
Image caption,
A data analyst may use scatter diagrams to look for patterns and trends.

A may use scatter diagrams to look for patterns and trends. These can then be used to make projections. A projection is an estimate or guess at what may happen in the future based on trends.

However, there is no guarantee that a will continue indefinitely. For example, as a person's age increases during childhood, their height typically increases. This would be a positive . This trend does not continue forever as, although people continue to grow older, they eventually stop growing taller.

An image with six silhouettes of a child growing taller between the ages of one and seventeen.
Image caption,
A data analyst may use scatter diagrams to look for patterns and trends.
Back to top

Game - Divided Islands

Play the Divided Islands game! game

Using your maths skills, help to build bridges and bring light back to the islands in this free game from BBC Bitesize.

Play the Divided Islands game!
Back to top

More on Representing data

Find out more by working through a topic