An Introduction To The Resource Pack: Roger Stern, Eleanor Allan, Carlos Barahona and Ian Dale

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

An Introduction to the Resource Pack

Roger Stern, Eleanor Allan, Carlos Barahona and Ian Dale

Contents

1. Introduction 3
1.1. Why this resources pack? 3
1.2. Who is the resource pack for? 3
1.3. What does the resource pack contain? 3
1.4. Whats in this introductory guide? 3
2. An alternative approach to statistics training 4
2.1. Less theory, more practice 4
2.2. Good graphs and tables 5
2.3. Using the statistical software 5
2.4. Statistical games 7
2.5. Good-practice guides 9
2.6. More concepts 10
2.7. Training in computing for statistics 10
3. Using a spreadsheet with SSC-Stat 11
3.1. Data layout 12
3.2. Exploratory graphics 12
3.3. Data and results 13
3.4. Analysis 14
3.5. Structured data 15
3.6. Behind the scenes 15
3.7. Help 16
3.8. Moving on 16
3.9. Further use of Excel 16
4. Using a statistics package Instat 16
4.1. Menus 17
4.2. Analysis 18
4.3. Help 19
4.4. Moving on 20
5. Training courses in statistics 20
5.1. In-service training 20
5.2. Tailored training 21
5.3. Undergraduate training 21
5.4. Postgraduate training 21
6. In conclusion 22

Statistical Services Centre


The University of Reading, UK
Guide to the SSC Resources

1. Introduction

1.1. Why this resources pack?


We live in societies where decision-makers seek solid evidence on which to base their actions.
They therefore often need statistics efficient collection and use of data a subject with which
many people are not comfortable. Sometimes this is because they have attended poorly presented
training courses, where statistics was perceived as being 'difficult', 'too mathematical' or just
irrelevant to their research needs. The fact that many professionals do not use their statistical
knowledge regularly enough to acquire confidence does not help either.
Recent developments provide an opportunity for change. These developments include: statistical
software becoming easier to use, advances in statistical methods that have made the subject more
accessible and better computing skills among users of statistics that allow them to take advantage
of these developments. We have attempted to harness some of these developments in this
resource pack in order to contribute to making the use of statistical methods and their teaching
more accessible and relevant. Our overall aim is to encourage good statistical practice.

1.2. Who is the resource pack for?


This pack can be used by those learning statistics, who can take advantage of it for the analysis of
their own data, and those in charge of teaching statistics, who will find suggestions on how to
integrate the resources into their training practice.
The ultimate target groups however are research scientists, social scientists, technologists,
monitoring and evaluation specialists, or any others who are required to collect and analyse data
and then interpret and present their findings.

1.3. What does the resource pack contain?


The resource pack contains:
SSC-Stat. A statistical add-in for Excel. It was developed to make Excel a more efficient tool
for data management and simple statistical analysis. It can be used in the teaching of basic
statistical practice.
Instat. A statistics package designed as an aid for teaching statistics, that has evolved into a
statistical package that can be useful for data analysis.
A set of statistical games. These are for the teaching of key statistical concepts in surveys
and experiments. Software versions of these games have been produced, but they can also
be played without computers.
A set of good practice guidelines. These were written to support research design and data
analysis but they can be used to support a general understanding of statistics.
Both SSC-Stat and Instat are free for individual use in non-commercial organisations. We provide
the full version of the software, including all the documentation, and there is no time limit on its use.
We plan to continue this policy, and upgrades may be downloaded from our web site
(http://www.ssc.rdg.ac.uk/) as they become available. The other resources, described in Section 2
of this introductory guide, are also available free of charge.

1.4. Whats in this introductory guide?


The aim of the resource pack is to encourage a different approach to the teaching and learning of
statistics. In Section 2 we present ideas about how this can be done. We do it in relation to software
and other materials that have been produced by the Statistical Services Centre at the University of

3
Guide to the SSC Resources

Reading. However, the ideas are general and do not depend entirely on our resources. They can,
for instance, be adapted and used with the users own software.
The proposed approach:
Builds on the knowledge and familiarity of students and training with spreadsheets
Encourages the use of software to discuss concepts rather than the mechanics of how
to generate output
Uses a set of good practice guidelines prepared to help in real research as an aid to
planning and interpretation of statistical analysis
Introduces the use of statistical games that may be used in conjunction with the
software and to turn statistics courses into an enjoyable experience.
Some brief consideration is also given to training in efficient statistical computing.
Section 3 outlines the key features of our Excel add-in: SSC-Stat. Section 4 outlines the use of
Instat. Section 5 outlines some training courses in which SSC-Stat, Instat and these other
resources have been used.

2. An alternative approach to statistics training


Many training courses for non-statisticians could be broadened and made less theoretical.
Inadequate time is usually spent on the design of a study, data organization is often omitted, and the
descriptive analysis of realistic-sized data sets is rarely done. Some courses still concentrate more
on formulae than on the key concepts of analysis. This section discusses changes in focus and
tools used in statistical training and how this resource pack can be used to help in teaching
statistics.

2.1. Less theory, more practice


Our view is that courses for non-statisticians should include more on: planning, data organisation
with realistic-sized data sets, practical data analysis andpresentation and interpretation.
If that means that fewer formulae are taught, then so be it. We have many years of evidence that
teaching formulae first is not effective. It may provide a solid foundation for statistical specialists,
but most others find it baffling. We argue that:

There should be less emphasis on formulae and hand calculations, to allow more time for
students to fully understand computer output and key concepts.
Courses tend to cover topics in an old-fashioned sequence of increasing complexity, hence
the methods are seen as a disjointed set of techniques and key concepts are often missed.
Introducing techniques according to their analytical complexity is less important now that
computers will handle the number crunching.
Courses often only include artificially small data sets. This does not prepare students for data
management and for simple descriptive methods with structured data, that are needed later.
Courses often stop too soon. Important subjects, such as regression modelling with non-
normal data, are omitted, because they are not possible to do by hand.
We should stress that we are not against the occasional hand-analysis, but it should be done
together with the corresponding computer analysis. If the computer output can be shown first, the
calculations will help students to understand the results. On the positive side we have found that
constructive integration of computers for demonstrations and practical exercises helps
participants to enjoy their statistics training, often for the first time!

4
Guide to the SSC Resources

2.2. Good graphs and tables


In many studies, much of the analysis can be handled through the production of good graphs and
tables. This is descriptive statistics. To support this work, this resource pack includes two guides:
Guidelines for good statistical graphics in Excel
Good tables for Excel users
These documents are included within the SSC-Stat help, but are also available separately. The
guides can be used in combination with SSC-Stat for practical work. For those starting to use Excel
for statistical work we have two introductory guides:Excel for statistics: tips and warningand
Disciplined use of spreadsheet packages for data entry.
As many people are comfortable with spreadsheets and some have used nothing else for data
management and analysis - using Excel is often a good starting point for statistics training. We
have chosen to present these topics using Excel because of the increasing number of students
who have at least a basic level of familiarity with it.

2.3. Using the statistical software


In this section we illustrate how statistical software can help in statistics training. We have chosen
three examples from the Instat Introductory Guide.
Example 1. Figure 2d shows data generated by a rice production survey (one of the games
included in the resource pack) where information was collected about yield of rice grown in the
fields of several farmers from four villages. The first example concerns the teaching of descriptive
statistics.
Fig. 2d. An example of survey data

Figure 2e shows a table of percentages 1, while Fig. 2f shows a two-way table of median yields from
the data in Fig. 2d. The tables include a tool-tip that is designed both to illustrate the importance of
understanding the data behind the summary values in a table, and to teach about percentages and
percentiles.

1
Many people lack confidence in their use or understanding of percentages. Calculations of different types of
percentages, including those that result from multiple response data, are described in Chapter 13 of the Instat
Introductory Guide (available from the help menu within Instat).

5
Guide to the SSC Resources

Fig. 2e. Two-way table of percentages Fig. 2f. Median rice yields by village and variety

The use of realistic examples permits training courses to spend longer on the important ideas of
descriptive statistics. In our experience when students and in some cases trainers are asked to
prepare outline tables and graphs that correspond to the objectives of their study, they often do not
know where to begin. However when descriptive statistics are calculated and appropriately
presented, they promote interesting discussions about statistics and the topic of analysis. This
enhances the teaching and learning process.
In many situations a descriptive summary, i.e., the appropriate use of tables and graphs are most of
what is needed. Probability and inferential ideas are needed later.
Example 2: Instat can also be used to support the discussion of probability ideas. As a
demonstration, Fig. 2g displays the distribution of the difference between two normal distributions,
and could for example be used to illustrate probability ideas or ideas about variability in data.
Fig. 2g. The difference between two normal distributions (green line)

Further examples, illustrating probability ideas similar to that in Fig. 2g, such as the sampling
distribution of the mean, and the central limit theorem, are given in Chapter 14 of the Instat
Introductory Guide. CHANDRA make these Italics a global change at each mention
Example 3 is illustrated in Fig. 2h, and concerns statistical inference. Many people have only a
hazy notion of what is meant by a confidence interval. The figure shows the yield of rice from a rice
production survey (Fig. 2d) together with estimates of the 20% and 80% points, and also confidence
limits for these percentage points. This is a good example of how to use software, in a way that is
meaningful to the researcher, to illustrate statistical concepts. And anyone who understands what is

6
Guide to the SSC Resources

meant by the 95% confidence limits for the 80% point of a set of data will certainly have mastered
the basic ideas of statistical inference!
Fig. 2h. Confidence limits are not just for the mean!

2.4. Statistical games


One way to help broaden courses is to use statistical games. The following games - included in
this resource pack - are simulations of the whole process from study design to reporting, and can
be undertaken by trainees within the time constraints of a short course:
To the woods A simple survey game to show the benefits of stratification
Paddy A multistage survey on rice cultivation practices and yields
Mice A simple experimental design game to introduce the ideas of blocking
Tomato An experimental design game that covers the ideas of incomplete block structures
and factorial treatments
Chick An experimental design game involving quantitative treatments, where the choice
and number of levels has to be considered.
Two pictures from the rice survey are shown in Figs 2a and 2b.

These games are available both as hand exercises and on the computer2. One useful feature is
that they can provide different challenges to trainees, depending on their ability, or previous

2
Initial developments were done at the Department of Applied Statistics in the 1970s. Since then the games have
been used widely, both in the UK and overseas. The games were updated by the Statistical Services Centre of the
University of Reading in the 1990s as part of a teaching initiative in UK universities, and further updating by the School
of Applied Statistics is currently in progress

7
Guide to the SSC Resources

knowledge. For example the rice survey typifies many real studies that collect data at multiple
levels: village, field, and plot. In a first course we can illustrate how such data may be summarised.
In later courses planning aspects, or more complex parts of the analysis can be discussed.

Fig. 2a. A simulation game to teach sampling

Fig. 2b. The 10 villages in the survey

A guide has recently been prepared to show how these games are being used in teaching statistics
to agriculture students in the University of Nairobi. This includes information on how to generate the

(March 2005).

8
Guide to the SSC Resources

hand versions as well as how to adapt the existing games, or produce new ones. See the
Resources CD or http://www.uonbi.ac.ke/acad-depts/BUCS/ for more details.

2.5. Good-practice guides


Another resource in our broadened training is a set of about 20 good statistical practice guides 3.
They provide advice in small doses of about 16 pages. There are two overview guides (Fig. 2c),
while more than half the guides are devoted to planning and data management. These guides often
make useful supporting material on training courses, but they are also relevant to people who are
self-learning statistical ideas.

Fig. 2c. Statistical good practice guides

3
The UK Governments Department for International Development (DFID) supported the production of these guides
and encouraged their wide circulation. In 2004 the set of guides was updated and republished as a book (see page
3).

9
Guide to the SSC Resources

Two of the guides relate to the use of Excel. Many analyses would be much easier if data were
entered into Excel in a controlled way. We describe what this means in the guide called Disciplined
use of spreadsheet packages for data entry.
The guide entitled The role of a database package in managing research data describes the types
of data that might be too complicated to enter into a spreadsheet. We compare the organising of
data in a spreadsheet with that in a database to provide guidance on the advantages and limitations
of each type of software.
The titles of the guides on data analysis and presentation are also shown in Fig. 2c. We use the
guide called Key concepts of inferential statistics in many of our courses. In some it provides a
summary of the main ideas from the training. On more advanced courses, we assume that these
concepts, that include standard errors, confidence limits and significance tests, are well
understood by the participants. However, the concepts are often poorly understood, and so this
guide can be used as preparatory reading before topics are discussed, if necessary, in a review
session at the start of the course.

2.6. More concepts


While there are many formulae for different analyses, there are relatively few key concepts. Once
these concepts are understood, they can be applied to both standard analyses and to new
situations.
Our ideas are illustrated in four of the good-practice guides listed in Fig. 2c that are concerned with
analysis. The first is called
Key concepts of inferential statistics
In this we provide the baseline for those who need more than descriptive methods. They are either
essential ideas on an elementary course, or are essential revision material for more advanced
courses.
Then there are two guides that describe methods that may be undertaken unaided, called:
Approaches to the analysis of survey data
Modern approaches to the analysis of experimental data
The fourth guide is entitled
Modern methods of analysis
Here we indicate the role of some advanced methods. We are convinced that knowledge of the
existence of these modern methods is helpful in many ways. It enables users to check what is
needed for a particular analysis. It helps dialogue with a statistician, or other support person, when
such analyses are undertaken, or when results are to be interpreted. This indication of what is now
possible in an analysis can also support imaginative design of future studies.

2.7. Training in computing for statistics


A final aspect of a broadened training is to consider how staff uses the computer for their statistical
work. Many users of software are so accustomed to menus and dialogues that they cannot
envisage working in any other way. This can be very inefficient, particularly when an analysis
involves processing many similar sets of data in an identical way. The user can become trapped
into doing the routine work by hand (e.g., repetitive use of Copy/Paste or a defined set of menus
and dialogues), whereas computers were made to do such routine tasks. It is also unprofessional,
because there is often then no record of the steps that were taken in an analysis. Thus it can be
difficult to defend a presentation on a later occasion, because there is no record of the process by
which the results were generated.

10
Guide to the SSC Resources

Trainers should give some consideration to this issue, that is not discussed here. Elsewhere we
describe alternative ways of using Instat, and any other statistics package, so users can identify an
appropriate strategy, both for themselves and for others.4

3. Using a spreadsheet with SSC-Stat


Excel is commonly used to enter and organie data, for data summary and for graphs or charts.
SSC-Stat is designed to make these tasks easier and more robust.
SSC-Stat is installed in a similar way to any other Excel add-in and then provides an extra menu, as
shown in Fig. 3a. The three main components of this menu are called Manipulation, Visualisation
and Analysis.
These SSC-Stat menus parallel the main menus in a typical statistics package. For comparison,
Fig. 3b shows the main menus of the statistics package This needs to be expanded in full SPSS,
where these components are called Transform, Graphs and Analyse. Once spreadsheet users
are familiar with an add-in such as SSC-Stat, they should find it easy to use a statistics package if
one is needed.
Fig. 3a. The SSC-Stat menu in Excel, and an example of data in list format

4
This topic is discussed at greater length in the Instat Introductory Guide: Chapter 4 outlines alternative ways of using
a statistics package, Chapter 10 considers strategies for statistical software in more detail, and Appendix 1 shows
how to write and use a macro to automate or extend an analysis.

11
Guide to the SSC Resources

Fig. 3b. A view of the SPSS statistics package showing the same data and equivalent menus

3.1. Data layout


Excels Help recommends that data for analysis be held in list5 format. A list in this context is a
compact rectangle, where each column only contains data of a single type6 and each row has the
data for a single observational unit. There are no blank rows or columns. An example of a list is
shown in Fig. 3a. It is useful to name each column, as shown also in Fig. 3a.
In SSC-Stat, one use of the options in the Manipulation menu is to help users to reorganize data
entered earlier, so that they are in a list. Then the options in the Visualise and Analysis menus are
all designed to make use of the data in list format. Some of the powerful features of Excel for data
analysis, particularly Pivot Tables and Filters, also rely on the use of lists. In addition, if your
analysis needs more than Excel, then data in list format transfers easily into any statistics package.
Data often have structure, and it is essential that good analyses take account of this structure. For
example, in the data shown in Figs 3a and 3b, the main column to be analysed was the yield of rice
(sixth column). Part of the structure was that the farmers used different amounts of fertilizer Ian
please check that what is in the text exactly matches what is on the screen, I cant because there
are no figures in this file.(fourth column) and used one of three types of rice (fifth column).

3.2. Exploratory graphics


Figure 3c uses part of the structure in the data to produce an Excel graph of the mean yield against
fertiliser plotted separately for each of the three types of rice.

5
Look in Excels Help for the entry called Guidelines for creating a list on a worksheet.
6
Often a column will contain numbers only. Then there should be no explanatory text added in the cell itself. Use
Excels facility for Comments if you wish to add an explanation. If the column contains text, like Yes, Maybe, No,
that will be used as categories, then make sure they are spelled consistently down the whole column.

12
Guide to the SSC Resources

Fig. 3c. Graph from data in list format

We describe how to get this graph in the SSC-Stat tutorial. It is a standard Excel graph and could
be constructed in a few extra steps of data manipulation without the SSC-Stat add-in. However
SSC-Stat makes it very easy to construct this type of graph for the data in list format. For example,
we could also quickly try the same graph for each village instead of each variety. Unless
constructing such exploratory graphs is made easy, we find this step is often omitted from an
analysis. Graphs are an important way both to look at data and to present results in a report.
Fig. 3d. Boxplot of Yield Fig. 3e. Boxplots of Yield by Variety

A boxplot is a commonly used exploratory tool but is not available in Excel. Figures 3d and 3e
show boxplots of the data in Fig. 3c, produced by SSC-Stat, for all the yields together and
separately for each variety.

3.3. Data and results


Statistics packages like SPSS hold the data in a spreadsheet form (as seen in Fig. 3b), but display
results in a separate window. This feature does not exist in Excel, though we recommend that you
save your results on a different sheet to the data. Keeping data and results clearly separate is a key
element of using Excel effectively.

13
Guide to the SSC Resources

From a results sheet you need to return to the sheet containing the data for the next analysis. This
is so that Excel or SSC-Stat recognizes the data as opposed to the results. In SSC-Stat, an
alternative is to define the data area. This is then remembered each time you use SSC-Stat in the
current session and saves you the trouble of returning to the data sheet.

3.4. Analysis

Fig. 3f. SSC-Stats menu for analysis

Fig. 3g. The basic dialogue for descriptive statistics

SSC-Stats menu for analysis is highlighted in Fig. 3f. The first two options give descriptive
summary statistics, while the other options are for simple modelling. The dialogue for the
Descriptive Statistics option is shown in Fig. 3g.
The option in the dialogue to give some of the same statistics shown graphically in Fig. 3d is shown
in Fig. 3h. The results are in Fig. 3i.

14
Guide to the SSC Resources

Fig. 3h. Additional statistics added Fig. 3i. Results from using the dialogue

3.5. Structured data


Large data sets often include another key element of structure, besides the factor or category
columns mentioned earlier. This is where data are available at different levels of a natural hierarchy.
For example, an educational study might have information at the school level, at the classroom
level, from interviews with teachers and at the pupil level from marks obtained in tests. A survey
might have some information at household level and then further details on each individual within
the household. In such studies the data are often organized in a set of rectangles, with the data
from each level on a different sheet. The analysis is then usually done in stages.
The second option in the Analysis menu, called Summary Statistics, is particularly useful in such
analyses. It is designed to produce summaries that are laid out in list format, and hence can be
used as the new data in a subsequent stage in the analysis.

3.6. Behind the scenes


Much of the power of SSC-Stat is behind the scenes. For example, almost all the menus offer the
option to either link or copy the data. Linking is a powerful feature of a spreadsheet and means
that the results change automatically if the data are changed. (Excels own statistics toolkit does not
link. It only produces snapshots of the results.) The facility to link is not available in many statistics
packages. In SPSS for example (Fig. 3b), the data look as if they are in a spreadsheet, but changes
there would not be carried through to changes in the results that were obtained earlier. Similarly,
Instat does not offer the possibility of linking.
When data are in lists, then Excels Data Filter provides a powerful facility to hide rows of data
on a temporary basis. In SSC-Stat, what you see is what you get analysed, as it only processes
the visible data. This makes it simple to analyse user-defined subsets of the data. It is also easy to
temporarily omit rows of data that may include suspect values, to see what effect those values
have on the analysis.
Sometimes data include missing values, and SSC-Stat tries to cope sensibly. Where cells are left
blank, SSC-Stat treats the observation as missing. If you prefer to use a non-numeric symbol for
missing values, like a star * or a dot ., then SSC-Stat will also ignore these values when

15
Guide to the SSC Resources

processing numeric columns. For text columns, non-blank rows would not be ignored, but you can
always choose to hide these rows before processing the data.
Some articles have complained that the algorithms used by Excel for statistical calculations are not
sound.7 We explain the implications of this in the SSC-Stat tutorial, but are also confident in the
accuracy of the calculations used in SSC-Stat.

3.7. Help
SSC-Stat includes help at three different levels. There is overall Help, accessed via the General
menu. Then each menu has its own Help, so there is help on Manipulation, Visualization and
Analysis. There is also Help on each individual dialogue. In addition, the good-practic guides
(Section 2) that relate to the use of Excel, are available from within the Help system.

3.8. Moving on
Finally, some users may be disappointed at the lack of more powerful statistical facilities in SSC-
Stat. Where, for example, is the multiple regression, or the powerful analysis of variance? Our view
is that these methods are better handled by a standard statistics package. We have made the
menus in SSC-Stat similar to those in many statistics packages. We hope this will help the learning
process when those who need more than can be done in Excel add a statistics package to their
repertoire. You could be surprised how easy statistics packages have become to use. If you need
more convincing before obtaining one of the standard packages, you could start with Instat, that is
described in the next section.

3.9. Further use of Excel


One of the strengths of Excel is the facility for extending its capabilities by writing VBA programs
(Visual Basic for Applications: Excels built-in macro language). This is in fact how SSC-Stat was
constructed. Writing simple macros is made easier by the macro recording feature of Excel, which
also provides a useful route to learning about programming in VBA. 8

4. Using a statistics package Instat


Many non-statisticians, with data to analyse, have never used a statistics package. They are now
as easy to learn as a spreadsheet. This used not to be the case. Formerly, a statistics package
required you to type commands, so in practice you had to learn its language.
Nowadays most statistics packages can be used through menus, in a way that is similar to other
types of software. The commands that used to be a burden have now become a bonus because
they provide a record or log of what you have done.
When using a spreadsheet we recommended in Section 3 that the data be kept in columns, each
with a name. This is obligatory when you use a statistics package. A statistics package is effec-
tively a column calculator. Thus simple statistics, like a histogram, or a mean, are based on a
single column of data. Other methods, like scatterplots or correlations, use two or more columns.
To clarify this use of columns, we have opened Instat as shown in Fig. 4a, and typed numbers into
the fourth row. Instat immediately assumes that the first three rows must contain missing obser-
vations. Note also that Instat gives the columns default names of X1, X2, and so on. The second
header row allows you to specify a name of your choice. The data start on the next row after that
(labelled 1).

7
See, for example McCullough and Wilson (1999) On the accuracy of statistical procedures in MS Excel
97Computational Statistics and Data Analysis 31, 27-37, or http://www.stat.uiowa.edu/~jcryer/JSMTalk2001.pdf. A
more positive view, with which we agree, is in http://www.agresearch.cri.nz/Science/Statistics/exceluse1.htm.
8
Macros are the basis of the SSCs 1-day course Taking Microsoft Excel further: macros for data management and statistic Notes
and examples used on the course are supplied on the CD.

16
Guide to the SSC Resources

Fig. 4a. A statistics package is a column calculator

Some users are fearful of learning to use a statistics package, just as they lack confidence in
statistics. One use of Instat is as a starter-package for those who require analyses that exceed
those that are desirable in a spreadsheet. We find that most newcomers are pleasantly surprised
by how easy it now is to use a statistics package.9
Once the step of using a statistics package has been taken, appropriate use of the software can
support data analysis and the teaching of statistics. They can help overcome the blocks that some
users have of the subject. A second use of Instat is to support the teaching of statistics. This was
outlined in Section 2 and is described in detail in the Instat Introductory Guide.10

4.1. Menus
Figure 4b shows the two main windows in Instat, with the same data as in Sections 2 and 3. The
principal menus in Instat and in many other statistics packages are :
Manage to organize the data for analysis
Graphics for exploratory and presentation graphs
Statistics to analyse the data
The menus File, Edit, Window, and Help, will be familiar from other Windows software. The
special menu called Submit is for those who use commands or macros to automate parts of their
analyses. The Climatic menu is optional. It provides special facilities for processing climatic data,
and includes its own user guide and help facilities.
This guide and the Instat Introductory Guide are unusual in that they frequently mention other
statistics packages. Some users may find Instat to be sufficient for their needs, but others will use it
as a stepping-stone to using a more powerful statistics package.
It used to be difficult to mix statistics packages. Not only did you have to learn a new language, but
also it was tedious to transfer data between packages. Now data transfer is easy, and all packages
are similar to use. So, cost apart, it is feasible to use a mix of statistics packages. In Section 5 we
give examples from our training courses.

9
Occasionally we are asked the converse: competent users of a powerful statistics package wonder whether they should spend
time learning to use a spreadsheet to assist in their statistical work. We believe that there is no strong case for this.
10
When Instat is installed, the Instat Introductory Guide is available both as a Windows Help file, and in pdf format for
printing. The pdf version is also available as a separate download from http://www.ssc.rdg.ac.uk/.

17
Guide to the SSC Resources

Fig. 4b. Instats Worksheet and Output windows

4.2. Analysis

Fig. 4c. Instats statistics menu Fig. 4d. Sub-menu for summary statistics

The main statistics menu is shown in Fig. 4c, together with the sub-menu for summary statistics in
Fig. 4d. As an example, we show the dialogue for a grouped frequency distribution in Fig. 4e, with
some results in Fig. 4f. For example, we see that of the 36 farmers, 9 (i.e., 25%) did not apply any
fertilizer.

18
Guide to the SSC Resources

Fig. 4e. Instats dialogue for frequencies Fig. 4f. Results

One way of looking at multiple columns together is to provide tables of summary statistics. Tables
are a powerful feature of Instat, just as Pivot tables are a strength of Excel. Examples were given in
Section 2, Figs 2f and 2g, and are described in more detail in Chapter 13 of the Instat Introductory
Guide.
The main menus in Instat for statistical modelling are the second group in Fig. 4g, namely Simple
Models, Analysis of Variance and Regression. We show the sub-menu for regression in Fig. 4g,
which includes options for both simple and multiple regressions. The use of these menus is
described in Chapters 15 to 17 of the Introductory Guide.
Fig. 4g. Instats Regression sub-menu

4.3. Help
The Help supplied with Instat is extensive. The series of about 20 good-practice guides mentioned
in Section 2 are provided with Instat, in both printable and Windows help file formats.
The Introductory Guide, part of which is shown in Fig. 4h (overleaf), is also supplied, both as a
Windows help file and in printable format. It includes chapters that examine, in more detail, the
teaching ideas mentioned in this Guide.

19
Guide to the SSC Resources

Fig. 4h. The Instat Introductory Guide is part of the Help

4.4. Moving on
Most statistics packages are designed primarily to support data analysis, but can also be used in
training courses. Instat is designed the other way round. It is intended largely to support training, but
can also be used for data analysis.
Most chapters in the Introductory Guide have an In Conclusion section, were we also mention the
limitations of Instat, for those users who require more. Just as it is easy to use Excel together with a
statistics package, so it is simple to integrate the use of multiple statistics packages, both within
training courses and for data analysis.
Within training courses it is indeed valuable that students become familiar with more than one
package for their statistical work. This provides them with the confidence to move to other
packages in the future, should the need arise.

5. Training courses in statistics


In this section we describe some of the training scenarios from which ideas and resources
described in the previous sections have evolved. It illustrates a range of courses in which we have
used these resources, both in the UK and abroad.

5.1. In-service training


The SSC advertises and runs a series of short training courses for both statisticians and non-
statisticians outside TheUniversity of Reading. These generally last between 1 and 3 days .
The most popular is a 2-day review of basic statistical concepts. The course moves quickly from
descriptive statistics to cover the basic ideas of statistical inference, for both normal and binomial
probability models. It is a practical course on which participants are invited to use one or more
statistics packages of their choice. This can include Excel (with SSC-Stat) and Instat as well as

20
Guide to the SSC Resources

other well-known packages. The main aim is to teach statistical ideas rather than computing.
Some participants therefore stick to the package with which they are familiar, while others use the
opportunity to explore different software, either for its own sake, or because it is the simplest for a
particular practical.

5.2. Tailored training


The SSC runs commissioned short courses for clients outside the University. The contents are
tailored to the needs of the client, and examples normally use data that they have supplied. With
overseas training, we usually develop a training strategy and course materials jointly with a local
partner, so we remain involved only as long as our inputs are needed.
An example is a sequence of training courses for the Central Bureau of Statistics (CBS) in Kenya,
developed with the Biometry Unit Consultancy Service (BUCS), our partner organisation within the
University of Nairobi. The first was a 7-day course on the key statistical concepts, designed largely
by CBS to ensure that all staff had a common foundation that could be assumed in any further
courses. The course covered descriptive methods using SSC-Stat, followed by the ideas of
statistical inference, using Instat. It also used data from one of their own poverty surveys involving
about 10,000 households. The course was run six times between 2003 and 2004, for staff from the
CBS headquarters, for the district statisticians and for statisticians working in associated ministries.
A follow-up was a 2-week course on the analysis of survey data. This required more advanced
facilities than are available with SSC-Stat or Instat. The CBS staff decided on Stata as the software
to use, and so there was a 3-day introductory course on Stata prior to the analysis course.

5.3. Undergraduate training


Undergraduates taking agriculture at the University of Nairobi have a statistics course in each of
years 2, 3 and 4. The Year-2 course now teaches only descriptive statistics. It uses a range of real
case studies, so students practice their skills in data organisation, summary and in the presentation
of results. Excel, with SSC-Stat, Instat and some of the statistical games, described in Section 2
are used.
The concepts of statistical inference and modelling are taught in Year 3, and the whole process,
from design to presentation is in the Year 4 course.

5.4. Postgraduate training


Statistics made Simple was a 4-day course, given before the start of term, designed to level the
playing field for incoming post-graduate students. It was intended for those who were previously
taught statistics in ways that emphasized the formulae, rather than the ideas, and for students who
were not taught statistics together with computers.
Descriptive methods were introduced using Excel with SSC-Stat, and Instat was used to cover the
basic ideas of inference and some simple regression modelling. There was extensive practical
work. Apart from teaching the key concepts, the course aimed to show how the teaching of
statistics changes once computers with the appropriate software are used. We also showed
how easy it is to use a mix of software for statistical work.
A different, shorter, version of this course has also been given to incoming PhD students doing
biological work, and Genstat was used, because it would be the main statistical package for their
research.
Life Sciences MSc and research students at The University of Reading followed a statistics course,
taught by the School of Applied Statistics, as part of their training. Our objectives included preparing
the students for their subsequent research project, or thesis research. It was a 20-week course,
with 2 hours of lectures and a 2-hour practical session each week, and we included sessions on
planning and data management within the course. Two of the games mentioned in Section 2 were
also used as part of the assessed practical work. On software, we encouraged students to use a

21
Guide to the SSC Resources

mix of relevant software including Excel (with SSC-Stat) and Instat as well as software readily
available within their own departments.

6. In conclusion
The target audience for this resource pack is the large number of professionals who need to
analyse data, but lack confidence in their statistical ability. If statistics can be made more relevant
and accessible, it could help research and development projects to collect and process data more
effectively. This, in turn, will help the quality assurance and the process of evidence-based decision
making.
The SSCs experiences, as providers of statistical training and consultancy, have led us to believe
that training in statistics needs to be broadened. More emphasis needs to be placed on statistical
concepts and practice, and less on theory and formulae. We also need to be more imaginative in
teaching study design and data management skills.
This resource pack is a collection of resources that we have developed and used in our own work,
and we offer them here for others to use. They include: Change this list to bullets
(i) An Excel add-in
(ii) A user-friendly software package developed with teaching in mind
(iii) Several statistical games which simulate different scenarios to help understand key
statistical concepts
(iv) A range of short good practice booklets, many of which discuss statistical concepts
from planning through to presentation of results.
Improved confidence with statistics can be acquired relatively quickly and easily, either via a short
course, provided it is the right type of short course, or by some self-study. We therefore hope that
the materials will be of interest and use to both trainers of statistics and anyone attempting to carry
out their own data analysis. We encourage trainers to adapt our ideas and to incorporate some of
the materials into their own work. Using their own datasets and software are two obvious examples
of adapting our resource pack. The SSC would welcome feedback from trainers on their
imaginative use of the resource pack.

22

You might also like