Data Quality Audit Tool

Download as pdf or txt
Download as pdf or txt
You are on page 1of 106

Data Quality Audit Tool

GUIDELINES FOR IMPLEMENTATION

MS-08-29 September 2008


This tool was made possible by support from the U.S. Agency for International Development (USAID)
under terms of Cooperative Agreement GPO-A-00-03-00003-00.

Additional financial support was provided by the President’s Emergency Plan for AIDS Relief and the
Global Fund to Fight AIDS, TB and Malaria.

The author’s views expressed in this publication do not necessarily reflect the views of USAID or the
United States Government. This publication can be accessed online at the MEASURE Evaluation Web
site: http://www.cpc.unc.edu/measure.
Acknowledgements

This tool was developed with input from a number of individuals representing various organizations.
Those most directly involved in development of the tool include Ronald Tran Ba Huy of The Global
Fund to Fight AIDS, Tuberculosis and Malaria and Karen Hardee, J. Win Brown, Ron Stouffer,
Sonja Schmidt, Yoko Shimada, David Boone, and Philip Setel of the MEASURE Evaluation
Project. Einar Heldal, TB Consultant and Charlotte Kristiansson of the Swiss Tropical Institute
also contributed to the development of the tool. Others who were instrumental in its development
include: Bernhard Schwartländer, Bernard Nahlen, Daniel Low-Beer, Linden Morrison, John
Cutler, Itamar Katz, Gulshod Allabergenova, Marton Sziraczki, and George Shakarishvili from
The Global Fund to Fight AIDS, TB and Malaria; Kathy Marconi, Michelle Sherlock, and Annie
La Tour from the Office of the Global AIDS Coordinator. Others who provided technical input
and review included: Malgosia Grzemska, Christian Gunneberg, Pierre-Yves Norval, Catherine
Bilger, Robert Makombe, Yves Souteyrand, Tisha Mitsunaga, Cyril Pervilhac, Chika Hayashi,
Abdikamal Alisalad, Evelyn Isaacs, Thuy Nguyen Thi Thanh, Spes C. Ntabangana, Andrea Godfrey,
and Mehran Hosseini of the World Health Organization (WHO); Bilali Camara of PAHO/WHO,
Deborah Rugg and Saba Moussavi of UNAIDS, Bob Pond of Health Metrics Network (HMN),
Pepukai Chikudwa of the International HIV/AIDS Alliance, Arnaud Trebucq of the International
Union Against Tuberculosis and Lung Disease, Rene L’Herminez of KNCV Tuberculosis
Foundation, Rick Steketee of PATH, Verne Kemerer of MEASURE Evaluation, Abdallah Bchir
and Anshu Banerjee of the Global Alliance for Vaccines and Immunization (GAVI); John Novak
from USAID; Scott McGill and Gloria Sanigwa from Family Health International (FHI); Matthew
Lynch from Johns Hopkins University, and Lee Yerkes from the Elizabeth Glaser Pediatrics AIDS
Foundation. In addition, the tool greatly benefited from the participation of a number of individuals
during pilot tests in Tanzania, Rwanda, Vietnam, and Madagascar.

Data Quality Audit Tool 3


Table of contents

Acknowledgements . ..................................................................................................................................3
Introduction ................................................................................................................................................7
A. Background..........................................................................................................................................7
B. Objectives . ..........................................................................................................................................8
C. Conceptual Framework ......................................................................................................................9
D. Methodology . .....................................................................................................................................9
E. Selection of Sites . .............................................................................................................................15
F. Outputs ...............................................................................................................................................16
G. Ethical Considerations .....................................................................................................................17
H. Implementation .................................................................................................................................18
Phase 1. Preparation and Initiation ......................................................................................................21
Step 1. Select Country, Program/Project(s), Indicator(s), and Reporting Period ...............................22
Step 2. Notify Program, Request Documentation and Obtain National Authorizations ...................25
Step 3. Select Sites to be Audited .........................................................................................................29
Step 4. Prepare for On-Site Audit Visits ...............................................................................................32
Step 5. Review Documentation ...........................................................................................................36
Phase 2. M&E unit ..................................................................................................................................37
Step 6. Assessment of Data Management SYSTEMS (at the M&E Unit) .......................................38
Step 7. Trace and Verify Results from Intermediate Aggregation Levels (at the M&E Unit) . ........39
Phase 3. Intermediate Aggregation Level(s) ........................................................................................41
Step 8. Assessment of Data Management Systems (at the Intermediate Aggregation Levels) ........41
Step 9. Trace and Verify Results from Site Reports (at the Intermediate Aggregation Levels) .......42
Phase 4. Service Delivery Sites . .............................................................................................................44
Step 10. Assessment of Data Collection and Reporting System (at the Service Delivery Points) . .44
Step 11. Trace and Verify Results from Source Documents (at the Service Delivery Points) .........45
Phase 5. M&E Unit . ................................................................................................................................48
Step 12. Consolidate Assessment of Data Management Systems .....................................................49
Step 13. Draft Preliminary Finding and Recommendation Notes .....................................................52
Step 14. Conduct a Closeout Meeting .................................................................................................55

Data Quality Audit Tool 5


Phase 6. Completion ................................................................................................................................56
Step 15. Draft Audit Report .................................................................................................................57
Step 16. Review and Collect Feedback from Country and Organization Commissioning the DQA . 58
Step 17. Finalize Audit Report . ...........................................................................................................58
Step 18. Initiate Follow-Up of Recommended Actions .....................................................................59
Annex 1. DQA Protocols .........................................................................................................................61
Annex 2. Templates for the Organization Commissioning the DQA ...............................................69
Annex 3. Templates for the Audit Agency and Team .........................................................................79
Annex 4. Site Selection using Cluster Sampling Techniques . ...........................................................87
Annex 5. Calculation of the Verification Factor ..................................................................................99

6 Data Quality Audit Tool


Introduction

A. Background

National programs and donor-funded projects are working towards achieving ambitious goals
related to the fight against diseases such as Acquired Immunodeficiency Syndrome (AIDS),
Tuberculosis (TB), and Malaria. Measuring the success and improving the management of these
initiatives is predicated on strong monitoring and evaluation (M&E) systems that produce quality
data related to program implementation.

In the spirit of the “Three Ones,” the “Stop TB Strategy,” and the “RBM Global Strategic Plan,”
a number of multilateral and bilateral organizations have collaborated to jointly develop a Data
Quality Assessment (DQA) Tool. The objective of this harmonized initiative is to provide a
common approach for assessing and improving overall data quality. A single tool helps to ensure
that standards are harmonized and allows for joint implementation between partners and with
National Programs.

The DQA Tool focuses exclusively on (1) verifying the quality of reported data, and (2) assessing
the underlying data management and reporting systems for standard program-level output
indicators. The DQA Tool is not intended to assess the entire M&E system of a country’s response
to HIV/AIDS, Tuberculosis, or Malaria. In the context of Figure 1. Organizing Framework
HIV/AIDS, the DQA Tool relates to component 10 (i.e., for a Functional National HIV
supportive supervision and data auditing) of the “Organizing M&E System – 12 Components.
Framework for a Functional National HIV M&E System.1”

Two versions of the DQA Tool have been developed: (1) the
“Data Quality Audit Tool” which provides guidelines to be
used by an external audit team to assess a program/project’s
ability to report quality data; and (2) the “Routine Data
Quality Assessment Tool” (RDQA) which is a simplified
version of the DQA Tool for auditing that allows programs
and projects to assess the quality of their data and strengthen
their data management and reporting systems.

1
  UNAIDS (2008). Organizing Framework for a Functional National HIV Monitoring and Evaluation System.
Geneva: UNAIDS.

Data Quality Audit Tool 7


The objectives of the DQA Tool for auditing are to:

• Verify the quality of reported data for key indicators at selected sites; and
• Assess the ability of data management systems to collect and report
quality data.

In addition, for the programs/projects being audited, the findings of the DQA can also be very
useful for strengthening their data management and reporting systems.

B. Objectives

The DQA Tool for auditing provides processes, protocols, and templates addressing how to:
• Determine the scope of the data quality audit. The DQA Tool begins with suggested
criteria for selecting the country, program/project(s), and indicators to be reviewed. In
most cases, the Organization Commissioning the DQA will select these parameters.
• Engage the program/project(s) and prepare for the audit mission. The DQA Tool
includes template letters for notifying the program/project of the data quality audit (and
for obtaining relevant authorizations), as well as guidelines for preparing the country
mission.
• Assess the design and implementation of the program/project’s data management
and reporting systems. The DQA Tool provides steps and a protocol to identify potential
risks to data quality created by the program/project’s data management and reporting
system.
• Trace and verify (recount) selected indicator results. The DQA Tool provides
protocol(s) with special instructions, based on the indicator and type of Service Delivery
Site (e.g. health facility or community-based). These protocols will direct the Audit Team
as it verifies data for the selected indicator from source documents and compares the
results to the program/project(s) reported results.
• Develop and present the Audit Team’s findings and recommendations. The
DQA Tool provides instructions on how and when to present the DQA findings and
recommendations to program/project officials and how to plan for follow-up activities to
ensure that agreed-upon steps to improve systems and data quality are completed.

Note: While the Data Quality Audit Tool is not designed to assess the quality of services provided,
its use could facilitate improvements in service quality as a result of the availability of better
quality data related to program performance.

8 Data Quality Audit Tool


C. Conceptual Framework

The conceptual framework for the DQA and RDQA is illustrated in the Figure 1 (below). Generally,
the quality of reported data is dependent on the underlying data management and reporting systems;
stronger systems should produce better quality data. In other words, for good quality data to be
produced by and flow through a data management system, key functional components need to be
in place at all levels of the system — the points of service delivery, the intermediate level(s) where
the data are aggregated (e.g. districts, regions), and the M&E unit at the highest level to which data
are reported. The DQA and RDQA tools are therefore designed to:

(1) verify the quality of the data,


(2) assess the system that produces that data, and
(3) develop action plans to improve both.

Introduction – Figure 1. Conceptual Framework for the (R)DQA: Data Management and
Reporting Systems, Functional Areas, and Data Quality.

D. Methodology

The DQA and RDQA are grounded in the components of data quality, namely, that programs and projects
need accurate, reliable, precise, complete and timely data reports that managers can use to effectively
direct available resources and to evaluate progress toward established goals (see Introduction Table 1 on
the following page). Furthermore, the data must have integrity to be considered credible and should be
produced ensuring standards of confidentiality.

Data Quality Audit Tool 9


Introduction – Table 1. Data Quality Dimensions

Dimension of
Operational Definition
Data Quality
Also known as validity. Accurate data are considered correct: the data measure what
Accuracy they are intended to measure. Accurate data minimize errors (e.g., recording or
interviewer bias, transcription error, sampling error) to a point of being negligible.
The data generated by a program’s information system are based on protocols and
procedures that do not change according to who is using them and when or how
Reliability
often they are used. The data are reliable because they are measured and collected
consistently.
This means that the data have sufficient detail. For example, an indicator requires
the number of individuals who received HIV counseling & testing and received their
Precision
test results, by sex of the individual. An information system lacks precision if it is
not designed to record the sex of the individual who received counseling and testing.

Completeness means that an information system from which the results are derived
Completeness is appropriately inclusive: it represents the complete list of eligible persons or units
and not just a fraction of the list.

Data are timely when they are up-to-date (current), and when the information is
available on time. Timeliness is affected by: (1) the rate at which the program’s
Timeliness
information system is updated; (2) the rate of change of actual program activities;
and (3) when the information is actually used or required.
Data have integrity when the system used to generate them is protected from
Integrity
deliberate bias or manipulation for political or personal reasons.

Confidentiality means that clients are assured that their data will be maintained
according to national and/or international standards for data. This means that
Confidentiality personal data are not disclosed inappropriately, and that data in hard copy and
electronic form are treated with appropriate levels of security (e.g. kept in locked
cabinets and in password protected files).

Based on these dimensions of data quality, the DQA Tool is comprised of two components: (1)
assessment of data management and reporting systems; and (2) verification of reported data for
key indicators at selected sites.

Accordingly, the implementation of the DQA is supported by two protocols (see ANNEX 1):
Protocol 1: System Assessment Protocol;
Protocol 2: Data Verification Protocol.

These protocols are administered at each level of the data-collection and reporting system
(i.e., program/project M&E Unit, Service Delivery Sites and, as appropriate, any Intermediate
Aggregation Level – Regions or Districts).

10 Data Quality Audit Tool


Protocol 1 - Assessment of Data Management and Reporting Systems:
The purpose of Protocol 1 is to identify potential challenges to data quality created by the data
management and reporting systems at three levels: (1) the program/project M&E Unit, (2) the
Service Delivery Sites, and (3) any Intermediary Aggregation Level (at which reports from Service
Delivery Sites are aggregated prior to being sent to the program/project M&E Unit, or other relevant
level).

The assessment of the data management and reporting systems will take place in two stages:
1. Off-site desk review of documentation provided by the program/project;
2. On-site follow-up assessments at the program/project M&E Unit and at selected Service
Delivery Sites and Intermediate Aggregation Levels (e.g., Districts, Regions).

The assessment will cover five functional areas, as shown in Introduction – Table 2.

Introduction – Table 2. Systems Assessment Questions by Functional Area

Functional Areas Summary Questions


I M&E Structures, 1 Are key M&E and data-management staff identified with clearly
Functions and assigned responsibilities?
Capabilities 2 Have the majority of key M&E and data-management staff received
the required training?
II Indicator Definitions 3 Are there operational indicator definitions meeting relevant
and Reporting standards that are systematically followed by all service points?
Guidelines 4 Has the program/project clearly documented (in writing) what is
reported to who, and how and when reporting is required?
III Data Collection 5 Are there standard data-collection and reporting forms that are
and Reporting systematically used?
Forms and Tools 6 Is data recorded with sufficient precision/detail to measure relevant
indicators?
7 Are data maintained in accordance with international or national
confidentiality guidelines?
8 Are source documents kept and made available in accordance with a
written policy?
IV Data Management 9 Does clear documentation of collection, aggregation and
Processes manipulation steps exist?
10 Are data quality challenges identified and are mechanisms in place
for addressing them?
11 Are there clearly defined and followed procedures to identify and
reconcile discrepancies in reports?
12 Are there clearly defined and followed procedures to periodically
verify source data?
V Links with National 13 Does the data collection and reporting system of the program/
Reporting System project link to the National Reporting System?

Data Quality Audit Tool 11


The outcome of this assessment will be identified strengths and weaknesses for each functional
area of the data management and reporting system.

Introduction – Figure 2. Assessment of Data Management System (Illustration).

Protocol 2 - Verification of Reported Data for Key Indicators:


The purpose of Protocol 2 is to assess, on a limited scale, if service delivery and intermediate
aggregation sites are collecting and reporting data to measure the audited indicator(s) accurately
and on time — and to cross-check the reported results with other data sources. To do this, the
DQA will determine if a sample of Service Delivery Sites have accurately recorded the activity
related to the selected indicator(s) on source documents. It will then trace that data to see if it has
been correctly aggregated and/or otherwise manipulated as it is submitted from the initial Service
Delivery Sites through intermediary levels to the program/project M&E Unit.

The data verification exercise will take place in two stages:

1. In-depth verifications at the Service Delivery Sites; and


2. Follow-up verifications at the Intermediate Aggregation Levels and at the program/
project M&E Unit.

12 Data Quality Audit Tool


Introduction – Figure 3. Tracing and Verifying Report Totals from the Service Delivery Site
Through Intermediate Reporting Levels to the Program/Project M&E Unit.

The first stage of the data-verification occurs at the Service Delivery Sites. There are five types of
standard data-verification steps that can be performed at this level (Introduction – Table 3):

Introduction – Table 3. Service Delivery Site: Five Types of Data Verifications

Verifications Description Required


1. Description Describe the connection between the delivery of services and/ In all cases
or commodities and the completion of the source document
to record that delivery.
2. Documentation Review Review availability and completeness of all indicator source In all cases
documents for the selected reporting period.
3. Trace and Verification Trace and verify reported numbers: (1) Recount the In all cases
reported numbers from available source documents;
(2) Compare the verified numbers to the site reported
number; (3) Identify reasons for any differences.
4. Cross-checks Perform “cross-checks” of the verified report totals with In all cases
other data-sources (e.g. inventory records, laboratory
reports, registers, etc.).
5. Spot-checks Perform “spot-checks” to verify the actual delivery of If feasible
services and/or commodities to the target populations.

Data Quality Audit Tool 13


Because there are significant differences between certain types of indicators and sites—e.g.,
facility-based (clinics) and community-based sites—the DQA includes indicator-specific
protocols to perform these standard data-verification steps (e.g., Antiretroviral Therapy [ART]
Protocol; Voluntary Counseling and Testing [VCT] Protocol; TB Treatment Outcome Protocol(s);
Insecticide-Treated Nets [ITN] Protocol; etc.). These indicator-specific protocols are based on
generic protocols that have been developed for facility-based data sources and community-based
data sources. The Service Delivery Site Worksheet from these generic data-verification protocols
are shown in ANNEX 1.

The second stage of the data-verification occurs at the Intermediate Aggregation Levels (e.g.,
Districts, Regions) and at the program/project M&E Unit. As illustrated in Introduction – Figure
3, the DQA evaluates the ability at the intermediate level to accurately aggregate or otherwise
process data submitted by Service Delivery Sites, and report these data to the next level in a timely
fashion. Likewise, the program/project M&E Unit must accurately aggregate data reported by
intermediate levels and publish and disseminate National Program results to satisfy the information
needs of stakeholders (e.g. donors).

The following verifications (Introduction - Table 4) will therefore be performed at Intermediate


Aggregation Levels. Similar verifications are performed at the M&E Unit.

Introduction – Table 4. Intermediate Aggregation Levels: Two Types of Data Verifications

Verifications Description Required


1, Documentation Review Review availability, timeliness, and completeness of In all cases
expected reports from Service Delivery Sites for the
selected reporting period.
2. Trace and Verification Trace and verify reported numbers: (1) Re-aggregate the In all cases
numbers submitted by the Service Delivery Sites; (2)
Compare the verified counts to the numbers submitted to
the next level (program/project M&E Unit); (3) Identify
reasons for any differences.

The outcome of these verifications will be statistics on the accuracy, availability, completeness,
and timeliness of reported data.

14 Data Quality Audit Tool


Introduction – Figure 4. Statistics on Data Quality (Illustration).

E. Selection of Sites

There are four methods for selecting sites for the Data Quality Audit:

1. Purposive selection: The sites to be visited are purposely selected, for example based on their
size, their geographical proximity or concerns regarding the quality of their reported data.
In this case, there is no need for a sampling plan. However, the data quality audit findings
produced from such a “purposive” or targeted sample cannot be used to make inferences or
generalizations about all the sites, or a group of sites, in that country.
2. Restricted site design: Only one site is selected for the DQA. The benefit of this approach
is that the team can maximize its efforts in one site and have a high degree of control over
implementation of the audit protocols and knowledge of the site-specific systems from
which the results are derived. This approach is ideal for measuring the change in data quality
attributable to an intervention (e.g. data management training). In this approach, the data
quality audit is implemented in a selected site; the intervention is conducted, and is followed
by another data quality audit in the same site. Any change in the quality of data could therefore
be most likely a result of the intervention.

Data Quality Audit Tool 15


3. Stratified random sampling: This involves the drawing of a stratified random sample of a
sub-national group of sites where a particular variable of interest is chosen as the basis of
the sites to be visited. Examples of such variables include rural sites, extremely large sites,
sites run by a certain type of organization (e.g., nongovernmental organizations [NGOs]) or
sites operating in a specific region or district of a country. Such stratified random sampling
allows the audit team to make inferences from the sample audit findings to all the sites that
belong to the stratification variable of interest (like all the rural sites, all the very large sites,
all NGOs, etc.)
4. Random sampling: It is often desirable to make judgments about data quality for an entire
program or country. However, in most countries, it would be far too costly and time
consuming to audit all the sites reporting to a program. Furthermore, it can be inaccurate
and misleading to draw conclusions for all implementing sites based on the experiences of a
few. Random sampling techniques allow us to select a relatively small number of sites from
which conclusions can be drawn which are generalizable to all the sites in a program/project.
Such sampling relies on statistical properties (e.g., size of the sample, the variability of the
parameter being measured) which must be considered when deciding which DQA approach
to use. Sometimes, the minimally acceptable number of sites (in terms of statistical validity)
dictated by the sampling methodology is still too many sites to realistically pursue in terms
of cost and available staff. Compromising the methodology by including fewer sites than
indicated, or replacing one site for another based on convenience, can yield erroneous or
biased estimates of data quality. However, given the appropriate resources, random sampling
offers the most powerful method for drawing inferences about data quality for an entire
program or country. This method involves the random selection of a number of sites that
together are representative of all the sites where activities supporting the indicator(s) under
study are being implemented. Representative means that the selected sites are similar to the
entire population of sites in terms of attributes that can affect data quality (e.g., size, volume
of service, and location). The purpose of this approach is to produce quantitative estimates
of data quality that can be viewed as indicative of the quality of data in the whole program/
project, and not simply the selected sites.

The number of sites selected for a given DQA will depend on the resources available to conduct
the audit and the level of precision desired for the national level estimate of the Verification Factor.
A more precise estimate requires a larger sample of sites. The Audit Teams should work with the
Organization Commissioning the DQA to determine the right number of sites for a given program
and indicator.

F. OUTPUTS

In conducting the DQA, the Audit Team will collect and document: (1) evidence related to the
review of the program/project’s data management and reporting system; and (2) evidence related
to data verification. The documentation will include:
• Completed protocols and templates included in the DQA Tool.
• Write-ups of observations, interviews, and conversations with key data quality officials
at the M&E Unit, at intermediary reporting locations, and at Service Delivery Sites.

16 Data Quality Audit Tool


• Preliminary findings and draft Recommendation Notes based on evidence collected in
the protocols;
• Final Audit Report. The Final Audit Report will summarize the evidence the Audit
Team collected, identify specific audit findings or gaps related to that evidence, and
include recommendations to improve data quality. The report will also include the
following summary statistics that are calculated from the system assessment and data
verification protocols:
1. Strength of the Data Management and Reporting System based on a review of the
program/project’s data collection and reporting system, including responses to questions
on how well the system is designed and implemented;
2. Accuracy of Reported Data through the calculation of Verification Factors2 generated
from the trace and verify recounting exercise performed at each level of the reporting
system (i.e., the ratio of the recounted value of the indicator to the reported value); and
3. Availability, Completeness and Timeliness of Reports through percentages calculated
at the Intermediate Aggregation Level(s) and the M&E Unit.

These summary statistics, which are automatically generated in the Excel files, are developed
from the system assessment and data verification protocols included in this tool.

• All follow-up communication with the program/project and the Organization


Commissioning the DQA related to the results and recommendations of the Data Quality
Audit.

G. ETHICAL CONSIDERATIONS

The data quality audits must be conducted with the utmost adherence to the ethical standards of the
country and, as appropriate, of the Organization Commissioning the DQA. While the audit teams
may require access to personal information (e.g., medical records) for the purposes of recounting
and cross-checking reported results, under no circumstances will any personal information be
disclosed in relation to the conduct of the audit or the reporting of findings and recommendations.
The Audit Team should neither photocopy nor remove documents from sites.

In addition, the auditor shall not accept or solicit directly or indirectly anything of economic value
as a gift, gratuity, favor, entertainment or loan that is or may appear to be designed to in any manner
influence official conduct, particularly from one who has interests that might be substantially
affected by the performance or nonperformance of the auditor’s duty. This provision does not
prohibit the acceptance of food and refreshments of insignificant value on infrequent occasions in
the ordinary course of a meeting, conference, or other occasion where the auditor is properly in
attendance, nor the acceptance of unsolicited promotional material such as pens, calendars, and/or
other items of nominal intrinsic value.

2
Please refer to ANNEX 5 for a description of the methodology for calculating the Composite Verification Factor.

Data Quality Audit Tool 17


H. IMPLEMENTATION

The Data Quality Audit will be implemented chronologically in 19 steps conducted in six phases,
as shown in Introduction Figure 5.

Introduction – Figure 5. Data Quality Audit Phases and Steps.

 PHASE 1 – Steps 1-5 are performed at the Organization Commissioning the DQA and at
the Audit Team’s Office.
• The Organization Commissioning the DQA determines the country and program/
project(s) to be audited. The Audit Team and/or the Organization Commissioning the
DQA then select(s) the corresponding indicators and reporting period (Step 1).
• The Organization Commissioning the DQA is responsible for obtaining national
authorization to conduct the audit, as appropriate, and for formally notifying the program/
project of the DQA. The Audit Team follows up with a request for documentation for its
review prior to visiting the program/project, including information from which to draw
the sample of sites (Step 2).

18 Data Quality Audit Tool


• The Audit Team, in collaboration with the Organization Commissioning the DQA,
identifies the number and locations of the Service Delivery Sites and related Intermediate
Aggregation Levels (i.e., districts or regions) at which targeted system assessment and
data verification will be conducted (Step 3).
• The Audit Team prepares for on-site visits, including establishing the timing of the visits,
constituting the Audit Team and attending the requisite logistical issues (Step 4).
• The Audit Team conducts a desk review of the documentation provided by the program/
project (Step 5).

 PHASE 2 – Steps 6-7 are performed at the program/project’s M&E Unit.


• The Audit Team assesses the data management and reporting system at the level of the
M&E Unit (Step 6). This assessment is designed to identify potential challenges to data
quality created by the program/project’s data management and reporting system.
• The Audit Team begins to trace and verify data for the selected indicator(s) by reviewing
the reports for the selected reporting period submitted by lower reporting levels (such as a
district or regional offices) (Step 7).

 PHASE 3 – Steps 8-9 are conducted at the Intermediate Aggregation Levels (such as a
district or regional offices), if the program/project data management system has such levels.
• The Audit Team assesses the data management and reporting system by determining how
data from sub-reporting levels (e.g., Service Delivery Sites) are aggregated and reported
to the program/project M&E Unit (Step 8).
• The Audit Team continues to trace and verify the numbers reported from the Service
Delivery Sites to the intermediate level (Step 9).

 PHASE 4 – Steps 10-11 are conducted at Service Delivery Sites (e.g., in a health facility or a
community).
• The Audit Team continues the assessment of the data management and reporting system
at Service Delivery Sites by determining if a functioning system is in place to collect,
check, and report data to the next level of aggregation (Step 10).
• The Audit Team also traces and verifies data for the selected indicator(s) from source
documents to reported results from Service Delivery Sites (Step 11).

 PHASE 5 – Steps 12-14 take place back at the program/project M&E Unit.
• The Audit Team finalizes the assessment of the data management and reporting system by
answering the final Audit Summary Questions (Step 12).
• The Audit Team then drafts its preliminary DQA findings and recommendations (Step
13) and shares them with the program/project M&E officials during an Audit Closeout
Meeting (Step 14). Emphasis is placed on reaching a consensus with M&E officers on
what steps to take to improve data quality.

Data Quality Audit Tool 19


 PHASE 6 – Steps 15-18 are conducted at the Audit Team’s Office and through meetings with
the Organization Commissioning the DQA and the program/project office.
• The Audit Team completes a draft Audit Report (Step 15) which is communicated to the
Organization Commissioning the DQA and the program/project (Step 16).
• Based on the feedback provided, the Audit Team completes the Final Audit Report and
communicates the report to the program/project (Step 17).
• In the final audit step, the Audit Team may be asked to outline a follow-up process to help
assure that improvements identified in the Final Audit Report are implemented (Step 18).

20 Data Quality Audit Tool


PHASE 1: PREPARATION AND INITIATION

The first phase of the DQA occurs prior to the Audit Team being
PHASE 1 on site at the location of the program/project. Responsibility for
PHASE 1 rests partly with the Organization Commissioning the
DQA and partly with the Audit Agency. The steps in PHASE
Off-Site 1 are to:
(Preparation
and Initiation) 1. Identify the country and program/project and select the
indicator(s) and reporting period that will be the focus of
the actual data verification work at a few Service Delivery
Sites.
1. Select Country,
2. Notify the selected program/project(s) of the impending
Program/Project(s)
Indicators and
data quality audit and request documentation related to the
Reporting Period data management and reporting system that the Audit Team
can review in advance of the site visits. Obtain national
authorization(s), if needed, to undertake the audit. Notify
2. Notify Program, key country officials and coordinate with other organizations
Request such as donors, implementing partners and national audit
Documentation and agencies, as necessary.
Obtain National 3. Determine the type of sample and the number of sites to be
Authorizations the subject of on-site data quality verifications.
4. Prepare for the site visits, including determining the timing
of the visit, constituting the Audit Team, and addressing
3. Select Sites
to be Audited logistical issues.
5. Perform a “desk review” of the provided documentation to
begin to determine if the program/project’s data management
4. Prepare for On- and reporting system is capable of reporting quality data if
site Audit Visits: 1) implemented as designed.
Timing; 2) Team
Constitution; The steps in PHASE 1 are estimated to take four to six
3) Logistics weeks.

5. Review
Documentation

Data Quality Audit Tool 21


Step 1. Select Country, Program/project(s), Indicator(s),
and Reporting Period

Step 1 can be performed by the Organization Commissioning the DQA and/or the Audit Team.

A – SELECT THE COUNTRY AND PROGRAM/PROJECT(S)

In all likelihood, the Organization Commissioning the DQA will determine which country
and program/project should be the subject of the Data Quality Audit. This DQA Tool presents
strategies for selecting a program/project(s) for an audit by providing a list of relevant criteria and
other issues to be considered. There is no single formula for choosing program/project(s) to be
audited; international, local and programmatic circumstances must be taken into consideration in
the decision. The audit documentation should include information about who made the selection
and, to the extent known, the rationale for that decision.

An illustrative list of criteria to be used for the selection of a country and program/project is shown
below in Step 1 – Table 1. If a National program is having the audit conducted, it can also use
these criteria to select which aspects of the program (e.g. indicators) will be audited.

Step 1 – Table 1. Illustrative Criteria for Selection of a Country, Disease/Health Area, and
Program/Project

1 Amount of funding invested in the countries and programs/projects within the disease/health area.

Results reported from countries and programs/projects (such as number of people on ART, ITNs
2
distributed, or Directly Observed Treatment, Short Course [DOTS] Detection Numbers).
Large differences in results reporting from one period to the next within a country or a program/
3
project.
Discrepancies between programmatic results and other data sources (e.g., expenditures for health
4
products that are inconsistent with number of people reported on anti-retroviral [ARV] treatment).
Inconsistencies between reported data from a specific project and national results (e.g., reported
5
number of ITNs distributed is inconsistent with national numbers).
Findings of previous M&E assessments indicating gaps in the data management and reporting
6
systems within program(s)/project(s).

7 Opinion/references about perceived data quality weaknesses and/or risks within a program/project.

8 A periodic audit schedule associated with funding or renewal reviews.

9 A desire to have some random selection of countries and programs/projects for audit.

22 Data Quality Audit Tool


When Organizations Commissioning a DQA select the country and program/project to be the
subject of a data quality audit, they might find it useful to rank the countries (or programs/projects)
by the amount they have invested in them and/or the reported output (results). This could be done
in the following sequence:
‰‰ First, rank the countries or program/project(s) by the investment amount for a specific
disease;
‰‰ Second, identify the indicators relevant for ranking the countries (or the programs/
projects) by reported results (this list will generally be specific to the particular
Organization Commissioning the DQA);
‰‰ Third, determine the ranking of each Country or program/project for each of the
identified indicators.

This list should help the Organization Commissioning the DQA prioritize the countries or program/
project(s). ANNEX 2, Step 1 – Template 1 is illustrative of such an analysis.

B – SELECT THE INDICATOR(S)

Other important decisions in preparing for a Data Quality Audit are to determine: (1) which
indicators will be included in the audit; and (2) for what reporting period(s) the audit will be
conducted. It is recommended that up to two indicators be selected within a Disease/Health
Area and that, if multiple Diseases/Health Areas are included in a Data Quality Audit, that a
maximum of four indicators be included. More than four indicators could lead to an excessive
number of sites to be evaluated.

The decision regarding which indicators to include will generally be made by the Organization
Commissioning the DQA and can be based on a number of criteria, including an analysis of the
funding levels to various program areas (e.g., ARV, Prevention of Mother-to-Child Transmission
[PMTCT], ITN, DOTS, Behavior Change Communication [BCC]) and the results reported for the
related indicators. In addition, the deciding factor could also be program areas of concern to the
Organization Commissioning the DQA and/or to the National program (e.g., community-based
programs that may be more difficult to monitor than facility-based programs). In some cases, the
Audit Agency may be asked to do an initial selection of indicators to be proposed to the Organization
Commissioning the DQA. The analysis conducted in Step 1 can help guide the selection of indicators
to be included in the Data Quality Audit.

The criteria for selecting the indicators for the Data Quality Audit could be the following:
1. “Must Review” Indicators. Given the program/project(s) selected for auditing, the
Organization Commissioning the DQA may have a list of “must review” indicators that
should be selected first (e.g., indicators related to People on ARV Treatment, ITNs Distributed
[or re-treated], and DOTS Detection Numbers). These are generally the indicators that are
internationally reported to measure the global response to the disease. For example, for audits
undertaken through the Global Fund, the indicators to be audited will generally come from its
list of “Top 10 indicators.” Under the President’s Emergency Plan for AIDS Relief, the list

Data Quality Audit Tool 23


will likely come from indicators that most directly relate to the goals of putting two million
people on treatment and providing 10 million people with care and support. Other donors and
National programs may have different lists of important indicators to consider.
2. Relative Magnitude of the Indicators.
a. Relative Magnitude of Resource Investment in Activities Related to the Indicator. For
example, if the program/project invests more than 25% of its funding in a specific program
area, then the key indicator in that area could be selected.
b. Reported Number for an Indicator Relative to the Country Target. If the identified program/
project has “substantial” reporting activity within a country for an indicator, that indicator
should be considered for auditing. Substantial could be defined as generating more than
25% of the country’s total reported numbers for that indicator.
3. “Case by Case” Purposive Selection. In some cases, the Organization Commissioning the
DQA may have other reasons for including an indicator in the DQA. This could be because
there are indicators for which data quality questions exist. It could also be the case for indicators
that are supposedly routinely verified and for which the Organization Commissioning the
DQA wants an independent audit. Those reasons should be documented as justification for
inclusion.

ANNEX 2, Step 1 – Template 2 contains an illustrative template for analyzing the relative magnitude
of the investments and indicator results per program area.

C – SELECT THE REPORTING PERIOD

It is also important to clearly identify the reporting period associated with the indicator(s) to be
audited. Ideally, the time period should correspond to the most recent relevant reporting period
for the national system or to the program/project activities associated with the Organization
Commissioning the DQA. If the circumstances warrant, the time period for the audit could be less
(e.g., a fraction of the reporting period, such as the last quarter or month of the reporting period).
For example, the number of source documents in a busy VCT site could be voluminous, audit
staff resources may be limited, or the program/project’s Service Delivery Sites might produce
monthly or quarterly reports related to the relevant source documents. In other cases, the time
period could correspond to an earlier reporting period where large results were reported by the
program/project(s).

D – DOCUMENT THE SELECTION

ANNEX 2, Step 1 – Template 3 provides a tool that can be used to document selection of the
country, program/project(s), indicator(s), and reporting period being audited.

24 Data Quality Audit Tool


Step 2. Notify Program, Request Documentation and
Obtain National Authorizations

Step 2 is typically performed by the Organization Commissioning the DQA.

A – Notify Program and Request Documentation

The Organization Commissioning the DQA should notify the program/project about the impending
Data Quality Audit as soon as possible and obtain national and other relevant authorizations. They
should also notify other organizations, as appropriate, about the audit and request cooperation. The
Audit Team is expected to comply with national regulations regarding data confidentiality
and ethics. It is the Audit Team’s responsibility to identify such national regulations and adhere
to them.

ANNEX 2, Step 2 – Template 1 contains draft language for the notification letter. This letter can be
modified, as needed, in consultation with local stakeholders (e.g., the National Disease Commission,
the MOH, the CCM, relevant donors). It is important that the Organization Commissioning the
DQA stress the need for the relevant M&E Unit staff member(s) to accompany the Audit Team
on its site visits. The letter should be accompanied by the initial documentation request from the
M&E Unit, which is found in Step 2 – Table 1.

After the notification letter has been sent, the Organization Commissioning the DQA should send
a copy of the notification letter to all relevant stakeholders, including, for example:
• Host country officials related to the program/project being audited;
• National audit agency, as appropriate; and
• Donors, development partners, international implementing partner organizations, and
relevant M&E working-group representatives.

The Audit Agency should follow up with the selected program/project about the pending audit,
timeframes, contact points, and the need to supply certain information and documentation in
advance.

The Audit Team will need four types of documentation at least two weeks in advance of the country
mission:
1. A list of all service points with latest reported results related to the indicator(s);
2. A description of the data-collection and reporting system;
3. The templates of the data-collection and reporting forms; and
4. Other available documentation relating to the data management and reporting systems and a
description of the program/project (e.g., a procedures manual).

Data Quality Audit Tool 25


1) List of Service Delivery Sites that offer services related to the indicator(s). The Audit Team
should receive a list of all Service Delivery Sites from which to select a sample of the sites to be
audited. This list of service sites should include:
• Location – region, district, etc., and whether the site is in an urban or rural area.
• Type of facility – if the service site is a health facility (and what type of health facility,
e.g. hospital, primary health care center) or a community-based service site.
• Latest reporting results for each of the Service Delivery Sites (e.g., numbers of
individuals on treatment or cases successfully treated).
• Information on other factors (as necessary) – the Organization Commissioning the
DQA may define other characteristics defining the sample of sites to be drawn. For
example, the selection may include public and private sector sites or may focus on sites
supported by faith-based organizations or non-governmental organizations.

Once Service Delivery Sites and the related Intermediate Aggregation Levels are selected for the
audit, it is critical that the Audit Team work through the program/project to notify the selected
sites and provide them with the information sheets found in ANNEX 3, Step 2 – Templates 1, 2, 3.
This is meant to ensure that relevant staff is available and source documentation accessible for the
indicator(s) and reporting period being audited.

2) Description of the data-collection and reporting system related to the indicator(s). The Audit
Team should receive the completed template(s) found in ANNEX 2, Step 2 – Template 2 describing
the data-collection and reporting system related to the indicator(s) being audited.

3) Templates of the data-collection and reporting forms. The Audit Team should receive the
templates of all data-collection and reporting forms used at all levels of the data management
system for the related indicator(s) (e.g., patient records, client intake forms, registers, monthly
reports, etc.).

4) Other documentation for the systems review. The other documents requested are needed so
that the Audit Team can start assessing the data collection and reporting system for the selected
indicator(s). These documents are listed on the following page in Step 2 – Table 1. In the event
the program/project does not have such documentation readily available, the Audit Team should be
prepared to follow-up with the program/project management once in country.

In addition, the Organization Commissioning the Audit should also provide the Audit Team with
relevant background documents regarding the country and program/project being audited.

26 Data Quality Audit Tool


Step 2 – Table 1. List of Audit Functional Areas and Documentation to Request from
Program/Project for Desk Review (if available)

Functional Areas General Documentation Requested Check if


provided

• Names and contact information for key program/project
Contact Information officials, including key staff responsible for data
management activities.
• Organizational chart depicting M&E responsibilities.
I – M&E Structures,
• List of M&E positions and status (e.g., full time or part
Roles, and
Capabilities
time, filled or vacant).
• M&E Training Plan, if one exists.
• Instructions to reporting sites on reporting requirements
and deadlines.
• Description of how service delivery is recorded on
source documents, and on other documents such as clinic
registers and periodic site reports.
II – Indicator • Detailed data flow diagram including:
Definitions and {{ from Service Delivery Sites to Intermediate
Reporting Guidelines Aggregation Levels (e.g., district offices, provincial
offices, etc.); and
{{ from Intermediate Aggregation Levels (if any) to the
M&E Unit.
• National M&E Plan, if one exists.
• Operational definitions of indicators being audited.
• Data-collection form(s) for the indicator(s) being audited.
III – Data collection
• Reporting form(s) for the indicator(s) being audited.
and Reporting
Forms and Tools • Instructions for completing the data collection and
reporting forms.
• Written documentation of data management processes
including a description of all data-verification,
aggregation, and manipulation steps performed at each
IV – Data level of the reporting system.
Management • Written procedures for addressing specific data quality
Processes challenges (e.g. double-counting, “lost to follow-up”),
including instructions sent to reporting sites.
• Guidelines and schedules for routine supervisory site
visits.
V – Links with • Documented links between the program/project data
National Reporting reporting system and the relevant national data reporting
System system.

Data Quality Audit Tool 27


The systems review will be conducted by answering the questions in the DQA Protocol 1: System
Assessment Protocol. The protocol is arranged into five functional areas with thirteen key
summary questions that are critical to evaluating whether the program/project(s) data management
system is well designed and implemented to produce quality data. Performing the desk review
with the documentation provided prior to visiting the program/project will reduce the burden the
audit will place on the data management staff at the M&E Unit.

B – OBTAIN NATIONAL AUTHORIZATION

In certain cases, special authorization for conducting the DQA may be required from another
national body, such as the National Audit Agency. ANNEX 2, Step 2 – Template 3 provides text
for the letter requesting such additional authorization to conduct the Data Quality Audit. This letter
should be sent by the Organization Commissioning the DQA. The recipient(s) of the authorization
letter will vary according to what program or project is being audited. The national authorization
and any other relevant permission to conduct the DQA from donors supporting audited sites or
program/project officials should be included in the Final Audit Report as an attachment.

28 Data Quality Audit Tool


Step 3. Select Sites to be Audited

Step 3 can be performed by the Organization Commissioning the DQA and/or the Audit Team.

In this section, four alternatives are presented for selecting the sites in which the data quality audit
teams will conduct the work. The alternatives are presented in order of complexity, from Sampling
Strategy A which is completely non-statistical, to Sampling Strategy D which is a multistage cluster
sampling method that can be used to make statistical inferences about data quality on a national
scale. Sampling Strategies B and C represent midpoints between the non-statistical and statistical
approaches and offer the audit team an opportunity to tailor the audit to a specific set of sites based
on need or interest.

The Organization Commissioning the DQA should decide on the sampling strategy based on the
objective of the DQA and available resources. The Audit Agency will determine, based on which
type of sample is used, the sites for the audit. The Organization Commissioning the DQA may want
to be involved in decisions regarding site selection, particularly if the sampling is not random.

A – SELECTION METHOD A: Purposive Selection

This is a pre-determined sample that the Organization Commissioning the DQA dictates to the
Data Quality Audit team. In some cases, there may be a need for a data quality audit to focus
specifically on a set of service delivery points that are predetermined. In this case, there is no
need for a sampling plan. However, the data quality audit findings produced from such a
“purposive” or targeted sample cannot be used to make generalized statements (or statistical
inferences) about the total population of sites in that country. The findings will be limited to
those sites visited by the audit team.

B – SELECTION METHOD B: Restricted Site Selection

Sampling Strategy B is also called a restricted site design. It is commonly used as a substitute for
probability sampling (based on a random algorithm) and is a good design for comparison of audit
results over multiple periods. In the Restricted Site design, the audit team selects one site where all
the work will occur. The benefit of this approach is that the team can maximize its efforts in one
site and have a high degree of control over implementation of the audit protocols and knowledge
of the site-specific systems from which the results are derived. Sampling Strategy B is ideal for
evaluating the effects of an intervention to improve data quality. For example, the DQA is
implemented at a site and constitutes a baseline measurement. An intervention is conducted
(e.g. training), and the DQA is implemented a second time. Since all factors that can influence
data quality are the same for both the pre and post test (the same site is used), any difference
in data quality found on the post test can most likely be attributable to the intervention. Such
a repeated measure approach using the data quality audit tool might be prohibitively expensive if
used in conjunction with a sampling plan that involves many sites.

Data Quality Audit Tool 29


C – SELECTION METHOD C: Priority Attribute Selection

This sample is drawn by the Data Quality Audit team with the objective of maximizing exposure
to important sites while minimizing the amount of time and money spent actually implementing
the audit. In most cases, Sampling Strategy C involves the random selection of sites from within
a particular group, where group membership is defined by an attribute of interest. Examples
of such attributes include location (e.g. urban/rural, region/district), volume of service, type of
organization (e.g. faith-based, non-governmental), or performance on system assessments (e.g.
sites that scored poorly on the M&E Systems Strengthening Tool).

The stratified random sampling used in Sampling Strategy C allows the audit team to make
inferences from the audit findings to all the sites that belong to the stratification attribute
of interest (like all rural sites, all very large sites, all faith-based sites, etc.). In this way, the
audit findings can be generalized from the sample group of sites to a larger “population” of sites to
which the sampled sites belong. This ability to generate statistics and make such generalizations
can be important and is discussed in more detail in the section below describing Sampling Strategy
D.

The stratified sampling used in Sampling Strategy C is sub-national: the data quality auditors are
not attempting to make generalizations about national programs. In this sense, the strategy differs
from Sampling Strategy D mainly with respect to its smaller scope. Both strategies use random
sampling (explained in more detail in Annex 4), which means that within a particular grouping of
sites (sampling frame), each site has an equal chance of being selected into the audit sample.

A Verification Factor can be calculated that indicates the data quality for the group with the attribute
of interest but which is not national in scope.

D – SELECTION METHOD D: Cluster Sampling Selection

Sampling Strategy D is used to derive a national level Verification Factor for program-level
indicators. It is complex and requires updated and complete information on the geographical
distribution of sites (for whatever indicators have been selected) as well as the site-specific
reported results (counts) for the indicator that is being evaluated. Sampling Strategy D could also
be referred to as a modified two-stage cluster sample (modified in that a stratified random sample
of sites, rather than a simple random sample, is taken within the selected clusters).

Cluster sampling is a variation on simple random sampling (where all sites would be chosen
randomly) that permits a more manageable group of sites to be audited. Were all sites chosen at
random they would likely be dispersed all over the country and require much time and resources
to audit. Cluster sampling allows for the selection of a few districts, thereby reducing the amount
of travel required by the auditors.

30 Data Quality Audit Tool


A scientific sampling plan implies the use of probability theory and involves statistics. The purpose
of statistics in this context is to allow the auditors to produce quantitative data quality findings that
can be viewed as estimates of data quality for the whole program/project, and not simply as the
data quality at the selected sites. Furthermore, a scientific sample allows for the quantification of
the certainty of the estimates of accuracy found by the audit (i.e. confidence intervals). The benefits
of such a proportionally representative sampling plan go beyond the calculation of Verification
Factors and apply to all empirical data quality audit findings.

The primary sampling unit for Sampling Strategy D is a cluster, which refers to the administrative
or political or geographic unit in which Service Delivery Sites are located. In practice, the selection
of a cluster is usually a geographical unit like a district. Ultimately, the selection of a cluster
allows the audit team to tailor the sampling plan according to what the country program looks like.

The strategy outlined here uses probability proportionate to size (PPS) to derive the final set of
sites that the audit team will visit. Sampling Strategy D generates a selection of sites to be visited
by the audit team that is proportionately representative of all the sites where activities supporting
the indicator(s) under study are being implemented.

Clusters are selected in the first stage using systematic random sampling, where clusters with
active programs reporting on the indicator of interest are listed in a sampling frame. In the second
stage, Service Delivery Sites from selected clusters are chosen using stratified random sampling
where sites are stratified on volume of service.

The number of sites selected for a given DQA will depend on the resources available to conduct
the audit and the level of precision desired for the national level estimate of the Verification Factor.
The Audit Teams should work with the Organization Commissioning the DQA to determine the
right number of sites for a given program and indicator. Annex 4 contains a detailed discussion
and an illustrative example of Sampling Strategy D for the selection of clusters and sites for the
DQA.

Note: The precision of estimates of the Verification Factor found using the GAVI sampling
methodology employed here have been questioned.3 It is strongly advised that the Auditing Agency
have access to a sampling specialist who can guide the development of representative samples and
that the verification factors generated using these methods be interpreted with caution.

3
Woodard S., Archer L., Zell E., Ronveaux O., Birmingham M. Design and Simulation Study of the Immunization
Data Quality Audit (DQA). Ann Epidemiol, 2007;17:628–633.

Data Quality Audit Tool 31


Step 4. Prepare for On-site Audit Visits

Step 4 is performed by the Audit Team.

The Audit Agency will need to prepare for the audit site visits. In addition to informing the
program/project and obtaining a list of relevant sites and requesting documentation (Steps 2-3),
the Audit Agency will need to: (1) estimate the timing required for the audit (and work with the
program/project to agree on dates); (2) constitute an Audit Team with the required skills; and (3)
prepare materials for the site visits. Finally, the Audit Agency will need to make travel plans for
the site visits.

A – ESTIMATE TIMING

Depending on the number and location of the sampled sites to be visited, the Audit Agency will
need to estimate the time required to conduct the audit. As a guideline:
• The M&E Unit will typically require two days (one day at the beginning and one day at
the end of the site visits);
• Each Intermediate Aggregation Level (e.g., District or Provincial offices) will require
between one-half and one day;
• Each Service Delivery Site will require between one-half and two days (i.e., more than
one day may be required for large sites with reported numbers in the several hundreds or
sites that include satellite centers or when “spot-checks” are performed).
• The Audit Team should also plan for an extra work day after completion of the site visits
to prepare for the meeting with the M&E Unit.

Step 4 – Table 1 on the following page provides an illustrative daily schedule for the site visits
which will help the Audit Agency plan for the total time requirement.

32 Data Quality Audit Tool


4
Step 4 – Table 1. Illustrative Daily Schedule for Data Quality Audit Site Visits and Meetings

Country: Indicator:

Date: Disease: Team:

Estimated
Activity Notes
Time
Note: Add travel and DQA team work days, as needed

M&E UNIT (Beginning) – 1 day


1 Introduction and presentation of DQA process 30 min Morning – day 1
2 Questions and answers 15 min Morning – day 1
3 Confirm reporting period 15 min Morning – day 1
Complete “DQA Protocol 1: System Assessment Protocol”
4 a. Request additional documentation (if needed) 2 hrs Morning – day 1
b. Discuss and get answers to protocol questions
5 Complete “DQA Protocol 2: Data Verification Protocol” 2-4 hrs Afternoon – day 1

SERVICE DELIVERY POINT – between ½-2 days4


1 Introduction and presentation of DQA process 30 min Morning – day 1
2 Questions and answers 15 min Morning – day 1
3 Discuss reporting period and service observation time 15 min Morning – day 1
Complete “DQA Protocol 1: System Assessment Protocol”
4 a. Request additional documentation (if needed) 1-2 hrs Morning – day 1
b. Discuss and get answers to protocol questions
5 Complete “DQA Protocol 2: Data Verification Protocol” 4-15 hours
-- Observation/Description 1 hr Afternoon – day 1
-- Documentation review 1-2 hrs Afternoon – day 1
-- Trace and verification 1-4 hrs Afternoon – day 1
-- Cross-checks 1-2 hours Afternoon – day 1
-- Spot-checks 0-6 hours day 2 (if applicable)

INTERMEDIATE AGGREGATION LEVEL – between ½-1 day


1 Introduction and presentation of DQA process 30 min Morning – day 1
2 Questions and answers 15 min Morning – day 1
3 Discuss reporting period 15 min Morning – day 1

4  The time required at the Service Delivery Points will vary between one and two days depending on the size of the
reported numbers to be verified and whether or not spot-checks are performed.

Data Quality Audit Tool 33


Step 4 – Table 1. Illustrative Daily Schedule for Data Quality Audit Site Visits and Meetings

Country: Indicator:

Date: Disease: Team:

Estimated
Activity Notes
Time
Complete “DQA Protocol 1: System Assessment Protocol”
4 a. Request additional documentation (if needed) 1-2 hrs Morning – day 1
b. Discuss and get answers to protocol questions
5 Complete “DQA Protocol 2: Data Verification Protocol” 2-4 hrs Afternoon – day 1

AUDIT TEAM WORK DAY


1 Review and consolidate DQA Protocols 1 & 2 1-2 hrs Morning
2 Complete preliminary findings and Recommendation Notes 3 hrs Morning
3 Prepare final presentation for meeting with M&E Unit 4 hrs Afternoon

M&E UNIT (End) – 1 day


1 Conduct closeout meeting 2-3 hrs Morning

B – CONSTITUTE THE AUDIT TEAM

While the Organization Commissioning the DQA will select the organization to conduct the data
quality audit, it is recommended that the following skills be represented in the audit teams:
• Public Health (closely related to the disease area and indicator(s) being audited);
• Program Auditing;
• Program Evaluation (e.g., health information systems, M&E systems design, indicator
reporting);
• Data Management (e.g., strong understanding of and skills in data models and querying/
analyzing databases);
• Excel (strong skills preferable to manipulate, modify and/or create files and worksheets);
and
• Relevant Country Experience; preferable.

Audit Team members can have a combination of the skills listed above. While the total number of
team members will vary by the size of the audit, it is recommended that the Audit Team comprise
a minimum of two to four consultants including at least one Senior Consultant. The team may be
comprised of international and/or regional consultants. In addition, if the consultants do not speak
the country language, one or more independent translator(s) should be hired by the Audit Team.

34 Data Quality Audit Tool


When visiting the sites, the Audit Team will need to split into sub-teams and pair-up with at
least one representative of the program/project. Each sub-team will be responsible for visiting a
number of sites related to the audit (for example, one sub-team would visit the sites A, B, and C;
while the second sub-team would visit the sites D, E, and F). For sub-teams visiting sites with
computerized systems, one team member should have the capability to conduct queries of the
relevant database.

Finally, the Organization Commissioning the DQA may have other requirements for team members
or skills. It will be important for all Audit Team members to be familiar with the indicator-specific
protocols being used in the audit and to become familiar with the program/project being audited.

C – PREPARE LOGISTICS

Materials to Take on the Audit Visits


When the Audit Team visits the program/project, it should be prepared with all the materials needed
to carry out the on-site audit steps. A list of materials the Audit Team should be prepared with is
shown in Annex 3, Step 4 – Template 4.

Note: While the protocols in the DQA are automated Excel files, the Audit Team should be
prepared with paper copies of all needed protocols. In some cases, it may be possible to use
computers during site visits, but in other cases the Audit Team will need to fill out the protocols on
the paper copies and then transcribe the findings to the Excel file.

Planning Travel
The Audit Team should work with the program/project to plan for travel to the country (if the
Audit Team is external) and to the sampled sites — both to set appointments and to coordinate with
program/project staff that will accompany the audit team on the site visits. The Audit Team should
arrange for transportation to the sampled sites and for lodging for the team.

Data Quality Audit Tool 35


Step 5. Review Documentation

Step 5 is performed by the Audit Team.

The purpose of reviewing and assessing the design of the program/project’s data management and
reporting system is to determine if the system is able to produce reports with good data quality if
implemented as planned. The review and assessment is accomplished in several steps, including a
desk review of information provided in advance by the program/project, and follow-up reviews at
the program/project M&E Unit, at selected Service Delivery Sites, and Intermediate Aggregation
Levels. During the off-site desk review, the Audit Team will work to start addressing the questions
in the DQA Protocol 1: System Assessment Protocol based on the documentation provided. The
Audit Team should nevertheless anticipate that not all required documentation will be submitted
by the program/project in advance of the country mission.

Ideally, the desk review will give the Audit Team a good understanding of the Program’s reporting
system — its completeness and the availability of documentation relating to the system and
supporting audit trails. At a minimum, the desk review will identify the areas and issues the Audit
Team will need to follow-up at the program/project M&E Unit (Phase 2).

Because the M&E system may vary among indicators and may be stronger for some indicators
than others, the Audit Team will need to fill out a separate DQA Protocol 1: System Assessment
Protocol for each indicator audited for the selected program/project. However, if indicators selected
for auditing are reported through the same data reporting forms and systems (e.g., ART and OI
numbers or TB Detection and Successfully Treated numbers), only one DQA Protocol 1: System
Assessment Protocol may be completed for these indicators.

ANNEX 1 shows the list of 39 questions included in the DQA Protocol 1: System Assessment
Protocol that the Audit Team will complete, based on its review of the documentation and the
audit site visits.

As the Audit Team is working, it should keep sufficiently detailed notes or “work papers” related
to the steps in the audit that will support the Audit Team’s final findings. Space has been provided
on the protocols for notes during meetings with program/project staff. In addition, if more detailed
notes are needed at any level of the audit to support findings and recommendations, the Audit
Team should identify those notes as “work papers” and the relevant “work paper” number should
be referenced in the appropriate column on all DQA templates and protocols. For example, the
“work papers” could be numbered and the reference number to the “work paper” noted in the
appropriate column on the DQA templates and protocols. It is also important to maintain notes
of key interviews or meetings with M&E managers and staff during the audit. Annex 3, Step 5
– Template 1 provides a format for the notes of those interviews.

36 Data Quality Audit Tool


PHASE 2: PROGRAM/PROJECT’S M&E UNIT

The second phase of the DQA is conducted at the M&E Unit of


PHASE 2 the program/project being audited. The steps in PHASE 2 are
to:

6. Assess the design and implementation of the data


M&E Management
Unit management and reporting system at the M&E Unit.
7. Begin tracing and verifying results reported from
Intermediate Aggregation Levels (or Service Delivery Sites)
to the M&E Unit.

6. Assess Data During PHASE 2, the Audit Team should meet the head of
Management the M&E Unit and other key staff who are involved in data
Systems management and reporting.

The steps in PHASE 2 are estimated to take one day.


7. Trace and
verify results
from Intermediate
Aggregation
Site Reports

Data Quality Audit Tool 37


Step 6. Assess Data Management SYSTEMS
(at the M&E Unit)

Step 6 is performed by the Audit Team.

While the Data Quality Audit Team can determine a lot about the design of the data management and
reporting system based on the off-site desk review, it will be necessary to perform on-site follow-up
at three levels (M&E Unit, Intermediate Aggregation Levels, and Service Delivery Points) before
a final assessment can be made about the ability of the overall system to collect and report quality
data. The Audit Team must also anticipate the possibility that a program/project may have some data
reporting systems that are strong for some indicators, but not for others. For example, a program/
project may have a strong system for collecting ART treatment data and a weak system for collecting
data on community-based prevention activities.

The Excel-based DQA Protocol 1: System Assessment Protocol contains a worksheet for the Audit
Team to complete at the M&E Unit. The Audit Team will need to complete the protocol as well as obtain
documentary support for answers obtained at the program/project’s M&E Unit. The most expeditious
way to do this is to interview the program/project’s key data management official(s) and staff and to
tailor the interview questions around the unresolved systems design issues following the desk review
of provided documentation. Hopefully, one meeting will allow the Audit Team to complete the DQA
Protocol 1: System Assessment Protocol section (worksheet) for the M&E Unit.

It is important that the Audit Team include notes and comments on the DQA Protocol 1: System
Assessment Protocol in order to formally document the overall design (and implementation) of the
program/project data management and reporting system and identify areas in need of improvement.
Responses to the questions and the associated notes will help the Audit Team answer the 13 overarching
Audit Team Summary Questions towards the end of the DQA (see Step 12 – Table 2 for the list of
summary questions – which will be completely answered in PHASE 5 - Step 12).

As the Audit Team completes the DQA Protocol 1: System Assessment Protocol, it should keep in
mind the following two questions that will shape the preliminary findings (Step 13) and the Audit
Report (drafted in Step 15 and finalized in Step 17):
1. Does the design of the program/project’s overall data collection and reporting system ensure
that, if implemented as planned, it will collect and report quality data? Why/why not?
2. Which audit findings of the data management and reporting system warrant Recommendation
Notes and changes to the design in order to improve data quality? These should be documented
on the DQA Protocol 1: System Assessment Protocol.

Note: While the Audit Team is meeting with the M&E Unit, it should determine how the audit findings
will be shared with staff at the lower levels being audited. Countries have different communication
protocols; therefore in some countries, the Audit Team will be able to share preliminary findings at
each level, while in other countries, the M&E Unit will prefer to share findings at the end of the audit.
It is important for the Audit Team to comply with the communication protocols of the country. The
communication plan should be shared with all levels.

38 Data Quality Audit Tool


Step 7. Trace and verify results from Intermediate
Aggregation LEVELS (at the M&E Unit)

Step 7 is performed by the Audit Team.

Step 7 is the first of three data verification steps that will assess, on a limited scale, if Service
Delivery Sites, Intermediate Aggregation Levels (e.g., Districts or Regions), and the M&E Unit
are collecting, aggregating, and reporting data accurately and on time.

The Audit Team will use the appropriate version of the DQA Protocol 2: Data Verification
Protocol—for the indicator(s) being audited—to determine if the sampled sites have accurately
recorded the service delivery on source documents. They will then trace those data to determine
if the numbers have been correctly aggregated and/or otherwise manipulated as the numbers are
submitted from the initial Service Delivery Sites, through Intermediary Aggregation Levels, to
the M&E Unit. The protocol has specific actions to be undertaken by the Audit Team at each level
of the reporting system (for more detail on the DQA Protocol 2: Data Verification Protocol,
see Steps 9 and 11). In some countries, however, Service Delivery Sites may report directly to
the central M&E Unit, without passing through Intermediate Aggregation Levels (e.g., Districts
or Regions). In such instances, the verifications at the M&E Unit should be based on the reports
directly submitted by the Service Delivery Sites.

While the data verification exercise implies recounting numbers from the level at which they are
first recorded, for purposes of logistics, the M&E Unit worksheet of the DQA Protocol 2: Data
Verification Protocol can be completed first. Doing so provides the Audit Team with the numbers
received, aggregated and reported by the M&E Unit and thus a benchmark for the numbers the
Audit Team would expect to recount at the Service Delivery Sites and the Intermediate Aggregation
Levels.

At the M&E Unit, the steps undertaken by the Audit Team on the DQA Protocol 2: Data
Verification Protocol are to:
1. Re-aggregate reported numbers from all Intermediate Aggregation Sites: Reported
results from all Intermediate Aggregation Sites (e.g., Districts or Regions) should be re-
aggregated and the total compared to the number contained in the summary report prepared
by the M&E Unit. The Audit Team should identify possible reasons for any differences
between the verified and reported results.

 STATISTIC: Calculate the Result Verification Ratio for the M&E Unit.

Sum of reported counts from all Intermediate Aggregation Sites


Total count contained in the Summary Report prepared by the M&E Unit

Data Quality Audit Tool 39


2. Copy results for the audited Intermediate Aggregation Sites as observed in the Summary
Report prepared by the M&E Unit. To calculate the Adjustment Factor (which is necessary
to derive a Composite Verification Factor — see ANNEX 5), the Audit Team will need to
find the numbers available at the M&E Unit for the audited Intermediate Aggregation Sites.
These are likely to be contained in the Summary Report prepared by the M&E Unit or in a
database.
3. Review availability, completeness, and timeliness of reports from all Intermediate
Aggregation Sites. How many reports should there have been from all Intermediate
Aggregation Sites? How many are there? Were they received on time? Are they complete?

 STATISTIC: Calculate % of all reports that are A) available; B) on time; and C) complete.

A) % Available Reports (available to the Audit Team) =


Number of reports received from all Intermediate Aggregation Sites
Number of reports expected from all Intermediate Aggregation Sites

B) % On Time Reports (received by the due date) =


Number of reports received on time from all Intermediate Aggregation Sites
Number of reports expected from all Intermediate Aggregation Sites

C) % Complete Reports =
Number of reports that are complete from all Intermediate Aggregation Sites
Number of reports expected from all Intermediate Aggregation Sites

That is to say, for a report to be considered complete it should include at least (1) the reported
count relevant to the indicator; (2) the reporting period; (3) the date of submission of the report;
and (4) a signature from the staff having submitted the report.

Warning: If there are any indications that some of the reports have been fabricated (for the purpose
of the audit), the Audit Team should record these reports as “unavailable” and seek other data
sources to confirm the reported counts (for example, an end-of-year report from the site containing
results for the reporting period being audited). As a last resort, the Audit Team may decide to visit
the site(s) for which reports seem to be fabricated to obtain confirmation of the reported counts. In
any event, if these reported counts cannot be confirmed, the Audit Team should dismiss the reported
counts and record “0” for these sites in the DQA Protocol 2: Data Verification Protocol.

Note: In no circumstances should the Audit Team record personal information, photocopy or
remove documents from the M&E Unit.

40 Data Quality Audit Tool


PHASE 3: INTERMEDIATE AGGREGATION LEVEL(S)

The third phase of the DQA takes place, where applicable,


PHASE 3 at one or more intermediary aggregation (reporting) levels
where data reported by the selected Service Delivery Sites may
be aggregated with data from other service sites before it is
Intermediate communicated to the program/project headquarters. The steps
Aggregation Levels
in PHASE 3 are to:
(e.g. District,
Region)
8. Determine if key elements of the program/project’s data
management and reporting system are being implemented at
the intermediary reporting sites (e.g., Districts or Regions).
8. Assess of 9. Trace and verify reported numbers from the Service Delivery
Data Management Site(s) through any aggregation or other manipulative steps
Systems performed at the intermediary sites.

During PHASE 3, the Audit Team should meet with key staff
involved in program/project M&E at the relevant Intermediate
9. Trace and Verify Aggregation Level — including the staff member(s) in charge
Results from of M&E and other staff who contribute to aggregating the
Site Reports data received from Service Delivery Sites and reporting the
aggregated (or otherwise manipulated) results to the next
reporting level.

NOTE: As stated earlier, in some countries, Service Delivery


Sites may report directly to the central M&E Unit, without
passing through Intermediate Aggregation Levels. In such instances, the Audit Team should not
perform PHASE 3.

The steps in PHASE 3 are estimated to take between one-half and one day.

Step 8. ASSESS DATA MANAGEMENT SYSTEMS


(AT THE INTERMEDIATE AGGREGATION LEVELS)

Step 8 is performed by the Audit Team.

In Step 8, the Audit Team continues the assessment of the data management and reporting system
at the intermediate aggregation levels at which data from Service Delivery Sites are aggregated
and manipulated before being reported to the program/project M&E Unit. Specific instructions
for completing the Intermediate Aggregation Level worksheet of the DQA Protocol 1: System
Assessment Protocol are found in the Excel file of the protocol.

Data Quality Audit Tool 41


Step 9. TRACE AND VERIFY RESULTS FROM SITE REPORTS
(AT THE INTERMEDIATE AGGREGATION LEVELS)

Step 9 is performed by the Audit Team.

The Audit Team will continue with the DQA Protocol 2: Data Verification Protocol for Steps 9
and 11.

Step 9 – Table 1. Intermediate Aggregation Levels: Two Types of Data Verifications

Verifications Description Required


1. Documentation Review availability, timeliness and completeness of expected In all cases
Review reports from Service Delivery Sites for the selected reporting
period.
2. Trace and Trace and verify reported numbers: (1) Re-aggregate the numbers In all cases
Verification submitted by the Service Delivery Sites; (2) Compare the verified
counts to the numbers submitted to the next level (program/
project M&E Unit); (3) Identify reasons for any differences.

At this stage of the audit, the Data Quality Audit seeks to determine whether the intermediary
reporting sites correctly aggregated the results reported by Service Delivery Points.

The Audit Team will perform the following data quality audit steps for each of the selected
indicators at the Intermediate Aggregation Level(s):
1. Re-aggregate reported numbers from all Service Delivery Points: Reported results from all
Service Delivery Points should be re-aggregated and the total compared to the number contained
in the summary report prepared by the Intermediate Aggregation Site. The Audit Team should
identify possible reasons for any differences between the verified and reported results.

 STATISTIC: Calculate the Result Verification Ratio for the Intermediate Aggregation Site.

Sum of reported counts from all Service Delivery Points


Total count contained in the Summary Report prepared by the Intermediate Aggregation Site

2. Review availability, completeness and timeliness of reports from all Service Delivery
Points. How many reports should there have been from all Service Delivery Points? How
many are there? Were they received on time? Are they complete?

 STATISTIC: Calculate % of all reports that are A) available; B) on time; and C) complete.

42 Data Quality Audit Tool


A) % Available Reports (available to the Audit Team) =
Number of reports received from all Service Delivery Points
Number of reports expected from all Service Delivery Points

B) % On Time Reports (received by the due date) =


Number of reports received on time from all Service Delivery Points
Number of reports expected from all Service Delivery Points

C) % Complete Reports (i.e. contains all the relevant data to measure the indicator) =
Number of reports that are complete from all Service Delivery Points
Number of reports expected from all Service Delivery Points

That is to say, for a report to be considered complete, it should at least include (1) the reported
count relevant to the indicator; (2) the reporting period; (3) the date of submission of the report;
and (4) a signature from the staff having submitted the report.

Warning: If there are any indications that some of the reports have been fabricated (for the purpose
of the audit), the Audit Team should record these reports as “unavailable” and seek other data
sources to confirm the reported counts (for example, an end-of-year report from the site containing
results for the reporting period being audited). As a last resort, the Audit Team may decide to visit
the site(s) for which reports seem to be fabricated to obtain confirmation of the reported counts. In
any event, if these reported counts cannot be confirmed, the Audit Team should dismiss the reported
counts and record “0” for these sites in the DQA Protocol 2: Data Verification Protocol.

Note: In no circumstances should the Audit Team record personal information, photocopy or
remove documents from the Intermediate Aggregation Sites.

Data Quality Audit Tool 43


PHASE 4: SERVICE DELIVERY SITES

The fourth phase of the DQA takes place at the selected Service
PHASE 4 Delivery Sites where the following data quality audit steps are
performed:

Service Delivery 10. Determine if key elements of the program/project’s data


Sites/ management and reporting system are being implemented
Organizations at the Service Delivery Sites.
11. Trace and verify reported data from source documents for
the selected indicators.

10. Assess Data During PHASE 4, the Audit Team should meet with key
Collection and Re- data collection and management staff at the Service Delivery
porting System Site — including the staff involved in completing the source
documents, in aggregating the data, and in verifying the reports
before submission to the next administrative level.

11. Trace and Verify The steps in PHASE 4 are estimated to take between one-
Results from Source half and two days. More than one day may be required for
Documents large sites (with reported numbers in the several hundreds),
sites that include satellite centers, or when “spot-checks”
are performed.

Step 10. Assess Data Collection and Reporting System


(at the SERVICE DELIVERY POINTS)

Step 10 is performed by the Audit Team.

In Step 10, the Audit Team conducts the assessment of the data management and reporting system
at a selection of Service Delivery Sites at which services are rendered and recorded on source
documents. Data from Service Delivery Sites are then aggregated and manipulated before being
reported to the Intermediate Aggregation Levels. Specific instructions for completing the Service
Delivery Site worksheet of the DQA Protocol 1: System Assessment Protocol are found in the
Excel file of the protocol.

44 Data Quality Audit Tool


Step 11. Trace and verify results from Source
Documents (at the SERVICE DELIVERY POINTS)

Step 11 is performed by the Audit Team.

At the Service Delivery Site, each indicator-specific protocol begins with a description of the service(s)
provided in order to orient the Audit Team towards what is being “counted” and reported. This will
help lead the Audit Team to the relevant source documents at the Service Delivery Point, which can
be significantly different for various indicators (e.g., patient records, registers, training logs).

Regardless of the indicator being verified or the nature of the Service Delivery Site (health based/
clinical or community-based), the Audit Team will perform some or all of the following data
verification steps (Step 11 – Table 1) for each selected indicator:

Step 11 – Table 1. Service Delivery Site: Five Types of Data Verifications

Verifications Description Required


1. Description Describe the connection between the delivery of services and/ In all cases
or commodities and the completion of the source document that
records that delivery.
2. Documentation Review availability and completeness of all indicator source In all cases
Review documents for the selected reporting period.

3. Trace and Trace and verify reported numbers: (1) Recount the reported In all cases
Verification numbers from available source documents; (2) Compare the
verified numbers to the site reported number; (3) Identify reasons
for any differences.

4. Cross-checks Perform “cross-checks” of the verified report totals with other data- In all cases
sources (e.g. inventory records, laboratory reports, other registers, etc.).

5. Spot-checks Perform “spot-checks” to verify the actual delivery of services and/ If feasible
or commodities to the target populations.

Before starting the data verifications, the Audit Team will need to understand and describe the
recording and reporting system related to the indicator being verified at the Service Delivery
Site (i.e., from initial recording of the service delivery on source documents to the reporting of
aggregated numbers to the next administrative level).

1. DESCRIPTION – Describe the connection between the delivery of the service and/or
commodity and the completion of the source document. This step will give the Audit Team
a “frame of reference” for the link between the service delivery and recording process, and
obtain clues as to whether outside factors such as time delays and/or competing activities
could compromise the accurate and timely recording of program activities.

Data Quality Audit Tool 45


2. DOCUMENTATION REVIEW – Review availability and completeness of all indicator
source documents for the selected reporting period.
‰‰ Review a template of the source document (by obtaining a blank copy) and determine
if the site has sufficient supplies of blank source documents;
‰‰ Check availability and completeness of source documents and ensure that all the
completed source documents fall within the reporting period being audited;
‰‰ Verify that procedures are in place to prevent reporting errors (e.g., double-counting
of clients who have transferred in/out, died or are lost to follow up (if applicable).

Note that the indicator-specific protocols have listed likely source document(s). If the Audit
Team determines that other source documents are used, the team can modify the protocol(s)
accordingly and document in its work papers the change that has been made to the protocol.
The Audit Team will need to maintain strict confidentiality of source documents.

3. TRACE AND VERIFICATION – Recount results from source documents, compare the
verified numbers to the site reported numbers and explain discrepancies.

 STATISTIC: Calculate the Result Verification Ratio for the Service Delivery Site.

Verified counts at selected Service Delivery Site


Reported count at selected Service Delivery Site

Possible reasons for discrepancies could include simple data entry or arithmetic errors. The
Audit Team may also need to talk to data reporting staff about possible explanations and
follow-up with program data-quality officials if needed. This step is crucial to identifying
ways to improve data quality at the Service Delivery Sites. It is important to note that the Audit
Team could find large mistakes at a site “in both directions” (i.e., over-reporting and under-
reporting) that results in a negligible difference between the reported and recounted figures
— but are indicative of major data quality problems. Likewise, a one-time mathematical error
could result in a large difference. Thus, in addition to the Verification Factor calculated for the
site, the Audit Team will need to consider the nature of the findings before drawing conclusions
about data quality at the site.

4. CROSS-CHECKS – Perform feasible cross-checks of the verified report totals with other
data sources. For example, the team could examine separate inventory records documenting
the quantities of treatment drugs, test-kits, or ITNs purchased and delivered during the reporting
period to see if these numbers corroborate the reported results. Other cross-checks could
include, for example, comparing treatment cards to unit, laboratory, or pharmacy registers.
The Audit Team can add cross-checks to the protocol, as appropriate.

 STATISTIC: Calculate percent differences for each cross-check.

46 Data Quality Audit Tool


5. SPOT CHECKS – Spot-checks to verify the actual delivery of services and/or commodities
can also be done, time and resources permitting. Spot-checks entail selecting a number of
patients/clients (e.g., three to five) from source documents and verifying that they actually
received the services and/or commodities recorded. Spot-checks can be performed in two
ways: (1) the Audit Team obtains the names and addresses of people in the community and
makes an effort to locate them; or (2) the Audit Team requests representatives of the site to
contact the people and ask them to come to the Service Delivery Site (for example the next
day). For reasons of confidentiality, spot-checks will not be possible for indicators related to
some medical services, such as ART treatment for HIV.

As noted above, while the five data verification steps of the DQA Protocol 2: Data Verification
Protocol should not change5 within each verification step the protocol can be modified to better
fit the program context (e.g., add cross-checks, modify the reference source document). Major
modifications should be discussed with the Organization Commissioning the DQA.

Note: In no circumstances should the Audit Team record personal information, photocopy, or
remove documents from sites.

5
1. description, 2. documentation review, 3. trace and verification, 4. cross-checks, 5. spot-checks.

Data Quality Audit Tool 47


PHASE 5: M&E UNIT

In the fifth phase of the DQA, the Audit Team will return to the
PHASE 5 program/project M&E Unit. The steps in PHASE 5 are to:

12. Complete the assessment of the data management and


reporting system by answering the 13 overarching summary
M&E Manage-
ment Unit audit questions.
13. Develop preliminary audit findings and recommendation
notes.
14. Communicate the preliminary findings and recommenda-
12. Consolidate tions to the program/project’s M&E officers and senior
Assessment of management during an audit closeout meeting.
Data Manage-
ment Systems
The steps in PHASE 5 are estimated to take two days.

13. Draft Preliminary


Findings and Recom-
mendation Notes

14. Conduct Close-


out Meeting

48 Data Quality Audit Tool


Step 12. CONSOLIDATE Assessment of Data Management
Systems

Step 12 is performed by the Audit Team.

By Step 10, the Excel file worksheets of the DQA Protocol 1: System Assessment Protocol related
to the M&E Unit, the Intermediate Aggregation Levels, and the Service Delivery Sites will have
been completed. Based on all responses to the questions, a summary table (Step 12 – Table 1) will
be automatically generated, as will a summary graphic of the strengths of the data management and
reporting system (Step 12 – Figure 1). The results generated will be based on the number of “Yes,
completely,” “Partly,” and “No, not at all” responses to the questions on the DQA Protocol 1:
System Assessment Protocol.

Step 12 – Table 1. Summary Table: Assessment of Data Management and Reporting System
(Illustration)
I II III IV V
SUMMARY TABLE

(per site)
M&E Indicator Data- Links with

Average
Assessment of Data Data
Structure, Definitions Collection and National
Management Management
Functions, and and Reporting Reporting Reporting
and Reporting Systems Processes
Capabilities Guidelines Forms/Tools System
M&E Unit
- National M&E Unit 1.80 1.83 1.80 1.82 1.67 1.78
Intermediate Aggregation Level Sites
1 Collines 2.67 2.50 1.67 1.78 2.00 2.12
2 Atakora 3.00 2.25 1.33 1.67 2.50 2.15
3 Borgu 2.33 2.00 1.67 1.90 2.50 2.08
Service Delivery Points/Organizations
1.1 Savalou 2.67 2.00 1.67 1.86 2.00 2.04
1.2 Tchetti 2.00 2.25 1.67 2.13 2.00 2.01
1.3 Djalloukou 2.67 1.75 1.67 2.00 2.25 2.07
2.1 Penjari 2.33 2.00 2.00 1.86 2.50 2.14
2.2 Ouake 2.67 2.25 1.67 1.88 2.50 2.19
2.3 Tanagou 2.67 2.75 1.67 1.88 2.75 2.34
3.1 Parakou 2.33 2.00 2.00 1.86 2.25 2.09
3.2 Kandi 2.33 2.25 1.67 2.00 2.25 2.10
3.3 Kalale 2.67 2.25 1.67 1.88 2.50 2.19
Average (per 2.46 2.15 1.76 1.92 2.30 2.12
functional area)
Color Code Key
Green 2.5 - 3.0 Yes, Completely
Yellow 1.5 - 2.5 Partly
Red < 1.5 No, Not at All

Data Quality Audit Tool 49


Step 12 – Figure 1. Assessment of Data Management and Reporting System (Illustration).

Interpretation of the Output: The scores generated for each functional area on the Service
Delivery Site, Intermediate Aggregation Level, and M&E Unit pages are an average of the
responses which are coded 3 for “Yes, completely,” 2 for “Partly,” and 1 for “No, not at all.”
Responses coded “N/A” or “Not Applicable,” are not factored into the score. The numerical value
of the score is not important; the scores are intended to be compared across functional areas as a
means to prioritizing system strengthening activities. That is, the scores are relative to each other
and are most meaningful when comparing the performance of one functional area to another. For
example, if the system scores an average of 2.5 for ‘M&E Structure, Functions and Capabilities’
and 1.5 for ‘Data-collection and Reporting Forms/Tools,’ one would reasonably conclude that
resources would be more efficiently spent strengthening ‘Data-collection and Reporting Forms/
Tools’ rather than ‘M&E Structure, Functions and Capabilities.’ The scores should therefore not
be used exclusively to evaluate the information system. Rather, they should be interpreted within
the context of the interviews, documentation reviews, data verifications, and observations made
during the DQA exercise.

Using these summary statistics, the Audit Team should answer the 13 overarching questions on
the Audit Summary Question Worksheet of the protocol (see Step 12 – Table 2). To answer
these questions, the Audit Team will have the completed DQA Protocol 1: System Assessment
Protocol worksheets for each site and level visited, as well as the summary table and graph of
the findings from the protocol (see Step 12 – Table 1 and Figure 1). Based on these sources of
information, the Audit Team will need to use its judgment to develop an overall response to the
Audit Summary Questions.

50 Data Quality Audit Tool


Step 12 – Table 2. Summary Audit Questions

13 OVERARCHING SUMMARY AUDIT QUESTIONS

Program Area:  
Indicator:  
Answer

Yes - completely
Question Partly Comments
No - not at all
N/A

Are key M&E and data-management staff identified with


1    
clearly assigned responsibilities?
Have the majority of key M&E and data-management staff
2    
received the required training?
Has the program/project clearly documented (in writing) what
3    
is reported to who, and how and when reporting is required?
Are there operational indicator definitions meeting relevant
4    
standards that are systematically followed by all service points?
Are there standard data collection and reporting forms that are
5    
systematically used?
Are data recorded with sufficient precision/detail to measure
6
relevant indicators?
Are data maintained in accordance with international or
7
national confidentiality guidelines?
Are source documents kept and made available in accordance
8    
with a written policy?
Does clear documentation of collection, aggregation, and
9    
manipulation steps exist?
Are data quality challenges identified and are mechanisms in
10    
place for addressing them?
Are there clearly defined and followed procedures to identify
11    
and reconcile discrepancies in reports?
Are there clearly defined and followed procedures to
12    
periodically verify source data?
Does the data collection and reporting system of the program/
13    
project link to the National Reporting System?

Data Quality Audit Tool 51


Step 13. DRAFT PRELIMINARY FINDINGS AND
RECOMMENDATION NOTES

Step 13 is performed by the Audit Team.

By Step 12, the Audit Team will have completed both the system assessment and data verification
protocols on selected indicators. In preparation for its close-out meeting with the M&E Unit, in
Step 13 the Audit Team drafts Preliminary Findings. Recommendation Notes for data quality
issues found during the audit. Annex 3, Step 13 – Template 1 provides a format for those
Recommendation Notes. These findings and issues are presented to the program/project M&E
Unit (Step 14) and form the basis for the Audit Report (Steps 15 and 17). The Audit Team should
also send a copy of the Preliminary Findings and Recommendation Notes to the Organization
Commissioning the DQA.

The preliminary findings and Recommendation Notes will be based on the results from the DQA
Protocol 1: System Assessment Protocol and the DQA Protocol 2: Data Verification Protocol
and will be developed by the Audit Team based on:
• The notes columns of the protocols in which the Audit Team has explained findings
related to: (1) the assessment of the data-management and reporting system; and (2) the
verification of a sample of data reported through the system. In each protocol, the final
column requests a check (√) for any finding that requires a Recommendation Note.
• Work papers further documenting evidence of the Audit Team’s data quality audit
findings.

The findings should stress the positive aspects of the program/project M&E system as it relates
to data management and reporting as well as any weaknesses identified by the Audit Team. It is
important to emphasize that a finding does not necessarily mean that the program/project is deficient
in its data collection system design or implementation. The program/project may have in place a
number of innovative controls and effective steps to ensure that data are collected consistently and
reliably.

Nevertheless, the purpose of the Data Quality Audit is to improve data quality. Thus, as the Audit
Team completes its data management system and data verification reviews, it should clearly
identify evidence and findings that indicate the need for improvements to strengthen the design
and implementation of the M&E system. All findings should be backed by documentary evidence
that the Audit Team can cite and provide along with its Recommendation Notes.

Examples of findings related to the design and implementation of data collection, reporting and
management systems include:
• The lack of documentation describing aggregation and data manipulation steps.
• Unclear and/or inconsistent directions provided to reporting sites about when or to whom
report data is to be submitted.
• The lack of designated staff to review and question submitted site reports.

52 Data Quality Audit Tool


• The lack of a formal process to address incomplete or inaccurate submitted site reports.
• The lack of a required training program for site data collectors and managers.
• Differences between program indicator definitions and the definition as cited on the data
collection forms.
• The lack of standard data collection forms.

Examples of findings related to verification of data produced by the system could include:
• A disconnect between the delivery of services and the filling out of source documents.
• Incomplete or inaccurate source documents.
• Data entry and/or data manipulation errors.
• Misinterpretation or inaccurate application of the indicator definition.

Draft Recommendation Note(s)


In the Recommendation Notes, the Audit Team should cite the evidence found that indicates a
threat to data quality. The team should also provide one or more recommended actions to prevent
recurrence. The Audit Team may propose a deadline for the recommended actions to be completed
and seek concurrence from the program/project and the Organization Commissioning the DQA.
Step 13 – Table 1 provides an example of the content of Recommendation Notes.

Step 13 – Table 1. Illustrative Findings and Recommendations for Country X’s TB Treatment Program:
Number of Smear Positive TB Cases Registered Under DOTS Who Are Successfully Treated

Country X runs an organized and long-established TB treatment program based on international treatment
standards and protocols. The processes and requirements for reporting results of the TB program are
specifically identified and prescribed in its Manual of the National Tuberculosis and Leprosy Programme.
The Manual identifies required forms and reporting requirements by service sites, districts, and regions.
Based on information gathered through interviews with key officials and a documentation review, the
Data Quality Audit Team identified the following related to improving data quality.

Findings and Recommendations for the M&E Unit


1) M&E Training
• FINDING: The Audit Team found a lack of a systematic and documented data management
training plan that identifies training requirements, including necessary data management skills
for all levels of the program from health care workers at Service Delivery Sites to district
coordinators, regional staffers, and M&E Unit data managers. Currently, training is instigated,
implemented, and paid for by different offices at multiple levels throughout the TB program.
RECOMMENDATION: That the National TB M&E Unit develop a plan to coordinate available
training resources and identify training needs throughout the system including those needed to
efficiently achieve data management requirements.

Data Quality Audit Tool 53


2) Supervisory checks of District Reports
• FINDING: The lack of supervisory checks of the files used to store submitted quarterly reports
from district offices can lead to potential aggregation errors. For example, the Audit Team’s
verification exercise identified duplicate, out-of-date, and annual rather than quarterly reports in
these files that could easily lead to data entry errors.
RECOMMENDATION: That a program management supervisor regularly review the files used
to store regional reports after they are submitted, but before data entry occurs to help reduce the
possibility of errors.

• FINDING: Approximately 2% of the submitted regional reports to the MOH lacked


supervisory signatures. This signature is required to document that the report was reviewed for
completeness and obvious mistakes.

RECOMMENDATION: That the MOH reinforce its requirement that submitted reports contain a
supervisory signature, perhaps by initially rejecting reports that have not been reviewed.

3) Policy on Retention of Source Documents


• FINDING: The TB program has no policy regarding the retention of reporting documents
including patient treatment cards, registers and related report. While the documents are
routinely retained for years, good data management requires that a specific document retention
policy be developed.

RECOMMENDATION: That the program office develop a specific document retention policy for
TB program source and key reporting documents in its new reporting system.

Findings and Recommendations for the Intermediate Aggregation Level Sites

4) Quality Control in Data Entry


• FINDING: The Audit Team found that limited measures are taken to eliminate the possibility
of data entry errors at the district level. While there are checks in the reporting software to
identify out-of-range entries, the district staff could not describe any other steps taken to
eliminate data entry errors.

RECOMMENDATION: That the program identify steps to eliminate data entry errors wherever
report numbers are entered into the electronic reporting system.

Findings and Recommendations for the Service Delivery Sites

5) Ability to Retrieve Source Documents


• FINDING: At all service sites, the Audit Team had difficulty completing the data verification
exercise because the site staff found it difficult or was unable to retrieve source documents—
e.g., the TB patient treatment cards for patients that had completed treatment. If such
verification cannot be performed, a Data Quality Audit team cannot confirm that the reported
treatment numbers are accurate and valid.

RECOMMENDATION: That TB Service Delivery Sites should systematically file and store TB
treatment source documents by specific reporting periods so that they can be readily retrieved for
audit purposes.

54 Data Quality Audit Tool


Step 14. CONDUCT A CLOSEOUT MEETING

Step 14 is performed by the Audit Team.

At the conclusion of the site visits, the Audit Team Leader should conduct a closeout meeting with
senior program/project M&E officials and the Director/Program Manager to:
1. Share the results of the data-verifications (recounting exercise) and system review;
2. Present the preliminary findings and Recommendation Notes; and
3. Discuss potential steps to improve data quality.

A face-to-face closeout meeting gives the program/project’s data management staff the opportunity
to discuss the feasibility of potential improvements and related timeframes. The Audit Team Leader
should stress, however, that the audit findings at this point are preliminary and subject to change
once the Audit Team has had a better opportunity to review and reflect on the evidence collected
on the protocols and in its work papers.

The Audit Team should encourage the program/project to share relevant findings with the appropriate
stakeholders at the country-level such as multi-partner M&E working groups and the National
program, as appropriate. The Audit Team should also discuss how the findings will be shared
by the program/project M&E officials with the audited Service Delivery Sites and Intermediate
Aggregation Levels (e.g., Regions, Districts).

As always, the closeout meeting and any agreements reached on the identification of findings
and related improvements should be documented in the Audit Team’s work papers in order to be
reflected in the Final Audit Report.

Data Quality Audit Tool 55


PHASE 6: COMPLETION

The last phase of the DQA takes place at the offices of the
PHASE 6 DQA Team, and in face-to-face or phone meetings with the
Organization Commissioning the DQA and the program/
project. The steps in PHASE 6 are to:
Off-Site
(Completion) 15. Draft Audit Report.
16. Discuss the Draft Audit Report with the program/project
and with the Organization Commissioning the DQA.
17. Complete the Final Audit Report and communicate
the findings, including the final Recommendation
15. Draft Audit Note(s), to the program/project and the Organization
Report Commissioning the DQA.
18. As appropriate, initiate follow-up procedures to ensure
that agreed upon changes are made.

16. Review and Col- The steps in PHASE 5 are estimated to take between two
lect Feedback from and four weeks.
Country and Orga-
nization Commis-
sioning the DQA

17. Finalize
Audit Report

18. Initiate
Follow-up of
Recommended
Actions

56 Data Quality Audit Tool


Step 15. DRAFT AUDIT REPORT

Step 15 is performed by the Audit Team.

Within 1-2 weeks, the Audit Team should complete its review of all of the audit documentation
produced during the mission and complete a draft Audit Report with all findings and suggested
improvements. Any major changes in the audit findings made after the closeout meeting in country
should be clearly communicated to the program/project officials. The draft of the Audit Report
will be sent to the program/project management staff and to the Organization Commissioning the
DQA. Step 15 – Table 1 shows the suggested outline for the Audit Report.

Step 15 – Table 1: Suggested Outline for the Final Data Quality Audit Report

Section Contents
I Executive Summary
II Introduction and Background
‰‰ Purpose of the DQA
‰‰ Background on the program/project
‰‰ Indicators and Reporting Period – Rationale for selection
‰‰ Service Delivery Sites – Rationale for selection
‰‰ Description of the data-collection and reporting system (related to the
indicators audited)
III Assessment of the Data Management and Reporting System
‰‰ Description of the performed system assessment steps
‰‰ Dashboard summary statistics (table and spider graph of functional areas – Step
12: Table 1 and Figure 1)
‰‰ Key findings at the three levels:
{{ Service Delivery Sites
{{ Intermediate Aggregation Levels
{{ M&E Unit
‰‰ Overall strengths and weaknesses of the Data-Management System (based on 13
Summary Audit Questions)

Data Quality Audit Tool 57


IV Verification of Reported Data
‰‰ Description of the performed data-verifications steps
‰‰ Data Accuracy – Verification Factor
‰‰ Precision and confidentiality of reported data
‰‰ Availability, completeness, and timeliness of reports
‰‰ Key findings at the three levels:
{{ Service Delivery Sites
{{ Intermediate Aggregation Levels
{{ M&E Unit
‰‰ Overall assessment of Data Quality
V Recommendation Notes and Suggested Improvements
VI Final Data Quality Classification (if required by the Organization Commissioning the DQA).
VII Country Response to DQA Findings

Step 16. Collect and Review FEEDBACK FROM COUNTRY


AND ORGANIZATION COMMISSIONING THE DQA

Step 16 is performed by the Audit Team.

To build consensus and facilitate data quality improvements, the Audit Team needs to share the
draft Audit Report with the Organization Commissioning the DQA and with the program/project
management and M&E staff. The program/project will be given an opportunity to provide
a response to the audit findings. This response will need to be included in the Final Audit
Report.

Step 17. FINALIZE AUDIT REPORT

Step 17 is performed by the Audit Team.

Once the program/project and the Organization Commissioning the DQA have reviewed the Draft
Audit Report (given a time limit of two weeks, unless a different time period has been agreed) and
provided feedback, the Audit Team will complete the Final Audit Report. While the Audit Team
should elicit feedback, it is important to note that the content of the Final Audit Report is
determined by the Audit Team exclusively.

58 Data Quality Audit Tool


Step 18. INITIATE FOLLOW-UP OF RECOMMENDED ACTIONS

Step 18 can be performed by the Organization Commissioning the DQA and/or the Audit Team.

The program/project will be expected to send follow-up correspondences once the agreed upon
changes/improvements have been made. If the Organization Commissioning the DQA wants the
Audit Team to be involved in the follow-up of identified strengthening measures, an appropriate
agreement may be reached. The Organization Commissioning the DQA and/or the Audit Team
should maintain a “reminder” file to alert itself as to when these notifications are due (see ANNEX
3, Step 19 – Template 1). In general, minor data quality issues should be remedied in one to
six months and major issues in six to twelve months.

Data Quality Audit Tool 59


Annexes

60 Data Quality Audit Tool


ANNEX 1: DQA Protocols
Protocol 1: System Assessment Protocol
Protocol 2: Data Verification Protocol

Data Quality Audit Tool 61


Protocol 1 – System Assessment Protocol (AIDS and Malaria)

LIST OF ALL QUESTIONS – For reference only (Protocol 1 - System’s Assessment)


Checkmark indicates
reporting system

Supporting documen-
level at which the

tation required?
question is asked
Component of the M&E System

Aggregation
M&E Unit

Service
Levels

Points
I – M&E Structure, Functions, and Capabilities
There is a documented organizational structure/chart that clearly
1 identifies positions that have data management responsibilities at √ Yes
the M&E Unit.
All staff positions dedicated to M&E and data management
2 √ -
systems are filled.
There is a training plan which includes staff involved in data-
3 √ Yes
collection and reporting at all levels in the reporting process.
All relevant staff have received training on the data management
4 √ √ √ -
processes and tools.
A senior staff member (e.g., the Program Manager) is responsible
5 for reviewing the aggregated numbers prior to the submission/ √ -
release of reports from the M&E Unit.
There are designated staff responsible for reviewing the quality
6 of data (i.e., accuracy, completeness and timeliness) received √ √ -
from sub-reporting levels (e.g., regions, districts, service points).
There are designated staff responsible for reviewing aggregated
7 numbers prior to submission to the next level (e.g., to districts, to √ √ -
regional offices, to the central M&E Unit).
The responsibility for recording the delivery of services on source
8 √ -
documents is clearly assigned to the relevant staff.
II – Indicator Definitions and Reporting Guidelines
The M&E Unit has documented and shared the definition of the
9 indicator(s) with all relevant levels of the reporting system (e.g., √ Yes
regions, districts, service points).
There is a description of the services that are related to each
10 √ Yes
indicator measured by the program/project.
The M&E Unit has provided written guidelines to each sub-reporting level on …
11 … what they are supposed to report on. √ √ √ Yes
12 … how (e.g., in what specific format) reports are to be submitted. √ √ √ Yes
13 … to whom the reports should be submitted. √ √ √ Yes
14 … when the reports are due. √ √ √ Yes
There is a written policy that states for how long source
15 √ Yes
documents and reporting forms need to be retained.

62 Data Quality Audit Tool


LIST OF ALL QUESTIONS – For reference only (Protocol 1 - System’s Assessment)
Checkmark indicates
reporting system

Supporting documen-
level at which the

tation required?
question is asked
Component of the M&E System

Aggregation
M&E Unit

Service
Levels

Points
III – Data-collection and Reporting Forms/Tools
The M&E Unit has identified a standard source document (e.g.,
16 medical record, client intake form, register, etc.) to be used √ Yes
by all Service Delivery Points to record service delivery.
The M&E Unit has identified standard reporting forms/tools to be
17 √ Yes
used by all reporting levels.
Clear instructions have been provided by the M&E Unit on
18 √ √ √ Yes
how to complete the data collection and reporting forms/tools.
The source documents and reporting forms/tools specified by
19 √ √ -
the M&E Unit are consistently used by all reporting levels.
If multiple organizations are implementing activities under the
20 program/project, they all use the same reporting forms and report √ √ √ -
according to the same reporting timelines.
The data collected by the M&E system has sufficient precision
to measure the indicator(s) (i.e., relevant data are collected by
21 √ -
sex, age, etc., if the indicator specifies disaggregation by these
characteristics).
All source documents and reporting forms relevant for measuring
22 the indicator(s) are available for auditing purposes (including √ √ √ -
dated print-outs in case of computerized system).
IV – Data Management Processes
The M&E Unit has clearly documented data aggregation, analysis
23 and/or manipulation steps performed at each level of the reporting √ Yes
system.
There is a written procedure to address late, incomplete,
24 inaccurate, and missing reports; including following-up with sub- √ √ Yes
reporting levels on data quality issues.
If data discrepancies have been uncovered in reports from sub-
reporting levels, the M&E Unit or the Intermediate Aggregation
25 √ √ -
Levels (e.g., districts or regions) have documented how these
inconsistencies have been resolved.
Feedback is systematically provided to all sub-reporting levels
26 on the quality of their reporting (i.e., accuracy, completeness, and √ √ -
timeliness).
There are quality controls in place for when data from paper-
27 based forms are entered into a computer (e.g., double entry, post- √ √ √ -
data entry verification, etc).

Data Quality Audit Tool 63


LIST OF ALL QUESTIONS – For reference only (Protocol 1 - System’s Assessment)
Checkmark indicates
reporting system

Supporting documen-
level at which the

tation required?
question is asked
Component of the M&E System

Aggregation
M&E Unit

Service
Levels

Points
For automated (computerized) systems, there is a clearly
documented and actively implemented database administration
28 √ √ √ Yes
procedure in place. This includes backup/recovery procedures,
security administration, and user administration.
There is a written back-up procedure for when data entry or data
29 √ √ √ Yes
processing is computerized.
If yes, the latest date of back-up is appropriate given the
30 frequency of update of the computerized system (e.g., backups √ √ √ -
are weekly or monthly).
Relevant personal data are maintained according to national or
31 √ √ √ -
international confidentiality guidelines.
The reporting system avoids double counting people …
… within each point of service/organization (e.g., a person
receiving the same service twice in a reporting period, a person
32 √ √ √ -
registered as receiving the same service in two different locations,
etc).
… across service points/organizations (e.g., a person registered
33 as receiving the same service in two different service points/ √ √ √ -
organizations, etc).
The reporting system enables the identification and recording of a
34 √ √ √ -
“drop out,” a person “lost to follow-up,” and a person who died.
The M&E Unit can demonstrate that regular supervisory site
35 √ Yes
visits have taken place and that data quality has been reviewed.
V – Links with National Reporting System
When available, the relevant national forms/tools are used for
36 √ √ √ Yes
data-collection and reporting.
When applicable, data are reported through a single channel of
37 √ √ √ -
the national information systems.
Reporting deadlines are harmonized with the relevant timelines of
38 √ √ √ -
the National program (e.g., cut-off dates for monthly reporting).
The service sites are identified using ID numbers that follow a
39 √ √ √ -
national system.

64 Data Quality Audit Tool


Protocol 2 – Data Verification Protocol (Illustration – Community-based Interventions)

Data Quality Audit Tool 65


66 Data Quality Audit Tool
Data Quality Audit Tool 67
68 Data Quality Audit Tool
ANNEX 2: Templates for the Organization
Commissioning the DQA

Data Quality Audit Tool 69


Annex 2 – Step 1. Template 1. Illustrative Table for Ranking Countries by Investment and Results Reported

Disease: AIDS
Ranking of Results reported

Program Area
Program Area Program Area
Countries (or Treatment Behavioral Change
OVC
programs/projects) Communication Notes/
Dollar Investment
(ranked by Dollar Comments
Invested) Indicator 2 Indicator 3
Indicator 1 Number of Number of OVC
People on ARV Condoms Receiving Care and
Distributed Support

2 4 8
Country X $66 Million
(6,500) (3 million) (1,879)
1 10
Country Y $52 Million NA
(7,000) (1,254)

70
Annex 2 – Step 1. Template 2. Illustrative Analysis of the Relative Magnitude of the Investments and Indicator Results per
Program Area
Program/Project: _____________

% of Targets
% of Total Target or
$ Invested in the Key Indicator in the or Results Notes/
Program Area Invested in the Reported Result
Program Area Program Area Reported in the Comments
Program/Project for the Indicator
Country

ART Treatment $2,000,000 80% Nb. of people on ART 20,000 80%

71
Annex 2 – Step 1. Template 3. Documentation of the Selection of the Country, Disease/Health Area, Program/Project(s), Program
Area and Indicators

Criteria Used Persons/


for Selection Entities
Disease/Health Program/ Reporting
Country Program Area Indicator(s) of Indicator Involved
Area Project Period
and Reporting in Audit
Period Determination

72
Annex 2 – Step 2. Template 1. Notification and Documentation Request Letter to the Selected
Program/Project

Date
Address
Dear__________________:

[Your organization] has been selected for a Data Quality Audit by [name of Organization
Commissioning the Audit] related to [Program/Project name].

The purpose of this audit is to: (1) assess the ability of the data management systems of the program/
project(s) you are managing to report quality data; and (2) verify the quality of reported data for
key indicators at selected sites. [Name of Audit Agency] will be conducting the audit and will
contact you soon regarding the audit.

This Data Quality Audit relates to [disease], [program area] and the verifications will focus on the
following indicators:
1 [indicator name]
2 [indicator name]

The audit will:


1. Assess the design of the data management and reporting systems;
2. Check at selected Service Delivery Sites and intermediary aggregation levels (e.g., districts,
regions) if the system is being implemented as designed;
3. Trace and verify past reported numbers for a limited number of indicators at a few sites; and
4. Communicate the audit’s findings and suggested improvements in a formal Audit Report.

Prior to the audit taking place, [list name of Audit Agency] will need:
‰‰ A list of all the Service Delivery Sites with the latest reported results (for the above
indicators);
‰‰ The completed Template 2 (attached to this letter) describing the data-collection and
reporting system (related to the above indicators);
‰‰ Data-collection and reporting forms (related to the above indicators).

This information is critical for beginning the audit, therefore it is requested within two weeks of
receipt of this letter and should be sent to [address of Audit Agency].

To help the Audit Team perform the initial phase of the review of your overall data management
system and to limit the team’s on-site presence to the extent possible, we also request that you
provide the Audit Agency with the existing and available documentation listed in Table 1 (attached
to this letter).

Thank you for submitting the requested documentation to ___________ at ______ by _________.
If any of the documentation is available in electronic form it can be e-mailed to _____________.

Data Quality Audit Tool 73


Following a desk review of the information and documentation provided, the Audit Agency will
pursue the audit at the office that serves as the M&E management unit for the program/project and
at a small number of your reporting sites and intermediary data management offices (e.g., district
or regional offices). To facilitate site visits, we request that two staff members responsible for
M&E, or who receives, reviews and/or compiles reports from reporting entities accompany the
Audit Team to the sites for the duration of the audit.

Because the time required for the audit depends on the number and location of sampled sites, the
Audit Agency will contact you with more specific information regarding timing after the sample
of sites has been selected. However, you should anticipate that the audit will last between 10 and
15 days (including two days at the M&E Unit and around one day per Service Delivery Site and
Intermediate Aggregation Level — e.g., Districts or Regions).

Finally, since the Audit Team will need to obtain and review source documents (e.g., client records
or registration logs/ledger), it is important that official authorization be granted to access these
documents. However, we would like to assure you that no details related to individuals will be
recorded as part of the audit — the team will only seek to verify that the counts from “source
documents” related to the service or activity are correct for the reporting period. The personal
records will neither be removed from the site nor photocopied.

We would like to emphasize that we will make every effort to limit the impact our audit will have
on your staff and ongoing activities. In that regard, it would be very helpful if you could provide
the Audit Agency with a key contact person early on in this process (your chief data management
official, if possible) so we can limit our communications to the appropriate person. If you have any
questions please contact ___________ at ____________.

Sincerely,

cc: Government Auditing Agency


Donor/Development Partners and Implementing Partners
Other, as appropriate for the country and audit

74 Data Quality Audit Tool


Table 1 – List of Audit Functional Areas and Documentation to Request from Program/
Project for Desk Review (if available)

Check if
Functional
General Documentation Requested provided
Areas

Contact • Names and contact information for key program/project officials,
Information including key staff responsible for data management activities.

• Organizational chart depicting M&E responsibilities.


I – M&E
Structures, • List of M&E positions and status (e.g., full time or part time, filled
Roles and or vacant).
Capabilities
• M&E Training plan, if one exists.

• Instructions to reporting sites on reporting requirements and


deadlines.
• Description of how service delivery is recorded on source documents,
and on other documents such as clinic registers and periodic site reports.
II – Indicator • Detailed diagram of how data flows:
Definitions {{ from Service Delivery Sites to Intermediate Aggregation Levels
and Reporting (e.g. district offices, provincial offices, etc.);
Guidelines {{ from Intermediate Aggregation Levels (if any) to the M&E
Unit.

• National M&E Plan, if one exists.

• Operational definitions of indicators being audited.

• Data-collection form(s) for the indicator(s) being audited.


III – Data
collection and • Reporting form(s) for the indicator(s) being audited.
Reporting
Forms and Tools • Instructions for completing the data-collection and reporting
forms.
• Written documentation of data management processes including a
description of all data-verification, aggregation, and manipulation
IV – Data steps performed at each level of the reporting system.
Management • Written procedures for addressing specific data quality challenges
Processes (e.g. double-counting, “lost to follow-up”), including instructions
sent to reporting sites.
• Guidelines and schedules for routine supervisory site visits.
V – Links • Documented links between the program/project data reporting
with National system and the relevant national data reporting system.
Reporting
System

Data Quality Audit Tool 75


Annex 2 – Step 2. Template 2. Description of the Data-Collection and Reporting System

Please complete this template form for each indicator being verified by the Data Quality Audit (DQA)

Indicator Name
Indicator Definition

1. Is there a designated person responsible for data management and analysis at the M&E
Yes No
Management Unit at Central Level?

1.1. If “Yes,” please give the name and e-mail address of the contact person: Name
e-mail

RECORDING OF SERVICE DELIVERY ON SOURCE DOCUMENTS (at Service Delivery Points)

2. Is there a standardized national form that all Service Delivery Points use to record the
Yes No
delivery of the service to target populations?

2.1. If “No,” how many different forms are being used by the Service Delivery Points? Number

3. What is the name of the form(s) used by the Service Delivery Points?
Name of the Form(s)

4. What are the key fields in the form that are relevant for the indicator? Field 1
Field 2
Field 3
Field 4
Please add …

76
REPORTING FROM SERVICE DELIVERY POINTS UP TO THE NATIONAL M&E UNIT (through any intermediary levels – Districts,
Regions, etc.)

5. Please use this table to explain the reporting process in your country. In the first row, provide information about reports which are received
in the central office. Show where those reports come from, how many you expect for each reporting period, and how many times per year
you receive these reports.

Number of senders
Number of times reports
(i.e. if reports are sent by
Reports received by: Sender are received each year
districts, put the number of
(i.e. quarterly = 4 times)
districts here)

6. What is the lowest level for which you have data at the M&E Management Unit at Central Level?

Other … [please
Individual patients Health facilities Districts Region
specify]

7. At what level is data first computerized (i.e., entered in a computer)?

Other … [please
Health facilities Districts Region National
specify]

8. Please provide any other comments (if applicable).

Finally, please attach the templates of the (1) source document; and (2) reports received by each level.

77
Annex 2 – Step 2. Template 3. Letter to Request National Authorization for the DQA

Date

Address of National Authorizing Agency for Data Quality Audit

Dear__________________:

As part of its ongoing oversight activities, [name of Organization Commissioning the Audit] has selected
[program/project(s)] in [country] for a Data Quality Audit. Subject to approval, the Data Quality Audit will
take place between [months and ], [Year].

The purpose of this Data Quality Audit is to assess the ability of the program’s data management system
to report quality data and to trace and verify reported results from selected service sites related to the
following indicators:

1 [indicator name]
2 [indicator name]

[Name of auditing firm] has been selected by [name of Organization Commissioning the Audit] to carry out
the Data Quality Audit.

Conducting this Data Quality Audit may require access to data reported through the national data reporting
system on [Disease and Program Area]. The audit will include recounting data reported within selected
reporting periods, including obtaining and reviewing source documents (e.g. client records or registration
logs/ledgers, training log sheets, commodity distribution sheets). While the Audit Team will potentially
require access to personal patient information, the Team will hold such information in strict confidence and
no audit documentation will contain or disclose such personal information. The purpose of access to such
information is strictly for counting and cross-checking purposes related to the audit. When necessary, the
Audit Team will need to access and use such information at Service Delivery Sites. The personal records
will neither be removed from the site nor photocopied.

If you have any questions about this Data Quality Audit, please contact ______ at ________.

[Name of Organization Commissioning the Audit] hereby formally requests approval to conduct this Data
Quality Audit.

Please indicate approved or not approved below (with reasons for non-approval) and return this letter to
______________________ at ________________________.

Approved/Not approved (please circle one)

Sincerely, Date:

Title

cc: Program/project Director, Donor/Development Partners and Implementing Partners, Other, as


appropriate for the Audit.

78 Data Quality Audit Tool


ANNEX 3: Templates for the Audit Agency and Team

Data Quality Audit Tool 79


Annex 3, Step 2 – Template 1. Information Sheet for the M&E Unit Involved in the DQA
1. Objective of the DQA
The objectives of the Data Quality Audit are to:
‰‰ Verify that appropriate data management systems are in place; and
‰‰ Verify the quality of reported data for key indicators at selected sites.

2. Program Areas Included in the Audit


- to be completed by Audit Team -
3. Tasks Performed by the Audit Team at the M&E Unit
‰‰ Interview Program Manager and staff involved in M&E and data-management.
‰‰ Review availability, completeness, and timeliness of reports received from reporting sites.
‰‰ Re-count numbers from received reports and compare result to the numbers reported by the M&E
Unit.
4. Staff to Be Available at the M&E Unit during the DQA
‰‰ Program Manager.
‰‰ Chief Data-management Official.
‰‰ Staff involved in reviewing and compiling reports received from reporting sites.
‰‰ IT staff involved in database management, if applicable.
‰‰ Relevant staff from partner organizations working on M&E systems strengthening, if applicable.

4. Documentation to Prepare in Advance of Arrival of Audit Team


‰‰ Reported results by the M&E Unit for the selected reporting period (see Point 3 above).
‰‰ Access to the site summary reports submitted for the period (see Point 3 above).
‰‰ Organizational chart depicting M&E responsibilities.
‰‰ List of M&E positions and status (e.g., full time or part time, filled or vacant).
‰‰ M&E Training Plan, if one exists.
‰‰ Instructions to reporting sites on reporting requirements and deadlines.
‰‰ Description of how service delivery is recorded on source documents, and on other documents such
as clinic registers and periodic site reports.
‰‰ Detailed diagram of how data flows from Service Delivery Sites to the M&E Unit.
‰‰ National M&E Plan, if one exists.
‰‰ Operational definitions of indicators being audited (see Point 2 above).
‰‰ Template data-collection and reporting form(s) for the indicator(s) being audited (with the
instructions).
‰‰ Written documentation of data-management processes including a description of all data-verification,
aggregation, and manipulation steps performed at each level of the reporting system.
‰‰ Written procedures for addressing specific data quality challenges (e.g., double-counting, “lost to
follow-up”), including instructions sent to reporting sites.
‰‰ Guidelines and schedules for routine supervisory site visits.

80 Data Quality Audit Tool


5. Expected time of Audit Team at the M&E Unit
To be completed by Audit Team
[Guideline: two days – one day at the beginning and one day at the end of the DQA]

WARNING: In no circumstances should reports be fabricated for the purpose of the audit.

Annex 3, Step 2 – Template 2. Information Sheet for the Intermediate Aggregation Levels
Selected for the DQA

1. Objective of the DQA


The objectives of the Data Quality Audit are to:
‰‰ Verify that appropriate data management systems are in place; and
‰‰ Verify the quality of reported data for key indicators at selected sites.

2. Program Areas Included in the Audit

- to be completed by Audit Team -

3. Tasks Performed by the Audit Team at the Intermediate Aggregation Level


‰‰ Interview Site Manager and staff involved in data-management and compilation.
‰‰ Review availability, completeness, and timeliness of reports received from reporting sites.
‰‰ Re-count numbers from received reports and compare result to the numbers reported to the next level.

4. Staff to Be Available at the Intermediate Aggregation Level during the DQA


‰‰ Site Manager
‰‰ Staff involved in reviewing and compiling reports received from reporting sites.
‰‰ IT staff involved in database management, if applicable.

5. Documentation to Prepare in Advance of Arrival of Audit Team


‰‰ Reported results to the next level for the selected reporting period (see Point 3 above).
‰‰ Access to the site summary reports submitted for the period (see Point 3 above).
‰‰ Description of aggregation and/or manipulation steps performed on data submitted by reporting sites.

6. Expected Time of Audit Team at the Intermediate Aggregation Level


To be completed by Audit Team
[Guideline: between one-half and one day at each Intermediate Aggregation Level Site]

WARNING: In no circumstances should reports be fabricated for the purpose of the audit.

Data Quality Audit Tool 81


Annex 3, Step 2 – Template 3. Information Sheet for all Service Delivery Sites Selected for the DQA

1. Objective of the DQA


The objectives of the Data Quality Audit are to:
‰‰ Verify that appropriate data management systems are in place; and
‰‰ Verify the quality of reported data for key indicators at selected sites.

2. Program Areas Included in the Audit


- to be completed by Audit Team -
3. Tasks Performed by the Audit Team at the Service Delivery Site

‰‰ Interview Site Manager and staff involved in data-collection and compilation.


‰‰ Understand how and when source documents are completed in relation to the delivery of services.
‰‰ Review availability and completeness of all source documents for the selected reporting period.
‰‰ Recount the recorded numbers from available source documents and compare result to the numbers
reported by the site.
‰‰ Compare reported numbers with other data sources (e.g., inventory records, laboratory reports, etc.).
‰‰ Verify the actual delivery of services and/or commodities to the target populations (if feasible).

4. Staff to Be Available at the Service Delivery Site during the DQA

‰‰ Site Manager.
‰‰ Staff responsible for completing the source documents (e.g., patient treatment cards, clinic registers, etc.).
‰‰ Staff responsible for entering data in registers or computing systems (as appropriate).
‰‰ Staff responsible for compiling the periodic reports (e.g., monthly, quarterly, etc.).

5. Documentation to Prepare in Advance of Arrival of Audit Team

‰‰ Reported results to the next level for the selected reporting period (see Point 3 above).
‰‰ All source documents for the selected reporting period, including source documents from auxiliary/
peripheral/satellite sites (see Point 3 above).
‰‰ Description of aggregation and/or manipulation steps performed on data submitted to the next level.

6. Expected Time of Audit Team at the Service Delivery Site


To be completed by Audit Team
[Guideline: between one-half and two days (i.e., more than one day may be required for large sites with
reported numbers in the several hundreds or sites that include satellite centers or when “spot-checks”
are performed).]

WARNING: In no circumstances should source documents or reports be fabricated for the purpose of the
audit.

82 Data Quality Audit Tool


Annex 3, Step 4 – Template 4. Checklist for Audit Team Preparation for Audit Site Visits

Check when
No. Item completed
(√)

1 Letter of authorization

2 Guidelines for implementation


DQA Protocol 1: System Assessment Protocol (paper copy of all relevant
3
worksheets and computer file)
DQA Protocol 2: Data Verification Protocol(s) (paper copy of all relevant
4
worksheets and computer file)
5 List of sites and contacts

6 Confirmed schedule of site visits

7 Laptop computer (at least one per sub-team)

8 Plan for logistical support for the audit

9 Relevant documentation provided by program/project for the desk review

10 Other

Data Quality Audit Tool 83


Annex 3, Step 5 - Template 1. Format for Recoding Notes of Interviews/Meetings with Key
M&E Managers and Staff

Name and Address of Program/Project:

Contract Number (if relevant):

Name of Person(s) Interviewed:

Auditor: Interview Date:

Program Area: Relevant Indicator(s):

Work Paper Reference or Index Number:

Purpose of the Interview:

Narrative Description of Discussions:

Auditor Signature: Date:

84 Data Quality Audit Tool


Annex 3, Step 13 - Template 1. Data Quality Audit Recommendation Note6

Name and Address of Program/Project:

Contract Number (if relevant):

Contact Person:

Auditor: Audit Date:

Location: Relevant Indicator(s):

Classification: Major/Minor Data Quality Dimension:6

Explanation of Findings (including evidence):

Recommended Action for Correction (complete prior to closeout meeting with the program/project):

Notes from Closeout Meeting Discussion with Program/Project:

Final Recommended Action (complete after closeout meeting with the program/project):

Expected Completion Date (if applicable):

Auditor Signature: Date:

  The data quality dimensions are: Accuracy, reliability, precision, completeness, timeliness, integrity, and confidentiality.
6

Data Quality Audit Tool 85


Annex 3, Step 19 - Template 1: Reminder File for M&E Data Quality Strengthening Activities
of Program/Project

Name and Address of Program/Project:

Contract Number (if relevant):

Contact Person:

Auditor: Audit Date:

Program Area: Relevant Indicator(s):

Activity Title and Estimated Date of Person(s) Date checked Outcome


Description Completion Responsible

86 Data Quality Audit Tool


ANNEX 4: Site Selection Using Cluster Sampling
Techniques

Data Quality Audit Tool 87


Instructions for Sampling using Sampling Strategy D – Cluster Sampling Selection:
1. Determine the number of clusters and sites. The Audit Team should work with the Organization
Commissioning the DQA to determine the number of clusters and sites within clusters.
2. More than one intermediate level. In the event there is more than one Intermediate Aggregation
Level (i.e., the data flows from district to region before going to national level), a three-
stage cluster sample should be drawn. That is, two regions should be sampled and then two
districts sampled from each region.
3. No intermediate level. If the data is reported directly from Service Delivery Sites to the
national level (i.e., no Intermediate Aggregation Sites), the site selection will be conducted as
above (cluster sampling with the district as the primary sampling unit), but the calculation of
the Verification Factor will change. In this case, there is no adjustment for the error occurring
between the district and national level.
4. Prepare the sampling frame. The first step in the selection of clusters for the audit will be to
prepare a sampling frame, or a listing of all districts (or clusters) where the activity is being
conducted (e.g., districts with ART treatment sites). The methodology calls for selecting
clusters proportionate to size, i.e. the volume of service. Often it is helpful to expand the
sampling frame so that each cluster is listed proportionate to the size of the program in the
cluster. For example, if a given cluster is responsible for 15% of the clients served, that
cluster should comprise 15% of the elements in the sampling frame. See the Illustrative
Example Sampling Strategy D (Annex 4, Table 3) for more details. Be careful not to order
the sampling frame in a way that will bias the selection of the clusters. Ordering the clusters
can introduce periodicity; e.g. every 10th cluster is a rural district. Ordering alphabetically is
generally a harmless way of ordering the clusters.
5. Calculate the sampling interval. The sampling interval is obtained by dividing the number of
elements in the sampling frame by the number of elements to be sampled. Using a random
number table (Annex 4, Table 5) or similar method, randomly choose a starting point on the
sampling frame. This is the first sampled district. Then proceed through the sampling frame
selecting districts which coincide with multiples of the sample interval.
6. Randomly select a starting point. Use the random number table in Annex 4, Table 5 to
generate a random starting number. Select a starting point on the table by looking away and
marking a dot on the table with a pencil. Draw a line above the row nearest the dot, and a line
to the left of the column nearest the dot. Moving down and right of your starting point select
the first number read from the table whose last X digits are between 0 and N. (If N is a two
digit number, then X would be 2; if it is a four digit number, X would be 4; etc.).
Example:
N = 300; M = 50; starting point is column 3, row 2 on Random Number Table; read down. You
would select 043 as your starting number.
59468
99699
14043
15013
12600
33122
94169
etc...
88 Data Quality Audit Tool
7. Select clusters. Move down the ordered and numbered list of clusters and stop at the starting
number. This is the first cluster. Now proceed down the sampling frame a number of elements
equal to the sampling interval. The starting number + sampling interval = 2nd cluster. The
starting number + 2 (sampling interval) = 3rd cluster etc.
8. Stratify Service Delivery Points. Order the Service Delivery Points within each of the sampled
districts by volume of service, i.e. the value of the indicator for the audited reporting period.
Divide the list into strata according to the number of sites to be selected. If possible, select
an equal number of sites from each strata. For example, if you are selecting three sites, create
three strata (small, medium, and large). If selecting two sites, create two strata. For six sites,
create three strata and select two sites per stratum and so on. Divide the range (subtract the
smallest value from the largest) by the number of strata to establish the cut points of the strata.
If the sites are not equally distributed among the strata use your judgment to assign sites to
strata.
9. Select Service Delivery Points. For a large number of sites you can use a random number
table and select sites systematically as above. For a small number of sites, simple random
sampling can be used to select sites within clusters.
10. Select ‘back up’ sites. If possible, select a back up site for each stratum. Use this site only if
you are unable to visit the originally selected sites due to security concerns or other factors.
Start over with a fresh sampling frame to select this site (excluding the sites already selected).
Do not replace sites based on convenience. The replacement of sites should be discussed
with the Organization Commissioning the DQA if possible.
11. Know your sampling methodology. The sites are intended to be selected for auditing as
randomly (and equitably) as possible while benefiting from the convenience and economy
associated with cluster sampling. You may be asked to explain why a given site has been
selected. Be prepared to describe the sampling methods and explain the equitable selection
of sites.

Illustrative Example – Sampling Strategy D: Cluster Sampling Selection

In the following example, Sampling Strategy D (modified two-stage cluster sample) is used to
draw a sample of ART sites in “Our Country” in order to derive an estimate of data quality at
the national level. In a cluster sampling design, the final sample is derived in stages. Each stage
consists of two activities: (1) listing; and (2) sampling. Listing means drawing a complete list of
all the elements from which a number will be selected. Sampling is when a pre-determined number
of elements are chosen at random from the complete listing of elements. A sample is only as good
as the list from which it is derived. The list, also called a sampling frame, is “good” (valid) if it is
comprehensive, i.e. it includes all the known elements that comprise the population of elements.
For ART sites in a country, a good sampling frame means that every single ART site in the country
is properly identified in the list.
1. Illustrative Indicator for this application = Number of Individuals Receiving Anti-Retroviral
Therapy (ART)
2. Audit Objective: to verify the consistency of Our Country’s national reports of ART progress
based on administrative monitoring systems.

Data Quality Audit Tool 89


3. Sampling Plan: two-stage cluster design is used to select three districts and then to select
three ART sites in each of the selected districts.
4. Sampling Stage 1: (a) list all districts; (b) select three districts.
5. Problem: Listing all districts is inefficient because ART sites may not be located in every
district of Our Country. Therefore, to make sampling of districts more efficient, first find
out which districts have ART sites. In the illustrative grid below (Annex 4, Table 1), the
highlighted cells represent those districts (n=12) in which ART sites are located. These 12
highlighted districts comprise the initial sampling frame.

Annex 4, Table 1. Illustrative Grid Display of All Districts in Our Country

1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
26 27 28 29 30

6. Sampling Frame for Stage 1: The list in Annex 4, Table 2 on the following page is called
a sampling frame. It contains a complete list of districts that are relevant for auditing ART
sites, because only the districts in which ART sites are located are included in the list.
7. The first column of the frame contains a simple numbering scheme beginning with “1” and
ending with the final element in the list, which in this case is 12, because only 12 districts in
“Our Country” contain ART sites.
8. The second column of the frame contains the number of the district that corresponds to the
illustrative grid display shown in the previous table. These were the highlighted cells that
showed which districts contained ART sites. Column 2 (District Number) does not list the
selected districts. Rather, it lists only those districts in “Our Country” where ART sites are
located. The sample of three districts will be drawn from Column 2.
9. The third column shows how many ART sites are located in each district. This is important
because the selection of districts will be proportional to the number of individuals receiving
ART in each district.

90 Data Quality Audit Tool


Annex 4, Table 2. Sampling Frame for Selection of Districts in Our Country

Sampling Frame Number of Individuals


District Number of ART Sites
Simple Ascending Receiving ART
Number per district
Number per District
1 1 2 300
2 3 1 100
3 9 2 200
4 12 3 500
5 16 3 500
6 19 1 60
7 20 1 70
8 21 2 300
9 22 1 90
10 26 5 600
11 27 1 80
12 28 2 200
Total 24 3000

10. The next step in this stage of sampling is to use the sampling frame to select the three districts
where the auditors will conduct the audit at specific ART sites. We are attempting to estimate
a parameter (data quality) for all the districts/sites in the country using a select few. Therefore
we would like that the few we select be as ‘typical’ as possible so as to provide an estimate
as close to the actual value as possible. Some districts may contribute more, or less to the
average of data quality in the whole country. Since we are interested in selecting districts
that are representative of all districts with ART sites in the country, and we know that some
districts with ART sites may not be typical (or representative) of all districts with ART sites,
we need to ensure that districts with a high volume of service (which contribute more to the
average data quality of all districts) are included in our sample. Therefore, the sampling
technique will select districts using “probability proportionate to size.”
11. In other words, the chance of a district being selected for the audit depends on the number of
individuals being treated in the district. This information can be found in column 4 of Annex
4, Table 2: “Number of Individuals Receiving ART per District.” Usually this number
corresponds to quarterly reports.
12. One way to link the probability of selection of a district to the volume of service is to inflate
the sampling frame according to the number of individuals receiving ART in each district.
For example, if in District 1 a total of 300 individuals are receiving ART, then District 1
should be listed in the sampling frame 300 times.
13. To make this easier, divide the values in Column 4 (Number of Individuals Receiving ART)
by 10. For example, now District 1 should appear 30 times instead of 300 times. District
3 should appear 10 times instead of 100 times, and so on. This inflated sampling frame is
shown on in Table 3 of this section.

Data Quality Audit Tool 91


14. Using the inflated sampling frame shown in Annex 4, Table 3 we are ready to use systematic
random sampling to select three districts.
15. In systematic random sampling, every kth element in the sampling frame is chosen for
inclusion in the final audit sample. If the list (the sampling frame) contains 1,000 elements
and you want a sample of 100 elements, you will select every 10th element for your sample.
To ensure against bias, the standard approach is to select the first element at random. In this
case, you would randomly select a number between 1 and 10; that number would represent
the first element in your sample. Counting 10 elements beyond that number would represent
the second element in your sample, and so on.
16. In this ART site example, we want to select three districts, and then within each of those three
selected districts we want to select three ART sites. Therefore, our desired sample size is nine
ART sites. It is a two stage sample: the first stage involves listing and sampling districts. The
second stage involves listing and sampling ART sites.
17. Our sampling frame is organized by a Probability Proportionate to Size methodology because
the list is weighted by the number of individuals receiving ART per district. In other words,
we will have a higher probability of selecting a district where a high number of individuals
are receiving ART, because these districts are listed more often (that is what the “inflation” of
the sampling frame accomplished).
18. In systematic random sampling, the sampling interval is calculated by dividing the desired
sampling size (three districts) by the number of elements in the sampling frame (300 in the
frame shown in Annex 3, Table 3). So, our sampling interval is 300/3, which equals 100.

92 Data Quality Audit Tool


Annex 4, Table 3. Sampling Frame for Selection of Districts Based on Probability
Proportionate to Size
# Distr. # Distr. # Distr. # Distr. # Distr. # Distr. # Distr.
1 1 51 9 101 12 151 16 201 21 251 26 301
2 1 52 9 102 12 152 16 202 21 252 26 302
3 1 53 9 103 12 153 16 203 21 253 26 303
4 1 54 9 104 12 154 16 204 22 254 26 304
5 1 55 9 105 12 155 16 205 22 255 26 305
6 1 56 9 106 12 156 16 206 22 256 26 306
7 1 57 9 107 12 157 16 207 22 257 26 307
8 1 58 9 108 12 158 16 208 22 258 26 308
9 1 59 9 109 12 159 16 209 22 259 26 309
10 1 60 9 110 12 160 16 210 22 260 26 310
11 1 61 12 111 16 161 19 211 22 261 26 311
12 1 62 12 112 16 162 19 212 22 262 26 312
13 1 63 12 113 16 163 19 213 26 263 26 313
14 1 64 12 114 16 164 19 214 26 264 26 314
15 1 65 12 115 16 165 19 215 26 265 26 315
16 1 66 12 116 16 166 19 216 26 266 26 316
17 1 67 12 117 16 167 20 217 26 267 26 317
18 1 68 12 118 16 168 20 218 26 268 26 318
19 1 69 12 119 16 169 20 219 26 269 26 319
20 1 70 12 120 16 170 20 220 26 270 26 320
21 1 71 12 121 16 171 20 221 26 271 26 321
22 1 72 12 122 16 172 20 222 26 272 26 322
23 1 73 12 123 16 173 20 223 26 273 27 323
24 1 74 12 124 16 174 21 224 26 274 27 324
25 1 75 12 125 16 175 21 225 26 275 27 325
26 1 76 12 126 16 176 21 226 26 276 27 326
27 1 77 12 127 16 177 21 227 26 277 27 327
28 1 78 12 128 16 178 21 228 26 278 27 328
29 1 79 12 129 16 179 21 229 26 279 27 329
30 1 80 12 130 16 180 21 230 26 280 27 330
31 3 81 12 131 16 181 21 231 26 281 28 331
32 3 82 12 132 16 182 21 232 26 282 28 332
33 3 83 12 133 16 183 21 233 26 283 28 333
34 3 84 12 134 16 184 21 234 26 284 28 334
35 3 85 12 135 16 185 21 235 26 285 28 335
36 3 86 12 136 16 186 21 236 26 286 28 336
37 3 87 12 137 16 187 21 237 26 287 28 337
38 3 88 12 138 16 188 21 238 26 288 28 338
39 3 89 12 139 16 189 21 239 26 289 28 339
40 3 90 12 140 16 190 21 240 26 290 28 340
41 9 91 12 141 16 191 21 241 26 291 28 341
42 9 92 12 142 16 192 21 242 26 292 28 342
43 9 93 12 143 16 193 21 243 26 293 28 343
44 9 94 12 144 16 194 21 244 26 294 28 344
45 9 95 12 145 16 195 21 245 26 295 28 345
46 9 96 12 146 16 196 21 246 26 296 28 346
47 9 97 12 147 16 197 21 247 26 297 28 347
48 9 98 12 148 16 198 21 248 26 298 28 348
49 9 99 12 149 16 199 21 249 26 299 28 349
50 9 100 12 150 16 200 21 250 26 300 28 350

Data Quality Audit Tool 93


19. Using a random start methodology, let us now select a random number between 1 and 100.
Use the random number table in Annex 4, Table 5 to generate this random number. Select
a starting point on the table by looking away and marking a dot on the table with a pencil.
Draw a line above the row nearest the dot, and a line to the left of the column nearest the dot.
From the starting point (the dot) go down the column to the right of the vertical line until you
arrive at a number less than the sampling interval. This number is your starting point and
first sampled district. In this case the random number equaled 14. This now becomes the first
element selected from the sampling frame, and corresponds to District #1.
20. In a systematic random sample we move systematically down the list based on the sampling
interval. Our calculated sampling interval is 100. Since our random start was 14, the task is
now to move 100 rows down the list to arrive at our next selected district. 14 plus 100 equals
114; this location in our list refers to District #16. This is our next selected district.
21. Moving down the list by our sampling interval (100) from 114 means that our next district is
114 + 100 = 214, which corresponds to District #26. This is our third selected district.
22. Stage 1 of the sampling strategy generated the three districts from which the actual ART sites
to be audited will be drawn in Stage 2.
23. Using the exact same methodology that was used in Stage 1 of this sampling strategy, list all
the ART sites in District 1, District 16, and District 26, (Annex 4, Table 4).

Annex 4, Table 4. The Four Selected Districts and the Listing of ART Sites within District 12

Illustrative Listing of ART Sites


The 4 Districts Selected into the Audit
within the Selected Districts
Sample
(District 16 is highlighted)
Aggregate Aggregate
Site
Reported Reported
District Sites District Site Specific
Count: Count:
Number per District Number Number Reported
Individuals Individuals
Count
on ART on ART
1 2 300
16 3 500  16 500 #1 100
26 5 600 #2 350
#3 50
Total: 3 500

24. The task is now to select three ART sites in each of the selected districts. But, as can be seen,
District 1 only has two ART sites; District 16 has three sites; and District 26 has five sites.
25. Depending on the population distribution of the country and the epidemiology of the disease
of interest, there may be many sites per district, or comparatively few. Given the relative
maturity of TB programs and the generalized distribution of both TB and Malaria, sites with
programs addressing these diseases are likely to be fairly numerous per district. On the other
hand, sites with HIV/AIDS programs will be relatively few, particularly in countries with
low prevalence or countries with concentrated epidemics (i.e., cases found primarily in high
risk groups). In our ART example there are very few sites per district. With these small

94 Data Quality Audit Tool


numbers of sites per district, any kind of random (chance) algorithm can be used to derive
the 9 ART sites that will comprise the audit sample. A simple random sample algorithm is
perhaps easiest to use in this case. In the case of many sites per district, sites should be ranked
per district according to the volume of service and three sites chosen using stratified random
sampling. That is, stratify the sites into large, medium and small volume (number of patients
treated, number of commodities distributed) and select one site at random from within each
stratum. This will ensure adequate representation of all sites with respect to the volume of
service
26. At this point, a sample of 9 ART sites has been drawn. Now the data quality auditors know
which districts to visit and which sites within those districts are to be audited, so the team can
plan its work accordingly. After the Audit Team has completed work at these nine sites, the
next step is to calculate Verification Factors.

Note: the combination of number of clusters and number of sites within clusters is not fixed; rather,
this combination should be based on the distribution of sites across a programmatic landscape.
Fewer sites per district can be selected when volume of services is heavily concentrated. For
example, in “Our Country” we could have selected four districts and then two sites per district in
order to ensure more geographical representation of sites. While increasing the number of districts
in the sample leads to greater statistical power of the analysis (i.e., greater precision of the estimate
of data quality), the expense and time required for traveling to the additional districts will likely
out-weigh the marginal improvement in precision (see Woodard et al.7 for a discussion on the
precision of estimates using the GAVI DQA sampling methodology).

The total number of clusters and sites will be determined by the Organization Commissioning the
DQA in consultation with the Auditing Agency, but is ultimately dependent upon the resources
available to conduct the Data Quality Audit. The main constraints in this regard are: (1) the time that
an Audit Team can devote to the in-country work; (2) the composition (number and training) of the
audit team in-country; and (3) the funding available to support the implementation of the audit.

How Big Should the Sample Be?


There is no right or wrong answer to this question. The question is really asking, “how many
clusters (e.g., districts) should we select and how many sites per cluster should we select in order
to generate statistics that are accurate?”

Accurate statistics in this case mean that the verification factors that are calculated for the sampled
districts are representative of the verification factors for all the districts that were not selected into
the data quality audit sample.

In other words, random sampling allows the DQA team to estimate a national Verification Factor
by verifying reported counts in only a fraction of the total (national) number of sites. How good is
this estimation? How closely do the results found by the auditors at this fraction of sites represent
the results that might be found for the whole?
7
Woodard S., Archer L., Zell E., Ronveaux O., Birmingham M. Design and Simulation Study of the Immunization
Data Quality Audit (DQA). Ann Epidemiol, 2007;17:628–633.

Data Quality Audit Tool 95


The answer lies in sampling errors. A sampling error is a measure of how much the sample
estimates deviate from the so-called true values. (The true values are usually called the parameters.)
Sampling errors are a function of two things: (1) sample size; and (2) variability of the parameter.
Sampling errors decrease as the sample size increases. The larger your sample, the lower your
sampling error, and the more accurate your results are. Sampling error also depends on the
variability of the parameter. For example, if the true national verification factor (data quality
parameter) happens to be 0.95, it is likely a reflection of good reporting practices in the majority of
sites in the country. Therefore, it is probable that a random sample would contain sites with good
reporting performance. In this sample, the data quality is uniformly good and you would not need
a large sample to demonstrate this.

On the other hand, if the true national verification factor is 0.50, then it probably reflects a
combination of good and poor data quality across all sites in the country. It would take a larger
sample to ensure that enough of these “good” and “bad” sites were represented in the sample just
as they are distributed overall in the country.

The sampling error is a mathematical construct that permits the calculation of confidence intervals.
It specifically relates to the number of standard deviations (plus or minus) that your sample results
deviate from the “true” results (the parameter). Most statistical textbooks have tables of sampling
errors in appendix form, where the specific value of the sampling error is indicated according to
sample size and variability of the parameter.

The key to reducing sampling errors in the context of the data quality audit is to remember that
sample size is not how many clusters (e.g. districts) are in the sample, nor is it how many sites are
in the sample; rather, sample size pertains to how many instances of a health service (a visit to the
site by an ART patient) are recorded at the site.

In Annex 4, we use an example where three districts are selected and three sites are selected per
district. The auditors are verifying reported counts of ART patients receiving ART services at the
selected sites. The total reported number of ART patients is 1,400. This is the actual number that
the data quality auditors are attempting to verify and it constitutes an effective sample size when
considering statistical issues of sample accuracy.

How big is this sample? In Uganda, the total reported number of individuals receiving ART
services directly from sites in 2005 was 49,600. Fourteen hundred individuals is about three
percent of that total, which under most conditions is a reasonable sample size for that population.
In Nigeria, the total direct number of individuals reached with ART services was 18,900 in 2005.
For Nigeria our hypothetical sample size of 1,400 individuals represents about eight percent of the
total – an 8% sample is robust in most applications.

So unless a country has a very large number of sites where important health services are occurring
(e.g., South Africa, Kenya, Uganda), it is usually possible to capture a robust fraction of services
by visiting 8-12 sites using a probability proportionate to size methodology.

96 Data Quality Audit Tool


However, mathematical modeling of the modified two-stage cluster sampling technique described
here has determined that the precision of estimates of the verification factor for immunization
coverage data is too low for realistic use at the national level.2 In simulations, Woodard et al.
found that up to 30 districts would need to be sampled to achieve precision in the neighborhood of
+/-10%. Given the investment of time, staff and financial resources required to visit 30 districts,
the calculation of a precise national verification factor is unlikely.

That said, it is possible to gain an insight into the overall quality of data in a program/project
without reliance on the national estimate of verification factor. The qualitative aspects of the DQA
are adequate to determine the strengths and weaknesses of a given reporting system. For example,
if indicator definitions are poorly understood in a majority of a representative sample of sites, it is
quite likely that indicator definitions are poorly understood in non-sampled districts as well. The
recounting of indicators and comparison with reported values for a sample of sites is similarly
adequate to determine in a general sense whether data quality is good, mediocre or poor, even
without the benefit of a precise national estimate. Missing reports or large disparities between
recounted and reported results in a handful of sites is indicative of similar disparities elsewhere.

Ultimately, the national verification factor should be interpreted with caution. For the purposes of
the Data Quality Audit, it should be used as an indication of data quality (or lack of data quality),
rather than an exact measure.

Data Quality Audit Tool 97


Annex 4, Table 5. Random Number Table

From The Rand Corporation, A Million Random Digits with 100,000 Normal Deviates
(New York: The Free Press, 1955)

98 Data Quality Audit Tool


ANNEX 5: Calculation of the Verification Factor

Data Quality Audit Tool 99


In a data quality audit, one of the most fundamental questions is the extent to which reported results
match verified results. More specifically, “for the indicator being audited, what proportion of sites
in {country name} reported accurate results over the previous time period?” The Verification Factor
represents a way to summarize the answer to this question in a standard, quantitative measure.

The use of Verification Factors can be applied to the full set of health indicators that this Data
Quality Audit Tool is designed to cover — provided that the sampling strategy used by the
Audit Team is statistically representative of the country-wide program (or an important
subset of the country-wide program) and that the actual number of sites in the sample is
large enough to generate robust estimates of reporting consistency.

The Verification Factor is an indicator of reporting consistency that is measured at three levels:
(1) the Service Delivery Site level; (2) the district administrative level; and (3) the national
administrative level. It is often called a district-based indicator of reporting consistency because
the primary sampling units for estimating Verification Factors are districts (or ‘intermediate
aggregation levels’). It can also be referred to as a district-based indicator because in the GAVI
approach Verification Factors are constructed at the district level and at the national level.

The equation to derive Verification Factors consists of four factors:


Factor 1: the Audit Team’s verified count at a selected site.
Factor 2: the observed reported count at a selected Service Delivery Site.
Factor 3: the observed reported count from all sites in a selected cluster (district).*
Factor 4: the reported count of a selected cluster (district) as observed at the national level.**
* Cluster level refers to an administrative/geographical unit like a district, a province, a region, etc.
** National level refers to the final place where aggregation of reported counts occur, like the
relevant unit within the host country national government or the Strategic Liaison Officer
within the USG team under the President’s Emergency Plan for AIDS Relief.

Calculation of the Verification Factor consists of three steps.

Step One:

Divide Factor 1 by Factor 2:

Verified count at selected site


Reported count at selected site

This result equals the proportion of reported counts at a selected site that is verified by the Audit
Team. This result can be called the Verified Site Count.

100 Data Quality Audit Tool


Step Two:

Divide Factor 3 by Factor 4:

Reported count from all sites in selected cluster (district)


Reported count of selected cluster (district) as observed at the national level

This result equals the proportion of the selected cluster or district-level reporting that is completely
consistent with the national-level reporting. This result is called the cluster consistency ratio, or
Adjustment Factor.

The adjustment factor answers the following question: “Were the results reported at the selected
district level (for all sites in the selected district — not just those sites that were visited by the Audit
Team) exactly the same as the results (for the selected district) that were observed at the national
level?”

Step Three:

For each sampled district, sum the recounted values for the audited sites and divide by the sum
of the reported values for the audited sites. Multiply this result for each sampled district by the
adjustment factor appropriate for each district. This result, when further adjusted with “district”
weights as shown below, is the National Verification Factor.

It is important to remember that the units of time should be equivalent across each of the factors
used to calculate the Verification Factor. What this means is that if the auditor is tracing and
verifying reported results for the past 12 months at a selected site, then this time period (past 12
months) should be used as the basis for the other factors in the equation.

The Verification Factor can be expressed using statistical notation as follows:

where

i = selected district (i = 1, 2, 3) and

j = selected site (j = 1, 2, 3)

Data Quality Audit Tool 101


and where

Xij = the validated count from the jth site of the ith district

Yij = the observed reported count from the jth site of the ith district

Rdi = at the district level, the reported count from all the sites in the ith district that were prepared
for submission to the national level

Rni = at the national level, the observed count as reported from the ith district.

In order to derive a National Verification Factor, it is necessary to first calculate Verification Factors
at the district level. The national Verification Factor is calculated as the weighted average of the
district Verification Factors.

The example showing how Verification Factors are derived assumes that the Data Quality Audit
Team is working in the three districts that were selected in the random sample section outlined
previously. These three districts (1, 16, 26) and the ART sites embedded within them are shown
in Annex 5, Table 1.

Annex 5, Table 1. The Flow of Reported ART Counts from the Selected Site Level
Up to the Selected District ( i = 1, 16, 26) Level and Up to the National Level

Aggregation of Reported Counts from Districts ( N )  National Level


(300) + (500) + (700) = 1,500

Aggregation of Reported Counts from Sites ( N )  District Level: District Identification


Number ( I )

1 16 26
(300) (500) (600)
1 2 3 4 5 6 7 8 NA* 9
(150) (150) (100) (350) (50) (200) (100) (100) (100) (100)

Site Level: Selected Site Identification Number (j ) and Reported ART Count ( y )
Note that the aggregated ART reported count at District 26 (600) is misreported at the
National Level (700)
* NA = This site not randomly selected

Two-stage cluster sampling, as discussed above, resulted in three districts and a total of 10 ART
sites. In accordance with the GAVI approach, this strategy requires a set number of sites to be
selected per district. In this example, three sites are to be selected per district. The problem is that
since District #1 only has two ART sites it is not possible to select three.

102 Data Quality Audit Tool


One solution to this problem is to select both ART sites in District #1, all three sites in District
16, and randomly select four of the five sites in District 26. Please note that there are a number of
alternatives available to address the sampling problem shown above – this Data Quality Audit Tool
is not the place to discuss these alternatives.

Once an alternative to the sampling issue shown above is identified, then the Audit Team can begin
to complete the matrix required to calculate Verification Factors. The matrix can be illustrated as
below:

Illustrative Calculation Matrix for Verification Factors


I = selected district (i = 1, i = 16, i = 26)
j= selected ART site located in the ith district
x = verified count at selected site j
y = reported count at selected site j

Annex 5, Table 2 illustrates the calculations derived from the calculation matrix.

Annex 5, Table 2. Calculations of i, j, x, and y

i j x y x/y
1 1 145 150 0.96
1 2 130 150 0.86
Total: 2 275 300 0.91
16 3 100 100 1.00
16 4 355 350 1.01
16 5 45 50 0.90
Total: 3 500 500 1.00
26 6 100 200 0.50
26 7 50 100 0.50
26 8 75 100 0.75
26 9 40 100 0.40
Total: 4 265 500 0.53

One of the rows in the matrix is highlighted for the purpose of further understanding how the
Verification Factor is derived. The row is associated with District 26 (i=26) and Site number 7
(j=7). The third column in the matrix shows (x), or the verified count of ART patients that the
auditors came up with at the site (50). The fourth column in the matrix shows (y), or the reported
count of ART patients at this site (100). This part of the Verification Factor is derived by simply
dividing the verified count (50) by the reported counted (100) = (0.50).

The matrix illustrates how sites are clustered together within districts, because the verification
factors are calculated at the district level by pooling the audit results from each selected site within
a district. Thus the Verification Factor for District 1 in the matrix is 0.91, which is derived by
pooling the [x/y] results from the two sites in District 1.

Data Quality Audit Tool 103


Pooling is straightforward: the total of the x column (275) is divided by the total of the y column
(300) to calculate the district level Verification Factor for District 1. This is done for each of the
selected districts.

Judging from these verification factors (based on hypothetical values typed into the x column), the
matrix suggests that District 26 over-reported the number of ART patients served in its sites. Here,
the total number of reported ART patients was 500, while the total verified count that was derived
by the Data Quality Audit Team examining source documents at the four selected sites was 265;
265 divided by 500 equals 0.53, which implies that the auditors were able to verify only about half
of all the ART patients that were reported in this district.

The final two steps to deriving a national Verification Factor is to (1) calculate the adjustment
factor [Rdi/Rni] for each cluster; and (2) multiply this adjustment factor by the weighted district-
level Verification Factors.

Calculation of the Adjustment Factor


Annex 5 Table 1 shows the flow of reported ART counts from the selected site level up to the
selected district (or cluster) level, and then finally up to the national (or final aggregate) level.
In our example, the table indicates that the aggregated ART reported count at the district level
(District 26) was not reflected at the national level. Specifically, the 600 reported ART patients as
found in the District 26 health offices was found not to match the 700 reported ART patients for
District 26 at the national health office.

This fact was uncovered by a member of the Data Quality Audit Team who was tracing the district
level results to what could be observed at the national level. As a result of this work by the Data
Quality Audit Team that occurs in levels of aggregation higher than the site (namely intermediate
and final levels of aggregation), we now have what we need to calculate the Adjustment Factor.

Rdi/Rni is equal to:


1. The reported aggregate count from all sites in a selected district as observed by the auditor
at the district (or intermediate) level of aggregation
2. Divided by
3. The reported aggregate count from all sites in a selected district as observed by the auditor
at the national (or highest) level of aggregation.

In our example, the adjustment factors for each district would be:
• District 1: 300/300 = 1.0
• District 16: 500/500 = 1.0
• District 26: 600/700 = 0.86

104 Data Quality Audit Tool


The adjustment factor is applied by multiplying it against the Verification Factor for each district.
Thus, the adjusted verification factors for each district are:
• District 1: 0.91 x 1.0 = 0.91
• District 16: 1.0 x 1.0 = 1.0
• District 26: 0.53 x 0.86 = 0.46

The next step in the calculation is to weight the adjusted district Verification Factors by the verified
counts at district level. We weight the adjusted district Verification Factors because we want to
assign more importance to a Verification Factor that represents a large number of clients, and
proportionately less importance to a Verification Factor that represents a small number of clients.

In other words, based on our hypothetical example of the three districts, it looks like District 16
has the highest volume of ART patient services and that District 26 has the smallest volume of
ART patient services during this time period. When we construct an average Verification Factor
for all of the three districts, we ideally would like to assign proportionately more weight to the
verification results from District 16, proportionately less weight to District 26, and so on.

The matrix below shows the intermediate and final calculations that are required to construct a
weighted average of all the District Verification Factors.

Annex 5, Table 3. Calculation of the Average and Weighted Average


of the District Verification Factors

i=1 i = 16 i = 26 Summed Total


District-level Verified Count (x) 275 500 265 1040
District-level Reported Count (y) 300 500 500 1300
District Verification Factor (x/y) 0.91 1.00 0.53 2.44
Adjustment Factor 1.0 1.0 0.86
Adjusted District Verification Factor 0.91 1.0 0.46 2.37
Weight* 275 500 265 1040
Verification Factor (Weight) 250.25 500.00 121.9 872.15
District Average 0.81
Weighted District Average 0.84
* The weight used here is the verified number of patients on ART (x)

The District Average is calculated by summing the three District Verification Factors for each
district (0.92+1.00+0.53 = 2.44) and then dividing by three (2.44/3 = 0.813).

Weighted District Average is calculated by first multiplying each of the three adjusted District
Verification Factors by the district-level weight that has been assigned. In this example, the weight
is equal to the district-level verified count (x). In the matrix, this value is shown in the row labeled

Data Quality Audit Tool 105


Verification Factor (Weight). Next you take the sum of the weighted values, which is shown in the
last column of the row labeled Verification Factor (Weight) = 872.2. Then, you divide this value
by the sum of the weights themselves (1040). So, 872.2/1040 = 0.84.

Based on the calculations shown in Annex 5, Table 3, the simple arithmetic average of the combined
Verification Factors across all three districts is 0.813, while the weighted average is 0.840. The
weighted average is higher because its calculation took into account the fact that District 16 had
more ART patients than the other districts. Since the Verification Factor for District 16 was 1.00,
this (perfect) Verification Factor was applicable to more ART patients and thus it had more influence
on the overall average.

106 Data Quality Audit Tool

You might also like