Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Are You Being Served?: New Tools for Measuring Service Delivery
Are You Being Served?: New Tools for Measuring Service Delivery
Are You Being Served?: New Tools for Measuring Service Delivery
Ebook838 pages8 hours

Are You Being Served?: New Tools for Measuring Service Delivery

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This publication presents tools and techniques for measuring service delivery in health and education and people's experiences from the field in deploying these methods. It begins by providing an introduction to the different methodological tools available for evaluating the performance of the health and education sectors. Country specific experiences are then explored to highlight lessons on the challenges, advantages and disadvantages of using different techniques to measure quality in a variety of different contexts and of using the resulting data to affect change. This book is a valuable resource for those who seek to enhance capacity for the effective measurement of service delivery in order to improve accountability and governance and enhance the quality of service delivery in developing countries.
LanguageEnglish
Release dateDec 28, 2007
ISBN9780821371862
Are You Being Served?: New Tools for Measuring Service Delivery

Related to Are You Being Served?

Related ebooks

American Government For You

View More

Related articles

Reviews for Are You Being Served?

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Are You Being Served? - Samia Amin

    PART 1

    Overview

    1

    Introduction

    Why Measure Service Delivery?

    Markus Goldstein

    The beginning of knowledge is the discovery of something we do not understand.

    Frank Herbert

    One of the cornerstones in building policies to improve welfare is the services that governments offer to citizens. In most countries, governments provide some form of basic education and health services. Governments also supply a variety of other services ranging from essential public goods such as police services to administrative services such as drivers’ licenses. Taken as a whole, those services are critical for economic growth and the reduction of poverty.

    Although we have an array of tools and techniques to measure ultimate welfare outcomes, our tools for measuring the services aimed at improving these outcomes are less well developed. This book explores some of those tools, their uses, and the way they are implemented in practice. Through those lessons, we may expand our understanding of welfare outcomes and the processes of accountability, governance, and service delivery that help produce these outcomes.

    There is a temptation to view the relationships between welfare outcomes and these processes simplistically: if more money is spent on basic services, welfare outcomes will improve. However, this view flies in the face of the empirical fact that there is a weak correlation between spending and outcomes. See, for example, figure 1.1, which is taken from World Development Report 2004 (World Bank 2003). The charts in figure 1.1 show the relationship between spending and selected health and education outcomes across a large number of countries. Because national wealth might affect those welfare indicators through mechanisms other than government spending, the charts use a measure of expenditure that captures the difference between actual spending in a country and what might be expected, on average, for a country at the same level of gross domestic product. We may see that, even if we account for national income, there is a weak association between spending and outcomes.

    Figure 1.1 Association between Outcomes and Public Spending

    WB.978-0-8213-7185-5.part1.ch1.sec1.fig1.jpg

    Source: World Bank 2003.

    Unfortunately, this association robs us of the easy solution of simply spending more money to improve welfare. Instead, we need to understand the process by which funds are transformed into outcomes. Many factors intervene between the input of spending and the outcome of individual welfare, including the functioning and failure of markets, the composition of spending (for example, for tertiary versus primary education or health), corruption, and the effectiveness of service delivery. There is a significant and growing literature showing that understanding some of those intervening factors would help us work out a great deal about how service delivery institutions affect outcomes. For instance, Lewis (2006) provides cross-country evidence on the correlation between governance and health outcomes, while Wane (chapter 8 in this volume) shows that, once leakages are accounted for, there is, indeed, a positive relationship between central government spending and health outcomes in Chad. Broader discussions, such as Filmer, Hammer, and Pritchett (2000) and World Development Report 2004, look at the multiple ways in which those intervening factors may shape the relationship between inputs and welfare outcomes.

    This volume focuses on one key aspect of the process of transforming inputs into outcomes: the process of service delivery. The key question here is: are citizens being served? Asking this question in the context of the benchmarking of public services raises a host of other questions. Are we talking about quality, efficiency, or another dimension? Should we gather data from clients, providers, surveys, or administrative records? For what purposes should such data be used?

    This book offers a starting point for answering these questions. Building on the authors’ experiences in a wide range of settings, the chapters provide not only an overview of service delivery measurement tools and what they may be used for, but also lessons about what works and what does not work in practice. Those discussions reveal a process that is rarely easy, but one that may provide powerful inputs for making effective policy.

    The general state of knowledge is less developed on measuring service delivery than on household surveys. For example, for household surveys, someone interested in designing a survey may refer to the volumes edited by Grosh and Glewwe (2000), who offer a chapter-by-chapter discussion of potential modules and methodological issues. No such reference exists for measuring service delivery; it may even be argued that not enough consistent and comprehensive attempts have been made to warrant a definitive guide. One might also argue that the tools for measuring service delivery—ranging from the use of routine administrative data to the presentation of case studies to doctors for comment to gauge the ability of the doctors—are so diverse that analyzing them all in one definitive volume would be impossible.

    Instead, we seek in this volume to bring together a set of lessons arising from the application of some of the various tools, and we have asked people who have engaged with these tools to report on their experiences, on what has worked well and what has not, and on what the data may be used for. In terms of overall success, the experience of the authors is mixed; every author has indicated the difficulty of collecting those types of data. Indeed, during the authors workshop held as part of the preparations for this volume, one author regaled us with tales of chasing peanut vendors outside a ministry to track down missing records because the records were being recycled as food wrappers. However, there remains a selection bias in this volume: we do not observe complete failures. Given our goal of learning about these tools and demonstrating their application, we have had to exclude cases where attempts to measure service delivery have failed.

    Readers will notice that this volume focuses mostly on health and education. These areas are where these tools are most developed, but they are not the only areas where these tools may be applied. Surveys such as the Indonesia Family Life Survey (discussed in chapter 15 by Beegle), the Living Standards Measurement Study (chapter 16 by Scott), and the Indonesia Governance and Decentralization Survey all measure different areas where the state provides services to citizens. Administrative data may also represent a powerful tool for understanding projects that do not revolve around health or education, as shown in chapter 4 by Lanjouw and Özler. Thus, as you read through chapters that focus on one sector, it is important to keep in mind that many of the tools may be fruitfully applied to other areas of service delivery.

    The final section of this chapter provides a detailed road map to this book, but, now, it is worth spending a bit more time on the conceptual overview of the volume. A better understanding of service delivery will enable policy makers to increase the efficiency and effectiveness with which resources are translated into welfare outcomes. There are four main ways in which the measurement of service delivery may be used to achieve this. First, service delivery information may be used to increase accountability by helping to strengthen the ties through which information and sanctions flow between providers, clients, and the government units that fund providers. Second, service delivery data may be used to deepen our understanding of poverty and inequality and to target a policy response. This information will enable the more effective formulation of policy as the conditions the poor face become more well understood, and it will enable resources to be directed more successfully to the poor. Third, the measurement of service delivery is critical to rigorous impact evaluation. Through careful evaluation aimed at answering key questions of design and the resulting effects, existing programs may be adjusted to boost their effect. Finally, service delivery data may be used for policy-relevant research to answer a range of questions about the way providers and clients interact and about the way facilities function. Let us now examine the potential uses of service delivery in greater depth.

    Making Use of Service Delivery Data

    Accountability

    Measures of service delivery represent a vehicle for holding service providers to account for the quality and quantity of the services they provide. This accountability is critical in any well-functioning system, but it is particularly important if the underlying system is in flux because of sectorwide policy changes or large national governance changes such as decentralization. One useful distinction in capturing and using these data is to think of the paths of accountability from the provider to the client and also from the supervising agency (the government body) to the provider. Figure 1.2, taken from World Development Report 2004, provides a framework for thinking about these two paths.

    Figure 1.2 Key Relationships of Power

    WB.978-0-8213-7185-5.part1.ch1.sec2.fig2.jpg

    Source: World Bank 2003.

    In terms of measurement tools for service provider accountability, we can think of the relationship between providers and citizens as being measured in terms of satisfaction (as well as the realized demand for services), while monitoring offers information for the state-provider relationship.

    Satisfaction

    The measurement of service delivery can represent a powerful mechanism for obtaining feedback from client to providers. Although a complaint box at a local facility is a good initial step, the tools examined in this volume represent a more organized and balanced way to obtain comprehensive feedback for providers. The satisfaction of users with the services they are relying on may be measured through tools such as citizen report cards (see chapter 3 by Amin and Chaudhury and chapter 7 by Lindelow and Wagstaff), questions in household surveys, and exit polls (see chapter 14 by Lundberg). The satisfaction measures might include opinions on, for example, the length of the waits to see a doctor, the teacher’s performance, and the politeness of nurses. This information may be presented directly to facility staff or be used by government supervisors to give voice to clients about their concerns. This type of user feedback, particularly if it is repeated and associated with appropriate incentives by government agencies responsible for oversight, may become a powerful path of accountability from providers to clients. It is important to keep in mind, however, that user satisfaction may not, in fact, be correlated with more objective measures of quality. For example, in chapter 14, Lundberg finds that measures such as whether patients were given physical examinations, whether their pulse was taken, or whether stethoscopes were used are not significantly correlated with patient satisfaction. In addition, as Amin and Chaudhury note, measures of satisfaction also show a built-in bias created by expectations, which may be based not only on the past performance of a facility, but promises made by politicians, reports in news media, and other contextual factors.

    Monitoring

    When the state is seeking to measure the performance of service providers, the first best option is administrative data. There are a few important attributes that these data should possess to be effective. First, they need to be collected routinely to ensure timely feedback for decision making. Second, they should be of sufficient quality to provide useful information. Third, they should be of adequate breadth to capture key activities, but also sufficiently narrow so as to avoid an unnecessary burden on frontline service providers. Finally, once these other conditions have been met, the effectiveness of the data depends on their actual use in monitoring and helping improve the performance of service providers. Chapter 5 by Galasso, chapter 6 by Behrman and King, and chapter 4 by Lanjouw and Özler in this volume show how administrative data may be used to draw inferences about program performance. Indeed, high-quality administrative data represent a key link between monitoring and impact evaluation (discussed next) because they supply information on important aspects of program implementation such as costs and service utilization rates. These data serve important monitoring functions, but they may also be used to enhance the design and results of a more thorough evaluation.

    In a number of contexts, however, administrative data systems are too weak to provide effective monitoring. The primary goal should be to improve these systems, and, in the meantime, facility surveys (which are a much more expensive way of collecting data) may suffice. Indeed, using facility surveys as a replacement for administrative data in a regular program of monitoring would be akin to providing four-wheel drive vehicles for urban driving rather than fixing the roads.

    It is important to bear in mind that facility surveys may be used to address other monitoring tasks and thus are a significant complement to high-quality administrative data. One use of facility surveys involves collecting data in a swath that is much broader than is feasible through regular monitoring (see the discussion in Lindelow and Wagstaff, for example) and at a level of detail that would overwhelm routine reporting. Another complementary use of facility surveys might involve verifying administrative data in cases in which service providers have a strong incentive to mis-report (see the discussion of the measurement of absenteeism in Amin and Chaudhury, for instance). A final example is offered in chapter 12 by Serneels, Lindelow, and Lievens, who show how qualitative facility-level work may help us understand the incentives, motivations, and behavior that lie behind the data captured by regular monitoring systems or quantitative facility surveys.

    As noted in several of the chapters in this volume, administrative data almost always only supply information on those people actually using a given service. There are a number of situations in which an analyst would also need data on those people who are not using a service (or at least not the service provided by the government). For instance, such data would be useful if the government were trying to understand why service utilization at a given clinic or school is low: is the underlying issue the awareness among potential users about the existence of a service at a given facility, or is the underlying issue the quality of the service? In those cases, administrative data, combined with a household survey, might provide information on the differences between users and nonusers. Furthermore, to compare a service at a government facility with other services, either administrative data or survey data on private facilities are needed. Ultimately, this type of analysis would lead to a thorough market analysis that may be beyond the purview of most routine monitoring systems, and we therefore discuss it in the section on research uses below.

    Monitoring does not take place only at the level of facilities and of interactions with clients. A key feature of an effective monitoring system is the ability to keep track of goods and other resources as they travel through the system to the end users. This process may be achieved in part through administrative data, but periodic surveys and audits are a critical complement. One example of these types of tools is Public Expenditure Tracking Surveys, which are discussed in depth in the chapters by Filmer (9), Lindelow (7), and Wane (8). Those surveys provide in-depth information on the flows and losses of resources through a system. However, as Lindelow and Wane both point out, a serious attempt to carry out a Public Expenditure Tracking Survey inevitably raises questions about what should be classified as embezzlement or fraud and what should be classified as inefficiency and what should be classified as legitimate reallocations of resources. When conducted in an atmosphere of openness to dialogue, those surveys may help shape not only regular monitoring systems but also the thinking about allocation rules in government.

    Understanding Poverty and Inequality and Targeting the Policy Response

    Accurate measurements of service delivery are important tools in understanding poverty and inequality and in properly targeting the policy response to these problems. Whether we adopt a multidimensional notion of poverty and are concerned with outcomes such as health and education as ends in their own right, or rely on a narrower definition and believe that health and education are instrumental to improving income, these tools will help us fathom the processes through which health and education are or are not produced. Although the accountability mechanisms discussed above represent a critical means of ensuring that the voices of the poor are heard, they may also be put to diagnostic and analytic use in improving policy. This approach would include measuring service delivery with a particular emphasis on the levels and quality of the services available to the poor and then using this information to develop a targeted policy response.

    Understanding Poverty and the Service Environment of the Poor

    At a fundamental level, the Living Standards Measurement Study surveys, given their focus on the correlates and determinants of poverty, are ideally placed to help us understand poverty. However, as Scott notes in chapter 16, the surveys were not originally designed to provide information on service access or on the use and (especially) the quality of services for the poor and others. This approach has changed somewhat over time, and Scott counts 14 or so cases where a Living Standards Measurement Study survey has included a facility component. The prevailing method of the study to assess services (similar to the Indonesia Family Life Survey discussed by Beegle in chapter 15) is to sample those facilities most likely to be used by households, rather than starting from a national sample frame of, for example, all possible facilities. Given that this approach takes the household as the starting point for sampling, but also that it is explicitly designed to link household and facility data, it represents a powerful means for examining the relationship between poverty and service provision.

    Unlike the Living Standards Measurement Study and the Indonesia Family Life Survey wherein facility and household data are integrated by design, the school survey in Ukraine examined by Bekh and others in chapter 11 shows us a case where this design may be achieved (with some difficulty) by adding a facility survey that, after the fact, may be linked with a living standards survey. In this case, one of the uses of the survey was to look at the effects of institutional change and the contribution of the quality of education to equality and household well-being.

    Targeting the Policy Response

    Linked household and facility surveys such as the Living Standards Measurement Study (LSMS) and the Indonesia Family Life Survey provide a powerful set of tools to target a policy response by providing a straightforward method to correlate poverty and service delivery. However, in the development of a policy response that focuses on service delivery, care must be taken to separate out those measures of quality that reflect the underlying poverty (and, hence, that call for a broader development strategy) and those that are due to deficiencies in service delivery. This distinction is clearly made in chapter 13 by Das and Leonard, who use the example of the measurement of service delivery quality in health through the measurement of the number of doctors who follow a given protocol. They argue that the results may be driven by the education level of the patients (for example, because the patients encourage the doctors or make their jobs easier) rather than some underlying quality of the physicians. For one to measure dimensions of quality that directly capture the quality of the underlying service providers, it is necessary to test the skills of doctors directly through a tool such as vignettes. Das and Leonard show us that, in the vignette studies they review, the poor are being served by doctors of lower quality and that an appropriate policy response would, therefore, involve finding a way to bring more competent doctors to poor districts.

    Sometimes, however, the basic physical condition of a facility or the presence of a service provider may be the primary concern. One area where rapidly implemented facility surveys, combined with household data, may be of use in developing a targeted response is in the planning and execution of a reconstruction program following a natural disaster. In chapter 10 by Frankenberg and others, we can see how the surveys used in the Study of the Tsunami Aftermath and Recovery serve this purpose in the case of rebuilding after the recent tsunami in Indonesia. Clearly, the surveys (here combined with geographical information such as satellite pictures) provide evidence on facilities that have been wiped out and providers who have been killed. However, as the authors caution, a well-targeted response to the disaster does not simply consist of rebuilding and repairing previously existing facilities. Indeed, the household and facility surveys are useful in shedding light on population shifts and the resulting changes in the demand for services within the places affected by the tsunami. The surveys thus also provide insights into effects beyond the physically damaged zones. For example, some of the affected populations move to other areas, where they create additional demand for services, and this needs to be taken into account in targeting reconstruction aid.

    Evaluation, Especially Impact Evaluation

    Measurements of service delivery are critical in any program evaluation where the process or output of facilities is a subject of examination. However, when we are trying to establish a clear link between inputs and impacts, the best option, if feasible, is to use the tools of impact evaluation. Impact evaluation tackles one of the fundamental problems of evaluation: what would have happened to the beneficiaries of a program or a policy in the absence of the program? Because individuals either receive or do not receive a program, impact evaluation techniques seek to construct a comparison group or counterfactual that proxies for the outcome among the beneficiaries or treatment group in the absence of the intervention. This comparison group is chosen according to criteria that cause the characteristist of the group (both those characteristist that are observed by the evaluator and those that are not) to be as similar as possible to the characteristist of the group receiving the intervention. By using appropriate methods of statistical analysis to compare the two groups at a point at which the program is expected to have already had some effects, we are able to provide a rigorous link between program inputs and impacts.¹ This type of evidence provides a critical input into an informed policy debate. Indeed, returning to figure 1.2, we may think of this type of evidence as strengthening citizen–government links (by demonstrating the validity of a program) and government–provider links (by demonstrating which interventions have the greatest effect).

    There are three main ways in which the measurement of service delivery may be used in impact evaluation. The first way is the most straightforward: we may use administrative data to estimate whether or not there are any program effects or any effects related to an additional time of exposure to a program. The second way looks at the effect of management on service delivery. The third way examines how the effect of interventions varies in the context of providers of different quality. If we are to make the distinction between these latter avenues of analysis clear, let us think of two types of interventions we might wish to evaluate. In the first case, the management of a facility is the subject of the intervention. For example, such an intervention might include changes in the incentives workers face, how feedback is provided, which transactions are monitored, and a host of other options. In this instance, the measurement of service delivery will provide a core set of data for the evaluation. In the second case, we may think of an intervention that provides added inputs, such as textbooks or medicines, or new inputs altogether. In this case, we may use service delivery data to understand how variations in the quality of the provider (or in the heterogeneity of treatment) affect the impacts we are examining. Let us now analyze those three ways of measurement in greater depth.

    Evaluating Marginal (or Any) Program Impacts

    In the simplest case of impact evaluation, we want to know whether a program has had any effect on individual welfare outcomes. To do this, we are likely to need data on individuals and household survey data or administrative data that cover both the treatment group and comparison group. Although those would be the most useful, they are not service delivery measurement tools. However, in some instances, we may use service delivery data as a proximate indicator of program impacts. For example, given a vaccine of proven effectiveness in clinic trials, the evaluation may need to rely instead on measurements of the number of children who are properly vaccinated in both the treatment and control groups. Although this examination does not reveal the welfare impact we are really concerned about (child mortality), it produces a quick assessment of whether the program is more effectively providing a treatment that we know contributes to a reduction in child mortality. For this assessment to be a viable option, however, it is essential that the service delivery data we use include information on both the treated group and the comparison group.

    Chapter 5 by Galasso highlights another way we may apply service delivery data to measure program impacts. Galasso discusses the measurement of the effect of additional exposure to a program (the marginal effect of increased duration). She examines an intervention that a pilot test has shown to be effective. As the evaluation is brought to scale, her goal is to determine the effectiveness of the intervention in different socioeconomic settings. In this case, the program collected data on the phasing in of the project and implementation over time and also on recipient outcomes (such as malnutrition rates). Combining this information with data on local characteristics allows for estimates of the impact of additional exposure to the program, as well as insights into which communities would benefit most from the intervention.

    Evaluating a Change in Management

    If the aim is to evaluate the effect of a change in management, two levels of effect are likely to be of interest. First, the ultimate goal of a change in public service management is to improve the welfare of clients; so, the evaluation will likely include measurements of client outcomes (for example, literacy rates or the incidence of vaccine preventable disease) through household surveys or administrative data, as discussed above. However, those welfare effects may take a fair amount of time to become manifest; meanwhile, it may be useful to collect proximate indicators that measure improvements in the quality of service delivery. We are also interested in changes in service delivery outcomes in their own right, because a subsidiary goal of a change in management is, hopefully, to encourage facilities to perform more effectively. Chapter 14 by Lundberg discusses data collected as part of an evaluation of an experiment with performance-based contracts in the Ugandan health sector. In the study examined by Lundberg, private not-for-profit providers were randomly allocated to different treatments with varying amounts of financial autonomy and bonus payments. To evaluate the effects of these different contract types, the research team collected household data, conducted exit surveys, and carried out facility surveys.

    Using Service Delivery Data to Capture the Effects of Variations in Treatment

    The two sorts of cases outlined earlier, wherein service delivery data are used to look at the direct effects of an intervention, are common in the literature. Less common is the approach whereby data on service delivery are used to complement data on individual beneficiary outcomes to analyze the effects of heterogeneity in treatments when the intervention under evaluation is focused on providing new inputs rather than on changing the management structure among service providers. The idea is to measure individual impacts through a tool such as a household survey or, if feasible, administrative data. By itself, this approach will usually be applied to estimate an average treatment effect. However, all individuals do not face the same service environment: variations in the attributes of providers will lead to variations in program effects on individuals that will not be evident if only the average treatment effect is examined. To understand which elements of service delivery matter more or less in producing these individual-level effects, we may use measurements of the attributes of the delivery of services to determine what is most important in achieving greater program impact.

    For example, take a program that seeks to promote child literacy by providing one textbook for each child instead of obliging children to share textbooks. The government is not sure about the efficacy of this program, so it randomly chooses a group of pilot schools to receive the additional textbooks. In the simplest impact evaluation, we might measure child literacy after enough time has passed and examine the treatment schools (that is, the pilot schools) with the comparison group of schools. Let’s say this program increases literacy by 10 percentage points. Although we might be happy with this result, we might make more effective policy recommendations for the scaling-up of this program by complementing our analysis with service delivery data. For instance, as the program is starting, we might collect data on variables such as teacher absenteeism, the quality of roofs in the school, and teacher education levels. It could turn out that the program has a much greater effect if the school has a roof (the books will not become wet, and children will be more likely to use the books anyway if they are not exposed to the elements). Some of those effects will be obvious to anyone familiar with education program design, but some may be counterintuitive. An examination of this type will allow prioritization among scarce resources by showing which factors are more important.

    Another example of the use of service delivery data to understand the heterogeneity in treatment is contained in chapter 6 by Behrman and King. The authors show the importance of being aware of differences in implementation speed. This awareness would take into account the possibility that providers are initially learning about a program (and, so, implementation is less effective) or, in contrast, that providers may be more motivated because they are pioneering a program (and therefore exert greater effort). Behrman and King apply those considerations in an evaluation of an early childhood development program in the Philippines and show that adequately measuring the duration of treatment is quite important in measuring the impact of a program.

    Policy-Relevant Research

    Data on service delivery permit the examination of a host of research questions. Impact evaluation is one obvious area, but chapter 2 by Lindelow and Wagstaff suggests three other broad areas for consideration. The first of those is the establishment of links between quality and client outcomes. The realm in which we define quality, be it the satisfaction of clients, the quality of facility infrastructure, the skill of the provider, or other dimensions, will guide us in the choice we make among measurement tools. The data may then be linked to client outcomes so that we may understand whether or not the different measures of quality are important and which of them are more important.

    A second area of research involves understanding the demand for services. This work seeks to elucidate the factors that influence the level of utilization of services and what may be causing this level to deviate from the optimum. This analysis requires a significant amount of data on potential clients and actual clients, as well as on the service environment. This environment will likely include not only public and private providers, but also formal and informal providers. As Lindelow and Wagstaff note in the case of health, the accurate description of this environment may be tricky because many surveys only include the nearest facilities. Other surveys (for example, the Indonesia Family Life Survey discussed by Beegle in chapter 15) take a broader approach and sample all known facilities, as well as all the facilities that are used. However, as a number of chapters in this volume indicate, attempts to describe the service environment within which households make their choices require careful attention to survey design, particularly the sampling method.

    A third strand of research seeks to unpack the functioning of facilities in a manner that is more in-depth than basic monitoring. This research tackles efficiency within and across facilities by looking at how inputs are used to provide different levels of service. This type of work is able to address the issue of whether human and physical capital is being used in the appropriate proportions, whether facilities are operating at the optimal size, and what inputs are being wasted. Given the focus on the production process within facilities, this type of research will require in-depth facility data supplied through a well-functioning administrative system or through facility surveys.

    Many of the chapters in this volume discuss these and other research questions that service delivery measurement tools have been used to address (for example, see chapter 4 by Lanjouw and Özler on inequality and project choice). An exciting aspect of these chapters is the fact that they show how, in many cases, the same data may be used for both routine program monitoring and in-depth research, thereby providing a range of feedback for improving the design of policies.

    The Structure of This Book

    This volume does not need to be read straight through; each chapter is meant to be freestanding. Nonetheless, a useful place to start might be chapters 2 and 3 (part I), which are designed to provide an overview of the different tools in health and education. In health, the focus has been more on facility surveys, and Lindelow and Wagstaff, therefore, use this as their point of departure. Amin and Chaudhury provide an introduction to methods for measuring service delivery in education.

    Part II discusses administrative data and how they may be used to understand program design and program effects. In chapter 4, Lanjouw and Özler discuss administrative data and their uses in understanding the link between inequality and project choice. Galasso, in chapter 5, also uses administrative data, in this case to analyze the effects of a nutrition program in Madagascar. In chapter 6, Behrman and King discuss the ways in which the duration of exposure may affect project impacts and examine this in the case of an early childhood development program in the Philippines.

    Part III presents the trials and tribulations involved in tracking public spending data. In chapter 7, Lindelow discusses general problems with defining the concept of leakage and provides an example of a survey in Mozambique. Wane, in chapter 8, offers some insight into complications in comprehending budgets and presents a case in Chad. In chapter 9, Filmer provides a transition to part IV by discussing both Public Expenditure Tracking Surveys and Quantitative Service Delivery Surveys in the context of Indonesia and Papua New Guinea.

    Part IV focuses on a range of facility surveys. In chapter 10, Frankenberg and others open with a discussion of the advantages of combining facility survey data with a range of data sources to assess health and education services in Indonesia following the tsunami. In chapter 11, Bekh and others analyze a school survey in Ukraine. Chapter 12, by Serneels, Lindelow, and Lievens, offers us insights into the use of qualitative work, particularly to guide the ultimate choice among quantitative instruments, in the context of examining absenteeism in Ethiopia and Rwanda. In chapter 13, Das and Leonard review and discuss experiences with the use of vignettes to measure quality in the health sector.

    Part V focuses on cases in which household and facility surveys are combined. In chapter 14, Lundberg compares client satisfaction and perceived quality among health facilities in Uganda using exit polls and household surveys. Beegle, in chapter 15, discusses the Indonesian experience with linking health facility, school, and household surveys. In chapter 16, Scott provides a broad overview of the incorporation of service provider data into Living Standards Measurement Study surveys.

    Part VI concludes by pulling together some of the general lessons that may be derived from the applications of service delivery measurement that are described in the chapters.

    Note

    The author wishes to thank Samia Amin, Louise Cord, Elizabeth King, Kenneth Leonard, and Maureen Lewis for valuable comments and Deon Filmer and Shanta Devarajan for useful conversations.

    1. For more on impact evaluation methods, see Ravallion, forthcoming, and the papers and resources at http://www.worldbank.org/impactevaluation.

    References

    Filmer, Deon, Hammer, S., Pritchett, H. (2000). Weak Links in the Chain: A Diagnosis of Health Policy in Poor Countries. World Bank Research Observer 15 (2): 199 –224.

    Grosh, E., Glewwe, W., eds. (2000). Designing Household Survey Questionnaires for Developing Countries: Lessons from 15 Years of the Living Standards Measurement Study. 3 vols. Washington, DC: World Bank; New York: Oxford University Press.

    Lewis, A., (2006). Governance and Corruption in Public Health Care Systems. Center for Global Development Working Paper 78, Center for Global Development, Washington, DC.

    Ravallion, Martin,. (Forthcoming). Evaluating Anti-Poverty Programs. In Handbook of Development Economics, Vol. 4, ed. Schultz, T., Strauss, John,. Amsterdam: North-Holland.

    World Bank. (2003). World Development Report 2004: Making Services Work for Poor People. Washington, DC: World Bank; New York: Oxford University Press.

    2

    Assessment of Health Facility Performance

    An Introduction to Data and Measurement Issues

    Magnus Lindelow

    Adam Wagstaff

    Introduction

    Over the past 20 years, household surveys have considerably improved our understanding of health outcomes and health-related behavior in developing countries. For example, data from surveys such as the Living Standards Measurement Study (LSMS) surveys and the Demographic and Health Surveys (DHS) program have shed light on the nature and determinants of health outcomes, while providing information on health-related behavior, including household expenditure on health care and the utilization of health services. Although these and other surveys have strengthened the foundations for policy design and implementation, they have also highlighted the need to understand the supply side of service delivery more thoroughly. The supply-side perspective has also gained importance because many health systems have had to grapple with real or perceived problems of inefficiency, low quality, inequality, and unsustainable financing.

    In this context, health facility surveys have come to make up an important source of information about the characteristics and activities of health facilities and about the financing and institutional arrangements that support service delivery. Although those surveys all have the health facility as the focal point, they vary in at least four important dimensions. First, they have different goals. Some of the surveys aim to deepen understanding of the ways in which health facility characteristics influence health-seeking behavior and health outcomes. Others make the facility the focus of analysis and emphasize issues such as cost, efficiency, and quality. Still other surveys are designed to illuminate the broader context of service delivery, including links among providers or between providers and the government. Second, the scope and nature of the data collected by surveys differ. For example, although many surveys collect data on inputs, not all collect data on costs or on the clinical dimensions of health care quality. Third, surveys collect data in different ways and adopt different approaches to measurement. Fourth, surveys vary in the uses to which the data are put. In some cases, the attention is on research. In others, the principal application has been use as a tool in designing interventions or in monitoring and evaluating programs.

    In this chapter, we provide an introduction to health facility surveys and the methodological approaches that underpin them. Although the focus is on health facility surveys, the chapter also draws attention to other sources of data for assessing and understanding the performance of health facilities. For example, household or community surveys are a major tool for shedding light on the perceptions of actual or potential clients. Moreover, in developed countries, administrative data from hospitals and other facilities play an important part in assessing costs and quality. Although administrative data can also be a key source of information in developing countries, surveys tend to play a more significant role given the incompleteness and limited reliability that plague routine data.

    The chapter is organized as follows. In the next section, we discuss the motivation for various health facility surveys. The subsequent section provides details on the types of data that have been collected in the surveys and on measurement issues. The following section outlines some of the uses of facility data. In the last section, we offer conclusions, including a discussion of the lessons learned and of emerging themes. The chapter also includes an annex that summarizes key information about selected health survey programs.

    The Motivation for Health Facility Surveys

    Provider-Household Links

    Health facility surveys have often been motivated by the desire to understand the links among the availability of health facilities, household health-seeking behavior, and health outcomes among households (see figure 2.1). Indeed, a number of health facility surveys have been implemented with the explicit purpose of feeding into analyses of household-level data. Although household-level behavior and outcomes are shaped in large part by factors within the household (income, education, location, and so on), the characteristics of local health care providers may be an important determinant of health service utilization, the effectiveness of health care interventions, and the client perceptions.

    Figure 2.1 Provider–Household Links

    WB.978-0-8213-7185-5.part1.ch2.sec2.fig1.jpg

    Source: Compiled by the authors.

    Work on the links between households and providers dates back to the 1970s, when the World Fertility Survey started to collect data to measure the impact of health service availability on fertility and mortality (see Turner et al. 2001). Initially, data were collected at the community level through interviews with key community informants. This practice was continued in the context of the DHS program, which took over from the World Fertility Survey in 1984. Many LSMS surveys have also included community modules for the collection of information on the availability of public services, for example.

    Visiting the actual providers was a natural extension of the collection of community data on service delivery. In the late 1980s, a number of LSMS surveys (such as the ones on Côte d’Ivoire and Jamaica) experimented with health facility surveys and school surveys to complement the household data. A more systematic approach, called Situation Analysis, was introduced in 1989 by the Population Council, an international nonprofit nongovernmental organization. The focus was on family planning and reproductive health services. At least in part, the approach was motivated by findings emerging from the DHS program that weaknesses on the supply side were important in explaining low contraceptive prevalence (see Miller et al. 1997, 1998). The Situation Analysis was based on a representative sample of service delivery units and included structured interviews with managers and facility staff, inventory reviews, direct observation of clinic facilities and the availability of equipment and consumables, reviews of service statistics, direct observation of client-provider interactions, and interviews with clients of family planning and maternal and child health services. More recently, facility surveys have been implemented in conjunction or in coordination with DHS household surveys. Known as Service Provision Assessments, those surveys are ambitious in scope; they seek to supply information about the characteristics of health services, including extensive information about quality, resource availability, and infrastructure.

    Measuring and Understanding Provider Performance

    In some surveys, the facility rather than the household has been the main object of interest. Facility data may serve in program monitoring or evaluation, but they may also form the basis of empirical work on the determinants of facility performance. For example, how may we account for differences across providers in key dimensions of performance such as quality, costs, and efficiency? The facility surveys designed to explore provider-household links have generally offered little insight into these issues. Figure 2.2 illustrates the complex institutional and organizational environment that influences the performance of health care providers. This environment operates through the financing, support, regulation, and oversight provided by the administrative levels upstream from the facilities; the competitive environment in which the facilities are operating; and the oversight, accountability, and influence exercised by households and communities.

    Enjoying the preview?
    Page 1 of 1