LSMS Paradata Tools


Overview

This website guides users in using the paradata tools developed by the Living Standard Measurement Study (LSMS) team at The World Bank Group. These tools are used for analyzing survey paradata from household surveys conducted through SurveySolutions and can be used to help improve survey design efficiency.


What is Paradata?

Data that is generated as a byproduct of computer-assisted data collection and that capture the entire process of creating a final survey dataset.

Think of paradata as a highly disaggregated account of the “life” of a survey and includes time stamped records of all “events” associated with each interview (e.g., interview record creation, interview assignment to an enumerator, answer provision, modification and comment addition in each questionnaire field, interview completion, to name a few).

The two tools below were developed to help survey practitioners make the most of their paradata analysis. For more information on how paradata can be helpful to survey practitioners and researchers, see Hasanbasri et al. (2024) for a list of references and example of use. Folder systems with R-code can be downloaded to produce reports of paradata analysis.


Tool 1: Module Stats

Household surveys consists of a wide range of modules covering various topics. While some modules are simple to administer, others can be quite complicated and difficult especially given the limited time and resources that survey practitioners face.

This tool potentially helps with the following questions:

  • Which modules take too long to administer?

  • Which modules have large variation in interview time?

  • Are there teams that are lagging behind in terms of interview time?

  • Are modules being administered consistently across teams?


Tool 2: Multilevel Model

Enumerator training is detrimental to ensure that data is collected efficiently and with limited bias. Researchers can estimate interviewer effects on interview length to identify potential problems with the data collection process.

This tool potentially helps answering these types of questions:

  • Which modules are most affected by interviewer effects?

  • Which modules might require more interviewer training?