^{1}

^{2}

^{3}

^{4}

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.

While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions.

To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained.

The population size estimate is obtained by dividing the number of individuals receiving a service or the number of unique objects distributed (

There is high variance in estimates. Random error around the size estimate reflects uncertainty from

We suggest a method for investigating the effects of sample size on the precision of a population size estimate obtained using multipler methods and respondent-driven sampling. Uncertainty in the size estimate is high, particularly when

Population size estimates (PSE) for those most at risk for human immunodeficiency virus infection are crucial to make epidemic projections, allocate funding, and monitor coverage of prevention and care programs [

While there has been research into sample size requirements for RDS surveys [

We briefly outline multiplier method size estimation, the approach to estimating uncertainty in the resulting population size estimates, and integrate this with advice on design effects and sample size requirements for RDS surveys.

Multiplier methods use 2 sources of data to estimate population size as described above: (1) a count of unique individuals from the target population receiving a service or unique objects distributed among this population,

Johnston et al. [

Equations for estimating population size, study sample size, and variance of the population size estimate.

RDS is a structured, peer-referral recruitment method assuming a model for estimating each participant’s probability of inclusion; thus, allowing weighting of responses to be used to approximate a random sample [_{P} is the estimate for the proportion we wish to estimate, and _{adj} that has been corrected for an estimated finite population as Equation 3 (

Rearranging Equation 2, and using _{adj} as obtained in Equation 3,

We examined the relationship between sample size,

To estimate the number of FSW in Harare, we planned a RDS survey of FSW aged 18 and older who had resided in the city for at least the previous 6 months. For service data, we planned to use Sisters with a Voice clinic attendance records. FSW attending this clinic, which provides sexual and reproductive health services for self-identified FSW, are given unique identification numbers and their visits recorded and dated (described further elsewhere [

To identify a reasonable estimated FSW population size for sample size calculation, we used previous estimates from a systematic review of FSW prevalence among 15- to 49-year-old women in sites from sub-Saharan Africa (.07%–4.3%) and multiplied them by the number of women of this age in Harare [

We examined the number of sex workers who visited the program for different reference periods up to April 23, 2015 to generate likely values for

For all values of

In

For our Harare example, we were able to review earlier service attendance data to see how the value of

We used previous service attendance data to observe how

Based on changes in the width of the estimated 95% CIs with increasing sample size (

Sample size and width of 95% confidence interval around a fixed population size estimate of 15,000 for different values of

Number of female sex workers attending the Sisters program and effect on

Reference period to April 23, 2015 | Number of unique female sex workers attending, M | Estimated |

85 | .006 | |

560 | .037 | |

952 | .063 | |

1542 | .103 | |

2227 | .148 |

Effect of reference period (variations in

Sample size and width of 95% confidence intervals around a population size estimate of 15,000 female sex workers in Harare, for assumed reference period of 6 months and design effects (DEFF) of 2, 3, and 4.

We have applied current guidance on RDS and multiplier methods to propose an approach to planning population size estimation studies and determining sample size. We have given an example using the SMM, similar principles of which can be applied to the UOM.

Even for large sample sizes, 95% CIs around the PSE are wide. The uncertainty around the PSE is more sensitive to the uncertainty in

We used DEFFs of 2 to 4 in our sample size calculations, but it is possible that a higher value would be more appropriate. Previous research has found that high levels of homophily (similarity) between recruiters and recruitees in RDS surveys is associated with higher DEFFs [

RDS surveys must have sufficient recruitment waves in order to reach stable estimates. There should also be sufficient numbers of seed participants to reflect diversity of the target population [

This short paper considers random error around size estimates and does not discuss a consideration of bias resulting from unmet assumptions of both the multiplier and RDS methods, which we consider elsewhere [

confidence interval

design effect

female sex workers

population size estimates

respondent-driven sampling

service multiplier method

unique object multiplier method

This work was supported by the Measurement and Surveillance of Human Immunodeficiency Virus Epidemics Consortium (MeSH), which is funded by the Bill & Melinda Gates Foundation (Funder ID OPP1120138). The Funder has not been involved in manuscript review or approval.

None declared.