HIV-Specific Reported Outcome Measures: Systematic Review of Psychometric Properties

Background: The management of people living with HIV and AIDS is multidimensional and complex. Using patient-reported outcome measures (PROMs) has been increasingly recognized to be the key factor for providing patient-centered health care to meet the lifelong needs of people living with HIV and AIDS from diagnosis to death. However, there is currently no consensus on a PROM recommended for health care providers and researchers to assess health outcomes in people living with HIV and AIDS. Objective: The purpose of this systematic review was to summarize and categorize the available validated HIV-specific PROMs in adults living with HIV and AIDS and to assess these PROMs using the Consensus-Based Standards for the Selection of Health Measurement Instruments (COSMIN) methodology. Methods: This systematic review followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. A literature search of 3 recommended databases (PubMed, Embase, and PsychINFO) was conducted on January 15, 2021. Studies were included if they assessed any psychometric property of HIV-specific PROMs in adults living with HIV and AIDS and met the eligibility criteria. The PROMs were assessed for 9 psychometric properties, evaluated in each included study following the COSMIN methodology by assessing the following: the methodological quality assessed using the COSMIN risk of bias checklist; overall rating of results; level of evidence assessed using the modified Grading of Recommendations, Assessment, Development, and Evaluation approach; and level of recommendation. Results: A total of 88 PROMs classified into 8


Background
According to the statistics from the Joint United Nations Program on HIV/AIDS, 28.2 million individuals were accessing antiretroviral therapy (ART) as of mid-2021 [1]. Although effective treatment via ART has improved the life expectancy of people living with HIV and AIDS [2], this population still faces substantial challenges brought by HIV [3][4][5][6]. Therefore, Lazarus et al [7] proposed the Fourth 90 target to ensure that 90% of people living with HIV and AIDS with viral suppression have a good health-related quality of life (HRQoL) after the World Health Organization proposed the 90-90-90 targets. They proposed that HRQoL in people living with HIV and AIDS should be considered as important as viral suppression [8]. For people living with HIV and AIDS, the focus should be shifted toward improving HIV-related care [9].
The management of people living with HIV and AIDS is multidimensional and complex. To overcome the obstacles to achieving the Fourth 90 [10], patient-centered care that can meet the lifelong needs of people living with HIV and AIDS from diagnosis to death is the key requirement [9]. The collection and use of patient-reported outcome (PRO) data is one of the most effective approaches for ensuring that the care reflects the needs and priorities of people living with HIV and AIDS [9]. Compared with clinician-reported outcomes, PROs present a more comprehensive method for assessing the subjective perceptions of people living with HIV and AIDS of their own health that cannot be observed or are not easily observed directly and have been shown to accurately predict health outcomes among this population [11,12]. Furthermore, there is sufficient evidence that PROs can be used to improve the care quality and health outcomes in people living with HIV and AIDS, such as by improving patient-physician communication [13], clinical decision-making [14], and symptom recognition [15].

Why Did This Systematic Review Only Include HIV-Specific PRO Measures?
Patient-reported outcome measures (PROMs) are the actual tool developed for collecting PRO data. There are 2 types of PROMs: generic (designed for use in any population and cover general aspects of outcome measures) and disease specific (designed for use in people with a condition and measure specific aspects of an outcome of importance). Many generic and HIV-specific PROMs have been validated in people living with HIV and AIDS. The advantage of a generic PROM is that it enables researchers to compare the health outcomes of people living with HIV and AIDS with those of other populations based on the same measurements [16]. However, unlike generic PROMs, HIV-specific PROMs do not have a significant ceiling and floor effect and do not overestimate health outcomes in people living with HIV and AIDS [17,18]. Furthermore, HIV-specific PROMs are more closely associated with HIV than are generic PROMs. In addition, they have the sensitivity for detecting and quantifying minor changes and specificity needed for HIV-specific domains, such as HIV-related stigma, comorbidities, and ART-related treatment [19]. Some related reviews have recommended a strategy to combine generic and HIV-specific PROMs to supplement HIV-specific health care outcomes that cannot be obtained with generic PROMs alone [20,21]. Clayson et al [20] suggested that the right combination of generic and HIV-specific PROMs can improve the comprehensiveness of assessment content, such that it includes not only the 3 core domains that generic PROMs focus on, that is, physical function, social or role function, and mental health or emotional well-being, but also the items or domains addressing issues relevant to HIV or AIDS and its treatment. Considering that many HIV-specific PROMs were developed before the widespread use of ART, they may not be able to detect the impact of current treatment on people living with HIV and AIDS and serve as an assessment tool for the long-term management of people living with HIV and AIDS [9]. In addition, many poorly designed PROMs lack a standardized development process. Therefore, it is necessary to summarize the existing HIV-specific PROMs and assess their psychometric properties.

Previous Studies
With the rapid development of this field, many HIV-specific PROMs have been developed. After a preliminary literature search in MEDLINE using a comprehensive search strategy (Table S1 in Multimedia Appendix 1), we found some relevant reviews. Wen et al [19] recently conducted a systematic review on a similar topic; however, they only aimed at identifying and assessing the psychometric properties of HRQoL in people living with HIV and AIDS. Engler et al [22] identified 117 different HIV-specific PROMs in 2016; however, they did not quantitatively assess the psychometric properties of these PROMs. Cooper [16] reported an overview of the available reviews and summarized the PROMs with <40 items for measuring HRQoL in people living with HIV and AIDS in 2017. Earlier, several researchers conducted nonsystematic reviews of some PROMs in specific contexts [20,23,24]. Although many previous reviews have summarized the content of some existing HIV-specific PROMs, few have comprehensively reported the psychometric properties of these PROMs and given recommendations for the use of these PROMs.
As accurate and reliable PROMs are a precondition for obtaining robust results, PROMs with good psychometric properties are indispensable for research [25]. Lancet HIV also suggested in the special issue of "HIV outcomes beyond viral suppression" that the psychometric properties of the existing PROMs should be assessed in line with the existing guidelines, such as the Consensus-Based Standards for the Selection of Health Measurement Instruments (COSMIN) guidelines [9]. The COSMIN guidelines provide a consecutive procedure to help health care providers and researchers improve the selection of the most suitable PROMs in research and clinical practice [26]. Therefore, we conducted a systematic review to identify studies assessing the psychometric properties of HIV-specific PROMs validated in a population of adults living with HIV and AIDS and categorized these PROMs based on the type of outcome measure. We further assessed the methodological quality and level of evidence of these PROMs in association with their psychometric properties.

Objective
The purpose of this systematic review was to summarize and categorize the available and validated HIV-specific PROMs for adults living with HIV and AIDS. This systematic review also aimed to use the COSMIN methodology to assess the psychometric properties of these PROMs and make an evidence-based and completely transparent recommendation for the use of these PROMs.

Overview
This systematic review was conducted and reported according to the COSMIN guidelines [27] and the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement [28]. It included only a secondary data analysis of publicly available content not involving human participants. Therefore, ethics approval was not required for this review.

Search Strategy
Three literature databases (MEDLINE, Embase, and PsycINFO) were searched on January 15, 2021. Two important web databases, PROQOLID and PROMIS, which contain a large number of PROMs and cover a wide range of populations and therapeutic areas, were also searched for PROMs. These 2 databases were developed by the Mapi Research Trust in France and the National Institutes of Health in the United States to facilitate the selection process of PROMs and are now used by many clinical investigators. The reference lists of relevant reviews in the preliminary literature search and the included studies were further examined for relevant publications. The search strategy used three COSMIN-guided search terms in reference to the search for constructs developed by Terwee et al [29]: (1) construct of interest, (2) condition of interest, and (3) psychometric properties (Table S2 in Multimedia Appendix 1). A comprehensive search strategy was developed under the guidance of a senior health research librarian.

Study Selection
The eligibility criteria of the studies were as follows: (1) the study validated HIV-specific PROMs for adults living with HIV or AIDS and assessed at least one of the 9 psychometric properties defined by the COSMIN guidelines: content validity, structural validity, internal consistency, cross-cultural validity or measurement invariance, reliability, measurement error, criterion validity, hypotheses testing for construct validity, and responsiveness [30]; (2) the study was published in English in a peer-reviewed journal; and (3) the study applied self-administered PROMs for patients.
Studies were excluded if (1) they used the PROM mainly for outcome measures rather than for assessing the 9 psychometric properties; (2) they developed and used PROMs for screening or diagnostic purposes only; (3) they were not an original investigation, such as reviews, letters, and editorials; (4) they included generic PROMs or other disease-specific PROMs not related to or only partially related to HIV (such as the 36-Item Short-Form Health Survey Questionnaire); and (5) they provided indirect evidence of psychometric properties (such as studies using a PROM in a validation study of another instrument [30]).
The retrieved literature was imported into the EndNote software (version X9; Clarivate Plc), and duplications were automatically removed. A 2-stage screening process was used to select eligible studies. First, the titles and abstracts were screened based on the predetermined selection criteria (stage I). Subsequently, the full texts of articles deemed relevant or possibly relevant were obtained and further assessed for eligibility (stage II). Two independent researchers (ZW and YZ) determined study eligibility, and any disagreement was settled by consensus or discussion with a third researcher (BQ).

Data Exclusion
For the eligible studies, data were independently extracted by the same 2 researchers (ZW and YZ) using a standardized form, and completeness and correctness were confirmed. Any discrepancy was resolved via a discussion with the third researcher (BQ). The extracted data included the characteristics of PROMs (name of the PROM[abbreviation], year of PROM development, targeted concept, recall period, number of items, each domain and the number of items in each domain, response options and score range, and original language), characteristics of the included studies (first author [year of publication], the total number of patients [N], age, gender, patient description, years diagnosis, severity of disease, recruitment context, country of research, and effective response rate of the questionnaire), and results of the included studies (COSMIN risk of bias information, evidence of the 9 psychometric properties, and COSMIN summary and rating).

Data Analysis
According to the suggestions mentioned in the COSMIN guidelines, each PROM was assessed via a 4-step process [27]. First, the methodological quality for every psychometric property in each study was assessed using the COSMIN risk of bias checklist based on a four-point response, "very good," "adequate," "doubtful," or "inadequate," and an overall rating of the psychometric property was determined based on the item with the worst rating [30]. Second, the results for every psychometric property in each study were rated based on the updated criteria for good psychometric properties [27], and each result was graded as positive (+), negative (-), or indeterminate (?). Third, the overall results for each psychometric property of a PROM were rated as sufficient (+), insufficient (-), inconsistent (±), or indeterminate (?), and the level of evidence for each psychometric property of a PROM was rated as "high," "moderate," "low," or "very low" by following the Grading of Recommendations, Assessment, Development, and Evaluation approach, which considered the initial level of evidence to be high, with subsequent downgrading based on the score for 4 criteria: risk of bias, inconsistencies, imprecision, and indirectness. Finally, a table summarizing the findings was constructed and used to make recommendations for the selection of the most suitable PROMs.
All assessments were conducted independently by 3 researchers (ZW, HK, and XD), and any disagreement was settled via consensus or discussion with a fourth researcher (YZ). The Cohen κ coefficient was calculated using the SPSS software (version 24.0; IBM) to evaluate the interrater agreement for title and abstract screening, study selection, and data extraction.

Characteristics of the Included Records
As 3 studies [34,35,42] included the assessment of 2 PROMs, 155 records were included.  Table S4 in Multimedia Appendix 1 also summarizes the years since diagnosis, the severity of the disease, recruitment context, and effective response rate.

Methodological Quality Assessment
The methodological quality for each psychometric property of every record is summarized in Table S5 in Multimedia Appendix 1 based on the COSMIN risk of bias checklist. As there is no generally accepted "golden standard" for assessing health outcomes in adults living with HIV and AIDS, the criterion validity of all studies was not considered. Most records assessed internal consistency (146/155, 94.2% of records) and structural validity (96/155, 61.9% of records), and most of them were rated as "very good" or "adequate." Although 79.4% (123/155) of records assessed the hypotheses testing for construct validity, most were rated as "doubtful" or "inadequate." As for the remaining psychometric properties, only a few records assessed them, and most of them were rated as "doubtful" or "inadequate." Table S6 in Multimedia Appendix 1 shows the results of each psychometric property of each record. The overall results and the level of evidence are presented in Table S7 in Multimedia Appendix 1. There are only few studies on PROMs, except for some well-known PROMs; accordingly, there is little evidence for psychometric properties.

Recommendations
The following recommendations are presented according to the COSMIN guidelines (Table 2): • Class A: The PROMs with evidence for "sufficient" content validity (any level) and at least low-quality evidence for "sufficient" internal consistency included the following: Poz Quality of Life (PozQoL) [102], HIV Symptom Index or Symptoms Distress Module of the Adult AIDS Clinical Trial Group (HIV-SI or SDM) [110][111][112], and People Living with HIV Resilience Scale (PLHIV-RS) [151]. These may be recommended for use, and the results obtained may be credible.
• Class B: The remaining PROMs have the potential to be recommended for use; however, further research is required to assess their quality (PROMs not included in class A or C).
• Class C: The PROMs with high-quality evidence for an "insufficient" psychometric property included the following: Multidimensional Quality of Life for patients With HIV and AIDS [72][73][74][75], Patient-Reported Outcome Quality of Life-HIV Questionnaire-38 [101], HIV-Related Fatigue Scale [113][114][115], HIV Stigma Scale-10 [130], HIV or AIDS Stress Scale [144], Screenphiv [146,147], SECope [167], and HIV Exercise Stereotypes Scale [181]. They may not be recommended for use. Although 3 PROMs have been recommended, they all have some shortcomings, reducing the strength of the recommendation for their routine use. Furthermore, although PozQoL [102] and PLHIV-RS [151] achieved class A, they were developed and assessed based on a single validation study. In addition, some items in HIV-SI or SDM have significant differential item functioning between different cultural groups [111], indicating low-quality evidence for "insufficient" cross-cultural validity.  [182] a As there is no generally accepted "golden standard" for assessing health outcomes in adults living with HIV and AIDS, the criterion validity of all studies was not considered. Overall results of PROMs are rated as +: sufficient; ?: indeterminate; ±: inconsistent; and -: insufficient. LoE is rated as H: high, M: moderate, L: low; VL: very low. Blank cells indicate that the data are not available. b PROM: patient-reported outcome measure. c Internal consistency can be rated as "sufficient" if there is at least low evidence for "sufficient" structural validity, and Cronbach α values≥.70 for each unidimensional scale or subscale; the evidence for "sufficient" structural validity may come from different studies, and the "at least low evidence" was defined by grading the evidence according to the Grading of Recommendations, Assessment, Development, and Evaluation approach. d CCV or MI: cross-cultural validity or measurement invariance. e HTCV: hypotheses testing for construct validity. f The results of all included records should be taken together, and it should then be decided if 75% of the results are in accordance with the hypotheses. Only assessed measurement properties are shown. g Class A represents evidence for sufficient content validity (any level) and at least low-quality evidence for sufficient internal consistency (PROMs can be recommended for use); class B, PROMs categorized not in class A or C; and class C, high-quality evidence for an insufficient measurement property; PROMs with class B recommendation require further evaluation to assess their quality before recommendation for use; PROMs with class C recommendation are not recommended for use. h LoE: level of evidence (using the Grading of Recommendations, Assessment, Development, and Evaluations assessment tool).

Principal Findings
From the 152 included studies, we identified 88 PROMs in 8 categories for adults living with HIV, and the psychometric properties of the majority of the included PROMs were rated with insufficient evidence. The principal finding of this review was the lack of comprehensively validated HIV-specific PROMs for the assessment of health outcomes in adults living with HIV and AIDS. Although 3 available PROMs (PozQoL, HIV-SI or SDM, and PLHIV-RS) have been recommended based on the COSMIN guidelines, they all have some shortcomings. In addition, because of limited evidence, recommendations regarding the use of most of the remaining assessed PROMs (class B recommendation) cannot be made. These findings emphasize on the need for a more comprehensive validation of the psychometric properties of the existing PROMs. Furthermore, our findings indicate the need for a robust and rapid validation of PROMs through the use of electronic PROMs (ePROMs) and modern measurement theories (such as Item Response Theory).

Taxonomy of HIV-Specific PROMs
This systematic review updated the review reported by Engler [22] and provided improvisations on the inclusion and exclusion criteria, such that many unvalidated PROMs were excluded because if we include these PROMs, we cannot summarize the overall status of their psychometric properties. In addition, using the 12 categories reported by inductive content analysis in the review of Engler [22] as reference, this review reported 8 integrated categories (Table 1). The 2 categories of "ART and adherence-related views and experiences" and "healthcare-related views and experiences" in the study by Engler et al [22] were integrated into "treatment," and "psychological challenges" and "psychological resources" were integrated into the category "psychological"; the PROMs in the "sexual and reproductive health" category were excluded because they did not meet the inclusion criteria for our study. Finally, the "Disability" category was integrated with "Symptoms." The new taxonomy proposed in this review should be helpful for health care providers and researchers in selecting PROM.
In addition, although some of the PROMs included cognitive function or symptoms to some extent (such as "cognitive functioning" of Medical Outcomes Study-HIV Health Survey and "cognitive symptoms" of HIV Disability Questionnaire), no PROM specifically designed to measure cognitive concerns was included in the analysis. However, considering the high prevalence of HIV-associated neurocognitive disorders and HIV-associated dementia in people living with HIV and AIDS, it is important to assess their cognition via PROMs [184]. Askari [185,186]  Although the related PROMs were not included in this review, we will further explore these cognitive concerns as an independent PROM category in future studies.

Overview
A thorough validation process is important for ensuring the applicability of a PROM to individual patient care [187]. However, in this review, most included PROMs were short of evidence for many psychometric properties, such as content validity, measurement error, cross-cultural validity or measurement invariance, and responsiveness. Therefore, it was difficult to assess the quality of these PROMs.

Content Validity
On the basis of the most up-to-date COSMIN methodology [26], content validity is the most important psychometric property, and the current guidance suggests that it is very important for patients to participate in development and validation studies [25]. As suggested by Selby and Velikova [188], and public involvement should appear as a core feature in PROM design and application. In addition, Wilson [189] believed that the perception of patients was essential for providing better insights into how a disease affects HRQoL. However, they were short of evidence in terms of patient and public involvement in the development process of the included PROMs. To determine whether a PROM was well designed, it should be confirmed that the PROM is relevant, comprehensible, and comprehensive from a patient perspective and for their context of use [190]. In addition, PROMs should be able to record the experience of people living with HIV and AIDS and how HIV affects their lives so as to make a study more relevant and have better content validity [191].

Internal Structure
Internal consistency was the most frequently reported psychometric property. However, many studies used internal consistency as the only indicator of reliability, which was definitely not enough. Besides, structural validity is also one of the most important psychometric properties [192]. The premise for assessing internal consistency is at least "low" evidence for "sufficient" structural validity, and this evidence may come from different studies [27]. However, only exploratory factor analysis was conducted in many studies for the assessment of structural validity instead of confirmatory factor analysis. Accordingly, this property can only obtain the rating of "indeterminate," further affecting the assessment of internal consistency. In addition, the assessment of structural validity in most studies included in this review was based on classical test theory. Only 2 studies used Rasch analysis to assess the extent of interval level measurement and implementation of unidimensionality in this review [62,67]. However, no guidance has been provided in the COSMIN guidelines with regard to relying on only Rasch analysis without classical test theory statistics to assess the structural validity of PROMs. Therefore, Recchioni [193] suggested that it is necessary to provide additional guidance for the study that only uses Rasch analysis, especially in the development of new PROMs.
A PROM developed in one particular context may not be suitable for another. Therefore, it is necessary to use the same PROM for direct comparisons between different populations. No positive results for cross-cultural validity or measurement invariance were reported in this review [82,111,122,124,127], showing that the validity and transferability of the included PROMs between different geographies, cultural contexts, and risk populations were still unclear. Many researchers directly use the existing PROMs through simple translations and ignore cross-cultural adaptation [194]. However, there are great differences in the understanding of some concepts among people of different cultures, global regions, genders, ages, and socioeconomic strata [195]. The use of PROMs in different contexts is not simply dependent on translating items but should be processed based on a 7-key-step process for comprehensive cross-cultural adaptation [196].

Remaining Psychometric Properties
Measurement error was also important for interpreting PROs. Minimal important change is best calculated from multiple studies and using multiple anchors with an anchor-based longitudinal approach [197]. In this review, only 1 study reported the smallest detectable change ranging from 7.3 to 15.0 points without minimal important change. Therefore, measurement error was assessed as "indeterminate" [118].
Moreover, only few studies assessed responsiveness. However, responsiveness was vital to assess the effectiveness of a clinical intervention designed to improve the health outcomes of people living with HIV and AIDS. This identifies several gaps for future research in the area of HIV. Without such information, it is impossible to understand whether changes in the levels of health outcomes of people living with HIV and AIDS are meaningful and matter to health care providers and researchers.

Clinical Implications
Despite a 64% reduction in HIV-related deaths in 2020 compared with the peak reported in 2004, a total of 680,000 people living with HIV and AIDS still died from HIV-related illnesses in 2020. This was largely due to the unique physical and psychosocial symptoms [1]. These symptoms seriously affect the physical function and clinical outcomes of people living with HIV and AIDS [4,[198][199][200]. PRO data can be used in a variety of ways to improve care and health outcomes at a patient, institution, and population level [201][202][203][204]. Considering the particularity of people living with HIV and AIDS on subjective and privacy issues, PROs should be the primary outcome or end point. Many regulatory agencies and guidelines also recommend the inclusion of PROMs as the primary or secondary end points in clinical trials [205,206]. In addition, the development of the current ART regimen aims at simplifying the form of administration to meet the needs of long-term ART and maintain viral suppression with minimal toxicity [207]. Therefore, PRO data are becoming increasingly important for determining which ART regimen to use [208]. Therefore, a reliable, valid, and sensitive PROM is invaluable to health care providers and researchers.
In this systematic review, only 3 available PROMs (PozQoL, HIV-SI or SDM, and PLHIV-RS) were recommended based on the COSMIN guidelines, wherein PozQoL was used to assess HIV-related HRQoL, HIV-SI or SDM was used to assess HIV-related symptoms, and PLHIV-RS was used to assess HIV-related resilience. Health care providers can adopt these 3 PROMs for different application purposes. With regard to PROMs that received class B recommendation, although these PROMs are not recommended in this systematic review, researchers can select the PROMs with relatively good results for psychometric properties and use them according to the research purpose or further validate them for use in their context. For administrators, selecting validated PROMs can aid in the development of continuous quality improvement reports to understand health care providers' performance against the measurement framework and standard key performance indicators [209]. On the basis of the data collected through validated PROMs, policy makers can further evaluate system performance by comparing outcomes over time and support health care policy decision-making [210]. In summary, this review will help health care providers, administrators, policy makers, and researchers to choose suitable PROMs in different contexts, which in turn will promote the systematic use of these PROMs, identify areas that need to be improved from a patient perspective, and improve the quality of assessment for intervention.

Limitations
Our study has some limitations. First, although this systematic review additionally searched 2 important web-based databases of PROMs (PROQOLID and PROMIS) that are considered to be an important source of gray literature, we did not search dissertations, non-English literature, and other gray literature. This may have caused some relevant studies to be left out of our analysis, and these studies may help provide some evidence to support or refute our findings. Furthermore, evidence on the validation of PROMs can be deduced from the results of some studies. However, it was not the primary purpose of these studies; therefore, these studies were not included. Furthermore, some other PROMs were not included because they are still under study. Moreover, this systematic review may have ignored PROMs that only assessed a certain domain related to specific comorbidities, such as PROMs specifically designed to measure cognitive concerns. Considering the importance of evaluating these comorbidities in people living with HIV and AIDS, we will conduct further research on these PROMs. Furthermore, because no generally accepted "golden standard" measure for adults living with HIV and AIDS currently exists, the criterion validity of the included PROMs was not assessed. In addition, an insufficient number of studies reporting PROM development and content validity were included in this systematic review. Although we excluded many qualitative studies during the title and abstract screening stage, none of these studies researched on content validity. However, this is the same as the other relevant reviews [16,19] that also searched for insufficient studies reporting on the content validity of HIV-related PROMs.
One another limitation of this review is that the selection of studies, scoring of methodological quality, and grading of evidence were subjective in nature. However, this systematic review strictly followed the steps of the COSMIN guidelines, and the processes mentioned earlier involved multiple researchers. We believe that this could resolve discrepancies and reduce variability in interpretation, thereby minimizing the chance of errors. Furthermore, given that the negative results of many PROMs are less likely to be published, the possibility of publication bias cannot be eliminated. Moreover, some included studies may have reported on only some psychometric properties; accordingly, there may be a selective reporting bias. Finally, quantitative pooled summary or meta-analyses were not performed because of the possible large heterogeneity. These limitations may help to explain why concrete recommendations for the use of some PROMs were not made because there were few included studies for some PROMs, and not all psychometric properties were assessed in these studies.

Future Work
Although there are a large number of PROMs in each category, it would be necessary to validate the existing PROMs, or even develop new PROMs in some categories, because not enough validated PROMs are available. Considering the shortcomings of the 3 class A PROMs, efforts in future research should focus on validation as well as class B PROMs. It should be noted that multiple personnel such as patients themselves, their family members, health care providers, and researchers should participate in the development and validation of all PROMs [211]. In the future research on PROMs, researchers should follow the suggestions of the COSMIN guidelines to ensure the complete reporting of research details and accurate interpretation of results [27].
For the existing PROMs, research should focus on the validation of content validity and measurement error to determine the suitability of a PROM for use in the care of people living with HIV and AIDS. Moreover, these PROMs should be applied to different regions or populations to assess their cross-cultural validity or measurement invariance and explore the comparability of the results. In addition, future research should use more longitudinal or experimental study designs to assess the responsiveness of PROMs [9].
With the gradual aging of people living with HIV and AIDS, new and adjusted PROMs should focus on exploring the impact of aging on people living with HIV and AIDS, such as complex complications [212], polypharmacy [213], menopause in older women [214], low social support [215], cognitive impairment [216], and special symptoms of early exposure to HIV [9]. PROMs for children will be summarized in our future research.
In the past decades, researchers have mainly used interviewer-administered surveys and self-administered paper questionnaires to collect data [217]. However, several limitations of these methods have been found in the actual application process. ePROMs are becoming increasingly popular in recent years, greatly saving labor and time costs, minimizing errors, and realizing complex survey management [9]. Despite the fact that ePROMs are rapidly developing, future research should pay attention to evaluating the equivalence between electronic questionnaires and paper questionnaires [218]. Some researchers have used the most advanced technologies to integrate ePROMs into electronic hospital records or routine HIV care, allowing health care providers to easily and conveniently assess the qualitative and quantitative health outcomes of people living with HIV and AIDS. In addition, there are independent apps and software used in clinical practice and research.
Moreover, with the development of computer adaptive tests (CATs) in recent years, future research can develop and improve the item bank for people living with HIV and AIDS and use the CAT technology to dynamically select items for administration based on the respondent's previous answers for finally assessing their PROs [219][220][221]. However, the item bank of the CAT instrument requires a large number of unidimensional scales, posing a great challenge to the content validity of each PROM and its subconstructs. At the same time, the development of a CAT item bank can promote the improvement of the existing HIV-specific PROMs and the development of new HIV-specific PROMs, further promoting the vigorous development of research in related fields in the future.

Conclusions
This systematic review provides a detailed assessment of the psychometric properties of the existing HIV-specific PROMs for adults living with HIV and AIDS. Class A rating of PROMs was achieved for PozQoL, HIV-SI or SDM, and PLHIV-RS. However, all of these have a few shortcomings. Therefore, this study believes that future studies should conduct a more comprehensive validation of the psychometric properties of the existing PROMs to provide sufficient assessment evidence. These findings may provide a reference for the selection of high-quality HIV-specific PROMs by health care providers and researchers for clinical practice and research.

Authors' Contributions
ZW, YZ, and BQ conceptualized and designed the study. ZW and HK performed literature search, screening, and selection. ZW, HK, and XD extracted the data. ZW, HK, and YZ performed quality appraisal and statistical analysis. ZW, HK, XD, YZ, and BQ contributed to COSMIN evaluation. YZ and BQ supervised the study. ZW, YZ, and BQ drafted the manuscript. ZW, YZ, and BQ critically revised the manuscript for important intellectual content. ZW, YZ, and BQ provided administrative, technical, or material support. All the authors critically reviewed the manuscript and approved the final version before submission.

Conflicts of Interest
None declared.

Multimedia Appendix 1
Literature search strategy for existing review, literature search strategy for HIV-specific patient-reported outcome measures, subscales of the included PROMs, characteristics of the included records, methodological quality assessment of the included records, results and ratings of each psychometric property of each record, the overall results and the level of evidence, PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 abstract checklist, and PRISMA 2020 main checklist.