Summary of the main findings
Understanding the HR-QoL of people living with predementia AD, MCI and dementia is essential for the accurate evaluation of health interventions from an economic perspective. This systematic review identified 61 studies assessing the HR-QoL of people with MCI or dementia using preference-based utility measures. Of these, 39 reported utility values according to disease severity, with seven including at least four stages of dementia, including MCI. This attempt to capture HR-QoL across the entire span of the disease, with particular focus on preclinical and prodromal AD and MCI, responds to a gap in the current literature that needs to be addressed given that new disease-modifying treatments are expected to act in the earlier stages of the disease. Overall, the studies identified in this review demonstrated heterogenous HR-QoL utility data, likely because of a combination of factors. This review identified a number of these factors, as discussed in the following sections, including measures used to define disease severity, measures used to assess HR-QoL, underlying disease (both causes of dementia or cognitive impairment and comorbidities), clinical and geographical setting, choice of respondent and other methodological differences.
Instruments used to assess HR-QoL
A variety of instruments were used to measure disease severity, with widespread variation in the reference ranges adopted to define each severity stage. The most commonly used measure of disease severity was the MMSE, which only assesses cognition, and the CDR-G, which reflects both cognitive and functional assessment with inputs from the clinician in the overall score. HR-QoL was assessed using a wide range of preference-based measures. As the domains and scoring and utility valuations vary between HR-QoL instruments, direct comparison of results can be difficult. This heterogeneity could be observed when several HR-QoL instruments were used on the same sample and obtained different utility values . The choice of instrument could therefore result in different quality-adjusted life-years and incremental cost-effectiveness ratios and so have the potential to influence important decision-making processes. In this review, the most commonly used HR-QoL instrument was the EQ-5D. It is short and easy to administer, making it attractive for use in populations with attention difficulties, such as those with dementia . It demonstrates good feasibility, reliability and validity in dementia  and is the HR-QoL instrument of choice recommended by the UK National Institute for Health and Care Excellence  for use in economic evaluations. The widespread use of the EQ-5D permitted the results to be meta-analysed according to disease severity stage, thus providing a better insight into HR-QoL across the disease continuum.
However, the EQ-5D predominantly assesses functional and emotional impairments. Lack of a specific cognitive domain may explain why, compared with the HUI, the EQ-5D detects less marked differences between mild and severe cognitive impairment . In addition, the EQ-5D is a generic HR-QoL tool that has not been specifically designed for patients with dementia. Indeed, most HR-QoL instruments used in the studies identified in this review were generic rather than disease specific. It can be argued that generic instruments produce less targeted results than those produced by instruments specifically designed to cover more relevant aspects of a disease. However, a recently published study by Ratcliffe et al.  compared generic EQ-5D-5L results with those from the dementia-specific DEMQOL-U and DEMQOL-U-Proxy instruments when calculating HR-QoL in an Australian residential care setting. Results suggested that, although both tools captured specific aspects of the disease and thus complemented each other, the EQ-5D was a more suitable instrument in this setting as it was more strongly correlated to function.
Potential explanations for observed differences in HR-QoL
A systematic review and meta-regression analysis by Li et al.  suggested that, even when using the same HR-QoL instrument, utilities can vary significantly across different samples. This suggests that covariates such as study methodology, study setting, country and patient characteristics can influence utility values. Type of dementia included in the study, for example, might influence reported HR-QoL, as might other demographic factors and disease symptoms beyond the scope of this review. Jönsson et al.  found, for example, that when caregivers lived with patients, patients reported higher baseline utility scores, and Schiffczyk et al. found that the rate of cognitive decline over time was associated with reduced utilities. If patient-level data are available to researchers and these covariates have been recorded, such differences can be controlled for, but often neither of these conditions holds.
Although ROADMAP primarily focuses on AD, this review included all types of dementia because of the possibility of overlap and diagnostic uncertainty between dementia types. Lam et al. reported utilities separately for patients with AD and patients with “dementia not AD”. Although not directly compared, the HUI-2 utility values were 0.23 and 0.24, respectively. The sample size for patients with AD in this study was also much smaller than that for patients with other dementia types, contradicting literature reports that AD accounts for the majority of dementia cases. It is likely that misdiagnosis or misreporting means the “dementia not AD” population did actually include a mixture of patients with and without AD, resulting in quite similar scores. Different types of dementia may have different effects on HR-QoL. For example, Boström et al. reported that patients with DLB had significantly lower utility values than those diagnosed with AD when both self and proxy rated (P < 0.0001), and additional research is needed to compare the impact of the several different types of dementia on HR-QoL.
Differences in HR-QoL between studies can also be caused by factors such as comorbidities. Winter et al. found that the presence of depressive symptoms reduced utility by 14% in patients with AD and VD (P < 0.01), with an overall utility for patients with depression of 0.35 and for those without depression of 0.48. In a population restricted to hospital inpatients, Sheehan et al. found significantly lower utilities in those with self-reported depression (P = 0.001) and in patients with instrumental ADL impairment (P = 0.020), though this was only the case when using the Quality of Life – Alzheimer’s Disease scale. Interestingly, this study reported that self-reported EQ-5D utility values were also significantly associated with carer stress (P = 0.002). Koekkoek et al. reported that patients with type 2 diabetes mellitus (T2DM) were twice as likely to develop cognitive impairment as those without and therefore compared HR-QoL in individuals with T2DM and cognitive impairment with that for individuals with T2DM but no cognitive impairment. Unfortunately, as utilities were not compared between those with and without T2DM, it is difficult to determine the extent to which T2DM itself impacts on HR-QoL.
More research is also required on the effect of study setting on HR-QoL. Olazarán et al. identified no significant difference in utilities between patients with severe dementia in institutions and the community, but Kuo et al. found that individuals in the community had significantly higher utility than those in institutions, who were typically older, were more frequently widowed, had an increased number of chronic medical conditions and were restricted in their functional independence. Hessman et al. also found significantly higher HR-QoL for patients at home than for those living in nursing homes.
Utility values may also be influenced by the perspective from which patient HR-QoL is rated. Overall, this review found that HR-QoL was most often reported by patients and their informal caregivers. We did not observe a significant difference between self-rated and caregiver proxy-rated utility values for people with MCI. However, beyond the MCI stage of the disease, self-rated utilities were significantly higher than caregiver proxy-rated utilities, with an increasing difference in more severe stages of dementia. A recent study by Easton et al.  identified a similar trend. HR-QoL is a subjective construct that should ideally be reported by the individual directly affected. However, studies on the validity of self-reported HR-QoL measurement instruments are contradictory, with some arguing that patients with dementia are capable of providing their own self-ratings and Vogel et al.  and Schiffczyk et al. suggesting that patients provide over-optimistic reports of HR-QoL. The disparities between patient and caregiver scores in mild dementia could be explained by differences in insight into the effect of the disease, adaptation of patients to their condition or censoring bias as patients become increasingly unable to complete HR-QoL questionnaires with progressing disease . Alternatively, caregivers may be experiencing increasing emotional, physical and financial pressures as dementia symptoms emerge. This may decrease their own utility, which could in turn influence their perception of patient HR-QoL. Schiffczyk et al. demonstrated that proxies with depression rate patient HR-QoL worse and report more behavioural and functional impairments than do those without depression. The impact of caregiver HR-QoL on their ratings of the HR-QoL of people with dementia is underresearched and deserves future consideration. This might be a more important factor in informal than in professional carers. Bryan et al. investigated the differences between the utilities reported by different proxies and found that informal carers, who were often spouses living with affected individuals, rated patient HR-QoL significantly worse than clinicians did. Overall, proxy utility data should be interpreted cautiously and not be assumed to provide a direct substitute for patient self-assessment, even when disease severity means that patients are no longer able to meaningfully assess their own HR-QoL [24, 29].
Another potential element of studies that might affect the HR-QoL findings is the choice of preference weight data. Ideally, preference weights for the calculation of utilities should be derived from the population of the country being studied. In a study of patients with AD in Canada, Oremus et al. found significantly higher mean utility values with USA than with Canadian preference weights (0.87 vs 0.81; P < 0.0001). On the other hand, Fang et al. demonstrated no significant difference in mean self-rated utility values (P = 0.63) when comparing Canadian and UK preference weights to rate HR-QoL for Canadian patients and their caregivers. These differences must be considered when interpreting the findings of the meta-analysis.
Overall completeness and quality of evidence
Our review identified 12 studies that reported mean utility values but not by disease severity. The results of these studies are included in Figure S1F (see Additional file 1: Appendix 5), reporting mean utility for all patients, but they are unlikely to contribute with useful information to disease models, as they provide no information regarding the patient’s location on the disease spectrum. Furthermore, in this group of studies, the self-rated weighted mean is lower than that from studies reporting each of the separate severity stages of dementia, including severe dementia. This was mainly due to the low overall utility values reported in the studies by Boström et al., van de Ven et al. and Winter et al. The severity of the dementia included in the study by van de Ven et al. was unclear, though the utilities may also have been affected by inclusion only of patients living in residential or nursing homes. The low self-rated utilities of 0.38 in the study by Boström et al. for patients with DLB also impacted the low overall average, whereas Winter et al. stated that 15.3% of individuals in their study had severe dementia and 84.7% had moderate dementia. However, given the sparse information provided, it is difficult to compare the findings of these studies with those of other studies reporting utilities by disease severity. Future studies should focus on providing utility values by disease severity.
The majority of the studies, when assessed using the Effective Public Health Practice Project quality assessment tool, were considered to produce strong/moderate evidence. However, the tool itself rates observational studies as weak in the study design parameter, which affects most of the studies in this review, as only ten were randomised trials. Nevertheless, non-randomised studies may provide more generalisable HR-QoL evidence than some randomised controlled trial populations when parameterising economic models.
Strengths and limitations of the systematic review
Overall, the strength of this study lies in the fact that it is a comprehensive systematic review of the literature. It used rigorous screening techniques to ensure inclusion of all relevant articles and included studies published in several languages to produce globally relevant results. It builds on the study by Shearer et al.  by summarising the current instruments used to measure HR-QoL and describing instruments available to measure disease severity. However, it goes further by summarising utility values according to the stage of disease, with the specific inclusion of MCI as well as mild, moderate and severe dementia.
This review has some limitations. Our meta-analysis pooled utility values by disease severity and respondent using fixed effects. Given the heterogeneity across studies, it would have been useful to perform the meta-analysis using random effects, but the small sample sizes precluded this. Also, the data were pooled across all countries despite the acknowledged differences in country-specific value sets, but, again—given the limited number of studies by country—it was not possible to take this variability into account. The meta-analysis did not differentiate between different types of dementia, which will also have increased heterogeneity across the study results. However, 29 of the 61 studies focussed only on AD, so this group will represent the majority of observations, even in studies including all types of dementia. As described, differences were mainly observed between DLB and AD. A systematic review reported DLB as accounting for approximately 4.2% of all dementia cases in the community and approximately 6.3% of cases in secondary care , and this is reflected in the sample of patients included in studies examining all types of dementia. Another limitation is that we were unable to differentiate between settings in the meta-analysis, but such differences were described in the narrative synthesis.