Cultural adaptation and validation of the Health Literacy Questionnaire (HLQ): robust nine-dimension Danish language confirmatory factor model
SpringerPlus volume 5, Article number: 1232 (2016)
Health literacy is an important construct in population health and healthcare requiring rigorous measurement. The Health Literacy Questionnaire (HLQ), with nine scales, measures a broad perception of health literacy. This study aimed to adapt the HLQ to the Danish setting, and to examine the factor structure, homogeneity, reliability and discriminant validity. The HLQ was adapted using forward–backward translation, consensus conference and cognitive interviews (n = 15). Psychometric properties were examined based on data collected by face-to-face interview (n = 481). Tests included difficulty level, composite scale reliability and confirmatory factor analysis (CFA). Cognitive testing revealed that only minor re-wording was required. The easiest scale to respond to positively was ‘Social support for health’, and the hardest were ‘Navigating the healthcare system’ and ‘Appraisal of health information’. CFA of the individual scales showed acceptably high loadings (range 0.49–0.93). CFA fit statistics after including correlated residuals were good for seven scales, acceptable for one. Composite reliability and Cronbach’s α were >0.8 for all but one scale. A nine-factor CFA model was fitted to items with no cross-loadings or correlated residuals allowed. Given this restricted model, the fit was satisfactory. The HLQ appears robust for its intended application of assessing health literacy in a range of settings. Further work is required to demonstrate sensitivity to measure changes.
The complexity of modern healthcare and the many health messages being promoted have led to health literacy being a key consideration for health promotion and improving the quality of health services (Nutbeam 2000; Protheroe et al. 2009; Sørensen et al. 2012; Kickbusch et al. 2013; Norgaard et al. 2014). Furthermore, the need for people to manage their health themselves, including using various health technologies, requires individuals to have a wide range of health literacy competencies.
The World Health Organization defined health literacy as “the cognitive and social skills which determine the motivation and ability of individuals to gain access to, understand and use information in ways which promote and maintain good health” (World Health Organization 1998). From the consumer’s perspective the required competencies include not only being able to read and understand health information, but also being able to navigate the health system, communicate and engage with healthcare providers, engage in critical appraisal of health information, and advocate for one’s right to health services (HLS-EU Consortium 2012; Kickbusch et al. 2013; Osborne et al. 2013). There is, therefore, great potential to improve public health and clinical medicine by addressing health literacy in organisations on many levels through clinician training, organisational policy and planning, health protection, and public health policy.
In Denmark, and many other countries, there is a need for data about population health literacy in order to cultivate an inclusive health system that meets the needs of individuals and diverse population groups (Sorensen et al. 2014).
The focus of previous health literacy measurement methods was on functional health literacy: word recognition tests, reading ability and numeracy (Mårtensson and Hensing 2012; Haun et al. 2014). Recent instruments, such as the Health Literacy Questionnaire (HLQ) (Osborne et al. 2013) and the European Health Literacy Survey (HLS-EU) (HLS-EU Consortium 2012), assess a wider range of health literacy components. Data derived from the HLQ are designed to assist with the diagnosis of the diverse health literacy needs of individuals and communities in both health promotion and clinical settings. It has nine scales that generate comprehensive profiles of the health literacy of individuals and groups. The profiles then assist practitioners, planners and policy makers to understand the health literacy needs of communities and this in turn assists in planning, designing and evaluating interventions.
The HLQ has been widely translated and applied in research, evaluation and monitoring (Deakin University 2016). However, in order for data from a new measure to be regarded as sufficiently robust to make decisions about individuals, communities or organisations, or to compare across settings, psychometric evidence is required that demonstrates that it is culturally and linguistically appropriate and has strong measurement properties.
In the development of the English language HLQ the items were chosen to best represent the constructs that had been generated through a ‘grounded validity-driven’ approach which included both qualitative and quantitative techniques among both consumers and professionals (Buchbinder et al. 2011). The meaning of the constructs was established by this grounded process and validated by the psychometric analyses reported in the original paper (Osborne et al. 2013). A strict translation protocol was then instituted to ensure that this meaning was carried through accurately to all non-English-language versions. The psychometric analyses in the present paper have sought to replicate the results in the original paper and thus validate the meaning of the Danish scales as equivalent to that established by the original procedures.
The aims of this study were to translate the HLQ to Danish and test the psychometric properties of the translated version in a Danish validation sample. Adaptation was done through a rigorous cultural and linguistic translation procedure, followed by psychometric analyses. The outcome will inform stakeholders and the research community if the Danish HLQ data enable valid and reliable decisions to be made about health literacy in Danish settings.
Setting and data collection
The study was undertaken in primary care settings (community health centres and general practices), public places and workplaces with wide geographic and sociodemographic variation. These settings were consistent with populations and areas in which future application of the HLQ is planned. Data were gathered by health professionals and students by face-to-face interview in both the cognitive testing and validation samples using a standardised protocol.
Health Literacy Questionnaire (HLQ)
The HLQ was designed using a grounded, validity-driven approach and initially tested in diverse samples of individuals in Australian communities and shown to have strong construct validity, reliability and acceptability to clients and clinicians (Osborne et al. 2013; Batterham et al. 2014). It was designed for administration by pen and paper self-administration or by interview to ensure inclusion of people who cannot read or have other difficulties with self-administration.
The HLQ contains 44 questions that cover nine conceptually distinct areas of health literacy:
Feeling understood and supported by healthcare providers (four items).
Having sufficient information to manage my health (four items).
Actively managing my health (five items).
Social support for health (five items).
Appraisal of health information (five items).
Ability to actively engage with healthcare providers (five items).
Navigating the healthcare system (six items).
Ability to find good health information (five items).
Understand health information well enough to know what to do (five items).
Response options for each scale were determined by the content and nature of the items. For scales 1–5 four-point ordinal response options are used (strongly disagree, disagree, agree and strongly agree), while for scales 6–9 five-point ordinal response options are used (cannot do, very difficult, quite difficult, quite easy and very easy).
Translation, cultural adaptation, and item strength equivalence
A standardised procedure (Hawkins and Osborne 2010) was used including recommendations from a range of organisations (Guillemin et al. 1993; Wild et al. 2005; Koller et al. 2007). The translation process included an initial discussion of cultural differences between the Australian and Danish health systems and the presence of local dialects in Denmark. It involved two forward translators, two backward translators, the authors, and was chaired by one of the HLQ developers (RHO). The process involved three main steps:
Professional translators, with Danish as their native language, were briefed and provided with the English HLQ and detailed information about the intent of each of the nine scale constructs and each of the items within those scales. The item intent document included narratives about precisely what each element of the HLQ items mean and, in some cases, what these do not mean.
Back translation was undertaken by two native English speakers with excellent Danish language skills.
Finally, a translation consensus meeting was conducted by an expert panel comprising the four translators and four researchers in the fields of population health, health promotion and patient care (HTM), healthcare systems and medicine (ON, LK), and questionnaire development and translation methodology (RHO). Each translated item was examined in turn. Panel members confirmed that: (a) the principal scale construct embodied in each Danish item matched the original English item; (b) the translated items were suitable for males and females, for a wide age range, for people with low linguistic skills, limited experience of health and the healthcare system, and who either did or did not have a health problem; and (c) the relative ‘difficulty’ (in choosing the positive response options agree/strongly agree and quite easy/very easy) of each item was equivalent between the languages. When a word from a Danish dialect was used in the translation, panel members were required to consider how ubiquitous the word is across the country, or if a more commonly-used word with the same meaning would be more appropriate.
The level of comprehensibility and cognitive equivalence of the preliminary translated Danish HLQ (Willis 1999; Wild et al. 2005) was field tested through face-to-face cognitive interviews with 15 persons across gender and education categories and from different parts of Denmark. The cognitive testing involved initial administration of items using paper and pen format with careful observation of each respondent. The interviewer then reviewed items with the respondent and asked specific questions about items they had hesitated on or appeared to have found difficult in answering. Respondents were asked ‘What were you thinking about when you were answering that question?’ This process elicited the cognitive process behind the answers. A prompt was used if needed: ‘Why did you select that response option?’
The study was approved by the Danish Data Protection Agency (j.no: 2013-41-2270). According to Danish law, when survey-based studies are undertaken in accordance with the Helsinki Declaration, specific approval by an ethics committee and written informed consent is not required. Potential respondents were provided with information about the survey and its purpose, including that participation was voluntary. The completion of the survey by participants was then considered to be implied consent.
Analyses were conducted with STATA version 13.0 and 14.0 and Mplus version 7.11. Descriptive statistics were generated for each item to determine the extent of missing values and to demonstrate the range of responses by providing difficulty estimates within and across the nine scales. For scales with disagree/agree response options, the relative strength of an item (how difficult it is to score highly) was calculated as the proportion responding disagree and strongly disagree (low scores) as against agree or strongly agree (high scores). For scales with response options cannot do to very easy, the difficulty level was calculated as the proportion responding cannot do, very difficult, or quite difficult as against quite easy and very easy (Raykov and Marcoulides 2011).
As Cronbach’s α is frequently a biased estimate of population reliability, unbiased estimates of composite reliability were generated (Raykov 2007). However, α was also calculated for comparison with other studies. Given that the HLQ scales were specified a priori, confirmatory factor analysis (CFA) was undertaken. Using one-factor CFA, a model was fitted to the data for each previously confirmed scale (Osborne et al. 2013). The response options were scored as ordinal variables (1–4 for the strongly disagree–strongly agree scales; 1–5 for the cannot do–very easy scales) and the models were fitted using the weighted least squares mean and variance adjusted (WLSMV) estimator available in Mplus. Unstandardised and standardised factor loadings, an estimate of the variance in the measured variable explained by the latent variable (R2), and associated standard errors are provided in Mplus together with fit statistics (χ 2, comparative fit index—CFI, Tucker–Lewis index—TLI, and root mean square error of approximation—RMSEA). Indicative threshold values for the tests of ‘close fit’ used in this analysis were CFI > 0.95; TLI > 0.95; RMSEA < 0.06 while a value of <0.08 for the RMSEA was taken to indicate a “reasonable” fit (Browne and Cudeck 1993; Yu 2002; West et al. 2012).
Mplus also provides statistics that can be used to facilitate model improvement by suggesting fixed parameters (e.g., in the case of single-factor models, correlations among residual variances) that might be freely estimated. In Mplus, these statistics include standardised residuals, modification indices (MIs) and the associated change in a parameter if the modification is included in the model (standardised expected parameter change—SEPC).
A full nine-factor CFA model with no correlated residuals or cross-loadings was fitted to the data to investigate discriminant validity. As the restrictions that are typically placed on multi-factor CFA models frequently result in a strong upwards bias in the estimation of inter-factor correlations, this analysis was followed-up by fitting the nine-factor model using Bayesian structural equation modelling (BSEM) (Marsh et al. 2010). By using small-variance priors, BSEM allows models to be fitted that have the flexibility to estimate small variations from the strictly zero constraints on the residual correlations and cross-loadings in a typical multi-factor CFA model (Muthén and Asparouhov 2012). For this analysis, the variance of the Bayesian priors for the cross-loadings was set at 0.02 such that there was a 95 % probability that the cross-loadings would be within the range ±0.28. Similarly, the variance for the residual correlations was set to give a 95 % probability that the correlations were within the range of ±0.2 (Muthén and Asparouhov 2012).
HLQ scale scores were calculated as unit-weighted sums of the constituent items averaged by the number of items in the scale such that the nominal range of the scale scores was 1–4 for scales 1–5 and 1–5 for scales 6–9. One-way ANOVA was used to investigate mean differences on the HLQ scale scores across a range of sociodemographic variables. Effect sizes (ES) and their 95 % confidence intervals (CI) for standardised differences in means between sociodemographic groups were calculated using Cohen’s ‘d’ with interpretation of ES as follows: “small” ES > 0.20–0.50, “medium” ES approximately 0.50–0.80, and “large” ES > 0.80.
Cognitive testing revealed that almost all items were well understood. Only minor re-wording was required.
Validation sample characteristics
Table 1 shows the sociodemographic characteristics and health conditions of the 481 respondents in the validation sample. The median age was 53 years, with the youngest being 17, and the oldest 92. There were more women than men (a percentage-point difference of 19.2), while 18.9 % of respondents lived alone. Over 50 % had completed a high school or higher and 60 % reported having a longstanding illness or disability. The most frequent chronic conditions reported were: musculoskeletal disorders, cardiovascular disease, diabetes, cancer, chronic obstructive pulmonary disease and mental disorders. Just over a quarter of respondents lived in the capital area of Denmark and 63 % lived in other Danish cities across different geographical regions.
Psychometric properties of the Danish HLQ
Response to the HLQ items was high (missing answers: 0.2–1.7 %). Tables 2 and 3 present the psychometric properties of the items and scales. At the item level, there were few missing data, with the highest proportion being for item 7.6 ‘Work out what is best care for you’ (1.7 %). The relative strength of the items within scales and between scales varied modestly. The scales that were easiest to score highly were 4. ‘Social support for health’ (average item difficulty 0.14) and 2. ‘Having sufficient information to manage my health’ (0.16). The scales that were hardest to score highly were 7. ‘Navigating the healthcare system’ (0.36) and 5. ‘Appraisal of health information’ (0.36). The easiest item was found in scale 4. ‘Social support for health’ [4.4 ‘I have at least one person who can come to medical appointments with me’ (0.07)], and the hardest item was in scale 7. ‘Navigating the healthcare system’ [7.5 ‘Find out what healthcare services you are entitled to’ (0.51)]. The scale with the smallest range of difficulty was 8. ‘Ability to find good health information’ (hardest 0.21, easiest 0.15, range 0.06), whereas 1. ‘Feeling understood and supported by healthcare providers’ had the largest range of difficulties (hardest 0.36, easiest 0.11, range 0.25).
The model fit for all scales (with some inclusion of correlated errors related mostly to linguistic overlap) was generally very good, demonstrating that the scales are homogeneous, although, after model modification, the RMSEA remained unacceptably high for scale 1 (Table 3) suggesting some further association between the item residuals in this scale. Initially four scales (scales 2, 3, 7 and 8) returned a satisfactory close fit for the one-factor models. For five scales, the close fit statistics were initially not satisfactory due to the presence of correlated residuals which, when included in the model (maximum 2), ranged from 0.23 (scale 4) to 0.49 (scale 1).
For each scale there were high loadings on almost all items [all above 0.6 except for one each in scale 5 (0.49) and scale 7 (0.50)]. The median loading was 0.79. A composite reliability and Cronbach’s α of ≥0.8 was observed for all scales except scale 5. ‘Appraisal of health information’ (composite reliability = 0.77, α = 0.76). The median composite reliability was 0.84 (α = 0.83), with the highest for scale 8. ‘Ability to find good health information’ (composite reliability = 0.87, α = 0.87). In summary, the reliability for all scales was good (0.8–0.9) except for scale 4. These findings are in the same range as what was observed in the original psychometric studies of the English HLQ.
A nine-factor CFA model was fitted to the 44 items with no cross-loadings or correlated residuals allowed. Given the very restricted nature of the model, the fit was quite satisfactory: χ 2WLSMV (866 df) = 2459.31, p < 0.0001, CFI = 0.934, TLI = 0.930 and RMSEA = 0.062. While the CFI and TLI are lower than the pre-specified cut-off, this is not surprising given the large number of parameters in the model set precisely to 0.0. Also, the CFI and TLI tend to underestimate the goodness-of-fit of models with large numbers of measured variables compared with the RMSEA as in this analysis (Kenny and McCoach 2003). The ranges of the factor loadings in this model were: scale 1. ‘Feeling understood and supported by healthcare providers’: 0.78–0.95; scale 2. ‘Having sufficient information to manage my health’: 0.77–0.86; scale 3. ‘Actively managing my health’: 0.69–0.89; scale 4. ‘Social support for health’: 0.62–0.84; scale 5. ‘Appraisal of health information’: 0.58–0.83; scale 6. ‘Ability to actively engage with healthcare providers’: 0.69–0.90; scale 7. ‘Navigating the healthcare system’: 0.57–0.81; scale 8. ‘Ability to find good health information’: 0.79–0.89; and scale 9. ‘Understanding health information well enough to know what to do’: 0.69–0.86.
Inter-factor correlations in the nine-factor model ranged from 0.41 (scales 4 and 9) to 0.93 (scales 8 and 9). Inter-factor correlations were >0.80 for scales 6/7 = 0.82, 7/8 = 0.87, 6/9 = 0.84, 7/9 = 0.92 and 8/9 = 0.93. It is frequently argued that an inter-factor correlation of >0.80 to >0.85 indicates a potential lack of discriminant validity (Brown 2006). This suggests that there may be a lack of discriminant validity in second part of the HLQ, particularly for scales 7, 8 and 9.
By allowing some flexibility in the estimation of residual correlations and cross-loadings the nine-factor BSEM model was an excellent fit to the data (the posterior predictive probability value was 0.49 compared with the target value of 0.50). There were nine statistically significant cross-loadings, the largest being 0.28. Correlations between scales 6–9 were 6/7 = 0.72, 6/8 = 0.63, 6/9 = 0.67, 7/8 = 0.83, 7/9 = 0.85, 8/9 = 0.84.
Table 4 shows pattern of HLQ scores according to sociodemographic variables. Females were somewhat higher than males only on 8. ‘Ability to find good health information’. Younger people had substantially higher scores on 4. ‘Social support for health’ than older people. There were differences on six HLQ variables for education, all indicating that people with higher education have higher health literacy than those with less education. The strongest effects were seen for 8. ‘Ability to find good health information’. There are no significant differences between those who did or didn’t have Danish as their mother tongue. Five scales showed that those with a long term illness or disability had higher health literacy, the largest effect was seen for 4. ‘Social support for health’ and 2. ‘Having sufficient information to manage my health’. The strongest pattern of all exogenous variables was with self-rated health. All scales demonstrated a difference between those rating themselves as having 'Excellent' or 'Very good' health compared with 'Good' or worse health . The effect was strongest for 4. ‘Social support for health’, 2. ‘Having sufficient information to manage my health’, and 3. ‘Actively managing my health’.
In this study, we applied rigorous linguistic and cultural adaption methods to the HLQ to produce a high quality Danish version. We used a process we call measurement adaptation whereby the meanings of the original concepts were carefully reproduced. Consequently, data from a diverse validation sample of the Danish population demonstrated that the translated HLQ has a strong psychometric structure, and the reliability of the original HLQ was maintained. Within each scale, the Danish items have a range of difficulty that should enable the Danish HLQ to be sensitive to differences between groups and change over time. Importantly, respondents clearly understood all the translated items. This series of rigorous psychometric and practical tests indicate that the Danish HLQ is robust and suitable for application in Danish settings, and that the reproduction of the nine scales in a different language and setting attests to the fundamental separate elements embodied in each of the HLQ constructs (Osborne et al. 2013).
The cognitive interviews generated valuable information. It was demonstrated that the translated items were understood according to the item intents by Danish people from a wide range of backgrounds.
Within scales there was some item residual correlation. Such correlations can mean construct complexity (i.e., sub-constructs) or item redundancy (Boyle 1991). In some cases, this seemed to be due to limited choices of words in the Danish language. For example, two separate words exist in English for ‘asking’ and ‘discussing’ (items 6.4 and 6.5) but not in Danish in this context. For items 5.1 and 5.3, the correlation is thought to relate to the positioning of these items on Bloom’s taxonomy (Amer 2006). That is, the items measure the easy and difficult ends of the same dimension: item 5.1 (0.27) is a relatively easy item on which to score highly, and item 5.3 (0.41) is a relatively hard item. While there is some conceptual overlap across these items, this scale, like most HLQ scales, has a good range of item strength and therefore has wide coverage of the target construct.
The majority of the Danish HLQ items load highly on their respective factors and the scales have good reliability. The fit of both single-factor and nine-factor models to the data was generally good, indicating scale homogeneity. Further, scales 1–6 showed clear discriminant validity, while discriminant validity was not clearly established for scales 7–9. By allowing for small correlated item residuals and cross-loadings the BSEM analysis showed inter-factor correlations among scales 6–9 that were somewhat smaller, however three remained >0.80 but ≤0.85. The overlap between scales 7–9 suggests that there may be a higher-order factor or causal linkages determining the stronger association between the concepts measured by these scales. Strong associations between these scales have been noted in previous analyses of the English HLQ (Osborne et al. 2013). Scales 8 and 9 focus on the ability to locate (scale 8) and appraise (scale 9) health information, while scale 7 (health system navigation) may be seen as a closely linked outcome of these abilities.
Every item clearly loaded on its own factor, with only three of the 44 items loading <0.6. Given that every scale comprised only four, five or six items, this demonstrates parsimony: that is, a minimum number of items are administered to respondents to reliably capture the full breadth of the intended constructs (Boyle 1991). Many questionnaires achieve high reliability through the inclusion of large numbers of items, or of items that have only minor linguistic differences and conceptual redundancy (Boyle 1991). In the Danish HLQ, only one scale had reliability below our nominal 0.8 cut-off: 5. ‘Appraisal of health information’ (0.77). This scale included the most difficult items and asked respondents about a range of challenging tasks such as comparing, checking up and finding out about information. These concepts are high in Bloom’s taxonomy (Krathwohl 2002; Amer 2006) and are likely to be more difficult for respondents to attend to and process successfully. Thus respondents may have found it challenging to judge their own critical appraisal ability, and this possibly contributed to the lower, but acceptable, reliability of the scale.
The wide ranging difficulty of items within a scale was an important and deliberate step in the development of the HLQ as it is intended to make the questionnaire sensitive to small differences at both the low score (e.g., strongly disagree or cannot do) and high score (e.g., strongly agree or very easy) ends of the scales. The ability to measure differences between subgroups was demonstrated in a recent study that used two of the scales from the Danish HLQ (scales 6 and 9—the understanding and engagement dimensions of health literacy) (Bo et al. 2014). The response options were slightly different because the extreme cannot do category in this general survey was omitted. However, the estimates clearly pointed out that 8–20 % of the Danish population have difficulties with these two dimensions of health literacy. Further, the study showed that the elderly, immigrants, those with limited formal education, and low-income groups were more likely to have limited health literacy skills. The findings are in line with Dutch results using the European Health Literacy Survey, where having a low level of education, or a low perceived social status, or being male were found to be modestly related to low health literacy scores, mainly for accessing and understanding health information (van der Heide et al. 2013).
The HLQ scales were related to the sociodemographic subgroups in our sample in reasonably expected ways. While the items were written specifically to avoid potential bias in gender, age and education, the data indicated that women were slightly better than men in their ability to find good information, however differences were not observed in other domains. Younger people (<65 years), compared with older people (≥65 years), reported more ‘Social support for health’ (scale 4), probably related to isolation in older people. As expected, education was clearly related to health literacy, but primarily in those HLQ domains with content covering finding and processing information (scales 2, 5, 7, 8 and 9) and not with domains related to feeling understood or engaging with health care professionals (scales 1, 3 and 6). In previous research we have seen language variables, such as the language spoken at home, to be strongly related to health literacy (Beauchamp et al. 2015). In the current study, with 93 % having Danish as their mother tongue and where no questionnaires were offered in other languages, almost no association between language and health literacy was observed. Only a small effect was observed for the scale with content requiring the highest language skills, 5. ‘Appraisal of health information’, where those without Danish reported slightly higher scores. It is possible that this small group included nationalised professionals, however future studies need to explore this in more detail.
Having a long term illness or disability was moderately and positively related to several health literacy domains. This is consistent with the notion that health literacy develops in individuals as they gain experience with managing illnesses, working with practitioners over many years, and through overcoming information barriers (Paasche-Orlow and Wolf 2007). Finally, we expect that good health literacy is a determinant, antecedent or consequence of good health (Paasche-Orlow and Wolf 2007; Berkman et al. 2011; Batterham et al. 2016). Consequently, we observe this pattern through all nine HLQ domains positively related to self-rated health.
Strengths and limitations
It is intended that the HLQ can be used in diverse settings, and with people with broad-reaching sociodemographic profiles and various health conditions. A strength of this study is that the validation sample was drawn from communities across a wide range of locations, with an array of health conditions, and who were young and old. The face-to-face administration of the questionnaire ensured a high participation rate, particularly from people with limited reading and writing abilities who usually are excluded from taking part in questionnaire studies.
There are several indications that the Danish HLQ has good psychometric properties. The use of modern psychometric procedures permitted the application of highly demanding tests, and the analysis has clearly shown that the HLQ multi-dimensional construct of health literacy comprises nine separate and cogent scales. A nine-factor CFA model demonstrated good construct validity. A model with no residual correlations or cross-loadings was an acceptable fit to the data while a very good fit was achieved with a BSEM analysis that allowed modest correlated residuals and cross-loadings.
The stringent linguistic, cultural and measurement adaptation procedures are likely to have contributed to the strong psychometric performance of the Danish HLQ. This procedure includes an extensive item intent document that explains what the elements of individual items mean and do not mean. This document, along with the intensive translation consensus meeting, ensures the translators are supported to capture the precise meaning, context, difficulty, and measurement juxtaposition of one item with other items within each scale.
A strength of the HLQ is that it has sensitivity to group differences related to illness or disability, and self-rated health. This has also been demonstrated in other settings (Beauchamp et al. 2015; Zhang et al. 2016; Vamos et al. 2016). While it is important that differences between groups are demonstrated, research needs to be done to demonstrate sensitivity to change. This issue is being addressed in a range of ongoing intervention studies where the HLQ is part of the assessments of a range of outcomes (Batterham et al. 2014; Livingston et al. 2014; Redfern et al. 2014; Banbury et al. 2014; Barker et al. 2015; Griva et al. 2015).
Another possible limitation that needs further investigation is the somewhat high inter-correlations between Danish HLQ scales 7, 8 and 9. These data suggest that there may be a lack of discriminant validity between these scales; however, this may also be explained by higher-order models and causal relationships between these constructs. This provides guidance for further work, particularly in exploring how the scales respond, with careful exploration of rates of change (or no change) over time.
Finally, this study only explored administration of the HLQ in face-to-face interviews. This is a strength of the study because it overcomes illiteracy of respondents, but future work should continue to explore other modes of administration including self-administered written formats, and confirm whether the HLQ produces valid information that results in improved services across the Danish society.
In conclusion, this study demonstrates that the Danish HLQ has strong construct and content validity, and high composite reliability. As such, the HLQ is now available for application in Denmark with nine scales providing a robust multi-dimensional approach to understanding health literacy. It is incumbent on the developers of questionnaires to demonstrate the measurement properties of new questionnaires, and that the data they return to consumers, practitioners, policymakers and researchers provide valid inferences and are reliable. The findings of this study may be an important contribution to health literacy research, and the data support the web of evidence of measurement validity of the interpretation of HLQ scores for use in national and international population health and healthcare.
Bayesian structural equation modelling
confirmatory factor analysis
comparative fit index
Health Literacy Questionnaire
European Health Literacy Survey
inter quartile range
root mean square error of approximation
standardised expected parameter change
weighted least squares mean and variance adjusted
Amer A (2006) Reflections on Bloom’s revised taxonomy. Electron J Res Ed Psychol 4:213–230
Banbury A, Parkinson L, Nancarrow S et al (2014) Multi-site videoconferencing for home-based education of older people with chronic conditions: the Telehealth Literacy Project. J Telemed Telecare 20:353–359. doi:10.1177/1357633X14552369
Barker AL, Cameron PA, Hill KD et al (2015) RESPOND—a patient-centred programme to prevent secondary falls in older people presenting to the emergency department with a fall: protocol for a multicentre randomised controlled trial. Inj Prev 21:e1. doi:10.1136/injuryprev-2014-041271
Batterham RW, Buchbinder R, Beauchamp A et al (2014) The OPtimising HEalth LIterAcy (Ophelia) process: study protocol for using health literacy profiling and community engagement to create and implement health reform. BMC Public Health 14:694. doi:10.1186/1471-2458-14-694
Batterham RW, Hawkins M, Collins PA et al (2016) Health literacy: applying current concepts to improve health services and reduce health inequalities. Public Health 132:3–12. doi:10.1016/j.puhe.2016.01.001
Beauchamp A, Buchbinder R, Dodson S et al (2015) Distribution of health literacy strengths and weaknesses across socio-demographic groups: a cross-sectional survey using the Health Literacy Questionnaire (HLQ). BMC Public Health 15:678. doi:10.1186/s12889-015-2056-z
Berkman ND, Sheridan SL, Donahue KE et al (2011) Low health literacy and health outcomes: an updated systematic review. Ann Intern Med 155:97–107. doi:10.7326/0003-4819-155-2-201107190-00005
Bo A, Friis K, Osborne RH, Maindal H (2014) National indicators of health literacy: ability to understand health information and to engage actively with healthcare providers—a population-based survey among Danish adults. BMC Public Health 14:1095. doi:10.1186/1471-2458-14-1095
Boyle GJ (1991) Does item homogeneity indicate internal consistency or item redundancy in psychometric scales? Pers Individ Differ 12:291–294. doi:10.1016/0191-8869(91)90115-R
Brown TA (2006) Confirmatory factor analysis for applied research. Guildford Press, New York
Browne MW, Cudeck R (1993) Alternative ways of assessing model fit. In: Bollen KA, Long JS (eds) Testing structural equation models. Sage, Newbury Park, CA, pp 136–162
Buchbinder R, Batterham R, Elsworth G et al (2011) A validity-driven approach to the understanding of the personal and societal burden of low back pain: development of a conceptual and measurement model. Arthritis Res Ther 13:R152. doi:10.1186/ar3468
Deakin University (2016) The Ophelia approach. http://www.ophelia.net.au. Accessed 11 June 2016
Griva K, Mooppil N, Khoo E et al (2015) Improving outcomes in patients with coexisting multimorbid conditions—the development and evaluation of the combined diabetes and renal control trial (C-DIRECT): study protocol. BMJ Open 5:e007253–e007253. doi:10.1136/bmjopen-2014-007253
Guillemin F, Bombardier C, Beaton D (1993) Cross-cultural adaptation of health-related quality of life measures: literature review and proposed guidelines. J Clin Epidemiol 46:1417–1432
Haun J, McCormack L, Valerio M, Sorensen K (2014) Health literacy measurement: an inventory and descriptive summary of 52 instruments. J Health Commun. doi:10.1080/10810730.2014.936571
Hawkins M, Osborne R (2010) Questionnaire translation and cultural adaptation procedure. Deakin University, Burwood
HLS-EU Consortium (2012) Comparative report on health literacy in eight EU member states: the European Health Literacy Survey (HLS-EU). The European Health Literacy Project Consortium (HLS-EU Consortium), Maastrict
Kenny DA, McCoach DB (2003) Effect of the number of variables on measures of fit in structural equation modeling. Struct Equ Model A Multidiscip J 10:333–351. doi:10.1207/S15328007SEM1003_1
Kickbusch I, Pelikan JM, Apfel F, Tsouros AD (2013) Health literacy: the solid facts. WHO Regional Office for Europe, Copenhagen
Koller M, Aaronson NK, Blazeby J et al (2007) Translation procedures for standardised quality of life questionnaires: The European Organisation for Research and Treatment of Cancer (EORTC) approach. Eur J Cancer 43:1810–1820. doi:10.1016/j.ejca.2007.05.029
Krathwohl DR (2002) A revision of Bloom’s taxonomy: an overview. Theory Pract 41:212–218. doi:10.1207/s15430421tip4104_2
Livingston PM, Osborne RH, Botti M et al (2014) Efficacy and cost-effectiveness of an outcall program to reduce carer burden and depression among carers of cancer patients [PROTECT]: rationale and design of a randomized controlled trial. BMC Health Serv Res 14:5. doi:10.1186/1472-6963-14-5
Marsh HW, Lüdtke O, Muthén B et al (2010) A new look at the big five factor structure through exploratory structural equation modeling. Psychol Assess 22:471–491. doi:10.1037/a0019227
Mårtensson L, Hensing G (2012) Health literacy—a heterogeneous phenomenon: a literature review. Scand J Caring Sci 26:151–160. doi:10.1111/j.1471-6712.2011.00900.x
Muthén B, Asparouhov T (2012) Bayesian structural equation modeling: a more flexible representation of substantive theory. Psychol Methods 17:313–335. doi:10.1037/a0026802
Norgaard O, Sorensen K, Maindal HT, Kayser L (2014) Measuring health literacy can improve communication in health care. Ugeskr Laeger 176:37–39
Nutbeam D (2000) Health literacy as a public health goal: a challenge for contemporary health education and communication strategies into the 21st century. Health Promot Int 15:259–267. doi:10.1093/heapro/15.3.259
Osborne RH, Batterham RW, Elsworth GR et al (2013) The grounded psychometric development and initial validation of the Health Literacy Questionnaire (HLQ). BMC Public Health 13:658. doi:10.1186/1471-2458-13-658
Paasche-Orlow MK, Wolf MS (2007) The causal pathways linking health literacy to health outcomes. Am J Health Behav 31(Suppl 1):S19–S26. doi:10.5555/ajhb.2007.31.supp.S19
Protheroe J, Nutbeam D, Rowlands G (2009) Health literacy: a necessity for increasing participation in health care. Br J Gen Pract 59:721–723. doi:10.3399/bjgp09X472584
Raykov T (2007) Reliability if deleted, not “alpha if deleted”: evaluation of scale reliability following component deletion. Br J Math Stat Psychol 60:201–216. doi:10.1348/000711006X115954
Raykov T, Marcoulides GA (2011) Classical item analysis using latent variable modeling: a note on a direct evaluation procedure. Struct Equ Model A Multidiscip J 18:315–324. doi:10.1080/10705511.2011.557347
Redfern J, Usherwood T, Harris MF et al (2014) A randomised controlled trial of a consumer-focused e-health strategy for cardiovascular risk management in primary care: the Consumer Navigation of Electronic Cardiovascular Tools (CONNECT) study protocol. BMJ Open 4:e004523. doi:10.1136/bmjopen-2013-004523
Sorensen K, Norgaard O, Maindal HT (2014) Need for more research in patients’ health literacy. Ugeskr Laeger 176:40–43
Sørensen K, van den Broucke S, Fullam J et al (2012) Health literacy and public health: a systematic review and integration of definitions and models. BMC Public Health 12:80. doi:10.1186/1471-2458-12-80
Vamos S, Yeung P, Bruckermann T et al (2016) Exploring health literacy profiles of Texas university students. Heal Behav Policy Rev 3:209–225. doi:10.14485/HBPR.3.3.3
van der Heide I, Rademakers J, Schipper M et al (2013) Health literacy of Dutch adults: a cross sectional survey. BMC Public Health 13:179. doi:10.1186/1471-2458-13-179
West SG, Taylor AB, Wu W (2012) Model fit and model selection in structural equation modeling. In: Hoyle RH (ed) Handbook of structural equation modeling. Guilford Press, New York, pp 209–231
Wild D, Grove A, Martin M et al (2005) Principles of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: report of the ISPOR task force for translation and cultural adaptation. Value Health 8:94–104. doi:10.1111/j.1524-4733.2005.04054.x
Willis GB (1999) Cognitive interview. A “how to” guide. Research Triangle Institute, Research Triangle Park
World Health Organization (1998) Health promotion glossary. World Health Organization, Geneva
Yu CY (2002) Evaluating cut-off criteria of model fit indices for latent variable models with binary and continuous outcomes. Los Angeles, CA
Zhang Y, Zhang F, Hu P et al (2016) Exploring health literacy in medical university students of Chongqing, China: a cross-sectional study. PLoS One 11:e0152547. doi:10.1371/journal.pone.0152547
HTM, RHO and GE conceived the research question. ON, HTM, LK and RHO undertook the translation and cultural adaption. HTM, ON and LK were responsible for data collection and HTM was responsible for the data management. HTM, AB, RHO and GE analysed and interpreted the data. HTM and RHO produced the initial draft of the manuscript. All authors read and approved the final manuscript.
Our appreciation goes to individuals responding to the questionnaire, and the local sites for helping with recruitment. We express sincere thanks to Kirsten Vinther-Jensen who conducted cognitive interviews and helped with data collection, and to Karin Rosenkilde Laursen, who chaired the students that conducted interviews.
The authors declare that they have no competing interests.
The Danish Strategic Research Council has funded the translation process and ON as part of the project Experience-oriented Sharing of health knowledge via Information and Communication Technology (ESICT). RHO received a Grant from the Nordea Foundation to work in Denmark in the period May to August 2012 to chair the translation process. RHO is a recipient of an Australian National Health and Medical Research Council Senior Research Fellowship #1059122. Three of the six students that conducted the interviews were partly employed by an unrestricted research Grant given to HTM from the pharmaceutical company MSD Denmark.
About this article
Cite this article
Maindal, H.T., Kayser, L., Norgaard, O. et al. Cultural adaptation and validation of the Health Literacy Questionnaire (HLQ): robust nine-dimension Danish language confirmatory factor model. SpringerPlus 5, 1232 (2016). https://doi.org/10.1186/s40064-016-2887-9