| HOME | ARCHIVE | SEARCH | TABLE OF CONTENTS |
|---|
| ||||||||||||||||||||||||||||||||
RESEARCH ARTICLE |
a Polisher Research Institute, Philadelphia Geriatric Center
Vicki A. Freedman, Senior Research Scientist, Polisher Research Institute, Philadelphia Geriatric Center, The Pavilion, Suite 427, 261 Old York Road, P.O. Box 728, Jenkintown, PA 19046-7128 E-mail: vfreedman{at}pgc.org.
| Abstract |
|---|
|
|
|---|
Methods. For 1,054 Asset and Health Dynamics of the Oldest Old Study Wave 2 respondents, we compared responses to questions about difficulty without reference to assistance (ambiguous difficulty) to those about difficulty without help or equipment (underlying difficulty) and difficulty with help or equipment, if used (residual difficulty). We modelled predictors of discordance by means of logistic regression.
Results. Discordance exists for 15% of respondents between summary variables indicating underlying and ambiguous difficulty with one or more activities. Discrepancies are evenly split between respondents reporting (a) underlying but no ambiguous difficulty and (b) ambiguous but no underlying difficulty. Discordance also exists for 15% of respondents between summary variables indicating residual and ambiguous difficulty with one or more activities; most of these discrepancies involve reports of ambiguous but no residual difficulty. Most respondent characteristics investigated are not significant predictors of discrepancies.
Discussion. Analysts should be aware that (a) ambiguously worded questions appear to be a better proxy for underlying than for residual difficulty, (b) discrepancies seem to be lower for separate activities than for summary variables indicating difficulty with one or more activities, and (c) being Hispanic and receiving help may affect reporting discrepancies.
DISABILITY outcome measures have been important in geriatric research for many years (Kane and Kane 1981
). Such measures are also useful for describing trends in the general health of the older population (e.g., Crimmins, Saito, and Reynolds 1997
; Manton, Corder, and Stallard 1993
, Manton, Corder, and Stallard 1997
) and for planning public long-term care and social service programs (e.g., Spector and Kemper 1994
). Analytically, such measures are important predictors of the long-term care decisions, living arrangements, health care utilization, and social service consumption of older Americans (see, e.g., Liu, Coughlin, and McBride 1991
; Wolf, Freedman, and Soldo 1997
; Wolf and Soldo 1988
; Wolinsky and Johnson 1991
).
The original Activities of Daily Living (ADL) scale, developed by Katz (Katz, Ford, Moskowitz, Jackson, and Jaffee 1963
; Katz, Downs, Cash, and Grotz 1970
), was designed to assess the rehabilitative potential of older institutionalized residents. The scale was based on observer-recorded information about a patient's ability to carry out basic activities without assistance. Since that time, surveys of older Americans living in the community have adopted numerous approaches to asking respondents about their activity limitations. Some surveys ask respondents whether they have difficulty with an activity when performing it without help or special equipment; others ask about difficulty given help or equipment, if used; still others ask about difficulty with no reference to whether help or equipment should be considered. For purposes of this study, the latter is referred to as ambiguous because such questions are unclear as to whether help or equipment should be considered.
Several studies have shown that survey-based estimates of the older disabled population are sensitive to question wording (Jette 1994
; Picavet and van den Bos 1996
; Rodgers and Miller 1997
; Weiner, Hanley, Clark, & Van Nostrand, 1990). For example, Rodgers and Miller have shown large differences in the percentage of respondents classified as disabled by using questions about difficulty from different surveys. No research to date has explicitly focused on the implications of using ambiguously worded questions for identifying and studying issues involving the older disabled population.
To better understand the potential consequences of using ambiguously phrased questions about daily activities, it is useful to recognize important conceptual distinctions embedded in the various approaches to operationalizing disability. Particularly useful is the conceptual distinction between intrinsic and actual disability. As Verbrugge and Jette 1994
discussed, intrinsic disability refers to an individual's underlying capability of carrying out an activity without help from another person or the use of equipment, whereas actual disability refers to the residual level of difficulty an individual experiences, given the use of help and/or equipment, if used. Others have referred to these concepts as underlying and residual difficulty (see, e.g., Agree, 1998), an approach adopted in this study.
Underlying and residual difficulty measures are conceptually distinct, and measurements of the two constructs often serve very different purposes. In general, measures of residual difficulty and the use of help are more useful for planning for long-term care and social services because they focus on the unmet needs of the disabled population. In contrast, measures of underlying disability are often preferred for identifying respondents to be included in individual-level analyses of care arrangements, and for controlling for respondents' level of disability, because such measures are less influenced by whether respondents use assistance. That is, in analyses of care arrangements, residual disability measures may be endogenous (influenced by whether care is received), whereas underlying disability measures are more likely to be exogenous to care decisions.
Questions that do not specify whether respondents should consider the use of help or special equipment in reporting about difficulty blur these important distinctions. The purpose of this study is to understand the implications of using such ambiguous difficulty measures in analyses of disability and care arrangements of older Americans. To achieve this aim, responses to disability questions that are ambiguously worded are compared to responses to questions about underlying and residual difficulty. Also investigated is whether select subgroups of older Americans are more likely to provide certain types of discrepant responses. To explore these issues, this study examines responses provided by a random subsample of respondents to the second wave of the Asset and Health Dynamics of the Oldest Old study (AHEAD), a national survey of Americans aged 70 and older.
| Methods |
|---|
|
|
|---|
Data are from the third preliminary release of Wave 2. These data have not been cleaned and may contain errors that will be corrected in the final public release of AHEAD Wave 2. As of this writing, sampling weights for AHEAD Wave 2 have not been made available to the public, so all analyses presented are unweighted. The absence of weights could affect point estimates. Sensitivity analyses, using weights from the first wave of AHEAD, suggest that weighted estimates are unlikely to differ by more than half a percentage point (±0.5) from the unweighted estimates presented here. Further, the absence of weights should not affect comparisons across questions or model coefficients, the latter of which control for major sampling domains.
Questions About Difficulty with Daily Activities
In the core questionnaire, respondents were asked about difficulty with 12 physical tasks: walking one or several blocks, jogging, sitting, transferring, climbing one or several flights of stairs, kneeling, reaching, pulling or pushing, lifting or carrying, and picking up small objects. Those who reported difficulty with at least one of these tasks were then asked about difficulty with six personal care activities: getting across a room, dressing, bathing or showering, eating, getting in and out of bed, and toileting. Respondents were instructed, "Please tell me if you have any difficulty with these because of a physical, mental, emotional or memory problem. Again exclude any difficulties you expect to last less than three months." Then, for each activity respondents were asked, "[Because of a health or memory problem] do you have any difficulty ...?" For two activitiesgetting across the room and getting in and out of bedrespondents were then asked if they ever use equipment or devices when carrying out the activity. For all activities, respondents who answered they either have difficulty, can't do, or don't do the activity were then asked a follow-up question: "Does anyone ever help you ...?"
AHEAD Wave 2 included several experimental modules administered to random subsamples of core respondents. Experimental Modules 1 and 2 included identical questions about the six activities in the core questionnaire. The modules were prefaced by a statement that some of the questions might sound similar to questions the respondent had already answered. For each activity, respondents were first asked about help received with the activity. Then, for two activitiesgetting across a room and getting in and out of bedrespondents were asked about the use of special equipment. Finally, respondents were asked about residual difficulty for each of the activities, with the lead-in to the difficulty question dependent on the answer(s) to the preceding question(s) on help and special equipment. Those who reported getting help with an activity were asked, "Even when someone helps you, do you have any difficulty ...?" whereas those who reported using special equipment were asked, "Even when using the [equipment specified], do you have any difficulty ...?" Respondents who reported using neither help nor special equipment were asked, "Without any help or special equipment do you have any difficulty ...?"
In sum, two sets of responses to activity limitation questions are available for the same point in time for a subsample of respondents (n = 1,061), and four important differences exist between the core and the experimental module disability questions (see Table 1 ).
|
Analysis Samples
This analysis is restricted to respondents who survived the approximately 24-month period from Wave 1 to Wave 2 and who completed Experimental Module 1 or 2 and core limitation in daily activity questions in Wave 2. Seven respondents were eliminated from the analysis because they reported not doing one or more activities. This yielded a final sample of 1,054 respondents reporting about 6,324 activities (6 per respondent).
Descriptive analyses are presented separately for each activity and for summary variables indicating difficulty with one or more activities (n = 1,054). In addition, descriptive analyses are presented for the full activity-level sample (n = 6,324) and predictors of discordance are modelled using various subsets of the activity-level sample (described in more detail in the Results section). It is important to model the extent of discordance using the activity-level sample for two reasons. First, individuals could have different types of discordance for two different activities. Suppose, for example, an older person reports underlying but no ambiguous difficulty with getting across a room and reports ambiguous but no underlying difficulty with dressing. For that person, a summary variable indicating underlying difficulty with one or more activities would be coded "yes"exactly the same as a summary variable indicating ambiguous difficulty with one or more activities. In other words, the activity-level discrepancies would cancel out in the respondent-level summary variables. Second, individuals may report consistently across most of the activities but have disagreement on only one activity, so the summary variables may overstate the true extent of discordance relative to an activity-specific basis.
Experimental modules were administered only to respondents who answered the core interview for themselves. Although the findings cannot be generalized to situations in which respondents have a proxy respond for them (often involving the most cognitively and physically impaired sample members), this restriction offers the benefit of preventing the analyses from being confounded by the influence of proxy respondent error.
Measures of Disability and Discordance
Three measures of disability were constructed. Responses to the experimental module questions were used to construct measures of underlying and residual disability. For each activity, an individual was classified as having underlying difficulty if he or she (a) got help, (b) used equipment (for getting across the room and in and out of bed only), or (c) had difficulty without help or special equipment. An individual was classified as having residual difficulty if he or she reported having difficulty even with help or special equipment, if used. Finally, responses to the core Wave 2 questions were used to construct a measure of ambiguous difficulty. Individuals who responded that they had difficulty carrying out a given activity were classified as having ambiguous difficulty.
From these three disability measures, three variables indicating the type of discordance among responses were created. The first measure considers discordance between underlying difficulty and ambiguously worded questions about difficulty; the second considers discordance between residual and ambiguous difficulty; and the third captures whether ambiguous answers match either underlying and/or residual difficulty responses.
The first measure of discordance is a four-category variable that captures responses to underlying and ambiguous difficulty questions. That is, the categories represent, for a given activity, whether the respondent reported both underlying and ambiguous difficulty (yes, yes), neither underlying nor ambiguous difficulty (no, no), underlying but no ambiguous difficulty (yes, no), or no underlying but ambiguous difficulty (no, yes). The latter two categories were used to explore predictors of different types of discordance between underlying and ambiguous difficulty.
The second measure of discordance is similar to the first variable, but it captures responses to residual (rather than underlying) and ambiguous difficulty. Like the first measure of discordance, the categories represent, for a given activity, whether the respondent reported both (yes, yes), neither (no, no), residual but no ambiguous difficulty (yes, no), and no residual but ambiguous difficulty (no, yes). Again, the last two categories were used to explore predictors of discordance between residual and ambiguous difficulty.
The final measure of discordance captures whether ambiguous difficulty responses match only underlying, only residual, neither underlying nor residual, or both underlying and residual responses. For the last category (matches both), further discordances were made between those who reported having ambiguous, underlying, and residual difficulty (all yes) from those who reported having no difficulty using all three measurement approaches (all no).
This final measure of discordance is particularly useful for two reasons. First, it identifies respondents not captured by either of the first two variables: those whose responses to ambiguous difficulty questions match neither underlying nor residual difficulty responses. Second, by focusing on the first two groups (matches underlying only and matches residual only), the variable allows for an analysis of whether some groups of respondents were more likely to interpret ambiguous questions consistent with underlying or with residual difficulty questions.
Table 2 summarizes how the six unique combinations of answers to underlying, residual, and ambiguous difficulty questions map into these three measures of discordance. For example, the first row shows that respondents who reported yes to underlying, residual, and ambiguous difficulty questions would be classified as "yes, yes" on the first measure of discordance (because underlying = yes and ambiguous = yes), "yes, yes" on the second measure (because residual = yes and ambiguous = yes), and "both (all yes)" on the third measure (because ambiguous matches both underlying and residual difficulty and because all three responses are positive).
|
|
Health-related measures included in the analysis are self-rated health (excellent/very good, good, fair/poor), whether the respondent usually had pain, and scores on slightly modified versions (as described in Soldo et al. 1997
, and Herzog and Wallace 1997
) of the Center for Epidemiological Studies Depression Scale (CES-D; Kahout, Berkman, Evans, and Cornoni-Huntley 1993
) and the Telephone Interview for Cognitive Status (TICS; Brandt, Spencer, and Folstein 1988
). For purposes of this analysis, respondents with scores of 5 or more out of 8 on the modified CES-D were considered to have depressive symptoms. Respondents answering 7 or fewer items correct out of 10 items on the modified TICS were considered to have some cognitive impairment. Other cutoffs were investigated but did not change the results appreciably; these cutoffs were ultimately selected so that at least 10% of the sample were identified as having depressive symptoms or some cognitive impairment.
A general measure of social support was constructed from a question on whether the respondent had received help in the last 2 years from children or grandchildren with household chores, errands, or transportation. Two activity-level variables were also created, indicating the respondent received help or used equipment for one or more activities other than the activity being predicted. (This latter restriction was imposed in order to avoid having help and equipment use on both sides of the logistic regression models.) Finally, all models also control for the specific type of activity (e.g., getting across a room, dressing, bathing, etc.).
Statistical Analysis
Descriptive (chi-square) statistics were calculated in SAS. Odds ratios (ORs) for characteristics associated with discrepant difficulty responses were calculated by exponentiating coefficients from logistic regression models (Hosmer and Lemeshow 1989
) estimated with various subgroups of the activity-level sample.
All statistical tests and confidence intervals based on estimates from the activity-level sample (or subgroup thereof) were adjusted to take into account the clustering due to multiple observations per respondent. This was done by taking the variances of the logistic regression coefficients (produced by SAS under the assumption of a simple random sample) and inflating them by an estimated design effect (DEFF). A separate DEFF was calculated for each of the specific subsamples (DEFF ranged from 1.07 to 1.14), using Kish's formula for cluster design effects (Kish 1965
, p. 162). Formally, DEFF = 1+ (B - 1)
, where
is the intraclass correlation coefficient (in this application, assumed to be the average correlation among the activity-specific response patterns) and B is the average cluster size (or average number of activities per person in the given sample). In this application,
ranged from 0.15 to 0.29 and B ranged from 1.5 to 1.6, depending on the specific subsample.
| Results |
|---|
|
|
|---|
|
Discordance Between Underlying and Ambiguous Difficulty
At the individual level, the proportion of discordant responses to ambiguously phrased questions and questions about underlying difficulty ranges from 2.1% to 14.1%, depending on the specific activity (see Table 5 ). Discordance is highest for getting across a room (14.1%).
|
When the activity is the unit of analysis, only 7.3% of activities are classified as having discordant underlying and ambiguous responses. Discordance is distributed fairly evenly between the two types (4.2% and 3.1%, respectively).
Not surprisingly, the majority of concordant cases involve respondents who report no difficulty irrespective of question wording: 63.3% of respondents and 86.3% of activities are classified as having/involving no underlying (and therefore no residual) and no ambiguous difficulty. When respondents who report no difficulty to all three measures are eliminated from the analysis, the percentage of discordant responses is much more substantial, ranging from 40% to 64%, depending on the activity (not shown but calculated from Table 5 ).
Discordance Between Residual and Ambiguous Difficulty
The percentage of discordant responses to residual and ambiguously worded difficulty questions is very similar to that seen for underlying and ambiguous difficulty questions, shown in Table 5 . That is, at the individual level, discordance between residual difficulty and questions that do not specify whether to consider help or equipment ranges from 3.5% to 9.9%, depending on the specific activity (see Table 6 ). Discordance appears to be highest for dressing and getting across a room (9.9% and 9.6%, respectively).
|
When the activity is the unit of analysis, only 7.2% of activities are classified as having discordant underlying and ambiguous responses. Among discrepancies in the activity-level sample, the distribution is skewed toward those reporting ambiguous but no residual difficulty.
Again, when respondents who report no difficulty to all three measures are eliminated from the analysis (not shown but calculated from Table 6 ), the percentage of discordant responses is much more substantial, ranging from 40% to 75%, depending on the activity.
Discordance Among Underlying, Residual, and Ambiguous Difficulty
Considered next is the percentage of older persons whose responses to ambiguously worded questions match neither underlying nor residual difficulty responses. Depending on the specific activity, 1.4% to 6.5% of respondents answer ambiguous difficulty questions consistent with neither underlying nor residual difficulty (see Table 7 ). For all activities except eating, this type of discordance falls within a relatively narrow range (between 4.6% and 6.5%).
|
Focusing on the activity-level analysis, only 4.6% of all activities involve answers to ambiguous questions that match neither underlying nor residual difficulty answers. Among the 5.1% of cases that match only underlying or only residual difficulty responses, about half of the answers to ambiguously worded questions are consistent with only underlying difficulty (2.5%/5.1%), and the other half answer ambiguous questions consistent with residual difficulty (2.6%/5.1%).
Predictors of Discordance
Odds ratios (ORs) and 95% confidence intervals (CIs) from three separate logistic regression models are presented in Table 8 . Each model was estimated using a subset of the activity-level sample. Model 1 includes activities for which respondents provide inconsistent answers to underlying and ambiguous difficulty questions. The outcome was coded so that 1 indicates the respondent reported underlying but no ambiguous difficulty and 0 indicates ambiguous but no underlying difficulty. Model 2 includes activities for which respondents provide inconsistent answers to residual and ambiguous difficulty questions. For this model, 1 indicates residual but no ambiguous difficulty and 0 indicates ambiguous but no residual difficulty. The final model includes activities for which responses to ambiguous difficulty questions match only underlying or only residual difficulty responses. In this case, 1 indicates that responses to ambiguously worded questions match only responses to underlying difficulty questions, whereas 0 indicates that responses to ambiguously worded questions match only responses to residual difficulty questions. All three models also include variables indicating the specific activity (e.g., getting across a room, dressing, etc.); coefficients for these variables, which in most cases were large and statistically significant, were omitted for the sake of brevity but essentially replicate the activity-specific patterns shown in Table 5 Table 6 Table 7 .
|
When one considers which response ambiguous difficulty answers match, as shown in Column 3, two additional statistically significant relationships emerge. Among those whose responses to ambiguous difficulty questions match only underlying or only residual difficulty (Column 3), Hispanics are less than one fourth (OR = 0.22) as likely as others and those who receive help with another activity are twice as likely as others (OR = 2.11) to answer ambiguous difficulty questions consistent with underlying, rather than residual, difficulty.
| Discussion |
|---|
|
|
|---|
Most respondent characteristics investigated were not related to reporting discrepancies. At the same time, a few potentially systematic relationships did emerge that suggest interpretation of ambiguously worded questions might differ for some select subgroups of elderly personsthose who receive help and those of Hispanic origin.
Two limitations of this analysis are worth noting. First, one cannot be sure that all of the discrepancies between the ambiguously worded questions and other measures of difficulty are due to the ambiguities of the former. For example, errors in measuring equipment use and help, coupled with the lack of information on equipment use for activities other than walking and getting across the room, may also explain some of the inconsistencies. In addition, changes to the introduction between the Wave 2 core and experimental module interviewswith a more restrictive introduction preceding the ambiguously worded questionsmay also explain some of these discrepancies. The estimates in this study should therefore be considered an upper bound on the extent of measurement error associated with ambiguously worded difficulty questions.
Second, the analysis of predictors of discordance was limited in part by sample size. Discrepancies were reported by fewer than 500 of the 1,000 or so respondents to the experimental module. Thus, there was only enough sample size to detect ORs on the order of 1.6 or higher with a power of .80 or higher. It may be that other respondent characteristics had important effects on response patterns, but there was not enough power to detect such effects.
Despite such limitations, this analysis offers some important insights into the extent and direction of measurement error associated with the use of questions about difficulty that do not specify whether to consider help or equipment. For example, underlying and ambiguous disability questions appear to be interchangeable for aggregate-level, descriptive purposes. In contrast, ambiguously worded difficulty questions are likely to substantially overstate aggregate levels of unmet need. The latter finding suggests that those interested in planning for the needs of older disabled Americans should be especially careful to rely on measures explicitly designed to capture residual difficulty.
At the individual level, the adequacy of ambiguously worded questions as a proxy for underlying and residual difficulty is less clear cut and depends on the specific analysis at hand. For example, analyses that use ambiguous difficulty questions to create a summary variable as a proxy for underlying difficulty with one or more activities will face measurement error for up to 15% of respondents. For analyses that focus on frail elderly persons, however, measurement error will be much more substantial.
For studies that use such a summary variable based on ambiguously worded questions to select a sample of older persons with underlying difficulty, about 75% of those selected into a sample will indeed have underlying difficulty with one or more activities (22.0/[22.0 + 7.1] from the second-to-last line of Table 5 ). It is somewhat reassuring that those inadvertently included are of similar size as, and do not appear to differ systematically from, those inadvertently excluded, except that the former overrepresent those receiving help.
In contrast, such a summary variable based on ambiguously worded questions was less successful in identifying a sample of older persons with residual difficulty. Only about half of those who report ambiguous difficulty with one or more activities report residual difficulty with one or more activities (15.1/[15.1 + 14.0] from the second-to-last line of Table 6 ). Thus, ambiguously worded questions are not particularly useful as a proxy for identifying a sample of respondents with residual difficulty.
This analysis also provides limited insight into how ambiguous difficulty questions are interpreted by select subgroups of older disabled Americans. For example, among those who report underlying but no residual difficulty (i.e., individuals who somehow accommodated their underlying deficits in functioning), Hispanic older persons (most of whom are of Mexican origin in this sample) are more likely to respond to ambiguous difficulty questions consistent with residual difficulty responses. This finding is consistent with previous research suggesting that cultural norms are important in influencing how one answers questions about physical well-being (for a review, see Angel and Angel 1997
). Reports by persons of Hispanic origin may be affected, for example, by culturally conditioned beliefs about what is "normal," by health and functioning of their reference group, and by traditional norms that value independence. Further, those who receive help are more likely to answer ambiguously worded questions consistent with underlying, rather than residual, difficulty and more likely to report having ambiguous difficulty even if they report no underlying difficulty with an activity. This result is consistent with the notion, as discussed by Agree 1999
, that reliance on caregivers to bridge deficits fosters a self-view of dependence.
Taken together, these findings provide guidance to researchers interested in minimizing measurement errors associated with ambiguously worded difficulty questions. For example, whenever possible, analysts should avoid using such questions as a proxy for residual difficulty. Second, to minimize measurement error, analysts should consider using the ambiguously worded difficulty questions separately for each activity rather than as a summary indicator of having difficulty with one or more activities. Further, those researchers specifically interested in predicting care arrangements of older Americans should be aware that the percentage of cases for whom this measure would be endogenous in studies of care arrangementsthat is, those who answer ambiguous questions consistent with residual, rather than underlying, difficultyis on the order of 6%. Analysts interested in using ambiguously worded measures in studies of care arrangements should therefore consider appropriate methods to take into account such endogeneity.
In sum, this analysis demonstrates that the implications of asking questions about difficulty that do not specify whether to consider assistance can be benign or far-reaching, depending on the aim of the study. Survey researchers interested in improving the validity of disability measures may want to continue to investigate less ambiguous approaches to ascertaining whether older persons have difficulty with daily activities. Whether the costs of less ambiguous approachesprimarily in the form of additional items, which translates into increases in interviewer time and respondent burden and, in the case of future AHEAD waves, the loss of comparability over timeare worth the improvements in validity and reliability is a question for future research.
| Acknowledgments |
|---|
Received for publication April 9, 1999. Accepted for publication April 20, 2000.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
R. J. Calsyn, J. P. Winter, and R. D. Yonker Should Disability Items in the Census Be Used for Planning Services for Elders? Gerontologist, October 1, 2001; 41(5): 583 - 588. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||
| HOME | ARCHIVE | SEARCH | TABLE OF CONTENTS |
|---|