| HOME | ARCHIVE | SEARCH | TABLE OF CONTENTS |
|---|
| ||||||||||||||||||||||||||||||||
BRIEF REPORT |
1 Center for Aging Research, Indiana University, and Regenstrief Institute, Inc., Indianapolis.
2 Health Services R&D Service, Department of Veterans Affairs Medical Center, and College of Public Health and Health Professions, University of Florida, Gainesville.
3 Department of Psychiatry, School of Medicine, Saint Louis University, St. Louis, Missouri.
4 Division of Biostatistics, Washington University School of Medicine, St. Louis, Missouri.
5 College of Public Health, University of Iowa, and Iowa City VAMC, Iowa City.
Address correspondence to Douglas K. Miller, MD, IU Center for Aging Research, 1050 Wishard Blvd., RG-6, Indianapolis, IN 46202. E-mail: dokmille{at}iupui.edu
| Abstract |
|---|
|
|
|---|
Methods. We examined testretest reliability of subclinical status using Fried's measurement method of changing either the frequency or method of task performance for five functional limitations, three basic activities of daily living (ADLs), and four instrumental ADLs, as well as summary scales reflecting these three constructs. We also performed sensitivity analyses of testretest interval and alternative definitional approaches (using only method, only frequency, or both).
Results. Weighted kappas for individual tasks across three performance levels (high functioning, subclinical status, and task difficulty) indicated moderate agreement for one task and substantial agreement for 11 tasks. Intraclass correlation coefficients for the three scales demonstrated outstanding agreement. The most reproducible definition of subclinical status involved the either/or method.
Discussion. Excellent testretest reproducibility was demonstrated in this population-based sample of late middle-aged African Americans using Fried's method of measuring subclinical status.
Functional limitation and disability are core concepts in health status and health-related quality-of-life trajectories among older adults (Andresen, Rothenberg, &, Zimmer, 1996
; Wolinsky & Miller, in press). Functional limitation generally refers to reported difficulty performing muscular-skeletal activities (e.g., going up and down stairs, walking a half mile), while disability reflects reported difficulty performing basic activities of daily living (ADLs) and instrumental ADLs (IADLs; Wolinsky & Miller). As originally proposed by Fried and colleagues (Fried, Herdman, Kuhn, Rubin, & Turano, 1991
), the concept of subclinical status (or "preclinical disability") is useful in understanding the earlier stages of functional limitation and disability. Theoretically, the ascertainment of subclinical status prior to progression to self-reported difficulty should provide an early warning system that can be successfully used to promote functional recovery and prevent the onset of task disability, which is crucial in the context of the Institute of Medicine's (2003)
recent emphasis on disability prevention as a priority area for improving health care quality (Wolinsky, Miller, Andresen, Malmstrom, & Miller, 2005a
). Although several approaches to operationalize subclinical status have been proposed (e.g., Binder et al., 2002
; Hazuda, Gerety, Lee, Mulrow, Lichtenstein, 2002
), an easy and frequently used method is Fried's (Fried et al., 1996
). In it, subclinical status is said to exist if the subject reports no difficulty performing a task but reports having modified his/her task performance either in terms of its frequency or method of performance.
Using this definition, subclinical status for functional limitations, ADLs, and IADLs have been shown to be prevalent and to be highly predictive of subsequent incident difficulty in samples of community-dwelling individuals either in or near their senior years. Fried and colleagues (1996)
demonstrated subclinical status prevalence ranging from 2% to 33% for 27 tasks in a convenience sample of 231 adults aged 59 to 90 years. We showed subclinical prevalence among subjects who reported no task difficulty ranging from 10% to 40% for five functional limitations, three ADLs, and four IADLs in a population-based sample of 998 urban-dwelling African Americans aged 49 to 65 years. Moreover, both Fried and colleagues (Fried, Bandeen-Roche, Chaves, & Johnson, 2000
) and we (Wolinsky et al., 2005a
) have also shown that task-specific subclinical status is a consistent and strong predictor of the subsequent development of reported task difficulty, even after adjusting for relevant covariates.
Despite these useful attributes of subclinical status for functional limitations and disability, information on the reliability of its measurement is sparse. The only published data that we are aware of come from the Fried and colleagues (1996)
study. In that report, the testretest reliability of five tasks (dressing, preparing meals, self-managing medications, grasping and handling small objects, and walking a half mile) was assessed in 93 subjects selected at random from the sample of 231. Kappas ranged from 0.74 to 0.86. Despite the innovative nature of this work, Fried and colleagues' results have important limitations. Because it was not feasible to have their subjects return for reevaluation, the retest interval they used was only several hours long, increasing the possibility of recall and short-term learning effects and thus of reliability estimates that were biased high. In addition, only five tasks were evaluated in a convenience sample of primarily Caucasian subjects.
In this study, we extended Fried and colleagues' (1996)
work using a population-based, somewhat younger sample from a single, important minority group (African Americans) coupled with a more optimal retest interval (DeVellis, 2003
) to examine the reliability of measuring subclinical status for 12 tasks. We also conducted sensitivity analyses to determine whether alternative definitional approaches to subclinical status (change in method only, change in frequency only, or change in both) would be equally or more reproducible than the original either/or approach, which might reduce the need to ask subjects about both method and frequency modifications, assuming that the new definition also retained predictive validity of the original definition.
| METHODS |
|---|
|
|
|---|
Fifty interviewers conducted an initial in-home interview and functional assessment averaging about 2.5 hours in length on the 998 participants between September 2000 and July 2001. In a substudy, 114 of the 998 subjects were randomly selected for retesting, and 92 (81%) completed the in-home repeated assessments. For 80 of these subjects, interviewers were matched for the two interviews, and 12 subjects had a different interviewer at test than retest. Questions in the substudy included the difficulty and subclinical status questions for 12 tasks (five functional limitations, three ADLs, and four IADLs). There were no statistically significant differences between subjects participating in the substudy from other participants in the main study in terms of age, gender, income, education, 11 self-reported medical conditions, or reported level of difficulty (none, subclinical status, or difficulty) for any of the 12 tasks addressed in this article, with the exception of preparing meals and congestive heart failure. For preparing meals, 73% of the subjects in the retest subset had high functioning, 13% had subclinical status, and 14% had difficulty versus 76%, 19%, and 5%, respectively, in the other main study participants (p =.002). Twelve percent of the retest group reported congestive heart failure, while only 5% of those in the main study did (p =.004). Given that 30 comparisons were performed, these two differences were most likely due to chance. The interval between test and retest was 5 to 45 days (M = 18, median = 19, interquartile range = 1322). For the most part, these intervals fit within the range identified as optimal to minimize subject and interview bias (several days to several weeks; DeVellis, 2003
).
Functional limitation, ADL, and IADL items
Tasks addressed included five functional limitations (stoop, crouch, or kneel; lift and carry 10 pounds; go up and down 10 steps; grasp or handle ["for example, picking up a dime from the table"]; and walking a half mile), three ADLs (bathing or showering, dressing, and getting in and out of bed or chairs), and four IADLs (preparing meals, performing light housework, performing heavy housework, and managing medications). We ascertained task difficulty using the wording and approach of the Second Longitudinal Study on Aging (LSOA-II; National Center for Health Statistics, 1998
). For example, we asked subjects, "Because of a health or physical problem do you have any difficulty bathing or showering?" Subjects who reported any difficulty or inability to perform a task were considered to have "difficulty." We then asked subjects who reported no difficulty performing a task whether, because of a health or physical problem, they had altered either the method or frequency of task performance since age 40. Subjects who reported neither difficulty nor task modification were considered to have "high functioning," and those who reported no difficulty but either modified the method or reduced the task frequency were determined to have "subclinical status" for that task. Subjects could also indicate that they did not perform the task for reasons other than health or physical problems, and this opportunity existed at both the test and retest administration. We excluded subjects who responded, "Don't do for other reasons" at the time of either test or retest from the reliability assessment (see Table 1 for number of subjects included for each task). Thus, for each item, we placed each subject into one of three mutually exclusive and exhaustive functional categories: high functioning, subclinical status, or difficulty.
|
; Armstrong, White, & Saracci, 1992
0.80 outstanding (Landis & Koch, 1977| RESULTS |
|---|
|
|
|---|
Depending on the task, subclinical status was somewhat more prevalent (e.g., bathe, light housework) or less prevalent (e.g., heavy housework, walk a half mile) than difficulty and ranged from 10.8% for managing medications to 23.1% for lift and carry 10 pounds (Table 1). Kappas across the three levels for the individual tasks ranged from 0.40 for managing medications to 0.78 for going up and down 10 steps. Kappas indicated moderate agreement for one task and substantial agreement for 11 tasks. ICCs for the three scales (functional limitations, ADL, and IADL) were higher than the individual item kappas and showed outstanding agreement in each case.
To examine the potential effect of the testretest interval on the results, we examined the item and scale reliability separately in the group whose testretest interval was 5 to 18 days and in the group with a 19 to 45 day interval. Including all individual items, kappas for the short interval group averaged 0.045 higher than for the longer interval group, but kappas for the three domain summary scores were 0.023 higher for the longer interval group (data not shown; available from first author on request).
We also examined the reasons for subclinical status determination (changed method only, changed frequency only, or changed both) at both the test and retest assessments for respondents in the retest subgroup who had no difficulty for that task at either assessment. These analyses were performed to investigate (a) which of the two factors (changed method or changed frequency) was the more common reason for subclinical status determination, (b) whether the reason for subclinical status was stable from test to retest, and thus (c) whether one of the two factors could be dropped without reducing reliability. For all 12 items, change in frequency was the more common reason for subclinical status, but this varied across the items and generally in content-appropriate ways (Table 2). For example, change in frequency was by far the more frequent reason for subclinical status determination for light housework, but change in method was more common for managing medications. However, when kappas could be determined for agreement regarding which reason caused a subclinical status determination (seven items), they ranged from 0.17 to 0.66. These kappas were substandard for five tasks and moderate for two tasks.
|
| DISCUSSION |
|---|
|
|
|---|
= 0.40). The reliability was especially strong when the items were combined into scales according to their functional category. Moreover, by demonstrating this finding for 12 tasks in a population-based group of community-dwelling African Americans who were about 17 years younger on average than Fried's sample, we have extended their findings to a representative sample of different age and race, to a larger group of functional tasks, and to a more appropriate test-retest interval.
Kappas in our sample averaged 0.14 lower for the five tasks that Fried and colleagues (1996)
also examined. The biggest difference related to managing medications, for which the reliability in our study was 0.40 versus 0.86 in theirs. If that one item is removed, then the difference in reliability for the other four items between the two studies is only 0.07. The somewhat lower reliability in our study is plausible given that the shorter testretest interval in their investigation would permit greater recall of prior responses.
Although our examination of the specific reasons for subclinical status determination was limited somewhat by the relatively small number of subjects in each category (Table 2), the results strongly suggest that both the change method and change frequency questions must be used to determine subclinical status for two reasons. First, the factor that resulted in subclinical status determination varied across tasks. Second, the reason for subclinical status (change in method or change in frequency) varied from test to retest even within each task so that the either/or combined measure produced excellent reliability, whereas either factor on its own would have generated a less reliable measure. These data suggest that the distinction between change in method and change in frequency may not have been entirely clear to the subjects. Qualitative investigation to understand subjects' interpretation and response to these questions would be useful to help clarify these issues, for example using cognitive interviewing techniques (Krause, 2002
).
Potential limitations of our investigation should be kept in mind. We examined the testretest reliability for only 12 tasks in a single minority population of restricted age in a single locale. Moreover, while most of the retest interviews were performed by the same interviewer as the original interview, in 12 cases interviewers differed between test and retest assessments. Despite these potential limitations, the consistency of our findings both internally and in comparison with the findings of Fried and colleagues (1996)
suggest robust reliability for this method of determining subclinical status.
| Acknowledgments |
|---|
| Footnotes |
|---|
Received for publication February 9, 2005. Accepted for publication July 29, 2005.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
E. J. Porter Scales and Tales: Older Women's Difficulty With Daily Tasks J. Gerontol. B. Psychol. Sci. Soc. Sci., May 1, 2007; 62(3): S153 - S159. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||
| HOME | ARCHIVE | SEARCH | TABLE OF CONTENTS |
|---|