Home
HOME ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Services
Right arrow Download to citation manager
PubMed
Right arrow PubMed Citation
The Journals of Gerontology Series B: Psychological Sciences and Social Sciences 56:P119-P128 (2001)
© 2001 The Gerontological Society of America


RESEARCH ARTICLE

Encoding Tasks and the Processing of Perceptual Information in Young and Older Adults

Maura Pilottia, Tim Beyera and Mariya Yasunamia

a Department of Psychology, Washington University, St. Louis, Missouri

Maura Pilotti, Department of Psychology, Washington University, One Brookings Drive, Campus Box 1125, St. Louis, MO 63130-4899 E-mail: mpilotti{at}eudoramail.com.

Decision Editor: Margie E. Lachman, PhD


    Abstract
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
This study examined the degree to which different tasks promote the encoding of the characteristics of a talker's voice in young and older adults, and whether these characteristics encoded in long-term memory facilitate spoken word identification under difficult listening conditions. During the encoding phase, participants were given extensive exposure to the voices of two talkers and performed tasks that focused their attention on either voice characteristics (explicitly or incidentally) or linguistic information. Subsequently, participants identified novel words masked by noise, half of which were spoken by one of the familiar talkers and half by an unfamiliar talker. Young adults identified with greater accuracy words spoken in a familiar voice, whereas older adults benefited from voice familiarity only under instructions that promoted attention to voice characteristics either explicitly or incidentally. Age-related declines in sensory uptake (hearing loss) accounted for most of these task-dependent voice effects.

THE principal challenge confronting listeners in spoken word identification is that talkers' voices substantially modify the spectral and temporal properties of speech signals ( Ladefoged 1980Citation; Peterson and Barney 1952Citation), so that a word spoken by two different talkers results in slightly different acoustic patterns. Several studies ( Church and Schacter 1994Citation; Goldinger 1996Citation, Goldinger 1998Citation; Nygaard and Pisoni 1998Citation; Nygaard, Sommers, and Pisoni 1994Citation; Pilotti, Bergman, Gallo, Sommers, and Roediger 2000Citation; Sommers 1999Citation) have demonstrated that young listeners confront this challenge by encoding in long-term memory the unique characteristics of a talker's voice (e.g., pitch, melodic contours, and speech rate), which are then used to identify words spoken by that talker. The goals of this investigation were (a) to examine the conditions that promote the encoding of voice information in memory, and (b) to assess whether age-related declines exist in this form of encoding.


    Young Adults: Research Findings
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
The bulk of evidence supporting the notion that nonlinguistic properties of the stimulus material (i.e., the distinctive qualities of a speaker's voice) are encoded in long-term memory has emerged primarily from two different versions of the repetition priming paradigm: word and voice priming. In the word-priming paradigm, subjects are exposed to two experimental phases, study (encoding) and test, none of which involves instructions to explicitly encode or remember information presented during the experiment. At study, subjects are usually asked to judge some relevant linguistic characteristics of a set of words (incidental encoding instructions), such as their meaning or phonetic content. At test, subjects perform tasks that are ostensibly unrelated to the previously presented information. For instance, subjects identify degraded words, half of which are those heard during the earlier phase of the experiment (encoding phase) and half that are novel. In this context, performance is facilitated for previously presented information relative to information encountered only at test, although subjects typically make no conscious attempt at remembering. This facilitation is termed "word priming."

In studies using this paradigm ( Church and Schacter 1994Citation; Goldinger 1996Citation, Goldinger 1998Citation; Pilotti et al. 2000Citation; Sheffert 1998Citation), priming has been found to be greater for words repeated at test in the same voice as at study than for words repeated in a different voice, even though the encoding instructions focused listeners' attention on linguistic information (either phonetic or semantic). These findings indicate that, even with a limited exposure to a talker's voice and without instructions to remember either the words produced by that talker or his/her voice, young adults incidentally encode both forms of information in long-term memory. These findings, however, also suggest that young adults develop memory representations of spoken words in which each word is encoded with the voice characteristics of the talker who said that word (i.e., specific voices saying specific words), implying that any beneficial effect of perceptual information on word identification is dependent on previously encoded phonetic/lexical information.

Interestingly, the voice priming paradigm devised by Nygaard and associates (see Nygaard and Pisoni 1998Citation; Nygaard et al. 1994Citation) has provided evidence suggesting that voice characteristics encoded in long-term memory can also affect word identification independently of previously encoded phonetic/lexical information. In this paradigm, however, subjects are first exposed to a lengthy voice-encoding phase in which they are explicitly required to become familiar with the voices of a set of talkers (explicit or intentional encoding). Specifically, subjects are asked to learn the association between a set of visually presented talkers' names and their voices by listening to a large number of words spoken by these talkers. At test, subjects identify degraded words, half of which are spoken by the familiar talkers and half by novel talkers. In contrast to the word-priming paradigm, all words used in the identification test are different from those of the voice familiarization set. In this experimental context, novel words spoken by familiar talkers are generally identified more accurately than those spoken by unfamiliar talkers ( Nygaard et al. 1994Citation).

At first sight, these findings suggest that the encoding of voice information in long-term memory can be empirically separated into two qualitatively different stages. With a limited exposure to a talker's voice, young adults appear to develop incidental records of spoken words, in which the phonetic/lexical content and perceptual details of each word, although stored separately, remain linked together via associative connections ( Schacter and Church 1992Citation) or constitute a single memory unit stored in a context-sensitive word-recognition system ( Goldinger 1996Citation). Consequently, young subjects display higher identification for words repeated at test in the same voice as at study than for words repeated in a different voice (see word priming results). With considerable exposure to a talker's voice, young adults seem to be able to dissociate these two sources of information as to show a benefit in the identification of novel words that match the encoded perceptual details (as demonstrated by the voice priming results).

The name–voice association task used by Nygaard and colleagues 1994Citation, however, required subjects to explicitly extract voice characteristics from the speech signals of the encoding phase. Therefore, the possibility exists that an encoding task focusing subjects' attention on voice information could lead subjects to attend to voice characteristics in the subsequent identification test, making the beneficial effect of voice familiarity on word identification observed by Nygaard and coworkers a carryover effect of the encoding task. A similar argument applies to the findings of a recent study conducted by Yonan and Sommers 2000Citation. In this study, a lengthy voice-encoding phase with sentences spoken by several talkers was followed by an identification test with novel sentences masked by noise. During the voice-encoding phase, subjects were instructed to either become familiar with the voices of a set of talkers (explicit encoding) or focus on semantic information (i.e., the meaning of the last word; incidental encoding). At test, subjects were required to identify the final word of novel sentences, half of which were spoken by the familiar talkers, and half by novel talkers. Yonan and Sommers reported that explicit encoding yielded voice effects similar to those observed following incidental encoding. However, prior to the identification test, subjects in both encoding conditions were given a voice discrimination test with 160 sentences, half of which were spoken by the familiar talkers of the encoding phase and half by novel talkers. Interestingly, incidental encoding yielded considerably lower voice-discrimination scores than explicit encoding, even though it produced equivalent voice effects in the word identification test. Therefore, the possibility exists that the incidental encoding instructions might have yielded little retention of voice characteristics, with the mere length of the discrimination test giving subjects the opportunity to encode these characteristics in long-term memory. Of course, it might have also led subjects to attend to voice characteristics in the subsequent identification test, nullifying any effect that the encoding manipulation might have had on word identification.


    Older Adults: Research Findings
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
In contrast to the variety of studies showing that phonetic/lexical coding and voice processing are closely tied together in young adults, only a few studies have examined this issue in elderly adults, and their findings have been ambiguous. For example, in the word-priming paradigm discussed above, Schacter, Church, and Osowiecki 1994Citation found that older adults, but not young adults, were unaffected by voice changes between study and test, yielding virtually equivalent same- and different-voice priming. Sommers 1999Citation, however, found voice effects on priming in similarly aged adults, even though the encoding instructions, as in Schacter and colleagues' investigation, focused elderly adults' attention on linguistic information (either phonetic or semantic). In the voice-priming paradigm discussed above, Yonan and Sommers 2000Citation also found that voice familiarity benefited the identification of novel words masked by noise across age groups. Interestingly, this pattern of voice effects was observed in spite of elderly subjects' difficulties in identifying the familiar talkers of the voice-discrimination task, which preceded the word identification test where these benefits were observed. Voice effects were also reported to be independent of the encoding instructions (explicit vs incidental), albeit older adults as well as young adults were overall less proficient in the voice-discrimination task following the incidental encoding condition. As discussed earlier, however, the incidental condition of this investigation can hardly be defined as incidental, when followed by a lengthy voice-discrimination task focusing subjects' attention on voice characteristics. Consequently, without additional evidence, the report of task-independent voice effects in the voice-priming paradigm remains questionable.

How can these apparently contradictory findings be explained? Clearly, the uptake of sensory information is reduced and/or disrupted in old age as a result of hearing loss and/or other peripheral impairments ( Florentine et al. 1993Citation; Konig 1957Citation; Moore, Peters, and Glasberg 1992Citation; Schneider 1997Citation; Schneider and Pichora-Fuller 2000Citation). Consequently, the possibility exists that phonetic/lexical coding and voice processing, which are closely tied together in young adults, might be altered by age-related changes in the uptake of sensory information. Of course, a compromised sensory uptake provides the word recognition system with a degraded input, making word identification more difficult for elderly subjects (see Schneider and Pichora-Fuller 2000Citation). Schneider 1997Citation has suggested that a degraded input to the word recognition system leads elderly adults to devote cognitive resources to the recovery of the phonetic information lost at the periphery (top–down processing). Therefore, in encoding tasks promoting linguistic processing, the possibility exists that the diversion of resources to the recovery process, although necessary for word identification, might weaken the processes involved in encoding the attributes of a novel voice in long-term memory. Consequently, these encoding tasks should yield either no voice effects or voice effects smaller than those explicitly promoting voice processing (e.g., name–voice association and voice discrimination tasks) in elderly adults. The finding of Yonan and Sommers 2000Citation, who reported voice effects in elderly subjects in the voice-priming paradigm, is consistent with this hypothesis if we assume that the effects of the incidental encoding condition were driven by the voice-discrimination test. Of course, the finding of Schacter and associates 1994Citation, who reported that elderly adults did not yield voice effects in the word-priming paradigm with encoding instructions focusing subjects' attention on linguistic information, clearly supports this hypothesis. However, the finding of voice effects in elderly adults reported by Sommers 1999Citation in the same paradigm is difficult to interpret with respect to this hypothesis because there were no instructions explicitly focusing elderly adults' attention on voice processing. Therefore, it is unclear whether the size of these effects would fluctuate with different encoding instructions as predicted by the assumption of age-related declines in voice processing.

In light of these unresolved issues, the first goal of the present study was to reexamine the effects of encoding instructions (intentional/explicit vs incidental) on the identification of novel words displaying either familiar or unfamiliar voice patterns. To this end, we exposed young and older subjects to an extensive voice familiarization (encoding) session to give both age groups the opportunity to develop a memory of each voice. In contrast to the Yonan and Sommers 2000Citation voice familiarization session, which involved talkers of different gender, our participants heard only two male voices. This permitted us to avoid confounding memory of gender with memory of voice characteristics per se, and limit the cognitive load involved in processing several voices ( Mullennix and Pisoni 1990Citation).

In Experiment 1, the voice familiarization phase was administered under two instructional conditions (explicit and incidental) to assess whether the encoding of voice information is modulated by the type of instructions. In the explicit (E) encoding condition, subjects were to learn the association between a name and a voice as in Nygaard and colleagues' study (1994). Therefore, this condition was intended to promote the processing of voice information irrespective of the specific words spoken by each talker. In the incidental (I) encoding condition, subjects were asked to judge the clarity of enunciation of spoken words. Therefore, this condition—which required subjects to judge the quality of the phonetic content of each word with no explicit reference to voice characteristics—was assumed to focus subjects' attention on linguistic information. Voice characteristics, however, modified the acoustic patterns that listeners were to judge for clarity ( Ladefoged 1980Citation; Peterson and Barney 1952Citation). Consequently, the processing of voice information was an integral aspect of the clarity-of-enunciation task, although it was unclear whether this processing would promote the same encoding and use of voice information as the name–voice association task.

After each encoding condition, listeners identified words masked by noise. Because we were interested in the long-term encoding of voice characteristics irrespective of word information, the words used in the identification test were all novel. Although novel, half of the words were spoken by one of the familiar talkers and half by a novel talker (both men). In this context, novel words exhibiting unfamiliar voice patterns served as a control condition. The findings of Nygaard and coworkers 1994Citation and Yonan and Sommers 2000Citation led us to expect that with explicit encoding instructions both young and older adults would identify with higher accuracy words spoken by the familiar talker (voice effects). Whether the effect of voice familiarity on word identification would be modulated by the encoding manipulation in either age group was a matter of empirical investigation. Of course, the question of interest here was not whether subjects develop a memory of a talker's voice after extensive exposure to that voice. It is obvious that subjects have knowledge about familiar voices in long-term memory, as illustrated by their ability to recognize the voice of a friend over the telephone. Rather, the question of interest was whether without prompting (i.e., attention to voice characteristics during encoding), subjects would develop a sufficiently detailed knowledge of a talker's voice that could be used to identify novel words spoken in that voice. We hypothesized that if mere exposure is sufficient to induce this form of processing in either age group, encoding instructions focusing subjects' attention on either voice or word information should yield equivalent voice effects at test. We also hypothesized that, if age-related declines in sensory uptake weaken older adults' ability to process voice information, age differences in voice effects should be observed at test primarily following encoding instructions promoting attention to linguistic information.


    Experiment 1
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
Methods
Participants.
Forty-eight young and 48 older adults participated in the experiment. The mean age of young adults was 20.9 (18–27), and the mean age of older adults was 73.6 (61–84). The young adults were undergraduate students who participated in the experiment for course credit. The older adults were community members from the aging and development subject pool of Washington University who were paid for participating in the experiment. Young and older participants reported themselves as being in good health for their age, and none wore a hearing aid. Although young adults had completed an average of 15.13 (SD = .94) years of formal education and older adults had completed an average of 14.58 years (SD = 2.41), this difference did not reach significance (t = 1.45). All participants were given the vocabulary subsection of the Shipley test ( Shipley 1940Citation). The mean vocabulary score of young adults was 31.44 (SD = 3.90), and the mean score of older adults was 35.30 (SD = 2.87). The difference between means, favoring the aged, was significant, t (94) = 5.52, p = .0001.

Because a major concern in studies comparing performance of young and older adults in auditory tests is the decreased uptake of sensory information in the aged, we collected pure-tone air-conduction thresholds for octave frequencies from 250 to 4000 Hz from both young and older adults. Average pure-tone air conduction thresholds, which provide a measure of participants' hearing acuity, and standard errors are reported in Fig. 1. These data were submitted to a mixed factorial analysis of variance (ANOVA) with frequency and age as factors. This analysis produced a main effect of age, F (1,94) = 274.22, MSe = 174.50, frequency, F (4,376) = 53.91, MSe = 43.40, and a reliable interaction, F (4,376) = 87.01, MSe = 43.40. Although young and older adults differed in auditory acuity at all the selected frequencies (p < .05), high-frequency information yielded the largest group differences (see Appendix, Note 1).



View larger version (14K):
[in this window]
[in a new window]
 
Figure 1. Experiments 1 and 2. Average pure-tone thresholds (in dB) and standard errors of young and older participants for octave frequencies from 250 to 4000 Hz.

 
Stimuli.
The stimuli consisted of 300 polysyllabic English words (see Pilotti et al. 2000Citation). The stimulus words were either of low or medium frequency (M = 9.4, SD = 14.8; Kucera and Francis 1967Citation) and had a mean familiarity rating of 6.7 (SD = .6; Nusbaum, Pisoni, and Davis 1984Citation). Fifty additional words were chosen to serve as fillers at the beginning of the familiarization and test sessions.

All the stimuli, recorded by six male talkers in a sound-attenuating booth, were digitized at a sampling rate of 20 kHz on an IBM-compatible computer using a 16-bit analog-to-digital converter equipped with anti-aliasing filters. The amplitude levels of all the stimuli were digitally equated to the same root mean square (RMS) using a software package specifically designed to modify speech waveforms. The stimuli were presented at 80 dB sound pressure level. Prior to experimental implementation, each word token, presented in the clear, was checked independently for mispronunciations and misarticulations by two raters. Tokens that were judged as containing any of these production errors were re-recorded. Approximately 10% of the word tokens of each talker needed to be replaced.

The 300 stimulus words were organized in three lists of 100 words each. Because the familiarization sessions of the explicit (E) and implicit (I) phases involved different tasks and talkers, the same list of 100 words was used for both familiarization sessions. The remaining two lists of 100 words were used in the test sessions. Lists were matched for frequency and familiarity. Both the familiarization and test lists included filler words, placed at the beginning of each list for subjects to practice. The word tokens of the familiarization sessions were presented in the clear, whereas the words of the test lists were masked by white noise. The noise was 5 dB louder than the signal (signal-to-noise ratio, S/N = – 5).

The six talkers were first randomly assigned to two sets of three so as to assure that the talkers heard in the E phase were never heard in the I phase. A Latin square design was then used to assign talkers to the voice familiarization and test sessions so that in either the I or E phase, a subject would be first familiarized with two voices, one of which would be subsequently used at test along with a novel voice. This procedure produced three unique combinations of talkers (2 familiar voices and 1 unfamiliar voice) for each experimental phase. Therefore, because the voices used in one phase were never used in the other phase, each participant at test always heard one familiar voice and one unfamiliar voice. Forty-eight unique combinations of talkers, lists, and order were obtained by counterbalancing the test lists assigned to each phase, the familiar talkers selected for a given test list, the words of the test lists assigned to familiar and unfamiliar talkers, and the order in which each phase was administered.

Procedure and design.
The experiment was presented as an investigation of auditory perception consisting of a series of independent tasks. Participants were exposed to two experimental phases, explicit (E) and incidental (I), counterbalanced across subjects. Each phase involved a voice familiarization (encoding) session followed by a test session in which the task was to identify words masked by noise. The main difference between these two phases was the task of the familiarization session, which required either the explicit or the incidental encoding of voice information.

In the E phase, subjects were first familiarized with two voices by learning to associate a person's name with a voice (explicit encoding). Subjects were given 20 practice trials and 400 randomly presented trials, each involving a word spoken by one of two male talkers. To assure maximum exposure to the two voices, each talker spoke the same 100 words twice. On any given trial of this session, two names (John and Paul) appeared on the screen before subjects heard a word spoken by one of these two talkers. Subjects were asked to identify the talker who spoke that word by pressing one of two keys on the computer keyboard, which were labeled "Paul" and "John." Participants started the task by guessing, as no preexisting association existed between names and voices. Feedback on each trial provided the opportunity for learning the correct name–voice pairings, which was the prerequisite for entering the next phase of the experiment. Performance in the last 50 trials of the name–voice association task was near ceiling for both age groups (young adults: M = 97%, SD = 3; older adults: M = 96%; SD = 4; t (94) = 1.62, NS), indicating that subjects satisfied such a prerequisite. After the explicit voice-familiarization session, subjects were given the identification test, including 5 words for practice and 100 novel words, all masked by noise. Subjects were asked to identify each word and report their answers in a booklet containing 105 numbered blanks (test session). Half of the words were spoken by one of the familiar male talkers and the other half by a novel male talker.

In the I phase, subjects first became familiar with two male voices by performing a clarity rating task (incidental encoding). As in the other phase, subjects were given 20 practice trials at the beginning of the familiarization session and 400 randomly presented trials, each involving a word spoken by one of two male talkers (the same 100 words were spoken by both talkers twice). On any given trial, subjects, who heard a word spoken by one of these talkers, were asked to rate the clarity of enunciation of each word on a 7-point scale (from 1 = poorly enunciated to 7 = very well enunciated) by pressing the key on the computer keyboard that corresponded to their answer. As in the E phase, at test participants heard 5 words for practice and 100 novel words, all masked by noise. Subjects were asked to identify each word and report their answers in a booklet containing 105 numbered blanks. Half of the words were spoken by one of the familiar male talkers and half by a novel male talker.

The experiment lasted approximately 2.5 hours. To minimize fatigue effects, there was a 10-minute break between experimental phases and a 5-minute break between familiarization and test sessions within each phase. Subjects were tested individually in a sound-deadened testing room. The experiment involved a mixed factorial design with encoding task (explicit vs incidental) and test voice (familiar vs unfamiliar) as within-subjects factors. Age was the only between-subjects factor.


    Results and Discussion
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
The percentage of words correctly identified at test and standard errors as a function of encoding task, test voice, and age are displayed in Fig. 2. There are three points to make from Fig. 2. First, overall, young adults identified more words than older adults. Second, irrespective of the encoding task, young adults identified more accurately words pronounced by a familiar talker than words pronounced by an unfamiliar talker. Third, older adults also displayed higher accuracy for words spoken by a familiar talker, but only after the explicit encoding task.



View larger version (15K):
[in this window]
[in a new window]
 
Figure 2. Percentage of words correctly identified by young and older adults and standard errors as a function of encoding task and test voice.

 
The observations regarding the effects of voice familiarity on word identification were supported by a 2 (familiar vs unfamiliar test voice) x 2 (explicit vs incidental encoding session) ANOVA conducted on the percentage of words correctly identified at test by each age group (see Appendix, Note 2). Unless otherwise indicated, all statistical tests reported here are significant at the .05 level. In young adults, this analysis yielded only a main effect of test voice, F (1, 47) = 297.28, MSe = 8.41, Eta-squared = .86 (other Fs < 1.50), suggesting that word identification benefited from voice familiarity following both encoding tasks. In older adults, this analysis yielded a main effect of test voice, F (1, 47) = 24.71, MSe = 19.48, Eta-squared = .35, and an interaction between voice and encoding task, F (1,47) = 27.71, MSe = 17.83, Eta-squared = .38 (encoding task: F = 1.87, NS). Tests for simple main effects clarified this pattern of results. Young adults benefited from voice familiarity in both encoding conditions (incidental encoding: t (47) = 10.98; explicit encoding: t (47) = 11.30), yielding voice effects of the same magnitude (t < 1). This finding rules out the notion that attention to voice information during encoding may indirectly promote voice effects in word identification (at least in young adults). Older adults displayed a similar pattern of facilitation in the explicit encoding condition (t (47) = 12.17), but not in the incidental condition (t (47) < 1), indicating an age-related change in the processing of voice information.

Interestingly, the explicit encoding of voice information yielded a similar pattern of facilitation for young and older adults (words spoken by a familiar talker minus words spoken by an unfamiliar talker: young: 7%; older: 6%, t (94) < 1). Of course, age differences emerged in the incidental encoding condition (7% vs 0; t (94) = 5.20). These findings indicate that although young adults can encode in long-term memory voice information and use it to promote on-line speech perception irrespective of the encoding task, older adults' ability to encode and use voice information is task dependent. Specifically, older adults appear to process the characteristics of a talker's voice only when their attention is explicitly directed to these characteristics.

In this experiment, there were age differences not only in voice effects, but also in overall identification performance. Specifically, older adults were less accurate in identifying words masked by noise across all the encoding conditions and test voices, F (1,94) = 61.89, MSe = 41.49, Eta-squared = .40. Because age differences in hearing acuity existed in our sample of participants, we examined whether hearing loss could account for these patterns of results. Hearing loss was defined as the average decrement in pure-tone sensitivity across all the selected frequencies relative to the normative value of 25 dB, which is the value that defines normal hearing in young adults. As seen in Fig. 1, which displays average pure-tone thresholds for both young and older adults, the elderly adults of our sample exhibited hearing losses primarily in the high-frequency range. A 2 (familiar vs unfamiliar test voice) x 2 (explicit vs incidental encoding session) x 2 (young vs old) analysis of covariance was then conducted on the percentage correct identification scores with hearing loss as the covariate. In this analysis, there were no age differences in performance, F = 2.83, p = 1, indicating that age-related declines in pure-tone sensitivity accounted for the overall lower word-identification rate of the aged (see also Yonan and Sommers 2000Citation). There was, however, an effect of test voice, F (1,93) = 62.06, MSe = 14.02, Eta-squared = .40, indicating that word identification benefited from voice familiarity. Test voice also interacted with encoding task, F (1,93) = 11.95, MSe = 14.95, Eta-squared = .11. Although this effect was primarily driven by the elderly adults' data, the interactions involving age and the other factors were either quite small or did not reach significance (age and test voice: F (1,93) = 4.15, MSe = 14.02, Eta-squared = .04, other Fs < 1). These findings indicate that hearing loss accounted for most of the age differences in voice effects uncovered in this experiment (see Appendix, Note 3).

How can hearing loss explain elderly adults' impaired ability to encode voice information incidentally and subsequently use it in on-line speech perception? Hearing loss, as indexed by pure-tone sensitivity, is a gross measure of the reduced uptake of sensory information in the aged. We have argued earlier that age-related declines in the uptake of sensory information reduce and/or disrupt the input to the word recognition system. We have also suggested that a reduced and/or disrupted input to this system may lead elderly adults to devote cognitive resources to the recovery of the phonetic information lost at the periphery (top–down processing). On the basis of these assumptions, we have proposed that the recovery process, although necessary for the activation of the appropriate lexical units in the word recognition system, may weaken the processing of voice information. Obviously, the clarity-of-enunciation task of the incidental condition requires that subjects evaluate the quality of the phonetic content of the stimulus material, whereas the name–voice association task of the explicit encoding condition focuses subjects' attention on voice characteristics. These different task requirements make word identification essential to the clarity-of-enunciation task, but irrelevant to the name–voice association task. Consequently, in the incidental encoding condition (clarity-of-enunciation task), where cognitive resources are devoted to word identification, it is reasonable to expect the recovery process to weaken the encoding of voice characteristics. The absence of voice effects in elderly adults following incidental encoding instructions supports this hypothesis. In contrast, in the explicit encoding condition, cognitive resources are specifically devoted to extracting voice information from the stimulus material and developing a distinct memory of each voice. Consequently, it is reasonable to expect voice information stored in long-term memory to facilitate the processes involved in extracting phonemic information from the degraded items of the word identification test. The voice effects observed in this condition support this hypothesis.

The overall lower word-identification rate of elderly adults in the explicit encoding condition also indicates that voice familiarity can weaken, but not eliminate, the effects of age-related declines in sensory uptake on word identification. Therefore, it is reasonable to assume that although voice information encoded in long-term memory helps older adults to disambiguate the degraded items of the identification test, it does not entirely compensate for age-related declines in sensory uptake.


    Experiment 2
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
In Experiment 1, older adults displayed evidence of having encoded voice information in long-term memory only under explicit encoding instructions. However, the name–voice association task used in this experiment required subjects not only to focus on voice characteristics, but also to explicitly retain these characteristics in memory. Therefore, one may ask whether the fine-grained discrimination of voice characteristics (i.e., attention to subtle perceptual features of the stimulus material) required by this task could have promoted by itself the encoding of voice information in older adults.

If the encoding of voice information in older adults is dependent upon the type of analyses conducted on speech signals, an incidental encoding task that promotes a fine-grained analysis of speech signals should produce voice effects in older adults. However, if older adults encode perceptual information only under explicit encoding instructions, even a fine-grained analysis of the stimulus material would not produce voice effects in this subject group.

We tested these contrasting hypotheses in Experiment 2 by exposing older adults to two incidental encoding conditions. In one encoding condition (incidental–fine grained, I–FG), a word spoken by two different talkers was presented on any given trial. The participants' task was to decide which of the two instances of the word was spoken more clearly (perceptual comparison task). It was thought that this comparison would promote a fine-grained analysis of the stimulus material (attention to voice information), even though the encoding task did not involve instructions to explicitly remember or attend to voice characteristics. In the other condition (incidental–phonetic, I–P), subjects performed the clarity of enunciation task of Experiment 1 in which only one word was presented on any given trial. This condition was used to assess whether the findings of the incidental encoding condition of Experiment 1 could be replicated with another sample of older adults.


    Methods
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
Participants.
Forty-eight older adults, mean age 73.5 (62–86), participated in the experiment. They were community members from the same subject pool used in Experiment 1 who were paid for participating in the experiment. Participants self-reported as being in good health for their age, and none wore a hearing aid. They had completed an average of 14.48 years of formal education (SD = 2.20), and their mean vocabulary score was 34.31 (SD = 3.55). Older adults' average pure-tone thresholds are displayed in Fig. 1 (see Note 1).

Stimuli and procedure.
The stimuli of this experiment were the words used in Experiment 1. There were two experimental phases: incidental–fine grained (I–FG) and incidental–phonetic (I–P). As in the earlier experiment, each phase, counterbalanced across subjects, involved an encoding task and a perceptual identification test with words masked by noise. In the I–P phase, subjects were exposed to the voices used in the explicit condition of Experiment 1, whereas in the I–FG phase, subjects were exposed to the voices of the incidental encoding condition of Experiment 1. This was done to assure that the specific voice patterns heard in the explicit encoding condition of Experiment 1 could not be held responsible for the voice effects observed in this experiment.

In the I–FG encoding session, subjects were presented with 10 practice trials and 200 randomly presented trials, each including two instances of the same word, each spoken by a different talker. The subjects' task was to judge the clarity of enunciation of the two instances of a word and select the one that they judged to be spoken more clearly by pressing the key on the computer keyboard that corresponded to their answer. In the I–P encoding session, subjects were presented with 20 practice trials and 400 randomly presented trials, each including a word spoken by one of two talkers. The participants' task was to rate each spoken word on a 7-point scale for clarity of enunciation. Both encoding tasks were followed by the identification test used in Experiment 1, in which half of the words were spoken by one of the familiar talkers and half by an unfamiliar talker.


    Results and Discussion
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
The percentage of words correctly identified in each experimental condition is presented in Fig. 3, which presents two main points. First, when the encoding task promoted a fine-grained analysis of the stimulus material of the familiarization session, even with incidental encoding instructions (I–FG condition), older adults identified more accurately words pronounced by a familiar talker than words pronounced by an unfamiliar talker. Second, when the stimulus material did not promote such a fine-grained analysis (I–P condition), no evidence of voice encoding was observed in elderly subjects.



View larger version (16K):
[in this window]
[in a new window]
 
Figure 3. Percentage of words correctly identified by older adults and standard errors as a function of encoding task and test voice.

 
These observations were supported by a 2 (familiar vs unfamiliar test voice) x 2 (phonetic vs fine-grained encoding session) repeated measures ANOVA (see Appendix, Note 2). Unless otherwise reported, all statistical tests reported here are significant at the .05 level. This analysis yielded a main effect of voice, F (1,47) = 78.44, MSe = 8.23, Eta-squared = .63, and a significant interaction of voice and encoding task, F (1,47) = 35.96, MSe = 9.05, Eta-squared = .43 (encoding task: F < 1). Tests for simple main effects indicated that the encoding instructions that required a fine-grained analysis of the stimulus material promoted the encoding of voice information in long-term memory. Indeed, following these encoding instructions, elderly adults displayed higher identification scores for words displaying familiar voice patterns (6%, t (47) = 9.92). As in Experiment 1, the encoding instructions that focused older adults' attention on phonetic information did not promote the encoding of voice characteristics in this subject group (–1%, t (47) = 1.88). The finding that the I–FG condition and the explicit encoding condition of Experiment 1 yielded equivalent voice effects in elderly adults (t < 1) supports the conclusion that attention to voice information promotes the encoding of this information in long-term memory (see Appendix, Note 3).


    General Discussion
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
The findings of the present study can be summarized in three main points. First, young adults encoded talker-specific characteristics in long-term memory under both explicit and incidental instructions. Second, older adults encoded perceptual information only under instructions that promoted, either incidentally or explicitly, a fine-grained analysis of the speech signals used in the familiarization session. Third, hearing loss accounted for older adults' lower word identification scores and for most of the age differences in voice effects (familiar minus unfamiliar voice) observed in Experiment 1.

Our findings are consistent with the results of word-priming studies ( Church and Schacter 1994Citation; Goldinger 1996Citation; Sheffert 1998Citation) reporting that young adults identify words spoken at test in the same voice as at study more accurately than words spoken in a different voice. They are also consistent with the results of voice-priming studies (see Nygaard et al. 1994Citation) reporting that young adults identify novel words spoken by familiar talkers more accurately than words spoken by unknown talkers. Therefore, our findings corroborate the notion that young listeners encode talker-specific characteristics in long-term memory, and then incidentally retrieve these characteristics to identify novel or repeated phonetic patterns that match these characteristics. Interestingly, in our study, the beneficial effects of voice information on young adults' spoken word identification were independent of the encoding task. These findings support the notion that the encoding of voice characteristics is an automatic byproduct of speech perception in young adults.

Our findings also support the notion that aging involves a change not in the ability of older adults to encode voice characteristics, but in their ability to do so spontaneously (i.e., independently of the encoding task). Interestingly, Schacter and coworkers 1994Citation found in the word-priming paradigm that older adults were insensitive to voice characteristics when the encoding task focused their attention on linguistic information. Similarly, we observed no beneficial effect of voice familiarity on word identification in the voice-priming paradigm when the encoding task did not focus older adults' attention, either directly (e.g., name–voice association task) or incidentally (e.g., perceptual comparison task), on perceptual information. These findings suggest that attention to voice characteristics fostered by the encoding task can promote the long-term encoding of voice information in the aged. Why would attention to voice characteristics be important for the encoding of this information in old age? We have proposed that age-related declines in the uptake of sensory information lead older adults to shift cognitive resources to the recovery of phonetic information, reducing their opportunities for encoding voice characteristics. Consequently, the encoding of voice information in older adults becomes dependent on tasks that promote either explicitly or indirectly the processing of voice characteristics. This account is consistent with the notion that the encoding of perceptual details becomes a cognitively effortful activity in old age ( Kausler and Puckett 1981Citation), and provides a reasonable explanation for our finding of task-dependent voice effects in elderly adults.

Interestingly, Yonan and Sommers 2000Citation found that voice familiarity aided the identification of novel words displaying familiar voice patterns in both young and older adults, with explicit and incidental encoding conditions yielding equivalent voice effects. Their incidental encoding condition, however, was not truly incidental. Indeed, prior to the identification test, subjects were given a voice discrimination test with 160 sentences, half of which were spoken by the familiar talkers of the encoding session. We have proposed earlier that the discrimination task of this study may have given older adults the opportunity to encode voice characteristics in long-term memory, nullifying the effect of the encoding task manipulation. Our finding of task-dependent voice effects in elderly adults supports this account. However, the voice effects reported by Sommers 1999Citation in the word-priming paradigm are difficult to interpret with respect to this account because there were no encoding conditions promoting voice processing in that study. Interestingly, in the same paradigm, Pilotti and Beyer 2000Citation found that older adults' voice effects were dependent on attention to voice characteristics. Specifically, older adults exhibited voice effects only when they were familiarized with the talkers' voices, via a name–voice association task, prior to the encoding session, which involved the clarity-of-enunciation task of Experiments 1 and 2. Of course, this finding leads us to predict that the voice effects observed by Sommers 1999Citation would fluctuate with encoding tasks promoting either voice or linguistic processing.

Although our findings indicate that older adults' compromised sensory uptake may affect the encoding of voice information, there was no evidence in our study that it also affected older adults' reliance on this information at test. If the effect of a compromised uptake of sensory information is to lead elderly adults to focus cognitive resources on phonetic/lexical processing, as we have hypothesized, why did elderly adults have no difficulty in processing voice patterns at test? There are two feasible explanations for this finding. First, voice information encoded in long-term memory facilitates phonetic/lexical coding; thus, it becomes quite useful to older adults, for whom word identification under difficult listening conditions is quite problematic (as demonstrated by the lower identification rate of our elderly subjects; see Findlay and Denenberg 1977Citation; Townsend and Bess 1980Citation). Therefore, it is reasonable to expect older adults to rely on voice information encoded in long-term memory for the identification of the degraded signals of the test session. Second, in older adults, attention to voice information via a fine-grained analysis of the stimulus material promotes the processing of this information (as demonstrated by the task-dependent voice effects observed in the elderly subjects). Therefore, it is reasonable to expect the difficulty of the identification test to enhance older adults' attention to the test items, promoting a fine-grained analysis of the stimulus material, and thus the processing of voice information at test.

It should be noted here that the notion of a compromised uptake of sensory information in the aged, which we have proposed to account for the task-dependent voice effects and the lower word-identification rate of the elderly subjects, was based on the finding of age-related hearing loss. Hearing loss, as measured by a pure-tone audiometric examination, however, is simply a gross measure of elderly adults' reduced and/or disrupted uptake of sensory information. Clearly, age-related slowing of processing (see Stine, Wingfield, and Poon 1986Citation; Wingfield, Poon, Lombardi, and Lowe 1985Citation), and declines in frequency, intensity, and temporal discrimination ( Florentine et al. 1993Citation; Konig 1957Citation; Moore et al. 1992Citation; Schneider 1997Citation; Schneider and Pichora-Fuller 2000Citation), albeit not measured here, may also compromise the uptake of sensory information in old age. Therefore, it is reasonable to assume that the age-related declines in the encoding of voice information and word identification reported here have multiple sources, and hearing loss is simply one indicator of elderly adults' compromised uptake of sensory information. Moreover, it should be noted that the age differences in overall test performance observed in our investigation were present despite the generally superior vocabulary scores of the older participants, and were attenuated, but not eliminated, by voice familiarity. This finding indicates that the superior verbal abilities of older subjects cannot serve as a compensatory mechanism for age-related declines in spoken word identification.

Interestingly, the findings of this study are consistent with models of implicit memory processes in which both phonetic information and nonlinguistic details are encoded and stored in long-term memory. The pre-semantic perceptual representation system (PRS) proposed by Schacter and Church 1992Citation is one of these models. The PRS is assumed to be composed of cortically based subsystems devoted to the encoding and storage of the superficial properties of words such as their phonetic and perceptual context. With respect to spoken words, the PRS model postulates that phonetic information and voice characteristics are represented in separate subsystems. Associative connections between these subsystems represent the co-occurrences of phonetic patterns and voice characteristics in the stimulus material. The PRS model accounts for the voice effects on word identification observed in our study by assuming that the voice familiarization session produces a record of each talker's voice in the voice subsystem. At test, novel words spoken by one of the familiar talkers activate voice patterns preserved in memory, facilitating word identification under difficult listening conditions. The PRS model can also account for the finding that older adults' voice effects are task dependent by postulating age-related changes in the operations that govern the voice and the word subsystems. However, because in this model separate operations govern the encoding of voice and phonetic/lexical information, to account for our findings these operations must be assumed to depend on a common pool of cognitive resources. Given this assumption, the PRS is compatible with our proposal that age-related changes in the uptake of sensory information increase the resources that older adults devote to phonetic/lexical processing, weakening the encoding of voice information in long-term memory.

Our findings are also compatible with episodic memory models ( Goldinger 1996Citation, Goldinger 1998Citation; Hintzman 1986Citation; see also Tenpenny 1995Citation), which postulate that each encounter with a spoken word creates a memory record including phonetic/lexical information and idiosyncratic perceptual attributes (e.g., voice information). Therefore, in these models, word identification is assumed to depend on a collection of perceptually specific lexical records stored in the word recognition system. Accordingly, the voice familiarization (encoding) session of our study would produce a large number of voice-specific records in which a talker's voice is represented by the collection of records that share the same voice patterns. These models account for the voice effects observed in this study by postulating that, at test, novel words displaying familiar voice patterns activate the voice-specific records of the encoding session, facilitating word identification. To be able to account for the task-dependent voice effects observed in this study's elderly adults, however, these models must assume that memory records are not mere analogues of incoming stimuli, but complex entities defined by both the physical forms of the stimuli and the operations that subjects perform on them ( Van Orden and Goldinger 1994Citation). Given this assumption, episodic models can account for elderly adults' task-dependent voice effects by postulating that age-related declines in the uptake of sensory information compromise the input to the word recognition system of elderly adults. We have proposed earlier that a weakened input to this system is likely to engage a recovery process. Therefore, it is reasonable to assume that this recovery process may lead older adults to discard idiosyncratic information in the speech signals. As a result, the memory records generated under encoding instructions that do not focus elderly adults' attention on voice characteristics may be less voice-specific than those of young adults. Of course, this account entertains the notion that prior to the experiment the perceptually specific lexical records that constitute the word recognition system of elderly adults may be also less voice-specific, further biasing elderly adults to discard idiosyncratic information in encoding sessions that promote linguistic processing.

In conclusion, our findings encourage researchers not only to refine existing models of implicit memory phenomena to account for age-related changes in perceptual processing, but also to study the specific environmental conditions (instructions) and peripheral factors that produce these changes. Our findings also alert researchers to the role that both peripheral and cognitive factors may play in determining the age-related declines in spoken word identification documented here and in numerous other studies of aging (see Grady et al. 1984Citation; Willott 1991Citation; Working Group on Speech Understanding and Aging 1988Citation).


    Acknowledgments
 
This work was supported by Grant F32 DC00342 from the National Institute on Deafness and Other Communication Disorders. We thank the speakers for recording the stimuli and the people who participated in this study.

Received for publication February 7, 2000. Accepted for publication October 7, 2000.


    Appendix ENDIX
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 
Notes

  1. Age-related declines in pure-tone sensitivity are one of the most extensively documented hearing changes associated with old age. The American National Standards Institute defines normal hearing as the ability to hear pure tones at a threshold of 25 dB, and all young adult subjects satisfied this criterion at all selected frequencies. Although there were age differences at all selected frequencies, the older participants in Experiments 1 and 2 exhibited relatively less age-related decline in pure-tone sensitivity than previously reported in larger samples of similarly aged adults (see Grady et al. 1984Citation; Willott 1991Citation). The primary reason for this finding is that older adults who wore hearing aids were excluded from our study. Furthermore, there were no reliable differences in hearing acuity between the older adults of Experiment 1 and the older adults of Experiment 2. Therefore, the data of these two groups were combined in Fig. 1.
  2. It should be noted here that the order in which the encoding phases were presented to participants did not affect the identification scores in the experiments reported in this study (Fs < 1).
  3. One may ask whether differences in hearing loss among the elderly adults could modulate the pattern of voice effects displayed by this age group in Experiments 1 and 2. To address this issue, within-group hearing loss served as the covariate in a 2 (Test voice) x 2 (Encoding instructions) factorial analysis conducted on elderly adults' percentage correct identification scores in each experiment. This analysis preserved the pattern of results reported above. Experiment 1: test voice: F (1,46) = 23.35, MSe = 19.80, Eta-squared = .34; Test voice x Encoding task: F (1,46) = 23.40, MSe = 18.16, Eta-squared = .34; encoding task: F = 2.33, NS. Experiment 2: test voice: F (1,46) = 73.91, MSe = 8.33, Eta-squared = .62; Test voice x Encoding task: F (1,46) = 34.04, MSe = 9.20, Eta-squared = .43; encoding task: F < 1, NS. These findings, compared to those involving young and elderly subjects, indicate that the presence or absence of hearing loss, rather than slight differences in hearing loss within the elderly subject group, is primarily responsible for elderly adults' task-dependent voice effects.


    References
 TOP
 Abstract
 Young Adults: Research Findings
 Older Adults: Research Findings
 Experiment 1
 Results and Discussion
 Experiment 2
 Methods
 Results and Discussion
 General Discussion
 Appendix ENDIX
 References
 





This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Services
Right arrow Download to citation manager
PubMed
Right arrow PubMed Citation


HOME ARCHIVE SEARCH TABLE OF CONTENTS