Researchers typically begin by deciding how to mea­sure the topic of interest. For example, the first step toward resolving Leah and Sarah’s discussion about remembering grocery items would be to decide how to measure remembering. Gerontologists usually use one of three approaches: observing system­atically, using tasks to sample behavior, and ask­ing people for self-reports. In addition, researchers need to be concerned with how representative the participants in the study are of the larger group of people in question.

Regardless of the kind of method chosen, research­ers must show it is both reliable and valid. The reli­ability of a measure is the extent to which it provides

a consistent index of the behavior or topic of interest. A measure of memory is reliable to the extent that it gives a consistent estimate of performance each time you administer it. All measures used in geron­tological research must be shown to be reliable, or they cannot be used. The validity of a measure is the extent to which it measures what researchers think it measures. For example, a measure of memory is valid only if it can be shown to actually measure memory (and not vocabulary ability, for example). Validity often is established by showing that the measure in question is closely related to another measure known to be valid. Because it is possible to have a measure that is reliable but not valid (a ruler is a reliable mea­sure of length but not a valid measure of memory), researchers must ensure that measures are both reli­able and valid.

Systematic Observation. As the name implies, sys­tematic observation involves watching people and carefully recording what they say or do. Two forms of systematic observation are common. In natu­ralistic observation, people are observed as they behave spontaneously in some real-life situation. For example, Leah and Sarah could be observed in the grocery store purchasing their items as a way to test how well they remember.

Structured observations differ from naturalistic observations in that the researcher creates a set­ting that is particularly likely to elicit the behavior of interest. Structured observations are especially useful for studying behaviors that are difficult to observe naturally. For example, how people react to emergencies is hard to study naturally because emergencies generally are rare and unpredictable events. A researcher could stage an emergency and watch how people react. However, whether the behaviors observed in staged situations are the same as would happen naturally often is hard to deter­mine, making it difficult to generalize from staged settings to the real world.

Sampling Behavior with Tasks. When investigators can’t observe a behavior directly, another popular alternative is to create tasks that are thought to sample the behavior of interest. For example, one way to test older adults’ memory is to give them a
grocery list to learn and remember. Likewise, police training includes putting the candidate in a building in which targets pop up that may be either criminals or innocent bystanders. This approach is popular with gerontological researchers because it is so con­venient. The main question with this approach is its validity: Does the task provide a realistic sample of the behavior of interest? For example, asking people to learn grocery lists would have good validity to the extent it matched the kinds of lists they actually use.

Self-Reports. The last approach, self-reports, is a spe­cial case of using tasks to sample people’s behavior. Self-reports are simply people’s answers to questions about the topic of interest. When questions are posed in written form, the verbal report is a questionnaire; when they are posed verbally, it is an interview. Either way, questions are created that probe dif­ferent aspects of the topic of interest. For example, if you think imagery and lists are common ways people use to remember grocery items, you could devise a questionnaire and survey several people to find out.

Although self-reports are very convenient and provide information on the topic of interest, they are not always good measures of people’s behavior, because they are inaccurate. Why? People may not remember accurately what they did in the past, or they may report what they think the researcher wants to hear.

Representative Sampling. Researchers usually are inter­ested in broad groups of people called populations. Examples of populations are all students taking a course on adult development and aging or all Asian American widows. Almost all studies include only a sample of people, which is a subset of the popu­lation. Researchers must be careful to ensure that their sample is truly representative of the popula­tion of interest. An unrepresentative sample can result in invalid research. For example, what would you think of a study of middle-aged parents if you learned that the sample consisted entirely of two-parent households? You would, quite correctly, decide that this sample is not representative of all middle-aged parents and question whether its results apply to single middle-aged parents.

As you read on, you’ll soon discover that most of the research we consider in this text has been con­ducted on middle-class, well-educated European Americans. Are these samples representative of all people in the United States? In the world? Sometimes, but not always. Be careful not to assume that findings from this group apply to people of other groups. In addition, some developmental issues have not been studied in all ethnic groups and cultures. For example, the U. S. government does not always report statistics for all ethnic groups. To change this, some U. S. government agencies, such as the National Institutes of Health, now require samples to be representative. Thus in the future we may gain a broader understanding of aging.