The Mind’s Journey from Novice to Expert

By John T. Bruer

Charles is in the seventh grade. He has an IQ of 70 and reads at the third-grade level. He has had several years of remedial reading instruction in a public school, but seems to make little progress. Charles has sufficient decoding skills to read aloud, but has almost no comprehension of what he reads. He is representative of many students who will fail -- students whom our educational system can't reach.

Teachers report that they often see students who are unable to comprehend written language. What is odd is that these children can tell stories and often have no trouble understanding spoken language. This suggests they do have language comprehension skills and at least some background knowledge about the world, but cannot bring these skills and knowledge to bear on written language. Sometimes they can even read aloud, but still have difficulty understanding what they have read. Obviously, standard reading instruction has failed these students in a fundamental way.

On the first day of a new remedial reading program, the teacher asked Charles to read a short passage about reptiles. To see if he understood the passage, the teacher asked him to formulate a question based on the passage, a question that might appear on a test. Although he tried, he couldn't think of a question and gave up. He had not understood and retained enough of what he had just read to frame a question about it.

After 15 days in the new remedial program, the teacher and Charles repeated this exercise. After Charles had read a passage about Antarctic research, he immediately offered the question "Why do scientists come to the South Pole to study?" By this time he also had raised his comprehension scores on the reading passages from 40 to 75 percent, an average performance level for a seventh-grader. On comprehension tests given in his regular classroom, he improved from the 25th to the 78th percentile in social science and from the 5th to the 69th percentile in science. At the end of the 20-day program he had gained 20 months on standardized reading comprehension tests, and he maintained this improvement long after his remedial instruction ended.

Charles was the beneficiary of reciprocal teaching, a method that applies results of cognitive research to reading instruction. To understand what lies behind this method, we have to go back more than three decades to the beginning of what has come to be called the cognitive science revolution.

In 1956, a group of psychologists, linguists, and computer scientists met at the Massachusetts Institute of Technology for a symposium on information science (Gardner 1985). This three-day meeting was the beginning of the cognitive revolution in psychology, a revolution that eventually replaced behaviorist psychology with a science of the mind. In essence, the revolutionaries claimed that human minds and computers are sufficiently similar that a single theory - the theory of computation-could guide research in both psychology and computer science. "The basic point of view inhabiting our work," wrote two of the participants, "has been that the programmed computer and human problem solver are both species belonging to the genus IPS" (Newell and Simon 1972, p. 870). Both are species of the genus information-processing system; both are devices that process symbols.

That scientific revolution became a movement, and eventually a discipline, called cognitive science. Cognitive scientists study how our minds work-how we think, remember, and learn. Their studies have profound implications for restructuring schools and improving learning environments. Cognitive science-the science of mind-can give us an applied science of learning and instruction. Teaching methods based on this research-methods that result in some sixth-graders' having a better understanding of Newtonian physics than most high school students, or that, as recounted above, help remedial students raise their reading comprehension scores four grade levels after 20 days of instruction-are the educational equivalents of polio vaccine and penicillin. Yet few outside the educational research community are aware of these breakthroughs or understand the research that makes them possible.

Certainly cognitive science, or even educational research in general, isn't the sole answer to all our educational problems. Yet it has to be part of any attempt to improve educational practice and to restructure our schools. The science of mind can guide educational practice in much the same way that biology guides medical practice. There is more to medicine than biology, but basic medical science drives progress and helps doctors make decisions that promote their patients' physical well-being. Similarly, there is more to education than cognition, but cognitive science can drive progress and help teachers make decisions that promote their students' educational well-being.

In the years following the MIT symposium, cognitive scientists worked to exploit the similarities between thinking and information processing. Allen Newell and Herbert Simon developed the first working artificial intelligence computer program, called the Logic Theorist. It could prove logical theorems using methods a human expert might use. Besides logic, Newell and Simon studied problem solving in other areas, ranging from tic-tac-toe to arithmetic puzzles to chess. Problem solving in each of these areas depends on learning facts, skills, and strategies that are unique to the area. As cognitive scientists say, expertise in each area requires mastery of a distinct knowledge domain. Cognitive research began to have relevance for education as scientists gradually started to study knowledge domains that are included in school instruction-math, science, reading, and writing.

In their 1972 book Human Problem Solving, Newell and Simon summarized the results of this early research program and established a theoretical outlook and research methods that would guide much of the work that now has educational significance. Newell and Simon argued that if we want to understand learning in a domain, we have to start with a detailed analysis of how people solve problems in that domain. The first step is to try to discover the mental processes, or programs, that individuals use to solve a problem. To do this, cognitive scientists give a person a problem and observe everything the subject does and says while attempting a solution. Newell and Simon prompted their subjects to "think aloud" -- to say everything that passed through their minds as they worked on the problems. Cognitive psychologists call these "think-aloud" data protocols. Analysis of the protocols allows cognitive scientists to form hypotheses about what program an individual uses to solve a problem. Cognitive scientists can test their hypotheses by writing computer programs based on their hypotheses to simulate the subject's problem-solving performance. If the scientists' analysis is correct, the computer simulation should perform the same way the human did on the problem. If the simulation fails, the scientists revise their hypotheses accordingly and try again. After studying and simulating performances from a variety of subjects, Newell and Simon could trace individual differences in problem-solving performance to specific differences in the mental programs the subjects used.

To be sure they could find clear-cut differences among individual programs, Newell and Simon initially compared the problem-solving performances of experts and novices -- which were almost certain to be different -- in a variety of domains. In such studies (now a mainstay of the discipline), cognitive scientists consider any individual who is highly skilled or knowledgeable in a given domain to be an "expert" in that domain. The domains can be ordinary and commonplace; they don't have to be arcane and esoteric. In the cognitive scientists' sense of the word, there are experts at tic-tac-toe, third-grade arithmetic, and high school physics. Comparing experts with novices makes it possible to specify how experts and novices differ in understanding, storing, recalling, and manipulating knowledge during problem solving.

Of course Newell and Simon knew that experts in a domain would be better at solving problems in that domain than novices, but it was not always obvious how experts and novices actually differed in their problem-solving behavior. In one early expert-novice study, Simon and Chase (1973) looked at chess players. One thing we do when playing chess is to choose our next move by trying to anticipate what our opponent's countermove might be, how we might respond to that move, how the opponent might counter, and so on. That is, we try to plan several moves ahead. One might think that experts and novices differ in how far ahead they plan: a novice might look ahead two or three moves, an expert ten or twelve. Surprisingly, Simon and Chase found that experts and novices both look ahead only two or three moves. The difference is that experts consider and choose from among vastly superior moves. When expert chess players look at a board, they see configurations and familiar patterns of pieces; they see "chunks" of relevant information. Novices, in contrast, see individual pieces. The experts' more effective, more information-rich chunks allow them to see superior possible moves and choose the best of these. Chunking, rather than planning farther ahead, accounts for the experts' superiority. Experts process more and better information about the next few moves than novices.

Newell and Simon's emphasis on problem-solving performance and expert-novice differences was a first step toward a new understanding of learning. In short, learning is the process by which novices become experts. As one learns chess, math, or physics, one's problem-solving performance in the domain improves as the programs one uses to solve problems improve. If we know what programs a person first uses to solve problems in a domain, and if we can compare them with the programs the person eventually constructs, we have a measure and a description of what the person learned. We can study learning by tracing changes in the mental processes students use as they progress from novice to higher levels of proficiency. If we have detailed knowledge of these processes, such as the computer simulations give us, we can know not only that learning has occurred but also how it has occurred.

Other investigators joined in the program that Newell and Simon had outlined, and the research developed and expanded along two dimensions.

First, the kinds of problems and tasks the scientists studied became more complex. To play games and solve puzzles, even in logic and chess, one has to know a few rules, but one doesn't need much factual knowledge about the world. As cognitive scientists honed their methods on puzzle problems and accumulated insights into how people solve them, they became more ambitious and began applying their methods to more knowledge-rich domains. They started to study problem solving in physics, mathematics, and medical diagnosis. They began to study language skills, such as reading and writing, and how students use these skills to acquire more knowledge. Extending their research into these domains made it applicable to understanding expert and novice performance in school subjects.

Second, the research evolved from merely comparing novices against experts to studying the process by which novices become experts. Psychologists began to develop intermediate models of problem-solving performance in a variety of domains. The intermediate models describe how domain expertise develops over time and with experience. If learning is the process by which novices become experts, a sequence of intermediate models in a domain traces the learning process in that domain. The intermediate models describe the stages through which students progress in school.

By the mid 1970s, cognitive scientists were studying school tasks over a range of competencies -- from novice to expert, from preschool through college. In many subject areas, our knowledge of students' cognitive processes is now sufficiently detailed that we can begin to describe their performance at every level of competence, from novice to expert. We can describe the normal trajectory of learning in these subject areas. If we understand the mental processes that underlie expert performance in school subjects, we can ask and answer other questions that are important for education. How do students acquire these processes? Do certain instructional methods help students acquire these processes more quickly or more easily? Can we help students learn better? Answers to these questions can guide educational practice and school reform. For example, research in science learning shows that novices-and all beginning students are novices-hold naive theories about how the physical world works. These theories so influence how the students interpret school instruction that the instruction is often ineffective. Curricula based on cognitive research that build from and correct these naive theories can overcome this problem.

Later in this article, we will see how researchers and teachers are applying the new learning theory to create classroom environments in which students are successfully moving along the path from novice to expert. But first let's look at how cognitive scientists work and how their results can contribute to better instructional methods.

II. Balance-Scale Problems:A Classic Study of Novice-to-Expert Performance

Research on how children learn to solve balance-scale problems illustrates the main ideas, methods, and instructional applications of cognitive science.

Try to solve the balance-scale problem shown in Figure 1. Assume the scale's arm is locked so that it can't rotate around the fulcrum. If I were to unlock the arm, what would happen? Would the scale tip left, tip right, or balance?

This is a tricky problem. Rule IV in Figure 1 gives a set of rules one might use to solve it. Each rule has an IF clause that states the conditions under which the rule is applicable and a THEN clause that states what to do under those conditions. To use these rules, find the rule whose conditions fit the pattern of weights and distances in the problem. You find that P4 is the only rule whose IF clause fits the problem. Its THEN clause tells you to compute torques for each side; that is, for each side, multiply the number of weights by their distance from the fulcrum. Doing that gives t1 = 5 X 3 = 15 for the left side and t2 = 4 X 4 = 16 for the right. These new data satisfy the condition for P7; executing its THEN clause gives the correct answer, "Right side down:' Some readers might remember the THEN clause in P4 from high school physics as a version of the law of torques: Multiply weight by distance on each arm to find the torque, or rotational force; the side with the larger torque goes down. This simple law solves all balance-scale problems.

The set of rules is an English-language version of a computer program for solving balance-scale problems. It takes as input data about the weight on each side of the scale and the distance of the weight from the fulcrum. The output is the answer for a balance-scale problem: tip left, tip right, or balance. The program is a series of IF-THEN rules. Computer scientists call the IF clauses conditions, the THEN clauses actions,and the entire IF-THEN statement a production rule. They call computer programs written using only production rules production systems. Computing devices that execute production systems efficiently have a specific internal structure (or architecture,as computer scientists say).

Cognitive scientists claim that the human mind can be described as a computing device that builds and executes production-system programs. In fact, rule IV is a production system an expert would use to solve balance-scale problems. Robert Siegler, a cognitive psychologist, showed that production systems can simulate human performance on such problems (Siegler 1976; Klahr and Siegler 1978; Siegler and Klahr 1982). He also showed that a series of increasingly complex production systems can model the way in which children gradually develop expertise on balance-scale problems from ages 5 through 17. Children learn, says Siegler, by adding better rules to their production systems. Proper instruction, he goes on to show, can help children acquire these better rules.

The beauty of the balance=scale task for developmental psychology is that it is complex enough to be interesting but simple enough for exhaustive task analysis. Two variables are relevant: the amount of weight on each arm and the distance of the weight from the fulcrum. There are three discrete outcomes: tip left, tip right, and balance. There is a simple law, the law of torques, that solves all balance-scale problems, though few of us discover this law on our own. If weight and distance are the only two relevant variables and if the scale either tips or balances, there are only six possible kinds of balance-scale problems:

• balance problems-equal weight on each side and the weights at equal distances from the fulcrum;

• weight problems-unequal weight on each side and the weights at equal distance from the fulcrum;

• distance problems-equal weight on each side and the weights at unequal distances from the fulcrum;

• conflict-weight-one side has more weight, the other side has its weight at a greater distance from the fulcrum, and the side with greater weight goes down;

• conflict-distance-one side has more weight, the other side has its weight at a greater distance from the fulcrum, and the side with greater distance goes down;

• conflict-balance-one side has more weight, the other side has its weight at a greater distance from the fulcrum, and the scale balances.

Siegler called the last three types "conflict" problems, because when one side has more weight but the other side has its weight farther from the fulcrum one can have conflicting intuitions about which variable dominates. (The problem illustrated in Figure 1 is a conflict-distance problem: there is more weight on the left side, the weight is farther from the fulcrum on the right side, and the right side goes down.)

These six possibilities cover all possible cases for how weight and distance influence the action of the scale. The six cases provide a complete theory, or task analysis, of the balance scale. Notice that the six problem types place varying demands on the solver. For a balance problem or a weight problem, a solver need only consider weight. For the conflict problems, a solver has to pay attention to weight, distance, and the ways in which weight and distance interact.

Siegler formulated some psychological hypotheses about how people might solve balance-scale problems. Using the information from the task analysis, he could test his hypotheses by giving subjects problems and observing their performance. Siegler called his hypotheses "rules" and formulated them as four production-system programs (see Figure 1).

The rules make different assumptions about how and when people use weight or distance information to solve the problems. Rule I considers only weight. Rule II considers distance, but only when the weights on the two sides are equal (P3). Rule III attempts to integrate weight and distance information (P4 and P5). Rule IV introduces the law of torques (P4) when one side has more weight but less distance.

If children use Siegler's rules, then the pattern of a child's responses to a set of balance-scale problems that contains all six types will reveal what rule that child uses. Children's responses will tell us what they know about the balance-scale task including what information -- weight, distance; or both-they use to solve the problem. Siegler tested his hypotheses and predictions by giving a battery of 30 balance-scale problems (five of each of the six kinds) to a group of 40 children that included equal numbers of 5-year-olds, 9-year-olds, 13-year-olds, and 17-year-olds. He showed each child a balance scale that had weights placed on it and asked the child to predict what the scale would do. As soon as the child made a prediction, Siegler rearranged the weights for the next problem. He did not let the children see if their predictions were correct, because he wanted to find out what they knew initially. He wanted to avoid giving the students feedback on their performance so he could be sure they weren't learning about the task during the experiment. He wanted to look at their learning, but only after he assessed their initial understanding.

The children's performance confirmed Siegler's hypotheses. Ninety percent of them made predictions that followed the pattern associated with one of the four rules. There was also a strong developmental trend. The 5-year-olds most often used rule 1. The 9-year-olds used rule II or rule III. The 13- and 17-year-olds used rule III. Only two children, a 9-year-old and a 17-year-old, used rule IV.

Taken together, Siegler's four rules constitute a developmental theory that explains development in terms of changes in children's knowledge structures and how they initially encode, or as cognitive psychologists say, represent problems. By age 5, most children are using rule I. By age 13, almost all are using rule 111. Few children spontaneously progress to rule IV, the expert-level rule for the balance scale. Thus, the rules chart a course of normal development on the task, from novice to expert performance.

Siegler's rules also tell us what cognitive changes underlie the transition from novice to expert. On tasks like the balance scale, children progress through a series of partial understandings that gradually approach mastery. Performance improves, or learning occurs, when children add more effective production rules to the theories they have stored in their long-term memories. If we know what the developmental stages are and how they differ at the level of detail provided by a cognitive theory, we ought to be able to design instruction to help children advance from one stage to the next.

To investigate how children learn about the balance scale, Siegler conducted a training study. Working with 5-year-olds and 8-year-olds, all of whom used rule I, he had each child make predictions for 16 problems. After each prediction, Siegler released the lock on the balance scale and let the child see if his or her prediction was correct. This feedback experience gave the children an opportunity to learn about the balance scale. Two days later, the children had a retest with no feedback to see if they had learned anything from the training.

In this experiment, there were three training groups. One group of 5- and 8-year-olds served as a control group. Their training session consisted only of balance and weight problems-problems they could solve using rule I. A second group had training on distance problems, where rule II, but not rule I, would work. A third group had training on conflict problems, which require at least rule III for performance even at chance levels. With this training, would the children learn anything? Would they progress from rule I to a more advanced rule?

As expected, the children in the control group made no progress. They learned nothing from training on problems they already knew how to solve. The children in the second group, both 5- and 8-year-olds trained on distance problems, did learn something. Feedback from 16 problems was enough for these children to advance from rule I to rule II. The surprise came with the third group, the children

who had training on conflict problems. The 8-year-olds in this group advanced two levels in their mastery of the balance scale, from rule I to rule III. The 5 year-olds in this group either stayed at rule I or became so confused and erratic that it appeared they were no longer using a rule.

To find out why the 8-year-olds learned and the 5-year-olds didn't, Siegler and his collaborators selected several children between 5 and 10 years old for in-depth study (Klahr and Siegler 1978). Each child had a training session with the balance scale that included conflict problems. In the training session the child was asked to make a prediction for each problem and to state his or her reasons for the prediction. The experimenter then unlocked the scale's arm and the child observed the result. If the prediction was not borne out, the experimenter asked "Why do you think that happened?" The researchers videotaped the entire session with each child and transcribed all the children's verbal responses, which provided data for protocol analysis.

Lisa, a typical 5-year-old, took 30 minutes to do 16 problems. Protocols like Lisa's suggested that the younger children were not encoding or representing distance in their initial interpretations of balance-scale problems. For example, when Lisa was given a distance problem (on the left side, one weight on peg 3; on the right side, one weight on peg 1), she predicted the scale would balance--"They would both stay up," she said. Asked why she thought this she answered "` Cause they are both the same." When she saw the left side tip down, she was genuinely puzzled: "Well, why are they both the same thing and one's up and one's down?" Lisa did not see any difference between the two sides. She was not including distance information in her initial representation of the problem. She simply did not notice and encode distance information.

An 8-year-olds protocol gave very different data. Jan was given a conflict-distance problem: on the left side, three weights on the first peg; on the right side, two weights on the third peg. She predicted incorrectly that the left side would go down. When shown what really happens (right side down) and asked for an explanation, she gave one involving both weight and distance. For her, pegs 1 and 2 on each side were "near" the fulcrum and pegs 3 and 4 were "far" from the fulcrum. She stated a rule: "If far pegs have weights, then that side will go down." She then pointed out that in this problem the far pegs on the right side had weights but the far pegs on the left had none, so the right side would go down. Jan's is not a perfect explanation, nor is her rule always true. Her protocol shows, though, that she, unlike Lisa, had noticed and encoded both weight and distance information in her representation of the problem.

On the basis of the protocols, the difference between 5-year-olds and 8-year-olds seemed to be that the younger children saw the problems in terms of weight only, whereas the older children could see the problems in terms of weight at a distance from the fulcrum. If the younger children were not encoding distance, they could not learn from training on conflict problems that differences in distance sometimes overcome differences in weight. They could not develop the concepts or -- similar to the chess expert -- build the chunks they needed for the conditions of P4 and P5 in rule III. On the other hand, the older children, even if they were using rule 1, appeared to encode distance. They could learn from training on conflict problems how to use that information to build new productions and progress to rule III.

Can 5-year-olds learn to encode both weight and distance, or is it beyond their level of cognitive development? Siegler found that giving 5-year-olds more time to study the configurations or giving them more explicit instructions ("See how the weights are on the pegs? See how many are on each side and how far they are from the center on each side?") made no difference in their ability to reproduce the configurations from memory.

Only one intervention seemed to work. The 5 year-olds had to be told explicitly what to encode and bow to encode it. The instructor had to tell them what was important and teach them a strategy for remembering it. The instructor taught the children to count the disks on the left side, count the pegs on the left side, and then rehearse the result (i.e., say aloud "three weights on peg 4"); to repeat this process for the right side; and then to rehearse both results together ("three weights on peg 4 and two weights on peg 3").The instructor then told the children to try to reproduce the pattern their statement described. The instructor guided each child through this strategy on seven problems. With each problem, the children took more responsibility for executing the strategy.

After this training, the 5-year-olds' performance on reconstructing distance information from memory improved. They now correctly reproduced weight information 52 percent of the time, and distance information 51 percent of the time. Although they now apparently encoded the information, they, like the 8-year-old rule I users, did not spontaneously start using it. They continued to use rule I. However, when these 5-year-olds were given training on conflict problems, they too progressed from rule I to rule III. They had to be taught explicitly what representation, or encoding, to use in order to learn from the training experience.

The results of this study exemplify features of learning that are common to almost all school subjects. Students learn by modifying long-term memory structures, here called production systems. They modify their structures when they encounter problems their current rules can't solve. Some children modify their structures spontaneously; that is how children normally develop through Siegler's four rules. But by giving appropriate training we can facilitate children's development. For some children, presenting anomalous problems is enough. Like the 8year-olds confronted with conflict problems, some children can build better rules when challenged with hard problems. Other children can't. Some children have inadequate initial representations of the problem. Children have to notice the information they need and encode it if they are to build better rules.

Students who can't learn spontaneously from new experiences need direct instruction about the relevant facts and about the strategies to use. Teaching just facts or teaching strategies in isolation from the facts won't work. To know when and how to intervene, we have to understand, in some detail, what stages children pass through on their mental journeys from novice to expert. Cognitive science tells us how we can then help children progress from relative naivete through a series of partial understandings to eventual subject mastery.

The difficulties children have in learning about the balance scale are highly similar to the difficulties they encounter in learning mathematics, science, and literacy skills. The tasks, representations, and production systems will become more complex -- the progression from novice to expert can't be captured by four rules in every domain. However, our innate cognitive architecture remains the same no matter what domain we try to master, and the methods of cognitive science yield detailed information about how we think and learn. The lessons learned on the simple balance scale apply across the curriculum.

III. What Does Expertise Consist Of?

Imagine that a small, peaceful country is being threatened by a large, belligerent neighbor. The small country is unprepared historically, temperamentally, and militarily to defend itself; however, it has among its citizens the world's reigning chess champion. The prime minister decides that his country's only chance is to outwit its aggressive neighbor. Reasoning that the chess champion is a formidable strategic thinker and a deft tactician -- a highly intelligent, highly skilled problem solver -- the prime minister asks him to assume responsibility for defending the country. Can the chess champion save his country from invasion?

This scenario is not a plot from a Franz Lehar operetta, but a thought experiment devised by David Perkins and Gavriel Salomon (1989). As they point out, our predictions about the chess champion's performance as national security chief depend on what we believe intelligence and expertise are. If the goal of education is to develop our children into intelligent subject-matter experts, our predictions about the chess champion, based on what we believe about intelligence and expertise, have implications for what we should do in our schools.

Since the mid 1950s cognitive science has contributed to the formulation and evolution of theories of intelligence, and so to our understanding of what causes skilled cognitive performance and what should be taught in schools. In this section, we will review how our understanding of intelligence and expertise has evolved over the past two decades and see how these theories have influenced educational policy and practice.

Four theories will figure in this story

The oldest theory maintains that a student builds up his or her intellect by mastering formal disciplines, such as Latin, Greek, logic, and maybe chess. These subjects build minds as barbells build muscles. On this theory the chess champion might succeed in the national security field. If this theory is correct, these formal disciplines should figure centrally in school instruction.

At the turn of the twentiety century, when Edward Thorndike did his work, this was the prevailing view. Thorndike, however, noted that no one had presented scientific evidence to support this view. Thorndike reasoned that if learning Latin strengthens general mental functioning, then students who had learned Latin should be able to learn other subjects more quickly. He found no evidence of this. Having learned one formal discipline did not result in more efficient learning in other domains. Mental "strength" in one domain didn't transfer to mental strength in others. Thorndike's results contributed to the demise of this ancient theory of intelligence and to a decline in the teaching of formal disciplines as mental calisthenics.

In the early years of the cognitive revolution, it appeared that general skills and reasoning abilities might be at the heart of human intelligence and skilled performance. If this is so, again the chess champion might succeed, and schools should teach these general thinking and problem-solving skills -- maybe even in separate critical-thinking and study-skills classes.

But by the mid 1970s, cognitive research suggested that general domain-independent skills couldn't adequately account for human expertise. Research shows that either the teaching of traditional study skills has no impact on learning or else the skills fail to transfer from the learning context to other situations. Either way, teaching these general skills is not the path to expertise and enhanced academic performance.

A wide variety of books and commercially available courses attempt to teach general cognitive and thinking skills. (For reviews and evaluations see Nickerson et al. 1985, Segal et al. 1985, and Chipman et al. 1985.) Analysis and evaluation of these programs again fail to support the belief that the teaching of general skills enhances students' overall performance.

Most of these programs teach general skills in standalone courses, separate from subject-matter instruction. The assumption is that students would find it too difficult to learn how to think and to learn subject content simultaneously. Like the early artificial intelligence and cognitive science that inspire them, the courses contain many formal problems, logical puzzles, and games. The assumption is that the general methods that work on these problems will work on problems in all subject domains.

A few of these programs, such as the Productive Thinking Program (Covington 1985) and Instrumental Enrichment (Feuerstein et al. 1985), have undergone extensive evaluation. The evaluations consistently report that students improve on problems like those contained in the course materials but show only limited improvement on novel problems or problems unlike those in the materials (Mansfield et al. 1978; Savell et al. 1986). The programs provide extensive practice on the specific kinds of problems that their designers want children to master. Children do improve on those problems, but this is different from developing general cognitive skills. After reviewing the effectiveness of several thinking-skills programs, one group of psychologists concluded that "there is no strong evidence that students in any of these thinking-skills programs improved in tasks that were dissimilar to those already explicitly practiced" (Bransford et al. 1985, p. 202). Students in the programs don't become more intelligent generally; the general problem-solving and thinking skills they learn do not transfer to novel problems. Rather, the programs help students become experts in the domain of puzzle problems.

Researchers then began to think that the key to intelligence in a domain was extensive experience with and knowledge about that domain.

One of the most influential experiments supporting this theory was William Chase and Herb Simon's (1973) study of novice and expert chess players, which followed on earlier work by A.D. De Groot (1965). Chase and Simon showed positions from actual chess games to subjects for 5 to 10 seconds and asked the subjects to reproduce the positions from memory. Each position contained 25 chess pieces. Expert players could accurately place 90 percent of the pieces, novices only 20 percent. Chase and Simon then had the subjects repeat the experiment, but this time the "positions" consisted of 25 pieces placed randomly on the board. These were generally not positions that would occur in an actual game. The experts were no better than the novices at reproducing the random positions: both experts and novices could place only five or six pieces correctly.

Other researchers replicated the Chase-Simon experiment in a variety of domains, using children, college students, and adults. The results were always the same: Experts had better memories for items in their area of expertise, but not for items in general. This shows, first, that mastering a mentally demanding game does not improve mental strength in general. The improved memory performance is domain specific. Chess isn't analogous to a barbell for the mind. Second, it shows that if memory strategies account for the expert's improved memory capacity, the strategies aren't general strategies applicable across all problem-solving domains. Chess experts have better memories for genuine chess positions, but not for random patterns of chess pieces or for strings of words or digits. Thus, experts aren't using some general memory strategy that transfers from chess positions to random patterns of pieces or to digit strings.

From long experience at the game, chess experts have developed an extensive knowledge base of perceptual patterns, or chunks. Cognitive scientists estimate that chess experts learn about 50,000 chunks, and that it takes about 10 years to learn them. Chunking explains the difference between novice and expert performance. When doing this task, novices see the chessboard in terms of individual pieces. They can store only the positions of five or six pieces in their short-term, or working, memory-numbers close to what research has shown our working memory spans to be. Experts see "chunks," or patterns, of several pieces. If each chunk contains four or five pieces and if the expert can hold five such chunks in working memory, then the expert can reproduce accurately the positions of 20 to 25 individual pieces. Chase and Simon even found that when experts reproduced the positions on the board, they did it in chunks. They rapidly placed four or five pieces, then paused before reproducing the next chunk.

Expertise, these studies suggest, depends on highly organized, domain-specific knowledge that can arise only after extensive experience and practice in the domain. Strategies can help us process knowledge, but first we have to have the knowledge to process. This suggested that our chess expert might be doomed to failure, and that schools should teach the knowledge, skills, and representations needed to solve problems within specific domains.

In the early 1980s researchers turned their attention to other apparent features of expert performance. They noticed that there were intelligent novices -- people who learned new fields and solved novel problems more expertly than most, regardless of how much domain-specific knowledge they possessed. Among other things, intelligent novices seemed to control and monitor their thought processes. This suggested that there was more to expert performance than just domain-specific knowledge and skills.

Cognitive scientists called this new element of expert performance metacognition -- the ability to think about thinking, to be consciously aware of oneself as a problem solver, and to monitor and control one's mental processing.

As part of an experiment to see which metacognitive skills might be most helpful when learning something new, John Bransford, an expert cognitive psychologist, tried to learn physics from a textbook with the help of an expert physicist. He kept a diary of his learning experiences and recorded the skills and strategies most useful to him (Brown et al. 1983). Among the things he listed were (1) awareness of the difference between understanding and memorizing material and knowledge of which mental strategies to use in each case; (2) ability to recognize which parts of the text were difficult, which dictated where to start reading and how much time to spend; (3) awareness of the need to take problems and examples from the text, order them randomly, and then try to solve them; (4) knowing when he didn't understand, so he could seek help from the expert; and (5) knowing when the expert's explanations solved his immediate learning problem. These are all metacognitive skills; they all involve awareness and control of the learning problem that Bransford was trying to solve. Bransford might have learned these skills originally in one domain (cognitive psychology), but he could apply them as a novice when trying to learn a second domain (physics).

This self-experiment led Bransford and his colleagues to examine in a more controlled way the differences between expert and less-skilled learners. They found that the behavior of intelligent novices contrasted markedly with that of the less skilled. Intelligent novices used many of the same strategies Bransford had used to learn physics. Less-skilled learners used few, if any, of them. The less-skilled did not always appreciate the difference between memorization and comprehension and seemed to be unaware that different learning strategies should be used in each case (Bransford et al. 1986; Bransford and Stein 1984). These students were less likely to notice whether texts were easy or difficult, and thus were less able to adjust their strategies and their study time accordingly (Bransford et al. 1982). Less-able learners were unlikely to use self-tests and self-questioning as sources of feedback to correct misconceptions and inappropriate learning strategies (Brown et al. 1983; Stein et al. 1982).

The importance of metacognition for education is that a child is, in effect, a universal novice, constantly confronted with novel learning tasks. In such a situation it would be most beneficial to be an intelligent novice. What is encouraging is that the research also shows that it is possible to teach children metacognitive skills and when to use them. If we can do this, we will be able to help children become intelligent novices; we will be able to teach them how to learn.

We are just beginning to see what this new understanding of expertise and intelligence might mean for educational practice. The most important implication of the theory is that how we teach is as important as what we teach. Domain-specific knowledge and skills are essential to expertise; however, school instruction must also be metacognitively aware, informed, and explicit. In the next section, we will see a vivid example of this in the teaching of reading comprehension.

IV. Reading Comprehension: Teaching Children the Strategies Experts Use

Reciprocal teaching, the method mentioned in the introduction to this article, improved Charles' classroom reading comprehension by four grade levels in 20 days. This method illustrates how instruction designed on cognitive principles can help children to apply language comprehension skills in their reading and to acquire the metacognitive strategies essential to skilled reading.

Reciprocal teaching also shows how researchers, administrators, and teachers can collaborate to apply the results of research in the classroom. Annemarie Palincsar, Ann Brown, and Kathryn Ransom -- a graduate student, a professor, and a school administrator -- shared the belief that cognitive research could improve classroom practice and that classroom practice can improve research.

After five years working as a special education teacher and administrator, Palincsar returned to the University of Illinois as a doctoral student. She felt her previous training in psychodiagnostics -- training based on a medical model of learning disabilities --was not meeting the needs of her students. She decided to broaden her academic background and to study how sociocultural factors might influence students' experience in school.

The cognitive revolution was spreading through academic circles, but had not yet reached teacher-practitioners. Palincsar's classroom experience influenced her choice of a thesis project. In her words: "As a teacher, one of the situations I found most baffling was having children who were fairly strong decoders but had little comprehension or recall of what they had read." She was baffled by students like Charles, students who can adequately comprehend spoken, but not written, language.

At first, Palincsar was interested in how self-verbalization might be used to help children regulate their cognitive processing. Donald Meichenbaum (1985) had developed techniques based on self-verbalization to help impulsive children -- children who mentally fail to stop, look, and listen-pace their actions and develop self-control. At the time, most of the work on self-verbalization had explored how it could be used to regulate social behavior. Palincsar wondered how it might be used to regulate cognitive behavior-specifically, how it might be used to improve reading comprehension. She wrote to Meichenbaum, who suggested that the application of his ideas to academic subjects might be strengthened by incorporating ideas from research on metacognition. He told Palincsar to discuss her idea with Ann Brown, an authority on metacognition who at that time was also at Illinois.

At their initial meeting, Palincsar showed Brown a design for the pilot study that was to evolve into reciprocal teaching. Brown offered her a quarter-time research appointment to do the study. When it proved successful (Brown and Palincsar 1982), Brown gave Palincsar a full-time research assistantship and supervised her thesis research.

Palincsar and Brown developed reciprocal teaching from the pilot study on a sound theoretical basis. (See Brown and Palincsar 1987). They analyzed the task's demands, developed a theory of task performance based on expert-novice studies, and formulated a theory of instruction that might improve task performance. This is the same sequence Bob Siegler followed with the balance-scale task. A major difference, of course, is that reading comprehension presents a more complex problem than the balance scale.

From their analysis and a review of previous research, Palincsar and Brown (1984) identified six functions that most researchers agreed were essential to expert reading comprehension: The competent reader understands that the goal in reading is to construct meaning, activates relevant background knowledge, allocates attention or cognitive resources to concentrate on major content ideas, evaluates the constructed meaning (the gist) for internal consistency and compatibility with prior knowledge and common sense, draws and tests inferences (including interpretations, predictions, and conclusions), and monitors all the above to see if comprehension is occurring.

Palincsar and Brown then identified four simple strategies that would together tap all six functions needed for comprehension: summarizing, questioning, clarifying, and predicting. They explained the relation between the four strategies and the six functions as follows (Palincsar and Brown 1986): Summarizing a passage requires that the reader recall and state the gist he or she has constructed. Thus, a reader who can summarize has activated background knowledge to integrate information appearing in the text, allocated attention to the main points, and evaluated the gist for consistency. Formulating a question about a text likewise depends on the gist and the functions needed for summarizing, but with the additional demand that the reader monitor the gist to pick out important points. When clarifying, a reader must allocate attention to difficult points and engage in critical evaluation of the gist. Making predictions involves drawing and testing inferences on the basis of what is in the text together with activated background knowledge. A reader who self-consciously uses all four strategies would certainly appreciate that the goal of reading is to construct meaning.

Expert-novice studies supported the hypothesized connection between comprehension functions and strategies. After completing a comprehension task, expert readers reported that they spent a lot of time summarizing, questioning, clarifying, and predicting. Experts' "comprehend-aloud" protocols substantiated these self-reports. Poor readers did not report using the strategies and showed no evidence of using them in their comprehension protocols. As Palincsar and Brown characterize it, novices executed a "once-over, desperate, nonfocused read."

But can you teach the strategies to novices? And if you can, will it improve their comprehension? To answer these questions, Palincsar and Brown designed a prototype instructional intervention to teach non-experts how to use the strategies. As with all instruction, the primary problem is transfer. How should one teach the strategies to get novices to use them spontaneously? Here Palincsar based her strategy instruction on Brown's work on teaching metacognitive skills. Brown's research had shown that successful strategy instruction must include practice on specific task-appropriate skills (the cognitive aspect), explicit instruction on how to Supervise and monitor these skills (the metacognitive aspect), and explanations of why the skills work (the informed instruction aspect).

The research suggests what teachers should do to help students master strategies. First, teachers have to make the strategies overt, explicit and concrete. Teachers can best do this by modeling the strategies for the students.

Second, to ensure that students will spontaneously use the strategies where needed, teachers should link the strategies to the contexts in which they are to be used and teach the strategies as a functioning group, not in isolation. This suggests that reading-strategy instruction should take place during reading-comprehension tasks, where the explicit goal is to construct meaning from written symbols.

Third, instruction must be informed. The students should be fully aware of why the strategies work and where they should use particular strategies. Thus, instruction should involve discussion of a text's content and students' understanding of why the strategies are useful in that situation.

Fourth, students have to realize the strategies work no matter what their current level of performance. Thus, instruction should include feedback from the teacher about the students' success relative to their individual abilities and encouragement to persist even if a student is not yet fully competent.

Finally, if students are to become spontaneous strategy users, responsibility for comprehension must be transferred from` the teacher to the students gradually, but as soon as possible. This suggests that the teacher should slowly raise the demands made on the students and then fade into the background, becoming less an active modeler and more a sympathetic coach. Students should gradually take charge of their learning.

Palincsar designed reciprocal teaching to satisfy all five of these requirements. Reciprocal teaching takes the form of a dialogue. Dialogue is a language game children understand, and it is a game that allows control of a learning session to alternate between teacher and student. Most important, when engaged in dialogue students are using their language-comprehension skills and sharing any relevant background knowledge they have individually with the group. In reciprocal teaching, dialogue directs these skills and knowledge toward reading.

The dialogue becomes a form of cooperative learning, in which teachers model the strategies for the students and then give students guided practice in applying them to a group task of constructing a text's meaning. Teacher and students take turns leading a dialogue about the portion of text they are jointly trying to understand. The dialogue includes spontaneous discussion and argument emphasizing the four strategies.

In reciprocal teaching, the teacher assigns the reading group a portion of a text and designates one student to be the leader for that segment. Initially, the teacher might be the leader. The group reads the passage silently. Then the assigned leader summarizes the passage, formulates a question that might be asked on a test, discusses and clarifies difficult points, and finally makes a prediction about what might happen next in the story. The teacher provides help and feedback tailored to the needs and abilities of the current leader. The student-listeners act as supportive critics who encourage the leader to explain and clarify the text. Each student takes a turn as leader. The group's public goal is collaborative construction of the text's meaning. The teacher provides a model of expert performance. As the students improve, the teacher fades into the background.

In the first test of reciprocal teaching, Palincsar served as the teacher and worked with one student at a time. The students were seventh-graders in a remedial reading program who had adequate decoding skills but who were at least three grades behind in reading comprehension. At first, students found it difficult to be the leader, and Palincsar had to do a lot of modeling and prompting, but gradually the students' performance improved. In the initial sessions, over half the questions students formulated were inadequate. Only 11 percent of the questions addressed main ideas, and only 11 percent of the summaries captured the gist of the passage. After ten tutoring sessions, however, students could generate reasonably sophisticated questions and summaries. By the end of training, 96 percent of the students' questions were appropriate, 64 percent of the questions addressed main ideas, and 60 percent of their summaries captured the gists of the passages.

Students' reading comprehension improved along with their performance in reciprocal teaching. On daily comprehension tests, scores improved from 10 percent to 85 percent correct and stayed at this level for at least 6 months after reciprocal teaching ended. Back in the classroom, reciprocal-teaching students improved their performance on other reading tasks from the seventh percentile before reciprocal teaching to the fiftieth percentile after. Palincsar repeated the study working with two children simultaneously and obtained the same results. (Charles, mentioned above, was one of the students in this second study.)

Palincsar and Brown wanted to know if reciprocal teaching was the most efficient way to achieve these gains before they asked teachers to try it in classrooms. Reciprocal teaching demands a great deal of the teacher's time and requires intensive interaction with small groups of students. Both are valuable classroom commodities. Could the same results be achieved more efficiently by a different method? Reciprocal teaching turned out to be superior to all the alternatives tested (Brown and Palincsar 1987, 1989). In all the comparison studies, reciprocal teaching improved remedial seventh graders' performance on comprehension tests from less than 40 percent before instruction to between 70 and 80 percent after instruction, a level typically achieved by average seventh-graders. The best of the alternative methods-explicit strategy instruction, where the teacher demonstrated and discussed each strategy and the students then completed worksheets on the strategies raised scores from around 40 percent to between 55 and 60 percent (Brown and Palincsar 1987). These studies showed that the intense and prolonged student-teacher interaction characteristic of reciprocal teaching is crucial to its success (Palincsar et al. 1988). This is the investment teachers have to make to cash in on reciprocal teaching's dividends.


Can reciprocal teaching work in a real classroom? Here Kathryn Ransom, Coordinator for Reading and Secondary Education in District 186, Springfield, Illinois, enters the story. Ransom-a former teacher-is a veteran professional educator. She makes it clear she has seen many trends come and go, and realizes that neither she nor the schools will please all the people all the time. Nonetheless, Ransom devotes time and effort to get new things happening in the Springfield schools. She has become adept at, as she puts it, "making deals" with research groups. "We can bring in people who have exciting ideas that need to become practical, and as the researchers work with Springfield teachers they can provide staff development experiences I never could.”

Springfield's District 186 serves a population of 15,000 students, from kindergarten through high school. The system is 25 to 28 percent minority. On standardized tests, classes at all grade levels score at or above grade level in all subjects. This is a solid achievement, Ransom points out, because the majority of special education children in the district receive instruction in regular classrooms. When it was time for the classroom testing of reciprocal teaching, Palincsar approached Ransom. Ransom saw the potential of reciprocal teaching and recognized in Palincsar a researcher who could make cognitive science meaningful to administrators and teachers. The researcher and the administrator struck a deal advantageous to both.

Together, they decided to approach Springfield's middle school remedial reading teachers. These teachers worked daily with children who had adequate decoding skills but no functional comprehension skills. Ransom and Palincsar collaborated to design a staff development program that would encourage the teachers to think about instructional goals and methods and that would allow the researchers to introduce reciprocal teaching and the theory behind it. The teachers first watched videos of Palincsar conducting reciprocal teaching sessions. Later the teachers took part in reciprocal teaching sessions, playing the roles of teacher and student. Next a teacher and a researcher jointly conducted a reciprocal teaching lesson. The final training consisted of three formal sessions on the method over a three-day period.

In the first classroom study of reciprocal teaching, four volunteer remedial reading teachers used the method with their classes (Palincsar et al. 1988). Class size varied from four to seven students. Before reciprocal teaching, the baseline on daily reading-assessment tests for the students was 40 percent. After 20 days of reciprocal teaching their performance rose to between 70 and 80 percent;-just as in Palincsar's initial laboratory studies. Students maintained this level of performance after reciprocal teaching and also improved their performance on other classroom comprehension tasks, including science and social studies reading. Reciprocal teaching worked in the classroom! Experienced volunteer teachers, after limited training, could replicate the laboratory results in classroom settings.

Palincsar and Ransom obtained similar results in a study that used conscripted teachers, who varied greatly in experience and expertise. The students also were more diverse in their reading deficiencies than the students in the first study. Class size varied from 7 to 15, with an average size of 12. Each teacher taught one reciprocal teaching group and one control group; the latter received standard reading-skills instruction. Again, after 20 days of reciprocal instruction, scores on daily comprehension tests improved to 72 percent for the reciprocal teaching group, versus 58 percent for the control group. Thus, average classroom teachers, working in less-than-ideal circumstances and teaching groups of seven or more students, replicated the original laboratory results. As the ultimate test, the Springfield team ran an experiment in which the strongest student in a remedial group served as the teacher. In this study, the student-teachers improved their scores on comprehension tests from 72 percent to 85 percent correct. The other students in the group improved their scores from 50 percent to 70 percent correct.

Since the study ended, in 1989, reciprocal teaching has become a mainstay in the Springfield schools. It is now used in all remedial reading classes, and its methods have been incorporated in some form into all regular classroom reading programs. Even more encouraging, Springfield teachers exposed to reciprocal teaching and to the importance of strategic thinking attempt to integrate these elements into their teaching of other subjects.

One benefit of reciprocal teaching, and of similar projects in the Springfield system, has been the teachers' participation in extended applied research. This was part of Ransom's original agenda. A project running over 5 years, as reciprocal teaching did, provides a powerful way to change teachers' behavior. Most in-service training for teachers lasts only a day or two and at best can have only a minor impact on their thinking and their performance. Ransom sees collaboration in classroom research as a way for teachers and researchers to interact in a dignified, mutually beneficial way. The teachers gain meaningful in-service experience that is intellectually satisfying. Working closely with fellow teachers and other education professionals helps them overcome the isolation of seven-hour days as the only adult in the classroom. The research team also gains, as the reciprocal teaching researchers will attest. The teachers initially helped refine reciprocal teaching for classroom use, providing important insights into how to make an instructional prototype work in a school. Later, they helped identify new research questions and helped the researchers design ways to test the method's classroom effectiveness. Because of her Springfield experience, Palincsar decided that all her subsequent educational research would be done in close collaboration with classroom professionals.

Interest in reciprocal teaching continues within District 186 through instructional chaining. A network has developed in which teachers who have used reciprocal teaching conduct in-service sessions for other teachers. By the 1987-88 school year, 150 teachers in 23 buildings had taken part in these sessions. Teachers formed peer support groups so they could discuss progress and problems associated with daily use of reciprocal teaching and other strategy instruction. The remedial teachers also helped the district design new reading tests to assess students' use of comprehension strategies. The Springfield experience contributed to ongoing efforts at the state level to revamp reading instruction and to develop reading tests that can measure the skills that methods such as reciprocal teaching try to impart. Veterans of the Springfield experiment now work in other schools and with national educational organizations to improve reading instruction.

In the Springfield schools and in others that have used reciprocal teaching, teachers have a better understanding of what reading is about. As Palincsar and Brown (1986, p. 770) observe, "There was a time not long ago when successful reading was thought to be execution of a series of component subskills." To teach reading one taught the subskills, from word recognition through finding the main idea, often in isolation and in a fixed sequence. Charles and the approximately 60 percent of American 17-year-olds who fail to reach the fourth reading proficiency level of the National Assessment of Educational Progress-who fail to become adept readers (Mullis and Jenkins 1990)-show the inadequacy of this approach. Reciprocal teaching works. The strategies it teaches enable students to apply their language-comprehension skills to reading so that they can read for meaning. Reading is more than decoding and more than the mastery of a series of small, isolated subskills.

V. High School Physics: Confronting the Misconceptions of Novices

One of the best places to see a cognitive science approach to teaching is in Jim Minstrell's physics classes at Mercer Island High School. (See Minstrell 1989; Minstrell 1984; Minstrell and Stimpson 1990.)

Mercer Island is an upper-middle-class suburb of Seattle. The high school serves just over 1,000 students in four grades. Jim Minstrell has been teaching there since 1962. He holds bachelors', master's, and doctoral degrees from the Universities of Washington and Pennsylvania, and during his career he has worked on several national programs to improve high school physics instruction. Although deeply committed to educational research, he prefers the classroom to a university department or a school of education. "I have one of the best laboratories in the world right here," he observes. He adopted what he calls "a cognitive orientation to teaching" for practical, not theoretical, reasons. The cognitive approach addresses a fundamental classroom problem that confronts science teachers: Students' preconceptions influence how they understand classroom material. .

In the early 1970s, after a decade of outstanding teaching (as measured both by students' test scores and by supervisors' evaluations), Minstrell became concerned about his effectiveness. His students couldn't transfer their formal book and lecture learning to the physics of everyday situations, and they showed little understanding of basic physical concepts, such as force, motion, and gravity. At first he thought, following Jean Piaget's theory of cognitive development, that his students lacked logical, or formal operational, skills. However, when he tested this hypothesis, he found otherwise.

Minstrell describes a task of Piaget's in which students are given two clay balls of equal size. Students agree that the two balls weigh the same. But if one ball is then flattened into a pancake, many students will then say that the pancake weighs more than the ball. They reason that the pancake weighs more because it has a larger upper surface on which air can press down. This is not a logical error but a conceptual one. Students believe that air pressure contributes to an object's weight.

"Students were bringing content ideas to the situation, ideas that were greatly affecting their performance on questions that were supposed to be testing their reasoning," Minstrell recalls.

Minstrell became actively involved in research on students' misconceptions, and he tried to apply the research in his classroom. First he expanded his classroom agenda. A teacher's primary goals are to control the students and to provide explanations that allow the students to solve textbook problems. A third goal, in the current climate of accountability, is to prepare the students to pass standardized tests of low-level skills. Minstrell maintains the first two goals, minimizes the third, and adds two goals of his own, based on cognitive research: to establish explicit instructional targets for understanding and to help the students actively reconstruct their knowledge to reach that understanding.

Minstrell attempts to diagnose students' misconceptions and to remedy them by instruction. Most teachers aren't trained to recognize and fix misconceptions. How does Minstrell do it?


Minstrell assumes from the first day of school that his students have some knowledge of physics and that they have adequate reasoning ability. Unlike expert scientists who want to explain phenomena with a minimum of assumptions and laws, students are not driven by a desire for conceptual economy. Their knowledge works well enough in daily life, but it is fragmentary and local. Minstrell calls pieces of knowledge that are used in physics reasoning facets. Facets are schemas and parts of schemas that are used to reason about the physical world

Students typically choose and apply facets on the basis of the most striking surface features of a problem. They derive their naive facets from everyday experience. Such facets are useful in particular situations; however, they are most likely false in general, and for the most part they are only loosely interrelated. Thus, students can quickly fall into contradictions. Two facets Minstrell typically finds students using when reasoning about objects are (1) that larger objects exert more force than smaller objects and (2) that only moving objects exert force. The first facet explains why the smart money was on Goliath and not David; the second explains why a football can "force" its way through a window. But how do you explain what happens when you throw a ball against the side of a building? The first facet suggests that the wall must exert a larger force on the ball than the ball does on the wall, but the second facet says that only the ball can exert a force, not the wall. So how is it that the ball bounces off the wall? As Minstrell sees it, the trick is to identify the students' correct intuitions-their facets that are consistent with formal science-and then build on these. As Minstrell says, "Some facets are anchors for instruction; others are targets for change."


At the outset, Minstrell's students are not different from other high school juniors and seniors. Early in each course unit he administers a diagnostic test to assess qualitative, not quantitative, physical reasoning.

Between 50 percent and 75 percent of the students believe that when a heavy object and a light object are dropped or thrown horizontally, the heavier one hits the ground first. As many as half believe that when two moving objects are at the same position they are traveling at the same speed; yet they all know that to pass a car on the highway the overtaking car must be going faster, even when the two cars are side by side. Nearly half believe that air pressure affects an object's weight. Almost all believe that a constant, unbalanced force causes constant velocity. The results of the diagnostic tests give Minstrell a profile of which facets are prevalent, which ones might be anchors, and which ones are targets for change.

Minstrell organizes his course into units, such as measurement, kinematics, gravity, and electromagnetism. Some lessons, usually presented early in a unit, are particularly important in helping students change their reasoning. Minstrell calls these benchmark lessons.

In a benchmark lesson, the teacher and the students dissect their qualitative reasoning about vivid, everyday physics problems into facets. They become aware of the limitations of each facet, and they identify which facets are useful for understanding a particular phenomenon. They can explore how appropriate facets can be combined into powerful explanations that can be used to solve other problems.

The benchmark lesson on gravity begins 6 weeks into the course. By this time Minstrell has established a rapport with his class. He has created an environment conducive to developing understanding, a climate where questioning and respect for diverse opinions prevail, a climate where the process of scientific reasoning can be made explicit and self-conscious. Even veteran teachers marvel at how uninhibited Minstrell's students are in expressing ideas, suggesting hypotheses, and arguing positions.

Minstrell explains to the students that the unit will begin with a three-problem diagnostic quiz, and that their answers will be the subject of discussion for the next two days. He reassures them that the quiz is not intended to embarrass them or show how little they know. He wants to find out what they already-know, and he wants them to be aware of what they already know. (Two of the problems from the quiz are reproduced in Figure 2.)

As the students work, Minstrell moves among them and observes their answers and explanations. After 15 minutes he collects the quizzes and goes to the board at the front of the classroom. He reports that on the first question, the scale problem, he saw several answers, and he writes them on the board: 15-20 pounds, a little over 10 pounds, exactly 10 pounds, a little less than 10 pounds, and about 0 pounds. "Now let's hold off on attacking these answers. Rather, let's defend one or more of them," he suggests.

Ethan explains why he thinks the object in the vacuum weighs nothing: "I felt it was zero, because when you're in space you float. It would be related to that." Minstrell helps fill in the argument: "When you're in space things seem weightless. Space is essentially an airless environment, so the object would weigh nothing."

A few students argue that the object weighs the same in the vacuum as in air. One says that when air is present the air above and the air below the scale balance out; some air pushes down and some pushes up, with no net effect. Chris, baseball cap on the back of his head and arms crossed, offers: "Ten pounds. The vacuum inside only has a relation to air pressure, not a relation to mass"

Two students argue that the object in a vacuum weighs slightly more than 10 pounds, because under normal conditions air helps hold up the scale. When you remove the air, the object will weigh more because there is no air supporting the scale.

The most popular student response is that the scale would read slightly less than 10 pounds. These arguments invoke facets involving density and buoyancy. John presents the rationale: "It's gonna be a little less than 10. You remember Bob Beamon. He set a world record in the long jump at the Mexico City Olympics. He jumped really far there because there is less air and it is lighter and so everything weighs less."

In the class period devoted to discussion, over half the students offer explanations for one of the answers. Minstrell is strictly a facilitator, offering no facts, opinions, or arguments himself. He then encourages students to present counter arguments When the counter arguments and the responses have run their course, Minstrell signals the start of the next lesson segment: "Sounds like there are some pretty good arguments here across the spectrum. So what do we do?" The students urge him to run an experiment. He says, "Luckily, I happen to have a scale, a bell jar, and a vacuum pump here."

Minstrell calls two students to the front to help conduct the crucial experiment. Such demonstrations are dramatic and exciting for the students and allow them to see which prediction is correct. Research also suggests that such experiences have an important cognitive role in inducing conceptual change. They provide an initial experience that places naive and expert theories in conflict. As the students try to resolve the conflict, the dramatic demonstration serves as an organizing structure in long-term memory (an anchor) around which schemas can be changed and reorganized (Hunt 1993).

The first student reports that the object on the scale weighs 1.2 newtons under normal circumstances. Minstrell starts the vacuum pump, and the students watch the gauge as the pressure drops inside the bell jar. The pump stops when the pressure gauge reads nearly zero.

"Did the weight go to zero?" Minstrell asks. Somewhat amazed, the students respond that the weight stayed the same. Minstrell suggests that they see what happens when the air rushes back into the jar. He opens the valve and the air whistles in. A student exclaims, "Air or no air in there, there's not much difference either way!"

Minstrell asks "What does this tell us about gravity and air pressure?" "Air pressure doesn't affect weight," the students respond. They have started to correct a major misconception. Other experiences in the unit and throughout the course reinforce this benchmark discovery that air pressure and gravity are distinct physical phenomena.


A few days later, Minstrell and the class analyze their reasoning about the time it would take a 1-kilogram and a 5-kilogram object to fall the same distance (problem 2 above). They run the crucial experiment-a miniature replay of Galileo's apocryphal experiment at Pisa. After both balls hit the floor simultaneously, Minstrell returns to the board where he had written the quiz answers. "Some of you were probably feeling pretty dumb with these kinds of answers. Don't feel dumb," he counsels. Let's see what's valuable about each of these answers, because each one is valuable. Why would you think that heavier things fall faster?"

A student suggests that heavy things (such as barbells) are harder to pull up, so it seems they would fall back to the ground more quickly too. "Right," Minstrell says. "When you lift something heavy, that sucker is heavy. Gravity is really pulling down. 'Aha,' you think, ‘big effect there.’ A useful rule of daily life is the more of X, the more of Z."

Why would anyone think a heavier object falls more slowly? A student argues that heavier objects are harder to push horizontally than light ones, and that because they are harder to push, one moves them more slowly; thus, when a heavier object is dropped, it must fall more slowly. Minstrell reinforces what is correct about this intuition. He points out that the first argument uses the facet of direct proportional reasoning and the second argument the facet of indirect proportional reasoning. Minstrell and the class will revisit these facets when they grapple with Newton's Second Law, F = ma (i.e., when a force acts on an object the acceleration is directly proportional to the force and inversely proportional to the object's mass).

Minstrell concludes: "So, there are some good rationales behind these answers. Part of what I'm saying is that the rationales you have-the physics you've cooked up in the past 16 to 19 years of living-are valuable. But they are valuable only in certain contexts." The trick to becoming a competent physicist is knowing when to use which facet. It's not just a matter of having the pieces of knowledge; what counts is knowing when to use them -- linking conditions of applicability to cognitive actions.

The unit on gravity continues with students doing experiments in the classroom and around the school building. It ends with seven problems, all taken from standard high school texts, which allow the students to assess their mastery of the unit's central facets and concepts.

Throughout the unit, Minstrell has not lectured, expounded, or "taught" in the traditional sense. He has identified students' initial intuitions, made their reasoning explicit by eliciting and debating their positions, provided vivid benchmark experiences to help trigger conceptual change, and encouraged them to reason about these views and experiences. He has taught physics from a cognitive perspective.


In 1986 Minstrell initiated a collaboration with Earl Hunt, a cognitive psychologist at the University of Washington, to assess and refine his classroom method. Hunt, a "basic" cognitive scientist who has developed an interest in an applied science of learning, describes himself as the wet blanket" of the project. "I'm the professional skeptic who must be convinced that it is the cognitive approach and not just Minstrell that accounts for the effects," he says.

A comparison of students' scores on pretests and posttests makes it clear that Minstrell's method works. '['he students lean physics. But why does it work?

One concern is whether the method's success depends entirely on Jim Minstrell's pedagogical talents. This was the first issue Hunt and Minstrell investigated. Could someone other than Minstrell use the method successfully?

Minstrell trained Virginia Stimpson and Dorthy Simpson, two math teachers at Mercer Island High who had never taught physics, to use his method. At Mercer Island, as at most high schools, which students end up in which physics sections is due more to scheduling than to student choice or teacher selection. Thus, students of varying abilities are likely to end up in each section. This allowed Minstrell and Hunt to make reliable comparisons between the performances of Minstrell's students and the performances of Stimpson's and Simpson's. Gini's and Dottie's students did at least as well as Jim's, so the effect (at least at Mercer Island High) is not due to Minstrell himself.

Is Minstrell's method better than other instructional methods currently in use? Minstrell himself has shown at Mercer Island High that his method is superior to traditional methods. His students have fewer misconceptions at course's end than do students taught traditionally. For example, on the pretest 3 percent of Minstrell's students showed correct understanding of both Newton's First and Second Laws. When he used the traditional methods and curriculum, Minstrell observed that after instruction 36 percent understood the First Law and 62 percent the Second Law. When he used his cognitive approach, 95 percent of the students ended up with a correct understanding of the First Law and 81 percent with a correct understanding of the Second Law (Minstrell 1984).

Minstrell and Hunt compared Mercer Island students with students at a neighboring, comparable high school that Hunt calls "Twin Peaks." The physics instructor there also uses a conceptual, non-quantitative approach in his course. Performance on standardized math tests is the best predictor of high school physics performance. On this measure, Mercer Island and Twin Peaks students were not significantly different. So, in physics one would expect similar outcomes at the two schools. However, on the same final exam in mechanics, taken after 3 months of studying that topic, the Mercer Island students scored about 20 percent higher than the Twin Peaks students across the entire range of math scores. "This is an important result," skeptic Hunt emphasizes, "because it shows that the method does not selectively appeal to brighter students as measured by math achievement.”

For good measure, Minstrell and Hunt also compared Mercer Island students with students in a "nationally known experimental, physics teaching, research and development program." The Mercer Island students consistently outperformed the other experimental group on all topics tested. Hunt adds: "We regard these data as particularly important because the questions we used in this comparison were developed by the other experimental group."

These results have allayed some of Hunt's initial skepticism, but Hunt and Minstrell realize that much remains to be done. The success of Minstrell's theory-based curriculum vindicates the cognitive approach, but for Hunt success raises further theoretical questions. He has begun a research program back in his laboratory to refine the theory underlying Minstrell's method. Why are benchmark lessons so important? How does transfer occur? How do students develop deep representations and make appropriate generalizations? Minstrell's classroom is a good laboratory, but a teacher who is responsible for seeing that his students learn physics is limited in the experiments he can conduct. No doubt, in a few years results from Hunt's basic-re-search will feed back into Minstrell's applied research at Mercer Island High.

The next challenge for Minstrell and Hunt will be to test the method elsewhere. What will happen when teachers who are not under the innovators' direct supervision try to use the method? Instructional materials, including videotapes of benchmark lessons for each unit, will soon be ready for dissemination. The next step will be to assemble an implementation network and conduct applied research in a variety of classroom situations.


Jim Minstrell's students end up with a better understanding of physics, in part, because they learn more expert-like representations and concepts, as well as how to reason with them. There is a price to pay for this deeper understanding. As Earl Hunt points out, "From a traditional perspective one might argue that Minstrell's classes fail, because often students don't get through the standard curriculum. Last year, they did not complete electricity, and atomic physics and waves were barely mentioned." Hunt thinks that changes in curricular time and course coverage will be crucial in making science instruction more effective. Hunt is quick to add that in other countries curricula sometimes allow two to three years to teach what we cram into one.

The applied work of Minstrell and others shows that we can teach in such a way as to make a significant impact on students' scientific understanding. All who have attempted to teach for understanding, though, emphasize that doing so takes time. Minstrell spends over a week developing Newton's laws, not one or two days as in most traditional courses. Reflecting on his classroom experiences, Minstrell (1989, p. 147) advises: "We must provide the time students need for mental restructuring. Hurrying on to the next lesson or the next topic does not allow for sufficient reflection on the implications of the present lesson."

Results from cognitive research indicate that if we want more students to understand science, the instruction should start early in school, and that throughout the curriculum instruction should build on students' correct intuitions,and prior understanding. We should try to teach experts' conceptual understandings, not just formulas and equations, and along with this content we should teach students how to reason scientifically. Better science instruction along these lines may require a "less is more" (or at least a "longer is better") approach to the science curriculum.


Learning is the process whereby novices become more expert. Teaching is the profession dedicated to helping students learn, helping them become more expert. Cognitive research has matured to where it can now tell us what is involved in the mental journey from novice to expert not just in reading and physics, but across a variety of school subject domains. The research can now describe these journeys in sufficient detail -- recall Siegler's exacting, fine-grained analysis of learning the balance scale-that it can serve as a map and guide for improved learning and teaching. We have at our disposal the basis for an applied science of learning that can inform the design of new materials, teaching methods and curricula. These are the tools students and teachers must have, if, as a nation, we are serious about becoming more productive and helping all students develop their intelligence as fully as possible.

Developing these tools and restructuring our schools to use them won't be easy. We will have to start in the classroom, where teachers interact with students. We will need teachers who can create and maintain learning environments where students have the smoothest possible journey from novice to expert and where they can learn to become intelligent novices. To do this, we will have to rethink, or at least re-evaluate, much of our received wisdom about educational policy, classroom practices, national standards, and teacher training.

Admittedly, there is much we still don't know about how our minds work, how children best learn, and how to design better schools. On the other hand, we already know a great deal that we can apply to improve our schools and our children's futures.



Bransford, J.D., Sherwood, R., Vye, N., and Rieser, J. 1986. Teaching thinking and problem solving. American Psychologist 41(10): 1078-1089.

Bransford, J.D., and Stein, B.S. 1984. The IDEAL Problem solver Freeman.

Bransford, J.D., Stein, B.S., Arbitman-Smith, R., and Vye, NJ. 1985. Improved thinking and learning skills: An analysis of three approaches. In J.W Segal, S.F. Chipman, and R. Glaser, eds., Thinking and Learning Skills, volume 1: Relating Instruction to Research. Erlbaum.

Bransford, J.D., Stein, B.S., Vye, NJ., Franks, J.J., Auble, PM., Mezynski, K.J., and Perfetto, G.A. 1982. Differences in approaches to learning: An overview. Journal of Experimental Psychology: General 111: 390-398.

Brown, A.L., Bransford, J.D., Ferrara, R.A., and Campione, J.C. 1983. Learning, remembering, and understanding. In P H. Mussen, ed., Handbook of Child Psychology, volume 3: Cognitive Development. Wiley

Brown, A.L., and Palincsar, A.S. 1982. Inducing strategic learning from text by means of informed, self-control training. Topics in Learning and Learning Disabilities 2: 1-17.

Brown, A.L., and Palincsar, A.S. 1987. Reciprocal teaching of comprehension strategies: A natural history of one program for enhancing learning. In J.D. Day and J.G. Borkowski, eds., Intelligence and Exceptionality: New Directions for Theory, Assessment, and Instructional Practices. Ablex.

Brown, A.L., and Palincsar, A.S. 1989: Guided, cooperative learning and individual knowledge acquisition. In L.B. Resnick, ed., Knowing, Learning, and Instruction: Essays in Honor of Robert Glaser Erlbaum.

Chase, W.G., and Simon, H.A. 1973. Perception in chess. Cognitive Psychology 4: 55-81.

Chipman, S.F., Segal, J.W., and Glaser, R. 1985. Thinking and Learning Skills, volume 2: Research and Open Questions. Erlbaum.

Covington, M. V 1985. Strategic thinking and the fear of failure. In J.W. Segal, S.F. Chipman, and R. Glaser, eds., Thinking and Learning Skills, volume 1: Relating Instruction to Research. Erlbaum.

De Groot, A.D. 1965. Thought and Choice in Chess. Mouton.

Feuerstein, R., Hoffman, M.B., Jensen, M.R., and Rand, Y 1985. Instrumental enrichment, an intervention program for structural cognitive modifiability: Theory and practice. In J.W. Segal, S.F Chipman, and R. Glaser, eds., Thinking and Learning Skills, volume 1: Relating Instruction to Research. Erlbaum.

Gardner, H. 1985. The Mind's New Science. Basic Books.

Hunt, E. 1993. Thoughts on Thought.- An Analysis of Formal Models of Cognition. Erlbaum.

Klahr, D., and Siegler, R.S. 1978. The representation of children's knowledge. In H. Reese and L.P. Lipsett, eds., Advances in Child Development and Behavior, volume 12. Academic Press.

Mansfield, R.S., Busse, TV, and Krepelka, E.J. 1978. The effectiveness of creativity training. Review of Educational Research 48(4): 517-536.

Meichenbaum, D. 1985. Teaching thinking: A cognitive-behavioral perspective. In S. E Chipman, J. W Segal, and R. Glaser, eds., Thinking and Learning Skills, volume 2: Research and Open Questions. Erlbaum.

Minstrell, J. 1984. Teaching for the development of understanding of ideas: Forces on moving objects. In Observing Classrooms: Perspectives from Research and Practice. Ohio State University.

Minstrell, J. 1989. Teaching science for understanding. In L.B. Resnick and L.E. Klopfer, eds., Toward the Thinking Curriculum. Association for Supervision and Curriculum Development.

Minstrell, J., and Stimpson, VC. 1990. A teaching system for diagnosing student conceptions and prescribing relevant instruction. Paper prepared for AERA session "Classroom Perspectives on Conceptual Change Teaching," Boston.

Mullis, INS., and Jenkins, L.B. 1990. The Reading Report Card, 1971-88.Trends from the Nation's Report Card. Office of Educational Research and Improvement, U.S. Department of Education.

Newell, A., and Simon, H.A. 1972. Human Problem Solving. Prentice-Hall.

Nickerson, R.S., Perkins, D.N., and Smith, E.E. 1985. The Teaching of Thinking. Erlbaum.

Palincsar, A.S., and Brown, A.L. 1984. Reciprocal teaching of comprehension-fostering and comprehension-monitoring activities. Cognition and Instruction 1(2): 117-175.

Palincsar, A.S., and Brown, A.L. 1986. Interactive teaching to promote independent learning from text. Reading Teacher 39: 771-777.

Palincsar, A.S., Ransom, K., and Derber, S. 1988. Collaborative research and the development of reciprocal teaching. Educational Leadership 46: 37-40.

Perkins, D.N., and Salomon, G. 1989. Are cognitive skills context-bound? Educational Researcher 18: 16-25.

Savell, J. M., Wohig, PT, and Rachford, D.L. 1986. Empirical status of Feuerstein's "Instrumental Enrichment" (FIE) technique as a method of teaching thinking skills. Review of Educational Research 56(4): 381-409.

Segal, J.W., Chipman, S.1:, and Glaser, R. 1985. Thinking and Learning Skills, volume 1: Relating Instruction to Research. Erlbaum.

Siegler, R.S. 1976. Three aspects of cognitive development. Cognitive Psychology 8: 481-520.

Siegler, R.S., and Klahr, D. 1982. When do children learn? The relationship between existing knowledge and the acquisition of new knowledge. In R. Glaser, ed., Advances in Instructional Psychology, volume 2. Erlbaum.

Simon, H. A., and Chase, W G. 1973. Skill in chess. American Scientist 61:394-403.