People often miscalibrate their own absolute or relative performance in a task and appear to be overconfident for difficult tasks. This cognitive bias may lead to serious inefficiencies in educational systems in which students make repeated choices of effort of increasing difficulty to reap returns which are mainly dependent on whether they can pass or fail at each successive level. We reproduce here this scenario in an experiment with real effort in which we measure the individual's task-specific ability and subjective probability of success (confidence) at three levels of increasing difficulty.
Our subjects had to resolve anagrams. In a long training phase of nine rounds, a maximum of six anagrams could be solved each round in eight minutes. Subjects earned a sum of money if they solved at least two-thirds of the total number of anagrams. The task was relatively easy since 84% passed the test. Successful participants were then asked to “double or quits” in two successive sessions of increasing difficulty. 20% passed the second level test and less than 11% passed the third level test. Those deciding quits left the game with the money already earned, while those choosing double could substantially increase their gains if they succeeded to solve increasing numbers of anagrams under the same rules in two successive levels of three rounds each. However, they would lose part of their earnings and step out of the game if they failed to reach any of these levels. The double or quits decision was repeated for those who had reached the second level. They could leave the game with their gains or engage in three final rounds in the hope of reaching the third and highest level. Confidence in one's ability to reach a given level was observed just before start, then again before the fifth round and finally before trying to reach level 2. An accurate objective measure of task-specific ability is provided here by the average of the number of anagrams solved per minute, as computed after the first four rounds.
Five treatments were considered. The required number of anagrams was imposed in the first two treatments but differed between the treatments. In the “wall” treatment, subjects who chose double after reaching level 1 faced a wall of 10 anagrams during two sessions of three rounds each. In the “hill” treatment, they faced a rising slope of 8 anagrams during the first session of three rounds (level 2) and 12 anagrams during the second session of three rounds (level 3). The total number of anagrams required to reach the highest level (level 3) was identical in these two treatments (40 anagrams) but it was attained by two different paths. In the third treatment, designated as “Choice” treatment, subjects were offered a choice between these two paths to reach levels 2 and 3. Finally, two more treatments (Screening and Ranking) were added in which choice of track was reserved to the ‘more able' subjects and the “hill” path was assigned to the ‘less able'. 779 participants were recruited at LEEP (Paris I University) and CIRANO (Montreal).
In accordance with earlier studies, we observe (ability-adjusted) under-confidence at the easy level 1and overconfidence at the more difficult levels 2 and 3. We show that these effects may be partly attributed to a limited power of discrimination of subjects who do not perceive differences of difficulty between tasks (wall and hill) unless they are extremely salient. We also extend this well-documented ‘hard-easy effect' by showing that low-ability subjects are more prone to overconfidence than high-ability subjects for a given level. In the next step, we study how this cognitive bias can be eliminated with experience and feedback on the task. For this purpose, we develop an ‘intuitive Bayesian' model that predicts reported self-confidence as a weighted average of a prior and cues received during the game. We find that individuals behave rationally in a local or intuitive sense since this model is not rejected by the data. The relative overconfidence of less able subjects is somehow limited by the experience of their lower performance. However, it is by far not eliminated, perhaps because our subjects (like students) received only partial feedback on their ability to pass.
In a further step, we study the inefficiency caused by the individuals' imperfect knowledge of their own ability. We reproduce experimentally the typical structure of schooling systems. After a long phase of compulsory schooling (level 1), students may quit for the job market or engage in further studies. Those who decide to continue usually have an option between two tracks (or more), a general and a vocational track, which differ in the required level of cognitive ability. The less able students should opt for vocational studies in level 2 while the more able would opt for general studies. If successful, both groups of students would have another choice to quit or engage in further studies (level 3). However, students engaged in general education would normally find it a lot easier to pass this higher level than students engaged in a vocational track.
If students are fully aware of their own cognitive ability by the end of level 1, they will optimally self-select themselves between grades based on their cognitive ability. However, imperfect knowledge of ability may lead to inefficient sorting of students between tracks and grades. This prediction was tested experimentally. We found that subjects who could choose their preferred track failed more frequently on average than those who had no choice. Overconfidence and failure were increased for subjects who could choose their preferred track because, as the latter overestimated their chances of future success, they opted more frequently for the more difficult path at the middle level than subjects having no choice.
In the last step, we introduce screening and ranking of students. Selection of students at the gate based on an index of cognitive ability is commonly considered as an efficient way of sorting students who don't perfectly know their ability. Both procedures relegate ‘less able' students into vocational studies and let the ‘more able' opt for their preferred track. They differ, however, in the criterion they use for selection. Higher ability was defined by the achievement of a good level of performance (above the pass level) in the screening treatment, and by the attainment of the pass level of performance in the first ranks in the ranking treatment. Quite surprisingly, the screening and ranking treatments produced worse outcomes than self-selection, and ranking was the worst treatment. Selection at the gate increased inefficiency instead of reducing it! The aggregate performance was maximized, and inefficiency minimized, when subjects were randomly allocated to a track. And this was true even for the more able subjects who are supposed to benefit from screening and ranking.