In the previous section we have examined the hypothesis that the referential ambiguity of the hypernym is what makes possible children’s exclusive comparisons and we have shown that it has strong explanatory power. But to establish the explanation of the performance that we propose, we need empirical evidence supporting two of its claims.
Young children’s referential attribution of the hypernym
The first claim, which is implicit, is that the younger children do understand the referential properties of class names, that is, know that the hypernym can also be used to refer to a subclass. This was demonstrated by Smith and Rizzo (1982). In a first experiment, 4- and 5-year-olds were presented with materials such as three daisies and three roses and requested to tell whether a puppet named objects correctly or not (e.g., flowers for the roses, flowers for all the flowers, roses for the roses). About two thirds of the children accepted the reference of the hypernym to both the superclass and the subclass indicating knowledge of the referential properties of the hypernym.
The results of two other experiments support the notion that the hypernym is ambiguous. In a second experiment 5-year-olds were requested to get a set of objects, put it back and then get another set; this was done in the case of two subclasses (e.g., daisies then roses) and in the case of a superclass and a subclass (e.g., flowers then roses) by instructing the child to “get the—and then get the—”. Performance was virtually perfect in the first case but did not exceed 14 % in the second case, suggesting difficulty in attributing reference to the complementary subclass—however children may also fail because, as the authors acknowledge, the question requiring to take back some objects already taken is particularly tricky. In a third experiment one group of 5-year-olds was given the same task as in the second experiment while another group received this task with feedback. In addition, both groups received a class inclusion question as a pretest and as a posttest. The no-feedback group committed three times as many errors as the other, suggesting that the source of the errors is a lack of clarity in the reference of the hypernym, which was remedied by the feedback as the intended reference got progressively fixed across trials. Also the no-feedback group did not improve from the pretest to the posttest whereas the other group jumped from 20 to 75 % correct. This suggests that the training was effective in disambiguatng the hypernym. This work is important in showing that 5-year-old children know that a hypernym can refer to the subclass and to the superclass, and also in indicating—although indirectly—that the hypernym is ambiguous and that this can be overcome by a training procedure which helps disambiguate the hypernym.
The subclass-to-subclass comparison
The other claim of the present approach, which is explicit, is that the younger children who fail the question make subclass to subclass comparisons. Starting from Piaget himself, there is unanimity in favour of this claim, with the only exception of Brainerd and Kaszor (1974). They based their denial on the results of one of their experiments in which they asked children to recall the question. They hypothesised that if children referred to the subclass by the hypernym, one should observe substitutions during recall (the child reformulating the question as “more B or more B’ ”) and such errors should be more frequent after an incorrect response. Because they found few cases of substitution and no differences in frequency in a condition with immediate recall, they rejected the hypothesis. This clearly is too hasty, for the hypothesis is based on the assumption that children should reformulate the question in the same terms that coincide with their interpretation. This is very doubtful as it is the experimenter’s role to define the task, give the instructions and fix the use of the vocabulary. If a child hears the name A and interprets it as referring to B’, he is likely to continue to use the experimenter’s word A to refer to B’, especially for an immediate recall.
This is borne out by results obtained by McCabe et al. (1982) who asked five class inclusion questions with various concepts and only then asked a recall of the questions: Among the 5-year-olds who answered incorrectly, the majority recalled the question in terms of the hyponyms. Further evidence of exclusive comparisons can be found in a study by Ahr and Youniss (1970) who varied the ratios of the number of items in the subclasses (dogs and cats). With eight dogs and no cat most 6- to 8-year-olds answered “more dogs” suggesting an unsuccessful search for cats. This interpretation is born out by the answer to the question formulated by “fewer”, which was “fewer animals” most of the time. Even more significantly, with four dogs and four cats the tendency was to answer “same” (half of the children to the “more” question and the great majority to the “fewer” question). Trabasso et al. (1978) offer further evidence in an investigation in which the standard question (“more A or more B”) was compared with a question of the type “more A or more B’ ”). Whereas the rate of success ranged from one third to two thirds, depending on age, it was always above 90 % with the second question. This is easily explained if the children make exclusive comparisons. B is always chosen because there are more B than B’; so, with the standard question B is denoted by the hyponym B and the children answer “B” whereas with the other question B is denoted by the hypernym A so that they answer “A”, which surreptitiously increases the rate of apparently correct responses. Naturally, the use of B in the formulation of the standard question is motivated to avoid this possibility. Interestingly, McCabe (1987) has shown that even adults may commit errors under time constraint. When requested to identify the question asked, subclass comparisons were falsely recognised 30 % of the time.
In brief, there is overwhelming evidence in support of the claim that participants actually perform an exclusive comparison between subclasses following the class inclusion question.
Demonstration of the referential ambiguity in the standard question: experiment 1
The claim that the hypernym can be used to refer to the subclass as well as to the superclass will now be substantiated by demonstrating that the spontaneous reference attributed by children to a hypernym depends on whether or not it follows the mention of one of its hyponyms. No class inclusion question was asked in this experiment; there were only requests for designation.
Participants and material
Thirty children, aged 6;7 to 7;7 (median: 7;1) from a primary school in a small French city were presented with two kinds of concepts: Flowers (five asters and three tulips), and fruit (four bananas and three apples). For this and the following experiments the classes were drawn in colour on a Bristol board and the children were tested individually in an isolated room. Parents’ consent to the children’s participation was obtained through the school administration.
Design and predictions
There were two experimental conditions with 15 children in each. In the AB-BA condition the children were asked to designate the superclass (“show me the flowers”) by pointing with their finger; immediately after answering the children were asked to designate the subclass B (“show me the asters”). Then the same request was made in the reverse order with the fruit (“show me the bananas”, then “show me the fruit”). In the BA-AB condition the order of the requests was: Asters, flowers, then fruit, bananas. This design allows to vary the position of the crucial pair of requests AB (first vs second position) and the concepts (flowers vs fruit). Care was taken to let the children answer at their own pace and make exhaustive choices.
It was predicted that in response to an initial request for A (mention of the hypernym), the designated items would belong to both subclasses because a preference for any one subclass is irrelevant: Children will make an inclusive use of the hypernym. In contrast, when the same request follows a previous request to show B, then there should be cases where children designate B’ exclusively. This is because in the context of a previous request to show one subclass (B), designating the complementary subclass (B’) is now relevant as this materialises the partition and establishes B’ on par with B, which is at the same hierarchical level: If you have asked me to show one subclass, then it is reasonable for me to expect that the next request will be to show the other subclass. These are cases of an exclusive use of the hypernym.
Results and discussion
We are interested in the answers to the request to show the class A, and comparing this answer as a function of its position, before or after a request to show the subclass B. The results appear in Table 1 and they are clean-cut. Because there was no difference as a function of the type of concept, we consider the totals.
Initially children were overwhelmingly correct in showing the A (B + B’), but in the context of a previous request to show the B now about one half showed only the B’ (and the other half the B and the B’). The differences in the numbers of choice are significant for both concepts (Fisher test, p < .05). In brief, the reference of name B has become fully ambiguous between the complementary subclass B’ and the whole class A. Interestingly, following the choice of B’, a few children interrupted themselves (with their hand hovering above the drawing) and then carried on to complete their choice with B, an hesitation which nicely reveals the ambiguity.
The consequence for the formulation of the class inclusion question is straightforward: Because the names A and B are mentioned in the same sentence, the tendency to interpret A as referring to the B’ should be even stronger than it was in the experiment where the names A and B occurred in two separate sentences. Based on the notion that the standard class inclusion question is ambiguous, and having identified the origin of the ambiguity, the next step now is to construct a modified class inclusion question devoid of ambiguity to get the correct performance on the simple judgement of inclusion.
Elaborating a modified question: experiments 2 and 3
A modification to the standard class inclusion question suggests itself, namely mentioning the superclass and the two subclasses in the question. As reported ealier, this was already done by Ahr and Youniss (1970) and by Agnoli (1991), but with inconclusive results. Experiment 2 was designed to test the effect of this manipulation.
Participants and materials
For this and the next experiments, the participants came from a suburban residential area near Paris. Forty-two kindergarden children aged 5;1 to 6;0 (median: 5;6) from a kindergarden were presented individually with two kinds of concept: Fruit (five pears and three bananas) and flowers (four tulips and two asters).
Design and predictions for experiment 2
Each child was asked only two questions, one standard (henceforth the standard question), the other modified (the modified question). There were two conditions, with 21 children in each, that served as mutual control and differed by the order of the questions: standard question first or modified question first. The use of the two concepts (fruit and flowers) was counterbalanced. This design allows both within- and between-participant comparisons. Before both questions the experimenter made sure that the children knew the reference of the subclasses by requesting an initial designation; there was an additional request to designate the superclass before the modified question. The questions were, “Are there more B or more A?” for the Standard Question, and “Are there more B or more B’ or more A?” for the modified question. No feed-back was given after the child’s answer.
It was predicted that performance between- and within-participants would be higher on the modified question than on the standard question because the former question is disambiguated as the references of A, B and B’ have been fixed by designation and by the mention of all three names in the question, so that the hypernym must refer to A and the major hyponym to B.
Results and discussion
Table 2 presents the cross-distribution of the answers.
The between-participant analysis performed on the question presented first shows that three children (14.3 %) passed the standard question (a usual rate for the present age range) compared to 10 (47.6 %) who passed the modified question, an unusually high rate; this difference is significant (Chi square = 5.70, p < .01). The higher performance is confirmed by a nearly significant result within participants: Eight children passed the modified question and failed the standard question against two who had the reverse pattern (binomial test, p = .055). Finally, considering success on the standard question, it appears that 3 children (14.3 %) passed it when presented before the modified question against 8 (38.1 %) when presented after; this is a significant difference (Chi square = 3.07, p < .05) indicating that the Modified Question helps improve performance on the standard question: In receiving the first question some children learned that the hypernym does not refer to the subclass and transferred this to the standard question.
Children’s reaction time to the request to designate the superclass after their designation of the two subclasses was most suggestive. Whereas the reaction to designate the subclasses was generally immediate, the time to designate the superclass (which came after designation of the subclasses) was typically several seconds; in fact, the experimenter often needed to amend the question (“show me all the A”) for the child to answer.Footnote 3
In this experiment the modified question was highly effective in increasing performance. Now because a request for designation accompanied the mention of the hypernym, one may question whether the sheer mention of the hypernym is sufficient to improve performance. The next experiment was designed to answer this question.
Experiment 3
Participants, design and materials
The materials, design and procedure were the same as for experiment 2. The participants were fifty-one children aged 5;10 to 6;11 (median 6;5) coming from a primary school in the same residential area. The two questions were again a standard and a modified question. However this time both were preceded by requests for designation. In brief, the two tasks differed only by the presence or the absence of the minority hyponym (B’) in the question. It was predicted that performance would be higher with the modified question than with the standard question because the formulation of the modified question disambiguates the hypernym.
Results and discussion
Table 3 presents the cross-distribution of the answers. The between-participant analysis performed on the first of the two questions shows that, as expected, performance was higher with the modified question than with the standard question, as the number of correct answers were 20 (80 %) and 14 (53.8 %) respectively, which is sigificant (Chi square = 3.91, p < .05). This result is confirmed by the second of the two tasks (88.5 and 48 %, respectively). It is also confirmed by the within-participant analysis which indicates a highly significant effect of the modified question: 18 children passed it and failed the standard question against only one who passed the standard question but failed the modified question (McNemar test, Chi square = 14.22, p < 5.10−4). These results still obtain for each order of presentation separately (McNemar test, Chi square, p < .01).
In sum, there is a definite advantage in adding the minority-hyponym (B’) in the question, as predicted. It is not clear why this manipulation failed in Agnoli’s (1991) experiments.
The discrepancy may stem from a difference in the order of the three terms in the question. In experiments 2 and 3 the hypernym always came last, whereas its position was counterbalanced in Agnoli’s main experiment (and there is no information for the additional experiment). Another difference is that in experiments 2 and 3 the question was preceded by a request for designation. It is now important to separate the respective importance of the request for designation from the presence of the hyponym in the question in the disambiguation. In addition, we wish to get the developmental trend. The next experiment will attempt to fulfill these objectives by presenting children aged 5–8 with four tasks: The Standard Question and the Modified Question, both with and without a previous request for designation.
The developmental trend: experiment 4
The results of experiment 2 suggest that children as young as 5 or 6 years old could pass the question if it was properly interpreted. Consequently in experiment 4 the age range started as early as 4;6 (finishing at 8;9).
Participants and materials
The participants were 386 children from kindergarden and primary schools. The age ranges were 4;6 to 5;5 (N = 59); 5;6 to 6;5 (N = 138); 6;6 to 7;5 (N = 123) and 7;6 to 8;9 (N = 66) with median ages of exactly 5;0, 6;0, 7;0 and 8;0, respectively. Two concepts were used: Fruit (five pears, three bananas) and animals (four lions, two elephants).
Design
The children were presented with two tasks in four conditions as follows:
Condition I: (1) Standard question. (2) Modified question after request for designation of the three classes.
Condition II: (1) Standard question after request for designation of the three classes. (2) Standard question.
Condition III: (1) Modified question. (2) Standard question.
Condition IV. (1) Modified question after request for designation of the three classes. (2) Standard question.
Condition I was an exact replication of one of the conditions of experiment 2. Condition IV differed by the exchange of the order of the two tasks. The first task in condition IV cumulates the disambiguations introduced in the first task of conditions II (designation) and III (modification). In all the conditions the two concepts were used in counterbalanced order.
Conditions II and III were administered to the 5- and 6-year-olds only. Because I was a control and IV the target condition these two were administered to all four age groups.
Predictions
We begin with the first task. Performance should be higher in condition IV (which cumulates two disambiguating procedures) than in conditions that have only one (III and II) or none (I); the latter two comparisons predict replications of the effects observed in experiments 2 and 3. Also performance should be higher with either of the two ways of disambiguating the standard question: By modification of the question (we expect III > I) or by a request for designation (we expect II > I). In brief, the predictions for the performance on the first task can be summarised by five inequalities : IV > I; IV > II; IV > III; III > I; II > I. Notice that no prediction is made between conditions II and III: It is an empirical question to know which of the two disambiguating procedures is the most efficacious.
The second task aims to test a secondary hypothesis: A transfer effect as observed in experiment 2 would result in higher performance on the second task in conditions II, III and IV.
Results and discussion
Table 4 presents the percentage of correct responses. All the comparisons that follow are statistically significant using Chi square tests at p < .05 (most of them well beyond this level). We begin with the first task.
The results of experiment 1 are confirmed and generalised: The comparison of columns I and IV shows that by combining the two disambiguiting procedures there is a spectacular improvement in performance across all ages. In particular for the 7-year-olds, the rate of success jumps from less that 20 % to near perfection. Also for the 5- and 6-year-olds, the conventional criterion of inclusion (more than 50 % success) is reached. Recall that this is usually attained between 8 and 9 years. Importantly, the rates in condition I are typical of the common results, so that the possibility that the children were particularly advanced in their development can be ruled out.
Next, comparison of column I with columns II and III shows that each disambiguation procedure was effective separately. It was effective to roughly the same extent for the 6-year-olds but for the 7-year-olds the request for designation was the most effective. Finally, comparison of column IV with columns II and III shows that performance is higher when both procedures of disambiguation are cumulated rather than using any one alone.
We now consider the second task. We first relate performance on the standard question when it is asked first and when it is asked second; this is a between-participant comparison. The percentages of success are 28, 29.8, 62, and 84.8 % for the four age groups respectively, to be compared with the figures in the first column of Table 4: 6.6, 5.9, 18.7, and 42.4 %. This indicates a very important transfer effect, showing that children have learned the rule of the game, so to speak, on the first task, that is, the conventions used for the names to refer to classes and then apply this subsequently in the second task.
The within-participant analysis is based on Table 5 which presents the cross-distribution of answers when the second task is a standard question. Averaging across the ten 2 × 2 sub-cells, it appears that (i) failure at the disambiguated question almost always implies failure at the standard question (in 94 % of the cases) and this applies at all ages; (ii) success at the disambiguated question most generally implies success at the standard question (in 81 % of the cases) with he exception of the younger children. It is again apparent that cumulating both disambiguating procedures is conducive to the best transfer, followed by the request for designation, which in turn is more efficacious than the modified question.
Because the hypernym is ambiguous, as long as it is optimally relevant for the children to opt for an exclusive interpretation, they will compare the two subclasses. The results of experiment 4 have established that when care is taken to formulate the simple class inclusion question in a way that disambiguates in the intended sense, children as young as 5 years old can pass it because now they can engage in the comparison intended by the experimenter. The results show that the simple judgement of inclusion is made correctly three to four years earlier than is usually claimed in the literature.
There is, however, one possible methodological objection to the results of experiments 2, 3 and 4 that concerns the modified question. Because the modified question has been formulated with the hypernym in the last position, couldn’t it be the case that the improvement in the performance reflects only an order effect? This means that the child would choose response A more often just because A appears the last in the question. There is some pertinence in such considerations as an order effect was observed with the standard question (Kalil et al. 1974): The order B, A yielded higher performance than the order A, B. However, the hypothesis that order is the only factor of facilitation must be rejected because in our experiments the standard question too has been formulated in the B, A order. So, if the child followed a heuristic to select the class whose name is the last, performance should be the same with both questions, but this is not so; consequently there is more in the effectiveness of the modified question than just an effect of order that would reflect a heuristic based, e.g., on an expectation that the experimenter keeps the correct option at the end of the sentence. However, the existence of an order effect with the standard question is intriguing in itself. These considerations lead us to a refinement of the linguistic analysis that we now develop.
More on the psycholinguistic analysis of the question: experiment 5
In the formulation of the modified question the hypernym A was placed at the end on purpose. Indeed, the order of the names is not indifferent from the viewpoint of the linguistic theory. When both hyponyms B and B’ have already appeared in the sentence, the hypernym A is unlikely to be given the same reference as B or B’ because the extension of the subclasses has already been denoted; this optimises the exploitation of the use of B’ to disambiguate A. But if A appears before both B and B’, there are a number of possibilities such as deferring reference until after B and B’ have been mentioned, or give A a revocable reference that may or may not be revoked at the end: The final assignment of A to B’ is not so straightforward and less warranted. Experiment 5 was designed to test the hypothesis that performance is affected by the position of A in the question.
Participants and materials
Seventy-one primary school children aged 5;10 to 7;0 (median 6;4) were presented with the fruit drawing (five pears and three bananas).
Procedure, design, and predictions
One single question was asked, preceded by a request for designation. The order of the three names in the question was varied according to all six possible permutations constituting six groups of 11 or 12 children:
(1): B’ B A (2): B B’ A (3): B’ A B (4): B A B’ (5): A B’ B (6): A B B’
We have seen that the best performance is expected to occur when A is the last mentioned. When it is not, there is an additional treatment and a load in working memory which is costly, especially for the younger children. As a first approximation, we hypothesise that the difficulty is an increasing function of the distance of A from the end position. Thus, the prediction for the correct response rate is: B’ B A = B B’ A > B’ A B = B A B’ > A B’ B = A B B’
Results and discussion
Table 6 presents the numbers of answers for each group. When the position of A is kept constant within the three sub-groups (last, middle, first) the frequency of A answers does not vary. The comparison between the three groups obtained by collapsing (1) and (2), (3) and (4), and (5) and (6) indicates that the position of A is the only factor that yields a variation in the frequency of A responses, with the lowest rate for the first position but the middle and first position yield equal rates, contrary to the prediction of a decrease from last to middle. However, the whole trend is compatible with the prediction of a general decrease (Jonckheere trend test for ordered alternatives, z = 2.06, p < .05).
So, putting A in the mid-position resulted in as much improvement as putting it last. This is compatible with the post hoc hypothesis that the contiguity between A and the last hyponym is necessary for A to remain in working memory and have better chance of receiving its correct reference, whereas in the first position A is readily lost. Of course, this interpretation needs independent experimental support.
Conclusion of experiments 1–4
The experiments reported offer direct evidence that in the standard class inclusion question the hypernym (A) has referential ambiguity (of the privative variety). Experiment 1 has shown that it can refer with an inclusive denotation to the superclass, but also with an exclusive denotation to the subclass that is not mentioned in the question, that is, the minor subclass B’. The main claim of this paper is that the interpretation of the hypernym is pragmatically determined as a function of the child’s perception of the aim of the standard task, which evolves with age. Depending on their level of development, children may or may not adopt spontaneously the interpretation that enables the experimenter to test their acquisition of the simple inclusion judgement. One interpretation (the exclusive one) does not offer this possibility. Consequently, experimenters who wish to know whether the younger children are capable of the simple inclusion judgement should attempt to disambiguate the hypernym and help interpret the question in such a way that the hypernym refers to the superclass, which is its intended meaning in the standard question; only then can it be considered that the children are put to a valid test. The results of experiments 2, 3, and 4 have shown that when one, or even better, two disambiguation procedures are applied, the children reach the critical behavioural criterion of inclusion three to four years earlier than is usually claimed in the literature, that is, as early as five years of age. This is by no means a lower bound, rather it may be the limit that the present means of investigation is able to reach.Footnote 4