Why the Better-than-Average Effect is a Worse-than-Average Measure of Self-Enhancement: An Investigation of Conflicting Findings from Studies of East Asian Self-Evaluations

Better-than-Average Effect

Why the Better-than-Average Effect is a Worse-than-Average

Measure of Self-Enhancement:

An Investigation of Conflicting Findings from Studies of East Asian Self-Evaluations

Takeshi Hamamura

Steven J. Heine

University of British Columbia

Timothy R. S. Takemoto

Yamaguchi University

Please address correspondence to:

Steven J. Heine

2136 West Mall, University of British Columbia

Vancouver, BC, V6T 1Z4 Canada

Tel: (604) 822-6908. Fax (604) 822-6923

E-mail: heine@psych.ubc.ca

Motivation and Emotion (in press)


A recent meta-analysis on cross-cultural studies of self-enhancement finds that evidence for East Asian self-enhancement is consistently apparent only in studies where participants compare themselves to the average other, aka the “Better-than-Average” Effect (BAE). However, prior research has suggested that the BAE may conflate motivations to view the self in a positive light with non-motivational factors, such as a tendency to evaluate “everyone as better than average” (EBTA; Klar & Gilladi, 1997). In two studies, European-Canadian, Asian-Canadian, and Japanese students were asked to evaluate themselves as well as a fictitious student compared to the average. Replicating prior research, evidence for Japanese self-enhancement was found with the BAE, albeit weaker than Canadians. However, in the measures where the EBTA effect was circumvented, self-enhancement was no longer evident among Japanese. Likewise, within the BAE method, prior research has found that East Asians self-enhance more for important than unimportant traits. When the EBTA effect was circumvented this correlation was also significantly reduced. Findings from this research converge with other sources of evidence that East Asians do not appear to be motivated to self-enhance.

The question of whether people from East Asian cultures (in particular, Chinese, Koreans, and Japanese) exhibit as strong self-enhancement motivations as Westerners has recently received much attention and has generated considerable controversy. Many studies have found evidence that Westerners self-enhance more than East Asians (e.g., Chang & Asakawa, 2003; Heine, Takata, & Lehman, 2000; Norasakkunkit & Kalick, 2002). There does not appear to be much disagreement regarding the existence of this cultural difference. A recent meta-analysis of the published literature (Heine & Hamamura, 2007) revealed a large cultural difference in self-enhancement (d = .84) between the two cultural groups that emerged in 29 of the 30 methods that were employed to investigate this question (the one exception being studies that used the Implicit Associations Test; e.g., Kitayama & Uchida, 2003).

However, although cultural differences in self-enhancement emerged quite consistently across studies, evidence for the existence of self-enhancing motivations among East Asians varies considerably across studies. Some researchers have found evidence that East Asians show pronounced self-criticism. For example, Kitayama, Markus, Matsumoto, and Norasakkunkit (1997) found that Japanese were more likely to experience self-esteem decreases than they were self-esteem increases when they imagined themselves in various situations (whereas Americans showed the opposite effect). Likewise, Heine and Lehman (1995, Study 2) found that Japanese estimated that the absolute likelihood of them experiencing negative life events was greater than that of their peers (whereas Canadians showed the reverse effect). In contrast, other studies have found evidence that East Asians show pronounced self-enhancement, albeit, less pronounced than among Westerners. For example, Heine and Lehman (1995, Study 2) found with a relative-likelihood measure that Japanese rated themselves as less likely than the average student to experience various negative life events. Likewise, Brown and Kobayashi (2002) found that Japanese evaluated themselves more positively on a number of personality traits than they did the average student. In contrast to this inconsistent pattern of self-enhancement for East Asian participants, Western participants showed significant evidence for self-enhancement consistently across the different studies (Heine & Hamamura, 2007). In short, the question of whether East Asians self-enhance has been controversial and does not yield a straightforward answer.

One further source of controversy regarding the identified cultural differences in self-enhancement is regarding whether these differences can be accepted at face value. Heine et al. (1999) proposed three alternative accounts for cultural differences in self-enhancement. One account was that East Asians might self-enhance by enhancing their groups, whereas Westerners self-enhance by enhancing their individual selves (for competing arguments on this account see Brown & Kobayashi, 2002; Heine, 2003; Heine & Lehman, 1997; Muramoto & Yamaguchi, 1997). A second account was that East Asians might be feigning their self-criticism (or Westerners might be feigning their self-enhancement), such that their true motivations might not be evident in self-report measures due to cultural differences in self-presentation norms (for arguments on both side of this debate see Heine & Hamamura, 2007; Heine et al., 2000; Kobayashi & Greenwald, 2003; Kurman, 2003). And a third account was that East Asians might self-enhance in domains that were of importance to them. The present paper does not address the first two accounts, but focuses on the third account regarding the importance of traits.

Do East Asians self-enhance more in domains that are important to them than they do in domains that are relatively unimportant? Similar to the conflicting pattern of evidence regarding the existence of self-enhancement among East Asians, there have been conflicting findings regarding this question. Some research has found clear evidence that East Asians are more likely to self-enhance in important domains than they are in less important ones. For example, Sedikides, Gaertner, and Toguchi (2003) found that Japanese were more likely to self-enhance in interdependent domains than they were in independent domains, whereas Americans showed the opposite pattern. Brown and Kobayashi (2002) showed that Japanese, like Americans, showed more self-enhancement for traits that were especially important to them compared with those that were less important. On the other hand, other research has found the opposite pattern. Heine and Lehman (1995) found that Japanese were less likely to self-enhance in interdependent domains than they were in independent ones. Similarly, Heine and Renshaw (2002) found that Japanese were more self-critical for those traits that they viewed to be important compared with those that they viewed to be less important, whereas Americans showed the opposite pattern (also see Heine et al., 2001; Heine & Lehman, 1999; Kitayama et al., 1997; Norasakkunkit & Kalick, 2002). In sum, there has also been controversy over the question of whether East Asians self-enhance for especially important traits.

Variation Across Methods

A recent meta-analysis of cross-cultural studies of self-enhancement between Westerners and East Asians helps to shed light on the conflicting patterns of evidence (Heine & Hamamura, 2007). Overall, across all studies, there were pronounced cultural differences in self-enhancement, with Westerners showing strong evidence of self-enhancement (d = .87), whereas East Asians did not (d = -.01). However, this overall analysis conceals a considerable degree of variation across methods. In particular, there were two methods in which East Asians consistently showed a strong self-enhancement effect: 1) studies in which participants evaluated themselves in comparison with the average other (aka, the “better-than-average effect”; BAE), and 2) studies in which participants evaluated the relative likelihood that they would experience negative future life events (aka, the “future is better-than-average effect”; FBAE1). In the remaining 12 methods, in contrast, East Asians showed either evidence for self-criticism or for null effects. Contrary to this method-specific pattern of East Asians, Westerners showed significant evidence for self-enhancement in all 14 of the methods (Heine & Hamamura, 2007).

The average effect size for East Asian self-enhancement for the BAE and FBAE was positive (d = .38) while the average for all the other 12 methods in the meta-analysis was negative, showing evidence for self-criticism (d = -.24). Westerners also showed more self-enhancement in the BAE and the FBAE designs (d = 1.31) than in other methods (d = .70). Hence, studies that utilize the BAE and FBAE methods yield stronger self-enhancement effects for both East Asians and Westerners (a difference of approximately d = .60 for both cultures) than other methods of assessing self-enhancement.

Furthermore, studies that have explored whether East Asians self-enhance more for important traits than unimportant ones also reveal a similar pattern. Studies that have investigated the relation between self-enhancement and trait importance using the BAE design find significant positive relations (average r = .20; Heine, Kitayama & Hamamura, 2007), providing evidence for self-enhancing motivations. In contrast, studies that investigated the relation between self-enhancement and trait importance using other methods (e.g., self-peer comparisons, manipulations of success and failure) find non-significant relations (average r = .05; Heine et al., 2007). Again, these analyses reveal that the BAE design provides greater evidence for self-enhancement than the other methods.

Everybody is Better than Their Group’s Average (EBTA)

Why do the BAE and FBAE methods result in stronger self-enhancement effects than those obtained from other methods? Some researchers suggest that the BAE and FBAE capture “a robust and valid signature of self-enhancement” (Sedikides, Gaertner, & Vevea, 2005, p. 548), and thus argue for the authenticity of East Asian self-enhancement captured in these designs. However, a question remains as to why East Asian self-enhancement is not reliably captured in the 12 other methods that have been utilized in the past (Heine & Hamamura, 2007).

A more parsimonious account is that the BAE and FBAE yield an inflated estimate of self-enhancement across cultures. Recent developments in BAE and FBAE research support this account. Kwan, John, Kenny, Bond and Robins (2004) suggest that self-other comparisons such as the BAE are biased measures of self-enhancement as they do not take into account whether one’s self-evaluation is more positive than others’ evaluation of him or her. Kwan et al (2004) suggest that methods such as BAE and FBAE yield an inflated self-enhancement effect for this reason. Furthermore, much recent research has shown that the BAE and FBAE are implicated by non-motivational factors, such as egocentrism (Kruger & Burrus, 2004) and focalism (for a review see Chambers & Windschitl, 2004). This line of research does not rule out that self-enhancement motivation underlies BAE and FBAE, but it does suggest that the BAE and FBAE are conflated with non-motivational factors.

One such factor, in particular, systematically inflates estimates of self-enhancement captured in the BAE and FBAE designs. This bias arises when people process singular versus distributional information (c.f., Kahneman & Tversky, 1973). Klar and colleagues (Klar & Giladi, 1997, Klar, Medding, & Sarel, 1996) have suggested that in making a comparative judgment between a singular target (e.g., the self, a stranger) and a distributional target (e.g., most other students in my university, the average person), people fail to adequately consider the qualities of the group, and the comparison comes to only reflect their absolute evaluations of the singular target. Thus, if people are comparing a fictitious target (e.g., “Jennifer”) with most other members of a positively evaluated group, (e.g., university students), participants have a mildly favorable attitude towards Jennifer as a member of this positive group, and they express this favorability by concluding that Jennifer is “better than average.” This effect is termed the “everyone is better than their group’s average effect” (EBTA; Klar & Giladi, 1997). Viewing a random other as better than average is a finding parallel to what is seen in the BAE design, yet it could not be driven by self-enhancing motivations as it has nothing to do with the self.

Likewise, in studies where participants are asked to estimate the likelihood that they will experience certain future events relative to the average other (i.e., the FBAE), their evaluations are also vulnerable to the EBTA effect. The FBAE is prone to the EBTA effect because in estimating the relative likelihood of future life events, people tend not to adequately consider the perceived likelihood for others (Klar et al., 1996). That is, people will reason that “Jennifer” is unlikely to become an alcoholic, and their focus on the specific target of Jennifer, and not the distributional target of the comparison group of “most other students” will lead them to conclude that she is less likely than average to become an alcoholic. People’s judgments of Jennifer’s relative likelihood thus fail to consider the base rates of these events. To the extent the FBAE results from people’s considerations of a target’s absolute likelihood, the FBAE should be larger for future events that are especially unlikely. For this reason, studies find a larger FBAE for negative future life events compared with positive events as the negative events tend to be far less common than positive ones (Price, Pentecost, & Voth, 2002). Indeed, parallel findings emerge from cross-cultural studies of unrealistic optimism. When asked to compare themselves to others, people from both Eastern and Western cultural groups show more of a self-enhancing bias for FBAE judgments of negative events (average ds across 7 studies = .39 and .98 for East Asians and North Americans, respectively) than they do of positive events (average ds across 5 studies = -.20 and .42 for East Asians and North Americans, respectively; Heine & Hamamura, 2007). Furthermore, whereas Westerners exhibit significant self-enhancing biases for both FBAE judgments of both positive and negative events, East Asians only show a significant self-enhancing bias for negative events (and East Asians do not show evidence of unrealistic optimism when estimating likelihoods in absolute terms, Heine & Lehman, 1995).

To the extent that the EBTA effect is implicated in studies of the BAE and FBAE, self-enhancement effects that are reported from methods that have participants comparing themselves to average might thus consist of two components: a motivation to view themselves positively (self-enhancement) as well as a cognitive tendency of failing to consider the qualities of the group (the EBTA effect). Extending this rationale, it would seem likely that across cultures the BAE and FBAE should be greatly reduced when participants evaluated themselves against a random other, thus circumventing the EBTA effect. A few studies support this consideration. Alicke, Klotz, Breitenbecher, Yurak, and Vredenburg (1995) found that the BAE was attenuated dramatically if instead of comparing themselves to a generalized target people compared themselves to a randomly chosen singular target. Also, when participants are asked to compare themselves to a specific target (i.e., a sibling), the FBAE is far reduced for both Westerners and East Asians (Chang & Asakawa 2003).

Similarly, we hypothesize that the EBTA effect might be implicated in the stronger positive correlations between self-enhancement and importance that are evident in studies of the BAE compared with studies conducted with other methods (Heine et al., 2007). If people evaluate specific others especially favorably in BAE studies because of the EBTA effect, it follows that they should rate specific others as better than average especially for those traits that are most positive. Favorable evaluations of people are most afforded by traits that are strongly valenced. For example, if a person evaluated a target extremely positively on especially valenced traits, such as warm, intelligent, or trustworthy, they would likely have an overall positive view of that target. In contrast, extremely positive evaluations on less valenced traits such as punctual, impulsive, or cautious, would not necessarily translate into an overall positive view of the target. The more desirable and important the trait, the more it will afford a positive evaluation. We thus reason that because the EBTA effect inflates the positivity of evaluations of individual targets compared to average it should also inflate the correlation between self-enhancement and importance. Hence, we hypothesize that if the EBTA effect is circumvented, the magnitude of these correlations should decrease.

In two studies we sought to assess whether Japanese and Canadians still show significant self-enhancement biases when the EBTA effect is taken into account. In the first study we investigated whether the EBTA effect magnified measures of self-enhancement using the BAE and FBAE designs. We also investigated whether the EBTA effect inflated the magnitude of correlations between self-enhancement and the importance of traits in a BAE design.

Study 1


Participants were recruited from several Japanese universities and at the University of British Columbia (UBC). Announcements were made in various classes at the universities inviting them to participate in a survey on the internet. The Japanese sample consisted of 31 students (20 females and 11 males) from Chuo University, International Christian University, Hokkaido University, Kyoto University, Sophia University, and Tokyo Gakugei University. All Japanese participants were born in Japan and had Japanese parents.

The Canadian sample consisted of 98 University of British Columbia (UBC) students (74 females and 24 males). We partitioned the Canadian sample into three groups by ethnic background. Participants were classified as “Asian-Canadian” if they self-identified with an East Asian ethnicity; specifically, Chinese (including those from Taiwan and Hong Kong), Korean, and Japanese. Forty-seven participants (34 females and 13 males) met the criteria for this group. The “European-Canadian” sample consisted of the 40 participants (30 females and 10 males) who reported that they were of European ethnicity. The remaining 11 participants were of varied ethnicities (e.g., Middle Eastern descent, mixed ethnicities, etc.) and were excluded from the analyses. A number of participants had missing values on some of their measures so the degrees of freedom vary slightly across some analyses.


Participants from both countries completed a questionnaire on the internet that consisted of assessments of the BAE and the relative likelihood unrealistic optimism for negative events, aka the FBAE. The BAE was assessed using the same list of 15 attributes developed by Brown and Kobayashi (2002). As in Brown and Kobayashi (2002), participants rated how accurately these 15 attributes characterized themselves on a Likert scale from 1 (Not at all accurate) to 7 (Completely accurate), and how accurately they characterized “most other students” from their university. In between these two sets of evaluations we added one additional rating task: participants rated how accurately those 15 statements characterized a specific, fictitious individual. Participants read a brief statement which said that “Kate (Yumiko in Japanese) age 20, is a student at your university. Please evaluate Kate on the following scale.” “Kate” and “Yumiko” are common female names among university-aged students in Canada and Japan, respectively, and we chose female names expecting that the overwhelming majority of our participants would be female. Following the ratings of Kate (Yumiko), participants rated most students from their school. The materials and procedure for this study are modeled after Klar and Giladi (1997, Study 2). Lastly, participants rated how important each of the 15 traits was to them on a scale ranging from 1 (not at all important) to 7 (very important).

The FBAE was assessed with 10 potential future life events, adopted from Heine and Lehman (1995, Study 1). The 10 events were put into two types of statements, and beneath the description of each event, respondents were presented with a 7-point rating scale which ranged from 1) Much less likely than the average university student; through 4) About the same as the average university student; to 7) Much more likely than the average university student. Participants rated the relative likelihood that they would experience the events followed by the relative likelihood that Kate (Yumiko) would experience the events using the same scale.

Translation of Materials. Questionnaires were produced both in English and Japanese, and respondents completed them in their native language. The original English version was translated into Japanese by a bilingual, and two other bilinguals checked the translation to ensure comparability and equivalence in meaning.

Results and Discussion

Comparability of Samples

A significant age difference emerged among the three groups, F(2, 114) = 5.88, p < .01. The Asian-Canadian sample (M = 19.06) was significantly younger than both the European-Canadian sample (M = 21.33) and the Japanese sample (M = 21.30). We calculated the correlations between age and each of the dependent variables and found a significant correlation between age and evaluations of “most other students.” Thus, we included age as a covariate for analyses with this variable.

The Japanese sample consisted of 64.5% females, compared to 75% for the European-Canadians and 72.3% for the East Asian-Canadians. These proportions were not significantly different (2 [2, N = 118] < 1, ns). We report all analyses collapsed across gender but note whenever gender effects emerge.


We examined how members of each cultural group rated themselves, the fictitious other, and most others. Participants’ evaluations were averaged across the 15 attributes to form a composite measure. Reliability analyses conducted within each culture and for each of the 3 rating scales revealed that the average Cronbach’s alphas was .85 (range .72~ .96), indicating that participants generally viewed the attributes similarly within each type of statement. Analyses of ratings were conducted on these composite measures (see Table 1).

First, we calculated the BAE by subtracting participants’ ratings of “most other students” from their ratings of themselves. To the extent that participants’ ratings for the statements about themselves were higher than those for most other students, the difference score of these two is an indication of the BAE. An ANOVA of the BAE revealed no cultural difference, F(2, 111) < 1, ns. Japanese (M = .34, SD = .91) showed as pronounced a BAE as both European-Canadians (M = .37, SD = .59) and Asian-Canadians (M = .33, SD = .72). Analyses of the magnitude of the BAE within each culture revealed that both European-Canadians, t (37) = 3.75, p < .001, and Asian-Canadians, t (47) = 3.15, p < .01, showed a significant BAE, and the effect was nearly significant among Japanese, t (30) = 2.03, p < .06.

Next, we calculated the difference between the ratings for the fictitious other and most other students. This represents the EBTA effect (Klar & Giladi, 1997). An ANOVA that was conducted on the EBTA effect revealed an unpredicted cultural difference, F (2, 112) = 7.83, p < .001. Post-hoc comparisons (Tukey’s, which are used throughout the studies) revealed that the EBTA effect was larger among Japanese (M = .65, SD = .80) than among European-Canadians (M = -.01, SD = .63) or Asian-Canadians (M = .25, SD = .65). The difference between the two Canadian groups was not significant. Moreover, the EBTA effect was not significant among European-Canadians, t < 1, ns, although it was significant for both Asian-Canadians, t (47) = 2.69, p < .01, and Japanese, t (30) = 4.48, p < .001. We believe there are two reasons for this unanticipated cultural difference, which we elaborate on in the discussion below.

Next, we calculated the difference between ratings of self and the fictitious other, and the difference score was operationalized as the “Better than a Random Other Effect” (BROE). That is, how participants compare themselves to a random other should provide evidence for motivations to view oneself positively, but, at the same time, circumvent the problems in comparing singular versus distributive targets. If the self is rated more positively than a random other this is evidence for self-enhancement, whereas if the self is rated less positively than a random other this is evidence for self-criticism.

An ANOVA of the BROE revealed a significant cultural difference, F(2, 113) = 8.69, p < .001. Post-hoc comparisons revealed that the BROE was significantly larger for European-Canadians (M = .41, SD = .72) than for Japanese (M = -.32. SD = .74). Asian-Canadians (M = .08, SD = .71) fell nonsignificantly in between. T-tests revealed that the BROE was significantly positive for European-Canadians, t(38) = 3.49, p < .001, indicating a self-enhancing bias, and significantly negative for Japanese, t(30) = -2.40, p < .05, indicating a self-critical bias. Asian-Canadians showed a nonsignificant trend for self-enhancement, t(46) < 1.

In sum, members from all cultural groups showed a significant BAE. However, when participants instead compared themselves to a random other, Japanese were self-critical whereas European-Canadians were still self-enhancing. This pattern suggests that the BAE found in other studies with Japanese might not be due to self-enhancing motivations but to the EBTA effect.

An unexpected finding of our analysis was that the 3 cultural groups differed in the magnitude of the EBTA effect. We suspect that this difference may have occurred for two reasons. First, prior research on the EBTA effect suggests a distinction between the EBTA effect as captured by an indirect comparison method (i.e. taking the difference between two separate ratings of average and singular targets) that we used in Study 1, and by a direct comparison method (i.e., asking participants to directly compare the singular target to average). This line of research, which has been conducted exclusively among Westerners, has reported consistent evidence of the EBTA from the direct comparison method but not with the indirect comparison method (see Giladi & Klar, 2002; Klar & Giladi, 1997). Hence, the absence of the EBTA effect among European-Canadians might be due to our use of the indirect comparison method. To address this possibility, Study 2 used the direct comparison method of assessing the EBTA effect.

Second, we also suspect that the observed cultural difference in the EBTA effect is due to an interaction of the different self-evaluative motivations of the three samples and the order that participants evaluated the different targets. That is, the Canadians rated Kate immediately after they rated themselves, and their self-enhancing motivations would suggest that they rate Kate more negatively than themselves to create a favorable contrast. This motivation might have resulted in Kate being rated more negatively than she would have been if participants hadn’t first rated themselves, which would have served to decrease the magnitude of the EBTA effect. Likewise, Japanese self-critical motivations suggest that they would rate Yumiko more positively than themselves to create an upward social comparison, and in so doing, lead to a more favorable rating of Yumiko, and an enhanced EBTA effect. Similarly, the weak self-enhancing motivations of the Asian-Canadians would predict a result in between. We speculate that if participants rated the random other before they rate themselves, we would not have found a cultural difference in the magnitude of the EBTA effect. We address this point in Study 2.

