How to pick up the true meaning of messages exchanged in the laboratory is an important issue for experimental research. The present study investigates, by experimentally comparing self- and third-party evaluations, to what extent self-evaluations by message receivers can be relied on. After standard public-good game, subjects receive a free-form written message evaluating their decision and self-evaluate its content from their counterparts. Third-party evaluators also evaluate the content independently. A comparison between both evaluations shows that a significant proportion of them agree. Firm evidence of a self-serving bias cannot be found.
Non-restricted communication plays an important role in economic decision-making. Experimental literature continues to accumulate evidence showing that the behavior of subjects significantly differs in depending on whether they are allowed to send messages to each other or not (e.g., Cooper and Kagel [
Message receivers’ self-reports look sound, but they may produce a severe self-serving bias (e.g., Babcock et al. [
However, very little is known to what extent subjects’ self-evaluations and third-party evaluations agree or disagree. This study provides evidence on this question. Using a message exchanges an experiment described below, self-evaluation of a message received by the subjects is compared with third-party evaluations of it. A significant proportion of both evaluations agree, suggesting that the subjects’ self-evaluations are, at least to some extent, reliable. Firm evidence of a self-serving bias cannot be found.
The message exchange experiment, which comprises two stages, is described as follows. In the first stage, paired subjects play a standard public-good game; they simultaneously decide how much they would invest in the public good from 20 units of an endowment. A zero-investment is the dominant strategy, while investing the whole endowment leads to a Pareto-efficient allocation (see the appendix for a detailed description of the voluntary contribution mechanism used in the experiment). In the second stage, after the decision of their partners has been revealed, subjects write a free-form message evaluating their partners’ contribution and send it to them. The message is inputted via a keyboard, and is not handwritten. After the subjects have sent their message, it is displayed on their partners’ computer screens. After confirming the content of the message, the subjects classify it into the following three evaluation indexes: positive, neutral, or negative.
The experiment was conducted at Osaka University. Twenty subjects participated in each of two sessions1. The experiment required approximately one hour, and the average payoff per subject was $23.61.
After the message exchange experiment, an additional 12 students were employed as third-party evaluators. After a detailed description of the message exchange experiment, they simultaneously and independently classify the messages actually written in the experiment according to their content into the same three evaluation indexes. Among these 12 evaluators’ decisions on each message, the most popular one was adopted as the third-party evaluation2.
Comparing self- and third-party evaluations of a message, we define a self-serving bias as follows.
Definition 1. We say there exists a self-serving bias if (i) the self-evaluation of a message is neutral or positive but the third-party evaluation of it is negative or (ii) the self-evaluation of a message is positive but the third-party evaluation of it is neutral.
In other words, if the receiver of a message interprets it more positively than third-party evaluators do, a self- serving bias is indicated. Of course, we can consider the opposite bias, such as a sort of self-discipline bias3.
This nonlinear relationship seemingly implies that the messages are distorted by their receivers, but this is not
true.
In addition to the direct comparison of self- and third-party evaluations presented above, a further investigation was conducted on the agreement between both evaluations. In the analysis so far, the most popular evaluation among third-party evaluators was adopted as the average opinion and the number of votes in its favor was neglected. Here, we investigate the relationship between the number of votes for the most popular evaluation and the probability that both evaluations agree. Specifically, the following probit model was estimated:
where the dependent variable is a dummy, which equals 1 if subject i’s self-evaluation of a message received from subject j agrees with third-party evaluations and 0 otherwise. The independent variables are the number of votes for the most popular evaluation of the message by third-party evaluators and the absolute positive and negative differences between the contributions of subjects i and j.
The results are summarized in
Second, although the absolute positive and negative differences between contributions were not significant at the 10% level (p = 0.196 and 0.206, respectively), the estimated coefficients for both were small positive values. Some subjects and third-party evaluators might use this information, in addition to the content of a message, to evaluate it.
The present study provides the evidence on the extent to which self-evaluation of a message received by subjects can be relied on. The experimental data confirm that (i) their self-serving bias is not large and (ii) their self-
Third-party evaluation | ||||
---|---|---|---|---|
Positive | Neutral | Negative | ||
Positive | 8 | (6) | (0) | |
Self-evaluation | Neutral | [ | 14 | (1) |
Negative | [ | [ | 8 |
Note: The data in parentheses, those with underlines, and those in brackets represent self-serving evaluations, agreed evaluations, and self-discipline evaluations, respectively.
Independent variable | Coefficient |
---|---|
Constant | −0.122 |
(0.937) | |
Number of votes | 0.042 |
(0.780) | |
Absolute positive difference | 0.118 |
(0.196) | |
Absolute negative difference | 0.111 |
(0.206) |
Notes: The dependent variable equals 1 if a subject’s self-evaluation agrees with the third-party evaluation, and 0 otherwise. The numbers in parentheses represent p-values.
evaluations accord well with third-party evaluations, even when a message is relatively difficult for third-party evaluators to judge. A positive interpretation of these results will imply that experimental researchers can consider subjects’ self-evaluations during an experiment as reliable data for analytical purposes, at least to some extent.
However, even though the number of observations is small, some subjects interpret a message more positively―but not more negatively―than third-party evaluators do. When and how often cheating occurs is still an open question, left for future research.
Finally, a limitation of this study should be mentioned. The discussion thus far implicitly assumes that third-party evaluators will evaluate messages objectively and neutrally, at least to some extent. However, if taking their psychological factors as human beings into account, this may not always be true. For example, many studies point out that people sometimes behave spitefully, that is, their behavior intends to make others suffer (monetary or nonmonetary) losses (e.g., Jensen [
Special thanks were due to Yuki Hamada and Keiko Takaoka, who helped conduct the experiment. This research was supported by Grants-in-Aid for JSPS Fellows 211071 and 231657 from the Japan Society for the Promotion of Science.
TakehisaKumakawa, (2015) An Experimental Comparison between Self- and Third-Party Evaluations. Theoretical Economics Letters,05,453-457. doi: 10.4236/tel.2015.54053
The experiment uses the following voluntary contribution mechanism. There are two subjects, a and b, with subject i (=a, b) having wi units of an endowment of a private good. Each subject faces a decision regarding splitting wi between his or her own consumption of the private good (xi) and investment (yi) in the public good (y). From the investment, each subject enjoys y = ya + yb; that is, the level of the public good is the sum of the investments made by the two subjects. Therefore, each subject’s decision problem is to maximize his or her own payoff ui (xi, y), subject to xi + yi = wi. All subjects have the same payoff function, specified as follows:
where (wa, wb) = (20, 20) and a = 0.7, the latter of which is the marginal per-capita return from an investment in the public good.
Within these parameters, making no investment in the public good (i.e., complete free-riding) is the dominant strategy for each subject in the one-shot game. Accordingly, the level of the public good is 0 in the dominant-strategy equilibrium. By contrast, the aggregate payoff of the two subjects is maximized when each subject invests all 20 units of his or her endowment (i.e., full cooperation).