informational insights from everyday decision-making

Persons tend to prefer what is familiar.  Political candidates, for example, heavily advertise themselves with signs and bumper stickers that typically include just the candidate’s name and office sought.  While voters find this same information on the ballot when they go to vote, repeated exposure to a candidate’s name evidently induces voters to prefer that candidate.  Brand advertising, which has been highly successful across a wide variety of media ecologies, is oriented toward the same effect.

Preferring the familiar favors survival in a wide range of actual human environments.  Because persons recognize dangers over time and avoid them, familiar surroundings are less likely to be dangerous than unfamiliar surroundings.  Persons who eat familiar foods are less likely to suffering poisoning than persons who eat unrecognized substances.  Familiar persons are more likely to offer help than are strangers.  Preference for the familiar is a simple decision rule that makes sense from evolutionary and ecological perspectives.

Preferring the familiar can produce good decisions on contrived tasks not directly related to familiarity. For example, presented in the laboratory with pairs of cities and told to choose which city is larger, students more often chose as larger a recognized city that was paired with an unrecognized city. Because actual patterns of conversation and media content refer to larger cities more often than smaller cities, choosing the recognized city identifies the larger city with better than random odds. In fact, on the pairwise city-size decision task, American students correctly choose the larger city more often for German city pairs than for American city pairs.  The opposite was true for German students.  This surprising result indicates the merits of the recognition heuristic.  The recognition heuristic can be applied only to city pairs for which one city is recognized, and one isn’t.  City pairs from a foreign country provided more scope for the recognition heuristic, and the recognition heuristic produced better decisions than decisions made when information could be recalled about both cities.[1]

Actual human decision processes point to important characteristics of practical decision logic. No formal decision logic can determine the scope of information that it considers. Every decision necessarily does not consider some possible information. An optimal decision is necessarily defined with respect to an assumed structure of information.  Recognition depends on biological capabilities, a wide range of life experiences, and non-problem-specific characteristics of the environment.  Recognition points to the huge scope of possibilities for useful information.[2]

More information, however, can make predictions less accurate.  In the real world, one does not know the data-generating process for the information under consideration.  Nor does one know whether that data-generating process is the same as the data-generating process relevant to the circumstances of the prediction.  Hence over-fitting and non-representative samples are always risks in real-world statistical applications.  More information can lead to a better estimate of the wrong data-generating process and hence worse predictions.[3]  The data-generating process for less information may be implicitly or explicitly better estimated and more consistent over time.  More information makes more known, but does not necessarily provide a better guide to the unknown.

* * * * *

[1] Goldstein, Daniel G. and Gerd Gigerenzer, “Models of Ecological Rationality: The Recognition Heuristic,” Psychological Review v. 109, n. 1, pp. 75-90.

[2] Processing fluency at a lower level of sense than recognition is also important in decision-making.

[3] Thus, for example, in some situations the median, which uses only ordinal information, provides better predictions than the mean.  Gigerenzer, Gerd, “Why Heuristics Work,” Perspectives on Psychological Sciences v. 3, n. 1, pp. 20-9, provides a nice overview of how biology (“adaptive toolbox”) and real-world decision-making circumstances (ecological rationality) support fast and frugal heuristics.  Gigerenzer is an eminent academic and research leader in this field.  While I know much less, it seems to me that, in this short article, Gigerenzer doesn’t adequately distinguish between “irrelevant information (or ‘noise’ )” and model mis-specification / structural change.  If one knows correctly the data-generating process, larger sample sizes typically serve well to increase prediction accuracy in the presence of noise.  That’s not true for a mis-specified model.  Moreover, there is no statistical test for the true data-generating process for data not yet known.

Update:  Section 2 of Gerd Gigerenzer and Henry Brighton, “Homo Heuristicus: Why Biased Minds Make Better Inferences,” Topics in Cognitive Science 1 (2009) 107–143,  provides a good discussion of model mis-specification.  It uses the terms bias, variance, and noise in a way that might be jarring for someone focused on textbook statistics.  Textbook statistics, however, typically do not adequately recognize the reality that the true data-generating process is always unknown. Moreover, in practical circumstances the law of large numbers confronts important limitations:

  • increasing the sample size is often costly or not feasible
  • a larger sample may create greater model mis-specification because different data-generating processes may apply to different subsets of the sample
  • a larger sample enables greater over-fitting and increases the importance of correct parametrization

On the other hand, big datasets and complex algorithms have been successful in practical domains.