Psychohistory & Big Data

English: This image is a reproduction of an or...

Isaac Asimov introduced the fictional scientific field of psychohistory in his Foundation universe. In this science fiction setting, this science could predict the future by analyzing data and making inductive inferences from this data using various algorithms and formulas. The predictions resulting from the science are not about specific individuals, but rather about broad events. For example, the science could predict the fall of the Empire, but it could not be used to predict which specific person would be the emperor at that time.

Not surprisingly, real thinkers have been striving to make such predictions for quite some time and have met with some success at making statistical predictions involving large numbers of people. For example, the number of traffic accidents that will occur in a year can be predicted with a fair degree of accuracy as can the number of births.  However, making the sort of predictions made in the Foundation series has been beyond the reach of current social sciences. However, this might change.

Psychohistory is, in many ways, would work like weather prediction: data needs to be collected, analyzed and used to create mathematical models. Ideally, the model would be a perfect duplicate of reality and time could be accelerated in the model to see what will happen. Of course, making such a model is rather challenging.

One major restrictive factor has been that of data. After all, the ideal would be a perfect reconstruction of the world and to the degree that the available data falls short, the model becomes less than accurate.

While humans have been gathering and storing information since the advent of writing, we are currently gathering and storing more information than ever before. In fact, the practice of gathering, storing and analyzing data is now a standard business practice that goes by the name “Big Data.” Google was one of the pioneers of modern Big Data but other companies and organizations have gotten into the game. Some are involved because it is an industry worth billions while others are involved for other reasons (such as law enforcement). In any case, significant effort is being expended to gather up data that would be useful in predicting human behavior whether the goal is to sell more baby products or fight terrorism. People are, of course, contributing to this process by handing over massive amounts of data via social networking sites and other ways, such as trading private information for “free” stuff.  As such, there is now a massive quantity of Big Data that would be very useful in modeling the future.

The data will, of course, always be less than complete. In addition to the practical limits, there is also the problem of “limited” omniscience—knowing everything that is and was. Unlimited omniscience would include knowing everything, including what will be (assuming that can be known). Given human limitations, we will never have that complete information. As such, the epistemic limits will certainly prevent a perfect model because there will presumably always be past things that we do not know (and perhaps there are unknowable things) and hence they will not be in the data.

But, perhaps there is a way around this. If a suitably awesome machine could be built, perhaps it could predict everything from a single truth—a Cartesian machine of sorts. This leads to a second restrictive element.

A second restrictive factor has been a matter of logic. To be specific, there is the problem of creating the “software” to analyze the massive amounts of data so as to make predictions. Much of this involves inductive reasoning. After all, the goal is to make an inference from what is known (the sample) to what is not known (the target). This sort of reasoning is, of course, essentially philosophical. As such, it is hardly surprising that Leibniz was one of the first to explicitly propose creating a model of reality using symbols. Hobbes also believed that the social sciences could be “real” sciences and took geometry as his model.

While the “software” is still not quite up to psychohistory standards, there have been some impressive results in the business world in the field of predictive analysis. Of course, some of these successes have created some concern such as Target’s infamous use of such results to predict pregnancies and thus engage in targeted marketing of women who were statistically likely to be pregnant based on their buying behaviors.

As might be imagined, metaphysics becomes a factor in regards to predictive software. One important matter is whether or not humans have free will. After all, if humans do have free will in the classic sense, then predicting human behavior will always be limited by that factor. Of course, it can be argued that even if people do have that mysterious free will, people still behave in ways that are subject to statistical analysis. So, X% of people will freely do Y, while Z% of people will freely not. Though they are all free, the general patterns of behavior would certainly remain predictable. After all, we already engage in effective statistical predictions and if these are compatible with our (alleged) free will, then it seems reasonable that the same would apply to other large scale predictions as well. As such, psychohistory would be consistent with free will. That said, perhaps free will could be a factor that could “break” some predictions, perhaps in very important ways. The “breakage” caused by free will would seem to depend on how much impact individual choice has on the behavior of the whole.

A second important matter is, obviously enough, whether reality is determined or not. If we live in a deterministic world, this would seem to make definitive predictions easier (if that even makes sense to say in a deterministic universe). After all, there would be no random chance or free will to complicate matters. Of course, even if we live in a random universe then predictions would still be possible. They would, of course, lack the certainly that would be theoretically possible in a deterministic universe, but such is life in a random universe.

A third important matter is whether or not reality can be adequately modeled. This involves concerns about the nature of reality as well as the capability of humans to develop a means of modeling reality. It seems reasonable to believe that our models will always fall short of reality, thus ensuring that predictions will always potentially be in error.

A third restrictive factor is processing power. Before computers, data analysis was done by humans and this placed a rather serious limit on the volume of data processed and the speed at which it could be done. While modern computers lack human intelligence, they are well suited to data analysis—at least once they have been properly programmed by humans. While the industry is starting to run into the limits imposed by physics when it comes to improvements in processors, creating massive networks as provided a means to work around this, at least for a while.

There is, of course, the fact that it is probably impossible to build a machine with enough processing power to recreate the world (even if it is assumed that the data is complete and completely accurate) even in a virtual way. As such, this will also limit the efficacy of predictions.

Perhaps someday we will be able to predict the future so as to know whether or not we need to wear shades.

My Amazon Author Page. Big Data predicts that you will buy some books. Don’t make big data cry.

Enhanced by Zemanta

Leave a Comment

NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Trackbacks and Pingbacks: