I recently finished reading Dr. Robert Cialdini’s Influence: The Psychology of Persuasion. In the section discussing the herd mentality, he brings up a poignant statistic:
“After a suicide has made front-page news, airplanes –
private planes, corporate jets, commercial airlines – begin falling out of the
sky at an alarming rate…immediately following certain kinds of highly publicized suicide stories, the number of people who die in commercial-airline crashes increases by 1,000 percent! Even more alarming: The increase is not limited to airplane deaths. The number of automobile fatalities shoots up as well”
Initially, this relationship was puzzling but after a brief explanation, the sad reality was ostensibly clear.
He begins by raising possible explanations for this odd set of seemingly related events. One possibility is that the suicide-prone are not able to grapple stressful public events and decide the best coping mechanism is to sadly end it all. Another suggestion is that the front-page suicides involve well-known and respected public figures which is so remorseful for a few that they become careless around cars and planes. Although this last theory can account for the connection between suicides and automotive wrecks, it cannot explain the correlation between suicide victims who die alone relating to single-fatality wrecks while public suicide/murder combinations produce an increase in multiple-fatality wrecks only. It must be the case that, more specifically, the public is falling trap to Cialdini’s “informational social influence.” This heart-wringing example clearly supports Cialdni’s argument as he begins this section of his book but this artistically-derived, and apparently statistically-proved, connection sums our complex environment to a few simple correlations – I am skeptical to say the least.
Note: I have recently become skeptical of the sloppy use of statistical assumptions in the social sciences. There are too many instances of poor statistical theory I have seen in my own interaction with friends/colleagues/classmates, in the media and elsewhere. I will be doing some reading over the next month investigating the assumptions surrounding order/chaos in nature. In the words of Francis Bacon, “Beware the fallacies into which undisciplined thinkers most easily fall – they are the real distorting prisms of human nature” (ie. I will be reading some Nassim Taleb hehe). I enjoy simply pulling information in a systematic way from data that would otherwise be unrecognizable but Cialdini’s effort of drawing such vast conclusions to support his arguments is an artform that perturbs me.
A little over a week ago, I met with a small group of Toronto data-minded folks who were interested in using the open data made available by the city of Toronto to search for insights into how/why Toronto has been spending. With the recent issues in Detroit, it sounded like an interesting way to spend a Sunday afternoon.
I first heard about this mini-hackathon on the DataTO Google Group which is now a site that I peruse weekly for new updates. I was able to make the second day of hacking and it proved to be very interesting. While I was not familiar with the data sets prior to meeting, it did not take long to understand our meeting’s goal. One of these goals I found most interesting – why was Toronto budgeting more money to certain areas of the city and not to others. The most interesting data set to investigate this question was the huge XML data set archiving every meeting between a lobbyist and councilman (I ended up spending about a half hour teaching myself how to access XML files). Every meeting is documented in obsessive detail – who is meeting? what are the meeting about? When did they meet? Who is the lobbyist representing? All of these questions and more are documented in detail. We created a “word cloud” using the information supplied about why the meeting was being held.
I would like to eventually see a connection drawn between the number of times a lobbyist meets with a councilman representing a certain area and the amount of money eventually budgeted to this area. How much of an affect do lobbyists in Toronto have on city spending?
We looked at some other data sets and created a couple other visualizations that helped to really make the available data more accessible/comprehensible. For example, following treemap was created using Tableau (a program which I am not convinced is very useful for my purposes).
The plan is to meet for another weekend soon and try to draw some insights from the data sets we have begun to grasp. The meeting was organized by Gabe Sawhney – check out his site.
Within the past year, my hometown of Toronto posted over 100 open data sets online in an effort to be more “accountable and transparent.” In light of the data trend, they’ve even initiated the use of #dataeh on twitter to spark interest. I am ecstatic that Toronto has joined other major cities with this release of city data. While most of the data sets are uninspiring there are a select few that I have been playing with – the first of which contains the number of licensed dogs/cats in each of the GTA’s forward sortation areas (the area that the first three letters of a postal code encompass). Below I have plotted the pet densities relative to the number of households in the given forward sortation area.
Each dot represents a different forward sortation area. A dot of size 0.06 means that there are about 6 dogs/cats per 100 households in the given region.
It seems that there is a ring separating the downtown core and the suburbs to the north and east (North York and Scarborough) where you are most likely to find a household with a pet. While it makes sense that the folks downtown may not have space for a dog or cat, I was expecting those away from the city to be most likely to have a pet. Another thing to note is that the residents in midtown just south of the 401 near Edwards Gardens are total dog people – this is easily the largest difference between the two maps. I am a dog person myself and am representative of my FSA (M5M) – are you? Is your cat a minority in his or her neighbourhood? These are important questions to ponder over. Maybe your dog/cat deserves better. Perhaps its time for a move to the high park area where cats have equal representation.
Note: Cat density is probably the closest illustration we will get to “hipster density” in Toronto – hello High Park hipsters :)
Contains public sector Datasets made available under the City of Toronto’s Open Data Licence v2.0.
If you live to be one hundred, you’ve got it made. Very few people die past that age.