Wednesday, 28 May 2014

Week 8 that should be in Week 7 (late) – Playing with statistics to prove any theory…

First I would like to apologize my fellow students, I’ve mixed up my calendar and I’ve thought that my article should be posted another week…

Since just few days ago we had elections to EU parliament, I thought to write few words about lying using numbers. ;) I’m sure everyone came across some bizarre “proof” that was based on completely irrelevant data.

Sometimes you can look at a plot, and after a while you realize that what you see is different than what you initially thought…. Look at “Gun deaths in Florida” – what can you see?

http://freethoughtblogs.com/pharyngula/2014/04/15/thats-a-terrible-chart/






Sometimes the numbers and plots are just fine, but somehow we are trying to connect data that have no connection whatsoever. Checkout this website: http://www.tylervigen.com/
Here you have couple of charts that I find most surprising.


1. Number people who drowned while in a swimming-pool
correlates with
Power generated by nuclear power plants (US)
Correlation: 0.901
      

2. US spending on science, space, and technology
correlates with
Suicides by hanging, strangulation and suffocation

Correlation: 0.992






3. Divorce rate in Maine
correlates with
Per capita consumption of margarine (US)

Correlation: 0.993

That’s quite amazing how unusual correlations can be found, but what conclusion can we make out of it?

There’s a great article by Andrew Gelman, a professor of statistics at Columbia University –
“Lying with statistics”.
http://www.stat.columbia.edu/~gelman/bag-of-tricks/chap10.pdf
There's a citation of an newspaper article that shows studies conducted on children from different countries on how good their reading skills are. The point of prof. Gelman is that it's very hard to compare language skills of people reading in different languages!


Just like comparing someones' manual abilities...


Comparing ability of totally different animals makes completely no sense.

1.      What is wrong with those correlation charts? Data used here are true. ;)
2.      Try to find and give us an example of unlogical/strange conclusion made on basis of wrongly interpreted data.
3.      What's your opinion on making IQ tests? Can someones intelligence be measured by solving few tasks? I believe that those tests show ability to "solve tests" and are not showing someones creativity.

26 comments:

  1. Great, finally an interesting text. According to me to then this article deserves the award among those which I read on this blog. Apparently checks the sentence "... the last shall be first ...". Moreover, despite a light form, the content is closely linked to scientific research. In a recent article were doubts as to the usefulness of philosophy in scientific research and this article are almost canonical problem of epistemology. After reading this article, I encourage you to look at page David Hume and reflect on the similarity of the problems of charts with fragment about "analysis of the concept of causality."

    1. I will answer a question with a question so as to drill down on subject. From where exactly we know, which things may be correlated and from where we can know that comparison does not make sense? Or is it just our intuition?

    2. Introduced in motion pendulum of clock so as that both leaned pendulums in the same direction. Suppose that someone X who does not know what is the pendulum clock, and how it works enters the room where the two clocks are standing next to each other. Casual observer X might get the impression that the cause of movement one the pendulum is movement due to by the second one. He could gather data, such as the deviation of the two pendulums (for example by five minutes) and make graph the correlation. Then he could come to the conclusion that the strength of one pendulum swing affects the strength of the second one.

    3. In my opinion the best IQ tests measure only a "level" the ability of combinatorial. The measurement itself is partly reliable only; comparing measurements with other people can even deteriorate the reliability of.

    ReplyDelete
    Replies
    1. Thanks! I'm glad that you liked this topic. :)
      I've never heard about this historian (David Hume), his first work looks like connected to my blog post - A Treatise of Human Nature.

      1. That's a good question - and I'm excactly wondering what usefull methods are used by scientists to find out when data is not only correlated but is linked by casue and effect...

      2. That's a great example, quite unusual, correlation of two events of same type.

      3. That's what I'm thinking about those types of tests and I'm wondering what can be done to make it more precise...

      Delete
  2. I have a mixed feeling about this article. On the one hand, people have been abusing statistic for their own purposes for quite a while now and it's nothing new. Interpreting data in beneficial way, "proper" presentation, minor wording differences, and hundreds other methods that don't really change the results, but alter the perception.

    However, while it's easy to discredit statistical analysis as lies/misinformation - including the famous "lies, bigger lies, statistics" - there can be no denying that the raw data itself is extremely useful. One simply has to examine it before blindly accepting it as "truth" - taking under account various political/marketing bias and other reasons for "creative interpretation".

    The examples in text are rather extreme and obviously serve more as a joke rather than true example of "playing with" statistics. There is subtle wording change used to prove a point, no specific metric, no nothing. Two lines of data which merely happen to have similar shape and thus "corelate".

    It would be like me "proving" that world has become more dangerous in the last 50 years or so - as the date increases, so does the number of deaths... while conveniently ignoring the rising world's population and number of births. Anyone who'd think about it for more than two seconds would be able to call me out on this.

    As for the IQ testing - the whole idea of IQ is just a very rough metric. Yes, the person with higher number is "more intelligent" than person with lower number, but it's only some general indicator. They can be smarter in some areas, while less proficient in others. Or maybe it has less to do with their intelligence, but with learning some "tricks" behind these tests.

    However, it still serves as a (very) rough mean of measuring intelligence and, as such, does its job decently.

    ReplyDelete
    Replies
    1. I gave only obvious examples of data just having "similar shape" in a form of joke just too give clear example. I hope everyone understood it just like you.

      I think it happens quite often (especially in "popular" media) that conclusion is made by journalist on basis of one chart whitout "wider view" of the topic, and then we can see such information as you posted - "world became more dangerouse...".

      I disagree on IQ tests. I believe results of an average IQ test won't even, in some cases, order people in correct series and quantitative comparison is totally impossible.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. 1. There's nothing wrong with the charts - it's just that correlation does not equal causation ;)
    It would be a fun exercise to find made up reasons as to why there may be a very indirect, hidden link between the data anyway. For example, looking at the first chart, let's say that as nuclear power plants generate more energy, it gets warmer and because of that, people spend more time in swimming-pools to cool down - and that results in more deaths by drowning... :)

    2. Whenever I read a sensationalistic news article that starts with "Scientists have found that..." I often can't help but assume that the causality was the other way around. I know there are statistical methods to prove which way the correlation occurred and I know that news sites will pick the most surprising of these findings. But when it makes so much more sense to reverse the causality - I have trouble taking those articles seriously. I can't give any specific examples at the moment, because I tend to forget them right after I read them.

    3. I agree, I also think that it's possible to practice and learn how to solve these tests to get a better score. I'm not sure what they are supposed to measure.

    ReplyDelete
    Replies
    1. 1. I guess that's an answer. :) BTW, finding those "hidden links" is typical for conspiracy theory enthusiasts.

      2. But if you find any in the future it would be cool if you would post it :)

      3. That's my thoughts - you can train yourself to get better score, and I don't think it correlates with increasing ones IQ.

      Delete
  5. 1. I do believe that out of the millions of things that we could correlate we will often find some which are very similar just by chance. Sometimes though there is hidden link that we cannot yet comprehend. As for examples is what Wiktor said about temperatures rising and causing more people to use their swimming pools therefore increasing/decreasing the number of people drowning, or since we spend more money of technology (which promotes remote contacts rather than physical ones) it could lead to people feeling alone and depressed therefore causing suicides, I can't find a decent explanation about the last one though :) .

    2. I can't come up with one that I was concerned with but here are some examples:

    Sleeping with one's shoes on is strongly correlated with waking up with a headache.
    Therefore, sleeping with one's shoes on causes headache.

    As ice cream sales increase, the rate of drowning deaths increases sharply.
    Therefore, ice cream consumption causes drowning.

    source: http://en.wikipedia.org/wiki/Correlation_does_not_imply_causation

    3. This is true, most of the IQ tests I had solved were pretty similar (same tasks all over again), you can memorize the general approach to solve each of them. But speaking of IQ there are tests claiming they can also evaluate your emotional intelligence and other. In my opinion it is a very rough representation of someone's intelligence and it can sometimes be misleading.

    ReplyDelete
    Replies
    1. Drowning and ice creams - great example of "hidden cause". We have more people eating ice cream and more people drowning becouse it's summer - good one, and on time (i's getting worm now :).

      3. In my opinion the main problem with IQ tests is that we want to understand and measure how a hammer (brain) works using a hammer (brain) - so it's difficult.

      Delete
  6. 1.What is wrong with those correlation charts? Data used here are true. ;)
    I don't think that this is something wrong with those correlation chart I think that we can find more that kind of examples. In my opinion those kind of examples are rather fun then useful.

    2.Try to find and give us an example of unlogical/strange conclusion made on basis of wrongly interpreted data.
    I can't find now some example but there are a lot of examples. Most of that kind examples are beginning" American scientist found out... " :)

    3. What's your opinion on making IQ tests? Can someones intelligence be measured by solving few tasks? I believe that those tests show ability to "solve tests" and are not showing someones creativity.

    I agree wit you and Wiktor that it is possible to learn how to solve IQ test so that kind of test only show ability to "solve tests"

    ReplyDelete
    Replies
    1. 1. You're right. There's nothing wrong, but sometimes we are trying to compare data that is totally not connected - I believe it happens quite to often. Of course my examples were a joke.

      3. Agree. :)

      Delete
  7. 1.What is wrong with those correlation charts? Data used here are true. ;)
    There is nothing wrong with those correlation charts. Data used are probably correct, correlation is high. By mixing a scale and condition everything can be proven.

    2.Try to find and give us an example of unlogical/strange conclusion made on basis of wrongly interpreted data.
    Polish budget for 2014?

    3. What's your opinion on making IQ tests? Can someones intelligence be measured by solving few tasks? I believe that those tests show ability to "solve tests" and are not showing someones creativity.

    For me IQ tests are only useful for negative selection and people believing in ability to measure the uniqueness or creativity using this tests are simply naive.

    ReplyDelete
    Replies
    1. 1. The point is - is it proven? And if it is, then what acctually was proven here? Becouse in charts that I showed - I think totally nothing.

      3. That's propably a good point, IQ tests as a negative selection, maybe... but I'm afraid that it's possible to acctually choose someone that is better at solving tests than thinking, so I'm not that convinced after all.

      Delete
  8. 1. As my classmates have already noticed correlation does not implicate causation. I think that it would not be too difficult to find similar looking charts in a big database. It might be caused by an existence of some universal rules which work in different aspects of life. For instance such a rule is well-known the Pareto one.

    2. I will not present a case relating to correlation but something connected to interpreting data. I have a problem with putting a chart so I only describe the situation.
    There are many warm discussions in the business TV programs. Presenters comment current situation using charts of indices (for instance look at www.nasdaq.com ). Charts seem to show big changes in indices. But looking at the scale of Y-axis everybody can noticed that it does not start from zero. So these big changes are often less than one percent.
    An another important relating to interpreting data is type of scale. Sometimes it is not linear but exponential or logarithmic what can change everything.

    3. I know that this measure is not perfect, but a very relevant sentence (about measuring sizes of computer programs) was said by Grzegorz Kędzierski during last PhD workshop – “Using non-perfect methods is better than not using any methods”. Thus until we discover better methods let’s use IQ tests.

    ReplyDelete
    Replies
    1. 1. Wow - "Pareto principle" looks like very strong rule and interesting in how many places it can bee noticed. Thanks for pointing that out.

      2. Yes, that's a great point! Often we look at chart, and see extremly big changes on a plot, but after you look at the numbers you can see that chose "big changes" are less than 0,001%. :) The stock charst are good example, but in this case no one is trying to "cheat" us, just if they would shown us a chart scaled from zero - in most cases - we would have seen completly horizonal plot and no information could be taken out of if.
      Log scale is also a great examlpe of not understanding the basics by many people.

      3. Maybe so, but if "not perfect method" gives you oppsite results than "the reality" then it's no good. What I mean is - having a relatively big error on a measurement is ok as long as those measurements give you any reliable information. In the case of IQ tests it's hard to say if "measurement error" is not bigger than the measurement itself. We just don't know that... the data used in IQ test could potentially be totally uninformative...

      Delete
  9. This comment has been removed by the author.

    ReplyDelete
  10. 1. These correlations are a bit funny, I treat them as an attempt to summarize the facts with "sense of humor". Ofcourse I understand that above correlations can prove something for some people but not for me.
    2. Just type in google "funny or illogical correlations" and switch to images view.
    3. There is a many opinion that IQ test may fail to act as an accurate measure of "intelligence" in its broadest sense. IQ tests only examine particular areas embodied by the broadest notion of "intelligence", failing to account for certain areas which are also associated with "intelligence" such as creativity or emotional intelligence. This is a citation from wikipedia. Ofcourse there always will be Criticism but at this moment we don't have any different tests to check IQ.

    BTW. I have free and good advice - Start responding to these above answers cuz even you have the highest IQ in our group ;) you screw up and you'll get an 2 even with this second chance to publish an article after the time.

    ReplyDelete
    Replies
    1. Ofcourse, please treat this advice as the irony to the mentioned correlation;)

      Delete
    2. 1. They ment to be funny :)
      2. I was hoping for some funny examples...
      3. If we consider that IQ tests are testing narrow range of "intelligence", then I guess we can accept that they could be usefull.

      Delete
    3. In the meantime I read this article:
      http://www.aboutintelligence.co.uk/alternative-brain-tests-for-intelligence.html
      Maybe Psychometric Tests (Aptitude Tests) is a good alternative? Because there you can answer in many ways.

      Delete
  11. 1. What is wrong with those correlation charts? Data used here are true. ;)
    I don't know. I have been trying to find explanation for these correlations for some time but its hard to find something relevant. In the chart nr 1 where number of people who drowned in a swimming-pool and power generated by nuclear power plants, I have no idea what this correlation is for. In my intuition high correlation is accidental. The same with Divorce rate in Maine and capita consumption of margarine...
    Correlation of US spending on science and suicides by hanging is somehow self-explanatory. More unsuccessful research, more researchers commit suicides...

    2. Try to find and give us an example of unlogical/strange conclusion made on basis of wrongly interpreted data.
    General decisions that are made based on small trial. For example, when the president is chosen on the very low election attendance.

    3. What's your opinion on making IQ tests? Can someones intelligence be measured by solving few tasks? I believe that those tests show ability to "solve tests" and are not showing someones creativity.
    I agree with you but to some extend, I think, it is possible to vary people based on that test. But I am not familiarized with such tests therefore I have no specific opinion on that topic.

    ReplyDelete
  12. 1. What is wrong with those correlation charts? Data used here are true. ;)
    I think the word we are looking for here is probability :) When we give ourselves set big enough as for example universe we can prove almost anything with statistics.
    2. Try to find and give us an example of unlogical/strange conclusion made on basis of wrongly interpreted data.
    Lets take for example first chart, swimming pools doesn't have much to do with nuclear plants (beside the latter are colled with water :) )
    3. What's your opinion on making IQ tests? Can someones intelligence be measured by solving few tasks? I believe that those tests show ability to "solve tests" and are not showing someones creativity.
    In my opinions most of such test can be learned in term of methodology, so if somebody is interested in them and he will train them, he will get better scores but it does not mean that he is smarter. Nevertheless those test measure somehow the ability of one to solve advanced problems, so I would not take them in very strict sense of one number being superior to other but general "guide" how to measure somebody's "capabilities".

    ReplyDelete
  13. I don’t have a big experience with correlations. My colleague answer the question about wrong interpreted data so what can I say, I agree with them. About 3 question, generally I agree with Artur Szymanski its rather learn methodology not to check our IQ.

    ReplyDelete

  14. Charting the correlation is very large number. It's hard for me to compare the results on on a graph. Data needed to chart should be rather justified by. IQ test shows the only part of my field of science. You can not on the basis of in IQ test determine the level of intelligence. Each person is very intelligent in their field Like I know there is no of one possible test to examine the level of intelligence of people. Criticism directed IQ started in 1990. I think since then have all the advantages and disadvantages are presented. I will not be repeat.

    ReplyDelete
  15. What is wrong with those correlation charts? Data used here are true. ;)


    Correlations do not measure any real value is just a free interpretation.

    2. Try to find and give us an example of unlogical/strange conclusion made on basis of wrongly interpreted data.

    When I cast two things with the same weight of a large amount of one of them has a larger surface and reaches the ground later. Interpretation can be larger to be different weight distribution affects the gravity which is not true because the department an additional factor we do not know the air resistance.

    3. What's your opinion on making IQ tests? Can someones intelligence be measured by solving few tasks? I believe that those tests show ability to "solve tests" and are not showing someones creativity.

    IQ tests do not measure my opinion anything. Intelligence defies systematic testing.

    ReplyDelete
  16. 1. What is wrong with those correlation charts? Data used here are true. ;)

    In the statistics there can exist true that does not depict real true. Let’s look at the simple measure of mean vs median vs average. Those numbers can vary significantly depending on the distribution but all can be used with some understatements to manipulate the audience.

    2. Try to find and give us an example of unlogical/strange conclusion made on basis of wrongly interpreted data.

    The simple statistical values I’ve mentioned in the first question can cause trouble. If you know the average wage in some company then it does not mean that you would know what amount you should ask for while trying to achieve rise.

    3. What's your opinion on making IQ tests? Can someones intelligence be measured by solving few tasks? I believe that those tests show ability to "solve tests" and are not showing someones creativity.

    IQ test is solely what it does. It answers the question of how good the participant is in responding to the queries in the form. All the assessment performed afterwards are to some degree just speculations.

    ReplyDelete