Search Blog
  • Alan Fustey
  • Becky Wong
  • Bert Griffin
  • Blair MacDougall
  • Blake Goldring
  • Brett Baughman
  • Camillo Lento
  • Chris Delaney
  • Cynthia Kett
  • Darren Long
  • Desmond Jordan
  • Don Shaughnessy
  • Doug Lamb
  • Ed Olkovich
  • Eva Sachs
  • Evelyn Jacks
  • Gail Bebee
  • Gerald Trites
  • Gordon Brock
  • Guy Conger
  • Guy Ward
  • Heather Phillips
  • Ian Burns
  • Ian R. Whiting
  • Ian Telfer
  • Jack Comeau
  • James Dean
  • James West
  • Jeffrey Lipton Fairmont Gloucester
  • Jim Ruta
  • Jim Yih
  • Joe White
  • Jonathan Chevreau
  • Kenneth Eng
  • Larry Weltman
  • Malvin Spooner
  • Mark Borkowski
  • Marty Gunderson
  • Michael Kavanagh
  • Monty Loree
  • Nick Papapanos
  • Norma Walton
  • Pat Bolland
  • Patrick O’Meara
  • Paul Brent
  • Peter Deeb
  • Peter Lantos
  • Riaz Mamdani
  • Richard Crenian
  • Richard Warke
  • Rick Atkinson
  • Rob Peers
  • Robert Bird
  • Robert Gignac
  • Sam Albanese
  • Stephane Ruah
  • Steve Nyvik
  • Steve Selengut
  • Tammy Johnston
  • Terry Cutler
  • Trade With Kavan
  • Trevor Parry
  • Trindent Consulting
  • Wayne Wile
  • Categories
    July 2013
    M T W T F S S
    « Jun   Aug »


    Intuitively Obvious Is Usually Wrong

    Don Shaughnessy

    It has been my experience that when I receive summarized data and there is an obviously right conclusion, I should worry.  My rule is that when something is intuitively obvious, it is likely wrong.  I find that it usually pays to be a skeptic with summarized, and especially averaged, data.  Here’s why.

    People have intuitive belief systems that are incomplete and the incompleteness provides a source of error.  Worse it is outside their knowledge so they cannot analyze it effectively.

    For example, if I tell you that product X has an average approval rating of 7 out of 10, what does that mean to you?  Probably pretty good.  At least average satisfaction.  Okay to own it.

    Let’s see.

    The population of all those who rated the item gives it an average rating of 7, but the 7 does not tell us anything about the population who rated it.  We fill in that information by assuming the ratings are normally (bell curve) distributed.  What if they are not?  Suppose out of 100 people, 70 rated it 10 and 30 rated it 0.  A U-shaped curve.  The average is still 7 but it means nothing.  You would need to know the characteristics of each group of raters before you could decide if the item is satisfactory in your context.

    Mistrust averages.

    Using statistical information intuitively tends to create policy errors with both individuals and governments.  It is remarkably common in social policies.

    Suppose I tell you that at the University of California Berkley, the grad school discriminates against females.  As proof, I offer the information that of 1,835 women who applied to graduate school 30% were admitted, while in the same period, of 2,590 males who applied, 46% were admitted.  Should the government intervene with quotas to make the acceptance rate more equal?  Pretty clear, right.  Assuming you agree with the intervention idea at all.

    Actually not so much.  You do not have enough information to make the assessment.  The part you are missing is the answer to a question  “To which programs did they apply?”  Grad school is an amalgam of many programs and they don’t have the same characteristics.  You assumed equality of base information.   When the breakdown is known, the answer becomes more clear.

             Males       Females
    Program Apply Admit Apply Admit
    A       825       512 62% 108        89 82%
    B       560       353 63% 25        17 68%
    C       325       120 37% 593      202 34%
    D       417       138 33% 375      131 35%
    E       191         53 28% 393        94 24%
    F       272         16 6% 341        24 7%
     Total    2,590    1,192 46% 1,835 557 30%

    Now we see that in four out of six programs females were more likely to be admitted than males and in the other two programs, it was close.  In any program where more males applied, the female acceptance rate was higher.

    Here’s is where it gets interesting.  For programs C,D,E, and F there were 327 of 1,205 males admitted and 451 of 1,702 females.  24% each.

    The key to the puzzle is in the relative number of applicants.  In programs with a high acceptance rate A and B, there were not many females who applied.  In programs with lower acceptance rates females outnumbered males.

    The conclusion is not that Berkley grad school discriminates against females but rather that the programs females prefer at Berkley have inherently lower acceptance rates.  A quota system would not fix that.  Expanding the facilities for programs C,D,E, and F might.

    The data is drawn from Wikipedia and P.J. Bickel, E.A. Hammel and J.W. O’Connell (1975). “Sex Bias in Graduate Admissions: Data From Berkeley”. Science 187 (4175): 398–404. doi:10.1126/science.187.4175.398. PMID 17835295.

    I wonder how many quotas are based on faulty but intuitively obvious data?

    When you see a summary like this, you are seeing an average of averages.  Always a misleading item.  You cannot average averages unless all the components are identical in population size.

    Statistical information looks intuitive but it usually is not.  Our minds are made for simpler things.  It is a bit like compound interest in that you need to work it out to get the real underlying ideas.

    In your financial planning, be very cautious with average yield or average inflation rate especially over a long time.  The averages do not mean what you think they mean.

    Don Shaughnessy is a retired partner in an international accounting firm and is presently with The Protectors Group, a large personal insurance, employee benefits and investment agency in Peterborough Ontario.

    The MONEY® Network