This article was first published in the Fall 2019 Edition of Appraisal Buzz Magazine. To receive this subscription directly, click here.
Why won’t statistics work?!
Recent studies, by seemingly responsible companies, have shown such things as red doors, pet doors, $100,000 pools, and flatter roofs as making properties more valuable. My first reaction each time is: “This must be a spoof!” Yet these outlandish, seemingly serious statements keep coming out in full view—embarrassing the very essence of companies, organizations, and people who claim to be on the “forefront of technology.” Why is this happening?
I have been giving talks to investor, lender, and appraiser groups around the country. The topic: Fallacies, Foibles, Fun, and Future of Appraisal. Some statistical silliness and fateful fallacies provide plenty of material for some laughs and some concern.
Examples of Real Estate Statistical Mistakes
One study claimed their ”research” proved that homes with more peaked roofs sold for lower prices than those homes with flatter roofs. Another asserted that homes with red doors sold for 20% more than those with black doors. Then another asserted that homes with pet doors were worth much more! Finally, another declared that after “scientific” nationwide statistically significant research, the degrees of view from the backyard were related to higher prices. Amazing! Who would have thought such a thing?
Most of these funny foibles are the result of a handful of basic statistical fallacies. Some of these even rise to the level of ‘groupthink’ culture, where everyone believes something because they think everyone else believes it to be true (but no one checks to see if it really is true).
One overarching belief has to do with ego. It is the belief that “it is just common sense.” The belief that something should be obvious to everyone who has an ounce of brains— “like me.”
Unfortunately, statistics and today’s data science are not always simplistic. What appears to be true, often is not. In the numerous statistics (particularly econometrics) courses I have taken, entire lectures are often focused on a particular fallacy. The good news is that it does make the topic more interesting to wallow through, but sometimes wallowing in my own assumptions, finding them wrong, or even dangerous—brought light and awareness of how the world really works.
While there are many statistical concepts that can be misunderstood, a few stand out. Regression is another complex and misused ‘statistical’ topic. An article “Why Doesn’t Regression Work?” is available through my website: georgedell.com.
By the way, the reason ‘statistical’ is in quotes, is because these are not really mistakes of statistics. The math is always right. These are a matter of ‘modeling’ mistakes. As The Appraisal of Real Estate says: “Abuse usually falls into one or both of two categories: overt attempts to mislead or ignorance. In my litigation practice I have found the “or both” category to be the most common of these.
Some Common and/or Irritating Fallacies
- Correlation equals causation
As easy as this one is to understand, people seem to take joy in conclusions which favor their cause, or simply provide drama to their writing.
Why bother with good science, especially when you have strong beliefs and a point to make. This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 License.
An example of this are some pure facts. How about the relationship between increased ice cream eating and drownings. Several studies have been done in different parts of the country, and they all prove the same thing: ice cream kills. We need to ban ice cream immediately!
What’s the problem here? There is high correlation, repeatedly, between ice cream and drownings. The issue is that there is no connection, other than that both tend to occur during the summer, every summer. (Oh, by the way, drowning doesn’t cause increased ice cream eating either.)
- Sophisticated “advanced” quantitative methods make credible appraisals
A high p-value proves one thing: First, the problem you are trying to solve is to describe a population. Second, there are some assumptions:
- You are unable to get all (or most all) of the population
- That a random sample represents the population very well
- That you have a clear and random sampling protocol
Third, you need to know that the result tells you nothing about how good your model is.
- Thinking that the average says anything about the spread
This is a key problem with the ‘point-value opinion.’ What clients want and need is one of three things:
- Risk of loss: for collateral decisions, portfolio management, and stress testing
- Potential for gain: for investments and value in use
- Fairness: for litigation, administrative law, or eminent domain
Each of these is dependent on the sureness of the analytic result. The level of accuracy cannot be estimated by a point value alone, especially one which is an opinion rather than a reproducible result. This may be a ‘groupthink’ problem with the appraisal community, our standards and regulatory agencies, and even our clients who in turn trade off price and speed.
- A high R-squared proves a good regression adjustment
It does not. There are two problems here. First r2 is different from R2. The little r is for simple regression, where there is only one predictor variable. The big R is for multiple regression, with several predictors like cap rate, GRM, living area, rooms, site area, building area, and location. In the SGDS1 class we present the four assumptions of simple regression. These are seldom, if ever true in real estate markets. And we also present the ten additional aspects of the data to be considered alongside the r2 or R2.
In reality, for some regressions, like for price indexing a market, r-squared is irrelevant on its face, due to the purpose of the regression. R2 has been oversold. It looks clever. It feels sophisticated. Unfortunately, it exposes the purveyor as not knowing what they are doing. It is bogus in most all valuation procedures. But it is easy to calculate!
- A regression coefficient is a good adjustment indicator
This has been one of the most egregious misuses of ‘statistics’ I know. An adjustment, as appraisers apply it, is based on prior experience. Although these can be easily abused , even “adjustment cheat sheets” simplify and align internally as reference points, and chronicle prior experience.
In theory, the adjustment is a marginal rate of change, holding all else ceteris paribus – holding other things constant. The problem here is twofold: first, some predictors may go unidentified interacting with other variables in unidentified ways; second, we cannot ‘experiment’—we study markets as they are, not as we may wish we could in order to manipulate them for scientific experiments!
Finally, predictor variables act on each other in different ways. In multiple regression this is called collinearity—correlation between variables. That can be positive like moving together, similar room counts, and living area. Negative is like power-line influence, which always coincides with an open area. And more: some variables interact. For example, large sites sell for more, view sites sell for more, and large sites with view sell for more than the additive of site and view. The relationship often is multiplicative because large lots with good views are especially scarce. Worse yet, some variables, in and of themselves have little or no predictive power, but they act on other variables in positive or negative ways.
- Outliers should be eliminated right away
This one may be at the core of why traditional vintage judgment-based appraisal is becoming obsolete. Appraisers have been trained to seek a handful of comps that are “similar, competitive, and comparable.” Yet no one has ever defined it. What is a comparable?
Pick six or seven sales, go look at them, discard all but the best three. Make sure you bracket overall, and bracket the key variables, like sqft, site area, income . Now, make it look good.
So, what’s the problem? The problem is the worst mistakes made by appraisers are not because of adjustments being wrong. Big mistakes where appraisers get sued are because of a missed feature. A feature that only shows up in something that looks like an outlier, but happens to have something in common with the subject.
In Evidence Based Valuation EBV© we focus on how some piece of data may not fit rather than focusing on the stuff. We make sure outliers are explained in some way. Focusing on outliers substantially prevents large mistakes.
Discarded data of any type (a core fallacy/problem in the old 1004mc form) causes analytical bias. The use of just three or four comps, instead of all the relevant market data, will always bias a result. The only question is “how much?” When you have all the competitive market data, it is easier to use all the data. Thus eliminating data bias up front.
As per USPAP, an appraiser “must collect, verify, and analyze all information necessary for credible assignment results.” And for the sales comparison approach: “must analyze such comparable sales data as are available.” Nowhere in our standards suggest 3 comps or 5 comps is the right number. Nowhere.
To Sum Up
The future is approaching fast. It is likely there will be fewer traditional, opinion-based appraisers. But there will be a greater need for those qualified in dealing with big data challenges and opportunities. The answer is in data science competency – the ability to understand and exploit computer power, not get beat down by it. The answer is in learning how to use algorithms to optimize your market and valuation competency. Appraisers will still be needed, but in a smarter way. You will be of greater value, not less.
Can you lie with statistics? Yes. Can you lie with graphs? Yes. Can you lie with words? Yes. Please tell me: what do each of these have in common?
Perhaps a statistical BS Detector is a possibility, but that is for another day.
 For a more academic look at some appraisal statistical fallacies, see my journal article in The Appraisal Journal, Fall 2013, entitled Common Statistical Errors and Mistakes: Valuation and Reliability¸
 Free understandable access is available to the article Why Doesn’t Regression Work? on my blog website: https://georgedell.com Feel free to contact me there.
Have any comments or would you like to submit an article of your own? Email firstname.lastname@example.org for more information.