It’s pretty obvious that estimating is a profession that requires a lot of math. If you look beyond calculations of all that’s obvious, you’re left with quantifying the unclear, and the uncertain. Construction can be a complicated business. Describing the impact that one detail may have to an overall assembly can be very difficult. In the field, we have industry jargon to abbreviate lengthy terms relating to the task at hand. In many ways estimators use statistics as a shorthand to express the complex relationships they’re dealing with. In really simple terms, we’re trying to express two simple concepts about a group of numbers. First, we want to know what’s the “center” of the group. Second, we want to know the “spread” of the group of numbers. We’re looking for a “central tendency” which gives us some assurance that our short-hand answer is representative of the original data. An outlier is a number that’s inconsistent with the rest of the group.

*In some cases the outlier is as threatening as the group…*

**Average**

Average or mean, is the most common and familiar statistical shorthand to describe the center a set of numbers. The math is pretty simple, it’s the sum of every number in the set, divided by the count of numbers in the set. When the numbers in the set are similar, the calculated average is close to any individual number in the set. That’s a long way of saying that averages are only meaningful when the numbers in the set are similar.

As estimators, the utility of an average is that you can take the complexities of larger systems and “boil it down” to a handy number. We most often see these applied in unit or parametric pricing. An “average” restroom may cost $X amount based on the average of what restrooms have cost in the past.

**It’s “mean” for a reason**

Averages that are based on sets of numbers with wildly different values can generate misleading information. Picking up on the restroom example, imagine what might happen to your average if your past projects included ten truck stops, a campground, and a queens powder room! Outliers can skew your statistics to an incredible degree.

*Bending the curve can lead to errors in judgment…*

**Median**

Median is another way to describe the center of a set of numbers. The median is the middle number of the set when the numbers are arranged from smallest to largest. If there is an even count of numbers in the set, you calculate the average of the middle two. This is a statistic that’s under-utilized because it’s benefit isn’t immediately obvious. The entire point of a statistic is to create a shorthand, one-number answer to inform what a bunch of numbers are telling us. If we limit our examples to simple and similar number sets, there really isn’t much difference between an average and a median. However, when our number set includes outliers, as in the Queens powder room, the median more accurately illustrates the central tendency of the data set. That’s a fancy way of saying that the median helps to knock out the outliers in your data.

Students of estimating should notice something here. If the data is really consistent, you probably didn’t need to calculate the average because it’s obvious. If the data has clear outliers, the median’s easier to spot and it’s more useful anyway.

**Range**

Range speaks to the “spread” of a group of numbers. Range is the difference of the largest and the smallest numbers in the set. Simply subtracting the largest from the smallest number will give you your range. Estimating can be likened to a series of approximations. We must confine our approximations by defining the smallest and the largest acceptable answers. Each successive approximation reduces the range of acceptable answers. Broadly speaking, estimators can add up everything that’s obvious to arrive at the smallest acceptable answer, however we know that the risk of the unknown has value. Estimators often express these concepts in terms of “best case or worst case” scenarios. The range defines the potential risk between the two. It’s entirely possible for the potential risk to exceed the anticipated reward of a project.

*It’s easy to dig yourself into a hole, but it’s a bear getting out.*

Range is also used to define the difference between subcontractor (sub) proposals or bid results. A small range may indicate a consensus view among bidders, whereas a larger range may indicate the opposite.

**Standard Deviation**

If every group of numbers has a mean, than every other number in the group has a deviation (difference) from the mean. Value – Mean = Deviation. Standard Deviation is an estimate of the size of a typical deviation. There are four steps to calculating standard deviation

- Calculate the mean
- Calculate the deviation for each number in the set and square the result
- Calculate the mean of the squared deviations
- Calculate the square root of the result

The standard deviation informs you of how widely the set of numbers differs from the mean of the set. A small standard deviation indicates that the spread is minimal which implies the mean is more representative of the set.

Standard deviation becomes particularly useful to identify whether the numbers in a set are clustering around the mean. A set with one outlier will have a lower standard deviation than a set with several outliers even if both sets have the same calculated range and median.

**Mode**

Mode is the most frequently occurring variable in a data-set. We really don’t hear people using the term “mode” to refer to estimating, but it’s being used nevertheless. The important thing to understand about mode is that it’s useful outside of numbers. Tracking our work as estimators may be evaluated in many different ways. If you wanted to know which month is the least productive, or which city had the most projects, you’re asking about the mode of your work.

An absolutely astounding amount of information is created during a bid, however many estimators fail to record job characteristics beyond the construction documents. Tracking project parameters like cities, dates, values, construction types, and even required working hours can generate informative feedback about your market. Tracking which competitor won the jobs you lost may reveal trends that speak to your chances on a new opportunity.

If there was one area of statistics that estimators could improve upon, it would be mode. I think estimate tracking is neglected when markets are good because it’s relatively easy to win work. Lacking sufficient pressure to improve, most companies simply aim to repeat whatever worked on the last thing they bid. When markets shift, these firms often resort to chasing every lead in hopes of landing a fruitful job with leads into future work. Running blind leads to crashing hard. Estimate tracking can be a thankless business, but it can be a profound help to cull the good leads from the bad. Time spent on fruitless bidding begets more fruitless bidding. Sober heads must prevail if anything is to improve.

*Good judgment will make you stand out from the crowd*

** Trade-offs
**

Working with statistics involves trade-offs that must be considered. For example, let’s say you’re bidding a chain restaurant that’s similar to several past bids. You could calculate the average square foot cost of your past bids to arrive at a total for the current project. Now, for argument’s sake, let’s suggest that you didn’t win all of those past bids. If you’re strictly working off your past bids, there’s a built-in rate hike.

Even if you only factored the winning bids, do you know which ones went on to be profitable? Many estimators hand off their bids to Project Managers (PM) for the construction phase. Some PM’s don’t track their change orders separately from the original bid which can make big differences in profitability, and production. Estimators must understand that winning an unprofitable job is much worse for their company than losing a bid. Since you can’t count on change orders to save you, every bid should include sufficient overhead and profit to make the work worthwhile.

Finally, there may be features that factor into a project cost in ways that don’t translate to square foot costing. Rest rooms are a good example of this because they require many trades and vendors to assemble compared to any other room. An individual restroom is a costly parameter that is not exclusively driven by the building’s square footage. It’s entirely possible that two otherwise identical projects would have different numbers of restrooms. They certainly won’t cost the same amount as a result.

All of these considerations go towards questioning the value of the data, before trusting the statistics on the data. Bad information will give you bad statistics every time.

**Factoring **

So far, I’ve focused on basic statistical analysis. As estimators we might be comparing samples that we know are imperfect for the task at hand, simply because it’s all we have to go on. For example, let’s say you’re trying to get a sense of what it will cost to build out a “white box” space into a bank. If your past experience included ground-up banks and the occasional retail Tenant Improvement (TI), you have some obvious similarities to work with.

Starting with the ground-up bank bids, you need to consider what portion of the total cost was for the “core and shell”. The goal here is to strip out the parts that don’t apply to your current project. The amount we’re left with is a proportion of the original total. That proportion is known as a “factor”. We can take that factor and multiply it with another ground-up bank bid to arrive at the approximate value of the TI portion. You might hear someone say that they factored out the core and shell portion of their ground-up bids.

The idea here is to cut out anything you don’t need and add where you do. You might hear these operations referred to as “corrections, “adjustments” , or “compensations for xyz”. Estimators should interpret these terms to mean there’s **built-in uncertainty** in the affected information.

**The weighted average**

A retail TI may have a lot in common with a bank build out. For example the level of finishes and square footages may be very similar. The differences between a retail TI and a bank TI may be tougher than line-item considerations will allow. Obviously a store selling lawn equipment will have less in common with a bank than a high-end clothier. What we’re looking for is a means to minimize the differences and maximize the similarities. We can achieve this aim by a process known as a *weighted averag*e. The weighted average is an average resulting from the multiplication of each component by a factor reflecting its importance.

Weighted averages can be factored in several different ways. One approach is to use positive factors that are greater than or equal to 1. Applying this to our example, we might apply a factor of 3 to the adjusted bank estimates, a factor of 2 to the high-end clothier TI, and a factor of 1 to the lawn equipment store TI. Assuming we had four banks we have the following:

Banks 4 each x 3 factor = 12 weight

High end clothier 1 each x factor 2 = 2 weight

Lawn equip 1 each 2 factor 2 = 1 weight

Total weight =15

We then calculate the weighted average by factoring for each component

Bank #1 250,000 X 3 = 750,000

Bank #2 244,000 X 3 = 732,000

Bank #3 230,000 X 3 = 690,000

Bank #4 226,000 X 3 = 678,000

Clothier 225,000 X 2 = 550,000

Lawn 190,000 X 1 = 190,000

Total = 3,590,000

Now we divide by our total weight of 12 to get our weighted average

3,590,000 / 12 = 299,166

The non-weighted average of this example comes to 277,500 which illustrates the difference weighting can make.

A word of caution is in order here. Factoring and weighted averages can generate flawed information if they’re not used with care. Earlier I used an example of factoring out the core and shell costs on ground-up banks. There’s a built-in assumption that the going rate for ground-up work is the same as TI work. In reality, the TI work may be **more expensive** because there’s less competition for smaller jobs, or because the site logistics are more difficult when you’re surrounded by finished spaces. Lacking first-hand experience, it’s difficult to factor for uncertainty. On the other hand, it’s easy to steer your way to a favorable answer. I see a lot of this kind of thinking employed to rationalize a mistake.

*Edna tries to make her driving position look reasonable with a little help from her husbands wardrobe*

**Accuracy versus precision**

Estimators have to deal with uncertainty, it’s part of everything we do. We often think of that uncertainty outside of our “answer” to what something costs. In fact, the only time we’re actually certain, is when the project’s done, the bills are paid, and the final reckoning is complete. There’s a great temptation to assume that our past work is* fact* so that we’re basing our current decisions on something scientific.

Truthfully, the only accurate price, is the winning bid on bid day. It doesn’t matter if you used a supercomputer or a dart board to arrive at a losing number because the outcome is the same. Accuracy is more important than precision.

Accuracy and precision are not interchangeable terms. Accuracy is an approximation of how close a measurement system is to the subjects actual value. Precision is an approximation of a measurement systems repeatability.

This distinction is important because it’s easy to get hung up on precision when there opportunities to apply mathematical processes. Being able to repeatedly output the same answer is great for consistency, however we win jobs by hitting the market price.

If you’re 10% higher than the low bidder every time you bid, you know that your precision is nearly 100%, but your accuracy is only 90%. Unless you find a way to profitably cut 10% on the next bid, you’ve got little reason to expect a win.

Estimators need both precision and accuracy. Guessing may land you the occasional victory, but it’s a gamble whether the amount bid will lead to profitable work without the precision of repeatable estimating process.

**Clusters, stratification, and outliers**

Estimators who receive several bids on the same scope of work will be able to recognize some patterns. Arranged in ascending order of value, some bids may tightly cluster around a value. Provided there are enough bids from a diverse group of bidders, you might see several clusters appearing. This stratification can provide some delineation between groups of bidders that is instructive. Finally, you might see proposals that are substantially different from any clusters or strata. Outliers are worrisome because they may imply a bidder mistake .

Every company has an efficiency of scale which tends to make them a market leader when they’re competitively bidding work that fits their key efficiencies. When several companies have similar efficiencies of scale, their bids tend to cluster very closely. With a large enough sample, it’s possible to see bidders stratifying according to their efficiencies of scale. Estimators should be cautious, because stratification can be driven by differences in scope interpretation as well.

Competitive bidding means we’re constantly looking for the lowest bidder, which may be an outlier of the group. It’s obvious that omitting, excluding, or misunderstanding the project scope can lead to underestimating the cost of the work. Estimators calculate the risk of hiring a low bidder by subtracting the 2nd low from the low bid amount. This is the cost for the GC to “buy” their way to the 2nd low bidder if the low bidder proved unacceptable.

**Perception driven perfection**

Lots of GC’s have a policy of requiring at least three sub bids per trade as sufficient proof to draw meaningful estimating conclusions. Indeed, if the three market-leading subs of every trade are consulted, the GC may rest assured that they’ll win more than their competitors.

However, some GC’s solicit the exact same subs for all their bids regardless of what they pursue. This limited perspective creates very little useful information. Sadly, this practice generates very consistent statistics as losses stack up. Great precision, with little accuracy.

GC’s who challenge their perceptions by monitoring what goes on within their local markets may learn what stratifies bidders and predicts market leaders. Be honest with yourself, and admit faults where they exist. Many estimators could improve their hit rates tremendously by targeting only those opportunities they can actually win with the resources they currently possesses. GC’s who aren’t attracting top subs should take every opportunity to improve their reputation. It should be obvious that subs can see when a GC is constantly losing bids. They can also see which GC is steadily winning bids. Estimates are **not free** so the best sub prices go to the GC’s who won’t squander them. This is why bidding less often leads to winning more.

For more articles like this click here

© Anton Takken 2016 all rights reserved

April 16th, 2016 at 9:29 pm

I enjoyed the article. Thanks Anton Takken. There a many issues brought up within your post worthy of more consideration. It was a thoughtful piece. However, I wanted to talk about accuracy and precision a bit more, especially as they relate to uncertainty. First of all, risk is uncertainty. Precision is important because it essentially defines the constraints of your profit margin. If precision is only known to 15%, then there is an issue in bids lost at 10%. Without understanding how precise – how well you understand your work or processes – there can be no certainty in profit & loss on bids. Quality comes from minimized variation from process outcome – precision. Accuracy comes from measuring the outcome correctly where one actually knows the value bid will make a profit if bid 10% less because why bother bidding to lose money (although losing a little can be better than losing a lot)?

LikeLike

April 17th, 2016 at 11:44 am

Steven, I’m glad you enjoyed the article. I think you’re making good points about having a precise process so that you know how you got to your price. The important thing to understand here is that precision relates to our efforts to quantify the real world, and accuracy relates to how well we achieved that aim. We are rewarded (and punished) according to our accuracy.

I think that uncertainty can be interpreted as risk only to the extent that we assume there’s nothing else that can be done to reduce the uncertainty. Asking questions, or finding alternates means of accomplishing a task can often reduce the uncertainty to a great degree. Dealing with uncertainty is an aspect of our craft where a lot of excuses are made. Ultimately, we can’t control everything in our market. It’s not easy, but we must remember that everyone must account for that same uncertainty.

There’s a tendency to focus on moving numbers around simply because it’s under the estimators control. It takes a lot of experience and judgment to account for the uncertainty driven by factors out of our control. As you wrote, it’s a much better to refine an estimating process that’s consistently growing closer to the going rate.

Thanks for reading!

LikeLike