Estimating IT projects is a pain. Whoever gave promises they couldn’t keep, only to work overtime just to meet the deadline they have set up for themselves?

When I started my path and tried to estimate my time spent while being a developer, I always underestimated things. Every time there would appear a job I didn’t account for. Colleagues told me to multiply my estimates by 2, 3, the number Pi. Only it didn’t help to increase the estimation accuracy, just added other problems. For example, when I had to explain where the high numbers came from.

15 years have passed since then. Over this time, I’ve estimated over 250 projects, got a lot of experience, and now I’m willing to share my thoughts on the topic.

Hopefully, this article will improve the quality of the estimations you’re giving.

Why estimate?

No more than 29% of projects end up in success, according to the research by The Standish Group, conducted in 2015. The other 71% either failed or exceeded the triple limitation system: deadline, functionality, budget.

From these statistics, we can assume that project estimation is often not what it should be. Does it mean that this process is pointless? There’s even a movement on the Internet that invites you to not estimate anything and just write a code, so that whatever happens – happens (search by #noestimates).

Not having any estimations does sound appealing, but let me give you an example. Imagine that you come to a restaurant and order a steak and a bottle of wine but there are no prices on the menu. You ask a waiter: “how much?”, and he goes, “depends on how long it takes the chef to cook. Order, please. You’ll get your food, you’ll eat it and then we’ll tell you how much it cost”.

There can also be an Agile-like option: “The chef will cook your meal, and you’ll be paying as he proceeds. Do this, please, unless you’re out of money. When you have no more money, the chef will stop cooking. The steak won’t be ready, perhaps, or it will be just reaching the state when it’s edible. If it’s not edible, though.. Sorry, it’s your problem”.

This is approximately how customers in the IT-sphere feel when they’re offered to start a job without estimations.

In the example above we’d ideally like to get an exact price for the steak. At the very least, it’d be fine if we just got a price range. This way we can check whether we want to go to this restaurant, choose a cheaper one, go get a cheeseburger or stay home and cook a salad.

Going to the restaurant without any understanding as to what to expect is not the decision that a person in the right mind will take.

I hope I could convince you that estimation is an important part of a decision making process on any project. The estimation can be as close or far from reality as possible, but it’s needed nevertheless.

Reasons for underestimation

Ignoring probability theory

Imagine the following situation. A developer is approached by a manager, and the manager would like to know how long it will take the developer to finish a task. The developer has done something like that in the past and can give the “most probable” estimation. Let it be 10 days. There’s a probability that the completion of the task will last for 12 days, but the chance is lower than that of 10 days. There’s also a chance that the task would be completed in 8 days but this probability is lower as well.

It’s often assumed that estimation for a task or a project is distributed according to the normal distribution law (read more about it here). If you show estimation distributions as a graph, you’ll get this:

X shows estimation while Y shows probabilities that the estimation will turn out to be correct and a task will consume the precise amount of time. In the center, you can see the point of the highest probability. It goes with our 10-day estimation.

The area under the curve gives us a probability of 100%. It means that if we go with the most probable estimation, we’ll finish the project on time or earlier with the 50% chance (the area under the graph before the 10-hour estimation is half of a figure, therefore it’s 50%). So, if we go with this principle, we’ll be able to only miss 50% of the deadlines.

This is only if the distribution of probabilities does go hand-in-hand with the normal distribution. In this case, the possibility of finishing a project earlier than a possible estimation equals to finishing it later than a possible estimation. However, it’s also normal that something goes wrong with the project, and we finish it earlier than a miracle happens and we finish earlier. In other words, the amount of negative possibilities is always higher than that of positive ones.

If we go with this idea in mind, we’ll get the following distribution:

For this to be easier to read, let me represent this information as a cumulative graph which will show a possibility of finishing the project on time or earlier:

Turns out, if we take the “most possible” estimation in 10 days, the possibility of the task being completed in that period or earlier.

Ignoring the current level of uncertainty

During the work on a project/task, we constantly learn new information. We get feedback from the customer, manager, tester, designer, and other team members. This knowledge keeps renewing. We know little about the project from the start, but we learn more as we go along, and we can mention exactly how long it took us once we’ve finished the project.

What we know directly affects how precise our estimation is.

Luiz Laranjeira’s (Ph.D., Associate Professor at The University of Brasilia) research also shows that the accuracy of estimating a program project depends on how clear the requirements are (Luiz Laranjeira 1990). The clearer the requirements are - the more accurate the estimation is. Usually, the estimation isn’t clear because the uncertainty is involved in the project/task. Therefore, the only way of reducing uncertainty is by reducing it in the project/task.

According to this research and common sense, as we decrease the uncertainty on a task/project, we increase the estimation accuracy.

This graph is here to make it easier to understand. In reality, the most possible estimation may change as uncertainty decreases.

Dependency between precise estimation and the project stage

Luiz Laranjeira went on with his research and figured out the number of dependency between and estimation spread and a project stage (level of uncertainty).

Taking optimistic, pessimistic and the most possible estimates (optimistic estimate is the earliest period of finishing the project out of all, while pessimistic is vice versa, the latest) and show how their ratios change over time, from start to finish of the project, we’ll get the following picture:

This is called a cone of uncertainty. The horizontal axis stays for the time between the start and the finish of the project. The main project stages are mentioned there. The vertical axis shows a relative margin of error in the estimation.

So, at the start of the initial concept, the most possible estimation may vary from the optimistic one by 400%. When the UI is ready, the spread of estimation goes between 0.8 and 1,25 relative to the most possible estimation.

This data can be found in the table down below:

It’s very important to note that the cone doesn’t get narrower as time goes by. For it to narrow, one needs to manage the project and take action to lower uncertainty. If one doesn’t do that, they’ll get something like this:

The blue area is called a cloud of uncertainty. The estimation is subject to major deviation up to the very end of the project.

To move on the cone to the most-right point where there’s no uncertainty, we need to create a finished product :). So, unless the product is ready, there is always uncertainty, and the estimation can’t be 100% precise. However, you can affect the estimation accuracy by lowering uncertainty. With this, any action targeted at lowering uncertainty also lowers the estimation spread.

This model is used in many companies, NASA included. Some adapt it to consider volatility in requirements. You can read about that in detail in “Software Estimation: Demystifying the Black Art”.

What is a good estimation?

There are plenty of options to answer this question. However, in reality, if the estimation deviates by more than 20%, the manager doesn’t have a room for maneuver. If the estimation is somewhere around 20%, the project can be finished successfully by managing functionality, deadlines, team size, etc. It does sound logical, so let’s stop at this definition of good estimation, for example. This decision has to be taken on the organizational level. Some risk and a 40-50% deviation is OK for them; others see 10% as a lot.

So, if out estimation differs from the actual result by not more than 20%, we consider it good.

Practice. Estimating a project on various stages

Let’s imagine that a project manager has approached you and asked to estimate a function or a project.

To start with, you have to study available requirements and figure out the life-cycle stage of a project definition.

What you do next depends on the stage you’re on:

Stage 1. Initial concept

If a manager approaches you and asks how long it will take to create an app where doctors will consult patients, you are on Stage 1.

When does it make sense to make estimations on this stage?

On a pre-sale stage. When you need to realize whether it’s worth it to further discuss the project. All in all, it’s better to avoid estimations on this stage and try to lower the uncertainty as soon as possible. After that, we can move on to the next stage.

What do you need to estimate on this stage?

Actual labor time data on a similar finished project.

What tools are the most suitable for this stage?

  • An estimation by analogy

Estimation algorithm

Actually, estimating the project on this stage is an impossible task. You can only see how long a similar project took to launch.

For example, this is how you could put your estimation into words: “I don’t know how long this project will take as I lack data. However, project X which was similar to this one took Y time. To give at least an approximate estimation, it’s imperative to make requirements clearer”.

If there’s no data from similar projects, then lowering the uncertainty and moving to the next stage is the only way to estimate here.

How to move to the next stage?

For this to happen, the requirements must be clarified. You need to understand what the app is for and its functionality.

Ideally, one should have skills in gathering and analyzing requirements.

To improve that skill, it’s recommended to read “Software requirements” by Karl Wiegers and Joy Beatty.

To gather preliminary requirements, you might use this questionnaire:

  • What’s the purpose of the app? What problems will it solve?
  • What is the target audience? (for the task above that could be doctor, patient, administrator)
  • What problems will each type of users solve in the app?
  • What platforms is the app for?

After figuring these things out, you will have an image of the app in your head with all the necessary information. With this, we’re moving to Stage 2.

Stage 2. An agreed definition of the product

We have an understanding here, although not very detailed, about what the app will do and what won’t.

When does it make sense to make estimations on this stage?

Again, on the pre-sale stage. Right when one needs to decide whether it’s worth it to complete the task or project, whether they have enough money, whether the deadlines are affordable. You need to check if the value that the project brings is worth the resources that need to be involved.

What do you need to estimate on this stage?

Quite a few finished projects and their estimations OR huge experience in the area of development to which the project is related. These two combined would be even better!

What tools are the most suitable for this stage?

  • An estimation by analogy
  • A top-to-bottom estimation

Estimation algorithm

If there was a project like this before, the approximate estimation would be the time spent on that project.

If there is no data on projects like that, you need to split the project into the main functional units, then estimate every block according to those that were done on other projects.

For example, with the app where the doctors would consult patients, we could have got something like that:

  • Registration
  • Appointment scheduling system
  • Notification system
  • Video consultation
  • Feedback system
  • Payment system

You could estimate the “registration” block by using something similar from another project and for the “feedback system” a block from a different project.

If there are blocks that were never done before or they lack data, you can either compare the necessary labor time against other blocks or reduce uncertainty and use the estimation method from the next stage.

For example, the “feedback system” module might seem twice as difficult as the “registration” module. Therefore, for the feedback, we could get an estimation twice as high as the registration.

The method of comparing one block against the other is not exactly precise, and it’s better used in the situation where the number of the blocks that were never done isn’t higher than 20% of the blocks that do have historic data. Otherwise, it’s just a guess.

After this, we summarize estimation of all blocks, and it will be the most possible one. The optimistic and pessimistic estimations can be calculated using the coefficients appropriate for the current stage – x0,5 and x2 (check the coefficient sheet).

Ideally, you should let your manager know what’s going on, and they will have to deal with it.

If the manager can’t deal with it and asks for one, single number, there are ways to do that.

How to calculate one estimation out of three? It will be answered down below in the corresponding chapter.

How to move to the next stage?

Prepare a full list of requirements. There are quite a few documentation ways, but we’ll look into a widely used one with a User Story.

We need to understand who will be using each block and what they’ll be doing with the blocks.

For example, for the “feedback system” block we would end up with these bullet points  after gathering and analyzing requirements:

  • A patient can check all feedback about the chosen doctor
  • The patient can leave feedback for the doctor after a video consultation with him
  • The doctor can see feedback from the patients
  • The doctor can leave a comment on feedback
  • An administrator can see all feedback
  • The administrator can edit any feedback
  • The administrator can delete feedback

You will also need to collect and write down all requirements that are not functionality-based. To do that, use this check-list:

  • What platforms is it for?
  • What operating systems need to be supported?
  • What do you need to integrate with?
  • How fast is it supposed to work?
  • How many users at the same time can use the tool?

Clarifying this stage will move you to the next one.

Stage 3. The requirements are gathered and analyzed

This stage has a full list of what each user can do in the system. There is also a list of non-functional requirements.

When does it make sense to make estimations on this stage?

When you need to give an approximate estimation for the project before you begin working with the Time & Materials model. The estimation of tasks from this stage can be used to prioritize some of them on the project, to plan the release dates and the whole project budget. You can also use those to control the team’s efficiency on the project.

What do you need to estimate on this stage?

  • The list of functional requirements
  • The list of non-functional requirements

What tools are the most suitable for this stage?

  • An estimation by analogy
  • A top-to-bottom estimation

Estimation algorithm

You need to decompose each task (split it into components). The smaller the components are, the more precise will the estimation be.

To do it on the best of your abilities, you need to represent everything that needs to be done on paper.

For example, for our User Story that goes like “a patient can see all feedback about the chosen doctor”, we could get something like this:

We split the task here into three parts:

  • Create infrastructure in the database
  • Create the DAL level for data samples
  • Create a UI where the feedback will appear

If you could, you can write down the UI functionality and approve it with whoever asks for estimation. It will eliminate lots of questions, make the estimation more precise, and be a good quality of life change.

If you want to improve your interface design skills, it’s recommended to read two books: “The Humane Interface” by Jef Raskin and “About Face. The essentials of interaction design” by Alan Cooper.

Then you need to imagine what exactly will be done for each task and estimate how long it will take. Here you have to calculate time, not guess it. You have to know what you will do to finish each subtask.

If there are tasks that take more than 8 hours, split them into subtasks.

The estimation received after having done this can be considered optimistic as it most likely uses the shortest path from point A to point B, given that we haven’t forgotten anything.

Now it’s about time we thought about things that we’ve probably missed and correct the estimation accordingly. Usually, the checklist helps here. This is an example of such a list:

  • Testing
  • Design
  • Test data creation
  • Support for different screen resolutions

After completing this list, we have to add the tasks we might have missed to the task list:

Go through each task and subtask and think about what could go wrong, what is missed. Oftentimes, this analysis reveals things without which you can’t end up with a best-case scenario. Add them to your estimation:

After you calculate this, too, your estimation will be even closer to the optimistic one than to the most possible one. If we take a look at the cone, the estimation will be closer to its lowest line.

The exception here might be if you’ve done a similar task before and can speak with authority that you know how it’s done and how long it takes. In that case, your estimation would be called “the most possible” and it’d go along with the 1x line on the cone. Otherwise, your estimation is optimistic.

The other two estimations can be calculated with the coefficients according to this stage: x0,67 and x1.5 (check out the coefficient table).

If you calculate the estimation from the example above, we’ll get this:

  • Optimistic estimation: 14 hours
  • The most possible estimation: 20 hours
  • Pessimistic estimation: 31 hours

How to move to the next stage?

By designing the UI. Creating wireframes would be the best way to go.

There are multiple programs for that but I’d recommend Balsamiq and Axure RP.

Prototyping is another huge topic that is not for this article.

Having a wireframe means that we’re on the next stage.

Stage 4. The interface is designed

We have a wireframe here as well as the full list of what each user will do in the system. We also have a list of non-functional requirements.

When does it make sense to make estimations on this stage?

To create an exact estimation by the Fixed Price model. You can also do that for everything that was mentioned in the previous stage.

What do you need to estimate on this stage?

  • Prepared wireframes
  • A list of functional requirements
  • A list of non-functional requirements

What tools are the most suitable for this stage?

  • An estimation by analogy
  • A top-to-bottom estimation

Estimation algorithm

The same as in the previous stage. The difference is in accuracy. Having a planned interface, you won’t have to think as much and the possibility of missing something is lower.

How to move to the next stage?

Design the app architecture and thoroughly think about the realization. We won’t check that option as it is used quite rarely. With that being said, the estimation algorithm after thinking about architecture won’t differ from one on this stage. The difference is, once again, in accuracy increase.

Retrieving one estimation from the range of estimations

If you have the three types of estimation ready, we can use the knowledge by Tom DeMarco to retrieve one estimation. In his book “Waltzing with bears” he mentioned that absolute possibility can be obtained by integrating the area under the curve (in the graph we had before). The original calculation template can be downloaded from here or from here without registration. You need to insert three numbers in the template and receive a result as a list of estimations with their corresponding probabilities.

For example, for our estimations of 14, 20, and 31 hours we’ll have this result:

You can choose any probability you deem decent for your organization, but I’d recommend 85%,

Don’t know how to estimate? Speak up!

If you don’t know what you’re asked or how to implement the functionality you need to estimate? Let your manager know, give him an approximate estimation, if that’s possible, and suggest actions that will make the estimation more precise.

For example, if you don’t know whether the technology works to finish the task, ask for some time to create a prototype that will either confirm your estimation or show what you’ve missed. If you are not sure that the task is doable, say that from the beginning. These things need to be confirmed before you’ve put their weight on your shoulders.

It’s very important to provide a manager with this information, otherwise, he can blindly trust you and have no idea that there’s a chance of missing the deadline by 500% of the time or simply not finish the product with the technology or requirements you have.

A good manager will always be by your side. You and he are in the same boat, and sometimes his career depends on whether you’ll finish on time even more than yours.

Have doubts? Don’t promise

Many organizations and developers help their projects fail by taking on responsibilities too early on the cone of uncertainty. It’s risky as the possible result jumps between 100% and 1600%.

Efficient organizations and developers postpone decision making up until the moment when the cone is narrower.

Usually, this is normal for organizations that are on the more mature CMMI module level. Their actions to make the cone narrow down are clearly stated and are followed.

You can see the quality of estimations and its increase in the projects of the U.S. Air Force when they moved to the more mature CMMI level:

There’s something to think about here. Other companies’ statistics confirm this correlation.

Even here, accuracy of the estimations can’t be achieved only with estimation methods. It’s inextricably linked to the efficiency of project management. It doesn’t depend only on developers, but also on project managers and senior management.

Conclusion

  • It’s nearly impossible to give an absolutely correct estimation. You can, however, affect the range in which the estimation will fluctuate. To do that, lower the uncertainty level on the project
  • Splitting tasks into components will help increase the estimation accuracy. As you decompose things, you’ll think what and how you will do things in details
  • Use checklists to lower the possibility of missing something as you estimate
  • Use the uncertainty cone to understand the range where your estimation will most probably fluctuate
  • Always be comparing the given estimate with the time that was actually spent on a task. It will help you improve your estimation skill, understand what you’ve missed, and use it as you move further.

Useful books

There is a lot of literature on the topic but I’ll recommend the two that must be read.

  • Software Estimation: Demystifying the Black Art by Steve McConnell
  • Waltzing With Bears: Managing Risks On Software Projects by Tom DeMarco and Timothy Lister.
  • Processes
    Clients' questions