
Key takeaways
• Software estimates are a probability distribution, not a single number. Steve McConnell’s Cone of Uncertainty puts kickoff-stage estimates at ±4× their final value — that range only narrows as scope solidifies.
• The CHAOS data has barely budged in twenty years. Roughly 31% of IT projects succeed, 50% are challenged (late, over-budget, or descoped), and 19% fail outright. McKinsey’s research on large IT initiatives found 45% over budget and 7% over time, delivering 56% less value than promised.
• The two most expensive estimation antipatterns are scope creep and the “padding-vs-unpadding” spiral. Roughly 52% of projects experience scope creep; the developer-pads-50% / manager-strips-30% loop produces estimates that look honest and aren’t.
• An honest estimate is a document, not a number. It includes assumptions, risks, dependencies, a low/high range, work breakdown, milestones, and the buyer’s responsibilities. If your proposal does not have all six, demand them before signing.
• Fora Soft writes estimates with the math visible. We have shipped 320+ products across video, AI, telehealth, and fitness; our estimates show ranges, hours, assumptions, and the cut-list we’d propose if the budget tightens. Bring us your scope — we’ll come back with a defensible quote.
Why Fora Soft wrote this estimation playbook
Estimation is the most consequential conversation a buyer has with a vendor. The number you sign sets the budget, the runway, the team size, and the implicit promise to your board or your customer. And yet the entire industry has been bad at it for forty years, with the data to prove it.
Fora Soft has been on both sides of that conversation since 2005. We have written hundreds of estimates that held, and we have lived through the few that did not. We have audited proposals that buyers brought us for second opinions and seen the same red flags repeat: a single round number, no assumptions, no buffer for testing, a salesperson who could not name three risks. The pattern is consistent. The fix is also consistent.
This article is the playbook we wish every buyer had before they compared three competing quotes for the first time. It explains why estimates legitimately differ, what an honest estimate looks like, the most common failure modes, and the questions to ask before you sign anything. Concrete proof of the discipline: CirrusMED shipped a HIPAA telehealth MVP on the budget we wrote in week one; BrainCert launched its WebRTC classroom against the original milestones; Perspire.tv hit its first paid release on the schedule the estimate promised. Same approach across all three.
Got conflicting quotes you want a second opinion on?
Send us the proposals and your project brief. We’ll come back with a 30-minute walkthrough of where each estimate is honest, where it’s padded, and what questions to ask before you sign anything.
Why three vendors quote three wildly different numbers for the same brief
When the same brief gets a $40k quote, an $85k quote, and a $180k quote, none of them are necessarily lying. They are answering different questions inside their heads.
Different scope assumptions. One vendor read “simple chat” as a single dyadic message thread; another read it as multi-room with presence and read-receipts; the third read it as a moderated chat with audit logs. The brief did not specify; their assumptions filled the gap differently.
Different non-coding allowances. Coding is 40–60% of a real project. Testing, code review, design, deployment, and documentation are the rest. The cheap quote often forgot half of those line items. The expensive quote priced them in.
Different team rates and seniority mix. A senior architect at $180/hr who ships in two weeks beats a junior at $40/hr who ships in eight weeks — even though the line-item rate looks worse. Total cost is what you are comparing, not hourly.
Different risk margins. A defensible estimate carries 15–30% buffer for unknowns. The cheap quote often has 0%. The buyer pays the buffer back later through change orders, missed deadlines, or a quietly extended bill.
Different sales philosophy. Some shops bid low to win the contract and recoup margin via change requests. Some bid high because they have a deep waitlist. Some bid honestly. Telling them apart is the rest of this article.
The numbers — how often estimates miss, and by how much
The CHAOS Report from Standish Group has been tracking IT project outcomes for three decades. Their long-running findings: roughly 31% of projects succeed (delivered on time, on budget, with intact scope); about 50% are “challenged” (late, over budget, or de-scoped to ship); and around 19% fail outright. Small projects perform far better — closer to 90% success — while large initiatives drop below 10%.
McKinsey’s joint research with the University of Oxford on large IT projects found that initiatives over $15M run 45% over budget on average, 7% over time, and deliver 56% less value than originally promised. Roughly 17% become “black swans” with cost overruns above 200%, threatening the company that commissioned them.
PMI’s 2025 Pulse of the Profession adds a useful counterpoint: teams led by managers with high business acumen achieve 73% budget adherence vs. 68% for the rest, and the failure rate drops from ~11% to 8%. Better project leadership materially moves the needle — not by 20 points, but by enough to matter.
The Cone of Uncertainty — estimates have a legal range
Steve McConnell’s Software Estimation: Demystifying the Black Art introduced the most useful single concept in this conversation. At project kickoff, with only an idea and no requirements, the best possible estimate has ±4× variance. Once requirements are complete, the variance narrows to about ±1.5×. Once detailed design is done, ±1.25×.
Translation: a $100k project at the “initial concept” phase could legitimately be priced anywhere from $25k to $400k by an honest estimator. That spread does not mean the estimator is bad; it means scope is genuinely undefined. Asking for narrower precision before committing scope is asking for a guess dressed as a number.
The buyer’s job is to push the project into the narrower part of the cone before signing — via a discovery phase, a wireframe pass, or a written specification. The vendor’s job is to be transparent about which part of the cone the estimate lives in. The fixed-price quote at kickoff with three significant figures is the antipattern; the “$80k–$140k pending design freeze” is the honest version.
Three estimate types — match the type to the decision you’re making
The AACE International cost-estimate classification (used in construction, engineering, and adopted by software for serious work) defines three types you should be able to name.
1. Rough Order of Magnitude (ROM, AACE Class 5). Accuracy band roughly −50% / +100%. Used at 0–5% scope definition. The right answer for “is this even feasible at our budget?”. The wrong answer for “sign here”.
2. Budget Estimate (AACE Class 3). Accuracy band roughly ±15–30%. Prepared at ~30–40% design completion. The estimate type that should drive the actual budget allocation. Most signed contracts should be at this level.
3. Definitive Estimate (AACE Class 1). Accuracy band roughly ±5–10%. Requires >80% detailed design. Achievable only late in planning. A vendor offering Class 1 accuracy at kickoff is over-promising.
Estimate types compared — what each one is for
A condensed view of when each estimate type fits, what accuracy you can expect, and the decision it should drive.
| Estimate type | Accuracy band | Scope completeness | Decision it supports | When you should ask for it |
|---|---|---|---|---|
| ROM (Class 5) | −50% / +100% | 0–5% | Feasibility / go-no-go | First conversation |
| Budget (Class 3) | ±15–30% | ~30–40% | Budget allocation, signing | After discovery / wireframes |
| Definitive (Class 1) | ±5–10% | >80% | Final commitment / fixed price | After detailed design freeze |
Ten reasons estimates legitimately fail
1. Underspecified requirements and scope creep. PMI data puts scope creep at ~52% of projects. Each change feels small; the cumulative effect is a different product than the one estimated. Managing scope changes can consume a quarter of the project’s resources.
2. Hidden complexity in third-party APIs and integrations. The Stripe webhook that turns out to need an idempotency layer. The Salesforce sync that needs three custom fields. The OpenAI Realtime fallback when the model rate-limits. Estimates routinely under-budget integration work by 30–50%.
3. Wishful thinking and sales pressure. The salesperson promises the timeline that wins the contract; the engineering team inherits an impossible commitment. The fix is to involve the team that will deliver the work in the estimate that is signed.
4. Padding-vs-unpadding antipattern. The developer adds 50% to cover unknowns; the project manager strips 30% because “that looks like padding”. The signed number is now under-funded by 20% before any work starts. The fix is for both sides to declare their buffer in writing.
5. Missing testing, code review, and refactoring buffer. Coding is 40–60% of a real project. Testing is 20–30%. Code review, deployment, security checks, and the inevitable refactor pass are the rest. Estimates that show only “dev hours” under-budget by half.
6. Unaccounted non-coding work. Standups, design reviews, stakeholder demos, deployment runs, documentation, support handoff. Industry research routinely shows developers spend a minority of their time actually writing code. The estimate must price the rest.
7. Optimism bias on individual productivity. Everyone thinks they are a 10× engineer. Most engineers are 1×. The math has to assume average performance under the team’s actual conditions, not the best week of the senior’s life.
8. Brooks’s Law — communication overhead in larger teams. Three engineers carry three communication channels; six carry fifteen. Adding people late to a slipping project does not speed it up — it usually slows it down. The estimate must reflect the team size, not just total hours.
9. Unfamiliar tech stack or first-time integrations. A team writing its first WebRTC implementation, its first HIPAA-compliant audit log, or its first OpenAI Realtime agent will discover the surprises in production rather than in the estimate. Allow 20–40% premium on first-time-on-stack work.
10. Compliance and regulatory hidden cost. HIPAA, SOC 2, GDPR, PCI-DSS, accessibility (WCAG). The compliance posture is often invisible until the audit fails. Estimates that mention compliance only as “we’ll handle that” are undisclosed risk; honest estimates put a separate line item with hours, evidence requirements, and a named compliance lead.
What an honest estimate actually looks like
A defensible estimate is a document, not a number. The structure we use, and the structure you should expect from any vendor:
1. Assumptions. What scope was assumed. Which integrations were treated as “ready” vs. “build”. Which platforms (web, iOS, Android, desktop) are in. Which user types are out for now. The list should be five to fifteen items.
2. Risks and dependencies. Three to seven named risks, each with a probability assessment and a mitigation plan. Dependencies on the buyer (sample data, decision authority, third-party access) named explicitly.
3. Work breakdown. Hours per role: design, front-end, back-end, QA, DevOps, project management. The cheap-quote red flag is “dev hours” as a single line. The honest version separates the line items.
4. Range, not a single number. Low / most likely / high. Three-point or PERT estimation surfaces the variance. The single-number estimate is signing without information.
5. Milestones and deliverables. A demo cadence (every two weeks is the modern default), a working build at each milestone, and the deliverable you can put in front of a customer or investor by date X.
6. Buyer’s responsibilities. The estimate is a contract. The buyer’s side — provide assets by date Y, decide on copy by date Z, give us API credentials by date W — is part of the math. If the buyer slips, the schedule slips. That should be in writing on both sides.
The estimation techniques behind a real number
Bottom-up (work breakdown structure). Decompose the project into tasks; estimate each in hours; sum. The most accurate technique once scope is reasonable. Labor-intensive but defensible.
Top-down (analogy). Compare to past similar projects. Fast; biased by selection. Useful as a sanity check on bottom-up, dangerous as a primary technique.
Three-point / PERT. Estimate optimistic / most likely / pessimistic; weight as (O + 4M + P)/6. Surfaces variance. Pairs well with bottom-up.
Planning poker / story points. Agile teams estimate user stories using consensus and Fibonacci sizes (1, 2, 3, 5, 8, 13). Captures team disagreement and forces explicit discussion of why estimates differ.
T-shirt sizing. Quick S / M / L / XL pass for early triage. Useful in roadmap conversations; not signing-grade.
Function points / COCOMO II. Algorithmic models that derive effort from feature counts and complexity factors. Heavy machinery; useful on enterprise and regulated work.
Eight red flags in proposals (and the questions that surface them)
1. Single-point estimate with no range. Ask: “What’s your low and high? Why?” If they cannot answer, the number is a guess.
2. Estimate identical to a competitor’s. Suggests copying, not analysis. Ask each vendor for their five biggest assumptions; compare the lists.
3. No breakdown of design / dev / QA / PM. Ask for the hours per role. The honest answer fits in a one-page table.
4. No testing or buffer line. Ask: “What percent of the budget is QA, code review, and contingency?” Healthy answer: 20–40% combined.
5. Hourly rate too low to be sustainable. Below ~$50/hr in 2026 the work is usually offshore juniors with thin oversight. Below ~$30/hr it is a coin-flip on quality. The cheap rate often costs more after the rework.
6. Salesperson cannot explain the assumptions. Ask the salesperson the three biggest risks. If they hand you back to the engineering team, the engineering team should be in the room from the next call onward.
7. “We’ll figure it out as we go”. Translation: scope creep, change orders, no commitment to a number. Acceptable only on T&M with weekly demos and an active product owner; never on fixed-price.
8. Fixed-price proposal with vague scope language. “Reasonable iterations on the design”. “Standard integrations”. “Best-effort QA”. Each of these phrases is a future change-order ambush. Demand exact specifications or move to T&M with a cap.
Worried the cheap quote is too good to be true?
Send us all the proposals. We’ll mark the assumptions, find the missing line items, and tell you which questions to take back to each vendor. No pitch — just a sanity check.
Fixed price vs T&M vs hybrid — pick the one that matches your risk profile
Fixed price. The vendor bears scope and timeline risk; the buyer gets predictability. Hides a 15–30% risk premium in the quote (otherwise the vendor would lose money on the unknowns). Right when scope is genuinely locked — rare. Catastrophic when it isn’t, because the vendor cuts corners on testing or quality to recover margin.
Time and Materials (T&M). The buyer bears risk; the vendor is paid for hours worked. Aligns the vendor’s incentives with quality and transparency. Requires active oversight by a product owner on the buyer’s side. Without that, T&M is just open-ended billing.
Hybrid (target with cap). Agreed target price plus a cap; over-runs split or absorbed by vendor up to a defined limit. Aligns both sides on staying within budget while leaving room for genuine surprises. The most common 2026 default for serious work.
Per-sprint or outcomes-based. Pay sprint by sprint based on agreed deliverables, or tie compensation to business outcomes (uptime, conversion lift, revenue). Newer pattern in 2025–2026; requires high trust and good metrics. Right for partnerships with established vendors, not first engagements.
How AI tooling has changed estimates in 2026
Agent Engineering and AI code assistants have materially compressed the routine parts of software work. Public benchmarks show modern coding agents resolving 70–80%+ of well-scoped tasks on standard SWE-bench evaluations; senior engineers using them report meaningful weekly time savings on refactors, debugging, and boilerplate.
The practical effect on estimates: a senior engineer with modern tooling can deliver scope in 2026 that would have needed two engineers in 2020. We see this on every project we run, which is part of why our estimates routinely beat 2020-era industry benchmarks. The buyer-side implication: if a vendor’s 2026 quote uses 2020 hourly numbers without acknowledging tooling gains, they are pricing the wrong year.
The honest counter-point: AI tools accelerate routine work, not unfamiliar work. Compliance, integrations, ambiguous requirements, and design decisions still take roughly the same time. The savings show up on the dev hours line; the design and PM lines stay roughly flat. A vendor claiming “AI cuts everything in half” is over-selling. A vendor claiming “AI cuts the dev line by 30–40% on routine tasks” is honest.
Mini case — how Fora Soft writes an estimate
A useful proof-of-pattern. We’ll walk through the rough shape of an estimate we recently wrote for a video MVP — numbers approximate, structure verbatim.
Scope. 1-on-1 video call with recording, basic chat, payment integration, web + iOS + Android. Three user types. Eight integration points. Compliance: SOC 2 path, no HIPAA in v1.
Work breakdown. Discovery 80h. Product design 200h. Front-end web 320h. iOS 280h. Android 280h. Back-end + APIs 360h. RTC integration 200h. Recording & storage 120h. QA 240h. DevOps + deployment 80h. Project management 200h. Total 2,360h.
Range. Low 2,000h (if scope holds and integrations are smooth); most likely 2,360h; high 2,900h (if RTC integration surprises us). PERT-weighted central estimate: ~2,400h.
Assumptions. Sample data and brand assets provided by week 2. Decision authority on the buyer’s side responds within 48 hours. Stripe is the payment processor. Agora is the RTC platform. iOS supported on iOS 16+; Android on API 28+.
Risks. RTC integration on iOS could surface AVFoundation surprises (mitigation: 40h buffer in iOS line). SOC 2 evidence collection often takes longer than scoped (mitigation: separate evidence track with named owner).
Buyer’s responsibilities. Sample assets, payment processor account, App Store accounts, decision SLA. The signed estimate has a small table with buyer-side line items and dates. If the buyer slips, the schedule slips, and the estimate gets re-cut on a documented change. Want a similar estimate for your project?
A buyer’s decision framework in five questions
Q1. Is the scope clear enough for a Class 3 estimate? If you cannot describe the user, the workflow, and the integrations in one page, you are still at Class 5 (ROM). Run a paid discovery sprint first; sign on the budget estimate after.
Q2. Did the proposal name three risks? If yes, the vendor has thought about your project. If no, the number is generic.
Q3. Does the work breakdown show design / dev / QA / PM as separate lines? If yes, the math is defensible. If no, the math is hidden.
Q4. Is the buffer 15–30% of the total? Below 15% is wishful; above 30% is padding (or the vendor genuinely doesn’t understand the scope, which is also a no-signal).
Q5. Will the team that delivers be the team in this scoping call? If yes, the estimate has accountability. If no — "we’ll match you with a delivery team after signing" — you are signing for an unknown team’s assumptions.
KPIs to monitor once the project is running
Schedule variance. Actual hours vs. planned hours per milestone. The first 20% of the project predicts the rest with surprising accuracy — if you are 30% over hours by milestone two, you will be 30% over the budget.
Scope-change rate. Number and size of change orders per month. A healthy project has fewer than two material changes per month after design freeze. More than that is scope creep, and the estimate must be re-cut, not absorbed.
Defect density. Bugs per thousand lines (or per delivered feature). A vendor that under-quotes QA will surface here within four weeks of beta.
Demo cadence. Working build at every demo. If the demo slips or the build is broken, the schedule is slipping — surface it the same week, not at month end.
Frequently asked questions
Why do estimates miss so often if the industry has been doing this for forty years?
Three reasons. First, software work is creative work with embedded uncertainty — you cannot fully estimate something you have not yet designed. Second, the planning fallacy (Kahneman, Tversky) systematically biases humans toward optimistic estimates. Third, sales pressure rewards low quotes that win contracts, even when the engineering team would have priced higher. The fix is process discipline (Cone of Uncertainty, three-point estimation, Class-3 budget estimates) plus organisational courage to say “we don’t know yet”.
Should I always pick the cheapest estimate?
No. Cheap estimates often skip testing, code review, compliance, or buffer. The total cost frequently ends up higher after rework, change orders, and missed deadlines. Compare estimates on the work-breakdown line items, not the bottom-line number. Reject the lowest if it does not name testing, QA, and contingency as separate lines.
What’s a fair buffer percentage in 2026?
15–30% of total hours, depending on scope clarity. Less than 15% is wishful; more than 30% means scope is too unclear to commit yet — a discovery phase is the right answer rather than a padded number. The buffer should be transparent in the estimate, not hidden inside per-line items.
Fixed price or T&M — which is better?
Neither universally. Fixed price is right when scope is genuinely locked and the buyer wants predictability; T&M is right when scope will evolve and the buyer has an active product owner. The 2026 default for serious work is hybrid — target cost with a cap, with both sides taking a share of overage and savings. That model aligns incentives without trapping either party.
How do I read three competing quotes that range $40k / $85k / $180k?
Reconcile their assumptions. Ask each vendor for their five-line scope summary, their work-breakdown table, and their three biggest risks. The cheap quote almost always assumed less scope or less rigor. The expensive quote usually priced more line items. Once the scope assumptions match, compare hours by role — the spread will narrow significantly. We do this exercise for buyers as a paid second-opinion service.
Do AI coding tools really make estimates lower?
On routine work, yes — 30–50% compression on coding hours is realistic for senior engineers using modern AI assistants. On unfamiliar work (compliance, integrations, ambiguous requirements, design), the savings are smaller. Honest 2026 estimates show the dev-hours line lower than the equivalent 2020 estimate, but the design / PM / QA lines roughly the same. A vendor showing “AI cuts everything in half” is selling. A vendor showing “AI cuts dev hours 30–40% on routine tasks” is honest.
What’s the difference between an estimate and a quote?
An estimate is a probabilistic prediction with a range; a quote is a binding commitment. Estimates are right at ROM and Budget stages; quotes are appropriate at Definitive stage (after detailed design). Confusing the two — treating an early estimate as a binding quote — is one of the biggest sources of project disputes. Make sure both sides know which one they are signing.
How do I keep an estimate honest after we sign?
Weekly status updates with hours-burned vs hours-planned. Two-week demo cadence with a working build. A change-order log with explicit cost and schedule impact for any new scope. Monthly retros that compare actuals to plan and surface variance early. The estimate is a living document, not a one-time signing. The vendor should propose this cadence in the original proposal — if they don’t, ask for it.
What to Read Next
Scoping
Why Cut Features and Launch Early — the MVP Playbook
The companion playbook on cutting scope before signing — smaller estimates start with smaller scopes.
Cost
Am I Overpaying for Development?
A line-by-line sanity check on your proposal — what a fair 2026 quote actually looks like.
Cost
How to Cut Costs on a Software Project
Smart cost-cutting moves that preserve quality, and the corners that are dangerous to cut.
Recovery
Deadlines Slipping — What to Do
When the estimate has already failed — a recovery framework that does not throw the budget overboard.
Discovery
What Happens in the Analytical Stage
The discovery work that turns ROM-stage uncertainty into a Budget-class estimate worth signing.
Ready to compare estimates honestly?
Software estimation will keep being hard because software work is creative work with embedded uncertainty. The fix is not better algorithms; it is better discipline. Use the Cone of Uncertainty to know which estimate type you are reading. Use three-point estimation to surface variance. Demand a work breakdown, named risks, and explicit assumptions. Reject single-number quotes from sales teams that cannot explain their math.
If you have proposals on your desk and the spread does not make sense, the most useful next step is a 30-minute audit call with senior engineers who write estimates every week. We will mark the assumptions, find the missing line items, and tell you what to ask each vendor before you sign. The goal is not to win your contract — it is to make sure whatever contract you sign is one you can actually deliver against.
Get a sane second opinion on your proposal
A 30-minute audit call with senior engineers who write estimates every week. Bring the proposals, the brief, and the budget — we’ll come back with the questions to ask each vendor before signing.


.avif)

Comments