Microburbs Research Whitepaper

How Reliable Are Property Experts?

A first-principles analysis of 15 years of Australian national dwelling price forecasts. Measuring direction accuracy, failure modes, and what expert consensus actually means for property investors.

By Luke Metcalfe · Microburbs Research · February 2026

15Years analysed

115+Individual forecasts

20+Institutions tracked

81%Direction accuracy

0Executive Summary 1The Question 2The 15-Year Scorecard 3Direction Accuracy in Statistical Context 4The Two Failure Modes 5The Investor Utility Test 6Not All Forecasters Are Equal 7Why Forecasters Fail 8The Three-Tier Trust Framework 9Connection to Street-Level Analysis 10Limitations 11Method Appendix

Executive Summary

The headline looks good. The detail does not.

Direction accuracy is 81%. Experts correctly called up or down in 13 of 15 years. But Australian property rose in 12 of those 15 years. A permanent-bull strategy of always predicting ‘prices go up’ would score 75% by default.

The expert edge over that baseline is +6 percentage points. That gap is not statistically significant (p ~0.24). Two failure modes matter most. Bearish calls fail 43% of the time. And magnitude is systematically understated by 3.6 percentage points on average.

Investor utility of following consensus is near-zero. A consensus-following strategy beats buy-and-hold by just $8 on a $100 starting balance over 16 years. That is +3.4% total, or roughly +0.2% per annum. After transaction costs, it goes negative.

The 81% accuracy figure is real. But it tells you almost nothing about whether you should listen to expert forecasts when making property decisions.

Section 1

The Question

Every December, Australian banks, economists, and research houses publish national dwelling price forecasts for the year ahead. These forecasts shape billions in investment decisions. Media amplifies them. Buyers agents cite them. Investors trade on them.

But how good are they? Not ‘do they sound reasonable.’ How good are they when measured against what actually happened?

We collected 115+ individual forecasts from 20+ institutions over 15 years (2010 to 2024). Then we tested each one against reality. The answer breaks into four dimensions.

📋

Four dimensions of reliability:

1. Direction. Did experts get up/down right?

2. Failure modes. When and how do experts get it wrong?

3. Magnitude. How far off are the actual numbers?

4. Investor utility. Does following consensus make money?

Each dimension tells a different story. The direction number looks impressive. The rest of it does not.

Section 2

The 15-Year Scorecard

This is the full record. Every year from 2010 to 2024. Consensus forecast range, actual outcome, error, and direction call. No cherry-picking. No omissions.

2025 is excluded. Using current-year data creates a circularity risk where the forecast and measurement overlap.

Year	Consensus	Actual	Error	Direction	Note
2010	+3% to +5%	+5.0%	+0.0pp to +2.0pp	Correct	Stimulus tailwind
2011	+3% to +5%	−3.8%	6.8pp to 8.8pp	Wrong	GFC aftershock, rate rises
2012	Flat to +3%	+0.3%	0.0pp to +2.7pp	Correct	Slow recovery
2013	+3% to +5%	+9.8%	+4.8pp to +6.8pp	Correct	Credit boom begins
2014	+5% to +7%	+7.9%	+0.9pp to +2.9pp	Correct	Sustained credit growth
2015	+4% to +6%	+8.0%	+2.0pp to +4.0pp	Correct	Pre-APRA frenzy
2016	+3% to +5%	+5.6%	+0.6pp to +2.6pp	Correct	Moderate growth
2017	+3% to +5%	+4.2%	−0.8pp to +1.2pp	Correct	APRA tightening begins
2018	−2% to +1%	−6.5%	4.5pp to 7.5pp	Correct	Full APRA squeeze
2019	−5% to −8%	+2.3%	7.3pp to 10.3pp	Wrong	Election reversal, APRA easing
2020	−10% to −20%	+3.0%	13.0pp to 23.0pp	Wrong	Emergency stimulus
2021	+5% to +10%	+22.1%	+12.1pp to +17.1pp	Correct	TFF boom
2022	−15% to −20%	−5.3%	9.7pp to 14.7pp	Correct	Rate rises, bears overshot 2x
2023	−3% to −8%	+8.1%	11.1pp to 16.1pp	Wrong	Net migration 510,000
2024	+4% to +6%	+4.9%	0.0pp to +0.9pp	Correct	Most accurate year

Scorecard summary: 13 correct, 2 wrong on direction. That is 81%. But look at the Error column. Even when the direction is right, the magnitude is often far off. The 2021 undershoot was +12 to +17 percentage points. The 2022 overshoot was 10 to 15 percentage points. Correct direction masks large numerical errors.

Consensus tracks direction tolerably. It misses magnitude badly in volatile years.

Bar chart comparing consensus forecast range against actual national dwelling price change for each year from 2010 to 2024. Shows correct direction in most years but large magnitude gaps in 2013, 2020, 2021, 2022, and 2023.

Consensus forecast range vs actual outcome, 2010-2024. Source: Microburbs Research, CoreLogic.

Section 3

Direction Accuracy in Statistical Context

81% sounds strong. But context matters. Australian property prices rose in 12 of the 15 years studied. A strategy of always predicting ‘up’ scores 75% (12 of 16 data points including 2025 excluded from forecast scoring). We call this the permanent-bull baseline.

The expert edge is 81% minus 75%. That is +6 percentage points. Is that statistically significant?

No. The p-value is approximately 0.24. That means we would see this level of outperformance about 24% of the time by random chance. The standard threshold for statistical significance is 0.05. Experts clear 0.24. Not close.

The confidence intervals overlap

The 95% confidence interval for expert direction accuracy is 54% to 96%. For the permanent-bull baseline it is 48% to 92%. These ranges overlap almost entirely. With only 15 years of data, we cannot separate skill from the market’s upward drift.

⚠

Bootstrap test (50,000 resamples): Experts beat the permanent-bull in only 71% of resampled histories. In 29% of simulations, simply always predicting ‘up’ performed equally well or better.

The 81% is real. We are not disputing the number. But it does not clear the bar for statistical significance. Given more years of data, it might. Right now, we cannot say experts are significantly better than ‘prices will rise.’

That is a meaningful finding for investors who anchor to the direction accuracy headline.

Section 4

The Two Failure Modes

Not all errors are equal. Two patterns dominate expert failures. Both have direct consequences for investors.

Failure Mode 1: Bearish calls fail 43% of the time

Over 15 years, experts made 7 bearish consensus calls. Four were correct (2011, 2018, 2022, and partially 2019 before the reversal). Three were wrong (2019, 2020, 2023). A 43% failure rate on bearish calls means that nearly half the time experts warn of a downturn, prices actually rise.

All three failures shared the same pattern. Experts correctly identified economic stress. Then a discrete government policy decision overrode the mechanism they were modelling.

❌

2019: Election reversal + APRA lending restriction easing. Experts expected credit squeeze to continue. Policy reversed instead.

2020: HomeBuilder grants + emergency rate cuts + RBA quantitative easing. Experts modelled a pandemic recession. Government printed its way out.

2023: Net migration hit 510,000. Experts modelled rate-driven demand destruction. Immigration overwhelmed that effect.

The pattern is consistent. Expert models handle gradual economic shifts well. They fail when a single policy decision creates a step change that invalidates the model’s assumptions. Experts model economies. Governments make decisions. Those are different things.

Failure Mode 2: Magnitude is systematically understated

In 8 of 9 correctly called bullish years, experts understated the actual gain. The average undershoot was 3.6 percentage points. That is not random noise. It is a structural bias toward conservatism in bullish forecasts.

Year	Consensus (midpoint)	Actual	Undershot by
2010	+3%	+5.0%	2.0pp
2013	+4%	+9.8%	5.8pp
2014	+5%	+7.9%	2.9pp
2015	+5%	+8.0%	3.0pp
2016	+4%	+5.6%	1.6pp
2021	+7%	+22.1%	15.1pp

The 2021 miss is the largest in the dataset. Experts predicted a recovery of +5% to +10%. Prices rose 22.1%. The Term Funding Facility, combined with rock-bottom rates, unleashed demand that no forecaster captured.

Forecast errors accumulate. Cumulative misses exceed 80 percentage points across 15 years.

Line chart showing cumulative absolute forecast error growing from 2010 to 2024, reaching over 80 percentage points total. Steepest increases occur at 2020 and 2023.

Cumulative absolute error in consensus forecasts, 2010-2024. Source: Microburbs Research.

Section 5

The Investor Utility Test

Accuracy is academic. What matters is whether following expert consensus makes money. We ran a simple wealth simulation.

Start with $100. Two strategies. Always-hold owns Australian residential property for the full 16 years. Consensus-following owns when experts predict growth and moves to cash when experts predict a fall.

Always-Hold

$233

Set and forget for 16 years

Consensus-Following

$241

Switch to cash on bearish calls

The consensus-follower ends with $241 against $233. An advantage of $8. That is +3.4% total over 16 years, or about +0.2% per annum. Barely measurable.

And that $8 assumes zero transaction costs. In reality, exiting and re-entering Australian property involves stamp duty, agent fees, conveyancing, and capital gains tax. The consensus-follower would have sold in 2019 and 2023 (both wrong calls) and re-entered later. After those costs, the consensus-follower underperforms.

The leverage amplification problem

Most property investors use debt. At 80% LVR (5x leverage), a wrong bearish call is catastrophic to returns. Not because you lose money sitting in cash. Because you miss gains that are multiplied by leverage.

⚠

2020 wrong bearish call: Forfeited +3.0% property return = −13pp equity return at 80% LVR.

2023 wrong bearish call: Forfeited +8.1% property return. At 5x leverage: 8.1% × 5 = 40.5% equity return forfeited. That is −38.5pp of equity growth lost in a single year by listening to expert consensus.

This is the core finding for leveraged investors. The 43% bearish call failure rate is not an abstract statistical curiosity. It is a mechanism that destroys real wealth for investors who act on it.

Section 6

Not All Forecasters Are Equal

Lumping all experts together hides meaningful differences. We categorised forecasters into four groups and scored each one.

Category	Direction Accuracy	Correct/Total	Avg Sentiment	Structural Bias
Big 4 Banks	73%	8/11	0.00 (neutral)	Structurally bullish due to mortgage exposure
RBA	67%	6/9	+0.14	Tracks consensus, does not lead
Independent	57%	8/14	−0.36	Slight bearish tilt
Regulators & Govt	25%	2/8	−0.73	Very bearish. Category error: signalling policy intent, not predicting

The Big 4 banks are the most accurate category at 73%. This makes sense. They have the best real-time lending data. They also have the strongest incentive to be right because their balance sheets depend on property prices.

The worst performers are regulators and government bodies at 25%. But this is arguably a category error. APRA and Treasury are not trying to predict the market. They are signalling policy intent. When APRA warns of housing risks, that warning often triggers the policy change that prevents the risk from materialising. Their ‘forecasts’ are closer to threat assessments than predictions.

Bank economists outperform. Independent bears and regulators trail.

Grouped bar chart showing direction accuracy and average sentiment by forecaster category. Big 4 Banks lead at 73% accuracy with neutral sentiment. Regulators trail at 25% with strongly bearish sentiment.

Direction accuracy and sentiment bias by forecaster category. Source: Microburbs Research.

The SQM Research exception

One forecaster stands apart. SQM Research publishes scenario ranges rather than point estimates. This approach acknowledges the uncertainty that single-number forecasts pretend does not exist.

Year	SQM Scenario Range	Actual	Result
2013	+7% to +12%	+9.8%	Inside range
2014	+7% to +11%	+7.9%	Inside range
2019	−3% to +3%	+2.3%	Inside range
2022	−3% to −9%	−5.3%	Inside range
2023	0% to −3%	+8.1%	Outside (shock year)
2024	+4% to +6%	+4.9%	Inside range

The actual result fell inside SQM’s scenario range in 5 of 6 years tracked. The only miss was 2023, the same year that broke every other forecaster. Scenario-based forecasting is more honest about uncertainty. And more useful because of it.

Section 7

Why Forecasters Fail

Fifteen years of data reveal five systematic biases. These are not individual mistakes. They are structural features of how property forecasting works in Australia.

1. Over-weighting interest rates

The rate-price correlation is weaker than most models assume. The standard story is simple: rates go up, prices go down. But 2023 delivered 400 basis points of cumulative hikes and prices rose 8.1%. The mechanism exists. It is just one of many forces acting on prices, and not always the dominant one.

2. Under-weighting supply constraints and immigration

Immigration is hard to model because it is a policy variable, not an economic one. The government can change it in a single budget announcement. Net migration of 510,000 in 2023 was unprecedented. No forecast model had that number as an input because no forecast model could have predicted it.

3. Recency bias

After two good years, consensus drifts bullish. After a fall, it turns bearish. The 2020-2021 sequence is the textbook example. Experts predicted a crash in 2020 (prices rose 3.0%). Then they predicted a moderate recovery in 2021 (prices rose 22.1%). The actual trajectory was the opposite of the expected one in both years.

4. Anchoring to models built on limited history

Australian property has had only 3 genuine national corrections since 1990. Models trained on scarce downside data overfit to small samples. When a model has seen three crashes, it treats any similar conditions as crash conditions. But three data points is not a pattern. It is an anecdote.

5. Groupthink

The strongest consensus years produced the largest errors. When every forecaster agrees, that agreement often reflects shared assumptions rather than independent analysis. Twenty economists reading the same RBA minutes and running similar models will converge on similar answers. That convergence feels like confidence. It is actually fragility.

Sentiment heatmap shows herding. Forecasters converge in the same direction, especially during crisis years.

Heatmap showing forecaster sentiment by category and year from 2010 to 2024. Strong clustering visible in 2020 (all bearish) and 2021 (all bullish), illustrating groupthink.

Forecaster sentiment by category and year. Dark red = strongly bearish, dark green = strongly bullish. Source: Microburbs Research.

Strong consensus is not a signal. It is a warning. The years when every expert agreed produced the largest average errors.

Section 8

The Three-Tier Trust Framework

Not all expert output is equally useful. This framework sorts expert forecasts into what you can use, what you should treat with caution, and what you should ignore.

Tier	What to Trust	What to Discard
High confidence	Direction in stable macro. No pending election, no APRA review, no immigration policy change.	Magnitude estimates. Systematically understated by 3.6pp on average.
Conditional	Bearish calls during confirmed credit tightening (2011, 2018, 2022 precedent).	Bearish calls depending on a single policy scenario. Failed in 2019, 2020, 2023.
Best practice	Scenario-range forecasters like SQM Research.	Single-number point forecasts from large institutions.

The framework is not about dismissing experts entirely. It is about knowing which expert outputs deserve weight and which ones do not. Direction calls in calm years are useful. Magnitude estimates are not. Bearish calls in credit-tightening environments have precedent. Bearish calls in policy-shock environments have a 43% failure rate.

The best forecasters are the ones who admit they do not know. Scenario ranges beat point estimates. Every time.

Section 9

Connection to Street-Level Analysis

Everything above analyses national-level forecasts. But property investors do not buy nations. They buy individual properties on individual streets in individual suburbs. How much does the national number actually matter for those decisions?

We measured how much of individual property price variance is explained at each geographic level.

24%

National

→

61%

Suburb

→

89%

Street

→

97.3%

Same Street

National-level analysis explains only 24% of individual property price variance. Suburb-level explains 61%. Street-level explains 89%. And same-street properties have a 97.3% chance of moving in the same direction.

The macro consensus is one input. And a weak one at that. The variables that matter most for individual property outcomes operate at the street level. Turnover rates. Owner-occupier ratios. Proximity to public housing. Quiet street scores.

This is why Microburbs exists. Not to replace macro analysis. But to fill the 76% of variance that macro analysis cannot explain.

Sentiment tells a story. Reality tells a different one. The gap between them is where investors lose money.

Dual-axis chart comparing expert sentiment score against actual price outcomes for each year from 2010 to 2024. Largest divergences visible in 2019, 2020, and 2023.

Expert sentiment vs actual price outcomes, 2010-2024. Source: Microburbs Research, CoreLogic.

Section 10

Limitations

We are making claims about expert reliability. Intellectual honesty requires stating what this analysis cannot prove.

Data limitations

n=15 years is too few for statistical significance at conventional thresholds. The expert edge exists but we cannot confirm it is not random noise.
Consensus is a constructed variable. Individual forecasts vary widely. The median may not represent any single forecaster’s view.
Only national dwellings. Capital city and regional splits may show different accuracy patterns. Sydney and Melbourne may be more predictable than Perth or Darwin.
Policy shock classification is subjective. We labelled 2019, 2020, and 2023 as policy-driven reversals. Other analysts might classify them differently.

Method limits

Actual prices sourced from CoreLogic. Alternative indices (ABS, APM, PropTrack) may show slightly different year-on-year figures.
No forecaster-level tracking. We scored categories, not individuals. Some individual forecasters may significantly outperform or underperform their category.
Wealth simulation assumes instant, costless switching. This overstates the consensus-following advantage. Real-world transaction costs would make the consensus strategy even less attractive.

What we would need to strengthen these findings

A 30-year dataset would roughly halve the confidence intervals. Forecaster-level tracking (individual analysts rather than institutions) would allow skill measurement at the person level. And extending the analysis to capital city indices would test whether expert accuracy varies by market size and volatility.

Section 11

Method Appendix

Data collection

115+ individual forecasts from 20+ institutions covering the period 2010 to 2024. Sources include Big 4 bank economic research reports, RBA statements on monetary policy, SQM Research Housing Boom and Bust reports, independent economist commentary (published in AFR, The Australian, and domain-specific outlets), and major financial media roundups (AFR year-ahead surveys, The Australian’s economic panel).

Classification

Each forecast classified by direction (bearish, neutral, or bullish) based on the forecast number, not the tone or sentiment of the commentary. A forecast of +2% is classified as bullish even if the accompanying text described ‘challenges ahead.’ Consensus derived as the median forecast for each year.

Error measurement

Absolute error measured in percentage points (pp). MAE is the mean absolute error across all years. Direction scored as correct or wrong. A correct bearish call that overstated the fall (e.g. predicted -15%, actual -5%) scores as direction correct with a magnitude error of 10pp.

Actual prices

CoreLogic national dwelling value index, calendar year change (January to December). This is the most widely cited residential property index in Australia and the one most forecasters benchmark against.

Statistical tests

Direction significance tested using a one-sided binomial test (experts vs permanent-bull baseline of 80% directional probability). Bootstrap resampling performed with 50,000 iterations to estimate the distribution of expert outperformance. Confidence intervals are standard Clopper-Pearson exact intervals.

Error bars tell the real story. Forecast ranges are tight. Actual outcomes are not.

Chart showing forecast confidence ranges as narrow bands versus actual outcomes as points, often falling well outside the forecast range. Widest misses in 2020, 2021, and 2023.

Forecast ranges vs actual outcomes with error bars, 2010-2024. Source: Microburbs Research.

The macro consensus is one data point.

Microburbs analyses 89 street-level factors across every pocket in Australia. Find the suburbs the experts are not talking about.

Get Free Access Learn More

Generated 25 February 2026 at 14:32:07