Microburbs
Subscriptions

Combined Forecast + Threshold Backtest

Can 1,200+ suburb-level threshold features improve the existing capital growth forecast? Walk-forward backtested from 2013 to 2024.

7.0%
Original Top-20 Alpha
5.6%
Combined Top-20 Alpha
+0.027
Correlation Improvement
221
Threshold Features
Luke Metcalfe
Luke Metcalfe
Founder & Chief Data Scientist
15+ years in property data analytics

1. Data Flow Diagram

The combined model merges three data systems. The existing Random Forest forecast produces suburb-level growth predictions. Univariate threshold scores from 1,221 suburb-level metrics and 137 synthetic composite indices provide additional features. Both are combined through walk-forward backtesting.

Data Flow Diagram

End-to-end data pipeline from raw sources through feature assembly to backtested results

Pipeline Stage Details

StageSourceInputOutputDescription
1. Forecast Datafj.parquet4,268,3951,366,457Filter to house 2yr, SAL-type markets, non-excluded suburbs. 6,492 unique suburbs across 216 months.
2. Feature Selectionfields_ttests.parquet + selected_fields.parquet21,850 t-test rows179 features selected42 univariate fields from t-test significance (p < 0.05) and production selection. 137 synthetic composite fields.
3. Feature Assembly1,221 thresh_ts + 274 synth_figures parquets~192K-2.1M per file1,614,841 x 221Load value and rank columns. Forward-fill sparse census features (3 dates) to monthly. Join with forecast predictions.
4. Walk-Forward BacktestMerged feature matrix1,366,457 x 226674,868 test predictionsTwo training windows (2011, 2016). Correlation filter, RF importance filter (top 50), then RandomForest(depth=6, n=200).

The pipeline processes 4.3 million forecast rows, filtering to 1.37 million house 2-year predictions across 6,492 suburbs. It loads 42 univariate and 137 synthetic threshold features, forward-fills sparse census data (available at only 3 dates: 2011, 2016, 2021), and assembles a 221-column feature matrix for walk-forward backtesting.

2. Overall Performance Comparison

9.3%
Original Top-5 Alpha
6.9%
Combined Top-5 Alpha
7.0%
Original Top-20 Alpha
5.6%
Combined Top-20 Alpha
81%
Original Hit Rate
79%
Combined Hit Rate
0.213
Original Correlation
0.240
Combined Correlation
Alpha Comparison

Top-N alpha: outperformance vs national average for both models

The original forecast model produces higher alpha at the top of the ranking (top 5 and top 20 picks). However, the combined model achieves a higher overall prediction correlation with actual growth (0.240 vs 0.213). This means the threshold features improve prediction accuracy across all suburbs, but the original model is better at identifying the extreme outperformers. The original forecast remains the superior stock-picking tool.

3. Rolling 12-Month Alpha

Rolling Alpha

Rolling 12-month top-20 alpha: original forecast vs combined model

The original forecast (blue) consistently sits above the combined model (dark blue) across most periods. Both models track similar trends, confirming that the threshold features capture the same underlying market dynamics but with less precision at the extreme tails.

4. Decile Performance

Decile Performance

Mean actual growth vs national average by prediction decile (1 = best predicted)

DecileOriginalCombined
1+2.74%+2.50%
2+1.16%+1.77%
3+0.65%+1.19%
4+0.26%+0.65%
5-0.10%+0.21%
6-0.31%-0.21%
7-0.52%-0.65%
8-0.70%-1.20%
9-1.11%-1.73%
10-2.07%-2.48%

The decile chart reveals a key pattern. The combined model has a steeper gradient from top to bottom (2.50% to -2.48%), while the original model has a flatter but higher-at-top profile (2.74% to -2.07%). The combined model better separates good from bad suburbs in the middle deciles, but the original model picks slightly better at the very top.

5. Feature Importance (Residual Model)

The residual model predicts what the forecast gets wrong. These features explain the gap between forecast predictions and actual outcomes.

Feature Importance

Top features by Random Forest importance in the residual prediction model

FeatureImportance
census_ferry_rnk0.4625
prop_list_price_0_5_growth_3yr_rent_house0.2089
prop_list_price_0_5_rent_house0.1095
prop_list_price_0_5_growth_10yr_buy_house0.0938
prop_list_price_0_75_growth_1yr_buy_house0.0377
census_overseas_born_parents_val0.0151
census_public_housing0.0127
census_males_labour_force_15_190.0105
census_speaks_indo_aryan_languages0.0043
prop_list_price_0_5_buy_house0.0033

The census ferry rank feature dominates (0.46 importance in window 2). This percentile rank of ferry usage within each capital city region proxies for harbour-adjacent suburbs in Sydney and Melbourne. The next most important features relate to rental growth (3-year rent growth, current rent levels) and long-term price history (10-year buy growth). These threshold features capture fundamentals that the hedonic-based forecast model misses.

6. Cumulative Excess Growth

Cumulative Growth

$100K invested in top-20 picks: excess growth above national average

7. Year-by-Year Comparison

YearOriginal Top 20Combined Top 20Original Top 5Combined Top 5
2013+5.3%+5.7%+7.0%+5.9%
2014+8.6%+8.3%+10.7%+10.8%
2015+7.9%+5.8%+7.9%+6.9%
2016+4.9%+2.8%+12.5%+7.5%
2018+8.1%+5.0%+11.0%+7.4%
2019+3.9%+3.6%+4.5%+3.8%
2020+4.0%+4.2%+5.8%+1.9%
2021+10.5%+7.6%+14.9%+12.0%
2022+5.9%+6.2%+5.8%+6.9%
2023+7.3%+4.7%+9.7%+6.3%
2024+6.1%+4.9%+8.6%+8.1%

8. Walk-Forward Window Results

WindowTrainTestFeaturesOOBCorr (Orig)Corr (Comb)Alpha (Orig)Alpha (Comb)
2011-03 to 2013-03:2016-02233,137227,481240.11360.20460.22897.32%6.41%
2016-03 to 2018-03:2024-02612,159426,600360.17960.21810.24746.88%5.23%

The residual model predicts forecast errors using threshold features only (no access to the forecast prediction itself). It then adjusts the original forecast up or down based on what it learns. OOB scores of 0.11-0.18 show that threshold features explain 11-18% of the variance in forecast errors. This is meaningful signal that could be used to flag suburbs where the forecast may be too optimistic or too pessimistic.

Methodology

Walk-forward backtesting with two training windows (train to 2011-03 and 2016-03). Features selected within each training window to prevent look-ahead bias. Missing values imputed with training set medians only.

Feature selection: (1) drop features with more than 50% missing, (2) correlation filter removing one of pairs with |r| greater than 0.95, (3) RF importance filter keeping top 50 features.

Two model approaches tested per window: (A) Direct model predicting y using forecast pred + threshold features, (B) Residual model predicting (y - pred) using threshold features only. The residual (boosted) approach is used as the combined model.

Growth values are log growth relative to the national average. Alpha = exp(mean(y)) - 1 for top-N picks. Backtest period: March 2013 to February 2024. 654,081 valid test observations across 6,492 suburbs.

Microburbs Suburb Forecasting

Capital growth predictions powered by 1,200+ suburb-level indicators

Explore MicroburbsOriginal Backtest Report
Microburbs

Australia's most comprehensive property data platform.

Explore

  • Suburb Reports
  • Region Reports
  • Property Reports
  • AI Property Finder
  • Suburb Finder

Resources

  • Blog
  • Academy
  • Podcast
  • Data Definitions
  • FAQ

About

  • About Microburbs
  • Contact Us
  • Careers

Legal

  • Terms of Use
  • Privacy Policy
  • Disclaimer

© 2026 Microburbs. All rights reserved.