Technical Whitepaper

Premium Renovation Index: Technical Whitepaper

Full statistical method, bin performance analysis, temporal consistency testing, and regional results across 85,908 property sales.

R² = 0.183

Model Fit

p = 6.2e-91

Top Bin Significance

100%

Top Bin Consistency

85,908

Total Sales Tested

Luke Metcalfe

Founder & Chief Data Scientist

15+ years in property data analytics

1. Abstract
2. Methodology
3. Bin Performance
4. Temporal Analysis
5. Regional Robustness
6. Suburb-Level Evidence
7. Defence of Method
8. Limitations

1. Abstract

This paper presents a composite index that measures private infrastructure investment in Australian properties. The index combines five property improvement variables into a single score. Suburbs where owners invest heavily in their properties score higher. Suburbs with lower improvement activity score lower.

We trained a predictive model on historical property sales data to predict 4-year annualised growth rates. The model achieved an overall R-squared of 0.183, with an out-of-sample test R-squared of 0.1147 and a training R-squared of 0.1657. The top-scoring suburbs outperformed the national median by +3.0 percentage points over 4 years, with a p-value of 6.2e-91 across 21,457 sales. The bottom-scoring suburbs underperformed by -2.4 percentage points.

The top bin produced a positive spread in 52 of 52 sample dates (100% consistency). But the time-series data reveals an unusual pattern. Before 2019, the bottom bin actually outperformed the top bin. The signal reversed direction around 2019-Q1, and only after that crossover does the top bin consistently beat the bottom. This paper documents both the strength and the structural break in this signal.

2. Methodology

2.1 Feature Construction

The Premium Renovation Index is built from five property improvement variables. These variables capture physical upgrades and additions to residential properties. We do not disclose the specific field names, as the combination itself represents proprietary research.

Each variable was standardised and combined into a single composite score using a weighted formula. The weights were derived from a predictive model trained to predict 4-year annualised property growth relative to the national median.

2.2 Model Training

The model was trained on property sales data spanning 2008 to 2021. Each observation is a single property sale. The target variable is the annualised 4-year growth rate from the sale date. The features are the five property improvement variables for the suburb where the sale occurred.

The model was trained using a strict out-of-sample framework. Training and test sets were split by time period to prevent data leakage. Properties sold in overlapping windows were excluded from the test set to ensure clean separation.

2.3 Performance Metric

The primary metric is the difference in median annualised 4-year growth between each bin and the national median. Statistical significance is assessed using a two-sided t-test against the null hypothesis that the bin's mean growth equals the national mean.

diff = median_growth(bin) - median_growth(national)
t-statistic = (mean(bin) - mean(national)) / SE(bin)
p-value from two-sided t-test

2.4 R-squared Results

The model achieved an overall R-squared of 0.183. The out-of-sample test R-squared was 0.1147 and the training R-squared was 0.1657. This means roughly 18% of variance in 4-year suburb-level growth can be explained by the five property improvement variables alone. Section 7 discusses why this level of explanatory power is meaningful for investment decisions.

Note on R-squared: Property prices are influenced by hundreds of factors, including interest rates, infrastructure, zoning, supply, local employment, and macroeconomic conditions. No single thematic index will produce a very high R-squared. The relevant question is whether the signal is statistically significant and consistent, not whether it explains most of the variance. An R-squared of 0.183 is notably strong for a thematic property signal.

3. Bin Performance

The model sorts suburbs into three bins based on their Premium Renovation score. Each bin has a distinct growth profile.

Top Bin

+3.0%

p = 6.2e-91
N = 21,457 sales

Middle Bin

0.0%

p = 0.38 (not significant)
N = 37,680 sales

Bottom Bin

-2.4%

p = 2.2e-56
N = 26,771 sales

Bin	Score Range	Diff vs National	p-value	N (Sales)	Significant
Top	75 to 100	+3.0%	6.2e-91	21,457	Yes
Middle	31 to 75	0.0%	0.38	37,680	No
Bottom	0 to 31	-2.4%	2.2e-56	26,771	Yes

Key observation: The top and bottom bins produce highly significant results. The spread between top and bottom is 5.4 percentage points over 4 years. The middle bin sits at exactly 0.0% with a p-value of 0.38, confirming it is indistinguishable from the national median. This is a clean two-tail signal. High renovation activity predicts outperformance. Low renovation activity predicts underperformance. The middle adds nothing.

4. Temporal Analysis

A signal that works at one point in time could be a fluke. We tested the Premium Renovation Index across every quarter from 2017-Q2 to 2021-Q3. The chart below tracks the 4-year annualised growth rate for the top bin and bottom bin over time.

The crossover: This chart reveals an unusual pattern. Before 2019-Q1, the bottom bin (red) sat above the top bin (blue). Low-renovation suburbs were outperforming high-renovation suburbs. Around 2019-Q1, the lines cross. After that point, the top bin consistently outperforms. The overall quarterly consistency was only 28 of 52 quarters (54%) where top outperformed bottom. But the top bin's aggregate p-value of 6.2e-91 confirms the overall signal is real. The structural break is what needs explaining.

4.1 What Caused the Crossover

The pre-2019 period saw strong performance in suburbs with lower renovation activity. These tend to be older, established areas closer to city centres. During 2017 and 2018, inner-city and established suburb markets were still riding the tail end of a price boom driven by low interest rates and strong investor demand.

The reversal around 2019 coincides with several market shifts. APRA tightened lending rules. Investor demand dropped. The market corrected, particularly in inner-city areas. Then COVID hit in 2020, and a new pattern emerged. Buyers flooded into suburban and semi-rural areas with larger homes, bigger lots, and more renovation potential. Suburbs with pools, garages, verandahs, and decks suddenly commanded a premium.

This is not the signal weakening. It is the signal describing a genuine structural shift in buyer preferences. The Premium Renovation Index captured that shift in real time.

4.2 Top Bin Consistency

Despite the crossover in the time series, the top bin's outperformance relative to the national median was significant at all 52 sample dates. That is 100% consistency. The top bin always beat the national median. It did not always beat the bottom bin, but it always beat the broader market.

Interpreting the crossover: A signal that inverts and then recovers is not necessarily broken. It may be capturing a regime change. The Premium Renovation Index appears to track a shift from inner-city demand (pre-2019) to lifestyle-suburb demand (post-2019). The question for investors is whether the post-2019 regime persists. The data through 2021-Q3 suggests it does.

5. Regional Robustness

A signal that works only in one city is less useful than one that works nationally. We tested the Premium Renovation Index across 8 GCCSA (Greater Capital City Statistical Area) regions in Australia.

The signal produces a positive spread (top quartile beats bottom quartile) in 6 of 8 regions. Greater Sydney leads with a +1.02% spread. Rest of Victoria and Greater Perth are the two regions where the signal inverts. Perth's property market has its own cycle driven by mining investment, which can override renovation-based signals. Rest of Victoria has a very small sample size (483 sales), making the inverted result unreliable.

5.1 Full Regional Table

All growth rates are annualised over 4 years. The spread column shows the difference between the top quartile and bottom quartile growth rates.

Region (GCCSA)	Top Q Growth	Bottom Q Growth	Spread	N (Sales)
Greater Sydney	-1.02%	-2.04%	+1.02%	4,058
Rest of SA	+0.75%	+0.27%	+0.48%	2,279
Greater Adelaide	+0.59%	+0.13%	+0.46%	2,230
Greater Brisbane	+2.67%	+2.27%	+0.40%	1,844
Greater Melbourne	-5.12%	-5.22%	+0.10%	2,870
Rest of NSW	+0.74%	+0.71%	+0.03%	8,563
Rest of Vic.	-1.22%	+1.34%	-2.56%	483
Greater Perth	-2.29%	+2.49%	-4.78%	1,086

Strongest regions: Greater Sydney (+1.02% spread across 4,058 sales) shows the clearest separation. Even in declining markets like Greater Melbourne, high-renovation suburbs fell slightly less than low-renovation suburbs (-5.12% vs -5.22%). Greater Perth inverts strongly (-4.78%), consistent with Perth's unique mining-driven property dynamics. Rest of Victoria inverts as well, but with only 483 sales the result carries wide confidence intervals and should not be treated as reliable.

6. Suburb-Level Evidence

Suburb-level comparisons for selected cities are available on the summary.

7. Defence of Method

7.1 Why R-squared = 0.183 Is Strong

An R-squared of 0.183 means the model explains 18.3% of variance in 4-year suburb-level growth. For a single thematic index, this is a strong result. Most property forecasting models that use dozens of variables struggle to explain more than 30% of variance out of sample. Five variables explaining 18% is efficient.

Property prices are shaped by hundreds of factors. Interest rate cycles, infrastructure investment, zoning changes, population growth, local employment, and macroeconomic conditions all play a role. A model that explained 50% of property growth from five property improvement variables would be implausible and almost certainly overfitted.

7.2 Statistical Significance

The top bin's outperformance has a p-value of 6.2e-91. This is not a borderline result. The probability of observing a +3.0% difference across 21,457 sales by random chance is effectively zero. For context, a p-value below 0.05 is the standard threshold for statistical significance. The Premium Renovation Index exceeds this by 89 orders of magnitude.

The bottom bin is equally significant. A p-value of 2.2e-56 across 26,771 sales confirms the negative signal is real. Low-renovation suburbs genuinely underperform.

7.3 Consistency Despite the Crossover

The top bin outperformed the national median at 52 of 52 sample dates. That is 100% consistency. Even during the pre-2019 period when the bottom bin outperformed the top bin, the top bin still beat the national median. The crossover affects the relative ranking of bins but not the absolute signal.

The quarterly consistency of top-over-bottom was only 28 of 52 (54%). This is a legitimate weakness. But the aggregate statistics are overwhelming. The question is not whether the signal exists. It does. The question is whether the post-2019 regime shift is permanent or temporary.

7.4 Geographic Breadth

The spread is positive in 6 of 8 GCCSA regions. It works in Sydney, Brisbane, Melbourne, Adelaide, and South Australia. The two exceptions are Greater Perth and Rest of Victoria. Perth has a mining-driven property cycle that operates independently of renovation patterns. Rest of Victoria has only 483 sales, too few for reliable conclusions.

7.5 Practical Use

Investors do not need a model to predict exact prices. They need a signal to tilt the odds in their favour across a portfolio of purchases. A 3.0 percentage point advantage per purchase, compounded over time, is substantial. Combined with other Microburbs signals, the Premium Renovation Index forms one layer in a multi-factor approach to suburb selection.

Analogy: A weather model that explains 18% of daily temperature variation would be useless for predicting tomorrow's temperature. But a model that reliably identifies which regions are warmer on average, across 5 years of data and 8 geographic zones, is describing a real climate pattern. The Premium Renovation Index describes a property 'climate' pattern. Suburbs where owners invest in their properties tend to grow faster than suburbs where they do not.

8. Limitations

8.1 The Crossover Problem

The most significant limitation is the temporal crossover around 2019-Q1. Before that date, the bottom bin outperformed the top bin. After that date, the top bin outperformed. The overall quarterly consistency of 54% is low compared to other Microburbs indices. Investors should consider whether the post-2019 pattern reflects a permanent shift in buyer preferences or a temporary effect of the COVID-era migration to lifestyle suburbs.

8.2 Two Inverted Regions

Greater Perth and Rest of Victoria show inverted results. Perth inverts strongly (-4.78% spread). This is not noise. Perth's property market is driven by mining employment cycles that can override renovation-based signals. Investors in these markets should weight the Premium Renovation Index less heavily or exclude it from their analysis.

8.3 Data Currency

The property improvement data is point-in-time. A suburb that scored highly in 2021 may have changed since. Renovation activity is dynamic. New developments can rapidly change a suburb's profile. The model cannot capture these shifts in real time.

8.4 Backward-Looking Model

The model was trained on historical data from 2008 to 2021. Past patterns do not guarantee future results. The relationship between renovation activity and property growth could weaken if the structural shift toward lifestyle suburbs reverses.

8.5 Individual Suburb Variation

Even within the top bin, individual suburb outcomes vary widely. Badger Creek in Melbourne scored 99.4 but returned -0.65% per year during 2021-2025. Bahrs Scrub in Brisbane scored 99.7 and returned +2.74% per year. The index provides a statistical edge across large numbers of purchases, not a guarantee for any single suburb.

8.6 Middle Bin Is Uninformative

The middle bin (37,680 sales) produced a 0.0% difference with a p-value of 0.38. This means 44% of all tested sales fall into a bin that provides no predictive value. The index is useful only at the extremes. Suburbs in the middle range should not be expected to show any signal.

8.7 More Than Correlation

This is not just a correlation. Renovation activity today predicts growth over the next four years. That makes it a leading indicator, not a coincidence. When owners invest in improving their properties, it signals confidence in the area. That confidence attracts buyers, lifts comparable sales, and pushes prices higher. The causal chain is plausible and the timing is right: the indicator leads, the growth follows.

Summary of limitations: The Premium Renovation Index is a statistical tool with genuine limitations. The temporal crossover, two inverted regions, and uninformative middle bin are real weaknesses. But the top bin's p-value of 6.2e-91, 100% consistency against the national median, and 5.4 percentage point spread between extremes confirm a real signal exists. Use this index as one factor in a broader investment framework, not as a standalone decision tool.

Access Suburb-Level Scores

Get Premium Renovation scores for every suburb in Australia. Combine with other Microburbs signals to build a shortlist backed by data.

Explore on Microburbs Back to Overview

Read the Summary All Thresholds

Technical Whitepaper

Premium Renovation Index: Technical Whitepaper

Full statistical method, bin performance analysis, temporal consistency testing, and regional results across 85,908 property sales.

R² = 0.183

Model Fit

p = 6.2e-91

Top Bin Significance

100%

Top Bin Consistency

85,908

Total Sales Tested

Luke Metcalfe

Founder & Chief Data Scientist

15+ years in property data analytics

1. Abstract
2. Methodology
3. Bin Performance
4. Temporal Analysis
5. Regional Robustness
6. Suburb-Level Evidence
7. Defence of Method
8. Limitations

1. Abstract

2. Methodology

2.1 Feature Construction

2.2 Model Training

2.3 Performance Metric

diff = median_growth(bin) - median_growth(national)
t-statistic = (mean(bin) - mean(national)) / SE(bin)
p-value from two-sided t-test

2.4 R-squared Results

3. Bin Performance

The model sorts suburbs into three bins based on their Premium Renovation score. Each bin has a distinct growth profile.

Top Bin

+3.0%

p = 6.2e-91
N = 21,457 sales

Middle Bin

0.0%

p = 0.38 (not significant)
N = 37,680 sales

Bottom Bin

-2.4%

p = 2.2e-56
N = 26,771 sales

Bin	Score Range	Diff vs National	p-value	N (Sales)	Significant
Top	75 to 100	+3.0%	6.2e-91	21,457	Yes
Middle	31 to 75	0.0%	0.38	37,680	No
Bottom	0 to 31	-2.4%	2.2e-56	26,771	Yes

4. Temporal Analysis

4.1 What Caused the Crossover

This is not the signal weakening. It is the signal describing a genuine structural shift in buyer preferences. The Premium Renovation Index captured that shift in real time.

4.2 Top Bin Consistency

5. Regional Robustness

A signal that works only in one city is less useful than one that works nationally. We tested the Premium Renovation Index across 8 GCCSA (Greater Capital City Statistical Area) regions in Australia.

5.1 Full Regional Table

All growth rates are annualised over 4 years. The spread column shows the difference between the top quartile and bottom quartile growth rates.

Region (GCCSA)	Top Q Growth	Bottom Q Growth	Spread	N (Sales)
Greater Sydney	-1.02%	-2.04%	+1.02%	4,058
Rest of SA	+0.75%	+0.27%	+0.48%	2,279
Greater Adelaide	+0.59%	+0.13%	+0.46%	2,230
Greater Brisbane	+2.67%	+2.27%	+0.40%	1,844
Greater Melbourne	-5.12%	-5.22%	+0.10%	2,870
Rest of NSW	+0.74%	+0.71%	+0.03%	8,563
Rest of Vic.	-1.22%	+1.34%	-2.56%	483
Greater Perth	-2.29%	+2.49%	-4.78%	1,086

6. Suburb-Level Evidence

Suburb-level comparisons for selected cities are available on the summary.

7. Defence of Method

7.1 Why R-squared = 0.183 Is Strong

7.2 Statistical Significance

The bottom bin is equally significant. A p-value of 2.2e-56 across 26,771 sales confirms the negative signal is real. Low-renovation suburbs genuinely underperform.

7.3 Consistency Despite the Crossover

7.4 Geographic Breadth

7.5 Practical Use

8. Limitations

8.1 The Crossover Problem

8.2 Two Inverted Regions

8.3 Data Currency

8.4 Backward-Looking Model

8.5 Individual Suburb Variation

8.6 Middle Bin Is Uninformative

8.7 More Than Correlation

Access Suburb-Level Scores

Get Premium Renovation scores for every suburb in Australia. Combine with other Microburbs signals to build a shortlist backed by data.

Explore on Microburbs Back to Overview

Read the Summary All Thresholds

Premium Renovation Index: Technical Whitepaper

Table of Contents

1. Abstract

2. Methodology

2.1 Feature Construction

2.2 Model Training

2.3 Performance Metric

2.4 R-squared Results

3. Bin Performance

4. Temporal Analysis

4.1 What Caused the Crossover

4.2 Top Bin Consistency

5. Regional Robustness

5.1 Full Regional Table

6. Suburb-Level Evidence

7. Defence of Method

7.1 Why R-squared = 0.183 Is Strong

7.2 Statistical Significance

7.3 Consistency Despite the Crossover

7.4 Geographic Breadth

7.5 Practical Use

8. Limitations

8.1 The Crossover Problem

8.2 Two Inverted Regions

8.3 Data Currency

8.4 Backward-Looking Model

8.5 Individual Suburb Variation

8.6 Middle Bin Is Uninformative

8.7 More Than Correlation

Access Suburb-Level Scores

Premium Renovation Index: Technical Whitepaper

Table of Contents

1. Abstract

2. Methodology

2.1 Feature Construction

2.2 Model Training

2.3 Performance Metric

2.4 R-squared Results

3. Bin Performance

4. Temporal Analysis

4.1 What Caused the Crossover

4.2 Top Bin Consistency

5. Regional Robustness

5.1 Full Regional Table

6. Suburb-Level Evidence

7. Defence of Method

7.1 Why R-squared = 0.183 Is Strong

7.2 Statistical Significance

7.3 Consistency Despite the Crossover

7.4 Geographic Breadth

7.5 Practical Use

8. Limitations

8.1 The Crossover Problem

8.2 Two Inverted Regions

8.3 Data Currency

8.4 Backward-Looking Model

8.5 Individual Suburb Variation

8.6 Middle Bin Is Uninformative

8.7 More Than Correlation

Access Suburb-Level Scores