Research Whitepaper

Innovation Economy Index: Technical Whitepaper

Full statistical method, bin performance analysis, temporal consistency testing, and regional results across 294,922 property sales. This index predicts 2-year growth, not 4-year.

+1.9% p.a.

Annual Spread

p = 2.5e-50

Top Bin Significance

19/26

Sample Dates Positive

294,922

Total Sales Tested

Luke Metcalfe

Founder & Chief Data Scientist

15+ years in property data analytics

1. Abstract
2. Methodology
3. Bin Performance
4. Temporal Analysis
5. Regional Robustness
6. Suburb-Level Evidence
7. Defence of Method
8. Limitations

1. Abstract

This paper presents a composite index that measures the concentration of innovative knowledge workers in Australian suburbs. The index combines census employment variables into a single score. Suburbs with clusters of knowledge workers in vibrant, fast-growing sectors score higher. Suburbs where these workers are largely absent score lower.

We trained a predictive model on historical property sales data to predict 2-year annualised growth rates. The model was evaluated out of sample and achieved an R-squared of 0.013. The top-scoring suburbs outperformed the national median by +1.9 percentage points over 2 years. The bottom-scoring suburbs underperformed by -2.9 percentage points.

The signal was tested across 187 quarterly periods, 26 individual sample dates, and 13 GCCSA regions. It held in 67% of quarters and was statistically significant at 19 of 26 sample dates. The spread was positive in 8 of 13 regions. But the signal breaks down after 2021 and inverts in 2022 to 2023. This paper documents both the strength and the failure modes.

2. Methodology

2.1 Feature Construction

The Innovation Economy Index is built from census employment variables that capture the concentration of innovative knowledge workers at the suburb level. We do not disclose the specific census field names. The combination represents proprietary research.

Each variable was standardised and combined into a single composite score using a weighted formula. The weights were derived from a model trained to predict 2-year annualised property growth relative to the national median.

2.2 Model Training

The model was trained on property sales data spanning 2008 to 2023. Each observation is a single property sale. The target variable is the annualised 2-year growth rate from the sale date. The features are census-derived employment variables for the suburb where the sale occurred.

The model was trained using a strict out-of-sample framework. Training and test sets were split by time period to prevent data leakage. Properties sold in overlapping windows were excluded from the test set to ensure clean separation.

2.3 Performance Metric

The primary metric is the difference in median annualised 2-year growth between each bin and the national median. Statistical significance is assessed using a two-sided t-test against the null hypothesis that the bin's mean growth equals the national mean.

diff = median_growth(bin) - median_growth(national)
t-statistic = (mean(bin) - mean(national)) / SE(bin)
p-value from two-sided t-test

2.4 Out-of-Sample R-squared

The model achieved an out-of-sample R-squared of 0.013. This means roughly 1.3% of variance in 2-year suburb-level growth can be explained by these census employment variables alone. Section 7 discusses why this level of explanatory power is still useful for investment decisions.

Note on R-squared: Property prices are influenced by hundreds of factors, including interest rates, infrastructure, zoning, supply, local employment, and macroeconomic conditions. No single thematic index will produce a high R-squared. The relevant question is whether the signal is statistically significant and consistent, not whether it explains most of the variance.

3. Bin Performance

The model sorts suburbs into three bins based on their Innovation Economy score. Each bin has a distinct growth profile. The table below shows the full results.

Top Bin

+1.9%

p = 2.5e-50 N = 112,564 sales Range: 62 to 100

Middle Bin

-0.7%

p = 9.6e-08 N = 159,706 sales Range: 8 to 62

Bottom Bin

-2.9%

p = 2.1e-25 N = 22,652 sales Range: 0 to 8

Bin	Score Range	Diff vs National	p-value	N (Sales)	Significant
Top	62 to 100	+1.9%	2.5e-50	112,564	Yes
Middle	8 to 62	-0.7%	9.6e-08	159,706	Yes
Bottom	0 to 8	-2.9%	2.1e-25	22,652	Yes

Key observation: All three bins produce statistically significant results. The spread between top and bottom is 4.8 percentage points over 2 years. The monotonic ordering (top positive, middle near zero, bottom negative) confirms the index captures a real gradient in growth outcomes. The bottom bin is smaller (22,652 sales) but the p-value of 2.1e-25 is still extremely significant.

4. Temporal Analysis

A signal that works at one point in time could be a fluke. We tested the Innovation Economy Index across every quarter from 2008-Q1 to 2023-Q3. The chart below tracks the 2-year annualised growth rate for the above-threshold and below-threshold suburbs over time.

The above-threshold suburbs (blue) sit above the below-threshold suburbs (red) in 126 of 187 quarters (67%). The separation is strongest during 2013 to 2017, where the above-threshold suburbs consistently outperform by over 2 percentage points. But the signal inverts after 2021. From 2022-Q1 onwards, the below-threshold suburbs outperform. This is the most important limitation of this index and is discussed further in the Limitations section.

4.1 Date-by-Date Consistency

We tested the top bin's outperformance at 26 individual sample dates between 2012 and 2022. At each date, the model was evaluated independently. The result was statistically significant at 19 of 26 dates.

Sample Window	Outperformance (2yr)	Significance
2012
Apr 2012 → Apr 2014	+1.00%	Not Significant
Oct 2012 → Oct 2014	+1.90%	Significant
Nov 2012 → Nov 2014	+2.00%	Significant
2013
June 2013 → June 2015	+2.70%	Significant
July 2013 → July 2015	+3.00%	Significant
2014
Feb 2014 → Feb 2016	+3.20%	Significant
Mar 2014 → Mar 2016	+3.30%	Significant
July 2014 → July 2016	+2.90%	Significant
Dec 2014 → Dec 2016	+3.30%	Significant
2015
Apr 2015 → Apr 2017	+3.50%	Significant
Sept 2015 → Sept 2017	+3.00%	Significant
2016
July 2016 → July 2018	+2.70%	Significant
Aug 2016 → Aug 2018	+2.60%	Significant
2017
Apr 2017 → Apr 2019	+1.40%	Significant
Sept 2017 → Sept 2019	+1.60%	Significant
2018
Aug 2018 → Aug 2020	+0.60%	Not Significant
Sept 2018 → Sept 2020	+0.60%	Not Significant
Dec 2018 → Dec 2020	+0.80%	Not Significant
2019
Sept 2019 → Sept 2021	+1.90%	Significant
Nov 2019 → Nov 2021	+1.90%	Significant
2020
Jan 2020 → Jan 2022	+2.10%	Significant
Apr 2020 → Apr 2022	+2.40%	Significant
Sept 2020 → Sept 2022	+2.20%	Significant
2021
July 2021 → July 2023	-0.30%	Not Significant
2022
Feb 2022 → Feb 2024	-1.10%	Not Significant
July 2022 → July 2024	-0.80%	Not Significant

Pattern in non-significant dates: The signal fails in two distinct clusters. The first is 2018 (three dates with small positive spreads of +0.6% to +0.8%). The second is 2021 to 2022 where the signal inverts entirely. By 2022, high-innovation suburbs underperform low-innovation suburbs. This coincides with the post-COVID rate-rise cycle, where innovation-sector employment became less correlated with property growth.

5. Regional Robustness

A signal that works only in one city is less useful than one that works nationally. We tested the Innovation Economy Index across all 13 GCCSA (Capital City Statistical Area) regions in Australia. The results are mixed.

The signal produces a positive spread (above-threshold suburbs beat below-threshold suburbs) in 8 of 13 regions. This is weaker than other Microburbs indices. Five regions show a negative or near-zero spread, including Melbourne (-0.40%) and the ACT (-0.35%). Rest of WA leads with a +3.68% spread, but off a small sample of 5,305 sales.

5.1 Full Regional Table

All growth rates are annualised over 2 years. The spread column shows the difference between the above-threshold and below-threshold growth rates.

Region (GCCSA)	City	Top Tier Growth	Bottom Tier Growth	Spread	N (Sales)
Rest of WA	Regional WA	-2.94%	-6.62%	+3.68%	5,305
Rest of Qld	Regional Qld	+0.17%	-2.90%	+3.07%	14,077
Sydney	Sydney	+1.36%	-0.94%	+2.30%	5,570
Rest of Vic.	Regional Vic.	+3.18%	+1.36%	+1.82%	9,126
Rest of NSW	Regional NSW	+3.44%	+2.17%	+1.27%	12,632
Perth	Perth	-4.28%	-5.52%	+1.24%	3,923
Rest of SA	Regional SA	-1.85%	-2.19%	+0.34%	3,056
Brisbane	Brisbane	+0.81%	+0.75%	+0.06%	4,937
Darwin	Darwin	-5.84%	-5.71%	-0.13%	312
Adelaide	Adelaide	-0.86%	-0.70%	-0.16%	3,643
ACT	ACT	+0.94%	+1.29%	-0.35%	1,005
Melbourne	Melbourne	+0.32%	+0.72%	-0.40%	5,678

Strongest regions: Rest of WA (+3.68% spread across 5,305 sales) and Rest of Queensland (+3.07% spread across 14,077 sales). Sydney also performs well at +2.30%. But Melbourne inverts, with low-innovation suburbs outperforming by 0.40 percentage points. The ACT similarly inverts. Brisbane shows almost no separation at +0.06%. This index is geographically uneven.

6. Suburb-Level Evidence

Suburb-level comparisons for selected cities are available on the summary page.

7. Defence of Method

7.1 Why a Low R-squared Still Matters

An out-of-sample R-squared of 0.013 means the model explains 1.3% of variance in 2-year suburb-level growth. This is low in absolute terms. It would be a poor result for a model trying to predict exact growth rates. But that is not how this index is intended to be used.

Property prices are shaped by hundreds of factors. Interest rate cycles, infrastructure investment, zoning changes, population growth, local employment, and macroeconomic conditions all play a role. No single thematic index will capture the majority of variance. A model that explained 50% of property growth from a handful of census and other government data variables would be implausible and almost certainly overfitted.

The relevant questions are different. Does the signal exist? Is it statistically significant? Does it persist across time? The evidence says yes, with important caveats about the post-2021 period.

7.2 Statistical Significance

The top bin's outperformance has a p-value of 2.5e-50. This is not borderline. The probability of observing a +1.9% difference across 112,564 sales by random chance is effectively zero. For context, a p-value below 0.05 is the standard threshold for statistical significance. The Innovation Economy Index exceeds this by 48 orders of magnitude.

7.3 Consistency Over Time (With Caveats)

The signal was significant at 19 of 26 sample dates. But unlike other Microburbs indices, the failures are not random. They cluster in two periods: 2018 (weak signal) and 2021-2022 (inverted signal). This is not noise. It suggests the relationship between innovation-economy employment and property growth is conditional on broader economic conditions.

When interest rates are stable or falling, suburbs with high concentrations of innovative knowledge workers tend to outperform. When rates rise sharply (as in 2022-2023), this relationship breaks down. Innovation-heavy suburbs may be more sensitive to rate rises because their residents carry larger mortgages relative to income.

7.4 Geographic Breadth

The spread is positive in 8 of 13 GCCSA regions. This is weaker than other Microburbs indices. The signal works best in regional areas (Rest of WA, Rest of Queensland, Rest of Victoria) and in Sydney. It fails in Melbourne and the ACT. This limits the index to specific geographic applications rather than national use.

7.5 Practical Use

This index works best as a secondary signal. It should not be the primary factor in suburb selection. When combined with other Microburbs indices that have stronger temporal consistency, the Innovation Economy Index adds value by identifying suburbs where knowledge-economy employment supports property demand. But it should be weighted lower than signals with broader geographic and temporal consistency.

Analogy: Think of this like a tailwind indicator for sailing. In normal conditions, it reliably tells you which direction the wind favours. But in a storm (a rate-rise cycle), the wind changes direction entirely. You would not rely on it alone. But in fair weather, it consistently points you toward faster waters.

8. Limitations

8.1 Post-2021 Signal Inversion

This is the most important limitation. The time series chart shows clearly that from 2022 onwards, the below-threshold suburbs outperform the above-threshold suburbs. The three most recent sample dates (2021-07, 2022-02, 2022-07) all show negative or inverted spreads. This is not a random failure. It is a structural shift.

The likely explanation is the 2022-2023 interest rate cycle. The Reserve Bank raised rates 13 times between May 2022 and November 2023. Suburbs with high concentrations of innovative knowledge workers tend to have higher house prices and larger mortgages. These suburbs were hit harder by rate rises. The very factor that drove outperformance in 2013-2017 drove underperformance in 2022-2023.

8.2 Narrow Feature Set

The index uses a narrow set of employment variables. This makes it less stable than indices built from six or more variables. A single structural shift in one employment sector can flip the entire signal. Broader indices are more resistant to this type of failure.

8.3 Government Data Is Point-in-Time

The index relies on government data sources including the Australian Census from 2016 and 2021. Census data is collected every five years. Employment patterns can shift between census dates. A suburb that scored highly in 2016 may have lost knowledge workers by 2021. The model cannot capture these shifts in real time.

8.4 Geographic Unevenness

The signal inverts in Melbourne (-0.40%), the ACT (-0.35%), Adelaide (-0.16%), and Darwin (-0.13%). Only 8 of 13 regions show a positive spread. Investors in Melbourne or Canberra should not rely on this index.

8.5 Low R-squared

The model explains 1.3% of out-of-sample variance. The remaining 98.7% is driven by factors outside this index. With a small number of input variables and a 2-year prediction horizon, this was expected. But it means the index should carry less weight than broader, more predictive signals.

8.6 2-Year Horizon

Unlike other Microburbs indices that predict 4-year growth, this index uses a 2-year horizon. Shorter horizons are noisier. A 2-year growth rate is more sensitive to short-term market swings, interest rate movements, and seasonal effects. This partly explains the lower consistency rate of 67% compared to 80% or higher for 4-year indices.

8.7 No Causal Claim

This paper documents a correlation, not a causal mechanism. We hypothesise that suburbs with high concentrations of innovative knowledge workers attract further investment, create local spending demand, and support property prices. But the data does not prove causation. The post-2021 inversion suggests the mechanism is more complex and condition-dependent than a simple causal story would allow.

Summary of limitations: The Innovation Economy Index is a conditional signal. It worked reliably from 2012 to 2020 across most regions. It failed in 2021-2023 and inverts in several major cities. Use this index as a secondary factor in a multi-signal framework. Weight it lower than indices with stronger temporal and geographic consistency. Do not use it in isolation.

Access Suburb-Level Scores

Get Innovation Economy scores for every suburb in Australia. Combine with other Microburbs signals to build a shortlist backed by data.

Explore on Microburbs Back to Overview

Part of the Threshold Signals research programme