Cultural Integration Index: Technical Whitepaper
Full statistical methodology, bin performance analysis, temporal consistency testing, and regional robustness results across 272,958 property sales.

1. Abstract
This paper presents a composite index that measures community depth in Australian suburbs. The index combines six government data variables capturing different dimensions of how established, layered, and stable a community is. Suburbs with deeply rooted, multi-generational communities score higher. Suburbs where the community is still forming score lower.
We trained a predictive model on historical property sales data to predict 4-year annualised growth rates. The model was evaluated out of sample and achieved an R-squared of 0.071. While this is a modest fit, the signal is statistically significant with a p-value of 2.0e-47 for the top performance bin. The top-scoring suburbs outperformed the national median by +1.7 percentage points over 4 years. The bottom-scoring suburbs underperformed by -2.2 percentage points.
The signal was tested across 163 quarterly periods, 24 individual sample dates, and 12 GCCSA regions. It held in 82% of quarters, was statistically significant at 21 of 24 sample dates, and produced a positive spread in 11 of 12 regions. The ACT was the only region where the signal inverted.
2. Methodology
2.1 Feature Construction
The Cultural Integration Index is built from six government data variables. These variables capture different dimensions of community depth, including demographic layering, residential stability, local institution maturity, and neighbourhood tenure patterns. No single variable carries the signal. The predictive power comes from the interaction between factors.
Each variable was standardised and combined into a single composite score using a weighted formula. The weights were derived from a model trained to predict 4-year annualised property growth relative to the national median.
2.2 Model Training
The model was trained on property sales data spanning 2008 to 2021. Each observation is a single property sale. The target variable is the annualised 4-year growth rate from the sale date. The features are the six variables drawn from census and other government data for the suburb where the sale occurred.
The model was trained using a strict out-of-sample framework. Training and test sets were split by time period to prevent data leakage. Properties sold in overlapping windows were excluded from the test set to ensure clean separation.
2.3 Performance Metric
The primary metric is the difference in median annualised 4-year growth between each bin and the national median. Statistical significance is assessed using a two-sided t-test against the null hypothesis that the bin's mean growth equals the national mean.
t-statistic = (mean(bin) - mean(national)) / SE(bin)
p-value from two-sided t-test
2.4 Out-of-Sample R-squared
The model achieved an out-of-sample R-squared of 0.071. This means 7.1% of variance in 4-year suburb-level growth can be explained by the six community depth variables alone. Section 6 discusses why this level of explanatory power is still useful for investment decisions.
3. Bin Performance
The model sorts suburbs into three bins based on their Cultural Integration score. Each bin has a distinct growth profile.
| Bin | Score Range | Diff vs National | p-value | N (Sales) | Significant |
|---|---|---|---|---|---|
| Top | 62 to 100 | +1.7% | 2.0e-47 | 104,593 | Yes |
| Middle | 9 to 62 | -0.7% | 2.3e-09 | 143,281 | Yes |
| Bottom | 0 to 9 | -2.2% | 7.4e-28 | 25,084 | Yes |
4. Temporal Analysis
A signal that works at one point in time could be a fluke. We tested the Cultural Integration Index across every quarter from 2008-Q1 to 2021-Q3.
4.1 Date-by-Date Consistency
We tested the top bin's outperformance at 24 individual sample dates between 2011 and 2021. The result was statistically significant at 21 of 24 dates.
| Sample Window | Outperformance (4yr) | Significance |
|---|---|---|
| 2011 | ||
| Dec 2011 → Dec 2015 | +2.20% | Significant |
| 2012 | ||
| Apr 2012 → Apr 2016 | +2.80% | Significant |
| Oct 2012 → Oct 2016 | +3.20% | Significant |
| Nov 2012 → Nov 2016 | +3.30% | Significant |
| 2013 | ||
| June 2013 → June 2017 | +3.90% | Significant |
| 2014 | ||
| Feb 2014 → Feb 2018 | +3.70% | Significant |
| July 2014 → July 2018 | +3.40% | Significant |
| 2015 | ||
| Apr 2015 → Apr 2019 | +2.40% | Significant |
| June 2015 → June 2019 | +2.10% | Significant |
| Sept 2015 → Sept 2019 | +1.70% | Significant |
| 2016 | ||
| May 2016 → May 2020 | +1.80% | Significant |
| Aug 2016 → Aug 2020 | +1.70% | Significant |
| 2017 | ||
| Apr 2017 → Apr 2021 | +2.10% | Significant |
| Sept 2017 → Sept 2021 | +2.20% | Significant |
| 2018 | ||
| Jan 2018 → Jan 2022 | +2.40% | Significant |
| 2019 | ||
| Apr 2019 → Apr 2023 | +1.80% | Significant |
| May 2019 → May 2023 | +1.70% | Significant |
| June 2019 → June 2023 | +1.70% | Significant |
| Nov 2019 → Nov 2023 | +1.30% | Significant |
| Dec 2019 → Dec 2023 | +1.30% | Significant |
| 2020 | ||
| Jan 2020 → Jan 2024 | +1.30% | Significant |
| 2021 | ||
| Jan 2021 → Jan 2025 | +0.50% | Not Significant |
| Mar 2021 → Mar 2025 | +0.30% | Not Significant |
| Aug 2021 → Aug 2025 | +0.00% | Not Significant |
5. Regional Robustness
A signal that works only in one city is less useful than one that works nationally. We tested the Cultural Integration Index across all 12 GCCSA regions in Australia.
5.1 Full Regional Table
All growth rates are annualised over 4 years. The spread column shows the difference between the above-threshold and below-threshold growth rates.
| Region (GCCSA) | City | Top Tier Growth | Bottom Tier Growth | Spread | N (Sales) |
|---|---|---|---|---|---|
| Rest of Qld | Regional Qld | +1.46% | -1.85% | +3.31% | 16,730 |
| Melbourne | Melbourne | +0.36% | -2.65% | +3.01% | 5,861 |
| Darwin | Darwin | -3.70% | -5.58% | +1.88% | 390 |
| Perth | Perth | -2.05% | -3.76% | +1.71% | 3,874 |
| Rest of WA | Regional WA | -2.01% | -3.60% | +1.59% | 6,566 |
| Sydney | Sydney | -0.49% | -1.61% | +1.12% | 6,573 |
| Rest of NSW | Regional NSW | +2.58% | +1.53% | +1.05% | 11,335 |
| Rest of Vic. | Regional Vic. | +2.79% | +1.93% | +0.86% | 7,189 |
| Brisbane | Brisbane | +1.39% | +0.79% | +0.60% | 5,544 |
| Adelaide | Adelaide | +0.25% | -0.13% | +0.38% | 3,505 |
| Rest of SA | Regional SA | +0.25% | -0.09% | +0.34% | 3,763 |
| ACT | ACT | +0.02% | +0.23% | -0.21% | 1,056 |
6. Defence of Method
6.1 Why a Low R-squared Still Matters
An R-squared of 0.071 means the model explains 7.1% of variance in 4-year suburb-level growth. This is low in absolute terms. It would be a poor result for a model trying to predict exact growth rates. But that is not how this index is intended to be used.
Property prices are shaped by hundreds of factors. Interest rate cycles, infrastructure investment, zoning changes, population growth, local employment, and macroeconomic conditions all play a role. No single thematic index will capture the majority of variance.
6.2 Statistical Significance
The top bin's outperformance has a p-value of 2.0e-47. The probability of observing a +1.7% difference across 104,593 sales by random chance is effectively zero. A p-value below 0.05 is the standard threshold for statistical significance. The Cultural Integration Index exceeds this by 45 orders of magnitude.
6.3 Consistency Over Time
The signal was significant at 21 of 24 sample dates spanning a decade. The three non-significant dates all fall in 2021, during an extraordinary market environment. A signal that works 88% of the time across a full market cycle is robust.
6.4 Geographic Breadth
The spread is positive in 11 of 12 GCCSA regions. It works in rising markets (Rest of Queensland, Rest of NSW) and falling markets (Perth, Darwin). It works in large metros (Sydney, Melbourne) and regional areas (Rest of Victoria, Rest of Western Australia). The only exception is the ACT.
6.5 Practical Use
Investors do not need a model to predict exact prices. They need a signal to tilt the odds in their favour across a portfolio of purchases. A 1.7 percentage point advantage per purchase, compounded over time, is meaningful.
7. Limitations
7.1 Government Data Is Point-in-Time
The index relies on government data sources including the Australian Census from 2016 and 2021. Census data is collected every five years. Community composition can shift between census dates.
7.2 Backward-Looking Model
The model was trained on historical data from 2008 to 2021. Past patterns do not guarantee future results. The signal already weakened during the COVID-era boom of 2020 to 2021.
7.3 Individual Suburb Variation
Even within the top bin, individual suburb outcomes vary widely. Bateau Bay scored 100.0 but returned -1.73% per year during 2021 to 2025. The index provides a statistical edge across large numbers of purchases, not a guarantee for any single suburb.
7.4 Low R-squared
The model explains 7.1% of variance. The remaining 92.9% is driven by factors outside this index. Investors should not use the Cultural Integration Index as their sole decision-making tool.
7.5 Sample Size Variability
Some regions have small sample sizes. Darwin contributed only 390 sales. Results in small-sample regions carry wider confidence intervals.
7.6 No Causal Claim
This paper documents a correlation, not a causal mechanism. We hypothesise that deeply layered communities create stable demand, local amenity, and neighbourhood identity that support property values. But the data does not prove causation.
Access Suburb-Level Scores
Get community depth scores for every suburb in Australia. Combine with other Microburbs signals to build a shortlist backed by data.
Part of the Threshold Signals research programme