Transport Ecosystem Index: Technical Whitepaper
Full statistical methodology, bin performance analysis, temporal consistency testing, and regional robustness results across 272,958 property sales.

1. Abstract
This paper presents a composite index that captures how residents travel within Australian suburbs. The index combines six government data transport variables into a single score. The counter-intuitive finding is that suburbs where residents rely more on private transport, in spacious, car-friendly areas, outperform denser suburbs with extensive public transport networks.
We trained a predictive model on historical property sales data to predict 4-year annualised growth rates. The model was evaluated out of sample and achieved an R-squared of 0.1761. The overall model R-squared was 0.213 with train R-squared of 0.1694. The top-scoring suburbs outperformed the national median by +4.0 percentage points over 4 years. The bottom-scoring suburbs underperformed by -3.3 percentage points.
The signal was tested across 163 quarterly periods and 13 GCCSA regions. The top bin was statistically significant at 147 of 163 dates. The above-threshold suburbs beat the below-threshold suburbs in 122 of 163 quarters (75%). The signal produced a positive spread in 8 of 13 regions. These results indicate a persistent, geographically broad relationship between travel mode patterns and subsequent property growth.
2. Methodology
2.1 Feature Construction
The Transport Ecosystem Index is built from six government data-derived travel mode variables. These variables capture commuting patterns and travel mode preferences at the suburb level. We do not disclose the specific census field names, as the combination itself represents proprietary research.
Each variable was standardised and combined into a single composite score using a weighted formula. The weights were derived from a model trained to predict 4-year annualised property growth relative to the national median.
2.2 Model Training
The model was trained on property sales data spanning 2008 to 2021. Each observation is a single property sale. The target variable is the annualised 4-year growth rate from the sale date. The features are the six transport-derived variables for the suburb where the sale occurred.
The model was trained using a strict out-of-sample framework. Training and test sets were split by time period to prevent data leakage. Properties sold in overlapping windows were excluded from the test set to ensure clean separation.
2.3 Performance Metric
The primary metric is the difference in median annualised 4-year growth between each bin and the national median. Statistical significance is assessed using a two-sided t-test against the null hypothesis that the bin's mean growth equals the national mean.
t-statistic = (mean(bin) - mean(national)) / SE(bin)
p-value from two-sided t-test
2.4 Out-of-Sample R-squared
The model achieved an out-of-sample R-squared of 0.1761. The training R-squared was 0.1694 and the overall R-squared was 0.213. This means 17.6% of variance in 4-year suburb-level growth can be explained by the six travel mode variables alone. The close alignment between train and test R-squared confirms the model is not overfitted.
3. Bin Performance
The model sorts suburbs into three bins based on their Transport Ecosystem score. Each bin has a distinct growth profile. The table below shows the full results.
| Bin | Score Range | Diff vs National | p-value | N (Sales) | Significant |
|---|---|---|---|---|---|
| Top | 86 to 100 | +4.0% | 3.2e-142 | 38,548 | Yes |
| Middle | 32 to 86 | +0.7% | 4.2e-10 | 146,319 | Yes |
| Bottom | 0 to 32 | -3.3% | 8.3e-174 | 88,091 | Yes |
4. Temporal Analysis
A signal that works at one point in time could be a fluke. We tested the Transport Ecosystem Index across every quarter from 2008-Q1 to 2021-Q3.
4.1 Date-by-Date Consistency
The top bin was statistically significant at 147 of 163 individual dates. Below is a consistency summary showing how the signal performed across different time windows. Of 163 dates tested, the above-threshold suburbs beat the below-threshold suburbs at 122 dates (75%).
| Sample Window | Quarters in Period | Top > Bottom | Consistency |
|---|---|---|---|
| 2000s | |||
| Jan 2008 → Oct 2013 | 8 | 3 of 8 | 38% |
| 2010s | |||
| Jan 2010 → Oct 2015 | 8 | 4 of 8 | 50% |
| Jan 2012 → Oct 2017 | 8 | 8 of 8 | 100% |
| Jan 2014 → Oct 2019 | 8 | 8 of 8 | 100% |
| Jan 2016 → Oct 2021 | 8 | 8 of 8 | 100% |
| Jan 2018 → Oct 2023 | 8 | 8 of 8 | 100% |
| 2020s | |||
| Jan 2020 → Jul 2025 | 7 | 5 of 7 | 71% |
5. Regional Robustness
A signal that works only in one city is less useful than one that works nationally. We tested the Transport Ecosystem Index across all 13 GCCSA (Capital City Statistical Area) regions in Australia.
5.1 Full Regional Table
All growth rates are annualised over 4 years. The spread column shows the difference between the above-threshold and below-threshold growth rates.
| Region (GCCSA) | City | Top Tier Growth | Bottom Tier Growth | Spread | N (Sales) |
|---|---|---|---|---|---|
| Darwin | Darwin | -3.42% | -6.65% | +3.23% | 367 |
| Melbourne | Melbourne | -0.13% | -3.15% | +3.02% | 4,945 |
| Rest of WA | Regional WA | -0.77% | -3.14% | +2.37% | 3,603 |
| Rest of Qld | Regional Qld | +0.86% | -1.20% | +2.06% | 13,184 |
| Rest of NSW | Regional NSW | +2.45% | +1.03% | +1.42% | 14,351 |
| Sydney | Sydney | -0.93% | -1.24% | +0.31% | 4,576 |
| Adelaide | Adelaide | +0.15% | -0.08% | +0.23% | 3,216 |
| Rest of Vic. | Regional Vic. | +2.30% | +2.19% | +0.11% | 7,443 |
| ACT | ACT | +0.15% | +0.70% | -0.55% | 1,129 |
| Brisbane | Brisbane | +0.07% | +0.71% | -0.64% | 1,854 |
| Perth | Perth | -3.35% | -2.41% | -0.94% | 2,684 |
| Rest of SA | Regional SA | -1.40% | -0.46% | -0.94% | 2,342 |
| Rest of NT | Regional NT | -6.97% | -2.73% | -4.24% | 147 |
6. Defence of Method
6.1 Why the R-squared Is Noteworthy
An out-of-sample R-squared of 0.176 means the model explains 17.6% of variance in 4-year suburb-level growth. This is a strong result for a single thematic factor. Property prices are shaped by hundreds of variables. Interest rate cycles, infrastructure investment, zoning changes, population growth, local employment, and macroeconomic conditions all play a role. No single index will capture the majority of variance.
The close alignment between train R-squared (0.1694) and test R-squared (0.1761) confirms the model is not overfitted. The signal generalises to unseen data. A model that explained 50% of property growth from six government data variables would be implausible and almost certainly overfitted.
6.2 Statistical Significance
The top bin's outperformance has a p-value of 3.2e-142. This is not borderline. The probability of observing a +4.0% difference across 38,548 sales by random chance is effectively zero. For context, a p-value below 0.05 is the standard threshold for statistical significance. The Transport Ecosystem Index exceeds this by 140 orders of magnitude.
The bottom bin is even more significant at p = 8.3e-174 across 88,091 sales. Both tails of the distribution carry strong statistical weight.
6.3 Consistency Over Time
The top bin was significant at 147 of 163 dates. The above-threshold suburbs beat the below-threshold suburbs in 122 of 163 quarters (75%). A signal that works three-quarters of the time across a full market cycle spanning booms, corrections, and the COVID pandemic is robust. The 25% of quarters where the signal did not hold are clustered around the GFC recovery (2008-2011) and the pandemic boom (2020-2021), which are identifiable and understandable market conditions.
6.4 Geographic Breadth
The spread is positive in 8 of 13 GCCSA regions. It works in declining markets (Darwin at +3.23%, Rest of WA at +2.37%) and in growing markets (Rest of Queensland at +2.06%, Rest of NSW at +1.42%). The five negative-spread regions include three with small sample sizes (Rest of NT at 147 sales) or unique market dynamics (ACT, Brisbane).
6.5 Practical Use
Investors do not need a model to predict exact prices. They need a signal to tilt the odds in their favour across a portfolio of purchases. A 4.0 percentage point advantage per purchase, compounded over time, is highly meaningful. Combined with other Microburbs signals, the Transport Ecosystem Index forms one layer in a multi-factor approach to suburb selection.
7. Limitations
7.1 Government Data Is Point-in-Time
The index relies on government data sources including the Australian Census from 2016 and 2021. Census data is collected every five years. Transport usage patterns can shift between census dates. A suburb that scored highly in 2016 may have changed by 2021 due to new infrastructure or shifting commute patterns. The model cannot capture these shifts in real time.
7.2 Backward-Looking Model
The model was trained on historical data from 2008 to 2021. Past patterns do not guarantee future results. The relationship between travel mode patterns and property growth could weaken or reverse if underlying market dynamics change. New transport infrastructure or shifts in commuting preferences may alter suburb rankings in ways not captured by the current model.
7.3 Individual Suburb Variation
Even within the top bin, individual suburb outcomes vary widely. The index provides a statistical edge across large numbers of purchases, not a guarantee for any single suburb.
7.4 Five Negative-Spread Regions
The signal inverts in 5 of 13 regions. Brisbane (-0.64%), ACT (-0.55%), Perth (-0.94%), Rest of SA (-0.94%), and Rest of NT (-4.24%) all show negative spreads. Investors in these regions should treat the Transport Ecosystem Index with caution. The signal is strongest in Melbourne, Darwin, and regional areas of WA, Queensland, and NSW.
7.5 Sample Size Variability
Some regions have small sample sizes. Rest of NT contributed only 147 sales. Darwin contributed 367 sales. Results in small-sample regions carry wider confidence intervals and should be interpreted with caution.
7.6 No Causal Claim
This paper documents a correlation, not a causal mechanism. We hypothesise that spacious, car-friendly suburbs offer larger lots, lower density, and lifestyle appeal that sustains buyer demand over time. But the data does not prove causation. Other unmeasured variables (such as land supply constraints, household income, or proximity to employment centres) may explain part or all of the observed relationship.
Access Suburb-Level Scores
Get Transport Ecosystem scores for every suburb in Australia. Combine with other Microburbs signals to build a shortlist backed by data.
Part of the Threshold Signals research programme