Journal article

Constructing whole of population cohorts for health and social research using the New Zealand Integrated Data Infrastructure

Statistical computing Cardiovascular diseases New Zealand
Attachment Size
apo-nid142091.pdf 291.37 KB


Objectives: To construct and compare a 2013 New Zealand population derived from Statistics New Zealand’s Integrated Data Infrastructure (IDI) with the 2013 census population and a 2013 Health Service Utilisation population, and to ascertain the differences in cardiovascular disease prevalence estimates derived from the three cohorts.

Methods: We constructed three national populations through multiple linked administrative data sources in the IDI and compared the three cohorts by age, gender, ethnicity, area‐level deprivation and District Health Board. We also estimated cardiovascular disease prevalence based on hospitalisations using each of the populations as denominators.

Results: The IDI population was the largest and most informative cohort. The percentage differences between the IDI and the other two populations were largest for males and for those aged 15–34 years. The percentage differences between the IDI and Census cohorts were largest for people living in the most deprived areas. The ethnic distribution varied across the three cohorts. Using the IDI population as a reference, the Health Service Utilisation population generally overestimated cardiovascular disease prevalence, while the Census population generally underestimated it.

Conclusions and implications: The New Zealand IDI population is the most comprehensive and appropriate national cohort for use in health and social research.

Publication Details


License type:
Access Rights Type: