Article Text

Prospective longitudinal study investigating predictors of childhood injuries from Growing Up in New Zealand cohort: study protocol
  1. Luam Ghebreab1,
  2. Bridget Kool1,
  3. Arier Lee1,
  4. Susan Morton2
  1. 1 Faculty of Medical and Health Sciences, School of Population Health, Section of Epidemiology and Biostatistics, The University of Auckland, Auckland, New Zealand
  2. 2 Faculty of Medical and Health Sciences, School of population health, Department of Social and Community Health, The University of Auckland, Auckland, New Zealand
  1. Correspondence to Dr Luam Ghebreab, School of population health, Section of Epidemiology and Biostatistics, The University of Auckland Faculty of Medical and Health Sciences, Auckland 1142, New Zealand; luam.ghebreab{at}


Background Injury is one of the leading causes of mortality and morbidity worldwide and yet preventable and predictable. In New Zealand (NZ), unintentional injury is the leading cause of emergency department visits, hospitalisations and death among children, making it a significant public health concern.

Objective To identify the factors that place young children in NZ at an increased risk of unintentional injury.

Methods This study will investigate injuries among children from the prospective Growing Up in NZ birth cohort of 6853 children and their families. The primary outcome of interest is injury events where medical treatment was sought. The data sources include parental reports of child injury and Accident Compensation Corporation—NZ’s no-fault injury compensation system—injury claims. The linked datasets will be utilised to examine the distribution of life course exposures and outcome data using descriptive statistics. A temporal multilevel model will then be developed to examine relationships between neighbourhood, child and family characteristics and injury from birth to 5 years of age for all children for whom parental consent to link data were obtained.

Discussion The findings of this research will help to identify how the multiplicity of influences between children, family and their broader societal context acting across time affect their risk of experiencing a preschool injury. This information will provide an evidence base to inform context-relevant strategies to reduce and prevent childhood injuries.

  • longitudinal
  • cohort study
  • burden of disease
  • child

Data availability statement

No data are available.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Injuries pose a major public health threat to children throughout the world and are the leading cause of premature morbidity and mortality.1 In New Zealand (NZ), unintentional childhood injury accounts for more than 90% of all injuries and is the leading cause of emergency department visits, hospitalisations and death among children.2 The annual incidence of acute admissions to hospitals due to unintentional injury among under 15-year-olds is 775 per 100 000 children.3 Unintentional injury accounts for approximately 29% of all deaths among children aged 1–14 years, claiming around 35 children per year.4 This associated annual mortality rate is among the highest in high-income countries.5 In 2008 (the latest data available), it was estimated that the total social and economic costs per child fatality in NZ were US$8.1 million.6 The annual Accident Compensation Corporation (ACC)—NZ’s no-fault injury compensation system7—claim expenditure for child injury is around US$175 million.6

The incidence of child injuries in NZ is significantly higher in younger children (excluding neonates) compared with older adolescents (NZ Injury Query System, 2020). Injury patterns vary by age group with suffocation the most common cause of death among neonates, falls and burns among preschool children and motor vehicle-related deaths most common among older children and adolescents.3 8 9 According to the 2013–2017 Injury Prevention Research Unit report, falls are the leading cause of hospitalisation for all children under the age of 14 years in NZ.10

The occurrence of childhood injury in NZ varies by ethnicity and socioeconomic determinants.11 12 For instance, Māori children aged 0–14 years have significantly higher unintentional injury mortality and hospitalisation rates when compared with non-Māori children.13 Additionally, injury rates among children living in areas of high deprivation are higher compared with those living in less deprived areas.14 The causes of child injury are most likely to be multifactorial15 and significantly impacted by broader social and environmental determinants.

There have been numerous descriptive reports and studies in NZ using conventional epidemiological methods to investigate risk factors for unintentional childhood injury.9 16–18 These studies have predominantly focused on short-term, single or a few exposures despite the likely complex, multifaceted and interlinked contributory frameworks to childhood injury.19 20 These studies have highlighted several important modifiable risk factors. The Dunedin Multidisciplinary Study21 and the Christchurch Health and Development Study22 are longitudinal studies commenced in NZ in the 1970s. These studies have made a significant contribution to the field of longitudinal research in NZ. Their findings have provided an evidence base that has informed injury prevention policy and practice23 and have contributed to early childhood intervention programmes.24 However, there is a paucity of evidence in NZ from studies reflecting contemporary children and their environments, NZ’s ethnic diversity and that contain context-relevant data to determine timely points to identify risk factors for child injury. Without such evidence, it remains challenging to develop informed preventive strategies to reduce child injury rates in NZ.

Aims and research Questions

The study aims to explore how combinations of individual child characteristics, proximal and distal social environments and macroenvironmental factors, and multiple events acting across the early life course either protect young children (0–5 years) or place them at greater risk of experiencing unintentional injuries requiring medical attention. The study will address the following research questions (RQs):

  1. What is the distribution and patterns of childhood injury events where medical treatment was sought among young (0–5 years) children from the Growing Up in NZ (GUiNZ) cohort?

  2. How reliable is parental/caregiver report of medically attended child injuries?

  3. What are the risk and protective factors for injury among a cohort of young children within different age ranges (0–9 months, 9 months−2 years and 2 years to 4.5 years)?

  4. What is the longitudinal relationship between a wide range of child, social and physical environment-related risk factors and injury among children up to the age of 5 years?



This study will analyse data from the GUiNZ, a contemporary and population relevant NZ longitudinal birth cohort.25 The prospective GUiNZ birth cohort enrolled pregnant mothers with expected delivery dates between 25 April 2009 and 25 March 2010 residing in three contiguous North Island District Health Board regions during pregnancy. These three regions are where 29% of the NZ population reside and over one-third of live births occur.26 The initial GUiNZ birth cohort consisted of 6822 mothers during pregnancy (4401 partners), and 6853 potential births.27 The longitudinal multidisciplinary information collected from the children and their families includes parental self-reported developmental information across multiple interconnected disciplinary domains and direct objective measures of child development, child and parent health outcomes, dynamic interactions between the child and their parents and environments and biological samples. It also included an explicit aim to seek parental consent for linkage to routing admin data—including health and education information from the beginning of the study.25 Attrition rates were low across the early years of this study (with more than 90% of the baseline cohort completing the major data collection waves (DCWs) throughout the preschool period) and overall opt-out was less than 5% of the baseline cohort.28 Compared with similar international birth cohort studies, the completion rate from eligible participants remained high (81%) in the latest—8-year DCW.29

This study adapted the ecological model of injury across the life-course, a ‘lens and telescope’ developed by Hosking et al,30 developed in conjunction with the GUiNZ experts and aligned to the GUiNZ conceptual framework.25 31 The model outlines a conceptual framework that encompasses a comprehensive picture of the multiple factors that surround childhood injury and provides a foundation for causal pathway representations to link the developmental outcomes at different periods across the life-course.32 This approach also has the potential to direct policy initiatives to more than one specific area at a time where intervention is likely to have the highest impact on reducing injuries.

Data sources

The GUiNZ cohort data have been collected using face-to-face, telephone and computer-mediated interviews as well as through linkage of existing data at different DCWs.25 Separate interviews were conducted with child data gathered from interviewing the mother and her partner and through direct observation by interviewers, and for the maternal and partner data during each DCW, as well as with the children themselves. This research will use data collected from mother’s face-to-face computer-assisted personal interviews during—antenatal (DCW0), 9 months (DCW1), 24 months (DCW2) and 54 months (DCW4), and mother’s computer-assisted telephone interview (6 weeks, 16 months and 45 months).33 Relevant variables of linked data and child measurements at the perinatal stage will also be included (See table 1).

Table 1

GUiNZ DCWs used for this research and the number of participants

In addition, the study will draw on information from ACC. When medical attention is sought for an injury, an ACC claim form (AC 45) is completed by the patient (or caregiver) and the attending clinician.7

The National Health Index (NHI)—NZ’s unique health identifier—of children in GUiNZ cohort will be used to identify injury claims from ACC’s database. The two datasets will then be probabilistically linked using NHI, date of birth and child’s sex at birth. Parental consent has been provided for 97% of the GUiNZ cohort to link their interview data to routine dataset—and this process is undertaken according to strict data access protocols that are applied to all GUiNZ data access.34

Outcomes of interest

Primary outcome

The primary outcome of interest in this research is parental reports of injury events where medical treatment was sought for children aged 0–5 years in the GUiNZ cohort. This outcome will be obtained from two sources, first via parental self-report of child injury from the GUiNZ data collected at three-time points during the first, second and fifth DCWs when the children were aged 9 months, 24 months and 54 months old, respectively (see table 2). Second, from injury claims contained in the ACC database for children from the GUiNZ cohort.

Table 2

GUiNZ injury-related questions asked at different DCWs

Secondary outcomes

Recurrence of injuries/number of injuries and hospitalisation vers no hospitalisation as the secondary outcome of interest will be obtained from paternal self-reported injury and ACC claim records.

Information obtained from ACC records will include injury date, cause of injury, scene of injury, primary diagnosis, International Classification of Disease (ICD) -10, whether or not hospital admission was required and intent.

Variables of interest

Sociodemographic variables

GUiNZ ethnicity classification for the child and mother uses both the Statistics NZ classification of Ethnicity Level 1 (European; Māori; Pacific; Asian; Middle Eastern, Latin American or African; Other; and New Zealander) and more detailed Level 3 information.35 36 Child sociodemographic variables included sex obtained from the perinatal linkage, self-prioritised ethnicity of the child as described by their mother at 54 months. Maternal demographic data were collected during DCW0 and included self-prioritised ethnicity, age and highest education. Maternal age will be categorised into four groups: under 20 years, 20–29 years, 30–39 years and 40 or more years. Highest maternal education will be collapsed into three categories: high school or less, post-secondary and university.37

Explanatory variables

A list of the relevant exposure variables that will be considered in the analyses is displayed in table 3. These various aspects of children’s biological, developmental, cultural and broader environmental factors were selected for inclusion in this study were drawn from the relevant literature,38 39 and current expertise in the field of injury prevention. The availability of acceptable levels of missing data will be considered to select variables. Explanatory variables will comprise continuous, binary and categorical (>2 groups) data. Where possible, validated cut-offs relevant to contemporary NZ children will be used to transform scale data into binary or categorical variables. Additionally, indices and composite variables will be created based on previous evidence published from GUiNZ data: cognitive,40 self-control,41 developmental concerns,42 safety in household,43 material hardship27 and overall house quality44 from the United Kingdom Millennium Cohort study. The individual variables that form the indices will be further analysed to observe the univariate effect of each variable. Details of all variables included in this study are described in the online supplemental table 4).

Table 3

Explanatory variables considered

Study population

Children will be included in the current study if their mothers had provided complete injury data, they were classified as NZ residents during data collection, and their parents had consented to link to their routinely collected health records (including ACC data).

Data management

Merging of several datasets from different waves will follow the GUiNZ Reference and Process User Guidelines.34 Each dataset of the GUiNZ DCWs have main keys for child identification (CID), family ID (FAMID) and/or mother ID (MID), and where relevant partner ID (PID). These identifying numbers have five digits with an additional suffix indicating either the child, primary caregiver or the partner. Following the importing of all datasets required for each DCW, the CID (KEY)—a dataset with identification numbers of all 6853 participants will be included in the study—will be used as a key variable to merge each dataset in each DCW included in this study using R Software and STATA. The details of the subsequent process of merging all datasets are illustrated in figure 1. The final dataset will be manually checked for any duplications or loss of data by comparing descriptive statistics of selected variables between the DCW (0–5) with each of the originally created datasets.

Figure 1

GUiNZ dataset merging process. CID, child identification; DCWs, data collection waves; FAMID, family ID; MID, mother ID; PID, partner ID.

Statistical analysis

Data analysis will be performed using STATA and R software. Three forms of data analyses (descriptive, bivariate and multivariable analysis) will be performed based on the steps to address the RQ.

Descriptive analysis

To address RQ1, descriptive analyses of each dataset will be undertaken to explore the characteristics and distribution of exposure variables and injury outcomes from GUiNZ and ACC data. The ACC data will be matched with GUiNZ data using birth month, child sex at birth and NHI number to ensure the correct child and associated information is received prior to analysing the data. Mean, SD or median and IQR will be examined for continuous variables depending on the outcomes of several visual and numerical normality tests—histogram, Kolmogorov-Smirnov (K-S) normality test and Shapiro-Wilk’s test. Graphics of the variables will be generated, including Kernel density plots, P–P plots and Q–Q plots to assess visual distribution for continuous variables. Frequencies and percentages will be analysed for all categorical variables.

Matching injury Variable from GUiNZ and ACC Records

To commence addressing RQ2, the frequencies of injury event versus injury status, hospitalisation versus no hospitalisation, number of injury events and types of unintentional injuries from the maternal reported injury events from GUiNZ data and injury events from the ACC will be compared using bivariate analysis. The injury matching categories for ‘no report’ will be each assigned a numerical value in preparation for assessment of the degree of agreement between the GUiNZ Study maternal reports of injury and the ACC records of injuries.

The first test of symmetry will be undertaken to identify any existing direction of misclassification between the GUiNZ Study maternal reports of injury and the ACC records by reviewing the symmetry of the discrepant dichotomous injury classification (injury vs no injury). This will be classified as discordant (under-reporting or over-reporting ACC injury reports by mothers) and concordant results and their distribution will be explored among different maternal demographic characteristics. Assuming the ACC report is accurate, the validity of maternal reported child injury will be evaluated using sensitivity, specificity and PPV and NPV.45 Finally, the degree of reliability will be assessed using the prevalence and bias-adjusted kappa statistic (as kappa is commonly affected by the prevalence of an indicator and level of disagreement)46 and reveal the percentage of agreement beyond chance between GUiNZ study cohort reporting of childhood injuries by mothers and the records of injuries in the ACC database.

Cross-sectional Aanalysis

To address the RQ3, further analyses will be conducted to investigate differences between and within age subgroups (0–9 months, 9−24 months, 24−54 months). For binary outcomes such as ‘no injury vs injury’ and ‘hospitalisation vs no hospitalisation’, a binary regression model will be executed. For the count data for the number of injury events, the Poisson regression analysis will be conducted assuming that the mean of a Poisson random variable is equal to its variance. As a preliminary model, all the variables within the child/individual characteristics, family social environment, home/household physical environment, community and cultural practices and national macrosocial factors and physical environment will be identified using bivariate regression analysis. Individual regression models will be fitted for each domain (see table 2), wherein all variables will be fitted concurrently, and models will be compared against a constant only model.

A correlation matrix between continuous predictor variables will be considered before deciding one of the two variable exclusion from the multiple regression models or combining them to form one composite variable as a way of addressing the potential impact of multicollinearity.47 It is important to make sure no variables are highly correlated as it might imply both variables are assessing the same underlying construct. As a rule of thumb, interpreting high Pearson’s correlation cut-off point is 0.7 and above.48 49 Additional multicollinearity diagnostics such as the Variance Inflation Factor will be obtained to assess collinearity among all independent variables that will be included in multiple regression models.50 Once the main effects of the selected variables are examined, a further interaction effect model will be employed to explore the two-way interaction terms between the variables that made a significant contribution and the other terms in the model. The statistical significance level will be set at p≤0.05.

Longitudinal Analysis

To address RQ4, a temporal model will be developed through the use of multilevel modelling in order to examine relationships between child and family characteristics, physical household and neighbourhood environment and injury from birth to 5 years of age. Where there is sufficient power, the longitudinal analysis will be held for specific population subgroups (eg, by ethnicity). Generalised linear mixed effects models will be considered to identify the predictors of unintentional injury from birth to 5 years of age (for the repeated measures taken over time).51 This model will enable the variability within and between the cases to be explored. Analyses undertaken will account for relevant assumptions such as the normal distribution of the random effects, handling outliers, choosing appropriate link functions, plotting the outcome averages against the predicted values and estimating variance dispersion.52

Missing Ddata analysis

In longitudinal cohort studies, missing data are expected at different stages of DCWs. The most common approach for dealing with missing data is complete case analysis using listwise deletion, limiting analysis to cases with complete data only and always assumes missingness is completely at random (MCAR).53 Given that the GUiNZ study has a large sample size, a complete case analysis will be considered for the outcome and independent variables. However, this might result in several issues such as decreased power due to significant loss of the representativeness of the sample size, biased estimates, incorrect standard errors with widened CIs and inaccurate inferences depending on the mechanism and degree of missingness. Hence deleting all cases with missing values is not always the best strategy in this current study as it will potentially remove a high number of cases from the cohort. Descriptive reporting of variables, their proportion of missing data and cross-tabulations of the individual variables with the outcome of interest will be used to indicate patterns of ‘missingness’, MCAR, missing at random and not missing at random54 within the data. After exploring the proportions and pattern of missing data, multiple imputations will be employed55 to address the issue of variables MCAR (using mice procedure)56 in R. Prior to imputation, Little’s MCAR test will be applied to confirm that the selected variables are MCAR.57 Where necessary, categorical variables with an extensive amount of missingness, a missing category will be created.58 59


The outcome of this research is expected to add to the limited contemporary body of knowledge regarding the life-course determinants of child injury in NZ. The multifaceted nature of the injury burden among young children in NZ requires correspondingly complex feasible, sustainable and culturally appropriate solutions. The longitudinal aspect of this study will identify the factors or clusters of risk factors that are inevitably influenced by the broader context within which a child lives and will offer population-relevant evidence to determine timely points for the delivery of effective interventions that are contextually relevant to NZ children and their broader environments. The methodological approach developed to identify factors associated with risk of injury among preschoolers serves as a model for future studies analysing injury-related routinely collected data linked to GUiNZ data as the cohort ages.

Data availability statement

No data are available.

Ethics statements

Patient consent for publication

Ethics approval

Ethics approval for the GUiNZ study was granted by the NZ Ministry of Health Northern Y Regional Ethics committee (NTY/08/06/055) (www. growing All participants have provided written consent for enrolment into the GUiNZ study. Additional consent to access and link to routine health and education data from hospital and health centres, accident and medical centres, and education and formal childcare providers collected about their child during their preschool years has been obtained at various time points throughout the study. All access to GUiNZ data is strictly controlled by a Data Access Protocol (DAP) and a Data Access Committee (DAC) with responsibility for ensuring the privacy and confidentiality of all individual participants and the sustainability of the cohort. Approval to access GUiNZ data for this study was obtained from the DAC. Approval was also obtained from ACC to access injury data for children within the GUiNZ cohort until the age of 5 years.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • Contributors LG is the lead author and guarantor of this manuscript. All (LG, BK SM, AL) authors have participated in the conception, planning, reviewing and approving this protocol plan. All authors of this paper have read and approved the final version submitted.

  • Funding GUiNZ has been funded by the Ministry of Social Development, supported, Ministries of Health and Education, as well as Oranga Tamariki; Te Puni Kōkiri; the Ministry of Justice; the Ministry of Business, Innovation and Employment; the Ministry for Pacific Peoples; the Ministry for Women; the Department of Corrections; the New Zealand Police; Sport New Zealand; and the Office of the Health and Disability Commissioner: Office of the Children’s Commissioner; Housing New Zealand (now Ministry of Housing and Urban Development); the Office of Ethnic Communities; Statistics New Zealand; the Department of Prime Minister and Cabinet and the Treasury. GUiNZ acknowledges the ongoing support and advice provided by the University of Auckland and Auckland UniServices Limited, as well as the advisory and governance groups involved in the study, including the Steering Group, Policy Forum; Expert Scientific Advisory Group; Kaitiaki Group; Pasifika Advisory Group; and Data Access Committee.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.