This report on the investigation into the release of myki data demonstrates that deficiencies in data governance and risk management can undermine the protection of privacy, even where the project is well-intentioned.
In 2018, Public Transport Victoria (PTV) released a dataset containing 1.8 billion historical records of public transport users’ activity (the dataset) to a group known as Data Science Melbourne for use in the ‘Melbourne Datathon’. The dataset contained the records of ‘touch on’ and ‘touch off’ activity of 15.1 million ‘myki’ cards used over a three-year period up to June 2018.
The Office of the Victorian Information Commissioner (OVIC) were notified, by PTV and a group of academics, of privacy concerns in relation to the release of the dataset. As a result, the Deputy Commissioner decided she would initiate a formal investigation under section 8C(2)(e) of the PDP Act, to determine whether she should issue a compliance notice against PTV.
During the preliminary stages of OVIC’s investigation, the Deputy Commissioner engaged data science experts from Data61, a division of the Commonwealth Scientific and Industrial Research Organisation (CSIRO), to further examine the dataset. Data61’s analysis was that the detailed nature of the information in the dataset created a high risk that some individuals may be re-identified by linking the dataset with other information sources.
The Deputy Commissioner found there were flaws in the process followed by PTV in deidentifying the dataset, assessing the risk of re-identification and deciding to provide the dataset for use in the Datathon.
This report makes recommendations to PTV and the Victorian public sector more generally. OVIC also considers it could have provided better regulatory guidance. The recommendations cover:
- Policies and procedures for data release decisions
- The Department of Transport's rollout of its data governance program initiated by Public Transport Victoria
- Data capability across the Victorian Public Sector
- Processes to support data release decisions
- Improved Privacy Impact Assessment guidance