Skip to Main Content

Managing Research Data @MQ

Guidance on Management of Research Data at Macquarie University

Introduction

Certain key characteristics of data must be declared in the DMP, including the governance and source, level of sensitivity and size of your data.

Data Governance

Data governance must be clearly stated and understood prior to each project. Data Governance includes the policies, processes, contracts, roles, and standards that apply to your data to ensure its security, accountability, and appropriate use. These details might differ between projects, depending on the source of the data.

 

Source of Data

It is important to know what you can do with data used in your research.

Data obtained from a third party may be subject to copyright and other licence terms and conditions for its use and any subsequent communication or reuse. Examples of third-party data could include data obtained from external databases or collections, via contractual arrangements, collaborative research agreements, a data linkage, or from a government source.

Text-based, image, video and multimedia materials used as data sources, including for text and data mining, are subject to copyright law and conditions of use. This includes material available through Macquarie University Library. Access to any material available through Macquarie University Library has been negotiated by the University for a fee under enforceable licence terms.

This table outlines some key governance considerations depending on the source of data:

Data Source or use Governance Details
For any data collected, created or collated as part of your research
  • Specify roles and responsibility, such as project participants and data custodian (the custodian serves as the first point of contact and is responsible for execution of the DMP).
  • Define the intended sharing and use of active and published data
  • Declare how data will be licensed and any other terms governing re-use.
For data sourced from external databases, restricted access databases or via contractual arrangements
  • Specify the location of the original data
  • State any applicable licenses or conditions of use.
  • Supply other key information about the database from which the data would be sourced
  • Attach contracts, agreements, or other documents.
For data produced by or being used in a collaborative research project with external researchers
  • Enter into a Data Access Agreement or Collaborative Research Agreement (or equivalent) with all parties involved.
  • This agreement should detail data and arrangements for custodianship, storage, retention, access, licensing, use of the data, and the right to produce research outputs based upon the data (refer to the Macquarie University Collaborative Research Standard).

Data Sensitivity

Research data may contain information of a personal or sensitive nature which needs to be protected against unwarranted disclosure. Breaches of sensitive data have a detrimental impact on individuals and organisations. Mitigating that risk by applying appropriate security practices and access conditions is crucial.

Classifying data according to its sensitivity can help you decide:

  • WHERE the data should be stored
  • WHO can access the data
  • HOW the data should be archived or disposed

 

Access to sensitive data must be safeguarded with appropriate data security practices. Protection of sensitive data may be required for:

  • legal or ethical compliance,
  • to ensure to personal privacy and welfare,
  • to protect vulnerable environmental or cultural heritage, or
  • on account of Intellectual Property considerations.

 

At Macquarie University, research data can be grouped into three categories depending upon the sensitivity of its information. The categories are: General, Sensitive, and Highly Sensitive.

You will need to classify your data into one of these categories and record this in your DMP.

 

Assessing the sensitivity of data

This section provides examples of data that may be classified as highly sensitive, sensitive or general. A full list of indicators used to classify the data is found in the Research Data Sensitivity, Security and Storage Guideline

Data is generally considered either sensitive or highly sensitive if it contains identifiable or re-identifiable personal information, but cultural, environmental or proprietary considerations can also make data sensitive. Assessing the type of ‘sensitive information' contained in the data and its ‘identifiability’ will help with its classification.

 

MQ Research Data Classification Examples

General

  • Published or otherwise publicly available data
  • Aggregated or anonymous human subject data
  • Aggregated or derivative environmental or cultural heritage data that obscures locations
  • Data considered ‘general intellectual property’
  • Unpublished research information not covered by conditions making it more sensitive (e.g., not related to human subjects, not containing sensitive environmental or cultural information, and lacking IP constraints)
Sensitive

  • Culturally sensitive data
  • Environmentally sensitive data
  • Data with explicit IP constraints
  • De-identified research data relating to individuals that cannot plausibly be re-identified in combination with other publicly available data
  • Data relating to individuals that does not contain any of the sensitivity indicators listed below that would make it ‘highly sensitive’
Highly Sensitive
  • Identifiable or re-identifiable data containing any of the following characteristics
    • Racial or ethnic origin
    • Political opinions
    • Religious beliefs or affiliations
    • Philosophical beliefs
    • Criminal record
    • Health information about an individual
    • Genetic information
    • Biometric information
    • Financial information
  • it contains information that is subject to regulatory controls and is deemed highly sensitive by a Data Steward or by a relevant Research Management Committee (for example, data relating to controlled technology per the Defence Trade Controls Act 2012 or information which poses a risk to national security.

Human Research Ethics and Data

If research is to be conducted on people, their data, or biological samples they provide, then approval must be sought from a Human Research Ethics Committee (HREC). Data management for all projects encompassing human research must meet the terms of the National Statement on Ethical Conduct in Human Research (refer in particular to Section 3: Element 4 Collection, Use and Management of Data and Information pages 32-38).

When applying to the HREC for approval, you are expected to provide a detailed Data Management Plan.

Consent

Consent must be explicitly obtained from human research participants regarding the use of any personal information that might be obtained from them. In addition to lodging a Data Management Plan you must be explicit in your participant information sheet and in your consent forms about data availability, data access, and data use.

Sensitive data can and should be made available to others via dissemination to the extent that is possible, but appropriate safeguards must be in place, and all necessary declarations must be communicated to potential participants and their consent gained.  

You must explain to potential participants what measures will be used to safeguard data privacy and confidentiality. For instance, you must provide details on:

  • The degree to which participants’ data will be identifiable or re-identifiable.
  • How data will be used or disseminated for re-use.
  • How access to data will be controlled or mediated.

Researchers are expected to ensure that research data is made available for sharing and re-use whenever feasible (refer to Data Dissemination). Asking participants for extended or unspecified consent will increase the value and impact of your project. With longer-term data availability and reuse in mind, researchers need to plan for data archiving and dissemination, as well as consider conditions governing access and use of data.

“As open as possible, as closed as necessary”.

Data does not have to be openly accessible to be shared for the benefit of future researchers and other interested groups. In many cases, mediated or restricted access could be appropriate. Even if your data is confidential, you may be able to share it under certain circumstances, for example:

  • if the re-users must register an application with a data custodian
  • if the re-users make an application to an ethics committee or project board, or
  • if a formal re-use agreement protecting participant confidentiality is in place.

 

Disseminating Sensitive Data

The sensitive nature of some research information may result in research datasets that would pose a risk to participants, the environment or cultural resources if the dataset was made publicly available.

In these cases, before dissemination, the dataset can be modified to remove information that might lead to the unwilling identification of the participants, places or resources. This process is often called de-identification. By removing identifying elements, a researcher can still reap the benefits of making their data available, while respecting the privacy of their research subjects, protecting biodiversity or safeguarding culturally sensitive information.

 

The danger with "de-identifation"

Even with identifying elements removed, with access to other data sets (now, or in the future), someone might be able to “re-identify” the dataset.  Therefore, you need to take great care when preparing sensitive data for publication or dissemination. Read A visual guide to practical data de-identification for more information:

“...de-identification techniques unlock value by enabling important public and private research, allowing for the maintenance and use – and, in certain cases, sharing and publication – of valuable information, while mitigating privacy risk."

 

Working with Sensitive Data Resources

Retention or Archival Requirements

Data retention or disposal is determined by the NSW State Records Act 1998. Under this Act, the State Archives and Records and the General retention and disposal authority (currently GA47) governs the retention of records of research.

 

Data Retention

Permanent retention (archiving) should be considered the default because the significance of research data is not always immediately apparent. If data is not kept permanently, both the reason and procedure for its disposal should be stated and the minimum retention periods outlined below must apply.

The minimum retention period prior to disposal for research that doesn’t require to be kept as a state archive depends on the type of research being undertaken. Disposal of any research data should be considered carefully and fully justified.

If a research project involves multiple institutions, an agreement should state the retention or disposal requirements at the outset of the project. Collaborators may have additional data retention requirements.

The minimum retention periods for data and datasets created as part of research activities at Macquarie University are as follows

Research Description Retention Period

Research data of regulatory or community significance, or research where the dataset would be part of genetic research, including gene therapy

  • controversial or of high public interest, or has influence in the research domain
  • costly or impossible to reproduce or substitute (i.e. with an alternative data set of acceptable quality and usability) if the primary data is not available
  • relates to the use of an innovative technique for the first time.
Note: Because the significance of research data is not always immediately apparent, this is the default for Macquarie University Research Data
Retain Permanently

Research data not meeting the above criteria, but data and datasets created from clinical trials, or research with potential long term effects on humans, as part of research activities within the institution, which are not of regulatory or community significance.

Includes animal testing for human products.

Retain minimum of 15 years after completion of research activity or until subject reaches or would have reached the age of 25 years, whichever is longer, then destroy
Other research data not meeting the above criteria where disposal is justified. Retain for minimum of 5 years after project completion or publication

 

Disposal of Data

Since the significance of research is not always immediately apparent, any disposal of research data must be justified by the principal investigator or designated data custodian and outlined in the data management plan. If data is not retained permanently, the reason for its disposal should be explained, and its disposal protocol should be articulated. 

When disposal is justified, data, information and biospecimens used in research should be disposed of in a manner that is safe and secure, consistent with the consent obtained, following the National Statement on Ethical Conduct in Human Research, any legal requirements and as appropriate for the design of the research.