Research Guides: Research Data Management: FAIR and CARE Principles

FAIR Data Principles

The FAIR principles are a framework for ensuring that data collected by researchers across all disciplines and fields meet specific standards to promote open science, and enhance the reusability of data. They were first published in Scientific Data in 2016: "FAIR Guiding Principles for scientific data management and stewardship".

The following description of the FAIR principles is taken directly from https://www.go-fair.org/fair-principles/

Findability: The first step in (re)using data is to find them. Metadata (the description of the data) and data should be easy to find for both humans and computers. This means assigned a persistent identifier (PID) to the data/dataset (usually in the form of a digital object identifer, or DOI). Identifiers consist of an internet link (e.g., a URL that resolves to a web page where the data are located). Identifiers will help others to properly cite your work when reusing your data.

Accessibility: Once the user finds the required data, they need to know how can they be accessed, possibly including authentication and authorisation. This does not mean that data should be open, necessarily. There are many reasons to restrict access to data (e.g. the data contain personally identifiable information (PII), are proprietary/licensed as intellectual property (IP), or contain other sensitive information). Accessibility essentially means that it should be clear under what conditions access is allowed. The rule with accessibility can be distilled to: "As Open as Possible, as Closed as Necessary"

Interoperable: Interoperability refers to the ease by which data can be integrated with other/new data. In practice, storing data in open formats makes it easier to later integrate new data. On the other hand, storing data in proprietary formats hinders this effort. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing. This means that when possible, it's best practice to use standardized vocabularies/variable labels/terms.

Reusable: The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings. In practice, this involves creating a README file with details on how to clean, transform, or manage the data, if applicable. This also involves applying a license to let others know if the data are public domain or if copyright is retained to some degree or completely.

CARE Principles for Indigenous Data Governance

The CARE Principles for Indigenous Data Governance are people and purpose-oriented, reflecting the crucial role of data in advancing Indigenous innovation and self-determination. These principles complement the existing FAIR principles (www.go-fair.org) encouraging open and other data movements to consider both people and purpose in their advocacy and pursuits.

The following description of the CARE principles is taken directly from: https://www.gida-global.org/care

Collective Benefit: Data ecosystems shall be designed and function in ways that enable Indigenous Peoples to derive benefit from the data.

Authority to Control: Indigenous Peoples’ rights and interests in Indigenous data must be recognised and their authority to control such data be empowered. Indigenous data governance enables Indigenous Peoples and governing bodies to determine how Indigenous Peoples, as well as Indigenous lands, territories, resources, knowledges and geographical indicators, are represented and identified within data

Responsibility: Those working with Indigenous data have a responsibility to share how those data are used to support Indigenous Peoples’ self-determination and collective benefit. Accountability requires meaningful and openly available evidence of these efforts and the benefits accruing to Indigenous Peoples.

Ethics: Indigenous Peoples’ rights and wellbeing should be the primary concern at all stages of the data life cycle and across the data ecosystem.