Skip to Main Content

Research Data Management

Selecting a Data Repository

Placing your data into a repository allows it to be saved after the life of a research project and makes sharing easier. There are a variety of repositories suited for different needs. Your repository choice may be based on data requirements, discipline as well as journal and funder requirements. 

Some data repository options are listed below. Please consider contacting Nora Mulvaney, TMU’s Research Data Management Librarian, at nmulvaney@torontomu.ca for advice about selecting the most appropriate data repository for your research data.

Depositing in TMU Dataverse

Toronto Metropolitan University (TMU) Dataverse is part of Borealis, the Canadian Dataverse Repository, which is a bilingual, multidisciplinary, secure, Canadian research data repository, supported by academic libraries and research institutions across Canada. Datasets in TMU Dataverse are assigned a digital object identifier (DOI), are widely discoverable and receive monthly integrity checks in combination with safe storage to protect against data loss and corruption. 

For more information about depositing in TMU Dataverse, please contact Nora Mulvaney, Research Data Management Librarian: nmulvaney@torontomu.ca 
 

  1. Create an Account
    1. To get started, you’ll need to create a Borealis account. Click on “Sign Up” on the Borealis login page: 
    2. Create your account with your TMU email address:
    3. Read and agree to the Borealis “General Terms of Use” and click on “Create Account”. Note: the Terms of Use contain a section on “Sensitive and Confidential Data”, which you should read carefully. 
  2. Login to your Account
    1. Once you’ve created your account, you’ll be able to login with your TMU email address as your username and your password. Note: TMU does not currently have single-sign on functionality, so you’ll need to click on “Institution not listed?”
  3. Deposit a Dataset
    1. To deposit a dataset, click on “+ Add Data” and then “New Dataset”
    2. Select a Dataset Template to choose a license for your dataset. More information about the Creative Commons licenses can be found here or please contact the RDM Librarian. 
    3. Select the Files you want to add to your dataset and click on “Save Dataset”. This saves the dataset in Draft form, but it is not yet published or publicly visible. 
    4. You can edit elements of your dataset at any time using the Files, Metadata, and Terms (license and use permissions) tabs at the bottom of the page.
    5. Files can be Restricted or embargoed so they are available only on request. If you are interested in doing this, please contact the RDM Librarian. 
  4. Submit Dataset for Review
    1. Once you're happy with the dataset, click on “Submit for Review”. The draft dataset will then be reviewed by the RDM Librarian who may reach out to you with additional questions regarding the dataset or metadata. 
    2. Once the dataset has been published by the RDM Librarian, a DOI will be assigned to permanently identify it and you will receive an email. 

 

FAIR Data Principles

The FAIR principles are a framework for ensuring that data collected by researchers across all disciplines and fields meet specific standards to promote open science, and enhance the reusability of data. They were first published in Scientific Data in 2016: "FAIR Guiding Principles for scientific data management and stewardship"

The following description of the FAIR principles is taken directly from https://www.go-fair.org/fair-principles/ 

Findability: The first step in (re)using data is to find them. Metadata (the description of the data) and data should be easy to find for both humans and computers. This means assigned a persistent identifier (PID) to the data/dataset (usually in the form of a digital object identifer, or DOI). Identifiers consist of an internet link (e.g., a URL that resolves to a web page where the data are located). Identifiers will help others to properly cite your work when reusing your data.

Accessibility: Once the user finds the required data, they need to know how can they be accessed, possibly including authentication and authorisation. This does not mean that data should be open, necessarily. There are many reasons to restrict access to data (e.g. the data contain personally identifiable information (PII), are proprietary/licensed as intellectual property (IP), or contain other sensitive information). Accessibility essentially means that it should be clear under what conditions access is allowed. The rule with accessibility can be distilled to: "As Open as Possible, as Closed as Necessary"

Interoperable: Interoperability refers to the ease by which data can be integrated with other/new data. In practice, storing data in open formats makes it easier to later integrate new data. On the other hand, storing data in proprietary formats hinders this effort. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing. This means that when possible, it's best practice to use standardized vocabularies/variable labels/terms.

Reusable: The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings. In practice, this involves creating a README file with details on how to clean, transform, or manage the data, if applicable. This also involves applying a license to let others know if the data are public domain or if copyright is retained to some degree or completely.