How to request datasets from dbGaP and other federal repositories
Federal Data Repositories
There are that must be in place in order to use controlled-access data from a federal repository. You may need to seek an IT environment that meets these standards prior to accessing. You can (91爆料 NetID required).
For NIH Controlled-Access Data Repositories, review the:
- Required Security and Operational Standard for .
- List of .
dbGaP Overview
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. Such studies include genome-wide association studies, medical sequencing, molecular diagnostic assays, as well as association between genotype and non-clinical traits. The advent of high-throughput, cost-effective methods for genotyping and sequencing has provided powerful tools that allow for the generation of the massive amount of genotypic data required to make these analyses possible.
dbGaP provides two levels of access 鈥 and 鈥 to allow broad release of non-sensitive data, while providing oversight and investigator accountability for sensitive data sets involving personal health information.
Before you begin a Data Access Request
Review the following questions and guidance.
- Authorized users must be employees of the 91爆料. If your team includes users at other organizations, they need to request access through their own institution.
- For dbGaP, you will need appropriate system credentials in . If you need access, review Commons Roles at the 91爆料.
- Review .
- Review 91爆料IT guidance on .
- Identify whether you will use an existing 91爆料 environment, set up a new secure environment through 91爆料, or use a third-party environment with 91爆料IT approval. In some cases, it may be possible to use an NIH-hosted environment (e.g., AnVIL or BioData Catalyst).
- Consult with the authorized IT Director for the you plan to use. If you will be using an NIH-hosted environment, please reach out to your department鈥檚 IT administrator.
If you are working with :
- Complete the required training on .
- Save completion certificate to include with SAGE Request.
- An Authorized IT Director for the being used must provide confirmation that the IT environment meets .
- You will need an Assurance signed by Approved User that the NIH Security Best Practices can be met.
Start your Data Access Request (DAR)
After reviewing the previous guidance, follow these steps to begin your DAR.
- Choose datasets you wish to access.
- Some datasets require IRB approval. See the Human Subjects Division guidance on obtaining IRB approval.
- Select the Signing Official: Select the authorized official.
- Your OSP reviewer will update the Signing Official to themselves after they receive the accompanying SAGE request. See steps to Prepare your Request in SAGE to OSP.
- In the DAR, list the authorized IT Director who has firsthand knowledge of the IT environment you intend to use. This is the same person who signs the IT Director Confirmation.
- If using a Cloud Computing IT Environment (91爆料 Government Community Cloud or 91爆料 GCC), upload the 91爆料 Cloud Computing IT Environment Statement into the DAR.
- Read the attestation language.
- Add other necessary attachments required by NIH, such as IRB Approval.
- Read and agree to the terms and conditions as the 鈥淎pproved User鈥:
- Investigators and their institutions are responsible for safeguarding the accessed datasets. Pay close attention to the Data Use Certification (DUC) being made by you as an Approved User.
- Review and approve the Data Access request so it begins routing to the Signing Official.
- Download a copy of the DAR, then proceed with next steps to prepare your SAGE request to OSP.
Prepare your SAGE Request to OSP
The type of SAGE request depends on whether your DAR is associated with an existing sponsored program. If it is associated with a sponsored program, route an OSP & GCA Modification Request (MOD) in SAGE.
If it is not associated with a sponsored program, route a Non-award Agreement (NAA) eGC1.
- Prepare and route the request in SAGE
- For a MOD, select the 鈥淔ederal Repository Data Access and Submission鈥 subcategory.
- For an eGC1, select the 鈥淣on-Award Agreement (new)鈥 application type if you are requesting access to new data, or 鈥淣on-Award Agreement (continuation)鈥 if it is a renewal request for data you have already been using.
- Attach the following to your SAGE request: .
- Copy of the DAR.
- A copy of a signed IT Director confirmation. This is the same person who is named as IT Director in your DAR.
- Assurance signed by Approved User that the NIH Security Best Practices can be met.
- If the dataset you wish to access requires IRB approval, a copy of the IRB approval.
- If applicable, copy of the completion certificate for the required training on .
- OSP will review the Award Modification Request (MOD) or NAA eGC1 together with the DAR in eRA Commons.
- Check status on 鈥淢y Requests鈥 page in eRA Commons.
Signing Official (OSP) Review
- DAR is complete.
- An authorized IT Director is identified.
- A signed confirmation statement from IT Director is attached in SAGE to the NAA eGC1 or Award Modification,
- Assurance statement signed by the Approved User is attached to SAGE item.
- If the IT Environment used is 鈥淕CC High鈥, that PI has uploaded the 91爆料 Cloud Computing Statement in the DAR.
- IRB approval, if needed, is attached to the DAR, and corresponds to the study in question.