Skip to content

Developing Data Protection Plans

Definitions:

Sensitive data

Information, that if inadvertently released, could place the research participants at risk of harm. Harms could be to participants’ relationships, status, employability, or insurability. Participants could face criminal or civil prosecution, and in some cases, physical harm. The assessment of risk must take into account the participants’ culture, age, life experience and any other relevant characteristics.

Individually identifiable data

Data that contain direct identifiers, such as participants’ names and email addresses, or indirect identifiers (“demographic data”), which are a combination of characteristics about participants that would allow others to deduce their identities.

Examples on indirect identifiers:

  • Position, gender, and length of service in a named company
  • Age, gender, major, ethnicity, and year in school

Data Classification and protection plans

Consult the IRB about the classification of your data before/during the development of your research protocol.  See also the Duke Data Classification Standards.

The online SecureIt tool was designed to help researchers identify approved Duke services they can use to collect, store, and analyze research data.

The Information Technical Security Office (ITSO) will review all data protection plans and will inform researchers and the IRB if any changes need to be made to the data protection procedures described in the protocol. ITSO can be contacted directly at security@duke.edu.

NOTE: The University has determined that any research studies that collect or use direct (e.g., names) or indirect (that is, demographic) data about Duke students may meet the "sensitive" data classification. If the data you intend to collect are considered “sensitive” they must be protected to mitigate institutional risk.

Best Practices:

Sensitive, individually identifiable data must be protected during all phases of a research project. The following are some best practices from the Campus IRB and ITSO to prevent an inadvertent breach of individually identifiable, sensitive data during the collection, transfer, storage, analyses, and report phases of your project.

Collection

Data collection using online services should be conducted using a secure platform, such as Qualtrics. If carrying out virtual interviews or focus groups (including recording), researchers must use their Duke-sponsored Zoom accounts.

When using personal devices during data collection, researchers should apply the following best practices:

Laptops and tablets:
  • Must be encrypted, have regular software updates enabled, anti-virus software (Duke supports Symantec), password-protected screensaver, remote wipe and Prey software for anti-theft protection, and other minimum security standard for endpoints
  • Recommendation: local IT support should help with guidance on management of laptops involved in research
Mobile phones:
  • Must be passcode-protected or have fingerprint recognition enabled and have regular software updates enabled, encrypted
  • Recommendation: use “Find my iPhone,” remote wipe, and Prey software for anti-theft protection

OIT and OR&I have prepared a more complete list of best practices for using personal devices in research.

If limited resources make it necessary to use pen and pencil to collect sensitive data in the field, paper documents should be identified using a unique ID number, not the participants’ names or any other direct identifiers.  The key linking direct identifiers to unique ID numbers may be taken into the field on an encrypted device.

Transfer

All sensitive, identifiable data collected in the field should be transferred as soon as possible to a secure environment (e.g., PN for Research, Duke Box) at Duke.

Files containing sensitive, individually identifiable data should never be sent as email attachments. Instead, first upload data to Duke Box and then download it to a secure environment (e.g., PN for Research) where the data may be analyzed.

Any file transfer protocols must use encrypted channels, such as secure file transfer protocol (SFTP).

If your research involves the analyses of existing data and requires a Data Use Agreement (DUA), please append it to your IRB protocol. The IRB will route your protocol and the DUA to the Office of Research Support (ORS; for contractual review) and the Information Technical Security Office (ITSO; for data security review).  The Protected Research Support & Compliance unit at the Office of Research & Innovation (OR&I) can assist if needed. You can request a consultation or contact OR&I directly at researchdatasupport@duke.edu.

Some data providers require a DUA as a condition for accessing the data, even when the data are not individually identifiable; thus, the data are considered sensitive.

Compliance with data security and DUA requirements is the responsibility of the principal investigator as well as Duke personnel working under such an agreement.

Storage and Analyses

Protected Network (“PN”) for Research. Choosing to use the Duke University Protected Network (“PN”) for Research will expedite the ITSO review. The PN for Research is the recommended secure environment for both data storage and analyses.  The PN for Research:

  • Offers a free tier level that will suffice for most studies
  • Can be used with properly secured Duke managed or personal computers via web browser (e.g., Safari or Google Chrome) allowing easy inclusion of students who may not have a Duke owned device
  • Does not require a VPN (Virtual Private Network)
  • Offers numerous options to easily install analytic software packages (e.g., Stata, RStudio, SAS, and NVivo)
  • May be used in parallel with: Duke's Microsoft OneDrive, Duke's Qualtrics, Duke's Zoom, and Duke's Box
  • Is integrated with Duke’s Box to allow for self-service export of results
  • Duke is working towards adherence to the NIST 800-171 Standard; other Duke services do not offer these specific controls which allow for compliance with regulated/contracted data
  • Controls for safeguarding the de-identification of sensitive data and for appropriately storing associated keys

Duke managed machine. A Duke managed machine is only recommended if a more secure option (e.g., the PN for Research) is not available, as Duke managed laptops may be lost or stolen and Duke managed desktops may not be appropriately secure for use with sensitive data.

Data provider enclave

Duke's Microsoft OneDrive

Duke's Qualtrics

Duke's Zoom

Duke's Box is recommended for collection and storing of data rather than ongoing analysis of data.

NOTE: It is not recommended to use Duke’s Box with Box Drive for data analysis (e.g., using statistical software like SAS or RStudio) as sensitive data will no longer be maintained in the secure cloud environment. See Using Duke Box with Sensitive Research Data: Researchers and IT Staff for more information.

Reporting

To reduce the risk of an inadvertent or intentional re-identification of a research participant, the following strategies may be used when reporting your findings:

  • Report data in aggregate only with cells of a sufficient size to prevent indirect identification
  • Depict identifiers in general terms, for example, age or income ranges
  • Use pseudonyms rather than names
  • Use broad group identifiers such as “tradesperson” rather than carpenter
  • Create misleading or vague identifiers, for example, say that the research took place in a midsize city in Western Africa rather than identify the specific city

Categories

Campus IRB Guides