Data Set Definitions

This page explains the definitions of different types of data sets to help you be compliant with the Privacy Rule and DUHS IRB guidelines.

What is PHI?

"PHI" stands for Protected Health Information. HIPAA privacy regulations define PHI as individually identifiable health information that is maintained or transmitted in any form or medium.

The definition of individually identifiable health information is as follows:

“Individually identifiable health information is information that is a subset of health information, including demographic information collected from an individual, and:

(1) Is created or received by a health care provider, health plan, employer, or health care clearinghouse; and

(2) Relates to the past, present, or future physical or mental health or condition of an individual; the provision of health care to an individual; or the past, present, or future payment for the provision of health care to an individual; and

(i) That identifies the individual; or
(ii) With respect to which there is a reasonable basis to believe the information can be used to identify the individual.” 


What is a “limited data set”?

A Limited Data Set is health information that excludes these 16 direct identifiers:

  1. Names
  2. Postal address information, other than town or city, state, and ZIP Code
  3. Telephone numbers
  4. Fax numbers
  5. Electronic mail addresses
  6. Social security numbers
  7. Medical record numbers
  8. Health plan beneficiary numbers
  9. Account numbers
  10. Certificate/license numbers
  11. Vehicle identifiers and serial numbers, including license plate numbers
  12. Device identifiers and serial numbers
  13. Web universal resource locators (URLs)
  14. Internet protocol (IP) address numbers
  15. Biometric identifiers, including fingerprints and voiceprints
  16. Full-face photographic images and any comparable images

A Limited Data Set may retain the following information:

  1. Geographic data: town or city, state, and ZIP Code, but no street address 
  2. Dates relating to an individual (e.g., birth date, admission and discharge date)
  3. Other unique identifiers: any unique identifying number, characteristic or code other than those specified in the list of 16 identifiers that are expressly disallowed

What is a “de-identified data set”?

Using the “safe harbor” method, a de-identified data set excludes all 18 HIPAA identifiers:

  1. Name
  2. Address (all geographic subdivisions smaller than state, including street address, city county, and zip code)
  3. All elements (except years) of dates related to an individual (including birthdate, admission date, discharge date, date of death, and exact age if over 89)
  4. Telephone numbers
  5. Fax number
  6. Email address
  7. Social Security Number
  8. Medical record number
  9. Health plan beneficiary number
  10. Account number
  11. Certificate or license number
  12. Vehicle identifiers and serial numbers
  13. Device Identifiers and serial numbers
  14. Web URL
  15. Internet Protocol (IP) Address
  16. Biometric identifiers, including finger or voice print
  17. Photographic image - Photographic images are not limited to images of the face.
  18. Any other characteristic that could uniquely identify the individual