Skip to Main Content

Standard Analytical Files - LDS

Alert: Starting April 9, 2014 CMS is no longer encrypting physician identifiers, e.g., National Provider Identifiers (NPIs), on the SAFs.  All new SAF requests will include actual NPI rather than an encrypted physician identifier.  Researchers who have already purchased SAFs and have an active Data Use Agreement (DUA) with CMS may request a crosswalk linking encrypted physician identifiers to NPIs/UPINs.   For more information on how to request the crosswalk visit: http://www.cms.gov/Research-Statistics-Data-and-Systems/Computer-Data-and-Systems/Privacy/DUAs_-_more_actions.html


The Limited Data Set (LDS) is the same as our previous BEFs with the exception that the HIC is completely blank in the LDS.

The DESY-SORT-KEY is the unique beneficiary identifier field in the LDS files. It will be consistent within the file, across files, and across time, but cannot be used for any other purpose and will not be accepted for search for other CMS data. The DESY-SORT-KEY is provided within 50-bytes generated by DESY and added at the end of the base (or fixed portion) record of each claim. This does not affect the statistical quality of the data as the properties of the data and relative proportions are still the same. In other words, an identified cohort of patients (by diagnosis code as an example) is still the same.

These files are available by type of claim or collectively as a group. The 5% sample is created based on selecting records with 05, 20, 45, 70 or 95 in positions 8 and 9 of the Health Insurance Claim (HIC) number. The term 'providers' is used universally to refer to physicians, as well as, institutions. Medicare Institutional provider numbers are not encrypted; however, physician identifiers, e.g. UPINs, etc. are encrypted in the public use files.

Physician Identification (ID) numbers are encrypted. The DESY Link Key field provides an encrypted number that enables users to find all claims for a single beneficiary. These files contain final action claims data in which all adjustments have been resolved. Files are scheduled to be released in late November.

Note: The file format of the 2011 LDS SAFs have been modified to accommodate changes in the source data file. Significant changes include the segmentation of the claims files:

Part A (Inpatient, Outpatient, Home Health, Hospice, Skilled Nursing Facility)

  1. Base Claim File
  2. Revenue Center File
  3. Condition Code File
  4. Occurrence Code File
  5. Value Code File

Part B (Carrier, DMERC)

  1. Base Claim File
  2. Line File


File cost is per year

Media: CD or DVD - check individual file for media offered
Data Format: CSV format with SAS® read-in program

Available: 1999 through 2011

Note: 100% Physician/Supplier Part B (Carrier) File not provided due to file size.

Please follow steps 1 – 8 for instructions on How to Request LDS Data Files under the Limited Data Sets subheading.

See the links in the "Downloads" section below to find:

- Important Information about the 2008 LDS SAFs

- Availability and Procedures for Obtaining the LDS 'Date' file

- Price List and Media Output for SAFs

- Data Dictionary for SAS and CSV Datasets