BSA Inpatient Claims PUF

BSA Inpatient Claims PUF

This release contains the Basic Stand Alone (BSA) Inpatient Public Use Files (PUF) named “CMS 2008 BSA Inpatient Claims PUF” with information from 2008 Medicare inpatient claims. This is a claim-level file in which each record is an inpatient claim incurred by a 5% sample of Medicare beneficiaries. There are some demographic and claim-related variables provided in this PUF as detailed below. However, as beneficiary identities are not provided, it is not possible to link claims that belong to the same beneficiary in the CMS 2008 BSA Inpatient Claims PUF.

Beneficiaries have been selected as a 5% simple random sample (without replacement) from the approximately 48 million people eligible for Medicare at any time during 2008 but excluding people included in the widely used 5% Medicare research sample. The Inpatient Claims PUF contains demographic information from the beneficiary summary file and claims information from the inpatient claims file.

The file contains seven (7) variables: A primary claim key indexing the records and six (6) analytic variables, listed below. One of the analytic variables, claim cost, is provided in two forms, (a) as an integer category and (b) as a dollar average. These two versions are essentially equivalent. As they can be treated as one variable, there are six (6) rather than seven (7) analytic variables, in addition to the claim ID. 

  • Age (BENE_AGE_CAT_CD), the beneficiary's age, reported in six categories: (1) under 65, (2) 65 - 69, (3) 70 - 74, (4) 75-79, (5) 80-84, (6) 85 and above.
  • Gender (BENE_SEX_IDENT_CD), (1) male or (2) female.
  • Base DRG (IP_CLM_BASE_DRG_CD): This is a set of 311 possible codes, numbered 1 - 311, derived from MS-DRG codes. It identifies a basic diagnosis or a set of diagnoses. A base DRG code might be comprised of up to three MS-DRG codes.
  • ICD-9 primary procedure code (IP_CLM_ICD9_PRCDR_CD): “International Classification of Diseases” version 9. This is a two-digit code reported as 00 - 99. In the PUF, 85 such codes are observed. This is the only variable that has “missing” values (about 47% missing) meaning that there does not exist a primary procedure on the claim.
  • Length (IP_CLM_DAYS_CD), the length of stay reported in four categories: (1) 1 day, (2) 2 - 4 days, (3) 5 - 7 days, and (4) 8 or more days
  • Amount (IP_DRG_QUINT_PMT_AVG and IP_DRG_QUINT_PMT_CD): This has (up to) five (5) categories for each base DRG code. Within each base DRG code, the original claim amounts in the entire population (except negative payments) are broken into approximate quintiles (identified by IP_DRG_QUINT_PMT_CD with values of 1 - 5).

In addition to the General documentation file, there are two documentation files for the Inpatient Claims PUF:

  1. The data dictionary and codebook file contains information about each variable on the file and its values, as well as formatted frequencies for each variable on the data file.
  2. A data users document which gives the file layout for the downloadable CSV file, as well as SAS program code for creating a SAS dataset with variable formats.
Page Last Modified:
09/06/2023 04:57 PM