PIPDCG Payment Model for 2002 Software

October 20, 2000
November 29, 2001 (revision date)


This memorandum provides documentation for using the PIPDCG 
payment model 2002 software.  It was prepared by DxCG, Inc. under 
a subcontract to Health Economics Research (HER) on HCFA contract 
500-99-0038.

The software consists of two programs:

     PIPMODEL.TXT   program that processes enrollment and claims
                    input data, and calculates relative risk
                    scores and predicted expenditures.
     FMTPROG2.TXT   program that creates the required SAS format
                    library.

Note: The software is SAS source code.  The program takes the form 
needed to run on an IBM mainframe.  Modifications are needed to 
run it on other platforms.  For example, the JCL assigning files 
would be replaced by SAS statements or system-specific code.

PIPMODEL.TXT is the main program that assigns PIPDCGs and 
calculates relative risk scores and predicted expenditures.  
PIPMODEL.TXT is written in the SAS programming language.  It was 
developed in SAS Version 6.09 and should run correctly on that 
version or any more recent version.

To run PIPMODEL.TXT successfully, a SAS FORMAT LIBRARY containing 
the crosswalk from ICD-9-CM diagnosis codes to PIPDCG is 
necessary.  This SAS FORMAT LIBRARY is external to PIPMODEL.TXT.  
PIPMODEL.TXT utilizes the library to map ICD-9-CM diagnosis codes 
into PIPDXGs and to map PIPDXGs into PIPDCGs.

The separate program FMTPROG2.TXT creates the required SAS FORMAT 
LIBRARY containing the cross-walk.  The JCL statement with the 
DDname LIBRARY in PIPMODEL.TXT points to the SAS FORMAT LIBRARY.  

Before running PIPMODEL.TXT, the user must run the program 
FMTPROG2.TXT, which references the SAS FORMAT LIBRARY as output.  
The details of this reference depend on the user's platform.


The remainder of this document explains PIPMODEL.TXT.  The topics 
covered are:

  I.           Input Files
  II.          User-Defined Program Parameters
  III.         Notes on Program Computations
  IV.          Output File
  Appendix A:  HER age/sex diagnosis validity edits


I.   Input Files

Two SAS input files are required for the software:

PERSON file--a person-level file of demographic and enrollment 
information; and
ADMISSN file an admission-level file of diagnoses and length of 
stay.

Data requirements for the SAS input files:

A.  PERSON Input File 
The person-level input file requires the following variables for 
each person:

1)  IDNO
IDNO must be character or numeric type and unique to an 
individual.  It can, for example, be the Medicare HICNO.  The 
PERSON Data Set must be presorted in ascending order by IDNO.

2)  SEX
One character:  1=male, 2=female.

3)  DOB
Date of birth.  Numeric variable in format YYYYMMDD, where Y 
indicates year, M is month, and D is day.  For example, someone 
born on June 23, 1930, would be coded as 19300623.

Using DOB, SEX, YEAR2, and YR2MONTH (YEAR2 and YR2MONTH are
user-defined program parameters specifying the year for which 
expenditures are being predicted - see Section II. below), the 
software computes a person's age and sex cells, which are defined 
as the fraction of eligible* months in YEAR2 that are spent in 
each cell.  A person's age for each month in Year 2 is determined 
as their age as of the first day of the following month.  Then the 
number of months in each age cell in Year 2 is determined, and 
divided by the number of eligible months to determine the fraction 
of the year in each age/sex cell.  These cells may take on the 
values zero, fraction, or one, and sum to one for each person.

Example:  A woman was born on June 15, 1929.  Expenditures are 
being predicted for YEAR2 = 1999, YR2MONTH=1 (January).  For 5 
months of YEAR2, the woman is 69 years old, and for seven months 
she is 70 years old.  She receives the values 5/12 = 0.42 for the 
cell female, age 65-69 and 7/12 = 0.58 for the cell female, age 
70-74.

*Throughout this program, it is assumed that the number of 
eligible months in YEAR2 for each individual is 12.

4)  OREC
Original reason for Medicare entitlement.  One character, which 
may take on the following values:

  0 = old age and survivors insurance (OASI)
  1 = disability insurance benefits (DIB)
  2 = End Stage Renal Disease (ESRD)
  3 = both DIB and ESRD

Using OREC, DOB, YEAR2, and YR2MONTH the software computes 
EVERDISM (ever disabled), which is defined as the fraction of 
YEAR2 eligible months spent in ever disabled status. EVERDISM may 
take on the value zero, fraction, or one.  Ever disabled status 
refers to someone who was originally entitled to Medicare by 
disability, but is currently entitled by age (i.e. is currently 65 
years of age or older).  Someone is in ever disabled status if 
they are currently 65 years old or older, and OREC takes on the 
values 1, 2, or 3.

5)  MCAID
Numeric variable.  Medicaid status, defined as 1 for year 2 
(YEAR2) if the person was enrolled in Medicaid for at least one 
month in (the base) year 1, 0 otherwise.  This variable takes on 
the values 0 or 1 for all of year 2; it cannot be a fraction.

6)  MSP
Numeric variable.  Medicare as a secondary payer, or working aged 
status, defined as the fraction of eligible months in year 2 that 
the person is in working aged status.  This variable can take the 
values zero, fraction, or one.  MSP should be coded as 0 for the 
non-working aged.  It should never be left missing.  To derive a 
correct relative risk score and monthly predicted expenditure 
value for someone who is working aged in a given month, set this 
variable to 1.  However if this person has non-working-aged months 
during the prediction year, the annualized predicted expenditure 
will not be correct, nor will the relative risk score the program 
outputs be correct for the entire year.  To get the correct 
results for someone with a mixture of working aged and non-
working-aged months, enter the fraction of eligible months in 
working aged status in year 2.  For example, if a person is in 
working aged status for 6 of 12 months, enter 0.5 (in this case, 
the monthly prediction will be an average prediction across months 
in the year, and will be incorrect for any particular month 
because someone is either in working aged status or not in a 
particular month).

7)  NEWENROL
Numeric variable.  1 indicates a new beneficiary (Medicare 
enrollee).  This indicator is used when the beneficiary has not 
been eligible for a full data year for the collection of 
encounters.  0 is a continuing beneficiary.  New and continuing 
beneficiaries have different payment formulas.

7)  CHFFLAG
Character variable, length 1.  1 indicates a beneficiary who has 
received a diagnosis of Congestive Heart Failure (CHF) in the year 
prior to the 'base year'.  0 indicates a beneficiary who did not 
receive a CHF diagnosis in the prior year.  A set of adjusted 
output variables will be calculated that may include additional 
payment components for beneficiaries with a prior CHF diagnosis 
identified by CHFFLAG = 1.

B.  ADMISSN File
The hospital admission level input file requires the following 
variables, where each record represents one hospital admission:

1)  IDNO
IDNO of person admitted.  The specifications are the same as for 
PERSON file.  IDNO on the ADMISSN file must be identical to the 
IDNO for the same person on the PERSON file.  It can, for example, 
be a person's Medicare HICNO.  The ADMISSN file must be presorted 
in ascending order by IDNO.

2)  LOS
Numeric variable.  Length of stay, in days, defined as discharge 
date minus admission date.

3)  DIAG1, DIAG2, DIAG3,  , DIAG[MAXDIAG]
(In the SAS computer language, DIAG1-DIAG&MAXDIAG)
All ICD-9-CM diagnoses from this hospital stay.  The principal 
diagnosis for the stay must be in the first position (DIAG1).  All 
secondary diagnoses must follow without gaps on the record.  The 
order of the secondary diagnoses does not matter.  If a record 
(admission) has fewer than MAXDIAG, then the remaining diagnosis 
fields must be blank filled through MAXDIAG.  (MAXDIAG is a user-
defined program parameter specifying the maximum number of 
diagnoses allowed on a single admission record - see below.)
Each diagnosis is a 5 character field.  Diagnosis codes should be 
left-justified, include leading zeros, exclude periods, and should 
be right filled with blanks.  Letter codes (i.e., V codes) should 
be UPPER case.  Diagnosis codes not conforming to these 
specifications will be considered invalid.
Examples ( b indicates a blank space):
   003.2  should be coded  0032b  NOT  003.2  NOT  0032
          NOT  bb32b  NOT  32bbb
   003.20 should be coded  00320  NOT  003.2b
   650    should be coded  650bb  NOT  65000  NOT  00650
          NOT  bb650
   V57.0  should be coded  V570b  NOT  v570b
   806.21 should be coded  80621  NOT  806.21


II.  User-Defined Program Parameters

The user can set the following program parameters in Part 1, Step 
1 of the SAS program.  If they are not changed by the user, they 
assume the default values indicated.

1)  LOS01 = 1 includes diagnoses from all hospital admissions in
              assigning PIPDCG.
          = 0 ignores diagnoses from all hospital admissions with
              LOS of less than 2 in assigning PIPDCG.
          Default = 0

Important note:  the prediction formulas in the software are 
calibrated excluding short stay admissions from PIPDCG assignment, 
whether or not the user-controllable switch is set to include or 
exclude short stays.  That is, there is only one formula in the 
software, based on exclusion of short stays.

2)  PIPADJ  = 1 HER age/sex edits for invalid diagnoses are done
            = 0 HER age/sex edits are not done.
            Default = 0.
See Appendix A for HER age/sex edits.

3)  MAXDIAG = maximum number of diagnoses allowed on an admission 
record.
Code as an integer, e.g., 1 2 ... 10 11 
            Default = 10

4)  YEAR2  = the year for which expenditures are being predicted.
          Format is yyyy.
          Default = 1999
If YEAR2 extends across two calendar years, the calendar year of 
the first month of YEAR2 should be entered.

Important note:  predicted expenditures are always in 1996 
dollars.  They are NOT adjusted for inflation, notwithstanding the 
value of YEAR2.  YEAR2 only affects the computation of the age/sex 
cells, the ever disabled variable, and the age variable used in 
the HER age/sex edits.

5)  YR2MONTH = the first month of YEAR2.
             Format is an integer 1-12.
             Default is  1.
Possible values are 1 = January, 2 = February, ...., 12 = 
December.  E.g., if YEAR2=1999 and YR2MONTH = 1, then the software 
assumes that expenditures are being predicted for calendar year 
1999.  If YEAR2 = 1999 and YR2MONTH = 5, the software assumes that 
expenditures are being predicted for May 1999 - April 2000.

8)  FAGESEX = 1 output file includes 34 age/sex cell variables for
                each person
            = 0 age/sex variables not included in the output file.
            Default = 1

8)  WAM = [value]
Working aged multiplier for continuing beneficiaries.  Multiplier 
used to adjust predicted expenditures and risk score of a person 
in working aged status.  Any number can be entered. 
            Default value = 0.21.

9)  WAM_NE = [value]
Working aged multiplier for new beneficiaries.
            Default value = 0.21.


III.      Notes on Program Computations

Using the information on the two input files, the software assigns 
each continuing beneficiary a PIPDCG for Year 2, which ranges from 
4 to 29.  All new beneficiaries (NEWENROL = 1) are assigned a 
value of "missing" for the variable PIPDCG.

The software replaces principal chemotherapy diagnoses by the 
highest-ranked secondary cancer PIPDXG on the same admission 
record.  If an admission has a principal diagnosis of 
chemotherapy, but no cancer diagnosis among the secondary 
diagnoses, the software assigns the admission to PIPDXG 14 - 
breast cancer - which is the lowest-paid cancer PIPDXG.

The software searches all secondary diagnoses for all admissions 
for HIV/AIDS diagnoses (PIPDXG 3).  It assigns an admission to 
PIPDXG 3 if either a principal or secondary diagnosis of HIV/AIDS 
is present for that admission.

Excluding short stays takes precedence over the chemotherapy or 
HIV/AIDS algorithms.  That is, short stays with a principal 
diagnosis of chemotherapy or HIV/AIDS are excluded.

For each continuing enrollee in the sample, the software computes 
a base relative risk score for year 2 using the payment formula 
that was announced by HCFA in its Report to Congress, March, 1999.  
The relative cost weights used in this formula are a function of 
the age/sex cells, EVERDISM, MCAID, and PIPDCG.  This base 
relative risk score is called RSKSCORB.  It is rounded to the 
nearest .001.

The Base Relative Risk Score, RSKSCORB, is adjusted for working 
aged status, and assigned to RSKSCORA, which denotes the Relative 
Risk Score Adjusted.  The formula used is:
         RSKSCORA = RSKSCORB*(1 - MSP*(1-WAM))
where WAM is the user-defined working aged multiplier parameter, 
set to a default value of 0.21.  Note that when MSP = 0,
RSKSCORA = RSKSCORB, i.e., no change.  Also, when MSP = 1, 
RSKSCORA = WAM*RSKCSORB.  The relative risk score RSKSCORA is 
rounded to the nearest .001.

Annual predicted expenditures are calculated as:
         PREDEXPB = RSKSCORB*5100 and 
         PREDEXPA = RSKSCORA*5100.
These amounts are rounded to the nearest .01.

Monthly predicted expenditures are calculated from the annual 
amounts by:
         MPRDEXPB = PREDEXPB/12 and 
         MPRDEXPA = PREDEXPA/12.
These amounts are rounded to the nearest .01.

For each new beneficiary, the same steps are followed except that 
the working aged multiplier for new beneficiaries, WAM_NE, 
replaces the working aged multiplier for continuing beneficiaries, 
WAM.  Also, relative risk scores for new beneficiaries are derived 
from a different formula that is a function of age, sex, and 
MCAID, but not the PIPDCGs or EVERDISM.

Three additional values, CHFRSKSC (CHF-adjusted risk score), 
PREDEXPC, and MPRDEXPC are calculated for each beneficiary.  These 
values may include an additional payment component if that 
beneficiary has been identified as having received a diagnosis of 
Congestive Heart Failure (CHF) in the year prior to the 'base 
year'.  Beneficiaries having received a CHF diagnosis in the prior 
year are in turn identified by the input variable CHFFLAG = 1.


IV.  Output File

The program outputs a person-level SAS dataset named OUTPUT with 
the following variables:

1)  IDNO

The person's ID number.  Same as input variable on PERSON and 
ADMISSN files.

2)  SEX

Same as input variable.

3)  DOB

Date of birth.  Same as input variable.

4)  OREC

Original reason for Medicare entitlement.  Same as input variable.

5)  EVERDISM 

Fraction of eligible  months in Year 2 that the person is in 'ever 
disabled' status.

6)  MCAID

Medicaid status.  Same as input variable.

7)  MSP

Medicare as a secondary payer (working aged) status.  Same as 
input variable.

8)  PIPDCG

A person's Principal Inpatient Diagnostic Cost Group.

9)  AGE

A person's age in years on the first day of YR2MONTH in YEAR2.

10)  NEWENROL

Indicator variable for new Medicare beneficiary.  Same as input 
variable.

11)  PREDEXPB

Annualized base predicted expenditures, in 1996 dollars.

12)  PREDEXPA

Annualized predicted expenditures in 1996 dollars, adjusted for 
working aged status.

13)  MPRDEXPB

Monthly base predicted expenditures, in 1996 dollars.

14)  MPRDEXPA

Monthly predicted expenditures, in 1996 dollars, adjusted for 
working aged status.

15)  RSKSCORB

Base relative risk score.

16)  RSKSCORA

Relative risk score, adjusted for working aged status.

17)  CHFFLAG

Indicator variable for Congestive Heart Failure (CHF) diagnosis 
status in the year prior to the 'base year'.  Same as input 
variable.

18)  PREDEXPC

Annualized predicted expenditures, in 1996 dollars, adjusted for 
CHF status.

19)  MPRDEXPC

Monthly predicted expenditures, in 1996 dollars, adjusted for CHF 
status.

20)  CHFRSKSC

Relative risk score, adjusted for CHF status.

21)  W0_34  W35_44 W45_54 W55_59 W60_64 W65_69
     W70_74 W75_79 W80_84 W85_89 W90_94 W95_GT
     M0_34  M35_44 M45_54 M55_59 M60_64 M65_69
     M70_74 M75_79 M80_84 M85_89 M90_94 M95_GT
     W65  W66  W67  W68  W69
     M65  M66  M67  M68  M69

34 age/sex variables indicating the fraction of eligible months in 
YEAR2 in each age/sex cell.
M0_34  indicates "male, age 0 to 34"
M65    indicates "male, age 65"
M95_GT indicates "male, 95 years or older"
W0_34  indicates "female, age 0 to 34" etc.

The variables M65-M69 and W65-W69 are used for new beneficiaries 
instead of the variables M65_69 and W65_69.

These variables are optionally output if the user-defined 
parameter FAGESEX is set to 1 (see Section II. above).

Appendix A:         HER Age/Sex Diagnosis Edits

The following age/sex edits were used in model development:

1.  This edit assumes that if neonatal codes occur on the record 
of a female 2 years or older, they are a baby's diagnoses on a 
mother's record.  For males 2 years or older, they are assumed to 
be invalid.

    If age >= 2 and PIPDXG is in interval from 166 to 170
     Then  for Male   (SEX=1) set PIPDXG=-1 (invalid);
           for Female (SEX=2) set PIPDXG=130.


2.  This edit specifies diagnostic categories that are 
inconsistent with sex (e.g., females with prostate diagnoses).

     For Female:
          if SEX=2 and (PIPDXG is one of (18,121,122)
          or (PIPDXG=31 and ICD9 starts with 257)) 
          then set PIPDXG=-1 (invalid).

     For Male: 
          if SEX=1 and (PIPDXG is one of (16,17,123,124,125)
          or (PIPDXG=31 and ICD9 starts with 256) 
          then set PIPDXG=-1 (invalid).


3.  This edit specifies pregnancy/infertility diagnoses that are
    inconsistent with age and/or sex:

     if (PIPDXG is in interval from 126 to 132
     or (PIPDXG=124 and ICD9 starts with 628))
     and (SEX='1' or AGE < 8 or AGE > 59)
     then set PIPDXG=-1 (invalid).
