Data at CMS

These sections of the CMS Technical Reference Architecture (TRA) consolidate guidance and policy information regarding data — its management, storage, and consumption by users and application systems.

The content is organized within the following sections:

Introduction to Data Management

The Centers for Medicare & Medicaid Services (CMS) Enterprise Data Environment (EDE) data strategy addresses data management and the Enterprise Data Mesh. The strategy seeks to incorporate shared costs across multiple CMS Centers to build shared services. It depicts a notional framework that organizes data management capabilities, data governance policies, and data user support services. It includes core components that support CMS’s infrastructure and enterprise shared data. The overall framework seeks to keep costs down while encouraging data reuse, better data quality, faster DevOps, advanced security management, and improved adaptability. In addition, this framework supports part of the risk-based management framework and the requirements of National Institute of Standards and Technology (NIST) Special Publication (SP) 800-37 Rev. 2., Risk Management Framework for Information Systems and Organizations: A System Life Cycle Approach for Security and Privacy

Effectively securing and managing enterprise data is accomplished by implementing consistent data management methods to include data governance, architecture, quality, and security, as outlined in the HHS Policy for Enterprise Data Management guidelines. Data architecture and consistent data management methods should include, but are not limited to, scalable solutions, minimizing data redundancy, and considerations for data virtualization to explore data synchronization and integration needs.

Monitoring and managing data security are essential to protect the confidentiality, integrity, and availability (CIA) of data. CMS policies and procedures must include data security requirements that comply with Department of Health and Human Services (HHS) and federal mandates. Such mandates include, but are not limited to, required controls to protect data collected and shared across the enterprise, to maintain comprehensive inventory of databases and their contents, to encrypt sensitive data at rest and in transit except data approved for public release, and to protect data against unauthorized use. Data sharing agreements must comply with HHS policy, Office of Chief Information Officer (OCIO) Rules of Engagement for Security, Monitoring, and Collaborative Systems, NIST SP 800-47 Rev. 1, Managing the Security of Information Exchanges, and NIST SP 800-53 Rev. 5., Security and Privacy Controls for Information Systems and Organizations

Major Related Services and Data Sources

PREFERRED

CMS has invested heavily in the maturity of these solutions, and strongly recommends their use where feasible.

Enterprise Data Mesh (EDM)

The CMS Enterprise Data Mesh (EDM) ecosystem provides value by preventing replication of multiple data sources which perpetuate data inefficiency, duplication, inconsistency, inferior quality, and increased costs for associated infrastructures. EDM services include:

  • Enterprise Data Mesh – Hive Metastore is a centralized, managed, and secure connection where systems and individual users can locate, access, and use CMS enterprise information hosted in AWS cloud with “data in place.” The EDM enables data owners and curators to focus on data and data quality while also enabling consumers to bring their own preferred computes, analytics, and APIs. Additionally, the EDM enables a wide spectrum of programs and consumers to leverage program data sets with close to zero provisioning time with their choice of tools and technologies optimal for their use case. CMS’ EDM program includes AWS cloud-based services support, data cataloging, access management, and development and operations support.

  • EDM Launchpad is an AWS-based Platform as a Service (PaaS) solution designed to facilitate the rapid setup of proof of concepts (POCs) and pilot projects. It offers flexibility in terms of supported architectures and leverages various AWS services to provide a comprehensive platform for application development and testing. Also, it is a pre-ATO’d Production environment for the consumers who do not have a Cloud presence and want to explore cloud capabilities.

  • The EDM Workspace offering provides individual data analysts and data scientists within CMS with a versatile and tailored virtual desktop environment equipped with the necessary tools for data analysis, wrangling, and visualization. The technology stack and architectural choices can be customized to ensure a secure, efficient, and personalized analytical workspace for each user. The Workspace offering is designed as an all-inclusive AWS Windows or Linux workspace tailored to cater to the unique preferences and requirements of individual users. The architecture features AWS WorkSpaces, providing a secure, cloud-based virtual desktop environment.

  • The Enterprise User Data Catalog (EUDC) is a user-facing, centralized repository of metadata that offers data dictionary, data description/glossary, and other information about the metadata required by a data analyst. The EUDC enables data consumers to find pertinent information by searching metadata (column and field names within tables) that contributors provide.

The CMS TRA guidance can be found in the Data Management EDM chapter below.

Integrated Data Repository Cloud

The Integrated Data Repository Cloud (IDRC) is a high-volume data warehouse integrating Medicare claims—Parts A, B, C, D, and Durable Medical Equipment (DME) with beneficiary and provider data sources, as well as such ancillary data as contract information and risk scores. This robust, integrated data supports much needed analytics across CMS.

IDR services include:

  • State-of-the-art capabilities for business intelligence and reporting, along with additional data access capabilities
  • Automated Finder File and Data Extract Process
  • Data dictionary, data limitations information, and source-to-target mappings
  • Customer support and assistance

Center for Medicaid and CHIP Services (CMCS) DataConnect

CMCS DataConnect is an all-in-one analytics platform for the Center for Medicaid & CHIP Services (CMCS). DataConnect is built on Databricks and Amazon QuickSight dashboards. It provides read-only access to an expanding set of enterprise datasets, integrated with tools.

CMS Master Data Management

The Master Data Management (MDM) system is a CMS enterprise shared service. MDM provides singular, consolidated, and ID-resolved authoritative sources of Beneficiary, Provider, Organization, Program and Relationship data for use within CMS and by external agencies and organizations.