State health departments are under a significant amount of pressure to ensure that individual records and other health data is up-to-date and consumable for a wide variety of use cases. A modern approach to building data pipelines can help.
Why? In today’s world where health records are used for all types of analysis and research, the amount of entities accessing these records is at an all-time high. But many health departments are still stuck in the past when it comes to collection and storage, with some still taking health information via pen and paper, and then manually uploading the data to a database.
Not only does this use up a large amount of resources, but data added this way can often be incomplete or lack standardization. Without coordination with the wide variety of systems, there can also be duplication of the data, as well as delays.
See also: Automated Data Pipelines Make it Easier to Use More Data Sources
The CDC partnered with the United States Digital Service to develop a pilot project for the Virginia Department of Health, with the aim of improving the data pipelines for public health agencies. It is part of a broader effort by the CDC to reduce the amount of manual effort needed to access public health data on a state or local level.
A prototype cloud-based data pipeline was developed by the team, which is customizable with a set of building blocks which automatically process datasets, such as lab results or case studies. Data is standardized and geocoded, and is a single source of truth for all incoming data.
While the system was built with the Virginia health department, it has been created for adoption and reuse by other health authorities. It is also a signal that the CDC will be cooperative and support efforts by health agencies to improve data processing speeds.
The team aims to apply the learnings from the Virginia prototype to scale Building Blocks for a wide range of state and local public health services.