The Framers of our Constitution laid the cornerstone for the federal statistical system by requiring a decennial census as the basis for apportioning the House of Representatives. Over the next 233 years, the Census Bureau’s mission has expanded to serve as the nation’s leading provider of comprehensive, quality data about its people and the economy. To achieve this mission, the Census Bureau has made a steady set of innovations to modernize its data collection and dissemination. Here we share a concept and initial steps forward for their next innovation, the creation of a Curated Data Enterprise.
The Curated Data Enterprise is both an infrastructure and a continuous evolving ambition to empower and enable Census Bureau scientists and their data users to progress from a focus on individual data elements or surveys items to one focused on the purpose and use of the information. This can result in new and better measures of America’s people, places, and the economy. It is a new vison to exploit multiple data sources across many sample surveys, censuses, tribal, federal, state, and local administrative data, as well as private-sector data, to produce more robust, granular, timelier, and comprehensive measures of demographic changes, social trends, and economic activity.
Today, the Census Bureau is innovating its processes to take advantage of new data sources and data science computing innovations and to adapt to declining survey response rates that are challenging the public and private survey world. The Bureau is also exploring additional means of producing data to address shortfalls that are becoming increasingly inherent with surveys. It has stood up a team representing a cross-section of its demographic, geographic, and economic programs to break down siloed activities and build a new enterprise infrastructure. One component of this infrastructure is the "Frames Program" that will link four key components of the internal architecture, Geospatial, Business, Jobs, and Demographic frames, into shared enterprise resources that will bring together, serve, and support all Census Bureau surveys and make appropriate use of administrative and third-party data. This resource will form the foundation for the Curated Data Enterprise, creating a scaffold to hold and link massive amounts of public and private sector data.
With support from the Sloan Foundation and the Census Bureau, we have shared the Curated Data Enterprise concepts with a diverse set of Census Bureau stakeholders, including researchers, economic developers, economists and business leaders, advocates, public policy analysts, and applied demographers. Here we report on their expert viewpoints, including support for the concept, concerns, technical gaps, research challenges, key partnerships that will be needed, and unforeseen opportunities.
Our conclusion is that the innovations offered by the Curated Data Enterprise represent a necessary evolution beyond the survey-only model that has reached scientific and practical limits in an era of increasing demand for more data, more often, and more urgently. It holds the promise of producing more timely, robust, and accurate findings and to more fully reflect the diversity of the nation’s racial and ethnic composition.
We welcome your questions and comments – please contact Stephanie Shipp.
Presentation to the U.S. Census Bureau's Research and Methodology Directorate
Dr. Sallie Keller presented the findings from the report A 21st Century Curated Data Enterprise: A Bold New Approach to Create Official Statistics to the US Census Bureau's Research and Methodology Directorate on May 19, 2022. In this presentation, she highlights:
- The paradigm shift to focus on purpose and use when creating new federal indicators and data products.
- The forces driving innovations in federal statistics.
- Defining the Curated Data Enterprise and framework.
- The history of the Curated Data Enterprise initiative.
- The work plan, stakeholder groups, listening sessions, and early takeaways.
- A summary of the initial research topics:
- Use Cases
- Access and dissemination
- Data linkage and integration