’s Data Pipeline Explained

Dec 4, 2014

In case you missed it: the team recently hosted DigitalGov University webinars designed to help agencies and open data advocates better understand how to get data on and how to implement the Open Data Policy’s metadata schema updates. These webinars were designed assist government data publishers in making more data discoverable to the American people. You can watch these webinars and check out additional supplemental resources below.

Project Open Data Metadata Schema v1.1 Updates

Executive Order 13642 and OMB Memorandum M-13-13 require all executive departments and agencies to list all agency data that can be made public in a publicly available open data catalog with consistent metadata. In the year plus since the release of the Open Data Policy, agencies and the public have suggested several updates to the metadata schema. Each issue was rigorously discussed in its own issue thread and at the July government-wide offsite session dedicated to this update. The result is version 1.1 of the metadata schema required under the Open Data Policy. Federal agencies will be required to present their datasets using version 1.1 starting in February 2015.’s October 15th “Project Open Data Metadata Updates” webinar reviews metadata schema v1.0 as required under the Open Data Policy, provides a comprehensive step-by-step overview of the updates to the metadata schema, and provides a roundup of tools and resources to assist data stewards, IT personnel, and all agency staff in their v 1.1 metadata updates.

As of December 3, 2014, the catalog supports both version 1.0 and 1.1 of the metadata schema, to provide agencies adequate transition time to version 1.1 by the February 2015 deadline.

Data Harvesting 101

The team also conducted a recent webinar with more basic information about and how agencies’ data is added to the catalog. is the United States’ central clearinghouse to search and discover over 130,000 open government datasets. does not host data directly, but rather aggregates metadata about open data resources in one centralized location.

Once an open data source meets the necessary format and metadata requirements, the team can pull directly from it as a Harvest Source, synchronizing that source’s metadata on as often as every 24 hours.

The November 5th “How to Get Your Agency’s Data onto” webinar reviews step-by-step how federal, geospatial, and non-federal data is funneled on to, the data requirements for getting your government data on to, and tools and resources to assist data stewards, IT personnel, and all agency staff.

Additional Resources

Stay Tuned

Register for’s upcoming webinar on how to use the tool to host your metadata: “How to use Inventory.Data.Gov” on Tuesday, December 16th from 1pm – 2 pm and stay tuned for additional Data Harvesting documentation. As always you can reach the team at