The Data Briefing: Harnessing the Internet of Things and Synthetic Data to Provide Better Flood Warnings and Prevent Veterans Suicides

Sep 14, 2016

Two significant items in federal government data in the last few weeks:

The Department of Commerce releases the National Water Model. The National Water Model provides a comprehensive model of river flows so local communities can better prepare for possible flooding events. What is especially amazing about the National Water Model is that it pulls data from over 8,000 stream gauges. Stream gauges are automated measuring stations that measure water flow, height, surface runoff, and other hydrological data. The National Water Model is a great example of data being produced from an Internet of Things: here, a nationwide network of scientific sensors.

The consequences of the flood washed the road with sagging plates.

The Department of Veterans Affairs’ Veterans Affairs Suicide Prevention Innovations (VASPI). A community of experts assembled on September 9 and September 10 to find solutions to the growing problem of suicide among veterans. The themes for VASPI included:

“1) Improving VA predictive analytic methods for identifying suicide risk.”

“2) Accessing Veterans at risk for suicide who are not receiving VA care.”

“3) Enhancing VA resources and interventions for suicide prevention.”

VASPI text message.
A first in federal government data was access to a “crowdsourced synthetic suicide dataset.” Synthetic data is used to protect the confidentiality of the original data. A statistical model is created from the original data. The statistical model is then used to generate a new dataset which is not the original data but closely resembles the original dataset. Here, the veteran suicide synthetic dataset can be analyzed to determine what lifestyle choices or behavioral indicators may lead to suicide events while protecting the identity of the veterans in the original dataset.

Synthetic datasets can also train machine learning algorithms. Machine learning algorithms need a great amount of data to find patterns and build the prediction model. Many synthetic datasets can be created from the original dataset and then distributed to developers and researchers to find the most effective machine learning algorithms. The best machine learning algorithms can then be applied to the original dataset to aid in suicide prevention programs.

VASPI text message.
The federal government is using innovative methods to better develop and use its vast data resources. From harnessing the power of the Internet of (sensor) Things to protecting citizens’ privacy while providing valuable insights from synthetic data, federal agencies are leading the way in cutting-edge data methods. Federal agencies are also pioneering new ways in crowdsourcing data collection, data analysis, and app development. In the coming months, more federal hackathons will showcase future federal agency innovations.

Each week, The Data Briefing showcases the latest federal data news and trends. Visit this blog every week to learn how data is transforming government and improving government services for the American people. If you have ideas for a topic or have questions about government data, please contact me via email. Dr. William Brantley is the Training Administrator for the U.S. Patent and Trademark Office’s Global Intellectual Property Academy. You can find out more about his personal work in open data, analytics, and related topics at All opinions are his own and do not reflect the opinions of the USPTO or GSA.