2019 Data Science for the Public Good Program

DSPG Research Projects

NSFSkilled Technical Workforce: Demand, Supply, and Pathways  
DSPG: Alyssa Fowers, Calvin Isch, and Sarah McDonald
SDAD Mentors: Vicki Lancaster and Samantha Cohen
Sponsor: National Science Foundation (NSF) National Center for Science and Engineering Statistics (NCSES)

People in positions that require a high level of technical knowledge, but not a college education, are part of the skilled technical workforce. Having workers in these positions is critical to the success of the American economy. The variety of various skill formation pathways for the skilled technical workforce in Virginia is explored through the evaluation of local job markets, current workforce characteristics, and training opportunities.


NSFMeasuring the Universe of Open Source Software (OSS)
DSPG: Cong Cong, Calvin Isch, and Eliza Tobin
SDAD Mentors: Gizem Korkmaz, J Bayoán Santiago Calderón, Aaron Schroeder, and Brandon Kramer
Sponsor: National Science Foundation (NSF) National Center for Science and Engineering Statistics (NCSES)

Open Source software is a key part of our economy, yet currently not measured because it is ‘free’ and therefore not captured in the nation’s Gross Domestic Product. NCSES would like to better understand the contribution and value of OSS to provide policy makers with a more comprehensive picture of science and engineering indicators. In this project, various data sources including online repositories such as GitHub, and databases of copyrights, trademarks, and patents are examined to assess if these data sources provide useful information about quantity and value of OSS.


NSFDetecting Pharmaceutical Innovations in News Articles Using Machine Learning
DSPG: Quinton Neville and Raghav Sawhney
SDAD Mentors: Devika Nair, Gizem Korkmaz, and Thomas Neil Alexander Kattampallil
Sponsor: National Science Foundation (NSF) National Center for Science and Engineering Statistics (NCSES)

Company R&D and innovation is measured by NCSES and compared to countries around the world as indicators of economic competition and growth. The measurement is based on the annual Business R&D Innovation Survey (BRDIS). This project aims to develop machine learning techniques to assess the feasibility of measuring business innovation in the US pharmaceutical industry using non-survey data with focus on stories in newspapers, trade journals, newsletters, and other sources.


NSFMeasuring the Public Funding of R&D: A Feasibility Study
DSPG: Alyssa Fowers and Sean Pietrowicz
SDAD Mentors: Joel Thurston, Samantha Cohen, and Stephanie Shipp
Sponsor: National Science Foundation (NSF) National Center for Science and Engineering Statistics (NCSES)

Federal (public) funding of research and development (R&D) accounts for 25% of total R&D funding. Using publicly available sources of administrative data, the project is documenting the quality and availability of public funding of R&D to universities and nonprofit organizations. The goal is to assess the feasibility of enhancing and supplementing the current collection of these data through the NSF Survey of Federal Science and Engineering Support to Universities, Colleges, and Nonprofit Institutions (Fed Support) (FSS).


USDABroadband Data Validation: Comparing U.S. Broadband Coverage
DSPG: Kateryna Savchyn, Sarah McDonald, and Raghav Sawhney
SDAD Mentors: Teja Pristavec, Josh Goldstein, and Stephanie Shipp
Sponsor: United States Department of Agriculture (USDA) Economic Research Service (ERS)

USDA would like to know whether broadband development has an effect on rural prosperity, as measured by changes in property values and quality of life indicators in rural communities. This DSPG project has two parts that will inform the overall project in the Social and Decision Analytics Division. The first is evaluating and benchmarking the quality of the Federal Communication Commission (FCC) Form 477 data on broadband availability and subscription rates. The second is identifying what factors other than broadband predict property values.


Fairfax CountyFairfax County CommunityScapes
DSPG: Cong Cong, Quinton Neville, Eliza Tobin, and Victoria Halewicz
SDAD Mentors: Teja Pristavec, Joy Tobin, Stephanie Shipp, Josh Goldstein, and Brandon Kramer
Sponsors: Fairfax County Health and Human Services and Inova Translational Medicine Institute

We partnered with Fairfax County and Inova Translational Medicine Institute to better understand regional health and well-being through a data science framework. Our research team used an interdisciplinary, data-driven approach to discover, acquire, and statistically integrate publicly available local data and create CommunityScapes – social, economic, environmental, and well-being indicators at sub-county levels. We developed obesogenic environment exposure and economic vulnerability composite indices, mapping social determinants of health that influence neighborhood well-being. Doing so allowed us to contextualize where citizens live, learn, work and play at smaller geographic levels that are actionable for stakeholders, establish a baseline for measuring change, and promote informed policy in Fairfax County.


ARIMeasuring Community Embeddedness Near Army Installations: A Feasibility Study
DSPG: Jessica Keast, Sean Pietrowicz, and Allegra Pocinki
SDAD Mentors: Stephanie Shipp, Joel Thurston, Brandon Kramer, Vicki Lancaster, Josh Goldstein, and Nathaniel Ratcliff
Sponsor: Army Research Institute (ARI)

Community Embeddedness is the degree to which soldiers are connected (or not connected) to the larger community and nation that they serve. This feasibility study aims to better understand community embeddedness near two large Army posts in Virginia and Oklahoma and their surrounding counties by developing composite indices using the social determinants of health as the conceptual basis. This DSPG project supports a broader research SDAD is conducting to define and quantify the social behaviors of Soldiers by identifying social constructs that capture Soldier, team, and unit performance.


ACPDEconomic and Social Impact of Arlington Restaurant Initiative
DSPG: Victoria Halewicz, Kateryna Savchyn, and Jessica Keast
SDAD Mentors: J Bayoán Santiago Calderón, Gizem Korkmaz, and Aaron Schroeder
Sponsor: Arlington County Police Department

The Arlington Restaurant Initiative (ARI), an initiative by the Arlington County Police Department (ACPD), aims to foster a safe environment for customers, businesses and neighborhood residents in areas with nightlife and entertainment. Building on earlier research, the DSPG team is conducting a cost-benefit analysis of the program. The strategy for measuring the socio-economic benefit of the program is to estimate the effect of the program on occurrences of alcohol-related crimes (by type) and offenses. These will be multiplied and discounted by an estimated social cost of each occurrence to obtain a monetary equivalent effect. The estimated benefit of the program will be a vital component in providing the ACPD actionable information for policy making and budget allocation.


VDCReturning Citizens Re-entry Program
DSPG: Jessica Keast and Allegra Pocinki
SDAD Mentors: Nathaniel Ratcliff, Joel Thurston, J Bayoán Santiago Calderón, and Aaron Schroeder
Sponsor: Virginia Department of Corrections and University of Virginia, Darden School of Business

Resilience Education at the University of Virginia sends MBA volunteer instructors into prisons to educate prisoners who qualify to prepare them for improved re-entry to society after they are released. Through a review of the literature and data discovery to identify data sources, the DSPG team is evaluating the feasibility of measuring the pre- and post-release effect of education during incarceration, the impact on recidivism.


DSPG Young Scholar Participants