Skip to main content

Inventory of USDA Artificial Intelligence Use Cases

Artificial intelligence (AI) promises to drive the growth of the United States economy and improve the quality of life of all Americans. Executive Order (EO) 13960, Promoting the Use of Trustworthy Artificial Intelligence in the Federal Government, directed federal agencies to inventory their Artificial Intelligence (AI) use cases and share their inventories with other government agencies and the public. As stated in the Executive Order, federal applications of Artificial Intelligence should benefit the US economy and improve the quality of life of all Americans. As such, the growing adoption of AI must coincide with the launch of practices that ensure AI is deployed in a manner that fosters public trust and protects the rights and values of the American people.

In alignment with Executive Order (13960) of December 8, 2020, the Department of Agriculture has prepared an inventory of its use cases of artificial intelligence including current and planned uses, consistent with the agency's mission. The table below summarizes those AI use cases.

USDA AI Inventory (updated July 2022)

Agency

Use Case

Summary of AI Use Case

Agricultural Research Service

4% Repair Dashboard

The model reviews the descriptions of expenses tagged to repairs and maintenance and classifies expenses as "repair" or "not repair" based on keywords in context.

Agricultural Research Service

ARS Project Mapping

NLP of research project plans including term analysis and clustering enables national program leaders to work with an interactive dashboard to find synergies and patterns within and across the various ARS research program portfolios.

Agricultural Research Service

NAL Automated indexing

Vendor system Cogito uses machine learning for indexing of publication abstracts and project proposals using terms from the USDA National Agricultural Library Thesaurus.

Animal and Plant Health Inspection Service

Forecasting Grasshopper Outbreaks in the Western United States using Machine Learning Tools

Machine learning tools help us integrate historic grasshopper survey data and grasshopper biology with environmental covariates (e.g., climate, soil, and topography) to generate grasshopper outbreaks forecasts for the 17 contiguous states in the western United States. Maxent machine learning tool helps us generate potential distribution maps for 12 most damaging and widespread rangeland pest grasshopper species in the western United States.

Agricultural Research Service

Facial recognition

Facial recognition as one of several factors for access to secure areas of a facility

Economic Research Service

Coleridge Initiative, Show US the Data

The purpose of this project is the use AI tools, machine learning and natural language processing, to understand how publicly funded data and evidence are used to serve science and society.

Economic Research Service

Westat

A competition to find automated, yet effective, ways of linking USDA nutrition information to 750K food items in a proprietary data set of food purchases and acquisitions.

Farm Production and Conservation (FPAC)

Land Change Analysis Tool (LCAT)

We employ a random forest machine learning classifier to produce high resolution land cover maps from aerial and/or satellite imagery. Training data is generated from a custom-built web application. We built and operate a 192-node docker cluster to parallelize CPU-intensive processing tasks. We are publishing results through a publicly available Image service. To date, we have mapped over 600 million acres and have generated over 700 thousand training samples.

Food and Nutrition Services

Retailer Receipt Analysis

The Retailer Receipt Analysis is a Proof of Concept (POC) that uses Optical Character Recognition (OCR), an application of artificial intelligence on a sample (no more than 1000) of FNS receipt and invoice data. Consultants will use this data to demonstrate how the existing manual process can be automated, saving staff time, ensuring accurate review, and detecting difficult patterns. The goal of this POC will pave the way for a review system that (1) has an automated workflow and learns from analyst feedback (2) can incorporate know SNAP fraud patterns, look for new patterns, and visualize alerts on these patterns on retailer invoices and receipts.

Forest Service

Ecosystem Management Decision Support System (EMDS)

EMDS is a spatial decision support system for landscape analysis and planning that runs as a component of ArcGIS and QGIS. Users develop applications for their specific problem that may use any combination of four AI engines for 1) logic processing, 2) multi-criteria decision analysis, 3) Bayesian networks, and Prolog-based decision trees.

Forest Service

Wildland Urban Interface - Mapping Wildfire Loss

This is a proof-of-concept study to investigate the use of machine learning (deep learning / convolutional neural networks) and object-based image classification techniques to identify buildings, building loss, and defensible space around buildings before and after a wildfire event in wildland-urban interface settings.

Forest Service

National Land Cover Database (NLCD) Tree Canopy Cover Mapping

As a member of the interagency Multi-Resolution Land Characteristics (MRLC) consortium, the USDA Forest Service (FS) is responsible for producing TCC maps with consistent spatial resolution (30 meters), with methods that are consistently applied across the United States. The forest structure maps are generated using over 60,000 training plots with a probabilistic sample design to train statistical machine learning models (random forests) to classify continuous tree canopy cover (0-100%). Summary and background are available through peer reviewed papers and reports.

Forest Service

The BIGMAP Project

The project uses machine learning, along with features derived from dense time series of Landsat imagery as well as climatic and topographic data, to impute attributes from FIA's national forest inventory database to 30-meter pixels to produce raster maps of forest resources of the contiguous United States.

Forest Service

DISTRIB-II: Habitat Suitability of Eastern United States Trees

Habitat suitability is modeled for 125 eastern United States trees species under 1981-2010 climate conditions and 8 projected future conditions (2070-2099). Inputs include downscaled climate conditions from general circulation models, elevation, and soil type. Outputs include importance values, derived from tree basal area and number of stems, for the species, modeled under current conditions and the 8 projected future conditions. The AI provides insight into options for managing eastern U.S. forests.

Forest Service

CLT Knowledge Database

The CLT knowledge database catalogs cross-laminated timber information in an interface that helps users find relevant information. The information system uses data aggregator bots that search the internet for relevant information. These bots search for hundreds of keywords and use machine learning to determine if what is found is relevant. The search engine uses intelligent software to locate and update pertinent CLT references, as well as categorize information with respect to common application and interest areas. As of 2/24/2022, the CLT knowledge database has cataloged >3,600 publications on various aspects of CLT. This system fosters growth of mass timber markets by disseminating knowledge and facilitating collaboration among stakeholders, and by reducing the risk of duplication of efforts. Manufacturers, researchers, design professionals, code officials, government agencies, and other stakeholders directly benefit from the tool, thereby supporting the increasing use of mass timber, which benefits forest health by increasing the economic value of forests.

Forest Service

RMRS Raster Utility

RMRS Raster Utility is a .NET object-oriented library that simplifies data acquisition, raster sampling, and statistical and spatial modeling while reducing the processing time and storage space associated with raster analysis. It includes machine learning techniques.

Forest Service

TreeMap 2016

TreeMap 2016 provides a tree-level model of the forests of the conterminous United States. It matches forest plot data from Forest Inventory and Analysis (FIA) to a 30x30 meter (m) grid. TreeMap 2016 is being used in both the private and public sectors for projects including fuel treatment planning, snag hazard mapping, and estimation of terrestrial carbon resources. A random forests machine-learning algorithm was used to impute the forest plot data to a set of target rasters provided by Landscape Fire and Resource Management Planning Tools (LANDFIRE). Predictor variables consisted of percent forest cover, height, and vegetation type, as well as topography (slope, elevation, and aspect), location (latitude and longitude), biophysical variables (photosynthetically active radiation, precipitation, maximum temperature, minimum temperature, relative humidity, and vapor pressure deficit), and disturbance history (time since disturbance and disturbance type) for the landscape circa 2016.

Forest Service

Landscape Change Monitoring System (LCMS)

The Landscape Change Monitoring System (LCMS) is a National Landsat/sentinel remote sensing-based data produced by the USDA Forest Service for mapping and monitoring changes related to vegetation canopy cover, as well as land cover and land use. The process utilizes temporal change classifications together with training data in a supervised classification process for vegetation gain, and loss as well as land cover and use.

Forest Service

Forest Health Detection Monitoring

Machine learning models are used to (1) upscale training data, using Sentinel-2, Landsat, MODIS, and lidar imagery, that was collected from both the field and high-resolution imagery to map and monitor stages of forest mortality and defoliation across the United States, and (2) to post-process raster outputs to vector polygons.

Forest Service

Land Cover Data Development

Existing cover data development is a key data set for natural resource characterization. There are standard methods that can easily utilize the many different models available. Existing natural resource data (e.g., landcover data, digital soil data, land use data, etc.) initially involves the derivation of a map unit classification (the data labels). Further, using training sites (field and in situ derived) as reference data, apply supervised classification methods (i.e., Random Forests) with remotely sensed satellite (e.g., Landsat, sentinel), aerial imagery and other landscape variables (e.g., digital elevation derivations, soils data, geology data, etc.) for labelling segments or pixels with land attributes for a landscape/study area (applied for local and National applications). Random forest is the main tool used a supervised algorithm consisting of many decision trees for labeling the landscape.

National Agricultural Statistics Service

Cropland Data Layer

A machine learning algorithm is used to interpret readings from satellite-based sensors and CLASSIFY the type of crop or activity that falls in each 30 square meter pixel (a box of fixed size) on the ground. The algorithms are trained on USDA's Farm Services Agency data and other sources of data as sources of "ground truth". It allows us to not only produce a classification, but to assess the accuracy of the classification as well. For commodities, like corn and soybeans, the CDL is highly accurate. The CDL has been produced for national coverage since 2008. Some summary and background about the CDL is available in a number of peer reviewed research papers and presentations
https://www.nass.usda.gov/Research_and_Science/Cropland/othercitations/index.php

National Agricultural Statistics Service

List Frame Deadwood Identification

The deadwood model leverages boosted regression trees with inputs such as administrative linkage data, frame data, and historical response information as inputs, to produce a propensity score representing a relative likelihood of a farm operation being out of business. Common tree splits were identified using the model and combined with expert knowledge to develop a recurring process for deadwood clean up.

National Institute for Food and Agriculture

Climate Change Classification NLP

The model classifies NIFA funded projects as climate change related or not climate related through natural language processing techniques. The model input features include text fields containing the project's title, non-technical summary, objectives, and keywords. The target is a dummy variable classification of projects as climate change related or not climate change related.

Office of Safety Security and Protection

Video Surveillance System

The Video Surveillance System: the VSS system design will include a video management system, NVRs, DVRs, encoders, fixed cameras, Pan and Tilt cameras, network switches, routers, IP cables, equipment racks and mounting hardware. The Video Surveillance System (VSS)- shall control multiple sources of video surveillance subsystems to collect, manage, and present video clearly and concisely. VMS shall integrate the capabilities of each subsystem across single or multiple sites, allowing video management of any compatible analog or digital video device through a unified configuration platform and viewer. Disparate video systems are normalized and funneled through a shared video experience. Drag and drop cameras from the Security Management System hardware tree into VMS views and leverage Security Management System alarm integration and advanced features that help the operator track a target through a set of sequential cameras with a simplified method to select a new central camera and surrounding camera views.

Office of the Chief Information Officer

Acquisition Approval Request Compliance Tool

A natural language processing (NLP) model was developed to utilize the text in procurement header and line descriptions within USDA's Integrated Acquisition System (IAS) to determine the likelihood that an award is IT-related, and therefore might require an AAR. The model uses the text characteristics for awards that have an AAR number entered into IAS and then calculates the probability of being IT-related for those procurements that did not have an AAR Number entered in IAS.

USDA Natural Resources Conservation Service (NRCS) Snow Survey and Water Supply Forecast (SSWSF) program

Operational water supply forecasting for western US rivers

Western US water management is underpinned by forecasts of spring-summer river flow volumes made using operational hydrologic models. The USDA Natural Resources Conservation Service (NRCS) National Water and Climate Center operates the largest such forecast system regionally, carrying on a nearly century-old tradition. The NWCC recently developed a next-generation prototype for generating such operational water supply forecasts (WSFs), the multi-model machine-learning metasystem (M4), which integrates a variety of AI and other data-science technologies carefully chosen or developed to satisfy specific user needs. Required inputs are data around snow and precipitation from the NRCS Snow Survey and Water Supply Forecast program SNOTEL environmental monitoring network but are flexible. In hindcasting test-cases spanning diverse environments across the western US and Alaska, out-of-sample accuracy improved markedly over current benchmarks. Various technical design elements, including multi-model ensemble modeling, autonomous machine learning (AutoML), hyperparameter pre-calibration, and theory-guided data science, collectively permitted automated training and operation. Live operational testing at a subset of sites additionally demonstrated logistical feasibility of workflows, as well as geophysical explainability of results in terms of known hydroclimatic processes, belying the black-box reputation of machine learning and enabling relatable forecast storylines for NRCS customers.