Risk-Screening Environmental Indicators (RSEI) Geographic Microdata from the United States Environmental Protection Agency (EPA) is available for anyone to use via Amazon S3.
EPA’s RSEI Geographic Microdata is a unique dataset that provides detailed air model results from EPA’s Risk-Screening Environmental Indicators (RSEI) model. RSEI models chemical releases and transfers reported to the from EPA’s Toxics Release Inventory (TRI), which tracks toxic chemical releases and waste management activities at industrial and federal facilities across the United States and territories. RSEI contains results for over 400 TRI-listed chemicals and 45,000 TRI-reporting facilities.
You can learn more about this data at the RSEI website, and learn more about TRI reporting at the Toxics Release Inventory Program website.
The results include chemical concentration, toxicity-weighted concentration and score, calculated for each 810 meter square grid cell in a 49-km circle around the emitting facility, for every year from 1988 through 2014. As with any model, RSEI is subject to the limitations of the underlying data sources and models that it incorporates. You should carefully consider the impact that the RSEI method may have on the results for any analysis. The RSEI Microdata Guidance contains important information about using the RSEI Microdata, and the RSEI methodology document and appendices contain full documentation of methods and data sources used in the RSEI model.
The data can be used to examine trends in air pollution from industrial facilities over time and across geographies. Users can examine relationships between RSEI impacts and population demographics for environmental justice analyses. RSEI data are unique because they provide a fully comparable 27-year time series, as well as a national matched source-receptor relationship – that is, users can match the estimated impact on an area to the facilities responsible.
RSEI Microdata is available in the epa-rsei-pds
Amazon S3 bucket in the US East region.
If you use the AWS Command Line Interface, you can list the contents of the bucket with this command:
aws s3 ls epa-rsei-pds
This information is also available in the listing for RSEI Microdata on the Registry of Open Data on AWS.
A RSEI Microdata file is a CSV file in which each row represents the metrics for a single chemical release as it affects a single grid cell. RSEI data on AWS provides files that include all Microdata for each year, as well as files that include Microdata broken down by state for each year. All files are compressed using gzip.
Annual RSEI Microdata files are available in s3://epa-rsei-pds/v234/microdata
and are named vxxx_micro_yyyy.csv.gz
, where xxx
refers to the data version number, and yyyy
refers to the year. For example, the file for 1989 RSEI data is at s3://epa-rsei-pds/v234/microdata/v234_micro_1989.csv.gz
.
RSEI Microdata files broken down by state are available in s3://epa-rsei-pds/v234/microdata/<two-character state identifier>
and are named vxxx_micro_state_ss_yyyy.csv.gz
, where xxx
refers to the data version number, ss
refers to the state, and yyyy
refers to the year.
For example, the file for Utah’s 2002 RSEI data is at s3://epa-rsei-pds/v234/microdata/ut/v234_micro_state_ut_2002.csv.gz
.
A list of state identifiers is at s3://epa-rsei-pds/v234/data_tables/states.csv
.
Each row of a Microdata file refers to a single 810m grid cell, and contains scores, concentrations, and toxicity-weighted concentrations for each chemical release. Note that if two releases for the same chemical (either from different facilities or one from a stack release and one from a fugitive release from the same facility) affect the same grid cell, there will be separate rows for each grid cell/release combination.
Each Microdata CSV contains 13 columns, described in the table below.
Column Number | Name | Description. |
---|---|---|
1 | GridCode |
Identifies grid.
|
2 | X | X-coordinate of grid. |
3 | Y | Y-coordinate of grid. |
4 | ReleaseNumber | Internal unique identifier for release. Identifiers are described in the "Release" lookup table (46 MB CSV compressed using gzip). |
5 | ChemicalNumber | Internal unique identifier of released chemical. Identifiers are described in the "Chemical" lookup table (208 KB CSV). |
6 | FacilityNumber |
Internal unique identifier of releasing facility.
|
7 | Media | Code describing media into which chemical is released. Codes are described in the "Media" lookup table (2 KB CSV). |
8 | Conc | Concentration of chemical for release/media at grid cell. |
9 | ToxConc | Concentration multiplied by inhalation toxicity weight. |
10 | Score | Risk-related score (surrogate dose × toxicity weight × population). |
11 | ScoreCancer | Risk-related score (surrogate dose × toxicity weight × populationonly toxicity values for cancer effects. |
12 | ScoreNonCancer | Risk-related score (surrogate dose × toxicity weight × populationonly toxicity values for noncancer effects. |
13 | Pop | Number of people in grid cell (may be interpolated). |
The Microdata uses internal identifiers that link to subject lookup tables that contain names and parameters used for chemicals, facilities, etc. Some of those tables are linked to in the table above, and all lookup tables can be found at s3://epa-rsei-pds/V234/data_tables
. Field names and descriptions can be found in the RSEI data dictionary.