pyrsig package¶
Subpackages¶
Submodules¶
pyrsig.bin module¶
pyrsig.utils module¶
- pyrsig.utils.customize_grid(grid_kw, bbox, clip=True)[source]¶
Redefine grid_kw to cover bbox by removing extra rows and columns and redefining XORIG, YORIG, NCOLS and NROWS.
- Parameters:
grid_kw (dict or str) – If str, must be a known grid in default grids. If dict, must include all IOAPI grid metadata properties
bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)
clip (bool) – If True, limit grid to original grid bounds
- Returns:
ogrid_kw – IOAPI grid metadata properties with XORIG/YORIG and NCOLS/NROWS adjusted such that it only covers bbox or (if clip) only covers the portion of bbox covered by the original grid_kw.
- Return type:
dict
- pyrsig.utils.get_file(url, outpath, maxtries=5, verbose=1, overwrite=False)[source]¶
Download file from RSIG using fault tolerance and optional caching when overwrite is False.
- Parameters:
url (str) – path to retrieve
outpath (str) – path to save file to
maxtries (int) – try this many times before quitting
verbose (int) – Level of verbosity
overwrite (bool) – If True, overwrite existing files. If False, reuse existing files.
- Return type:
None
- pyrsig.utils.get_proj4(attrs, earth_radius=6370000.0)[source]¶
Create a proj4 formatted grid definition using IOAPI attrs and earth_radius
- Parameters:
attrs (dict-like) – Mappable of IOAPI properties that supports the items method
earth_radius (float) – Assumed radius of the earth. 6370000 is the WRF default.
- Returns:
projstr – proj4 formatted string such that the domain southwest corner starts at (0, 0) and ends at (NCOLS, NROWS)
- Return type:
str
pyrsig.xdr module¶
- pyrsig.xdr.from_xdr(inf, na_values=None, decompress=False, as_dataframe=True)[source]¶
Currently supports profile, site and swath (v2.0). Each is in XDR format with a custom set of header rows in text format. The text header rows also describe the binary portion of the file.
Infers RSIG format using first 40 characters.
Site 2.0: from_site
Profile 2.0: from_profile
Swath 2.0: from_swath
Point 1.0: from_point
CALIPSO 1.0: from_calipso
Polygon 1.0: from_polygon
Grid 1.0: from_grid
Subset 9.0: from_subset
- Parameters:
inf (file) – Data file in XDR format with RSIG headers
na_values (scalar) – Used to remove known missing values.
decompress (bool) – If True, decompress to temporary file. If False, do not decompress to temporary file (was never compressed)
as_dataframe (bool) – If True (default), return data as a pandas.Dataframe. If False, return a xarray.Dataset. Only subset and grid support as_dataframe.
- Returns:
df – Dataframe with XDR content
- Return type:
pd.DataFrame
- pyrsig.xdr.from_xdrfile(path, na_values=None, decompress=None, as_dataframe=True, decompress_inline=True)[source]¶
Currently supports profile, site and swath (v2.0). Each is in XDR format with a custom set of header rows in text format. The text header rows also describe the binary portion of the file.
- Parameters:
path (str) – Path to file in XDR format with RSIG headers
decompress (bool) – If None, use decompress if path ends in .gz If True, decompress to temporary file. If False, do not decompress to temporary file (was never compressed)
decompress_inline (bool) – if True (default), use gzip.open to decompress and read file if False, decompress file on disk
as_dataframe (bool) – If True (default), return data as a pandas.Dataframe. If False, return a xarray.Dataset. Only subset and grid support as_dataframe.
- Returns:
df – Dataframe with XDR content
- Return type:
pd.DataFrame
Module contents¶
- class pyrsig.RsigApi(key=None, bdate=None, edate=None, bbox=None, grid_kw=None, tropomi_kw=None, purpleair_kw=None, viirsnoaa_kw=None, tempo_kw=None, pandora_kw=None, calipso_kw=None, server='ofmpub.epa.gov', compress=1, corners=1, encoding=None, overwrite=False, workdir='.', gridfit=False)[source]¶
Bases:
object
RsigApi is a python-based interface to RSIG’s web-based API
- Parameters:
key (str) – Default key for query (e.g., ‘aqs.o3’, ‘purpleair.pm25_corrected’, or ‘tropomi.offl.no2.nitrogendioxide_tropospheric_column’)
bdate (str or pd.Datetime) – beginning date (inclusive) defaults to yesterday at 0Z
edate (str or pd.Datetime) – ending date (inclusive) defaults to bdate + 23:59:59
bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)
grid_kw (dict) –
If str, must be 12US1, 1US1, 12US2, 1US2, 36US3, 108NHEMI2, 36NHEMI2 and will be used to set parameters based on EPA domains. If dict, IOAPI mapping parameters. For details, look at the defaults:
import pyrsig; print(pyrsig.RsigApi().grid_kw)
The REGRID_AGGREGATE defines how the regridded values are aggregated in time. Options are None (default), daily, or all.
viirsnoaa_kw (dict) – Dictionary of VIIRS NOAA filter parameters default {‘minimum_quality’: ‘high’} other options ‘medium’ or ‘low’
tropomi_kw (dict) – Dictionary of TropOMI filter parameters default {‘minimum_quality’: 75, ‘maximum_cloud_fraction’: 1.0} options are 0-100 and 0-1.
purpleair_kw (dict) –
- Dictionary of purpleair filter parameters and api_key.
’out_in_flag’: 0, # options 0, 2, ‘’ ‘freq’: ‘hourly’, # options hourly, daily, monthly, yearly, none ‘maximum_difference’: 5, # integer ‘maximum_ratio’: 0.70, # float ‘agg_pct’: 75, # 0-100 ‘default_humidity’: 50, ‘api_key’: ‘your_key_here’
tempo_kw (dict) –
- Dictionary of TEMPO filter parameters default
’api_key’: ‘your_key_here’ # ‘password’ ‘minimum_quality’: ‘normal’ ‘maximum_cloud_fraction’: 1.0 ‘maximum_solar_zenith_angle’: 70.
pandora_kw (dict) – Dictionary of Pandora filter parameters default {‘minimum_quality’: ‘high’} other options ‘medium’ or ‘low’
calipso_kw (dict) – Dictionary of Calipso filter parameters default {‘MINIMUM_CAD’: 20, ‘MAXIMUM_UNCERTAINTY’: 99}
server (str) – ‘ofmpub.epa.gov’ for external users ‘maple.hesc.epa.gov’ for on EPA VPN users
compress (int) – 1 to transfer files with gzip compression 0 to transfer uncompressed files (slow)
encoding (dict) – IF encoding is provided, netCDF files will be stored as NetCDF4 with encoding for all variables. If _FillValue is provided, it will not be applied to TFLAG and COUNT.
overwrite (bool) – If True, overwrite downloaded files in workdir. If False, reuse downloaded files in workdir.
workdir (str) – Working directory (must exist) defaults to ‘.’
gridfit (bool) – Default (False) keep grid as supplied. If True, redefine grid to remove cells outside the bbox.
Properties –
---------- –
grid_kw – Dictionary of regridding IOAPI properties. Defaults to 12US1
viirsnoaa_kw – Dictionary of filter properties
tropomi_kw – Dictionary of filter properties
tempo_kw – Dictionary of filter properties
purpleair_kw – Dictionary of filter properties and api_key. Unlike other options, purpleair_kw will not work with the defaults. The user must update teh api_key property to their own key. Contact PurpleAir for more details.
- capabilities(as_dataframe=True, refresh=False, verbose=0)[source]¶
At this time, the capabilities does not list cmaq.*
- describe(key, as_dataframe=True, raw=False)[source]¶
describe returns details about the coverage specified by key. Details include spatial bounding box, time coverage, time resolution, variable label, and a short description.
DescribeCoverage with a COVERAGE should be faster than descriptions because it only returns a small xml chunk. Currently, DescribeCoverage with a COVERAGE specified is unreliable because of malformed xml. If this fails, describe will instead request all coverages and query the specific coverage.
- Parameters:
as_dataframe (bool) – Defaults to True and descriptions are returned as a dataframe. If False, returns a list of elements.
raw (bool) – Return raw xml instead of parsing. Useful for debugging.
- Returns:
coverages – dataframe or list of parsed descriptions
- Return type:
pandas.DataFrame or list
Example
df = rsigapi.describe(‘airnow.no2’) print(df.to_csv()) # ,name,label,description,bbox_str,beginPosition,timeResolution # 0,no2,no2(ppb),UTC hourly mean surface measured nitrogen …, # … -157 21 -51 64,2003-01-02T00:00:00Z,PT1H
- descriptions(refresh=False, verbose=0)[source]¶
Experimental and may change.
descriptions returns details about all coverages. Details include spatial bounding box, time coverage, time resolution, variable label, and a short description.
Currently, parses capabilities using xml.etree.ElementTree and returns coverages from details available in CoverageOffering elements from DescribeCoverage.
Currently cleaning up data xml elements that are bad and doing a per-coverage parsing to increase fault tolerance in the xml.
- Parameters:
refresh (bool) – If True, get new copy and save to ~/.pyrsig/descriptons.xml If False (default), reload from saved if available.
verbose (int) – If verbose is greater than 0, show warnings from parsing.
- Returns:
coverages – dataframe or list of parsed descriptions
- Return type:
pandas.DataFrame or list
Example
rsigapi = pyrsig.RsigApi() desc = rsigapi.descriptions() print(desc.query(‘prefix == “tropomi”’).name.unique()) # [‘tropomi.nrti.no2.nitrogendioxide_tropospheric_column’ # … 43 other name here # ‘tropomi.rpro.ch4.methane_mixing_ratio_bias_corrected’]
- get_file(formatstr, key=None, bdate=None, edate=None, bbox=None, grid=False, corners=None, request='GetCoverage', compress=0, overwrite=None, verbose=0)[source]¶
Build url, outpath, and download the file. Returns outpath
- keys(offline=True)[source]¶
- Parameters:
offline (bool) – If True, uses small cached set of tested coverages. If False, finds all coverages from capabilities service.
- resize_grid(clip=True)[source]¶
Update grid_kw property so that it only covers the bbox by adjusting the XORIG, YORIG, NCOLS and NROWS. If clip is True, this has the affect of reducing the number of rows and columns. This is useful when the area of interest is much smaller than the grid defined in grid_kw.
- Parameters:
clip (bool) –
- Return type:
None
- to_dataframe(key=None, bdate=None, edate=None, bbox=None, unit_keys=True, parse_dates=False, corners=None, withmeta=False, verbose=0, backend='ascii', grid=False)[source]¶
All arguments default to those provided during initialization.
- Parameters:
key (str) – Default key for query (e.g., ‘aqs.o3’, ‘purpleair.pm25_corrected’, or ‘tropomi.offl.no2.nitrogendioxide_tropospheric_column’)
bdate (str or pd.Datetime) – beginning date (inclusive) defaults to yesterday at 0Z
edate (str or pd.Datetime) – ending date (inclusive) defaults to bdate + 23:59:59
bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)
unit_keys (bool) – If True, keep unit in column name. If False, move last parenthetical part of key to attrs of Series.
parse_dates (bool) – If True, parse Timestamp(UTC)
withmeta (bool) – If True, add ‘GetMetadata’ results as a “metadata” attribute of the dataframe. This is useful for understanding the underlying datasets used to create the result.
verbose (int) – level of verbosity
- Returns:
df – Results from download
- Return type:
pandas.DataFrame
- to_ioapi(key=None, bdate=None, edate=None, bbox=None, withmeta=False, removegz=False, verbose=0)[source]¶
All arguments default to those provided during initialization.
- Parameters:
key (str) – Default key for query (e.g., ‘aqs.o3’, ‘purpleair.pm25_corrected’, or ‘tropomi.offl.no2.nitrogendioxide_tropospheric_column’)
bdate (str or pd.Datetime) – beginning date (inclusive) defaults to yesterday at 0Z
edate (str or pd.Datetime) – ending date (inclusive) defaults to bdate + 23:59:59
bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)
withmeta (bool) – If True, add ‘GetMetadata’ results at an attribute “metadata” to the netcdf file. This is useful for understanding the underlying datasets used to create the result.
removegz (bool) – If True, then remove the downloaded gz file. Bad for caching.
- Returns:
ds – Results from download
- Return type:
xarray.Dataset
- to_netcdf(key=None, bdate=None, edate=None, bbox=None, grid=False, withmeta=False, removegz=False, verbose=0)[source]¶
All arguments default to those provided during initialization.
- Parameters:
key (str) – Default key for query (e.g., ‘aqs.o3’, ‘purpleair.pm25_corrected’, or ‘tropomi.offl.no2.nitrogendioxide_tropospheric_column’)
bdate (str or pd.Datetime) – beginning date (inclusive) defaults to yesterday at 0Z
edate (str or pd.Datetime) – ending date (inclusive) defaults to bdate + 23:59:59
bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)
grid (bool) – Add column and row variables with grid assignments.
withmeta (bool) – If True, add ‘GetMetadata’ results at an attribute “metadata” to the netcdf file.
removegz (bool) – If True, then remove the downloaded gz file. Bad for caching.
- Returns:
ds – Results from download
- Return type:
xarray.Dataset
- class pyrsig.RsigGui[source]¶
Bases:
object
RsigGui Object designed for IPython with ipywidgets in Jupyter
Example: gui = RsigGui() gui.form # As last line in cell, displays controls for user gui.plotopts() # Plots current options gui.check() # Check bounding box and date options make sense rsigapi = gui.get_api() # Convert gui to standard api # proceed with normal RsigApi usage
- property bbox¶
- property bdate¶
- property edate¶
- property form¶
- property grid_kw¶
- property key¶
- property workdir¶
- pyrsig.open_ioapi(path, metapath=None, earth_radius=6370000.0, **kwds)[source]¶
Open an IOAPI file, add coordinate data, and optionally add RSIG metadata.
- Parameters:
path (str) – Path to IOAPI formatted files.
metapath (str) – Path to metadata associated with the RSIG query. The metadata will be added as metadata global property.
earth_radius (float) – Assumed radius of the earth. 6370000 is the WRF default.
kwds (mappable) – Passed to xr.open_dataset
- Returns:
ds – Dataset with IOAPI metadata
- Return type:
xarray.Dataset
- pyrsig.open_mfioapi(paths, metapaths=None, earth_radius=6370000.0, **kwargs)[source]¶
Minimal version of open_mfdataset that is compatible with open_ioapi. preprocess : keyword defaults to add_ioapi_meta concat_dim : keyword defaults to ‘TSTEP’
- Parameters:
paths (iterable) – Paths to ioapi files to be opened.
metapaths (iterable) – Paths to be added as a string metadata
earth_radius (float) – Radius of the earth for projection.
kwargs – See xr.open_mfdataset