pyrsig package

Subpackages

Submodules

pyrsig.bin module

pyrsig.bin.from_binfile(path)[source]
Parameters:

path (str) – Path to binary file that has a first line prefix shxn shpn dbfn Where shxn is the length of the shx file Where shpn is the length of the shp file Where dbfn is the length of the shp file

Returns:

df

Return type:

geopandas.GeoDataFrame

pyrsig.utils module

pyrsig.utils.coverages_from_xml(txt)[source]

Based on xml text, create coverage data

pyrsig.utils.customize_grid(grid_kw, bbox, clip=True)[source]

Redefine grid_kw to cover bbox by removing extra rows and columns and redefining XORIG, YORIG, NCOLS and NROWS.

Parameters:
  • grid_kw (dict or str) – If str, must be a known grid in default grids. If dict, must include all IOAPI grid metadata properties

  • bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)

  • clip (bool) – If True, limit grid to original grid bounds

Returns:

ogrid_kw – IOAPI grid metadata properties with XORIG/YORIG and NCOLS/NROWS adjusted such that it only covers bbox or (if clip) only covers the portion of bbox covered by the original grid_kw.

Return type:

dict

pyrsig.utils.get_file(url, outpath, maxtries=5, verbose=1, overwrite=False)[source]

Download file from RSIG using fault tolerance and optional caching when overwrite is False.

Parameters:
  • url (str) – path to retrieve

  • outpath (str) – path to save file to

  • maxtries (int) – try this many times before quitting

  • verbose (int) – Level of verbosity

  • overwrite (bool) – If True, overwrite existing files. If False, reuse existing files.

Return type:

None

pyrsig.utils.get_proj4(attrs, earth_radius=6370000.0)[source]

Create a proj4 formatted grid definition using IOAPI attrs and earth_radius

Parameters:
  • attrs (dict-like) – Mappable of IOAPI properties that supports the items method

  • earth_radius (float) – Assumed radius of the earth. 6370000 is the WRF default.

Returns:

projstr – proj4 formatted string such that the domain southwest corner starts at (0, 0) and ends at (NCOLS, NROWS)

Return type:

str

pyrsig.utils.legacy_get(url, *args, **kwds)[source]

pyrsig.xdr module

pyrsig.xdr.from_xdr(inf, na_values=None, decompress=False, as_dataframe=True)[source]

Currently supports profile, site and swath (v2.0). Each is in XDR format with a custom set of header rows in text format. The text header rows also describe the binary portion of the file.

Infers RSIG format using first 40 characters.

  • Site 2.0: from_site

  • Profile 2.0: from_profile

  • Swath 2.0: from_swath

  • Point 1.0: from_point

  • CALIPSO 1.0: from_calipso

  • Polygon 1.0: from_polygon

  • Grid 1.0: from_grid

  • Subset 9.0: from_subset

Parameters:
  • inf (file) – Data file in XDR format with RSIG headers

  • na_values (scalar) – Used to remove known missing values.

  • decompress (bool) – If True, decompress to temporary file. If False, do not decompress to temporary file (was never compressed)

  • as_dataframe (bool) – If True (default), return data as a pandas.Dataframe. If False, return a xarray.Dataset. Only subset and grid support as_dataframe.

Returns:

df – Dataframe with XDR content

Return type:

pd.DataFrame

pyrsig.xdr.from_xdrfile(path, na_values=None, decompress=None, as_dataframe=True, decompress_inline=True)[source]

Currently supports profile, site and swath (v2.0). Each is in XDR format with a custom set of header rows in text format. The text header rows also describe the binary portion of the file.

Parameters:
  • path (str) – Path to file in XDR format with RSIG headers

  • decompress (bool) – If None, use decompress if path ends in .gz If True, decompress to temporary file. If False, do not decompress to temporary file (was never compressed)

  • decompress_inline (bool) – if True (default), use gzip.open to decompress and read file if False, decompress file on disk

  • as_dataframe (bool) – If True (default), return data as a pandas.Dataframe. If False, return a xarray.Dataset. Only subset and grid support as_dataframe.

Returns:

df – Dataframe with XDR content

Return type:

pd.DataFrame

Module contents

class pyrsig.RsigApi(key=None, bdate=None, edate=None, bbox=None, grid_kw=None, tropomi_kw=None, purpleair_kw=None, viirsnoaa_kw=None, tempo_kw=None, pandora_kw=None, calipso_kw=None, server='ofmpub.epa.gov', compress=1, corners=1, encoding=None, overwrite=False, workdir='.', gridfit=False)[source]

Bases: object

RsigApi is a python-based interface to RSIG’s web-based API

Parameters:
  • key (str) – Default key for query (e.g., ‘aqs.o3’, ‘purpleair.pm25_corrected’, or ‘tropomi.offl.no2.nitrogendioxide_tropospheric_column’)

  • bdate (str or pd.Datetime) – beginning date (inclusive) defaults to yesterday at 0Z

  • edate (str or pd.Datetime) – ending date (inclusive) defaults to bdate + 23:59:59

  • bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)

  • grid_kw (dict) –

    If str, must be 12US1, 1US1, 12US2, 1US2, 36US3, 108NHEMI2, 36NHEMI2 and will be used to set parameters based on EPA domains. If dict, IOAPI mapping parameters. For details, look at the defaults:

    import pyrsig; print(pyrsig.RsigApi().grid_kw)

    The REGRID_AGGREGATE defines how the regridded values are aggregated in time. Options are None (default), daily, or all.

  • viirsnoaa_kw (dict) – Dictionary of VIIRS NOAA filter parameters default {‘minimum_quality’: ‘high’} other options ‘medium’ or ‘low’

  • tropomi_kw (dict) – Dictionary of TropOMI filter parameters default {‘minimum_quality’: 75, ‘maximum_cloud_fraction’: 1.0} options are 0-100 and 0-1.

  • purpleair_kw (dict) –

    Dictionary of purpleair filter parameters and api_key.

    ’out_in_flag’: 0, # options 0, 2, ‘’ ‘freq’: ‘hourly’, # options hourly, daily, monthly, yearly, none ‘maximum_difference’: 5, # integer ‘maximum_ratio’: 0.70, # float ‘agg_pct’: 75, # 0-100 ‘default_humidity’: 50, ‘api_key’: ‘your_key_here’

  • tempo_kw (dict) –

    Dictionary of TEMPO filter parameters default

    ’api_key’: ‘your_key_here’ # ‘password’ ‘minimum_quality’: ‘normal’ ‘maximum_cloud_fraction’: 1.0 ‘maximum_solar_zenith_angle’: 70.

  • pandora_kw (dict) – Dictionary of Pandora filter parameters default {‘minimum_quality’: ‘high’} other options ‘medium’ or ‘low’

  • calipso_kw (dict) – Dictionary of Calipso filter parameters default {‘MINIMUM_CAD’: 20, ‘MAXIMUM_UNCERTAINTY’: 99}

  • server (str) – ‘ofmpub.epa.gov’ for external users ‘maple.hesc.epa.gov’ for on EPA VPN users

  • compress (int) – 1 to transfer files with gzip compression 0 to transfer uncompressed files (slow)

  • encoding (dict) – IF encoding is provided, netCDF files will be stored as NetCDF4 with encoding for all variables. If _FillValue is provided, it will not be applied to TFLAG and COUNT.

  • overwrite (bool) – If True, overwrite downloaded files in workdir. If False, reuse downloaded files in workdir.

  • workdir (str) – Working directory (must exist) defaults to ‘.’

  • gridfit (bool) – Default (False) keep grid as supplied. If True, redefine grid to remove cells outside the bbox.

  • Properties

  • ----------

  • grid_kw – Dictionary of regridding IOAPI properties. Defaults to 12US1

  • viirsnoaa_kw – Dictionary of filter properties

  • tropomi_kw – Dictionary of filter properties

  • tempo_kw – Dictionary of filter properties

  • purpleair_kw – Dictionary of filter properties and api_key. Unlike other options, purpleair_kw will not work with the defaults. The user must update teh api_key property to their own key. Contact PurpleAir for more details.

capabilities(as_dataframe=True, refresh=False, verbose=0)[source]

At this time, the capabilities does not list cmaq.*

describe(key, as_dataframe=True, raw=False)[source]

describe returns details about the coverage specified by key. Details include spatial bounding box, time coverage, time resolution, variable label, and a short description.

DescribeCoverage with a COVERAGE should be faster than descriptions because it only returns a small xml chunk. Currently, DescribeCoverage with a COVERAGE specified is unreliable because of malformed xml. If this fails, describe will instead request all coverages and query the specific coverage.

Parameters:
  • as_dataframe (bool) – Defaults to True and descriptions are returned as a dataframe. If False, returns a list of elements.

  • raw (bool) – Return raw xml instead of parsing. Useful for debugging.

Returns:

coverages – dataframe or list of parsed descriptions

Return type:

pandas.DataFrame or list

Example

df = rsigapi.describe(‘airnow.no2’) print(df.to_csv()) # ,name,label,description,bbox_str,beginPosition,timeResolution # 0,no2,no2(ppb),UTC hourly mean surface measured nitrogen …, # … -157 21 -51 64,2003-01-02T00:00:00Z,PT1H

descriptions(refresh=False, verbose=0)[source]

Experimental and may change.

descriptions returns details about all coverages. Details include spatial bounding box, time coverage, time resolution, variable label, and a short description.

Currently, parses capabilities using xml.etree.ElementTree and returns coverages from details available in CoverageOffering elements from DescribeCoverage.

Currently cleaning up data xml elements that are bad and doing a per-coverage parsing to increase fault tolerance in the xml.

Parameters:
  • refresh (bool) – If True, get new copy and save to ~/.pyrsig/descriptons.xml If False (default), reload from saved if available.

  • verbose (int) – If verbose is greater than 0, show warnings from parsing.

Returns:

coverages – dataframe or list of parsed descriptions

Return type:

pandas.DataFrame or list

Example

rsigapi = pyrsig.RsigApi() desc = rsigapi.descriptions() print(desc.query(‘prefix == “tropomi”’).name.unique()) # [‘tropomi.nrti.no2.nitrogendioxide_tropospheric_column’ # … 43 other name here # ‘tropomi.rpro.ch4.methane_mixing_ratio_bias_corrected’]

get_file(formatstr, key=None, bdate=None, edate=None, bbox=None, grid=False, corners=None, request='GetCoverage', compress=0, overwrite=None, verbose=0)[source]

Build url, outpath, and download the file. Returns outpath

keys(offline=True)[source]
Parameters:

offline (bool) – If True, uses small cached set of tested coverages. If False, finds all coverages from capabilities service.

resize_grid(clip=True)[source]

Update grid_kw property so that it only covers the bbox by adjusting the XORIG, YORIG, NCOLS and NROWS. If clip is True, this has the affect of reducing the number of rows and columns. This is useful when the area of interest is much smaller than the grid defined in grid_kw.

Parameters:

clip (bool) –

Return type:

None

set_grid_kw(grid_kw)[source]
to_dataframe(key=None, bdate=None, edate=None, bbox=None, unit_keys=True, parse_dates=False, corners=None, withmeta=False, verbose=0, backend='ascii', grid=False)[source]

All arguments default to those provided during initialization.

Parameters:
  • key (str) – Default key for query (e.g., ‘aqs.o3’, ‘purpleair.pm25_corrected’, or ‘tropomi.offl.no2.nitrogendioxide_tropospheric_column’)

  • bdate (str or pd.Datetime) – beginning date (inclusive) defaults to yesterday at 0Z

  • edate (str or pd.Datetime) – ending date (inclusive) defaults to bdate + 23:59:59

  • bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)

  • unit_keys (bool) – If True, keep unit in column name. If False, move last parenthetical part of key to attrs of Series.

  • parse_dates (bool) – If True, parse Timestamp(UTC)

  • withmeta (bool) – If True, add ‘GetMetadata’ results as a “metadata” attribute of the dataframe. This is useful for understanding the underlying datasets used to create the result.

  • verbose (int) – level of verbosity

Returns:

df – Results from download

Return type:

pandas.DataFrame

to_ioapi(key=None, bdate=None, edate=None, bbox=None, withmeta=False, removegz=False, verbose=0)[source]

All arguments default to those provided during initialization.

Parameters:
  • key (str) – Default key for query (e.g., ‘aqs.o3’, ‘purpleair.pm25_corrected’, or ‘tropomi.offl.no2.nitrogendioxide_tropospheric_column’)

  • bdate (str or pd.Datetime) – beginning date (inclusive) defaults to yesterday at 0Z

  • edate (str or pd.Datetime) – ending date (inclusive) defaults to bdate + 23:59:59

  • bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)

  • withmeta (bool) – If True, add ‘GetMetadata’ results at an attribute “metadata” to the netcdf file. This is useful for understanding the underlying datasets used to create the result.

  • removegz (bool) – If True, then remove the downloaded gz file. Bad for caching.

Returns:

ds – Results from download

Return type:

xarray.Dataset

to_netcdf(key=None, bdate=None, edate=None, bbox=None, grid=False, withmeta=False, removegz=False, verbose=0)[source]

All arguments default to those provided during initialization.

Parameters:
  • key (str) – Default key for query (e.g., ‘aqs.o3’, ‘purpleair.pm25_corrected’, or ‘tropomi.offl.no2.nitrogendioxide_tropospheric_column’)

  • bdate (str or pd.Datetime) – beginning date (inclusive) defaults to yesterday at 0Z

  • edate (str or pd.Datetime) – ending date (inclusive) defaults to bdate + 23:59:59

  • bbox (tuple) – wlon, slat, elon, nlat in decimal degrees (-180 to 180)

  • grid (bool) – Add column and row variables with grid assignments.

  • withmeta (bool) – If True, add ‘GetMetadata’ results at an attribute “metadata” to the netcdf file.

  • removegz (bool) – If True, then remove the downloaded gz file. Bad for caching.

Returns:

ds – Results from download

Return type:

xarray.Dataset

class pyrsig.RsigGui[source]

Bases: object

RsigGui Object designed for IPython with ipywidgets in Jupyter

Example: gui = RsigGui() gui.form # As last line in cell, displays controls for user gui.plotopts() # Plots current options gui.check() # Check bounding box and date options make sense rsigapi = gui.get_api() # Convert gui to standard api # proceed with normal RsigApi usage

property bbox
property bdate
check(action='return')[source]
date_range()[source]
property edate
property form
classmethod from_api(api)[source]
get_api()[source]
property grid_kw
property key
plotopts()[source]
property workdir
pyrsig.open_ioapi(path, metapath=None, earth_radius=6370000.0, **kwds)[source]

Open an IOAPI file, add coordinate data, and optionally add RSIG metadata.

Parameters:
  • path (str) – Path to IOAPI formatted files.

  • metapath (str) – Path to metadata associated with the RSIG query. The metadata will be added as metadata global property.

  • earth_radius (float) – Assumed radius of the earth. 6370000 is the WRF default.

  • kwds (mappable) – Passed to xr.open_dataset

Returns:

ds – Dataset with IOAPI metadata

Return type:

xarray.Dataset

pyrsig.open_mfioapi(paths, metapaths=None, earth_radius=6370000.0, **kwargs)[source]

Minimal version of open_mfdataset that is compatible with open_ioapi. preprocess : keyword defaults to add_ioapi_meta concat_dim : keyword defaults to ‘TSTEP’

Parameters:
  • paths (iterable) – Paths to ioapi files to be opened.

  • metapaths (iterable) – Paths to be added as a string metadata

  • earth_radius (float) – Radius of the earth for projection.

  • kwargs – See xr.open_mfdataset