The ecco_access Python “package”: accessing ECCO output on PO.DAAC

Updated 2024-10-21

Note: The ecco_access library is new as of Oct 2024. If you notice bugs or have any questions or suggestions, please feel free to post an issue on the GitHub or contact Andrew Delman at andrewdelman@ucla.edu.

Introduction

In the past several years since ECCOv4 release 4 output was made available on the Physical Oceanography Distributed Active Archive Center or PO.DAAC, a number of Python scripts/functions have been written to facilitate requests of this output, authored by Jack McNelis, Ian Fenty, and Andrew Delman. To make access easier and standardize the format of these requests, the ecco_access library has been made available in the ecco_access folder of the ECCO-v4-Python-Tutorial Github repository.

This tutorial will help you set up ecco_access so that you can import it into your notebooks and code as a Python package, i.e., with import ecco_access. The tutorial also introduces the two top-level functions in ecco_access:

  • ecco_podaac_to_xrdataset: takes as input a text query or ECCO dataset identifier, and returns an xarray Dataset

  • ecco_podaac_access: takes the same input, but returns the URLs/paths or local files where the data is located

These functions support a number of access modes for querying the datasets, downloading the datasets, or accessing through S3 cloud storage. For more examples of these modes, you can look at the ECCO access modes tutorial.

ECCO output on PO.DAAC

Setting up Earthdata login credentials

An account with NASA Earthdata is required to access ECCO output hosted in the AWS Cloud by PO.DAAC. Please visit https://urs.earthdata.nasa.gov/home to make an account.

The Earthdata Login provides a single mechanism for user registration and profile management for all EOSDIS system components (DAACs, Tools, Services). Your Earthdata login also helps the EOSDIS program better understand the usage of EOSDIS services to improve user experience through customization of tools and improvement of services. EOSDIS data are openly available to all and free of charge except where governed by international agreements.

Note! some Earthdata password characters may cause problems depending on your system. To be safe, do not use any of the following characters in your password: backslash (), space, hash (#), quotes (single or double), or greater than (>). Set/change your Earthdata password here: https://urs.earthdata.nasa.gov/change_password

  1. After creating a NASA Earthdata account, create a file called .netrc in your home directory (linux, Mac):

/home/<username>/.netrc

or _netrc (Windows):

C:\Users\<username>\_netrc

The netrc file must have the following structure and must include your Earthdata account login name and password:

machine urs.earthdata.nasa.gov
    login <your username>
    password <your password>
  1. Set permissions on your netrc file to be readable only by the current user. If not, you will receive the error “netrc access too permissive.”

$ chmod 0600 ~/.netrc

ECCO output structure on PO.DAAC

On PO.DAAC and in the NASA Earthdata Cloud, ECCO output is organized in the following hierarchy:

  • Dataset: Typically contains a few variables, spanning the time range of the ECCO v4r4 output (currently 1992-2017). Most datasets are divided (in the time dimension) into hundreds or thousands of granules.

    • Granule: Dataset variables at a specific time (monthly mean, daily mean, or snapshot). Exceptions are 1-D time series where the entire dataset only consists of one granule.

      • Variable: A specific geophysical parameter (or flux) representing the state of the ocean, atmosphere, or sea ice/snow cover. Individual variables are not visible through the NASA Earthdata website, but can be seen after a granule file has been opened.

Each dataset has a dataset code called a ShortName which is used to identify it on the cloud. In order to download particular variable(s), you need to identify the ShortName associated with the dataset containing those variables. You can search for the variables in the linked text files below, or download these files for your reference.

Dataset ShortNames and variables associated with them

ECCO v4r4 llc90 Grid Dataset Variables - Monthly Means

ECCO v4r4 llc90 Grid Dataset Variables - Daily Means

ECCO v4r4 llc90 Grid Dataset Variables - Daily Snapshots

ECCO v4r4 0.5-Deg Interp Grid Dataset Variables - Monthly Means

ECCO v4r4 0.5-Deg Interp Grid Dataset Variables - Daily Means

ECCO v4r4 Time Series and Grid Parameters

Note that unlike earlier releases of ECCO v4, in v4r4 all monthly mean variables are also available for download as daily means. Snapshots (typically at daily intervals) are available for a few variables, and can be used to help close budgets as shown in later tutorials.

Setting up ecco_access

The ecco_access library I am calling a “package” in quotes because it currently has the core structure of any package you would install using conda or pip; there is an __init__.py file that allows you to access all of the library’s modules and the functions within, using a single import ecco_access command. However, this “package” is not available through conda or pip yet. In the meantime, you can get ecco_access using one of the following methods:

  • git clone the ECCO-v4-Python-Tutorial repository, which contains ecco_access along with the symlinks (soft links) needed to run the tutorials in-place

  • Download the ecco_access folder by clicking on this link.

If you downloaded the folder (second option), make sure the downloaded folder is unzipped/extracted and rename it to ecco_access.

Then you need to make it accessible to your notebook, either by:

Adding the ecco_access parent directory to your Python path

The ecco_access parent directory is the directory containing ecco_access. Add it to your Python path with the following Python code in each notebook where ecco_access is used, replacing {parent_dir} with the path of the parent directory:

import sys
sys.path.append('{parent_dir}')

Or:

Using the ecco_podaac_to_xrdataset function

Perhaps the most convenient way to use ecco_access is the ecco_podaac_to_xrdataset function; it takes as input a query consisting of NASA Earthdata dataset ShortName(s), ECCO variables, or text strings in the variable descriptions, and outputs an xarray Dataset. Let’s look at the syntax:

[1]:
import numpy as np
import xarray as xr
from os.path import join,expanduser
import matplotlib.pyplot as plt

import ecco_access as ea

# identify user's home directory
user_home_dir = expanduser('~')
[2]:
help(ea.ecco_podaac_to_xrdataset)
Help on function ecco_podaac_to_xrdataset in module ecco_access.ecco_access:

ecco_podaac_to_xrdataset(query, version='v4r4', grid=None, time_res='all', StartDate=None, EndDate=None,\
  snapshot_interval=None, mode='download_ifspace', download_root_dir=None, **kwargs)
    This function queries and accesses ECCO datasets from PO.DAAC. The core query and download functions
    are adapted from Jupyter notebooks created by Jack McNelis and Ian Fenty
    (https://github.com/ECCO-GROUP/ECCO-ACCESS/blob/master/PODAAC/Downloading_ECCO_datasets_from_PODAAC/README.md)
    and modified by Andrew Delman (https://ecco-v4-python-tutorial.readthedocs.io).
    The syntax is similar to ecco_podaac_access, except instead of a list of URLs or files,
    an xarray Dataset with all of the queried ECCO datasets is returned.

    Parameters
    ----------
    query: str, list, or dict, defines datasets or variables to access.
           If query is str, it specifies either a dataset ShortName (if query
           matches a NASA Earthdata ShortName), or a text string that can be
           used to search the ECCO ShortNames, variable names, and descriptions.
           A query may also be a list of multiple ShortNames and/or text searches,
           or a dict that contains grid,time_res specifiers as keys and ShortNames
           or text searches as values, e.g.,
           {'native,monthly':['ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4',
                              'THETA']}
           will query the native grid monthly SSH datasets, and all native grid
           monthly datasets with variables or descriptions matching 'THETA'.

    version: ('v4r4'), specifies ECCO version to query

    grid: ('native','latlon',None), specifies whether to query datasets with output
          on the native grid or the interpolated lat/lon grid.
          The default None will query both types of grids, unless specified
          otherwise in a query dict (e.g., the example above).

    time_res: ('monthly','daily','snapshot','all'), specifies which time resolution
              to include in query and downloads. 'all' includes all time resolutions,
              and datasets that have no time dimension, such as the grid parameter
              and mixing coefficient datasets.


    StartDate,EndDate: str, in 'YYYY', 'YYYY-MM', or 'YYYY-MM-DD' format,
                       define date range [StartDate,EndDate] for download.
                       EndDate is included in the time range (unlike typical Python ranges).
                       Full ECCOv4r4 date range (default) is '1992-01-01' to '2017-12-31'.
                       For 'SNAPSHOT' datasets, an additional day is added to EndDate to enable
                       closed budgets within the specified date range.

    snapshot_interval: ('monthly', 'daily', or None), if snapshot datasets are included in ShortNames,
                       this determines whether snapshots are included for only the beginning/end
                       of each month ('monthly'), or for every day ('daily').
                       If None or not specified, defaults to 'daily' if any daily mean ShortNames
                       are included and 'monthly' otherwise.

    mode: str, one of the following:
          'download': Download datasets using NASA Earthdata URLs
          'download_ifspace': Check storage availability before downloading.
                              Download only if storage footprint of downloads
                              <= max_avail_frac*(available storage)
          'download_subset': Download spatial and temporal subsets of datasets
                             via Opendap; query help(ecco_access.ecco_podaac_download_subset)
                             to see keyword arguments that can be used in this mode.
          The following modes work within the AWS cloud only:
          's3_open': Access datasets on S3 without downloading.
          's3_open_fsspec': Use json files (generated with `fsspec` and `kerchunk`)
                            for expedited opening of datasets.
          's3_get': Download from S3 (to AWS EC2 instance).
          's3_get_ifspace': Check storage availability before downloading;
                            download if storage footprint
                            <= max_avail_frac*(available storage).
                            Otherwise data are opened "remotely" from S3 bucket.

    download_root_dir: str, defines parent directory to download files to.
                       Files will be downloaded to directory download_root_dir/ShortName/.
                       If not specified, parent directory defaults to '~/Downloads/ECCO_V4r4_PODAAC/'.

    Additional keyword arguments*:
    *This is not an exhaustive list, especially for
    'download_subset' mode; use help(ecco_access.ecco_podaac_download_subset) to display
    options specific to that mode

    max_avail_frac: float, maximum fraction of remaining available disk space to
                    use in storing ECCO datasets.
                    If storing the datasets exceeds this fraction, an error is returned.
                    Valid range is [0,0.9]. If number provided is outside this range, it is replaced
                    by the closer endpoint of the range.

    jsons_root_dir: str, for s3_open_fsspec mode only, the root/parent directory where the
                    fsspec/kerchunk-generated jsons are found.
                    jsons are generated using the steps described here:
                    https://medium.com/pangeo/fake-it-until-you-make-it-reading-goes-netcdf4-data-on-aws-s3
                    -as-zarr-for-rapid-data-access-61e33f8fe685
                    and stored as {jsons_root_dir}/MZZ_{GRIDTYPE}_{TIME_RES}/{SHORTNAME}.json.
                    For v4r4, GRIDTYPE is '05DEG' or 'LLC0090GRID'.
                    TIME_RES is one of: ('MONTHLY','DAILY','SNAPSHOT','GEOMETRY','MIXING_COEFFS').

    n_workers: int, number of workers to use in concurrent downloads. Benefits typically taper off above 5-6.

    force_redownload: bool, if True, existing files will be redownloaded and replaced;
                            if False (default), existing files will not be replaced.

    Returns
    -------
    ds_out: xarray Dataset or dict of xarray Datasets (with ShortNames as keys),
            containing all of the accessed datasets.
            This function does not work with the query modes: 'ls','query','s3_ls','s3_query'.

There are a lot of options that you can use to “submit” a query with this function. Let’s consider a simple case, where we already have the ShortName for the monthly native grid SSH from ECCOv4r4 (ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4), and we want to access output from the year 2017. The ShortName goes in the query field, and we can specify start and end dates (in YYYY-MM or YYYY-MM-DD format). The other options that matter most for this request are the mode, and depending on the mode, the download_root_dir or the jsons_root_dir.

Direct download over the internet (mode = ‘download’)

Let’s try the download mode, which retrieves the data over the Internet using NASA Earthdata URLs (this should work on any machine with Internet access, including cloud environments):

[3]:
# download data and open xarray dataset
SSH_monthly_shortname = 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4'
ds_SSH = ea.ecco_podaac_to_xrdataset(SSH_monthly_shortname,\
                                        StartDate='2017-01',EndDate='2017-12',\
                                        mode='download',\
                                        download_root_dir=join(user_home_dir,'Downloads','ECCO_V4r4_PODAAC'))
created download directory /home/jovyan/Downloads/ECCO_V4r4_PODAAC/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4
DL Progress: 100%|#########################| 12/12 [00:05<00:00,  2.16it/s]

=====================================
total downloaded: 71.02 Mb
avg download speed: 12.73 Mb/s
Time spent = 5.578183174133301 seconds


We specified a root directory for the download (which also happens to be the default setting), and the data files are then placed under download_root_dir / ShortName. We can verify that the contents of the file are what we queried:

[4]:
ds_SSH
[4]:
<xarray.Dataset> Size: 25MB
Dimensions:    (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4)
Coordinates: (12/13)
  * i          (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i_g        (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * j          (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * j_g        (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * tile       (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * time       (time) datetime64[ns] 96B 2017-01-16T12:00:00 ... 2017-12-16T0...
    ...         ...
    YC         (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    XG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    YG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    time_bnds  (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray>
    XC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
    YC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
Dimensions without coordinates: nv, nb
Data variables:
    SSH        (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHIBC     (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHNOIBC   (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    ETAN       (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
Attributes: (12/57)
    acknowledgement:              This research was carried out by the Jet Pr...
    author:                       Ian Fenty and Ou Wang
    cdm_data_type:                Grid
    comment:                      Fields provided on the curvilinear lat-lon-...
    Conventions:                  CF-1.8, ACDD-1.3
    coordinates_comment:          Note: the global 'coordinates' attribute de...
    ...                           ...
    time_coverage_duration:       P1M
    time_coverage_end:            2017-02-01T00:00:00
    time_coverage_resolution:     P1M
    time_coverage_start:          2017-01-01T00:00:00
    title:                        ECCO Sea Surface Height - Monthly Mean llc9...
    uuid:                         a21a5c30-400c-11eb-a9e0-0cc47a3f49c3

In-cloud direct access (mode = ‘s3_open’)

If you are working in the AWS Cloud, you do not have to download the ECCO output files to carry out computations with the data. Let’s try s3_open, which opens the files from S3 (no download necessary).

[5]:
SSH_monthly_shortname = 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4'
ds_SSH_s3 = ea.ecco_podaac_to_xrdataset(SSH_monthly_shortname,\
                                        StartDate='2017-01',EndDate='2017-12',\
                                        mode='s3_open')
{'ShortName': 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4', 'temporal': '2017-01-02,2017-12-31'}

Total number of matching granules: 12
[6]:
ds_SSH_s3
[6]:
<xarray.Dataset> Size: 25MB
Dimensions:    (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4)
Coordinates: (12/13)
  * i          (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i_g        (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * j          (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * j_g        (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * tile       (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * time       (time) datetime64[ns] 96B 2017-01-16T12:00:00 ... 2017-12-16T0...
    ...         ...
    YC         (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    XG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    YG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    time_bnds  (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray>
    XC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
    YC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
Dimensions without coordinates: nv, nb
Data variables:
    SSH        (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHIBC     (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHNOIBC   (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    ETAN       (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
Attributes: (12/57)
    acknowledgement:              This research was carried out by the Jet Pr...
    author:                       Ian Fenty and Ou Wang
    cdm_data_type:                Grid
    comment:                      Fields provided on the curvilinear lat-lon-...
    Conventions:                  CF-1.8, ACDD-1.3
    coordinates_comment:          Note: the global 'coordinates' attribute de...
    ...                           ...
    time_coverage_duration:       P1M
    time_coverage_end:            2017-02-01T00:00:00
    time_coverage_resolution:     P1M
    time_coverage_start:          2017-01-01T00:00:00
    title:                        ECCO Sea Surface Height - Monthly Mean llc9...
    uuid:                         a21a5c30-400c-11eb-a9e0-0cc47a3f49c3

Now plot the SSH for Jan 2017 in tile 10 (Python numbering convention; 11 in Fortran/MATLAB numbering convention). Here we use the “RdYlBu” colormap, one of many built-in colormaps that the matplotlib package provides, or you can create your own. The “_r” at the end reverses the direction of the colormap, so red corresponds to the maximum values.

[7]:
ds_SSH_s3.SSH.isel(time=0,tile=10).plot(cmap='RdYlBu_r')
[7]:
<matplotlib.collections.QuadMesh at 0x7f1bced3fcb0>
_images/ECCO_access_intro_12_1.png

We can also use the ecco_v4_py package to plot a global map of Jan 2017 SSH, using the plot_proj_to_latlon_grid function which regrids from the native LLC grid to a lat/lon grid.

[8]:
import ecco_v4_py as ecco

plt.figure(figsize=(12,6), dpi= 90)
ecco.plot_proj_to_latlon_grid(ds_SSH_s3.XC, ds_SSH_s3.YC, \
                              ds_SSH_s3.SSH.isel(time=0), \
                              user_lon_0=-160,\
                              projection_type='robin',\
                              plot_type='pcolormesh', \
                              cmap='RdBu_r',\
                              dx=1,dy=1,cmin=-1.2, cmax=1.2,show_colorbar=True)
plt.title('Monthly mean SSH [m], Jan 2017')
plt.show()
_images/ECCO_access_intro_14_0.png

What if you don’t know the ShortName already?

NASA Earthdata datasets are identified by ShortNames, but you might not know the ShortName of the variable or category of variables that you are seeking. One way to find the ShortName is to consult the ECCOv4r4 variable lists. Another option is that you can specify a text string “query” in ecco_access functions. If “query” does not match an existing NASA Earthdata ShortName, it will conduct a case-insensitive search for exact matches of that text string among the variable names and descriptions in the variable lists.

Note: if using a “query” to search the variable lists, it is recommended to use a single word (or even part of a word) when possible. Since only exact text matches are identified, you are more likely to get results with a shorter query.

For example, perhaps you are looking to open the dataset that has native grid monthly sea ice concentration in 2007. If the query is not identified as a ShortName, then a text search of the variable lists is conducted using query, grid, and time_res. Then of the identified matches, the user is asked to select one.

[9]:
ds_seaice_conc = ea.ecco_podaac_to_xrdataset('ice',grid='native',time_res='monthly',\
                                               StartDate='2007-01',EndDate='2007-12',\
                                               mode='s3_open')
ShortName Options for query "ice":
                  Variable Name     Description (units)

Option 1: ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  SSH               Dynamic sea surface height anomaly. Suitable for
                                    comparisons with altimetry sea surface height data
                                    products that apply the inverse barometer
                                    correction. (m)
                  SSHIBC            The inverted barometer correction to sea surface
                                    height due to atmospheric pressure loading. (m)
                  SSHNOIBC          Sea surface height anomaly without the inverted
                                    barometer correction. Suitable for comparisons
                                    with altimetry sea surface height data products
                                    that do NOT apply the inverse barometer
                                    correction. (m)
                  ETAN              Model sea level anomaly, without corrections for
                                    global mean density changes, inverted barometer
                                    effect, or volume displacement due to submerged
                                    sea-ice and snow. (m)

Option 2: ECCO_L4_STRESS_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  EXFtaux           Wind stress in the model +x direction (N/m^2)
                  EXFtauy           Wind stress in the model +y direction (N/m^2)
                  oceTAUX           Ocean surface stress in the model +x direction,
                                    due to wind and sea-ice (N/m^2)
                  oceTAUY           Ocean surface stress in the model +y direction,
                                    due to wind and sea-ice (N/m^2)

Option 3: ECCO_L4_HEAT_FLUX_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  EXFhl             Open ocean air-sea latent heat flux (W/m^2)
                  EXFhs             Open ocean air-sea sensible heat flux (W/m^2)
                  EXFlwdn           Downward longwave radiative flux (W/m^2)
                  EXFswdn           Downwelling shortwave radiative flux (W/m^2)
                  EXFqnet           Open ocean net air-sea heat flux (W/m^2)
                  oceQnet           Net heat flux into the ocean surface (W/m^2)
                  SIatmQnt          Net upward heat flux to the atmosphere (W/m^2)
                  TFLUX             Rate of change of ocean heat content per m^2
                                    accounting for mass (e.g. freshwater) fluxes
                                    (W/m^2)
                  EXFswnet          Open ocean net shortwave radiative flux (W/m^2)
                  EXFlwnet          Net open ocean longwave radiative flux (W/m^2)
                  oceQsw            Net shortwave radiative flux across the ocean
                                    surface (W/m^2)
                  SIaaflux          Conservative ocean and sea-ice advective heat flux
                                    adjustment, associated with temperature difference
                                    between sea surface temperature and sea-ice,
                                    excluding latent heat of fusion (W/m^2)

Option 4: ECCO_L4_FRESH_FLUX_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  EXFpreci          Precipitation rate (m/s)
                  EXFevap           Open ocean evaporation rate (m/s)
                  EXFroff           River runoff (m/s)
                  SIsnPrcp          Snow precipitation on sea-ice (kg/(m^2 s))
                  EXFempmr          Open ocean net surface freshwater flux from
                                    precipitation, evaporation, and runoff (m/s)
                  oceFWflx          Net freshwater flux into the ocean (kg/(m^2 s))
                  SIatmFW           Net freshwater flux into the open ocean, sea-ice,
                                    and snow (kg/(m^2 s))
                  SFLUX             Rate of change of total ocean salinity per m^2
                                    accounting for mass fluxes (g/(m^2 s))
                  SIacSubl          Freshwater flux to the atmosphere due to
                                    sublimation-deposition of snow or ice (kg/(m^2 s))
                  SIrsSubl          Residual sublimation freshwater flux (kg/(m^2 s))
                  SIfwThru          Precipitation through sea-ice (kg/(m^2 s))

Option 5: ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  SIarea            Sea-ice concentration (fraction between 0 and 1)
                  SIheff            Area-averaged sea-ice thickness (m)
                  SIhsnow           Area-averaged snow thickness (m)
                  sIceLoad          Average sea-ice and snow mass per unit area
                                    (kg/m^2)

Option 6: ECCO_L4_SEA_ICE_VELOCITY_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  SIuice            Sea-ice velocity in the model +x direction (m/s)
                  SIvice            Sea-ice velocity in the model +y direction (m/s)

Option 7: ECCO_L4_SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  ADVxHEFF          Lateral advective flux of sea-ice thickness in the
                                    model +x direction (m^3/s)
                  ADVyHEFF          Lateral advective flux of sea-ice thickness in the
                                    model +y direction (m^3/s)
                  DFxEHEFF          Lateral diffusive flux of sea-ice thickness in the
                                    model +x direction (m^3/s)
                  DFyEHEFF          Lateral diffusive flux of sea-ice thickness in the
                                    model +y direction (m^3/s)
                  ADVxSNOW          Lateral advective flux of snow thickness in the
                                    model +x direction (m^3/s)
                  ADVySNOW          Lateral advective flux of snow thickness in the
                                    model +y direction (m^3/s)
                  DFxESNOW          Lateral diffusive flux of snow thickness in the
                                    model +x direction (m^3/s)
                  DFyESNOW          Lateral diffusive flux of snow thickness in the
                                    model +y direction (m^3/s)

Option 8: ECCO_L4_SEA_ICE_SALT_PLUME_FLUX_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  oceSPflx          Net salt flux into the ocean due to brine
                                    rejection (g/(m^2 s))
                  oceSPDep          Salt plume depth (m)


Please select option [1-8]:  5
Using dataset with ShortName: ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4
{'ShortName': 'ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4', 'temporal': '2007-01-02,2007-12-31'}

Total number of matching granules: 12

We selected option 5, corresponding to ShortName ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4. Let’s look at the dataset contents.

[10]:
ds_seaice_conc
[10]:
<xarray.Dataset> Size: 25MB
Dimensions:    (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4)
Coordinates: (12/13)
  * i          (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i_g        (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * j          (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * j_g        (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * tile       (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * time       (time) datetime64[ns] 96B 2007-01-16T12:00:00 ... 2007-12-16T1...
    ...         ...
    YC         (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    XG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    YG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    time_bnds  (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray>
    XC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
    YC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
Dimensions without coordinates: nv, nb
Data variables:
    SIarea     (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SIheff     (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SIhsnow    (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    sIceLoad   (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
Attributes: (12/57)
    acknowledgement:              This research was carried out by the Jet Pr...
    author:                       Ian Fenty and Ou Wang
    cdm_data_type:                Grid
    comment:                      Fields provided on the curvilinear lat-lon-...
    Conventions:                  CF-1.8, ACDD-1.3
    coordinates_comment:          Note: the global 'coordinates' attribute de...
    ...                           ...
    time_coverage_duration:       P1M
    time_coverage_end:            2007-02-01T00:00:00
    time_coverage_resolution:     P1M
    time_coverage_start:          2007-01-01T00:00:00
    title:                        ECCO Sea-Ice and Snow Concentration and Thi...
    uuid:                         e6cdf192-400d-11eb-93b5-0cc47a3f49c3

Now plot the sea ice concentration/fraction in tile 6 (which approximately covers the Arctic Ocean), during Sep 2007 which at the time was a record minimum for Arctic sea ice.

[11]:
ds_seaice_conc.SIarea.isel(time=8,tile=6).plot(cmap='cool')
[11]:
<matplotlib.collections.QuadMesh at 0x7f1bcc70c1a0>
_images/ECCO_access_intro_20_1.png

Using the ecco_podaac_access function

In-cloud direct access (mode = ‘s3_open’)

The ecco_podaac_to_xrdataset function that was previously used invokes ecco_podaac_access under the hood, and ecco_podaac_access can also be called directly. This can be useful if you want to obtain a list of file objects/paths or URLs that you can then process with your own code. Let’s use this function with mode = s3_open (all s3 modes only work from an AWS cloud environment in region us-west-2).

[12]:
files_dict = ea.ecco_podaac_access(SSH_monthly_shortname,\
                                    StartDate='2015-01',EndDate='2015-12',\
                                    mode='s3_open')
{'ShortName': 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4', 'temporal': '2015-01-02,2015-12-31'}

Total number of matching granules: 12
[13]:
files_dict[SSH_monthly_shortname]
[13]:
[<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-01_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-02_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-03_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-04_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-05_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-06_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-07_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-08_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-09_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-10_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-11_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-12_ECCO_V4r4_native_llc0090.nc>]

The output of ecco_podaac_access is in the form of a dictionary with ShortNames as keys. In this case, the value associated with this ShortName is a list of 12 file objects. These are files on S3 (AWS’s cloud storage system) that have been opened, which is a necessary step for the files’ data to be accessed. The list of open files can be passed directly to xarray.open_mfdataset.

[14]:
ds_SSH_fromlist = xr.open_mfdataset(files_dict[SSH_monthly_shortname],\
                                    compat='override',data_vars='minimal',coords='minimal',\
                                    parallel=True)
[15]:
ds_SSH_fromlist
[15]:
<xarray.Dataset> Size: 25MB
Dimensions:    (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4)
Coordinates: (12/13)
  * i          (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i_g        (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * j          (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * j_g        (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * tile       (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * time       (time) datetime64[ns] 96B 2015-01-16T12:00:00 ... 2015-12-16T1...
    ...         ...
    YC         (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    XG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    YG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    time_bnds  (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray>
    XC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
    YC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
Dimensions without coordinates: nv, nb
Data variables:
    SSH        (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHIBC     (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHNOIBC   (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    ETAN       (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
Attributes: (12/57)
    acknowledgement:              This research was carried out by the Jet Pr...
    author:                       Ian Fenty and Ou Wang
    cdm_data_type:                Grid
    comment:                      Fields provided on the curvilinear lat-lon-...
    Conventions:                  CF-1.8, ACDD-1.3
    coordinates_comment:          Note: the global 'coordinates' attribute de...
    ...                           ...
    time_coverage_duration:       P1M
    time_coverage_end:            2015-02-01T00:00:00
    time_coverage_resolution:     P1M
    time_coverage_start:          2015-01-01T00:00:00
    title:                        ECCO Sea Surface Height - Monthly Mean llc9...
    uuid:                         a4955186-400c-11eb-8c14-0cc47a3f49c3