The ecco_access Python “package”: accessing ECCO output on PO.DAAC¶
Updated 2024-10-21
Note: The
ecco_access
library is new as of Oct 2024. If you notice bugs or have any questions or suggestions, please feel free to post an issue on the GitHub or contact Andrew Delman at andrewdelman@ucla.edu.
Introduction¶
In the past several years since ECCOv4 release 4 output was made available on the Physical Oceanography Distributed Active Archive Center or PO.DAAC, a number of Python scripts/functions have been written to facilitate requests of this output, authored by Jack McNelis, Ian Fenty, and Andrew Delman. To make access easier and standardize the format of these requests, the ecco_access library has been made available in the ecco_access
folder of the
ECCO-v4-Python-Tutorial Github repository.
This tutorial will help you set up ecco_access so that you can import it into your notebooks and code as a Python package, i.e., with import ecco_access
. The tutorial also introduces the two top-level functions in ecco_access:
ecco_podaac_to_xrdataset
: takes as input a text query or ECCO dataset identifier, and returns an xarray Datasetecco_podaac_access
: takes the same input, but returns the URLs/paths or local files where the data is located
These functions support a number of access modes for querying the datasets, downloading the datasets, or accessing through S3 cloud storage. For more examples of these modes, you can look at the ECCO access modes tutorial.
ECCO output on PO.DAAC¶
Setting up Earthdata login credentials¶
An account with NASA Earthdata is required to access ECCO output hosted in the AWS Cloud by PO.DAAC. Please visit https://urs.earthdata.nasa.gov/home to make an account.
The Earthdata Login provides a single mechanism for user registration and profile management for all EOSDIS system components (DAACs, Tools, Services). Your Earthdata login also helps the EOSDIS program better understand the usage of EOSDIS services to improve user experience through customization of tools and improvement of services. EOSDIS data are openly available to all and free of charge except where governed by international agreements.
Note! some Earthdata password characters may cause problems depending on your system. To be safe, do not use any of the following characters in your password: backslash (), space, hash (#), quotes (single or double), or greater than (>). Set/change your Earthdata password here: https://urs.earthdata.nasa.gov/change_password
After creating a NASA Earthdata account, create a file called
.netrc
in your home directory (linux, Mac):
/home/<username>/.netrc
or _netrc
(Windows):
C:\Users\<username>\_netrc
The netrc
file must have the following structure and must include your Earthdata account login name and password:
machine urs.earthdata.nasa.gov
login <your username>
password <your password>
Set permissions on your
netrc
file to be readable only by the current user. If not, you will receive the error “netrc access too permissive.”
$ chmod 0600 ~/.netrc
ECCO output structure on PO.DAAC¶
On PO.DAAC and in the NASA Earthdata Cloud, ECCO output is organized in the following hierarchy:
Dataset: Typically contains a few variables, spanning the time range of the ECCO v4r4 output (currently 1992-2017). Most datasets are divided (in the time dimension) into hundreds or thousands of granules.
Granule: Dataset variables at a specific time (monthly mean, daily mean, or snapshot). Exceptions are 1-D time series where the entire dataset only consists of one granule.
Variable: A specific geophysical parameter (or flux) representing the state of the ocean, atmosphere, or sea ice/snow cover. Individual variables are not visible through the NASA Earthdata website, but can be seen after a granule file has been opened.
Each dataset has a dataset code called a ShortName
which is used to identify it on the cloud. In order to download particular variable(s), you need to identify the ShortName
associated with the dataset containing those variables. You can search for the variables in the linked text files below, or download these files for your reference.
Dataset ShortNames and variables associated with them¶
ECCO v4r4 llc90 Grid Dataset Variables - Monthly Means
ECCO v4r4 llc90 Grid Dataset Variables - Daily Means
ECCO v4r4 llc90 Grid Dataset Variables - Daily Snapshots
ECCO v4r4 0.5-Deg Interp Grid Dataset Variables - Monthly Means
ECCO v4r4 0.5-Deg Interp Grid Dataset Variables - Daily Means
ECCO v4r4 Time Series and Grid Parameters
Note that unlike earlier releases of ECCO v4, in v4r4 all monthly mean variables are also available for download as daily means. Snapshots (typically at daily intervals) are available for a few variables, and can be used to help close budgets as shown in later tutorials.
Setting up ecco_access
¶
The ecco_access library I am calling a “package” in quotes because it currently has the core structure of any package you would install using conda
or pip
; there is an __init__.py
file that allows you to access all of the library’s modules and the functions within, using a single import ecco_access
command. However, this “package” is not available through conda
or pip
yet. In the meantime, you can get ecco_access
using one of the following methods:
git clone the ECCO-v4-Python-Tutorial repository, which contains ecco_access along with the symlinks (soft links) needed to run the tutorials in-place
Download the ecco_access folder by clicking on this link.
If you downloaded the folder (second option), make sure the downloaded folder is unzipped/extracted and rename it to ecco_access
.
Then you need to make it accessible to your notebook, either by:
Adding the ecco_access
parent directory to your Python path¶
The ecco_access
parent directory is the directory containing ecco_access
. Add it to your Python path with the following Python code in each notebook where ecco_access
is used, replacing {parent_dir}
with the path of the parent directory:
import sys
sys.path.append('{parent_dir}')
Or:
Put in a symlink to the ecco_access
location from your current notebook directory.¶
A symlink (or soft link) from the directory containing your notebooks will allow you to use the ecco_access
libraries from any notebook in that directory. Put in the link with the following code in your terminal window, replacing {parent_dir}
and {current_notebook_dir}
with the respective paths of those directories:
ln -s {parent_dir}/ecco_access {current_notebook_dir}/ecco_access
or by running the following Python code once:
import os
os.symlink('{parent_dir}/ecco_access','{current_notebook_dir}/ecco_access',target_is_directory=True)
After you have done this you should be able to import ecco_access just like any other Python package.
Using the ecco_podaac_to_xrdataset
function¶
Perhaps the most convenient way to use ecco_access is the ecco_podaac_to_xrdataset
function; it takes as input a query consisting of NASA Earthdata dataset ShortName(s), ECCO variables, or text strings in the variable descriptions, and outputs an xarray Dataset. Let’s look at the syntax:
[1]:
import numpy as np
import xarray as xr
from os.path import join,expanduser
import matplotlib.pyplot as plt
import ecco_access as ea
# identify user's home directory
user_home_dir = expanduser('~')
[2]:
help(ea.ecco_podaac_to_xrdataset)
Help on function ecco_podaac_to_xrdataset in module ecco_access.ecco_access:
ecco_podaac_to_xrdataset(query, version='v4r4', grid=None, time_res='all', StartDate=None, EndDate=None,\
snapshot_interval=None, mode='download_ifspace', download_root_dir=None, **kwargs)
This function queries and accesses ECCO datasets from PO.DAAC. The core query and download functions
are adapted from Jupyter notebooks created by Jack McNelis and Ian Fenty
(https://github.com/ECCO-GROUP/ECCO-ACCESS/blob/master/PODAAC/Downloading_ECCO_datasets_from_PODAAC/README.md)
and modified by Andrew Delman (https://ecco-v4-python-tutorial.readthedocs.io).
The syntax is similar to ecco_podaac_access, except instead of a list of URLs or files,
an xarray Dataset with all of the queried ECCO datasets is returned.
Parameters
----------
query: str, list, or dict, defines datasets or variables to access.
If query is str, it specifies either a dataset ShortName (if query
matches a NASA Earthdata ShortName), or a text string that can be
used to search the ECCO ShortNames, variable names, and descriptions.
A query may also be a list of multiple ShortNames and/or text searches,
or a dict that contains grid,time_res specifiers as keys and ShortNames
or text searches as values, e.g.,
{'native,monthly':['ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4',
'THETA']}
will query the native grid monthly SSH datasets, and all native grid
monthly datasets with variables or descriptions matching 'THETA'.
version: ('v4r4'), specifies ECCO version to query
grid: ('native','latlon',None), specifies whether to query datasets with output
on the native grid or the interpolated lat/lon grid.
The default None will query both types of grids, unless specified
otherwise in a query dict (e.g., the example above).
time_res: ('monthly','daily','snapshot','all'), specifies which time resolution
to include in query and downloads. 'all' includes all time resolutions,
and datasets that have no time dimension, such as the grid parameter
and mixing coefficient datasets.
StartDate,EndDate: str, in 'YYYY', 'YYYY-MM', or 'YYYY-MM-DD' format,
define date range [StartDate,EndDate] for download.
EndDate is included in the time range (unlike typical Python ranges).
Full ECCOv4r4 date range (default) is '1992-01-01' to '2017-12-31'.
For 'SNAPSHOT' datasets, an additional day is added to EndDate to enable
closed budgets within the specified date range.
snapshot_interval: ('monthly', 'daily', or None), if snapshot datasets are included in ShortNames,
this determines whether snapshots are included for only the beginning/end
of each month ('monthly'), or for every day ('daily').
If None or not specified, defaults to 'daily' if any daily mean ShortNames
are included and 'monthly' otherwise.
mode: str, one of the following:
'download': Download datasets using NASA Earthdata URLs
'download_ifspace': Check storage availability before downloading.
Download only if storage footprint of downloads
<= max_avail_frac*(available storage)
'download_subset': Download spatial and temporal subsets of datasets
via Opendap; query help(ecco_access.ecco_podaac_download_subset)
to see keyword arguments that can be used in this mode.
The following modes work within the AWS cloud only:
's3_open': Access datasets on S3 without downloading.
's3_open_fsspec': Use json files (generated with `fsspec` and `kerchunk`)
for expedited opening of datasets.
's3_get': Download from S3 (to AWS EC2 instance).
's3_get_ifspace': Check storage availability before downloading;
download if storage footprint
<= max_avail_frac*(available storage).
Otherwise data are opened "remotely" from S3 bucket.
download_root_dir: str, defines parent directory to download files to.
Files will be downloaded to directory download_root_dir/ShortName/.
If not specified, parent directory defaults to '~/Downloads/ECCO_V4r4_PODAAC/'.
Additional keyword arguments*:
*This is not an exhaustive list, especially for
'download_subset' mode; use help(ecco_access.ecco_podaac_download_subset) to display
options specific to that mode
max_avail_frac: float, maximum fraction of remaining available disk space to
use in storing ECCO datasets.
If storing the datasets exceeds this fraction, an error is returned.
Valid range is [0,0.9]. If number provided is outside this range, it is replaced
by the closer endpoint of the range.
jsons_root_dir: str, for s3_open_fsspec mode only, the root/parent directory where the
fsspec/kerchunk-generated jsons are found.
jsons are generated using the steps described here:
https://medium.com/pangeo/fake-it-until-you-make-it-reading-goes-netcdf4-data-on-aws-s3
-as-zarr-for-rapid-data-access-61e33f8fe685
and stored as {jsons_root_dir}/MZZ_{GRIDTYPE}_{TIME_RES}/{SHORTNAME}.json.
For v4r4, GRIDTYPE is '05DEG' or 'LLC0090GRID'.
TIME_RES is one of: ('MONTHLY','DAILY','SNAPSHOT','GEOMETRY','MIXING_COEFFS').
n_workers: int, number of workers to use in concurrent downloads. Benefits typically taper off above 5-6.
force_redownload: bool, if True, existing files will be redownloaded and replaced;
if False (default), existing files will not be replaced.
Returns
-------
ds_out: xarray Dataset or dict of xarray Datasets (with ShortNames as keys),
containing all of the accessed datasets.
This function does not work with the query modes: 'ls','query','s3_ls','s3_query'.
There are a lot of options that you can use to “submit” a query with this function. Let’s consider a simple case, where we already have the ShortName for the monthly native grid SSH from ECCOv4r4 (ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4), and we want to access output from the year 2017. The ShortName goes in the query field, and we can specify start and end dates (in YYYY-MM or YYYY-MM-DD format). The other options that matter most for this request are the mode, and depending on the mode, the download_root_dir or the jsons_root_dir.
Direct download over the internet (mode = ‘download’)¶
Let’s try the download mode, which retrieves the data over the Internet using NASA Earthdata URLs (this should work on any machine with Internet access, including cloud environments):
[3]:
# download data and open xarray dataset
SSH_monthly_shortname = 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4'
ds_SSH = ea.ecco_podaac_to_xrdataset(SSH_monthly_shortname,\
StartDate='2017-01',EndDate='2017-12',\
mode='download',\
download_root_dir=join(user_home_dir,'Downloads','ECCO_V4r4_PODAAC'))
created download directory /home/jovyan/Downloads/ECCO_V4r4_PODAAC/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4
DL Progress: 100%|#########################| 12/12 [00:05<00:00, 2.16it/s]
=====================================
total downloaded: 71.02 Mb
avg download speed: 12.73 Mb/s
Time spent = 5.578183174133301 seconds
We specified a root directory for the download (which also happens to be the default setting), and the data files are then placed under download_root_dir / ShortName. We can verify that the contents of the file are what we queried:
[4]:
ds_SSH
[4]:
<xarray.Dataset> Size: 25MB Dimensions: (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4) Coordinates: (12/13) * i (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89 * i_g (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89 * j (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89 * j_g (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89 * tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12 * time (time) datetime64[ns] 96B 2017-01-16T12:00:00 ... 2017-12-16T0... ... ... YC (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> XG (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> YG (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> time_bnds (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray> XC_bnds (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray> YC_bnds (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray> Dimensions without coordinates: nv, nb Data variables: SSH (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> SSHIBC (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> SSHNOIBC (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> ETAN (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> Attributes: (12/57) acknowledgement: This research was carried out by the Jet Pr... author: Ian Fenty and Ou Wang cdm_data_type: Grid comment: Fields provided on the curvilinear lat-lon-... Conventions: CF-1.8, ACDD-1.3 coordinates_comment: Note: the global 'coordinates' attribute de... ... ... time_coverage_duration: P1M time_coverage_end: 2017-02-01T00:00:00 time_coverage_resolution: P1M time_coverage_start: 2017-01-01T00:00:00 title: ECCO Sea Surface Height - Monthly Mean llc9... uuid: a21a5c30-400c-11eb-a9e0-0cc47a3f49c3
In-cloud direct access (mode = ‘s3_open’)¶
If you are working in the AWS Cloud, you do not have to download the ECCO output files to carry out computations with the data. Let’s try s3_open, which opens the files from S3 (no download necessary).
[5]:
SSH_monthly_shortname = 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4'
ds_SSH_s3 = ea.ecco_podaac_to_xrdataset(SSH_monthly_shortname,\
StartDate='2017-01',EndDate='2017-12',\
mode='s3_open')
{'ShortName': 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4', 'temporal': '2017-01-02,2017-12-31'}
Total number of matching granules: 12
[6]:
ds_SSH_s3
[6]:
<xarray.Dataset> Size: 25MB Dimensions: (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4) Coordinates: (12/13) * i (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89 * i_g (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89 * j (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89 * j_g (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89 * tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12 * time (time) datetime64[ns] 96B 2017-01-16T12:00:00 ... 2017-12-16T0... ... ... YC (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> XG (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> YG (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> time_bnds (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray> XC_bnds (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray> YC_bnds (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray> Dimensions without coordinates: nv, nb Data variables: SSH (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> SSHIBC (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> SSHNOIBC (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> ETAN (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> Attributes: (12/57) acknowledgement: This research was carried out by the Jet Pr... author: Ian Fenty and Ou Wang cdm_data_type: Grid comment: Fields provided on the curvilinear lat-lon-... Conventions: CF-1.8, ACDD-1.3 coordinates_comment: Note: the global 'coordinates' attribute de... ... ... time_coverage_duration: P1M time_coverage_end: 2017-02-01T00:00:00 time_coverage_resolution: P1M time_coverage_start: 2017-01-01T00:00:00 title: ECCO Sea Surface Height - Monthly Mean llc9... uuid: a21a5c30-400c-11eb-a9e0-0cc47a3f49c3
Now plot the SSH for Jan 2017 in tile 10 (Python numbering convention; 11 in Fortran/MATLAB numbering convention). Here we use the “RdYlBu” colormap, one of many built-in colormaps that the matplotlib
package provides, or you can create your own. The “_r” at the end reverses the direction of the colormap, so red
corresponds to the maximum values.
[7]:
ds_SSH_s3.SSH.isel(time=0,tile=10).plot(cmap='RdYlBu_r')
[7]:
<matplotlib.collections.QuadMesh at 0x7f1bced3fcb0>
We can also use the ecco_v4_py
package to plot a global map of Jan 2017 SSH, using the plot_proj_to_latlon_grid
function which regrids from the native LLC grid to a lat/lon grid.
[8]:
import ecco_v4_py as ecco
plt.figure(figsize=(12,6), dpi= 90)
ecco.plot_proj_to_latlon_grid(ds_SSH_s3.XC, ds_SSH_s3.YC, \
ds_SSH_s3.SSH.isel(time=0), \
user_lon_0=-160,\
projection_type='robin',\
plot_type='pcolormesh', \
cmap='RdBu_r',\
dx=1,dy=1,cmin=-1.2, cmax=1.2,show_colorbar=True)
plt.title('Monthly mean SSH [m], Jan 2017')
plt.show()
What if you don’t know the ShortName already?¶
NASA Earthdata datasets are identified by ShortNames, but you might not know the ShortName of the variable or category of variables that you are seeking. One way to find the ShortName is to consult the ECCOv4r4 variable lists. Another option is that you can specify a text string “query” in ecco_access
functions. If
“query” does not match an existing NASA Earthdata ShortName, it will conduct a case-insensitive search for exact matches of that text string among the variable names and descriptions in the variable lists.
Note: if using a “query” to search the variable lists, it is recommended to use a single word (or even part of a word) when possible. Since only exact text matches are identified, you are more likely to get results with a shorter query.
For example, perhaps you are looking to open the dataset that has native grid monthly sea ice concentration in 2007. If the query is not identified as a ShortName, then a text search of the variable lists is conducted using query, grid, and time_res. Then of the identified matches, the user is asked to select one.
[9]:
ds_seaice_conc = ea.ecco_podaac_to_xrdataset('ice',grid='native',time_res='monthly',\
StartDate='2007-01',EndDate='2007-12',\
mode='s3_open')
ShortName Options for query "ice":
Variable Name Description (units)
Option 1: ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4 *native grid,monthly means*
SSH Dynamic sea surface height anomaly. Suitable for
comparisons with altimetry sea surface height data
products that apply the inverse barometer
correction. (m)
SSHIBC The inverted barometer correction to sea surface
height due to atmospheric pressure loading. (m)
SSHNOIBC Sea surface height anomaly without the inverted
barometer correction. Suitable for comparisons
with altimetry sea surface height data products
that do NOT apply the inverse barometer
correction. (m)
ETAN Model sea level anomaly, without corrections for
global mean density changes, inverted barometer
effect, or volume displacement due to submerged
sea-ice and snow. (m)
Option 2: ECCO_L4_STRESS_LLC0090GRID_MONTHLY_V4R4 *native grid,monthly means*
EXFtaux Wind stress in the model +x direction (N/m^2)
EXFtauy Wind stress in the model +y direction (N/m^2)
oceTAUX Ocean surface stress in the model +x direction,
due to wind and sea-ice (N/m^2)
oceTAUY Ocean surface stress in the model +y direction,
due to wind and sea-ice (N/m^2)
Option 3: ECCO_L4_HEAT_FLUX_LLC0090GRID_MONTHLY_V4R4 *native grid,monthly means*
EXFhl Open ocean air-sea latent heat flux (W/m^2)
EXFhs Open ocean air-sea sensible heat flux (W/m^2)
EXFlwdn Downward longwave radiative flux (W/m^2)
EXFswdn Downwelling shortwave radiative flux (W/m^2)
EXFqnet Open ocean net air-sea heat flux (W/m^2)
oceQnet Net heat flux into the ocean surface (W/m^2)
SIatmQnt Net upward heat flux to the atmosphere (W/m^2)
TFLUX Rate of change of ocean heat content per m^2
accounting for mass (e.g. freshwater) fluxes
(W/m^2)
EXFswnet Open ocean net shortwave radiative flux (W/m^2)
EXFlwnet Net open ocean longwave radiative flux (W/m^2)
oceQsw Net shortwave radiative flux across the ocean
surface (W/m^2)
SIaaflux Conservative ocean and sea-ice advective heat flux
adjustment, associated with temperature difference
between sea surface temperature and sea-ice,
excluding latent heat of fusion (W/m^2)
Option 4: ECCO_L4_FRESH_FLUX_LLC0090GRID_MONTHLY_V4R4 *native grid,monthly means*
EXFpreci Precipitation rate (m/s)
EXFevap Open ocean evaporation rate (m/s)
EXFroff River runoff (m/s)
SIsnPrcp Snow precipitation on sea-ice (kg/(m^2 s))
EXFempmr Open ocean net surface freshwater flux from
precipitation, evaporation, and runoff (m/s)
oceFWflx Net freshwater flux into the ocean (kg/(m^2 s))
SIatmFW Net freshwater flux into the open ocean, sea-ice,
and snow (kg/(m^2 s))
SFLUX Rate of change of total ocean salinity per m^2
accounting for mass fluxes (g/(m^2 s))
SIacSubl Freshwater flux to the atmosphere due to
sublimation-deposition of snow or ice (kg/(m^2 s))
SIrsSubl Residual sublimation freshwater flux (kg/(m^2 s))
SIfwThru Precipitation through sea-ice (kg/(m^2 s))
Option 5: ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4 *native grid,monthly means*
SIarea Sea-ice concentration (fraction between 0 and 1)
SIheff Area-averaged sea-ice thickness (m)
SIhsnow Area-averaged snow thickness (m)
sIceLoad Average sea-ice and snow mass per unit area
(kg/m^2)
Option 6: ECCO_L4_SEA_ICE_VELOCITY_LLC0090GRID_MONTHLY_V4R4 *native grid,monthly means*
SIuice Sea-ice velocity in the model +x direction (m/s)
SIvice Sea-ice velocity in the model +y direction (m/s)
Option 7: ECCO_L4_SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID_MONTHLY_V4R4 *native grid,monthly means*
ADVxHEFF Lateral advective flux of sea-ice thickness in the
model +x direction (m^3/s)
ADVyHEFF Lateral advective flux of sea-ice thickness in the
model +y direction (m^3/s)
DFxEHEFF Lateral diffusive flux of sea-ice thickness in the
model +x direction (m^3/s)
DFyEHEFF Lateral diffusive flux of sea-ice thickness in the
model +y direction (m^3/s)
ADVxSNOW Lateral advective flux of snow thickness in the
model +x direction (m^3/s)
ADVySNOW Lateral advective flux of snow thickness in the
model +y direction (m^3/s)
DFxESNOW Lateral diffusive flux of snow thickness in the
model +x direction (m^3/s)
DFyESNOW Lateral diffusive flux of snow thickness in the
model +y direction (m^3/s)
Option 8: ECCO_L4_SEA_ICE_SALT_PLUME_FLUX_LLC0090GRID_MONTHLY_V4R4 *native grid,monthly means*
oceSPflx Net salt flux into the ocean due to brine
rejection (g/(m^2 s))
oceSPDep Salt plume depth (m)
Please select option [1-8]: 5
Using dataset with ShortName: ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4
{'ShortName': 'ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4', 'temporal': '2007-01-02,2007-12-31'}
Total number of matching granules: 12
We selected option 5, corresponding to ShortName ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4. Let’s look at the dataset contents.
[10]:
ds_seaice_conc
[10]:
<xarray.Dataset> Size: 25MB Dimensions: (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4) Coordinates: (12/13) * i (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89 * i_g (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89 * j (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89 * j_g (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89 * tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12 * time (time) datetime64[ns] 96B 2007-01-16T12:00:00 ... 2007-12-16T1... ... ... YC (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> XG (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> YG (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> time_bnds (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray> XC_bnds (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray> YC_bnds (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray> Dimensions without coordinates: nv, nb Data variables: SIarea (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> SIheff (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> SIhsnow (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> sIceLoad (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> Attributes: (12/57) acknowledgement: This research was carried out by the Jet Pr... author: Ian Fenty and Ou Wang cdm_data_type: Grid comment: Fields provided on the curvilinear lat-lon-... Conventions: CF-1.8, ACDD-1.3 coordinates_comment: Note: the global 'coordinates' attribute de... ... ... time_coverage_duration: P1M time_coverage_end: 2007-02-01T00:00:00 time_coverage_resolution: P1M time_coverage_start: 2007-01-01T00:00:00 title: ECCO Sea-Ice and Snow Concentration and Thi... uuid: e6cdf192-400d-11eb-93b5-0cc47a3f49c3
Now plot the sea ice concentration/fraction in tile 6 (which approximately covers the Arctic Ocean), during Sep 2007 which at the time was a record minimum for Arctic sea ice.
[11]:
ds_seaice_conc.SIarea.isel(time=8,tile=6).plot(cmap='cool')
[11]:
<matplotlib.collections.QuadMesh at 0x7f1bcc70c1a0>
Using the ecco_podaac_access
function¶
In-cloud direct access (mode = ‘s3_open’)¶
The ecco_podaac_to_xrdataset
function that was previously used invokes ecco_podaac_access
under the hood, and ecco_podaac_access
can also be called directly. This can be useful if you want to obtain a list of file objects/paths or URLs that you can then process with your own code. Let’s use this function with mode = s3_open
(all s3
modes only work from an AWS cloud environment in region us-west-2
).
[12]:
files_dict = ea.ecco_podaac_access(SSH_monthly_shortname,\
StartDate='2015-01',EndDate='2015-12',\
mode='s3_open')
{'ShortName': 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4', 'temporal': '2015-01-02,2015-12-31'}
Total number of matching granules: 12
[13]:
files_dict[SSH_monthly_shortname]
[13]:
[<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-01_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-02_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-03_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-04_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-05_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-06_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-07_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-08_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-09_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-10_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-11_ECCO_V4r4_native_llc0090.nc>,
<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-12_ECCO_V4r4_native_llc0090.nc>]
The output of ecco_podaac_access
is in the form of a dictionary with ShortNames as keys. In this case, the value associated with this ShortName is a list of 12 file objects. These are files on S3
(AWS’s cloud storage system) that have been opened, which is a necessary step for the files’ data to be accessed. The list of open files can be passed directly to xarray.open_mfdataset
.
[14]:
ds_SSH_fromlist = xr.open_mfdataset(files_dict[SSH_monthly_shortname],\
compat='override',data_vars='minimal',coords='minimal',\
parallel=True)
[15]:
ds_SSH_fromlist
[15]:
<xarray.Dataset> Size: 25MB Dimensions: (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4) Coordinates: (12/13) * i (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89 * i_g (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89 * j (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89 * j_g (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89 * tile (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12 * time (time) datetime64[ns] 96B 2015-01-16T12:00:00 ... 2015-12-16T1... ... ... YC (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> XG (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> YG (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray> time_bnds (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray> XC_bnds (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray> YC_bnds (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray> Dimensions without coordinates: nv, nb Data variables: SSH (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> SSHIBC (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> SSHNOIBC (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> ETAN (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray> Attributes: (12/57) acknowledgement: This research was carried out by the Jet Pr... author: Ian Fenty and Ou Wang cdm_data_type: Grid comment: Fields provided on the curvilinear lat-lon-... Conventions: CF-1.8, ACDD-1.3 coordinates_comment: Note: the global 'coordinates' attribute de... ... ... time_coverage_duration: P1M time_coverage_end: 2015-02-01T00:00:00 time_coverage_resolution: P1M time_coverage_start: 2015-01-01T00:00:00 title: ECCO Sea Surface Height - Monthly Mean llc9... uuid: a4955186-400c-11eb-8c14-0cc47a3f49c3