xarray_to_cdf
- cdflib.xarray.xarray_to_cdf(xarray_dataset, file_name, unix_time_to_cdf_time=False, istp=True, terminate_on_warning=False, auto_fix_depends=True, record_dimensions=['record0'], compression=0, nan_to_fillval=True)[source][source]
This function converts XArray Dataset objects into CDF files.
- Parameters:
xarray_dataset (xarray.Dataset) – The XArray Dataset object that you’d like to convert into a CDF file
file_name (str) – The path to the place the newly created CDF file
unix_time_to_cdf_time (bool, optional) – Whether or not to assume variables that will become a CDF_EPOCH/EPOCH16/TT2000 are a unix timestamp
istp (bool, optional) – Whether or not to do checks on the Dataset object to attempt to enforce CDF compliance
terminate_on_warning (bool, optional) – Whether or not to throw an error when given warnings or to continue trying to make the file
auto_fix_depends (bool, optional) – Whether or not to automatically add dependencies
record_dimensions (list of str, optional) – If the code cannot determine which dimensions should be made into CDF records, you may provide a list of them here
compression (int, optional) – The level of compression to gzip the data in the variables. Default is no compression, standard is 6.
nan_to_fillval (bool, optional) – Convert all np.nan and np.datetime64(‘NaT’) to the standard CDF FILLVALs.
- Returns:
None, but generates a CDF file
- Return type:
None
- Example CDF file from scratch:
>>> # Import the needed libraries >>> from cdflib.xarray import xarray_to_cdf >>> import xarray as xr >>> import os >>> import urllib.request
>>> # Create some fake data >>> var_data = [[1, 2, 3], [1, 2, 3], [1, 2, 3]] >>> var_dims = ['epoch', 'direction'] >>> data = xr.Variable(var_dims, var_data)
>>> # Create fake epoch data >>> epoch_data = [1, 2, 3] >>> epoch_dims = ['epoch'] >>> epoch = xr.Variable(epoch_dims, epoch_data)
>>> # Combine the two into an xarray Dataset and export as CDF (this will print out many ISTP warnings) >>> ds = xr.Dataset(data_vars={'data': data, 'epoch': epoch}) >>> xarray_to_cdf(ds, 'hello.cdf')
>>> # Add some global attributes >>> global_attributes = {'Project': 'Hail Mary', >>> 'Source_name': 'Thin Air', >>> 'Discipline': 'None', >>> 'Data_type': 'counts', >>> 'Descriptor': 'Midichlorians in unicorn blood', >>> 'Data_version': '3.14', >>> 'Logical_file_id': 'SEVENTEEN', >>> 'PI_name': 'Darth Vader', >>> 'PI_affiliation': 'Dark Side', >>> 'TEXT': 'AHHHHH', >>> 'Instrument_type': 'Banjo', >>> 'Mission_group': 'Impossible', >>> 'Logical_source': ':)', >>> 'Logical_source_description': ':('}
>>> # Lets add a new coordinate variable for the "direction" >>> dir_data = [1, 2, 3] >>> dir_dims = ['direction'] >>> direction = xr.Variable(dir_dims, dir_data)
>>> # Recreate the Dataset with this new objects, and recreate the CDF >>> ds = xr.Dataset(data_vars={'data': data, 'epoch': epoch, 'direction':direction}, attrs=global_attributes) >>> os.remove('hello.cdf') >>> xarray_to_cdf(ds, 'hello.cdf')
- Example netCDF -> CDF conversion:
>>> # Download a netCDF file (if needed) >>> fname = 'dn_magn-l2-hires_g17_d20211219_v1-0-1.nc' >>> url = ("https://lasp.colorado.edu/maven/sdc/public/data/sdc/web/cdflib_testing/dn_magn-l2-hires_g17_d20211219_v1-0-1.nc") >>> if not os.path.exists(fname): >>> urllib.request.urlretrieve(url, fname)
>>> # Load in the dataset, and set VAR_TYPES attributes (the most important attribute as far as this code is concerned) >>> goes_r_mag = xr.load_dataset("dn_magn-l2-hires_g17_d20211219_v1-0-1.nc") >>> for var in goes_r_mag: >>> goes_r_mag[var].attrs['VAR_TYPE'] = 'data' >>> goes_r_mag['coordinate'].attrs['VAR_TYPE'] = 'support_data' >>> goes_r_mag['time'].attrs['VAR_TYPE'] = 'support_data' >>> goes_r_mag['time_orbit'].attrs['VAR_TYPE'] = 'support_data'
>>> # Create the CDF file >>> xarray_to_cdf(goes_r_mag, 'hello.cdf')
- Processing Steps:
- Determines the list of dimensions that represent time-varying dimensions. These ultimately become the “records” of the CDF file
If it is named “epoch” or “epoch_N”, it is considered time-varying
If a variable points to another variable with a DEPEND_0 attribute, it is considered time-varying
If a variable has an attribute of VAR_TYPE equal to “data”, it is time-varying
If a variable has an attribute of VAR_TYPE equal to “support_data” and it is 2 dimensional, it is time-varying
- Determine a list of “dimension” variables within the Dataset object
These are all coordinates in the dataset that are not time-varying
Additionally, variables that a DEPEND_N attribute points to are also considered dimensions
Optionally, if ISTP=true, automatically add in DEPEND_0/1/2/etc attributes as necessary
Optionally, if ISTP=true, check all variable attributes and global attributes are present
Convert all data into either CDF_INT8, CDF_DOUBLE, CDF_UINT4, or CDF_CHAR
Optionally, convert variables with the name “epoch” or “epoch_N” to CDF_TT2000
Write all variables and global attributes to the CDF file!
- ISTP Warnings:
If ISTP=true, these are some of the common things it will check:
Missing or invalid VAR_TYPE variable attributes
DEPEND_N missing from variables
DEPEND_N/LABL_PTR/UNIT_PTR/FORM_PTR are pointing to missing variables
Missing required global attributes
Conflicting global attributes
Missing an “epoch” dimension
DEPEND_N attribute pointing to a variable with uncompatible dimensions
- CDF Data Types:
All variable data is automatically converted to one of the following CDF types, based on the type of data in the xarray Dataset:
Numpy type
CDF Data Type
np.datetime64
CDF_TIME_TT2000
np.int8
CDF_INT1
np.int16
CDF_INT2
np.int32
CDF_INT4
np.int64
CDF_INT8
np.float16
CDF_FLOAT
np.float32
CDF_FLOAT
np.float64
CDF_DOUBLE
np.uint8
CDF_UINT1
np.uint16
CDF_UINT2
np.uint32
CDF_UINT4
CDF_EPOCH16
CDF_CHAR
CDF_CHAR
object
CDF_CHAR
datetime
CDF_TIME_TT2000
If you want to attempt to cast your data to a different type, you need to add an attribute to your variable called “CDF_DATA_TYPE”. xarray_to_cdf will read this attribute and override the default conversions. Valid choices are:
Integers: CDF_INT1, CDF_INT2, CDF_INT4, CDF_INT8
Unsigned Integers: CDF_UINT1, CDF_UINT2, CDF_UINT4
Floating Point: CDF_REAL4, CDF_FLOAT, CDF_DOUBLE, CDF_REAL8
Time: CDF_EPOCH, CDF_EPOCH16, CDF_TIME_TT2000