NetCDF Climate and Forecast (CF) Metadata Conventions
Version 1.1, 17 January, 2008
Many others have contributed to the development of CF through their participation in discussions about proposed changes.
Abstract
This document describes the CF conventions for climate and forecast metadata designed to promote the processing and sharing of files created with the netCDF Application Programmer Interface [NetCDF]. The conventions define metadata that provide a definitive description of what the data in each variable represents, and of the spatial and temporal properties of the data. This enables users of data from different sources to decide which quantities are comparable, and facilitates building applications with powerful extraction, regridding, and display capabilities.
The CF conventions generalize and extend the COARDS conventions [COARDS]. The extensions include metadata that provides a precise definition of each variable via specification of a standard name, describes the vertical locations corresponding to dimensionless vertical coordinate values, and provides the spatial coordinates of non-rectilinear gridded data. Since climate and forecast data are often not simply representative of points in space/time, other extensions provide for the description of coordinate intervals, multidimensional cells and climatological time coordinates, and indicate how a data value is representative of an interval or cell. This standard also relaxes the COARDS constraints on dimension order and specifies methods for reducing the size of datasets.
Table of Contents
- Preface
- 1. Introduction
- 2. NetCDF Files and Components
- 3. Description of the Data
- 4. Coordinate Types
- 5. Coordinate Systems
- 6. Labels and Alternative Coordinates
- 7. Data Representative of Cells
- 8. Reduction of Dataset Size
- A. Attributes
- B. Standard Name Table Format
- C. Standard Name Modifiers
- D. Dimensionless Vertical Coordinates
- E. Cell Methods
- F. Grid Mappings
- G. Revision History
- Bibliography
List of Tables
- 3.1. Supported Units
- A.1. Attributes
- C.1. Standard Name Modifiers
- E.1. Cell Methods
- F.1. Grid Mapping Attributes
List of Examples
- 3.1. Use of
standard_name - 3.2. Instrument data
- 3.3. A flag variable
- 4.1. Latitude axis
- 4.2. Longitude axis
- 4.3. Atmosphere sigma coordinate
- 4.4. Time axis
- 4.5. Perpetual time axis
- 4.6. Paleoclimate time axis
- 5.1. Independent coordinate variables
- 5.2. Two-dimensional coordinate variables
- 5.3. Reduced horizontal grid
- 5.4. Timeseries of station data
- 5.5. Trajectories
- 5.6. Rotated pole grid
- 5.7. Lambert conformal projection
- 5.8. Multiple forecasts from a single analysis
- 6.1. Several parcel trajectories
- 6.2. Northward heat transport in Atlantic Ocean
- 6.3. Model level numbers
- 7.1. Cells on a latitude axis
- 7.2. Cells in a non-rectangular grid
- 7.3. Cell areas for a spherical geodesic grid
- 7.4. Methods applied to a timeseries
- 7.5. Surface air temperature variance
- 7.6. Climatological seasons
- 7.7. Decadal averages for January
- 7.8. Temperature for each hour of the average day
- 7.9. Temperature for each hour of the typical climatological day
- 7.10. Monthly-maximum daily precipitation totals
- 8.1. Horizontal compression of a three-dimensional array
- 8.2. Compression of a three-dimensional field
- B.1. A name table containing three entries
- D.1. Atmosphere natural log pressure coordinate
- D.2. Atmosphere sigma coordinate
- D.3. Atmosphere hybrid sigma pressure coordinate
- D.4. Atmosphere hybrid height coordinate
- D.5. Atmosphere smooth level vertical (SLEVE) coordinate
- D.6. Ocean sigma coordinate
- D.7. Ocean s-coordinate
- D.8. Ocean sigma over z coordinate
- D.9. Ocean double sigma coordinate
- F.1. Albers Equal Area
- F.2. Azimuthal equidistant
- F.3. Lambert azimuthal equal area
- F.4. Lambert conformal
- F.5. Polar stereographic
- F.6. Rotated pole
- F.7. Stereographic
- F.8. Transverse Mercator
- Home page:
Contains links to: previous draft and current working draft documents; applications for processing CF conforming files; email list for discussion about interpretation, clarification, and proposals for changes or extensions to the current conventions. http://www-pcmdi.llnl.gov/cf/
- Revision history:
This document will be updated to reflect agreed changes to the standard and to correct mistakes according to the rules of CF governance. See Appendix G, Revision History for the full revision history. Changes with provisional status use the following mark-up style: new text, deleted text, and [a comment].
The NetCDF library [NetCDF] is designed to read and write data that has been structured according to well-defined rules and is easily ported across various computer platforms. The netCDF interface enables but does not require the creation of self-describing datasets. The purpose of the CF conventions is to require conforming datasets to contain sufficient metadata that they are self-describing in the sense that each variable in the file has an associated description of what it represents, including physical units if appropriate, and that each value can be located in space (relative to earth-based coordinates) and time.
An important benefit of a convention is that it enables software tools to display data and perform operations on specified subsets of the data with minimal user intervention. It is possible to provide the metadata describing how a field is located in time and space in many different ways that a human would immediately recognize as equivalent. The purpose in restricting how the metadata is represented is to make it practical to write software that allows a machine to parse that metadata and to automatically associate each data value with its location in time and space. It is equally important that the metadata be easy for human users to write and to understand.
This standard is intended for use with climate and forecast data, for atmosphere, surface and ocean, and was designed with model-generated data particularly in mind. We recognise that there are limits to what a standard can practically cover; we restrict ourselves to issues that we believe to be of common and frequent concern in the design of climate and forecast metadata. Our main purpose therefore, is to propose a clear, adequate and flexible definition of the metadata needed for climate and forecast data. Although this is specifically a netCDF standard, we feel that most of the ideas are of wider application. The metadata objects could be contained in file formats other than netCDF. Conversion of the metadata between files of different formats will be facilitated if conventions for all formats are based on similar ideas.
This convention is designed to be backward compatible with the COARDS conventions [COARDS], by which we mean that a conforming COARDS dataset also conforms to the CF standard. Thus new applications that implement the CF conventions will be able to process COARDS datasets.
We have also striven to maximize conformance to the COARDS standard, that is, wherever the COARDS metadata conventions provide an adequate description we require their use. Extensions to COARDS are implemented in a manner such that the content that doesn't depend on the extensions is still accessible to applications that adhere to the COARDS standard.
The terms in this document that refer to components of a netCDF file are defined in the NetCDF User's Guide (NUG) [NUG] NUG. Some of those definitions are repeated below for convenience.
- auxiliary coordinate variable
Any netCDF variable that contains coordinate data, but is not a coordinate variable (in the sense of that term defined by the NUG and used by this standard - see below). Unlike coordinate variables, there is no relationship between the name of an auxiliary coordinate variable and the name(s) of its dimension(s).
- boundary variable
A boundary variable is associated with a variable that contains coordinate data. When a data value provides information about conditions in a cell occupying a region of space/time or some other dimension, the boundary variable provides a description of cell extent.
- CDL syntax
The ascii format used to describe the contents of a netCDF file is called CDL (network Common Data form Language). This format represents arrays using the indexing conventions of the C programming language, i.e., index values start at 0, and in multidimensional arrays, when indexing over the elements of the array, it is the last declared dimension that is the fastest varying in terms of file storage order. The netCDF utilities ncdump and ncgen use this format (see chapter 10 of the NUG ). All examples in this document use CDL syntax.
- cell
A region in one or more dimensions whose boundary can be described by a set of vertices. The term interval is sometimes used for one-dimensional cells.
- coordinate variable
We use this term precisely as it is defined in section 2.3.1 of the NUG . It is a one-dimensional variable with the same name as its dimension [e.g.,
time(time)], and it is defined as a numeric data type with values that are ordered monotonically. Missing values are not allowed in coordinate variables.- grid mapping variable
A variable used as a container for attributes that define a specific grid mapping. The type of the variable is arbitrary since it contains no data.
- latitude dimension
A dimension of a netCDF variable that has an associated latitude coordinate variable.
- longitude dimension
A dimension of a netCDF variable that has an associated longitude coordinate variable.
- multidimensional coordinate variable
An auxiliary coordinate variable that is multidimensional.
- recommendation
Recommendations in this convention are meant to provide advice that may be helpful for reducing common mistakes. In some cases we have recommended rather than required particular attributes in order to maintain backwards compatibility with COARDS. An application must not depend on a dataset's adherence to recommendations.
- scalar coordinate variable
A scalar variable that contains coordinate data. Functionally equivalent to either a size one coordinate variable or a size one auxiliary coordinate variable.
- spatiotemporal dimension
A dimension of a netCDF variable that is used to identify a location in time and/or space.
- time dimension
A dimension of a netCDF variable that has an associated time coordinate variable.
- vertical dimension
A dimension of a netCDF variable that has an associated vertical coordinate variable.
No variable or dimension names are standardized by this convention. Instead we follow the lead of the NUG and standardize only the names of attributes and some of the values taken by those attributes. The overview provided in this section will be followed with more complete descriptions in following sections. Appendix A, Attributes contains a summary of all the attributes used in this convention.
We recommend that the NUG defined attribute
Conventions
be given the string value
"CF-1.0"
"CF-1.1"
to identify datasets that conform to these
conventions.
The general description of a file's contents
should be contained in the following attributes:
title,
history,
institution,
source,
comment
and
references
(Section 2.6.2, “Description of file contents”).
For backwards compatibility with COARDS none
of these attributes is required, but their
use is recommended to provide human readable
documentation of the file contents.
Each variable in a netCDF file has an associated
description which is provided by the attributes
units,
long_name, and
standard_name. The
units,
and long_name
attributes are defined in the NUG and the
standard_name attribute is
defined in this document.
The
units
attribute is required for all variables
that represent dimensional quantities (except for
boundary variables defined in
Section 7.1, “Cell Boundaries”.
The values of the
units
attributes are character
strings that are recognized by UNIDATA's Udunits
package
[UDUNITS],
(with exceptions allowed as discussed in
Section 3.1, “Units”).
The
long_name
and
standard_name
attributes are
used to describe the content of each variable. For
backwards compatibility with COARDS neither
is required, but use of at least one of them
is strongly recommended. The use of standard
names will facilitate the exchange of climate
and forecast data by providing unambiguous
identification of variables most commonly
analyzed.
Four types of coordinates receive special treatment by these conventions: latitude, longitude, vertical, and time. Every variable must have associated metadata that allows identification of each such coordinate that is relevant. Two independent parts of the convention allow this to be done. There are conventions that identify the variables that contain the coordinate data, and there are conventions that identify the type of coordinate represented by that data.
There are two methods used to identify variables
that contain coordinate data. The first is to
use the NUG-defined "coordinate variables." The
use of coordinate variables is required for all
dimensions that correspond to one dimensional
space or time coordinates. In cases where
coordinate variables are not applicable,
the variables containing coordinate data are
identified by the
coordinates
attribute.
Once the variables containing coordinate data are
identified, further conventions are required to
determine the type of coordinate represented by
each of these variables. Latitude, longitude,
and time coordinates are identified solely by
the value of their
units
attribute. Vertical
coordinates with units of pressure may also
be identified by the
units
attribute. Other
vertical coordinates must use the attribute
positive
which determines whether the direction of
increasing coordinate value is up or down. Because
identification of a coordinate type by its units
involves the use of an external software package
[UDUNITS],
we provide the optional attribute
axis
for a direct identification of coordinates
that correspond to latitude, longitude, vertical,
or time axes.
Latitude, longitude, and time are defined
by internationally recognized standards,
and hence, identifying the coordinates of
these types is sufficient to locate data
values uniquely with respect to time and a
point on the earth's surface. On the other
hand identifying the vertical coordinate is
not necessarily sufficient to locate a data
value vertically with respect to the earth's
surface. In particular a model may output data
on the dimensionless vertical coordinate used
in its mathematical formulation. To achieve the
goal of being able to spatially locate all data
values, this convention includes the definitions
of common dimensionless vertical coordinates in
Appendix D, Dimensionless Vertical Coordinates.
These definitions provide a mapping
between the dimensionless coordinate values
and dimensional values that can be uniquely
located with respect to a point on the earth's
surface. The definitions are associated with
a coordinate variable via the
standard_name
and
formula_terms
attributes. For backwards
compatibility with COARDS use of these attributes
is not required, but is strongly recommended.
It is often the case that data values are not
representative of single points in time and/or
space, but rather of intervals or multidimensional
cells. This convention defines a
bounds
attribute
to specify the extent of intervals or cells. When
data that is representative of cells can be
described by simple statistical methods, those
methods can be indicated using the
cell_methods
attribute. An important application of this
attribute is to describe climatological and
diurnal statistics.
Methods for reducing the total volume of data
include both packing and compression. Packing
reduces the data volume by reducing the precision
of the stored numbers. It is implemented using
the attributes
add_offset
and
scale_factor
which
are defined in the NUG. Compression on the other
hand loses no precision, but reduces the volume by
not storing missing data. The attribute
compress
is defined for this purpose.
These conventions generalize and extend the COARDS conventions [COARDS]. A major design goal has been to maintain backward compatibility with COARDS. Hence applications written to process datasets that conform to these conventions will also be able to process COARDS conforming datasets. We have also striven to maximize conformance to the COARDS standard so that datasets that only require the metadata that was available under COARDS will still be able to be processed by COARDS conforming applications. But because of the extensions that provide new metadata content, and the relaxation of some COARDS requirements, datasets that conform to these conventions will not necessarily be recognized by applications that adhere to the COARDS conventions. The features of these conventions that allow writing netCDF files that are not COARDS conforming are summarized below.
COARDS standardizes the description of grids composed of independent latitude, longitude, vertical, and time axes. In addition to standardizing the metadata required to identify each of these axis types COARDS restricts the axis (equivalently dimension) ordering to be longitude, latitude, vertical, and time (with longitude being the most rapidly varying dimension). Because of I/O performance considerations it may not be possible for models to output their data in conformance with the COARDS requirement. The CF convention places no rigid restrictions on the order of dimensions, however we encourage data producers to make the extra effort to stay within the COARDS standard order. The use of non-COARDS axis ordering will render files inaccessible to some applications and limit interoperability. Often a buffering operation can be used to miminize performance penalties when axis ordering in model code does not match the axis ordering of a COARDS file.
COARDS addresses the issue of identifying
dimensionless vertical coordinates, but does
not provide any mechanism for mapping the
dimensionless values to dimensional ones that
can be located with respect to the earth's
surface. For backwards compatibility we continue
to allow (but do not require) the
units
attribute of dimensionless vertical coordinates to take the
values "level", "layer", or "sigma_level." But we
recommend that the
standard_name and
formula_terms
attributes be used to identify the appropriate
definition of the dimensionless vertical
coordinate (see
Section 4.3.2, “Dimensionless Vertical Coordinate”).
The CF conventions define attributes which enable the description of data properties that are outside the scope of the COARDS conventions. These new attributes do not violate the COARDS conventions, but applications that only recognize COARDS conforming datasets will not have the capabilities that the new attributes are meant to enable. Briefly the new attributes allow:
Identification of quantities using standard names.
Description of dimensionless vertical coordinates.
Associating dimensions with auxiliary coordinate variables.
Linking data variables to scalar coordinate variables.
Associating dimensions with labels.
Description of intervals and cells.
Description of properties of data defined on intervals and cells.
Description of climatological statistics.
Data compression for variables with missing values.
Table of Contents
The components of a netCDF file are described in section 2 of the NUG [NUG]. In this section we describe conventions associated with filenames and the basic components of a netCDF file. We also introduce new attributes for describing the contents of a file.
The netCDF data types char, byte,
short, int,
float or real, and double
are all acceptable. The
char type is not intended for numeric data. One
byte numeric data should be stored using the
byte data type. All integer types are treated by
the netCDF interface as signed. It is possible
to treat the byte type as unsigned by using the
NUG convention of indicating the unsigned range
using the
valid_min,
valid_max,
or
valid_range
attributes.
NetCDF does not support a character string type, so these must be represented as character arrays. In this document, a one dimensional array of character data is simply referred to as a "string". An n-dimensional array of strings must be implemented as a character array of dimension (n,max_string_length), with the last (most rapidly varying) dimension declared large enough to contain the longest string in the array. All the strings in a given array are therefore defined to be equal in length. For example, an array of strings containing the names of the months would be dimensioned (12,9) in order to accommodate "September", the month with the longest name.
Variable, dimension and attribute names should begin with a letter and be composed of letters, digits, and underscores. Note that this is in conformance with the COARDS conventions, but is more restrictive than the netCDF interface which allows use of the hyphen character. The netCDF interface also allows leading underscores in names, but the NUG states that this is reserved for system use.
Case is significant in netCDF names, but it is recommended that names should not be distinguished purely by case, i.e., if case is disregarded, no two names should be the same. It is also recommended that names should be obviously meaningful, if possible, as this renders the file more effectively self-describing.
This convention does not standardize any variable
or dimension names. Attribute names and their
contents, where standardized, are given in
English in this document and should appear in
English in conforming netCDF files for the sake
of portability. Languages other than English
are permitted for variables, dimensions, and
non-standardized attributes. The content of some
standardized attributes are string values that
are not standardized, and thus are not required
to be in English. For example, a description
of what a variable represents may be given
in a non-English language using the
long_name
attribute
(see Section 3.2, “Long Name”)
whose contents are not standardized, but a description given by
the
standard_name
attribute
(see Section 3.3, “Standard Name”)
must be taken from the standard name table which
is in English.
A variable may have any number of dimensions, including zero, and the dimensions must all have different names. COARDS strongly recommends limiting the number of dimensions to four, but we wish to allow greater flexibility. The dimensions of the variable define the axes of the quantity it contains. Dimensions other than those of space and time may be included. Several examples can be found in this document. Under certain circumstances, one may need more than one dimension in a particular quantity. For instance, a variable containing a two-dimensional probability density function might correlate the temperature at two different vertical levels, and hence would have temperature on both axes.
If any or all of the dimensions of a variable
have the interpretations of "date or time"
(T), "height or depth" (Z), "latitude"
(Y), or "longitude" (X) then we recommend,
but do not require
(see Section 1.4, “Relationship to the COARDS Conventions”),
those
dimensions to appear in the relative order T,
then Z, then Y, then X in the CDL definition
corresponding to the file. All other dimensions
should, whenever possible, be placed to the left
of the spatiotemporal dimensions.
Dimensions may be of any size, including unity. When a single value of some coordinate applies to all the values in a variable, the recommended means of attaching this information to the variable is by use of a dimension of size unity with a one-element coordinate variable. It is also acceptable to use a scalar coordinate variable which eliminates the need for an associated size one dimension in the data variable. The advantage of using a coordinate variable is that all its attributes can be used to describe the single-valued quantity, including boundaries. For example, a variable containing data for temperature at 1.5 m above the ground has a single-valued coordinate supplying a height of 1.5 m, and a time-mean quantity has a single-valued time coordinate with an associated boundary variable to record the start and end of the averaging period.
This convention does not standardize variable names.
NetCDF variables that contain coordinate data are referred to as coordinate variables, auxiliary coordinate variables, scalar coordinate variables, or multidimensional coordinate variables.
The NUG conventions
(NUG section 8.1)
provide the
_FillValue,
valid_min,
valid_max, and
valid_range attributes
to indicate missing data.
The NUG conventions for missing data
changed significantly between version
2.3 and version 2.4. Since version 2.4
the NUG defines missing data as all
values outside of the
valid_range,
and specifies how the
valid_range
should be defined from the
_FillValue (which has
library specified default values) if it
hasn't been explicitly specified. If
only one missing value is needed for
a variable then we strongly recommend
that this value be specified using
the
_FillValue
attribute. Doing this guarantees that the missing value will
be recognized by generic applications
that follow either the before or after
version 2.4 conventions.
The scalar attribute with the name
_FillValue
and of the same type as its
variable is recognized by the netCDF
library as the value used to pre-fill
disk space allocated to the variable. This
value is considered to be a special value
that indicates undefined or missing data,
and is returned when reading values that
were not written. The
_FillValue
should be
outside the range specified by
valid_range
(if used) for a variable. The netCDF
library defines a default fill value
for each data type
(NUG section 7.16).
The
missing_value
attribute is considered
deprecated by the NUG and we do not
recommend its use. However for backwards
compatibility with COARDS this standard
continues to recognize the use of the
missing_value
attribute to indicate undefined or missing data.
The missing values of a variable with
scale_factor
and/or
add_offset
attributes
(see section Section 8.1, “Packed Data”) are interpreted
relative to the variable's external
values, i.e., the values stored in the
netCDF file. Applications that process
variables that have attributes to indicate
both a transformation (via a scale and/or
offset) and missing values should first
check that a data value is valid, and
then apply the transformation. Note that
values that are identified as missing
should not be transformed. Since the
missing value is outside the valid
range it is possible that applying
a transformation to it could result
in an invalid operation. For example,
the default
_FillValue
is very close to
the maximum representable value of IEEE
single precision floats, and multiplying
it by 100 produces an "Infinity" (using
single precision arithmetic).
This standard describes many attributes (some mandatory, others optional), but a file may also contain non-standard attributes. Such attributes do not represent a violation of this standard. Application programs should ignore attributes that they do not recognise or which are irrelevant for their purposes. Conventional attribute names should be used wherever applicable. Non-standard names should be as meaningful as possible. Before introducing an attribute, consideration should be given to whether the information would be better represented as a variable. In general, if a proposed attribute requires ancillary data to describe it, is multidimensional, requires any of the defined netCDF dimensions to index its values, or requires a significant amount of storage, a variable should be used instead. When this standard defines string attributes that may take various prescribed values, the possible values are generally given in lower case. However, applications programs should not be sensitive to case in these attributes. Several string attributes are defined by this standard to contain "blank-separated lists". Consecutive words in such a list are separated by one or more adjacent spaces. The list may begin and end with any number of spaces. See Appendix A, Attributes for a list of attributes described by this standard.
We recommend that netCDF files that
follow these conventions indicate
this by setting the NUG defined global
attribute
Conventions
to the string value
"CF-1.0"
"CF-1.1"
. The string is interpreted as a
directory name relative to a directory
that is a repository of documents
describing sets of discipline-specific
conventions. The conventions directory
name is currently interpreted relative to
the directory
pub/netcdf/Conventions/
on the host machine
ftp.unidata.ucar.edu. The
web based versions of this
document are linked from the
netCDF Conventions web page
.
The following attributes are intended to provide information about where the data came from and what has been done to it. This information is mainly for the benefit of human readers. The attribute values are all character strings. For readability in ncdump outputs it is recommended to embed newline characters into long strings to break them into lines. For backwards compatibility with COARDS none of these global attributes is required.
The NUG defines title and history
to be global attributes. We wish to
allow the newly defined attributes,
i.e., institution, source,
references,
and comment, to be either global or
assigned to individual variables. When
an attribute appears both globally and
as a variable attribute, the variable's
version has precedence.
-
title A succinct description of what is in the dataset.
-
institution Specifies where the original data was produced.
-
source The method of production of the original data. If it was model-generated,
sourceshould name the model and its version, as specifically as could be useful. If it is observational,sourceshould characterize it (e.g., "surface observation" or "radiosonde").-
history Provides an audit trail for modifications to the original data. Well-behaved generic netCDF filters will automatically append their name and the parameters with which they were invoked to the global history attribute of an input netCDF file. We recommend that each line begin with a timestamp indicating the date and time of day that the program was executed.
-
references Published or web-based references that describe the data or methods used to produce it.
-
comment Miscellaneous information about the data or methods used to produce it.
The attributes described in this section are used to
provide a description of the content and the units
of measurement for each variable. We continue to
support the use of the
units
and
long_name attributes
as defined in COARDS. We extend COARDS by adding the
optional
standard_name
attribute which is used to provide
unique identifiers for variables. This is important for
data exchange since one cannot necessarily identify a
particular variable based on the name assigned to it by
the institution that provided the data.
The
standard_name
attribute can
be used to identify variables that contain coordinate
data. But since it is an optional attribute, applications
that implement these standards must continue to be
able to identify coordinate types based on the COARDS
conventions.
The units attribute is required for all variables
that represent dimensional quantities (except for boundary variables
defined in Section 7.1, “Cell Boundaries” and climatology variables
defined in Section 7.4, “Climatological Statistics”). The value of
the units attribute is a string that can be
recognized by UNIDATA"s Udunits package [UDUNITS],
with a few exceptions that are given below.
The Udunits package includes a file
udunits.dat,
which lists its supported unit names. Note that case is significant in the units strings.
The COARDS convention prohibits the unit
degrees altogether, but this unit is not
forbidden by the CF convention because it may in fact be appropriate
for a variable containing, say, solar zenith angle. The unit
degrees is also allowed on coordinate variables
such as the latitude and longitude coordinates of a transformed grid.
In this case the coordinate values are not true latitudes and
longitudes which must always be identified using the more specific
forms of degrees as described in
Section 4.1, “Latitude Coordinate” and Section 4.2, “Longitude Coordinate”.
Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included. The Udunits package defines a few dimensionless units, such as percent, but is lacking commonly used units such as ppm (parts per million). This convention does not support the addition of new dimensionless units that are not udunits compatible. The conforming unit for quantities that represent fractions, or parts of a whole, is "1". The conforming unit for parts per million is "1e-6". Descriptive information about dimensionless quantities, such as sea-ice concentration, cloud fraction, probability, etc., should be given in the long_name or standard_name attributes (see below) rather than the units.
The units level, layer, and sigma_level are allowed for dimensionless vertical coordinates to maintain backwards compatibility with COARDS. These units are not compatible with Udunits and are deprecated by this standard because conventions for more precisely identifying dimensionless vertical coordinates are introduced (see Section 4.3.2, “Dimensionless Vertical Coordinate”).
The Udunits syntax that allows scale factors and offsets to be applied to
a unit is not supported by this standard. The application of any scale
factors or offsets to data should be indicated by the
scale_factor and add_offset
attributes. Use of these attributes for data packing,
which is their most important application,
is discussed in detail in Section 8.1, “Packed Data”.
Udunits recognizes the following prefixes and their abbreviations.
Table 3.1. Supported Units
| Factor | Prefix | Abbreviation | Factor | Prefix | Abbreviation | |
|---|---|---|---|---|---|---|
| 1e1 | deca,deka | da | 1e-1 | deci | d | |
| 1e2 | hecto | h | 1e-2 | deci | c | |
| 1e3 | kilo | k | 1e-3 | milli | m | |
| 1e6 | mega | M | 1e-6 | micro | u | |
| 1e9 | giga | G | 1e-9 | nano | n | |
| 1e12 | tera | T | 1e-12 | pico | p | |
| 1e15 | peta | P | 1e-15 | femto | f | |
| 1e18 | exa | E | 1e-18 | atto | a | |
| 1e21 | zetta | Z | 1e-21 | zepto | z | |
| 1e24 | yotta | Y | 1e-24 | yocto | y |
The long_name attribute is defined by the NUG to contain a long descriptive name which may, for example, be used for labeling plots. For backwards compatibility with COARDS this attribute is optional. But it is highly recommended that either this or the standard_name attribute defined in the next section be provided to make the file self-describing. If a variable has no long_name attribute then an application may use, as a default, the standard_name if it exists, or the variable name itself.
A fundamental requirement for exchange of scientific data is the ability to describe precisely the physical quantities being represented. To some extent this is the role of the long_name attribute as defined in the NUG. However, usage of long_name is completely ad-hoc. For some applications it would be desirable to have a more definitive description of the quantity, which would allow users of data from different sources to determine whether quantities were in fact comparable. For this reason an optional mechanism for uniquely associating each variable with a standard name is provided.
A standard name is associated with a variable via the attribute standard_name which takes a string value comprised of a standard name optionally followed by one or more blanks and a standard name modifier (a string value from Appendix C, Standard Name Modifiers).
The set of permissible standard names is contained in the standard name table. The table entry for each standard name contains the following:
- standard name
The name used to identify the physical quantity. A standard name contains no whitespace and is case sensitive.
- canonical units
Representative units of the physical quantity. Unless it is dimensionless, a variable with a
standard_nameattribute must have units which are physically equivalent (not necessarily identical) to the canonical units, possibly modified by an operation specified by either the standard name modifier (see below and Appendix C, Standard Name Modifiers) or by thecell_methodsattribute (see Section 7.3, “Cell Methods” and Appendix E, Cell Methods).- description
The description is meant to clarify the qualifiers of the fundamental quantities such as which surface a quantity is defined on or what the flux sign conventions are. We don"t attempt to provide precise definitions of fundumental physical quantities (e.g., temperature) which may be found in the literature.
When appropriate, the table entry also contains the corresponding GRIB parameter code(s) (from ECMWF and NCEP) and AMIP identifiers.
The standard name table is located at http://cf-pcmdi.llnl.gov/documents/cf-standard-names/current/cf-standard-name-table.xml , written in compliance with the XML format, as described in Appendix B, Standard Name Table Format. Knowledge of the XML format is only necessary for application writers who plan to directly access the table. A formatted text version of the table is provided at http://cf-pcmdi.llnl.gov/documents/cf-standard-names/current/cf-standard-name-table.html , and this table may be consulted in order to find the standard name that should be assigned to a variable.
Standard names by themselves are not always sufficient to describe a quantity. For example, a variable may contain data to which spatial or temporal operations have been applied. Or the data may represent an uncertainty in the measurement of a quantity. These quantity attributes are expressed as modifiers of the standard name. Modifications due to common statistical operations are expressed via the cell_methods attribute (see Section 7.3, “Cell Methods” and Appendix E, Cell Methods). Other types of quantity modifiers are expressed using the optional modifier part of the standard_name attribute. The permissible values of these modifiers are given in Appendix C, Standard Name Modifiers.
Example 3.1. Use of standard_name
float psl(lat,lon) ;
psl:long_name = "mean sea level pressure" ;
psl:units = "hPa" ;
psl:standard_name = "air_pressure_at_sea_level" ;
The description in the standard name table entry for air_pressure_at_sea_level clarifies that "sea level" refers to the mean sea level, which is close to the geoid in sea areas.
Here are lists of equivalences between the CF standard names and the standard names from the ECMWF GRIB tables, the NCEP GRIB tables, and the PCMDI tables.
When one data variable provides metadata about the individual values of another data variable it may be desirable to express this association by providing a link between the variables. For example, instrument data may have associated measures of uncertainty. The attribute ancillary_variables is used to express these types of relationships. It is a string attribute whose value is a blank separated list of variable names. The nature of the relationship between variables associated via ancillary_variables must be determined by other attributes. The variables listed by the ancillary_variables attribute will often have the standard name of the variable which points to them including a modifier (Appendix C, Standard Name Modifiers) to indicate the relationship.
Example 3.2. Instrument data
float q(time) ;
q:standard_name = "specific_humidity" ;
q:units = "g/g" ;
q:ancillary_variables = "q_error_limit q_detection_limit" ;
float q_error_limit(time)
q_error_limit:standard_name = "specific_humidity standard_error" ;
q_error_limit:units = "g/g" ;
float q_detection_limit(time)
q_detection_limit:standard_name = "specific_humidity detection_minimum" ;
q_detection_limit:units = "g/g" ;
The attributes flag_values and flag_meanings are intended to make variables that contain flag values self describing. The flag_values attribute is the same type as the variable to which it is attached, and contains a list of the possible flag values. The flag_meanings attribute is a string whose value is a blank separated list of descriptive words or phrases, one for each flag value. If multi-word phrases are used to describe the flag values, then the words within a phrase should be connected with underscores.
Example 3.3. A flag variable
byte current_speed_qc(time, depth, lat, lon) ;
current_speed_qc:long_name = "Current Speed Quality" ;
current_speed_qc:_FillValue = -128b ;
current_speed_qc:valid_range = -127b, 127b ;
current_speed_qc:flag_values = 0b, 1b, 2b ;
current_speed_qc:flag_meanings = "quality_good sensor_nonfunctional
outside_valid_range" ;
Table of Contents
Four types of coordinates receive special treatment by these
conventions: latitude, longitude, vertical, and time.
We continue to support the special role that the
units and positive attributes
play in the COARDS convention to identify coordinate type.
We extend COARDS by providing explicit definitions of dimensionless
vertical coordinates. The definitions are associated with a coordinate
variable via the standard_name and
formula_terms attributes. For backwards compatibility
with COARDS use of these attributes is not required, but is strongly recommended.
Because identification of a coordinate type by its units is complicated
by requiring the use of an external software
package [UDUNITS], we provide two optional
methods that yield a direct identification.
The attribute axis may be attached to a coordinate
variable and given one of the values X, Y,
Z or T which stand for a longitude,
latitude, vertical, or time axis respectively.
Alternatively the standard_name attribute may be used
for direct identification. But note that these optional
attributes are in addition to the required COARDS metadata.
Coordinate types other than latitude, longitude, vertical, and time
are allowed. To identify generic spatial coordinates we recommend
that the axis attribute be attached to these
coordinates and given one of the values X,
Y or Z.
We attach no
specific meaning to the axis values in this case,
but note that they may provide a useful hint to an application that
plots spatially oriented data.
The values X and Y
for the axis attribute should be used to identify horizontal coordinate
variables. If both X- and Y-axis are identified, X-Y-up
should define a right-handed coordinate system, i.e. rotation from the
positive X direction to the positive Y direction is anticlockwise if
viewed from above.
We strongly recommend that coordinate
variables be used for all coordinate types whenever they are applicable.
The methods of identifying coordinate types described in this
section apply both to coordinate variables and to auxiliary
coordinate variables named by the coordinates
attribute (see Chapter 5,
Coordinate Systems
).
Variables representing latitude must always explicitly include the
units attribute; there is no default value.
The units attribute will be a string formatted
as per the
udunits.dat file.
The recommended unit of latitude
is degrees_north. Also acceptable
are degree_north, degree_N,
degrees_N, degreeN,
and degreesN.
Example 4.1. Latitude axis
float lat(lat) ;
lat:long_name = "latitude" ;
lat:units = "degrees_north" ;
lat:standard_name = "latitude" ;
Application writers should note that the Udunits package does not
recognize the directionality implied by the "north" part of the unit
specification. It only recognizes its size, i.e., 1 degree is defined
to be pi/180 radians. Hence, determination that a coordinate is a
latitude type should be done via a string match between the given unit
and one of the acceptable forms of degrees_north.
Optionally, the latitude type may be indicated additionally by providing
the standard_name attribute with the value
latitude, and/or the axis attribute
with the value Y.
Coordinates of latitude with respect to a rotated pole should be given
units of degrees, not degrees_north
or equivalents, because applications which use the units to identify
axes would have no means of distinguishing such an axis from real
latitude, and might draw incorrect coastlines, for instance.
It would
also not generally be appropriate to attach an axis attribute to a
rotated-latitude coordinate variable. Such a variable can be identified
by a standard_name of grid_latitude.
Variables representing longitude must always explicitly include
the units attribute; there is no default value.
The units attribute will be a string formatted
as per the
udunits.dat file.
The recommended unit of longitude is
degrees_east. Also acceptable
are degree_east, degree_E,
degrees_E, degreeE,
and degreesE.
Example 4.2. Longitude axis
float lon(lon) ;
lon:long_name = "longitude" ;
lon:units = "degrees_east" ;
lon:standard_name = "longitude" ;
Application writers should note that the Udunits package has limited
recognition of the directionality implied by the "east" part of the
unit specification. It defines degrees_east to be
pi/180 radians, and hence equivalent to degrees_north.
We recommend the determination that a coordinate is a longitude type
should be done via a string match between the given unit and one of the
acceptable forms of degrees_east.
Optionally, the longitude type may be indicated additionally by
providing the standard_name attribute with the
value longitude, and/or the axis
attribute with the value X.
Coordinates of longitude with respect to a rotated pole should be
given units of degrees, not
degrees_east or equivalents, because applications
which use the units to identify axes would have no means of
distinguishing such an axis from real longitude, and might draw
incorrect coastlines, for instance.
It would also not generally be
appropriate to attach an axis attribute to a rotated-longitude
coordinate variable. Such a variable can be identified by a
standard_name of grid_longitude.
Variables representing dimensional height or depth axes must always
explicitly include the units attribute; there is
no default value.
The direction of positive (i.e., the direction in which the coordinate
values are increasing), whether up or down, cannot in all cases be
inferred from the units. The direction of positive is useful for
applications displaying the data. For this reason the attribute
positive as defined in the COARDS standard is
required if the vertical axis units are not a valid unit of pressure
(a determination which can be made using the udunits routine, utScan)
-- otherwise its inclusion is optional. The positive
attribute may have the value up or
down (case insensitive). This attribute may be
applied to either coordinate variables or auxillary coordinate
variables that contain vertical coordinate data.
For example, if an oceanographic netCDF file encodes the depth of the surface as 0 and the depth of 1000 meters as 1000 then the axis would use attributes as follows:
axis_name:units = "meters" ;
axis_name:positive = "down" ;
If, on the other hand, the depth of 1000 meters were represented
as -1000 then the value of the positive attribute
would have been up. If the units
attribute value is a valid pressure unit the default value of the
positive attribute is down.
A vertical coordinate will be identifiable by:
units of pressure; or
the presence of the positive attribute with a value of
upordown(case insensitive).
Optionally, the vertical type may be indicated additionally by
providing the standard_name attribute with an
appropriate value, and/or the axis attribute
with the value Z.
The units attribute for dimensional coordinates will
be a string formatted as per the
udunits.dat file.
The acceptable units for vertical (depth or height) coordinate variables are:
units of pressure as listed in the file
udunits.dat. For vertical axes the most commonly used of these include includebar,millibar,decibar,atmosphere (atm),pascal (Pa), andhPa.units of length as listed in the file udunits.dat. For vertical axes the most commonly used of these include
meter (metre, m), andkilometer (km).other units listed in the file udunits.dat that may under certain circumstances reference vertical position such as units of density or temperature.
Plural forms are also acceptable.
The units attribute is not required for dimensionless coordinates. For backwards compatibility with COARDS we continue to allow the units attribute to take one of the values: level, layer, or sigma_level. These values are not recognized by the Udunits package, and are considered a deprecated feature in the CF standard.
For dimensionless vertical coordinates we extend the COARDS standard by making use of the standard_name attribute to associate a coordinate with its definition from Appendix D, Dimensionless Vertical Coordinates. The definition provides a mapping between the dimensionless coordinate values and dimensional values that can positively and uniquely indicate the location of the data. A new attribute, formula_terms, is used to associate terms in the definitions with variables in a netCDF file. To maintain backwards compatibility with COARDS the use of these attributes is not required, but is strongly recommended.
Example 4.3. Atmosphere sigma coordinate
float lev(lev) ; lev:long_name = "sigma at layer midpoints" ; lev:positive = "down" ; lev:standard_name = "atmosphere_sigma_coordinate" ; lev:formula_terms = "sigma: lev ps: PS ptop: PTOP" ;
In this example the standard_name value atmosphere_sigma_coordinate identifies the following definition from Appendix C, Standard Name Modifiers which specifies how to compute pressure at gridpoint (n,k,j,i) where j and i are horizontal indices, k is a vertical index, and n is a time index:
p(n,k,j,i) = ptop + sigma(k)*(ps(n,j,i)-ptop)
The formula_terms attribute associates the variable lev with the term sigma, the variable PS with the term ps, and the variable PTOP with the term ptop. Thus the pressure at gridpoint (n,k,j,i) would be calculated by
p(n,k,j,i) = PTOP + lev(k)*(PS(n,j,i)-PTOP)
Variables representing time must always explicitly include
the units attribute; there is no default value.
The units attribute takes a string value formatted
as per the recommendations in the Udunits package [UDUNITS].
The following excerpt from the Udunits documentation explains the time unit encoding by example:
The specification:
seconds since 1992-10-8 15:15:42.5 -6:00
indicates seconds since October 8th, 1992 at 3 hours, 15
minutes and 42.5 seconds in the afternoon in the time zone
which is six hours to the west of Coordinated Universal Time
(i.e. Mountain Daylight Time). The time zone specification
can also be written without a colon using one or two-digits
(indicating hours) or three or four digits (indicating hours
and minutes).
The acceptable units for time are listed in the
udunits.dat file.
The most commonly used of these strings (and their abbreviations)
includes day (d), hour (hr, h),
minute (min) and second (sec, s).
Plural forms are also acceptable. The reference time string
(appearing after the identifier since) may
include date alone; date and time; or date, time, and time zone.
The reference time is required. A reference time in year 0 has a
special meaning (see Section 7.4, “Climatological Statistics”).
Note: if the time zone is omitted the default is UTC, and if both time and time zone are omitted the default is 00:00:00 UTC.
We recommend that the unit year be used with caution. The Udunits package defines a year to be exactly 365.242198781 days (the interval between 2 successive passages of the sun through vernal equinox). It is not a calendar year. Udunits includes the following definitions for years: a common_year is 365 days, a leap_year is 366 days, a Julian_year is 365.25 days, and a Gregorian_year is 365.2425 days.
For similar reasons the unit month, which is defined in
udunits.dat
to be exactly year/12, should also be used with caution.
Example 4.4. Time axis
double time(time) ;
time:long_name = "time" ;
time:units = "days since 1990-1-1 0:0:0" ;
A time coordinate is identifiable from its units string alone. The Udunits routines utScan() and utIsTime() can be used to make this determination.
Optionally, the time coordinate may be indicated additionally by providing the standard_name attribute with an appropriate value, and/or the axis attribute with the value T.
In order to calculate a new date and time given a base date, base time and a time increment one must know what calendar to use. For this purpose we recommend that the calendar be specified by the attribute calendar which is assigned to the time coordinate variable. The values currently defined for calendar are:
-
gregorianorstandard Mixed Gregorian/Julian calendar as defined by Udunits. This is the default.
-
proleptic_gregorian A Gregorian calendar extended to dates before 1582-10-15. That is, a year is a leap year if either (i) it is divisible by 4 but not by 100 or (ii) it is divisible by 400.
-
noleapor365_day Gregorian calendar without leap years, i.e., all years are 365 days long.
-
all_leapor366_day Gregorian calendar with every year being a leap year, i.e., all years are 366 days long.
-
360_day All years are 360 days divided into 30 day months.
-
julian Julian calendar.
-
none No calendar.
The calendar attribute may be set to none in climate experiments that simulate a fixed time of year. The time of year is indicated by the date in the reference time of the units attribute. The time coordinate that might apply in a perpetual July experiment are given in the following example.
Example 4.5. Perpetual time axis
variables:
double time(time) ;
time:long_name = "time" ;
time:units = "days since 1-7-15 0:0:0" ;
time:calendar = "none" ;
data:
time = 0., 1., 2., ...;
Here, all days simulate the conditions of 15th July, so it does not make sense to give them different dates. The time coordinates are interpreted as 0, 1, 2, etc. days since the start of the experiment.
If none of the calendars defined above applies (e.g., calendars appropriate to a different paleoclimate era), a non-standard calendar can be defined. The lengths of each month are explicitly defined with the month_lengths attribute of the time axis:
month_lengthsA vector of size 12, specifying the number of days in the months from January to December (in a non-leap year).
If leap years are included, then two other attributes of the time axis should also be defined:
leap_yearAn example of a leap year. It is assumed that all years that differ from this year by a multiple of four are also leap years. If this attribute is absent, it is assumed there are no leap years.
leap_monthA value in the range 1-12, specifying which month is lengthened by a day in leap years (1=January). If this attribute is not present, February (2) is assumed. This attribute is ignored if
leap_yearis not specified.
The calendar attribute is not required when a non-standard calendar is being used. It is sufficient to define the calendar using the month_lengths attribute, along with leap_year, and leap_month as appropriate. However, the calendar attribute is allowed to take non-standard values and in that case defining the non-standard calendar using the appropriate attributes is required.
Example 4.6. Paleoclimate time axis
double time(time) ; time:long_name = "time" ; time:units = "days since 1-1-1 0:0:0" ; time:calendar = "126 kyr B.P." ; time:month_lengths = 34, 31, 32, 30, 29, 27, 28, 28, 28, 32, 32, 34 ;
The mixed Gregorian/Julian calendar used by Udunits is explained in the following excerpt from the udunits(3) man page:
The udunits(3) package uses a mixed Gregorian/Julian calen- dar system. Dates prior to 1582-10-15 are assumed to use the Julian calendar, which was introduced by Julius Caesar in 46 BCE and is based on a year that is exactly 365.25 days long. Dates on and after 1582-10-15 are assumed to use the Gregorian calendar, which was introduced on that date and is based on a year that is exactly 365.2425 days long. (A year is actually approximately 365.242198781 days long.) Seem- ingly strange behavior of the udunits(3) package can result if a user-given time interval includes the changeover date. For example, utCalendar() and utInvCalendar() can be used to show that 1582-10-15 *preceded* 1582-10-14 by 9 days.
Due to problems caused by the discontinuity in the default mixed Gregorian/Julian calendar, we strongly recommend that this calendar should only be used when the time coordinate does not cross the discontinuity. For time coordinates that do cross the discontinuity the proleptic_gregorian calendar should be used instead.
Table of Contents
A variable's spatiotemporal dimensions are used to locate data values in time and space. This is accomplished by associating these dimensions with the relevant set of latitude, longitude, vertical, and time coordinates. This section presents two methods for making that association: the use of coordinate variables, and the use of auxiliary coordinate variables.
All of a variable's dimensions that are latitude, longitude, vertical, or time dimensions (see Section 1.2, “Terminology”) must have corresponding coordinate variables, i.e., one-dimensional variables with the same name as the dimension (see examples in Chapter 4, Coordinate Types ). This is the only method of associating dimensions with coordinates that is supported by [COARDS].
All of a variable's spatiotemporal dimensions that are not latitude,
longitude, vertical, or time dimensions are required to be associated
with the relevant latitude, longitude, vertical, or time coordinates
via the new coordinates attribute of the variable.
The value of the coordinates attribute is
a blank separated list of the names of auxiliary coordinate variables.
There is no restriction on the order in which the auxiliary coordinate
variables appear in the coordinates attribute string.
The dimensions of an auxiliary coordinate variable must be a subset of
the dimensions of the variable with which the coordinate is associated
(an exception is label coordinates (Section 6.1, “Labels”) which
contain a dimension for maximum string length). We recommend that the
name of a multidimensional coordinate variable should not match the name
of any of its dimensions because that precludes supplying an associated
coordinate variable for the dimension. This practice also avoids potential
bugs in applications that determine coordinate variables by only checking
for a name match between a dimension and a variable and not checking that
the variable is one dimensional.
The use of coordinate variables is required whenever they are applicable.
That is, auxiliary coordinate variables may not be used as the only way
to identify latitude and longitude coordinates that could be identified
using coordinate variables. This is both to enhance conformance to COARDS
and to facilitate the use of generic applications that recognize the NUG
convention for coordinate variables. An application that is trying to
find the latitude coordinate of a variable should always look first to
see if any of the variable's dimensions correspond to a latitude
coordinate variable. If the latitude coordinate is not found this way,
then the auxiliary coordinate variables listed by the
coordinates attribute should be checked. Note that it
is permissible, but optional, to list coordinate variables as well as
auxiliary coordinate variables in the coordinates
attribute. The axis attribute
is not allowed for auxiliary coordinate variables. Auxiliary coordinate
variables which lie on the horizontal surface can be identified as such
by their dimensions being horizontal, which can in turn be inferred from
their having an axis attribute of X or Y
, or from their units in the case of latitude and longitude
(see Chapter 4,
Coordinate Types
).
If the coordinate variables for a horizontal grid are not longitude
and latitude, it is recommended that they be supplied
in addition to the required coordinates.
For example, the Cartesian coordinates of a map projection should be
supplied as coordinate variables in addition to the required
two-dimensional latitude and longitude variables that are identified
via the coordinates attribute.
The use of the axis attribute with
values X and Y is recommended
for the coordinate variables(see Chapter 4,
Coordinate Types
).
It is sometimes not practical to specify the latitude-longitude location of data which is representative of geographic regions with complex boundaries. For this purpose, provision is made in Section 6.1.1, “Geographic Regions” for indicating the region by a standardized name.
When each of a variable's spatiotemporal dimensions is a latitude, longitude, vertical, or time dimension, then each axis is identified by a coordinate variable.
Example 5.1. Independent coordinate variables
dimensions:
lat = 18 ;
lon = 36 ;
pres = 15 ;
time = 4 ;
variables:
float xwind(time,pres,lat,lon) ;
xwind:long_name = "zonal wind" ;
xwind:units = "m/s" ;
float lon(lon) ;
lon:long_name = "longitude" ;
lon:units = "degrees_east" ;
float lat(lat) ;
lat:long_name = "latitude" ;
lat:units = "degrees_north" ;
float pres(pres) ;
pres:long_name = "pressure" ;
pres:units = "hPa" ;
double time(time) ;
time:long_name = "time" ;
time:units = "days since 1990-1-1 0:0:0" ;
xwind(n,k,j,i) is associated with the coordinate values lon(i), lat(j), pres(k), and time(n).