Personal tools
You are here: Home Documents CF Conventions 1.2 NetCDF Climate and Forecast (CF) Metadata Conventions
Document Actions

NetCDF Climate and Forecast (CF) Metadata Conventions

NetCDF Climate and Forecast (CF) Metadata Conventions

NetCDF Climate and Forecast (CF) Metadata Conventions

Version 1.2, 4 May, 2008

Original Authors

Brian Eaton

NCAR

Jonathan Gregory

Hadley Centre, UK Met Office

Bob Drach

PCMDI, LLNL

Karl Taylor

PCMDI, LLNL

Steve Hankin

PMEL, NOAA

Additional Authors

John Caron

UCAR

Rich Signell

USGS

Phil Bentley

Hadley Centre, UK Met Office

Many others have contributed to the development of CF through their participation in discussions about proposed changes.

 

Abstract

This document describes the CF conventions for climate and forecast metadata designed to promote the processing and sharing of files created with the netCDF Application Programmer Interface [NetCDF]. The conventions define metadata that provide a definitive description of what the data in each variable represents, and of the spatial and temporal properties of the data. This enables users of data from different sources to decide which quantities are comparable, and facilitates building applications with powerful extraction, regridding, and display capabilities.

The CF conventions generalize and extend the COARDS conventions [COARDS]. The extensions include metadata that provides a precise definition of each variable via specification of a standard name, describes the vertical locations corresponding to dimensionless vertical coordinate values, and provides the spatial coordinates of non-rectilinear gridded data. Since climate and forecast data are often not simply representative of points in space/time, other extensions provide for the description of coordinate intervals, multidimensional cells and climatological time coordinates, and indicate how a data value is representative of an interval or cell. This standard also relaxes the COARDS constraints on dimension order and specifies methods for reducing the size of datasets.


List of Examples

3.1. Use of standard_name
3.2. Instrument data
3.3. A flag variable
4.1. Latitude axis
4.2. Longitude axis
4.3. Atmosphere sigma coordinate
4.4. Time axis
4.5. Perpetual time axis
4.6. Paleoclimate time axis
5.1. Independent coordinate variables
5.2. Two-dimensional coordinate variables
5.3. Reduced horizontal grid
5.4. Timeseries of station data
5.5. Trajectories
5.6. Rotated pole grid
5.7. Lambert conformal projection
5.8. Latitude and longitude on a spherical Earth
5.9. Latitude and longitude on the WGS 1984 datum
5.10. British National Grid
5.11. Multiple forecasts from a single analysis
6.1. Several parcel trajectories
6.2. Northward heat transport in Atlantic Ocean
6.3. Model level numbers
7.1. Cells on a latitude axis
7.2. Cells in a non-rectangular grid
7.3. Cell areas for a spherical geodesic grid
7.4. Methods applied to a timeseries
7.5. Surface air temperature variance
7.6. Climatological seasons
7.7. Decadal averages for January
7.8. Temperature for each hour of the average day
7.9. Temperature for each hour of the typical climatological day
7.10. Monthly-maximum daily precipitation totals
8.1. Horizontal compression of a three-dimensional array
8.2. Compression of a three-dimensional field
B.1. A name table containing three entries
D.1. Atmosphere natural log pressure coordinate
D.2. Atmosphere sigma coordinate
D.3. Atmosphere hybrid sigma pressure coordinate
D.4. Atmosphere hybrid height coordinate
D.5. Atmosphere smooth level vertical (SLEVE) coordinate
D.6. Ocean sigma coordinate
D.7. Ocean s-coordinate
D.8. Ocean sigma over z coordinate
D.9. Ocean double sigma coordinate

Preface

Home page:

Contains links to: previous draft and current working draft documents; applications for processing CF conforming files; email list for discussion about interpretation, clarification, and proposals for changes or extensions to the current conventions. http://www-pcmdi.llnl.gov/cf/

Revision history:

This document will be updated to reflect agreed changes to the standard and to correct mistakes according to the rules of CF governance. See Appendix G, Revision History for the full revision history. Changes with provisional status use the following mark-up style: new text, deleted text, and [a comment].

Chapter 1.  Introduction

1.1. Goals

The NetCDF library [NetCDF] is designed to read and write data that has been structured according to well-defined rules and is easily ported across various computer platforms. The netCDF interface enables but does not require the creation of self-describing datasets. The purpose of the CF conventions is to require conforming datasets to contain sufficient metadata that they are self-describing in the sense that each variable in the file has an associated description of what it represents, including physical units if appropriate, and that each value can be located in space (relative to earth-based coordinates) and time.

An important benefit of a convention is that it enables software tools to display data and perform operations on specified subsets of the data with minimal user intervention. It is possible to provide the metadata describing how a field is located in time and space in many different ways that a human would immediately recognize as equivalent. The purpose in restricting how the metadata is represented is to make it practical to write software that allows a machine to parse that metadata and to automatically associate each data value with its location in time and space. It is equally important that the metadata be easy for human users to write and to understand.

This standard is intended for use with climate and forecast data, for atmosphere, surface and ocean, and was designed with model-generated data particularly in mind. We recognise that there are limits to what a standard can practically cover; we restrict ourselves to issues that we believe to be of common and frequent concern in the design of climate and forecast metadata. Our main purpose therefore, is to propose a clear, adequate and flexible definition of the metadata needed for climate and forecast data. Although this is specifically a netCDF standard, we feel that most of the ideas are of wider application. The metadata objects could be contained in file formats other than netCDF. Conversion of the metadata between files of different formats will be facilitated if conventions for all formats are based on similar ideas.

This convention is designed to be backward compatible with the COARDS conventions [COARDS], by which we mean that a conforming COARDS dataset also conforms to the CF standard. Thus new applications that implement the CF conventions will be able to process COARDS datasets.

We have also striven to maximize conformance to the COARDS standard, that is, wherever the COARDS metadata conventions provide an adequate description we require their use. Extensions to COARDS are implemented in a manner such that the content that doesn't depend on the extensions is still accessible to applications that adhere to the COARDS standard.

1.2. Terminology

The terms in this document that refer to components of a netCDF file are defined in the NetCDF User's Guide (NUG) [NUG] NUG. Some of those definitions are repeated below for convenience.

auxiliary coordinate variable

Any netCDF variable that contains coordinate data, but is not a coordinate variable (in the sense of that term defined by the NUG and used by this standard - see below). Unlike coordinate variables, there is no relationship between the name of an auxiliary coordinate variable and the name(s) of its dimension(s).

boundary variable

A boundary variable is associated with a variable that contains coordinate data. When a data value provides information about conditions in a cell occupying a region of space/time or some other dimension, the boundary variable provides a description of cell extent.

CDL syntax

The ascii format used to describe the contents of a netCDF file is called CDL (network Common Data form Language). This format represents arrays using the indexing conventions of the C programming language, i.e., index values start at 0, and in multidimensional arrays, when indexing over the elements of the array, it is the last declared dimension that is the fastest varying in terms of file storage order. The netCDF utilities ncdump and ncgen use this format (see chapter 10 of the NUG ). All examples in this document use CDL syntax.

cell

A region in one or more dimensions whose boundary can be described by a set of vertices. The term interval is sometimes used for one-dimensional cells.

coordinate variable

We use this term precisely as it is defined in section 2.3.1 of the NUG . It is a one-dimensional variable with the same name as its dimension [e.g., time(time)], and it is defined as a numeric data type with values that are ordered monotonically. Missing values are not allowed in coordinate variables.

grid mapping variable

A variable used as a container for attributes that define a specific grid mapping. The type of the variable is arbitrary since it contains no data.

latitude dimension

A dimension of a netCDF variable that has an associated latitude coordinate variable.

longitude dimension

A dimension of a netCDF variable that has an associated longitude coordinate variable.

multidimensional coordinate variable

An auxiliary coordinate variable that is multidimensional.

recommendation

Recommendations in this convention are meant to provide advice that may be helpful for reducing common mistakes. In some cases we have recommended rather than required particular attributes in order to maintain backwards compatibility with COARDS. An application must not depend on a dataset's adherence to recommendations.

scalar coordinate variable

A scalar variable that contains coordinate data. Functionally equivalent to either a size one coordinate variable or a size one auxiliary coordinate variable.

spatiotemporal dimension

A dimension of a netCDF variable that is used to identify a location in time and/or space.

time dimension

A dimension of a netCDF variable that has an associated time coordinate variable.

vertical dimension

A dimension of a netCDF variable that has an associated vertical coordinate variable.

1.3. Overview

No variable or dimension names are standardized by this convention. Instead we follow the lead of the NUG and standardize only the names of attributes and some of the values taken by those attributes. The overview provided in this section will be followed with more complete descriptions in following sections. Appendix A, Attributes contains a summary of all the attributes used in this convention.

We recommend that the NUG defined attribute Conventions be given the string value "CF-1.1" "CF-1.2" to identify datasets that conform to these conventions.

The general description of a file's contents should be contained in the following attributes: title, history, institution, source, comment and references (Section 2.6.2, “Description of file contents”). For backwards compatibility with COARDS none of these attributes is required, but their use is recommended to provide human readable documentation of the file contents.

Each variable in a netCDF file has an associated description which is provided by the attributes units, long_name, and standard_name. The units, and long_name attributes are defined in the NUG and the standard_name attribute is defined in this document.

The units attribute is required for all variables that represent dimensional quantities (except for boundary variables defined in Section 7.1, “Cell Boundaries”. The values of the units attributes are character strings that are recognized by UNIDATA's Udunits package [UDUNITS], (with exceptions allowed as discussed in Section 3.1, “Units”).

The long_name and standard_name attributes are used to describe the content of each variable. For backwards compatibility with COARDS neither is required, but use of at least one of them is strongly recommended. The use of standard names will facilitate the exchange of climate and forecast data by providing unambiguous identification of variables most commonly analyzed.

Four types of coordinates receive special treatment by these conventions: latitude, longitude, vertical, and time. Every variable must have associated metadata that allows identification of each such coordinate that is relevant. Two independent parts of the convention allow this to be done. There are conventions that identify the variables that contain the coordinate data, and there are conventions that identify the type of coordinate represented by that data.

There are two methods used to identify variables that contain coordinate data. The first is to use the NUG-defined "coordinate variables." The use of coordinate variables is required for all dimensions that correspond to one dimensional space or time coordinates. In cases where coordinate variables are not applicable, the variables containing coordinate data are identified by the coordinates attribute.

Once the variables containing coordinate data are identified, further conventions are required to determine the type of coordinate represented by each of these variables. Latitude, longitude, and time coordinates are identified solely by the value of their units attribute. Vertical coordinates with units of pressure may also be identified by the units attribute. Other vertical coordinates must use the attribute positive which determines whether the direction of increasing coordinate value is up or down. Because identification of a coordinate type by its units involves the use of an external software package [UDUNITS], we provide the optional attribute axis for a direct identification of coordinates that correspond to latitude, longitude, vertical, or time axes.

Latitude, longitude, and time are defined by internationally recognized standards, and hence, identifying the coordinates of these types is sufficient to locate data values uniquely with respect to time and a point on the earth's surface. On the other hand identifying the vertical coordinate is not necessarily sufficient to locate a data value vertically with respect to the earth's surface. In particular a model may output data on the dimensionless vertical coordinate used in its mathematical formulation. To achieve the goal of being able to spatially locate all data values, this convention includes the definitions of common dimensionless vertical coordinates in Appendix D, Dimensionless Vertical Coordinates. These definitions provide a mapping between the dimensionless coordinate values and dimensional values that can be uniquely located with respect to a point on the earth's surface. The definitions are associated with a coordinate variable via the standard_name and formula_terms attributes. For backwards compatibility with COARDS use of these attributes is not required, but is strongly recommended.

It is often the case that data values are not representative of single points in time and/or space, but rather of intervals or multidimensional cells. This convention defines a bounds attribute to specify the extent of intervals or cells. When data that is representative of cells can be described by simple statistical methods, those methods can be indicated using the cell_methods attribute. An important application of this attribute is to describe climatological and diurnal statistics.

Methods for reducing the total volume of data include both packing and compression. Packing reduces the data volume by reducing the precision of the stored numbers. It is implemented using the attributes add_offset and scale_factor which are defined in the NUG. Compression on the other hand loses no precision, but reduces the volume by not storing missing data. The attribute compress is defined for this purpose.

1.4. Relationship to the COARDS Conventions

These conventions generalize and extend the COARDS conventions [COARDS]. A major design goal has been to maintain backward compatibility with COARDS. Hence applications written to process datasets that conform to these conventions will also be able to process COARDS conforming datasets. We have also striven to maximize conformance to the COARDS standard so that datasets that only require the metadata that was available under COARDS will still be able to be processed by COARDS conforming applications. But because of the extensions that provide new metadata content, and the relaxation of some COARDS requirements, datasets that conform to these conventions will not necessarily be recognized by applications that adhere to the COARDS conventions. The features of these conventions that allow writing netCDF files that are not COARDS conforming are summarized below.

COARDS standardizes the description of grids composed of independent latitude, longitude, vertical, and time axes. In addition to standardizing the metadata required to identify each of these axis types COARDS restricts the axis (equivalently dimension) ordering to be longitude, latitude, vertical, and time (with longitude being the most rapidly varying dimension). Because of I/O performance considerations it may not be possible for models to output their data in conformance with the COARDS requirement. The CF convention places no rigid restrictions on the order of dimensions, however we encourage data producers to make the extra effort to stay within the COARDS standard order. The use of non-COARDS axis ordering will render files inaccessible to some applications and limit interoperability. Often a buffering operation can be used to miminize performance penalties when axis ordering in model code does not match the axis ordering of a COARDS file.

COARDS addresses the issue of identifying dimensionless vertical coordinates, but does not provide any mechanism for mapping the dimensionless values to dimensional ones that can be located with respect to the earth's surface. For backwards compatibility we continue to allow (but do not require) the units attribute of dimensionless vertical coordinates to take the values "level", "layer", or "sigma_level." But we recommend that the standard_name and formula_terms attributes be used to identify the appropriate definition of the dimensionless vertical coordinate (see Section 4.3.2, “Dimensionless Vertical Coordinate”).

The CF conventions define attributes which enable the description of data properties that are outside the scope of the COARDS conventions. These new attributes do not violate the COARDS conventions, but applications that only recognize COARDS conforming datasets will not have the capabilities that the new attributes are meant to enable. Briefly the new attributes allow:

  • Identification of quantities using standard names.

  • Description of dimensionless vertical coordinates.

  • Associating dimensions with auxiliary coordinate variables.

  • Linking data variables to scalar coordinate variables.

  • Associating dimensions with labels.

  • Description of intervals and cells.

  • Description of properties of data defined on intervals and cells.

  • Description of climatological statistics.

  • Data compression for variables with missing values.

Chapter 2.  NetCDF Files and Components

The components of a netCDF file are described in section 2 of the NUG [NUG]. In this section we describe conventions associated with filenames and the basic components of a netCDF file. We also introduce new attributes for describing the contents of a file.

2.1. Filename

NetCDF files should have the file name extension ".nc".

2.2. Data Types

The netCDF data types char, byte, short, int, float or real, and double are all acceptable. The char type is not intended for numeric data. One byte numeric data should be stored using the byte data type. All integer types are treated by the netCDF interface as signed. It is possible to treat the byte type as unsigned by using the NUG convention of indicating the unsigned range using the valid_min, valid_max, or valid_range attributes.

NetCDF does not support a character string type, so these must be represented as character arrays. In this document, a one dimensional array of character data is simply referred to as a "string". An n-dimensional array of strings must be implemented as a character array of dimension (n,max_string_length), with the last (most rapidly varying) dimension declared large enough to contain the longest string in the array. All the strings in a given array are therefore defined to be equal in length. For example, an array of strings containing the names of the months would be dimensioned (12,9) in order to accommodate "September", the month with the longest name.

2.3. Naming Conventions

Variable, dimension and attribute names should begin with a letter and be composed of letters, digits, and underscores. Note that this is in conformance with the COARDS conventions, but is more restrictive than the netCDF interface which allows use of the hyphen character. The netCDF interface also allows leading underscores in names, but the NUG states that this is reserved for system use.

Case is significant in netCDF names, but it is recommended that names should not be distinguished purely by case, i.e., if case is disregarded, no two names should be the same. It is also recommended that names should be obviously meaningful, if possible, as this renders the file more effectively self-describing.

This convention does not standardize any variable or dimension names. Attribute names and their contents, where standardized, are given in English in this document and should appear in English in conforming netCDF files for the sake of portability. Languages other than English are permitted for variables, dimensions, and non-standardized attributes. The content of some standardized attributes are string values that are not standardized, and thus are not required to be in English. For example, a description of what a variable represents may be given in a non-English language using the long_name attribute (see Section 3.2, “Long Name”) whose contents are not standardized, but a description given by the standard_name attribute (see Section 3.3, “Standard Name”) must be taken from the standard name table which is in English.

2.4. Dimensions

A variable may have any number of dimensions, including zero, and the dimensions must all have different names. COARDS strongly recommends limiting the number of dimensions to four, but we wish to allow greater flexibility. The dimensions of the variable define the axes of the quantity it contains. Dimensions other than those of space and time may be included. Several examples can be found in this document. Under certain circumstances, one may need more than one dimension in a particular quantity. For instance, a variable containing a two-dimensional probability density function might correlate the temperature at two different vertical levels, and hence would have temperature on both axes.

If any or all of the dimensions of a variable have the interpretations of "date or time" (T), "height or depth" (Z), "latitude" (Y), or "longitude" (X) then we recommend, but do not require (see Section 1.4, “Relationship to the COARDS Conventions”), those dimensions to appear in the relative order T, then Z, then Y, then X in the CDL definition corresponding to the file. All other dimensions should, whenever possible, be placed to the left of the spatiotemporal dimensions.

Dimensions may be of any size, including unity. When a single value of some coordinate applies to all the values in a variable, the recommended means of attaching this information to the variable is by use of a dimension of size unity with a one-element coordinate variable. It is also acceptable to use a scalar coordinate variable which eliminates the need for an associated size one dimension in the data variable. The advantage of using a coordinate variable is that all its attributes can be used to describe the single-valued quantity, including boundaries. For example, a variable containing data for temperature at 1.5 m above the ground has a single-valued coordinate supplying a height of 1.5 m, and a time-mean quantity has a single-valued time coordinate with an associated boundary variable to record the start and end of the averaging period.

2.5. Variables

This convention does not standardize variable names.

NetCDF variables that contain coordinate data are referred to as coordinate variables, auxiliary coordinate variables, scalar coordinate variables, or multidimensional coordinate variables.

2.5.1. Missing Data

The NUG conventions (NUG section 8.1) provide the _FillValue, valid_min, valid_max, and valid_range attributes to indicate missing data.

The NUG conventions for missing data changed significantly between version 2.3 and version 2.4. Since version 2.4 the NUG defines missing data as all values outside of the valid_range, and specifies how the valid_range should be defined from the _FillValue (which has library specified default values) if it hasn't been explicitly specified. If only one missing value is needed for a variable then we strongly recommend that this value be specified using the _FillValue attribute. Doing this guarantees that the missing value will be recognized by generic applications that follow either the before or after version 2.4 conventions.

The scalar attribute with the name _FillValue and of the same type as its variable is recognized by the netCDF library as the value used to pre-fill disk space allocated to the variable. This value is considered to be a special value that indicates undefined or missing data, and is returned when reading values that were not written. The _FillValue should be outside the range specified by valid_range (if used) for a variable. The netCDF library defines a default fill value for each data type (NUG section 7.16).

The missing_value attribute is considered deprecated by the NUG and we do not recommend its use. However for backwards compatibility with COARDS this standard continues to recognize the use of the missing_value attribute to indicate undefined or missing data.

The missing values of a variable with scale_factor and/or add_offset attributes (see section Section 8.1, “Packed Data”) are interpreted relative to the variable's external values, i.e., the values stored in the netCDF file. Applications that process variables that have attributes to indicate both a transformation (via a scale and/or offset) and missing values should first check that a data value is valid, and then apply the transformation. Note that values that are identified as missing should not be transformed. Since the missing value is outside the valid range it is possible that applying a transformation to it could result in an invalid operation. For example, the default _FillValue is very close to the maximum representable value of IEEE single precision floats, and multiplying it by 100 produces an "Infinity" (using single precision arithmetic).

2.6. Attributes

This standard describes many attributes (some mandatory, others optional), but a file may also contain non-standard attributes. Such attributes do not represent a violation of this standard. Application programs should ignore attributes that they do not recognise or which are irrelevant for their purposes. Conventional attribute names should be used wherever applicable. Non-standard names should be as meaningful as possible. Before introducing an attribute, consideration should be given to whether the information would be better represented as a variable. In general, if a proposed attribute requires ancillary data to describe it, is multidimensional, requires any of the defined netCDF dimensions to index its values, or requires a significant amount of storage, a variable should be used instead. When this standard defines string attributes that may take various prescribed values, the possible values are generally given in lower case. However, applications programs should not be sensitive to case in these attributes. Several string attributes are defined by this standard to contain "blank-separated lists". Consecutive words in such a list are separated by one or more adjacent spaces. The list may begin and end with any number of spaces. See Appendix A, Attributes for a list of attributes described by this standard.

2.6.1. Identification of Conventions

We recommend that netCDF files that follow these conventions indicate this by setting the NUG defined global attribute Conventions to the string value "CF-1.1" "CF-1.2" . The string is interpreted as a directory name relative to a directory that is a repository of documents describing sets of discipline-specific conventions. The conventions directory name is currently interpreted relative to the directory pub/netcdf/Conventions/ on the host machine ftp.unidata.ucar.edu. The web based versions of this document are linked from the netCDF Conventions web page .

2.6.2. Description of file contents

The following attributes are intended to provide information about where the data came from and what has been done to it. This information is mainly for the benefit of human readers. The attribute values are all character strings. For readability in ncdump outputs it is recommended to embed newline characters into long strings to break them into lines. For backwards compatibility with COARDS none of these global attributes is required.

The NUG defines title and history to be global attributes. We wish to allow the newly defined attributes, i.e., institution, source, references, and comment, to be either global or assigned to individual variables. When an attribute appears both globally and as a variable attribute, the variable's version has precedence.

title

A succinct description of what is in the dataset.

institution

Specifies where the original data was produced.

source

The method of production of the original data. If it was model-generated, source should name the model and its version, as specifically as could be useful. If it is observational, source should characterize it (e.g., "surface observation" or "radiosonde").

history

Provides an audit trail for modifications to the original data. Well-behaved generic netCDF filters will automatically append their name and the parameters with which they were invoked to the global history attribute of an input netCDF file. We recommend that each line begin with a timestamp indicating the date and time of day that the program was executed.

references

Published or web-based references that describe the data or methods used to produce it.

comment

Miscellaneous information about the data or methods used to produce it.

Chapter 3.  Description of the Data

The attributes described in this section are used to provide a description of the content and the units of measurement for each variable. We continue to support the use of the units and long_name attributes as defined in COARDS. We extend COARDS by adding the optional standard_name attribute which is used to provide unique identifiers for variables. This is important for data exchange since one cannot necessarily identify a particular variable based on the name assigned to it by the institution that provided the data.

The standard_name attribute can be used to identify variables that contain coordinate data. But since it is an optional attribute, applications that implement these standards must continue to be able to identify coordinate types based on the COARDS conventions.

3.1. Units

The units attribute is required for all variables that represent dimensional quantities (except for boundary variables defined in Section 7.1, “Cell Boundaries” and climatology variables defined in Section 7.4, “Climatological Statistics”). The value of the units attribute is a string that can be recognized by UNIDATA"s Udunits package [UDUNITS], with a few exceptions that are given below. The Udunits package includes a file udunits.dat, which lists its supported unit names. Note that case is significant in the units strings.

The COARDS convention prohibits the unit degrees altogether, but this unit is not forbidden by the CF convention because it may in fact be appropriate for a variable containing, say, solar zenith angle. The unit degrees is also allowed on coordinate variables such as the latitude and longitude coordinates of a transformed grid. In this case the coordinate values are not true latitudes and longitudes which must always be identified using the more specific forms of degrees as described in Section 4.1, “Latitude Coordinate” and Section 4.2, “Longitude Coordinate”.

Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included. The Udunits package defines a few dimensionless units, such as percent, but is lacking commonly used units such as ppm (parts per million). This convention does not support the addition of new dimensionless units that are not udunits compatible. The conforming unit for quantities that represent fractions, or parts of a whole, is "1". The conforming unit for parts per million is "1e-6". Descriptive information about dimensionless quantities, such as sea-ice concentration, cloud fraction, probability, etc., should be given in the long_name or standard_name attributes (see below) rather than the units.

The units level, layer, and sigma_level are allowed for dimensionless vertical coordinates to maintain backwards compatibility with COARDS. These units are not compatible with Udunits and are deprecated by this standard because conventions for more precisely identifying dimensionless vertical coordinates are introduced (see Section 4.3.2, “Dimensionless Vertical Coordinate”).

The Udunits syntax that allows scale factors and offsets to be applied to a unit is not supported by this standard. The application of any scale factors or offsets to data should be indicated by the scale_factor and add_offset attributes. Use of these attributes for data packing, which is their most important application, is discussed in detail in Section 8.1, “Packed Data”.

Udunits recognizes the following prefixes and their abbreviations.

Table 3.1. Supported Units

FactorPrefixAbbreviation FactorPrefixAbbreviation
1e1deca,dekada 1e-1decid
1e2hectoh 1e-2decicentic
1e3kilok 1e-3millim
1e6megaM 1e-6microu
1e9gigaG 1e-9nanon
1e12teraT 1e-12picop
1e15petaP 1e-15femtof
1e18exaE 1e-18attoa
1e21zettaZ 1e-21zeptoz
1e24yottaY 1e-24yoctoy


3.2. Long Name

The long_name attribute is defined by the NUG to contain a long descriptive name which may, for example, be used for labeling plots. For backwards compatibility with COARDS this attribute is optional. But it is highly recommended that either this or the standard_name attribute defined in the next section be provided to make the file self-describing. If a variable has no long_name attribute then an application may use, as a default, the standard_name if it exists, or the variable name itself.

3.3. Standard Name

A fundamental requirement for exchange of scientific data is the ability to describe precisely the physical quantities being represented. To some extent this is the role of the long_name attribute as defined in the NUG. However, usage of long_name is completely ad-hoc. For some applications it would be desirable to have a more definitive description of the quantity, which would allow users of data from different sources to determine whether quantities were in fact comparable. For this reason an optional mechanism for uniquely associating each variable with a standard name is provided.

A standard name is associated with a variable via the attribute standard_name which takes a string value comprised of a standard name optionally followed by one or more blanks and a standard name modifier (a string value from Appendix C, Standard Name Modifiers).

The set of permissible standard names is contained in the standard name table. The table entry for each standard name contains the following:

standard name

The name used to identify the physical quantity. A standard name contains no whitespace and is case sensitive.

canonical units

Representative units of the physical quantity. Unless it is dimensionless, a variable with a standard_name attribute must have units which are physically equivalent (not necessarily identical) to the canonical units, possibly modified by an operation specified by either the standard name modifier (see below and Appendix C, Standard Name Modifiers) or by the cell_methods attribute (see Section 7.3, “Cell Methods” and Appendix E, Cell Methods).

description

The description is meant to clarify the qualifiers of the fundamental quantities such as which surface a quantity is defined on or what the flux sign conventions are. We don"t attempt to provide precise definitions of fundumental physical quantities (e.g., temperature) which may be found in the literature.

When appropriate, the table entry also contains the corresponding GRIB parameter code(s) (from ECMWF and NCEP) and AMIP identifiers.

The standard name table is located at http://cf-pcmdi.llnl.gov/documents/cf-standard-names/current/cf-standard-name-table.xml , written in compliance with the XML format, as described in Appendix B, Standard Name Table Format. Knowledge of the XML format is only necessary for application writers who plan to directly access the table. A formatted text version of the table is provided at http://cf-pcmdi.llnl.gov/documents/cf-standard-names/current/cf-standard-name-table.html , and this table may be consulted in order to find the standard name that should be assigned to a variable.

Standard names by themselves are not always sufficient to describe a quantity. For example, a variable may contain data to which spatial or temporal operations have been applied. Or the data may represent an uncertainty in the measurement of a quantity. These quantity attributes are expressed as modifiers of the standard name. Modifications due to common statistical operations are expressed via the cell_methods attribute (see Section 7.3, “Cell Methods” and Appendix E, Cell Methods). Other types of quantity modifiers are expressed using the optional modifier part of the standard_name attribute. The permissible values of these modifiers are given in Appendix C, Standard Name Modifiers.

Example 3.1. Use of standard_name

float psl(lat,lon) ;
  psl:long_name = "mean sea level pressure" ;
  psl:units = "hPa" ;
  psl:standard_name = "air_pressure_at_sea_level" ;
      

The description in the standard name table entry for air_pressure_at_sea_level clarifies that "sea level" refers to the mean sea level, which is close to the geoid in sea areas.


Here are lists of equivalences between the CF standard names and the standard names from the ECMWF GRIB tables, the NCEP GRIB tables, and the PCMDI tables.

3.4. Ancillary Data

When one data variable provides metadata about the individual values of another data variable it may be desirable to express this association by providing a link between the variables. For example, instrument data may have associated measures of uncertainty. The attribute ancillary_variables is used to express these types of relationships. It is a string attribute whose value is a blank separated list of variable names. The nature of the relationship between variables associated via ancillary_variables must be determined by other attributes. The variables listed by the ancillary_variables attribute will often have the standard name of the variable which points to them including a modifier (Appendix C, Standard Name Modifiers) to indicate the relationship.

Example 3.2. Instrument data

  float q(time) ;
    q:standard_name = "specific_humidity" ;
    q:units = "g/g" ;
    q:ancillary_variables = "q_error_limit q_detection_limit" ;
  float q_error_limit(time)
    q_error_limit:standard_name = "specific_humidity standard_error" ;
    q_error_limit:units = "g/g" ;
  float q_detection_limit(time)
    q_detection_limit:standard_name = "specific_humidity detection_minimum" ;
    q_detection_limit:units = "g/g" ;
      

3.5. Flags

The attributes flag_values and flag_meanings are intended to make variables that contain flag values self describing. The flag_values attribute is the same type as the variable to which it is attached, and contains a list of the possible flag values. The flag_meanings attribute is a string whose value is a blank separated list of descriptive words or phrases, one for each flag value. If multi-word phrases are used to describe the flag values, then the words within a phrase should be connected with underscores.

Example 3.3. A flag variable

  byte current_speed_qc(time, depth, lat, lon) ;
    current_speed_qc:long_name = "Current Speed Quality" ;
    current_speed_qc:_FillValue = -128b ;
    current_speed_qc:valid_range = -127b, 127b ;
    current_speed_qc:flag_values = 0b, 1b, 2b ;
    current_speed_qc:flag_meanings = "quality_good sensor_nonfunctional 
                                                     outside_valid_range" ;
      

Chapter 4.  Coordinate Types

Four types of coordinates receive special treatment by these conventions: latitude, longitude, vertical, and time. We continue to support the special role that the units and positive attributes play in the COARDS convention to identify coordinate type. We extend COARDS by providing explicit definitions of dimensionless vertical coordinates. The definitions are associated with a coordinate variable via the standard_name and formula_terms attributes. For backwards compatibility with COARDS use of these attributes is not required, but is strongly recommended.

Because identification of a coordinate type by its units is complicated by requiring the use of an external software package [UDUNITS], we provide two optional methods that yield a direct identification. The attribute axis may be attached to a coordinate variable and given one of the values X, Y, Z or T which stand for a longitude, latitude, vertical, or time axis respectively. Alternatively the standard_name attribute may be used for direct identification. But note that these optional attributes are in addition to the required COARDS metadata.

Coordinate types other than latitude, longitude, vertical, and time are allowed. To identify generic spatial coordinates we recommend that the axis attribute be attached to these coordinates and given one of the values X, Y or Z. The values X and Y for the axis attribute should be used to identify horizontal coordinate variables. If both X- and Y-axis are identified, X-Y-up should define a right-handed coordinate system, i.e. rotation from the positive X direction to the positive Y direction is anticlockwise if viewed from above. We strongly recommend that coordinate variables be used for all coordinate types whenever they are applicable.

The methods of identifying coordinate types described in this section apply both to coordinate variables and to auxiliary coordinate variables named by the coordinates attribute (see Chapter 5, Coordinate Systems ).

4.1. Latitude Coordinate

Variables representing latitude must always explicitly include the units attribute; there is no default value. The units attribute will be a string formatted as per the udunits.dat file. The recommended unit of latitude is degrees_north. Also acceptable are degree_north, degree_N, degrees_N, degreeN, and degreesN.

Example 4.1. Latitude axis

float lat(lat) ;
  lat:long_name = "latitude" ;
  lat:units = "degrees_north" ;
  lat:standard_name = "latitude" ;
      

Application writers should note that the Udunits package does not recognize the directionality implied by the "north" part of the unit specification. It only recognizes its size, i.e., 1 degree is defined to be pi/180 radians. Hence, determination that a coordinate is a latitude type should be done via a string match between the given unit and one of the acceptable forms of degrees_north.

Optionally, the latitude type may be indicated additionally by providing the standard_name attribute with the value latitude, and/or the axis attribute with the value Y.

Coordinates of latitude with respect to a rotated pole should be given units of degrees, not degrees_north or equivalents, because applications which use the units to identify axes would have no means of distinguishing such an axis from real latitude, and might draw incorrect coastlines, for instance.

4.2. Longitude Coordinate

Variables representing longitude must always explicitly include the units attribute; there is no default value. The units attribute will be a string formatted as per the udunits.dat file. The recommended unit of longitude is degrees_east. Also acceptable are degree_east, degree_E, degrees_E, degreeE, and degreesE.

Example 4.2. Longitude axis

float lon(lon) ;
  lon:long_name = "longitude" ;
  lon:units = "degrees_east" ;
  lon:standard_name = "longitude" ;
      

Application writers should note that the Udunits package has limited recognition of the directionality implied by the "east" part of the unit specification. It defines degrees_east to be pi/180 radians, and hence equivalent to degrees_north. We recommend the determination that a coordinate is a longitude type should be done via a string match between the given unit and one of the acceptable forms of degrees_east.

Optionally, the longitude type may be indicated additionally by providing the standard_name attribute with the value longitude, and/or the axis attribute with the value X.

Coordinates of longitude with respect to a rotated pole should be given units of degrees, not degrees_east or equivalents, because applications which use the units to identify axes would have no means of distinguishing such an axis from real longitude, and might draw incorrect coastlines, for instance.

4.3. Vertical (Height or Depth) Coordinate

Variables representing dimensional height or depth axes must always explicitly include the units attribute; there is no default value.

The direction of positive (i.e., the direction in which the coordinate values are increasing), whether up or down, cannot in all cases be inferred from the units. The direction of positive is useful for applications displaying the data. For this reason the attribute positive as defined in the COARDS standard is required if the vertical axis units are not a valid unit of pressure (a determination which can be made using the udunits routine, utScan) -- otherwise its inclusion is optional. The positive attribute may have the value up or down (case insensitive). This attribute may be applied to either coordinate variables or auxillary coordinate variables that contain vertical coordinate data.

For example, if an oceanographic netCDF file encodes the depth of the surface as 0 and the depth of 1000 meters as 1000 then the axis would use attributes as follows:

axis_name:units = "meters" ; 
axis_name:positive = "down" ; 	
      

If, on the other hand, the depth of 1000 meters were represented as -1000 then the value of the positive attribute would have been up. If the units attribute value is a valid pressure unit the default value of the positive attribute is down.

A vertical coordinate will be identifiable by:

  • units of pressure; or

  • the presence of the positive attribute with a value of up or down (case insensitive).

Optionally, the vertical type may be indicated additionally by providing the standard_name attribute with an appropriate value, and/or the axis attribute with the value Z.

4.3.1. Dimensional Vertical Coordinate

The units attribute for dimensional coordinates will be a string formatted as per the udunits.dat file. The acceptable units for vertical (depth or height) coordinate variables are:

  • units of pressure as listed in the file udunits.dat. For vertical axes the most commonly used of these include include bar, millibar, decibar, atmosphere (atm), pascal (Pa), and hPa.

  • units of length as listed in the file udunits.dat. For vertical axes the most commonly used of these include meter (metre, m), and kilometer (km).

  • other units listed in the file udunits.dat that may under certain circumstances reference vertical position such as units of density or temperature.

Plural forms are also acceptable.

4.3.2. Dimensionless Vertical Coordinate

The units attribute is not required for dimensionless coordinates. For backwards compatibility with COARDS we continue to allow the units attribute to take one of the values: level, layer, or sigma_level. These values are not recognized by the Udunits package, and are considered a deprecated feature in the CF standard.

For dimensionless vertical coordinates we extend the COARDS standard by making use of the standard_name attribute to associate a coordinate with its definition from Appendix D, Dimensionless Vertical Coordinates. The definition provides a mapping between the dimensionless coordinate values and dimensional values that can positively and uniquely indicate the location of the data. A new attribute, formula_terms, is used to associate terms in the definitions with variables in a netCDF file. To maintain backwards compatibility with COARDS the use of these attributes is not required, but is strongly recommended.

Example 4.3. Atmosphere sigma coordinate

float lev(lev) ;
  lev:long_name = "sigma at layer midpoints" ;
  lev:positive = "down" ;
  lev:standard_name = "atmosphere_sigma_coordinate" ;
  lev:formula_terms = "sigma: lev ps: PS ptop: PTOP" ;
	

In this example the standard_name value atmosphere_sigma_coordinate identifies the following definition from Appendix C, Standard Name Modifiers which specifies how to compute pressure at gridpoint (n,k,j,i) where j and i are horizontal indices, k is a vertical index, and n is a time index:

p(n,k,j,i) = ptop + sigma(k)*(ps(n,j,i)-ptop)
	

The formula_terms attribute associates the variable lev with the term sigma, the variable PS with the term ps, and the variable PTOP with the term ptop. Thus the pressure at gridpoint (n,k,j,i) would be calculated by

p(n,k,j,i) = PTOP + lev(k)*(PS(n,j,i)-PTOP)
	

4.4. Time Coordinate

Variables representing time must always explicitly include the units attribute; there is no default value. The units attribute takes a string value formatted as per the recommendations in the Udunits package [UDUNITS]. The following excerpt from the Udunits documentation explains the time unit encoding by example:

	The specification:

    seconds since 1992-10-8 15:15:42.5 -6:00

indicates seconds since October 8th, 1992  at  3  hours,  15
minutes  and  42.5 seconds in the afternoon in the time zone
which is six hours to the west of Coordinated Universal Time
(i.e.  Mountain Daylight Time).  The time zone specification
can also be written without a colon using one or  two-digits
(indicating hours) or three or four digits (indicating hours
and minutes).
      

The acceptable units for time are listed in the udunits.dat file. The most commonly used of these strings (and their abbreviations) includes day (d), hour (hr, h), minute (min) and second (sec, s). Plural forms are also acceptable. The reference time string (appearing after the identifier since) may include date alone; date and time; or date, time, and time zone. The reference time is required. A reference time in year 0 has a special meaning (see Section 7.4, “Climatological Statistics”).

Note: if the time zone is omitted the default is UTC, and if both time and time zone are omitted the default is 00:00:00 UTC.

We recommend that the unit year be used with caution. The Udunits package defines a year to be exactly 365.242198781 days (the interval between 2 successive passages of the sun through vernal equinox). It is not a calendar year. Udunits includes the following definitions for years: a common_year is 365 days, a leap_year is 366 days, a Julian_year is 365.25 days, and a Gregorian_year is 365.2425 days.

For similar reasons the unit month, which is defined in udunits.dat to be exactly year/12, should also be used with caution.

Example 4.4. Time axis

double time(time) ;
  time:long_name = "time" ;
  time:units = "days since 1990-1-1 0:0:0" ;
      

A time coordinate is identifiable from its units string alone. The Udunits routines utScan() and utIsTime() can be used to make this determination.

Optionally, the time coordinate may be indicated additionally by providing the standard_name attribute with an appropriate value, and/or the axis attribute with the value T.

4.4.1. Calendar

In order to calculate a new date and time given a base date, base time and a time increment one must know what calendar to use. For this purpose we recommend that the calendar be specified by the attribute calendar which is assigned to the time coordinate variable. The values currently defined for calendar are:

gregorian or standard

Mixed Gregorian/Julian calendar as defined by Udunits. This is the default.

proleptic_gregorian

A Gregorian calendar extended to dates before 1582-10-15. That is, a year is a leap year if either (i) it is divisible by 4 but not by 100 or (ii) it is divisible by 400.

noleap or 365_day

Gregorian calendar without leap years, i.e., all years are 365 days long.

all_leap or 366_day

Gregorian calendar with every year being a leap year, i.e., all years are 366 days long.

360_day

All years are 360 days divided into 30 day months.

julian

Julian calendar.

none

No calendar.

The calendar attribute may be set to none in climate experiments that simulate a fixed time of year. The time of year is indicated by the date in the reference time of the units attribute. The time coordinate that might apply in a perpetual July experiment are given in the following example.

Example 4.5. Perpetual time axis

variables:
  double time(time) ;
    time:long_name = "time" ;
    time:units = "days since 1-7-15 0:0:0" ;
    time:calendar = "none" ;
data:
  time = 0., 1., 2., ...;
      

Here, all days simulate the conditions of 15th July, so it does not make sense to give them different dates. The time coordinates are interpreted as 0, 1, 2, etc. days since the start of the experiment.

If none of the calendars defined above applies (e.g., calendars appropriate to a different paleoclimate era), a non-standard calendar can be defined. The lengths of each month are explicitly defined with the month_lengths attribute of the time axis:

month_lengths

A vector of size 12, specifying the number of days in the months from January to December (in a non-leap year).

If leap years are included, then two other attributes of the time axis should also be defined:

leap_year

An example of a leap year. It is assumed that all years that differ from this year by a multiple of four are also leap years. If this attribute is absent, it is assumed there are no leap years.

leap_month

A value in the range 1-12, specifying which month is lengthened by a day in leap years (1=January). If this attribute is not present, February (2) is assumed. This attribute is ignored if leap_year is not specified.

The calendar attribute is not required when a non-standard calendar is being used. It is sufficient to define the calendar using the month_lengths attribute, along with leap_year, and leap_month as appropriate. However, the calendar attribute is allowed to take non-standard values and in that case defining the non-standard calendar using the appropriate attributes is required.

Example 4.6. Paleoclimate time axis

double time(time) ;
  time:long_name = "time" ;
  time:units = "days since 1-1-1 0:0:0" ;
  time:calendar = "126 kyr B.P." ;
  time:month_lengths = 34, 31, 32, 30, 29, 27, 28, 28, 28, 32, 32, 34 ;
	

The mixed Gregorian/Julian calendar used by Udunits is explained in the following excerpt from the udunits(3) man page:

The udunits(3) package uses a mixed Gregorian/Julian  calen-
dar  system.   Dates  prior to 1582-10-15 are assumed to use
the Julian calendar, which was introduced by  Julius  Caesar
in 46 BCE and is based on a year that is exactly 365.25 days
long.  Dates on and after 1582-10-15 are assumed to use  the
Gregorian calendar, which was introduced on that date and is
based on a year that is exactly 365.2425 days long.  (A year
is  actually  approximately 365.242198781 days long.)  Seem-
ingly strange behavior of the udunits(3) package can  result
if  a user-given time interval includes the changeover date.
For example, utCalendar() and utInvCalendar() can be used to
show that 1582-10-15 *preceded* 1582-10-14 by 9 days.
	

Due to problems caused by the discontinuity in the default mixed Gregorian/Julian calendar, we strongly recommend that this calendar should only be used when the time coordinate does not cross the discontinuity. For time coordinates that do cross the discontinuity the proleptic_gregorian calendar should be used instead.

Chapter 5.  Coordinate Systems

A variable's spatiotemporal dimensions are used to locate data values in time and space. This is accomplished by associating these dimensions with the relevant set of latitude, longitude, vertical, and time coordinates. This section presents two methods for making that association: the use of coordinate variables, and the use of auxiliary coordinate variables.

All of a variable's dimensions that are latitude, longitude, vertical, or time dimensions (see Section 1.2, “Terminology”) must have corresponding coordinate variables, i.e., one-dimensional variables with the same name as the dimension (see examples in Chapter 4, Coordinate Types ). This is the only method of associating dimensions with coordinates that is supported by [COARDS].

All of a variable's spatiotemporal dimensions that are not latitude, longitude, vertical, or time dimensions are required to be associated with the relevant latitude, longitude, vertical, or time coordinates via the new coordinates attribute of the variable. The value of the coordinates attribute is a blank separated list of the names of auxiliary coordinate variables. There is no restriction on the order in which the auxiliary coordinate variables appear in the coordinates attribute string. The dimensions of an auxiliary coordinate variable must be a subset of the dimensions of the variable with which the coordinate is associated (an exception is label coordinates (Section 6.1, “Labels”) which contain a dimension for maximum string length). We recommend that the name of a multidimensional coordinate variable should not match the name of any of its dimensions because that precludes supplying an associated coordinate variable for the dimension. This practice also avoids potential bugs in applications that determine coordinate variables by only checking for a name match between a dimension and a variable and not checking that the variable is one dimensional.

The use of coordinate variables is required whenever they are applicable. That is, auxiliary coordinate variables may not be used as the only way to identify latitude and longitude coordinates that could be identified using coordinate variables. This is both to enhance conformance to COARDS and to facilitate the use of generic applications that recognize the NUG convention for coordinate variables. An application that is trying to find the latitude coordinate of a variable should always look first to see if any of the variable's dimensions correspond to a latitude coordinate variable. If the latitude coordinate is not found this way, then the auxiliary coordinate variables listed by the coordinates attribute should be checked. Note that it is permissible, but optional, to list coordinate variables as well as auxiliary coordinate variables in the coordinates attribute. The axis attribute is not allowed for auxiliary coordinate variables. Auxiliary coordinate variables which lie on the horizontal surface can be identified as such by their dimensions being horizontal, which can in turn be inferred from their having an axis attribute of X or Y , or from their units in the case of latitude and longitude (see Chapter 4, Coordinate Types ).

If the coordinate variables for a horizontal grid are not longitude and latitude, it is recommended that they be supplied in addition to the required coordinates. For example, the Cartesian coordinates of a map projection should be supplied as coordinate variables in addition to the required two-dimensional latitude and longitude variables that are identified via the coordinates attribute. The use of the axis attribute with values X and Y is recommended for the coordinate variables(see Chapter 4, Coordinate Types ).

It is sometimes not practical to specify the latitude-longitude location of data which is representative of geographic regions with complex boundaries. For this purpose, provision is made in Section 6.1.1, “Geographic Regions” for indicating the region by a standardized name.

5.1. Independent Latitude, Longitude, Vertical, and Time Axes

When each of a variable's spatiotemporal dimensions is a latitude, longitude, vertical, or time dimension, then each axis is identified by a coordinate variable.

Example 5.1. Independent coordinate variables

dimensions:
  lat = 18 ;
  lon = 36 ;
  pres = 15 ;
  time = 4 ;
variables:
  float xwind(time,pres,lat,lon) ;
    xwind:long_name = "zonal wind" ;
    xwind:units = "m/s" ;
  float lon(lon) ;
    lon:long_name = "longitude" ;
    lon:units = "degrees_east" ;
  float lat(lat) ;
    lat:long_name = "latitude" ;
    lat:units = "degrees_north" ;
  float pres(pres) ;
    pres:long_name = "pressure" ;
    pres:units = "hPa" ;
  double time(time) ;
    time:long_name = "time" ;
    time:units = "days since 1990-1-1 0:0:0" ;
      

xwind(n,k,j,i) is associated with the coordinate values lon(i), lat(j), pres(k), and time(n).

5.2. Two-Dimensional Latitude, Longitude, Coordinate Variables

The latitude and longitude coordinates of a horizontal grid that was not defined as a Cartesian product of latitude and longitude axes, can sometimes be represented using two-dimensional coordinate variables. These variables are identified as coordinates by use of the coordinates attribute.