Opened 6 years ago

Closed 6 years ago

#26 closed enhancement (fixed)

Enhance CF flag definitions to support bit fields

Reported by: gregr Owned by: russ
Priority: medium Milestone: Website Transition
Component: cf-conventions Version: 1.0
Keywords: flags Cc:

Description

1. Title

Bit field enhancement to CF Flags definition

2. Moderator

???

3. Requirement

CF ought to provide a flag expression for multiple conditions, typically Boolean (binary), that describes one or more status conditions associated with a data variable.

4. Initial Statement of Technical Proposal


A new CF flag attribute, named "flag_masks", would enhance the current CF flags capabilities to describe multiple, independent status conditions using bit fields to define unique conditions or status codes.

5. Benefits

Bit field flag attributes are best suited for describing data variables possessing a number of status conditions that typically occur independently of each other.

Bit field attributes would simplify the description of status flags that don't conveniently map to a unique set of mutually exclusive status codes, currently defined with flag_values attributes.

For example, when describing a precipitation measurement within a particular geo-spatial grid, a bit field flag value may be defined to contain four possible binary status conditions, occupying the four least significant bits, whose values would be defined as follows:
1 to indicate no sensor coverage at the grid location,
2 to indicate observation impairment at that grid location,
4 to indicate mixed-phase precipitation at the grid location,
8 to indicate snow precipitation at the grid location.

6. Status Quo

Current flag values could be defined for every OR'ed combination of bit settings that define all possible status conditions, but the result would be inefficient compared to simple bit field definitions.

Attachments (2)

reTrac (6.5 KB) - added by gregr 6 years ago.
Four suggested changes to the existing CF Conventions document regarding the proposed bit field flags attribute.
flag_masks (9.9 KB) - added by russ 6 years ago.
Enhancement to Flags with addition of flag_masks attribute

Download all attachments as: .zip

Change History (18)

Changed 6 years ago by gregr

Four suggested changes to the existing CF Conventions document regarding the proposed bit field flags attribute.

comment:1 Changed 6 years ago by jonathan

Dear Greg

Thanks for this useful addition, which I support. I hope someone will volunteer to moderate this ticket.

The changes you propose to the conventions text (in your attachment) look clear to me. Could you also propose the required changes to the conformance document (which is implemented by the CF checker)?

Best wishes

Jonathan

comment:2 Changed 6 years ago by russ

  • Owner changed from cf-conventions@… to russ

Greg,

I'm volunteering to moderate this ticket.

--Russ

comment:3 follow-up: Changed 6 years ago by gregr

The following is an update of Section 3.5 (Flags) of the CF Conformance Requirements and Recommendations with my proposed additions:

3.5 Flags

Requirements:

  • The flag_values attribute must have the same type as the variable to which it is attached.

+ The flag_masks attribute must have the same type as the variable to which it is attached.

+ The flag_masks attribute must have a type that is compatible with bit field expression, for example, char, byte, short and int.

+ Floating point decimal flag_masks values are prohibited, specifically, float, real and double.

+ The flag_masks attribute values must be non-zero.

+ The flag_values attribute values must be mutually exclusive among the set of flag_values attribute values defined for that variable.

+ When only the flags_masks and flag_meanings attributes are defined (no flag_values), the bit fields of each flag_masks value must not be shared with any other flag_masks value defined for that variable. For example, a boolean exclusive OR of each flag_masks value with any other flag_masks value, must be false (zero).

Recommendations:

+ The number of flag_values attribute values should equal the number of words or phrases appearing in the flag_meanings string.

+ The number of flag_masks attribute values should equal the number of words or phrases appearing in the flag_meanings string.

comment:4 in reply to: ↑ 3 Changed 6 years ago by jonathan

Dear Greg

Your requirements 2-4 are related. Maybe they could be combined and simplified thus:

If there is a flag_masks attribute, the data variable must have a type that is compatible with bit field expression (char, byte, short and int), not floating-point (float, real, double), and the flag_masks attribute must have the same type.

When only the flags_masks and flag_meanings attributes are defined (no flag_values), the bit fields of each flag_masks value must not be shared with any other flag_masks value defined for that variable. For example, a boolean exclusive OR of each flag_masks value with any other flag_masks value, must be false (zero).

Should that be "boolean AND"? If XOR gives zero, the masks are the same, I think.

These two:

The number of flag_values attribute values should equal the number of words or phrases appearing in the flag_meanings string.

The number of flag_masks attribute values should equal the number of words or phrases appearing in the flag_meanings string.

should be requirements, I think i.e. if they are not met it is an error. I don't really know why the first of these isn't already a requirement.

You could have a recommendation that if flag_values and flag_masks are both defined, the Boolean AND of each entry in flag_values with its corresponding entry in flag_masks should leave the value unaffected i.e. the mask selects all the bits required to express the value.

Cheers

Jonathan

comment:5 Changed 6 years ago by gregr

Jonathon,

Good comments. The more consolidation for fewer rules, the better. I bumped both 'number of words or phrases' items into the Requirements list. After confirming that the CF compliance checker currently permits an unequal number of values and meanings, I wasn't sure of the current status so I opted to submit them as Recommendations. I agree, they ought to be required. I debated merging these two requirements into one but decided that the result may be ambiguous, perhaps you can suggest a more suitable wording.

I reworded the flag_meanings exclusivity requirement to use a Boolean AND, rather than an XOR, as you suggested. They both establish the same restriction, but the AND will probably be clearer to readers.

I also added a recommendation, as you also suggested, regarding the relation of flag_masks and flag_values.

So, the new-and-improved list of Flags requirements and recommendations is as follows.

Thanks,
Greg.

3.5 Flags

Requirements:

+ The number of flag_values attribute values should equal the number of words or phrases appearing in the flag_meanings string.

+ The number of flag_masks attribute values should equal the number of words or phrases appearing in the flag_meanings string.

  • The flag_values attribute must have the same type as the variable to which it is attached.

+ Variables with a flag_masks attribute must have a type that is compatible with bit field expression (char, byte, short and int), not floating-point (float, real, double), and the flag_masks attribute must have the same type.

+ The flag_masks attribute values must be non-zero.

+ The flag_values attribute values must be mutually exclusive among the set of flag_values attribute values defined for that variable.

+ When only flags_masks and flag_meanings attributes are defined (no flag_values), the bit fields of each flag_masks value must not be shared with any other flag_masks value defined for that variable. For example, a boolean AND of each flag_masks value with any other flag_masks value, must be false (zero).

Recommendations:

+ When flag_masks and flag_values are both defined, the Boolean AND of each entry in flag_values with its corresponding entry in flag_masks should equal the flag_values entry, ie, the mask selects all the bits required to express the value.

comment:6 follow-up: Changed 6 years ago by gregr

I forgot to replace 'should' with 'must' while changing the first two items from recommendations to requirements:

+ The number of flag_values attribute values must equal the number of words or phrases appearing in the flag_meanings string.

+ The number of flag_masks attribute values must equal the number of words or phrases appearing in the flag_meanings string.

comment:7 Changed 6 years ago by dettling

I would like to voice my support for these modifications to the CF conventions document. We could
and will definitely make use of these enhancements to the flags definition for CoSPA (Consolidated Storm Prediction for Aviation) which includes data products from multiple organizations (GSD/NOAA, MIT/LL, and NCAR/RAL).

Sue Dettling
Research Applications Lab
NCAR

comment:8 in reply to: ↑ 6 Changed 6 years ago by russ

Replying to gregr:

Greg,

I was about to suggest we wrap up discussion on this proposal, but in
looking at the Examples 3.3, 3.4, and 3.5 in the attachment, I was
left with some questions about use of the "valid_range" attribute and
the meaning of the combined use of flag_masks and flag_values in
Example 3.5.

Example 3.3 has

current_speed_qc:valid_range = -127b, 127b ;

and

current_speed_qc:flag_values = 0b, 1b, 2b ;

If the only allowable values for current_speed_qc are the enumerated
flag_values, why does the valid_range permit other values? What would
be the meaning of a current_speed_qc value of -1 in this case, which
is within the valid range but not one of the listed flag_values? I
suggest changing the valid_range in this example to

current_speed_qc:valid_range = 0b, 2b ;

Similarly, Example 3.4 has

sensor_status_qc:valid_range = 1b, 127b ;

and

sensor_status_qc:flag_masks = 1b, 2b, 4b, 8b, 16b, 32b ;

where the valid_range is wider than the possible combinations of
flag_masks, so in this example, I would suggest changing valid_range
to

sensor_status_qc:valid_range = 1b, 63b ;

and I think Example 3.5 should have

sensor_status_qc:valid_range = 1b, 15b ;

although I'm unclear about whether that upper limit is intended,
because I'm having trouble understanding the use of both flag_masks
and flag_values attributes for a variable in Example 3.5:

sensor_status_qc:valid_range = 1b, 127b ;
sensor_status_qc:flag_masks = 1b, 2b, 12b, 12b, 12b ;
sensor_status_qc:flag_values = 1b, 2b, 4b, 8b, 12b ;
sensor_status_qc:flag_meanings = "low_battery

hardware_fault
offline_mode
calibration_mode
maintenance_mode";

The sentence in the Example explanation that seems unclear to me is

Repeated flag_masks define a bit field mask that identifies a number
of status conditions with different flag_values.

Why doesn't the example just use

sensor_status_qc:flag_masks = 1b, 2b, 12b ;

without repeated flag_masks values to convey the same meaning? Or how
would the meaning differ if it just used

sensor_status_qc:flag_masks = 1b, 2b ;

Is it intended that a sensor_status_qc value of 12b indicate
"maintenance_mode" (the flag_meaning corresponding to the flag_value
12b) or a combination of "offline_mode" and "calibration_mode" (since
12b corresponds to a boolean OR of the corresponding flag_values). Or
maybe the intended meaning is that these are synonymous, providing an
implicit definition of "maintenance_mode".

In this example, would a sensor_status_qc value of 9b be permitted? I
assume not, because it's not a listed flag_value or boolean OR of a
listed flag_mask. But how about 13b, 14b, or 15b? Those are all
valid boolean OR combinations of flag_masks, even though they aren't
in the list of valid flag_values.

comment:9 Changed 6 years ago by gregr

Replying to russ:

Russ,

Thanks for the comments on my CF ticket. Most of your suggested changes in the provided examples made sense and I think the issues you discuss, regarding the combined use of flag_masks and flag_values, probably indicates that some rewording is necessary to make things clearer for readers.

In Example 3.3, I simply "lifted" from the current CF document. However, the valid range and flag values ought to be, as you suggested:

current_speed_qc:valid_range = 0b, 2b ;
current_speed_qc:flag_values = 0b, 1b, 2b ;

In Example 3.4, I max'ed out the valid range, as in Example 3.3. However, the valid range and flag masks ought to be, as you suggested:

sensor_status_qc:valid_range = 1b, 63b ;
sensor_status_qc:flag_masks = 1b, 2b, 4b, 8b, 16b, 32b ;

In Example 3.5, the valid range, flag values and flag masks ought to be:

sensor_status_qc:valid_range = 1b, 15b ;
sensor_status_qc:flag_masks = 1b, 2b, 12b, 12b, 12b ;
sensor_status_qc:flag_values = 1b, 2b, 4b, 8b, 12b ;
sensor_status_qc:flag_meanings =

"low_battery

hardware_fault
offline_mode calibration_mode maintenance_mode" ;

In this case, mutually exclusive values are blended with Boolean values to maximize use of the available bits in a flag value. Perhaps a diagram would better illustrate what is being expressed:

The table below represents the four binary digits (bits) expressed by
the sensor_status_qc variable in Example 3.5.

Bit 0 and Bit 1 are Boolean values indicating a low battery condition
and a hardware fault, respectively.
The next two bits (Bit 2 & Bit 3) express an enumeration indicating
abnormal sensor operating modes.

Thus, if Bit 0 is set, the battery is low and if Bit 1 is set,
there is a hardware fault - independent of the current sensor
operating mode.

+-------------------------------+
| Bit 3 . Bit 2 | Bit 1 | Bit 0 |
| (MSB) . | | (LSB) |
| . | | |
| . | H/W | Low |
| . | Fault | Batt |
| . | | |
+-------------------------------+

The remaining bits (Bit 2 & Bit 3) are decoded as follows:
Bit3:0 Bit2:1 = offline_mode

1 0 = calibration_mode
1 1 = maintenance_mode

The "12b" flag mask is repeated in the sensor_status_qc flag_masks definition to explicitly declare the recommended bit field masks for repeatedly AND'ing with the variable value while searching for matching enumerated values. An application determines if any of the conditions declared in the flag_meanings list are true by simply iterating through each of the flag_masks and AND'ing them with the variable. When a result is non-zero, that condition is true. The repeated flag_masks seemed to enable a simple mechanism for clients to detect all possible conditions.

Finally, to answer your questions regarding value meanings:

  • A sensor_status_qc value of "12b" would indicate that the sensor was in maintenance mode and that no faults existed.
  • A sensor_status_qc value of '9b' would indicate that the sensor was in calibration mode and that the battery was low.

Greg.

comment:10 follow-up: Changed 6 years ago by gregr

Corrections to my previous submission:

Yow! My item lines and bit field table were *seriously* mangled by the Wiki without the necessary blocking. Here's another go at a more readable entry ... feel free to ignore my last submission.

In Example 3.3, I simply "lifted" from the current CF document. However, the valid range and flag values ought to be, as you suggested:

  current_speed_qc:valid_range = 0b, 2b ;
  current_speed_qc:flag_values = 0b, 1b, 2b ;

In Example 3.4, I max'ed out the valid range, as in Example 3.3. However, the valid range and flag masks ought to be, as you suggested:

  sensor_status_qc:valid_range = 1b, 63b ;
  sensor_status_qc:flag_masks = 1b, 2b, 4b, 8b, 16b, 32b ;

In Example 3.5, the valid range, glag masks, flag values and flag meanings ought to be:

  sensor_status_qc:valid_range = 1b, 15b ;
  sensor_status_qc:flag_masks = 1b, 2b, 12b, 12b, 12b ;
  sensor_status_qc:flag_values = 1b, 2b, 4b, 8b, 12b ;
  sensor_status_qc:flag_meanings =
    "low_battery
     hardware_fault
     offline_mode calibration_mode maintenance_mode" ;

In this case, mutually exclusive values are blended with Boolean values to maximize use of the available bits in a flag value. Perhaps a diagram would better illustrate what is being expressed:

The table below represents the four binary digits (bits) expressed by the sensor_status_qc variable in Example 3.5.

Bit 0 and Bit 1 are Boolean values indicating a low battery condition and a hardware fault, respectively.
The next two bits (Bit 2 & Bit 3) express an enumeration indicating abnormal sensor operating modes.

Thus, if Bit 0 is set, the battery is low and if Bit 1 is set, there is a hardware fault - independent of the current sensor operating mode.

    +-------------------------------+
    | Bit 3 . Bit 2 | Bit 1 | Bit 0 |
    | (MSB) .       |       | (LSB) |
    |       .       |       |       |
    |       .       | H/W   | Low   |
    |       .       | Fault | Batt  |
    |       .       |       |       |
    +-------------------------------+

The remaining bits (Bit 2 & Bit 3) are decoded as follows:

    Bit3:0   Bit2:1  = offline_mode
         1        0  = calibration_mode
         1        1  = maintenance_mode

The "12b" flag mask is repeated in the sensor_status_qc flag_masks definition to explicitly declare the recommended bit field masks to repeatedly AND with the variable value while searching for matching enumerated values. An application determines if any of the conditions declared in the flag_meanings list are true by simply iterating through each of the flag_masks and AND'ing them with the variable. When a result is non-zero, that condition is true. The repeated flag_masks seemed to enable a simple mechanism for clients to detect all possible conditions.

Finally, to answer your questions regarding value meanings:

  • A sensor_status_qc value of "12b" would indicate that the sensor was in maintenance mode and that no faults existed.
  • A sensor_status_qc value of '9b' would indicate that the sensor was in calibration mode and that the battery was low.

Greg.

comment:11 in reply to: ↑ 10 Changed 6 years ago by russ

Greg,

Thanks for the very clear explanation of the encoding of mixed flag
values and Boolean masks in Example 3.5. Maybe a little more
explanation of this example is needed than in the original text, but
perhaps not as much as was required to make me realize what was going
on with encoding one of several mutually exclusive flag meanings in
contiguous bits of a flag value.

Your explanation for the desirability of repeated flag_masks values
also seems clear if I change "non-zero" to "equal to the corresponding
flag_values element" in

... An application determines if any of the conditions declared in
the flag_meanings list are true by simply iterating through each of
the flag_masks and AND'ing them with the variable. When a result is
non-zero, that condition is true.

so it reads

... An application determines if any of the conditions declared in
the flag_meanings list are true by simply iterating through each of
the flag_masks and AND'ing them with the variable. When a result is
equal to the corresponding flag_values element, that condition is
true.

This is also in accord with the recommendation in the proposal.

Regards,

Russ

comment:12 Changed 6 years ago by stevehankin

This comment is in some sense orthogonal to the specific discussion of bit map flags. It applies to all flag definitions in CF. Namely -- consideration of namespaces for the flag definitions

What our flags encodings provide is a human-readable way for a sub-group of CF-users to define their own flag conventions. What we do not provide (or have I overlooked it?) is a way for an application to identify who the sub-group was that created this file. This is needed for future interoperability -- so that (some day) a generic application can locate the correct mapping between flags that may be available elsewhere.

It seems that CF lacks a global attribute in the spirit of

    :local_conventions = "my_foo_community";

I'd note that the requirement described here is echoed in a slightly different key in the discussions of Common Concept (#24). Maybe we should split out from both of these tickets a more general discussion of how to identify which local conventions are being utilized in a given CF file.

n.b. we need to beware the slippery slope of CF becoming merely a container for "application schema" in the style of the OGC specifications.

comment:13 Changed 6 years ago by stevehankin

I have been discussing flag values offline with some folks in the Argo community. They have chosen an altogether different encoding of flags in their netCDF files -- using a string variable, in fact. I've urged them to look at the CF encodings and think about whether we can converge. I hope that conversation will continue.

The Argo needs revealed a fairly major inconsistency in the flag encodings as defined in the current CF standard (section 3.5). Namely, there are no linkages provided in the file to indicate what variables a particular flag applies to. In fact, in the example given, the association is implied to be through the variable name, "current_speed_qc", and long_name, "Current Speed Quality" -- highly non-CF approaches to defining connections between variables.

I would propose, instead, that any CF variable should be able to contain an attribute, say markup_flags =, that points to flags that are relevant for it. Here is an example -- an expansion of the example from section 3.5:

  float temperature(time, depth, lat, lon) ;
    temperature:markup_flags = "my_flags" ;
  byte my_flags(time, depth, lat, lon) ;
    my_flags:long_name = "Quality Flags" ;
    my_flags:_FillValue = -128b ;
    my_flags:valid_range = -127b, 127b ;
    my_flags:flag_values = 0b, 1b, 2b ;
    my_flags:flag_meanings = "quality_good sensor_nonfunctional 
                                                     outside_valid_range" ;


comment:14 Changed 6 years ago by jonathan

Dear Steve

I don't think there is a deficiency of the kind you suggest in the current CF standard. There are two well-defined methods to make a link between flag data and the variables it applies to, as described in CF 3.4. They are:

  1. Via the standard name, which should include the standard name modifier status_flag. For instance, in your ARGO example the temperature variable would have standard_name="sea_water_temperature", and the my_flags variable would have standard_name="sea_water_temperature status_flag".
  2. Via the ancillary_variables attribute. In your example, the temperature variable would have an attribute ancillary_variables="my_flags".

Greg's proposal (in this ticket) doesn't affect these existing conventions.

I would say that your other point, about local conventions, isn't specific to this proposal. It relates to the discussion in ticket 27.

I think this proposal is fine as it is.

Cheers

Jonathan

Changed 6 years ago by russ

Enhancement to Flags with addition of flag_masks attribute

comment:15 Changed 6 years ago by russ

All,

It's been four weeks since the last discussion of this ticket (ticket:26) to enhance flags with a flag_masks attribute. As moderator and with the intention of closing the ticket, I've edited Greg Rappa's original proposal to incorporate the changes that have been agreed upon. The edited proposal is available as a text attachment that shows the suggested changes to the CF Conventions and to the CF Conformance Requirements and Recommendations documents.

Thanks to Greg Rappa for the original proposal and to the discussion participants for improvements to the text.

--Russ

comment:16 Changed 6 years ago by mlaker1

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.