improver.wxcode.weather_symbols module

Module containing weather symbol implementation.

class WeatherSymbols(wxtree, model_id_attr=None, record_run_attr=None, target_period=None)[source]

Bases: BasePlugin

Definition and implementation of a weather symbol decision tree. This plugin uses a variety of diagnostic inputs and the decision tree logic to determine the most representative weather symbol for each site defined in the input cubes.

Weather symbol decision trees

Weather symbol decision trees use diagnostic fields to diagnose a suitable symbol to represent the weather conditions. The tree is comprised of a series of interconnected decision nodes. At each node one or multiple forecast diagnostics are compared to predefined threshold values. The node has an if_true and if_false path on to the next node, or on to a resulting weather symbol. By traversing the nodes it should be possible, given the right weather conditions, to arrive at any of the weather symbols.

The first few nodes of a decision tree are represented in the schematic below.

Schematic of thundery nodes in a decision tree

There are two thresholds being used in these nodes. The first is the diagnostic threshold which identifies the critical value for a given diagnostic. In the first node of the schematic shown this threshold is the count of lightning flashes in an hour exceeding 0.0. The second threshold is the probability of exceeding (in this case) this diagnostic threshold. In this first node it’s a probability of 0.3 (30%). So the node overall states that if there is an equal or greater than 30% probability of any lightning flashes in the hour being forecast, proceed to the if_true node, else move to the if_false node.

Encoding a decision tree

The first node above is encoded as follows:

{
  "lightning": {
      "if_true": "lightning_cloud",
      "if_false": "heavy_precipitation",
      "if_diagnostic_missing": "if_false",
      "probability_thresholds": [0.3],
      "threshold_condition": ">=",
      "condition_combination": "",
      "diagnostic_fields": [
          "probability_of_number_of_lightning_flashes_per_unit_area_in_vicinity_above_threshold"
      ],
      "diagnostic_thresholds": [[0.0, "m-2"]],
      "diagnostic_conditions": ["above"]
  },
}

The key at the first level, “lightning” in this case, names the node so that it can be targeted as an if_true or if_false destination from other nodes. The dictionary accessed with this key contains the essentials that make the node function.

  • if_true (str or int): The next node to test if the condition in this node is true. Alternatively this may be an integer number that identifies which weather symbol has been reached; this is for the leaf (or final) nodes in the tree.

  • if_false (str or int): The next node to test if the condition in this node is false. Alternatively this may be an integer number that identifies which weather symbol has been reached; this is for the leaf (or final) nodes in the tree.

  • if_diagnostic_missing (str, optional): If the expected diagnostic is not provided, should the tree proceed to the if_true or if_false node. This can be useful if the tree is to be applied to output from different models, some of which do not provide all the diagnostics that might be desirable.

  • probability_thresholds (list(float)): The probability threshold(s) that must be exceeded or not exceeded (see threshold_condition) for the node to progress to the succeed target. Two values required if condition_combination is being used.

  • threshold_condition (str): Defines the inequality test to be applied to the probability threshold(s). Inequalities that can be used are “<=”, “<”, “>”, “>=”.

  • condition_combination (str): If multiple tests are being applied in a single node, this value determines the logic with which they are combined. The values can be “AND”, “OR”.

  • diagnostic_fields (List(str or List(str)): The name(s) of the diagnostic(s) that will form the test condition in this node. There may be multiple diagnostics if they are being combined in the test using a condition_combination. Alternatively, if they are being manipulated within the node (e.g. added together), they must be separated by the desired operators (e.g. ‘diagnostic1’, ‘+’, ‘diagnostic2’).

  • diagnostic_thresholds (List(List(float, str, Optional(int))): The diagnostic threshold value and units being used in the test. An optional third value provides a period in seconds that is associated with the threshold value. For example, a precipitation accumulation threshold might be given for a 1-hour period (3600 seconds). If instead 3-hour symbols are being produced using 3-hour precipitation accumulations then the threshold value will be scaled up by a factor of 3. Only thresholds with an associated period will be scaled in this way. A threshold [value, units] pair must be provided for each diagnostic field with the same nested list structure; as the basic unit is a list of value and unit, the overall nested structure is one list deeper.

  • diagnostic_conditions (as diagnostic_fields): The expected inequality that has been used to construct the input probability field. This is checked against the spp__relative_to_threshold attribute of the threshold coordinate in the provided diagnostic.

Every decision tree must have a starting node, and this is taken as the first node defined in the dictionary.

Manipulation of the diagnostics is possible using the decision tree configuration to enable more complex comparisons. For example:

"heavy_rain_or_sleet_shower": {
    "if_true": 14,
    "if_false": 17,
    "probability_thresholds": [0.0],
    "threshold_condition": "<",
    "condition_combination": "",
    "diagnostic_fields": [
        [
            "probability_of_lwe_sleetfall_rate_above_threshold",
            "+",
            "probability_of_lwe_snowfall_rate_above_threshold",
            "-",
            "probability_of_rainfall_rate_above_threshold"
        ]
    ],
    "diagnostic_thresholds": [[[1.0, "mm hr-1"], [1.0, "mm hr-1"], [1.0, "mm hr-1"]]],
    "diagnostic_conditions": [["above", "above", "above"]]
},

This node uses three diagnostics. It combines them according to the mathematical operators that separate the names in the diagnostic_fields list. The resulting value is compared to the probability threshold value using the threshold condition. In this example the purpose is to check whether the probability of the rain rate exceeding 1.0 mm/hr is greater than the combined probability of the same rate being exceeded by sleet and snow.

__init__(wxtree, model_id_attr=None, record_run_attr=None, target_period=None)[source]

Define a decision tree for determining weather symbols based upon the input diagnostics. Use this decision tree to allocate a weather symbol to each point.

Parameters
  • wxtree (dict) – Weather symbols decision tree definition, provided as a dictionary.

  • model_id_attr (Optional[str]) – Name of attribute recording source models that should be inherited by the output cube. The source models are expected as a space-separated string.

  • record_run_attr (Optional[str]) – Name of attribute used to record models and cycles used in constructing the weather symbols.

  • target_period (Optional[int]) – The period in seconds that the weather symbol being produced should represent. This should correspond with any period diagnostics, e.g. precipitation accumulation, being used as input. This is used to scale any threshold values that are defined with an associated period in the decision tree. It will only be used if the decision tree provided has threshold values defined with an associated period.

float_tolerance defines the tolerance when matching thresholds to allow for the difficulty of float comparisons. float_abs_tolerance defines the tolerance for when the threshold is zero. It has to be sufficiently small that a valid rainfall rate or snowfall rate could not trigger it.

_abc_impl = <_abc_data object>
static _invert_comparator(comparator)[source]

Inverts a single comparator string.

Return type

str

check_coincidence(cubes)[source]

Check that all the provided cubes are valid at the same time and if any of the input cubes have time bounds, these match.

The last input cube with bounds (or first input cube if none have bounds) is selected as a template_cube for later producing the weather symbol cube.

Parameters

cubes (Union[List[Cube], CubeList]) – List of input cubes used to generate weather symbols

Raises
  • ValueError – If validity times differ for diagnostics.

  • ValueError – If period diagnostics have different periods.

  • ValueError – If period diagnostics do not match target_period.

Return type

Cube

static compare_array_to_threshold(arr, comparator, threshold)[source]

Compare two arrays element-wise and return a boolean array.

Parameters
  • arr (ndarray) –

  • comparator (str) – One of ‘<’, ‘>’, ‘<=’, ‘>=’.

  • threshold (float) –

Return type

ndarray

Returns

Array of booleans.

Raises

ValueError – If comparator is not one of ‘<’, ‘>’, ‘<=’, ‘>=’.

construct_extract_constraint(diagnostic, threshold, coord_named_threshold)[source]

Construct an iris constraint.

Parameters
  • diagnostic (str) – The name of the diagnostic to be extracted from the CubeList.

  • threshold (AuxCoord) – The thresholds within the given diagnostic cube that is needed, including units. Note these are NOT coords from the original cubes, just constructs to associate units with values.

  • coord_named_threshold (bool) – If true, use old naming convention for threshold coordinates (coord.long_name=threshold). Otherwise extract threshold coordinate name from diagnostic name

Return type

Constraint

Returns

A constraint

create_condition_chain(test_conditions)[source]

Construct a list of all the conditions specified in a single query.

Parameters

test_conditions (Dict) – A query from the decision tree.

Returns

(1) If each a_1, …, a_n is an extract expression (i.e. a constraint, or a list of constraints, operator strings and floats), and b is either “AND”, “OR” or “”, then [[a1, …, an], b] is a valid condition chain. (2) If a1, …, an are each valid conditions chain, and b is either “AND” or “OR”, then [[a1, …, an], b] is a valid condition chain.

Return type

A valid condition chain is defined recursively

create_symbol_cube(cubes)[source]

Create an empty weather symbol cube

Parameters

cubes (Union[List[Cube], CubeList]) – List of input cubes used to generate weather symbols

Return type

Cube

Returns

A cube with suitable metadata to describe the weather symbols that will fill it and data initiated with the value -1 to allow any unset points to be readily identified.

evaluate_condition_chain(cubes, condition_chain)[source]

Recursively evaluate the list of conditions.

We can safely use recursion here since the depth will be small.

Parameters
  • cubes (CubeList) – A cubelist containing the diagnostics required for the weather symbols decision tree, these at co-incident times.

  • condition_chain (List) – A valid condition chain is defined recursively: (1) If each a_1, …, a_n is an extract expression (i.e. a constraint, or a list of constraints, operator strings and floats), and b is either “AND”, “OR” or “”, then [[a1, …, an], b] is a valid condition chain. (2) If a1, …, an are each valid conditions chain, and b is either “AND” or “OR”, then [[a1, …, an], b] is a valid condition chain.

Return type

ndarray

Returns

An array of masked array of booleans

evaluate_extract_expression(cubes, expression)[source]

Evaluate a single condition.

Parameters
  • cubes (CubeList) – A cubelist containing the diagnostics required for the weather symbols decision tree, these at co-incident times.

  • expression (Union[Constraint, List]) – Defined recursively: A list consisting of an iris.Constraint or a list of iris.Constraint, strings (representing operators) and floats is a valid expression. A list consisting of valid expressions, strings (representing operators) and floats is a valid expression.

Return type

ndarray

Returns

An array or masked array of booleans

static find_all_routes(graph, start, end, route=None)[source]

Function to trace all routes through the decision tree.

Parameters
  • graph (Dict) – A dictionary that describes each node in the tree, e.g. {<node_name>: [<if_true_name>, <if_false_name>]}

  • start (str) – The node name of the tree root (currently always lightning).

  • end (int) – The weather symbol code to which we are tracing all routes.

  • route (Optional[List[str]]) – A list of node names found so far.

Return type

List[str]

Returns

A list of node names that defines the route from the tree root to the weather symbol leaf (end of chain).

References

Method based upon Python Patterns - Implementing Graphs essay https://www.python.org/doc/essays/graphs/

invert_condition(condition)[source]

Invert a comparison condition to allow positive identification of conditions satisfying the negative case.

Parameters

condition (Dict) – A single query from the decision tree.

Return type

Tuple[str, str]

Returns

  • A string representing the inverted comparison.

  • A string representing the inverted combination

prepare_input_cubes(cubes)[source]

Check that the input cubes contain all the diagnostics and thresholds required by the decision tree. Sets self.coord_named_threshold to “True” if threshold-type coordinates have the name “threshold” (as opposed to the standard name of the diagnostic), for backward compatibility. A cubelist containing only cubes of the required diagnostic-threshold combinations is returned.

Parameters

cubes (CubeList) – A CubeList containing the input diagnostic cubes.

Return type

Tuple[CubeList, Optional[List[str]]]

Returns

  • A CubeList containing only the required cubes.

  • A list of node names where the diagnostic data is missing and this is indicated as allowed by the presence of the if_diagnostic_missing key.

Raises

IOError – Raises an IOError if any of the required input data is missing. The error includes details of which fields are missing.

process(cubes)[source]

Apply the decision tree to the input cubes to produce weather symbol output.

Parameters

cubes (CubeList) – A cubelist containing the diagnostics required for the weather symbols decision tree, these at co-incident times.

Return type

Cube

Returns

A cube of weather symbols.

remove_optional_missing(optional_node_data_missing)[source]

Some decision tree nodes are optional and have an “if_diagnostic_missing” key to enable passage through the tree in the absence of the required input diagnostic. This code modifies the tree in the following ways:

  • Rewrites the decision tree to skip the missing nodes by connecting nodes that proceed them to the node targetted by “if_diagnostic_missing”

  • If the node(s) missing are those at the start of the decision tree, the start_node is modified to find the first available node.

Parameters

optional_node_data_missing (List[str]) – List of node names for which data is missing but for which this is allowed.

_define_invertible_conditions()[source]

Returns a dictionary of boolean comparator strings where the value is the logical inverse of the key.

Return type

Dict[str, str]