improver.wxcode.weather_symbols module
Module containing weather symbol implementation.
- class WeatherSymbols(wxtree, model_id_attr=None, record_run_attr=None, target_period=None)[source]
Bases:
BasePluginDefinition and implementation of a weather symbol decision tree. This plugin uses a variety of diagnostic inputs and the decision tree logic to determine the most representative weather symbol for each site defined in the input cubes.
Weather symbol decision trees
Weather symbol decision trees use diagnostic fields to diagnose a suitable symbol to represent the weather conditions. The tree is comprised of a series of interconnected decision nodes. At each node one or multiple forecast diagnostics are compared to predefined threshold values. The node has an if_true and if_false path on to the next node, or on to a resulting weather symbol. By traversing the nodes it should be possible, given the right weather conditions, to arrive at any of the weather symbols.
The first few nodes of a decision tree are represented in the schematic below.
There are two thresholds being used in these nodes. The first is the diagnostic threshold which identifies the critical value for a given diagnostic. In the first node of the schematic shown this threshold is the count of lightning flashes in an hour exceeding 0.0. The second threshold is the probability of exceeding (in this case) this diagnostic threshold. In this first node it’s a probability of 0.3 (30%). So the node overall states that if there is an equal or greater than 30% probability of any lightning flashes in the hour being forecast, proceed to the if_true node, else move to the if_false node.
Encoding a decision tree
The first node above is encoded as follows:
{ "lightning": { "if_true": "lightning_cloud", "if_false": "heavy_precipitation", "if_diagnostic_missing": "if_false", "probability_thresholds": [0.3], "threshold_condition": ">=", "condition_combination": "", "diagnostic_fields": [ "probability_of_number_of_lightning_flashes_per_unit_area_in_vicinity_above_threshold" ], "diagnostic_thresholds": [[0.0, "m-2"]], "diagnostic_conditions": ["above"] }, }
The key at the first level, “lightning” in this case, names the node so that it can be targeted as an if_true or if_false destination from other nodes. The dictionary accessed with this key contains the essentials that make the node function.
if_true (str or int): The next node to test if the condition in this node is true. Alternatively this may be an integer number that identifies which weather symbol has been reached; this is for the leaf (or final) nodes in the tree.
if_false (str or int): The next node to test if the condition in this node is false. Alternatively this may be an integer number that identifies which weather symbol has been reached; this is for the leaf (or final) nodes in the tree.
if_diagnostic_missing (str, optional): If the expected diagnostic is not provided, should the tree proceed to the if_true or if_false node. This can be useful if the tree is to be applied to output from different models, some of which do not provide all the diagnostics that might be desirable.
probability_thresholds (list(float)): The probability threshold(s) that must be exceeded or not exceeded (see threshold_condition) for the node to progress to the succeed target. Two values required if condition_combination is being used.
threshold_condition (str): Defines the inequality test to be applied to the probability threshold(s). Inequalities that can be used are “<=”, “<”, “>”, “>=”.
condition_combination (str): If multiple tests are being applied in a single node, this value determines the logic with which they are combined. The values can be “AND”, “OR”.
diagnostic_fields (List(str or List(str)): The name(s) of the diagnostic(s) that will form the test condition in this node. There may be multiple diagnostics if they are being combined in the test using a condition_combination. Alternatively, if they are being manipulated within the node (e.g. added together), they must be separated by the desired operators (e.g. ‘diagnostic1’, ‘+’, ‘diagnostic2’).
diagnostic_thresholds (List(List(float, str, Optional(int))): The diagnostic threshold value and units being used in the test. An optional third value provides a period in seconds that is associated with the threshold value. For example, a precipitation accumulation threshold might be given for a 1-hour period (3600 seconds). If instead 3-hour symbols are being produced using 3-hour precipitation accumulations then the threshold value will be scaled up by a factor of 3. Only thresholds with an associated period will be scaled in this way. A threshold [value, units] pair must be provided for each diagnostic field with the same nested list structure; as the basic unit is a list of value and unit, the overall nested structure is one list deeper.
diagnostic_conditions (as diagnostic_fields): The expected inequality that has been used to construct the input probability field. This is checked against the spp__relative_to_threshold attribute of the threshold coordinate in the provided diagnostic.
Every decision tree must have a starting node, and this is taken as the first node defined in the dictionary.
Manipulation of the diagnostics is possible using the decision tree configuration to enable more complex comparisons. For example:
"heavy_rain_or_sleet_shower": { "if_true": 14, "if_false": 17, "probability_thresholds": [0.0], "threshold_condition": "<", "condition_combination": "", "diagnostic_fields": [ [ "probability_of_lwe_sleetfall_rate_above_threshold", "+", "probability_of_lwe_snowfall_rate_above_threshold", "-", "probability_of_rainfall_rate_above_threshold" ] ], "diagnostic_thresholds": [[[1.0, "mm hr-1"], [1.0, "mm hr-1"], [1.0, "mm hr-1"]]], "diagnostic_conditions": [["above", "above", "above"]] },
This node uses three diagnostics. It combines them according to the mathematical operators that separate the names in the diagnostic_fields list. The resulting value is compared to the probability threshold value using the threshold condition. In this example the purpose is to check whether the probability of the rain rate exceeding 1.0 mm/hr is greater than the combined probability of the same rate being exceeded by sleet and snow.
- __init__(wxtree, model_id_attr=None, record_run_attr=None, target_period=None)[source]
Define a decision tree for determining weather symbols based upon the input diagnostics. Use this decision tree to allocate a weather symbol to each point.
- Parameters
wxtree (
dict) – Weather symbols decision tree definition, provided as a dictionary.model_id_attr (
Optional[str]) – Name of attribute recording source models that should be inherited by the output cube. The source models are expected as a space-separated string.record_run_attr (
Optional[str]) – Name of attribute used to record models and cycles used in constructing the weather symbols.target_period (
Optional[int]) – The period in seconds that the weather symbol being produced should represent. This should correspond with any period diagnostics, e.g. precipitation accumulation, being used as input. This is used to scale any threshold values that are defined with an associated period in the decision tree. It will only be used if the decision tree provided has threshold values defined with an associated period.
float_tolerance defines the tolerance when matching thresholds to allow for the difficulty of float comparisons. float_abs_tolerance defines the tolerance for when the threshold is zero. It has to be sufficiently small that a valid rainfall rate or snowfall rate could not trigger it.
- _abc_impl = <_abc_data object>
- check_coincidence(cubes)[source]
Check that all the provided cubes are valid at the same time and if any of the input cubes have time bounds, these match.
The last input cube with bounds (or first input cube if none have bounds) is selected as a template_cube for later producing the weather symbol cube.
- Parameters
cubes (
Union[List[Cube],CubeList]) – List of input cubes used to generate weather symbols- Raises
ValueError – If validity times differ for diagnostics.
ValueError – If period diagnostics have different periods.
ValueError – If period diagnostics do not match target_period.
- Return type
- static compare_array_to_threshold(arr, comparator, threshold)[source]
Compare two arrays element-wise and return a boolean array.
- Parameters
- Return type
- Returns
Array of booleans.
- Raises
ValueError – If comparator is not one of ‘<’, ‘>’, ‘<=’, ‘>=’.
- construct_extract_constraint(diagnostic, threshold, coord_named_threshold)[source]
Construct an iris constraint.
- Parameters
diagnostic (
str) – The name of the diagnostic to be extracted from the CubeList.threshold (
AuxCoord) – The thresholds within the given diagnostic cube that is needed, including units. Note these are NOT coords from the original cubes, just constructs to associate units with values.coord_named_threshold (
bool) – If true, use old naming convention for threshold coordinates (coord.long_name=threshold). Otherwise extract threshold coordinate name from diagnostic name
- Return type
- Returns
A constraint
- create_condition_chain(test_conditions)[source]
Construct a list of all the conditions specified in a single query.
- Parameters
test_conditions (
Dict) – A query from the decision tree.- Returns
(1) If each a_1, …, a_n is an extract expression (i.e. a constraint, or a list of constraints, operator strings and floats), and b is either “AND”, “OR” or “”, then [[a1, …, an], b] is a valid condition chain. (2) If a1, …, an are each valid conditions chain, and b is either “AND” or “OR”, then [[a1, …, an], b] is a valid condition chain.
- Return type
A valid condition chain is defined recursively
- evaluate_condition_chain(cubes, condition_chain)[source]
Recursively evaluate the list of conditions.
We can safely use recursion here since the depth will be small.
- Parameters
cubes (
CubeList) – A cubelist containing the diagnostics required for the weather symbols decision tree, these at co-incident times.condition_chain (
List) – A valid condition chain is defined recursively: (1) If each a_1, …, a_n is an extract expression (i.e. a constraint, or a list of constraints, operator strings and floats), and b is either “AND”, “OR” or “”, then [[a1, …, an], b] is a valid condition chain. (2) If a1, …, an are each valid conditions chain, and b is either “AND” or “OR”, then [[a1, …, an], b] is a valid condition chain.
- Return type
- Returns
An array of masked array of booleans
- evaluate_extract_expression(cubes, expression)[source]
Evaluate a single condition.
- Parameters
cubes (
CubeList) – A cubelist containing the diagnostics required for the weather symbols decision tree, these at co-incident times.expression (
Union[Constraint,List]) – Defined recursively: A list consisting of an iris.Constraint or a list of iris.Constraint, strings (representing operators) and floats is a valid expression. A list consisting of valid expressions, strings (representing operators) and floats is a valid expression.
- Return type
- Returns
An array or masked array of booleans
- static find_all_routes(graph, start, end, route=None)[source]
Function to trace all routes through the decision tree.
- Parameters
graph (
Dict) – A dictionary that describes each node in the tree, e.g. {<node_name>: [<if_true_name>, <if_false_name>]}start (
str) – The node name of the tree root (currently always lightning).end (
int) – The weather symbol code to which we are tracing all routes.route (
Optional[List[str]]) – A list of node names found so far.
- Return type
- Returns
A list of node names that defines the route from the tree root to the weather symbol leaf (end of chain).
References
Method based upon Python Patterns - Implementing Graphs essay https://www.python.org/doc/essays/graphs/
- invert_condition(condition)[source]
Invert a comparison condition to allow positive identification of conditions satisfying the negative case.
- prepare_input_cubes(cubes)[source]
Check that the input cubes contain all the diagnostics and thresholds required by the decision tree. Sets self.coord_named_threshold to “True” if threshold-type coordinates have the name “threshold” (as opposed to the standard name of the diagnostic), for backward compatibility. A cubelist containing only cubes of the required diagnostic-threshold combinations is returned.
- Parameters
cubes (
CubeList) – A CubeList containing the input diagnostic cubes.- Return type
- Returns
A CubeList containing only the required cubes.
A list of node names where the diagnostic data is missing and this is indicated as allowed by the presence of the if_diagnostic_missing key.
- Raises
IOError – Raises an IOError if any of the required input data is missing. The error includes details of which fields are missing.
- process(cubes)[source]
Apply the decision tree to the input cubes to produce weather symbol output.
- remove_optional_missing(optional_node_data_missing)[source]
Some decision tree nodes are optional and have an “if_diagnostic_missing” key to enable passage through the tree in the absence of the required input diagnostic. This code modifies the tree in the following ways:
Rewrites the decision tree to skip the missing nodes by connecting nodes that proceed them to the node targetted by “if_diagnostic_missing”
If the node(s) missing are those at the start of the decision tree, the start_node is modified to find the first available node.