Aggregation
Descriptions of the basic functions are given below.
Function descriptions:
Functions to aggregate region data for a reduced set of regions obtained as a result of spatial grouping of regions.
- aggregation.aggregate_geometries(xr_data_array_in, sub_to_sup_region_id_dict)[source]
For each region group, aggregates their geometries to form one super geometry.
- Parameters:
xr_data_array_in (xr.DataArray) – subset of the xarray dataset data that corresponds to geometry variable
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
’03_reg_04_reg’: [‘03_reg’,’04_reg’]}
- Returns:
xr_data_array_out
Contains new geometries as values
Coordinates correspond to new regions
(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)
- Return type:
xr.DataArray
- aggregation.aggregate_time_series_spatially(xr_data_array_in, sub_to_sup_region_id_dict, mode='mean', xr_weight_array=None)[source]
For each region group, aggregates the given time series variable.
- Parameters:
xr_data_array_in (xr.DataArray) – subset of the xarray dataset data that corresponds to a time series variable
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
’03_reg_04_reg’: [‘03_reg’,’04_reg’]}
Default arguments:
- Parameters:
mode (str, one of {"mean", "weighted mean", "sum"}) – Specifies how the time series should be aggregated
* the default value is ‘mean’xr_weight_array (xr.DataArray) – Required if mode is “weighted mean”. xr_weight_array in this case would provide weights. The dimensions and coordinates of it should be same as xr_data_array_in
* the default value is None
- Returns:
xr_data_array_out
Contains aggregated time series as values
Coordinates correspond to new regions
(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)
- Return type:
xr.DataArray
- aggregation.aggregate_values_spatially(xr_data_array_in, sub_to_sup_region_id_dict, mode='mean')[source]
For each region group, aggregates the given 1d variable.
- Parameters:
xr_data_array_in (xr.DataArray) – subset of the xarray dataset data that corresponds to a 1d variable
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
’03_reg_04_reg’: [‘03_reg’,’04_reg’]}
Default arguments:
- Parameters:
mode (str, one of {"mean", "sum", "bool"}) – Specifies how the values should be aggregated
* the default value is ‘mean’- Returns:
xr_data_array_out
Contains aggregated 1d variable as values
Coordinates correspond to new regions
(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)
- Return type:
xr.DataArray
- aggregation.aggregate_connections(xr_data_array_in, sub_to_sup_region_id_dict, mode='bool')[source]
For each region group, aggregates the given 2d variable.
- Parameters:
Default arguments:
- Parameters:
mode (str, one of {"bool", "mean", "sum"}) – Specifies how the connections should be aggregated
* the default value is ‘bool’- Returns:
xr_data_array_out
Contains aggregated 2d variable as values
Coordinates correspond to new regions
(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)
- Return type:
xr.DataArray
- aggregation.aggregate_esm_parameters_spatially(param_df_in, old_locations, sub_to_sup_region_id_dict, mode='mean')[source]
For each region group, aggregates the given esm init parameter data.
- Parameters:
param_df_in (pd.DataFrame) – the dataframe with parameter data
old_locations (list) – list of former unaggregated regions
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
’03_reg_04_reg’: [‘03_reg’,’04_reg’]}
Default arguments:
- Parameters:
mode (str, one of {"mean", "sum"}) – Specifies how the data should be aggregated
* the default value is ‘mean’- Returns:
param_df_out * Contains aggregated data
- Return type:
pd.DataFrame
- aggregation.aggregate_based_on_sub_to_sup_region_id_dict(xarray_datasets, sub_to_sup_region_id_dict, aggregation_function_dict)[source]
After spatial grouping, for each region group, spatially aggregates the data.
- Parameters:
xarray_datasets (Dict[str, xr.Dataset]) – The dictionary of xarray datasets holding esM’s info
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
’03_reg_04_reg’: [‘03_reg’,’04_reg’]}
aggregation_function_dict (Dict[str, Tuple(str, None/str)]) –
Contains information regarding the mode of aggregation for each individual variable, component, and component class combination.
Aggregation possibilities: mean, weighted mean, sum, bool(boolean OR).
Format of the dictionary:
{<component_class>: {<component_name>: {<variable_name>: (<mode_of_aggregation>, <weights>),
<variable_name>: (<mode_of_aggregation>, None)}}}
<weights> is required only if <mode_of_aggregation> is ‘weighted mean’. The name of the variable that should act as weights should be provided. Can be None otherwise.
- Returns:
aggregated_xr_dataset
New xarray dataset with aggregated information
Coordinates correspond to new regions
(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)
- Return type:
xr.Dataset