Aggregation

Descriptions of the basic functions are given below.

Function descriptions:

Functions to aggregate region data for a reduced set of regions obtained as a result of spatial grouping of regions.

aggregation.aggregate_geometries(xr_data_array_in, sub_to_sup_region_id_dict)[source]

For each region group, aggregates their geometries to form one super geometry.

Parameters:

xr_data_array_in (xr.DataArray) – subset of the xarray dataset data that corresponds to geometry variable
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
- Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
  
  ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Returns:

xr_data_array_out

Contains new geometries as values
Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.DataArray

aggregation.aggregate_time_series_spatially(xr_data_array_in, sub_to_sup_region_id_dict, mode='mean', xr_weight_array=None)[source]

For each region group, aggregates the given time series variable.

Parameters:

xr_data_array_in (xr.DataArray) – subset of the xarray dataset data that corresponds to a time series variable
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
- Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
  
  ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Default arguments:

Parameters:

mode (str, one of {"mean", "weighted mean", "sum"}) – Specifies how the time series should be aggregated
* the default value is ‘mean’
xr_weight_array (xr.DataArray) – Required if mode is “weighted mean”. xr_weight_array in this case would provide weights. The dimensions and coordinates of it should be same as xr_data_array_in
* the default value is None

Returns:

xr_data_array_out

Contains aggregated time series as values
Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.DataArray

aggregation.aggregate_values_spatially(xr_data_array_in, sub_to_sup_region_id_dict, mode='mean')[source]

For each region group, aggregates the given 1d variable.

Parameters:

xr_data_array_in (xr.DataArray) – subset of the xarray dataset data that corresponds to a 1d variable
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
- Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
  
  ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Default arguments:

Parameters:

mode (str, one of {"mean", "sum", "bool"}) – Specifies how the values should be aggregated
* the default value is ‘mean’

Returns:

xr_data_array_out

Contains aggregated 1d variable as values
Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.DataArray

aggregation.aggregate_connections(xr_data_array_in, sub_to_sup_region_id_dict, mode='bool')[source]

For each region group, aggregates the given 2d variable.

Parameters:

xr_data_array_in (xr.DataArray) – subset of the xarray dataset that corresponds to a 2d variable
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
- Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
  
  ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Default arguments:

Parameters:

mode (str, one of {"bool", "mean", "sum"}) – Specifies how the connections should be aggregated
* the default value is ‘bool’

Returns:

xr_data_array_out

Contains aggregated 2d variable as values
Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.DataArray

aggregation.aggregate_esm_parameters_spatially(param_df_in, old_locations, sub_to_sup_region_id_dict, mode='mean')[source]

For each region group, aggregates the given esm init parameter data.

Parameters:

param_df_in (pd.DataFrame) – the dataframe with parameter data
old_locations (list) – list of former unaggregated regions
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
- Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
  
  ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Default arguments:

Parameters:: mode (str, one of {"mean", "sum"}) – Specifies how the data should be aggregated
* the default value is ‘mean’
Returns:: param_df_out * Contains aggregated data
Return type:: pd.DataFrame

aggregation.aggregate_based_on_sub_to_sup_region_id_dict(xarray_datasets, sub_to_sup_region_id_dict, aggregation_function_dict)[source]

After spatial grouping, for each region group, spatially aggregates the data.

Parameters:

xarray_datasets (Dict[str, xr.Dataset]) – The dictionary of xarray datasets holding esM’s info
sub_to_sup_region_id_dict (Dict[str, List[str]]) –
Dictionary new regions’ ids and their corresponding group of regions
- Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],
  
  ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}
aggregation_function_dict (Dict[str, Tuple(str, None/str)]) –
Contains information regarding the mode of aggregation for each individual variable, component, and component class combination.
- Aggregation possibilities: mean, weighted mean, sum, bool(boolean OR).
- Format of the dictionary:
  
  {<component_class>: {<component_name>: {<variable_name>: (<mode_of_aggregation>, <weights>),
  
  <variable_name>: (<mode_of_aggregation>, None)}}}
  
  <weights> is required only if <mode_of_aggregation> is ‘weighted mean’. The name of the variable that should act as weights should be provided. Can be None otherwise.

Returns:

aggregated_xr_dataset

New xarray dataset with aggregated information
Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.Dataset