Aggregation

Descriptions of the basic functions are given below.

Function descriptions:

Functions to aggregate region data for a reduced set of regions obtained as a result of spatial grouping of regions.

aggregation.aggregate_geometries(xr_data_array_in, sub_to_sup_region_id_dict)[source]

For each region group, aggregates their geometries to form one super geometry.

Parameters:
  • xr_data_array_in (xr.DataArray) – subset of the xarray dataset data that corresponds to geometry variable

  • sub_to_sup_region_id_dict (Dict[str, List[str]]) –

    Dictionary new regions’ ids and their corresponding group of regions

    • Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],

      ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Returns:

xr_data_array_out

  • Contains new geometries as values

  • Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.DataArray

aggregation.aggregate_time_series_spatially(xr_data_array_in, sub_to_sup_region_id_dict, mode='mean', xr_weight_array=None)[source]

For each region group, aggregates the given time series variable.

Parameters:
  • xr_data_array_in (xr.DataArray) – subset of the xarray dataset data that corresponds to a time series variable

  • sub_to_sup_region_id_dict (Dict[str, List[str]]) –

    Dictionary new regions’ ids and their corresponding group of regions

    • Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],

      ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Default arguments:

Parameters:
  • mode (str, one of {"mean", "weighted mean", "sum"}) – Specifies how the time series should be aggregated
    * the default value is ‘mean’

  • xr_weight_array (xr.DataArray) – Required if mode is “weighted mean”. xr_weight_array in this case would provide weights. The dimensions and coordinates of it should be same as xr_data_array_in
    * the default value is None

Returns:

xr_data_array_out

  • Contains aggregated time series as values

  • Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.DataArray

aggregation.aggregate_values_spatially(xr_data_array_in, sub_to_sup_region_id_dict, mode='mean')[source]

For each region group, aggregates the given 1d variable.

Parameters:
  • xr_data_array_in (xr.DataArray) – subset of the xarray dataset data that corresponds to a 1d variable

  • sub_to_sup_region_id_dict (Dict[str, List[str]]) –

    Dictionary new regions’ ids and their corresponding group of regions

    • Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],

      ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Default arguments:

Parameters:

mode (str, one of {"mean", "sum", "bool"}) – Specifies how the values should be aggregated
* the default value is ‘mean’

Returns:

xr_data_array_out

  • Contains aggregated 1d variable as values

  • Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.DataArray

aggregation.aggregate_connections(xr_data_array_in, sub_to_sup_region_id_dict, mode='bool')[source]

For each region group, aggregates the given 2d variable.

Parameters:
  • xr_data_array_in (xr.DataArray) – subset of the xarray dataset that corresponds to a 2d variable

  • sub_to_sup_region_id_dict (Dict[str, List[str]]) –

    Dictionary new regions’ ids and their corresponding group of regions

    • Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],

      ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Default arguments:

Parameters:

mode (str, one of {"bool", "mean", "sum"}) – Specifies how the connections should be aggregated
* the default value is ‘bool’

Returns:

xr_data_array_out

  • Contains aggregated 2d variable as values

  • Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.DataArray

aggregation.aggregate_esm_parameters_spatially(param_df_in, old_locations, sub_to_sup_region_id_dict, mode='mean')[source]

For each region group, aggregates the given esm init parameter data.

Parameters:
  • param_df_in (pd.DataFrame) – the dataframe with parameter data

  • old_locations (list) – list of former unaggregated regions

  • sub_to_sup_region_id_dict (Dict[str, List[str]]) –

    Dictionary new regions’ ids and their corresponding group of regions

    • Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],

      ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

Default arguments:

Parameters:

mode (str, one of {"mean", "sum"}) – Specifies how the data should be aggregated
* the default value is ‘mean’

Returns:

param_df_out * Contains aggregated data

Return type:

pd.DataFrame

aggregation.aggregate_based_on_sub_to_sup_region_id_dict(xarray_datasets, sub_to_sup_region_id_dict, aggregation_function_dict)[source]

After spatial grouping, for each region group, spatially aggregates the data.

Parameters:
  • xarray_datasets (Dict[str, xr.Dataset]) – The dictionary of xarray datasets holding esM’s info

  • sub_to_sup_region_id_dict (Dict[str, List[str]]) –

    Dictionary new regions’ ids and their corresponding group of regions

    • Ex.: {‘01_reg_02_reg’: [‘01_reg’,’02_reg’],

      ’03_reg_04_reg’: [‘03_reg’,’04_reg’]}

  • aggregation_function_dict (Dict[str, Tuple(str, None/str)]) –

    Contains information regarding the mode of aggregation for each individual variable, component, and component class combination.

    • Aggregation possibilities: mean, weighted mean, sum, bool(boolean OR).

    • Format of the dictionary:

      {<component_class>: {<component_name>: {<variable_name>: (<mode_of_aggregation>, <weights>),

      <variable_name>: (<mode_of_aggregation>, None)}}}

      <weights> is required only if <mode_of_aggregation> is ‘weighted mean’. The name of the variable that should act as weights should be provided. Can be None otherwise.

Returns:

aggregated_xr_dataset

  • New xarray dataset with aggregated information

  • Coordinates correspond to new regions

(In the above example, ‘01_reg_02_reg’, ‘03_reg_04_reg’ form new coordinates)

Return type:

xr.Dataset