Grouping

Descriptions of the basic functions are given below.

Function descriptions:

Grouping algorithms determine how to reduce the number of input regions to fewer regions while minimizing information loss.

grouping.perform_string_based_grouping(regions, separator=None, position=None)[source]

Groups regions based on their names/ids.

Parameters:

regions (List[str]/np.array(str)) –

List or array of region names.

Ex.: [‘01_es’, ‘02_es’, ‘01_de’, ‘02_de’, ‘03_de’]

Default arguments:

Parameters:

separator (str) –
The character or string in the region IDs that defines where the ID should be split
- Ex.: ‘_’ would split the above IDs at _ and take the last part (‘es’, ‘de’) as the group ID
* the default value is None
position (int/tuple) – Used to define the position(s) of the region IDs where the split should happen. An int i would mean the part from 0 to i is taken as the group ID. A tuple (i,j) would mean the part i to j is taken at the group ID.
* the default value is None

Returns:

sub_to_sup_region_id_dict - Dictionary new regions’ ids and their corresponding group of regions

Ex.: {‘es’ : [‘01_es’, ‘02_es’] , ‘de’ : [‘01_de’, ‘02_de’, ‘03_de’]}

Return type:

Dict[str, List[str]]

grouping.perform_distance_based_grouping(geom_xr, n_groups=3, skip_regions=None, enforced_groups=None, distance_threshold=None)[source]

Groups regions based on the regions’ centroid distances, using sklearn’s hierarchical clustering.

Parameters:: geom_xr (xr.Dataset) – The xarray dataset holding the geom info

Default arguments:

Parameters:

n_groups (strictly positive int) – The number of region groups to be formed from the original region set
* the default value is 3
distance_threshold (float) – The distance threshold at or above which regions will not be aggregated into one.
* the default value is None. If not None, n_groups must be None
skip_regions – The region id’s to be skipped while aggregating regions
* the default value is None
enforced_groups – The groups that should be enforced when aggregating regions.
* the default value is None

Returns:

aggregation_dict - A nested dictionary containing results of spatial grouping at various levels/number of groups

Ex.: {3: {‘01_reg’: [‘01_reg’], ‘02_reg’: [‘02_reg’], ‘03_reg’: [‘03_reg’]},

2: {‘01_reg_02_reg’: [‘01_reg’, ‘02_reg’], ‘03_reg’: [‘03_reg’]},

1: {‘01_reg_02_reg_03_reg’: [‘01_reg’,’02_reg’,’03_reg’]}}

Return type:

Dict[int, Dict[str, List[str]]]

grouping.perform_parameter_based_grouping(xarray_datasets, n_groups=3, aggregation_method='kmedoids_contiguity', weights=None, solver='gurobi')[source]

Groups regions based on the Energy System Model instance’s data. This data may consist of

regional time series variables such as operationRateMax of PVs

regional values such as capacityMax of PVs

connection values such as distances of DC Cables

values constant across all regions such as CommodityConversionFactors

All variables that vary across regions (a,b, and c) belonging to different ESM components are considered while determining similarity between regions.

Parameters:: xarray_datasets (Dict[str, xr.Dataset]) – The dictionary of xarray datasets holding esM’s info

Default arguments:

Parameters:

n_groups (strictly positive int) – The number of region groups to be formed from the original region set
* the default value is 3
aggregation_method (str) –
The clustering method that should be used to group the regions. Options:
- ’kmedoids_contiguity’:
  
  kmedoids clustering with added contiguity constraint
  
  Refer to TSAM docs for more info: https://github.com/FZJ-IEK3-VSA/tsam/blob/master/tsam/utils/k_medoids_contiguity.py
- ’hierarchical’:
  
  sklearn’s agglomerative clustering with complete linkage, with a connetivity matrix to ensure contiguity
  
  Refer to Sklearn docs for more info: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html
* the default value is ‘kmedoids_contiguity’
weights (Dict) –
Through the weights dictionary, one can assign weights to variable-component pairs. When calculating distance corresonding to each variable-component pair, these specified weights are considered, otherwise taken as 1. It must be in one of the formats:
- If you want to specify weights for particular variables and particular corresponding components:
  
  { ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : List[<variable_name>] }
- If you want to specify weights for particular variables, but all corresponding components:
  
  { ‘components’ : {‘all’ : <weight>}, ‘variables’ : List[<variable_name>] }
- If you want to specify weights for all variables, but particular corresponding components:
  
  { ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : ‘all’ }
<weight> can be of type int/float
* the default value is None
solver (str) – The optimization solver to be chosen. Relevant only if aggregation_method is ‘kmedoids_contiguity’
* the default value is ‘gurobi’

Returns:

aggregation_dict - A nested dictionary containing results of spatial grouping at various levels/number of groups

Ex.: {3: {‘01_reg’: [‘01_reg’], ‘02_reg’: [‘02_reg’], ‘03_reg’: [‘03_reg’]},

2: {‘01_reg_02_reg’: [‘01_reg’, ‘02_reg’], ‘03_reg’: [‘03_reg’]},

1: {‘01_reg_02_reg_03_reg’: [‘01_reg’,’02_reg’,’03_reg’]}}

Return type:

Dict[int, Dict[str, List[str]]]