Grouping
Descriptions of the basic functions are given below.
Function descriptions:
Grouping algorithms determine how to reduce the number of input regions to fewer regions while minimizing information loss.
- grouping.perform_string_based_grouping(regions, separator=None, position=None)[source]
Groups regions based on their names/ids.
- Parameters:
regions (List[str]/np.array(str)) –
List or array of region names.
Ex.: [‘01_es’, ‘02_es’, ‘01_de’, ‘02_de’, ‘03_de’]
Default arguments:
- Parameters:
separator (str) –
The character or string in the region IDs that defines where the ID should be split
Ex.: ‘_’ would split the above IDs at _ and take the last part (‘es’, ‘de’) as the group ID
* the default value is Noneposition (int/tuple) – Used to define the position(s) of the region IDs where the split should happen. An int i would mean the part from 0 to i is taken as the group ID. A tuple (i,j) would mean the part i to j is taken at the group ID.
* the default value is None
- Returns:
sub_to_sup_region_id_dict - Dictionary new regions’ ids and their corresponding group of regions
Ex.: {‘es’ : [‘01_es’, ‘02_es’] , ‘de’ : [‘01_de’, ‘02_de’, ‘03_de’]}
- Return type:
- grouping.perform_distance_based_grouping(geom_xr, n_groups=3, skip_regions=None, enforced_groups=None, distance_threshold=None)[source]
Groups regions based on the regions’ centroid distances, using sklearn’s hierarchical clustering.
- Parameters:
geom_xr (xr.Dataset) – The xarray dataset holding the geom info
Default arguments:
- Parameters:
n_groups (strictly positive int) – The number of region groups to be formed from the original region set
* the default value is 3distance_threshold (float) – The distance threshold at or above which regions will not be aggregated into one.
* the default value is None. If not None, n_groups must be Noneskip_regions – The region id’s to be skipped while aggregating regions
* the default value is Noneenforced_groups – The groups that should be enforced when aggregating regions.
* the default value is None
- Returns:
aggregation_dict - A nested dictionary containing results of spatial grouping at various levels/number of groups
Ex.: {3: {‘01_reg’: [‘01_reg’], ‘02_reg’: [‘02_reg’], ‘03_reg’: [‘03_reg’]},
2: {‘01_reg_02_reg’: [‘01_reg’, ‘02_reg’], ‘03_reg’: [‘03_reg’]},
1: {‘01_reg_02_reg_03_reg’: [‘01_reg’,’02_reg’,’03_reg’]}}
- Return type:
- grouping.perform_parameter_based_grouping(xarray_datasets, n_groups=3, aggregation_method='kmedoids_contiguity', weights=None, solver='gurobi')[source]
Groups regions based on the Energy System Model instance’s data. This data may consist of
regional time series variables such as operationRateMax of PVs
regional values such as capacityMax of PVs
connection values such as distances of DC Cables
values constant across all regions such as CommodityConversionFactors
All variables that vary across regions (a,b, and c) belonging to different ESM components are considered while determining similarity between regions.
- Parameters:
xarray_datasets (Dict[str, xr.Dataset]) – The dictionary of xarray datasets holding esM’s info
Default arguments:
- Parameters:
n_groups (strictly positive int) – The number of region groups to be formed from the original region set
* the default value is 3aggregation_method (str) –
The clustering method that should be used to group the regions. Options:
’kmedoids_contiguity’:
kmedoids clustering with added contiguity constraint
Refer to TSAM docs for more info: https://github.com/FZJ-IEK3-VSA/tsam/blob/master/tsam/utils/k_medoids_contiguity.py
’hierarchical’:
sklearn’s agglomerative clustering with complete linkage, with a connetivity matrix to ensure contiguity
Refer to Sklearn docs for more info: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html
* the default value is ‘kmedoids_contiguity’weights (Dict) –
Through the weights dictionary, one can assign weights to variable-component pairs. When calculating distance corresonding to each variable-component pair, these specified weights are considered, otherwise taken as 1. It must be in one of the formats:
If you want to specify weights for particular variables and particular corresponding components:
{ ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : List[<variable_name>] }
If you want to specify weights for particular variables, but all corresponding components:
{ ‘components’ : {‘all’ : <weight>}, ‘variables’ : List[<variable_name>] }
If you want to specify weights for all variables, but particular corresponding components:
{ ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : ‘all’ }
<weight> can be of type int/float
* the default value is Nonesolver (str) – The optimization solver to be chosen. Relevant only if aggregation_method is ‘kmedoids_contiguity’
* the default value is ‘gurobi’
- Returns:
aggregation_dict - A nested dictionary containing results of spatial grouping at various levels/number of groups
Ex.: {3: {‘01_reg’: [‘01_reg’], ‘02_reg’: [‘02_reg’], ‘03_reg’: [‘03_reg’]},
2: {‘01_reg_02_reg’: [‘01_reg’, ‘02_reg’], ‘03_reg’: [‘03_reg’]},
1: {‘01_reg_02_reg_03_reg’: [‘01_reg’,’02_reg’,’03_reg’]}}
- Return type: