Manager

Descriptions of the basic functions are given below.

Function descriptions:

Manager function that calls spatial grouping and aggregation algorithm.

manager.perform_spatial_aggregation(xr_datasets, shapefile, grouping_mode='parameter_based', n_groups=3, distance_threshold=None, aggregatedResultsPath=None, **kwargs)[source]

Performs spatial grouping of regions (by calling the functions in grouping.py) and then representation of the data within each region group (by calling functions in representation.py).

Parameters:

xr_datasets (str/Dict[str, xr.Dataset]) –
Either the path to .netCDF file or the read-in xarray datasets
- Dimensions in the datasets: ‘time’, ‘space’, ‘space_2’
shapefile (str/GeoDataFrame) – Either the path to the shapefile or the read-in shapefile

Default arguments:

Parameters:

grouping_mode (str, one of {'parameter_based', 'string_based', 'distance_based'}) – Defines how to spatially group the regions. Refer to grouping.py for more information.
* the default value is ‘parameter_based’
n_groups (strictly positive int) – The number of region groups to be formed from the original region set. This parameter is irrelevant if grouping_mode is ‘string_based’.
* the default value is 3
distance_threshold (float) – The distance threshold at or above which regions will not be aggregated into one.
* the default value is None. If not None, n_groups must be None
aggregatedResultsPath (str) – Indicates path to which the aggregated results should be saved. If None, results are not saved.
* the default value is None

Additional keyword arguments that can be passed via kwargs:

Parameters:

geom_col_name (str) – The geomtry column name in shapefile
* the default value is ‘geometry’
geom_id_col_name (str) – The colum in shapefile consisting geom IDs
* the default value is ‘index’
geom_id_col_name – The colum in shapefile consisting geom IDs
* the default value is ‘index’
separator (str) –
Relevant only if grouping_mode is ‘string_based’. The character or string in the region IDs that defines where the ID should be split.

E.g.: region IDs -> [‘01_es’, ‘02_es’] and separator=’_’, then IDs are split at _ and the last part (‘es’) is taken as the group ID

* the default value is None
position (int/tuple) –
Relevant only if grouping_mode is ‘string_based’. Used to define the position(s) of the region IDs where the split should happen. An int i would mean the part from 0 to i is taken as the group ID. A tuple (i,j) would mean the part i to j is taken at the group ID.

Note

either separator or position must be passed in order to perform string_based_grouping

* the default value is None
weights (Dict) –
Relevant only if grouping_mode is ‘parameter_based’. Through the weights dictionary, one can assign weights to variable-component pairs. When calculating distance corresonding to each variable-component pair, these specified weights are considered, otherwise taken as 1.

It must be in one of the formats:
- If you want to specify weights for particular variables and particular corresponding components:
  
  { ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : List[<variable_name>] }
- If you want to specify weights for particular variables, but all corresponding components:
  
  { ‘components’ : {‘all’ : <weight>}, ‘variables’ : List[<variable_name>] }
- If you want to specify weights for all variables, but particular corresponding components:
  
  { ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : ‘all’ }
<weight> can be of type int/float
* the default value is None
aggregation_method (str, one of {'kmedoids_contiguity', 'hierarchical'}) –
Relevant only if grouping_mode is ‘parameter_based’. The clustering method that should be used to group the regions. Options:
- ’kmedoids_contiguity’:
  kmedoids clustering with added contiguity constraint. Refer to TSAM docs for more info: https://github.com/FZJ-IEK3-VSA/tsam/blob/master/tsam/utils/k_medoids_contiguity.py
- ’hierarchical’:
  sklearn’s agglomerative clustering with complete linkage, with a connetivity matrix to ensure contiguity. Refer to Sklearn docs for more info: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html
* the default value is ‘kmedoids_contiguity’
skip_regions (List[str]) –
The region IDs to be skipped while aggregating regions

Note

currently only implemented for grouping_mode ‘distance_based’

* the default value is None
enforced_groups (Dict[str, List[str]]) –
The groups that should be enforced when aggregating regions.

Note

currently only implemented for grouping_mode ‘distance_based’

* the default value is None
solver (str) – Relevant only if grouping_mode is ‘parameter_based’ and aggregation_method is ‘kmedoids_contiguity’. The optimization solver to be chosen.
* the default value is ‘gurobi’
solver – Relevant only if grouping_mode is ‘parameter_based’ and aggregation_method is ‘kmedoids_contiguity’. The optimization solver to be chosen.
* the default value is ‘gurobi’
aggregation_function_dict (Dict[str, Tuple(str, None/str)]) –
- Contains information regarding the mode of aggregation for each individual variable.
- Possibilities: mean, weighted mean, sum, bool (boolean OR).
- Format of the dictionary:
  
  {<variable_name>: (<mode_of_aggregation>, <weights>), <variable_name>: (<mode_of_aggregation>, None)}
  
  <weights> is required only if <mode_of_aggregation> is ‘weighted mean’. The name of the variable that should act as weights should be provided. Can be None otherwise.
Note

A default dictionary is considered with the following corresponding modes. If aggregation_function_dict is passed, this default dictionary is updated. The default dicitionary:

{

“operationRateMax”: (“weighted mean”, “capacityMax”),

”operationRateFix”: (“sum”, None),

”locationalEligibility”: (“bool”, None),

”capacityMax”: (“sum”, None),

”investPerCapacity”: (“mean”, None),

”investIfBuilt”: (“bool”, None),

”opexPerOperation”: (“mean”, None),

”opexPerCapacity”: (“mean”, None),

”opexIfBuilt”: (“bool”, None),

”interestRate”: (“mean”, None),

”economicLifetime”: (“mean”, None),

”capacityFix”: (“sum”, None),

”losses”: (“mean”, None),

”distances”: (“mean”, None),

”commodityCost”: (“mean”, None),

”commodityRevenue”: (“mean”, None),

”opexPerChargeOperation”: (“mean”, None),

”opexPerDischargeOperation”: (“mean”, None),

”QPcostScale”: (“sum”, None),

”technicalLifetime”: (“mean”, None),

”balanceLimit”: (“sum”, None)

”pathwayBalanceLimit”: (“sum”, None)

}

* the default value is None
aggregated_shp_name (str) – Name to be given to the saved shapefiles after aggregation
* the default value is ‘aggregated_regions’
crs (int) – Coordinate reference system (crs) in which to save the shapefiles
* the default value is 3035
aggregated_xr_filename (str) – Name to be given to the saved netCDF file containing aggregated esM data
* the default value is ‘aggregated_xr_dataset.nc’

Returns:

aggregated_xr_dataset - The xarray datasets holding aggregated data

Return type:

Dict[str, xr.Dataset]