Manager
Descriptions of the basic functions are given below.
Function descriptions:
Manager function that calls spatial grouping and aggregation algorithm.
- manager.perform_spatial_aggregation(xr_datasets, shapefile, grouping_mode='parameter_based', n_groups=3, distance_threshold=None, aggregatedResultsPath=None, **kwargs)[source]
Performs spatial grouping of regions (by calling the functions in grouping.py) and then representation of the data within each region group (by calling functions in representation.py).
- Parameters:
xr_datasets (str/Dict[str, xr.Dataset]) –
Either the path to .netCDF file or the read-in xarray datasets
Dimensions in the datasets: ‘time’, ‘space’, ‘space_2’
shapefile (str/GeoDataFrame) – Either the path to the shapefile or the read-in shapefile
Default arguments:
- Parameters:
grouping_mode (str, one of {'parameter_based', 'string_based', 'distance_based'}) – Defines how to spatially group the regions. Refer to grouping.py for more information.
* the default value is ‘parameter_based’n_groups (strictly positive int) – The number of region groups to be formed from the original region set. This parameter is irrelevant if grouping_mode is ‘string_based’.
* the default value is 3distance_threshold (float) – The distance threshold at or above which regions will not be aggregated into one.
* the default value is None. If not None, n_groups must be NoneaggregatedResultsPath (str) – Indicates path to which the aggregated results should be saved. If None, results are not saved.
* the default value is None
Additional keyword arguments that can be passed via kwargs:
- Parameters:
geom_col_name (str) – The geomtry column name in shapefile
* the default value is ‘geometry’geom_id_col_name (str) – The colum in shapefile consisting geom IDs
* the default value is ‘index’geom_id_col_name – The colum in shapefile consisting geom IDs
* the default value is ‘index’separator (str) –
Relevant only if grouping_mode is ‘string_based’. The character or string in the region IDs that defines where the ID should be split.
E.g.: region IDs -> [‘01_es’, ‘02_es’] and separator=’_’, then IDs are split at _ and the last part (‘es’) is taken as the group ID
* the default value is Noneposition (int/tuple) –
Relevant only if grouping_mode is ‘string_based’. Used to define the position(s) of the region IDs where the split should happen. An int i would mean the part from 0 to i is taken as the group ID. A tuple (i,j) would mean the part i to j is taken at the group ID.
Note
either separator or position must be passed in order to perform string_based_grouping
* the default value is Noneweights (Dict) –
Relevant only if grouping_mode is ‘parameter_based’. Through the weights dictionary, one can assign weights to variable-component pairs. When calculating distance corresonding to each variable-component pair, these specified weights are considered, otherwise taken as 1.
It must be in one of the formats:
If you want to specify weights for particular variables and particular corresponding components:
{ ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : List[<variable_name>] }
If you want to specify weights for particular variables, but all corresponding components:
{ ‘components’ : {‘all’ : <weight>}, ‘variables’ : List[<variable_name>] }
If you want to specify weights for all variables, but particular corresponding components:
{ ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : ‘all’ }
<weight> can be of type int/float
* the default value is Noneaggregation_method (str, one of {'kmedoids_contiguity', 'hierarchical'}) –
Relevant only if grouping_mode is ‘parameter_based’. The clustering method that should be used to group the regions. Options:
- ’kmedoids_contiguity’:
kmedoids clustering with added contiguity constraint. Refer to TSAM docs for more info: https://github.com/FZJ-IEK3-VSA/tsam/blob/master/tsam/utils/k_medoids_contiguity.py
- ’hierarchical’:
sklearn’s agglomerative clustering with complete linkage, with a connetivity matrix to ensure contiguity. Refer to Sklearn docs for more info: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html
* the default value is ‘kmedoids_contiguity’skip_regions (List[str]) –
The region IDs to be skipped while aggregating regions
Note
currently only implemented for grouping_mode ‘distance_based’
* the default value is Noneenforced_groups (Dict[str, List[str]]) –
The groups that should be enforced when aggregating regions.
Note
currently only implemented for grouping_mode ‘distance_based’
* the default value is Nonesolver (str) – Relevant only if grouping_mode is ‘parameter_based’ and aggregation_method is ‘kmedoids_contiguity’. The optimization solver to be chosen.
* the default value is ‘gurobi’solver – Relevant only if grouping_mode is ‘parameter_based’ and aggregation_method is ‘kmedoids_contiguity’. The optimization solver to be chosen.
* the default value is ‘gurobi’aggregation_function_dict (Dict[str, Tuple(str, None/str)]) –
Contains information regarding the mode of aggregation for each individual variable.
Possibilities: mean, weighted mean, sum, bool (boolean OR).
Format of the dictionary:
{<variable_name>: (<mode_of_aggregation>, <weights>), <variable_name>: (<mode_of_aggregation>, None)}
<weights> is required only if <mode_of_aggregation> is ‘weighted mean’. The name of the variable that should act as weights should be provided. Can be None otherwise.
Note
A default dictionary is considered with the following corresponding modes. If aggregation_function_dict is passed, this default dictionary is updated. The default dicitionary:
{
“operationRateMax”: (“weighted mean”, “capacityMax”),
”operationRateFix”: (“sum”, None),
”locationalEligibility”: (“bool”, None),
”capacityMax”: (“sum”, None),
”investPerCapacity”: (“mean”, None),
”investIfBuilt”: (“bool”, None),
”opexPerOperation”: (“mean”, None),
”opexPerCapacity”: (“mean”, None),
”opexIfBuilt”: (“bool”, None),
”interestRate”: (“mean”, None),
”economicLifetime”: (“mean”, None),
”capacityFix”: (“sum”, None),
”losses”: (“mean”, None),
”distances”: (“mean”, None),
”commodityCost”: (“mean”, None),
”commodityRevenue”: (“mean”, None),
”opexPerChargeOperation”: (“mean”, None),
”opexPerDischargeOperation”: (“mean”, None),
”QPcostScale”: (“sum”, None),
”technicalLifetime”: (“mean”, None),
”balanceLimit”: (“sum”, None)
”pathwayBalanceLimit”: (“sum”, None)
}
* the default value is Noneaggregated_shp_name (str) – Name to be given to the saved shapefiles after aggregation
* the default value is ‘aggregated_regions’crs (int) – Coordinate reference system (crs) in which to save the shapefiles
* the default value is 3035aggregated_xr_filename (str) – Name to be given to the saved netCDF file containing aggregated esM data
* the default value is ‘aggregated_xr_dataset.nc’
- Returns:
aggregated_xr_dataset - The xarray datasets holding aggregated data
- Return type:
Dict[str, xr.Dataset]