Manager

Descriptions of the basic functions are given below.

Function descriptions:

Manager function that calls spatial grouping and aggregation algorithm.

manager.perform_spatial_aggregation(xr_datasets, shapefile, grouping_mode='parameter_based', n_groups=3, distance_threshold=None, aggregatedResultsPath=None, **kwargs)[source]

Performs spatial grouping of regions (by calling the functions in grouping.py) and then representation of the data within each region group (by calling functions in representation.py).

Parameters:
  • xr_datasets (str/Dict[str, xr.Dataset]) –

    Either the path to .netCDF file or the read-in xarray datasets

    • Dimensions in the datasets: ‘time’, ‘space’, ‘space_2’

  • shapefile (str/GeoDataFrame) – Either the path to the shapefile or the read-in shapefile

Default arguments:

Parameters:
  • grouping_mode (str, one of {'parameter_based', 'string_based', 'distance_based'}) – Defines how to spatially group the regions. Refer to grouping.py for more information.
    * the default value is ‘parameter_based’

  • n_groups (strictly positive int) – The number of region groups to be formed from the original region set. This parameter is irrelevant if grouping_mode is ‘string_based’.
    * the default value is 3

  • distance_threshold (float) – The distance threshold at or above which regions will not be aggregated into one.
    * the default value is None. If not None, n_groups must be None

  • aggregatedResultsPath (str) – Indicates path to which the aggregated results should be saved. If None, results are not saved.
    * the default value is None

Additional keyword arguments that can be passed via kwargs:

Parameters:
  • geom_col_name (str) – The geomtry column name in shapefile
    * the default value is ‘geometry’

  • geom_id_col_name (str) – The colum in shapefile consisting geom IDs
    * the default value is ‘index’

  • geom_id_col_name – The colum in shapefile consisting geom IDs
    * the default value is ‘index’

  • separator (str) –

    Relevant only if grouping_mode is ‘string_based’. The character or string in the region IDs that defines where the ID should be split.

    E.g.: region IDs -> [‘01_es’, ‘02_es’] and separator=’_’, then IDs are split at _ and the last part (‘es’) is taken as the group ID


    * the default value is None

  • position (int/tuple) –

    Relevant only if grouping_mode is ‘string_based’. Used to define the position(s) of the region IDs where the split should happen. An int i would mean the part from 0 to i is taken as the group ID. A tuple (i,j) would mean the part i to j is taken at the group ID.

    Note

    either separator or position must be passed in order to perform string_based_grouping


    * the default value is None

  • weights (Dict) –

    Relevant only if grouping_mode is ‘parameter_based’. Through the weights dictionary, one can assign weights to variable-component pairs. When calculating distance corresonding to each variable-component pair, these specified weights are considered, otherwise taken as 1.

    It must be in one of the formats:

    • If you want to specify weights for particular variables and particular corresponding components:

      { ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : List[<variable_name>] }

    • If you want to specify weights for particular variables, but all corresponding components:

      { ‘components’ : {‘all’ : <weight>}, ‘variables’ : List[<variable_name>] }

    • If you want to specify weights for all variables, but particular corresponding components:

      { ‘components’ : Dict[<component_name>, <weight>}], ‘variables’ : ‘all’ }

    <weight> can be of type int/float
    * the default value is None

  • aggregation_method (str, one of {'kmedoids_contiguity', 'hierarchical'}) –

    Relevant only if grouping_mode is ‘parameter_based’. The clustering method that should be used to group the regions. Options:


    * the default value is ‘kmedoids_contiguity’

  • skip_regions (List[str]) –

    The region IDs to be skipped while aggregating regions

    Note

    currently only implemented for grouping_mode ‘distance_based’


    * the default value is None

  • enforced_groups (Dict[str, List[str]]) –

    The groups that should be enforced when aggregating regions.

    Note

    currently only implemented for grouping_mode ‘distance_based’


    * the default value is None

  • solver (str) – Relevant only if grouping_mode is ‘parameter_based’ and aggregation_method is ‘kmedoids_contiguity’. The optimization solver to be chosen.
    * the default value is ‘gurobi’

  • solver – Relevant only if grouping_mode is ‘parameter_based’ and aggregation_method is ‘kmedoids_contiguity’. The optimization solver to be chosen.
    * the default value is ‘gurobi’

  • aggregation_function_dict (Dict[str, Tuple(str, None/str)]) –

    • Contains information regarding the mode of aggregation for each individual variable.

    • Possibilities: mean, weighted mean, sum, bool (boolean OR).

    • Format of the dictionary:

      {<variable_name>: (<mode_of_aggregation>, <weights>), <variable_name>: (<mode_of_aggregation>, None)}

      <weights> is required only if <mode_of_aggregation> is ‘weighted mean’. The name of the variable that should act as weights should be provided. Can be None otherwise.

    Note

    A default dictionary is considered with the following corresponding modes. If aggregation_function_dict is passed, this default dictionary is updated. The default dicitionary:

    {

    “operationRateMax”: (“weighted mean”, “capacityMax”),

    ”operationRateFix”: (“sum”, None),

    ”locationalEligibility”: (“bool”, None),

    ”capacityMax”: (“sum”, None),

    ”investPerCapacity”: (“mean”, None),

    ”investIfBuilt”: (“bool”, None),

    ”opexPerOperation”: (“mean”, None),

    ”opexPerCapacity”: (“mean”, None),

    ”opexIfBuilt”: (“bool”, None),

    ”interestRate”: (“mean”, None),

    ”economicLifetime”: (“mean”, None),

    ”capacityFix”: (“sum”, None),

    ”losses”: (“mean”, None),

    ”distances”: (“mean”, None),

    ”commodityCost”: (“mean”, None),

    ”commodityRevenue”: (“mean”, None),

    ”opexPerChargeOperation”: (“mean”, None),

    ”opexPerDischargeOperation”: (“mean”, None),

    ”QPcostScale”: (“sum”, None),

    ”technicalLifetime”: (“mean”, None),

    ”balanceLimit”: (“sum”, None)

    ”pathwayBalanceLimit”: (“sum”, None)

    }


    * the default value is None

  • aggregated_shp_name (str) – Name to be given to the saved shapefiles after aggregation
    * the default value is ‘aggregated_regions’

  • crs (int) – Coordinate reference system (crs) in which to save the shapefiles
    * the default value is 3035

  • aggregated_xr_filename (str) – Name to be given to the saved netCDF file containing aggregated esM data
    * the default value is ‘aggregated_xr_dataset.nc’

Returns:

aggregated_xr_dataset - The xarray datasets holding aggregated data

Return type:

Dict[str, xr.Dataset]