hdeeprm.entrypoints.HDeepRMWorkloadManager module

The HDeepRM workload manager is able to evaluate deep reinforcement learning policies in the framework.

class hdeeprm.entrypoints.HDeepRMWorkloadManager.HDeepRMWorkloadManager(options: dict)[source]

Bases: hdeeprm.entrypoints.BaseWorkloadManager.BaseWorkloadManager

Entrypoint for Deep Reinforcement Learning experimentation.

This Workload Manager generates the HDeepRM Environment and provides a reference of it to the Agent, who is in charge of making the decisions. It also orchestrates the simulation flow by handling Batsim events and calling the Agent when it is necessary. It extends the BaseWorkloadManager for basic event handling.

env

The Workload Management Environment. The Agent observes and interacts with it.

Type:Environment
agent

The Agent in charge of making decisions by observing and altering the Environment.

Type:Agent
optimizer

Optimizer for updating the Agent’s inner model weights at the end of the simulation.

Type:Optimizer
step

Current decision step.

Type:int
flow_flags

Control the event flow. Fields:

jobs_submitted (bool) - Becomes True when at least one job has been submitted.
jobs_completed (bool) - Becomes True when at least one job has been completed.
action_taken (bool): Becomes True when an action has been taken by the Agent. This triggers the reward procedure.
void_taken (bool): Becomes True when a void action has been selected.
Type:dict
create_agent(agent_options: dict, seed: int) → tuple[source]

Generates the Agent based on the agent options.

The agent class is obtained from the user provided file. It is instantiated according to its parent class. Previously saved models might be loaded if the user indicates so in command line.

Parameters:
  • agent_options (dict) – options for the Agent creation. User provided.
  • seed (int) – random seed for torch library reproducibility when evaluating.
Returns:

A tuple with the created Agent and the optimizer in case of training.

onJobCompletion(job: batsim.batsim.Job) → None[source]

Set the “jobs_completed” flag to True.

Further details on this handler on the base onJobCompletion().

onJobSubmission(job: batsim.batsim.Job) → None[source]

Set the “jobs_submitted” flag to True.

Further details on this handler on the base onJobSubmission().

onNoMoreEvents() → None[source]

When there are no more events in the current time step, the following flow occurs:

  1. The Agent observes the Environment, obatining an approximation of the state.
  2. The Agent processes this observation through its inner model, and decides which action to take,
  3. The Agent alters the Environment based on the selected action.
  4. In the next decision step, the Agent will be rewarded for its action.
onSimulationEnds() → None[source]

Handler triggered when the simulation has ended.

Triggered when receiving a SIMULATION_ENDS event. If the Agent evaluated has been in training mode, the loss is calculated to update its inner model weights. The updated model is saved if the user has indicated so in command line. Rewards are also logged for observing performance.