hdeeprm.entrypoints.HDeepRMWorkloadManager module¶
The HDeepRM workload manager is able to evaluate deep reinforcement learning policies in the framework.
-
class
hdeeprm.entrypoints.HDeepRMWorkloadManager.HDeepRMWorkloadManager(options: dict)[source]¶ Bases:
hdeeprm.entrypoints.BaseWorkloadManager.BaseWorkloadManagerEntrypoint for Deep Reinforcement Learning experimentation.
This Workload Manager generates the HDeepRM Environment and provides a reference of it to the Agent, who is in charge of making the decisions. It also orchestrates the simulation flow by handling Batsim events and calling the Agent when it is necessary. It extends the
BaseWorkloadManagerfor basic event handling.-
env¶ The Workload Management Environment. The Agent observes and interacts with it.
Type: Environment
-
agent¶ The Agent in charge of making decisions by observing and altering the Environment.
Type: Agent
-
optimizer¶ Optimizer for updating the Agent’s inner model weights at the end of the simulation.
Type: Optimizer
-
flow_flags¶ Control the event flow. Fields:
jobs_submitted (bool) - BecomesTruewhen at least one job has been submitted.jobs_completed (bool) - BecomesTruewhen at least one job has been completed.action_taken (bool): BecomesTruewhen an action has been taken by the Agent. This triggers the reward procedure.void_taken (bool): BecomesTruewhen a void action has been selected.Type: dict
-
create_agent(agent_options: dict, seed: int) → tuple[source]¶ Generates the Agent based on the agent options.
The agent class is obtained from the user provided file. It is instantiated according to its parent class. Previously saved models might be loaded if the user indicates so in command line.
Parameters: Returns: A tuple with the created Agent and the optimizer in case of training.
-
onJobCompletion(job: batsim.batsim.Job) → None[source]¶ Set the “jobs_completed” flag to
True.Further details on this handler on the base
onJobCompletion().
-
onJobSubmission(job: batsim.batsim.Job) → None[source]¶ Set the “jobs_submitted” flag to
True.Further details on this handler on the base
onJobSubmission().
-
onNoMoreEvents() → None[source]¶ When there are no more events in the current time step, the following flow occurs:
- The Agent observes the Environment, obatining an approximation of the state.
- The Agent processes this observation through its inner model, and decides which action to take,
- The Agent alters the Environment based on the selected action.
- In the next decision step, the Agent will be rewarded for its action.
-
onSimulationEnds() → None[source]¶ Handler triggered when the simulation has ended.
Triggered when receiving a SIMULATION_ENDS event. If the Agent evaluated has been in training mode, the loss is calculated to update its inner model weights. The updated model is saved if the user has indicated so in command line. Rewards are also logged for observing performance.
-