visualizers.viser_app.controllers.sampling.mppi
#
Module Contents#
- class visualizers.viser_app.controllers.sampling.mppi.MPPIConfig#
Bases:
jacta.visualizers.viser_app.controllers.sampling_base.SamplingBaseConfig
Configuration for predictive sampling. .. py:attribute:: sigma
- type:
float
- value:
0.05
- temperature: float = 1.0#
- class visualizers.viser_app.controllers.sampling.mppi.MPPI(task: jacta.visualizers.viser_app.tasks.task.Task, config: MPPIConfig, reward_config: jacta.visualizers.viser_app.tasks.task.TaskConfig)#
Bases:
jacta.visualizers.viser_app.controllers.sampling_base.SamplingBase
Predictive sampling planner.
- Parameters:
config (MPPIConfig) – configuration object with hyperparameters for planner.
model – mujoco model of system being controlled.
data – current configuration data for mujoco model.
reward_func – function mapping batches of states/controls to batches of rewards.
task (jacta.visualizers.viser_app.tasks.task.Task) –
reward_config (jacta.visualizers.viser_app.tasks.task.TaskConfig) –
- update_action(curr_state: numpy.ndarray, curr_time: float, additional_info: dict[str, Any]) None #
Performs rollouts + reward computation from current state.
- Parameters:
curr_state (numpy.ndarray) –
curr_time (float) –
additional_info (dict[str, Any]) –
- Return type:
None
- action(time: float) numpy.ndarray #
Current best action of policy.
- Parameters:
time (float) –
- Return type:
numpy.ndarray