Action Modes#
Action modes define how you control the robot in RoboEval environments. Different action modes are suited for different use cases: data collection, policy learning, or model evaluation.
Available Action Modes#
RoboEval provides joint position control for the robot:
JointPositionActionMode#
Control the robot by specifying joint positions (angles).
from roboeval.action_modes import JointPositionActionMode
# Absolute positions (most common)
action_mode = JointPositionActionMode(
absolute=True,
floating_base=False
)
# Delta positions (incremental control)
action_mode = JointPositionActionMode(
absolute=False,
floating_base=False
)
Parameters:
absolutebool, default=FalseIf True, actions specify absolute joint positions. If False, actions specify position deltas.
floating_basebool, default=TrueEnable control of the robot’s base position (for mobile robots).
eebool, default=FalseIf True, control end-effector poses instead of joint positions.
block_until_reachedbool, default=FalseContinue stepping until target position is reached (or max steps exceeded).
floating_dofsList[PelvisDof], optionalWhich base DOFs to control. By default includes X, Y, and RZ (rotation around Z-axis).
Use cases:
Replaying demonstrations
Teleoperation data collection
Position-based policies
Action Mode Details#
Absolute Joint Position#
Specify exact joint angles directly:
action_mode = JointPositionActionMode(absolute=True)
env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)
# Action: target joint positions for all joints
action = np.array([
0.0, -0.7, 0.0, -2.4, 0.0, 1.7, 0.8, # Left arm
0.0, -0.7, 0.0, -2.4, 0.0, 1.7, 0.8, # Right arm
0.04, 0.04 # Grippers
])
obs, reward, terminated, truncated, info = env.step(action)
Action space: Box(low=-π, high=π, shape=(n_joints,))
Advantages:
Precise control
Easy to replay demonstrations
Stable for data collection
Disadvantages:
Requires knowledge of joint limits
May need inverse kinematics for end-effector control
Delta Joint Position#
Specify incremental changes to joint positions:
action_mode = JointPositionActionMode(absolute=False)
env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)
# Action: change in joint positions
action = np.array([
0.01, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, # Move first joint slightly
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0
])
Action space: Box(low=-0.1, high=0.1, shape=(n_joints,))
Advantages:
Easier for learning (smaller action space)
Smoother trajectories
Disadvantages:
Accumulated errors over time
Harder to replay exact demonstrations
End-Effector Control#
Control the robot by specifying end-effector poses (position + orientation):
action_mode = JointPositionActionMode(absolute=True, ee=True, floating_base=False)
env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)
# For bimanual robot: 12D action (6D per arm)
# [left_x, left_y, left_z, left_roll, left_pitch, left_yaw,
# right_x, right_y, right_z, right_roll, right_pitch, right_yaw]
action = np.array([
0.5, 0.3, 1.0, 0.0, 0.0, 0.0, # Left end-effector
0.5, -0.3, 1.0, 0.0, 0.0, 0.0, # Right end-effector
])
Action space (bimanual): Box(shape=(12,)) - 6D pose per arm (position xyz + euler xyz)
Action space (single-arm): Box(shape=(6,)) - 6D pose (position xyz + euler xyz)
Advantages:
Intuitive task-space control
Easier for teleoperation
No need for manual inverse kinematics
Disadvantages:
IK solver may fail for unreachable poses
Slower than direct joint control
May have singularities
Floating Base Control#
For robots with mobile bases, enable floating base control:
from roboeval.action_modes import PelvisDof
action_mode = JointPositionActionMode(
absolute=True,
floating_base=True,
floating_dofs=[PelvisDof.X, PelvisDof.Y, PelvisDof.RZ] # Planar movement
)
env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)
Available DOFs:
PelvisDof.X- Forward/backwardPelvisDof.Y- Left/rightPelvisDof.Z- Up/downPelvisDof.RZ- Rotation around Z-axis (yaw)
Note: The floating base DOFs use the PelvisDof enum, not strings.
Action space:
# With floating_dofs=[PelvisDof.X, PelvisDof.Y, PelvisDof.RZ]
action_space = Box(shape=(n_floating_dofs + n_joints,))
# First N elements: floating base DOFs
# Remaining: joint positions or gripper positions
Choosing an Action Mode#
For Replaying Demonstrations:
action_mode = JointPositionActionMode(
absolute=True,
floating_base=True, # If demos include base movement
floating_dofs=[] # Match demo configuration
)
For Data Collection:
# Keyboard teleoperation
action_mode = JointPositionActionMode(
absolute=False, # Incremental control
floating_base=True,
floating_dofs=[PelvisDof.X, PelvisDof.Y, PelvisDof.RZ]
)
# VR teleoperation
action_mode = JointPositionActionMode(
absolute=True, # Direct IK control
floating_base=True
)
For Policy Learning:
# Behavior cloning from demos
action_mode = JointPositionActionMode(
absolute=True # Match demo format
)
# Reinforcement learning
action_mode = JointPositionActionMode(
absolute=False # Easier action space
)
Action Normalization#
Actions are automatically normalized/denormalized:
# Your policy outputs actions in [-1, 1]
normalized_action = np.array([0.5, -0.3, 0.8, ...])
# Environment automatically scales to valid range
obs, reward, terminated, truncated, info = env.step(normalized_action)
To access raw (unnormalized) action space:
# Normalized (default, for learning)
env.action_space # Box(low=-1, high=1, ...)
# Unnormalized (actual joint ranges)
env.action_mode.get_action_space() # Box(low=-π, high=π, ...)
Gripper Control#
Grippers are controlled as part of the action vector:
# For BimanualPanda (14 arm joints + 2 gripper joints)
action = np.zeros(16)
# Arm joints
action[:14] = arm_positions
# Gripper control (last 2 elements)
action[14] = 0.04 # Left gripper (open)
action[15] = 0.04 # Right gripper (open)
# To close grippers
action[14] = 0.0 # Left gripper (closed)
action[15] = 0.0 # Right gripper (closed)
Common Patterns#
Smooth Trajectories with Delta Mode:
action_mode = JointPositionActionMode(absolute=False)
env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)
obs, info = env.reset()
# Move smoothly to target
target = np.array([...]) # Target joint positions
# Extract current positions from proprioception (qpos is first half)
n_joints = len(env.robot.qpos)
current_pos = obs["proprioception"][:n_joints]
for step in range(100):
delta = (target - current_pos) * 0.1 # 10% of distance
delta = np.clip(delta, -0.05, 0.05) # Limit speed
obs, reward, terminated, truncated, info = env.step(delta)
current_pos = obs["proprioception"][:n_joints]
Switching Between Action Modes:
# Collect data with one mode
collect_mode = JointPositionActionMode(absolute=True, floating_base=True)
env_collect = LiftPot(action_mode=collect_mode, robot_cls=BimanualPanda)
# Train with another mode
train_mode = JointPositionActionMode(absolute=False)
env_train = LiftPot(action_mode=train_mode, robot_cls=BimanualPanda)
Action Masking for Subsets of Joints:
# Control only one arm
action = env.action_space.sample()
# Get current joint positions from observation
n_joints = len(env.robot.qpos)
current_qpos = obs["proprioception"][:n_joints]
action[:7] = desired_left_arm_pos
action[7:14] = current_qpos[7:14] # Keep right arm fixed
Best Practices#
Use absolute mode for replaying demonstrations
Use delta mode for smoother learning policies
Match control frequency to action mode (20 Hz for demos, higher for RL)
Enable floating base only if needed (adds action dimensions)
Clip actions when using delta mode to prevent large jumps
Normalize actions to [-1, 1] for neural network policies
Troubleshooting#
Action space mismatch:
# Check action space
print(env.action_space)
# Ensure action matches
action = env.action_space.sample()
Robot moves erratically:
Reduce control frequency
Use smaller deltas in delta mode
Check action normalization
Cannot reproduce demonstrations:
Verify action mode matches demo collection settings
Check floating_base configuration
Ensure control_frequency matches
Next Steps#
Configure Observations for your task
Learn about Data Collection with different action modes
Explore Demonstrations replay and conversion
See Creating Custom Tasks for implementing custom action modes