Action Modes#

Action modes define how you control the robot in RoboEval environments. Different action modes are suited for different use cases: data collection, policy learning, or model evaluation.

Available Action Modes#

RoboEval provides joint position control for the robot:

JointPositionActionMode#

Control the robot by specifying joint positions (angles).

from roboeval.action_modes import JointPositionActionMode

# Absolute positions (most common)
action_mode = JointPositionActionMode(
    absolute=True,
    floating_base=False
)

# Delta positions (incremental control)
action_mode = JointPositionActionMode(
    absolute=False,
    floating_base=False
)

Parameters:

absolutebool, default=False: If True, actions specify absolute joint positions. If False, actions specify position deltas.
floating_basebool, default=True: Enable control of the robot’s base position (for mobile robots).
eebool, default=False: If True, control end-effector poses instead of joint positions.
block_until_reachedbool, default=False: Continue stepping until target position is reached (or max steps exceeded).
floating_dofsList[PelvisDof], optional: Which base DOFs to control. By default includes X, Y, and RZ (rotation around Z-axis).

Use cases:

Replaying demonstrations
Teleoperation data collection
Position-based policies

Action Mode Details#

Absolute Joint Position#

Specify exact joint angles directly:

action_mode = JointPositionActionMode(absolute=True)
env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)

# Action: target joint positions for all joints
action = np.array([
    0.0, -0.7, 0.0, -2.4, 0.0, 1.7, 0.8,  # Left arm
    0.0, -0.7, 0.0, -2.4, 0.0, 1.7, 0.8,  # Right arm
    0.04, 0.04  # Grippers
])

obs, reward, terminated, truncated, info = env.step(action)

Action space: Box(low=-π, high=π, shape=(n_joints,))

Advantages:

Precise control
Easy to replay demonstrations
Stable for data collection

Disadvantages:

Requires knowledge of joint limits
May need inverse kinematics for end-effector control

Delta Joint Position#

Specify incremental changes to joint positions:

action_mode = JointPositionActionMode(absolute=False)
env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)

# Action: change in joint positions
action = np.array([
    0.01, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,  # Move first joint slightly
    0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
    0.0, 0.0
])

Action space: Box(low=-0.1, high=0.1, shape=(n_joints,))

Advantages:

Easier for learning (smaller action space)
Smoother trajectories

Disadvantages:

Accumulated errors over time
Harder to replay exact demonstrations

End-Effector Control#

Control the robot by specifying end-effector poses (position + orientation):

action_mode = JointPositionActionMode(absolute=True, ee=True, floating_base=False)
env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)

# For bimanual robot: 12D action (6D per arm)
# [left_x, left_y, left_z, left_roll, left_pitch, left_yaw,
#  right_x, right_y, right_z, right_roll, right_pitch, right_yaw]
action = np.array([
    0.5, 0.3, 1.0, 0.0, 0.0, 0.0,  # Left end-effector
    0.5, -0.3, 1.0, 0.0, 0.0, 0.0,  # Right end-effector
])

Action space (bimanual): Box(shape=(12,)) - 6D pose per arm (position xyz + euler xyz)

Action space (single-arm): Box(shape=(6,)) - 6D pose (position xyz + euler xyz)

Advantages:

Intuitive task-space control
Easier for teleoperation
No need for manual inverse kinematics

Disadvantages:

IK solver may fail for unreachable poses
Slower than direct joint control
May have singularities

Floating Base Control#

For robots with mobile bases, enable floating base control:

from roboeval.action_modes import PelvisDof

action_mode = JointPositionActionMode(
    absolute=True,
    floating_base=True,
    floating_dofs=[PelvisDof.X, PelvisDof.Y, PelvisDof.RZ]  # Planar movement
)

env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)

Available DOFs:

PelvisDof.X - Forward/backward
PelvisDof.Y - Left/right
PelvisDof.Z - Up/down
PelvisDof.RZ - Rotation around Z-axis (yaw)

Note: The floating base DOFs use the PelvisDof enum, not strings.

Action space:

# With floating_dofs=[PelvisDof.X, PelvisDof.Y, PelvisDof.RZ]
action_space = Box(shape=(n_floating_dofs + n_joints,))

# First N elements: floating base DOFs
# Remaining: joint positions or gripper positions

Choosing an Action Mode#

For Replaying Demonstrations:

action_mode = JointPositionActionMode(
    absolute=True,
    floating_base=True,  # If demos include base movement
    floating_dofs=[]  # Match demo configuration
)

For Data Collection:

# Keyboard teleoperation
action_mode = JointPositionActionMode(
    absolute=False,  # Incremental control
    floating_base=True,
    floating_dofs=[PelvisDof.X, PelvisDof.Y, PelvisDof.RZ]
)

# VR teleoperation
action_mode = JointPositionActionMode(
    absolute=True,  # Direct IK control
    floating_base=True
)

For Policy Learning:

# Behavior cloning from demos
action_mode = JointPositionActionMode(
    absolute=True  # Match demo format
)

# Reinforcement learning
action_mode = JointPositionActionMode(
    absolute=False  # Easier action space
)

Action Normalization#

Actions are automatically normalized/denormalized:

# Your policy outputs actions in [-1, 1]
normalized_action = np.array([0.5, -0.3, 0.8, ...])

# Environment automatically scales to valid range
obs, reward, terminated, truncated, info = env.step(normalized_action)

To access raw (unnormalized) action space:

# Normalized (default, for learning)
env.action_space  # Box(low=-1, high=1, ...)

# Unnormalized (actual joint ranges)
env.action_mode.get_action_space()  # Box(low=-π, high=π, ...)

Gripper Control#

Grippers are controlled as part of the action vector:

# For BimanualPanda (14 arm joints + 2 gripper joints)
action = np.zeros(16)

# Arm joints
action[:14] = arm_positions

# Gripper control (last 2 elements)
action[14] = 0.04  # Left gripper (open)
action[15] = 0.04  # Right gripper (open)

# To close grippers
action[14] = 0.0  # Left gripper (closed)
action[15] = 0.0  # Right gripper (closed)

Common Patterns#

Smooth Trajectories with Delta Mode:

action_mode = JointPositionActionMode(absolute=False)
env = LiftPot(action_mode=action_mode, robot_cls=BimanualPanda)

obs, info = env.reset()

# Move smoothly to target
target = np.array([...])  # Target joint positions
# Extract current positions from proprioception (qpos is first half)
n_joints = len(env.robot.qpos)
current_pos = obs["proprioception"][:n_joints]

for step in range(100):
    delta = (target - current_pos) * 0.1  # 10% of distance
    delta = np.clip(delta, -0.05, 0.05)  # Limit speed

    obs, reward, terminated, truncated, info = env.step(delta)
    current_pos = obs["proprioception"][:n_joints]

Switching Between Action Modes:

# Collect data with one mode
collect_mode = JointPositionActionMode(absolute=True, floating_base=True)
env_collect = LiftPot(action_mode=collect_mode, robot_cls=BimanualPanda)

# Train with another mode
train_mode = JointPositionActionMode(absolute=False)
env_train = LiftPot(action_mode=train_mode, robot_cls=BimanualPanda)

Action Masking for Subsets of Joints:

# Control only one arm
action = env.action_space.sample()
# Get current joint positions from observation
n_joints = len(env.robot.qpos)
current_qpos = obs["proprioception"][:n_joints]

action[:7] = desired_left_arm_pos
action[7:14] = current_qpos[7:14]  # Keep right arm fixed

Best Practices#

Use absolute mode for replaying demonstrations
Use delta mode for smoother learning policies
Match control frequency to action mode (20 Hz for demos, higher for RL)
Enable floating base only if needed (adds action dimensions)
Clip actions when using delta mode to prevent large jumps
Normalize actions to [-1, 1] for neural network policies

Troubleshooting#

Action space mismatch:

# Check action space
print(env.action_space)

# Ensure action matches
action = env.action_space.sample()

Robot moves erratically:

Reduce control frequency
Use smaller deltas in delta mode
Check action normalization

Cannot reproduce demonstrations:

Verify action mode matches demo collection settings
Check floating_base configuration
Ensure control_frequency matches

Next Steps#

Configure Observations for your task
Learn about Data Collection with different action modes
Explore Demonstrations replay and conversion
See Creating Custom Tasks for implementing custom action modes

Action Modes

Contents

Action Modes#

Available Action Modes#

JointPositionActionMode#

Action Mode Details#

Absolute Joint Position#

Delta Joint Position#

End-Effector Control#

Floating Base Control#

Choosing an Action Mode#

Action Normalization#

Gripper Control#

Common Patterns#

Best Practices#

Troubleshooting#

Next Steps#