ALPHA-α and Bi-ACT Are All You Need: Importance of Position and Force Control and Information in Imitation Learning for Unimanual and Bimanual Robotic Manipulation
Osaka University
Team
ALPHA-α
A” “L"ow-cost “P"hysical “Ha"rdware Considering Diverse Motor Control Modes for Research in Everyday Bimanual Robotic Manipulation
Overview
we aimed to develop ALPHA-α, a low-cost bimanual robotic physical hardware considering diverse motor control modes that is suitable for robotics research capable of handling everyday tasks, allowing it to be easily constructed by many researchers and developers. It is important to note that we do not claim our hardware ALPHA-α is superior to ALOHA in terms of performance. The reason for comparing ALPHA-α and ALOHA in this paper is to clarify the position of ALPHA-α by comparing ALPHA-α with ALOHA, a bimanual robot platform used by many users. ALPHA-α features low cost, ease of use, repairability, ease of assembly, and ability to enable various control types and high control frequency.
The distinctive features of ALPHA-α are as follows.
- Low Cost:
Note: OpenMANIPULATOR SARA is not developed by us. However, our ALPHA-α has modified the leader hand from parts of OpenMANIPULATOR SARA. Since OpenMANIPULATOR SARA is not available as of 2024/11, the prices in the table are produced by modification from OpenMANIPULATOR-X[3]. We have also modified it from OpenMANIPULATOR-X. The price of OpenMANIPULATOR SARA may change in the future.
- [1] https://aloha-2.github.io/
- [2] https://github.com/zang09/open_manipulator_6dof_application
- [3] https://www.robotis.us/openmanipulator-x-rm-x52-tnm/
- Diverse Motor Control Modes:
ALPHA-α utilizes and improves upon robots, a robot composed of motors capable of position, velocity, and torque (current) control, to provide researchers with flexibility in selecting control methods. Since motors can be controlled by position, speed, and torque (current) control, researchers/developers can develop various control methods by themselves.
- Data Collection Frequency:
As the selection of control systems increases, for example, sensitive manipulation by force control requires a higher control frequency. Therefore, ALPHA-α employs a motor capable of collecting and estimating joint angle, velocity, and current data at 1000 Hz and an RGB camera capable of collecting RGB images at 260 Hz. For stable collection, RGB image data is collected at about 100 Hz in this paper.
We selected and improved a robot that meets these specifications, constructing the physical hardware which we have named ALPHA-α.
ALPHA-α via Bilateral Control
Leader-Follower Control
Unilateral Control (e. g. ALOHA)
The primary difference between ALOHA/ACT and Bi-ACT is information and control methods. ALOHA/ACT is based on unilateral control, which relies solely on the robot’s joint positions and uses the joint angle data predicted by the ACT learning model directly as command values for ALOHA’s joint position control controller. This system prioritizes position targets, which can make it difficult to generate movements that require nuanced control of force. It is important to note that although it is possible to simulate force modulation in remote operations using only position control, this typically requires extensive time for operators to master the control of the leader robot.
Bilateral Control (e. g. ALPHA-α via Bilateral Control)
On the other hand, our Bi-ACT is based on bilateral control, which considers the robot’s joint positions, velocities, and torques. Bi-ACT utilizes not only the joint data of the leader robot—positions, velocities, and torques—but also incorporates this information from the actively operating follower robots to generate command values for current and torque control. This approach allows for control that combines both position and force in the robot’s movements. Crucially, the command values are not directly generated by the model; instead, they are produced by using the values generated by the model for the leader robot in conjunction with the actual values obtained from the follower robot. By leveraging four-channel bilateral control, this method enables the generation of command values that consider interactions with the environment, thus facilitating a broader range of movements.
Teleoperation Skills: ALPHA-α via Bilateral Control
Bi-ACT
“Bi"lateral Control-Based Imitation Learning via “A"ction “C"hunking with “T"ransformers
Data Collection
Execution
Autonomous Skills
Unimanual Robotic Manipulation
Task
Objects for Pick and Place
We collected joint angles, angular velocities, and torques data for a Leader-Follower robot’s demonstration using a bilateral control system. The robot was controlled at a frequency of 1000Hz. Additionally, both the onboard hand RGB camera and the top RGB camera on the environmental side of the robot were operating. To align both sets of data with the system’s operating cycle, we adjusted the data to 100Hz for use as training data.
Results (Autonomous)
Pick and Place (Real-Time: 1X)
Why Bi-ACT(or position and force information/control) is important?
In this section, we analyze Bi-ACT model’s performance on various objects, focusing on joint5—the gripper joint with the most contact with the objects. Results showed that integrating force metrics significantly enhanced its effectiveness. Details are as follows.
Difference in hardness
Difference in shape consistency
These results confirmed the importance of position and force information/control when using Bi-ACT.
Bimanual Robotic Manipulation
Task
To examine the applicability of Bi-ACT, experiments were conducted on three tasks, “Put-Cup-Ball,” “Egg Handling,” and “Open Cap,” using ALPHA-α. For each task, we collected 5 demonstrations as training data.
- Put-Cup-Ball
In the “Put-Cup-Ball” task, the left robot arm transports a cup located on the left side, while the right robot arm picks up a ball and places it on top of the cup. The specific steps are as follows: (#0) Initial position (#1) Pick up the cup and ball (#2) Place the cup and move the ball (#3) Place the ball on top of the cup.
This task requires coordinated bimanual robot actions, as each arm must monitor the other’s status and avoid interference, ensuring effective cooperation between the two robot arms.
- Egg Handling
In the “Egg Handling” task, the two robots coordinate to lift two eggs and place them in a designated area. The specific steps are as follows: (#0) Initial position (#1) Pick two eggs (#2) Move to place area (#3) Place two eggs in place area.
This task requires the left and right arms to carefully grasp and transport fragile eggs to the specified location. Proper and delicate bimanual robot actions are essential to avoid breaking the eggs.
- Open Cap
In the “Open Cap” task, two robots are used to open the cap of a plastic bottle. The specific steps are as follows: (#0) Initial position (#1) Grasp the bottle with the right robot (#2) Pass the bottle to the left robot (#3) Open the bottle cap.
This task requires even more careful coordination than the “Put-Cup-Ball” task, as both arms must monitor each other’s status and avoid interference. In particular, it necessitates coordinated bimanual actions, such as passing the bottle between robots and holding the bottle with the left robot arm while the right robot arm opens the cap.
Results (Autonomous)
A little extra
Repairability
Control
Citation
@misc{kobayashi2024alphabiact,
title={ALPHA-$\alpha$ and Bi-ACT Are All You Need: Importance of Position and Force Information/Control for Imitation Learning of Unimanual and Bimanual Robotic Manipulation with Low-Cost System},
author={Masato Kobayashi and Thanpimon Buamanee and Takumi Kobayashi},
year={2024},
eprint={2411.09942},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2411.09942},
}
Contact
Masato Kobayashi (Assistant Professor, Osaka University, Japan)
- X (Twitter)
- English : https://twitter.com/MeRTcookingEN
- Japanese : https://twitter.com/MeRTcooking
- Linkedin https://www.linkedin.com/in/kobayashi-masato-robot/
* Corresponding author: Masato Kobayashi