Introduction
I work for a manufacturing company where robots perform tasks like welding and transporting parts. I’m at an interesting intersection where both AI and robotics are my passions, and I’m actively working in the field and building my own robots. I sense that a significant change is taking place in robotics. Traditionally, robots have been built and controlled by electrical engineers and programmers, but this is changing. Just as software developers are evolving into AI engineers or AI-assisted software developers, I believe we are witnessing the emergence of AI robotics engineers and a new stack of robotics technologies.
Hugging Face is at the forefront of open-source AI, and I’m a big fan. They have created an open-source AI stack for robotics, eliminating the need for complicated setups like ROS and Gazebo. All you need is a laptop, a model, a robot, and the Hugging Face GitHub repository. There’s no need to program microcontrollers or deal with complex ROS topics and nodes – it’s a much simpler stack.
Here’s my robot in action :
What can it do?
Watch Remi Cadene, one of the lead developers of the robot stack, demonstrate what two $115 robots can do:
🤖 Open-source AI-powered robot
— Remi Cadene (@RemiCadene) February 14, 2024
✅ Cheap ($115)
✅ 3D printed
✅ Controlled by LLM
✅ Trained by imitation learning
Demo: pick and place task
Github: https://t.co/BBkjRPFvMF pic.twitter.com/ATkXzm1YwR
How to get started: The action plan
- 3d print the robot parts using files from the SO-ARM100 repository
- Purchase the necessary parts (e.g., motors) as specified in the SO-ARM repository
- Assemble the robot following instructions from The Robot Studio YouTube channel
- Connect your laptop to the robot
- Clone the Hugging Face LeRobot repository
- Configure and run the repository for your robot using these instructions
- Program and record the robot’s actions (if a model doesn’t already exist for your desired task)
- Train the model using the recorded data with this script
- Run the model and let the robot perform its tasks
How this new stack changes things
Overall, it will be much simpler to get started. Once the community has trained models, it will be a matter of running the model and letting the robot do its work. Initially, training will be required, but as the space grows, less training will be needed for common tasks. Here are some of my thoughts on how things might progress:
- Robots will become cheaper to build and buy. Current factory robots, programmed to move to exact coordinates, require very accurate and expensive motors, encoders, and components. This new style of robot doesn’t need to be as precise because it uses machine learning models and vision (cameras) to continuously plan and calculate its movements, adjusting motor positions on the fly as needed. This is similar to how a human catches a ball in cricket or baseball, continuously adjusting their body and arm while watching the ball. This reduces the need for expensive, precise components, lowering the overall cost of the robot.
- Robotics will become more affordable due to the decreasing cost of components and the use of more commonly available parts (e.g., starting at $115 currently). This will attract more people to robotics.
- Increased competition in the robotics market will breed innovation and drive down prices.
- The community will upload more and better datasets to the Hugging Face repository, creating a wealth of data for everyone to train on.
- Models will improve as people train on more diverse datasets.
- Better neural network architectures for training models will be discovered as people experiment with different approaches.
- More people, including home hobbyists, will enter the robotics space as the barrier to entry lowers, leading to increased automation in homes, offices and factories.
My thoughts so far based on experience
It’s a simple stack, and I think it will be a game-changer. The latest robot costs $115 to build, making it the cheapest robot on the market. While it may not be able to pick up car doors like factory robots, it’s affordable enough to get many people started and capable of performing numerous useful tasks. It also allows people to learn how to control robots using this new stack, become efficient at it, and understand its capabilities.
The next step for users would be to buy or build a more capable robots, which will become cheaper for the reasons mentioned above. A robot that can lift 5kg is available on Alibaba for under $2,000, opening up possibilities for automation and new tasks.
The current challenge is obtaining enough data for the models. This is where Hugging Face excels – they are making it easier to save data, host trained models, and provide access to models. They already have everything in place; what they need now is more people to record data, train models, and, most importantly, bring the community together to start building open-source robotics and models.