Introduction to YOLO Object Detection in ROS2
This document explains how to perform object detection using YOLO (You Only Look Once) on images obtained from robot cameras in a ROS2 environment. YOLO is a high-performance deep learning model that can detect objects in real-time.
Learning Objectives
By the end of this lesson, you will be able to:
- Understand what YOLO is and how it works for object detection
- Integrate YOLO models into ROS2 nodes
- Process camera images and detect objects in real-time
- Visualize and publish object detection results
YOLO Object Detection Overview
This lesson covers:
- Processing images obtained from robot cameras in real-time
- Implementation of object detection using deep learning (YOLO)
- Visualization of detection results and publishing to ROS2 topics
- Working with pre-trained models and understanding model selection
1. What is YOLO
YOLO (You Only Look Once) is a deep learning model for detecting objects in images. Compared to traditional object detection methods, it has the following features:
- High Speed: Can detect objects in real-time
- High Accuracy: Latest YOLO models achieve high detection accuracy
- Ease of Use: Pre-trained models are provided and easy to use
- Diversity: Can detect various object classes
Main Uses of YOLO
- Object Detection: Detect position and type of objects in images
- Object Tracking: Track objects in videos
- Segmentation: Identify regions of objects in images
- Pose Estimation: Estimate object poses
2. Basics of YOLO Object Detection in ROS2
To perform YOLO object detection in ROS2, the following steps are required:
- Subscribe to images from the camera
- Convert ROS2 image messages to OpenCV format
- Perform object detection using YOLO model
- Visualize detection results and publish as needed
Required Libraries
- rclpy: ROS2 Python client library
- cv_bridge: Library for converting between ROS2 image messages and OpenCV format images
- OpenCV: Library for image processing
- Ultralytics: Library that provides YOLO models
3. Implementation of YOLO Object Detection
Create a ROS2 node for YOLO object detection. This node subscribes to images from the camera, performs object detection using the YOLO model, and visualizes and publishes detection results.
Implementation Steps
- Create a ROS2 node
- Subscribe to images from the camera
- Load YOLO model
- Perform object detection on images
- Visualize and publish detection results
Code Example
#!/usr/bin/env python3
import rclpy
from rclpy.node import Node
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
import cv2
from ultralytics import YOLO
from rclpy.qos import QoSProfile, QoSReliabilityPolicy, QoSDurabilityPolicy
from std_msgs.msg import String
import json
class ImageYoloDetection(Node):
def __init__(self):
super().__init__('image_yolo_detection')
# Set QoS profile for image data transmission
qos_profile = QoSProfile(
depth=5,
reliability=QoSReliabilityPolicy.BEST_EFFORT,
durability=QoSDurabilityPolicy.VOLATILE
)
# Subscribe to camera image topic
self.subscription = self.create_subscription(
Image,
'/kachaka/front_camera/image_raw',
self.image_callback,
qos_profile
)
# Create publisher for detection results
self.publisher = self.create_publisher(
Image,
'/image_yolo_detection/image',
qos_profile
)
# Publish detected object information
self.object_publisher = self.create_publisher(
String,
'/image_yolo_detection/objects',
qos_profile
)
# Create CvBridge for ROS-OpenCV image conversion
self.bridge = CvBridge()
# Load YOLOv8 model
# 'yolov8x.pt' is the largest YOLOv8 model (highest accuracy, slowest)
# The model file will be downloaded automatically on first use if not present
# Other options: 'yolov8n.pt' (fastest), 'yolov8s.pt', 'yolov8m.pt', 'yolov8l.pt'
self.model = YOLO('yolov8x.pt')
# Record previously detected objects to avoid duplicate announcements
# Using a set ensures each object is only tracked once
self.last_detected_objects = set()
self.get_logger().info("YOLOv8 node has started.")
def image_callback(self, msg):
try:
# Convert ROS Image message to OpenCV image
cv_image = self.bridge.imgmsg_to_cv2(msg, desired_encoding='bgr8')
except Exception as e:
self.get_logger().error("Error converting image: " + str(e))
return
# Perform object detection using YOLOv8
# The model processes the image and returns detection results
# Results include: bounding boxes, class labels, confidence scores
results = self.model(cv_image)
# Draw detection results on the image
# plot() draws bounding boxes, labels, and confidence scores on the image
# This creates a visual representation of what was detected
annotated_image = results[0].plot()
# Extract detected object names from the results
detected_objects = set() # Use set to avoid duplicates
for result in results:
# Iterate through all detected bounding boxes
for box in result.boxes:
# Get the class ID (integer) and convert to class name (string)
class_id = int(box.cls[0]) # Class ID (e.g., 0, 1, 2...)
class_name = result.names[class_id] # Class name (e.g., 'person', 'car')
detected_objects.add(class_name) # Add to set (automatically handles duplicates)
# Publish information only if new objects are detected
# This avoids spamming the topic with the same detections
# Set difference finds objects in detected_objects but not in last_detected_objects
new_objects = detected_objects - self.last_detected_objects
if new_objects:
# Publish detected object information in JSON format
object_info = {
"objects": list(new_objects),
"timestamp": self.get_clock().now().nanoseconds * 1e-9
}
object_msg = String()
object_msg.data = json.dumps(object_info)
self.object_publisher.publish(object_msg)
self.get_logger().info(f"Detected objects: {new_objects}")
# Update detection results
self.last_detected_objects = detected_objects
# Convert detection results to ROS Image message and publish
try:
annotated_msg = self.bridge.cv2_to_imgmsg(annotated_image, encoding='bgr8')
self.publisher.publish(annotated_msg)
except Exception as e:
self.get_logger().error("Error converting detection results: " + str(e))
# Display detection results in a window
cv2.imshow("YOLOv8 Detection", annotated_image)
cv2.waitKey(1)
def main(args=None):
rclpy.init(args=args)
node = ImageYoloDetection()
try:
rclpy.spin(node)
except KeyboardInterrupt:
node.get_logger().info("Shutting down...")
finally:
cv2.destroyAllWindows()
node.destroy_node()
rclpy.shutdown()
if __name__ == '__main__':
main()Code Explanation
- QoS Profile: Set QoS profile suitable for image data transmission and reception
- Subscription: Subscribe to images from the camera
- Publisher: Create topics to publish detection results
- cv_bridge: Convert between ROS2 image messages and OpenCV format images
- YOLO Model: Load YOLOv8 model using Ultralytics library
- Object Detection: Execute object detection using
model(cv_image) - Result Visualization: Draw detection results on images using
results[0].plot() - Result Publishing: Convert detection results to ROS Image messages and publish
- Result Display: Display detection results using
cv2.imshowfunction
4. Types of YOLO Models
YOLO provides various models. The main models are as follows:
YOLOv8 Models
- YOLOv8n: Lightest model (fast but lower accuracy)
- YOLOv8s: Small model (balanced performance)
- YOLOv8m: Medium model (higher accuracy)
- YOLOv8l: Large model (high accuracy but slow)
- YOLOv8x: Largest model (highest accuracy but slowest)
Model Selection
- Prioritize Processing Speed: Choose YOLOv8n or YOLOv8s
- Prioritize Accuracy: Choose YOLOv8l or YOLOv8x
- Prioritize Balance: Choose YOLOv8m
5. Package Creation and Execution
Package Creation
# Create YOLO object detection package
ros2 pkg create --build-type ament_python image_yolo_detection --dependencies rclpy sensor_msgs cv_bridge opencv-python ultralytics std_msgs --node-name image_yolo_detectionInstalling Dependencies
# Install Ultralytics library
pip install ultralyticsPackage Building
cd ~/ros2_ws
colcon build --packages-select image_yolo_detectionNode Execution
# Run YOLO object detection node
ros2 run image_yolo_detection image_yolo_detection6. Applications of YOLO Object Detection
Practical Applications
- Object Tracking: Track detected objects
- Object Counting: Count specific objects
- Anomaly Detection: Detect objects different from normal
- Robot Navigation: Move while avoiding objects
Research and Development Applications
- Training on Custom Datasets: Train YOLO models for specific objects
- Multimodal Detection: Detection combining images and other sensor data
- Real-time Object Tracking: Track objects in videos in real-time
- 3D Object Detection: Estimate 3D object position and pose from 2D images
Exercises
Basic Object Detection
- Run the YOLO object detection node and observe how objects are detected from camera images
- Use different YOLO models (YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, YOLOv8x) and observe the differences in detection accuracy and processing speed
Parameter Adjustment
- Change the YOLO model’s confidence threshold and observe changes in detection results
- Limit detection target classes to detect only specific objects
Node Extension
- Add detection probability information to the object detection node
- Add a function to display the number of detected objects
- Calculate and display object position information (where in the screen they were detected)
Content
Use ROS2 and YOLO to detect objects in real-time from robot camera images!