Introduction to YOLO Object Detection in ROS2

This document explains how to perform object detection using YOLO (You Only Look Once) on images obtained from robot cameras in a ROS2 environment. YOLO is a high-performance deep learning model that can detect objects in real-time.

Learning Objectives

By the end of this lesson, you will be able to:

  • Understand what YOLO is and how it works for object detection
  • Integrate YOLO models into ROS2 nodes
  • Process camera images and detect objects in real-time
  • Visualize and publish object detection results

YOLO Object Detection Overview

This lesson covers:

  • Processing images obtained from robot cameras in real-time
  • Implementation of object detection using deep learning (YOLO)
  • Visualization of detection results and publishing to ROS2 topics
  • Working with pre-trained models and understanding model selection

1. What is YOLO

YOLO (You Only Look Once) is a deep learning model for detecting objects in images. Compared to traditional object detection methods, it has the following features:

  • High Speed: Can detect objects in real-time
  • High Accuracy: Latest YOLO models achieve high detection accuracy
  • Ease of Use: Pre-trained models are provided and easy to use
  • Diversity: Can detect various object classes

Main Uses of YOLO

  • Object Detection: Detect position and type of objects in images
  • Object Tracking: Track objects in videos
  • Segmentation: Identify regions of objects in images
  • Pose Estimation: Estimate object poses

2. Basics of YOLO Object Detection in ROS2

To perform YOLO object detection in ROS2, the following steps are required:

  1. Subscribe to images from the camera
  2. Convert ROS2 image messages to OpenCV format
  3. Perform object detection using YOLO model
  4. Visualize detection results and publish as needed

Required Libraries

  • rclpy: ROS2 Python client library
  • cv_bridge: Library for converting between ROS2 image messages and OpenCV format images
  • OpenCV: Library for image processing
  • Ultralytics: Library that provides YOLO models

3. Implementation of YOLO Object Detection

Create a ROS2 node for YOLO object detection. This node subscribes to images from the camera, performs object detection using the YOLO model, and visualizes and publishes detection results.

Implementation Steps

  1. Create a ROS2 node
  2. Subscribe to images from the camera
  3. Load YOLO model
  4. Perform object detection on images
  5. Visualize and publish detection results

Code Example

image_yolo_detection.py

#!/usr/bin/env python3
import rclpy
from rclpy.node import Node
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
import cv2
from ultralytics import YOLO
from rclpy.qos import QoSProfile, QoSReliabilityPolicy, QoSDurabilityPolicy
from std_msgs.msg import String
import json


class ImageYoloDetection(Node):
    def __init__(self):
        super().__init__('image_yolo_detection')
        
        # Set QoS profile for image data transmission
        qos_profile = QoSProfile(
            depth=5,
            reliability=QoSReliabilityPolicy.BEST_EFFORT,
            durability=QoSDurabilityPolicy.VOLATILE
        )
        
        # Subscribe to camera image topic
        self.subscription = self.create_subscription(
            Image,
            '/kachaka/front_camera/image_raw',
            self.image_callback,
            qos_profile
        )
        
        # Create publisher for detection results
        self.publisher = self.create_publisher(
            Image,
            '/image_yolo_detection/image',
            qos_profile
        )

        # Publish detected object information
        self.object_publisher = self.create_publisher(
            String,
            '/image_yolo_detection/objects',
            qos_profile
        )

        # Create CvBridge for ROS-OpenCV image conversion
        self.bridge = CvBridge()

        # Load YOLOv8 model
        # 'yolov8x.pt' is the largest YOLOv8 model (highest accuracy, slowest)
        # The model file will be downloaded automatically on first use if not present
        # Other options: 'yolov8n.pt' (fastest), 'yolov8s.pt', 'yolov8m.pt', 'yolov8l.pt'
        self.model = YOLO('yolov8x.pt')
        
        # Record previously detected objects to avoid duplicate announcements
        # Using a set ensures each object is only tracked once
        self.last_detected_objects = set()
        
        self.get_logger().info("YOLOv8 node has started.")

    def image_callback(self, msg):
        try:
            # Convert ROS Image message to OpenCV image
            cv_image = self.bridge.imgmsg_to_cv2(msg, desired_encoding='bgr8')
        except Exception as e:
            self.get_logger().error("Error converting image: " + str(e))
            return

        # Perform object detection using YOLOv8
        # The model processes the image and returns detection results
        # Results include: bounding boxes, class labels, confidence scores
        results = self.model(cv_image)

        # Draw detection results on the image
        # plot() draws bounding boxes, labels, and confidence scores on the image
        # This creates a visual representation of what was detected
        annotated_image = results[0].plot()

        # Extract detected object names from the results
        detected_objects = set()  # Use set to avoid duplicates
        for result in results:
            # Iterate through all detected bounding boxes
            for box in result.boxes:
                # Get the class ID (integer) and convert to class name (string)
                class_id = int(box.cls[0])  # Class ID (e.g., 0, 1, 2...)
                class_name = result.names[class_id]  # Class name (e.g., 'person', 'car')
                detected_objects.add(class_name)  # Add to set (automatically handles duplicates)

        # Publish information only if new objects are detected
        # This avoids spamming the topic with the same detections
        # Set difference finds objects in detected_objects but not in last_detected_objects
        new_objects = detected_objects - self.last_detected_objects
        if new_objects:
            # Publish detected object information in JSON format
            object_info = {
                "objects": list(new_objects),
                "timestamp": self.get_clock().now().nanoseconds * 1e-9
            }
            object_msg = String()
            object_msg.data = json.dumps(object_info)
            self.object_publisher.publish(object_msg)
            
            self.get_logger().info(f"Detected objects: {new_objects}")

        # Update detection results
        self.last_detected_objects = detected_objects

        # Convert detection results to ROS Image message and publish
        try:
            annotated_msg = self.bridge.cv2_to_imgmsg(annotated_image, encoding='bgr8')
            self.publisher.publish(annotated_msg)
        except Exception as e:
            self.get_logger().error("Error converting detection results: " + str(e))

        # Display detection results in a window
        cv2.imshow("YOLOv8 Detection", annotated_image)
        cv2.waitKey(1)

def main(args=None):
    rclpy.init(args=args)
    node = ImageYoloDetection()
    try:
        rclpy.spin(node)
    except KeyboardInterrupt:
        node.get_logger().info("Shutting down...")
    finally:
        cv2.destroyAllWindows()
        node.destroy_node()
        rclpy.shutdown()

if __name__ == '__main__':
    main()

Code Explanation

  • QoS Profile: Set QoS profile suitable for image data transmission and reception
  • Subscription: Subscribe to images from the camera
  • Publisher: Create topics to publish detection results
  • cv_bridge: Convert between ROS2 image messages and OpenCV format images
  • YOLO Model: Load YOLOv8 model using Ultralytics library
  • Object Detection: Execute object detection using model(cv_image)
  • Result Visualization: Draw detection results on images using results[0].plot()
  • Result Publishing: Convert detection results to ROS Image messages and publish
  • Result Display: Display detection results using cv2.imshow function

4. Types of YOLO Models

YOLO provides various models. The main models are as follows:

YOLOv8 Models

  • YOLOv8n: Lightest model (fast but lower accuracy)
  • YOLOv8s: Small model (balanced performance)
  • YOLOv8m: Medium model (higher accuracy)
  • YOLOv8l: Large model (high accuracy but slow)
  • YOLOv8x: Largest model (highest accuracy but slowest)

Model Selection

  • Prioritize Processing Speed: Choose YOLOv8n or YOLOv8s
  • Prioritize Accuracy: Choose YOLOv8l or YOLOv8x
  • Prioritize Balance: Choose YOLOv8m

5. Package Creation and Execution

Package Creation

# Create YOLO object detection package
ros2 pkg create --build-type ament_python image_yolo_detection --dependencies rclpy sensor_msgs cv_bridge opencv-python ultralytics std_msgs --node-name image_yolo_detection

Installing Dependencies

# Install Ultralytics library
pip install ultralytics

Package Building

cd ~/ros2_ws
colcon build --packages-select image_yolo_detection

Node Execution

# Run YOLO object detection node
ros2 run image_yolo_detection image_yolo_detection

6. Applications of YOLO Object Detection

Practical Applications

  • Object Tracking: Track detected objects
  • Object Counting: Count specific objects
  • Anomaly Detection: Detect objects different from normal
  • Robot Navigation: Move while avoiding objects

Research and Development Applications

  • Training on Custom Datasets: Train YOLO models for specific objects
  • Multimodal Detection: Detection combining images and other sensor data
  • Real-time Object Tracking: Track objects in videos in real-time
  • 3D Object Detection: Estimate 3D object position and pose from 2D images

Exercises

Basic Object Detection

  1. Run the YOLO object detection node and observe how objects are detected from camera images
  2. Use different YOLO models (YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, YOLOv8x) and observe the differences in detection accuracy and processing speed

Parameter Adjustment

  1. Change the YOLO model’s confidence threshold and observe changes in detection results
  2. Limit detection target classes to detect only specific objects

Node Extension

  1. Add detection probability information to the object detection node
  2. Add a function to display the number of detected objects
  3. Calculate and display object position information (where in the screen they were detected)

Content

Use ROS2 and YOLO to detect objects in real-time from robot camera images!