0
Login / Create Account

Please fill your detail, To access account and manage orders

Log inSign Up
  • Products
    • View All Workstations
    • View All Server
      • View All Edge Computing
      • Solutions
        • View All Solutions
      • Services
        • View All Services
        • Managed Services
        • Home Services
        • Business Services
        • Medium & Large Business Services
      • Resources
        • Blogs
      • Company
        • About Us
        • Contact Us
        • Careers
      • 0
      • 011-40727769
      • Products
        • Our Workstations
        • Workstations
          • Server
            • View All Server
          • Edge Computing
            • View All Edge Computing
          Maven PX-007

          CPU: Upto 64 cores which can clocks at 4.5 Ghz

          Explore
          Maven PX-007

          CPU: Upto 64 cores which can clocks at 4.5 Ghz

          Explore
        • Solutions
          • View All Solutions
        • Services
          • View All Services
          • Managed Services
          • Home Services
          • Business Services
          • Medium & Large Business Services
        • Blog
        • About Us
        • Contact Us
        • My Wishlist

        For Professionals, By Professionals

        Discover ProX PC for best custom-built PCs, powerful workstations, and GPU servers in India. Perfect for creators, professionals, and businesses. Shop now!

        COMPANY
        • About Us
        • Blogs
        • Contact Us
        • Careers
        PRODUCTS
        • Workstations
        • GPU Server
        • Edge Computing
        SOLUTIONS
        • View All Solutions
        Info Links
        • Terms & Conditions
        • Shipping Policy
        • Return & Refund Policy
        • Product Warranty And Support
        SERVICES
        • View All Services
        • Managed Services
        • Business Services
        • Home Services
        • Medium & Large Business Services
        CONTACT US
        • 011-40727769
        • sales@proxpc.com
        • D-147, Second Floor Okhla Phase -1 OKHLA, New Delhi, 110020

        WE ACCEPT
        Terms Of UsePrivacy PolicyCopyrights ProX PC 2024 | All Rights Reserved
        Features Image

        YOLOv9: Advancements in Real-time Object Detection (2024)

        June 12, 2024
        Share this:

        Contents

        1. What is YOLOv9?
        2. YOLO Version History
        3. Architecture YOLOV9
        4. YOLOV9 License
        5. Advantages of YOLOv9
        6. YOLOV9 Applications
        7. YOLOv9: Main Takeaways


        The latest installation in the YOLO series, YOLOv9, was released on February 21st, 2024. Since its inception in 2015, the YOLO (You Only Look Once) object-detection algorithm has been closely followed by tech enthusiasts, data scientists, ML engineers, and more, gaining a massive following due to its open-source nature and community contributions. With every new release, the YOLO architecture becomes easier to use and much faster, lowering the barriers to use for people around the world.

        YOLO was introduced as a research paper by J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, signifying a step forward in the real-time object detection space, outperforming its predecessor – the Region-based Convolutional Neural Network (R-CNN). It is a single-pass algorithm having only one neural network to predict bounding boxes and class probabilities using a full image as input.

        Explore ProX Micro Edge Devices at proxpc.com


        What is YOLOv9?

        YOLOv9 is the latest version of YOLO, released in February 2024, by Chien-Yao Wang, I-Hau Yeh, and Hong-Yuan Mark Liao. It is an improved real-time object detection model that aims to surpass all convolution-based, and transformer-based methods.

        YOLOv9 is released in four models, ordered by parameter count: v9-S, v9-M, v9-C, and v9-E. To improve accuracy, it introduces programmable gradient information (PGI) and the Generalized Efficient Layer Aggregation Network (GELAN). PGI prevents data loss and ensures accurate gradient updates and GELAN optimizes lightweight models with gradient path planning.

        At this time, the only computer vision task supported by YOLOv9 is object detection.

        YOLOv9 concept proposed in the paper: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information 
        YOLOv9 concept proposed in the paper: YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information 


        YOLO Version History

        Before diving into the YOLOv9 specifics, let’s briefly recap on the other YOLO versions available today.

        YOLOv1

        YOLOv1 architecture (displayed above) surpassed R-CNN with a mean average precision (mAP) of 63.4, and an inference speed of 45 FPS on the open-source Pascal VOC 2007 dataset. With YOLOv1, object detection is treated as a regression task to predict bounding boxes and class probabilities from a single pass of an image.

        YOLOv2

        Released in 2016, it could detect 9000+ object categories. YOLOv2 introduced anchor boxes – predefined bounding boxes called priors that the model uses to pin down the ideal position of an object. YOLOv2 achieved 76.8 mAP at 67 FPS on the VOC 2007 dataset.

        YOLOv3

        The authors released YOLOv3 in 2018 which boasted higher accuracy than previous versions, with an mAP of 28.2 at 22 milliseconds. To predict classes, the YOLOv3 model uses Darknet-53 as the backbone with logistic classifiers instead of softmax and Binary Cross-entropy (BCE) loss.

        YOLOv3 application for a smart refrigerator in gastronomy and restaurants
        YOLOv3 application for a smart refrigerator in gastronomy and restaurants


        YOLOv4

        2020, Alexey Bochkovskiy et al. released YOLOv4, introducing the concept of a Bag of Freebies (BoF) and a Bag of Specials (BoS). BoF is a set of data augmentation techniques that increase accuracy at no additional inference cost. (BoS significantly enhances accuracy with a slight increase in cost). The model achieved 43.5 mAP at 65 FPS on the COCO dataset.

        YOLOv5

        Without an official research paper, Ultralytics released YOLOv5 also in 2020. The model is easy to train since it is implemented in PyTorch. The model architecture uses a Cross-stage Partial (CSP) Connection block as the backbone for a better gradient flow to reduce computational cost. YOLOv5 uses YAML files instead of CFG files in the model configurations.

        Small object detection with YOLOv5 in traffic analysis with computer vision
        Small object detection with YOLOv5 in traffic analysis with computer vision


        YOLOv6

        YOLOv6 is another unofficial version introduced in 2022 by Meituan – a Chinese shopping platform. The company targeted the model for industrial applications with better performance than its predecessor. The changes resulted in YOLOv6n achieving an mAP of 37.5 at 1187 FPS on the COCO dataset and YOLOv6s achieving 45 mAP at 484 FPS.

        YOLOv7

        In July 2022, a group of researchers released the open-source model YOLOv7, the fastest and the most accurate object detector with an mAP of 56.8% at FPS ranging from 5 to 160. YOLOv7 is based on the Extended Efficient Layer Aggregation Network (E-ELAN), which improves training by letting the model learn diverse features with efficient computation.

        Applied AI system trained for aircraft detection with YOLOv7
        Applied AI system trained for aircraft detection with YOLOv7


        YOLOv8

        YOLOv8 has no official paper (as with YOLOv5 and v6) but boasts higher accuracy and faster speed for state-of-the-art performance. For instance, the YOLOv8m has a 50.2 mAP score at 1.83 milliseconds on the MS COCO dataset and A100 TensorRT. YOLO v8 also features a Python package and CLI-based implementation, making it easy to use and develop.

        Segmentation with YOLOv8 applied in smart cities for pothole detection.
        Segmentation with YOLOv8 applied in smart cities for pothole detection.


        Since YOLOv9’s February 2024 release, another team of researchers has released YOLOv10 (May 2024), for real-time object detection.


        Architecture YOLOv9

        To address the information bottleneck (data loss in the feed-forward process), YOLOv9 creators propose a new concept, i.e. the programmable gradient information (PGI). The model generates reliable gradients via an auxiliary reversible branch. Deep features still execute the target task and the auxiliary branch avoids the semantic loss due to multi-path features.

        The authors achieved the best training results by applying PGI propagation at different semantic levels. The reversible architecture of PGI is built on the auxiliary branch, so there is no additional cost. Since PGI can freely select a loss function suitable for the target task, it also overcomes the problems encountered by mask modeling.

        The proposed PGI mechanism can be applied to deep neural networks of various sizes. In the paper, the authors designed a generalized ELAN (GELAN) that simultaneously takes into account the number of parameters, computational complexity, accuracy, and inference speed. The design allows users to choose appropriate computational blocks arbitrarily for different inference devices.

        YOLOv9 GELAN Architecture 
        YOLOv9 GELAN Architecture 

        Using the proposed PGI and GELAN – the authors designed YOLOv9. To conduct experiments they used the MS COCO dataset, and the experimental results verified that the proposed YOLO v9 achieved the top performance in all cases.


        Research Contributions

        1. Theoretical analysis of deep neural network architecture from the perspective of reversible function. The authors designed PGI and auxiliary reversible branches based on this analysis and achieved excellent results.
        2. The designed PGI solves the problem that deep supervision can only be used for extremely deep neural network architectures. Thus, it allows new lightweight architectures to be truly applied in daily life.
        3. The GELAN network only uses conventional convolution to achieve a higher parameter usage than the depth wise convolution design. So it shows the great advantages of being light, fast, and accurate.
        4. Combining the proposed PGI and GELAN, the object detection performance of the YOLOv9 on the MS COCO dataset largely surpasses the existing real-time object detectors in all aspects.

        Performance of YOLOv9 against other object detection models on COCO dataset 
        Performance of YOLOv9 against other object detection models on COCO dataset 


        YOLOv9 License

        YOLOv9 was not released with an official license. In the following days, however WongKinYiu updated the official license to GPL-3.0. YOLOv7 and YOLOv9 have been released under WongKinYiu’s repository.

        Advantages of YOLOv9

         

        YOLOv9 arises as a powerful model, offering innovative features that will play an important role in the further development of object detection, and maybe even image segmentation and classification down the road. It provides faster, clearer, and more flexible actions, and other advantages include:

        Handling the information bottleneck and adapting deep supervision to lightweight architectures of neural networks by introducing the Programmable Gradient Information (PGI).
         
        • Creating the GELAN, a practical and effective neural network. GELAN has proven its strong and stable performance in object detection tasks at different convolution and depth settings. It could be widely accepted as a model suitable for various inference configurations.

        • By combining PGI and GELAN – YOLOv9 has shown strong competitiveness. Its clever design allows the deep model to reduce the number of parameters by 49% and the number of calculations by 43% compared with YOLOv9. And it still has a 0.6% Average Precision improvement on the MS COCO dataset.
        • The developed YOLOv9 model is superior to RT-DETR and YOLO-MS in terms of accuracy and efficiency. It sets new standards in lightweight model performance by applying conventional convolution for better parameter utilization.
        Model  #Param.  FLOPs  AP50:95val  APSval  APMval  APLval
        YOLOv7 [63]  36.9  104.7  51.2%  31.8%  55.5%  65.0%
        + AF [63]  43.6  130.5  53.0%  35.8%  58.7%  68.9%
        + GELAN  41.7  127.9  53.2%  36.2%  58.5%  69.9%
        + DHLC [34]  58.1  192.5  55.0%  38.0%  60.6%  70.9%
        + PGI  58.1  192.5  55.6%  40.2%  61.0%  71.4%

        The above table demonstrates average precision (AP) of various object detection models.


        YOLOv9 Applications

        YOLOv9 is a flexible computer vision model that you can use in different real-world applications. Here we suggest a few popular use cases.

        YOLOv9 object detection for detecting customers in check-out queues
        YOLOv9 object detection for detecting customers in check-out queues

         

        • Logistics and distribution: Object detection can assist in estimating product inventory levels to ensure sufficient stock levels and provide information regarding consumer behavior.
        • Autonomous vehicles: Autonomous vehicles can utilize YOLOv9 object detection to help navigate self-driving cars safely through the road.
        • People counting: Retailers and shopping malls can train the model to detect real-time foot traffic in their shops, detect queue length, and more.
        • Sports analytics: Analysts can use the model to track player movements in a sports field to gather relevant insights regarding team performance.

        Street view detection with YOLOv9
        Street view detection with YOLOv9


        YOLOv9: Main Takeaways

        The YOLO models are the standard in the object detection space with their great performance and wide applicability. Here are our first conclusions about YOLOv9:

        • Ease-of-use: YOLOv9 is already in GitHub, so the users can implement YOLOv9 quickly through the CLI and Python IDE.
        • YOLOv9 tasks: YOLOv9 is efficient for real-time object detection with improved accuracy and speed.
        • YOLOv9 improvements: YOLOv9’s main improvements include a decoupled head with anchor-free detection and mosaic data augmentation that turns off in the last ten training epochs.

        In the future, we look forward to seeing if the creators will expand YOLOv9 capabilities to a wide range of other computer vision tasks as well.

        ProX PC is the end-to-end platform for computer vision. ProX PC offers a host of pre-trained models to choose from, or the possibility to import or train your own custom AI models. To learn how you can solve your industry’s challenges with computer vision, book a demo of VProX PC.

        For more info visit www.proxpc.com

        Related Products
          

        Micro Edge Orin Nano

        ProX MicroEdge Orin Nano

        • Compact AI accelerator with 6-core Arm® Cortex® CPU and 1024/512-core NVIDIA Ampere GPU with Tensor Cores
        • 8GB/4GB of high-speed LPDDR5 memory and NVMe SSD
        • Dual GbE ports, Wi-Fi options, and 4G/5G support
        • Versatile I/O and robust features
        • Ideal for data-intensive tasks and AI innovation

        Learn more

        Micro Edge Orin NX

        ProX MicroEdge Orin NX

        • Compact powerhouse that combines an 8/6-core Arm® Cortex® CPU, a 1024-core NVIDIA Ampere GPU with 32 Tensor Cores, and lightning-fast 128-bit LPDDR5 memory.
        • Store and retrieve data seamlessly with an NVMe SSD and Micro SD slot.
        • Stay connected with dual GbE ports, Wi-Fi options, and 4G/5G support.
        • Versatile I/O options, including USB 3.1 and HDMI, make interfacing a breeze.
        • Unlock the future of AI innovation with Jetson Orin NX.

        Learn more

        Share this:

        Related Posts

        View more