0

Training a YOLO Model from scratch (20 min prep, bake for 1+ hours)

An introductory guide on how to quickly train a computer vision model

Disclaimer

I do not claim to be an expert in Machine Learning. This is an introductory project and I am sharing my experiences. Use at your own risk.

Project requirements

  1. Python 3.11
  2. OpenCV, Roboflow
  3. Moderately fast GPU

Getting started

  1. Picking a model:

    There are several variations of YOLO. At the time of writing this article the most advanced version is the newly released YOLOv11. Newer models come with additional sets of features such as built in object tracking or potentially higher accuracy benchmarks. In my case I opted to use YOLOv9m, which is a mid size model. Model size on certain YOLO versions varies from s, m, L etc. Larger models can have higher precision.

  2. Find a dataset, or collect your own.

    Typically, it is much easier to discover an existing dataset rather than create your own. Creating one often involves careful thought and necessary planning and data processing. In my case, I found the necessary dataset in Roboflow. Once found, you can request Roboflow for the command

  3. Pulling the dataset In order to pull the selected dataset, some public dataset repos will provide an api key to pull from. For Roboflow, we can use:

        from roboflow import Roboflow
    
        # API key
        rf = Roboflow(api_key="YOUR_API_KEY")
    
        # Specify your project and workspace
        project = rf.workspace("your-workspace-name").project("your-project-name")
    
        # Pick your dataset type
        dataset = project.version("1").download("yolov8")
    
  4. Assess and understand the datase through the included .yaml file. The yaml file will point to the train, test and validate sets pulled from Roboflow.

  5. Training
    Training a model is quite straightforward with YOLO. YOLO compatible datasets will have a .config file that points to the validation, training, and classification sets.

    
    from ultralytics import YOLO
    
    model = YOLO('yolov8n.pt')
    
    model.train(
        data='path/to/data.yaml',
        epochs=20,
        imgsz=640,
        batch=16,
        name='custom_yolov8_model'
    )
    
  6. Predict

    from ultralytics import YOLO
    
    model = YOLO('runs/train/custom_yolov8_model/weights/best.pt')
    
    results = model.predict(
    source='path/to/image_or_video.jpg',
    conf=0.5,
    save=True
    )
    
    print(results)

Bonus Round

Using Gcollab we can speed up the process significantly, using at the time some fast GPU's.

  1. Create a collab account, and acquire the necessary credits based on the estimated training time. This can vary significantly, in my cases my entire data set totalled nearly 30,000 images, of 600x400 pixel images.

  2. Run in GCollab, import the YOLO and Roboflow python packages.


import Roboflow 
import YOLO

  1. Pull the selected dataset that we discussed earlier using the command discussed earlier. (This will insert the dataset into the root directory)

  2. Check the cuda version and ensure that the necessary pieces are installed. You can use the following snippet in Gcollab to check this:

    import torch
    
    def check_cuda_and_install():
        # Check if CUDA is available
        if torch.cuda.is_available():
            cuda_version = torch.version.cuda
            print(f"CUDA is available. CUDA version: {cuda_version}")
        else:
            print("CUDA is not available on this system. Please ensure you are using a GPU runtime in Colab.")
            return
    
        # Check PyTorch version
        torch_version = torch.__version__
        print(f"PyTorch version: {torch_version}")
    
        # Verify GPU details
        device_count = torch.cuda.device_count()
        print(f"Number of GPUs available: {device_count}")
        if device_count > 0:
            for i in range(device_count):
                print(f"GPU {i}: {torch.cuda.get_device_name(i)}")
        else:
            print("No GPUs are available.")
    
    # Ensure GPU runtime is selected
    !nvidia-smi
    
    # Check CUDA and PyTorch setup
    check_cuda_and_install()