Building Your Own Real-Time Object Detection App: Roboflow(YOLOv8) and Streamlit (Part 1)

Create a images data set from zero

10 min readJul 30, 2023

Introduction

Object detection is a groundbreaking computer vision task that has a ton of applications across various industries. It goes beyond traditional image classification, where a model assigns a single label to an entire image, to identify and locate multiple objects within an image, often accompanied by bounding boxes outlining their positions.

When working on custom models for object detection or other machine learning tasks, one of the challenges that researchers and developers may encounter is the lack of suitable databases or datasets. Overcoming these challenges often requires creativity and resourcefulness so this post will focus on how to create your custom database.

Labels in a image (Image from Roboflow) — Labels

How The 3 Parts of This Blog Series Are Organised?

In this series, we will build a YOLOv8 object detection model on a custom dataset and the corresponding application using Streamlit. The main goal of this project is to provide a simple and efficient implementation of real-time object detection that can be easily customized and integrated into other applications.

This blog series is divided into the following three Parts.

Part 1: Introduction and Setup for Roboflow

Welcome to Part 1 of our three-part tutorial series on Building Your Own Real-Time Object Detection App: Roboflow(YOLOv8) and Streamlit. In this series, we will walk you through the process of building an end-to-end object detection app that can identify objects from a photo. This web app was built only for images because we are using share.streamlit.io this is the Streamlit project hub where you can post your Streamlit projects free and it has a limit of 1 GB memory space for the app, there is a few libraries that cover a lot of that space so in another post or series I’ll add more about video and webcam functions to complement this app.

In Part 1, we will introduce the project, give you a demo of the app in action, and explain why I chose Roboflow and Streamlit for this project. We will also guide you through the setup process, including installing dependencies and creating the necessary files and directories.

By the end of this series, you will have the skills to build your own object detection app. So, let’s dive in!

Demo of the Object Detection App

This is the web app demo from the project that we are going to create and build together in the Streamlit share cloud. The app Object Detection will Upload an image on the WebApp and show detected objects.

Object Detection

Object detection is a computer vision solution that identifies instances of objects in visual media. Object detection scripts draw a bounding box around an instance of a detected object, paired with a label to represent the contents of the box. For example, a person in an image might be labeled “person” and a car might be labeled “vehicle”.

What is YOLOv8?

YOLOv8 is the newest state-of-the-art YOLO model that can be used for object detection, image classification, and instance segmentation tasks. YOLOv8 was developed by Ultralytics, this model is used in Roboflow.

Why Should I Use YOLOv8?

Here are a few main reasons why you should consider using YOLOv8 for your next computer vision project:

YOLOv8 has a high rate of accuracy measured by COCO and Roboflow 100.
YOLOv8 comes with a lot of developer-convenience features,an a well-structured Python package.
The labeling tool is easy to use and you don’t need to install a tool for that.
And last but not least is not difficult to run it also is faster than use a notebook with TensorFlow. In my case it takes 3 hours to train the model in Google Colab but with Roboflow it took me a few minutes.

Why Streamlit is a Good Choice for Building a ML App

Streamlit makes it easy to build web-based user interfaces for machine learning applications, enabling data scientists and developers to share their work with non-technical stakeholders.

Streamlit is an open-source framework that simplifies the process of building web applications in Python. And it has it’s own project cloud that makes really easy deploy your project.

Project Setup: Installing Dependencies and Creating Required Files and Directories

Before diving into the project, make sure you have the following dependencies installed on your system. In my case I’m a Windows user so everything in this tutorial is working for July 2023 in Windows 11.

For this project I have Python 3.11 but in Streamlit cloud only has the version 3.8 to 3.11 so I recommend using that range of versions and the Python packages that we will use will be PyTorch, Ultralytics and Streamlit. We can install these packages using pip into a separate virtual environment.

Creating Virtual Environment

When working on a Python project, it’s important to keep your dependencies separate from your global Python environment to prevent conflicts between different projects, especially with Pytorch.

Make sure you already have installed Python, VS code(or other IDE) and Git. Follow the next steps:

Create a new virtual environment by running the following command in the terminal after venv you can name as you wish your environment:

python -m venv env

Then activate the enviroment:

env\Scripts\activate

The first step is getting our data set (Images folder). In this case I recommend having at least 200 images. While the more pictures you have, the better your model becomes but don’t use pictures nearly identicals. I’m using 4 different sign hand posture so taking 50 photos with any device can take a lot of time so let’s create an environment only for the script that will take photos with our web cam. In this environment we only need to install OpenCV. So run in your terminal:

pip install opencv-python

Now you can run the following script, basically you can modify the labels, these labels will be used to create folders and will take the number of images that you declared. After finishing with the first label it will continue with the next one until it finishes the labels list. And will display a window that shows what is capturing. Also you can modify the time between each shot and time between the labels capture. Start taking pictures:

import cv2 
import uuid
import os
import time

labels = ['thumbsup', 'hi', 'loveyou', 'livelong'] #modify the labels as you need
number_imgs = 20#number of images that will take

IMAGES_PATH = os.path.join('images')

if not os.path.exists(IMAGES_PATH):
    os.makedirs(IMAGES_PATH)

for label in labels:#Loop that creates folders for the labels
    path = os.path.join(IMAGES_PATH, label)
    if not os.path.exists(path):
        os.makedirs(path)

for label in labels:#Loop that takes the pictures for each label
    cap = cv2.VideoCapture(0)
    print('Collecting images for {}'.format(label))
    time.sleep(10)#Time before start taking pictures
    for imgnum in range(number_imgs):
        print('Collecting image {}'.format(imgnum))
        ret, frame = cap.read()
        imgname = os.path.join(IMAGES_PATH, label, label + '.' + '{}.jpg'.format(str(uuid.uuid1())))
        cv2.imwrite(imgname, frame)
        cv2.imshow('frame', frame)
        time.sleep(2)#Time between each camera shot

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
cap.release()
cv2.destroyAllWindows()

At this point we will have the amount of images that we need but the name of each picture is random so we have to rename it to make it easier to identify each image. The next code will rename each image in just one folder so run the code for each folder in your project.

import os
import glob

def rename_images(folder_path):
    # Change the current working directory to the folder with images
    os.chdir(folder_path)

    # Get a list of all image files in the folder
    image_files = glob.glob("*.jpg") + glob.glob("*.jpeg") + glob.glob("*.png") + glob.glob("*.gif")

    # Sort the list of image files alphabetically
    image_files.sort()

    # Initialize a counter to create sequential numbers
    counter = 1

    # Rename each image file
    for old_name in image_files:
        # Get the file extension
        extension = os.path.splitext(old_name)[1]

        # Create the new name with the desired format (e.g., "title_1.jpg", "title_2.jpg", etc.)
        new_name = f"thumbsup_{counter}{extension}" #change the word before the _ for the name

        # Rename the file
        os.rename(old_name, new_name)

        # Increment the counter for the next image
        counter += 1

if __name__ == "__main__":
    folder_path = "images\hi"#change path for every folder
    rename_images(folder_path)

Create a project with Roboflow

Building a custom dataset can be a painful process. It might take dozens or even hundreds of hours to collect images, label them, and export them in the proper format. Fortunately, Roboflow makes this process straightforward. If you only have images, you can label them in Roboflow Annotate. (When starting from scratch, consider annotating large batches of images via API or use the model-assisted labeling tool to speed things up.)

Before you start, you need to create a Roboflow account. Once you do that, you can create a new project in the Roboflow dashboard.

Keep in mind to choose the right project type. In this case choose, Object Detection.

Upload your images

Add data to your newly created project. You can do it through the web interface. If you don’t have a dataset, you can grab one from Roboflow Universe.

If you drag and drop a directory with a data set in a supported format, the Roboflow dashboard will automatically read the images and annotations together. To create a data set with annotations locally in Windows check this post.

After all images uploaded you can click Save and Continue.

Then it will appear the pop-up window and you can Click only in Assing Images, in this part if you are working with a Team you can invite them to add images or labeling.

Then we need to click Start Annotating in case you upload images only to use the label tool from Roboflow.

Label your images

Use the tool to select the element with the classes that you are going to use in your model. And repeat the same process for all the images.

After you finish labeling all the images click the back button highlighted in red in the image below.

Now we can add all the images to the Dataset with the button Add n Image to the Dataset.

Noe will appear the option to Add Images you can choose different options I recommend using the default option.

After loading our images to the database another window will appear. You need to make sure that there are no UNASSIGNED images and the Dataset is ready, once you have it similar as the image below you can Click Generate New Version.

When we Generate a New Version we can use some tools to prepare the data and experiment with them. Go to option 3.

In this option we can apply transformations in all the images, so make sure to configure this depending on your project. Maybe you are using a camera in Raspberry Pi or maybe you want to use images with a specific format. For my project this configuration is perfect.

Option 4 is an amazing tool because you can generate extra versions from your images that can duplicate or triplicate in the free version of the dataset. Let’s see the options.

For this project I’ll use flip horizontal, try to experiment with it, and depending on your project you can choose the options that you need.

After you choose an Augmentation you will see extra options. For my project I only need the Horizontal. Try to check what is best for your custom project. After that click Apply

Then click continue to step 5 and last.

Select the Maximun Version and then click Generate and is ready to go.

After this will appear the next page:

Congratulations now you have an Image Dataset ready to train a model.

Conclusion

In this first part of our tutorial series, we have introduced you to the Image Data Set in Roboflow. This post gives you a script to capture images from your webcam and create the necessary files and directories. Also a full guide of Roboflow tools in action.

In Part 2, we will focus on Training using Roboflow and a notebook in Google Colab and get the necessary file for the Streamlit code. We will explain the concept of object detection and tracking, and guide you through the process of setting up your computer vision environment and integrating YOLOv8 with OpenCV.