In this tutorial, you’ll learn how to create a web application that performs real-time object detection using your webcam. By combining TensorFlow.js—Google’s library for machine learning in JavaScript—with p5.js for creative coding and drawing, you can build an interactive app that detects objects on the fly.
This guide will walk you through setting up your development environment, loading a pre-trained model, capturing video input, and overlaying detection results on the video feed.
Prerequisites
Before you begin, make sure you have the following installed and set up:
- A modern web browser that supports webcam access (Chrome, Firefox, or Edge).
- Basic knowledge of HTML, CSS, and JavaScript.
- Familiarity with p5.js (optional, but helpful).
- A code editor (Visual Studio Code, Sublime Text, etc.).
You do not need any backend setup since everything runs in the browser using TensorFlow.js and p5.js.
Step 1: Setting Up Your Project
Create a new folder for your project and add an index.html
file. In this file, we’ll include the necessary libraries via CDN:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Real-Time Object Detection</title>
<style>
body {
text-align: center;
background: #222;
color: #fff;
font-family: sans-serif;
}
canvas {
border: 2px solid #fff;
}
</style>
</head>
<body>
<h1>Real-Time Object Detection Web App</h1>
<!-- p5.js and TensorFlow.js -->
<script src="https://cdn.jsdelivr.net/npm/p5@1.6.0/lib/p5.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@4.6.0/dist/tf.min.js"></script>
<!-- Pre-trained model: COCO-SSD -->
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/coco-ssd"></script>
<script src="sketch.js"></script>
</body>
</html>
This HTML file loads p5.js, TensorFlow.js, and the COCO-SSD model library. We also reference our custom script file (sketch.js
), which will contain our application logic.
Step 2: Capturing Video with p5.js
Create a new file called sketch.js
in your project folder. We’ll use p5.js to access the webcam and display the video on a canvas:
let video;
let detector;
let detections = [];
function setup() {
// Create the canvas to match the video dimensions
createCanvas(640, 480);
// Capture video from the webcam
video = createCapture(VIDEO);
video.size(640, 480);
video.hide();
// Load the pre-trained COCO-SSD model
cocoSsd.load().then(model => {
detector = model;
console.log("Model Loaded!");
// Begin detecting objects every frame
detectObjects();
});
}
function detectObjects() {
detector.detect(video.elt).then(results => {
detections = results;
// Continue detection in a loop
detectObjects();
});
}
function draw() {
// Draw the video
image(video, 0, 0);
// Draw detection boxes and labels if available
if (detections) {
for (let i = 0; i < detections.length; i++) {
let object = detections[i];
stroke(0, 255, 0);
strokeWeight(2);
noFill();
rect(object.bbox[0], object.bbox[1], object.bbox[2], object.bbox[3]);
noStroke();
fill(0, 255, 0);
textSize(16);
text(object.class, object.bbox[0] + 4, object.bbox[1] + 16);
}
}
}
Explanation
- Setup: The
setup
function initializes the canvas and video capture. The video is hidden by p5.js’s default element so that we can draw it onto the canvas manually. - Model Loading: We load the COCO-SSD model asynchronously. Once the model is ready, we start continuous object detection by calling
detectObjects()
. - Detection Loop: The
detectObjects
function uses the loaded model to analyze the current video frame and stores the detection results. It recursively calls itself so that new frames are analyzed continuously. - Drawing: In the
draw
loop, the video feed is displayed and for each detected object, a rectangle and label are drawn. The bounding box coordinates and object class are provided by the model.
Step 3: Running Your App
- Open the Project: Open the
index.html
file in your browser. You might need to serve the files using a local web server (for example, using the VS Code Live Server extension) because accessing webcam streams from local files can be restricted. - Grant Webcam Permission: The browser will prompt you to allow access to the webcam. Grant permission to start the video feed.
- Observe the Detection: As the model processes the video stream, you’ll see bounding boxes and labels appear around detected objects (like people, chairs, and more).
Step 4: Experiment and Extend
Now that you have a basic real-time object detection app, consider extending its functionality:
- Filtering Detections: Display only specific classes (e.g., only people or vehicles).
- Custom UI Elements: Use p5.js to add buttons or controls that modify detection settings in real time.
- Performance Optimization: Experiment with frame rate adjustments or model parameters for faster detection.
Conclusion
Congratulations! You’ve built a real-time object detection web application using TensorFlow.js and p5.js. This project demonstrates how to integrate machine learning models into a browser-based environment and interact with live video feeds. With further experimentation, you can adapt this tutorial to a variety of creative projects, from interactive art installations to practical surveillance tools.