top of page

Object Detection

  • 작성자 사진: Shin Yoonah, Yoonah
    Shin Yoonah, Yoonah
  • 2022년 8월 17일
  • 2분 분량

최종 수정일: 2022년 8월 25일


Outline

- Sliding Windows

- Bounding Box

- Bounding Box Pipeline

- Score


Image classification predicts the class of an object in an image


Classification and object localization

--> Locate the presence of an object and indicate the location with a bounding box and their classes


Sliding Window

= algorithm


- If we want to detect a dog, we consider a fixed window size

- If chosen property, the dog will occupy most of the window


Essentially a sub image that we would like to classify as a dog

The other sub images - classified as background

(Image that does not contain the dog)


*Process

  1. Start in one region in the image, classify that sub-image

  2. Then shift the window and classify the next sub-image

  3. Repeat the process -- when the object occupies with of the window, it will be classified


Problems of Sliding Windows

  1. Overlapping Boxes: object detects often output many overlapping detections

  2. Object Sizes: have the issue of object sizes, where the same object can come in different sizes/Solution: reshaping the image

  3. Overlapping Objects: this may pose issues to the sliding windows


Bounding Box

Bounding box = a rectangular box that can be determined with the lower-right corner of the rectangle with coordinates y=0 and x=0


Y and X are not the same as the classification labels y and the image x

Upper--left corner = (Ymin, Xmin)

Lower--right corner = (Xmax, Ymax)


They are just to illustrate the coordinates of the Bounding Box

--> The goal of object detection is to predict these points, so we add a "hat" to indicate it's prediction


Bounding Box Pipeline

Like classification, we have the class y and x

- we have a dataset of classes and their bounding boxes


Similar to classification, we use the dataset to train the model; we include the box coordinates

--> result: object detector with updated learning parameters


Input the image with the objects we would like to detect


We have the predicted class and the box coordinates


Score

- Many object detection algorithms provide a score letting you know how confident the model prediction is


Each column in the table has an image and it's prediction

The first row: the score ranging from 0 to 1

The second row: the class

The third row: the image and its bounding box


For the first row, we see the prediction is dog

--> but the image does not look like a dog


As a result, the score is 0.99

= the model is confident about its prediction


For each detection, a score is provided, we can adjust so we only accept detections above a specific score


*Usually models will only output objects over a specific threshold*


Copyright Coursera All rights reserved




Bình luận


bottom of page