Cat Counter Watch - Computer Vision Cat Detection

Cat Counter Watch is a containerized computer vision service that monitors a video stream, detects cats, checks whether a cat is inside a defined kitchen-counter region, and sends an HTTP POST event when the condition is met.

It runs entirely on CPU and is designed for Raspberry Pi (64-bit) or any low-power Linux machine. It does not require a GPU.


What the Project Does

  • Connects to a webcam (/dev/video0) or an RTSP/HTTP stream.
  • Runs object detection on each frame.
  • Filters detections to specific classes (default: cat).
  • Checks whether the detected cat’s bounding box center lies inside a configured polygon (the counter area).
  • Requires the condition to hold for a configurable number of consecutive frames.
  • Enforces a cooldown period between triggers.
  • Sends an HTTP POST request to a configured endpoint when triggered.

The service runs continuously and logs status information such as effective FPS and inference time.


Why It Exists

The goal is to create a reliable, local trigger for external hardware (ESP, Arduino, relay, pneumatic valve, etc.) based on a simple visual condition.

Any microcontroller or backend service capable of handling an HTTP POST can receive the event.


How it works

The detector uses a lightweight YOLO (YOLOv4-tiny) model suitable for CPU inference.

Processing steps per frame:

  1. Resize to IMG_SIZE.
  2. Run forward pass through the model.
  3. Apply confidence threshold and non-max suppression.
  4. Filter detections by class name.
  5. Compute bounding box center.
  6. Check if the center point lies inside the configured ROI polygon.

If the cat remains inside the ROI (Region of Interest) for CONSEC_HITS consecutive processed frames, the trigger condition becomes true.


Deployment

The application is packaged as a Podman container.

Build:

podman build -t cat-counter-watch .

Run with a webcam:

podman run --rm -it \
  --network=host \
  --device=/dev/video0 \
  -e EVENT_URL=http://127.0.0.1:8000/event \
  -e ROI_POLY="50,50;590,50;590,430;50,430" \
  cat-counter-watch

Run with RTSP:

-e CAM_SOURCE="rtsp://host:port/path"

All configuration is passed through environment variables.


ROI Definition

The counter area is defined as a polygon using pixel coordinates:

x1,y1;x2,y2;x3,y3;...

A helper script (roi_click_calibrator.py) allows interactive point selection using OpenCV.


RTSP Emulator

RTSP Emulator is a small Java program that publishes a synthetic RTSP video stream containing either a “cat” image or a “no-cat” image. It streams static images into a local RTSP server and allows switching between them at runtime. It is a lot more useful to test via keypress instead of finding my cat whenever I want to try my changes.