=======================
Subsystem Vision
=======================

What the Vision subsystem does
------------------------------
The Vision layer is responsible for turning raw RGB-D sensor streams into a concise **Object Map** that downstream planners and manipulators can use.  Its end-to-end workflow is:

  1. **Capture**  
     The **Camera** node streams synchronized color and depth frames.

  2. **Clean & Normalize**  
     **Camera Preprocessing** applies lens undistortion, colour-balance and denoising filters in a reusable pipeline.

  3. **Perceive**  
     **Object Detection** locates plates, hands and obstacles in each frame at a steady 1 Hz, producing bounding-box hypotheses.

  4. **Localize**  
     **Distance Estimation** fuses those bounding boxes with depth pixels to compute metric XYZ coordinates, emitting a final `cogar_msgs/ObjectMap`.

.. image:: images/vision_subsystem.png
   :alt: Component diagram of the Vision subsystem
   :align: center
   :width: 90%

Design Patterns
---------------

Strategy
~~~~~~~~
In our Vision subsystem, **Object Detection** uses Strategy to swap between different detection implementations—whether a placeholder stub, a classical CV method or a deep-learning model—without touching the rest of the pipeline.  This keeps detection logic **extensible** and **decoupled** from downstream consumers.

Observer
~~~~~~~~
**Camera Preprocessing** subscribes to `/camera` and `/depth` and processes each frame as soon as it arrives.  By reacting to new-image events rather than polling, we guarantee that every frame is handled exactly once and with minimal latency.

Adapter
~~~~~~~
**Camera** uses `CvBridge` to adapt ROS `sensor_msgs/Image` messages into OpenCV `numpy` arrays and back.  This isolates ROS message formats from CV code, making sensor-to-image integration seamless and maintainable.

Decorator
~~~~~~~~~
**Camera Preprocessing** chains filters—undistort → colour-balance → 5×5 Gaussian blur—by wrapping each output in the next filter.  We can add, remove or reorder filters without modifying the subscription logic.

Template Method
~~~~~~~~~~~~~~~ 
**Distance Estimation** provides a fixed flow—parse detection results, sample depth, back-project to XYZ, publish—while allowing future variants to customize parsing or projection without rewriting the overall pipeline.  

Component roles
---------------
- **Camera**  
  - Publishes `/camera` (`sensor_msgs/Image`) and `/depth` (`sensor_msgs/Image`) at **1,280×720 @ 10 Hz**.  
  - Implements the **Adapter** pattern via `CvBridge` for ROS↔OpenCV conversions.

- **Camera Preprocessing**  
  - Subscribes to raw streams, applies an undistort + Gaussian blur + colour-balance **pipeline** (Decorator pattern), and republishes on `/camera_processed` and `/depth_processed`.  

- **Object Detection**  
  - Listens to `/camera_processed`, applies a detection algorithm chosen by the **Strategy** pattern, and publishes `vision_msgs/Detection2DArray` on `/object_hypotheses` at **1 Hz**.  

- **Distance Estimation**  
  - Subscribes to `/object_hypotheses` and `/depth_processed`, uses a **Template Method**–style base class to parse, project via pinhole intrinsics and publish a `cogar_msgs/ObjectMap` on `/object_map`.  


ROS interfaces & KPIs
---------------------
.. list-table::
   :header-rows: 1
   :widths: 30 25 45

   * - Topic / Service
     - Type
     - KPI / Note
   * - **`/camera`, `/depth`**  
     - `sensor_msgs/Image`  
     - 10 Hz capture rate; end-to-end acquisition latency ≤ 200 ms  
   * - **`/camera_processed`, `/depth_processed`**  
     - `sensor_msgs/Image`  
     - Preprocessing adds < 25 ms latency; depth completeness ≥ 99 %  
   * - **`/object_hypotheses`**  
     - `vision_msgs/Detection2DArray`  
     - Detection mAP ≥ 0.75 (IoU 0.5) on plate set, published ≤ 1 s after frame  
   * - **`/object_map`**  
     - `cogar_msgs/ObjectMap`  
     - Metric positions published ≤ 300 ms after detections  

Implementation modules
----------------------
Detailed API docs for each Vision component:

.. toctree::
   :maxdepth: 1
   :caption: Vision Components

   vision_modules/camera
   vision_modules/camera_preprocessing
   vision_modules/object_detection
   vision_modules/distance_estimation