distance_estimation

distance_estimation.py

Overview

distance_estimation.py is a vertical sensor-fusion component that combines semantic detections with depth imagery to deliver both metric positions and a high-level table-state decision (place, clear, etc.). Internally it uses the Strategy pattern so placement and clearing logic can evolve independently.

Why split placement vs clearing?

  • Placement cares only about free space (≤ 3 items).

  • Clearing cares only about specific dirty items (plate + cup).

Two interchangeable strategies let you tune or replace each behaviour without touching the other.

Interfaces (strongly-typed, stateless)

Direction

Topic

Message type

Notes

Required

/{robot}/camera_detections

std_msgs/String

Example: "Detected: cup, plate"

Required

/{robot}/depth_processed

sensor_msgs/Image

32-bit float depth (same resolution as RGB)

Provided

/{robot}/object_positions

std_msgs/String

XYZ list – e.g. "cup @ [x,y,z]; plate @ [x,y,z]"

Provided

/{robot}/placement_decision

std_msgs/String

"Decision: PLACE, CLEAR" (keywords from both strategies)

Contract

Pre-conditions

  • Depth topic and detection topic share the same resolution and optical frame.

Post-conditions

  • For every detection batch exactly one positions message and one decision string are published.

  • Each XYZ entry is based on the centre pixel (placeholder) until real bounding-box projection is implemented.

Invariants

  • Publication latency (both topics present → decision publish) < 40 ms.

  • Strategy keywords are always one of: PLACE, FULL, CLEAR, IGNORE.

Protocol

  1. Cache the latest depth and detection messages.

  2. When both are available, fuse → publish → reset detection cache. (Depth can arrive faster; each batch of detections is used once.)

Assumptions & Limitations

  • One RGB-D camera with pinhole intrinsics (fx, fy, cx, cy hard-coded).

  • Only the centre pixel depth is sampled – replace with bounding-box → 3-D projection for production.

  • Detections arrive as a simple string; switch to vision_msgs/Detection2DArray when a real detector is online.

Functions

main()

Register node and spin until shutdown.

Classes

ClearingStrategy()

Return CLEAR if both plate and cup present, else IGNORE.

DistanceEstimator()

Fuse detections + depth into XYZ positions and table-state keywords.

FindPlacementStrategy()

Return PLACE when < 3 objects present, else FULL.

TableAnalysisStrategy()

Abstract base for table-state reasoning.