Distance Estimation

distance_estimation.py

Overview

distance_estimation.py is a vertical sensor-fusion component that combines semantic detections with depth imagery to deliver both metric positions and a high-level table-state decision (place, clear, etc.). Internally it uses the Strategy pattern so placement and clearing logic can evolve independently.

Why split placement vs clearing?

  • Placement cares only about free space (≤ 3 items).

  • Clearing cares only about specific dirty items (plate + cup).

Two interchangeable strategies let you tune or replace each behaviour without touching the other.

Interfaces (strongly-typed, stateless)

Direction

Topic

Message type

Notes

Required

/{robot}/camera_detections

std_msgs/String

Example: "Detected: cup, plate"

Required

/{robot}/depth_processed

sensor_msgs/Image

32-bit float depth (same resolution as RGB)

Provided

/{robot}/object_positions

std_msgs/String

XYZ list – e.g. "cup @ [x,y,z]; plate @ [x,y,z]"

Provided

/{robot}/placement_decision

std_msgs/String

"Decision: PLACE, CLEAR" (keywords from both strategies)

Contract

Pre-conditions

  • Depth topic and detection topic share the same resolution and optical frame.

Post-conditions

  • For every detection batch exactly one positions message and one decision string are published.

  • Each XYZ entry is based on the centre pixel (placeholder) until real bounding-box projection is implemented.

Invariants

  • Publication latency (both topics present → decision publish) < 40 ms.

  • Strategy keywords are always one of: PLACE, FULL, CLEAR, IGNORE.

Protocol

  1. Cache the latest depth and detection messages.

  2. When both are available, fuse → publish → reset detection cache. (Depth can arrive faster; each batch of detections is used once.)

Assumptions & Limitations

  • One RGB-D camera with pinhole intrinsics (fx, fy, cx, cy hard-coded).

  • Only the centre pixel depth is sampled – replace with bounding-box → 3-D projection for production.

  • Detections arrive as a simple string; switch to vision_msgs/Detection2DArray when a real detector is online.

class distance_estimation.ClearingStrategy[source]

Bases: TableAnalysisStrategy

Return CLEAR if both plate and cup present, else IGNORE.

analyze(objects)[source]
class distance_estimation.DistanceEstimator[source]

Bases: object

Fuse detections + depth into XYZ positions and table-state keywords.

Internal State

latest_detectionsstr or None

Raw detection string awaiting fusion.

latest_depthndarray or None

Most recent depth frame (32-bit float).

depth_callback(msg)[source]

Convert depth image → ndarray then attempt fusion.

detection_callback(msg)[source]

Store detection string then attempt fusion.

static parse_detections(raw)[source]

Convert "Detected: cup, plate"['cup', 'plate'].

Any parsing error returns an empty list so the node keeps running.

static project_to_robot_frame(u, v, depth)[source]

Back-project pixel (u, v, depth) → metric XYZ in camera frame.

Hard-coded intrinsics (fx, fy, cx, cy) assume a 640×480 pinhole model.

Returns:

Coordinates (x, y, z) in metres.

Return type:

tuple[float, float, float]

try_estimate_positions()[source]

Run localisation + reasoning once both depth and detections exist.

Steps

  1. Parse detection string → list of class names.

  2. For each name, sample the centre pixel depth (placeholder) and back-project to XYZ.

  3. Publish formatted XYZ list.

  4. Run placement + clearing strategies, publish decision string.

  5. Clear detections cache so each batch is processed exactly once.

class distance_estimation.FindPlacementStrategy[source]

Bases: TableAnalysisStrategy

Return PLACE when < 3 objects present, else FULL.

analyze(objects)[source]
class distance_estimation.TableAnalysisStrategy[source]

Bases: ABC

Abstract base for table-state reasoning.

analyze(objects: list[str]) -> str must return one of

  • "PLACE" – safe to put another plate

  • "FULL" – table crowded; do nothing

  • "CLEAR" – fetch dirty items

  • "IGNORE" – no action needed

abstract analyze(objects)[source]
distance_estimation.main()[source]

Register node and spin until shutdown.