Distance Estimation
distance_estimation.py
Overview
distance_estimation.py is a vertical sensor-fusion component that combines semantic detections with depth imagery to deliver both metric positions and a high-level table-state decision (place, clear, etc.). Internally it uses the Strategy pattern so placement and clearing logic can evolve independently.
Why split placement vs clearing?
Placement cares only about free space (≤ 3 items).
Clearing cares only about specific dirty items (plate + cup).
Two interchangeable strategies let you tune or replace each behaviour without touching the other.
Interfaces (strongly-typed, stateless)
Direction |
Topic |
Message type |
Notes |
---|---|---|---|
Required |
|
|
Example: |
Required |
|
|
32-bit float depth (same resolution as RGB) |
Provided |
|
|
XYZ list – e.g. |
Provided |
|
|
|
Contract
Pre-conditions
Depth topic and detection topic share the same resolution and optical frame.
Post-conditions
For every detection batch exactly one positions message and one decision string are published.
Each XYZ entry is based on the centre pixel (placeholder) until real bounding-box projection is implemented.
Invariants
Publication latency (both topics present → decision publish) < 40 ms.
Strategy keywords are always one of: PLACE, FULL, CLEAR, IGNORE.
Protocol
Cache the latest depth and detection messages.
When both are available, fuse → publish → reset detection cache. (Depth can arrive faster; each batch of detections is used once.)
Assumptions & Limitations
One RGB-D camera with pinhole intrinsics (fx, fy, cx, cy hard-coded).
Only the centre pixel depth is sampled – replace with bounding-box → 3-D projection for production.
Detections arrive as a simple string; switch to
vision_msgs/Detection2DArray
when a real detector is online.
- class distance_estimation.ClearingStrategy[source]
Bases:
TableAnalysisStrategy
Return CLEAR if both
plate
andcup
present, else IGNORE.
- class distance_estimation.DistanceEstimator[source]
Bases:
object
Fuse detections + depth into XYZ positions and table-state keywords.
Internal State
- latest_detectionsstr or None
Raw detection string awaiting fusion.
- latest_depthndarray or None
Most recent depth frame (32-bit float).
- static parse_detections(raw)[source]
Convert
"Detected: cup, plate"
→['cup', 'plate']
.Any parsing error returns an empty list so the node keeps running.
- static project_to_robot_frame(u, v, depth)[source]
Back-project pixel (u, v, depth) → metric XYZ in camera frame.
Hard-coded intrinsics (fx, fy, cx, cy) assume a 640×480 pinhole model.
- try_estimate_positions()[source]
Run localisation + reasoning once both depth and detections exist.
Steps
Parse detection string → list of class names.
For each name, sample the centre pixel depth (placeholder) and back-project to XYZ.
Publish formatted XYZ list.
Run placement + clearing strategies, publish decision string.
Clear detections cache so each batch is processed exactly once.
- class distance_estimation.FindPlacementStrategy[source]
Bases:
TableAnalysisStrategy
Return PLACE when < 3 objects present, else FULL.