Human Object Interaction

2 minute read

Published: June 18, 2022

review some paper of human object interaction

toolboxes

object bounding box detection
instance segmentation
human from single image
- SMPL, SMPL-H, SMPL-X TODO
- smpl from image
  - keep it smpl, TODO
  - EFT:
  - frankmocap: what is the output in?
object from single image
- neural mesh renderer
- but all seems to have a object template
  - SMPL-object to deform? overfitting
- scale constraints from total 3D or web search
depth order loss
- for single images or combining all images(require video input)
- assume smpl output to be more accurate
- assume foreground to be in front of the obj, and bg to be behind the obj
collision
- phosa: GPU implementation of some work
- mover: SDF grid

definition of interaction

phosa

acturally quite manural-labeling heavy
- prior on size of objects
- template on objects -> SMPL-ish object
optimization per image? no
weak perspective camera
- ortho. proj. to plane then proj. to camera
- so to go back to 3D, assume a fixed focal length for all images
- note this focal length is also applied to objects, as long as the ratio of scale is correct, everything is good.
- (x, y, z) -> x * sigma + t_x, y * sigma + t_y. (x, y, z) -> x / z + t_x, y / z + t_y, focal = 1.
  - this is exactly what is done in the code
uses SMPL model
- only 15 joints right, then how is hand modeled?
- there are plenty works on hand-object interaction right?
- plus an intrinsic scale of human
  - this can be discarded right? as we are using human as a ruler
only optim. w.r.t. intrinsic scales, global rotation, translation of objects
interaction loss: distance of centroid
ordinal depth loss
collision loss: a lot of references, sad

mover

chore and behave

why does chore not compare to behave baseline?
- ok, the input is a image, not PC
- and from image, there is a lift-to-3D issue, and camera issue
3D representation
fit to sdf: looks wrong, but SMPL and obj template offers a strong prior

we still have a meeting in july right?

Learning-Based Image Synthesis