Human Object Interaction
Published:
review some paper of human object interaction
toolboxes
- object bounding box detection
- instance segmentation
- human from single image
- SMPL, SMPL-H, SMPL-X TODO
- smpl from image
- keep it smpl, TODO
- EFT:
- frankmocap: what is the output in?
- object from single image
- neural mesh renderer
- but all seems to have a object template
- SMPL-object to deform? overfitting
- scale constraints from total 3D or web search
- depth order loss
- for single images or combining all images(require video input)
- assume smpl output to be more accurate
- assume foreground to be in front of the obj, and bg to be behind the obj
- collision
- phosa: GPU implementation of some work
- mover: SDF grid
definition of interaction
- phosa: overlapping of bbox, predefined
- mover: human vertices from POSA
phosa
- acturally quite manural-labeling heavy
- prior on size of objects
- template on objects -> SMPL-ish object
- optimization per image? no
- weak perspective camera
- ortho. proj. to plane then proj. to camera
- so to go back to 3D, assume a fixed focal length for all images
- note this focal length is also applied to objects, as long as the ratio of scale is correct, everything is good.
- (x, y, z) -> x * sigma + t_x, y * sigma + t_y. (x, y, z) -> x / z + t_x, y / z + t_y, focal = 1.
- this is exactly what is done in the code
- uses SMPL model
- only 15 joints right, then how is hand modeled?
- there are plenty works on hand-object interaction right?
- plus an intrinsic scale of human
- this can be discarded right? as we are using human as a ruler
- only optim. w.r.t. intrinsic scales, global rotation, translation of objects
- interaction loss: distance of centroid
- ordinal depth loss
- collision loss: a lot of references, sad
mover
- contact vertices predicted by POSA
- contact loss being CD(one or two directional), segmentation of the object
chore and behave
- why does chore not compare to behave baseline?
- ok, the input is a image, not PC
- and from image, there is a lift-to-3D issue, and camera issue
- 3D representation
- fit to sdf: looks wrong, but SMPL and obj template offers a strong prior
we still have a meeting in july right?