Xinyu Zhan*
·
Lixin Yang*
·
Yifei Zhao
·
Kangrui Mao
·
Hanlin Xu
Zenan Lin
·
Kailin Li
·
Cewu Lu†
This repo contains the training and evaluation of TaMF models on OakInk2 dataset. TaMF targets at the generation of hand motion sequences that can fulfill given object trajectories conditioned on task descriptions.
object_raw models, which are aligned and downsampled from the objects' raw scans.
-
Setup dataset files.
Download tarballs from huggingface. You will need the preview version annotation tarball for all sequences, the
object_rawtarball, theobject_repairtarball and theprogramtarball. Organize these files as follow:data |-- data | `-- scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS |-- anno_preview | `-- scene_0x__y00z++00000000000000000000__YYYY-mm-dd-HH-MM-SS.pkl |-- object_raw |-- object_repair `-- programRefer to oakink2_toolkit for more details.
-
Setup the enviroment.
-
Create a virtual env of python 3.10. This can be done by either
condaor python packagevenv.-
condaapproachconda create -p ./.conda python=3.10 conda activate ./.conda
-
venvapproach First usepyenvor other tools to install a python intepreter of version 3.10. Here 3.10.14 is used as example:pyenv install 3.10.14 pyenv shell 3.10.14
Then create a virtual environment:
python -m venv .venv --prompt oakink2_tamf . .venv/bin/activate
-
-
Install the dependencies.
Make sure all bundled dependencies are there.
git submodule update --init --recursive --progress
Use
pipto install the packages:pip install -r requirements.dist.txt
-
-
Download the MANO model (version v1.2) and place the files at
asset/mano_v1_2.The directory structure should be like: ``` asset `-- mano_v1_2 `-- models |-- MANO_LEFT.pkl `-- MANO_RIGHT.pkl ``` -
Download object embeddings and sampled point clouds.
Untar the tarballs into
common. The directory structure should be like:common |-- common/retrieve_obj_embedding/main/embedding `-- common/retrieve_obj_pointcloud/main/pointcloudDownload grabnet assets. Untar the tarballs into
asset.asset `-- grabnetThere assets are from
https://github.com/otaheri/GrabNetandhttps://github.com/oakink/OakInk-Grasp-Generation. -
Save cache dict for each split.
Train:
python -m script.save_cache_dict --data.process_range ?(file:asset/split/train.txt) --data.split_name train --commitVal:
python -m script.save_cache_dict --data.process_range ?(file:asset/split/val.txt) --data.split_name val --commitTest:
python -m script.save_cache_dict --data.process_range ?(file:asset/split/test.txt) --data.split_name test --commit
All Dataset:
python -m script.save_cache_dict --data.process_range ?(file:asset/split/all.txt) --data.split_name all --commit -
Train
MF-MDM G.bash script/train.sh
-
Sample from
MF-MDM Gfor data cache that will be used inMF-MDM Rtraining (press y to proceed)../script/sample.sh train common/train/main__<timestamp here>/save/model_0099.pt arch_mdm_l__0099
./script/sample.sh val common/train/main__<timestamp here>/save/model_0399.pt arch_mdm_l__0399
./script/sample.sh test common/train/main__<timestamp here>/save/model_0399.pt arch_mdm_l__0399
-
Train
MF-MDM R.bash script/train_refine.sh
-
Sample from
MF-MDM R../script/sample_refine.sh test common/train/refine__<timestamp>/save/model_0399.pt arch_mdm_l__0399
- (Optional) Download the pretrained model weights for
MF-MDM G) andMF-MDM R. The directory structure should be like:
common
|-- common/train/main__remastered/save
`-- common/train/refine__remastered/save
- Download the feature-extraction model weights for FID computation in the paper. You could also use your own feature-extraction model if you would like to. The directory structure should be like:
common
`-- common/train/encoder__fid_1/save
-
Sample from
MF-MDM R. You can do it multiple times from different cache copies for evaluation../script/sample_refine.sh test common/train/refine__remastered/save/model_0399.pt arch_mdm_l__0399 -
Evaluate.
Contact Ratio, CR:
python -m script.compute_score.compute_score_cr
Solid Intersection Volume, SIV:
python -m script.compute_score.compute_score_siv
Power Spectrum Kullback-Leibler divergence of Joints, PSKL-J:
python -m script.compute_score.compute_score_psklj
FID:
python -m script.compute_score.compute_score_fid --cfg config/arch_encoder.yml --debug.encoder_checkpoint_filepath common/train/encoder__fid_1/save/model_0399.pt
Point
--sample_refine_filepathto different saved sampled trajectories to do evaluations multiple times. -
(Optional) Visualize.
python -m script.debug.debug_refine_sample --debug.model_weight_filepath xxx.ptIf you find OakInk2 dataset or OakInk2-TAMF repo useful for your research, please considering cite us:
@InProceedings{Zhan_2024_CVPR,
author = {Zhan, Xinyu and Yang, Lixin and Zhao, Yifei and Mao, Kangrui and Xu, Hanlin and Lin, Zenan and Li, Kailin and Lu, Cewu},
title = {{OAKINK2}: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {445-456}
}Our TaMF model is based on the Motion Diffusion Model (MDM), please also cite:
@inproceedings{
tevet2023human,
title={Human Motion Diffusion Model},
author={Guy Tevet and Sigal Raab and Brian Gordon and Yoni Shafir and Daniel Cohen-or and Amit Haim Bermano},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=SJ1kSyO2jwu}
}
