Cadene
|
06573d7f67
|
online training works (loss goes down), remove repeat_action, eval_policy outputs episodes data, eval_policy uses max_episodes_rendered
|
2024-04-10 11:34:01 +00:00 |
Cadene
|
73dfa3c8e3
|
tests for tdmpc and diffusion policy are passing
|
2024-04-09 02:50:32 +00:00 |
Cadene
|
4371a5570d
|
Remove latency, tdmpc policy passes tests (TODO: make it work with online RL)
|
2024-04-07 16:01:22 +00:00 |
Cadene
|
f56b1a0e16
|
WIP tdmpc
|
2024-04-05 13:40:31 +00:00 |
Simon Alibert
|
1c24bbda3f
|
WIP Upgrading simxam from mujoco-py to mujoco python bindings
|
2024-03-25 12:28:07 +01:00 |
Remi Cadene
|
cfc304e870
|
Refactor env queue, Training diffusion works (Still not converging)
|
2024-03-04 11:00:51 +00:00 |
Cadene
|
cf5063e50e
|
Add diffusion policy (train and eval works, TODO: reproduce results)
|
2024-02-28 15:21:42 +00:00 |
Cadene
|
21670dce90
|
Refactor train, eval_policy, logger, Add diffusion.yaml (WIP)
|
2024-02-26 01:10:09 +00:00 |
Cadene
|
5a219fed6e
|
Refactor policy config
|
2024-02-25 18:26:44 +00:00 |