* To run on CPU add following arguments: `--sim_device=cpu`, `--rl_device=cpu` (sim on CPU and rl on GPU is possible).
* To run headless (no rendering) add `--headless`.
* **Important** : To improve performance, once the training starts press `v` to stop the rendering. You can then enable it later to check the progress.
* The trained policy is saved in `logs/<experiment_name>/<date_time>_<run_name>/model_<iteration>.pt`. Where `<experiment_name>` and `<run_name>` are defined in the train config.
* The following command line arguments override the values set in the config files:
* --task TASK: Task name.
* --resume: Resume training from a checkpoint
* --experiment_name EXPERIMENT_NAME: Name of the experiment to run or load.
* --run_name RUN_NAME: Name of the run.
* --load_run LOAD_RUN: Name of the run to load when resume=True. If -1: will load the last run.
* --checkpoint CHECKPOINT: Saved model checkpoint number. If -1: will load the last checkpoint.
* --num_envs NUM_ENVS: Number of environments to create.
* --seed SEED: Random seed.
* --max_iterations MAX_ITERATIONS: Maximum number of training iterations.