Commit Graph

715 Commits

Author SHA1 Message Date
Michel Aractingi daa1480a91 nit 2025-01-22 10:26:52 +01:00
Michel Aractingi 71ec721e48 cleaned eval_on_robot.py; readded policy; fixed doc strings 2025-01-22 10:26:52 +01:00
Michel Aractingi bbb5ba0adf Extend reward classifier for multiple camera views (#626) 2025-01-22 10:26:52 +01:00
Eugene Mironov 844bfcf484 [Port HIL_SERL] Final fixes for the Reward Classifier (#598) 2025-01-22 10:26:52 +01:00
Michel Aractingi 13441f0d98 added temporary fix for missing task_index key in online environment 2025-01-22 10:26:50 +01:00
Michel Aractingi 41b377211c split encoder for critic and actor 2025-01-22 10:25:52 +01:00
KeWang1017 9ceb68ee90 Refine SAC configuration and policy for enhanced performance
- Updated standard deviation parameterization in SACConfig to 'softplus' with defined min and max values for improved stability.
- Modified action sampling in SACPolicy to use reparameterized sampling, ensuring better gradient flow and log probability calculations.
- Cleaned up log probability calculations in TanhMultivariateNormalDiag for clarity and efficiency.
- Increased evaluation frequency in YAML configuration to 50000 for more efficient training cycles.

These changes aim to enhance the robustness and performance of the SAC implementation during training and inference.
2025-01-22 10:23:33 +01:00
KeWang1017 d1baa5a82f trying to get sac running 2025-01-22 10:20:56 +01:00
Michel Aractingi 04da4dd3e3 Added normalization schemes and style checks 2025-01-22 10:19:19 +01:00
Michel Aractingi b0e2fcdba7 added optimizer and sac to factory.py 2025-01-22 10:17:48 +01:00
Eugene Mironov 1e2a757cd3 [Port Hil-SERL] Add unit tests for the reward classifier & fix imports & check script (#578) 2025-01-22 10:14:06 +01:00
Michel Aractingi ab842ba6ae nit in control_robot.py 2025-01-22 10:06:39 +01:00
Michel Aractingi 94a7221a94 Update lerobot/scripts/train_hilserl_classifier.py
Co-authored-by: Yoel <yoel.chornton@gmail.com>
2025-01-22 10:06:39 +01:00
Claudio Coppola 00dadcace0 LerobotDataset pushable to HF from any folder (#563) 2025-01-22 10:06:39 +01:00
berjaoui 81a2f2958d Update 7_get_started_with_real_robot.md (#559) 2025-01-22 10:06:39 +01:00
Michel Aractingi 68b4fb60ad Control simulated robot with real leader (#514)
Co-authored-by: Remi <remi.cadene@huggingface.co>
2025-01-22 10:06:39 +01:00
Remi 96b2b62377 Fix missing local_files_only in record/replay (#540)
Co-authored-by: Simon Alibert <alibert.sim@gmail.com>
2025-01-22 10:06:39 +01:00
Michel Aractingi b5c98bbfd3 Refactor OpenX (#505) 2025-01-22 10:06:39 +01:00
Eugene Mironov 58e12cf2e8 Fixup 2025-01-22 10:06:39 +01:00
Michel Aractingi d8b5fae622 Add human intervention mechanism and eval_robot script to evaluate policy on the robot (#541)
Co-authored-by: Yoel <yoel.chornton@gmail.com>
2025-01-22 10:06:39 +01:00
Yoel 67ac81d728 Reward classifier and training (#528)
Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai>
Co-authored-by: resolver101757 <kelster101757@hotmail.com>
Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com>
Co-authored-by: Remi <re.cadene@gmail.com>
Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>
2025-01-22 10:06:39 +01:00
Michel Aractingi b5f1ea3140 nit 2025-01-22 10:06:39 +01:00
AdilZouitine 4d854a1513 Stable version of rlpd + drq 2025-01-22 09:00:16 +00:00
AdilZouitine 87da655eab Add type annotations and restructure SACConfig class fields 2025-01-21 09:51:12 +00:00
Adil Zouitine a8fda9c61a Change SAC policy implementation with configuration and modeling classes 2025-01-17 09:39:04 +01:00
Adil Zouitine 55505ff817 Add rlpd tricks 2025-01-16 11:53:36 +01:00
Adil Zouitine 20d31ab8e0 SAC works 2025-01-16 11:53:27 +01:00
Adil Zouitine e5b83aab5e remove breakpoint 2025-01-16 11:52:03 +01:00
Adil Zouitine a9d5f62304 [WIP] correct sac implementation 2025-01-16 11:51:18 +01:00
Adil Zouitine 72e1ed7058 Add rlpd tricks 2025-01-16 11:42:24 +01:00
Adil Zouitine d8e67a2609 SAC works 2025-01-16 11:42:24 +01:00
Adil Zouitine 50e12376de remove breakpoint 2025-01-16 11:42:23 +01:00
Adil Zouitine 73aa6c25f3 [WIP] correct sac implementation 2025-01-16 11:42:14 +01:00
Pradeep Kadubandi 380b836eee
Fix for the issue https://github.com/huggingface/lerobot/issues/638 (#639) 2025-01-15 10:50:38 +01:00
Philip Fung eec6796cb8
fixes to SO-100 readme (#600)
Co-authored-by: Philip Fung <no@one>
Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
2025-01-10 11:30:01 +01:00
Mishig 25a8597680
[viz] Fixes & updates to html visualizer (#617) 2025-01-09 11:39:54 +01:00
CharlesCNorton b8b368310c
typo fix: batch_convert_dataset_v1_to_v2.py (#615)
Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
2025-01-09 09:57:45 +01:00
Ville Kuosmanen 5097cd900e
fix(visualise): use correct language description for each episode id (#604)
Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
2025-01-09 09:39:48 +01:00
CharlesCNorton bc16e1b497
fix(docs): typos in benchmark readme.md (#614)
Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>
2025-01-09 09:35:27 +01:00
Simon Alibert 8f821ecad0
Fix Quality workflow (#622) 2025-01-08 13:35:11 +01:00
CharlesCNorton 4519016e67
Update README.md (#612) 2025-01-03 16:19:37 +01:00
Eugene Mironov 59e2757434
Fix broken `create_lerobot_dataset_card` (#590) 2024-12-23 15:05:59 +01:00
Mishig 73b64c3089
[vizualizer] for LeRobodDataset V2 (#576) 2024-12-20 16:26:23 +01:00
s1lent4gnt 66f8736598
fixing typo from 'teloperation' to 'teleoperation' (#566) 2024-12-11 05:57:52 -08:00
Simon Alibert 4c41f6fcc6
Fix example 6 (#572) 2024-12-11 10:32:18 +01:00
Claudio Coppola 44f9b21e74
LerobotDataset pushable to HF from any folder (#563) 2024-12-09 11:32:25 +01:00
berjaoui 03f49ceaf0
Update 7_get_started_with_real_robot.md (#559) 2024-12-09 00:17:49 +01:00
Michel Aractingi 8e7d6970ea
Control simulated robot with real leader (#514)
Co-authored-by: Remi <remi.cadene@huggingface.co>
2024-12-03 12:20:05 +01:00
Remi 286bca37cc
Fix missing local_files_only in record/replay (#540)
Co-authored-by: Simon Alibert <alibert.sim@gmail.com>
2024-12-03 10:53:21 +01:00
Michel Aractingi a2c181992a
Refactor OpenX (#505) 2024-12-03 00:51:55 +01:00