lerobot

Commit Graph

Author	SHA1	Message	Date
KeWang1017	ecb91b37eb	Refactor SACPolicy for improved action sampling and standard deviation handling - Updated action selection to use distribution sampling and log probabilities for better stochastic behavior. - Enhanced standard deviation clamping to prevent extreme values, ensuring stability in policy outputs. - Cleaned up code by removing unnecessary comments and improving readability. These changes aim to refine the SAC implementation, enhancing its robustness and performance during training and inference.	2025-03-24 13:24:23 +01:00
KeWang1017	c89bcc5aa8	trying to get sac running	2025-03-24 13:24:23 +01:00
Michel Aractingi	cc85bca2b5	Added normalization schemes and style checks	2025-03-24 13:24:23 +01:00
Michel Aractingi	3b07766c33	added optimizer and sac to factory.py	2025-03-24 13:23:53 +01:00
Eugene Mironov	287968b418	[HIL-SERL PORT] Fix linter issues (#588 )	2025-03-24 13:23:02 +01:00
Eugene Mironov	c9f1a037e3	[Port Hil-SERL] Add unit tests for the reward classifier & fix imports & check script (#578 )	2025-03-24 13:23:02 +01:00
Michel Aractingi	8a7f74ee65	added comments from kewang	2025-03-24 13:21:05 +01:00
KeWang1017	8220546036	Enhance SAC configuration and policy with new parameters and subsampling logic - Added `num_subsample_critics`, `critic_target_update_weight`, and `utd_ratio` to SACConfig. - Implemented target entropy calculation in SACPolicy if not provided. - Introduced subsampling of critics to prevent overfitting during updates. - Updated temperature loss calculation to use the new target entropy. - Added comments for future UTD update implementation. These changes improve the flexibility and performance of the SAC implementation.	2025-03-24 13:21:05 +01:00
KeWang	214beec994	Port SAC WIP (#581 ) Co-authored-by: KeWang1017 <ke.wang@helloleap.ai>	2025-03-24 13:21:05 +01:00
Michel Aractingi	909ca8d9b6	completed losses	2025-03-24 13:21:05 +01:00
Michel Aractingi	5fe56e0a49	nit in control_robot.py	2025-03-24 13:21:05 +01:00
Yoel	0ebdae8a40	Reward classifier and training (#528 ) Co-authored-by: Daniel Ritchie <daniel@brainwavecollective.ai> Co-authored-by: resolver101757 <kelster101757@hotmail.com> Co-authored-by: Jannik Grothusen <56967823+J4nn1K@users.noreply.github.com> Co-authored-by: Remi <re.cadene@gmail.com> Co-authored-by: Michel Aractingi <michel.aractingi@huggingface.co>	2025-03-24 13:20:43 +01:00
Pepijn	e8159997c7	User/pepijn/2025 03 17 act different image shapes (#870 )	2025-03-18 11:09:05 +01:00
Steven Palma	5e9473806c	refactor(config): Move device & amp args to PreTrainedConfig (#812 ) Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2025-03-06 17:59:28 +01:00
Steven Palma	5d24ce3160	chore(doc): add license header to all files (#818 )	2025-03-05 17:56:51 +01:00
Yachen Kang	b80e55ca44	change "actions_id_pad" to "actions_is_pad"(🐛 Bug) (#774 ) Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2025-03-05 01:31:56 +01:00
Simon Alibert	a1809ad3de	Add typos checks (#770 )	2025-02-25 23:51:15 +01:00
Simon Alibert	3354d919fc	LeRobotDataset v2.1 (#711 ) Co-authored-by: Remi <remi.cadene@huggingface.co> Co-authored-by: Remi Cadene <re.cadene@gmail.com>	2025-02-25 15:27:29 +01:00
Simon Alibert	c4c2ce04e7	Update pre-commits (#733 )	2025-02-15 15:51:17 +01:00
Simon Alibert	e71095960f	Fixes following #670 (#719 )	2025-02-12 12:53:55 +01:00
Simon Alibert	90e099b39f	Remove offline training, refactor `train.py` and logging/checkpointing (#670 ) Co-authored-by: Remi <remi.cadene@huggingface.co>	2025-02-11 10:36:06 +01:00
Remi	638d411cd3	Add Pi0 (#681 ) Co-authored-by: Simon Alibert <simon.alibert@huggingface.co> Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com> Co-authored-by: Pablo <pablo.montalvo.leroux@gmail.com>	2025-02-04 18:01:04 +01:00
Simon Alibert	3c0a209f9f	Simplify configs (#550 ) Co-authored-by: Remi <remi.cadene@huggingface.co> Co-authored-by: HUANG TZU-CHUN <137322177+tc-huang@users.noreply.github.com>	2025-01-31 13:57:37 +01:00
Hirokazu Ishida	538455a965	feat: enable to use multiple rgb encoders per camera in diffusion policy (#484 ) Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>	2024-10-30 11:00:05 +01:00
Alexander Soare	a60d27b132	Raise ValueError if horizon is incompatible with downsampling (#422 )	2024-09-09 17:22:46 +01:00
Joe Clinton	f17d9a2ba1	Bug: Fix VQ-Bet not working when n_action_pred_token=1 (#420 ) Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>	2024-09-09 09:41:13 +01:00
Jack Vial	b2896d38f5	fix(act): n_vae_encoder_layers config parameter wasn't being used (#400 )	2024-09-02 18:29:27 +01:00
NielsRogge	86bbd16d43	Improve discoverability on the hub (#325 ) Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2024-08-19 15:16:46 +02:00
Alexander Soare	0f6e0f6d74	Fix input dim (#365 )	2024-08-19 11:42:32 +01:00
Halvard Bariller	7a3cb1ad34	Adjust the timestamps' description in Diffusion Policy (#343 ) Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>	2024-07-26 12:47:03 +01:00
Alexander Soare	f8a6574698	Add online training with TD-MPC as proof of concept (#338 )	2024-07-25 11:16:38 +01:00
Alexander Soare	abbb1d2367	Make sure policies don't mutate the batch (#323 )	2024-07-22 20:38:33 +01:00
Alexander Soare	c0101f0948	Fix ACT temporal ensembling (#319 )	2024-07-16 10:27:21 +01:00
Alexander Soare	471eab3d7e	Make ACT compatible with "observation.environment_state" (#314 )	2024-07-11 13:12:22 +01:00
Seungjae Lee	64425d5e00	Bug fix: fix error when setting select_target_actions_indices in vqbet (#310 ) Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>	2024-07-10 17:56:11 +01:00
Alexander Soare	cc2f6e7404	Train diffusion pusht_keypoints (#307 ) Co-authored-by: Remi <re.cadene@gmail.com>	2024-07-09 12:35:50 +01:00
Simon Alibert	74362ac453	Add VQ-BeT copyrights (#299 )	2024-07-04 13:02:31 +02:00
Alexander Soare	342f429f1c	Add test to make sure policy dataclass configs match yaml configs (#292 )	2024-06-26 09:09:40 +01:00
Seungjae Lee	7d1542cae1	Add VQ-BeT (#166 )	2024-06-26 08:55:02 +01:00
Thomas Wolf	48951662f2	Bug fix: missing attention mask in VAE encoder in ACT policy (#279 ) Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>	2024-06-19 12:07:21 +01:00
Jihoon Oh	b72d574891	fix Unet global_cond_dim to use state dim, not action dim (#278 )	2024-06-17 15:17:28 +01:00
Alexander Soare	15dd682714	Add multi-image support to diffusion policy (#218 )	2024-06-17 08:11:20 +01:00
Wael Karkoub	54c9776bde	Improves Type Annotations (#252 )	2024-06-10 19:09:48 +01:00
Ruijie	b0d954c6e1	Fix bug in normalize to avoid divide by zero (#239 ) Co-authored-by: rj <rj@teleopstrio-razer.lan> Co-authored-by: Remi <re.cadene@gmail.com>	2024-06-04 12:21:28 +02:00
Alexander Soare	cf15cba5fc	Remove redundant slicing operation in Diffusion Policy (#240 )	2024-06-03 13:04:24 +01:00
Remi	d585c73f9f	Add real-world support for ACT on Aloha/Aloha2 (#228 ) Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>	2024-05-31 15:31:02 +02:00
Alexander Soare	57fb5fe8a6	Improve documentation on VAE encoder inputs (#215 )	2024-05-30 19:16:44 +02:00
Alexander Soare	3d625ae6d3	Handle `crop_shape=None` in Diffusion Policy (#219 )	2024-05-28 18:27:33 +01:00
Radek Osmulski	3b86050ab0	throw an error if config.do_maks_loss and action_is_pad not provided in batch (#213 ) Co-authored-by: Alexander Soare <alexander.soare159@gmail.com>	2024-05-27 09:06:26 +01:00
Alexander Soare	5ec0af62c6	Explain why n_encoder_layers=1 (#193 )	2024-05-17 15:05:40 +01:00

1 2 3 4

173 Commits