Commit Graph

919 Commits

Author SHA1 Message Date
AdilZouitine 7ad93bdbf1 fix caching and dataset stats is optional 2025-04-09 13:20:51 +00:00
AdilZouitine ab2c2d39fb fix bug 2025-04-08 09:31:29 +00:00
AdilZouitine 9f6f508edb add softmax q network 2025-04-08 09:14:49 +00:00
AdilZouitine a8135629b4 Add rounding for safety 2025-04-08 08:50:02 +00:00
pre-commit-ci[bot] a7be613ee8 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-07 15:48:40 +00:00
AdilZouitine 632b2b46c1 fix sign issue 2025-04-07 15:44:06 +00:00
AdilZouitine 6c10390653 Refactor complementary_info handling in ReplayBuffer 2025-04-07 14:48:42 +00:00
AdilZouitine 4621f4e4f3 Handle gripper penalty 2025-04-07 08:23:49 +00:00
AdilZouitine 7741526ce4 fix caching 2025-04-04 14:29:38 +00:00
pre-commit-ci[bot] 037ecae9e0 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-04 07:59:23 +00:00
AdilZouitine e86fe66dbd fix indentation issue 2025-04-03 16:05:29 +00:00
AdilZouitine 38a8dbd9c9 Enhance SAC configuration and replay buffer with asynchronous prefetching support
- Added async_prefetch parameter to SACConfig for improved buffer management.
- Implemented get_iterator method in ReplayBuffer to support asynchronous prefetching of batches.
- Updated learner_server to utilize the new iterator for online and offline sampling, enhancing training efficiency.
2025-04-03 14:23:50 +00:00
AdilZouitine 51f1625c20 Enhance SACPolicy to support shared encoder and optimize action selection
- Cached encoder output in select_action method to reduce redundant computations.
- Updated action selection and grasp critic calls to utilize cached encoder features when available.
2025-04-03 07:44:46 +00:00
AdilZouitine 0ed7ff142c Enhance SACPolicy and learner server for improved grasp critic integration
- Updated SACPolicy to conditionally compute grasp critic losses based on the presence of discrete actions.
- Refactored the forward method to handle grasp critic model selection and loss computation more clearly.
- Adjusted learner server to utilize optimized parameters for grasp critic during training.
- Improved action handling in the ManiskillMockGripperWrapper to accommodate both tuple and single action inputs.
2025-04-02 15:50:39 +00:00
AdilZouitine 699d374d89 Refactor SACPolicy for improved readability and action dimension handling
- Cleaned up code formatting for better readability, including consistent spacing and removal of unnecessary blank lines.
- Consolidated continuous action dimension calculation to enhance clarity and maintainability.
- Simplified loss return statements in the forward method to improve code structure.
- Ensured grasp critic parameters are included conditionally based on configuration settings.
2025-04-01 15:43:29 +00:00
AdilZouitine 451a7b01db Add mock gripper support and enhance SAC policy action handling
- Introduced mock_gripper parameter in ManiskillEnvConfig to enable gripper simulation.
- Added ManiskillMockGripperWrapper to adjust action space for environments with discrete actions.
- Updated SACPolicy to compute continuous action dimensions correctly, ensuring compatibility with the new gripper setup.
- Refactored action handling in the training loop to accommodate the changes in action dimensions.
2025-04-01 14:22:08 +00:00
AdilZouitine 306c735172 Refactor SAC policy and training loop to enhance discrete action support
- Updated SACPolicy to conditionally compute losses for grasp critic based on num_discrete_actions.
- Simplified forward method to return loss outputs as a dictionary for better clarity.
- Adjusted learner_server to handle both main and grasp critic losses during training.
- Ensured optimizers are created conditionally for grasp critic based on configuration settings.
2025-04-01 11:42:28 +00:00
AdilZouitine 6a215f47dd Refactor SAC configuration and policy to support discrete actions
- Removed GraspCriticNetworkConfig class and integrated its parameters into SACConfig.
- Added num_discrete_actions parameter to SACConfig for better action handling.
- Updated SACPolicy to conditionally create grasp critic networks based on num_discrete_actions.
- Enhanced grasp critic forward pass to handle discrete actions and compute losses accordingly.
2025-04-01 11:32:24 +02:00
Michel Aractingi fe2ff516a8 Added Gripper quantization wrapper and grasp penalty
removed complementary info from buffer and learner server
removed get_gripper_action function
added gripper parameters to `common/envs/configs.py`
2025-04-01 11:08:15 +02:00
pre-commit-ci[bot] 7983baf4fc [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-03-31 16:10:01 +00:00
s1lent4gnt c774bbe522 Add grasp critic to the training loop
- Integrated the grasp critic gradient update to the training loop in learner_server
- Added Adam optimizer and configured grasp critic learning rate in configuration_sac
- Added target critics networks update after the critics gradient step
2025-03-31 18:06:21 +02:00
s1lent4gnt 2c1e5fa28b Add get_gripper_action method to GamepadController 2025-03-31 17:40:00 +02:00
s1lent4gnt 7452f9baaa Add gripper penalty wrapper 2025-03-31 17:38:16 +02:00
s1lent4gnt 007fee9230 Add complementary info in the replay buffer
- Added complementary info in the add method
- Added complementary info in the sample method
2025-03-31 17:36:35 +02:00
s1lent4gnt 4a1c26d9ee Add grasp critic
- Implemented grasp critic to evaluate gripper actions
- Added corresponding config parameters for tuning
2025-03-31 17:35:59 +02:00
pre-commit-ci[bot] 0f706ce543 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-03-31 13:59:32 +00:00
AdilZouitine 026ad463a9 Fix convergence of sac, multiple torch compile on the same model caused divergence 2025-03-31 13:54:21 +00:00
AdilZouitine 8494634d48 Fix cuda graph break 2025-03-31 07:59:56 +00:00
s1lent4gnt 66c3672738
Fix: Prevent Invalid next_state References When optimize_memory=True (#918) 2025-03-31 09:43:40 +02:00
pre-commit-ci[bot] c05e4835d0 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-03-28 17:20:39 +00:00
Michel Aractingi 808cf63221 Added support for controlling the gripper with the pygame interface of gamepad
Minor modifications in gym_manipulator to quantize the gripper actions
clamped the observations after F.resize in ConvertToLeRobotObservation wrapper due to a bug in F.resize, images were returned exceeding the maximum value of 1.0
2025-03-28 17:18:48 +00:00
AdilZouitine 0150139668 Refactor SACPolicy for improved type annotations and readability
- Enhanced type annotations for variables in the `SACPolicy` class to improve code clarity.
- Updated method calls to use keyword arguments for better readability.
- Streamlined the extraction of batch components, ensuring consistent typing across the class methods.
2025-03-28 17:18:48 +00:00
AdilZouitine b3ad63cf6e Refactor SACPolicy and learner_server for improved clarity and functionality
- Updated the `forward` method in `SACPolicy` to handle loss computation for actor, critic, and temperature models.
- Replaced direct calls to `compute_loss_*` methods with a unified `forward` method in `learner_server`.
- Enhanced batch processing by consolidating input parameters into a single dictionary for better readability and maintainability.
- Removed redundant code and improved documentation for clarity.
2025-03-28 17:18:48 +00:00
AdilZouitine 8b02e81bb5 Refactor actor_server.py for improved structure and logging
- Consolidated logging initialization and enhanced logging for actor processes.
- Streamlined the handling of gRPC connections and process management.
- Improved readability by organizing core algorithm functions and communication functions.
- Added detailed comments and documentation for clarity.
- Ensured proper queue management and shutdown handling for actor processes.
2025-03-28 17:18:48 +00:00
AdilZouitine dcce446a66 Refactor learner_server.py for improved structure and clarity
- Removed unused imports and streamlined the code structure.
- Consolidated logging initialization and enhanced logging for training processes.
- Improved handling of training state loading and resume logic.
- Refactored transition and interaction message processing for better readability and maintainability.
- Added detailed comments and documentation for clarity.
2025-03-28 17:18:48 +00:00
AdilZouitine 82a6b69e0e Refactor imports in modeling_sac.py for improved organization
- Rearranged import statements for better readability.
- Removed unused imports and streamlined the code structure.
2025-03-28 17:18:48 +00:00
AdilZouitine 6f7024242a Refactor SACConfig properties for improved readability
- Simplified the `image_features` property to directly iterate over `input_features`.
- Removed unused imports and unnecessary code related to main execution, enhancing clarity and maintainability.
2025-03-28 17:18:48 +00:00
AdilZouitine 3c56ad33c3 fix 2025-03-28 17:18:48 +00:00
AdilZouitine 49baa1ff49 Enhance logging for actor and learner servers
- Implemented process-specific logging for actor and learner servers to improve traceability.
- Created a dedicated logs directory and ensured it exists before logging.
- Initialized logging with explicit log files for each process, including actor transitions, interactions, and policy.
- Updated the actor CLI to validate configuration and set up logging accordingly.
2025-03-28 17:18:48 +00:00
Michel Aractingi 02b9ea9446 Added gripper control mechanism to gym_manipulator
Moved HilSerl env config to configs/env/configs.py
fixes in actor_server and modeling_sac and configuration_sac
added the possibility of ignoring missing keys in env_cfg in get_features_from_env_config function
2025-03-28 17:18:48 +00:00
AdilZouitine 79e0f6e06c Add WrapperConfig for environment wrappers and update SACConfig properties
- Introduced `WrapperConfig` dataclass for environment wrapper configurations.
- Updated `ManiskillEnvConfig` to include a `wrapper` field for enhanced environment management.
- Modified `SACConfig` to return `None` for `observation_delta_indices` and `action_delta_indices` properties.
- Refactored `make_robot_env` function to improve readability and maintainability.
2025-03-28 17:18:48 +00:00
Michel Aractingi d0b7690bc0 Change HILSerlRobotEnvConfig to inherit from EnvConfig
Added support for hil_serl classifier to be trained with train.py
run classifier training by python lerobot/scripts/train.py --policy.type=hilserl_classifier
fixes in find_joint_limits, control_robot, end_effector_control_utils
2025-03-28 17:18:48 +00:00
AdilZouitine 052a4acfc2 [WIP] Update SAC configuration and environment settings
- Reduced frame rate in `ManiskillEnvConfig` from 400 to 200.
- Enhanced `SACConfig` with new dataclasses for actor, learner, and network configurations.
- Improved input and output feature management in `SACConfig`.
- Refactored `actor_server` and `learner_server` to access configuration properties directly.
- Updated training pipeline to validate configurations and handle dataset repo IDs more robustly.
2025-03-28 17:18:48 +00:00
AdilZouitine 626e5dd35c Add wandb run id in config 2025-03-28 17:18:48 +00:00
AdilZouitine dd37bd412e [WIP] Non functional yet
Add ManiSkill environment configuration and wrappers

- Introduced `VideoRecordConfig` for video recording settings.
- Added `ManiskillEnvConfig` to encapsulate environment-specific configurations.
- Implemented various wrappers for the ManiSkill environment, including observation and action scaling.
- Enhanced the `make_maniskill` function to create a wrapped ManiSkill environment with video recording and observation processing.
- Updated the `actor_server` and `learner_server` to utilize the new configuration structure.
- Refactored the training pipeline to accommodate the new environment and policy configurations.
2025-03-28 17:18:48 +00:00
Michel Aractingi b7b6d8102f Change config logic in:
- gym_manipulator
- find_joint_limits
- end_effector_utils
2025-03-28 17:18:48 +00:00
AdilZouitine ee25fd8afe Add .devcontainer to .gitignore for improved development environment management 2025-03-28 17:18:48 +00:00
AdilZouitine 5fbbc65869 Add task field to frame_dict in ReplayBuffer and simplify save_episode calls
- Introduced a new "task" field in frame_dict to meet the requirements of LeRobotDataset.
- Removed task_name parameter from save_episode calls for consistency.
2025-03-28 17:18:48 +00:00
AdilZouitine f483931fc0 Handle new config with sac 2025-03-28 17:18:48 +00:00
AdilZouitine b2025b852c Handle multi optimizers 2025-03-28 17:18:48 +00:00