Commit Graph

927 Commits

Author SHA1 Message Date
AdilZouitine 78c640b6d8 Refactor complementary_info handling in ReplayBuffer 2025-04-18 15:10:22 +02:00
AdilZouitine d5a87f67cf Handle gripper penalty 2025-04-18 15:10:22 +02:00
AdilZouitine 8bcf41761d fix caching 2025-04-18 15:10:22 +02:00
pre-commit-ci[bot] 1efaf02df9 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
AdilZouitine cf58890bb0 fix indentation issue 2025-04-18 15:10:22 +02:00
AdilZouitine 7c2c67fc3c Enhance SAC configuration and replay buffer with asynchronous prefetching support
- Added async_prefetch parameter to SACConfig for improved buffer management.
- Implemented get_iterator method in ReplayBuffer to support asynchronous prefetching of batches.
- Updated learner_server to utilize the new iterator for online and offline sampling, enhancing training efficiency.
2025-04-18 15:10:22 +02:00
AdilZouitine 70130b9841 Enhance SACPolicy to support shared encoder and optimize action selection
- Cached encoder output in select_action method to reduce redundant computations.
- Updated action selection and grasp critic calls to utilize cached encoder features when available.
2025-04-18 15:10:22 +02:00
AdilZouitine 6167886472 Enhance SACPolicy and learner server for improved grasp critic integration
- Updated SACPolicy to conditionally compute grasp critic losses based on the presence of discrete actions.
- Refactored the forward method to handle grasp critic model selection and loss computation more clearly.
- Adjusted learner server to utilize optimized parameters for grasp critic during training.
- Improved action handling in the ManiskillMockGripperWrapper to accommodate both tuple and single action inputs.
2025-04-18 15:10:22 +02:00
AdilZouitine f9fb9d4594 Refactor SACPolicy for improved readability and action dimension handling
- Cleaned up code formatting for better readability, including consistent spacing and removal of unnecessary blank lines.
- Consolidated continuous action dimension calculation to enhance clarity and maintainability.
- Simplified loss return statements in the forward method to improve code structure.
- Ensured grasp critic parameters are included conditionally based on configuration settings.
2025-04-18 15:10:22 +02:00
AdilZouitine d86d29fe21 Add mock gripper support and enhance SAC policy action handling
- Introduced mock_gripper parameter in ManiskillEnvConfig to enable gripper simulation.
- Added ManiskillMockGripperWrapper to adjust action space for environments with discrete actions.
- Updated SACPolicy to compute continuous action dimensions correctly, ensuring compatibility with the new gripper setup.
- Refactored action handling in the training loop to accommodate the changes in action dimensions.
2025-04-18 15:10:22 +02:00
AdilZouitine f83d215e7a Refactor SAC policy and training loop to enhance discrete action support
- Updated SACPolicy to conditionally compute losses for grasp critic based on num_discrete_actions.
- Simplified forward method to return loss outputs as a dictionary for better clarity.
- Adjusted learner_server to handle both main and grasp critic losses during training.
- Ensured optimizers are created conditionally for grasp critic based on configuration settings.
2025-04-18 15:10:22 +02:00
AdilZouitine 7361a11a4d Refactor SAC configuration and policy to support discrete actions
- Removed GraspCriticNetworkConfig class and integrated its parameters into SACConfig.
- Added num_discrete_actions parameter to SACConfig for better action handling.
- Updated SACPolicy to conditionally create grasp critic networks based on num_discrete_actions.
- Enhanced grasp critic forward pass to handle discrete actions and compute losses accordingly.
2025-04-18 15:10:22 +02:00
Michel Aractingi 0cce2fe0fa Added Gripper quantization wrapper and grasp penalty
removed complementary info from buffer and learner server
removed get_gripper_action function
added gripper parameters to `common/envs/configs.py`
2025-04-18 15:10:22 +02:00
pre-commit-ci[bot] 88d26ae976 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
s1lent4gnt 3a2308d86f Add grasp critic to the training loop
- Integrated the grasp critic gradient update to the training loop in learner_server
- Added Adam optimizer and configured grasp critic learning rate in configuration_sac
- Added target critics networks update after the critics gradient step
2025-04-18 15:10:22 +02:00
s1lent4gnt fdd04efdb7 Add get_gripper_action method to GamepadController 2025-04-18 15:10:22 +02:00
s1lent4gnt ff18be18ad Add gripper penalty wrapper 2025-04-18 15:10:22 +02:00
s1lent4gnt 427720426b Add complementary info in the replay buffer
- Added complementary info in the add method
- Added complementary info in the sample method
2025-04-18 15:10:22 +02:00
s1lent4gnt 66693965c0 Add grasp critic
- Implemented grasp critic to evaluate gripper actions
- Added corresponding config parameters for tuning
2025-04-18 15:10:22 +02:00
pre-commit-ci[bot] 334cf8143e [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
AdilZouitine 5b49601072 Fix convergence of sac, multiple torch compile on the same model caused divergence 2025-04-18 15:10:22 +02:00
AdilZouitine 0185a0b6fd Fix cuda graph break 2025-04-18 15:10:22 +02:00
s1lent4gnt 70d418935d Fix: Prevent Invalid next_state References When optimize_memory=True (#918) 2025-04-18 15:10:22 +02:00
pre-commit-ci[bot] eb44a06a9b [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 15:10:22 +02:00
Michel Aractingi 8eb3c1510c Added support for controlling the gripper with the pygame interface of gamepad
Minor modifications in gym_manipulator to quantize the gripper actions
clamped the observations after F.resize in ConvertToLeRobotObservation wrapper due to a bug in F.resize, images were returned exceeding the maximum value of 1.0
2025-04-18 15:10:22 +02:00
AdilZouitine 4d5ecb082e Refactor SACPolicy for improved type annotations and readability
- Enhanced type annotations for variables in the `SACPolicy` class to improve code clarity.
- Updated method calls to use keyword arguments for better readability.
- Streamlined the extraction of batch components, ensuring consistent typing across the class methods.
2025-04-18 15:10:22 +02:00
AdilZouitine 6e687e2910 Refactor SACPolicy and learner_server for improved clarity and functionality
- Updated the `forward` method in `SACPolicy` to handle loss computation for actor, critic, and temperature models.
- Replaced direct calls to `compute_loss_*` methods with a unified `forward` method in `learner_server`.
- Enhanced batch processing by consolidating input parameters into a single dictionary for better readability and maintainability.
- Removed redundant code and improved documentation for clarity.
2025-04-18 15:10:22 +02:00
AdilZouitine eb710647bf Refactor actor_server.py for improved structure and logging
- Consolidated logging initialization and enhanced logging for actor processes.
- Streamlined the handling of gRPC connections and process management.
- Improved readability by organizing core algorithm functions and communication functions.
- Added detailed comments and documentation for clarity.
- Ensured proper queue management and shutdown handling for actor processes.
2025-04-18 15:10:22 +02:00
AdilZouitine 176557d770 Refactor learner_server.py for improved structure and clarity
- Removed unused imports and streamlined the code structure.
- Consolidated logging initialization and enhanced logging for training processes.
- Improved handling of training state loading and resume logic.
- Refactored transition and interaction message processing for better readability and maintainability.
- Added detailed comments and documentation for clarity.
2025-04-18 15:10:22 +02:00
AdilZouitine 3beab33fac Refactor imports in modeling_sac.py for improved organization
- Rearranged import statements for better readability.
- Removed unused imports and streamlined the code structure.
2025-04-18 15:10:22 +02:00
AdilZouitine c0ba4b4954 Refactor SACConfig properties for improved readability
- Simplified the `image_features` property to directly iterate over `input_features`.
- Removed unused imports and unnecessary code related to main execution, enhancing clarity and maintainability.
2025-04-18 15:10:22 +02:00
AdilZouitine 8fb373aeb2 fix 2025-04-18 15:10:22 +02:00
AdilZouitine 5a0ee06651 Enhance logging for actor and learner servers
- Implemented process-specific logging for actor and learner servers to improve traceability.
- Created a dedicated logs directory and ensured it exists before logging.
- Initialized logging with explicit log files for each process, including actor transitions, interactions, and policy.
- Updated the actor CLI to validate configuration and set up logging accordingly.
2025-04-18 15:10:22 +02:00
Michel Aractingi 05a237ce10 Added gripper control mechanism to gym_manipulator
Moved HilSerl env config to configs/env/configs.py
fixes in actor_server and modeling_sac and configuration_sac
added the possibility of ignoring missing keys in env_cfg in get_features_from_env_config function
2025-04-18 15:10:22 +02:00
AdilZouitine 88cc2b8fc8 Add WrapperConfig for environment wrappers and update SACConfig properties
- Introduced `WrapperConfig` dataclass for environment wrapper configurations.
- Updated `ManiskillEnvConfig` to include a `wrapper` field for enhanced environment management.
- Modified `SACConfig` to return `None` for `observation_delta_indices` and `action_delta_indices` properties.
- Refactored `make_robot_env` function to improve readability and maintainability.
2025-04-18 15:10:22 +02:00
Michel Aractingi b69132c79d Change HILSerlRobotEnvConfig to inherit from EnvConfig
Added support for hil_serl classifier to be trained with train.py
run classifier training by python lerobot/scripts/train.py --policy.type=hilserl_classifier
fixes in find_joint_limits, control_robot, end_effector_control_utils
2025-04-18 15:10:21 +02:00
AdilZouitine db897a1619 [WIP] Update SAC configuration and environment settings
- Reduced frame rate in `ManiskillEnvConfig` from 400 to 200.
- Enhanced `SACConfig` with new dataclasses for actor, learner, and network configurations.
- Improved input and output feature management in `SACConfig`.
- Refactored `actor_server` and `learner_server` to access configuration properties directly.
- Updated training pipeline to validate configurations and handle dataset repo IDs more robustly.
2025-04-18 15:09:46 +02:00
AdilZouitine 0b5b62c8fb Add wandb run id in config 2025-04-18 15:09:46 +02:00
AdilZouitine 056f79d358 [WIP] Non functional yet
Add ManiSkill environment configuration and wrappers

- Introduced `VideoRecordConfig` for video recording settings.
- Added `ManiskillEnvConfig` to encapsulate environment-specific configurations.
- Implemented various wrappers for the ManiSkill environment, including observation and action scaling.
- Enhanced the `make_maniskill` function to create a wrapped ManiSkill environment with video recording and observation processing.
- Updated the `actor_server` and `learner_server` to utilize the new configuration structure.
- Refactored the training pipeline to accommodate the new environment and policy configurations.
2025-04-18 15:09:46 +02:00
Michel Aractingi 114ec644d0 Change config logic in:
- gym_manipulator
- find_joint_limits
- end_effector_utils
2025-04-18 15:09:45 +02:00
AdilZouitine 26ee8b6ae5 Add .devcontainer to .gitignore for improved development environment management 2025-04-18 15:09:27 +02:00
AdilZouitine 38e8864284 Add task field to frame_dict in ReplayBuffer and simplify save_episode calls
- Introduced a new "task" field in frame_dict to meet the requirements of LeRobotDataset.
- Removed task_name parameter from save_episode calls for consistency.
2025-04-18 15:09:27 +02:00
AdilZouitine 80d566eb56 Handle new config with sac 2025-04-18 15:09:27 +02:00
AdilZouitine bb5a95889f Handle multi optimizers 2025-04-18 15:09:27 +02:00
pre-commit-ci[bot] 0ea27704f6 [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 15:09:25 +02:00
Michel Aractingi 2abbd60a0d Removed depleted files and scripts 2025-04-18 15:07:48 +02:00
pre-commit-ci[bot] 1c8daf11fd [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 15:07:46 +02:00
AdilZouitine cdcf346061 Update tensor device assignment in ReplayBuffer class
- Changed the device assignment for tensors in the ReplayBuffer class from `device` to `storage_device` for consistency and improved resource management.
2025-04-18 15:06:52 +02:00
pre-commit-ci[bot] 42f95e827d [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
2025-04-18 15:06:52 +02:00
AdilZouitine 618ed00d45 Initialize log_alpha with the logarithm of temperature_init in SACPolicy
- Updated the SACPolicy class to set log_alpha using the logarithm of the initial temperature value from the configuration.
2025-04-18 15:06:52 +02:00