Merge 1667b7c1d7 into c10c5a0e64

Fix --width --height type parsing on opencv and intelrealsense scripts (#556 )
Co-authored-by: Remi <remi.cadene@huggingface.co> Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>
2025-04-17 15:30:30 +02:00 · 2025-04-17 15:19:23 +02:00 · 2025-04-17 15:07:28 +02:00 · 2025-04-17 14:59:43 +02:00 · 2025-04-14 21:55:06 +02:00 · 2025-04-14 15:36:31 +02:00
8 changed files with 242 additions and 20 deletions
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@ -48,7 +48,7 @@ repos:
    -   id: pyupgrade

  - repo: https://github.com/astral-sh/ruff-pre-commit
-    rev: v0.11.4
+    rev: v0.11.5
    hooks:
      - id: ruff
        args: [--fix]
@ -57,7 +57,7 @@ repos:

  ##### Security #####
  - repo: https://github.com/gitleaks/gitleaks
-    rev: v8.24.2
+    rev: v8.24.3
    hooks:
      - id: gitleaks

--- a/README.md
+++ b/README.md
@ -103,6 +103,13 @@ When using `miniconda`, install `ffmpeg` in your environment:
 conda install ffmpeg -c conda-forge
 ```

+> **NOTE:** This usually installs `ffmpeg 7.X` for your platform compiled with the `libsvtav1` encoder. If `libsvtav1` is not supported (check supported encoders with `ffmpeg -encoders`), you can:
+>  - _[On any platform]_ Explicitly install `ffmpeg 7.X` using:
+>  ```bash
+>  conda install ffmpeg=7.1.1 -c conda-forge
+>  ```
+>  - _[On Linux only]_ Install [ffmpeg build dependencies](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#GettheDependencies) and [compile ffmpeg from source with libsvtav1](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#libsvtav1), and make sure you use the corresponding ffmpeg binary to your install with `which ffmpeg`.
+
 Install 🤗 LeRobot:
 ```bash
 pip install -e .
--- a/examples/4_train_policy_with_script.md
+++ b/examples/4_train_policy_with_script.md
@ -4,7 +4,7 @@ This tutorial will explain the training script, how to use it, and particularly

 ## The training script

-LeRobot offers a training script at [`lerobot/scripts/train.py`](../../lerobot/scripts/train.py). At a high level it does the following:
+LeRobot offers a training script at [`lerobot/scripts/train.py`](../lerobot/scripts/train.py). At a high level it does the following:

 - Initialize/load a configuration for the following steps using.
 - Instantiates a dataset.
@ -21,7 +21,7 @@ In the training script, the main function `train` expects a `TrainPipelineConfig
 def train(cfg: TrainPipelineConfig):
 ```

-You can inspect the `TrainPipelineConfig` defined in [`lerobot/configs/train.py`](../../lerobot/configs/train.py) (which is heavily commented and meant to be a reference to understand any option)
+You can inspect the `TrainPipelineConfig` defined in [`lerobot/configs/train.py`](../lerobot/configs/train.py) (which is heavily commented and meant to be a reference to understand any option)

 When running the script, inputs for the command line are parsed thanks to the `@parser.wrap()` decorator and an instance of this class is automatically generated. Under the hood, this is done with [Draccus](https://github.com/dlwh/draccus) which is a tool dedicated for this purpose. If you're familiar with Hydra, Draccus can similarly load configurations from config files (.json, .yaml) and also override their values through command line inputs. Unlike Hydra, these configurations are pre-defined in the code through dataclasses rather than being defined entirely in config files. This allows for more rigorous serialization/deserialization, typing, and to manipulate configuration as objects directly in the code and not as dictionaries or namespaces (which enables nice features in an IDE such as autocomplete, jump-to-def, etc.)

@ -50,7 +50,7 @@ By default, every field takes its default value specified in the dataclass. If a

 ## Specifying values from the CLI

-Let's say that we want to train [Diffusion Policy](../../lerobot/common/policies/diffusion) on the [pusht](https://huggingface.co/datasets/lerobot/pusht) dataset, using the [gym_pusht](https://github.com/huggingface/gym-pusht) environment for evaluation. The command to do so would look like this:
+Let's say that we want to train [Diffusion Policy](../lerobot/common/policies/diffusion) on the [pusht](https://huggingface.co/datasets/lerobot/pusht) dataset, using the [gym_pusht](https://github.com/huggingface/gym-pusht) environment for evaluation. The command to do so would look like this:
 ```bash
 python lerobot/scripts/train.py \
    --dataset.repo_id=lerobot/pusht \
@ -60,10 +60,10 @@ python lerobot/scripts/train.py \

 Let's break this down:
 - To specify the dataset, we just need to specify its `repo_id` on the hub which is the only required argument in the `DatasetConfig`. The rest of the fields have default values and in this case we are fine with those so we can just add the option `--dataset.repo_id=lerobot/pusht`.
- To specify the policy, we can just select diffusion policy using `--policy` appended with `.type`. Here, `.type` is a special argument which allows us to select config classes inheriting from `draccus.ChoiceRegistry` and that have been decorated with the `register_subclass()` method. To have a better explanation of this feature, have a look at this [Draccus demo](https://github.com/dlwh/draccus?tab=readme-ov-file#more-flexible-configuration-with-choice-types). In our code, we use this mechanism mainly to select policies, environments, robots, and some other components like optimizers. The policies available to select are located in [lerobot/common/policies](../../lerobot/common/policies)
- Similarly, we select the environment with `--env.type=pusht`. The different environment configs are available in [`lerobot/common/envs/configs.py`](../../lerobot/common/envs/configs.py)
+- To specify the policy, we can just select diffusion policy using `--policy` appended with `.type`. Here, `.type` is a special argument which allows us to select config classes inheriting from `draccus.ChoiceRegistry` and that have been decorated with the `register_subclass()` method. To have a better explanation of this feature, have a look at this [Draccus demo](https://github.com/dlwh/draccus?tab=readme-ov-file#more-flexible-configuration-with-choice-types). In our code, we use this mechanism mainly to select policies, environments, robots, and some other components like optimizers. The policies available to select are located in [lerobot/common/policies](../lerobot/common/policies)
+- Similarly, we select the environment with `--env.type=pusht`. The different environment configs are available in [`lerobot/common/envs/configs.py`](../lerobot/common/envs/configs.py)

-Let's see another example. Let's say you've been training [ACT](../../lerobot/common/policies/act) on [lerobot/aloha_sim_insertion_human](https://huggingface.co/datasets/lerobot/aloha_sim_insertion_human) using the [gym-aloha](https://github.com/huggingface/gym-aloha) environment for evaluation with:
+Let's see another example. Let's say you've been training [ACT](../lerobot/common/policies/act) on [lerobot/aloha_sim_insertion_human](https://huggingface.co/datasets/lerobot/aloha_sim_insertion_human) using the [gym-aloha](https://github.com/huggingface/gym-aloha) environment for evaluation with:
 ```bash
 python lerobot/scripts/train.py \
    --policy.type=act \
@ -74,7 +74,7 @@ python lerobot/scripts/train.py \
 > Notice we added `--output_dir` to explicitly tell where to write outputs from this run (checkpoints, training state, configs etc.). This is not mandatory and if you don't specify it, a default directory will be created from the current date and time, env.type and policy.type. This will typically look like `outputs/train/2025-01-24/16-10-05_aloha_act`.

 We now want to train a different policy for aloha on another task. We'll change the dataset and use [lerobot/aloha_sim_transfer_cube_human](https://huggingface.co/datasets/lerobot/aloha_sim_transfer_cube_human) instead. Of course, we also need to change the task of the environment as well to match this other task.
-Looking at the [`AlohaEnv`](../../lerobot/common/envs/configs.py) config, the task is `"AlohaInsertion-v0"` by default, which corresponds to the task we trained on in the command above. The [gym-aloha](https://github.com/huggingface/gym-aloha?tab=readme-ov-file#description) environment also has the `AlohaTransferCube-v0` task which corresponds to this other task we want to train on. Putting this together, we can train this new policy on this different task using:
+Looking at the [`AlohaEnv`](../lerobot/common/envs/configs.py) config, the task is `"AlohaInsertion-v0"` by default, which corresponds to the task we trained on in the command above. The [gym-aloha](https://github.com/huggingface/gym-aloha?tab=readme-ov-file#description) environment also has the `AlohaTransferCube-v0` task which corresponds to this other task we want to train on. Putting this together, we can train this new policy on this different task using:
 ```bash
 python lerobot/scripts/train.py \
    --policy.type=act \
--- a/examples/7_get_started_with_real_robot.md
+++ b/examples/7_get_started_with_real_robot.md
@ -830,11 +830,6 @@ It contains:
 - `dtRphone:33.84 (29.5hz)` which is the delta time of capturing an image from the phone camera in the thread running asynchronously.

 Troubleshooting:
- On Linux, if you encounter any issue during video encoding with `ffmpeg: unknown encoder libsvtav1`, you can:
-   - install with conda-forge by running `conda install -c conda-forge ffmpeg` (it should be compiled with `libsvtav1`),
-> **NOTE:** This usually installs `ffmpeg 7.X` for your platform (check the version installed with `ffmpeg -encoders | grep libsvtav1`). If it isn't `ffmpeg 7.X` or lacks `libsvtav1` support, you can explicitly install `ffmpeg 7.X` using: `conda install ffmpeg=7.1.1 -c conda-forge`
-   - or, install [ffmpeg build dependencies](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#GettheDependencies) and [compile ffmpeg from source with libsvtav1](https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu#libsvtav1),
-   - and, make sure you use the corresponding ffmpeg binary to your install with `which ffmpeg`.
 - On Linux, if the left and right arrow keys and escape key don't have any effect during data recording, make sure you've set the `$DISPLAY` environment variable. See [pynput limitations](https://pynput.readthedocs.io/en/latest/limitations.html#linux).

 At the end of data recording, your dataset will be uploaded on your Hugging Face page (e.g. https://huggingface.co/datasets/cadene/koch_test) that you can obtain by running:
--- a/lerobot/common/robot_devices/cameras/intelrealsense.py
+++ b/lerobot/common/robot_devices/cameras/intelrealsense.py
@ -512,13 +512,13 @@ if __name__ == "__main__":
    )
    parser.add_argument(
        "--width",
-        type=str,
+        type=int,
        default=640,
        help="Set the width for all cameras. If not provided, use the default width of each camera.",
    )
    parser.add_argument(
        "--height",
-        type=str,
+        type=int,
        default=480,
        help="Set the height for all cameras. If not provided, use the default height of each camera.",
    )
--- a/lerobot/common/robot_devices/cameras/opencv.py
+++ b/lerobot/common/robot_devices/cameras/opencv.py
@ -492,13 +492,13 @@ if __name__ == "__main__":
    )
    parser.add_argument(
        "--width",
-        type=str,
+        type=int,
        default=None,
        help="Set the width for all cameras. If not provided, use the default width of each camera.",
    )
    parser.add_argument(
        "--height",
-        type=str,
+        type=int,
        default=None,
        help="Set the height for all cameras. If not provided, use the default height of each camera.",
    )
--- a/lerobot/common/robot_devices/motors/hiwonder.py
+++ b/lerobot/common/robot_devices/motors/hiwonder.py
@ -0,0 +1,217 @@
+import traceback
+
+import numpy as np
+import serial
+
+from lerobot.common.robot_devices.utils import RobotDeviceAlreadyConnectedError, RobotDeviceNotConnectedError
+
+BAUDRATE = 115_200
+TIMEOUT_S = 1
+COMMAND_HEADER = [0x55, 0x55]  # Header for all commands sent to Hiwonder motors
+BYTE_MASK = 0xFF
+
+# https://drive.google.com/file/d/1DQGBBng8UFziv5hg15qmyvakItG3w6Ss/view?usp=sharing
+
+# data_name: (command, length)
+HIWONDER_CONTROL_TABLE = {
+    "SERVO_MOVE_TIME_WRITE": (0x01, 0x07),
+    "SERVO_MOVE_TIME_READ": (0x02, 0x03),
+    "SERVO_MOVE_TIME_WAIT_WRITE": (0x07, 0x07),
+    "SERVO_MOVE_TIME_WAIT_READ": (0x08, 0x03),
+    "SERVO_MOVE_START": (0x0B, 0x03),
+    "SERVO_MOVE_STOP": (0x0C, 0x03),
+    "SERVO_ID_WRITE": (0x0D, 0x04),
+    "SERVO_ID_READ": (0x0E, 0x03),
+    "SERVO_ANGLE_OFFSET_ADJUST": (0x11, 0x04),
+    "SERVO_ANGLE_OFFSET_WRITE": (0x12, 0x03),
+    "SERVO_ANGLE_OFFSET_READ": (0x13, 0x03),
+    "SERVO_ANGLE_LIMIT_WRITE": (0x14, 0x07),
+    "SERVO_ANGLE_LIMIT_READ": (0x15, 0x03),
+    "SERVO_VIN_LIMIT_WRITE": (0x16, 0x07),
+    "SERVO_VIN_LIMIT_READ": (0x17, 0x03),
+    "SERVO_TEMP_MAX_LIMIT_WRITE": (0x18, 0x04),
+    "SERVO_TEMP_MAX_LIMIT_READ": (0x19, 0x03),
+    "SERVO_TEMP_READ": (0x1A, 0x03),
+    "SERVO_VIN_READ": (0x1B, 0x03),
+    "SERVO_POS_READ": (0x1C, 0x03),
+    "SERVO_OR_MOTOR_MODE_WRITE": (0x1D, 0x07),
+    "SERVO_OR_MOTOR_MODE_READ": (0x1E, 0x03),
+    "SERVO_LOAD_OR_UNLOAD_WRITE": (0x1F, 0x04),
+    "SERVO_LOAD_OR_UNLOAD_READ": (0x20, 0x03),
+    "SERVO_LED_CTRL_WRITE": (0x21, 0x04),
+    "SERVO_LED_CTRL_READ": (0x22, 0x03),
+    "SERVO_LED_ERROR_WRITE": (0x23, 0x04),
+    "SERVO_LED_ERROR_READ": (0x24, 0x03),
+}
+
+
+def calculate_checksum(data):
+    """Calculate the checksum for the given data."""
+    checksum = ~(sum(data)) & BYTE_MASK
+    return checksum
+
+
+def low_byte(value):
+    """Extract the low byte of a 16-bit integer."""
+    return value & BYTE_MASK
+
+
+def high_byte(value):
+    """Extract the high byte of a 16-bit integer."""
+    return (value >> 8) & BYTE_MASK
+
+
+class HiwonderMotorsBus:
+    """
+    The HiwonderMotorBus class allows to efficiently read and write to the attached motors. It relies on
+    the [Hiwonder Bus Communication Protoco](https://drive.google.com/file/d/1JKyt_OUg9V6cIBC-SiX6IIAACsvz86aB/view?usp=sharing).
+
+    A HiwonderMotorBus instance requires a port (e.g. `HiwonderMotorBus(port="/dev/tty.usbmodem575E0031751"`)).
+
+    Example of usage for 1 motor connected to the bus:
+    ```python
+    motor_name = "gripper"
+    motor_index = 6
+    motor_model = "lx-16a"
+
+    motors_bus = HiwonderMotorBus(
+        port="/dev/tty.usbmodem575E0031751",
+        motors={motor_name: (motor_index, motor_model)},
+    )
+    motors_bus.connect()
+
+    position = motors_bus.read()
+
+    # move from a few motor steps as an example
+    few_steps = 30
+    motors_bus.write(position + few_steps)
+
+    # when done, consider disconnecting
+    motors_bus.disconnect()
+    ```
+    """
+
+    def __init__(self, port: str, motors: dict[str, tuple[int, str]]):
+        self.port = port
+        self.serial = None
+        self.is_connected = False
+        self.motors = motors
+
+    def connect(self):
+        """Open the serial port and establish a connection."""
+        if self.is_connected:
+            raise RobotDeviceAlreadyConnectedError(
+                f"HiwonderMotorsBus({self.port}) is already connected. Do not call `motors_bus.connect()` twice."
+            )
+        try:
+            self.serial = serial.Serial(self.port, baudrate=BAUDRATE, timeout=TIMEOUT_S)
+            self.is_connected = True
+            print("Connection established")
+        except Exception:
+            traceback.print_exc()
+            print("\nCould not open port.\n")
+            raise
+
+    def disconnect(self):
+        """Close the serial port connection."""
+        if self.serial:
+            self.serial.close()
+            self.is_connected = False
+            print("Connection closed")
+
+    @property
+    def motor_names(self) -> list[str]:
+        return list(self.motors.keys())
+
+    def write(self, values: int | float | np.ndarray, motor_names: str | list[str] | None = None):
+        if not self.is_connected:
+            raise RobotDeviceAlreadyConnectedError(f"HiwonderMotorsBus({self.port}) is not connected.")
+
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        if isinstance(motor_names, str):
+            motor_names = [motor_names]
+
+        if isinstance(values, (int, float)):
+            values = [values] * len(
+                motor_names
+            )  # Replicate value for each motor if a single value is provided
+
+        motor_commands = zip(motor_names, values, strict=False)
+
+        for motor_name, value in motor_commands:
+            motor_id, _ = self.motors[motor_name]  # Using only the ID for the command
+            duration = 0  # Get to the next position as soon as possible
+
+            # Command structure: 55 55 ID LEN CMD P1 P2 P3 P4 CHK
+            # Retrieve command and length from the control table
+            cmd, length = HIWONDER_CONTROL_TABLE["SERVO_MOVE_TIME_WRITE"]
+            # Prepare the packet
+            data = [
+                motor_id,
+                length,
+                cmd,
+                low_byte(value),
+                high_byte(value),
+                low_byte(duration),
+                high_byte(duration),
+            ]
+            checksum = calculate_checksum(data)
+            command = COMMAND_HEADER + data + [checksum]
+
+            self.serial.write(bytearray(command))
+
+    def read(self, motor_names: str | list[str] | None = None):
+        """Send a command to read the current position of the motors."""
+        if not self.is_connected:
+            raise RobotDeviceNotConnectedError(
+                f"HiwonderMotorsBus({self.port}) is not connected. You need to run `motors_bus.connect()`."
+            )
+
+        if motor_names is None:
+            motor_names = self.motor_names
+
+        if isinstance(motor_names, str):
+            motor_names = [motor_names]
+
+        motor_ids = [self.motors[name][0] for name in motor_names]  # Extract motor IDs based on names
+
+        positions = {}
+        for motor_name, motor_id in zip(motor_names, motor_ids, strict=False):
+            # Command structure for reading position: 55 55 ID LEN CMD CHK
+            # Retrieve command and length from the control table
+            cmd, length = HIWONDER_CONTROL_TABLE["SERVO_POS_READ"]
+
+            # Prepare the command packet
+            data = [motor_id, length, cmd]
+            checksum = calculate_checksum(data)
+            command = COMMAND_HEADER + data + [checksum]
+
+            # Send the command
+            self.serial.write(bytearray(command))
+
+            # Wait for response and handle it
+            try:
+                response = self.serial.read(8)  # Adjust size based on expected response
+                if len(response) == 8:
+                    header1, header2, resp_id, resp_len, resp_cmd, param1, param2, resp_chk = response
+                    # Calculate expected checksum and compare
+                    expected_checksum = calculate_checksum(response[2:-1])
+                    if resp_chk == expected_checksum:
+                        # Combine param1 and param2 to form the actual position
+                        position = param1 + (param2 << 8)
+                        positions[motor_name] = position  # Store position with motor ID as the key
+                    else:
+                        print(f"Checksum mismatch for motor {motor_id}.")
+                else:
+                    print(f"No response or invalid response for motor {motor_id}.")
+            except Exception as e:
+                print(f"Failed to read position for motor {motor_id}: {str(e)}")
+
+        return positions
+
+
+# Example usage
+if __name__ == "__main__":
+    pass
--- a/lerobot/scripts/visualize_dataset_html.py
+++ b/lerobot/scripts/visualize_dataset_html.py
@ -174,7 +174,10 @@ def run_server(
                dataset.meta.get_video_file_path(episode_id, key) for key in dataset.meta.video_keys
            ]
            videos_info = [
-                {"url": url_for("static", filename=video_path), "filename": video_path.parent.name}
+                {
+                    "url": url_for("static", filename=str(video_path).replace("\\", "/")),
+                    "filename": video_path.parent.name,
+                }
                for video_path in video_paths
            ]
            tasks = dataset.meta.episodes[episode_id]["tasks"]
@ -381,7 +384,7 @@ def visualize_dataset_html(
        if isinstance(dataset, LeRobotDataset):
            ln_videos_dir = static_dir / "videos"
            if not ln_videos_dir.exists():
-                ln_videos_dir.symlink_to((dataset.root / "videos").resolve())
+                ln_videos_dir.symlink_to((dataset.root / "videos").resolve().as_posix())

        if serve:
            run_server(dataset, episodes, host, port, static_dir, template_dir)
Author	SHA1	Message	Date
Jess Moss	475d5e9a68	Merge `1667b7c1d7` into `c10c5a0e64`	2025-04-17 15:30:30 +02:00
Alex Thiele	c10c5a0e64	Fix --width --height type parsing on opencv and intelrealsense scripts (#556 ) Co-authored-by: Remi <remi.cadene@huggingface.co> Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2025-04-17 15:19:23 +02:00
Junshan Huang	a8db91c40e	Fix Windows HTML visualization to make videos could be seen (#647 ) Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Steven Palma <imstevenpmwork@ieee.org>	2025-04-17 15:07:28 +02:00
HUANG TZU-CHUN	0f5f7ac780	Fix broken links in `examples/4_train_policy_with_script.md` (#697 )	2025-04-17 14:59:43 +02:00
pre-commit-ci[bot]	768e36660d	[pre-commit.ci] pre-commit autoupdate (#980 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2025-04-14 21:55:06 +02:00
Caroline Pascal	790d6740ba	fix(installation): adding note on `ffmpeg` version during installation (#976 ) Co-authored-by: Simon Alibert <75076266+aliberts@users.noreply.github.com>	2025-04-14 15:36:31 +02:00
jess-moss	1667b7c1d7	Merge branch 'main' of github.com:huggingface/lerobot into user/jmoss/2024_08_19_add_hiwonder_motors	2024-08-28 14:48:50 -05:00
jess-moss	699adc2130	Things are looking good.	2024-08-28 14:42:39 -05:00
jess-moss	8e05497b11	commit.	2024-08-28 09:01:09 -05:00
jess-moss	63d0a9841e	Merge branch 'main' of github.com:huggingface/lerobot	2024-08-19 11:10:02 -05:00
jess-moss	c0166949ad	Merge branch 'main' of github.com:huggingface/lerobot	2024-08-05 11:22:19 -05:00
jess-moss	3fde016246	Merge branch 'main' of github.com:huggingface/lerobot	2024-07-22 08:48:50 -05:00
jess-moss	e05066a88b	Merge branch 'main' of github.com:huggingface/lerobot	2024-07-09 14:52:30 -05:00