[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
This commit is contained in:
parent
ea81279964
commit
9726f20661
|
@ -2,7 +2,7 @@
|
|||
DexVLA: Vision-Language Model with Plug-In Diffusion Expert for Visuomotor Policy Learning</h1>
|
||||
|
||||
This policy is Community Contributed. For more information about DexVLA, you can also refer to [this](https://github.com/juruobenruo/DexVLA).
|
||||
This is [project website](https://dex-vla.github.io/).
|
||||
This is [project website](https://dex-vla.github.io/).
|
||||
|
||||
## Dataset
|
||||
### Data format
|
||||
|
@ -141,4 +141,4 @@ python lerobot/scripts/eval.py \
|
|||
~~~
|
||||
|
||||
### Inference Speed
|
||||
Tested on a single A6000 GPU, the DexVLA could infer 3.4 action chunks in one second. For each action chunk, if we execute 25 actions, the real control frequency can be 85 (3.4*25)Hz.
|
||||
Tested on a single A6000 GPU, the DexVLA could infer 3.4 action chunks in one second. For each action chunk, if we execute 25 actions, the real control frequency can be 85 (3.4*25)Hz.
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
class ActionProjector(nn.Module):
|
||||
def __init__(self, in_dim, out_dim=1024):
|
||||
super().__init__()
|
||||
|
|
|
@ -19,6 +19,7 @@ import torch
|
|||
from PIL import Image
|
||||
from qwen_vl_utils import fetch_image
|
||||
|
||||
|
||||
class Qwen2VLAProcess:
|
||||
def __init__(
|
||||
self,
|
||||
|
|
Loading…
Reference in New Issue