From 5389761b06e42421dd7e985383c12a2d30fcbeca Mon Sep 17 00:00:00 2001
From: lihengzhong <lihengzhong@baidu.com>
Date: Sat, 13 Jan 2024 20:12:08 +0800
Subject: [PATCH] update readme

---
 README.md | 44 +++++++++++++++++++++++++-------------------
 1 file changed, 25 insertions(+), 19 deletions(-)

diff --git a/README.md b/README.md
index e84773c..8720ad7 100644
--- a/README.md
+++ b/README.md
@@ -3,11 +3,11 @@ A streaming digital human based on the Ernerf model， realize audio video synch
 
 [![Watch the video]](/assets/demo.mp4)
 
-## Installation
+## 1. Installation
 
-Tested on Ubuntu 18.04, Python3.10, Pytorch 1.12 and CUDA 11.3
+Tested on Ubuntu 20.04, Python3.10, Pytorch 1.12 and CUDA 11.3
 
-### Install dependency
+### 1.1 Install dependency
 
 ```bash
 conda create -n nerfstream python=3.10
@@ -23,14 +23,14 @@ linux cuda环境搭建可以参考这篇文章 https://zhuanlan.zhihu.com/p/6749
 参照 https://github.com/lipku/python_rtmpstream
 
 
-## Run
+## 2. Run
 
-### 运行rtmpserver (srs)
+### 2.1 运行rtmpserver (srs)
 ```
 docker run --rm -it -p 1935:1935 -p 1985:1985 -p 8080:8080 registry.cn-hangzhou.aliyuncs.com/ossrs/srs:5
 ```
 
-### 启动：
+### 2.2 启动数字人：
 
 ```python
 python app.py
@@ -43,7 +43,7 @@ export HF_ENDPOINT=https://hf-mirror.com
 
 运行成功后，用vlc访问rtmp://serverip/live/livestream
 
-### 网页端数字人播报输入文字
+### 2.3 网页端数字人播报输入文字
 安装并启动nginx
 ```
 apt install nginx
@@ -52,24 +52,20 @@ nginx
 修改echo.html中websocket和视频播放地址，将serverip替换成实际服务器ip  
 然后将echo.html和mpegts-1.7.3.min.js拷到/var/www/html下
 
-启动数字人
-```python
-python app.py
-```
 
-用浏览器打开http://serverip/echo.html，在文本框输入任意文字，提交。数字人播报该段文字
+用浏览器打开http://serverip/echo.html, 在文本框输入任意文字，提交。数字人播报该段文字
 
-### docker运行
-先运行srs和nginx  
-启动数字人
+## 3. Docker Run  
+不需要第1步的安装，直接运行。
 ```
-docker run --gpus all -it --network=host --rm  registry.cn-hangzhou.aliyuncs.com/lipku/nerfstream:v1.2
+docker run --gpus all -it --network=host --rm  registry.cn-hangzhou.aliyuncs.com/lipku/nerfstream:v1.3
 ```
+srs和nginx的运行同2.1和2.3
 
-## Data flow
+## 4. Data flow
 ![](/assets/dataflow.png)
 
-## 数字人模型文件
+## 5. 数字人模型文件
 可以替换成自己训练的模型(https://github.com/Fictionarry/ER-NeRF)
 ```python
 .
@@ -80,7 +76,17 @@ docker run --gpus all -it --network=host --rm  registry.cn-hangzhou.aliyuncs.com
 
 ```
 
-## TODO
+## 6. 性能分析
+1. 帧率  
+在Tesla T4显卡上测试整体fps为18左右，如果去掉音视频编码推流，帧率在20左右。用4090显卡应该能达到25帧，欢迎有显卡资源的同学提供数据。  
+优化：新开一个线程运行音视频编码推流  
+2. 延时  
+整体延时5s多  
+（1）tts延时2s左右，目前用的edgetts，需要将每句话转完后一次性输入，可以优化tts改成流式输入  
+（2）wav2vec延时1s多，需要缓存50帧音频做计算，可以通过-m设置context_size来减少延时  
+（3）srs转发延时，设置srs服务器减少缓冲延时
+
+## 7. TODO
 - 添加chatgpt实现数字人对话
 - 声音克隆
 - 数字人静音时用一段视频代替