text-generation-webui搭建大模型运行环境与踩坑记录

admin • 2024-03-31 19:15 • 前端

text-generation-webui搭建大模型运行环境

text-generation-webui

text-generation-webui

text-generation-webui是一个基于Gradio的LLM Web UI开源项目，可以利用其快速搭建部署各种大模型环境。

环境初始化

下载该开源项目

git clone https://github.com/oobabooga/text-generation-webui.git

创建conda环境并进入

conda create -n ui python=3.10

conda activate  ui

安装项目依赖

命令方式

cd text-generation-webui

pip install -r requirements.txt

在安装text-generation-webui项目的依赖库文件时，出现如下异常：

 WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7fed6cb7abf0>, 'Connection to github.com timed out. (connect timeout=15)')': /oobabooga/llama-cpp-python-cuBLAS-wheels/releases/download/cpu/llama_cpp_python-0.2.24+cpuavx2-cp310-cp310-manylinux_2_31_x86_64.whl

解决方案：

pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com

注意：

这里最大一个问题是：requirements.txt中存在大量GitHub项目中的文件，需要访问GitHub，其速度不言而喻，如果是云服务器中特别注意一点，不要使用proxy服务器，直接在该服务器上安装proxy服务

脚本方式

直接运行项目目录下的start_linux.sh脚本，其会自动安装相关依赖，然后启动项目，方便快捷，推荐使用。

(ui) root@master:~/work/text-generation-webui# ./start_linux.sh 
Downloading Miniconda from https://repo.anaconda.com/miniconda/Miniconda3-py310_23.3.1-0-Linux-x86_64.sh to /root/work/text-generation-webui/installer_files/miniconda_installer.sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 69.7M  100 69.7M    0     0  9639k      0  0:00:07  0:00:07 --:--:-- 13.2M
PREFIX=/root/work/text-generation-webui/installer_files/conda
Unpacking payload ...
                                                                                                                                                                                                     
Installing base environment...


Downloading and Extracting Packages


Downloading and Extracting Packages

Preparing transaction: done
Executing transaction: done
installation finished.
Miniconda version:
conda 23.3.1
Collecting package metadata (current_repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 23.3.1
  latest version: 24.1.2

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=24.1.2



## Package Plan ##

  environment location: /root/work/text-generation-webui/installer_files/env

  added / updated specs:
    - python=3.11


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    _libgcc_mutex-0.1          |             main           3 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    _openmp_mutex-5.1          |            1_gnu          21 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
 

    xz-5.4.5                   |       h5eee18b_0         646 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    zlib-1.2.13                |       h5eee18b_0         103 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    ------------------------------------------------------------
                                           Total:        60.5 MB

The following NEW packages will be INSTALLED:

  _libgcc_mutex      anaconda/pkgs/main/linux-64::_libgcc_mutex-0.1-main 
  _openmp_mutex      anaconda/pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu 


  xz                 anaconda/pkgs/main/linux-64::xz-5.4.5-h5eee18b_0 
  zlib               anaconda/pkgs/main/linux-64::zlib-1.2.13-h5eee18b_0 



Downloading and Extracting Packages
                                                                                                                                                                                                     
Preparing transaction: done                                                                                                                                                                          
Verifying transaction: done                                                                                                                                                                          
Executing transaction: done                                                                                                                                                                          
#                                                                                                                                                                                                    
# To activate this environment, use                                                                                                                                                                  
#                                                                                                                                                                                                    
#     $ conda activate /root/work/text-generation-webui/installer_files/env                                                                                                                          
#                                                                                                                                                                                                    
# To deactivate an active environment, use                                                                                                                                                           
#                                                                                                                                                                                                    
#     $ conda deactivate                                                                                                                                                                             
                                                                                                                                                                                                     
                                                                                                                                                                                                     
What is your GPU?                                                                                                                                                                                    
                                                                                                                                                                                                     
A) NVIDIA                                                                                                                                                                                            
B) AMD (Linux/MacOS only. Requires ROCm SDK 5.6 on Linux)                                                                                                                                            
C) Apple M Series                                                                                                                                                                                    
D) Intel Arc (IPEX)                                                                                                                                                                                  
N) None (I want to run models in CPU mode)                                                                                                                                                           
                                                                                                                                                                                                     
Input> A

Do you want to use CUDA 11.8 instead of 12.1? Only choose this option if your GPU is very old (Kepler or older).
For RTX and GTX series GPUs, say "N". If unsure, say "N".

Input (Y/N)> N
CUDA: 12.1


*******************************************************************
* Installing PyTorch.
*******************************************************************


Collecting package metadata (current_repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 23.3.1
  latest version: 24.1.2

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=24.1.2



## Package Plan ##

  environment location: /root/work/text-generation-webui/installer_files/env

  added / updated specs:
    - git
    - ninja


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    c-ares-1.19.1              |       h5eee18b_0         118 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    curl-7.26.0                |                1         451 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free


    pcre2-10.42                |       hebb0a14_0         1.5 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    perl-5.34.0                |       h5eee18b_2        12.4 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    ------------------------------------------------------------
                                           Total:        58.4 MB

The following NEW packages will be INSTALLED:

  c-ares             anaconda/pkgs/main/linux-64::c-ares-1.19.1-h5eee18b_0 
  curl               anaconda/pkgs/free/linux-64::curl-7.26.0-1 


  pcre2              anaconda/pkgs/main/linux-64::pcre2-10.42-hebb0a14_0 
  perl               anaconda/pkgs/main/linux-64::perl-5.34.0-h5eee18b_2 



Downloading and Extracting Packages
                                                                                                                                                                                                     
Preparing transaction: done                                                                                                                                                                          
Verifying transaction: done                                                                                                                                                                          
Executing transaction: done                                                                                                                                                                          
Looking in indexes: https://download.pytorch.org/whl/cu121                                                                                                                                           
Collecting torch==2.1.*                                                                                                                                                                              
  Downloading https://download.pytorch.org/whl/cu121/torch-2.1.2%2Bcu121-cp311-cp311-linux_x86_64.whl (2200.7 MB)                                                                                    
     ━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.5/2.2 GB 17.5 MB/s eta 0:01:40    



*******************************************************************
* WARNING: You haven't downloaded any model yet.
* Once the web UI launches, head over to the "Model" tab and download one.
*******************************************************************


02:29:12-860927 INFO     Starting Text generation web UI                                                                                                                                             
02:29:12-865030 INFO     Loading the extension "gallery"                                                                                                                                             
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

准备模型

这里以Llama2-7B模型为例说明，将其放到text-generation-webui/models目录

mv /root/models/llama-2-7b-hf text-generation-webui/models

启动项目

在项目目录执行如下命令启动项目

(ui) root@instance:~/text-generation-webui-main# python server.py 
15:49:18-962453 INFO     Starting Text generation web UI                                                                                                                                                                                
15:49:18-966915 INFO     Loading the extension "gallery"                                                                                                                                                                                
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

也可以使用脚本方式启动项目

(ui) root@master:~/work/text-generation-webui# ./start_linux.sh

运行成功，访问：http://127.0.0.1:7860

注意：此时运行该项目监听的是http://127.0.0.1:7860地址，如果是从其他IP访问，是无法访问的，需要使用--listen参数更改监听地址

(ui) root@master:~/work/text-generation-webui# ./start_linux.sh --listen


*******************************************************************
* WARNING: You haven't downloaded any model yet.
* Once the web UI launches, head over to the "Model" tab and download one.
*******************************************************************


02:37:44-188412 INFO     Starting Text generation web UI                                                                                                                                             
02:37:44-192088 WARNING                                                                                                                                                                              
                         You are potentially exposing the web UI to the entire internet without any access password.                                                                                 
                         You can create one with the "--gradio-auth" flag like this:                                                                                                                 
                                                                                                                                                                                                     
                         --gradio-auth username:password                                                                                                                                             
                                                                                                                                                                                                     
                         Make sure to replace username:password with your own.                                                                                                                       
02:37:44-194428 INFO     Loading the extension "gallery"                                                                                                                                             
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.

加载模型

在选择模型后，点击Load加载模型

加载该模型成功标志如下

注意：加载模型过程中可能会遇到不同异常，具体参考下文Bug说明

Bug说明

Bug1

当出现如下异常，关键点then set the option trust_remote_code=True to remove this error.

Traceback (most recent call last):

File "/root/work/text-generation-webui/modules/ui_model_menu.py", line 242, in load_model_wrapper


shared.model, shared.tokenizer = load_model(selected_model, loader)

                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/work/text-generation-webui/modules/models.py", line 87, in load_model


output = load_func_map[loader](model_name)

         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/work/text-generation-webui/modules/models.py", line 140, in huggingface_loader


config = AutoConfig.from_pretrained(path_to_model, trust_remote_code=params['trust_remote_code'])

         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/work/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/models/auto/configuration_auto.py", line 1103, in from_pretrained


trust_remote_code = resolve_trust_remote_code(

                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/work/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 621, in resolve_trust_remote_code


raise ValueError(
ValueError: Loading models/chatglm3-6b requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code=True to remove this error.

解决方案：启动时添加--trust-remote-code参数

./start_linux.sh --listen --trust-remote-code

Bug2

遇到如下异常，加载模型始终加载失败，不管是什么模型。另外这里踩了2天坑，不是环境、配置什么的不对，根本原因是该项目的Bug，可以在Issues进一步确认

控制台异常日志如下：

(ui) root@instance:~/text-generation-webui# python server.py 
13:38:23-216368 INFO     Starting Text generation web UI                                                                                                                                                                                
13:38:23-224693 INFO     Loading the extension "gallery"                                                                                                                                                                                
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
13:38:57-356736 INFO     Loading Llama-2-7b-hf                                                                                                                                                                                          
Loading checkpoint shards:   0%|                                                                                                                                                                                  | 0/2 [00:00<?, ?it/s]
13:38:57-739003 ERROR    Failed to load the model.                                                                                                                                                                                      
Traceback (most recent call last):
  File "/root/text-generation-webui/modules/ui_model_menu.py", line 214, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
  File "/root/text-generation-webui/modules/models.py", line 90, in load_model
    output = load_func_map[loader](model_name)
  File "/root/text-generation-webui/modules/models.py", line 161, in huggingface_loader
    model = LoaderClass.from_pretrained(path_to_model, **params)
  File "/root/miniconda3/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
    return model_class.from_pretrained(
  File "/root/miniconda3/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3706, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/root/miniconda3/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4091, in _load_pretrained_model
    state_dict = load_state_dict(shard_file)
  File "/root/miniconda3/lib/python3.10/site-packages/transformers/modeling_utils.py", line 503, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer

解决方案1：降低版本，下载使用V1.5版本的text-generation-webui，然后重新把模型放到text-generation-webui/models目录下

git clone --branch v1.5 https://github.com/oobabooga/text-generation-webui.git

1.启动该项目，指定加载chatglm3-6b模型

(ui) root@instance:~/text-generation-webui# python server.py --model chatglm3-6b --trust-remote-code
2023-12-26 20:54:04 WARNING:trust_remote_code is enabled. This is dangerous.
2023-12-26 20:54:06 INFO:Loading chatglm3-6b...
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:28<00:00,  4.06s/it]
2023-12-26 20:54:38 INFO:Loaded the model in 32.00 seconds.

Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

2.启动该项目，指定加载Llama-2-7b-hf模型

(ui) root@instance:~/text-generation-webui# python server.py --model Llama-2-7b-hf
2023-12-26 21:17:52 INFO:Loading Llama-2-7b-hf...
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  4.48it/s]
2023-12-26 21:18:03 INFO:Loaded the model in 11.05 seconds.

Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.

解决方案2：过了一段时间，使用最新项目，安装相关依赖，加载模型，发现成功。

本图文内容来源于网友网络收集整理提供，作为学习参考使用，版权属于原作者。

THE END

chatglm3-6b llama LLM webui

二维码

【Web API系列】使用getDisplayMedia来实现录屏功能

< <上一篇

面向服务的架构（SOA）详解

下一篇>>

搜索内容