更新 'MD/AI大模型ollama结合Open-webui.md'

This commit is contained in:
diandian 2024-12-06 11:24:12 +08:00
parent 796a9ff1b7
commit ca29c236ed

View File

@ -1,389 +1,389 @@
<h1><center>AI大模型Ollama结合Open-webui</center></h1>
**作者:行癫(盗版必究)**
------
## 一:认识 Ollama
#### 1.什么是Ollama
![image-20241206093753629](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206093753629.png)
Ollama是一个开源的 LLM大型语言模型服务工具用于简化在本地运行大语言模型降低使用大语言模型的门槛使得大模型的开发者、研究人员和爱好者能够在本地环境快速实验、管理和部署最新大语言模型
#### 2.官方网址
官方地址https://ollama.com/
![image-20241206093735653](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206093735653.png)
Ollama目前支持以下大语言模型https://ollama.com/library
![image-20241206093844006](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206093844006.png)
Ollama下载地址https://ollama.com/download/ollama-linux-amd64.tgz
#### 3.注意事项
qwen、qwq、Llama等都是大语言模型
Ollama是大语言模型不限于`Llama`模型)便捷的管理和运维工具
## 二安装部署Ollama
#### 1.官方脚本安装
注意服务器需要可以访问github等外网
![image-20241206094615638](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206094615638.png)
```shell
curl -fsSL https://ollama.com/install.sh | sh
```
#### 2.二进制安装
参考网址https://github.com/ollama/ollama/blob/main/docs/linux.md
获取二进制安装包
```shell
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
```
解压及安装
```shell
sudo tar -C /usr -xzf ollama-linux-amd64.tgz
```
创建管理用户和组
```shell
sudo useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama
sudo usermod -a -G ollama $(whoami)
```
配置启动管理文件`/etc/systemd/system/ollama.service`
```shell
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"
[Install]
WantedBy=default.target
```
启动
```shell
sudo systemctl daemon-reload
sudo systemctl enable ollama
```
注意:如果手动启动直接执行以下命令
```shell
ollama serve
```
![image-20241206095038227](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206095038227.png)
#### 3.容器安装
CPU Only
```shell
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
```
GPU Install
参考网址https://github.com/ollama/ollama/blob/main/docs/docker.md
#### 4.Kubernetes安装
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
name: ollama
namespace: xingdian-ai
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
spec:
containers:
- image: >-
swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ollama/ollama:latest
imagePullPolicy: IfNotPresent
name: ollama
ports:
- containerPort: 11434
protocol: TCP
volumeMounts:
- mountPath: /root/.ollama
name: volume-28ipz
restartPolicy: Always
volumes:
- name: volume-28ipz
nfs:
path: /data/ollama
server: 10.9.12.250
---
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
name: ollama
namespace: xingdian-ai
spec:
clusterIP: 10.109.25.34
clusterIPs:
- 10.109.25.34
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: 3scykr
port: 11434
protocol: TCP
targetPort: 11434
selector:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
sessionAffinity: None
type: ClusterIP
```
#### 5.基本使用
```shell
[root@xingdian-ai ~]# ollama -h
Large language model runner
Usage:
ollama [flags]
ollama [command]
Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
stop Stop a running model
pull Pull a model from a registry
push Push a model to a registry
list List models
ps List running models
cp Copy a model
rm Remove a model
help Help about any command
Flags:
-h, --help help for ollama
-v, --version Show version information
Use "ollama [command] --help" for more information about a command.
```
获取大语言模型
```shell
[root@xingdian-ai ~]# ollama pull gemma2:2b
```
注意gemma是谷歌的大语言模型
查看已有的发语言模型
```shell
[root@xingdian-ai ~]# ollama list
```
![image-20241206101342001](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206101342001.png)
删除大语言模型
```shell
[root@xingdian-ai ~]# ollama list qwen2:7b
```
运行大语言模型
```shell
[root@xingdian-ai ~]# ollama run qwen:0.5b
```
![image-20241206101618643](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206101618643.png)
## 三:认识 Open-webui
#### 1.什么是Open-webui
Open-WebUI 常用于给用户提供一个图形界面,通过它,用户可以方便地与机器学习模型进行交互
#### 2.官方地址
官方地址https://github.com/open-webui/open-webui
## 四:安装部署和使用
#### 1.容器安装方式
```shell
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=https://example.com -e HF_ENDPOINT=https://hf-mirror.com -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
```
国内镜像地址swr.cn-north-4.myhuaweicloud.com/ddn-k8s/ghcr.io/open-webui/open-webui:main
OLLAMA_BASE_URL指定Ollama地址
由于国内网络无法直接访问 huggingface , 我们需要更改为国内能访问的域名 hf-mirror.com
```shell
报错:
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like sentence-transformers/all-MiniLM-L6-v2 is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
No WEBUI_SECRET_KEY provided
```
解决方法: 使用镜像站 -e HF_ENDPOINT=https://hf-mirror.com
#### 2.kubernetes方式
```yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
name: ai
namespace: xingdian-ai
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
spec:
containers:
- env:
- name: OLLAMA_BASE_URL
value: 'http://10.9.12.10:11434'
- name: HF_ENDPOINT
value: 'https://hf-mirror.com'
image: '10.9.12.201/ollama/open-webui:main'
imagePullPolicy: Always
name: ai
ports:
- containerPort: 8080
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
restartPolicy: Always
terminationGracePeriodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
name: ai
namespace: xingdian-ai
spec:
clusterIP: 10.99.5.168
clusterIPs:
- 10.99.5.168
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: cnr2rn
port: 8080
protocol: TCP
targetPort: 8080
selector:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
sessionAffinity: None
type: ClusterIP
```
#### 3.使用kong ingress 进行引流
创建upstream svc地址为xingdian-ai.ai.svc:8080
创建services
创建route
配置DNS解析
#### 4.浏览器访问
![image-20241206111757092](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206111757092.png)
![image-20241206111826330](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206111826330.png)
事先在ollama上获取大模型我们一qwen:0.5b为例,其他的大模型占用资源较多
![image-20241206111932885](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206111932885.png)
![image-20241206112136649](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206112136649.png)
<h1><center>AI大模型Ollama结合Open-webui</center></h1>
**作者:行癫(盗版必究)**
------
## 一:认识 Ollama
#### 1.什么是Ollama
![image-20241206093753629](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206093753629.png)
Ollama是一个开源的 LLM大型语言模型服务工具用于简化在本地运行大语言模型降低使用大语言模型的门槛使得大模型的开发者、研究人员和爱好者能够在本地环境快速实验、管理和部署最新大语言模型
#### 2.官方网址
官方地址https://ollama.com/
![image-20241206093735653](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206093735653.png)
Ollama目前支持以下大语言模型https://ollama.com/library
![image-20241206093844006](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206093844006.png)
Ollama下载地址https://ollama.com/download/ollama-linux-amd64.tgz
#### 3.注意事项
qwen、qwq、Llama等都是大语言模型
Ollama是大语言模型不限于`Llama`模型)便捷的管理和运维工具
## 二安装部署Ollama
#### 1.官方脚本安装
注意服务器需要可以访问github等外网
![image-20241206094615638](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206094615638.png)
```shell
curl -fsSL https://ollama.com/install.sh | sh
```
#### 2.二进制安装
参考网址https://github.com/ollama/ollama/blob/main/docs/linux.md
获取二进制安装包
```shell
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
```
解压及安装
```shell
sudo tar -C /usr -xzf ollama-linux-amd64.tgz
```
创建管理用户和组
```shell
sudo useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama
sudo usermod -a -G ollama $(whoami)
```
配置启动管理文件`/etc/systemd/system/ollama.service`
```shell
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"
[Install]
WantedBy=default.target
```
启动
```shell
sudo systemctl daemon-reload
sudo systemctl enable ollama
```
注意:如果手动启动直接执行以下命令
```shell
ollama serve
```
![image-20241206095038227](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206095038227.png)
#### 3.容器安装
CPU Only
```shell
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
```
GPU Install
参考网址https://github.com/ollama/ollama/blob/main/docs/docker.md
#### 4.Kubernetes安装
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations: {}
labels:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
name: ollama
namespace: xingdian-ai
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
spec:
containers:
- image: >-
swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ollama/ollama:latest
imagePullPolicy: IfNotPresent
name: ollama
ports:
- containerPort: 11434
protocol: TCP
volumeMounts:
- mountPath: /root/.ollama
name: volume-28ipz
restartPolicy: Always
volumes:
- name: volume-28ipz
nfs:
path: /data/ollama
server: 10.9.12.250
---
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
name: ollama
namespace: xingdian-ai
spec:
clusterIP: 10.109.25.34
clusterIPs:
- 10.109.25.34
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: 3scykr
port: 11434
protocol: TCP
targetPort: 11434
selector:
k8s.kuboard.cn/layer: web
k8s.kuboard.cn/name: ollama
sessionAffinity: None
type: ClusterIP
```
#### 5.基本使用
```shell
[root@xingdian-ai ~]# ollama -h
Large language model runner
Usage:
ollama [flags]
ollama [command]
Available Commands:
serve Start ollama
create Create a model from a Modelfile
show Show information for a model
run Run a model
stop Stop a running model
pull Pull a model from a registry
push Push a model to a registry
list List models
ps List running models
cp Copy a model
rm Remove a model
help Help about any command
Flags:
-h, --help help for ollama
-v, --version Show version information
Use "ollama [command] --help" for more information about a command.
```
获取大语言模型
```shell
[root@xingdian-ai ~]# ollama pull gemma2:2b
```
注意gemma是谷歌的大语言模型
查看已有的发语言模型
```shell
[root@xingdian-ai ~]# ollama list
```
![image-20241206101342001](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206101342001.png)
删除大语言模型
```shell
[root@xingdian-ai ~]# ollama list qwen2:7b
```
运行大语言模型
```shell
[root@xingdian-ai ~]# ollama run qwen:0.5b
```
![image-20241206101618643](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206101618643.png)
## 三:认识 Open-webui
#### 1.什么是Open-webui
Open-WebUI 常用于给用户提供一个图形界面,通过它,用户可以方便地与机器学习模型进行交互
#### 2.官方地址
官方地址https://github.com/open-webui/open-webui
## 四:安装部署和使用
#### 1.容器安装方式
```shell
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=https://example.com -e HF_ENDPOINT=https://hf-mirror.com -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
```
国内镜像地址swr.cn-north-4.myhuaweicloud.com/ddn-k8s/ghcr.io/open-webui/open-webui:main
OLLAMA_BASE_URL指定Ollama地址
由于国内网络无法直接访问 huggingface , 我们需要更改为国内能访问的域名 hf-mirror.com
```shell
报错:
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like sentence-transformers/all-MiniLM-L6-v2 is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
No WEBUI_SECRET_KEY provided
```
解决方法: 使用镜像站 -e HF_ENDPOINT=https://hf-mirror.com
#### 2.kubernetes方式
```yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
name: ai
namespace: xingdian-ai
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
spec:
containers:
- env:
- name: OLLAMA_BASE_URL
value: 'http://10.9.12.10:11434'
- name: HF_ENDPOINT
value: 'https://hf-mirror.com'
image: '10.9.12.201/ollama/open-webui:main'
imagePullPolicy: Always
name: ai
ports:
- containerPort: 8080
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
restartPolicy: Always
terminationGracePeriodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
annotations: {}
labels:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
name: ai
namespace: xingdian-ai
spec:
clusterIP: 10.99.5.168
clusterIPs:
- 10.99.5.168
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: cnr2rn
port: 8080
protocol: TCP
targetPort: 8080
selector:
k8s.kuboard.cn/layer: svc
k8s.kuboard.cn/name: ai
sessionAffinity: None
type: ClusterIP
```
#### 3.使用kong ingress 进行引流
创建upstream svc地址为xingdian-ai.ai.svc:8080
创建services
创建route
配置DNS解析
#### 4.浏览器访问
![image-20241206111757092](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206111757092.png)
![image-20241206111826330](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206111826330.png)
事先在ollama上获取大模型我们一qwen:0.5b为例,其他的大模型占用资源较多
![image-20241206111932885](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206111932885.png)
![image-20241206112136649](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206112136649.png)