AI大模型Ollama结合Open-webui

**作者:行癫(盗版必究)** ------ ## 一:认识 Ollama #### 1.什么是Ollama ![image-20241206093753629](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206093753629.png) ​ Ollama是一个开源的 LLM(大型语言模型)服务工具,用于简化在本地运行大语言模型,降低使用大语言模型的门槛,使得大模型的开发者、研究人员和爱好者能够在本地环境快速实验、管理和部署最新大语言模型 #### 2.官方网址 官方地址:https://ollama.com/ ![image-20241206093735653](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206093735653.png) Ollama目前支持以下大语言模型:https://ollama.com/library ![image-20241206093844006](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206093844006.png) Ollama下载地址:https://ollama.com/download/ollama-linux-amd64.tgz #### 3.注意事项 qwen、qwq、Llama等都是大语言模型 Ollama是大语言模型(不限于`Llama`模型)便捷的管理和运维工具 ## 二:安装部署Ollama #### 1.官方脚本安装 注意:服务器需要可以访问github等外网 ![image-20241206094615638](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206094615638.png) ```shell curl -fsSL https://ollama.com/install.sh | sh ``` #### 2.二进制安装 参考网址:https://github.com/ollama/ollama/blob/main/docs/linux.md 获取二进制安装包 ```shell curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz ``` 解压及安装 ```shell sudo tar -C /usr -xzf ollama-linux-amd64.tgz ``` 创建管理用户和组 ```shell sudo useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama sudo usermod -a -G ollama $(whoami) ``` 配置启动管理文件`/etc/systemd/system/ollama.service` ```shell [Unit] Description=Ollama Service After=network-online.target [Service] ExecStart=/usr/bin/ollama serve User=ollama Group=ollama Restart=always RestartSec=3 Environment="PATH=$PATH" [Install] WantedBy=default.target ``` 启动 ```shell sudo systemctl daemon-reload sudo systemctl enable ollama ``` 注意:如果手动启动直接执行以下命令 ```shell ollama serve ``` ![image-20241206095038227](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206095038227.png) #### 3.容器安装 CPU Only: ```shell docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama ``` GPU Install: 参考网址:https://github.com/ollama/ollama/blob/main/docs/docker.md #### 4.Kubernetes安装 ```yaml apiVersion: apps/v1 kind: Deployment metadata: annotations: {} labels: k8s.kuboard.cn/layer: web k8s.kuboard.cn/name: ollama name: ollama namespace: xingdian-ai spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: k8s.kuboard.cn/layer: web k8s.kuboard.cn/name: ollama strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: creationTimestamp: null labels: k8s.kuboard.cn/layer: web k8s.kuboard.cn/name: ollama spec: containers: - image: >- swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/ollama/ollama:latest imagePullPolicy: IfNotPresent name: ollama ports: - containerPort: 11434 protocol: TCP volumeMounts: - mountPath: /root/.ollama name: volume-28ipz restartPolicy: Always volumes: - name: volume-28ipz nfs: path: /data/ollama server: 10.9.12.250 --- apiVersion: v1 kind: Service metadata: annotations: {} labels: k8s.kuboard.cn/layer: web k8s.kuboard.cn/name: ollama name: ollama namespace: xingdian-ai spec: clusterIP: 10.109.25.34 clusterIPs: - 10.109.25.34 internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: 3scykr port: 11434 protocol: TCP targetPort: 11434 selector: k8s.kuboard.cn/layer: web k8s.kuboard.cn/name: ollama sessionAffinity: None type: ClusterIP ``` #### 5.基本使用 ```shell [root@xingdian-ai ~]# ollama -h Large language model runner Usage: ollama [flags] ollama [command] Available Commands: serve Start ollama create Create a model from a Modelfile show Show information for a model run Run a model stop Stop a running model pull Pull a model from a registry push Push a model to a registry list List models ps List running models cp Copy a model rm Remove a model help Help about any command Flags: -h, --help help for ollama -v, --version Show version information Use "ollama [command] --help" for more information about a command. ``` 获取大语言模型 ```shell [root@xingdian-ai ~]# ollama pull gemma2:2b ``` 注意:gemma是谷歌的大语言模型 查看已有的发语言模型 ```shell [root@xingdian-ai ~]# ollama list ``` ![image-20241206101342001](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206101342001.png) 删除大语言模型 ```shell [root@xingdian-ai ~]# ollama list qwen2:7b ``` 运行大语言模型 ```shell [root@xingdian-ai ~]# ollama run qwen:0.5b ``` ![image-20241206101618643](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206101618643.png) ## 三:认识 Open-webui #### 1.什么是Open-webui ​ Open-WebUI 常用于给用户提供一个图形界面,通过它,用户可以方便地与机器学习模型进行交互 #### 2.官方地址 ​ 官方地址:https://github.com/open-webui/open-webui ## 四:安装部署和使用 #### 1.容器安装方式 ```shell docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=https://example.com -e HF_ENDPOINT=https://hf-mirror.com -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main ``` 国内镜像地址:swr.cn-north-4.myhuaweicloud.com/ddn-k8s/ghcr.io/open-webui/open-webui:main OLLAMA_BASE_URL:指定Ollama地址 由于国内网络无法直接访问 huggingface , 我们需要更改为国内能访问的域名 hf-mirror.com ```shell 报错: OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like sentence-transformers/all-MiniLM-L6-v2 is not the path to a directory containing a file named config.json. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'. No WEBUI_SECRET_KEY provided ``` 解决方法: 使用镜像站 -e HF_ENDPOINT=https://hf-mirror.com #### 2.kubernetes方式 ```yaml --- apiVersion: apps/v1 kind: Deployment metadata: labels: k8s.kuboard.cn/layer: svc k8s.kuboard.cn/name: ai name: ai namespace: xingdian-ai spec: progressDeadlineSeconds: 600 replicas: 1 revisionHistoryLimit: 10 selector: matchLabels: k8s.kuboard.cn/layer: svc k8s.kuboard.cn/name: ai strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: labels: k8s.kuboard.cn/layer: svc k8s.kuboard.cn/name: ai spec: containers: - env: - name: OLLAMA_BASE_URL value: 'http://10.9.12.10:11434' - name: HF_ENDPOINT value: 'https://hf-mirror.com' image: '10.9.12.201/ollama/open-webui:main' imagePullPolicy: Always name: ai ports: - containerPort: 8080 protocol: TCP resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File restartPolicy: Always terminationGracePeriodSeconds: 30 --- apiVersion: v1 kind: Service metadata: annotations: {} labels: k8s.kuboard.cn/layer: svc k8s.kuboard.cn/name: ai name: ai namespace: xingdian-ai spec: clusterIP: 10.99.5.168 clusterIPs: - 10.99.5.168 internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: cnr2rn port: 8080 protocol: TCP targetPort: 8080 selector: k8s.kuboard.cn/layer: svc k8s.kuboard.cn/name: ai sessionAffinity: None type: ClusterIP ``` #### 3.使用kong ingress 进行引流 创建upstream svc地址为:xingdian-ai.ai.svc:8080 创建services 创建route 配置DNS解析 #### 4.浏览器访问 ![image-20241206111757092](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206111757092.png) ![image-20241206111826330](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206111826330.png) 事先在ollama上获取大模型,我们一qwen:0.5b为例,其他的大模型占用资源较多 ![image-20241206111932885](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206111932885.png) ![image-20241206112136649](https://xingdian-image.oss-cn-beijing.aliyuncs.com/xingdian-image/image-20241206112136649.png)