llm - 第 7 页 - 标签 - 军舰的日志

2024年1月10日星期三

基于 VSCode 使用 Tabby 插件搭建免费的 GitHub Copilot

使用的模型

代码生成 Tabby 使用的是 Deepseek Coder 6.7B 模型。

部署服务器端

基于 PyCharm 使用 Tabby 和 CodeGPT 插件搭建免费的 GitHub Copilot

安装 Visual Studio Code

Tabby 安装

Tabby 配置

单击状态栏中的 Tabby 图标，打开 Tabby 配置页面。

参数

EndPoint: http://172.16.33.66:8080

使用 Tabby

代码生成

2024-01-10 10:00

基于 IntelliJ IDEA 使用 Tabby 和 CodeGPT 插件搭建免费的 GitHub Copilot

使用的模型

代码生成 Tabby 使用的是 Deepseek Coder 6.7B 模型。
AI 聊天 CodeGPT 使用的是 ChatGLM3-6B 模型。这个后面考虑使用 Deepseek Coder 6.7B 来替换。

部署服务器端

基于 PyCharm 使用 Tabby 和 CodeGPT 插件搭建免费的 GitHub Copilot

安装 InteliJ IDEA

安装插件

插件

代码生成：Tabby
AI 聊天：CodeGPT

安装

打开 IntelliJ IDEA，选择 Settings 菜单，选择 Plugins，搜索 Tabby 和 CodeGPT，点击 Install 安装。

Tabby

CodeGPT

配置插件

Tabby

参数

Endpoint: http://172.16.33.66:8080

CodeGPT

参数

Service: OpenAI Service
API key: NULL
Model: GPT-3.5(4k)
Base host: http://172.16.33.66:8000

使用插件

AI 聊天

代码生成

2024-01-10 08:00

github-copilot intellij-idea tabby codegpt openai code-llm llm deepseek-coder chatglm3 ai-coding-assistant

2024年1月9日星期二

基于 PyCharm 使用 Tabby 和 CodeGPT 插件搭建免费的 GitHub Copilot

使用的模型

代码生成 Tabby 使用的是 Deepseek Coder 6.7B 模型。
AI 聊天 CodeGPT 使用的是 ChatGLM3-6B 模型。这个后面考虑使用 Deepseek Coder 6.7B 来替换。

部署服务器端

Tabby 服务

docker run -d --runtime nvidia --name tabby -p 8080:8080 \
  -e TABBY_DOWNLOAD_HOST=modelscope.cn \
  -e NVIDIA_VISIBLE_DEVICES=3 \
  -e RUST_BACKTRACE=1 \
  -v `pwd`/.tabby:/data tabbyml/tabby \
  serve --model TabbyML/DeepseekCoder-6.7B  --device cuda

OpaneAI 服务

启动服务 Controller

python -m fastchat.serve.controller

启动服务 Model Worker

python -m fastchat.serve.model_worker \
  --model-path THUDM/chatglm3-6b --port 21002 \
  --worker-address http://localhost:21002 \
  --model-names chatglm3-6b,gpt-3.5-turbo

2024-01-09 08:00

github-copilot pycharm tabby codegpt fastchat openai code-llm llm deepseek-coder chatglm3

2024年1月5日星期五

PrivateGPT

安装 Python 3.11

brew install python@3.11

安装

git clone https://github.com/imartinez/privateGPT && cd privateGPT && \
python3.11 -m venv .venv && source .venv/bin/activate && \
pip install --upgrade pip poetry && poetry install --with ui,local && ./scripts/setup

# Launch the privateGPT API server **and** the gradio UI
poetry run python3.11 -m private_gpt

# In another terminal, create a new browser window on your private GPT!
open http://127.0.0.1:8001/

Quickstart

安装失败 😭

参考资料

2024-01-05 08:00

privategpt python poetry gradio rag 本地部署 localgpt llm

2024年1月1日星期一

AI 大模型基础服务架构图

大模型基础服务架构图

<center>
<div class="mermaid">
%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart TB
  subgraph tool[聊天工具]
    direction TB
    chatgpt-next(ChatGPT Next Web)
    langchain-chatchat(Langchain-Chatchat)
    wechat(chatgpt-on-wechat)
  end
  subgraph business-application[业务应用层]
    direction TB
    app1(发电)
    app2(调度)
    app3(输变电)
// ...

代码大模型基础服务架构图

2024-01-01 10:00

llm code-llm 大模型基础服务架构图 fastchat tabby llmops dify openai-api chatglm3

AI 大模型

🔶 大模型

SLM

LLM

对话 LLM 排行榜 (Open LLM Leaderboard)

2024-01-01 08:00

llm slm code-llm embedding-llm 大模型 leaderboard gguf huggingface qwen

2023年12月28日星期四

Langchain‐Chatchat 和 FastChat 结合

[FastChat][FastChat]

安装

# 克隆仓库
git clone https://github.com/lm-sys/FastChat
cd FastChat

# 创建虚拟环境
python -m venv env
source env/bin/activate

# 安装
pip install --upgrade pip
pip install -e ".[model_worker,webui]"

创建大模型链接

LLM

mkdir THUDM
ln -s /Users/junjian/HuggingFace/THUDM/chatglm3-6b THUDM/chatglm3-6b

Embedding Model

mkdir BAAI
ln -s /Users/junjian/HuggingFace/BAAI/bge-base-zh-v1.5 BAAI/bge-base-zh-v1.5

启动服务 Controller

python -m fastchat.serve.controller

启动服务 Model Worker LLM python -m fastchat.serve.

2023-12-28 08:00

langchain-chatchat fastchat openai llm chatglm3 rag embeddings gradio

2023年12月20日星期三

Microsoft Phi-2

Phi-2: The surprising power of small language models

microsoft/phi-2

创建虚拟环境

conda create -n huggingface python==3.10.9
conda activate huggingface

安装依赖包

conda install pytorch torchvision -c pytorch
pip install transformers
pip install einops

下载模型

huggingface-cli download microsoft/phi-2 --local-dir microsoft/phi-2 --local-dir-use-symlinks False

代码 import torch from transformers import AutoModelForCausalLM, AutoTokenizer torch.set_default_device("mps") model = AutoModelForCausalLM.

2023-12-20 10:00

phi-2 llm hugging-face pytorch transformers microsoft apple-silicon mps small-language-models

2023年12月19日星期二

Text Generation Inference

TGI 介绍

TGI 是一个用于部署和服务大型语言模型（LLM）的工具包。 TGI 为最流行的开源 LLM 提供高性能文本生成，包括 Llama、Falcon、StarCoder、BLOOM、GPT-NeoX 和 T5 。

张量并行性，可在多个 GPU 上进行更快的推理
批处理连续传入的请求，以增加总吞吐量
在最流行的架构上使用 [Flash Attention][Flash-Attention] 和 [Paged Attention][Paged-Attention] 优化 Transformers 代码进行推理
使用 [bitsandbytes][bitsandbytes] 和 [GPT-Q][GPT-Q] 进行量化
[safetensors][safetensors] 权重加载
给模型输出加水印（Watermark）
微调支持：定制针对特定任务的微调模型来实现更高的准确性和性能

系统架构

部署模型 HuggingFaceH4/zephyr-7b-beta model=HuggingFaceH4/zephyr-7b-beta volume=$PWD/data # Avoid downloading weights every run docker run --

2023-12-19 08:00

text-generation-inference hugging-face inference-serving docker llm flash-attention quantization zephyr model-deployment

2023年12月12日星期二

TensorRT-LLM 大模型推理

[TensorRT-LLM][TensorRT-LLM]

TensorRT-LLM 为用户提供了易于使用的 Python API 来定义大型语言模型 (LLM) 并构建包含最先进优化的 TensorRT 引擎，以便在 NVIDIA GPU 上高效地执行推理。 TensorRT-LLM 还包含用于创建执行这些 TensorRT 引擎的 Python 和 C++ 运行时的组件。

Build TensorRT-LLM

# TensorRT-LLM uses git-lfs, which needs to be installed in advance.
apt-get update && apt-get -y install git git-lfs

git clone https://github.com/NVIDIA/TensorRT-LLM.git
cd TensorRT-LLM
git submodule update --init --recursive
git lfs install
git lfs pull

make -C docker release_build

2023-12-12 08:00

tensorrt-llm triton-inference-server chatglm tensorrt nvidia docker inference deployment llm

2023年12月6日星期三

基于 ChatGLM3 8k 和 32k 的文档问答对比

文档

这里使用的文档是：合作方人员出勤及结算管理信息化支撑规则

一、出勤打卡
出勤打卡包括：正常出勤打卡、出差打卡、外出打卡、加班打卡。

1. 正常出勤打卡：指正常的出勤办公打卡。
（1）全天出勤打卡：上班打卡：8点30分之前打卡。下班打卡：17点30分之后打卡。
（2）半天出勤打卡。上午打卡时间段：8点30分之前、12点之后。下午时间段：13点之前，17点30分之后。
（3）打卡(考勤机或企业微信打卡)形式按部门要求为准，最小半天为统计单位。

2. 出差打卡：指出差地出勤办公或在途期间打卡。
（1）固定出差地打卡：打卡时间参照第1条正常出勤上下班打卡；无法定位有效范围的找部门管理员修改工作打卡位置。(具体按照各部门要求执行)
（2）出差在途打卡(使用手机外出打卡)。到车站坐车前打外出打卡一次，到达目的地后打外出打卡一次(往返同理)。下午出差的，上午需打正常出勤卡(上午正常出勤须闭环打卡)；上午到达出差地的，下午需打一次外出打卡或上下班打卡。

3. 外出打卡：指外出办事打卡。提外出申请后，可以打外出卡，打外出卡时间需在申请时间内：
（1）半天外出：如外出时间在上午(12点前) 或者下午(12点后)，则另外半天需正常出勤打卡。
（2）跨12点外出：如外出跨度期间包含12点，则12点前、12点后分别打外出卡即可记为合格出勤。
// ...

提示词模板 """ {

2023-12-06 08:00

chatglm3 chatglm3-6b-32k bge-base-zh rag document-qa long-context embeddings quantization llm-performance llm

2023年12月3日星期日

GPT4All

下载 GPT4All 客户端（macOS）

下载模型

聊天

基于目录构建本地文档集合

本地服务

启用 API 服务器

打开服务聊天窗口

查看本地下载的模型 ll /Users/junjian/Library/Application\ Support/nomic.ai/GPT4All/*.gguf -rw-r--r--@ 1 junjian staff 44M 12 3 10:30 /Users/junjian/Library/Application Support/nomic.ai/GPT4All/all-MiniLM-L6-v2-f16.gguf -rw-r--r--@ 1 junjian staff 1.3G 12 3 12:53 /Users/junjian/Library/Application Support/nomic.ai/GPT4All/incomplete-nous-hermes-llama2-13b.Q4_0.gguf -rw-r--r--@ 1 junjian staff 3.8G 12 3 10:09 /Users/junjian/Library/Application Support/nomic.ai/GPT4All/mistral-7b-openorca.Q4_0.gguf -rw-r--r--@ 1 junjian staff 3.

2023-12-03 08:00

gpt4all local-llms rag openai chatgpt llm mistral nomic gguf

2023年10月24日星期二

FastChat 部署多模型

* [Chatbot Arena](https://chat.lmsys.org/) * [FastChat](https://github.com/lm-sys/FastChat) * [LMSYS BLOG](https://lmsys.org/blog/) * [Use AutoGen for Local LLMs](https://microsoft.github.io/autogen/blog/2023/07/14/Local-LLMs/)

安装

pip

pip install "fschat[model_worker,webui]"

源代码

这种方式安装比较容易调试，适合开发者。

克隆代码

git clone https://github.com/lm-sys/FastChat.git
cd FastChat

创建环境

python -m venv env
source env/bin/activate

安装

2023-10-24 08:00

fastchat llm model-deployment inference-serving vicuna langchain docker scaling

2023年10月16日星期一

Private GPT 中文 Embeddings 模型测试

文档

这里使用的文档是：合作方人员出勤及结算管理信息化支撑规则

一、出勤打卡
出勤打卡包括：正常出勤打卡、出差打卡、外出打卡、加班打卡。

1. 正常出勤打卡：指正常的出勤办公打卡。
（1）全天出勤打卡：上班打卡：8点30分之前打卡。下班打卡：17点30分之后打卡。
（2）半天出勤打卡。上午打卡时间段：8点30分之前、12点之后。下午时间段：13点之前，17点30分之后。
（3）打卡(考勤机或企业微信打卡)形式按部门要求为准，最小半天为统计单位。

2. 出差打卡：指出差地出勤办公或在途期间打卡。
（1）固定出差地打卡：打卡时间参照第1条正常出勤上下班打卡；无法定位有效范围的找部门管理员修改工作打卡位置。(具体按照各部门要求执行)
（2）出差在途打卡(使用手机外出打卡)。到车站坐车前打外出打卡一次，到达目的地后打外出打卡一次(往返同理)。下午出差的，上午需打正常出勤卡(上午正常出勤须闭环打卡)；上午到达出差地的，下午需打一次外出打卡或上下班打卡。

3. 外出打卡：指外出办事打卡。提外出申请后，可以打外出卡，打外出卡时间需在申请时间内：
（1）半天外出：如外出时间在上午(12点前) 或者下午(12点后)，则另外半天需正常出勤打卡。
（2）跨12点外出：如外出跨度期间包含12点，则12点前、12点后分别打外出卡即可记为合格出勤。
// ...

提示词模板使用以下上下文来回答最后的问题。

2023-10-16 08:00

rag embeddings llm privategpt chinese machine-learning nlp

2023年9月12日星期二

部署 LLM

测试结果

模型 & 精度 & 显存 & 速度

2023-09-12 08:00

llm model-deployment inference-serving deployment docker cuda gpu qwen

2023年9月9日星期六

LLM Leaderboard

LLM

Embedding 模型

Massive Text Embedding Benchmark (MTEB) Leaderboard

sensenova/piccolo-large-zh

piccolo是一个通用embedding模型(中文), 由来自商汤科技的通用模型组完成训练。piccolo借鉴了E5以及GTE的训练流程，采用了两阶段的训练方式。在第一阶段中，我们搜集和爬取了4亿的中文文本对(可视为弱监督文本对数据)，并采用二元组的softmax对比学习损失来优化模型。在第二阶段中，我们搜集整理了2000万人工标注的中文文本对(精标数据)，并采用带有难负样本的三元组的softmax对比学习损失来帮助模型更好地优化。

BAAI/bge-large-zh

FlagEmbedding 将任意文本映射为低维稠密向量，以用于检索、分类、聚类或语义匹配等任务，并可支持为大模型调用外部知识。

不同的任务

google/owlvit-base-patch32

参考资料

2023-09-09 08:00

llm benchmarks embeddings hugging-face models evals

2023年7月24日星期一

AI 大模型

🔥 大模型

🔥 Andrej Karpathy

🔥 李沐论文精读如何读论文 AlexNet ResNet 零基础多图详解图神经网络（GNN/GCN） GAN Transformer BERT Pre-training ViT 卷积神经网络的两个归纳偏置：1、locality（相同区域有相同的特征）；2、translation equivariance（平移等变性） local neighborhoods MAE Autoencoder 对比学习论文综述数据增强：Crop 和 Color 的组合最有效 MoCo CLIP How to Train Really Large Models on Many GPUs?

2023-07-24 08:00

llm gpt chatgpt openai ai generative-ai machine-learning deep-learning nlp computer-vision

2023年6月13日星期二

Building Systems with the ChatGPT API

使用 ChatGPT API 构建系统

Language Models, the Chat Format and Tokens（语言模型、聊天格式和 Tokens）

Load OpenAI API key

import os
import openai
import tiktoken
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']

2023-06-13 08:00

chatgpt openai api deeplearning-ai python llm prompt-engineering machine-learning generative-ai moderation

2023年6月9日星期五

LangChain for LLM Application Development

LangChain 是用于构建 LLM 应用程序的开源框架

LLM 应用程序开发的 LangChain

LangChain: Models, Prompts and Output Parsers

安装依赖包

pip install python-dotenv
pip install openai

ChatCompletion import os import openai from dotenv import load_dotenv, find_dotenv _ = load_dotenv(find_dotenv()) # read local .env file openai.api_key = os.environ['OPENAI_API_KEY'] def get_completion(prompt, model="gpt-3.5-turbo"): messages = [{"role": "user", "content": prompt}] response = openai.ChatCompletion.

2023-06-09 08:00

langchain llm chatgpt openai deeplearning-ai python prompt-engineering tools api generative-ai

2023年5月30日星期二

State of GPT - Andrej Karpathy

介绍

Learn about the training pipeline of GPT assistants like ChatGPT, from tokenization to pretraining, supervised finetuning, and Reinforcement Learning from Human Feedback (RLHF). Dive deeper into practical techniques and mental models for the effective use of these models, including prompting strategies, finetuning, the rapidly growing ecosystem of tools, and their future extensions.

了解 ChatGPT 等 GPT 助手的训练管道，从标记化到预训练、监督微调和人类反馈强化学习 (RLHF)。深入研究有效使用这些模型的实用技术和心智模型，包括提示策略、微调、快速增长的工具生态系统及其未来的扩展。

2023-05-30 08:00

llm gpt fine-tuning andrej-karpathy tokenization machine-learning deep-learning generative-ai ai chatgpt

146 篇文章带有标签 “llm”

2024年1月10日 星期三

2024年1月9日 星期二

2024年1月5日 星期五

2024年1月1日 星期一

2023年12月28日 星期四

2023年12月20日 星期三

2023年12月19日 星期二

2023年12月12日 星期二

2023年12月6日 星期三

2023年12月3日 星期日

2023年10月24日 星期二

2023年10月16日 星期一

2023年9月12日 星期二

2023年9月9日 星期六

2023年7月24日 星期一

2023年6月13日 星期二

2023年6月9日 星期五

2023年5月30日 星期二

2024年1月10日星期三

2024年1月9日星期二

2024年1月5日星期五

2024年1月1日星期一

2023年12月28日星期四

2023年12月20日星期三

2023年12月19日星期二

2023年12月12日星期二

2023年12月6日星期三

2023年12月3日星期日

2023年10月24日星期二

2023年10月16日星期一

2023年9月12日星期二

2023年9月9日星期六

2023年7月24日星期一

2023年6月13日星期二

2023年6月9日星期五

2023年5月30日星期二