通过API应用接口搭建专属Google Gemini Pro站点

Gemini AI是谷歌最新推出的多模态大模型，可以同时处理文本，图像，声音和视频和代码，参照Google官方介绍，Gemini是第一个在MMLU（大规模多任务语言理解）方面超越人类专家的模型，在推理、数学和代码上的能力也都超过了 GPT4，可在Google Bard中使用。Google于2023年12月13日开放了对Gemini Pro API的访问和免费调用。

国内用户“本地直连”访问Gemini是个伪命题，Google官方在『Where Bard with Gemini Pro is available』一文中列出了支持的国家和地区，同时googleapis.com、vercel等在国内也是被屏蔽的。本文介绍如何在VPS上通过docker快速部署开源项目以使用、体验Gemini，以及手动部署的方法。

1.Gemini版本

Gemini目前分三个版本，分别为Ultra、Pro、Nano。

▍Gemini Ultra

规模最大的版本，具有最强大的性能，适用于高度复杂的任务，官方视频中使用的就是Ultra模型。它仅适用于Google内部使用。

▍Gemini Pro

面向开发者的版本，用于跨各种任务进行扩展的最佳模型，目前可以体验到。它可以通过 Google AI Studio 或 Google Cloud Vertex AI 获取。

▍Gemini Nano

面向Android开发者的版本，移动端任务模型，适用于移动设备。它可以通过 AI Core 应用获取。

另外，Bard也集成了Gemini Pro，截止本文发布日，Bard仅开放了Gemini文本提示词且只能使用英文体验Gemini Pro。从Google发布的报告来看，Gemini Pro的能力会比 GPT-4 稍差。另附参考文档『Where Bard with Gemini Pro is available』。

2.申请API Key

Gemini Pro当前免费，每分钟限制60条对话。Gemini Pro是面向开发者的工具，用于创建和部署AI模型，因此，要体验Gemini Pro，需要在Google Cloud Platform上创建一个帐户并登录，同时还需创建相应的项目，接下来可以通过Google AI Studio或Google Cloud Vertex AI访问Gemini Pro。

如果仅使用AI模型，则可以尝试Gemini Nano，无需登录Google Cloud Platform，可在Google Play商店下载AI Core应用，在AI Core中安装Gemini Nano模型即可。

▍Google AI Studio入口

https://makersuite.google.com/?hl=zh-cn

▍Gemini API Key申请入口

https://makersuite.google.com/app/apikey

https://ai.google.dev/

▍Gemini介绍

https://deepmind.google/technologies/gemini/#introduction

▍Gemini官方文档

https://ai.google.dev/docs?hl=zh-cn

▍Bard入口

https://bard.google.com/chat

用户可以在Google AI Studio平台生成多个API Key，每个API Key都需绑定一个Google Cloud Platform项目。选择第一项『Create API Key in new project』则会自动创建新项目。

3.Docker部署GeminiProChat

希望尝鲜部署非官方开源应用的可以尝试GeminiProChat，GeminiProChat的Github仓库链接如下：

GitHub - babaohuang/GeminiProChat: Minimal web UI for GeminiPro.

首先明确一点，当前GeminiProChat并不支持Gemini Pro Vision。拉取GeminiProChat镜像（当前500M+），无法从Docker Hub拉取或进度缓慢的情况下可换用ghrc镜像源。

docker pull babaohuang/geminiprochat:latest

#docker pull ghcr.io/babaohuang/geminiprochat:main

创建并加载GeminiProChat容器，将尖括号包裹的内容替换为API Key。此处建议修改宿主机监听端口（默认容器使用3000），避免和其他项目如Chat-Next-Web、Chat-Bot等产生冲突。运行如下指令，GeminiProChat将在宿主机4000端口监听并提供服务。

docker run --name geminiprochat \
--restart always \
-p 4000:3000 \
-itd \
-e GEMINI_API_KEY=<your-api-key> \
babaohuang/geminiprochat:latest

这时，我们使用http://vps_ip_or_domainname:4000端口就可以打开GeminiProChat页面，体验并使用Gemini Pro了。在使用Gemini Pro以及Gemini Pro Vision之前，有两点应注意：

Gemini Pro以及Gemini Pro Vision仅支持文本的输出，横向比较来看，BingChat的文生图功能是基于DallE而非ChatGPT；
使用Gemini Pro Vision时，应在每次Chat的最开始就发送图片提示，纯文本提示会自动使用 gemini-pro 模型直至Chat关闭。

4.可选：手动本地化部署

安装最新版本Node.js（20.x）和npm，GeminiProChat项目基于Node.js 18.x+以上版本。

#rm -f /etc/apt/sources.list.d/nodesource.list
apt update && apt install -y ca-certificates curl gnupg
mkdir -p /etc/apt/keyrings
curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg
NODE_MAJOR=20
echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_$NODE_MAJOR.x nodistro main" | tee /etc/apt/sources.list.d/nodesource.list
apt update && apt install nodejs -y

安装pnmp，通过pnmp管理node.js软件包。

npm i -g pnpm

克隆仓库至本地，并进入项目目录。

git clone https://github.com/babaohuang/GeminiProChat.git && cd GeminiProChat

安装依赖（依据项目package.json，可自定义修改）。

pnpm install

创建.env文件并指定API Key。文件名即为.env，文件内容如下，将尖括号包裹的内容替换为申请的API Key。

# Your API Key for GEMINI_API
GEMINI_API_KEY=<your_api_key>
# Custom base url for API. default: https://generativelanguage.googleapis.com
API_BASE_URL=
# Inject analytics or other scripts before </head> of the page
HEAD_SCRIPTS=
# Secret string for the project. Use for generating signatures for API calls
PUBLIC_SECRET_KEY=
# Set password for site, support multiple password separated by comma. If not set, site will be public
SITE_PASSWORD=
# Set the maximum number of historical messages used for contextual contact
PUBLIC_MAX_HISTORY_MESSAGES=

运行项目，GeminiProChat默认运行于3000端口，可以使用http://vps_ip:3000打开GeminiProChat页面。

pnpm run dev

5.更优选择

▍需要同时支持Gemini Pro和Gemini Pro Vision

GeminiProChat仅支持Gemini Pro、不支持Gemini Pro Vision的问题会劝退不少人，有此需求的可直接使用「Google Bard」或者关注老E的后续博文。

▍多系列模型API支持问题

可以看到，使用Gemini就需要部署对应支持Gemini API的开源项目，ChatGPT、Claude2等都有自身的API接口标准规范，N个不同厂商的模型就需要N个应用，冗余且低效。

那么，有没有通过API转换、统一的方式同时支持ChatGPT、Gemini、Claude2等系列模型API、甚至可以支持包括国内文心一言、通义千问、星火认知、智谱等几乎所有大模型API的开源项目呢？答案是肯定的。

后续老E将分享并介绍相应的开源应用，并在VPS上进行部署。另外，Cloudflare AI Gateway（Cloudflare被污染严重但并未完全屏蔽）已经推出一段时间了，用户可以理解为Cloudflare推出的官方AI代理，尽管并不能涵盖所有大模型，但至少ChatGPT 4等的国内“直连”问题可以得到解决，这也是后续分享的内容之一。

本作品采用知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议进行许可