Embedding
Embedding(向量化)将文本或文档转换为固定维度的浮点数向量,是语义搜索、相似度计算和 RAG(检索增强生成)的基础。Spring AI 通过 EmbeddingModel 接口统一不同厂商的向量化能力。
1. 概述
EmbeddingModel 将文本转换为浮点数向量,是语义搜索和 RAG 的基础。通过自动注入即可使用。
2. 调用方式
EmbeddingModel 是核心接口,提供从简捷方法到完全控制的多种调用层次。
2.1 接口方法一览
| 方法 | 输入 | 输出 | 说明 |
|---|---|---|---|
embed(String) | 单个文本 | float[] | 最简调用,单文本向量化 |
embed(List<String>) | 文本列表 | List<float[]> | 批量文本向量化 |
embed(Document) | 单个文档 | float[] | 文档向量化 |
embed(List<Document>, EmbeddingOptions, BatchingStrategy) | 文档列表 + 参数 | List<float[]> | 分批文档向量化 |
embedForResponse(List<String>) | 文本列表 | EmbeddingResponse | 获取完整响应(含元数据) |
call(EmbeddingRequest) | 请求对象 | EmbeddingResponse | 底层契约方法 |
dimensions() | — | int | 获取向量维度 |
2.2 自动注入
引入任一厂商 Starter 后,EmbeddingModel 由 Spring Boot 自动配置提供。
application.yml
spring:
ai:
ollama:
base-url: http://localhost:11434
embedding:
options:
model: mxbai-embed-large
@RestController
public class EmbeddingController {
private final EmbeddingModel embeddingModel;
public EmbeddingController(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@GetMapping("/embed")
public float[] embed(@RequestParam String text) {
return embeddingModel.embed(text);
}
}
2.3 手动创建
完整示例:ManualEmbeddingDemo.java
ManualEmbeddingDemo.java
public class ManualEmbeddingDemo {
public static void main(String[] args) {
OllamaApi ollamaApi = OllamaApi.builder()
.baseUrl("http://localhost:11434")
.build();
OllamaOptions options = OllamaOptions.builder()
.model("mxbai-embed-large")
.build();
OllamaEmbeddingModel embeddingModel = OllamaEmbeddingModel.builder()
.ollamaApi(ollamaApi)
.defaultOptions(options)
.build();
float[] vector = embeddingModel.embed("Spring AI 向量化测试");
System.out.println("向量维度: " + vector.length);
System.out.println("前 5 个分量: " + Arrays.toString(Arrays.copyOf(vector, 5)));
}
}
3. 数据结构
3.1 Embedding
Embedding embedding = new Embedding(new float[]{0.1f, 0.2f, 0.3f}, 0);
float[] vector = embedding.getOutput(); // 向量数据
int index = embedding.getIndex(); // 在响应列表中的位置
3.2 EmbeddingRequest
EmbeddingRequest request = new EmbeddingRequest(
List.of("Hello World", "你好世界"),
EmbeddingOptionsBuilder.builder()
.withModel("mxbai-embed-large")
.build()
);
List<String> inputs = request.getInstructions(); // 待向量化的文本列表
EmbeddingOptions options = request.getOptions(); // 请求参数
3.3 EmbeddingResponse
EmbeddingResponse response = embeddingModel.call(request);
List<Embedding> results = response.getResults();
for (Embedding e : results) {
System.out.println("索引: " + e.getIndex());
System.out.println("向量维度: " + e.getOutput().length);
}
EmbeddingResponseMetadata metadata = response.getMetadata();
System.out.println("模型: " + metadata.getModel());
4. 使用方式
4.1 单文本向量化
完整示例:SingleEmbedDemo.java
SingleEmbedDemo.java
@Component
public class SingleEmbedDemo implements CommandLineRunner {
private final EmbeddingModel embeddingModel;
public SingleEmbedDemo(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@Override
public void run(String... args) {
float[] vector = embeddingModel.embed("Spring AI 让 AI 开发变得简单");
System.out.println("向量维度: " + embeddingModel.dimensions());
System.out.println("实际维度: " + vector.length);
}
}
4.2 批量文本向量化
完整示例:BatchEmbedDemo.java
BatchEmbedDemo.java
@Component
public class BatchEmbedDemo implements CommandLineRunner {
private final EmbeddingModel embeddingModel;
public BatchEmbedDemo(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@Override
public void run(String... args) {
List<String> texts = List.of(
"Spring Boot 自动配置原理",
"微服务架构设计模式",
"Java 虚拟机内存模型"
);
List<float[]> vectors = embeddingModel.embed(texts);
for (int i = 0; i < texts.size(); i++) {
System.out.println(texts.get(i) + " → 维度 " + vectors.get(i).length);
}
}
}
4.3 获取完整响应
embedForResponse 返回包含元数据的完整 EmbeddingResponse,可获取模型名和 Token 用量。
完整示例:EmbedForResponseDemo.java
EmbedForResponseDemo.java
@Component
public class EmbedForResponseDemo implements CommandLineRunner {
private final EmbeddingModel embeddingModel;
public EmbedForResponseDemo(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@Override
public void run(String... args) {
EmbeddingResponse response = embeddingModel.embedForResponse(
List.of("人工智能", "机器学习", "深度学习")
);
System.out.println("模型: " + response.getMetadata().getModel());
System.out.println("向量数: " + response.getResults().size());
for (Embedding e : response.getResults()) {
float[] v = e.getOutput();
System.out.printf(" 索引 %d: 维度 %d, 前3分量 [%.4f, %.4f, %.4f]%n",
e.getIndex(), v.length, v[0], v[1], v[2]);
}
}
}
5. 文档向量化
5.1 基本文档嵌入
完整示例:DocumentEmbedDemo.java
DocumentEmbedDemo.java
@Component
public class DocumentEmbedDemo implements CommandLineRunner {
private final EmbeddingModel embeddingModel;
public DocumentEmbedDemo(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@Override
public void run(String... args) {
Document doc = Document.builder()
.id("doc-001")
.text("Spring AI 提供统一的 EmbeddingModel 接口,屏蔽不同厂商差异")
.metadata("source", "spring-ai-docs")
.build();
float[] vector = embeddingModel.embed(doc);
System.out.println("文档 ID: " + doc.getId());
System.out.println("向量维度: " + vector.length);
}
}
5.2 TokenCountBatchingStrategy 分批嵌入
当文档总 Token 数超出模型限制时,使用 TokenCountBatchingStrategy 自动分批。
完整示例:BatchingEmbedDemo.java
BatchingEmbedDemo.java
@Component
public class BatchingEmbedDemo implements CommandLineRunner {
private final EmbeddingModel embeddingModel;
public BatchingEmbedDemo(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@Override
public void run(String... args) {
List<Document> documents = List.of(
new Document("Spring AI ChatClient 提供流式 API 简化模型调用"),
new Document("EmbeddingModel 将文本转换为向量表示"),
new Document("VectorStore 实现高效的相似度检索"),
new Document("RAG 将检索结果注入提示词增强生成质量")
);
List<float[]> vectors = embeddingModel.embed(
documents,
EmbeddingOptionsBuilder.builder()
.withModel("mxbai-embed-large")
.build(),
new TokenCountBatchingStrategy()
);
for (int i = 0; i < documents.size(); i++) {
System.out.println(documents.get(i).getId()
+ " → 向量维度: " + vectors.get(i).length);
}
}
}
6. 高级配置
6.1 EmbeddingOptions
通过 EmbeddingOptionsBuilder 配置模型名和维度(维度仅在部分厂商生效,Ollama 不支持指定维度)。
完整示例:EmbeddingOptionsDemo.java
EmbeddingOptionsDemo.java
@Component
public class EmbeddingOptionsDemo implements CommandLineRunner {
private final EmbeddingModel embeddingModel;
public EmbeddingOptionsDemo(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@Override
public void run(String... args) {
EmbeddingOptions options = EmbeddingOptionsBuilder.builder()
.withModel("nomic-embed-text")
.build();
EmbeddingRequest request = new EmbeddingRequest(
List.of("Java 21 虚拟线程"),
options
);
EmbeddingResponse response = embeddingModel.call(request);
System.out.println("使用模型: " + response.getMetadata().getModel());
System.out.println("向量维度: " + response.getResult().getOutput().length);
}
}
6.2 EmbeddingOptionsBuilder 方法
| 方法 | 说明 |
|---|---|
withModel(String) | 指定模型名称 |
withDimensions(Integer) | 指定输出维度(部分厂商支持) |
build() | 构建 EmbeddingOptions |
7. Ollama 向量化配置
7.1 配置属性
application.yml
spring:
ai:
ollama:
base-url: http://localhost:11434
embedding:
options:
model: mxbai-embed-large
truncate: true
keep-alive: 5m
| 配置项 | 默认值 | 说明 |
|---|---|---|
spring.ai.ollama.embedding.options.model | mxbai-embed-large | 嵌入模型名称 |
spring.ai.ollama.embedding.options.truncate | — | 输入过长时是否截断 |
spring.ai.ollama.embedding.options.keep-alive | — | 模型在内存中的驻留时间 |
7.2 常用嵌入模型
| 模型 | 维度 | 适用场景 |
|---|---|---|
mxbai-embed-large | 1024 | 通用嵌入,英文为主 |
nomic-embed-text | 768 | 轻量嵌入,支持中英文 |
bge-m3 | 1024 | 多语言嵌入,中文优秀 |
all-minilm | 384 | 最小嵌入,资源受限环境 |
维度可通过 embeddingModel.dimensions() 动态获取,AbstractEmbeddingModel 会缓存首次查询结果。
8. 向量相似度计算
向量化之后,可通过余弦相似度比较文本间的语义相似性。
完整示例:SimilarityDemo.java
SimilarityDemo.java
@Component
public class SimilarityDemo implements CommandLineRunner {
private final EmbeddingModel embeddingModel;
public SimilarityDemo(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@Override
public void run(String... args) {
String query = "Java 编程语言";
String candidate1 = "Java 是一种面向对象的编程语言";
String candidate2 = "今天天气很好适合出游";
float[] queryVec = embeddingModel.embed(query);
float[] vec1 = embeddingModel.embed(candidate1);
float[] vec2 = embeddingModel.embed(candidate2);
System.out.println("查询: " + query);
System.out.println("与候选1相似度: " + cosineSimilarity(queryVec, vec1));
System.out.println("与候选2相似度: " + cosineSimilarity(queryVec, vec2));
}
private double cosineSimilarity(float[] a, float[] b) {
double dotProduct = 0, normA = 0, normB = 0;
for (int i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}
}
9. 与 VectorStore 的关系
EmbeddingModel 是 VectorStore 的底层依赖——VectorStore 在存储和检索文档时内部委托 EmbeddingModel 完成向量化,用户只需注入 VectorStore 即可,详见 VectorStore 章节。
10. 完整综合示例
完整示例:EmbeddingCompleteExample.java
EmbeddingCompleteExample.java
@Component
public class EmbeddingCompleteExample implements CommandLineRunner {
private final EmbeddingModel embeddingModel;
public EmbeddingCompleteExample(EmbeddingModel embeddingModel) {
this.embeddingModel = embeddingModel;
}
@Override
public void run(String... args) {
List<Document> docs = List.of(
new Document("doc-1", "Spring AI 提供 ChatClient 简化模型调用", Map.of("topic", "chat")),
new Document("doc-2", "EmbeddingModel 将文本转换为高维向量", Map.of("topic", "embedding")),
new Document("doc-3", "VectorStore 支持多种向量数据库实现", Map.of("topic", "vectorstore")),
new Document("doc-4", "RAG 通过检索增强生成提高回答准确性", Map.of("topic", "rag"))
);
List<float[]> vectors = embeddingModel.embed(
docs,
EmbeddingOptionsBuilder.builder().build(),
new TokenCountBatchingStrategy()
);
String query = "如何将文本转换为向量?";
float[] queryVec = embeddingModel.embed(query);
double bestScore = -1;
String bestDoc = "";
for (int i = 0; i < docs.size(); i++) {
double score = cosineSimilarity(queryVec, vectors.get(i));
if (score > bestScore) {
bestScore = score;
bestDoc = docs.get(i).getText();
}
}
System.out.println("查询: " + query);
System.out.println("最匹配文档: " + bestDoc);
System.out.println("相似度: " + String.format("%.4f", bestScore));
}
private double cosineSimilarity(float[] a, float[] b) {
double dot = 0, na = 0, nb = 0;
for (int i = 0; i < a.length; i++) {
dot += a[i] * b[i];
na += a[i] * a[i];
nb += b[i] * b[i];
}
return dot / (Math.sqrt(na) * Math.sqrt(nb));
}
}