情報取得

2025-04-18

検索拡張生成 (RAG) ソリューションの前のステップでは、チャンク用の埋め込みを生成しました。この手順では、ベクターデータベースにインデックスを生成し、実験して最適な検索を決定します。この記事では、検索インデックス、検索の種類、および再ランク付け戦略の構成オプションについて説明します。

この記事はシリーズの一部です。概要を参照してください。

検索インデックスを構成する

注

このセクションでは、Azure AI Search の具体的な推奨事項について説明します。別のストアを使用する場合は、適切なドキュメントを確認して、そのサービスの主要な構成を見つけます。

ストア内の検索インデックスには、データ内のどのフィールドにも対応する列があります。一般に、検索ストアでは、文字列、ブール値、整数、単一、double、datetime などの非ベクトルデータ型がサポートされます。また、単一型のコレクションやベクターデータ型などのコレクションもサポートしています。列ごとに、データ型や、フィールドがフィルター可能か、取得可能か、検索可能かなどの情報を構成する必要があります。

ベクターフィールドに適用できる次のベクター検索構成について考えてみましょう。

ベクター検索アルゴリズム:ベクター検索アルゴリズムは、相対一致を検索します。 AI Search には、ベクトル空間全体をスキャンする、完全な k ニアレストネイバー (KNN) と呼ばれるブルートフォースアルゴリズムオプションがあります。また、近似最近隣 (ANN) 検索を実行する、階層ナビゲーション可能 Small World (HNSW) と呼ばれる、よりパフォーマンスの高いアルゴリズムオプションもあります。
類似性メトリック: このアルゴリズムでは、類似性メトリックを使用して近さを計算します。 AI Search のメトリックの種類には、コサイン、ドット積、ユークリッドなどがあります。 Azure OpenAI Service 埋め込みモデルを使用する場合は、コサインを選択します。
efConstruction パラメーター: このパラメーターは、HNSW インデックスの構築中に使用されます。インデックス作成中にベクトルに接続されている最も近い近傍の数を決定します。 efConstruction値を大きくすると、小さい数値よりも品質の高いインデックスが得られます。ただし、値を大きくするには、より多くの時間、ストレージ、コンピューティングが必要です。チャンクの数が多い場合は、 efConstruction 値を大きく設定します。チャンクの数が少ない場合は、値を小さく設定します。最適な値を判断するには、データと予想されるクエリを試してください。
efSearch パラメーター: このパラメーターは、クエリ時に、検索で使用される最も近い近隣ノード (または類似するチャンク) の数を設定するために使用されます。
m パラメーター: このパラメーターは双方向リンク数です。範囲は 4 ～ 10 です。数値を小さくすると、結果のノイズが少なくなります。

AI Search では、ベクター構成は vectorSearch 構成にカプセル化されます。ベクター列を構成するときは、そのベクター列の適切な構成を参照し、次元の数を設定します。ベクター列の dimensions 属性は、埋め込みモデルによって生成されるディメンションの数を表します。たとえば、ストレージ最適化 テキスト埋め込み-3-small モデルでは、1,536 次元が生成されます。

検索方法を選択する

検索ストアに対してプロンプトオーケストレーターからクエリを実行する場合は、次の要因を考慮してください。

実行する検索の種類 (ベクター、キーワード、ハイブリッドなど)
1 つ以上の列に対してクエリを実行するかどうか
キーワードクエリやベクター検索など、複数のクエリを手動で実行するかどうか
クエリをサブクエリに分割する必要があるかどうか
クエリでフィルター処理を使用する必要があるかどうか

プロンプトオーケストレーターは、プロンプトからのコンテキストの手がかりに基づいてアプローチを組み合わせた静的アプローチまたは動的アプローチを使用する場合があります。次のセクションでは、ワークロードに適したアプローチを見つけるのに役立つこれらのオプションについて説明します。

検索の種類

検索プラットフォームでは、通常、フルテキスト検索とベクター検索がサポートされます。 AI Search などの一部のプラットフォームでは、ハイブリッド検索がサポートされています。

ベクトル検索

ベクター検索では、ベクター化されたクエリ (プロンプト) フィールドとベクターフィールドの類似性が比較されます。詳細については、「ベクター検索用の Azure サービスの選択」を参照してください。

重要

クエリを埋め込む前に、チャンクに対して実行したのと同じクリーニング操作を実行する必要があります。たとえば、埋め込んだチャンク内のすべての単語を小文字にした場合は、埋め込む前にクエリのすべての単語を小文字にする必要があります。

注

同じクエリ内の複数のベクトルフィールドに対してベクトル検索を実行できます。 AI Search では、このプラクティスはハイブリッド検索と見なされます。詳細については、「ハイブリッド検索」を参照してください。

次のサンプルコードでは、contentVector フィールドに対してベクター検索を実行します。

embedding = embedding_model.generate_embedding(
    chunk=str(pre_process.preprocess(query))
)

vector = RawVectorQuery(
    k=retrieve_num_of_documents,
    fields="contentVector",
    vector=embedding,
)

results = client.search(
    search_text=None,
    vector_queries=[vector],
    top=retrieve_num_of_documents,
    select=["title", "content", "summary"],
)

クエリを埋め込むコードは、最初にクエリを前処理します。この前処理は、埋め込む前にチャンクを前処理するコードと同じである必要があります。チャンクを埋め込んだのと同じ埋め込みモデルを使用する必要があります。

フルテキスト検索

フルテキスト検索は、インデックスに格納されているプレーンテキストと一致します。クエリからキーワードを抽出し、抽出したキーワードをフルテキスト検索で 1 つ以上のインデックス付き列に対して使用するのが一般的です。任意の用語またはすべての用語が一致する場合に一致を返すようにフルテキスト検索を構成できます。

フルテキスト検索を実行するフィールドを決定する実験。エンリッチメントフェーズの記事で説明されているように、コンテンツのセマンティック意味は似ているものの、エンティティまたはキーワードが異なるシナリオでは、フルテキスト検索にキーワードとエンティティメタデータフィールドを使用する必要があります。フルテキスト検索で考慮すべきその他の一般的なフィールドには、タイトル、概要、チャンクテキストがあります。

次のサンプルコードでは、タイトル、コンテンツ、およびサマリーフィールドに対してフルテキスト検索を実行します。

formatted_search_results = []

results = client.search(
    search_text=query,
    top=retrieve_num_of_documents,
    select=["title", "content", "summary"],
)

formatted_search_results = format_results(results)

ハイブリッド検索

AI Search では、1 つ以上のテキスト検索と 1 つ以上のベクター検索を含むハイブリッドクエリがサポートされます。プラットフォームは各クエリを実行し、中間結果を取得し、逆ランク Fusion を使用して結果を再ランク付けし、上位 N 個の結果を返します。

次のサンプルコードでは、タイトル、コンテンツ、およびサマリーフィールドに対してフルテキスト検索を実行します。また、contentVector フィールドと questionVector フィールドに対してベクター検索を実行します。 AI Search では、すべてのクエリが並列で実行され、結果が再ランク付けされ、上位 のretrieve_num_of_documentsが返されます。

 embedding = embedding_model.generate_embedding(
    chunk=str(pre_process.preprocess(query))
)
vector1 = RawVectorQuery(
    k=retrieve_num_of_documents,
    fields="contentVector",
    vector=embedding,
)
vector2 = RawVectorQuery(
    k=retrieve_num_of_documents,
    fields="questionVector",
    vector=embedding,
)

results = client.search(
    search_text=query,
    vector_queries=[vector1, vector2],
    top=retrieve_num_of_documents,
    select=["title", "content", "summary"],
)

複数のクエリを手動で実行する

ベクター検索やキーワードフルテキスト検索など、複数のクエリを手動で実行できます。結果を集計し、結果を手動で再ランク付けを行い、上位の結果を返します。手動で複数のクエリを実行する場合は、次のユースケースを検討してください。

ハイブリッド検索をサポートしていない検索プラットフォームを使用します。手動の複数のクエリを使用して、独自のハイブリッド検索を実行します。
さまざまなクエリに対してフルテキスト検索を実行する場合。たとえば、クエリからキーワードを抽出し、キーワードメタデータフィールドに対してフルテキスト検索を実行できます。その後、エンティティを抽出し、エンティティメタデータフィールドに対してクエリを実行できます。
再ランク処理プロセスを制御する必要があります。
このクエリでは、複数のソースからグラウンドデータを取得するために、分解されたサブクエリを実行する必要があります。

クエリ翻訳

クエリ変換は、RAG ソリューションの情報取得フェーズの省略可能な手順です。この手順では、クエリを最適化された形式に変換または変換して、より良い結果を取得します。クエリ変換メソッドには、拡張、分解、書き換え、架空のドキュメント埋め込み (HyDE) が含まれます。

クエリの拡張

クエリ拡張は、クエリをよりシンプルで使いやすくし、コンテキストを強化する翻訳ステップです。クエリが小さい場合やあいまいな場合は、拡張を検討する必要があります。たとえば、"Microsoft の収益を比較する" というクエリを考えてみましょう。このクエリには、比較する時間枠や時間単位は含まれません。収益のみを指定します。 "当年度の Microsoft の収益と収益を四半期ごとに比較する" など、クエリの拡張バージョンについて考えてみましょう。新しいクエリは明確で具体的です。

クエリを拡張する場合は、元のクエリを維持しますが、コンテキストを追加します。元のクエリを削除または変更したり、クエリの性質を変更したりしないでください。

言語モデルを使用してクエリを拡張できます。ただし、すべてのクエリを拡張することはできません。コンテキストがある場合は、それを言語モデルに渡してクエリを拡張できます。コンテキストがない場合は、クエリの拡張に使用できる情報が言語モデルに含まれているかどうかを判断する必要があります。たとえば、GPT モデルのような大規模な言語モデルを使用する場合は、クエリに関する情報がインターネットですぐに利用できるかどうかを判断できます。その場合は、モデルを使用してクエリを拡張できます。それ以外の場合は、クエリを拡張しないでください。

次のプロンプトでは、言語モデルによってクエリが拡張されます。このプロンプトには、クエリにコンテキストがあり、含まれていない場合の例が含まれます。詳細については、 RAG 実験アクセラレータの GitHub リポジトリを参照してください。

Input Processing:

Analyze the input query to identify the core concept or topic.
Check whether the query provides context.
If context is provided, use it as the primary basis for augmentation and explanation.
If no context is provided, determine the likely ___domain or field, such as science, technology, history, or arts, based on the query.

Query Augmentation:

If context is provided:

Use the given context to frame the query more specifically.
Identify other aspects of the topic not covered in the provided context that enrich the explanation.

If no context is provided, expand the original query by adding the following elements, as applicable:

Include definitions about every word, such as adjective or noun, and the meaning of each keyword, concept, and phrase including synonyms and antonyms.
Include historical context or background information, if relevant.
Identify key components or subtopics within the main concept.
Request information about practical applications or real-world relevance.
Ask for comparisons with related concepts or alternatives, if applicable.
Inquire about current developments or future prospects in the field.

Other Guidelines:

Prioritize information from provided context when available.
Adapt your language to suit the complexity of the topic, but aim for clarity.
Define technical terms or jargon when they're first introduced.
Use examples to illustrate complex ideas when appropriate.
If the topic is evolving, mention that your information might not reflect the very latest developments.
For scientific or technical topics, briefly mention the level of scientific consensus if relevant.
Use Markdown formatting for better readability when appropriate.

Example Input-Output:

Example 1 (With provided context):

Input: "Explain the impact of the Gutenberg Press"
Context Provided: "The query is part of a discussion about revolutionary inventions in medieval Europe and their long-term effects on society and culture."
Augmented Query: "Explain the impact of the Gutenberg Press in the context of revolutionary inventions in medieval Europe. Cover its role in the spread of information, its effects on literacy and education, its influence on the Reformation, and its long-term impact on European society and culture. Compare it to other medieval inventions in terms of societal influence."

Example 2 (Without provided context):

Input: "Explain CRISPR technology"
Augmented Query: "Explain CRISPR technology in the context of genetic engineering and its potential applications in medicine and biotechnology. Cover its discovery, how it works at a molecular level, its current uses in research and therapy, ethical considerations surrounding its use, and potential future developments in the field."
Now, provide a comprehensive explanation based on the appropriate augmented query.

Context: {context}

Query: {query}

Augmented Query:

分解

複雑なクエリでは、モデルを作成するために複数のデータコレクションが必要です。たとえば、"電気自動車のしくみ、および内燃機関 (ICE) 車両とどのように比較するか" というクエリでは、複数のソースからの接地データが必要な場合があります。あるソースでは、電気自動車の動作を説明し、別のソースが ICE 車両と比較します。

分解は、複雑なクエリを複数の小さく単純なサブクエリに分割するプロセスです。分解された各クエリを個別に実行し、すべての分解されたクエリの上位の結果を累積コンテキストとして集計します。次に、元のクエリを実行します。これにより、蓄積されたコンテキストが言語モデルに渡されます。

検索を実行する前に、クエリで複数の検索が必要かどうかを判断する必要があります。複数のサブクエリが必要な場合は、すべてのクエリに対して手動で複数のクエリを実行できます。言語モデルを使用して、複数のサブクエリが推奨されるかどうかを判断します。

次のプロンプトは、クエリを単純または複雑に分類します。詳細については、 RAG 実験アクセラレータの GitHub リポジトリを参照してください。

Consider the given question to analyze and determine whether it falls into one of these categories:

1. Simple, factual question
  a. The question asks for a straightforward fact or piece of information.
  b. The answer can likely be found stated directly in a single passage of a relevant document.
  c. Breaking the question down further is unlikely to be beneficial.
  Examples: "What year did World War 2 end?", "What is the capital of France?", "What are the features of productX?"

2. Complex, multipart question
  a. The question has multiple distinct components or asks for information about several related topics.
  b. Different parts of the question likely need to be answered by separate passages or documents.
  c. Breaking the question down into subquestions for each component provides better results.
  d. The question is open-ended and likely to have a complex or nuanced answer.
  e. Answering the question might require synthesizing information from multiple sources.
  f. The question might not have a single definitive answer and could warrant analysis from multiple angles.
  Examples: "What were the key causes, major battles, and outcomes of the American Revolutionary War?", "How do electric cars work and how do they compare to gas-powered vehicles?"

Based on this rubric, does the given question fall under category 1 (simple) or category 2 (complex)? The output should be in strict JSON format. Ensure that the generated JSON is 100% structurally correct, with proper nesting, comma placement, and quotation marks. There shouldn't be a comma after the last element in the JSON.

Example output:
{
  "category": "simple"
}

言語モデルを使用して、複雑なクエリを分解することもできます。次のプロンプトでは、複雑なクエリが分解されます。詳細については、 RAG 実験アクセラレータの GitHub リポジトリを参照してください。

Analyze the following query:

For each query, follow these specific instructions:

- Expand the query to be clear, complete, fully qualified, and concise.
- Identify the main elements of the sentence, typically a subject, an action or relationship, and an object or complement. Determine which element is being asked about or emphasized (usually the unknown or focus of the question). Invert the sentence structure. Make the original object or complement the new subject. Transform the original subject into a descriptor or qualifier. Adjust the verb or relationship to fit the new structure.
- Break the query down into a set of subqueries that have clear, complete, fully qualified, concise, and self-contained propositions.
- Include another subquery by using one more rule: Identify the main subject and object. Swap their positions in the sentence. Adjust the wording to make the new sentence grammatically correct and meaningful. Ensure that the new sentence asks about the original subject.
- Express each idea or fact as a standalone statement that can be understood with the help of the given context.
- Break down the query into ordered subquestions, from least to most dependent.
- The most independent subquestion doesn't require or depend on the answer to any other subquestion or prior knowledge.
- Try having a complete subquestion that has all information only from the base query. There's no other context or information available.
- Separate complex ideas into multiple simpler propositions when appropriate.
- Decontextualize each proposition by adding necessary modifiers to nouns or entire sentences. Replace pronouns, such as it, he, she, they, this, and that, with the full name of the entities that they refer to.
- If you still need more questions, the subquestion isn't relevant and should be removed.

Provide your analysis in the following YAML format, and strictly adhere to the following structure. Don't output anything extra, including the language itself.

type: interdependent
queries:
- [First query or subquery]
- [Second query or subquery, if applicable]
- [Third query or subquery, if applicable]
- ...

Examples:

1. Query: "What is the capital of France?"
type: interdependent
queries:
    - What is the capital of France?

2. Query: "Who is the current CEO of the company that created the iPhone?"
type: interdependent
queries:
    - Which company created the iPhone?
    - Who is the current CEO of Apple? (identified in the previous question)

3. Query: "What is the population of New York City, and what is the tallest building in Tokyo?"
type: multiple_independent
queries:
    - What is the population of New York City?
    - What is the tallest building in Tokyo?

Now, analyze the following query:

{query}

書き直し

入力クエリは、グラウンドデータを取得するのに最適な形式ではない可能性があります。言語モデルを使用してクエリを書き直し、より良い結果を得ることができます。クエリを書き直して、次の課題に対処します。

曖昧さ
キーワードがありません
不要な単語
不明なセマンティクス

次のプロンプトでは、言語モデルを使用してクエリを書き換えています。詳細については、 RAG 実験アクセラレータの GitHub リポジトリを参照してください。

Rewrite the given query to optimize it for both keyword-based and semantic-similarity search methods. Follow these guidelines:

- Identify the core concepts and intent of the original query.
- Expand the query by including relevant synonyms, related terms, and alternate phrasings.
- Maintain the original meaning and intent of the query.
- Include specific keywords that are likely to appear in relevant documents.
- Incorporate natural language phrasing to capture semantic meaning.
- Include ___domain-specific terminology if applicable to the query's context.
- Ensure that the rewritten query covers both broad and specific aspects of the topic.
- Remove ambiguous or unnecessary words that might confuse the search.
- Combine all elements into a single, coherent paragraph that flows naturally.
- Aim for a balance between keyword richness and semantic clarity.

Provide the rewritten query as a single paragraph that incorporates various search aspects, such as keyword-focused, semantically focused, or ___domain-specific aspects.

query: {original_query}

HyDE の手法

HyDE は、RAG ソリューションの代替情報取得手法です。 HyDE では、クエリを埋め込みに変換し、それらの埋め込みを使用してベクターデータベース内の最も近い一致を見つけるのではなく、言語モデルを使用してクエリから回答を生成します。これらの回答は埋め込みに変換され、最も近い一致を見つけるために使用されます。このプロセスにより、HyDE は回答と回答の埋め込み類似性の検索を実行できます。

クエリ変換をパイプラインに結合する

複数のクエリ翻訳を使用できます。これらの 4 つの翻訳をすべて組み合わせて使用することもできます。次の図は、これらの翻訳をパイプラインに結合する方法の例を示しています。

パイプラインには、次の手順があります。

オプションのクエリ拡張ステップは、元のクエリを受け取ります。この手順では、元のクエリと拡張クエリが出力されます。
省略可能なクエリ分解ステップは、拡張クエリを受け取ります。この手順では、元のクエリ、拡張クエリ、分解されたクエリが出力されます。
分解された各クエリは、3 つのサブステップを実行します。分解されたすべてのクエリがサブステップを通過した後、出力には元のクエリ、拡張クエリ、分解されたクエリ、累積コンテキストが含まれます。累積コンテキストには、サブステップを通過するすべての分解されたクエリからの上位 N 件の結果の集計が含まれます。サブステップには、次のタスクが含まれます。
1. 任意のクエリリライターは、分解されたクエリを書き換える。
2. 検索インデックスは、書き換えられたクエリまたは元のクエリを処理します。ベクター、フルテキスト、ハイブリッド、手動の複数の検索の種類を使用してクエリを実行します。検索インデックスでは、HyDE などの高度なクエリ機能を使用することもできます。
3. 結果が再ランク付けされます。上位 N 再ランク付けされた結果が累積コンテキストに追加されます。
元のクエリは、蓄積されたコンテキストと共に、各分解されたクエリと同じ 3 つのサブステップを通過します。ただし、手順を実行するクエリは 1 つだけで、呼び出し元は上位 N 個の結果を受け取ります。

クエリでイメージを渡す

GPT-4V や GPT-4o などの一部のマルチモーダルモデルでは、画像を解釈できます。これらのモデルを使用する場合は、イメージのチャンクを回避し、プロンプトの一部としてイメージをマルチモーダルモデルに渡すことができます。追加のコンテキストを渡さずにイメージをチャンクする場合と比較して、このアプローチの実行方法を調べる必要があります。また、コスト差を比較し、コストメリット分析を行う必要があります。

絞り込みクエリ

クエリをフィルター処理するには、フィルター可能として構成されている検索ストアのフィールドを使用できます。これらのフィールドを使用して結果を絞り込むクエリのキーワードとエンティティをフィルター処理することを検討してください。フィルター処理を使用して、無関係なデータを排除します。インデックスから特定の条件を満たすデータのみを取得します。この方法では、クエリの全体的なパフォーマンスが向上し、より関連性の高い結果が得られます。フィルター処理がシナリオにメリットがあるかどうかを判断するには、実験とテストを行います。キーワードがないクエリや、キーワード、省略形、頭字語が不正確なクエリなどの要因を考慮してください。

重みフィールド

AI Search では、フィールドを重み付けして、条件に基づいて結果のランク付けに影響を与えることができます。

注

このセクションでは、AI Search の重み付け機能について説明します。別のデータプラットフォームを使用する場合は、そのプラットフォームの重み付け機能を調査します。

AI Search では、数値データの加重フィールドと関数のパラメーターを含むスコアリングプロファイルがサポートされています。スコアリングプロファイルは、非ベクトルフィールドにのみ適用されます。ベクター検索とハイブリッド検索のサポートはプレビュー段階です。インデックスに対して複数のスコアリングプロファイルを作成し、必要に応じてクエリごとに 1 つのスコアリングプロファイルを使用することもできます。

重み付けするフィールドは、クエリの種類とユースケースによって異なります。たとえば、クエリがキーワード中心の場合 ("Microsoft 本社はどこにあるか" など)、エンティティフィールドまたはキーワードフィールドを上位に重み付けするスコアリングプロファイルが必要です。ユーザーごとに異なるプロファイルを使用したり、ユーザーがフォーカスを選択したり、アプリケーションに基づいてプロファイルを選択したりできます。

運用システムでは、運用環境でアクティブに使用するプロファイルのみを更新する必要があります。

再ランク付けを使用する

reranking を使用して 1 つ以上のクエリを実行し、結果を集計し、それらの結果をランク付けします。検索結果を再ランク付けする利点がある次のシナリオを検討してください。

手動で複数の検索を実行し、結果を集計してランク付けする必要があります。
ベクトル検索とキーワード検索は、常に正確とは限らない。検索から返されるドキュメントの数を増やしたい場合は、無視される可能性のある有効な結果を含めることができます。また、再ランク処理を使用して結果を評価します。

言語モデルまたはクロスエンコーダーを使用して、再ランク付けを実行できます。 AI Search などの一部のプラットフォームには、結果を再ランク付けする独自の方法があります。シナリオに最適なものを決定するには、データについてこれらのオプションを評価します。以下のセクションでは、これらのメソッドの詳細について説明します。

言語モデルの再ランク付け

次のサンプル言語モデルプロンプトでは、結果が再ランク付けされます。詳細については、「 RAG 実験アクセラレータ」を参照してください。

Each document in the following list has a number next to it along with a summary of the document. A question is also provided.
Respond with the numbers of the documents that you should consult to answer the question, in order of relevance, and the relevance score as a JSON string based on JSON format as shown in the schema section. The relevance score is a number from 1 to 10 based on how relevant you think the document is to the question. The relevance score can be repetitive. Don't output any other text, explanation, or metadata apart from the JSON string. Just output the JSON string, and strip every other text. Strictly remove the last comma from the nested JSON elements if it's present.
Don't include any documents that aren't relevant to the question. There should be exactly one document element.

Example format:
Document 1:
content of document 1
Document 2:
content of document 2
Document 3:
content of document 3
Document 4:
content of document 4
Document 5:
content of document 5
Document 6:
content of document 6
Question: user-defined question

schema:
{
    "documents": {
        "document_1": "Relevance",
        "document_2": "Relevance"
    }
}

クロスエンコーダーの再ランク付け

次の例では、 Hugging Face によって提供されるクロスエンコーダーを使用して、Roberta モデルを読み込みます。各チャンクを反復処理し、モデルを使用して類似性を計算し、値を提供します。結果を並べ替え、上位 N 個の結果を返します。詳細については、 RAG 実験アクセラレータの GitHub リポジトリを参照してください。

from sentence_transformers import CrossEncoder
...

model_name = 'cross-encoder/stsb-roberta-base'
model = CrossEncoder(model_name)

cross_scores_ques = model.predict(
    [[user_prompt, item] for item in documents],
    apply_softmax=True,
    convert_to_numpy=True,
)

top_indices_ques = cross_scores_ques.argsort()[-k:][::-1]
sub_context = []
for idx in list(top_indices_ques):
    sub_context.append(documents[idx])

セマンティックランク付け

AI Searchには、「セマンティックランク付け」という独自の機能があります。この機能では、意味的に最も関連性の高い結果のレベルを上げる、Microsoft Bing から採用されたディープラーニングモデルを使用します。詳細については、「セマンティックランカーのしくみ」を参照してください。

その他の検索ガイダンスを検討する

検索ソリューションを実装するときは、次の一般的なガイダンスを検討してください。

検索からタイトル、概要、ソース、未加工のコンテンツフィールドを返します。
クエリをサブクエリに分割する必要があるかどうかを前もって判断します。
複数のフィールドに対してベクタークエリとテキストクエリを実行します。クエリを受け取ると、ベクター検索とテキスト検索のどちらが適しているかはわかりません。また、ベクター検索やキーワード検索で検索する必要がある理想的なフィールドがわかりません。複数のフィールドを検索し、複数のクエリを使用して検索し、結果を再ランク付けして、スコアが最も高い結果を返すことができます。
キーワードフィールドとエンティティフィールドをフィルター処理して、結果を絞り込みます。
キーワードとベクター検索を使用します。キーワードを使って、結果をより小さなサブセットにフィルター処理します。ベクトルストアはそのサブセットに対して動作し、最適な一致を見つけます。

検索結果を評価する

準備フェーズでは、テストドキュメント情報と共にテストクエリを収集しました。そのフェーズで収集した次の情報を使用して、検索結果を評価できます。

クエリ: サンプルクエリ
コンテキスト: サンプルクエリに対応するテストドキュメント内のすべてのテキストのコレクション

検索ソリューションを評価するには、次の確立された取得評価方法を使用できます。

K における精度: 検索結果全体に対して、正しく識別された関連アイテムの割合。このメトリックは、検索結果の精度に焦点を当てています。
K での再現率: 可能性のある全相対項目のうち、上位 K に入る関連項目の割合。このメトリックは、検索結果の範囲に焦点を当てています。
平均逆ランク (MRR): ランク付けされた検索結果の最初の関連する回答の逆ランクの平均。このメトリックは、検索結果で関連性の高い結果が最初に発生する場所に焦点を当てています。

正と負の例をテストする必要があります。肯定的な例では、メトリックを可能な限り 1 に近づける必要があります。データでクエリに対処するべきではない否定的な例では、メトリックを可能な限り 0 に近づける必要があります。すべてのテストクエリをテストする必要があります。肯定的なクエリ結果と否定的なクエリ結果を平均して、検索結果が集計でどのように実行されるかを理解します。

次のステップ

LLM のエンドツーエンドの評価フェーズ