Topic Preservation

Topic preservation measures how well a summary maintains the key topics and main ideas from the source text. This metric helps ensure that important themes and concepts aren't lost during summarization.

Overview

The topic preservation metric uses an LLM to analyze whether the core topics from the original text are retained in the summary. It's particularly useful for evaluating whether summarization maintains the essential meaning and key points of longer documents.

Usage

Here's how to evaluate topic preservation using Assert LLM Tools:

from assert_llm_tools.core import evaluate_summary
from assert_llm_tools.llm.config import LLMConfig

# Configure LLM provider (choose one)
llm_config = LLMConfig(
provider="bedrock",
model_id="anthropic.claude-v2",
region="us-east-1"
)

llm_config = LLMConfig(
provider="openai",
model_id="gpt-4-mini",
api_key="your-api-key"
)

# Example texts
full_text = "Climate change is causing rising sea levels and extreme weather events. Scientists warn that immediate action is needed to reduce carbon emissions and prevent catastrophic environmental damage."
summary = "Scientists emphasize the urgent need to address climate change due to its effects on sea levels and weather patterns."

# Evaluate topic preservation
metrics = evaluate_summary(
    full_text,
    summary,
    metrics=["topic_preservation"],
    llm_config=llm_config,
)

# Print results
print(metrics)

Expected output

{'topic_preservation': 0.4, 'reference_topics': ['Space exploration', 'Astronomy', 'Astrophysics', 'Cosmology', 'Technological advancements'], 'preserved_topics': ['Astronomy', 'Technological advancements'], 'missing_topics': ['Space exploration', 'Astrophysics', 'Cosmology']}

Interpretation

The topic preservation score ranges from 0 to 1:

1.0: Perfect topic preservation (all key topics maintained)
0.0: Poor topic preservation (key topics missing or altered)

When to Use

Use topic preservation metrics when:

Evaluating long-form summarization
Ensuring critical themes aren't lost in compression
Validating content maintains its intended focus
Checking for topic drift in multi-step summarization

Limitations

Requires an LLM provider, which may incur costs
May not capture nuanced relationships between topics
Performance depends on the LLM's understanding of domain-specific topics
Subjective nature of topic importance can affect scores

Topic Preservation

Overview​

Usage​

Expected output​

Interpretation​

When to Use​

Limitations​