Topic Preservation
Topic preservation measures how well a summary maintains the key topics and main ideas from the source text. This metric helps ensure that important themes and concepts aren't lost during summarization.
Overview
The topic preservation metric uses an LLM to analyze whether the core topics from the original text are retained in the summary. It's particularly useful for evaluating whether summarization maintains the essential meaning and key points of longer documents.
Usage
Here's how to evaluate topic preservation using Assert LLM Tools:
from assert_llm_tools.core import evaluate_summary
from assert_llm_tools.llm.config import LLMConfig
# Configure LLM provider (choose one)
llm_config = LLMConfig(
provider="bedrock",
model_id="anthropic.claude-v2",
region="us-east-1"
)
llm_config = LLMConfig(
provider="openai",
model_id="gpt-4-mini",
api_key="your-api-key"
)
# Example texts
full_text = "Climate change is causing rising sea levels and extreme weather events. Scientists warn that immediate action is needed to reduce carbon emissions and prevent catastrophic environmental damage."
summary = "Scientists emphasize the urgent need to address climate change due to its effects on sea levels and weather patterns."
# Evaluate topic preservation
metrics = evaluate_summary(
full_text,
summary,
metrics=["topic_preservation"],
llm_config=llm_config,
)
# Print results
print(metrics)
Expected output
{'topic_preservation': 0.4, 'reference_topics': ['Space exploration', 'Astronomy', 'Astrophysics', 'Cosmology', 'Technological advancements'], 'preserved_topics': ['Astronomy', 'Technological advancements'], 'missing_topics': ['Space exploration', 'Astrophysics', 'Cosmology']}
Interpretation
The topic preservation score ranges from 0 to 1:
- 1.0: Perfect topic preservation (all key topics maintained)
- 0.0: Poor topic preservation (key topics missing or altered)
When to Use
Use topic preservation metrics when:
- Evaluating long-form summarization
- Ensuring critical themes aren't lost in compression
- Validating content maintains its intended focus
- Checking for topic drift in multi-step summarization
Limitations
- Requires an LLM provider, which may incur costs
- May not capture nuanced relationships between topics
- Performance depends on the LLM's understanding of domain-specific topics
- Subjective nature of topic importance can affect scores