Skip to main content

Redundancy Detection

Redundancy detection measures the presence of repeated or unnecessary information in a generated summary. It evaluates whether the summary contains duplicate content or restates information in ways that don't add value.

Overview

The redundancy metric uses an LLM provider to analyze the internal consistency and information density of a summary. It helps identify cases where the same concepts or facts are unnecessarily repeated, helping ensure concise and efficient summaries.

Usage

Here's how to evaluate redundancy using Assert LLM Tools:

from assert_llm_tools.core import evaluate_summary
from assert_llm_tools.llm.config import LLMConfig

metrics = ["redundancy"]

# Configure LLM provider (choose one)
llm_config = LLMConfig(
provider="bedrock",
model_id="anthropic.claude-v2",
region="us-east-1"
)

llm_config = LLMConfig(
provider="openai",
model_id="gpt-4-mini",
api_key="your-api-key"
)

# Example texts
full_text = "The company launched a new product in March. Sales exceeded expectations."
summary = "The company launched a new product in March. The product launch happened in March. Sales were very good, exceeding all expectations and performing above expected levels."

# Evaluate redundancy
metrics = evaluate_summary(
full_text,
summary,
metrics=metrics,
llm_config=llm_config
)

# Print results
print("\nEvaluation Metrics:")
for metric, score in metrics.items():
print(f"{metric}: {score:.4f}")

Interpretation

The redundancy score ranges from 0 to 1:

  • 1.0: No redundancy (optimal information density)
  • 0.0: High redundancy (significant repetition of information)

When to Use

Use redundancy detection when:

  • Optimizing summary length and efficiency
  • Evaluating text generation quality
  • Improving readability of generated content
  • Fine-tuning summarization models

Limitations

  • Requires an LLM provider, which may incur costs
  • May not catch subtle forms of redundancy
  • Context-dependent interpretation might vary
  • Results can vary based on the LLM model used