Advanced Usage
Custom Metric Combinations
Selecting Specific Metrics
metrics = evaluate_summary(
full_text,
summary,
metrics=["rouge", "bleu", "faithfulness"],
llm_config=config
)
Available Metrics
- Non-LLM:
rouge
,bleu
,bert_score
,bart_score
- LLM-based:
faithfulness
,topic_preservation
,redundancy
,conciseness
Stopword Handling
Adding Custom Stopwords
from assert_llm_tools.utils import add_custom_stopwords
# Add domain-specific stopwords
add_custom_stopwords([
"specific",
"domain",
"terms"
])
# Use in evaluation
metrics = evaluate_summary(
full_text,
summary,
remove_stopwords=True
)
BERT Score Configuration
metrics = evaluate_summary(
full_text,
summary,
bert_model="microsoft/deberta-xlarge-mnli" # Default is "microsoft/deberta-base-mnli"
)
Progress Tracking
Disable Progress Bar
metrics = evaluate_summary(
full_text,
summary,
show_progress=False
)
Batch Processing
Processing Multiple Summaries
from assert_llm_tools.core import batch_evaluate_summaries
summaries = [
"First summary...",
"Second summary...",
"Third summary..."
]
results = batch_evaluate_summaries(
full_text,
summaries,
metrics=["rouge", "bleu"],
show_progress=True
)
Error Handling
from assert_llm_tools.exceptions import LLMProviderError
try:
metrics = evaluate_summary(
full_text,
summary,
llm_config=config
)
except LLMProviderError as e:
print(f"LLM provider error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
Performance Optimization
Memory Usage
- Use smaller BERT models for lower memory footprint
- Process large batches in smaller chunks
- Clean up resources after batch processing
Speed Optimization
- Disable unnecessary metrics
- Use faster LLM models for development
- Cache results for repeated evaluations
Logging and Debugging
import logging
# Configure logging
logging.basicConfig(level=logging.DEBUG)
# Evaluate with detailed logs
metrics = evaluate_summary(
full_text,
summary,
show_progress=True
)
Model Weight Management
BERT Score Models
- Base model (~500MB):
microsoft/deberta-base-mnli
- Large model (~3GB):
microsoft/deberta-xlarge-mnli
BART Score Model
- Default model (~1.6GB):
facebook/bart-large-cnn
API Rate Limiting
OpenAI
from assert_llm_tools.llm.config import LLMConfig
config = LLMConfig(
provider="openai",
model_id="gpt-4",
api_key="your-api-key",
rate_limit=20 # requests per minute
)
Amazon Bedrock
config = LLMConfig(
provider="bedrock",
model_id="anthropic.claude-v2",
rate_limit=10 # requests per minute
)