Advanced Usage

Custom Metric Combinations

Selecting Specific Metrics

metrics = evaluate_summary(
    full_text,
    summary,
    metrics=["rouge", "bleu", "faithfulness"],
    llm_config=config
)

Available Metrics

Non-LLM: rouge, bleu, bert_score, bart_score
LLM-based: faithfulness, topic_preservation, redundancy, conciseness

Stopword Handling

Adding Custom Stopwords

from assert_llm_tools.utils import add_custom_stopwords

# Add domain-specific stopwords
add_custom_stopwords([
    "specific",
    "domain",
    "terms"
])

# Use in evaluation
metrics = evaluate_summary(
    full_text,
    summary,
    remove_stopwords=True
)

BERT Score Configuration

metrics = evaluate_summary(
    full_text,
    summary,
    bert_model="microsoft/deberta-xlarge-mnli"  # Default is "microsoft/deberta-base-mnli"
)

Progress Tracking

Disable Progress Bar

metrics = evaluate_summary(
    full_text,
    summary,
    show_progress=False
)

Batch Processing

Processing Multiple Summaries

from assert_llm_tools.core import batch_evaluate_summaries

summaries = [
    "First summary...",
    "Second summary...",
    "Third summary..."
]

results = batch_evaluate_summaries(
    full_text,
    summaries,
    metrics=["rouge", "bleu"],
    show_progress=True
)

Error Handling

from assert_llm_tools.exceptions import LLMProviderError

try:
    metrics = evaluate_summary(
        full_text,
        summary,
        llm_config=config
    )
except LLMProviderError as e:
    print(f"LLM provider error: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

Performance Optimization

Memory Usage

Use smaller BERT models for lower memory footprint
Process large batches in smaller chunks
Clean up resources after batch processing

Speed Optimization

Disable unnecessary metrics
Use faster LLM models for development
Cache results for repeated evaluations

Logging and Debugging

import logging

# Configure logging
logging.basicConfig(level=logging.DEBUG)

# Evaluate with detailed logs
metrics = evaluate_summary(
    full_text,
    summary,
    show_progress=True
)

Model Weight Management

BERT Score Models

Base model (~500MB): microsoft/deberta-base-mnli
Large model (~3GB): microsoft/deberta-xlarge-mnli

BART Score Model

Default model (~1.6GB): facebook/bart-large-cnn

API Rate Limiting

OpenAI

from assert_llm_tools.llm.config import LLMConfig

config = LLMConfig(
    provider="openai",
    model_id="gpt-4",
    api_key="your-api-key",
    rate_limit=20  # requests per minute
)

Amazon Bedrock

config = LLMConfig(
    provider="bedrock",
    model_id="anthropic.claude-v2",
    rate_limit=10  # requests per minute
)

Advanced Usage

Custom Metric Combinations​

Selecting Specific Metrics​

Available Metrics​

Stopword Handling​

Adding Custom Stopwords​

BERT Score Configuration​

Progress Tracking​

Disable Progress Bar​

Batch Processing​

Processing Multiple Summaries​

Error Handling​

Performance Optimization​

Memory Usage​

Speed Optimization​

Logging and Debugging​

Model Weight Management​

BERT Score Models​

BART Score Model​

API Rate Limiting​

OpenAI​

Amazon Bedrock​

Custom Metric Combinations

Selecting Specific Metrics

Available Metrics

Stopword Handling

Adding Custom Stopwords

BERT Score Configuration

Progress Tracking

Disable Progress Bar

Batch Processing

Processing Multiple Summaries

Error Handling

Performance Optimization

Memory Usage

Speed Optimization

Logging and Debugging

Model Weight Management

BERT Score Models

BART Score Model

API Rate Limiting

OpenAI

Amazon Bedrock