Docs
Content Sources
Content Sources
Tatry provides access to a wide range of content sources, from free public resources to premium copyrighted content. This guide explains how these sources are managed and used.
Available Sources
Free Sources
- Wikipedia
- Open access academic papers
- Public domain books
- Government documents
- Open documentation
Premium Sources
- Academic journals
- Professional publications
- Industry reports
- News archives
- Specialized databases
Source Management
Listing Available Sources
from tatry import TatryRetriever
retriever = TatryRetriever(api_key="your-api-key")
# List all available sources
sources = retriever.list_sources()
# List only free sources
free_sources = retriever.list_sources(type="free")
# List only premium sources
premium_sources = retriever.list_sources(type="premium")
# List all sources
curl -X GET https://api.tatry.dev/v1/sources \
-H "Authorization: Bearer your-api-key"
# List free sources
curl -X GET https://api.tatry.dev/v1/sources?type=free \
-H "Authorization: Bearer your-api-key"
# List premium sources
curl -X GET https://api.tatry.dev/v1/sources?type=premium \
-H "Authorization: Bearer your-api-key"
Source Selection
# Use specific sources
retriever = ContentRetriever(
api_key="your-api-key",
allowed_sources=["wikipedia", "academic_journals"]
)
# Exclude sources
retriever = ContentRetriever(
api_key="your-api-key",
excluded_sources=["news_archives"]
)
Source Priority
ContentRetriever uses an intelligent source selection system:
-
Relevance Based
- Matches query intent to source type
- Evaluates source quality
- Considers content freshness
-
Cost Optimization
- Prefers free sources when appropriate
- Balances cost vs. quality
- Respects budget constraints
-
User Preferences
- Honors source restrictions
- Applies source weights
- Follows content filters
Source Metrics
Each source provides metadata:
# Get source details
source_info = retriever.get_source_details("academic_journals")
print(source_info)
{
"name": "academic_journals",
"type": "premium",
"cost_per_query": 0.05,
"update_frequency": "daily",
"coverage_topics": ["science", "technology", "medicine"],
"quality_score": 0.95
}
Content Quality
Quality Assurance
-
Source Verification
- Validated content providers
- Regular quality audits
- Content freshness monitoring
-
Content Standards
- Accuracy requirements
- Completeness checks
- Attribution verification
Quality Metrics
# Get quality metrics for a source
metrics = retriever.get_source_metrics("academic_journals")
print(metrics)
{
"accuracy_score": 0.98,
"completeness_score": 0.95,
"update_frequency": "daily",
"peer_reviewed": True
}
Source Configuration
Global Settings
retriever = ContentRetriever(
api_key="your-api-key",
source_settings={
"prefer_free": True,
"min_quality_score": 0.8,
"max_age_days": 365
}
)
Per-Query Settings
docs = retriever.get_relevant_documents(
"quantum computing advances",
source_filters={
"type": ["academic", "research"],
"peer_reviewed": True,
"max_age_days": 90
}
)
Usage Optimization
Cost Management
# Set source weights for cost optimization
retriever.set_source_weights({
"wikipedia": 1.0, # Free
"academic_journals": 0.5, # Use less due to cost
"news_archives": 0.3 # Use sparingly
})
Quality vs Cost
# Optimize for quality
retriever.configure_source_priority(
priority="quality",
budget_limit=10.0
)
# Optimize for cost
retriever.configure_source_priority(
priority="cost",
min_quality_score=0.7
)
Best Practices
-
Source Selection
- Start with free sources
- Add premium sources as needed
- Monitor source usage
-
Quality Control
- Set minimum quality thresholds
- Regular source evaluation
- Track content freshness
-
Cost Management
- Use source weights
- Set budget limits
- Monitor usage patterns
Next Steps
- Understand our Billing Model
- Explore our Guides for detailed tutorials