v16.1.0•
Audio & Image Generation Expansion
SreeAuthor
This significant update expands our creative and multimodal capabilities with 10 new image generation models and comprehensive audio processing features, alongside powerful web search integration for enhanced research capabilities.
Provider 4 Image Generation Models
5
- Free Tier Models: Cost-effective image generation from leading providers
- Black Forest Labs:
- Leonardo AI:
- Stability AI:
- Basic Tier Models: Professional-grade image generation
- Leonardo AI:
- Stability AI:
Provider 5 Image Generation Models
5
- Free Tier Models: Accessible image generation capabilities
- Black Forest Labs:
- Basic Tier Models: Enhanced generation options
- Black Forest Labs:
- HiDream:
- Pro Tier Models: Advanced image generation capabilities
- HiDream:
- Qwen:
Web Search Integration Models
6
- Basic Tier Models: Web search with vision and function calling
- Core Model:
- Dated Variant:
- Pro Tier Models: Advanced web search capabilities
- Core Model:
- Dated Variant:
- Ultra Tier Exclusive: Premium reasoning with web search
- Core Model:
- Dated Variant:
Audio Transcription Models
4
- Free Tier Models: Industry-standard speech-to-text
- Basic Tier Models: Advanced transcription capabilities
- Pro Tier Models: Premium transcription with diarization
- Standard Transcription:
- With Speaker Identification:
Text-to-Speech Models
4
- Basic Tier Models: High-quality voice synthesis
- Core Model:
- Dated Variant:
- Pro Tier Models: Premium HD voice synthesis
- Core Model:
- Dated Variant:
Multimodal Audio Chat Models
9
- Basic Tier Models: Native audio processing in conversations
- Core Model:
- Dated Variant:
- Pro Tier Models: Advanced multimodal audio capabilities
Model Capabilities
4
- Image Generation: 10 new models across Provider 4 and Provider 5 from Black Forest Labs, Leonardo AI, Stability AI, HiDream, and Qwen.
- Web Search Integration: Real-time information access through GPT models with vision and function calling support.
- Audio Transcription: Industry-leading speech-to-text with Whisper and GPT-4o models, including speaker diarization capabilities.
- Audio Synthesis: High-quality text-to-speech with standard and HD variants, plus native audio processing in chat conversations.
Platform Enhancements
3
- Provider 4 Expansion: Enhanced infrastructure with 5 new image generation models from leading AI providers.
- Provider 5 Audio Suite: Comprehensive audio processing capabilities including transcription, synthesis, and multimodal chat.
Important Notes
2
- Tier Access: Models are automatically available in all higher tiers. Free tier includes 4 image models, Basic tier adds 4 more, Pro tier provides complete access.
- Audio Features: Most audio capabilities require Basic tier minimum, with advanced features such as HD synthesis and diarization available on Pro tier.