v16.1.0•November 2, 2025

Audio & Image Generation Expansion

SreeAuthor

This significant update expands our creative and multimodal capabilities with 10 new image generation models and comprehensive audio processing features, alongside powerful web search integration for enhanced research capabilities.

Provider 4 Image Generation Models

Free Tier Models: Cost-effective image generation from leading providers
- Black Forest Labs:
  - provider-4/flux-schnell
- Leonardo AI:
  - provider-4/phoenix
- Stability AI:
  - provider-4/sdxl-lite
Basic Tier Models: Professional-grade image generation
- Leonardo AI:
  - provider-4/lucid-origin
- Stability AI:
  - provider-4/sdxl

Provider 5 Image Generation Models

Free Tier Models: Accessible image generation capabilities
- Black Forest Labs:
  - provider-5/flux-fast
Basic Tier Models: Enhanced generation options
- Black Forest Labs:
  - provider-5/flux-pro
- HiDream:
  - provider-5/hidream-i1-fast
Pro Tier Models: Advanced image generation capabilities
- HiDream:
  - provider-5/hidream-i1
- Qwen:
  - provider-5/qwen-image

Web Search Integration Models

Basic Tier Models: Web search with vision and function calling
- Core Model:
  - provider-5/gpt-4o-mini-search-preview
- Dated Variant:
  - provider-5/gpt-4o-mini-search-preview-2025-03-11
Pro Tier Models: Advanced web search capabilities
- Core Model:
  - provider-5/gpt-4o-search-preview
- Dated Variant:
  - provider-5/gpt-4o-search-preview-2025-03-11
Ultra Tier Exclusive: Premium reasoning with web search
- Core Model:
  - provider-5/gpt-5-search-api
- Dated Variant:
  - provider-5/gpt-5-search-api-2025-10-14

Audio Transcription Models

Free Tier Models: Industry-standard speech-to-text
- provider-5/whisper-1
Basic Tier Models: Advanced transcription capabilities
- provider-5/gpt-4o-mini-transcribe
Pro Tier Models: Premium transcription with diarization
- Standard Transcription:
  - provider-5/gpt-4o-transcribe
- With Speaker Identification:
  - provider-5/gpt-4o-transcribe-diarize

Text-to-Speech Models

Basic Tier Models: High-quality voice synthesis
- Core Model:
  - provider-5/tts-1
- Dated Variant:
  - provider-5/tts-1-1106
Pro Tier Models: Premium HD voice synthesis
- Core Model:
  - provider-5/tts-1-hd
- Dated Variant:
  - provider-5/tts-1-hd-1106

Multimodal Audio Chat Models

Basic Tier Models: Native audio processing in conversations
- Core Model:
  - provider-5/gpt-4o-mini-audio-preview
- Dated Variant:
  - provider-5/gpt-4o-mini-audio-preview-2024-12-17
Pro Tier Models: Advanced multimodal audio capabilities
- GPT Audio Series:
- GPT-4o Audio Preview Series:

Model Capabilities

Image Generation: 10 new models across Provider 4 and Provider 5 from Black Forest Labs, Leonardo AI, Stability AI, HiDream, and Qwen.
Web Search Integration: Real-time information access through GPT models with vision and function calling support.
Audio Transcription: Industry-leading speech-to-text with Whisper and GPT-4o models, including speaker diarization capabilities.
Audio Synthesis: High-quality text-to-speech with standard and HD variants, plus native audio processing in chat conversations.

Platform Enhancements

Provider 4 Expansion: Enhanced infrastructure with 5 new image generation models from leading AI providers.
Provider 5 Audio Suite: Comprehensive audio processing capabilities including transcription, synthesis, and multimodal chat.

Important Notes

Tier Access: Models are automatically available in all higher tiers. Free tier includes 4 image models, Basic tier adds 4 more, Pro tier provides complete access.
Audio Features: Most audio capabilities require Basic tier minimum, with advanced features such as HD synthesis and diarization available on Pro tier.