Latest AI Models in One API: Access GPT, Gemini, Claude & 400+ Models | Best AI API
Wan 2.6 Video
Alibaba Cloud
It enables users to generate cinematic, short-form videos from text descriptions, images, or reference materials, combining visual coherence, realistic motion, and synchronized audio in a single workflow.
Learn More
Type
Video
Per 1m token
0.195
Additional price
0.13
Usage pice
Gemini 3 Flash
Google
Gemini 3 Flash Preview is Google’s fast multimodal LLM API for agents, coding, and docs with pro-level control.
Chat
0.65
GPT Image 1.5
OpenAI
GPT Image 1.5 is OpenAI’s image generation model, built to produce crisp images that closely follow your prompt and to support dependable editing and variations.
Image
10.4
6.5
GPT-5.2
With significant upgrades in reasoning, vision, coding, and long-context handling, GPT-5.2 delivers state-of-the-art performance across professional, scientific, and engineering domains, while maintaining strong safety and reliability standards.
2.275
Z-Image Turbo
It delivers professional-grade image synthesis for commercial and creative applications with speed and efficiency at its core.
0.007
Claude Opus 4.5
Anthropic
It excels in software engineering and agentic workflows, supports advanced tool use and large context windowsl.
Gemini 3 Pro Image (Nano Banana Pro)
Gemini 3 Pro Image, also known as Nano Banana Pro, is Google DeepMind’s latest state-of-the-art text-to-image generation model.
GPT-5.1
It focuses on making the user experience more natural and the model’s reasoning behaviour more adaptive.
1.625
Qwen3 VL Plus
It is optimized for real-time dialog systems, analytics platforms, and visual assistant applications.
0.26
Veo 3.1
The model produces fully synchronized audiovisual content, with support for various aspect ratios and high-definition output.
0.52
GPT-5 Pro
While it requires more computational resources and higher cost, the payoff is exceptional performance on complex and mission-critical tasks.
19.5
Sora 2
With dedicated Text-to-Video and Image-to-Video APIs, it offers flexible entry points for both concept-driven storytelling and visual iteration.
Qwen Max
Alibaba's Qwen Max is a powerful multimodal AI with OpenAI-compatible API, notable for instruction-following stability.
2.08
Qwen Plus
Discover Qwen-Plus, Alibaba's advanced multilingual model designed for complex tasks and detailed analysis.
1.56
Grok Imagine Image Pro
xAI
Imagine it. Generate it. Ship it.
0.091
Grok Imagine Image
Imagine Anything. Generate Everything.
0.026
Seadream 5.0 Lite
ByteDance
Every major capability has been redesigned from first principles, moving image generation from a creative toy into a genuine production instrument.
0.046
GPT-5.3 Codex
One model. Everything you do on a computer.
Code
Gemini 3.1 Flash Image (Nano Banana 2)
Google's fastest high-resolution AI image model built on Gemini 3.1 Flash.
78
0.325
MiniMax-M2.5
MiniMax
A production-ready large language model built for text generation, conversational A now with a high-speed variant for real-time applications.
0.39
Qwen3.5 Plus
A frontier-class hosted model built for the agentic AI era. One million tokens of context. Native vision-language architecture. Adaptive reasoning at industrial scale.
Gemini 3.1 Pro
Built for complex tasks, creative breakthroughs, and intelligent agents that plan and act across long horizons.
2.6
Claude Sonnet 4.6
Positioned as the core workhorse in the Claude 4.6 family, it powers everything from autonomous agents to enterprise document workflows.
3.9
GLM-OCR
Zhipu AI
GLM-OCR stands out by combining state-of-the-art computer vision with intelligent structure detection, delivering 95%+ accuracy on real-world documents.
OCR
0.013
GLM-5
Designed to power modern AI products, GLM-5 API delivers strong performance across text generation, structured outputs, code understanding, and complex analytical tasks.
1.3
Inworld TTS‑1‑MAX
Inworld AI
With 8.8 billion parameters and a Transformer‑based autoregressive architecture, it delivers near‑human speech quality, fine‑grained emotional control, and rich custom voices tailored to your brand or characters.
Voice
10.5
Inworld TTS-1.5-Mini
For developers building responsive AI characters, this model offers the best trade-off of cost, speed, and quality in Inworld's lineup, outperforming competitors in latency-sensitive environments.
5.25
Kimi K2.5
Moonshot AI
Its native vision and language integration enables seamless content creation, visual analysis, and code generation from design mockups.
0.78
Claude Opus 4.6
With support for massive context windows and scalable agent-based workflows, Claude 4.6 Opus handles tasks beyond standard language models.
48.75
13
Kling Video v3 Standard
Kling AI
It emphasizes speed, accessibility, and predictable results, making it well suited for rapid content creation and iterative workflows.
0.328
0.218
Kling Video v3 Pro
Designed with flexibility in mind, Kling Video v3 Pro supports narrative control, multi-scene composition, and native sound generation, making it a powerful foundation for modern video pipelines.
0.437
0.291
OpenClaw
It runs close to your systems, orchestrates complex workflows, and steadily adapts to the way you operate.
GPT-5.2 Codex
By combining agent-centric design, configurable reasoning depth, and multimodal understanding, it moves beyond traditional code generation and into true software engineering collaboration.
1.8375
ERNIE 4.5 0.3B
Baidu
With a small parameter footprint and modern transformer architecture, it is designed for developers and researchers who need efficient text generation and language understanding without the computational cost of large-scale models.
0
ERNIE X1
ERNIE X1 is Baidu’s deep-thinking reasoning model, engineered to solve complex problems, code, and mathematical tasks with transparent, step-by-step logic while remaining highly cost-efficient for large-scale deployment.
0.2145
ERNIE 5.0
With advanced reasoning, multimodal understanding, and global language support, ERNIE 5.0 is redefining how AI interacts with complex data.
1.2298
1 / 15
MiMo-V2-Flash
Xiaomi
Its long-context capabilities make it an excellent choice for document analysis, knowledge extraction, and large-scale summarization.
0.1107
ERNIE 4.5 VL
ERNIE 4.5 VL is a series of vision‑language models (VLMs) built on Baidu’s ERNIE 4.5 multimodal MoE architecture, jointly trained on text and images for rich perception and reasoning.
0.6435
ERNIE 4.5
The model family includes the reasoning-focused 21B Thinking variant, the standard 21B model, and the high-capacity 300B model, each optimized for different workloads and deployment scales.
0.0936
ByteDance Seed 1.8
It combines information retrieval, coding assistance, and task orchestration in a single framework.
MiniMax-M2.1
Engineered for speed and precision, it delivers top-tier multilingual code generation with clean, actionable outputs.
GLM-4.7
GLM-4.7 is Z.AI’s latest flagship large language model, purpose-built for agentic coding, stable multi-step reasoning, and high-performance interactive workflows.
GPT-5.2 Chat Latest
Optimized for everyday professional and educational workflows that demand speed, clarity, and reliability without sacrificing conversational fluency.
GPT-5.2 Pro
With enhanced long-context understanding, premium vision capabilities, and robust tool integration, it excels in coding, data analysis, and expert-level research.
27.3
DeepSeek-V3.2 Speciale
DeepSeek
The model is optimized for controlled reasoning, interpretability, and developer-focused workflows.
0.36855
Grok 4.1 Fast
It powers complex analytics, fluid chat, and real-time data integration.
Gemini 3 Pro Preview
Engineered for versatility, speed, and intelligence across text, code, audio, images, and structured data.
5.2
GPT-5.1 Chat Latest
It balances intelligence with human-like communication, making AI interactions more enjoyable and effective for diverse use cases.
Hermes 4 405B
NousResearch
Its hybrid reasoning mode allows users to switch between fast, direct responses and deep, step-by-step analysis, making it highly adaptable for diverse use cases.
0.41145
Nemotron Nano 12B V2 VL
NVIDIA
Optimized for low-latency deployment, it excels in optical character recognition (OCR), chart reasoning, document comprehension, and long-form video analysis.
0.2743
Nemotron Nano 9B V2
Designed for developers and enterprises seeking fast inference with minimal hardware overhead, it excels in chat interfaces, content augmentation, and lightweight agents.
0.05486
Kimi K2 0905 Preview
Its ultra-long context window of 262,144 tokens enables deep understanding and processing of extremely large documents and extended multi-turn dialogues.
Qwen3 VL 32B Instruct
Its optimized instruction-following makes it ideal for platforms prioritizing enhanced user experience in visual data understanding, creative content generation, and interactive visual assistance.
0.91
1 / 7
Wan 2.6
Alibaba Cloud’s state-of-the-art diffusion-based image generation model, engineered to produce photorealistic and highly detailed visuals from text prompts.
0.039
FLUX.2 Max Edit
Black Forest Labs
FLUX.2 Max Edit transforms one or multiple reference images using simple prompts, preserving composition, lighting, and identity while applying high-fidelity changes suitable for final delivery, not just drafts.
FLUX.2 Max
FLUX.2 Max is Black Forest Labs' flagship text-to-image model, delivering exceptional photorealism, anatomical accuracy, and prompt fidelity for professional-grade visuals in production pipelines.
Seedream 4.5
It excels in rendering sharp, legible text directly within images, ideal for branded and advertising content.
0.052
Kling Image O1
Built for creators who demand semantic intelligence, visual coherence, and professional-grade results without complex workflows.
0.036
Z-Image Turbo LoRA
It combines Turbo’s speed with LoRA-based style adaptation, allowing flexible control over illustration, toon, and branded visual styles without compromising performance.
0.011
Flux 2 LoRA
Its ability to combine the speed with the precision of LoRA fine-tuning sets it apart as a leading model in AI-powered image generation technology.
0.027
FLUX.2 [pro]
Perfect for designers, marketers, and creative teams seeking speed, consistency, and professional-grade results.
FLUX.2
FLUX.2 is an advanced text-to-image generative model developed by Black Forest Labs.
0.016
GPT Image 1 Mini
Its optimized infrastructure ensures high throughput and low latency, ideal for embedding into various digital products and creative pipelines.
0.676
Hunyuan3D Part
Tencent
It enables altering, retouching, or extending specific regions of an image while seamlessly preserving the original scene’s style, lighting, and coherence.
Wan 2.2 Flash
Its optimized architecture ensures smooth, real-time interaction, making it ideal for dynamic applications like content creation tools, live design assists, and AI-powered artistic platforms.
0.033
Wan 2.2 Plus
Designed for creators, artists, and developers, it offers a powerful solution for generating visually striking images from textual descriptions with impressive fidelity and nuance.
0.065
Wan 2.5 Preview
Its flexible dimension support and high-quality output make it ideal for use in creative apps, marketing tools, content management systems, and design software.
Sharpen-Generative
Topaz Labs
Utilize adjustable sharpening strength, noise reduction, and face enhancement parameters to balance creativity and realism per project needs.
0.005
Sharpen | AI Image
Balances speed, precision, and ease of use, making it highly recommended for photographers, creators, and imaging professionals seeking the best sharpening accuracy and efficiency available.
Grok-2 Image
Fast, accurate, and context-aware AI for creative and professional visual content.
HunyuanImage 3.0
The model supports understanding and rendering multi-thousand-word prompts and creates clear, legible text within images, making it ideal for diverse creative applications.
Reve Remix Image
Reve
Its superior handling of detailed prompts and embedded text, combined with commercial rights, makes it an ideal solution for creators and businesses aiming for impactful visual content with minimum hassle.
Reve Edit Image
Its high accuracy in prompt adherence, superior text rendering, and 4K image quality set it apart from competitors.
Reve Create Image
Its distinct advantages in text rendering and multi-object composition make it ideal for professional marketers and designers seeking high-quality, ready-to-use visuals.
0.031
Qwen Image Edit
It supports bilingual text editing in English and Chinese, enabling complex scene adjustments, style transfers, and seamless visual edits while preserving image consistency.
0.059
Flux SRPO Image‑to‑Image
The model supports spatially-aware localized editing, enabling targeted modifications without compromising surrounding areas.
Flux SRPO
The model offers fine-grained control over composition and scene elements, enabling precise and customizable image synthesis.
Imagen 4.0 Ultra Generate
It is ideal for professional use cases requiring exceptional image fidelity and precise textual elements across a variety of formats and creative workflows.
0.078
Imagen 4.0 Fast Generate
This model offers near real-time generation speeds of around 1.5 to 2.5 seconds per image while maintaining excellent fidelity and text rendering quality, ideal for rapid iteration workflows.
Imagen 4.0 Generate
Imagen 4 Generate-001 ideal for marketing, design, publishing, and real-time content generation applications requiring photorealistic visuals and accurate text rendering.
USO
Its scalable design enables efficient batch processing and on-demand generation for applications ranging from marketing to gaming.
Seedream 4 Edit
The model is designed for professional and enterprise workflows, providing ultra-fast 2K image generation with precise, natural language-driven editing controls.
1 / 2
Model type
Magic Video
Magic Inc.
This approach positions Magic as a comprehensive tool compared to more specialized, single-function AI video models.
Veo 3.1 Fast Extend Video
With optional audio synthesis (on/off toggle), Veo 3.1 Fast empowers creators to tailor multimodal experiences without compromising quality or workflow agility.
Veo 3.1 Extend Video
Extend Video focuses on temporal extrapolation, making it uniquely suited for extending real or synthetic clips without abrupt visual shifts or narrative breaks.
Pixverse v5.5 Image-to-Video
PixVerse
This image-to-video model excels in generating high-quality clips up to 10 seconds long across multiple resolutions.
PixVerse V5.5 Text-to-Video
Create dynamic, dialogue-rich scenes with automatic shot transitions, emotional voice delivery, and precise visual framing.
Kling 2.6 Pro Text-to-Video
Kling 2.6 Pro is high-fidelity AI video generation with synchronized audio, ideal for social creatives, ads, promos, and rapid prototyping.
0.182
Kling 2.6 Pro Image-to-Video
Designed for creators, marketers, and developers, this model delivers production-ready results with minimal latency.
Kling Video O1 Video-to-Video Reference
Its Video-to-Video Reference mode enables creators to generate new, coherent video clips that preserve motion dynamics, cinematic language, and visual identity from source footage.
Kling Video O1 Image to Video
It leverages a unified multi-modal engine for superior consistency in complex scenes.
0.118
Kling Video O1 Video to Video Edit
It combines multi-angle character replacement, environment swapping, and full motion integrity in a single prompt.
Kling Video O1 Reference-to-Video
It uses advanced feature extraction to preserve visual identity such as appearance, texture, and style across entirely new scenarios and motions.
0.146
LTXV 2 Fast
LTXV
It offers exceptional speed for rapid video iterations without compromising visual quality, enabling users to produce sharp, realistic clips quickly.
0.208
LTXV 2
It powers professional creative workflows with near real‑time generation, 4K‑ready output, and flexible modes optimized for both speed and fidelity.
OmniHuman v1.5
This model excels in synchronizing lip movements, facial expressions, and subtle behavioral cues with the emotional tone and rhythm of the audio, producing lifelike avatars ideal for interactive and multimedia applications.
Kling AI Avatar Pro
It delivers smooth, professional-quality animations that maintain consistent realism.
0.15
Kling AI Avatar Standard
It enables precise lip-syncing, natural facial expressions, and lively articulation, suitable for diverse applications such as video presentations, virtual hosts, customer avatars, and digital dubbing.
0.073
Seedance 1.0 Pro Fast | Bytedance | Video Generation
It's the top choice for rapid social media ads and content prototyping, offering cinematic results.
3.25
Hailuo 2.3 Fast
It delivers professional-quality video clips with rapid turnaround, enabling creative workflows that demand both performance and efficiency.
0.247
0.416
Hailuo 2.3
Enables to craft visually realistic videos with configurable parameters such as resolution, style, and duration.
0.364
0.728
Ray Flash 2
Luma AI
Its optimized architecture supports fast iteration, preview, and production workflows for professional video, VFX, and multimedia applications.
0.002
Ray 2
Its balance of power and accessibility makes it a standout model for creators seeking to innovate with AI-driven cinematic video production.
0.008
Ray 1.6
It supports natural language instructions to produce dynamic camera movements and seamless animations without the need for complex editing skills.
0.003
Krea WAN 14B
Krea
With Krea WAN 14B, users can generate and edit high-quality videos in real time simply by using text descriptions or storyboards.
Kandinsky 5 Distill
Sber AI
This model is ideal for developers, content creators, and researchers who need to generate video content from text prompts efficiently.
Kandinsky 5 Standard
It specializes in converting textual descriptions into photorealistic video clips featuring rich artistic styles and high-detail animations.
Veo 3.1 Fast
Google’s high-speed AI video generation model, optimized for low-latency and large-scale production workflows.
VEED Fabric-1.0 Fast
VEED
This AI tool revolutionizes video content creation with speed, realism, and cost efficiency.
VEED Fabric 1.0
VEED Fabric 1.0 supports multiple video formats and resolutions and can be combined with other VEED features such as subtitles, voice translation, and video editing to streamline content production pipelines.
0.104
Sora 2 Pro
Designed for filmmakers, content strategists, and enterprise creative teams, Sora 2 Pro blends state-of-the-art video realism with intuitive controls, extended runtime.
Wan 2.2 14B Animate Replace
It enables seamless substitution of people in existing footage, maintaining natural motion, facial expressions, and scene lighting.
Wan 2.2 14B Animate Move
Developed by Alibaba as part of the Wan 2.2 family, it is widely used for AI avatars, virtual influencers, and animation production acceleration.
1 / 3
GPT-5.1 Codex Mini
Its advanced training and multimodal features enable seamless integration into development pipelines, boosting productivity and code quality.
GPT-5.1 Codex
Unlike general-purpose language models, it focuses on producing clean, maintainable, and executable code following developer instructions precisely.
Grok Code Fast 1
Its massive 256,000-token context window allows for handling large codebases and multi-file projects without truncation, making it ideal for complex coding workflows.
Qwen3 Coder
Qwen3-Coder is a cutting-edge AI model with a 262K token context window, designed for advanced text-to-text coding and instruction-based workflows. It offers robust integration support for seamless automation and software development at scale.
1.95
Qwen 2.5 Coder 32B Instruct
Discover Qwen2.5 Coder 32B Instruct: An open-source coding LLM for efficient code solutions.
0.84
Replit-Code-v1 (3B)
Replit
Access Replit's 2.7B parameter code completion model, along with 100+ AI Models. 20 Supported programming languages in your hands.
0.105
CodeGen2 (16B)
Salesforce
CodeGen2-16B: A colossal language model developed by Salesforce AI Research for advanced program synthesis tasks.
0.21
CodeGen2 (7B)
Access CodeGen2 (7B) API: A 7 billion parameter autoregressive language model, capable of generating and completing code in 12 programming languages and most popular frameworks.
StarCoder (16B)
BigCode
Explore the power of StarCoder API, a 15.5B parameter model, ideal for generating code across 80+ programming languages with unparalleled depth.
0.315
SQLCoder (15B)
Defog AI
SQLCoder API, a state-of-the-art language model excelling in transforming natural language queries into precise SQL commands. Perfect for developers and data analysts!
Phind Code LLaMA v2 (34B)
Phind
Phind Code LLaMA v2 (34B) API transforms coding by automating generation, debugging, and translating code across multiple languages.
WizardCoder Python v1.0 (34B)
WizardLM
Transform your Python development with WizardCoder Python v1.0 (34B) API, an AI model that revolutionizes code writing, debugging, and optimization with its vast knowledge base and analytical power.
Code Llama Instruct (34B)
Meta
Elevate your coding with Code Llama Instruct (34B) API, an AI model specialized in following complex instructions and generating precise code. Ideal for developers seeking high-level programming assistance.
0.815
Code Llama Instruct (7B)
Elevate your coding experience with Code Llama Instruct (7B) API. This AI model provides precise code generation and instructions compliance, making coding more efficient and accessible for developers of all levels.
Code Llama Python (13B)
The Code Llama Python (13B) API is a high-performance AI model designed to automate and enhance Python programming tasks. With 13 billion parameters, it excels in generating code, debugging, and providing programming insights.
0.231
Code Llama Instruct (13B)
Transform your coding processes with Code Llama Instruct (13B) API, an AI model specialized in understanding and executing programming instructions. With 13 billion parameters, it offers nuanced code generation and problem-solving capabilities, setting new standards in AI-assisted development.
Code Llama Python (34B)
Experience next-level code generation with Code Llama Python (34B) API. This model offers deeper insights and more complex code solutions, enhancing your programming projects with AI efficiency.
Code Llama Python (7B)
Unlock the power of AI in your coding projects with Code Llama Python (7B) API. This model accelerates code writing, debugs, and suggests optimizations effortlessly.
Deepseek Coder Instruct (33B)
Empower your development with Deepseek Coder Instruct (33B) API, a state-of-the-art AI model with 33 billion parameters designed for coding instruction and automation.
Code Llama (70B)
Elevate your coding with Code Llama (70B) API. This massive open source model is designed to understand and generate code across multiple programming languages.
0.945
Code Llama Python (70B)
Unlock the full potential of AI in coding with Code Llama Python (70B) API. This 70 billion parameter model specializes in understanding and generating Python code, offering unparalleled assistance in software development.
Code Llama Instruct (70B)
CodeLlama-70B-Instruct API: Meta's AI model tailored for code tasks, excelling in code completion and chatbot applications. Suitable for research and commercial use.
Inworld TTS-1-Max
Inworld TTS-1-Max is a high-fidelity, transformer-based neural text-to-speech model optimized for interactive and emotionally expressive voice synthesis.
Inworld TTS-1
A next-generation neural text-to-speech (TTS) model developed by Inworld AI, engineered specifically for dynamic, real-time conversational experiences within games, virtual agents, and immersive applications.
GPT-4o Mini Transcribe
Its advanced pretraining and reinforcement learning techniques make it ideal for real-time transcription in voice agents, call centers, and interactive audio applications.
0.63
GPT-4o Transcribe
It excels in handling diverse speech patterns and long audio contexts, making it an excellent choice for developers building accurate and scalable voice-enabled applications.
GPT Audio Mini
It provides robust, natural-sounding speech output while maintaining efficiency, enabling voice interactivity on devices with limited resources.
21
GPT Audio
Whether recognizing complex utterances, synthesizing expressive responses, or reasoning across modalities, it remains remarkably responsive and adaptable.
67.2
33.6
MiniMax Speech 2.6 Turbo
The Turbo version is finely optimized for real-time applications requiring expressive voices with minimal delay.
MiniMax Speech 2.6 HD
The model is optimized for high-definition audio output, supporting studio-grade prosody, breath control, and smooth phrasing.
Octave 2
Hume AI
It comprehends meaning and emotion, delivering unparalleled voice quality and expressiveness.
TTS-1 | Text-to-Speech
It delivers swift, real-time audio generation with minimal latency, making it especially suitable for live conversational agents and interactive applications.
0.02
TTS-1 HD | Text to speech
The model balances quality and latency making it suitable for demanding voice synthesis applications.
0.032
GPT-4o mini TTS
By enabling dynamic control over voice attributes like accent and emotion, this model surpasses many traditional TTS systems in naturalness and user customization.
0.001
Deepgram Aura 2
Deepgram
With high concurrency support and cost-efficient pricing, Aura 2 enables seamless, clear, and responsive voice AI interactions for industries like finance, healthcare, and customer support.
Qwen3-Omni Captioner
It serves audio input and returns rich text captions in real-time or batch mode without requiring input prompts.
4.953
3.213
4
Qwen3 TTS Flash
It excels in real-time applications, delivering clear, versatile speech suitable for conversational AI, audiobooks, and accessibility tools.
Eleven v3 Alpha
ElevenLabs
Its flexible prompting and tone control features allow developers to customize outputs for conversational agents, content automation, and multilingual use cases.
0.234
VibeVoice 1.5B
Microsoft
The model supports fine-grained control over tone, pace, emotion, and language, making it an ideal choice for businesses aiming for high-quality, scalable speech generation solutions.
VibeVoice 7B
Its advanced neural architecture enables seamless integration into a wide range of voice-driven applications, from virtual assistants to interactive storytelling and accessibility tools.
MiniMax Speech 2.5 HD
Its cutting-edge technology enables seamless integration across a wide range of voice-driven applications, from interactive assistants to multimedia production.
MiniMax Speech 2.5 Turbo
Designed for scalability, it fits effortlessly into applications spanning media, entertainment, education, and customer service environments.
Universal-2 by Assembly AI
Assembly AI
Universal is designed for seamless integration into diverse speech-to-text workflows, enabling accurate and efficient transcription across multiple languages and audio conditions.
0.006
Slam 1
It offers substantial gains in accuracy and adaptability, directly improving transcription workflows in complex real-world environments.
ElevenLabs Turbo v2.5
With support for 120+ languages and low-latency inference, it sets a new standard for responsive, natural-sounding text-to-speech applications.
0.117
ElevenLabs Multilingual v2
With support for 29+ languages and near-human prosody, it delivers studio-quality audio for global applications.
Chat GPT 4o mini audio preview
GPT-4o Mini Audio adds speech-to-text and text-to-speech abilities to the efficient GPT-4o Mini model, optimized for voice interfaces in smaller applications.
Chat GPT 4o audio preview
GPT-4o Audio Preview is OpenAI's latest flagship model capable of understanding and generating text and audio in real-time, designed for natural conversation and auditory tasks.
Deepgram Aura
Deepgram Aura: A real-time TTS model delivering human-like voices for responsive, high-throughput conversational AI agents and applications via API.
Deepgram Nova-2
Deepgram Nova-2 API features enhanced accuracy, multilingual support, and rapid transcription across various applications.
Whisper
OpenAI's Whisper API offers robust, multilingual speech-to-text capabilities, trained on diverse data, free for commercial use under the MIT license.
MiniMax Music 2.0
The model excels in capturing vocal emotions and instrumental dynamics with realistic sound expression, allowing flexible switching between diverse singing and emotional styles.
Music
MiniMax Music 1.5
With the ability to produce long, fully arranged songs featuring natural vocals and ethnic instruments, it excels in diverse cultural and genre contexts.
Eleven Music
It supports multimodal inputs, diverse genres, and neural audio synthesis for media, gaming, and entertainment applications.
0.455
Google Lyria2 | Text to Audio
Google's Lyria 2 is an advanced AI model that generates professional-grade, instrumental music from text prompts, offering creators fine-tuned controls.
MiniMax Music
Hailuo AI
Discover MiniMax Music, an AI model that transforms text into captivating music with advanced style learning capabilities and multiple genre support.
Stable Audio
Stability AI
Discover Stable Audio by Stability AI, an advanced audio generation model that creates high-quality tracks from text prompts with innovative features.
Suno — AI for Music Creators [Deprecated]
Suno AI
Suno AI API generates realistic music from text prompts, supporting diverse genres, languages, and seamless integration into applications.
Qwen Text Embedding v4
Qwen Text Embedding v4 is a state-of-the-art multilingual embedding model optimized for semantic search and retrieval tasks.
Embedding
0.074
Qwen Text Embedding v3
Built on Qwen3 foundations, it prioritizes long-context understanding and semantic accuracy for real-world applications.
Textembedding-gecko@001
Discover the textembedding-gecko@001 model API: features, technical specifications, usage guidelines, and ethical considerations for developers.
0.0325
Textembedding-gecko@003
Explore Textembedding-gecko@003 API, a powerful text embedding model by Google, designed for diverse NLP applications and high performance.
Textembedding-gecko-multilingual@001
Explore the textembedding-gecko-multilingual@001 model API, its architecture, training data, performance, and applications in NLP tasks.
Text multilingual embedding 002
Discover Text-multilingual-embedding-002 API, a powerful model for multilingual text embeddings, enhancing NLP applications across languages.
Voyage Large 2 Instruct
Voyage AI
Voyage Large 2 Instruct API: A top-performing, instruction-tuned text embedding model for retrieval, classification, and clustering tasks.
0.156
Text Embedding Ada 002
text-embedding-ada-002 API delivers consistent text embeddings, ideal for search, clustering, and recommendation applications at an affordable price.
Text Embedding 3 Large
Text-embedding-3-large API provides top-tier text embeddings with customizable dimensions, delivering exceptional accuracy for complex applications.
0.169
Text Embedding 3 Small
text-embedding-3-small API enhances text representation, offering better accuracy and cost-efficiency compared to its predecessor, text-embedding-ada-002.
BAAI-Bge-Base-1p5
BAAI
Utilize the API of the BAAI-Bge-Base-1p5 model to generate detailed language embeddings, enhancing the accuracy and depth of your linguistic analyses and applications.
Bert Base Uncased
Unlock the potential of natural language processing with BERT Base Uncased API, a fundamental model in AI for creating powerful and nuanced language embeddings, facilitating a deep understanding of text.
Sentence-BERT
Sentence Transformers
Discover Sentence-BERT API, a cutting-edge model designed for creating sentence embeddings that capture deep semantic meanings, facilitating enhanced text comparison and analysis.
M2-BERT-Retrieval-32k
Together
Transform your data search and retrieval processes with M2-BERT-Retrieval-32k, featuring advanced AI capabilities for navigating vast datasets and delivering precise information swiftly.
UAE-Large-V1
WhereIsAI
Leverage the power of Universal Angle Embedding with UAE-Large-V1 API, an AI model designed to provide advanced vector embeddings for a variety of AI applications, enhancing machine learning accuracy and efficiency.
BAAI-Bge-Large-1p5
Elevate your language processing tasks with BGE-Large-EN-v1.5 API, a state-of-the-art embedding model designed to capture nuanced linguistic features and semantics, significantly improving language understanding and analysis.
M2-BERT-Retrieval-8k
Elevate your search capabilities with M2-BERT-Retrieval-8k, an AI model optimized for fast and accurate information retrieval. Ideal for powering advanced search engines and data analysis tools.
M2-BERT-Retrieval-2K
Enhance your search capabilities with M2-BERT-Retrieval-2K API, an AI model optimized for rapid and accurate information retrieval in smaller datasets.
WizardLM 2-8 (22B) (Deprecated)
Discover Microsoft’s WizardLM 2-8 (22B), an advanced language model optimized for multilingual conversations and complex reasoning tasks with high efficiency.
Language
1.26
Llama 3 (8B)
Access Llama-3 (8B) API along with 100+ AI Models. LLama-3 8B is an optimized, open-source language model excelling in dialogue, reasoning, and code generation.
LLama 3 (70B)
Access Meta's Llama-3 (70B) AI along with other 100+ other AI models with our API. LLama 3 is a state-of-the-art open-source language model with enhanced reasoning, coding, and multilingual capabilities for software developers.
Chat GPT-3.5 Turbo Instruct
Get GPT-3.5-Turbo-Instruct model from OpenAI, along with 100+ open-source AI Modes. All designed for efficient, accurate, and instruction-driven AI interactions.
StableLM Base Alpha 3B
Dive into the world of enhanced language processing with StableLM-Base-Alpha API, boasting up to 7 billion parameters for superior text generation.
FLAN T5 XL (3B)
Discover FLAN-T5 XL (3B) API, a transformation of the T5 model enhanced by fine-tuning on over 1000 diverse tasks, excelling in multilingual language processing.
GPT Neox 20B
Eleuther AI
Explore the capabilities of GPT-NeoX-20B API for generating complex, context-aware text across a multitude of domains. Ideal for research and advanced AI applications!
Mixtral 8x22B
Mistral AI
Mixtral 8x22B API, a pioneering AI model with 176 billion parameters, offers unparalleled language processing capabilities.
GPT-JT-Moderation (6B)
Experience efficient, AI-powered content moderation to ensure respectful, safe digital environments with GPT-JT-Moderation (6B) API.
Falcon (7B)
TII
Falcon-7B API offers unmatched language understanding and generation, leveraging 1,500B tokens and cutting-edge technology, under a flexible Apache 2.0 license.
RedPajama-INCITE (3B)
The AI model from RedPajama-INCITE (3B) API utilizes advanced technology to analyze data and provide valuable insights for business decision-making.
Falcon (40B)
Falcon 40B API leads in natural language generation, offering multilingual support and efficient, scalable AI-driven text creation. Explore unparalleled capabilities.
Qwen (7B)
Qwen (7B) API is a state-of-the-art language model that combines advanced performance with remarkable efficiency, making it a top choice for developers and enterprises.
RedPajama-INCITE Instruct (3B)
Harness the exceptional text generation capabilities of the RedPajama-INCITE Instruct (3B) API. Unlock intelligent, human-like responses for a wide range of applications.
RedPajama-INCITE Instruct (7B)
Enhance decision-making across industries with RedPajama-INCITE Instruct API, providing precise, AI-generated insights for smarter, faster decisions.
RedPajama-INCITE (7B)
RedPajama-INCITE (7B) API is a highly adaptable and customizable AI model that delivers accurate and relevant results, making it an invaluable resource for a wide range of applications.
NexusRaven (13B)
Nexusflow
Step into the future of data-driven decision-making with NexusRaven (13B) API. This model offers unparalleled insights and analyses, empowering businesses to make informed strategic decisions
0.0315
01-ai Yi Base (6B)
01.AI
Discover the versatility of 01-ai Yi Base (6B) API, an AI model adept in text generation, language understanding, and data analysis. Ideal for businesses and developers seeking AI-powered solutions.
LLaMA-2 (13B)
Unleash the potential of AI with LLaMA-2 (13B) API, a model boasting 13 billion parameters, designed for comprehensive data analysis, deep learning, and sophisticated problem-solving across various domains.
LLaMA-2-32K (7B)
Harness the power of LLaMA-2-32K (7B) API, an AI model with 7 billion parameters and 32,000 token support, designed for deep learning and complex problem-solving.
LLaMA-2 (70B)
Unlock unparalleled AI performance with LLaMA-2 (70B) API, a groundbreaking model boasting 70 billion parameters for superior understanding and problem-solving capabilities.
StripedHyena Hessian (7B)
Unlock the power of data with StripedHyena Hessian (7B) API, a cutting-edge AI model designed for intricate data analysis and pattern recognition. With 7 billion parameters, this model provides deep insights and predictive analytics to drive informed decisions.
Qwen 1.5 (1.8B)
Qwen 1.5 (1.8B), a beta version of Qwen2, excels in text generation, chatbots, and content moderation with its transformer-based architecture. It outperforms competitors in benchmarks, offering multilingual support and advanced capabilities across various domains.
Qwen 1.5 (0.5B)
Enter the world of efficient AI communication with Qwen 1.5 (0.5B) API. This model, with 500 million parameters, offers a compact yet effective solution for generating intelligent and context-aware dialogues.
Microsoft Phi-2
Microsoft Phi-2 API, a breakthrough in AI, offers significant computational and AI advancements. Engineered for modern demands, it excels in NLP and various applications, setting new standards in AI efficiency, capability, and safety.
Qwen 1.5 (4B)
Unlock the potential of conversational AI with Qwen 1.5 (4B) API. Boasting 4 billion parameters, this model delivers efficient, high-quality dialogues, making it an excellent choice for streamlined and intelligent communication solutions.
Qwen 1.5 (14B)
Discover unparalleled conversational AI with Qwen 1.5 (14B) API. Empower your applications with deep, nuanced interactions, thanks to its 14 billion parameter architecture.
Qwen 1.5 (7B)
Step into the future of AI communication with Qwen 1.5 (7B) API. With 7 billion parameters, this model offers nuanced, intelligent conversational capabilities that redefine user engagement.
Qwen 1.5 (72B)
Qwen 1.5-72B: Transformer-based language model with multilingual support, 32K context, and strong performance in text completion and reasoning.
Gemma (7B)
Gemma (7B) created an AI model that uses machine learning to predict customer behavior and preferences for personalized recommendations.
Mixtral-8x7B v0.1
The AI model Mixtral-8x7B v0.1 API is a cutting-edge, advanced system designed to accurately analyze and process data in multiple domains.
Gemma (2B) (Deprecated)
Gemma's AI model utilizes deep learning techniques to accurately predict outcomes in a wide range of industries.
StripedHyena Nous (7B)
The AI model from StripedHyena Nous (7B) API utilizes advanced machine learning algorithms to analyze and interpret complex data sets, enabling organizations to make informed decisions and predictions.
Mixtral 7B
Mixtral 7B API excels in complex tasks like language translation, and content creation, surpassing Llama models and matching ChatGPT's capabilities.
LLaMA-2 (7B)
LLaMA-2 (7B) API excels in assistance tasks, delivering great overall performance at the modest price.
Magic Image-to-3D
The model is optimized for fast 3D asset creation, making it suitable for real-time pipelines in game development, AR/VR, e-commerce, and digital content production.
3D Generation
Stable TripoSR 3D
TripoSR API generates high-quality 3D meshes from single images in under 0.5 seconds, using transformer architecture for efficient reconstruction.
0.05
Mistral (7B) v0.1
Discover the power of Mistral (7B) v0.1 API, an AI model with 7 billion parameters designed for versatile, high-performance machine learning tasks.
Qwen2.5 VL 7B Instruct
Its optimized size ensures efficient performance with cost-effective operation, suitable for chatbots, AI assistants, and automated content extraction systems.
Mistral OCR Latest
Mistral OCR (mistral-ocr-latest), developed by Mistral AI, transforms PDFs and images into structured Markdown/JSON, handling text, tables, equations, and multilingual content.