Nova Micro
Now Serving
-
Provider
Amazon Bedrock
-
Originator
Amazon
-
Max Context Tokens
128000
-
Max Generated Tokens
5000
-
Data Freshness
October 2023
$0.14 Per Million Generated Tokens
Generated token cost by providerNova Micro
Now Serving
-
Version
nova_micro
-
Description
Amazon Nova Micro is a text only model that delivers the lowest latency responses at very low cost. It is highly performant at language understanding, translation, reasoning, code completion, brainstorming, and mathematical problem-solving. With its generation speed of over 200 tokens per second, Amazon Nova Micro is ideal for applications that require fast responses. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.
$0.035 Per Million Context Tokens
Context token cost by providerNova Lite
Now Serving
-
Provider
Amazon Bedrock
-
Originator
Amazon
-
Max Context Tokens
300000
-
Max Generated Tokens
5000
-
Data Freshness
October 2023
$0.24 Per Million Generated Tokens
Generated token cost by providerNova Lite
Now Serving
-
Version
nova_lite
-
Description
Amazon Nova Lite is a very low-cost multimodal model that is lightning fast for processing image, video, and text inputs. Amazon Nova Lite’s accuracy across a breadth of tasks, coupled with its lightning-fast speed, makes it suitable for a wide range of interactive and high-volume applications where cost is a key consideration. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.
$0.06 Per Million Context Tokens
Context token cost by providerNova Pro
Now Serving
-
Provider
Amazon Bedrock
-
Originator
Amazon
-
Max Context Tokens
300000
-
Max Generated Tokens
5000
-
Data Freshness
October 2023
$3.2 Per Million Generated Tokens
Generated token cost by providerNova Pro
Now Serving
-
Version
nova_pro
-
Description
Amazon Nova Pro is a highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Pro’s capabilities, coupled with its industry-leading speed and cost efficiency, makes it a compelling model for almost any task, including video summarization, Q&A, mathematical reasoning, software development, and AI agents that can execute multi-step workflows. In addition to state-of-the-art accuracy on text and visual intelligence benchmarks, Amazon Nova Pro excels at instruction following and agentic workflows as measured by Comprehensive RAG Benchmark (CRAG), the Berkeley Function Calling Leaderboard, and Mind2Web. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.
$0.8 Per Million Context Tokens
Context token cost by providerReasoner-R1
Now Serving
-
Provider
DeepSeek
-
Originator
DeepSeek
-
Max Context Tokens
64000
-
Max Generated Tokens
8192
-
Data Freshness
Unpublished
$2.19 Per Million Generated Tokens
Generated token cost by providerReasoner-R1
Now Serving
-
Version
deepseek_reasoner
-
Description
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.
$0.55 Per Million Context Tokens
Context token cost by providerChat-V3
Now Serving
-
Provider
DeepSeek
-
Originator
DeepSeek
-
Max Context Tokens
64000
-
Max Generated Tokens
8192
-
Data Freshness
Unpublished
$0.28 Per Million Generated Tokens
Generated token cost by providerChat-V3
Now Serving
-
Version
deepseek_chat
-
Description
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.
$0.14 Per Million Context Tokens
Context token cost by providerGPT-3.5 Turbo
Deprecated
-
Provider
OpenAI
-
Originator
OpenAI
-
Max Context Tokens
16385
-
Max Generated Tokens
4096
-
Data Freshness
Sep 2021
$1.5 Per Million Generated Tokens
Generated token cost by providerGPT-3.5 Turbo
Deprecated
-
Version
gpt_3_5_turbo
-
Description
The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.
$0.5 Per Million Context Tokens
Context token cost by providerGPT-4
Now Serving
-
Provider
OpenAI
-
Originator
OpenAI
-
Max Context Tokens
8192
-
Max Generated Tokens
8192
-
Data Freshness
Sep 2021
$60.0 Per Million Generated Tokens
Generated token cost by providerGPT-4
Now Serving
-
Version
gpt_4
-
Description
Snapshot of gpt-4 from June 13th 2023 with improved tool calling support. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.
$30.0 Per Million Context Tokens
Context token cost by providerGPT-4 Turbo
Now Serving
-
Provider
OpenAI
-
Originator
OpenAI
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Dec 2023
$30.0 Per Million Generated Tokens
Generated token cost by providerGPT-4 Turbo
Now Serving
-
Version
gpt_4_turbo
-
Description
The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Currently points to gpt-4-turbo-2024-04-09. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.
$10.0 Per Million Context Tokens
Context token cost by providerGPT-4o
Now Serving
-
Provider
OpenAI
-
Originator
OpenAI
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Oct 2023
$10.0 Per Million Generated Tokens
Generated token cost by providerGPT-4o
Now Serving
-
Version
gpt_4o
-
Description
GPT-4o is their most advanced model. It is multimodal (accepting text or image inputs and outputting text), and it has the same high intelligence as GPT-4 Turbo but is much more efficient—it generates text 2x faster and is 50% cheaper. Additionally, GPT-4o has the best vision and performance across non-English languages of any of our models. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.
$2.5 Per Million Context Tokens
Context token cost by providerGPT-4o Mini
Now Serving
-
Provider
OpenAI
-
Originator
OpenAI
-
Max Context Tokens
128000
-
Max Generated Tokens
16384
-
Data Freshness
Oct 2023
$0.6 Per Million Generated Tokens
Generated token cost by providerGPT-4o Mini
Now Serving
-
Version
gpt_4o_mini
-
Description
GPT-4o mini is their most advanced model in the small models category, and their cheapest model yet. It is multimodal (accepting text or image inputs and outputting text), has higher intelligence than gpt-3.5-turbo but is just as fast. It is meant to be used for smaller tasks, including vision tasks. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.
$0.15 Per Million Context Tokens
Context token cost by providerGPT-o1 Preview
Coming Soon
-
Provider
OpenAI
-
Originator
OpenAI
-
Max Context Tokens
128000
-
Max Generated Tokens
32768
-
Data Freshness
Oct 2023
$60.0 Per Million Generated Tokens
Generated token cost by providerGPT-o1 Preview
Coming Soon
-
Version
gpt_o1_preview
-
Description
The o1 series of large language models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. o1 Preview is the reasoning model designed to solve hard problems across domains. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.
$15.0 Per Million Context Tokens
Context token cost by providerGPT-o1 Mini
Coming Soon
-
Provider
OpenAI
-
Originator
OpenAI
-
Max Context Tokens
128000
-
Max Generated Tokens
65536
-
Data Freshness
Oct 2023
$12.0 Per Million Generated Tokens
Generated token cost by providerGPT-o1 Mini
Coming Soon
-
Version
gpt_o1_mini
-
Description
The o1 series of large language models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. o1 Mini is the faster and cheaper reasoning model particularly good at coding, math, and science. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.
$3.0 Per Million Context Tokens
Context token cost by providerMistral Small
Now Serving
-
Provider
MistralAI
-
Originator
MistralAI
-
Max Context Tokens
32000
-
Max Generated Tokens
8192
-
Data Freshness
Unpublished
$0.6 Per Million Generated Tokens
Generated token cost by providerMistral Small
Now Serving
-
Version
mistral_small
-
Description
The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matche the capabilities of models up to 30B parameters. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.
$0.2 Per Million Context Tokens
Context token cost by providerMixtral 8x22B
Now Serving
-
Provider
MistralAI
-
Originator
MistralAI
-
Max Context Tokens
64000
-
Max Generated Tokens
8192
-
Data Freshness
Unpublished
$6.0 Per Million Generated Tokens
Generated token cost by providerMixtral 8x22B
Now Serving
-
Version
mixtral_8_22b
-
Description
A bigger sparse mixture of experts model. As such, it leverages up to 141B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.
$2.0 Per Million Context Tokens
Context token cost by providerMistral Large
Now Serving
-
Provider
MistralAI
-
Originator
MistralAI
-
Max Context Tokens
128000
-
Max Generated Tokens
8192
-
Data Freshness
Unpublished
$6.0 Per Million Generated Tokens
Generated token cost by providerMistral Large
Now Serving
-
Version
mistral_large
-
Description
Our flagship model with state-of-the-art reasoning, knowledge, and coding capabilities. It's ideal for complex tasks that require large reasoning capabilities or are highly specialized (Synthetic Text Generation, Code Generation, RAG, or Agents). Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.
$2.0 Per Million Context Tokens
Context token cost by providerMistral Nemo
Coming Soon
-
Provider
MistralAI
-
Originator
MistralAI
-
Max Context Tokens
128000
-
Max Generated Tokens
8192
-
Data Freshness
Unpublished
$0.15 Per Million Generated Tokens
Generated token cost by providerMistral Nemo
Coming Soon
-
Version
mistral_nemo
-
Description
A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.
$0.15 Per Million Context Tokens
Context token cost by providerMistral Codestral
Coming Soon
-
Provider
MistralAI
-
Originator
MistralAI
-
Max Context Tokens
32000
-
Max Generated Tokens
8192
-
Data Freshness
Unpublished
$0.6 Per Million Generated Tokens
Generated token cost by providerMistral Codestral
Coming Soon
-
Version
mistral_codestral
-
Description
A cutting-edge generative model that has been specifically designed and optimized for code generation tasks, including fill-in-the-middle and code completion. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.
$0.2 Per Million Context Tokens
Context token cost by providerGemini1.0 Pro
Now Serving
-
Provider
Google VertexAI
-
Originator
Google
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Feb 2024
$1.5 Per Million Generated Tokens
Generated token cost by providerGemini1.0 Pro
Now Serving
-
Version
gemini_1_0_pro
-
Description
Gemini 1.0 Pro is an NLP model that handles tasks like multi-turn text and code chat, and code generation. Google DeepMind's mission is to build AI responsibly to benefit humanity. AI has the potential to be one of the most important and beneficial technologies ever invented. They are working to create breakthrough technologies that could advance science, transform work, serve diverse communities — and improve billions of people's lives.
$0.5 Per Million Context Tokens
Context token cost by providerGemini1.5 Pro
Now Serving
-
Provider
Google VertexAI
-
Originator
Google
-
Max Context Tokens
2097152
-
Max Generated Tokens
8192
-
Data Freshness
May 2024
$5.0 Per Million Generated Tokens
Generated token cost by providerGemini1.5 Pro
Now Serving
-
Version
gemini_1_5_pro
-
Description
Gemini 1.5 Pro is a mid-size multimodal model that is optimized for a wide-range of reasoning tasks. 1.5 Pro can process large amounts of data at once, including 2 hours of video, 19 hours of audio, codebases with 60,000 lines of code, or 2,000 pages of text. Google DeepMind's mission is to build AI responsibly to benefit humanity. AI has the potential to be one of the most important and beneficial technologies ever invented. They are working to create breakthrough technologies that could advance science, transform work, serve diverse communities — and improve billions of people's lives.
$1.25 Per Million Context Tokens
Context token cost by providerGemini1.5 Pro
Coming Soon
-
Provider
Google VertexAI
-
Originator
Google
-
Max Context Tokens
1048576
-
Max Generated Tokens
8192
-
Data Freshness
May 2024
$0.6 Per Million Generated Tokens
Generated token cost by providerGemini1.5 Pro
Coming Soon
-
Version
gemini_1_5_flash
-
Description
Gemini 1.5 Flash is a fast and versatile multimodal model for scaling across diverse tasks. Google DeepMind's mission is to build AI responsibly to benefit humanity. AI has the potential to be one of the most important and beneficial technologies ever invented. They are working to create breakthrough technologies that could advance science, transform work, serve diverse communities — and improve billions of people's lives.
$0.15 Per Million Context Tokens
Context token cost by providerClaude3 Haiku
Now Serving
-
Provider
Anthropic
-
Originator
Anthropic
-
Max Context Tokens
200000
-
Max Generated Tokens
4096
-
Data Freshness
Aug 2023
$1.25 Per Million Generated Tokens
Generated token cost by providerClaude3 Haiku
Now Serving
-
Version
claude_v3_haiku
-
Description
Fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.
$0.25 Per Million Context Tokens
Context token cost by providerClaude3 Sonet
Now Serving
-
Provider
Anthropic
-
Originator
Anthropic
-
Max Context Tokens
200000
-
Max Generated Tokens
4096
-
Data Freshness
Aug 2023
$15.0 Per Million Generated Tokens
Generated token cost by providerClaude3 Sonet
Now Serving
-
Version
claude_v3_sonet
-
Description
Balance of intelligence and speed. Strong utility, balanced for scaled deployments Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.
$3.0 Per Million Context Tokens
Context token cost by providerClaude3 Opus
Now Serving
-
Provider
Anthropic
-
Originator
Anthropic
-
Max Context Tokens
200000
-
Max Generated Tokens
4096
-
Data Freshness
Aug 2023
$75.0 Per Million Generated Tokens
Generated token cost by providerClaude3 Opus
Now Serving
-
Version
claude_v3_opus
-
Description
Powerful model for highly complex tasks. Top-level performance, intelligence, fluency, and understanding Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.
$15.0 Per Million Context Tokens
Context token cost by providerClaude3.5 Haiku
Now Serving
-
Provider
Anthropic
-
Originator
Anthropic
-
Max Context Tokens
200000
-
Max Generated Tokens
8192
-
Data Freshness
July 2024
$4.0 Per Million Generated Tokens
Generated token cost by providerClaude3.5 Haiku
Now Serving
-
Version
claude_v3_5_haiku
-
Description
Fastest model; serves with intelligence at blazing speeds. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.
$0.8 Per Million Context Tokens
Context token cost by providerClaude3.5 Sonet
Now Serving
-
Provider
Anthropic
-
Originator
Anthropic
-
Max Context Tokens
200000
-
Max Generated Tokens
8192
-
Data Freshness
Apr 2024
$15.0 Per Million Generated Tokens
Generated token cost by providerClaude3.5 Sonet
Now Serving
-
Version
claude_v3_5_sonet
-
Description
Most intelligent model with Highest level of intelligence and capability. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.
$3.0 Per Million Context Tokens
Context token cost by providerLlama3 8B
Deprecated
-
Provider
Amazon Bedrock
-
Originator
Meta
-
Max Context Tokens
8192
-
Max Generated Tokens
8192
-
Data Freshness
Mar 2023
$0.6 Per Million Generated Tokens
Generated token cost by providerLlama3 8B
Deprecated
-
Version
llama3_8b_instruct
-
Description
Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. Llama 3 uses a tokenizer with a vocabulary of 128K tokens, and was trained on on sequences of 8,192 tokens. Grouped-Query Attention (GQA) is used for all models to improve inference efficiency. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.
$0.3 Per Million Context Tokens
Context token cost by providerLlama3 70B
Deprecated
-
Provider
Amazon Bedrock
-
Originator
Meta
-
Max Context Tokens
8192
-
Max Generated Tokens
8192
-
Data Freshness
Dec 2023
$3.5 Per Million Generated Tokens
Generated token cost by providerLlama3 70B
Deprecated
-
Version
llama3_70b_instruct
-
Description
Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. Llama 3 uses a tokenizer with a vocabulary of 128K tokens, and was trained on on sequences of 8,192 tokens. Grouped-Query Attention (GQA) is used for all models to improve inference efficiency. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.
$2.65 Per Million Context Tokens
Context token cost by providerLlama3.1 8B
Coming Soon
-
Provider
Amazon Bedrock
-
Originator
Meta
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Dec 2023
$0.22 Per Million Generated Tokens
Generated token cost by providerLlama3.1 8B
Coming Soon
-
Version
llama3_1_8b_instruct
-
Description
Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.
$0.22 Per Million Context Tokens
Context token cost by providerLlama3.1 70B
Coming Soon
-
Provider
Amazon Bedrock
-
Originator
Meta
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Dec 2023
$0.9 Per Million Generated Tokens
Generated token cost by providerLlama3.1 70B
Coming Soon
-
Version
llama3_1_70b_instruct
-
Description
Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.
$0.9 Per Million Context Tokens
Context token cost by providerLlama3.1 405B
Coming Soon
-
Provider
Amazon Bedrock
-
Originator
Meta
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Dec 2023
$3.0 Per Million Generated Tokens
Generated token cost by providerLlama3.1 405B
Coming Soon
-
Version
llama3_1_405b_instruct
-
Description
Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.
$3.0 Per Million Context Tokens
Context token cost by providerLlama3.2 11B
Now Serving
-
Provider
Amazon Bedrock
-
Originator
Meta
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Dec 2023
$0.16 Per Million Generated Tokens
Generated token cost by providerLlama3.2 11B
Now Serving
-
Version
llama3_2_11b_instruct
-
Description
The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.
$0.16 Per Million Context Tokens
Context token cost by providerLlama3.2 90B
Now Serving
-
Provider
Amazon Bedrock
-
Originator
Meta
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Dec 2023
$0.72 Per Million Generated Tokens
Generated token cost by providerLlama3.2 90B
Now Serving
-
Version
llama3_2_90b_instruct
-
Description
The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.
$0.72 Per Million Context Tokens
Context token cost by providerCommand R
Now Serving
-
Provider
Amazon Bedrock
-
Originator
Cohere
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Unpublished
$1.5 Per Million Generated Tokens
Generated token cost by providerCommand R
Now Serving
-
Version
command_r
-
Description
Command R is a large language model optimized for conversational interaction and long context tasks. It targets the “scalable” category of models that balance high performance with strong accuracy, enabling companies to move beyond proof of concept and into production. Cohere empowers every developer and enterprise to build amazing products and capture true business value with language AI. They're building the future of language AI driven by cutting-edge research, pioneering the future of language AI for business.
$0.5 Per Million Context Tokens
Context token cost by providerCommand R+
Now Serving
-
Provider
Amazon Bedrock
-
Originator
Cohere
-
Max Context Tokens
128000
-
Max Generated Tokens
4096
-
Data Freshness
Unpublished
$15.0 Per Million Generated Tokens
Generated token cost by providerCommand R+
Now Serving
-
Version
command_r_plus
-
Description
Command R+ is a state-of-the-art RAG-optimized model designed to tackle enterprise-grade workloads. It is most powerful, scalable large language model (LLM) purpose-built to excel at real-world enterprise use cases. Command R+ joins our R-series of LLMs focused on balancing high efficiency with strong accuracy, enabling businesses to move beyond proof-of-concept, and into production with AI. Cohere empowers every developer and enterprise to build amazing products and capture true business value with language AI. They're building the future of language AI driven by cutting-edge research, pioneering the future of language AI for business.