Model Catalog

Check out the vast selection of large language models, their capabilities, knowledge freshness, price and other technical details.

Nova Micro

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Amazon

  • Max Context Tokens

    128000

  • Max Generated Tokens

    5000

  • Data Freshness

    October 2023

$0.14 Per Million Generated Tokens

Generated token cost by provider

Nova Micro

Now Serving

  • Version

    nova_micro

  • Description

    Amazon Nova Micro is a text only model that delivers the lowest latency responses at very low cost. It is highly performant at language understanding, translation, reasoning, code completion, brainstorming, and mathematical problem-solving. With its generation speed of over 200 tokens per second, Amazon Nova Micro is ideal for applications that require fast responses. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$0.035 Per Million Context Tokens

Context token cost by provider

Nova Lite

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Amazon

  • Max Context Tokens

    300000

  • Max Generated Tokens

    5000

  • Data Freshness

    October 2023

$0.24 Per Million Generated Tokens

Generated token cost by provider

Nova Lite

Now Serving

  • Version

    nova_lite

  • Description

    Amazon Nova Lite is a very low-cost multimodal model that is lightning fast for processing image, video, and text inputs. Amazon Nova Lite’s accuracy across a breadth of tasks, coupled with its lightning-fast speed, makes it suitable for a wide range of interactive and high-volume applications where cost is a key consideration. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$0.06 Per Million Context Tokens

Context token cost by provider

Nova Pro

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Amazon

  • Max Context Tokens

    300000

  • Max Generated Tokens

    5000

  • Data Freshness

    October 2023

$3.2 Per Million Generated Tokens

Generated token cost by provider

Nova Pro

Now Serving

  • Version

    nova_pro

  • Description

    Amazon Nova Pro is a highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Pro’s capabilities, coupled with its industry-leading speed and cost efficiency, makes it a compelling model for almost any task, including video summarization, Q&A, mathematical reasoning, software development, and AI agents that can execute multi-step workflows. In addition to state-of-the-art accuracy on text and visual intelligence benchmarks, Amazon Nova Pro excels at instruction following and agentic workflows as measured by Comprehensive RAG Benchmark (CRAG), the Berkeley Function Calling Leaderboard, and Mind2Web. Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock.

$0.8 Per Million Context Tokens

Context token cost by provider

Reasoner-R1

Now Serving

  • Provider

    DeepSeek

  • Originator

    DeepSeek

  • Max Context Tokens

    64000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$2.19 Per Million Generated Tokens

Generated token cost by provider

Reasoner-R1

Now Serving

  • Version

    deepseek_reasoner

  • Description

    We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.

$0.55 Per Million Context Tokens

Context token cost by provider

Chat-V3

Now Serving

  • Provider

    DeepSeek

  • Originator

    DeepSeek

  • Max Context Tokens

    64000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$0.28 Per Million Generated Tokens

Generated token cost by provider

Chat-V3

Now Serving

  • Version

    deepseek_chat

  • Description

    We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. DeepSeek-V3 achieves a significant breakthrough in inference speed over previous models. It tops the leaderboard among open-source models and rivals the most advanced closed-source models globally.

$0.14 Per Million Context Tokens

Context token cost by provider

GPT-3.5 Turbo

Deprecated

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    16385

  • Max Generated Tokens

    4096

  • Data Freshness

    Sep 2021

$1.5 Per Million Generated Tokens

Generated token cost by provider

GPT-3.5 Turbo

Deprecated

  • Version

    gpt_3_5_turbo

  • Description

    The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$0.5 Per Million Context Tokens

Context token cost by provider

GPT-4

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    8192

  • Max Generated Tokens

    8192

  • Data Freshness

    Sep 2021

$60.0 Per Million Generated Tokens

Generated token cost by provider

GPT-4

Now Serving

  • Version

    gpt_4

  • Description

    Snapshot of gpt-4 from June 13th 2023 with improved tool calling support. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$30.0 Per Million Context Tokens

Context token cost by provider

GPT-4 Turbo

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Dec 2023

$30.0 Per Million Generated Tokens

Generated token cost by provider

GPT-4 Turbo

Now Serving

  • Version

    gpt_4_turbo

  • Description

    The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling. Currently points to gpt-4-turbo-2024-04-09. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$10.0 Per Million Context Tokens

Context token cost by provider

GPT-4o

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Oct 2023

$10.0 Per Million Generated Tokens

Generated token cost by provider

GPT-4o

Now Serving

  • Version

    gpt_4o

  • Description

    GPT-4o is their most advanced model. It is multimodal (accepting text or image inputs and outputting text), and it has the same high intelligence as GPT-4 Turbo but is much more efficient—it generates text 2x faster and is 50% cheaper. Additionally, GPT-4o has the best vision and performance across non-English languages of any of our models. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$2.5 Per Million Context Tokens

Context token cost by provider

GPT-4o Mini

Now Serving

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    16384

  • Data Freshness

    Oct 2023

$0.6 Per Million Generated Tokens

Generated token cost by provider

GPT-4o Mini

Now Serving

  • Version

    gpt_4o_mini

  • Description

    GPT-4o mini is their most advanced model in the small models category, and their cheapest model yet. It is multimodal (accepting text or image inputs and outputting text), has higher intelligence than gpt-3.5-turbo but is just as fast. It is meant to be used for smaller tasks, including vision tasks. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$0.15 Per Million Context Tokens

Context token cost by provider

GPT-o1 Preview

Coming Soon

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    32768

  • Data Freshness

    Oct 2023

$60.0 Per Million Generated Tokens

Generated token cost by provider

GPT-o1 Preview

Coming Soon

  • Version

    gpt_o1_preview

  • Description

    The o1 series of large language models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. o1 Preview is the reasoning model designed to solve hard problems across domains. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$15.0 Per Million Context Tokens

Context token cost by provider

GPT-o1 Mini

Coming Soon

  • Provider

    OpenAI

  • Originator

    OpenAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    65536

  • Data Freshness

    Oct 2023

$12.0 Per Million Generated Tokens

Generated token cost by provider

GPT-o1 Mini

Coming Soon

  • Version

    gpt_o1_mini

  • Description

    The o1 series of large language models are trained with reinforcement learning to perform complex reasoning. o1 models think before they answer, producing a long internal chain of thought before responding to the user. o1 Mini is the faster and cheaper reasoning model particularly good at coding, math, and science. OpenAI is an AI research and deployment company. Their mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

$3.0 Per Million Context Tokens

Context token cost by provider

Mistral Small

Now Serving

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    32000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$0.6 Per Million Generated Tokens

Generated token cost by provider

Mistral Small

Now Serving

  • Version

    mistral_small

  • Description

    The first dense model released by Mistral AI, perfect for experimentation, customization, and quick iteration. At the time of the release, it matche the capabilities of models up to 30B parameters. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.2 Per Million Context Tokens

Context token cost by provider

Mixtral 8x22B

Now Serving

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    64000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$6.0 Per Million Generated Tokens

Generated token cost by provider

Mixtral 8x22B

Now Serving

  • Version

    mixtral_8_22b

  • Description

    A bigger sparse mixture of experts model. As such, it leverages up to 141B parameters but only uses about 39B during inference, leading to better inference throughput at the cost of more vRAM Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$2.0 Per Million Context Tokens

Context token cost by provider

Mistral Large

Now Serving

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$6.0 Per Million Generated Tokens

Generated token cost by provider

Mistral Large

Now Serving

  • Version

    mistral_large

  • Description

    Our flagship model with state-of-the-art reasoning, knowledge, and coding capabilities. It's ideal for complex tasks that require large reasoning capabilities or are highly specialized (Synthetic Text Generation, Code Generation, RAG, or Agents). Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$2.0 Per Million Context Tokens

Context token cost by provider

Mistral Nemo

Coming Soon

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    128000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$0.15 Per Million Generated Tokens

Generated token cost by provider

Mistral Nemo

Coming Soon

  • Version

    mistral_nemo

  • Description

    A 12B model built with the partnership with Nvidia. It is easy to use and a drop-in replacement in any system using Mistral 7B that it supersedes. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.15 Per Million Context Tokens

Context token cost by provider

Mistral Codestral

Coming Soon

  • Provider

    MistralAI

  • Originator

    MistralAI

  • Max Context Tokens

    32000

  • Max Generated Tokens

    8192

  • Data Freshness

    Unpublished

$0.6 Per Million Generated Tokens

Generated token cost by provider

Mistral Codestral

Coming Soon

  • Version

    mistral_codestral

  • Description

    A cutting-edge generative model that has been specifically designed and optimized for code generation tasks, including fill-in-the-middle and code completion. Mitral AI's mission is to make frontier AI ubiquitous, and to provide tailor-made AI to all the builders. This requires fierce independence, strong commitment to open, portable and customisable solutions, and an extreme focus on shipping the most advanced technology in limited time.

$0.2 Per Million Context Tokens

Context token cost by provider

Gemini1.0 Pro

Now Serving

  • Provider

    Google VertexAI

  • Originator

    Google

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Feb 2024

$1.5 Per Million Generated Tokens

Generated token cost by provider

Gemini1.0 Pro

Now Serving

  • Version

    gemini_1_0_pro

  • Description

    Gemini 1.0 Pro is an NLP model that handles tasks like multi-turn text and code chat, and code generation. Google DeepMind's mission is to build AI responsibly to benefit humanity. AI has the potential to be one of the most important and beneficial technologies ever invented. They are working to create breakthrough technologies that could advance science, transform work, serve diverse communities — and improve billions of people's lives.

$0.5 Per Million Context Tokens

Context token cost by provider

Gemini1.5 Pro

Now Serving

  • Provider

    Google VertexAI

  • Originator

    Google

  • Max Context Tokens

    2097152

  • Max Generated Tokens

    8192

  • Data Freshness

    May 2024

$5.0 Per Million Generated Tokens

Generated token cost by provider

Gemini1.5 Pro

Now Serving

  • Version

    gemini_1_5_pro

  • Description

    Gemini 1.5 Pro is a mid-size multimodal model that is optimized for a wide-range of reasoning tasks. 1.5 Pro can process large amounts of data at once, including 2 hours of video, 19 hours of audio, codebases with 60,000 lines of code, or 2,000 pages of text. Google DeepMind's mission is to build AI responsibly to benefit humanity. AI has the potential to be one of the most important and beneficial technologies ever invented. They are working to create breakthrough technologies that could advance science, transform work, serve diverse communities — and improve billions of people's lives.

$1.25 Per Million Context Tokens

Context token cost by provider

Gemini1.5 Pro

Coming Soon

  • Provider

    Google VertexAI

  • Originator

    Google

  • Max Context Tokens

    1048576

  • Max Generated Tokens

    8192

  • Data Freshness

    May 2024

$0.6 Per Million Generated Tokens

Generated token cost by provider

Gemini1.5 Pro

Coming Soon

  • Version

    gemini_1_5_flash

  • Description

    Gemini 1.5 Flash is a fast and versatile multimodal model for scaling across diverse tasks. Google DeepMind's mission is to build AI responsibly to benefit humanity. AI has the potential to be one of the most important and beneficial technologies ever invented. They are working to create breakthrough technologies that could advance science, transform work, serve diverse communities — and improve billions of people's lives.

$0.15 Per Million Context Tokens

Context token cost by provider

Claude3 Haiku

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    4096

  • Data Freshness

    Aug 2023

$1.25 Per Million Generated Tokens

Generated token cost by provider

Claude3 Haiku

Now Serving

  • Version

    claude_v3_haiku

  • Description

    Fastest and most compact model for near-instant responsiveness. Quick and accurate targeted performance Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$0.25 Per Million Context Tokens

Context token cost by provider

Claude3 Sonet

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    4096

  • Data Freshness

    Aug 2023

$15.0 Per Million Generated Tokens

Generated token cost by provider

Claude3 Sonet

Now Serving

  • Version

    claude_v3_sonet

  • Description

    Balance of intelligence and speed. Strong utility, balanced for scaled deployments Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$3.0 Per Million Context Tokens

Context token cost by provider

Claude3 Opus

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    4096

  • Data Freshness

    Aug 2023

$75.0 Per Million Generated Tokens

Generated token cost by provider

Claude3 Opus

Now Serving

  • Version

    claude_v3_opus

  • Description

    Powerful model for highly complex tasks. Top-level performance, intelligence, fluency, and understanding Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$15.0 Per Million Context Tokens

Context token cost by provider

Claude3.5 Haiku

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    8192

  • Data Freshness

    July 2024

$4.0 Per Million Generated Tokens

Generated token cost by provider

Claude3.5 Haiku

Now Serving

  • Version

    claude_v3_5_haiku

  • Description

    Fastest model; serves with intelligence at blazing speeds. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$0.8 Per Million Context Tokens

Context token cost by provider

Claude3.5 Sonet

Now Serving

  • Provider

    Anthropic

  • Originator

    Anthropic

  • Max Context Tokens

    200000

  • Max Generated Tokens

    8192

  • Data Freshness

    Apr 2024

$15.0 Per Million Generated Tokens

Generated token cost by provider

Claude3.5 Sonet

Now Serving

  • Version

    claude_v3_5_sonet

  • Description

    Most intelligent model with Highest level of intelligence and capability. Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure. They believe AI will have a vast impact on the world. Anthropic is dedicated to building systems that people can rely on and generating research about the opportunities and risks of AI.

$3.0 Per Million Context Tokens

Context token cost by provider

Llama3 8B

Deprecated

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    8192

  • Max Generated Tokens

    8192

  • Data Freshness

    Mar 2023

$0.6 Per Million Generated Tokens

Generated token cost by provider

Llama3 8B

Deprecated

  • Version

    llama3_8b_instruct

  • Description

    Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. Llama 3 uses a tokenizer with a vocabulary of 128K tokens, and was trained on on sequences of 8,192 tokens. Grouped-Query Attention (GQA) is used for all models to improve inference efficiency. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.3 Per Million Context Tokens

Context token cost by provider

Llama3 70B

Deprecated

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    8192

  • Max Generated Tokens

    8192

  • Data Freshness

    Dec 2023

$3.5 Per Million Generated Tokens

Generated token cost by provider

Llama3 70B

Deprecated

  • Version

    llama3_70b_instruct

  • Description

    Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. Llama 3 uses a tokenizer with a vocabulary of 128K tokens, and was trained on on sequences of 8,192 tokens. Grouped-Query Attention (GQA) is used for all models to improve inference efficiency. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$2.65 Per Million Context Tokens

Context token cost by provider

Llama3.1 8B

Coming Soon

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Dec 2023

$0.22 Per Million Generated Tokens

Generated token cost by provider

Llama3.1 8B

Coming Soon

  • Version

    llama3_1_8b_instruct

  • Description

    Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.22 Per Million Context Tokens

Context token cost by provider

Llama3.1 70B

Coming Soon

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Dec 2023

$0.9 Per Million Generated Tokens

Generated token cost by provider

Llama3.1 70B

Coming Soon

  • Version

    llama3_1_70b_instruct

  • Description

    Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.9 Per Million Context Tokens

Context token cost by provider

Llama3.1 405B

Coming Soon

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Dec 2023

$3.0 Per Million Generated Tokens

Generated token cost by provider

Llama3.1 405B

Coming Soon

  • Version

    llama3_1_405b_instruct

  • Description

    Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$3.0 Per Million Context Tokens

Context token cost by provider

Llama3.2 11B

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Dec 2023

$0.16 Per Million Generated Tokens

Generated token cost by provider

Llama3.2 11B

Now Serving

  • Version

    llama3_2_11b_instruct

  • Description

    The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.16 Per Million Context Tokens

Context token cost by provider

Llama3.2 90B

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Meta

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Dec 2023

$0.72 Per Million Generated Tokens

Generated token cost by provider

Llama3.2 90B

Now Serving

  • Version

    llama3_2_90b_instruct

  • Description

    The Llama 3.2-Vision collection of multimodal large language models (LLMs) is a collection of pretrained and instruction-tuned image reasoning generative models in 11B and 90B sizes (text + images in / text out). The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image. The models outperform many of the available open source and closed multimodal models on common industry benchmarks. The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions.

$0.72 Per Million Context Tokens

Context token cost by provider

Command R

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Cohere

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Unpublished

$1.5 Per Million Generated Tokens

Generated token cost by provider

Command R

Now Serving

  • Version

    command_r

  • Description

    Command R is a large language model optimized for conversational interaction and long context tasks. It targets the “scalable” category of models that balance high performance with strong accuracy, enabling companies to move beyond proof of concept and into production. Cohere empowers every developer and enterprise to build amazing products and capture true business value with language AI. They're building the future of language AI driven by cutting-edge research, pioneering the future of language AI for business.

$0.5 Per Million Context Tokens

Context token cost by provider

Command R+

Now Serving

  • Provider

    Amazon Bedrock

  • Originator

    Cohere

  • Max Context Tokens

    128000

  • Max Generated Tokens

    4096

  • Data Freshness

    Unpublished

$15.0 Per Million Generated Tokens

Generated token cost by provider

Command R+

Now Serving

  • Version

    command_r_plus

  • Description

    Command R+ is a state-of-the-art RAG-optimized model designed to tackle enterprise-grade workloads. It is most powerful, scalable large language model (LLM) purpose-built to excel at real-world enterprise use cases. Command R+ joins our R-series of LLMs focused on balancing high efficiency with strong accuracy, enabling businesses to move beyond proof-of-concept, and into production with AI. Cohere empowers every developer and enterprise to build amazing products and capture true business value with language AI. They're building the future of language AI driven by cutting-edge research, pioneering the future of language AI for business.

$3.0 Per Million Context Tokens

Context token cost by provider

Model Details