The Canadian unicorn Cohere recently unveiled "Command A," the latest version of its flagship model. Specifically designed, like its predecessors, to meet enterprise needs, this LLM with 111 billion parameters combines performance and energy efficiency, competing with leading models such as GPT-4o and DeepSeek-V3.
One of the major advantages of Command A for businesses is its minimal hardware footprint. While most comparable models require up to 32 GPUs, Command A operates efficiently with only two A100 or H100 GPUs, resulting in significant cost and latency reduction as well as higher execution speed. Besides faster generation of the first token, it can generate up to 156 tokens/s, which is 1.75 times faster than GPT-4o and 2.4 times faster than DeepSeek-V3.
Performance of Command A
Cohere evaluated the performance of Command A against GPT-4o and DeepSeek-V3 on academic benchmarks: MMLU (general knowledge), MATH, IFEval (instruction following), intelligent agent tests (BFCL, Taubench), and coding benchmarks (MBPPPlus, SQL, RepoQA).
Its capabilities in instruction following, coding, especially in SQL, and on agent tasks outperform those of its competitors.
In human evaluation tests, Command A, which covers 23 major languages, outperformed its competitors in several languages, notably in dialectal Arabic, where it proved more coherent and accurate than GPT-4o and DeepSeek-V3. This ability to adapt to local contexts represents a strategic asset for companies operating internationally.
Optimized Capabilities for Businesses
Unlike its predecessor, which supported a context length of 128,000 tokens, Command A features a context length of 256 tokens, making it suitable for analyzing long
business documents. It integrates advanced functionalities such as retrieval-augmented generation (RAG) with verifiable citations and the use of secure agent tools.
It is particularly effective for:
Analyzing and extracting information from large financial reports;
Managing HR policies according to local specifics;
Verifying and interpreting complex legal regulations.
Thanks to smooth integration with North, Cohere's AI agent platform, Command A allows companies to develop custom AI solutions while maintaining a high level of security and compliance.
Availability and Pricing
Already available on the
Cohere platform, with upcoming support from major cloud providers, Command A is offered at a cost of $2.50 for 1 million input tokens and $10.00 for 1 million output tokens. It is also accessible for research purposes on
Hugging Face.
To better understand
What is a LLM and why is it important for businesses?
A LLM (Large Language Model) is an artificial intelligence model that uses vast amounts of data to understand, generate, and manipulate natural language. For businesses, this can transform operations by streamlining customer service, analyzing complex data, and improving communication and decision-making.
How does Retrieval-Augmented Generation (RAG) enhance the capabilities of an LLM like Command A?
RAG, or Retrieval-Augmented Generation, allows an LLM to enrich its responses with current and relevant external data. This is done by consulting external databases or documents, verifying the information provided, and increasing the accuracy and relevance of the generated outputs.