Cohere | Documentation

The Cohere component is an AI component that allows users to connect the AI models served on the Cohere Platform. It can carry out the following tasks:

Text Generation Chat
Text Embeddings
Text Reranking

#Release Stage

Alpha

#Configuration

The component definition and tasks are defined in the definition.json and tasks.json files respectively.

#Setup

In order to communicate with Cohere, the following connection details need to be provided. You may specify them directly in a pipeline recipe as key-value pairs within the component's setup block, or you can create a Connection from the Integration Settings page and reference the whole setup as setup: ${connection.<my-connection-id>}.

Field	Field ID	Type	Note
API Key	`api-key`	string	Fill in your Cohere API key. To find your keys, visit the Cohere dashboard page.

#Supported Tasks

#Text Generation Chat

Cohere's text generation models (often called generative pre-trained transformers or large language models) have been trained to understand natural language, code, and images. The models provide text outputs in response to their inputs. The inputs to these models are also referred to as "prompts". Designing a prompt is essentially how you “program” a large language model model, usually by providing instructions or some examples of how to successfully complete a task.

Input	ID	Type	Description
Task ID (required)	`task`	string	`TASK_TEXT_GENERATION_CHAT`
Model Name (required)	`model-name`	string	The Cohere command model to be used. Enum values `command-r-plus` `command-r` `command` `command-nightly` `command-light` `command-light-nightly`
Prompt (required)	`prompt`	string	The prompt text.
System Message	`system-message`	string	The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. By default, the model’s behavior is using a generic message as "You are a helpful assistant.".
Documents	`documents`	array[string]	The documents to be used for the model, for optimal performance, the length of each document should be less than 300 words.
Prompt Images	`prompt-images`	array[string]	The prompt images (Note: As for 2024-06-24 Cohere models are not multimodal, so images will be ignored.).
Chat History	`chat-history`	array[object]	Incorporate external chat history, specifically previous messages within the conversation. Each message should adhere to the format: : {"role": "The message role, i.e. 'USER' or 'CHATBOT'", "content": "message content"}.
Seed	`seed`	integer	The seed (default=42).
Temperature	`temperature`	number	The temperature for sampling (default=0.7).
Top K	`top-k`	integer	Top k for sampling (default=10).
Max New Tokens	`max-new-tokens`	integer	The maximum number of tokens for model to generate (default=50).

Input Objects in Text Generation Chat

Chat History

Incorporate external chat history, specifically previous messages within the conversation. Each message should adhere to the format: : {"role": "The message role, i.e. 'USER' or 'CHATBOT'", "content": "message content"}.

Field	Field ID	Type	Note
Content	`content`	array	The message content.
Role	`role`	string	The message role, i.e. 'system', 'user' or 'assistant'.

Content

The message content.

Field	Field ID	Type	Note
Image URL	`image-url`	object	The image URL.
Text	`text`	string	The text content.
Type	`type`	string	The type of the content part. Enum values `text` `image_url`

Image URL

The image URL.

Field	Field ID	Type	Note
URL	`url`	string	Either a URL of the image or the base64 encoded image data.

Output	ID	Type	Description
Text	`text`	string	Model Output.
Citations (optional)	`citations`	array[object]	Citations.
Usage (optional)	`usage`	object	Token Usage on the Cohere Platform Command Models.

Output Objects in Text Generation Chat

Citations

Field	Field ID	Type	Note
End	`end`	integer	The end position of the citation.
Start	`start`	integer	The start position of the citation.
Text	`text`	string	The text body of the citation.

Usage

Field	Field ID	Type	Note
Input Tokens	`input-tokens`	number	The input tokens used by Cohere Models.
Output Tokens	`output-tokens`	number	The output tokens generated by Cohere Models.

#Text Embeddings

An embedding is a list of floating point numbers that captures semantic information about the text that it represents.

Input	ID	Type	Description
Task ID (required)	`task`	string	`TASK_TEXT_EMBEDDINGS`
Embedding Type (required)	`embedding-type`	string	Specifies the return type of embedding, Note that 'binary'/'ubinary' options means the component will return packed unsigned binary embeddings. The length of each binary embedding is 1/8 the length of the float embeddings of the provided model. Enum values `float` `int8` `uint8` `binary` `ubinary`
Input Type (required)	`input-type`	string	Specifies the type of input passed to the model. Enum values `search_document` `search_query` `classification` `clustering`
Model Name (required)	`model-name`	string	The Cohere embed model to be used. Enum values `embed-english-v3.0` `embed-multilingual-v3.0` `embed-english-light-v3.0` `embed-multilingual-light-v3.0`
Text (required)	`text`	string	The text.

Output	ID	Type	Description
Embedding	`embedding`	array[number]	Embedding of the input text.
Usage (optional)	`usage`	object	Token usage on the Cohere platform embed models.

Output Objects in Text Embeddings

Usage

Field	Field ID	Type	Note
Token Count	`tokens`	number	The token count used by Cohere Models.

#Text Reranking

Rerank models sort text inputs by semantic relevance to a specified query. They are often used to sort search results returned from an existing search solution.

Input	ID	Type	Description
Task ID (required)	`task`	string	`TASK_TEXT_RERANKING`
Model Name (required)	`model-name`	string	The Cohere rerank model to be used. Enum values `rerank-english-v3.0` `rerank-multilingual-v3.0`
Query (required)	`query`	string	The query.
Documents (required)	`documents`	array[string]	The documents to be used for reranking.
Top N	`top-n`	integer	The number of most relevant documents or indices to return. Defaults to the length of the documents (default=3).
Maximum Number of Chunks per Document	`max-chunks-per-doc`	integer	The maximum number of chunks to produce internally from a document (default=10).

Output	ID	Type	Description
Reranked Documents	`ranking`	array[string]	Reranked documents.
Reranked Documents Relevance (optional)	`relevance`	array[number]	The relevance scores of the reranked documents.
Usage (optional)	`usage`	object	Search Usage on the Cohere Platform Rerank Models.

Output Objects in Text Reranking

Usage

Field	Field ID	Type	Note
Search Counts	`search-counts`	number	The search count used by Cohere Models.