The Cohere component is an AI component that allows users to connect the AI models served on the Cohere Platform.
It can carry out the following tasks:
#Release Stage
Alpha
#Configuration
The component definition and tasks are defined in the definition.json and tasks.json files respectively.
#Setup
In order to communicate with Cohere, the following connection details need to be
provided. You may specify them directly in a pipeline recipe as key-value pairs
within the component's setup
block, or you can create a Connection from
the Integration Settings
page and reference the whole setup
as setup: ${connection.<my-connection-id>}
.
Field | Field ID | Type | Note |
---|
API Key | api-key | string | Fill in your Cohere API key. To find your keys, visit the Cohere dashboard page. |
#Supported Tasks
#Text Generation Chat
Cohere's text generation models (often called generative pre-trained transformers or large language models) have been trained to understand natural language, code, and images. The models provide text outputs in response to their inputs. The inputs to these models are also referred to as "prompts". Designing a prompt is essentially how you “program” a large language model model, usually by providing instructions or some examples of how to successfully complete a task.
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_TEXT_GENERATION_CHAT |
Model Name (required) | model-name | string | The Cohere command model to be used.
Enum valuescommand-r-plus command-r command command-nightly command-light command-light-nightly
|
Prompt (required) | prompt | string | The prompt text. |
System Message | system-message | string | The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. By default, the model’s behavior is using a generic message as "You are a helpful assistant.". |
Documents | documents | array[string] | The documents to be used for the model, for optimal performance, the length of each document should be less than 300 words. |
Prompt Images | prompt-images | array[string] | The prompt images (Note: As for 2024-06-24 Cohere models are not multimodal, so images will be ignored.). |
Chat History | chat-history | array[object] | Incorporate external chat history, specifically previous messages within the conversation. Each message should adhere to the format: : {"role": "The message role, i.e. 'USER' or 'CHATBOT'", "content": "message content"}. |
Seed | seed | integer | The seed (default=42). |
Temperature | temperature | number | The temperature for sampling (default=0.7). |
Top K | top-k | integer | Top k for sampling (default=10). |
Max New Tokens | max-new-tokens | integer | The maximum number of tokens for model to generate (default=50). |
Input Objects in Text Generation Chat
Chat History
Incorporate external chat history, specifically previous messages within the conversation. Each message should adhere to the format: : {"role": "The message role, i.e. 'USER' or 'CHATBOT'", "content": "message content"}.
Field | Field ID | Type | Note |
---|
Content | content | array | The message content. |
Role | role | string | The message role, i.e. 'system', 'user' or 'assistant'. |
Content
The message content.
Field | Field ID | Type | Note |
---|
Image URL | image-url | object | The image URL. |
Text | text | string | The text content. |
Type | type | string | The type of the content part.
Enum values |
Image URL
The image URL.
Field | Field ID | Type | Note |
---|
URL | url | string | Either a URL of the image or the base64 encoded image data. |
Output | ID | Type | Description |
---|
Text | text | string | Model Output. |
Citations (optional) | citations | array[object] | Citations. |
Usage (optional) | usage | object | Token Usage on the Cohere Platform Command Models. |
Output Objects in Text Generation Chat
Citations
Field | Field ID | Type | Note |
---|
End | end | integer | The end position of the citation. |
Start | start | integer | The start position of the citation. |
Text | text | string | The text body of the citation. |
Usage
Field | Field ID | Type | Note |
---|
Input Tokens | input-tokens | number | The input tokens used by Cohere Models. |
Output Tokens | output-tokens | number | The output tokens generated by Cohere Models. |
#Text Embeddings
An embedding is a list of floating point numbers that captures semantic information about the text that it represents.
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_TEXT_EMBEDDINGS |
Embedding Type (required) | embedding-type | string | Specifies the return type of embedding, Note that 'binary'/'ubinary' options means the component will return packed unsigned binary embeddings. The length of each binary embedding is 1/8 the length of the float embeddings of the provided model.
Enum valuesfloat int8 uint8 binary ubinary
|
Input Type (required) | input-type | string | Specifies the type of input passed to the model.
Enum valuessearch_document search_query classification clustering
|
Model Name (required) | model-name | string | The Cohere embed model to be used.
Enum valuesembed-english-v3.0 embed-multilingual-v3.0 embed-english-light-v3.0 embed-multilingual-light-v3.0
|
Text (required) | text | string | The text. |
Output | ID | Type | Description |
---|
Embedding | embedding | array[number] | Embedding of the input text. |
Usage (optional) | usage | object | Token usage on the Cohere platform embed models. |
Output Objects in Text Embeddings
Usage
Field | Field ID | Type | Note |
---|
Token Count | tokens | number | The token count used by Cohere Models. |
#Text Reranking
Rerank models sort text inputs by semantic relevance to a specified query. They are often used to sort search results returned from an existing search solution.
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_TEXT_RERANKING |
Model Name (required) | model-name | string | The Cohere rerank model to be used.
Enum valuesrerank-english-v3.0 rerank-multilingual-v3.0
|
Query (required) | query | string | The query. |
Documents (required) | documents | array[string] | The documents to be used for reranking. |
Top N | top-n | integer | The number of most relevant documents or indices to return. Defaults to the length of the documents (default=3). |
Maximum Number of Chunks per Document | max-chunks-per-doc | integer | The maximum number of chunks to produce internally from a document (default=10). |
Output | ID | Type | Description |
---|
Reranked Documents | ranking | array[string] | Reranked documents. |
Reranked Documents Relevance (optional) | relevance | array[number] | The relevance scores of the reranked documents. |
Usage (optional) | usage | object | Search Usage on the Cohere Platform Rerank Models. |
Output Objects in Text Reranking
Usage
Field | Field ID | Type | Note |
---|
Search Counts | search-counts | number | The search count used by Cohere Models. |