VoyageAI inference integration
editVoyageAI inference integration
editCreates an inference endpoint to perform an inference task with the voyageai
service.
Request
editPUT /_inference/<task_type>/<inference_id>
Path parameters
edit-
<inference_id>
- (Required, string) The unique identifier of the inference endpoint.
-
<task_type>
-
(Required, string) The type of the inference task that the model will perform.
Available task types:
-
text_embedding
, -
rerank
.
-
Request body
edit-
chunking_settings
-
(Optional, object) Chunking configuration object. Refer to Configuring chunking to learn more about chunking.
-
max_chunk_size
-
(Optional, integer)
Specifies the maximum size of a chunk in words.
Defaults to
250
. This value cannot be higher than300
or lower than20
(forsentence
strategy) or10
(forword
strategy). -
overlap
-
(Optional, integer)
Only for
word
chunking strategy. Specifies the number of overlapping words for chunks. Defaults to100
. This value cannot be higher than the half ofmax_chunk_size
. -
sentence_overlap
-
(Optional, integer)
Only for
sentence
chunking strategy. Specifies the numnber of overlapping sentences for chunks. It can be either1
or0
. Defaults to1
. -
strategy
-
(Optional, string)
Specifies the chunking strategy.
It could be either
sentence
orword
.
-
-
service
-
(Required, string)
The type of service supported for the specified task type. In this case,
voyageai
. -
service_settings
-
(Required, object) Settings used to install the inference model.
These settings are specific to the
voyageai
service.-
dimensions
-
(Optional, integer)
The number of dimensions the resulting output embeddings should have.
This setting maps to
output_dimension
in the VoyageAI documentation. Only for thetext_embedding
task type. -
embedding_type
-
(Optional, string)
The data type for the embeddings to be returned.
This setting maps to
output_dtype
in the VoyageAI documentation. Permitted values:float
,int8
,bit
.int8
is a synonym ofbyte
in the VoyageAI documentation.bit
is a synonym ofbinary
in the VoyageAI documentation. Only for thetext_embedding
task type. -
model_id
- (Required, string) The name of the model to use for the inference task. Refer to the VoyageAI documentation for the list of available text embedding and rerank models.
-
rate_limit
-
(Optional, object) This setting helps to minimize the number of rate limit errors returned from VoyageAI. The
voyageai
service sets a default number of requests allowed per minute depending on the task type. For bothtext_embedding
andrerank
, it is set to2000
. To modify this, set therequests_per_minute
setting of this object in your service settings:"rate_limit": { "requests_per_minute": <<number_of_requests>> }
More information about the rate limits for OpenAI can be found in your Account limits.
-
-
task_settings
-
(Optional, object) Settings to configure the inference task. These settings are specific to the
<task_type>
you specified.task_settings
for thetext_embedding
task type-
input_type
-
(Optional, string)
Type of the input text.
Permitted values:
ingest
(maps todocument
in the VoyageAI documentation),search
(maps toquery
in the VoyageAI documentation). -
truncation
-
(Optional, boolean)
Whether to truncate the input texts to fit within the context length.
Defaults to
false
.
task_settings
for thererank
task type-
return_documents
-
(Optional, boolean)
Whether to return the source documents in the response.
Defaults to
false
. -
top_k
- (Optional, integer) The number of most relevant documents to return. If not specified, the reranking results of all documents will be returned.
-
truncation
-
(Optional, boolean)
Whether to truncate the input texts to fit within the context length.
Defaults to
false
.
-
VoyageAI service example
editThe following example shows how to create an inference endpoint called voyageai-embeddings
to perform a text_embedding
task type.
The embeddings created by requests to this endpoint will have 512 dimensions.
PUT _inference/text_embedding/voyageai-embeddings { "service": "voyageai", "service_settings": { "model_id": "voyage-3-large", "dimensions": 512 } }
The next example shows how to create an inference endpoint called voyageai-rerank
to perform a rerank
task type.
PUT _inference/rerank/voyageai-rerank { "service": "voyageai", "service_settings": { "model_id": "rerank-2" } }