Data Transfer Objects and exceptions#
This page describes several classes that work as Data Transfer Objects (DTOs), to be used as input arguments for methods or functions, or as return values of these.
Input Arguments#
Instances of these classes are expected as input arguments by some methods or functions of the SDK. They contain the configuration and other details about the inference to perform.
- imagine.ModelType[source]#
Supported values: ModelType.EMBEDDING,`ModelType.LLM`, ModelType.RERANKER, ModelType.TEXT_TO_IMAGE, ModelType.TRANSCRIBE, ModelType.TRANSLATE.
- class imagine.ReRankerRequest(*, query, documents, top_n=None, model, return_documents=None)[source]#
- class imagine.ChatCompletionRequest(*, frequency_penalty=None, presence_penalty=None, repetition_penalty=None, stop=None, max_seconds=None, ignore_eos=None, skip_special_tokens=None, stop_token_ids=None, max_tokens=None, temperature=None, top_k=None, top_p=None, messages, model, stream, tools=None)[source]#
Bases:
LLMSamplingParams
- frequency_penalty: float | None#
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim, defaults to None
- ignore_eos: bool | None#
Whether to ignore the EOS token and continue generating tokens after the EOS token is generated., defaults to None
- messages: list[ChatMessage]#
A list of [messages]
- presence_penalty: float | None#
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics, defaults to None
- repetition_penalty: float | None#
Float that penalizes new tokens based on whether they appear in the prompt and the generated text so far. Values > 1 encourage the model to use new tokens, while values < 1 encourage the model to repeat tokens., defaults to None
- stop: list[str] | None#
Sequences where the API will stop generating further tokens. The returned text will contain the stop sequence., defaults to None
- stop_token_ids: list[list[int]] | None#
List of tokens that stop the generation when they are generated. The returned output will contain the stop tokens unless the stop tokens are special tokens., defaults to None
- temperature: float | None#
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic., defaults to None
- top_k: int | None#
Integer that controls the number of top tokens to consider. Set to -1 to consider all tokens., defaults to None
- top_p: float | None#
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both., defaults to None
- class imagine.CompletionRequest(*, frequency_penalty=None, presence_penalty=None, repetition_penalty=None, stop=None, max_seconds=None, ignore_eos=None, skip_special_tokens=None, stop_token_ids=None, max_tokens=None, temperature=None, top_k=None, top_p=None, prompt, model, stream)[source]#
Bases:
LLMSamplingParams
- frequency_penalty: float | None#
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim, defaults to None
- ignore_eos: bool | None#
Whether to ignore the EOS token and continue generating tokens after the EOS token is generated., defaults to None
- presence_penalty: float | None#
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics, defaults to None
- repetition_penalty: float | None#
Float that penalizes new tokens based on whether they appear in the prompt and the generated text so far. Values > 1 encourage the model to use new tokens, while values < 1 encourage the model to repeat tokens., defaults to None
- stop: list[str] | None#
Sequences where the API will stop generating further tokens. The returned text will contain the stop sequence., defaults to None
- stop_token_ids: list[list[int]] | None#
List of tokens that stop the generation when they are generated. The returned output will contain the stop tokens unless the stop tokens are special tokens., defaults to None
- temperature: float | None#
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic., defaults to None
- top_k: int | None#
Integer that controls the number of top tokens to consider. Set to -1 to consider all tokens., defaults to None
- top_p: float | None#
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both., defaults to None
Responses#
Instances of these classes are returned by some methods or functions of the SDK. They contain the data the user is interested into.
- class imagine.EmbeddingResponse(*, id, object, data, model, usage)[source]#
- class imagine.TranslateResponse(*, id, object, created, model, choices, usage=None, generation_time=None)[source]#
- class imagine.CompletionResponse(*, id, object, created, model, choices, usage=None, generation_time=None)[source]#
- class imagine.CompletionStreamResponse(*, id, model, choices, created=None, object=None, usage=None)[source]#
- class imagine.ChatCompletionResponse(*, id, object, created, model, choices, usage)[source]#
- choices: list[ChatCompletionResponseChoice]#
A list of chat completion choices
- class imagine.ChatCompletionResponseChoice(*, index, message, finish_reason)[source]#
- finish_reason: FinishReason | None#
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached error in case of error
- message: ChatMessage#
A chat completion message generated by the model.
- class imagine.ChatCompletionStreamResponse(*, id, model, choices, created=None, object=None, usage=None)[source]#
- choices: list[ChatCompletionResponseStreamChoice]#
A list of chat completion choices
- class imagine.ChatCompletionResponseStreamChoice(*, index, delta, finish_reason)[source]#
- delta: DeltaMessage#
A chat completion delta generated by streamed model responses.
- finish_reason: FinishReason | None#
The reason the model stopped generating tokens. This will be stop if the model hit a natural stop point or a provided stop sequence, length if the maximum number of tokens specified in the request was reached error in case of error
- class imagine.FinishReason(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Exceptions#
The following are exceptions returned by the SDK: