Skip to content

Requests

These Pydantic models represent the configuration for a request to a specific OpenAI API endpoint. They contain all the parameters you can set, such as model, temperature, max_tokens, etc.

You use these models when defining a common_request for the BatchJobManager or when creating a request via the BatchCollector.


ChatCompletionsRequest

Configuration for a /v1/chat/completions API request.

Bases: TextGenerationRequest

Configuration for a /v1/chat/completions API request.

Attributes:

Name Type Description
model str

Model ID used to generate the response, like "gpt-4.1". Defaults to "gpt-4.1".

messages List[Dict[str, str]]

A list of messages in the conversation.

frequency_penalty Optional[float]

Penalizes new tokens based on frequency (-2.0 to 2.0).

logit_bias Optional[Dict]

Modifies the likelihood of specified tokens.

logprobs Optional[bool]

Whether to return log probabilities.

max_completion_tokens Optional[int]

Upper bound for generated completion tokens.

modalities Optional[List[str]]

Output types the model should generate.

n Optional[int]

How many chat completion choices to generate.

prediction Optional[object]

Configuration for a Predicted Output.

presence_penalty Optional[float]

Penalizes new tokens based on presence (-2.0 to 2.0).

reasoning_effort Optional[Literal['minimal', 'low', 'medium', 'high']]

Constrains reasoning effort.

response_format Optional[Dict]

Specifies the format that the model must output (e.g., JSON schema).

verbosity Optional[Literal['low', 'medium', 'high']]

Constrains the response verbosity.

web_search_options Optional[object]

Configuration for the web search tool.

tools Optional[List[object]]

An array of tools the model may call.

top_p Optional[float]

An alternative to sampling with temperature (nucleus sampling).

parallel_tool_calls Optional[bool]

Whether to allow parallel tool calls.

prompt_cache_key Optional[str]

Used by OpenAI to cache responses.

safety_identifier Optional[str]

A stable identifier for policy monitoring.

service_tier Optional[Literal['auto', 'default', 'flex', 'priority']]

Specifies the processing type.

store Optional[bool]

Whether to store the generated model response.

temperature Optional[float]

Sampling temperature to use (0 to 2).

tool_choice Optional[str | object]

How the model should select which tool to use.

top_logprobs Optional[int]

Number of most likely tokens to return at each position (0 to 20).

Source code in openbatch/model.py
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
class ChatCompletionsRequest(TextGenerationRequest):
    """
    Configuration for a /v1/chat/completions API request.

    Attributes:
        model (str): Model ID used to generate the response, like "gpt-4.1". Defaults to "gpt-4.1".
        messages (List[Dict[str, str]]): A list of messages in the conversation.
        frequency_penalty (Optional[float]): Penalizes new tokens based on frequency (-2.0 to 2.0).
        logit_bias (Optional[Dict]): Modifies the likelihood of specified tokens.
        logprobs (Optional[bool]): Whether to return log probabilities.
        max_completion_tokens (Optional[int]): Upper bound for generated completion tokens.
        modalities (Optional[List[str]]): Output types the model should generate.
        n (Optional[int]): How many chat completion choices to generate.
        prediction (Optional[object]): Configuration for a Predicted Output.
        presence_penalty (Optional[float]): Penalizes new tokens based on presence (-2.0 to 2.0).
        reasoning_effort (Optional[Literal["minimal", "low", "medium", "high"]]): Constrains reasoning effort.
        response_format (Optional[Dict]): Specifies the format that the model must output (e.g., JSON schema).
        verbosity (Optional[Literal["low", "medium", "high"]]): Constrains the response verbosity.
        web_search_options (Optional[object]): Configuration for the web search tool.
        tools (Optional[List[object]]): An array of tools the model may call.
        top_p (Optional[float]): An alternative to sampling with temperature (nucleus sampling).
        parallel_tool_calls (Optional[bool]): Whether to allow parallel tool calls.
        prompt_cache_key (Optional[str]): Used by OpenAI to cache responses.
        safety_identifier (Optional[str]): A stable identifier for policy monitoring.
        service_tier (Optional[Literal["auto", "default", "flex", "priority"]]): Specifies the processing type.
        store (Optional[bool]): Whether to store the generated model response.
        temperature (Optional[float]): Sampling temperature to use (0 to 2).
        tool_choice (Optional[str | object]): How the model should select which tool to use.
        top_logprobs (Optional[int]): Number of most likely tokens to return at each position (0 to 20).
    """
    messages: List[Dict[str, str]] = Field(None, description="A list of messages comprising the conversation so far.")
    frequency_penalty: Optional[float] = Field(None, ge=-2, le=2, description="Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.")
    logit_bias: Optional[Dict] = Field(None, description="Modify the likelihood of specified tokens appearing in the completion.")
    logprobs: Optional[bool] = Field(None, description="Whether to return log probabilities of the output tokens or not.")
    max_completion_tokens: Optional[int] = Field(None, gt=0, description="An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.")
    modalities: Optional[List[str]] = Field(None, description="Output types that you would like the model to generate.")
    n: Optional[int] = Field(None, description="How many chat completion choices to generate for each input message.")
    prediction: Optional[object] = Field(None, description="Configuration for a Predicted Output, which can greatly improve response times when large parts of the model response are known ahead of time.")
    presence_penalty: Optional[float] = Field(None, ge=-2, le=2, description="Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.")
    reasoning_effort: Optional[Literal["minimal", "low", "medium", "high"]] = Field(None, description="Constrains effort on reasoning for reasoning models.")
    response_format: Optional[Dict] = Field(None, description="An object specifying the format that the model must output.")
    verbosity: Optional[Literal["low", "medium", "high"]] = Field(None, description="Constrains the verbosity of the model's response.")
    web_search_options: Optional[object] = Field(None, description="This tool searches the web for relevant results to use in a response.")

    def set_input_messages(self, messages: List[Message]) -> None:
        self.messages = [m.serialize() for m in messages]

    def set_output_structure(self, output_type: type[T]) -> None:
        schema = type_to_json_schema(output_type)
        self.response_format = {
            "format": {
                "type": "json_schema",
                "name": output_type.__name__,
                "schema": schema,
                "strict": True
            }
        }

ResponsesRequest

Configuration for a /v1/responses API request.

Bases: TextGenerationRequest

Configuration for a /v1/responses API request.

Attributes:

Name Type Description
model str

Model ID used to generate the response, like "gpt-4.1". Defaults to "gpt-4.1".

conversation Optional[str]

The conversation this response belongs to.

include Optional[List[Literal[...]]]

Specify additional output data to include.

input Optional[str | List[Dict[str, str]]]

Text, image, or file inputs to the model.

instructions Optional[str]

A system or developer message.

max_output_tokens Optional[int]

Upper bound for generated tokens.

max_tool_calls Optional[int]

Maximum number of tool calls allowed.

previous_response_id Optional[str]

ID of the previous response for multi-turn.

prompt Optional[ReusablePrompt]

Reference to a prompt template and its variables.

reasoning Optional[ReasoningConfig]

Configuration for reasoning models.

text Optional[object]

Configuration options for a text response from the model (e.g., JSON schema).

truncation Optional[Literal['auto', 'disabled']]

The truncation strategy to use.

tools Optional[List[object]]

An array of tools the model may call.

top_p Optional[float]

An alternative to sampling with temperature (nucleus sampling).

parallel_tool_calls Optional[bool]

Whether to allow parallel tool calls.

prompt_cache_key Optional[str]

Used by OpenAI to cache responses.

safety_identifier Optional[str]

A stable identifier for policy monitoring.

service_tier Optional[Literal['auto', 'default', 'flex', 'priority']]

Specifies the processing type.

store Optional[bool]

Whether to store the generated model response.

temperature Optional[float]

Sampling temperature to use (0 to 2).

tool_choice Optional[str | object]

How the model should select which tool to use.

top_logprobs Optional[int]

Number of most likely tokens to return at each position (0 to 20).

Source code in openbatch/model.py
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
class ResponsesRequest(TextGenerationRequest):
    """
        Configuration for a /v1/responses API request.

        Attributes:
            model (str): Model ID used to generate the response, like "gpt-4.1". Defaults to "gpt-4.1".
            conversation (Optional[str]): The conversation this response belongs to.
            include (Optional[List[Literal[...]]]): Specify additional output data to include.
            input (Optional[str | List[Dict[str, str]]]): Text, image, or file inputs to the model.
            instructions (Optional[str]): A system or developer message.
            max_output_tokens (Optional[int]): Upper bound for generated tokens.
            max_tool_calls (Optional[int]): Maximum number of tool calls allowed.
            previous_response_id (Optional[str]): ID of the previous response for multi-turn.
            prompt (Optional[ReusablePrompt]): Reference to a prompt template and its variables.
            reasoning (Optional[ReasoningConfig]): Configuration for reasoning models.
            text (Optional[object]): Configuration options for a text response from the model (e.g., JSON schema).
            truncation (Optional[Literal["auto", "disabled"]]): The truncation strategy to use.
            tools (Optional[List[object]]): An array of tools the model may call.
            top_p (Optional[float]): An alternative to sampling with temperature (nucleus sampling).
            parallel_tool_calls (Optional[bool]): Whether to allow parallel tool calls.
            prompt_cache_key (Optional[str]): Used by OpenAI to cache responses.
            safety_identifier (Optional[str]): A stable identifier for policy monitoring.
            service_tier (Optional[Literal["auto", "default", "flex", "priority"]]): Specifies the processing type.
            store (Optional[bool]): Whether to store the generated model response.
            temperature (Optional[float]): Sampling temperature to use (0 to 2).
            tool_choice (Optional[str | object]): How the model should select which tool to use.
            top_logprobs (Optional[int]): Number of most likely tokens to return at each position (0 to 20).
        """
    conversation: Optional[str] = Field(None, description="The conversation that this response belongs to.")
    include: Optional[List[Literal["code_interpreter_call.outputs", "computer_call_output.output.image_url", "file_search_call.results", "message.input_image.image_url", "message.output_text.logprobs", "reasoning.encrypted_content"]]] = Field(None, description="Specify additional output data to include in the model response.")
    input: Optional[str | List[Dict[str, str]]] = Field(None, description="Text, image, or file inputs to the model, used to generate a response.")
    instructions: Optional[str] = Field(None, description="A system (or developer) message inserted into the model's context.")
    max_output_tokens: Optional[int] = Field(None, gt=0, description="An upper bound for the number of tokens that can be generated for a response, including visible output tokens and reasoning tokens.")
    max_tool_calls: Optional[int] = Field(None, gt=0, description="The maximum number of total calls to built-in tools that can be processed in a response.")
    previous_response_id: Optional[str] = Field(None, description="The unique ID of the previous response to the model. Use this to create multi-turn conversations.")
    prompt: Optional[ReusablePrompt] = Field(None, description="Reference to a prompt template and its variables.")
    reasoning: Optional[ReasoningConfig] = Field(None, description="Configuration options for reasoning models.")
    text: Optional[object] = Field(None, description="Configuration options for a text response from the model.")
    truncation: Optional[Literal["auto", "disabled"]] = Field(None, description="The truncation strategy to use for the model response.")

    def set_input_messages(self, messages: List[Message]) -> None:
        self.input = [m.serialize() for m in messages]

    def set_output_structure(self, output_type: type[T]) -> None:
        schema = type_to_json_schema(output_type)
        self.text = {
            "format": {
                "type": "json_schema",
                "name": output_type.__name__,
                "schema": schema,
                "strict": True
            }
        }

EmbeddingsRequest

Configuration for a /v1/embeddings API request.

Bases: BaseRequest

Configuration for a /v1/embeddings API request.

Attributes:

Name Type Description
model str

Model ID used to generate the response, like "text-embedding-3-small".

input Union[str | List[str]]

Input text or array of tokens to embed.

dimensions Optional[int]

The desired number of dimensions for the resulting embeddings.

encoding_format Optional[Literal['base64', 'float']]

The format to return the embeddings in.

user Optional[str]

A unique identifier representing the end-user.

Source code in openbatch/model.py
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
class EmbeddingsRequest(BaseRequest):
    """
    Configuration for a /v1/embeddings API request.

    Attributes:
        model (str): Model ID used to generate the response, like "text-embedding-3-small".
        input (Union[str | List[str]]): Input text or array of tokens to embed.
        dimensions (Optional[int]): The desired number of dimensions for the resulting embeddings.
        encoding_format (Optional[Literal["base64", "float"]]): The format to return the embeddings in.
        user (Optional[str]): A unique identifier representing the end-user.
    """
    input: Union[str | List[str]] = Field(None, description="Input text to embed, encoded as a string or array of tokens.")
    dimensions: Optional[int] = Field(None, ge=1, description="The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models.")
    encoding_format: Optional[Literal["base64", "float"]] = Field(None, description="The format to return the embeddings in. Can be either float or base64.")
    user: Optional[str] = Field(None, description="A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. ")

    def set_input(self, inp: Union[str | List[str]]) -> None:
        self.input = inp