Azure OpenAI assistant post input binding for Azure Functions

2025-05-19

Important

The Azure OpenAI extension for Azure Functions is currently in preview.

The Azure OpenAI assistant post input binding lets you send prompts to assistant chat bots.

For information on setup and configuration details of the Azure OpenAI extension, see Azure OpenAI extensions for Azure Functions. To learn more about Azure OpenAI assistants, see Azure OpenAI Assistants API.

Note

References and examples are only provided for the Node.js v4 model.

Note

References and examples are only provided for the Python v2 model.

Note

While both C# process models are supported, only isolated worker model examples are provided.

Example

This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.

/// <summary>
/// HTTP POST function that sends user prompts to the assistant chat bot.
/// </summary>
[Function(nameof(PostUserQuery))]
public static IActionResult PostUserQuery(
    [HttpTrigger(AuthorizationLevel.Anonymous, "post", Route = "assistants/{assistantId}")] HttpRequestData req,
    string assistantId,
    [AssistantPostInput("{assistantId}", "{Query.message}", ChatModel = "%CHAT_MODEL_DEPLOYMENT_NAME%", ChatStorageConnectionSetting = DefaultChatStorageConnectionSetting, CollectionName = DefaultCollectionName)] AssistantState state)
{
    return new OkObjectResult(state.RecentMessages.Any() ? state.RecentMessages[state.RecentMessages.Count - 1].Content : "No response returned.");
}

This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.

/*
 * HTTP POST function that sends user prompts to the assistant chat bot.
 */ 
@FunctionName("PostUserResponse")
public HttpResponseMessage postUserResponse(
    @HttpTrigger(
        name = "req",
        methods = {HttpMethod.POST}, 
        authLevel = AuthorizationLevel.ANONYMOUS,
        route = "assistants/{assistantId}") 
        HttpRequestMessage<Optional<String>> request,
    @BindingName("assistantId") String assistantId,        
    @AssistantPost(name="newMessages", id = "{assistantId}", chatModel = "%CHAT_MODEL_DEPLOYMENT_NAME%", userMessage = "{Query.message}", chatStorageConnectionSetting = DEFAULT_CHATSTORAGE, collectionName = DEFAULT_COLLECTION) AssistantState state,
    final ExecutionContext context) {
        
        List<AssistantMessage> recentMessages = state.getRecentMessages();
        String response = recentMessages.isEmpty() ? "No response returned." : recentMessages.get(recentMessages.size() - 1).getContent();
        
        return request.createResponseBuilder(HttpStatus.OK)
            .header("Content-Type", "application/json")
            .body(response)
            .build();
}

This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.

const { app, input, output } = require("@azure/functions");

const assistantPostInput = input.generic({
    type: 'assistantPost',
    id: '{assistantId}',
    chatModel: '%CHAT_MODEL_DEPLOYMENT_NAME%',
    userMessage: '{Query.message}',
    chatStorageConnectionSetting: CHAT_STORAGE_CONNECTION_SETTING,
    collectionName: COLLECTION_NAME
})
app.http('PostUserResponse', {
    methods: ['POST'],
    route: 'assistants/{assistantId}',
    authLevel: 'anonymous',
    extraInputs: [assistantPostInput],
    handler: async (_, context) => {
        const chatState = context.extraInputs.get(assistantPostInput)
        const content = chatState.recentMessages[0].content
        return {
            status: 200,
            body: content,
            headers: {
                'Content-Type': 'text/plain'
            }
        };
    }
})

import { HttpRequest, InvocationContext, app, input, output } from "@azure/functions"

const assistantPostInput = input.generic({
    type: 'assistantPost',
    id: '{assistantId}',
    chatModel: '%CHAT_MODEL_DEPLOYMENT_NAME%',
    userMessage: '{Query.message}',
    chatStorageConnectionSetting: CHAT_STORAGE_CONNECTION_SETTING,
    collectionName: COLLECTION_NAME
})
app.http('PostUserResponse', {
    methods: ['POST'],
    route: 'assistants/{assistantId}',
    authLevel: 'anonymous',
    extraInputs: [assistantPostInput],
    handler: async (_, context) => {
        const chatState: any = context.extraInputs.get(assistantPostInput)
        const content = chatState.recentMessages[0].content
        return {
            status: 200,
            body: content,
            headers: {
                'Content-Type': 'text/plain'
            }
        };
    }
})

This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.

Here's the function.json file for post user query:

{
  "bindings": [
    {
      "authLevel": "function",
      "type": "httpTrigger",
      "direction": "in",
      "name": "Request",
      "route": "assistants/{assistantId}",
      "methods": [
        "post"
      ]
    },
    {
      "type": "http",
      "direction": "out",
      "name": "Response"
    },
    {
      "name": "State",
      "type": "assistantPost",
      "direction": "in",
      "dataType": "string",
      "id": "{assistantId}",
      "userMessage": "{Query.message}",
      "chatModel": "%CHAT_MODEL_DEPLOYMENT_NAME%",
      "chatStorageConnectionSetting": "AzureWebJobsStorage",
      "collectionName": "ChatState"
    }
  ]
}

For more information about function.json file properties, see the Configuration section.

using namespace System.Net

param($Request, $TriggerMetadata, $State)

$recent_message_content = "No recent messages!"

if ($State.recentMessages.Count -gt 0) {
    $recent_message_content = $State.recentMessages[0].content
}

Push-OutputBinding -Name Response -Value ([HttpResponseContext]@{
    StatusCode = [HttpStatusCode]::OK
    Body       = $recent_message_content
    Headers    = @{
        "Content-Type" = "text/plain"
    }
})

This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.

@apis.function_name("PostUserQuery")
@apis.route(route="assistants/{assistantId}", methods=["POST"])
@apis.assistant_post_input(
    arg_name="state",
    id="{assistantId}",
    user_message="{Query.message}",
    chat_model="%CHAT_MODEL_DEPLOYMENT_NAME%",
    chat_storage_connection_setting=DEFAULT_CHAT_STORAGE_SETTING,
    collection_name=DEFAULT_CHAT_COLLECTION_NAME,
)
def post_user_response(req: func.HttpRequest, state: str) -> func.HttpResponse:
    # Parse the JSON string into a dictionary
    data = json.loads(state)

    # Extract the content of the recentMessage
    recent_message_content = data["recentMessages"][0]["content"]
    return func.HttpResponse(
        recent_message_content, status_code=200, mimetype="text/plain"
    )

Attributes

Apply the PostUserQuery attribute to define an assistant post input binding, which supports these parameters:

Parameter	Description
Id	The ID of the assistant to update.
UserMessage	Gets or sets the user message for the chat completion model, encoded as a string.
AIConnectionName	Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable.
ChatModel	Optional. Gets or sets the ID of the model to use as a string, with a default value of `gpt-3.5-turbo`.
Temperature	Optional. Gets or sets the sampling temperature to use, as a string between `0` and `2`. Higher values (`0.8`) make the output more random, while lower values like (`0.2`) make output more focused and deterministic. You should use either `Temperature` or `TopP`, but not both.
TopP	Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with `top_p` probability mass. So `0.1` means only the tokens comprising the top 10% probability mass are considered. You should use either `Temperature` or `TopP`, but not both.
MaxTokens	Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of `100`. The token count of your prompt plus `max_tokens` can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096).
IsReasoningModel	Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of `false`.

Annotations

The PostUserQuery annotation enables you to define an assistant post input binding, which supports these parameters:

Element	Description
name	The name of the output binding.
id	The ID of the assistant to update.
userMessage	Gets or sets the user message for the chat completion model, encoded as a string.
aiConnectionName	Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable.
chatModel	Gets or sets the ID of the model to use as a string, with a default value of `gpt-3.5-turbo`.
temperature	Optional. Gets or sets the sampling temperature to use, as a string between `0` and `2`. Higher values (`0.8`) make the output more random, while lower values like (`0.2`) make output more focused and deterministic. You should use either `Temperature` or `TopP`, but not both.
topP	Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with `top_p` probability mass. So `0.1` means only the tokens comprising the top 10% probability mass are considered. You should use either `Temperature` or `TopP`, but not both.
maxTokens	Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of `100`. The token count of your prompt plus `max_tokens` can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096).
isReasoningModel	Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of `false`.

Decorators

During the preview, define the output binding as a generic_output_binding binding of type postUserQuery, which supports these parameters:

Parameter	Description
arg_name	The name of the variable that represents the binding parameter.
id	The ID of the assistant to update.
user_message	Gets or sets the user message for the chat completion model, encoded as a string.
ai_connection_name	Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable.
chat_model	Gets or sets the ID of the model to use as a string, with a default value of `gpt-3.5-turbo`.
temperature	Optional. Gets or sets the sampling temperature to use, as a string between `0` and `2`. Higher values (`0.8`) make the output more random, while lower values like (`0.2`) make output more focused and deterministic. You should use either `Temperature` or `TopP`, but not both.
top_p	Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with `top_p` probability mass. So `0.1` means only the tokens comprising the top 10% probability mass are considered. You should use either `Temperature` or `TopP`, but not both.
max_tokens	Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of `100`. The token count of your prompt plus `max_tokens` can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096).
is_reasoning _model	Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of `false`.

Configuration

The binding supports these configuration properties that you set in the function.json file.

Property	Description
type	Must be `PostUserQuery`.
direction	Must be `out`.
name	The name of the output binding.
id	The ID of the assistant to update.
userMessage	Gets or sets the user message for the chat completion model, encoded as a string.
aiConnectionName	Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable.
chatModel	Gets or sets the ID of the model to use as a string, with a default value of `gpt-3.5-turbo`.
temperature	Optional. Gets or sets the sampling temperature to use, as a string between `0` and `2`. Higher values (`0.8`) make the output more random, while lower values like (`0.2`) make output more focused and deterministic. You should use either `Temperature` or `TopP`, but not both.
topP	Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with `top_p` probability mass. So `0.1` means only the tokens comprising the top 10% probability mass are considered. You should use either `Temperature` or `TopP`, but not both.
maxTokens	Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of `100`. The token count of your prompt plus `max_tokens` can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096).
isReasoningModel	Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of `false`.

Configuration

The binding supports these properties, which are defined in your code:

Property	Description
id	The ID of the assistant to update.
userMessage	Gets or sets the user message for the chat completion model, encoded as a string.
aiConnectionName	Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable.
chatModel	Gets or sets the ID of the model to use as a string, with a default value of `gpt-3.5-turbo`.
temperature	Optional. Gets or sets the sampling temperature to use, as a string between `0` and `2`. Higher values (`0.8`) make the output more random, while lower values like (`0.2`) make output more focused and deterministic. You should use either `Temperature` or `TopP`, but not both.
topP	Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with `top_p` probability mass. So `0.1` means only the tokens comprising the top 10% probability mass are considered. You should use either `Temperature` or `TopP`, but not both.
maxTokens	Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of `100`. The token count of your prompt plus `max_tokens` can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096).
isReasoningModel	Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of `false`.

Usage

See the Example section for complete examples.