Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
The Azure OpenAI extension for Azure Functions is currently in preview.
The Azure OpenAI assistant post input binding lets you send prompts to assistant chat bots.
For information on setup and configuration details of the Azure OpenAI extension, see Azure OpenAI extensions for Azure Functions. To learn more about Azure OpenAI assistants, see Azure OpenAI Assistants API.
Note
References and examples are only provided for the Node.js v4 model.
Note
References and examples are only provided for the Python v2 model.
Note
While both C# process models are supported, only isolated worker model examples are provided.
Example
This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.
/// <summary>
/// HTTP POST function that sends user prompts to the assistant chat bot.
/// </summary>
[Function(nameof(PostUserQuery))]
public static IActionResult PostUserQuery(
[HttpTrigger(AuthorizationLevel.Anonymous, "post", Route = "assistants/{assistantId}")] HttpRequestData req,
string assistantId,
[AssistantPostInput("{assistantId}", "{Query.message}", ChatModel = "%CHAT_MODEL_DEPLOYMENT_NAME%", ChatStorageConnectionSetting = DefaultChatStorageConnectionSetting, CollectionName = DefaultCollectionName)] AssistantState state)
{
return new OkObjectResult(state.RecentMessages.Any() ? state.RecentMessages[state.RecentMessages.Count - 1].Content : "No response returned.");
}
This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.
/*
* HTTP POST function that sends user prompts to the assistant chat bot.
*/
@FunctionName("PostUserResponse")
public HttpResponseMessage postUserResponse(
@HttpTrigger(
name = "req",
methods = {HttpMethod.POST},
authLevel = AuthorizationLevel.ANONYMOUS,
route = "assistants/{assistantId}")
HttpRequestMessage<Optional<String>> request,
@BindingName("assistantId") String assistantId,
@AssistantPost(name="newMessages", id = "{assistantId}", chatModel = "%CHAT_MODEL_DEPLOYMENT_NAME%", userMessage = "{Query.message}", chatStorageConnectionSetting = DEFAULT_CHATSTORAGE, collectionName = DEFAULT_COLLECTION) AssistantState state,
final ExecutionContext context) {
List<AssistantMessage> recentMessages = state.getRecentMessages();
String response = recentMessages.isEmpty() ? "No response returned." : recentMessages.get(recentMessages.size() - 1).getContent();
return request.createResponseBuilder(HttpStatus.OK)
.header("Content-Type", "application/json")
.body(response)
.build();
}
This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.
const { app, input, output } = require("@azure/functions");
const assistantPostInput = input.generic({
type: 'assistantPost',
id: '{assistantId}',
chatModel: '%CHAT_MODEL_DEPLOYMENT_NAME%',
userMessage: '{Query.message}',
chatStorageConnectionSetting: CHAT_STORAGE_CONNECTION_SETTING,
collectionName: COLLECTION_NAME
})
app.http('PostUserResponse', {
methods: ['POST'],
route: 'assistants/{assistantId}',
authLevel: 'anonymous',
extraInputs: [assistantPostInput],
handler: async (_, context) => {
const chatState = context.extraInputs.get(assistantPostInput)
const content = chatState.recentMessages[0].content
return {
status: 200,
body: content,
headers: {
'Content-Type': 'text/plain'
}
};
}
})
import { HttpRequest, InvocationContext, app, input, output } from "@azure/functions"
const assistantPostInput = input.generic({
type: 'assistantPost',
id: '{assistantId}',
chatModel: '%CHAT_MODEL_DEPLOYMENT_NAME%',
userMessage: '{Query.message}',
chatStorageConnectionSetting: CHAT_STORAGE_CONNECTION_SETTING,
collectionName: COLLECTION_NAME
})
app.http('PostUserResponse', {
methods: ['POST'],
route: 'assistants/{assistantId}',
authLevel: 'anonymous',
extraInputs: [assistantPostInput],
handler: async (_, context) => {
const chatState: any = context.extraInputs.get(assistantPostInput)
const content = chatState.recentMessages[0].content
return {
status: 200,
body: content,
headers: {
'Content-Type': 'text/plain'
}
};
}
})
This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.
Here's the function.json file for post user query:
{
"bindings": [
{
"authLevel": "function",
"type": "httpTrigger",
"direction": "in",
"name": "Request",
"route": "assistants/{assistantId}",
"methods": [
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "Response"
},
{
"name": "State",
"type": "assistantPost",
"direction": "in",
"dataType": "string",
"id": "{assistantId}",
"userMessage": "{Query.message}",
"chatModel": "%CHAT_MODEL_DEPLOYMENT_NAME%",
"chatStorageConnectionSetting": "AzureWebJobsStorage",
"collectionName": "ChatState"
}
]
}
For more information about function.json file properties, see the Configuration section.
using namespace System.Net
param($Request, $TriggerMetadata, $State)
$recent_message_content = "No recent messages!"
if ($State.recentMessages.Count -gt 0) {
$recent_message_content = $State.recentMessages[0].content
}
Push-OutputBinding -Name Response -Value ([HttpResponseContext]@{
StatusCode = [HttpStatusCode]::OK
Body = $recent_message_content
Headers = @{
"Content-Type" = "text/plain"
}
})
This example demonstrates the creation process, where the HTTP POST function that sends user prompts to the assistant chat bot. The response to the prompt is returned in the HTTP response.
@apis.function_name("PostUserQuery")
@apis.route(route="assistants/{assistantId}", methods=["POST"])
@apis.assistant_post_input(
arg_name="state",
id="{assistantId}",
user_message="{Query.message}",
chat_model="%CHAT_MODEL_DEPLOYMENT_NAME%",
chat_storage_connection_setting=DEFAULT_CHAT_STORAGE_SETTING,
collection_name=DEFAULT_CHAT_COLLECTION_NAME,
)
def post_user_response(req: func.HttpRequest, state: str) -> func.HttpResponse:
# Parse the JSON string into a dictionary
data = json.loads(state)
# Extract the content of the recentMessage
recent_message_content = data["recentMessages"][0]["content"]
return func.HttpResponse(
recent_message_content, status_code=200, mimetype="text/plain"
)
Attributes
Apply the PostUserQuery
attribute to define an assistant post input binding, which supports these parameters:
Parameter | Description |
---|---|
Id | The ID of the assistant to update. |
UserMessage | Gets or sets the user message for the chat completion model, encoded as a string. |
AIConnectionName | Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable. |
ChatModel | Optional. Gets or sets the ID of the model to use as a string, with a default value of gpt-3.5-turbo . |
Temperature | Optional. Gets or sets the sampling temperature to use, as a string between 0 and 2 . Higher values (0.8 ) make the output more random, while lower values like (0.2 ) make output more focused and deterministic. You should use either Temperature or TopP , but not both. |
TopP | Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should use either Temperature or TopP , but not both. |
MaxTokens | Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of 100 . The token count of your prompt plus max_tokens can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096). |
IsReasoningModel | Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of false . |
Annotations
The PostUserQuery
annotation enables you to define an assistant post input binding, which supports these parameters:
Element | Description |
---|---|
name | The name of the output binding. |
id | The ID of the assistant to update. |
userMessage | Gets or sets the user message for the chat completion model, encoded as a string. |
aiConnectionName | Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable. |
chatModel | Gets or sets the ID of the model to use as a string, with a default value of gpt-3.5-turbo . |
temperature | Optional. Gets or sets the sampling temperature to use, as a string between 0 and 2 . Higher values (0.8 ) make the output more random, while lower values like (0.2 ) make output more focused and deterministic. You should use either Temperature or TopP , but not both. |
topP | Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should use either Temperature or TopP , but not both. |
maxTokens | Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of 100 . The token count of your prompt plus max_tokens can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096). |
isReasoningModel | Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of false . |
Decorators
During the preview, define the output binding as a generic_output_binding
binding of type postUserQuery
, which supports these parameters:
Parameter | Description |
---|---|
arg_name | The name of the variable that represents the binding parameter. |
id | The ID of the assistant to update. |
user_message | Gets or sets the user message for the chat completion model, encoded as a string. |
ai_connection_name | Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable. |
chat_model | Gets or sets the ID of the model to use as a string, with a default value of gpt-3.5-turbo . |
temperature | Optional. Gets or sets the sampling temperature to use, as a string between 0 and 2 . Higher values (0.8 ) make the output more random, while lower values like (0.2 ) make output more focused and deterministic. You should use either Temperature or TopP , but not both. |
top_p | Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should use either Temperature or TopP , but not both. |
max_tokens | Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of 100 . The token count of your prompt plus max_tokens can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096). |
is_reasoning _model | Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of false . |
Configuration
The binding supports these configuration properties that you set in the function.json file.
Property | Description |
---|---|
type | Must be PostUserQuery . |
direction | Must be out . |
name | The name of the output binding. |
id | The ID of the assistant to update. |
userMessage | Gets or sets the user message for the chat completion model, encoded as a string. |
aiConnectionName | Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable. |
chatModel | Gets or sets the ID of the model to use as a string, with a default value of gpt-3.5-turbo . |
temperature | Optional. Gets or sets the sampling temperature to use, as a string between 0 and 2 . Higher values (0.8 ) make the output more random, while lower values like (0.2 ) make output more focused and deterministic. You should use either Temperature or TopP , but not both. |
topP | Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should use either Temperature or TopP , but not both. |
maxTokens | Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of 100 . The token count of your prompt plus max_tokens can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096). |
isReasoningModel | Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of false . |
Configuration
The binding supports these properties, which are defined in your code:
Property | Description |
---|---|
id | The ID of the assistant to update. |
userMessage | Gets or sets the user message for the chat completion model, encoded as a string. |
aiConnectionName | Optional. Gets or sets the name of the configuration section for AI service connectivity settings. For Azure OpenAI: If specified, looks for "Endpoint" and "Key" values in this configuration section. If not specified or the section doesn't exist, falls back to environment variables: AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_KEY. For user-assigned managed identity authentication, this property is required. For OpenAI service (non-Azure), set the OPENAI_API_KEY environment variable. |
chatModel | Gets or sets the ID of the model to use as a string, with a default value of gpt-3.5-turbo . |
temperature | Optional. Gets or sets the sampling temperature to use, as a string between 0 and 2 . Higher values (0.8 ) make the output more random, while lower values like (0.2 ) make output more focused and deterministic. You should use either Temperature or TopP , but not both. |
topP | Optional. Gets or sets an alternative to sampling with temperature, called nucleus sampling, as a string. In this sampling method, the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. You should use either Temperature or TopP , but not both. |
maxTokens | Optional. Gets or sets the maximum number of tokens to generate in the completion, as a string with a default of 100 . The token count of your prompt plus max_tokens can't exceed the model's context length. Most models have a context length of 2,048 tokens (except for the newest models, which support 4096). |
isReasoningModel | Optional. Gets or sets a value indicating whether the chat completion model is a reasoning model. This option is experimental and associated with the reasoning model until all models have parity in the expected properties, with a default value of false . |
Usage
See the Example section for complete examples.