Building a Fully Functional AI Assistant with OpenAI (Series)

Building a Fully Functional AI Assistant with OpenAI (Series)

OpenAI
AI
Assistant
Agent
RAG
Tools
TypeScript
2025-01-26

A critical piece to unlocking your assistant’s potential is function calling. By leveraging function calling, you can give your assistant “superpowers” like calling external APIs, submitting forms, or searching data sources—without having to manually parse freeform text responses.

In a nutshell, we can define specialized functions, let the model decide when to call them, parse the structured response, and incorporate that result back into chat. This pattern is a game-changer for bridging the gap between your assistant’s replies and programmatic actions or data fetches in real time.

Here's the basic process with function calling and structured outputs:

  1. Define a tool (for instance, a function to search the weather).
  2. The assistant sees your definitions and can optionally generate a tool call with shorthand arguments in JSON if it decides the function is relevant.
  3. Your server (NestJS, Express, etc.) interprets this “function call” block, executes the actual logic, and returns the results.
  4. The assistant then uses those results in the conversation, presenting them neatly to the user.

Below, I’ll review how this shakes out in actual code, show why a “shared abilities” or “base class” approach is super helpful, and explain how structured outputs can ensure reliability by delivering strictly valid JSON your code can easily parse.

To illustrate, let's say we have a function called get_weather that fetches current temperatures. We define a tool object in JavaScript that includes “parameters” as a JSON Schema with a strict mode. In that schema, we specify the exact type of arguments this function expects. For example:

Defining a get_weather tool
import { OpenAI } from "openai"; const openai = new OpenAI(); const tools = [ { "type": "function", "function": { "name": "get_weather", "description": "Get current temperature for a given location.", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City and country e.g. Bogotá, Colombia" } }, "required": ["location"], "additionalProperties": false }, "strict": true } } ]; async function askAboutWeather() { const completion = await openai.chat.completions.create({ model: "gpt-4o", messages: [ { role: "user", content: "What is the weather like in Paris today?" } ], tools, store: true }); console.log(completion.choices[0].message.tool_calls); }

If the model decides to use our get_weather function, it will produce a structured JSON snippet:

Sample function call response
[ { "id": "call_12345xyz", "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\":\"Paris, France\"}" } } ]

We can parse arguments, execute the external call, and feed the results back in to finalize the conversation. This beats searching for arguments in raw, unstructured text.

The assistant might generate more than one tool call in a single run, especially if the user’s request is broad:

Multiple calls in one response
[ { "id": "call_12345xyz", "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\":\"Paris, France\"}" } }, { "id": "call_67890abc", "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\":\"Bogotá, Colombia\"}" } } ]

We simply iterate each call, parse the arguments, run the command, and gather the results so that the next conversation message can incorporate them. This concept generalizes for emailing, searching knowledge bases, or any specialized function.

Suppose we have a BaseAssistant class that defines a set of built-in tools (Weather, Email, GPT-based code interpreter, etc.). All specialized assistants—like a TravelBot, FinanceBot, or ShoppingBot—extend that base class. Now, each derived assistant automatically gets:

  • Consistent function calling logic.
  • Shared “tool-handling” code that interprets function args.
  • DRY expansions—add a new tool once, all your assistants can use it.

This approach is especially handy if you integrate many different function calls, from “search_knowledge_base” to “send_email” or “submit_form.” Keep the code consistent in one place, and your chat-based apps remain super flexible.

Sometimes, you may want the assistant’s final response to come in strict JSON form, not just freeform text. Structured outputs help you guarantee the model’s final response is valid JSON that matches a JSON Schema you’ve defined. That means no more guesswork about whether the model might produce partial or invalid JSON.

With strict mode, the model is forced to respect a strict schema—like an object containing certain keys, each with enumerated value types or possible string enumerations. This is perfect for building a custom UI, data extraction pipeline, or anything that demands guaranteed structured output. For example:

Strict structured output schema
{ "type": "object", "properties": { "title": { "type": "string" }, "authors": { "type": "array", "items": { "type": "string" } }, "abstract": { "type": "string" }, "keywords": { "type": "array", "items": { "type": "string" } } }, "required": ["title", "authors", "abstract", "keywords"], "additionalProperties": false }

Then, when you include that as a response_format in your OpenAI call, the assistant will produce exactly those keys. Because you can nest arrays, objects, and references (via$defs), you can represent practically any data structure you need.

While function calling is about letting your AI trigger server-side code, structured outputs ensure that the AI’s final user-facing or internal data is consistently formatted. Often we use both: function calls to do “actions” and structured outputs for the final user response. This synergy is an incredible leap in building robust, production-ready AI-driven apps.

By now, our assistant flow includes:

  • User asks for something. The assistant sees the system instructions (including available tools & function schemas).
  • The assistant may produce a function call if needed—like “checkWeather({ location: 'Paris' })”.
  • Our backend executes that function, returning structured data or a success/failure message.
  • The assistant “absorbs” that tool result into its next response. If we also want the final output to be a certain schema, we instruct the model to format it as strict JSON.
Assistant Flow Chart

This combined approach—function calling plus optionally structured outputs—delivers reliability. No more hacking around the AI’s sometimes freestyle text. You get the best of both worlds: the natural conversation power of LLMs and the reliability of typed function calls and JSON structures.

  • Function Calling is brilliant for bridging “AI conversation” with real-world data fetching and side-effect actions in your app.
  • Structured Outputs help you reliably parse final assistant responses (or function calls) in a strict JSON format you define.
  • A Base Class / Shared Abilities approach helps you maintain a single code path for function invocation and result handling. That means less duplication and easier expansions over time.

With these patterns, you can do everything from building a customer support agent that retrieves user data from your database, to an analytics bot that runs code transformations on the fly, to a knowledge-base agent that always returns a valid JSON structure. The possibilities are limitless—and in my experience, it’s unbelievably empowering as soon as you see your “assistant” fetch real data in code and respond seamlessly.