Tools

You can register tools (also known as functions) that the LLM may decide to call as part of assembling the answer. See OpenAI functions, Ollama tools, or Anthropic tool use.

Not all LLM models support tools, in those cases, GenAIScript also support a fallback mechanism to implement tool call through system prompts (see Fallback Tools).

Play

defTool is used to define a tool that can be called by the LLM. It takes a JSON schema to define the input and expects a string output. The parameters are defined using the parameters schema.

The LLM decides to call this tool on its own!

defTool(
    "current_weather",
    "get the current weather",
    {
        city: "",
    },
    (args) => {
        const { location } = args
        if (location === "Brussels") return "sunny"
        else return "variable"
    }
)

In the example above, we define a tool called current_weather that takes a location as input and returns the weather.

Weather tool example

Section titled “Weather tool example”

This example uses the current_weather tool to get the weather for Brussels.

script({
    model: "small",
    title: "Weather as function",
    description:
        "Query the weather for each city using a dummy weather function",
    temperature: 0.5,
    files: "src/cities.md",
    tests: {
        files: "src/cities.md",
        keywords: "Brussels",
    },
})
$`Query the weather for each listed city and return the results as a table.`
def("CITIES", env.files)
defTool(
    "get_current_weather",
    "get the current weather",
    {
        type: "object",
        properties: {
            location: {
                type: "string",
                description: "The city and state, e.g. San Francisco, CA",
            },
        },
        required: ["location"],
    },
    (args) => {
        const { context, location } = args
        const { trace } = context
        trace.log(`Getting weather for ${location}...`)
        let content = "variable"
        if (location === "Brussels") content = "sunny"
        return content
    }
)

Math tool example

Section titled “Math tool example”

This example uses the math expression evaluator to evaluate a math expression.

script({
  title: "math-agent",
  model: "small",
  description: "A port of https://ts.llamaindex.ai/examples/agent",
  parameters: {
    question: {
      type: "string",
      default: "How much is 11 + 4? then divide by 3?",
    },
  },
  tests: {
    description: "Testing the default prompt",
    keywords: "5",
  },
});
defTool("sum", "Use this function to sum two numbers", { a: 1, b: 2 }, ({ a, b }) => {
  console.log(`${a} + ${b}`);
  return `${a + b}`;
});
defTool(
  "divide",
  "Use this function to divide two numbers",
  {
    type: "object",
    properties: {
      a: {
        type: "number",
        description: "The first number",
      },
      b: {
        type: "number",
        description: "The second number",
      },
    },
    required: ["a", "b"],
  },
  ({ a, b }) => {
    console.log(`${a} / ${b}`);
    return `${a / b}`;
  },
);
$`Answer the following arithmetic question:
    ${env.vars.question}
`;

Reusing tools in system scripts

Section titled “Reusing tools in system scripts”

You can define tools in a system script and include them in your main script as any other system script or tool.

system({ description: "Random tools" })
export default function (ctx: ChatGenerationContext) {
    const { defTool } = ctx
    defTool("random", "Generate a random number", {}, () => Math.random())
}

Make sure to use system instead of script in the system script.

script({
    title: "Random number",
    tools: ["random"],
})
$`Generate a random number.

Multiple instances of the same system script

Section titled “Multiple instances of the same system script”

You can include the same system script multiple times in a script with different parameters.

script({
    system: [
        "system.agent_git", // git operations on current repository
        {
            id: "system.agent_git", // same system script
            parameters: { repo: "microsoft/genaiscript" }, // but with new parameters
            variant: "genaiscript" // appended to the identifier to keep tool identifiers unique
        }
    ]
})

Model Context Protocol Tools

Section titled “Model Context Protocol Tools”

Model Context Provider (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools.

You can leverage MCP servers to provide tools to your LLM.

defTool({
    memory: {
        command: "npx",
        args: ["-y", "@modelcontextprotocol/server-memory"],
    },
})

See Model Context Protocol Tools for more information.

Some LLM models do not have built-in model support. For those model, it is possible to enable tool support through system prompts. The performance may be lower than built-in tools, but it is still possible to use tools.

The tool support is implemented in system.tool_calls and “teaches” the LLM how to call tools. When this mode is enabled, you will see the tool call tokens being responded by the LLM.

GenAIScript maintains a list of well-known models that do not support tools so it will happen automatically for those models.

To enable this mode, you can either

add the fallbackTools option to the script

script({
    fallbackTools: true,
})

or add the --fallback-tools flag to the CLI

npx genaiscript run ... --fallback-tools

Prompt Injection Detection

Section titled “Prompt Injection Detection”

A tool may retrieve data that contains prompt injection attacks. For example, a tool that fetches a URL may return a page that contains prompt injection attacks.

To prevent this, you can enable the detectPromptInjection option. It will run your content safety scanner services on the tool output and will erase the answer if an attack is detected.

defTool("fetch", "Fetch a URL", {
    url: {
        type: "string",
        description: "The URL to fetch",
    },
}, async (args) => ...,
{
    detectPromptInjection: "always",
})

Output Intent validation

Section titled “Output Intent validation”

You can configure GenAIScript to execute a LLM-as-a-Judge validation of the tool result based on the description or a custom intent. The LLM-as-a-Judge will happen on every tool response using the intent model alias, which maps to small by default.

The description intent is a special value that gets expanded to the tool description.

defTool(
    "fetch",
    "Gets the live weather",
    {
        location: "Seattle",
    },
    async (args) => { ... },
    {
        intent: "description",
    }
)

Packaging as System scripts

Section titled “Packaging as System scripts”

To pick and choose which tools to include in a script, you can group them in system scripts. For example, the current_weather tool can be included the system.current_weather.genai.mjs script.

script({
    title: "Get the current weather",
})
defTool("current_weather", ...)

then use the tool id in the tools field.

script({
    ...,
    tools: ["current_weather"],
})

Let’s illustrate how tools come together with a question answering script.

In the script below, we add the retrieval_web_search tool. This tool will call into retrieval.webSearch as needed.

script({
    title: "Answer questions",
    tool: ["retrieval_web_search"]
})
def("FILES", env.files)
$`Answer the questions in FILES using a web search.
- List a summary of the answers and the sources used to create the answers.

We can then apply this script to the questions.md file below.

- What is the weather in Seattle?
- What laws were voted in the USA congress last week?

After the first request, the LLM requests to call the web_search for each questions. The web search answers are then added to the LLM message history and the request is made again. The second yields the final result which includes the web search results.

Builtin tools

fetch Fetch data from a URL from allowed domains.

fs_ask_file Runs a LLM query over the content of a file. Use this tool to extract information from a file.

fs_diff_files Computes a diff between two different files. Use git diff instead to compare versions of a file.

fs_find_files Finds file matching a glob pattern. Use pattern to specify a regular expression to search for in the file content. Be careful about asking too many files.

fs_read_file Reads a file as text from the file system. Returns undefined if the file does not exist.

fs_write_file Writes text content to a file in the workspace. The file will be created if it doesn't exist, and parent directories will be created as needed. Only files within the current workspace are allowed to be written.

git_status Generates a status of the repository using client.

git_diff Computes file diffs using the git diff command. If the diff is too large, it returns the list of modified/added files.

github_actions_job_logs_get Download github workflow job log. If the log is too large, use 'github_actions_job_logs_diff' to compare logs.

math_eval Evaluates a math expression. Do NOT try to compute arithmetic operations yourself, use this tool.

md_find_files Get the file structure of the documentation markdown/MDX files. Retursn filename, title, description for each match. Use pattern to specify a regular expression to search for in the file content.

meta_prompt Tool that applies OpenAI's meta prompt guidelines to a user prompt. Modified from https://platform.openai.com/docs/guides/prompt-generation?context=text-out.

meta_schema Generate a valid JSON schema for the described JSON. Source https://platform.openai.com/docs/guides/prompt-generation?context=structured-output-schema.

node_test build and test current project using `npm test`

python_code_interpreter_run Executes python 3.12 code for Data Analysis tasks in a docker container. The process output is returned. Do not generate visualizations. The only packages available are numpy===2.1.3, pandas===2.2.3, scipy===1.14.1, matplotlib===3.9.2. There is NO network connectivity. Do not attempt to install other packages or make web requests. You must copy all the necessary files or pass all the data because the python code runs in a separate container.

resource_list List available resources from the host. Returns a list of available resource URIs and their descriptions.

resource_read Read the content of a resource from a URL. Resolves various protocols and returns the content of the files found at the URL.

think Use the tool to think about something. It will not obtain new information or change the database, but just append the thought to the log. Use it when complex reasoning or some cache memory is needed.

transcribe Generate a transcript from a audio/video file using a speech-to-text model.

video_probe Probe a video file and returns the metadata information

video_extract_audio Extract audio from a video file into an audio file. Returns the audio filename.