Defining Tool Schemas
A tool's inputSchema is how the LLM learns what arguments to supply. A well-written schema produces reliable tool calls; a poorly-written one produces hallucinated or missing arguments. This page covers JSON Schema best practices for MCP tools.
How the LLM Uses Your Schema
When Octopus registers a tool with the LLM, it passes:
- The tool name — what to call
- The tool description — when to call it and what it returns
- The inputSchema — exactly what arguments to supply
The LLM reads the description field on every property to decide what value to provide. Missing or vague descriptions cause the LLM to guess — resulting in incorrect or hallucinated argument values at runtime.
Schema Anatomy
{
"type": "object",
"properties": {
// Each key is an argument the LLM can supply
"email": {
"type": "string",
"format": "email",
"description": "The contact's primary email address (e.g. jane@example.com)"
},
"include_notes": {
"type": "boolean",
"description": "Set to true to include internal CRM notes in the response",
"default": false
},
"max_results": {
"type": "integer",
"description": "Maximum number of records to return (1-100)",
"minimum": 1,
"maximum": 100,
"default": 10
}
},
"required": ["email"],
"additionalProperties": false
}
Property Types and Formats
| JSON Schema Type | Use For | Useful Constraints |
|---|---|---|
string | Text, IDs, names, free-form input | minLength, maxLength, pattern, format |
string + enum | Fixed-value choices | List every valid value in the enum array |
integer | Whole numbers, counts, IDs | minimum, maximum |
number | Decimal amounts, percentages | minimum, maximum, multipleOf |
boolean | Flags, include/exclude toggles | Always add a default |
array | Lists of values (emails, tags) | items schema, minItems, maxItems |
object | Nested structures | Define properties + required recursively |
Useful String Formats
| Format Value | Meaning | Example |
|---|---|---|
email | RFC 5321 email address | "jane@example.com" |
date | ISO 8601 date (YYYY-MM-DD) | "2024-06-15" |
date-time | ISO 8601 datetime | "2024-06-15T14:30:00Z" |
uri | Absolute URI | "https://example.com/page" |
uuid | UUID v4 | "550e8400-e29b-41d4-a716-..." |
Writing Effective Tool Descriptions
The tool-level description tells the LLM when to call the tool. The property-level description tells the LLM what value to supply. Both must be written for the LLM, not for human documentation.
// GOOD — tool description tells the LLM when and what it returns
{
"name": "crm_get_contact",
"description": "Look up a CRM contact by their email address. " +
"Call this when the user asks about a specific person, customer, or lead. " +
"Returns the contact's name, account, phone number, and open opportunities."
}
// BAD — too vague; LLM may not know when to use it
{
"name": "crm_get_contact",
"description": "Gets a contact."
}
// GOOD — property description tells the LLM exactly what to supply
"start_date": {
"type": "string",
"format": "date",
"description": "Start of the date range in YYYY-MM-DD format. " +
"Use the date the user mentioned, or today's date if not specified."
}
// BAD — LLM may not format the date correctly
"start_date": {
"type": "string",
"description": "Start date"
}
Arrays and Nested Objects
// Array of email addresses
"recipients": {
"type": "array",
"description": "List of recipient email addresses (at least one required)",
"items": {
"type": "string",
"format": "email"
},
"minItems": 1,
"maxItems": 10
}
// Nested object — date range
"date_range": {
"type": "object",
"description": "Inclusive date range for the query",
"properties": {
"from": { "type": "string", "format": "date",
"description": "Start date (YYYY-MM-DD)" },
"to": { "type": "string", "format": "date",
"description": "End date (YYYY-MM-DD, inclusive)" }
},
"required": ["from", "to"]
}
Optional vs Required Parameters
Put genuinely optional parameters outside the required array and give them a meaningful default. The LLM will omit them when not needed, which makes tool calls shorter and cheaper.
{
"type": "object",
"properties": {
"query": { "type": "string",
"description": "Search query (required)" },
"status": { "type": "string",
"enum": ["open","closed","all"],
"description": "Filter by status. Default is 'open'.",
"default": "open" },
"max_results": { "type": "integer",
"description": "Maximum tickets to return (1–50). Default is 10.",
"default": 10,
"minimum": 1,
"maximum": 50 }
},
"required": ["query"] // only query is mandatory
}
Common Schema Mistakes
| Mistake | Effect | Fix |
|---|---|---|
No description on required fields | LLM guesses the value; wrong calls at runtime | Add clear, LLM-readable descriptions to every property |
Required fields missing from required array | LLM may omit them; handler throws null ref | List every mandatory param in required |
Using any type or no type | LLM sends arbitrary JSON; handler fails | Always specify "type" |
| Enum values not matching handler expectations | Handler receives unexpected string; runtime error | Sync enum values with your handler's switch/match cases |
Date fields typed as string without format | LLM sends "next Tuesday" instead of a date | Add "format": "date" or "date-time" |
No additionalProperties: false | Extra LLM-generated fields silently ignored or stored | Add "additionalProperties": false to top-level schema |
GET /tools and validating each inputSchema with a JSON Schema validator (e.g. Newtonsoft.Json.Schema or the jsonschema Python library). Malformed schemas are surfaced to the LLM as-is and can cause silent failures.