Building Conversational AI with LLMs and Agents
Appendix P: Semantic Kernel: Enterprise AI Orchestration

Planners: Sequential, Stepwise, and Handlebars

Big Picture

Planners are Semantic Kernel: Enterprise AI Orchestration's mechanism for automatic orchestration. Instead of manually chaining function calls, you describe a goal in natural language and the planner figures out which functions to call and in what order. SK has evolved through several planner generations: the legacy SequentialPlanner and StepwisePlanner, the Handlebars planner, and the current recommended approach using automatic function calling with OpenAI-compatible models. This section covers all four approaches so you can work with both legacy and modern codebases.

1. Why Planners Exist

In Section P.1, you learned to invoke functions manually: call the converter, then pass the result to the writer. This works for simple pipelines, but real applications often require dynamic routing. The user might ask "What is the weather in Paris in Fahrenheit?" which requires calling a weather API (to get Celsius), then a unit converter, then a natural language formatter. A planner inspects the available functions, interprets the user's goal, and assembles the execution plan automatically.

The core tradeoff is control versus flexibility. Manual orchestration gives you full control but requires anticipating every workflow. Planners give you flexibility but introduce LLM reasoning into the control flow, which can be unpredictable.

2. Automatic Function Calling (Recommended)

The modern approach in Semantic Kernel: Enterprise AI Orchestration uses the model's native function-calling capability (supported by GPT-4o, GPT-4 Turbo, and compatible models). You register your plugins, enable auto function calling in the execution settings, and the model decides which functions to invoke.

import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
from semantic_kernel.contents import ChatHistory

kernel = sk.Kernel()
kernel.add_service(
    OpenAIChatCompletion(service_id="chat", ai_model_id="gpt-4o")
)

# Register plugins (from Section P.1)
kernel.add_plugin(MathPlugin(), plugin_name="Math")
kernel.add_plugin(WeatherPlugin(api_key="..."), plugin_name="Weather")

# Enable automatic function calling
settings = OpenAIChatPromptExecutionSettings(
    service_id="chat",
    function_choice_behavior=FunctionChoiceBehavior.Auto(),
)

# Create a chat history and ask a question
history = ChatHistory()
history.add_user_message(
    "What is the weather in Tokyo? Also, what is 42 times 17?"
)

result = await kernel.invoke_prompt(
    prompt="{{$chat_history}}",
    chat_history=history,
    settings=settings,
)
print(result)
# The model calls Weather.get_current_weather("Tokyo")
# and Math.multiply(42, 17) automatically
Memory stored: 3 entries Recall query: 'Python libraries' Results: [0.94] NumPy and pandas are essential Python libraries for data... [0.87] scikit-learn provides machine learning algorithms...
Key Insight

FunctionChoiceBehavior.Auto() lets the model call zero or more functions per turn. Use FunctionChoiceBehavior.Required() to force at least one function call, or FunctionChoiceBehavior.NoneInvoke() to send function schemas without allowing calls (useful for dry runs).

3. Controlling Function Visibility

You may not want every registered function to be available for automatic calling. SK lets you filter which functions the model can see using include and exclude lists.

# Only expose specific functions to the model
settings = OpenAIChatPromptExecutionSettings(
    service_id="chat",
    function_choice_behavior=FunctionChoiceBehavior.Auto(
        filters={
            "included_plugins": ["Weather"],
            # Or use "excluded_functions": ["Math.add"]
        }
    ),
)

# Now the model can only call Weather plugin functions,
# even though Math is still registered on the kernel

This filtering is essential for security. If your kernel has an "AdminPlugin" with a "delete_user" function, you certainly do not want a customer-facing chatbot to discover and invoke it.

4. The Handlebars Planner

The Handlebars planner generates a Handlebars template that orchestrates function calls. Unlike automatic function calling (which relies on the model's built-in tool-use protocol), the Handlebars planner asks the LLM to produce a plan as structured text, then executes that plan.

from semantic_kernel.planners.handlebars_planner import HandlebarsPlanner

planner = HandlebarsPlanner(
    service_id="chat",
    options={
        "excluded_plugins": ["Admin"],
        "max_tokens": 2000,
    },
)

# Generate a plan from a natural language goal
plan = await planner.create_plan(
    kernel=kernel,
    goal="Get the weather in London and convert the temperature to Fahrenheit",
)

# Inspect the generated plan (it's a Handlebars template)
print(plan.generated_plan)
# {{#with (Weather-get_current_weather city="London")}}
#   {{#with (Units-celsius_to_fahrenheit celsius=this.temp)}}
#     The temperature in London is {{this}} F.
#   {{/with}}
# {{/with}}

# Execute the plan
result = await plan.invoke(kernel)
print(result)
Embedding generated: dimension=1536 Similarity search: 5 results Top result (score=0.93): Transformers use self-attention...

The Handlebars planner is useful when you need to inspect, modify, or cache plans before execution. Since the plan is a text template, you can log it, show it to a user for approval, or store it for repeated use.

Tip

If a Handlebars plan fails, examine the generated template. Common issues include the LLM using incorrect function names, passing wrong argument types, or misunderstanding function descriptions. Improving your @kernel_function descriptions is usually the most effective fix.

5. Legacy: Sequential Planner

The SequentialPlanner was SK's original planning approach. It generates an XML-based plan that lists function calls in order, piping the output of each step to the next. While deprecated in favor of automatic function calling, you may encounter it in older codebases.

# Legacy approach (SK v0.x / early v1.x)
from semantic_kernel.planners import SequentialPlanner

planner = SequentialPlanner(kernel)
plan = await planner.create_plan(
    goal="Summarize the latest news about AI and translate it to Spanish"
)

# The plan is a sequence of steps:
# Step 1: News.get_latest(topic="AI")
# Step 2: Writer.summarize(input=$step1_result)
# Step 3: Translator.translate(input=$step2_result, target_lang="Spanish")

result = await plan.invoke(kernel)
print(result)
Vector store: ChromaDB Documents indexed: 150 Query: 'How do transformers work?' Retrieved 3 relevant chunks

The sequential planner works well for linear pipelines but struggles with branching logic or parallel execution. For new projects, prefer automatic function calling.

6. Legacy: Stepwise Planner

The StepwisePlanner implements a ReAct-style loop: the LLM reasons about what to do next, calls a function, observes the result, and repeats until the goal is met. This is more flexible than the sequential planner because it can adapt mid-execution.

# Legacy approach
from semantic_kernel.planners import StepwisePlanner

planner = StepwisePlanner(kernel, max_iterations=10)
plan = await planner.create_plan(
    goal="Find out if it's warmer in Tokyo or London right now"
)

# The planner runs a loop:
# Thought: I need to check Tokyo's weather
# Action: Weather.get_current_weather(city="Tokyo")
# Observation: Tokyo: 22C, Sunny
# Thought: Now I need London's weather
# Action: Weather.get_current_weather(city="London")
# Observation: London: 15C, Cloudy
# Thought: 22 > 15, so Tokyo is warmer
# Final Answer: Tokyo is currently warmer at 22C vs London's 15C.

result = await plan.invoke(kernel)

The stepwise approach is the predecessor of what automatic function calling now handles natively. Models like GPT-4o perform this reasoning internally when you use FunctionChoiceBehavior.Auto().

7. Plan Caching and Reuse

For performance-sensitive applications, you can cache generated plans and reuse them with different inputs. This avoids the latency and cost of asking the LLM to generate a new plan for every request.

import json

# Generate a plan once
plan = await planner.create_plan(
    kernel=kernel,
    goal="Get weather and convert temperature",
)

# Serialize the plan
plan_json = plan.generated_plan  # string representation

# Later, reload and execute with new inputs
# (The specific API depends on the planner type)
cached_plan = HandlebarsPlanner.load_plan(plan_json)
result = await cached_plan.invoke(kernel, city="Berlin")
Warning

Cached plans are fragile. If you rename a function, remove a plugin, or change a function's signature, cached plans will break at execution time. Implement a versioning scheme for your plans, and invalidate the cache when the kernel's function registry changes.

8. Choosing the Right Orchestration Strategy

Selecting the appropriate orchestration approach depends on your requirements for control, predictability, and flexibility.

In practice, many production systems use a hybrid approach: manual orchestration for the critical path, with automatic function calling for handling edge cases and user-driven exploration within guardrailed boundaries.