Section P.2: Prompt Templates and Semantic Functions | Building Conversational AI with LLMs and Agents

Big Picture

Prompt engineering is powerful, but raw string concatenation becomes unmaintainable as prompts grow. Semantic Kernel: Enterprise AI Orchestration provides a template engine that separates prompt structure from runtime data, supports conditional logic, and integrates with the kernel's function system. This section covers SK's template syntax, input variables, rendering pipeline, and best practices for building reusable semantic functions that scale across teams and projects.

1. The Semantic Kernel: Enterprise AI Orchestration Template Language

SK templates use a double-brace syntax inspired by Handlebars. Variables are prefixed with $, and function calls use a dotted pluginName.functionName notation. The template engine resolves all placeholders before the prompt is sent to the LLM.

# Basic variable substitution
template = """You are a helpful assistant specializing in {{$domain}}.

The user asks: {{$question}}

Provide a clear, concise answer."""

# At invocation time, $domain and $question are replaced
# with the values you pass as keyword arguments.

Variables can hold any string value. The engine performs simple string substitution; it does not enforce types at the template level. Type checking happens at the function boundary, where you declare input variables with descriptions and optional default values.

2. Declaring Input Variables

Each semantic function declares its expected inputs through a PromptTemplateConfig. This configuration tells the kernel (and any planner) what arguments the function needs, enabling auto-discovery and composition.

from semantic_kernel.prompt_template import PromptTemplateConfig, InputVariable

config = PromptTemplateConfig(
    template_format="semantic-kernel",
    template="""Translate the following text from {{$source_lang}} to {{$target_lang}}.

Text: {{$input}}

Translation:""",
    input_variables=[
        InputVariable(
            name="input",
            description="The text to translate",
            is_required=True,
        ),
        InputVariable(
            name="source_lang",
            description="Source language name",
            default="English",
        ),
        InputVariable(
            name="target_lang",
            description="Target language name",
            default="French",
        ),
    ],
)

translate_fn = kernel.add_function(
    plugin_name="Translator",
    function_name="translate",
    prompt_template_config=config,
)

When source_lang is not provided at invocation, the engine uses the default value "English". Required variables without defaults raise an error if missing, preventing silent prompt corruption.

Key Insight

Always write descriptive description strings for input variables. These descriptions serve double duty: they document your code for human developers, and they inform the LLM planner about what data each function expects. Vague descriptions lead to poor planning decisions.

3. Calling Functions Inside Templates

One of SK's most powerful features is the ability to call other kernel functions directly from within a template. This enables prompt composition without writing orchestration code.

# Assume we have a "Time.now" function that returns the current date
template = """Today's date is {{Time.now}}.

Given this context, answer the user's question about upcoming events.

Question: {{$question}}

Answer:"""

# When the template is rendered, SK:
# 1. Calls the Time.now function
# 2. Inserts its return value into the prompt
# 3. Substitutes $question
# 4. Sends the complete prompt to the LLM

Function calls in templates are resolved synchronously during the rendering phase, before the prompt reaches the AI service. This means the LLM sees a fully-rendered prompt with concrete values, not template syntax.

4. File-Based Prompt Organization

For projects with many semantic functions, SK supports a convention-based directory layout. Each function lives in its own folder with two files: the prompt template and a JSON configuration file.

# Directory structure:
# plugins/
#   WriterPlugin/
#     Summarize/
#       skprompt.txt       # The prompt template
#       config.json        # Execution settings and variable declarations
#     Rewrite/
#       skprompt.txt
#       config.json

# config.json example:
# {
#   "schema": 1,
#   "description": "Summarizes text into bullet points",
#   "execution_settings": {
#     "default": {
#       "max_tokens": 500,
#       "temperature": 0.3
#     }
#   },
#   "input_variables": [
#     {
#       "name": "input",
#       "description": "Text to summarize",
#       "is_required": true
#     }
#   ]
# }

# Load the entire plugin from disk
kernel.add_plugin(
    parent_directory="./plugins",
    plugin_name="WriterPlugin",
)

This approach keeps prompts in plain text files that are easy to review in pull requests, diff across versions, and edit without touching Python code. Non-technical team members (such as prompt engineers or domain experts) can modify prompts independently.

Tip

Store your prompt templates in version control alongside your application code. Use a CI pipeline that runs regression tests against a known set of inputs whenever a prompt file changes. This catches regressions early, before they reach production.

5. Execution Settings and Model Parameters

Each semantic function can specify its own model parameters (temperature, max tokens, top-p, and others). These settings can be defined at the function level, overridden at invocation time, or set as defaults on the service.

from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings

# Define settings at function creation time
settings = OpenAIChatPromptExecutionSettings(
    service_id="chat",
    max_tokens=1000,
    temperature=0.7,
    top_p=0.95,
    presence_penalty=0.0,
    frequency_penalty=0.0,
)

creative_fn = kernel.add_function(
    plugin_name="Writer",
    function_name="creative_write",
    prompt="Write a short story about {{$topic}} in the style of {{$author}}.",
    prompt_execution_settings=settings,
)

# Override at invocation time for a specific call
from semantic_kernel.contents import ChatHistory

result = await kernel.invoke(
    creative_fn,
    topic="a robot learning to paint",
    author="Hemingway",
)

Temperature and top-p control randomness. For factual tasks (summarization, extraction), use low temperature (0.0 to 0.3). For creative tasks (story writing, brainstorming), use higher values (0.7 to 1.0). The max_tokens parameter caps the response length and directly affects cost.

6. Chat-Based Semantic Functions

Modern LLMs work best with structured chat messages rather than raw text prompts. SK supports chat-formatted templates where you define system, user, and assistant messages explicitly.

chat_template = """<message role="system">
You are an expert code reviewer. Analyze code for bugs, security
issues, and style violations. Be concise and actionable.
</message>

<message role="user">
Review this {{$language}} code:

```{{$language}}
{{$code}}
```

Focus on: {{$focus_areas}}
</message>"""

review_fn = kernel.add_function(
    plugin_name="CodeReview",
    function_name="review",
    prompt=chat_template,
    template_format="semantic-kernel",
)

result = await kernel.invoke(
    review_fn,
    language="python",
    code="def calc(x): return eval(x)",
    focus_areas="security vulnerabilities",
)
print(result)

Plan generated: Step 1: Search for recent AI news (SearchPlugin.search) Step 2: Summarize findings (TextPlugin.summarize) Step 3: Format as report (TextPlugin.format_report) Plan executed successfully.

The <message role="..."> tags are parsed by SK's rendering engine and converted into the appropriate chat message format for the target API (OpenAI, Azure OpenAI, or others). This ensures the system prompt is properly separated from user content.

7. Prompt Rendering Pipeline

Understanding the rendering pipeline helps you debug template issues. When you invoke a semantic function, SK processes the template through these stages in order: variable substitution, function call resolution, message parsing, and execution settings application.

# You can render a template without invoking the LLM,
# which is useful for debugging and testing.

from semantic_kernel.prompt_template import PromptTemplate

template_obj = PromptTemplate(
    template="Hello {{$name}}, today is {{Time.now}}.",
    template_config=PromptTemplateConfig(
        template_format="semantic-kernel",
    ),
)

# Render the template to see the final prompt
rendered = await template_obj.render(
    kernel=kernel,
    arguments={"name": "Alice"},
)
print(rendered)
# Output: "Hello Alice, today is 2026-04-04."

Handlebars plan: {{#each steps}} {{call plugin=this.plugin function=this.function}} {{/each}} Execution result: Report generated with 3 sections.

Warning

Template rendering is not sandboxed. Function calls in templates execute real code with real side effects. Never allow untrusted user input to control which functions are called in a template. Treat template authoring as a privileged operation, similar to writing server-side code.

8. Best Practices for Semantic Functions

Well-designed semantic functions follow a set of principles that improve reliability, testability, and reuse across your organization.

Single responsibility: Each semantic function should do one thing well. A function that summarizes, translates, and formats is hard to test and reuse. Split it into three functions.
Explicit output format: Tell the LLM exactly what format you expect. "Respond with a JSON object containing keys: title, summary, keywords" is far more reliable than "Respond appropriately."
Version your prompts: Treat prompt changes like code changes. Use git, write commit messages explaining why a prompt was modified, and test before deploying.
Guard against injection: If user input flows into a template, wrap it in clear delimiters (triple backticks, XML tags) so the LLM can distinguish between instructions and data.
Set reasonable defaults: Functions with good defaults are easier to use. A translation function that defaults to English as the source language covers the most common case without extra configuration.