Building Conversational AI with LLMs and Agents
Appendix T: Distributed ML: PySpark, Databricks, and Ray

Feature Stores: Feast, Tecton, and Databricks Feature Engineering

Big Picture

Feature stores solve the "training-serving skew" problem by providing a single system that manages feature computation, storage, and retrieval for both training and inference. In LLM applications, features include user embeddings, document metadata, retrieval scores, and contextual signals that augment prompts or drive routing logic. This section covers three leading feature store platforms: Feast (open source), Tecton (managed), and Databricks Feature Engineering (integrated with Unity Catalog from Section T.1).

T.6.1 The Training-Serving Skew Problem

When building ML-powered applications, features computed during training must match exactly the features available at inference time. Without a feature store, teams often end up with two separate codebases: a batch pipeline (Python/Spark) that computes features for training, and a real-time service (Java/Go) that computes the same features for inference. Subtle differences between these implementations, such as different timestamp handling, rounding behavior, or null treatment, create training-serving skew that silently degrades model performance.

Feature stores address this by defining features once and serving them consistently across both contexts. A feature definition specifies the computation logic, the data source, and the freshness requirements. The feature store then materializes these features into an offline store (for training) and an online store (for low-latency inference), guaranteeing consistency.

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────┐
│   Data Sources  │────▶│  Feature Store    │────▶│   Consumers     │
│                 │     │                   │     │                 │
│  Event streams  │     │  ┌─────────────┐ │     │  Training jobs  │
│  Batch tables   │     │  │ Feature Defs │ │     │  (offline)      │
│  APIs           │     │  └─────────────┘ │     │                 │
│                 │     │  ┌─────────────┐ │     │  Serving APIs   │
│                 │     │  │ Offline Store│ │     │  (online)       │
│                 │     │  └─────────────┘ │     │                 │
│                 │     │  ┌─────────────┐ │     │  Notebooks      │
│                 │     │  │ Online Store │ │     │  (exploration)  │
│                 │     │  └─────────────┘ │     │                 │
└─────────────────┘     └──────────────────┘     └─────────────────┘
        
Figure T.6.1: Feature store architecture. Features are defined once, materialized into both offline (batch) and online (low-latency) stores, and consumed consistently by training and serving pipelines.

T.6.2 Feast: Open-Source Feature Store

Feast is the most widely adopted open-source feature store. It provides a Python SDK for defining features, materializing them from batch or streaming sources into online stores (Redis, DynamoDB, or SQLite for development), and retrieving them with point-in-time correctness for training. Feast integrates with data warehouses like BigQuery, Snowflake, and Redshift as offline stores, and with Delta Lake tables for Databricks environments.

# feature_repo/feature_definitions.py
from datetime import timedelta
from feast import Entity, Feature, FeatureView, FileSource, ValueType
from feast.types import Float32, Int64, String

# Define the entity (the primary key for feature lookups)
user = Entity(
    name="user_id",
    value_type=ValueType.STRING,
    description="Unique user identifier",
)

# Define a data source (Parquet file, BigQuery table, etc.)
user_features_source = FileSource(
    path="s3://bucket/features/user_stats.parquet",
    timestamp_field="event_timestamp",
    created_timestamp_column="created_at",
)

# Define a feature view (a group of related features)
user_features = FeatureView(
    name="user_features",
    entities=[user],
    ttl=timedelta(days=7),  # Features older than 7 days are stale
    schema=[
        Feature(name="total_conversations", dtype=Int64),
        Feature(name="avg_response_rating", dtype=Float32),
        Feature(name="preferred_language", dtype=String),
        Feature(name="user_embedding", dtype=Float32),
    ],
    source=user_features_source,
    online=True,  # Materialize to online store
)
# Initialize the Feast feature repository
feast init feature_repo
cd feature_repo

# Apply feature definitions to the registry
feast apply

# Materialize features from offline to online store
feast materialize 2024-01-01T00:00:00 2025-04-01T00:00:00

# Start a local feature server for online serving
feast serve --port 6566
Tip

For LLM applications, the most common Feast use case is serving user context at inference time. When a user sends a message, your application fetches their feature vector (conversation history length, preferred response style, topic preferences) from the online store and includes it in the system prompt. This personalizes the LLM response without any model fine-tuning.

T.6.3 Point-in-Time Joins for Training

One of the most valuable features of a feature store is point-in-time correct joins. When building a training dataset, you need to know the value of each feature as it existed at the time of the training event, not the current value. For example, if a user had 10 conversations when a particular interaction occurred but now has 500, the training data should reflect the value 10. Feast handles this automatically through its get_historical_features API.

from feast import FeatureStore
import pandas as pd

store = FeatureStore(repo_path="feature_repo/")

# Entity DataFrame: the events you want to enrich with features
entity_df = pd.DataFrame({
    "user_id": ["user_001", "user_002", "user_003", "user_001"],
    "event_timestamp": pd.to_datetime([
        "2025-01-15 10:00:00",
        "2025-01-15 11:30:00",
        "2025-02-01 09:00:00",
        "2025-03-01 14:00:00",  # Same user, different time
    ]),
})

# Fetch features with point-in-time correctness
training_df = store.get_historical_features(
    entity_df=entity_df,
    features=[
        "user_features:total_conversations",
        "user_features:avg_response_rating",
        "user_features:preferred_language",
    ],
).to_df()

print(training_df)
# user_001 at Jan 15 gets their Jan 15 feature values
# user_001 at Mar 01 gets their Mar 01 feature values (different!)
Ray cluster: Head node: 192.168.1.10 (2x A100) Worker 1: 192.168.1.11 (2x A100) Worker 2: 192.168.1.12 (2x A100) Total GPUs: 6 Status: all nodes connected

T.6.4 Tecton: Managed Feature Platform

Tecton is a managed feature platform built by the team that created Feast at Uber. It extends the feature store concept with a feature pipeline engine that handles both batch and real-time feature computation. Unlike Feast, which requires you to compute features externally and point to the results, Tecton defines the transformation logic alongside the feature definition and orchestrates the computation automatically. Tecton is available as a managed service and integrates with Databricks, Snowflake, and Spark.

# Tecton feature definition with built-in transformation
from tecton import Entity, BatchSource, batch_feature_view
from tecton.types import Field, String, Float64, Int64
from datetime import timedelta

# Entity definition
user = Entity(name="user", join_keys=["user_id"])

# Batch feature view with transformation logic
@batch_feature_view(
    sources=[conversation_logs_source],
    entities=[user],
    mode="spark_sql",
    batch_schedule=timedelta(days=1),
    ttl=timedelta(days=30),
    online=True,
    offline=True,
    feature_start_time="2024-01-01",
    schema=[
        Field("user_id", String),
        Field("total_conversations", Int64),
        Field("avg_response_length", Float64),
        Field("avg_quality_rating", Float64),
    ],
)
def user_conversation_stats(conversation_logs):
    return f"""
        SELECT
            user_id,
            COUNT(*) as total_conversations,
            AVG(response_length) as avg_response_length,
            AVG(quality_rating) as avg_quality_rating
        FROM {conversation_logs}
        GROUP BY user_id
    """
Key Insight

The key difference between Feast and Tecton is where the computation happens. Feast is a "feature serving" layer: you compute features externally and register the results. Tecton is a "feature platform" that also manages computation. For teams that already have robust data pipelines (perhaps using Databricks or Airflow), Feast provides the serving layer without duplicating orchestration. For teams building from scratch, Tecton's integrated computation reduces the number of systems to manage.

T.6.5 Databricks Feature Engineering

Databricks provides its own feature engineering capabilities through the databricks-feature-engineering SDK, tightly integrated with Unity Catalog (see Section T.1). Feature tables are standard Delta tables with additional metadata that marks specific columns as features and one or more columns as the lookup key. The advantage of this approach is that feature governance, lineage, and access control are inherited directly from Unity Catalog.

from databricks.feature_engineering import FeatureEngineeringClient, FeatureLookup

fe = FeatureEngineeringClient()

# Create a feature table from a Spark DataFrame
user_features_df = spark.sql("""
    SELECT
        user_id,
        COUNT(*) as total_conversations,
        AVG(response_rating) as avg_response_rating,
        COLLECT_LIST(topic) as topic_history
    FROM ml_catalog.llm_data.conversations
    GROUP BY user_id
""")

fe.create_table(
    name="ml_catalog.features.user_conversation_stats",
    primary_keys=["user_id"],
    df=user_features_df,
    description="Aggregated conversation statistics per user",
)

# Create a training dataset with automatic feature lookup
training_events = spark.read.table("ml_catalog.llm_data.training_events")

training_set = fe.create_training_set(
    df=training_events,
    feature_lookups=[
        FeatureLookup(
            table_name="ml_catalog.features.user_conversation_stats",
            lookup_key=["user_id"],
            feature_names=["total_conversations", "avg_response_rating"],
        ),
    ],
    label="quality_label",
)

training_df = training_set.load_df()
print(f"Training set with features: {training_df.columns}")
Ray Train results: Epoch 1: loss=2.38, throughput=5,200 tokens/sec Epoch 2: loss=1.85, throughput=5,350 tokens/sec Epoch 3: loss=1.42, throughput=5,280 tokens/sec Training complete. Model saved to: /results/checkpoint-final

T.6.6 Choosing a Feature Store for LLM Applications

The right feature store depends on your existing infrastructure and the complexity of your feature pipelines. The table below summarizes the tradeoffs for each platform.

Feature Store Comparison for LLM Applications
Criterion Feast Tecton Databricks FE
DeploymentSelf-hosted (open source)Managed SaaSIntegrated with Databricks
Feature computationExternal (bring your own)Built-in (Spark, Rift)Spark notebooks/jobs
Online store optionsRedis, DynamoDB, SQLiteDynamoDB, RedisDatabricks Online Tables
Streaming featuresVia push sourcesNative (Spark Streaming)Via Delta Live Tables
GovernanceManualBuilt-in RBACUnity Catalog
Best forTeams with existing pipelinesGreenfield ML platformsDatabricks-centric orgs
Warning

Feature stores add operational complexity. If your LLM application only uses prompt-time context (retrieved documents, conversation history) without computed features, you may not need a feature store at all. Adopt one when you find yourself maintaining duplicate feature computation logic between training and serving, or when feature freshness and consistency become reliability risks.

Summary

Feature stores bridge the gap between training and serving by providing a single source of truth for computed features. Feast offers an open-source, bring-your-own-compute approach that integrates with any data stack. Tecton provides a managed platform with built-in feature computation and orchestration. Databricks Feature Engineering leverages Unity Catalog for governance and integrates seamlessly with the broader Databricks ecosystem. For LLM applications, feature stores are most valuable when your system uses computed user or document features alongside retrieval-augmented generation. In the next section, we bring all these components together into production data pipelines and model serving architectures at scale.