Instrumenting Pay-i for Databricks

Instrumenting Databricks

Pay-i instruments your Databricks inference calls at the client level. Instrumentation is configured once at application startup—no changes are needed per request.

How host mappings work

Standard providers like OpenAI use a single global endpoint and Pay-i associates these inference calls with OpenAI models by default. Invoking Databricks hosted models with the OpenAI client requires you to override the default association. The client host mapping is how you specify the override. Without a host mapping, the Pay-i SDK cannot correctly associate inference calls to your workspace with the correct Databricks model and Databricks host cloud.

A host mapping tells Pay-i: "traffic to this workspace URL should be attributed to this category." Once the mapping is in place, all inference calls from your configured clients are captured and attributed correctly.

Prerequisites

Before you begin, confirm you have the following:

  • A Pay-i Application and an API key (see Pay-i API Keys)
  • A Databricks workspace capable of serving the required models and the ability to authenticate to it
  • Python 3.9 or later

Step 1: Getting Started with Pay-i

Follow the guidance on Getting Started with Pay-i

Step 2: Set environment variables

Create a .env file in your project directory:

DATABRICKS_HOST=<YOUR_WORKSPACE_HOST>
DATABRICKS_TOKEN=<YOUR_DATABRICKS_TOKEN>
PAYI_API_KEY=<YOUR_PAYI_API_KEY>
PAYI_BASE_URL=<YOUR_PAYI_BASE_URL>

DATABRICKS_HOST is your full workspace URL, including the trailing slash (for example, https://<workspace-id>.azuredatabricks.net/). The trailing slash must be present and must match exactly what you use in Step 4.

PAYI_BASE_URL is required only if Pay-i is deployed in your environment at a custom endpoint. Omit it to use the default Pay-i endpoint.

Step 3: Configure payi_instrument() with host mappings

Call payi_instrument() once at application startup before making an inference call. The host_mappings configuration tells Pay-i which Category to associate with traffic from your workspace.

The full category name is system.databricks.<hyperscaler>, where <hyperscaler> is aws, azure, or google depending on your cloud. The payi sdk provides a helper class PayiCategories for these string values. In the examples below, <databricks cloud host category> appears in code with a comment indicating how to substitute your cloud-specific value.

from dotenv import load_dotenv
from payi.lib.helpers import PayiCategories
from payi.lib.instrument import payi_instrument
import os

load_dotenv()

host = os.getenv("DATABRICKS_HOST")

# <databricks cloud host category> is PayiCategories.databricks_azure, PayiCategories.databricks.aws, or PayiCategories.databricks_google
payi_instrument(
    config={
        "databricks": {
            "host_mappings": {
                host: {
                    "price_as_category": <databricks cloud host category>
                }
            }
        }
    }
)

Client examples

After calling payi_instrument(), use your Databricks client as normal. Pay-i captures usage automatically.

Standard OpenAI client

Use this pattern when running outside Databricks—for example, in a local development environment, container, or CI pipeline. Authentication resolves via WorkspaceClient.

from dotenv import load_dotenv
from databricks.sdk import WorkspaceClient
from openai import OpenAI
from payi.lib.helpers import PayiCategories
from payi.lib.instrument import payi_instrument
import os

load_dotenv()

host = os.getenv("DATABRICKS_HOST")  # Must include trailing slash

# Initialize Pay-i before creating any clients
# <databricks cloud host category> is PayiCategories.databricks_azure, PayiCategories.databricks.aws, or PayiCategories.databricks_google
payi_instrument(
    config={
        "databricks": {
            "host_mappings": {
                host: {
                    "price_as_category": "system.databricks"  # system.databricks.aws | .azure | .google
                }
            }
        }
    }
)

# Resolve auth from your Databricks credentials
w = WorkspaceClient(host=host)
headers = w.config.authenticate()
token = headers["Authorization"].removeprefix("Bearer ")

client = OpenAI(
    api_key=token,
    base_url=f"{host}serving-endpoints",
    timeout=120.0,  # Increase from the default for pay-per-token cold-start tolerance
)

response = client.chat.completions.create(
    model="<MODEL_NAME>",
    messages=[{"role": "user", "content": "Your prompt here."}],
    max_tokens=256,  # Always set explicitly — see Important Considerations
)
print(response.choices[0].message.content)

Note: w.config.authenticate() returns a short-lived OAuth access token, not a static credential. For interactive development this is fine, but for long-running services or production deployments, use OAuth M2M or re-authenticate per request rather than caching the token at startup.

Databricks OpenAI client

Use this pattern when running inside Databricks notebooks or jobs. The DatabricksOpenAI client resolves the host and token automatically from the notebook context. Set DATABRICKS_HOST in your .env file if you want to use it outside Databricks.

from dotenv import load_dotenv
from databricks_openai import DatabricksOpenAI
from payi.lib.instrument import payi_instrument

load_dotenv()

host = "<YOUR_WORKSPACE_HOST>"  # Must include trailing slash

# Initialize Pay-i before creating any clients
payi_instrument(
    config={
        "databricks": {
            "host_mappings": {
                host: {
                    "price_as_category": "system.databricks"  # system.databricks.aws | .azure | .google
                }
            }
        }
    }
)

client = DatabricksOpenAI()  # Resolves host and token from the environment

response = client.chat.completions.create(
    model="<MODEL_NAME>",
    messages=[{"role": "user", "content": "Your prompt here."}],
    max_tokens=256,  # Always set explicitly — see Important Considerations
)
print(response.choices[0].message.content)

Databricks SDK: workspace.serving_endpoints.query()

If you're using the Databricks SDK directly via workspace.serving_endpoints.query(), automatic instrumentation is not available for this path. See Manual Event Submission to submit usage events explicitly.


Verifying instrumentation

After configuring Pay-i and making a few test calls, confirm your data is flowing correctly:

  1. Sign in to the Pay-i platform and open your Application.
  2. On the Usage tab, confirm requests appear attributed to the system.databricks.<hyperscaler> category.
  3. Confirm that token counts and cost data are populated.

If calls are not appearing, work through the checklist in Troubleshooting & Support before contacting support.


Important considerations

These are the integration issues most likely to cause silent failures or confusing results in production.

The host mapping key must match exactly

The string used as the host_mappings key must be character-for-character identical to the string passed to WorkspaceClient and used in your client's base_url. A difference in trailing slash, casing, or any other character will cause Pay-i to miss the match—Pay-i attributes calls to the wrong category silently—no error is raised.

Define host as a single variable and reference it everywhere:

# Correct — single source of truth
host = "<YOUR_WORKSPACE_HOST>"  # Include trailing slash
payi_instrument(config={"databricks": {"host_mappings": {host: {"price_as_category": "system.databricks"}}}})
w = WorkspaceClient(host=host)
client = OpenAI(api_key=token, base_url=f"{host}serving-endpoints")

# Incorrect — subtle mismatch will break the mapping without raising an error
payi_instrument(config={"databricks": {"host_mappings": {"<YOUR_WORKSPACE_HOST>/": {"price_as_category": "system.databricks"}}}})
client = OpenAI(base_url="<YOUR_WORKSPACE_HOST>serving-endpoints")  # Missing trailing slash

payi_instrument() must run before client initialization

Pay-i patches the OpenAI client at instrumentation time. If you create an OpenAI or DatabricksOpenAI client before calling payi_instrument(), that client is not instrumented and calls will not be tracked.

# Correct
payi_instrument(config={...})
client = OpenAI(...)

# Incorrect — client is created before instrumentation runs
client = OpenAI(...)           # Not instrumented
payi_instrument(config={...})  # Too late