Track application versions with models - Weights & Biases Documentation

A Model is a combination of data (which can include configuration, trained model weights, or other information) and code that defines how the model operates. By structuring your code to be compatible with this API, you benefit from a structured way to version your application so you can more systematically keep track of your experiments. This guide shows you how to define a Weave Model, call it to capture inputs and outputs, take advantage of automatic versioning when your code or parameters change, serve the model behind a local API, and tag production calls for filtering. It’s for developers who build LLM-powered applications with Weave and want a repeatable way to track and compare iterations of their app.

Python
TypeScript

To create a model in Weave, you need the following:

A class that inherits from weave.Model
Type definitions on all parameters
A typed predict function with @weave.op() decorator

from weave import Model
import weave

class YourModel(Model):
    attribute1: str
    attribute2: int

    @weave.op()
    def predict(self, input_data: str) -> dict:
        # Model logic goes here
        prediction = self.attribute1 + ' ' + input_data
        return {'pred': prediction}

You can call the model as usual with:

import weave
weave.init('intro-example')

model = YourModel(attribute1='hello', attribute2=5)
model.predict('world')

This tracks the model settings along with the inputs and outputs anytime you call predict(). You now have a versioned Weave Model that records every prediction it makes, which the following sections build on.

Automatic versioning of models

When you change the parameters or the code that defines your model, Weave logs these changes and updates the version. This lets you compare predictions across versions of your model. Use this to iterate on prompts or to try a different LLM and compare predictions across settings.For example, here you create a new model:

import weave
weave.init('intro-example')

model = YourModel(attribute1='howdy', attribute2=10)
model.predict('world')

After calling this, you see that you now have two versions of this model in the UI, each with different tracked calls.

Serve models

Serving a model exposes its predict function over HTTP, which is useful for testing it from other applications or sharing it with teammates without distributing the underlying code. To start a FastAPI server for a Weave model, replace [MODEL-REF] with the reference to your model and run:

weave serve [MODEL-REF]

For additional instructions, see serve.

Track production calls

Once you use your model in more than one environment, it helps to distinguish production traffic from development or evaluation runs so you can analyze them separately. To separate production calls, add an attribute to the predictions for filtering in the UI or API.

with weave.attributes({'env': 'production'}):
    model.predict('world')

This feature is not available in TypeScript yet.

​Automatic versioning of models

​Serve models

​Track production calls

Automatic versioning of models

Serve models

Track production calls