Idiology - Docs

This section will describe the core ideas of Aligned, and explain why it differs to other approaches.

Describe What not How

The core idea around aligned is that we start with our models high level business goals, and do not care about how the model works.

This also apply to all of our data, as we describe which data and technologies we have to work with, not how they will be glued together.

In other words aligned is a declerative package describing how your ML system behaves, and they relate to each other.

An Example

To show case this a bit clearer, let's look at an example.

@model_contract(
    input_features=[
        review_embedding.embedding,
    ],
    exposed_model=mlflow_server(
        host="http://movie-review-is-negative:8080",
        model_name="movie_review_is_negative",
        model_alias="champion",
    ),
    output_source=FileSource.csv_at("preds.csv")
)
class MovieReviewIsNegative:
    review_id = String().as_entity()
    predicted_sentiment = review.is_negative.as_classification_label()

The below describes a model that predicts if a review is positive or negative (predicted_sentiment), and each prediction have a review_id associated with it.

Furthermore, we describe that the review.is_negative is the ground truth that the model should learn from.

This is predicted based on the data existing at review_embedding.embedding.

We can use the model by leveraging an mlflow server at http://movie-review-is-negative:8080.

Lastly, we are going to store our predictions in a CSV format at preds.csv.

Free features

Therefore, since we describe this in a declerative way do we get all of the following features without writing anything else.

Automatic joining of ground truths to training datasets
Use the model - store.model("...").predict_over(...)
Automatic read, write and upserts to the prediction source - store.model("...").all_predictions()
Data validation of prediction formats
Data type conversion - like iso or unix timestamps
Data lineage for models and features
Online model performance monitoring