Deploy a ML inference service on a budget in less than 10 lines of code.

ebhy Last update: Feb 21, 2024

BudgetML: Deploy ML models on a budget

Installation • Quickstart • Community • Docs

Notice: This library is not being actively maintained, and we're looking for someone to update it and keep it going! Reach out to me directly if you would like to help!

Why

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

We built BudgetML because it's hard to find a simple way to get a model in production fast and cheaply.

Cloud functions are limited in memory and cost a lot at scale.
Kubernetes clusters are overkill for one single model.
Deploying from scratch involves learning too many different concepts like SSL certificate generation, Docker, REST, Uvicorn/Gunicorn, backend servers etc., that are simply not within the scope of a typical data scientist.

BudgetML is our answer to this challenge. It is supposed to be fast, easy, and developer-friendly. It is by no means meant to be used in a full-fledged production-ready setup. It is simply a means to get a server up and running as fast as possible with the lowest costs possible.

BudgetML lets you deploy your model on a Google Cloud Platform preemptible instance (which is ~80% cheaper than a regular instance) with a secured HTTPS API endpoint. The tool sets it up in a way that the instance autostarts when it shuts down (at least once every 24 hours) with only a few minutes of downtime. BudgetML ensures the cheapest possible API endpoint with the lowest possible downtime.

Key Features

Automatic FastAPI server endpoint generation (it's faster than Flask).
Fully interactive docs via Swagger.
Built-in SSL certificate generation via LetsEncrypt and docker-swag.
Uses cheap preemtible instances but has 99% uptime!
Complete OAuth2 secured endpoints with Password and Bearer pattern.

Cost comparison

BudgetML uses Google Cloud Preemptible instances under-the-hood to reduce costs by 80%. This can potentially mean hundreds of dollars worth of savings. Here is a screenshot of the e2-highmem GCP series, which is regular family of instances to be using for memory intense tasks like ML model inference functions. See the following price comparison (as of Jan 31, 2021 [source])

Even with the lowest machine_type, there is a $46/month savings, and with the highest configuration this is $370/month savings!

Installation

BudgetML is available for easy installation into your environment via PyPI:

pip install budgetml

Alternatively, if you’re feeling brave, feel free to install the bleeding edge:

NOTE: Do so at your own risk; no guarantees given!

pip install git+https://github.com/ebhy/budgetml.git@main --upgrade

Quickstart

BudgetML aims for as simple a process as possible. First set up a predictor:

# predictor.py
class Predictor:
    def load(self):
        from transformers import pipeline
        self.model = pipeline(task="sentiment-analysis")

    async def predict(self, request):
        # We know we are going to use the `predict_dict` method, so we use
        # the request.payload pattern
        req = request.payload
        return self.model(req["text"])[0]

Then launch it with a simple script:

# deploy.py
import budgetml
from predictor import Predictor

# add your GCP project name here.
budgetml = budgetml.BudgetML(project='GCP_PROJECT')

# launch endpoint
budgetml.launch(
    Predictor,
    domain="example.com",
    subdomain="api",
    static_ip="32.32.32.322",
    machine_type="e2-medium",
    requirements=['tensorflow==2.3.0', 'transformers'],
)

For a deeper dive, check out the detailed guide in the examples directory. For more information about the BudgetML API, refer to the docs.

Screenshots

Interactive docs to test endpoints. Support for Images.

Password-protected endpoints:

Simple prediction interface:

Projects using BudgetML

We are proud that BudgetML is actively being used in the following live products:

ZenML: For production scenarios

BudgetML is for users on a budget. If you're working in a more serious production environment, then consider using ZenML as the perfect open-source MLOps framework for ML production needs. It does more than just deployments, and is more suited for professional workplaces.

Proudly built by two brothers

We are two brothers who love building products, especially ML-related products that make life easier for people. If you use this tool for any of your products, we would love to hear about it and potentially add it to this space. Please get in touch via email.

Oh and please do consider giving us a GitHub star if you like the repository - open-source is hard, and the support keeps us going.

Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML: Deploy ML models on a budget

Why

Key Features

Cost comparison

Installation

Quickstart

Screenshots

Projects using BudgetML

ZenML: For production scenarios

Proudly built by two brothers

Providing accurate JSON and Python dicts about the many public information available about MNO

Programmer & Developer Cook-Book | Covers Programming Languages, Databases, Linux Commands and more.

Convert sqlite databases to JSON files

A python CLI script to create Entity Relationship Diagrams from JSON/YAML code.

The Python JSON Comparison package

Object serialization/deserialization tools for python.

trycast parses JSON-like values whose shape is defined by TypedDicts and other standard Python type hints.

The Python fake data producer for Apache Kafka® is a complete demo app allowing you to quickly produce JSON fake streaming datasets and push it to an Apache Kafka topic.

A flexible json diff framework for minimalist.