bentoml/BentoML

pypi status python versions Downloads build status Documentation Status join BentoML Slack

From a model in jupyter notebook to production API service in 5 minutes

BentoML

Getting Started | Documentation | Gallery | Contributing | Releases | License | Blog

BentoML is a flexible framework that accelerates the workflow of serving and deploying machine learning models in the cloud.

Check out our 5-mins Quickstart Notebook using BentoML to turn a trained sklearn model into a containerized REST API server, and then deploy it to AWS Lambda.

If you are using BentoML for production workloads or wants to contribute, be sure to join our Slack channel and hear our latest development updates: join BentoML Slack


Getting Started

Installation with pip:

Defining a prediction service with BentoML:

import bentoml
from bentoml.handlers import DataframeHandler
from bentoml.artifact import SklearnModelArtifact

@bentoml.env(pip_dependencies=["scikit-learn"])
@bentoml.artifacts([SklearnModelArtifact('model')])
class IrisClassifier(bentoml.BentoService):

    @bentoml.api(DataframeHandler)
    def predict(self, df):
        return self.artifacts.model.predict(df)

Train a classifier model with default Iris dataset and pack the trained model with the BentoService IrisClassifier defined above:

from sklearn import svm
from sklearn import datasets

clf = svm.SVC(gamma='scale')
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)

# Create a iris classifier service with the newly trained model
iris_classifier_service = IrisClassifier.pack(model=clf)

# Save the entire prediction service to file bundle
saved_path = iris_classifier_service.save()

A BentoML bundle is a versioned file archive, containing the BentoService you defined, along with trained model artifacts, dependencies and configurations.

Now you can start a REST API server based off the saved BentoML bundle form command line:

bentoml serve {saved_path}

If you are doing this only local machine, visit http://127.0.0.1:5000 in your browser to play around with the API server's Web UI for debbugging and testing. You can also send prediction request with curl from command line:

curl -i \
  --header "Content-Type: application/json" \
  --request POST \
  --data '[[5.1, 3.5, 1.4, 0.2]]' \
  http://localhost:5000/predict

The saved BentoML bundle can also be loaded directly from command line for inferencing:

bentoml predict {saved_path} --input='[[5.1, 3.5, 1.4, 0.2]]'

# alternatively:
bentoml predict {saved_path} --input='./iris_test_data.csv'

BentoML bundle is pip-installable and can be directly distributed as a PyPI package:

# Your bentoML model class name will become packaged name
import IrisClassifier

installed_svc = IrisClassifier.load()
installed_svc.predict([[5.1, 3.5, 1.4, 0.2]])

BentoML bundle is structured to work as a docker build context so you can easily build a docker image for this API server by using it as the build context directory:

docker build -t my_api_server {saved_path}

To learn more, try out the Getting Started with Bentoml notebook: Google Colab Badge

Examples

FastAI

Scikit-Learn

PyTorch

Keras

XGBoost

H2O

Visit bentoml/gallery repository for more example projects demonstrating how to use BentoML.

Deployment guides:

Project Overview

BentoML provides two set of high-level APIs:

  • BentoService: Turn your trained ML model into versioned file bundle that can be deployed as containerize REST API server, PyPI package, CLI tool, or batch/streaming job

  • YataiService: Manage and deploy your saved BentoML bundles into prediction services on Kubernetes cluster or cloud platforms such as AWS Lambda, SageMaker, Azure ML, and GCP Function etc

Feature Highlights

  • Multiple Distribution Format - Easily package your Machine Learning models and preprocessing code into a format that works best with your inference scenario:

    • Docker Image - deploy as containers running REST API Server
    • PyPI Package - integrate into your python applications seamlessly
    • CLI tool - put your model into Airflow DAG or CI/CD pipeline
    • Spark UDF - run batch serving on a large dataset with Spark
    • Serverless Function - host your model on serverless platforms such as AWS Lambda
  • Multiple Framework Support - BentoML supports a wide range of ML frameworks out-of-the-box including Tensorflow, PyTorch, Keras, Scikit-Learn, xgboost, H2O, FastAI and can be easily extended to work with new or custom frameworks.

  • Deploy Anywhere - BentoML bundle can be easily deployed with platforms such as Docker, Kubernetes, Serverless, Airflow and Clipper, on cloud platforms including AWS, Google Cloud, and Azure.

  • Custom Runtime Backend - Easily integrate your python pre-processing code with high-performance deep learning runtime backend, such as tensorflow-serving.

Documentation

Full documentation and API references can be found at bentoml.readthedocs.io

Usage Tracking

BentoML library by default reports basic usages using Amplitude. It helps BentoML authors to understand how people are using this tool and improve it over time. You can easily opt-out by running the following command from terminal:

bentoml config set usage_tracking=false

Contributing

Have questions or feedback? Post a new github issue or join our Slack channel: join BentoML Slack

Want to help build BentoML? Check out our contributing guide and the development guide.

Releases

BentoML is under active development and is evolving rapidly. Currently it is a Beta release, we may change APIs in future releases.

Read more about the latest features and changes in BentoML from the releases page.

License

Apache License 2.0

FOSSA Status


View Original