Pipelines Libraries

14 Components & Libraries

Sortby

Pipelines Libraries

An orchestration platform for the development, production, and observation of data assets.

With Dagster, you declare—as Python functions—the data assets that you want to build. Dagster then helps you run your functions at the right time and keep your assets up-to-date. Here is an example o…

The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

A simple YAML API to get started quickly, a powerful Python API for total flexibility. Automatically cache your pipeline’s previous results and only re-compute tasks that have changed since your last…

Turns Data and AI algorithms into production-ready web applications in no time.

Taipy is designed for data scientists and machine learning engineers to build full-stack apps. To install Taipy stable release run: Below is our filter function. This is a typical Python …

An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈

These three functionalities enable a variety of use cases for data scientists, machine learning engineers, and data engineers: From here, you can quickly log a dataset: And there you have it, you now…

[UNMAINTAINED] Automated machine learning for analytics & production

Automated machine learning for production and analytics auto_ml is designed for production. Here's an example that includes serializing and loading the trained model, then getting predictions on sing…

Pythonic tool for running machine-learning/high performance/quantum-computing workflows in heterogeneous environments.

Covalent is a Python library for AI/ML engineers, developers, and researchers. It provides a straightforward approach to running compute jobs, like LLMs, generative AI, and scientific research, on v…

(AAAI' 20) A Python Toolbox for Machine Learning Model Combination

or: Alternatively, you could clone and run setup.py file: Initialize a group of classifiers as base estimators Initialize, fit, predict, and evaluate with Stacking See a sample output of classifier_s…

pypyr task-runner cli & api for automation pipelines. Automate anything by combining commands, different scripts in different languages & applications into one pipeline process.

pypyr is a free & open-source task-runner that lets you define and run sequential steps in a pipeline. Like a turbo-charged shell script, but less finicky. Less annoying than a makefile. pypyr ru…

One framework to develop, deploy and operate data workflows with Python and SQL.

Spinnaker Pipeline/Infrastructure Configuration and Templating Tool - Pipelines as Code.

Foremast is a Spinnaker pipeline and infrastructure configuration and templating tool. Just create a couple JSON configuration files and then manually creating Spinnaker pipelines becomes a thing of…

Distributed Machine Learning Patterns from Manning Publications by Yuan Tang https://bit.ly/2RKv8Zo

This book teaches you how to take machine learning models from your personal laptop to large distributed clusters. You’ll explore key concepts and patterns behind successful distributed machine learn…

The easiest way to use Machine Learning. Mix and match underlying ML libraries and data set sources. Generate new datasets or modify existing ones with ease.

As we all know the Machine Learning space has a lot of tools and libraries for creating pipelines to train, test & deploy models, and dealing with these many different APIs can be cumbersome. Our…

Beneath is a serverless real-time data platform ⚡️

Beneath is a serverless real-time data platform. Our goal is to create one end-to-end platform for data workers that combines data storage, processing, and visualization with data quality management …