The open source active learning toolkit to find failure modes in your computer vision models, prioritize data to label next, and drive data curation to improve model performance.

encord-team encord-team Last update: Apr 12, 2023

Documentation | Try it Now | Website | Blog | Slack Channel

Encord Active

Encord logo

PRs-Welcome Licence PyPi project PyPi version Open In Colab

docs "Join us on Slack" Twitter Follow

What is Encord Active?

Encord Active is an open-source active learning tookit that helps you find failure modes in your models and improve your data quality and model performance.

Use Encord Active to visualize your data, evaluate your models, surface model failure modes, find labeling mistakes, prioritize high-value data for re-labeling and more!

projects page

💡 When to use Encord Active?

Encord Active helps you understand and improve your data, labels, and models at all stages of your computer vision journey.

Whether you've just started collecting data, labeled your first batch of samples, or have multiple models in production, Encord Active can help you.

encord active diagram

🔖 Documentation

Our full documentation is available here. In particular we recommend checking out:

Installation

The simplest way to install the CLI is using pip in a suitable virtual environment:

pip install encord-active

We recommend using a virtual environment, such as venv:

python3.9 -m venv ea-venv
source ea-venv/bin/activate
pip install encord-active

encord-active requires python3.9. If you have trouble installing encord-active, you find more detailed instructions on installing it here.

👋 Quickstart

Get started immediately by sourcing your environment and running the code below. This downloads a small dataset and launches the Encord Active App for you to explore:

encord-active quickstart

or you can use :

docker run -it --rm -p 8501:8501 -v ${PWD}:/data encord/encord-active quickstart

After opening the UI, we recommend you to head to the workflow documentation to see some common workflows.

⬇️  Download a sandbox dataset

Another way to start quickly is by downloading an existing dataset from the sandbox. The download command will ask which pre-built dataset to use and will download it into a new directory in the current working directory.

encord-active download
cd /path/of/downloaded/project
encord-active visualize

The app should then open in the browser. If not, navigate to localhost:8501. Our docs contains more information about what you can see in the page.

Import your dataset

Quick import Dataset

To import your data (without labels) save your data in a directory and run the command:

# within venv
encord-active init /path/to/data/directory

A project will be created using the data in the directory.

To visualize the project run:

cd /path/to/project
encord-active visualize

You can find more details on the init command in the documentation.

Import from COCO

To import your data, labels, and predictions from COCO, save your data in a directory and run the command:

# install COCO extras
(ea-venv)$ python -m pip install encord-active[coco]

# import samples with COCO annotaions
encord-active import project --coco -i ./images -a ./annotations.json

# import COCO model predictions
encord-active import predictions --coco results.json

Import from Encord Platform

This section requires setting up an ssh key with Encord, so slightly more technical.

To import an Encord project, use this command:

encord-active import project

The command will allow you to search through your Encord projects and choose which one to import.

Concepts and features

Quality metrics:

Quality metrics are applied to your data, labels, and predictions to assign them quality metric scores. Plug in your own or rely on Encord Active's prebuilt quality metrics. The quality metrics automatically decompose your data, label, and model quality to show you how to improve your model performance from a data-centric perspective. Encord Active ships with 25+ metrics and more are coming; contributions are also very welcome.

Core features:

Visit our documentation to learn more.

Supported data:

Data Labels Project sizes
jpg Bounding Boxes Images 50.000
png Polygons Videos 50.000 frames
tiff Segmentation
mp4 Classifications
Polylines 🟡

🧑🏽‍💻Development

🛠 Build your own quality metrics

Encord Active is built with customizability in mind. Therefore, you can easily build your own custom metrics 🔧 See the Writing Your Own Metric page in the docs for details on this topic. If you need help or guidance feel free to ping us in the Slack channel!

👪 Community and support

Join our channel on Slack to connect with the team behind Encord Active.

Also, feel free to suggest improvements or report problems via GitHub issues.

🎇 Contributions

If you're using Encord Active in your organization, please try to add your company name to the ADOPTERS.md. It really helps the project to gain momentum and credibility. It's a small contribution back to the project with a big impact.

If you want to share your custom metrics or improve the tool, please see our contributing docs.

🦸 Contributors

@Javi Leguina

Licence

This repository is published under the Apache 2.0 licence.

Subscribe to our newsletter