Create naive (no temporal loss) NST for videos with person segmentation. Just place your videos in data/, run and you get your stylized and segmented videos.

gordicaleksa Last update: May 01, 2024

Naive Video - Fast NST 🎥 + ⚡💻 + 🎨 = ❤️

This repo is a wrapper around my implementation of fast NST (for static images) and it additionally provides:

Support for creating (naive - no temporal loss included) videos
Support for creating segmentation masks for the person talking

You just place your videos in data/ directory and you get stylized/segmented videos - easy as that.

It's an accompanying repo for this video series on YouTube.

The first video of the series was created exactly using this method (I also used ReCoNet for 1 part of the video).

Combining stylized frames with original frames (via seg masks)

On the left you can see typical NST output and the 2 other images on the right were created using masks.

They were created using this segmentation mask (and original frame as the overlay):

It's not perfect but it was created in a fully automatic fashion.

Note: I intentionally show-cased a non-perfect segmentation mask here to display some problems I had (part of the world map behind me had a skin-like color).

Combining 2 types of stylized frames (via seg masks)

Similarly instead of using the original frame as the overlay you can use some other style:

Setup

git clone --recurse-submodules https://github.com/gordicaleksa/pytorch-naive-video-nst
cd pytorch-naive-video-nst
Run conda env create from project directory (this will create a brand new conda environment).
Run activate pytorch-video-naive (for running scripts from your console or set the interpreter in your IDE)
Run resource_downloader.py (from pytorch-nst-feedforward submodule) it will download 4 pretrained models
Make sure you have ffmpeg in your system path (used for creating videos)

That's it! It should work out-of-the-box executing environment.yml file which deals with dependencies.

Note: There is 1 git submodule (fast NST project) in this repo. That's why you'll need --recurse-submodules
check out this SO link if you run into any problems.

PyTorch package will pull some version of CUDA with it, but it is highly recommended that you install system-wide CUDA beforehand, mostly because of GPU drivers. I also recommend using Miniconda installer as a way to get conda on your system.

Follow through points 1 and 2 of this setup and use the most up-to-date versions of Miniconda and CUDA/cuDNN (I recommend CUDA 10.1 or 10.2 as those are compatible with PyTorch 1.5, which is used in this repo, and newest compatible cuDNN).

Usage

After you're done with the setup you can just run this: python naive_video_pipeline.py
And it will create results for the default example.mp4 video in data/clip_example/

To run the pipeline on your own videos do the following:

Place them under data/
Specify which ones you want to process via --specific_videos argument like: ['my_video1.mp4', my_video2.mp4']

That's it! If you bump into CUDA out of memory errors check out the Debugging section (easy to fix).

The output you can expect after processing my_video.mp4 (which can be found in data/clip_my_video/):

frames/ - dumped frames from your video
masks/ and processed_masks - contain segmentation masks for the person in the video
my_video.aac - sound clip from your video
<model_name>/ - contains stylized and combined imagery and videos (this is what you want)

If you want to combine your videos with some other style instead of overlaying the original frame,
set the --other_style to the the name of the model whose frames you want to use as the overlay.

Debugging

Q: I'm getting CUDA out of memory error in the segmentation/stylization stage what should I do?
A: 2 options: a) make the image width smaller b) make the batch size smaller

Citation

If you find this code useful for your research, please cite the following:

@misc{Gordić2020-naive-video-nst,
  author = {Gordić, Aleksa},
  title = {pytorch-naive-video-nst},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/gordicaleksa/pytorch-naive-video-nst}},
}

Connect with me

If you'd love to have some more AI-related content in your life 🤓, consider:

Subscribing to my YouTube channel The AI Epiphany 🔔
Follow me on LinkedIn and Twitter 💡
Follow me on Medium 📚 ❤️

Create naive (no temporal loss) NST for videos with person segmentation. Just place your videos in data/, run and you get your stylized and segmented videos.

Naive Video - Fast NST 🎥 + ⚡💻 + 🎨 = ❤️

Combining stylized frames with original frames (via seg masks)

Combining 2 types of stylized frames (via seg masks)

Setup

Usage

Debugging

Citation

Connect with me

Licence

Stress classifier with AutoML

Solutions to the 'Applied Machine Learning In Python' Coursera course exercises

A 50-Day Guide to Becoming a Data Scientist with Python, SQL, and Machine Learning

splearn: package for signal processing and machine learning with Python. Contains tutorials on understanding and applying signal processing.

{Python}: Detect and extract the license plate of vehicles using Machine Learning and Image Processing Techniques

Python script to generate fake datasets optimized for testing machine learning/deep learning workflows

Repo where I recreate some popular machine learning models from scratch in Python

Machine learning in Python for stock market and forex market predictions (fully functional)