Training Your Own Flux LoRA Model

FluxProArt · 2 years ago

Training Your Own Flux LoRA Model for Realistic AI Image Generation

Flux is one of the most advanced AI image generation models available today, capable of producing highly realistic results on par with Midjourney. A powerful new feature allows you to train your own likeness into the Flux model using a technique called LoRA (Low-Rank Adaptation). This enables generating creative AI images of yourself in any scenario, such as appearing as a wizard, astronaut, or superhero.

There are currently three main ways to train a custom Flux LoRA model:

Training on FluxPro.art
Using the OstrisAI toolkit on Google Colab
Through the Fal.ai platform
Via the Replicate.com

Let's dive into the details of each method, along with some tips and tricks for optimal results.

1. FluxPro.art Custom LoRa training

Compared to other options below, FluxPro.art makes it extremely easy to train custom models without technical expertise.

Go to https://fluxpro.art/models and click "Train new model"
Provide a name for your model and select the subject type you wish to train (options include Man, Woman, Product, Style).
Upload 10-20 High-quality example images
Our system will train a custom model in just 20-40 minutes, saving you time and effort.
Once training is complete, you can click the "Generate image" button on the model page to create stunning, personalized images that capture your unique essence and style.

2. Training Flux LoRA with OstrisAI Toolkit on Google Colab

The OstrisAI toolkit provides a script for training and inference of Flux LoRA models that is used by both Fal.ai and Replicate under the hood. You can run it directly from Google Colab using this notebook: https://colab.research.google.com/drive/1r09aImgL1YhQsJgsLWnb67-bjTV88-W0?usp=sharing

Requirements

To train a Flux LoRA model, you need a GPU with at least 24GB of VRAM. If the GPU is also controlling your monitors, set low_vram: true in the config file under model: to quantize the model on CPU. The toolkit has been tested on Linux, with some reports of bugs on Windows.

Model License

Currently, training only works with the FLUX.1-dev model, which has a non-commercial license. You must accept the license on Hugging Face before use. Here are the steps:

Sign into Hugging Face and accept the model access for black-forest-labs/FLUX.1-dev
Get a READ API key from Hugging Face
Place the key in a file named .env in the root folder like: HF_TOKEN=your_key_here

Training Steps

Copy the example config file config/examples/train_lora_flux_24gb.yaml to the config folder and rename it
Edit the config file following the comments
Run the training with python3 run.py config/your_config_file.yml

A folder will be created with checkpoints and generated images. You can stop and resume training anytime.

Dataset Preparation

Your dataset should be a folder of images (jpg, jpeg, png) and matching .txt caption files, e.g. image1.jpg and image1.txt. The word [trigger] in captions will be replaced by the trigger_word from your config. Images will be automatically resized and cropped.

3. Training Flux LoRA on Fal.ai

Fal.ai provides a simple interface for training Flux LoRA models at https://fal.ai/models/fal-ai/flux-lora-general-training. The process is:

Upload 12-15 images of yourself
Set training steps to 1,000 and specify a trigger word
Start training (takes 25-30 minutes, costs ~$5)
When complete, copy the URL of the generated .safetensors file

To generate images with your trained LoRA:

Go to https://fal.ai/models/fal-ai/flux-general
Paste the .safetensors URL in the "LoRAs -> Path" box
Use your trigger word in the prompt

Each image generation costs $0.075. You'll need a Fal.ai account with credits.

4. Training Flux LoRA on Replicate

The Replicate by Luca Taco provides a simple interface for training Flux LoRA at https://replicate.com/lucataco/ai-toolkit/train. Here are the steps:

Gather 12-20 high-quality photos of your face
Rename them as "a photo of {your name} 1.jpg", "a photo of {your name} 2.jpg", etc.
Zip the photos into a single file
Create an account on Replicate.com
Go to the "Luca Taco AI Toolkit" model page
Under "Train", create a new model and give it a name
Upload your zip file of photos
Set training steps to 1000
Obtain a Hugging Face token and paste it in
Click "Create Training" (takes ~25 minutes, costs ~$2.10)

Once training completes, use the "Flux Dev LoRA" model on Replicate to generate images:

Go to https://replicate.com/lucataco/flux-dev-lora
Provide a prompt like "{your name} as a wizard"
Set the LoRA scale to 1

The creator of the Replicate has provided a one-time $10 coupon code to cover initial training and generation costs.

Tips and Tricks for Optimal Results

Through experimentation, the following tips have been found to improve Flux LoRA training results:

Avoid mixing data types: Combining realistic photos with cartoon/anime style training images tends to produce plastic-looking, uncanny results. Stick to one consistent image style in your dataset.
Use natural language captions: Comma-separated tags (like those used on Danbooru) are less effective than full, narrative-style captions written in natural language. Describe the images as you would explain them to another person.
Keep datasets small: Surprisingly, smaller datasets of 20-30 images provide more flexible prompting compared to larger sets. Adding just 15 more images drastically reduced the prompting flexibility in one test.
Aim for 20 images at 1000 steps: Excellent likeness capture has been achieved using the Kohya trainer with a dataset of 20 well-chosen images trained for 1000 steps and 1 epoch. This seems to be a sweet spot.
Train at 512x512 resolution: While higher resolutions like 1024x1024 are often used, training at 512x512 produces great results in much less time and at lower cost. The model is able to handle different resolutions regardless.
Put trigger/name first in prompt: When generating images, place your trigger word or name token at the very start of the prompt for best results.
Use AI prompt generators: Tools like Claude can help optimize your prompts to get more aesthetic, creative results from your model.
Animate with Runway Gen-3: For the ultimate output, consider animating the generated images using Runway's Gen-3 model to bring them to life.

With just a dozen photos and 30 minutes of setup using any of these three methods, you can harness the power of Flux to realistically insert yourself into any imagined scenario. By following the tips above, you can achieve impressive results rivaling the best AI image generation available today. The LoRA technique opens up a world of creative possibilities for generating high-fidelity, personalized AI images.