Training Your Own Flux LoRA Model
FluxProArt ยทTraining Your Own Flux LoRA Model for Realistic AI Image Generation
Flux is one of the most advanced AI image generation models available today, capable of producing highly realistic results on par with Midjourney. A powerful new feature allows you to train your own likeness into the Flux model using a technique called LoRA (Low-Rank Adaptation). This enables generating creative AI images of yourself in any scenario, such as appearing as a wizard, astronaut, or superhero.
There are currently three main ways to train a custom Flux LoRA model:
- Using the OstrisAI toolkit on Google Colab
- Through the Fal.ai platform
- Via the Replicate.com
Let's dive into the details of each method, along with some tips and tricks for optimal results.
Training Flux LoRA with OstrisAI Toolkit on Google Colab
The OstrisAI toolkit provides a script for training and inference of Flux LoRA models that is used by both Fal.ai and Replicate under the hood. You can run it directly from Google Colab using this notebook: https://colab.research.google.com/drive/1r09aImgL1YhQsJgsLWnb67-bjTV88-W0?usp=sharing
Requirements
To train a Flux LoRA model, you need a GPU with at least 24GB of VRAM. If the GPU is also controlling your monitors, set low_vram: true
in the config file under model:
to quantize the model on CPU. The toolkit has been tested on Linux, with some reports of bugs on Windows.
Model License
Currently, training only works with the FLUX.1-dev model, which has a non-commercial license. You must accept the license on Hugging Face before use. Here are the steps:
- Sign into Hugging Face and accept the model access for black-forest-labs/FLUX.1-dev
- Get a READ API key from Hugging Face
- Place the key in a file named
.env
in the root folder like:HF_TOKEN=your_key_here
Training Steps
- Copy the example config file
config/examples/train_lora_flux_24gb.yaml
to theconfig
folder and rename it - Edit the config file following the comments
- Run the training with
python3 run.py config/your_config_file.yml
A folder will be created with checkpoints and generated images. You can stop and resume training anytime.
Dataset Preparation
Your dataset should be a folder of images (jpg, jpeg, png) and matching .txt
caption files, e.g. image1.jpg
and image1.txt
. The word [trigger]
in captions will be replaced by the trigger_word
from your config. Images will be automatically resized and cropped.
Training Flux LoRA on Fal.ai
Fal.ai provides a simple interface for training Flux LoRA models at https://fal.ai/models/fal-ai/flux-lora-general-training. The process is:
- Upload 12-15 images of yourself
- Set training steps to 1,000 and specify a trigger word
- Start training (takes 25-30 minutes, costs ~$5)
- When complete, copy the URL of the generated
.safetensors
file
To generate images with your trained LoRA:
- Go to https://fal.ai/models/fal-ai/flux-general
- Paste the
.safetensors
URL in the "LoRAs -> Path" box - Use your trigger word in the prompt
Each image generation costs $0.075. You'll need a Fal.ai account with credits.
Training Flux LoRA on Replicate
The Replicate by Luca Taco provides a simple interface for training Flux LoRA at https://replicate.com/lucataco/ai-toolkit/train. Here are the steps:
- Gather 12-20 high-quality photos of your face
- Rename them as "a photo of {your name} 1.jpg", "a photo of {your name} 2.jpg", etc.
- Zip the photos into a single file
- Create an account on Replicate.com
- Go to the "Luca Taco AI Toolkit" model page
- Under "Train", create a new model and give it a name
- Upload your zip file of photos
- Set training steps to 1000
- Obtain a Hugging Face token and paste it in
- Click "Create Training" (takes ~25 minutes, costs ~$2.10)
Once training completes, use the "Flux Dev LoRA" model on Replicate to generate images:
- Go to https://replicate.com/lucataco/flux-dev-lora
- Provide a prompt like "{your name} as a wizard"
- Set the LoRA scale to 1
The creator of the Replicate has provided a one-time $10 coupon code to cover initial training and generation costs.
Tips and Tricks for Optimal Results
Through experimentation, the following tips have been found to improve Flux LoRA training results:
-
Avoid mixing data types: Combining realistic photos with cartoon/anime style training images tends to produce plastic-looking, uncanny results. Stick to one consistent image style in your dataset.
-
Use natural language captions: Comma-separated tags (like those used on Danbooru) are less effective than full, narrative-style captions written in natural language. Describe the images as you would explain them to another person.
-
Keep datasets small: Surprisingly, smaller datasets of 20-30 images provide more flexible prompting compared to larger sets. Adding just 15 more images drastically reduced the prompting flexibility in one test.
-
Aim for 20 images at 1000 steps: Excellent likeness capture has been achieved using the Kohya trainer with a dataset of 20 well-chosen images trained for 1000 steps and 1 epoch. This seems to be a sweet spot.
-
Train at 512x512 resolution: While higher resolutions like 1024x1024 are often used, training at 512x512 produces great results in much less time and at lower cost. The model is able to handle different resolutions regardless.
-
Put trigger/name first in prompt: When generating images, place your trigger word or name token at the very start of the prompt for best results.
-
Use AI prompt generators: Tools like Claude can help optimize your prompts to get more aesthetic, creative results from your model.
-
Animate with Runway Gen-3: For the ultimate output, consider animating the generated images using Runway's Gen-3 model to bring them to life.
With just a dozen photos and 30 minutes of setup using any of these three methods, you can harness the power of Flux to realistically insert yourself into any imagined scenario. By following the tips above, you can achieve impressive results rivaling the best AI image generation available today. The LoRA technique opens up a world of creative possibilities for generating high-fidelity, personalized AI images.