* Merge both dreambooth and finetune back in one repo
6.7 KiB
Kohya_ss Finetune
This python utility provide code to run the diffusers fine tuning version found in this note: https://note.com/kohya_ss/n/nbf7ce8d80f29
Required Dependencies
Python 3.10.6 and Git:
- Python 3.10.6: https://www.python.org/ftp/python/3.10.6/python-3.10.6-amd64.exe
- git: https://git-scm.com/download/win
Give unrestricted script access to powershell so venv can work:
- Open an administrator powershell window
- Type
Set-ExecutionPolicy Unrestricted
and answer A - Close admin powershell window
Installation
Open a regular Powershell terminal and type the following inside:
git clone https://github.com/bmaltais/kohya_diffusers_fine_tuning.git
cd kohya_diffusers_fine_tuning
python -m venv --system-site-packages venv
.\venv\Scripts\activate
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
pip install --upgrade -r requirements.txt
pip install -U -I --no-deps https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/f/xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl
cp .\bitsandbytes_windows\*.dll .\venv\Lib\site-packages\bitsandbytes\
cp .\bitsandbytes_windows\cextension.py .\venv\Lib\site-packages\bitsandbytes\cextension.py
cp .\bitsandbytes_windows\main.py .\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py
accelerate config
Answers to accelerate config:
- 0
- 0
- NO
- NO
- All
- fp16
Optional: CUDNN 8.6
This step is optional but can improve the learning speed for NVidia 4090 owners...
Due to the filesize I can't host the DLLs needed for CUDNN 8.6 on Github, I strongly advise you download them for a speed boost in sample generation (almost 50% on 4090) you can download them from here: https://b1.thefileditch.ch/mwxKTEtelILoIbMbruuM.zip
To install simply unzip the directory and place the cudnn_windows folder in the root of the kohya_diffusers_fine_tuning repo.
Run the following command to install:
python .\tools\cudann_1.8_install.py
Upgrade
When a new release comes out you can upgrade your repo with the following command:
.\upgrade.bat
or you can do it manually with
cd kohya_ss
git pull
.\venv\Scripts\activate
pip install --upgrade -r requirements.txt
Once the commands have completed successfully you should be ready to use the new version.
Folders configuration
Simply put all the images you will want to train on in a single directory. It does not matter what size or aspect ratio they have. It is your choice.
Captions
Each file need to be accompanied by a caption file describing what the image is about. For example, if you want to train on cute dog pictures you can put cute dog
as the caption in every file. You can use the tools\caption.ps1
sample code to help out with that:
$folder = "sample"
$file_pattern="*.*"
$caption_text="cute dog"
$files = Get-ChildItem "$folder\$file_pattern" -Include *.png, *.jpg, *.webp -File
foreach ($file in $files) {
if (-not(Test-Path -Path $folder\"$($file.BaseName).txt" -PathType Leaf)) {
New-Item -ItemType file -Path $folder -Name "$($file.BaseName).txt" -Value $caption_text
}
}
You can also use the `Captioning` tool found under the `Utilities` tab in the GUI.
GUI
Support for GUI based training using gradio. You can start the GUI interface by running:
.\finetune.bat
CLI
You can find various examples of how to leverage the fine_tune.py in this folder: https://github.com/bmaltais/kohya_ss/tree/master/examples
Support
Drop by the discord server for support: https://discord.com/channels/1041518562487058594/1041518563242020906
Change history
- 12/20 (v9.6) update:
- fix issue with config file save and opening
- 12/19 (v9.5) update:
- Fix file/folder dialog opening behind the browser window
- Update GUI layout to be more logical
- 12/18 (v9.4) update:
- Add WD14 tagging to utilities
- 12/18 (v9.3) update:
- Add logging option
- 12/18 (v9.2) update:
- Add BLIP Captioning utility
- 12/18 (v9.1) update:
- Add Stable Diffusion model conversion utility. Make sure to run
pip upgrade -U -r requirements.txt
after updating to this release as this introduce new pip requirements.
- Add Stable Diffusion model conversion utility. Make sure to run
- 12/17 (v9) update:
- Save model as option added to fine_tune.py
- Save model as option added to GUI
- Retirement of cli based documentation. Will focus attention to GUI based training
- 12/13 (v8):
- WD14Tagger now works on its own.
- Added support for learning to fp16 up to the gradient. Go to "Building the environment and preparing scripts for Diffusers for more info".
- 12/10 (v7):
- We have added support for Diffusers 0.10.2.
- In addition, we have made other fixes.
- For more information, please see the section on "Building the environment and preparing scripts for Diffusers" in our documentation.
- 12/6 (v6): We have responded to reports that some models experience an error when saving in SafeTensors format.
- 12/5 (v5):
- .safetensors format is now supported. Install SafeTensors as "pip install safetensors". When loading, it is automatically determined by extension. Specify use_safetensors options when saving.
- Added an option to add any string before the date and time log directory name log_prefix.
- Cleaning scripts now work without either captions or tags.
- 11/29 (v4):
- DiffUsers 0.9.0 is required. Update as "pip install -U diffusers[torch]==0.9.0" in the virtual environment, and update the dependent libraries as "pip install --upgrade -r requirements.txt" if other errors occur.
- Compatible with Stable Diffusion v2.0. Add the --v2 option when training (and pre-fetching latents). If you are using 768-v-ema.ckpt or stable-diffusion-2 instead of stable-diffusion-v2-base, add --v_parameterization as well when learning. Learn more about other options.
- The minimum resolution and maximum resolution of the bucket can be specified when pre-fetching latents.
- Corrected the calculation formula for loss (fixed that it was increasing according to the batch size).
- Added options related to the learning rate scheduler.
- So that you can download and learn DiffUsers models directly from Hugging Face. In addition, DiffUsers models can be saved during training.
- Available even if the clean_captions_and_tags.py is only a caption or a tag.
- Other minor fixes such as changing the arguments of the noise scheduler during training.
- 11/23 (v3):
- Added WD14Tagger tagging script.
- A log output function has been added to the fine_tune.py. Also, fixed the double shuffling of data.
- Fixed misspelling of options for each script (caption_extention→caption_extension will work for the time being, even if it remains outdated).