KohyaSS/README_finetune.md
bmaltais 706dfe157f
Merge dreambooth and finetuning in one repo to align with kohya_ss new repo (#10)
* Merge both dreambooth and finetune back in one repo
2022-12-20 09:15:17 -05:00

168 lines
6.7 KiB
Markdown

# Kohya_ss Finetune
This python utility provide code to run the diffusers fine tuning version found in this note: https://note.com/kohya_ss/n/nbf7ce8d80f29
## Required Dependencies
Python 3.10.6 and Git:
- Python 3.10.6: https://www.python.org/ftp/python/3.10.6/python-3.10.6-amd64.exe
- git: https://git-scm.com/download/win
Give unrestricted script access to powershell so venv can work:
- Open an administrator powershell window
- Type `Set-ExecutionPolicy Unrestricted` and answer A
- Close admin powershell window
## Installation
Open a regular Powershell terminal and type the following inside:
```powershell
git clone https://github.com/bmaltais/kohya_diffusers_fine_tuning.git
cd kohya_diffusers_fine_tuning
python -m venv --system-site-packages venv
.\venv\Scripts\activate
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
pip install --upgrade -r requirements.txt
pip install -U -I --no-deps https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/f/xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl
cp .\bitsandbytes_windows\*.dll .\venv\Lib\site-packages\bitsandbytes\
cp .\bitsandbytes_windows\cextension.py .\venv\Lib\site-packages\bitsandbytes\cextension.py
cp .\bitsandbytes_windows\main.py .\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py
accelerate config
```
Answers to accelerate config:
```txt
- 0
- 0
- NO
- NO
- All
- fp16
```
### Optional: CUDNN 8.6
This step is optional but can improve the learning speed for NVidia 4090 owners...
Due to the filesize I can't host the DLLs needed for CUDNN 8.6 on Github, I strongly advise you download them for a speed boost in sample generation (almost 50% on 4090) you can download them from here: https://b1.thefileditch.ch/mwxKTEtelILoIbMbruuM.zip
To install simply unzip the directory and place the cudnn_windows folder in the root of the kohya_diffusers_fine_tuning repo.
Run the following command to install:
```
python .\tools\cudann_1.8_install.py
```
## Upgrade
When a new release comes out you can upgrade your repo with the following command:
```
.\upgrade.bat
```
or you can do it manually with
```powershell
cd kohya_ss
git pull
.\venv\Scripts\activate
pip install --upgrade -r requirements.txt
```
Once the commands have completed successfully you should be ready to use the new version.
## Folders configuration
Simply put all the images you will want to train on in a single directory. It does not matter what size or aspect ratio they have. It is your choice.
## Captions
Each file need to be accompanied by a caption file describing what the image is about. For example, if you want to train on cute dog pictures you can put `cute dog` as the caption in every file. You can use the `tools\caption.ps1` sample code to help out with that:
```powershell
$folder = "sample"
$file_pattern="*.*"
$caption_text="cute dog"
$files = Get-ChildItem "$folder\$file_pattern" -Include *.png, *.jpg, *.webp -File
foreach ($file in $files) {
if (-not(Test-Path -Path $folder\"$($file.BaseName).txt" -PathType Leaf)) {
New-Item -ItemType file -Path $folder -Name "$($file.BaseName).txt" -Value $caption_text
}
}
You can also use the `Captioning` tool found under the `Utilities` tab in the GUI.
```
## GUI
Support for GUI based training using gradio. You can start the GUI interface by running:
```powershell
.\finetune.bat
```
## CLI
You can find various examples of how to leverage the fine_tune.py in this folder: https://github.com/bmaltais/kohya_ss/tree/master/examples
## Support
Drop by the discord server for support: https://discord.com/channels/1041518562487058594/1041518563242020906
## Change history
* 12/20 (v9.6) update:
- fix issue with config file save and opening
* 12/19 (v9.5) update:
- Fix file/folder dialog opening behind the browser window
- Update GUI layout to be more logical
* 12/18 (v9.4) update:
- Add WD14 tagging to utilities
* 12/18 (v9.3) update:
- Add logging option
* 12/18 (v9.2) update:
- Add BLIP Captioning utility
* 12/18 (v9.1) update:
- Add Stable Diffusion model conversion utility. Make sure to run `pip upgrade -U -r requirements.txt` after updating to this release as this introduce new pip requirements.
* 12/17 (v9) update:
- Save model as option added to fine_tune.py
- Save model as option added to GUI
- Retirement of cli based documentation. Will focus attention to GUI based training
* 12/13 (v8):
- WD14Tagger now works on its own.
- Added support for learning to fp16 up to the gradient. Go to "Building the environment and preparing scripts for Diffusers for more info".
* 12/10 (v7):
- We have added support for Diffusers 0.10.2.
- In addition, we have made other fixes.
- For more information, please see the section on "Building the environment and preparing scripts for Diffusers" in our documentation.
* 12/6 (v6): We have responded to reports that some models experience an error when saving in SafeTensors format.
* 12/5 (v5):
- .safetensors format is now supported. Install SafeTensors as "pip install safetensors". When loading, it is automatically determined by extension. Specify use_safetensors options when saving.
- Added an option to add any string before the date and time log directory name log_prefix.
- Cleaning scripts now work without either captions or tags.
* 11/29 (v4):
- DiffUsers 0.9.0 is required. Update as "pip install -U diffusers[torch]==0.9.0" in the virtual environment, and update the dependent libraries as "pip install --upgrade -r requirements.txt" if other errors occur.
- Compatible with Stable Diffusion v2.0. Add the --v2 option when training (and pre-fetching latents). If you are using 768-v-ema.ckpt or stable-diffusion-2 instead of stable-diffusion-v2-base, add --v_parameterization as well when learning. Learn more about other options.
- The minimum resolution and maximum resolution of the bucket can be specified when pre-fetching latents.
- Corrected the calculation formula for loss (fixed that it was increasing according to the batch size).
- Added options related to the learning rate scheduler.
- So that you can download and learn DiffUsers models directly from Hugging Face. In addition, DiffUsers models can be saved during training.
- Available even if the clean_captions_and_tags.py is only a caption or a tag.
- Other minor fixes such as changing the arguments of the noise scheduler during training.
* 11/23 (v3):
- Added WD14Tagger tagging script.
- A log output function has been added to the fine_tune.py. Also, fixed the double shuffling of data.
- Fixed misspelling of options for each script (caption_extention→caption_extension will work for the time being, even if it remains outdated).