045750b46a
- Increase max LoRA rank (dim) size to 1024. - Update finetune preprocessing scripts. - ``.bmp`` and ``.jpeg`` are supported. Thanks to breakcore2 and p1atdev! - The default weights of ``tag_images_by_wd14_tagger.py`` is now ``SmilingWolf/wd-v1-4-convnext-tagger-v2``. You can specify another model id from ``SmilingWolf`` by ``--repo_id`` option. Thanks to SmilingWolf for the great work. - To change the weight, remove ``wd14_tagger_model`` folder, and run the script again. - ``--max_data_loader_n_workers`` option is added to each script. This option uses the DataLoader for data loading to speed up loading, 20%~30% faster. - Please specify 2 or 4, depends on the number of CPU cores. - ``--recursive`` option is added to ``merge_dd_tags_to_metadata.py`` and ``merge_captions_to_metadata.py``, only works with ``--full_path``. - ``make_captions_by_git.py`` is added. It uses [GIT microsoft/git-large-textcaps](https://huggingface.co/microsoft/git-large-textcaps) for captioning. - ``requirements.txt`` is updated. If you use this script, [please update the libraries](https://github.com/kohya-ss/sd-scripts#upgrade). - Usage is almost the same as ``make_captions.py``, but batch size should be smaller. - ``--remove_words`` option removes as much text as possible (such as ``the word "XXXX" on it``). - ``--skip_existing`` option is added to ``prepare_buckets_latents.py``. Images with existing npz files are ignored by this option. - ``clean_captions_and_tags.py`` is updated to remove duplicated or conflicting tags, e.g. ``shirt`` is removed when ``white shirt`` exists. if ``black hair`` is with ``red hair``, both are removed. - Tag frequency is added to the metadata in ``train_network.py``. Thanks to space-nuko! - __All tags and number of occurrences of the tag are recorded.__ If you do not want it, disable metadata storing with ``--no_metadata`` option.
24 lines
333 B
Plaintext
24 lines
333 B
Plaintext
accelerate==0.15.0
|
|
transformers==4.26.0
|
|
ftfy
|
|
albumentations
|
|
opencv-python
|
|
einops
|
|
diffusers[torch]==0.10.2
|
|
pytorch_lightning
|
|
bitsandbytes==0.35.0
|
|
tensorboard
|
|
safetensors==0.2.6
|
|
gradio==3.16.2
|
|
altair
|
|
easygui
|
|
tk
|
|
# for BLIP captioning
|
|
requests
|
|
timm
|
|
fairscale
|
|
# for WD14 captioning
|
|
tensorflow<2.11
|
|
huggingface-hub
|
|
# for kohya_ss library
|
|
. |