Metadata-Version: 2.1
Name: DiffEdit
Version: 0.0.2rc6
Summary: An implementation of the DiffEdit algorithm for prompt-based mask creation and inpating. For more information, see the Readme file.
License: Apache-2.0
Author: Gennaro Farina
Requires-Python: >=3.10,<4.0
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Provides-Extra: dev
Provides-Extra: test
Requires-Dist: accelerate (==0.26.1)
Requires-Dist: black[jupyter] (==23.3.0) ; extra == "dev"
Requires-Dist: diffusers (>=0.25)
Requires-Dist: docformatter (==1.5.1) ; extra == "dev"
Requires-Dist: docstr-coverage (==2.2.0) ; extra == "dev"
Requires-Dist: fastai (==2.7.13)
Requires-Dist: numpy (==1.26.2)
Requires-Dist: opencv_python (>=4.9,<5.0)
Requires-Dist: pre-commit (==3.3.1) ; extra == "dev"
Requires-Dist: pytest (==7.3.1) ; extra == "test"
Requires-Dist: pytest-cov (==4.0.0) ; extra == "test"
Requires-Dist: pytest-sugar (==0.9.7) ; extra == "test"
Requires-Dist: pytest-xdist (==3.2.1) ; extra == "test"
Requires-Dist: requests (==2.27.1)
Requires-Dist: ruff (==0.0.264) ; extra == "dev"
Requires-Dist: setuptools (>=69.1.1,<70.0.0)
Requires-Dist: torch (==2.0.1)
Requires-Dist: tqdm
Requires-Dist: transformers (==4.36.2)
Requires-Dist: twine (>=4.0.0,<5) ; extra == "dev"
Description-Content-Type: text/markdown

# DiffEdit
___
[![pypi wheel](https://github.com/Gennaro-Farina/DiffEdit/actions/workflows/publish-wheel-pypi.yml/badge.svg)](https://github.com/Gennaro-Farina/DiffEdit/actions/workflows/publish-wheel-pypi.yml)
[![Python package](https://github.com/Gennaro-Farina/DiffEdit/actions/workflows/python-package.yml/badge.svg?branch=main)](https://github.com/Gennaro-Farina/DiffEdit/actions/workflows/python-package.yml)

An unofficial implementation of <a href="https://arxiv.org/abs/2210.11427"> DiffEdit</a> based on <a href="https://huggingface.co"> 🤗 Hugging Face </a>, <a href="https://github.com/johnrobinsn/diffusion_experiments/blob/main/DiffEdit.ipynb"> this repo</a> and PyTorch.
This methodology leverage the diffusion process to automatically extract a mask from an image given a prompt. The mask is then used to inpaint the image with the new content.
To get a clearer overview of the process, you can take a look at the <a href="https://github.com/Gennaro-Farina/diffusion-nbs/blob/master/DiffEdit.ipynb"> DiffEdit.ipynb</a> notebook.

## Results

<table>
<head>
<th> Prompt: <i>remove</i> ⟶ <i>add</i>)</th><th>Original image</th> <th>Mask</th> <th>Edited</th>
</head>
<body>
<tr>
<td>"lion" ⟶ "dog"</td>
<td><img src="static/ai_gen_lion.jpeg" width="256" height="256"></td>
<td><img src="static/ai_gen_lion_mask.png" width="256" height="256"></td>
<td><img src="static/ai_gen_lion_result.png" width="256" height="256"></td>
</tr>
<tr>
<td>"house" ⟶ "3-floor hotel"</td>
<td><img src="static/ai_gen_house.jpeg" width="256" height="256"></td>
<td><img src="static/ai_gen_house_mask.png" width="256" height="256"></td>
<td><img src="static/ai_gen_house_result.png" width="256" height="256"></td>
</tr>
<tr>
<td>"an F1 race" ⟶ "a motogp race"</td>
<td><img src="static/ai_gen_f1.jpeg" width="256" height="256"></td>
<td><img src="static/ai_gen_f1_mask.png" width="256" height="256"></td>
<td><img src="static/ai_gen_f1_result.png" width="256" height="256"></td>
</tr>
</body>
</table>

All the previous masks was generated with: `num-samples = 10`

## Installation

```bash
pip install -e .
```

## Usage

For a fast evaluation use the script <a href="https://github.com/Gennaro-Farina/DiffEdit/blob/main/src/diff_edit/examples/image_edit.py">image_edit.py</a>:

```bash
python image_edit.py --input_image <path_to_image> --output_image <path_to_output_image> --prompt <prompt>
```

An example of usage is the following (resulting in <a href="https://github.com/Gennaro-Farina/DiffEdit/blob/main/static/ai_gen_lion_result.png"> this</a> image):

```bash
python image_edit.py --remove-prompt "lion" --add-prompt "dog" --image-link "https://github.com/Gennaro-Farina/DiffEdit/blob/main/static/ai_gen_lion.jpeg" --num-samples 10
```

You can further customize the process by tuning the parameters of the script. Here is the full list of parameters that you can tune:

```bash
python image_edit.py --help
usage: image_edit.py [-h] [--remove-prompt REMOVE_PROMPT] [--add-prompt ADD_PROMPT] [--image IMAGE] [--image-link IMAGE_LINK] [--device {cpu,cuda,mps}]
                     [--vae-model VAE_MODEL] [--tokenizer TOKENIZER] [--text-encoder TEXT_ENCODER] [--unet UNET] [--scheduler SCHEDULER]
                     [--scheduler-start SCHEDULER_START] [--scheduler-end SCHEDULER_END] [--num-train-timesteps NUM_TRAIN_TIMESTEPS] [--beta-schedule BETA_SCHEDULE]
                     [--inpainting INPAINTING] [--seed SEED] [--n N] [--save-path SAVE_PATH]

Diffusion Image Editing arguments

options:
  -h, --help            show this help message and exit
  --remove-prompt REMOVE_PROMPT
                        What you want to remove from the image
  --add-prompt ADD_PROMPT
                        What you want to add to the image
  --image IMAGE         Path to the image to edit
  --image-link IMAGE_LINK
                        Link to the image to edit
  --device {cpu,cuda,mps}
  --vae-model VAE_MODEL
                        Model name. E.g. stabilityai/sd-vae-ft-ema
  --tokenizer TOKENIZER
                        Tokenizer to tokenize the text. E.g. openai/clip-vit-large-patch14
  --text-encoder TEXT_ENCODER
                        Text encoder to encode the text. E.g. openai/clip-vit-large-patch14
  --unet UNET           UNet model for generating the latents. E.g. CompVis/stable-diffusion-v1-4
  --scheduler SCHEDULER
                        Noise scheduler. E.g. LMSDiscreteScheduler
  --scheduler-start SCHEDULER_START
                        Scheduler start value
  --scheduler-end SCHEDULER_END
                        Scheduler end value
  --num-train-timesteps NUM_TRAIN_TIMESTEPS
                        Number of training timesteps
  --beta-schedule BETA_SCHEDULE
                        Beta schedule
  --inpainting INPAINTING
                        Inpainting model. E.g. runwayml/stable-diffusion-inpainting
  --seed SEED           Random seed
  --num-samples N       Number of diffusion steps to generate the mask
  --save-path SAVE_PATH
                        Path to save the result. Default is <script_folder>/result.png
```


