Spaces:
Running
Running
Xuting Zhang
commited on
Revise README with new project details and instructions
Browse filesUpdated project title and added detailed overview, features, installation instructions, and usage examples.
README.md
CHANGED
|
@@ -1,4 +1,91 @@
|
|
| 1 |
# TextEraser
|
| 2 |
-
Text-Guided Precise Object Removal with YOLOv8 + Stable Diffusion Inpainting
|
| 3 |
|
| 4 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# TextEraser
|
|
|
|
| 2 |
|
| 3 |
+
**Text-Guided Object Removal using SAM2 + CLIP + Stable Diffusion XL**
|
| 4 |
+
|
| 5 |
+
Final Project for COMPSCI372: Intro to Applied Machine Learning @ Duke University (Fall 2025)
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## Overview
|
| 10 |
+
|
| 11 |
+
TextEraser intelligently removes objects from images using natural language descriptions. Simply type what you want to remove (e.g., "bottle", "person", "car"), and the AI pipeline handles the rest.
|
| 12 |
+
|
| 13 |
+
### Key Features
|
| 14 |
+
|
| 15 |
+
- **Natural language control** - Remove objects by describing them in plain text
|
| 16 |
+
- **Smart segmentation** - Uses SAM2 to find all objects in the image
|
| 17 |
+
- **Intelligent matching** - CLIP identifies which segments match your description
|
| 18 |
+
- **Seamless inpainting** - Stable Diffusion XL fills in the removed area naturally
|
| 19 |
+
- **Multi-part object handling** - Automatically merges related segments (e.g., cat + tail)
|
| 20 |
+
- **Interactive web interface** - Real-time Gradio UI with debug visualization
|
| 21 |
+
|
| 22 |
+
---
|
| 23 |
+
|
| 24 |
+
## How It Works
|
| 25 |
+
|
| 26 |
+
The pipeline has three stages:
|
| 27 |
+
|
| 28 |
+
1. **Segmentation (SAM2)** - Generates candidate object masks across the image
|
| 29 |
+
2. **Matching (CLIP)** - Scores each segment against your text query
|
| 30 |
+
3. **Inpainting (SDXL)** - Fills the masked region with contextually appropriate content
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## Installation
|
| 35 |
+
|
| 36 |
+
### Requirements
|
| 37 |
+
|
| 38 |
+
- Python 3.8+
|
| 39 |
+
- CUDA GPU with 12GB+ VRAM (recommended)
|
| 40 |
+
- ~10GB disk space for models
|
| 41 |
+
|
| 42 |
+
### Setup
|
| 43 |
+
|
| 44 |
+
```bash
|
| 45 |
+
# Clone repository
|
| 46 |
+
git clone https://github.com/lxzcpro/TextEraser.git
|
| 47 |
+
cd TextEraser
|
| 48 |
+
|
| 49 |
+
# Install dependencies
|
| 50 |
+
pip install -r requirements.txt
|
| 51 |
+
|
| 52 |
+
# Run the app
|
| 53 |
+
python app.py
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
On first run, models will auto-download from HuggingFace (~10GB total).
|
| 57 |
+
|
| 58 |
+
---
|
| 59 |
+
|
| 60 |
+
## Usage
|
| 61 |
+
|
| 62 |
+
### Web Interface
|
| 63 |
+
|
| 64 |
+
1. Launch the app: `python app.py`
|
| 65 |
+
2. Upload an image
|
| 66 |
+
3. Enter what to remove (e.g., "bottle", "car", "person")
|
| 67 |
+
4. Optionally specify background fill (default: "background")
|
| 68 |
+
5. Click "Run Pipeline"
|
| 69 |
+
6. Check the debug tab to see what was detected
|
| 70 |
+
|
| 71 |
+
### Python API
|
| 72 |
+
|
| 73 |
+
```python
|
| 74 |
+
from src.pipeline import ObjectRemovalPipeline
|
| 75 |
+
from PIL import Image
|
| 76 |
+
import numpy as np
|
| 77 |
+
|
| 78 |
+
# Initialize pipeline
|
| 79 |
+
pipeline = ObjectRemovalPipeline()
|
| 80 |
+
|
| 81 |
+
# Load and process image
|
| 82 |
+
image = np.array(Image.open("photo.jpg"))
|
| 83 |
+
result, mask, message = pipeline.process(
|
| 84 |
+
image=image,
|
| 85 |
+
text_query="bottle",
|
| 86 |
+
inpaint_prompt="table surface"
|
| 87 |
+
)
|
| 88 |
+
|
| 89 |
+
# Save result
|
| 90 |
+
Image.fromarray(result).save("result.jpg")
|
| 91 |
+
```
|