Stable-Lime-v1.0

Proiect nou (1)

Stable-Lime-v1.0 is an unconditional diffusion model based on the Denoising Diffusion Probabilistic Models (DDPM) architecture. It has been trained specifically to generate images representing the "essence of Lime."

Model Details

  • Model Type: Unconditional Image Generation (Diffusion)
  • Architecture: UNet2DModel with DDPMScheduler
  • Framework: PyTorch & Hugging Face Diffusers
  • Resolution: $64 \times 64$ pixels
  • Channels: 3 (RGB)
  • License: MIT (Assumed based on open-source usage)

Intended Use

This model is designed for:

  • Generating $64 \times 64$ images of limes (or lime-like textures).
  • Educational purposes regarding the implementation of DDPM loops.
  • Low-resolution, "retro" aesthetic generation.

Out of Scope:

  • Text-to-Image generation (this model does not accept text prompts).
  • High-resolution photorealism (limited by the 64px architecture).

Training Data

The model was trained on a proprietary dataset located at dataset_lime/processed.

  • Preprocessing: Images were resized to $64 \times 64$ and normalized to the range $[-1, 1]$.
  • Augmentation: Random horizontal flips were applied during training to improve generalization.

Training Procedure

Hyperparameters

The model was trained using the following configuration ("The Lime Settings"):

Parameter Value Description
Batch Size 16 Small batch size suitable for consumer GPUs.
Learning Rate $1 \times 10^{-4}$ Optimizer step size (AdamW).
Epochs 5 Note: This is a very short training duration.
Timesteps 1000 Number of diffusion noise steps.
Image Size 64 Output resolution.

Architecture Specification

The U-Net architecture utilizes a deep structure with attention mechanisms in the lower bottleneck layers:

  • Block Output Channels: (128, 128, 256, 256, 512, 512)
  • Downsampling: 4x DownBlock2D, 1x AttnDownBlock2D, 1x DownBlock2D
  • Upsampling: Mirror of downsampling blocks.

Loss Function

The model optimizes the Mean Squared Error (MSE) between the actual noise added and the predicted noise:

L=MSE(ϵ,ϵθ(xt,t))L = \text{MSE}(\epsilon, \epsilon_\theta(x_t, t))

Where $\epsilon$ is the Gaussian noise and $\epsilon_\theta$ is the model's prediction at timestep $t$.


Limitations & Biases

  • Undertraining Risk: With only 5 Epochs, the model may not have fully converged. Generated images might appear blurry or retain significant noise (static) rather than clear lime features.
  • Resolution: The output is strictly $64 \times 64$, resulting in pixelated, low-fidelity images.
  • Dataset Bias: The model's output is entirely dependent on the variety found in dataset_lime. If the dataset contained only green limes, it will not generate yellow limes (lemons).

Downloads last month
57
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train FlameF0X/Stable-Lime-v1.0

Collection including FlameF0X/Stable-Lime-v1.0