What is it?

#1
by qpqpqpqpqpqp - opened

Pruned Chroma.1?

Is chroma based on Z-image possible?

Is chroma based on Z-image possible?

Oops, it seems it's another model. Also, do you remember Lumina2 "Chroma"? lodestone(s) stopped updating it

Z-image?

No?, it says:
RuntimeError: Error(s) in loading state_dict for NextDiT:
size mismatch for x_embedder.weight: copying a param with shape torch.Size([3840, 128]) from checkpoint, the shape in current model is torch.Size([3840, 64]).
tried to load in fp8
Newbie 0.1?

it's a modified z-image with flux 2 vae and some slight arch changes and custom loss
this model is not ready yet it's just started training literally yesterday.

Am I dreaming? A new year, a new surprise?

I have a question: why not use the z-Image-De-Turbo model as the base model?

This is exciting, but @loadstones, will LoRAs made for Chroma1-HD still work on this, or will I have to retrain all of my loras? I'm pretty sure my chroma loras don't work with z-image

"I'm pretty sure my chroma loras don't work with z-image" Yes, don't work, of course

@loadstones by the way, why did you decide not to wait for the base version of z-image? It seemed to me that the distilled version is less flexible to fine tuning.

@Yndear this is not fine tuning, this is closer to pretraining and instead of random init, im using z-image as the "initial seed"

the arch is different, it uses DeCo head and flux 2 vae, the loss is different it's an fm-x0 loss instead of fm-velocity

this arch + loss function combo has huge benefit of ridiculously fast convergence. But even with that it still costly to pretrain one from scratch.
it's better to have some residual knowledge as initial seed than starting from blank slate.

@Yndear this is not fine tuning, this is closer to pretraining and instead of random init, im using z-image as the "initial seed"

the arch is different, it uses DeCo head and flux 2 vae, the loss is different it's an fm-x0 loss instead of fm-velocity

this arch + loss function combo has huge benefit of ridiculously fast convergence. But even with that it still costly to pretrain one from scratch.
it's better to have some residual knowledge as initial seed than starting from blank slate.

Is the text encoder still using Qwen3-4B, or will it be replaced with a larger-parameter text encoder?

still qwen

is there any workflow for comfyui that works with this new mode? also the same for the wip radiance model, the comfyui workflow for it is very old and in wondering if there is a newer one that works better?

@QuickscopingFTW for radiance, the default worfklow in the latest comfy ui should work and it will auto detect the model just fine
for this model tho i haven't pushed any support for comfy just yet because this model is literally still in very early pretraining phase

I AM SO EXCITED FOR THIS MODEL, Thank you @lodestones . hope it will have support for comfyui, also was getting errors when i was using the model with wen text encoder and fluxUltra Vae

fluxUltra Vae

The VAE is trash, why do you use it? Can you see its generated artifacts?

Sign up or log in to comment