What is it?

No?, it says:
RuntimeError: Error(s) in loading state_dict for NextDiT:
size mismatch for x_embedder.weight: copying a param with shape torch.Size([3840, 128]) from checkpoint, the shape in current model is torch.Size([3840, 64]).
tried to load in fp8
Newbie 0.1?

lodestones

Owner 6 days ago

it's a modified z-image with flux 2 vae and some slight arch changes and custom loss
this model is not ready yet it's just started training literally yesterday.

tuolaku

6 days ago

•

edited 6 days ago

Am I dreaming? A new year, a new surprise?

I have a question: why not use the z-Image-De-Turbo model as the base model?

Ares6x

5 days ago

This is exciting, but @loadstones, will LoRAs made for Chroma1-HD still work on this, or will I have to retrain all of my loras? I'm pretty sure my chroma loras don't work with z-image

qpqpqpqpqpqp

5 days ago

"I'm pretty sure my chroma loras don't work with z-image" Yes, don't work, of course

Yndear

5 days ago

@loadstones by the way, why did you decide not to wait for the base version of z-image? It seemed to me that the distilled version is less flexible to fine tuning.

lodestones

Owner 5 days ago

@Yndear this is not fine tuning, this is closer to pretraining and instead of random init, im using z-image as the "initial seed"

the arch is different, it uses DeCo head and flux 2 vae, the loss is different it's an fm-x0 loss instead of fm-velocity

this arch + loss function combo has huge benefit of ridiculously fast convergence. But even with that it still costly to pretrain one from scratch.
it's better to have some residual knowledge as initial seed than starting from blank slate.

tuolaku

5 days ago

@Yndear this is not fine tuning, this is closer to pretraining and instead of random init, im using z-image as the "initial seed"

the arch is different, it uses DeCo head and flux 2 vae, the loss is different it's an fm-x0 loss instead of fm-velocity

this arch + loss function combo has huge benefit of ridiculously fast convergence. But even with that it still costly to pretrain one from scratch.
it's better to have some residual knowledge as initial seed than starting from blank slate.

Is the text encoder still using Qwen3-4B, or will it be replaced with a larger-parameter text encoder?

lodestones

Owner 5 days ago

still qwen

QuickscopingFTW

4 days ago

is there any workflow for comfyui that works with this new mode? also the same for the wip radiance model, the comfyui workflow for it is very old and in wondering if there is a newer one that works better?

lodestones

Owner about 13 hours ago

@QuickscopingFTW for radiance, the default worfklow in the latest comfy ui should work and it will auto detect the model just fine
for this model tho i haven't pushed any support for comfy just yet because this model is literally still in very early pretraining phase

snapflipper

about 6 hours ago

I AM SO EXCITED FOR THIS MODEL, Thank you @lodestones . hope it will have support for comfyui, also was getting errors when i was using the model with wen text encoder and fluxUltra Vae

qpqpqpqpqpqp

about 6 hours ago

fluxUltra Vae

The VAE is trash, why do you use it? Can you see its generated artifacts?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment