I have a problem when training ControlNet.
I find the model desn't generate images according to the input condition, regardless of more than 50k iterations
Maybe the reason is that direct learning in pixel space is hard compared with latent space.
Thanks!
I have a problem when training ControlNet.
I find the model desn't generate images according to the input condition, regardless of more than 50k iterations
Maybe the reason is that direct learning in pixel space is hard compared with latent space.
Thanks!