FLUX Chroma: https://huggingface.co/lodestones/Chroma

FLUX Chroma (Tensor Art) : https://tensor.art/models/886764918794154122

Unlike base FLUX Schnell , FLUX Chroma uses NAG (Normalized Attentive Guidence) : https://huggingface.co/spaces/ChenDY/NAG_FLUX.1-dev

TLDR; NAG are negatives added to FLUX model

Paper: https://arxiv.org/abs/2505.21179

See FLUX Chroma Huggingface repo for additional changes from base FLUX Schnell model

To help with creating prompts for FLUX , use Joycaptions:https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one

And Danbooru tags: https://donmai.moe/wiki_pages/help:home

The prompt can be up to 512 tokens long , which can be checked at https://sd-tokenizer.rocker.boo/

    • Iwaku@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      So I just tried what you suggested. Perchance’s is on the left and mine on the right.

      As you can see mine is still not there yet and I need more info. Here is the workflow:

      If you could replicate Perchance’s exactly and provide your workflow, I would be glad to see

      • RandomPerchanceUser@lemmy.worldOP
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 months ago

        Try with settings listed at the Huggingface Chroma page. If problems persist use online hosted version on Tensor Art , or ask on the Chroma discord page

        • Iwaku@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 months ago

          ??? This is just the example workflow from their HF model card. I’m not sure what you mean here as it doesn’t come out well at all with the same settings I used with Perchance.

            • podvornyakva@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              edit-2
              1 month ago

              First thing first - it IS a Chroma model… whatever version it is. Base seed match exactly. About that prompt. It is too cheap. You can’t replicate nothing with that. Nobody can.

              To the point. ‘text-to-image’ use simple approach to feed Chroma model. It is a two-pass sampling. First pass image used by second pass. ‘text-to-image’ gives user only first/base seed. The second seed is unknown. It is random. Second seed makes variation of the base image. With only base seed you can’t reproduse same result. Neither in Comfy UI nor on web page of "text-to-image. Even with very detailed prompt. Hope i’ve made it clear.

              To deteils. “Seven is the number in magic” Shocking Blue.

              So there is a two KSampler nodes in workflow. First KSampler pass latent output as the source of second node latent input. Base KSampler set 3.5 guidiance and 1.0 denoise. As a Chroma creators recomedation. To mix in a second-pass image necessary to low down denoise factor of the second KSampler. Let’s say we wana spare noise factor in final image from both pass. How do we make it? Simple - we set denoise factor to 50% of the second KSampler. 0.5. But here is the “magic”. 50% denoise do not make spare influence. In fact - we get 25% influence from the second pass. Why? because of guidiense. It is halfed to. We halfing overal noise factor. Like we make transparent image on top of another image. Transparent image got only 50% of ALL it’s values. Hue, luma, color, etc. Same happening with KSampler denoise.

              And what should we do, to solve this issue? Right - we set guidiance to 7, doublind influence of the second pass noise. Simple. Recomended steps for sampling is 26. I use 16 for the first pass and 32 for the second. About resolution. To match “text-to-image” seed base image must be same size and orientation. Second image size may differ, but it leads to errors in image: extra fingers, extra hands and so on. It is happen because of streching latent image. To avoid those errors scaling factor must be 0.25, 0.5, 2.0, 4.0, etc.

              Probably the Chroma setup for “text-to-image” plugin much more complex than that. But it’s all up to you. Hope it was usefull. SYA.

              P.S. As you can see here, high steps change nothing… mostly, while second pass makes drastical changes of base image. Still high-step base image makes final image details sharper and clear.

              first image 16 steps, third image 52 steps, second image, pass 2, 32 steps)

              Here is a Chroma fp8 e4m3fm scaled. Detailed prompt. 32/32 passes. It was not necessary to make 32 steps for the first pass, just an example. 16/32 steps produce almost same result. even 8/16 can result in better output then one pass with 32 steps.

              One issue that i can’t solve is a CYAN color. All output, more or less, have a bluish look. I’m trying to get rid of that with help of LoRAs, but got no success yet. Any suggest?