I have heard many times that if statements in shaders slow down the gpu massively. But I also heard that texture samples are very expensive.
Which one is more endurable? Which one is less impactful?
I am asking, because I need to decide on if I should multiply a value by 0, or put an if statement.
I think there is no obvious way of telling this, because it depends on how you if statement will be constructed and in the end what machine code will be generated from your code.
So best thing would probably be implement both and measure the results. I would argue that’s how performance optimisations work. Don’t trust on what a forum post tells you.
However chances are high that both will have similar performance in a range that doesn’t matter for your use case… Without knowing your use case :)
Impact of if statements depends on how you use them. GPUs are massively parallel and sacrifice complexity to fit more parallel compute. Threads aren’t fully independent, so regardless of which branch is taken, the thread usually has to wait for both branches.
Pixels that take the then-branch idle while other ones take the else-branch and vice versa. That’s precious GPU time wasted doing nothing. Nested if statements make this exponentially worse because the program has to wait for every case.
Can’t say if it’s slower than your other expensive job, though. Try it out and measure.
@Smorty Link doesn’t load for me and I don’t know the answer in general, but one thing I can say is that _sometimes_ if statements aren’t an issue at all, which is when the condition evaluates to the same thing for all pixels/fragments. E.g. an “if sin(TIME) < 0.0” costs you almost nothing, whereas “if COLOR.r > 0.5” causes execution to branch and slows you down. But I can’t say how that case compares to a texture lookup, I assume it depends on many thing
I’ve heard that using
mix()
instead (or whatever GDShader calls that GLSL function) can be more performant, since it doesn’t branch. Is that true?@PoolloverNathan Afaik that is true, yes! mix is the same instruction for all fragments, so if you can replace a branching if with a mix that should be an improvement
@Smorty don’t know for sure but experience tells me to go with mul zero.
@Smorty because gpus can’t feasibly do speculative execution, forking is more expensive than a lookup, which can be done in parallel and cached, but of course, it depends on what you’re testing and what you’re sampling
it’s not the same to test for one equality than a complex function call, and it’s not the same thing sampling a small or big texture, with or without mipmap levels, aggregation, etc