How to define a function that processes N-vectors ?

Is there a way to define a new OpenCL function that processes N-vectors, for various N, like the built-in math functions?

For example, if overloading were supported:
float square(float x) { return x * x; }
float2 square(float2 x) { return x * x; }
float4 square(float4 x) { return x * x; }
but overloading doesn’t seem to be allowed (?)

It could be done via a macro:
#define square(x) ((x)*(x))
but in this case, x may be evaluated twice, so it is not the solution I’d like…

Thanks for any suggestions!

Not that I’m aware of. OpenCL’s kernel language is based on C99 which does not support overloading.