Is there a way to define a new OpenCL function that processes N-vectors, for various N, like the built-in math functions?

For example, if overloading were supported:
float square(float x) { return x * x; }
float2 square(float2 x) { return x * x; }
float4 square(float4 x) { return x * x; }
but overloading doesn't seem to be allowed (?)

It could be done via a macro:
#define square(x) ((x)*(x))
but in this case, x may be evaluated twice, so it is not the solution I'd like...

Thanks for any suggestions!