The opencl specification is the place where answers to such questions are best found.
e.g. see section 6.1.1 “Built-in Scalar Data Types”
work sizes are size_t, matches the device address bits - i.e. 64-bits only on64-bit devices. My GTX 480 device is only 32-bits, so 2^56 would not be possible (and i presume it would be the same for all gpu cards).
For such a problem size you probably would just run loops within each workgroup instead of leaving it to the hardware only.