We can query some information about a device, such as the number of compute units and the clock speed. Seeing how many compute units that are present isn’t very descriptive in terms of estimating the computational power of a device. A CPU will consider a compute unit to be one core whereas something like an nVidia card will consider a compute unit to be one streaming multiprocesor, which has either 8 or 32 streaming processors (cores).
When we start a program and have multiple devices present, we will want to run the program on the fastest device we can. Using the number of compute units alone can not help us do this.