Sorry for the grammatical errors. Re-posting after correction

Hi. Is there any way for the GPU to have persistent data storage(across kernel invocations), so that we don't have to repeatedly send the data, every time we invoke the kernel.

For example if the kernel takes 2 arguments - constant_data(let us say this is 10000 bytes and the data is constant) and variable_data(say 1000-10000 bytes and the data varies between every invocation of the kernel), and we invoke the kernel from the host program 10000 times, each time with the same constant_data, but different variable_data, I have to bear the extra overhead of sending the same constant_data, inspite of this data being constant across kernel invocations. So over multiple kernel invokes, can the constant data be stored in the GPU, so that for the next invocation of the kernel, I don't have to send the constant data, but I only send the variable data?

A case where this issues comes up is pattern matching. For example, if I have a kernel which implements the pattern matching algorithm, and I receive data as a stream in the host and the host invokes the kernel multiple times by sending as arguments a chunk of the data stream, and the set of patterns(where the set of patterns is always constant), then one has to bear the cost of sending the patterns, again when the kernel is invoked with the next chunk of data by the host. In such a case, it would be useful if one can store the patterns in the GPU and the kernel invocation from the host just supplies the next chunk of data against which the pattern matching has to be done.