Hi everyone,

I'm trying to implement a simple iterated convolution kernel with edge wrapping. I have a few questions about how to do this most effectively.

What is the best way to implement edge wrapping in a kernel? Should I use modulo? Should I limit my buffers to widths/heights that are powers of two and use a bitwise and? Or should I avoid this altogether and use a larger buffer and copy the left edge to the right side and vice versa? If so, what's the best way to do this?