On Sun, 25 Mar 2012 21:11:31 +0400, "Alexander Kiselyov"
> I found a frustrating problem in pyopencl - after each kernel execution
> host memory consumption increases by approx 1.5 MB. Taking into account
> program workflow (modelling some amount of iterations on card, finishing
> kernel, reading data via enqueue_copy, writing it to file, then starting
> kernel again), I run out of my 4 GB of RAM after some thousands of such
> In a previous project which was written on C++ I solved this problem this
> way. An event object was dynamically created and passed to
> enqueueNDRangeKernel(). After reading data from GPU event object was
> deleted. Obviously it's impossible to use this method in Python.
> Also it's worth to notice that the memory leak occures when using both
> or GPU. I'm using Intel CPU and nVidia GPU.
> What can be done to fix the problem?
You had me scared there for a second. I was able to reproduce the
phenomenon you describe. Fortunately, it has nothing to do with
The issue is that OpenCL does not limit the size of the queue you build
up, and this is what's causing the growth in used memory. In my
experiments, if I run a 'queue.finish()' every 100 or so submitted
kernels, a) the resulting code runs faster and b) memory usage is flat.
It would be possible to stick this type of auto-finish behavior into
PyOpenCL as a non-default option, but I'm not fully convinced I should
Hope this helps,