You may apply Wolfram Language to your project. There is a free Wolfram Engine for developers and with the Wolfram Client Library for Python you can use these functions in Python.
Either the CUDALink Overview and CUDALink Guide or the OpenCLLink Overview and OpenCLLink Guide will enable you to run code on your GPU. However, the CUDALink guide image process functions built-in so perhaps best to start there for your project.
from wolframclient.evaluation import WolframLanguageSession
from wolframclient.language import wl, wlexpr
Start the Wolfram session.
wolfSession = WolframLanguageSession()
The CUDA package is included with the Wolfram Engine but has to be explicitly loaded.
wolfSession.evaluate(wl.Needs('CUDALink`'))
Then you can check if the engine has located one or more compatible GPUs with CUDAQ.
print(wolfSession.evaluate(wl.CUDALink.CUDAQ()))
True
There is a one-time download of CUDA binaries when CUDAQ is evaluated. You can make your first call directly in the kernel to track the one-time download instead of first calling through Python. Run woframscript in the terminal and evaluate the Needs and CUDAQ functions at the input prompt.

Detailed information can be returned with CUDAInformation.
print(wolfSession.evaluate(wl.CUDALink.CUDAInformation(1,'Core Count')))
160
Grab an image for some CUDA image function examples.
imgWolfram=wolfSession.evaluate(
wl.Rasterize(
wl.Import('https://i2.wp.com/oemscorp.com/wp-content/uploads/2015/07/wolfram-language-logo.png'),
'Image'
)
);
The image can be Exported in one of the supported Raster Image Formats or Vector Graphics Formats.
wolfSession.evaluate(
wl.Export(
'<path with image filename>',
imgWolfram
)
)
)

A few example CUDA image functions include
CUDAImageConvolve
wolfSession.evaluate(
wl.Export(
'<path with image filename>',
wl.CUDALink.CUDAImageConvolve(imgWolfram, [[-1,0,1],[-2,0,2],[-1,0,1]])
)
)



Terminate the Wolfram session.
wolfSession.terminate()
Since you have processing many image files you would get slightly better performance by processing directly in wolfram script with the Wolfram Engine rather than by calling the engine from Python.
Hope this helps.