python - Fast basic linear algebra in Cython for recurrent calls -
i'm trying program function in cython monte-carlo simulation. function involves multiple small linear algebra operations, dot products , matrix inversions. function being called hundred of thousands of times numpy overhead getting large share of cost. 3 years ago 1 asked question: calling dot products , linear algebra operations in cython? have tried use recommendations both answers, first scipy.linalg.blas still goes through python wrapper , i'm not getting improvements. second, using gsl wrapper slow , tends freeze system when dimensions of vectors large. found ceygen package, looked promising seems installation file broke in last cython update. on other hand saw scipy working on cython wrapper lapack, looks still unavailable (scipy-cython-lapack) in end can code own c routines operations seems kind of re-inventing wheel.
so summarize: there new way kind of operations in cython? (hence don't think duplicate) or have found better way deal sort of problem haven't seen yet?
obligatory code sample: (this example, of course can still improved, give idea)
cimport numpy np import numpy np cpdef double risk(np.ndarray[double, ndim=2, mode='c'] x, np.ndarray[double, ndim=1, mode='c'] v1, np.ndarray[double, ndim=1, mode='c'] v2): cdef np.ndarray[double, ndim=2, mode='c'] tmp, sumx cdef double ret tmp = np.exp(x) sumx = np.tile(np.sum(tmp, 1).reshape(-1, 1), (1, tmp.shape[0])) tmp = tmp / sumx ret = np.inner(v1, np.dot(x, v2)) return ret
thanks!!
tl;dr: how-to linear algebra in cython?
the answer you link to still way call blas function cython. not python wrapper, python merely used c pointer function , can done @ initialization time. should c-like speed. wrong, think upcoming scipy 0.16 release provide convenient blas cython api, based on approach, not change things performance wise.
if didn't experience speed after porting cython of repeatedly called blas functions, either python overhead doing in numpy doesn't matter (e.g. if computation expensive part) or doing wrong (unnecessary memory copies etc.)
i approach should faster , easier maintain gsl, provided of course compiled scipy optimized blas (openblas, atlas, mkl, etc).
Comments
Post a Comment