Khushi Agrawal
@khushi-411
@khushi__411
This talk on Github: khushi-411/gsoc-cupy-talkNumPy/SciPy-compatible Array Library for GPU-accelerated Computing with Python. CuPy speeds up some operations more than 100X. cupy.dev
Projects using CuPy: https://github.com/cupy/cupy/wiki/Projects-using-CuPy
Mentor: Masayuki Takagi
Optimized GPU implementation of NumPy/SciPy APIs
Documentation and tests
Performance Benchmarking (use NVIDIA Nsight System)
BarycentricInterpolator
KroghInterpolator
log_ndtr
expn
softmax
log_softmax
boxcox_llf
zmap
zscore
Add complex number support for nanvar
& nanstd
logsumexp
barycentric_interpolate
krogh_interpolate
support multiple output args in ReductionKernel
Enhanced CuPy’s coverage of NumPy functions. My contribution’s:
Introduced interpolate
module in CuPy.
Before:
import cupy
from cupyx.scipy import interpolate
# throws error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'interpolate' from 'cupy' (/home/khushi/Documents/cupy/cupy/__init__.py)
Now:
import cupy
from cupyx.scipy import interpolate
# works well
BarycentricInterpolator
& barycentric_interpolate
Paper: https://people.maths.ox.ac.uk/trefethen/barycentric.pdf
Added _Interpolator1D
class: deals with common features for all interpolation functions (implemented in CPU, due to a smaller number of points). It supports the following methods:
__call__
: use to call the next points_prepare_x
: change into a 1-D array_finish_y
: reshape to the original shape_reshape_yi
: reshape the updated yi values to a 1-D array_set_yi
: if y values are not provided, this method is used to create a y-coordinate_set_dtype
: sets the dtype of the newly created yi point_evaluate
: evaluates the polynomial, but for reasons of numerical stability, currently, it is not implemented. Added BarycentricInterpolator
class: constructs polynomial. It supports the following methods:
set_yi
: update the next y coordinate, implemented in CPU due to smaller number of data pointsadd_xi
: add the next x value to form a polynomial, implemented in CPU due to smaller number as mentioned in the paper__call__
: calls the _Interpolator1D class to evaluate all the details of the polynomial at point x_evaluate
: evaluate the polynomial Added barycentric_interpolate
wrapper
KroghInterpolator
& krogh_interpolate
Added _Interpolator1DWithDerivatives
class: calculates derivatives. Its’ parent class is _Interpolator1D
. It supports the following methods:
derivatives
: evaluates many derivatives at point xderivative
: evaluate a derivative at point x Added KroghInterpolator
class: constructs polynomial and calculate derivatives. It supports the following methods:
_evaluate
: evaluates polynomial_evaluate_derivatives
: evaluates the derivatives of the polynomial Added krogh_interpolate
wrapper
Enhanced the coverage of special functions in CuPy.
log_ndtr
PR #6776log_softmax
PR #6823logsumexp
PR #6773support multiple output args in reduction kernel
PR #6813expn
PR #6790softmax
PR #6890log_ndtr
& expn
(both numerically stable)# universal function
log_ndtr = _core.create_ufunc(
'cupyx_scipy_special_log_ndtr',
(('f->f', 'out0 = log_ndtrf(in0)'), 'd->d'),
'out0 = log_ndtr(in0)',
preamble=log_ndtr_definition, # C++ function call
doc="""
....
"""
)
map
& reduce
operations in python.# function call
def(...):
...
_log_softmax_kernel(tmp, axis=axis, keepdims=True)
return ...
# ReductionKernel
_log_softmax_kernel = cp._core.ReductionKernel(
'T x1',
'T y',
'exp(x1)',
'a + b',
'y = log(a)',
'0',
name='log_softmax'
)
softmax
functiondef make_expander(shape, axis):
axis = internal._normalize_axis_indices(axis, len(shape))
expander = []
for i, s in enumerate(x.shape):
if i in axis:
expander.append(None)
else:
expander.append(slice(None))
return tuple(expander)
TODO: Add shape
method in cupy.fusion, to use make_expander
inside fusion kernel
@_util.memoize(for_each_device=True)
def _softmax_fuse(shape, axis):
expander = make_expander(shape, axis)
@_core.fusion.fuse()
def softmax_fuse(x):
x_max = cupy.amax(x, axis=axis)
exp_x_shifted = cupy.exp(x - x_max[expander])
return exp_x_shifted / cupy.sum(exp_x_shifted, axis=axis)[expander]
return softmax_fuse
def softmax(x, axis=None):
fused_softmax = _softmax_fuse(shape=x.shape, axis=axis)
return fused_softmax(x)
Enhanced the coverage of statistical functions in CuPy.
boxcox_llf
PR #6849
zmap & zscore
PR #6855
nanvar
and nanstd
PR #6869Thanks in millions to Masayuki Takagi for guiding me throughout the GSoC timeline and even before joining GSoC. I really appreciate you for being an incredibly supportive mentor. I am pleased and glad to get in touch with you!