GSoC '2022 Successfully Ends

on under Google Summer of Code

Originally published on CuPy’s official blog: https://medium.com/cupy-team/gsoc-2022-successfully-ends-dd7fef553d61.

Hey there! I’m on the edge of my seat to share my successful journey over the past thrilling months. Summer 2022 was amazing. I got an opportunity to work with the CuPy Team (under the NumFOCUS org) at the Google Summer of Code internship this summer.

icon_1st_page

Ahead of that, I’d appreciate it if you go through my final presentation. Here’s the link: https://khushi-411.github.io/gsoc-cupy-talk/#/. I strongly recommend flipping upside-down & right-left to see complete details about my work as I prepare nested slides. Don’t forget to check the complete slides!

Let’s talk about the work I’ve been indulged in these days. I structured my blog post In the following pattern:

  1. About CuPy
  2. My Project
  3. Post-GSoC Time
  4. Accomplished Work
    • Interpolate
    • Special
    • Stats
  5. My Experience
  6. Acknowledgment

About CuPy

cupy_logo

Here’s the compact description that explains everything about CuPy. Credits to the official site cupy.dev.

NumPy/SciPy-compatible Array Library for GPU-accelerated Computing with Python. CuPy speeds up some operations more than 100X.

If you want to know more about CuPy you can check it’s documentation, site, or CuPy’s GitHub. I wrote an introductory post about getting into GSoC. Here’s the link: Woohoo, GSoC’22 soon!.

My Project

I proposed to work on Project 1: CuPy coverage of NumPy/SciPy functions under the mentorship of Masayuki Takagi. Here’s the related issue: #6324. of my proposal.

I’m glad I accomplished all the functions I proposed for the project. I introduced the interpolate module and enhanced the coverage of the special and stats functions in CuPy. Gladly, I accomplished other goals apart from the proposed work (marked with “*”) during the GSoC timeline. I have also worked on experimental kernel fusion methods, Custom Kernels in CuPy (mainly Reduction Kernel), and universal functions. I documented my code well and added all the possible test cases. I also created the performance comparison for all the functions and used NVIDIA Nsight System for calculating benchmarks in interpolating functions. Below are the link and a short description of my work.

Pre-GSoC Time

Before getting into the GSoC, I enhanced CuPy’s coverage of NumPy functions. My contributions:

Accomplished Work

Interpolate

Interpolation is finding new data points from the given discrete set of data points. Interpolation is used in a variety of applications. It is used in engineering and science to estimate new values from the given points. The interpolation method can be used where it isn’t easy to use the Gaussian process. Interpolation can also be applied to higher sampling rates in digital signal processing techniques. There are many more important applications of interpolation techniques. These are mainly the following types of interpolation: Univariate interpolation, Multivariate interpolation, and Spline Interpolation (1-D Splines, 2-D Splines). To know more about interpolation, you can check Wikipedia’s article on interpolation.

In my project, I introduced the interpolate module in CuPy and worked on Univariate Interpolation.

Before:

import cupy
from cupyx.scipy import interpolate

# throws error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'interpolate' from 'cupy' (/home/khushi/Documents/cupy/cupy/__init__.py)

Now:

import cupy
from cupyx.scipy import interpolate

# works well
  • APIs covered:
  • Documentations and tests
  • Performance Benchmark
    • NVIDIA Nsight System

BarycentricInterpolator & barycentric_interpolate

Paper: https://people.maths.ox.ac.uk/trefethen/barycentric.pdf

  • Added _Interpolator1D class: deals with standard features for all interpolation functions (implemented in CPU, due to a smaller number of points). It supports the following methods:
    • __call__: use to call the next points
    • _prepare_x: change into a 1-D array
    • _finish_y: reshape to the original shape
    • _reshape_yi: reshape the updated yi values to a 1-D array
    • _set_yi: if y values are not provided, this method is used to create a y-coordinate
    • _set_dtype: sets the dtype of the newly created yi point
    • _evaluate: evaluates the polynomial, but for reasons of numerical stability, currently, it is not implemented.
  • Added BarycentricInterpolator class: constructs polynomial. It supports the following methods:
    • set_yi: update the next y coordinate, implemented in CPU due to smaller number of data points
    • add_xi: add the next x value to form a polynomial, implemented in CPU due to smaller number as mentioned in the paper
    • __call__: calls the _Interpolator1D class to evaluate all the details of the polynomial at point x
    • _evaluate: evaluate the polynomial
  • Added barycentric_interpolate* wrapper

KroghInterpolator & krogh_interpolate

  • Added _Interpolator1DWithDerivatives class: calculates derivatives. Its’ parent class is _Interpolator1D. It supports the following methods:
    • derivatives: evaluates many derivatives at point x
    • derivative: evaluate a derivative at point x
  • Added KroghInterpolator class: constructs polynomial and calculate derivatives. It supports the following methods:
    • _evaluate: evaluates polynomial
    • _evaluate_derivatives: evaluates the derivatives of the polynomial
  • Added krogh_interpolate* wrapper

Special

Special functions are mathematical functions that are known in mathematical and functional analysis. These are used in various aspects of geometrical applications and many more. These functions are used for error analysis problems and solve many classical problems. Therefore special functions have a high weightage in terms of importance. To know about special functions, you can check Wikipedia’s article on special functions.

In my project, I enhanced the coverage of the following special functions in CuPy.

  • APIs covered:
    • log_ndtr PR #6776
      • For more details: https://khushi-411.github.io/GSoC-week1/
    • log_softmax PR #6823
    • logsumexp PR #6773*
      • support multiple output args in reduction kernel PR #6813*
    • expn PR #6790
      • GitHub repository for more details: https://github.com/khushi-411/expn
    • softmax PR #6890
  • Documentations and tests
  • Performance Benchmark

Universal Functions

  • Implemented in log_ndtr & expn (both numerically stable)
  • Wraps into C++ snippet for speedy implementation.
# universal function
log_ndtr = _core.create_ufunc(
    'cupyx_scipy_special_log_ndtr',
    (('f->f', 'out0 = log_ndtrf(in0)'), 'd->d'),
    'out0 = log_ndtr(in0)',
    preamble=log_ndtr_definition, # C++ function call
    doc="""
    ....
    """
)

Custom Kernels (ReductionKernel)

  • Similar to map & reduce operations in python.
  • Wraps into C++ snippet for speedy implementation.
# function call
def(...):
    ...
    _log_softmax_kernel(tmp, axis=axis, keepdims=True)
    return ...

# ReductionKernel
_log_softmax_kernel = cp._core.ReductionKernel(
    'T x1',
    'T y',
    'exp(x1)',
    'a + b',
    'y = log(a)',
    '0',
    name='log_softmax'
)

Kernel Fusion (Experimental)

  • Compiles into a single kernel instead of a series of kernels
  • Experimental implementation to fuse softmax function
  • Drawback here: does not generate competitive codes, therefore, switched to the normal implementation
def make_expander(shape, axis):
    axis = internal._normalize_axis_indices(axis, len(shape))
    expander = []
    for i, s in enumerate(x.shape):
        if i in axis:
            expander.append(None)
        else:
            expander.append(slice(None))
    return tuple(expander)

@_util.memoize(for_each_device=True)
def _softmax_fuse(shape, axis):
    expander = make_expander(shape, axis)
    @_core.fusion.fuse()
    def softmax_fuse(x):
        x_max = cupy.amax(x, axis=axis)
        exp_x_shifted = cupy.exp(x - x_max[expander])
        return exp_x_shifted / cupy.sum(exp_x_shifted, axis=axis)[expander]
    return softmax_fuse

def softmax(x, axis=None):
    fused_softmax = _softmax_fuse(shape=x.shape, axis=axis)
    return fused_softmax(x)

TODO: Add shape method in cupy.fusion, to use make_expander inside fusion kernel

Stats

Stats is one of the most popular and user-demanding modules in most libraries. These functions are essential in organizing, interpreting, and calculating values. Therefore I proposed to enhance the coverage of the following statistical functions in CuPy.

  • boxcox_llf PR #6849
  • zmap & zscore PR #6855
    • Complex number support in nanvar and nanstd PR #6869*
      • For more details: https://khushi-411.github.io/GSoC-week6/

My Experience

I love open source. I started contributing to CuPy in late November. Participating in GSoC helped me continue contributing to the open source community. I’m glad that the approach to solving the problem changed after getting involved in GSoC. GSoC helped me to engage with the CuPy community. It helped me to learn new techniques and tricks to solve problems. It’s fascinating to see how CuPy team & GSoC helped to improve myself, get an opportunity to learn from talented people and be a successful contributor in the future. I’ve been active in discussion forums, and various issues and helped newbie contributors review their PR. I learned how to prioritize my work and connect with people from different nations, which is my favorite part of participating in GSoC.

Acknowledgement

Special thanks to my mentor Masayuki Takagi, for guiding me throughout the GSoC timeline. Thank you for answering my doubts and for all your explanations. I really appreciate you for being an incredibly supportive mentor.

Thanks very much to Kenichi Maehashi for providing valuable suggestions to improve the blog!

Thanks to the fantastic CuPy team and Google Summer of Code for this awesome summer! I am pleased and glad to get in touch with you!

I hope my work will bring an impact on the growing community. Signed-off-by:

Khushi Agrawal

gsoc, cupy
comments powered by Disqus