[P] 2,000x Faster RAPIDS TSNE – 3 hours down to 5 seconds on NVIDIA GPUs
TSNE is a very popular data visualization algorithm used alongside PCA and UMAP. Sklearn’s TSNE is very effective for small datasets, but on the 60,000 MNIST Digits dataset, expect to wait 1 hour. With RAPIDS cuML, TSNE on MNIST runs in 3 seconds! On 200,000 rows, Sklearn takes a whopping 3 hours, whilst RAPIDS takes 5 seconds! (2,000x faster). Figure 1. cuML TSNE on MNIST Fashion takes 3 seconds. Scikit-Learn takes 1 hour. Check out my blog showcasing how cuML achieves this massive performance boost, and how NVIDIA GPUs can help scientists and engineers save their precious time. https://medium.com/rapids-ai/tsne-with-gpus-hours-to-seconds-9d9c17c941db Figure 2. TSNE used on the 60,000 Fashion MNIST dataset (3 seconds) Give cuML a try! You might know me as the author of HyperLearn, and I can say cuML is the gold standard package for machine learning on GPUs! https://github.com/rapidsai/cuml Linear Regression, UMAP, K-Means, DBSCAN etc are all sped up on the GPU! If you have any questions, feel free to ask! Table 1. cuML’s TSNE time running on an NVIDIA DGX-1 with using 1 V100 GPU. Finally, a big drawback of current GPU implementations is its memory consumption. With cuML TSNE, we use 30% less GPU memory! In a future release, this will be shaved by 33% again to a total of 50% memory reductions! We will also support PCA initialization. submitted by /u/danielhanchen |