Parallel Jaccard with RcppParallel

 

Parallel Jaccard Benchmarks

We have implemented a naive but fast parallel version of Jaccard Coefficient estimation for the Phenograph method, thus obtaing a speed boot of about 20X when comparade with previous serial implementation.

Tets were performed using a dataset composed by 32900 cells and using different numbers of neighbors for each cell (i.e. 15, 25, 50, 100, 250). Each jaccard estimation method (serial and parallel) was run two times. See code below for details.

test elapsed relative cell neighbors number
Parallel Jaccard 54.822 1.000 250
Serial Jaccard 1293.185 23.589 250
Parallel Jaccard 7.192 1.000 100
Serial Jaccard 161.533 22.460 100
Parallel Jaccard 1.614 1.000 50
Serial Jaccard 38.642 23.942 50
Parallel Jaccard 0.372 1.000 25
Serial Jaccard 9.546 25.661 25
Parallel Jaccard 0.114 1.000 15
Serial Jaccard 3.366 29.526 15

Below the code used to generate the above table using PBMCs dataset from HERE.

library(gficf)
library(uwot)
library(RcppParallel)
library(rbenchmark)

# Step 1: load PBMC dstsdet
pbmc = readRDS("/path/to/Purified.PBMC.RAW.rds")
data = gficf::gficf(M = pbmc,cell_proportion_max = 1,cell_proportion_min = .05,storeRaw = F,normalize = F)
rm(pbmc);gc()

# Step 2: Reduce data with PCA
data = gficf::runPCA(data = data,dim = 10)

# Benchmarks
RcppParallel::setThreadOptions(numThreads = 6)
res = NULL
for (neigh.number in c(15,25,50,100,250))
{
  message(paste("Testing neigh = ",neigh.number))
  neigh.idx = uwot:::find_nn(data$pca$cells,k=neigh.number+1,include_self = T,n_threads = 6,verbose = F,method = "annoy")$idx
  neigh.idx = neigh.idx[,-1]

  # compare performance of serial and parallel
  tmp <- benchmark(gficf:::jaccard_coeff(neigh.idx,T),
                   gficf:::rcpp_parallel_jaccard_coef(neigh.idx,T),
                   replications=2,
                   order="elapsed")
  tmp$neigh_number = neigh.number
  res = rbind(tmp,res)
}

print(res[,c("test","elapsed","relative","neigh_number")])