ComplexHeatmap

Tutorial with Tommy

Author

Jaakko

Published

November 19, 2025

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

First time setup

install.packages("BiocManager")
BiocManager::install(version = "3.16")
BiocManager::install("ComplexHeatmap")

Load the library

library(ComplexHeatmap)

Make dummy data

h1 <- c(10,20,10,20,10,20,10,20,10,20)
h2 <- c(20,10,20,10,20,10,20,10,20,10)

l1 <- c(1,3,1,3,1,3,1,3,1,3)
l2 <- c(3,1,3,1,3,1,3,1,3,1)

mat <- rbind(h1,h2,l1,l2)
colnames(mat) <- paste0("timepoint_", 1:10)
mat
   timepoint_1 timepoint_2 timepoint_3 timepoint_4 timepoint_5 timepoint_6
h1          10          20          10          20          10          20
h2          20          10          20          10          20          10
l1           1           3           1           3           1           3
l2           3           1           3           1           3           1
   timepoint_7 timepoint_8 timepoint_9 timepoint_10
h1          10          20          10           20
h2          20          10          20           10
l1           1           3           1            3
l2           3           1           3            1

Visualize the data

par(mfrow = c(1,1), mar = c(4,4,1,1))
plot(1:10, rep(0,10), ylim = c(0,35), pch = "", xlab = "Time", ylab = "Gene Expression")

for (i in 1:nrow(mat)) {
  lines(1:10, mat[i,], lwd = 3, col = i)
}

legend(1,35,rownames(mat), 1:4, cex = 0.7)

Making a heatmap is easy!

Don’t cluster the columns (auto-setting); genes are in column -> cluster false.

Heatmap(mat, cluster_columns = F)

quantile(mat, c(0, 0.1, 0.5, 0.9))
  0%  10%  50%  90% 
 1.0  1.0  6.5 20.0 

Change color mapping

Color interpolation will be helpful.

col_fun <- circlize::colorRamp2(c(0, 3, 20), c("blue", "white", "red"))

Heatmap(mat, cluster_columns = F, col = col_fun)

Outlier

Outliers can nullify heatmaps misleading colors. Set the correct colors range-scale.

mat2 <- mat
mat2[1,1] <- 1000

Heatmap(mat2, cluster_columns = F)

Heatmap(mat2, cluster_columns = F, col = col_fun)

Scaling

Let’s scale the data/gene expression level across columns (time points) first = scale the same gene over the time.
N.B. scale() works on columns -> double t().

scaled_mat <- t(scale(t(mat)))

scaled_mat
   timepoint_1 timepoint_2 timepoint_3 timepoint_4 timepoint_5 timepoint_6
h1  -0.9486833   0.9486833  -0.9486833   0.9486833  -0.9486833   0.9486833
h2   0.9486833  -0.9486833   0.9486833  -0.9486833   0.9486833  -0.9486833
l1  -0.9486833   0.9486833  -0.9486833   0.9486833  -0.9486833   0.9486833
l2   0.9486833  -0.9486833   0.9486833  -0.9486833   0.9486833  -0.9486833
   timepoint_7 timepoint_8 timepoint_9 timepoint_10
h1  -0.9486833   0.9486833  -0.9486833    0.9486833
h2   0.9486833  -0.9486833   0.9486833   -0.9486833
l1  -0.9486833   0.9486833  -0.9486833    0.9486833
l2   0.9486833  -0.9486833   0.9486833   -0.9486833
attr(,"scaled:center")
h1 h2 l1 l2 
15 15  2  2 
attr(,"scaled:scale")
      h1       h2       l1       l2 
5.270463 5.270463 1.054093 1.054093 
Heatmap(scaled_mat, cluster_columns = F)

Note - after scaling, h1 and l1, as h2 and l2, are close each other!

Understanding clustering?

In order to understand heatmap you need to understand clustering.
In order to cluster different rows or different columns together you first have to define distance between the rows or the columns -> dist() = distance matrix (euclidian distance) = pair-wise [2D plane d=sqrt{(x2-x1)^2 + (y2-y1)^2} ].

c1 c2 c3
r1 x
r2 x x
r3 x x x

N.B. There are different distance methods used in different scenarios.
Then you do the clustering.
N.B. Different methods of hierarchical clustering.

  1. define distance
?dist

d <- dist(mat)
d
          h1        h2        l1
h2 31.622777                    
l1 43.011626 45.276926          
l2 45.276926 43.011626  6.324555
  1. do clustering and plot
?hclust

hclust(d)

Call:
hclust(d = d)

Cluster method   : complete 
Distance         : euclidean 
Number of objects: 4 
plot(hclust(d))

Look after scaling.

d2 <- dist(scaled_mat)

plot(hclust(d2))

Take home message

  1. Color mapping is critical
  2. Scaling is critical
  3. Making heatmap is easy, but better to understand the details
?Heatmap