R/LinkedCharts Tutorial
Customising your chart
Here, we show how one can use various built-in properties to customise charts. The main goal of this tutorial is to give an overview of the adjustable aspects in R/LinkedCharts, (colours, axes, labels, etc.). Therefore, we will use well-known example data sets such as Iris flower data set or randomly generated data.
data("iris") # load Iris data set
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
We assume that you are already familiar with R/LinkedCharts and its central ideas and principles, and you want to explore more possibilities of the library. Otherwise, we recommend first to go through this tutorial.
library(rlc) # load the library
Colour
This section describes the following properties:
colourValue
colourDomain
palette
colour
fill
stroke
colourLegendTitle
globalColourScale
addColourScaleToLegend
The simplest way to colour elements of a chart is to use the colourValue
property. It takes numbers or strings for each point (line, bar, etc.) and generates a continuous or categorical colour scale based on that.
openPage(layout = "table1x2")
# a scatter plot with a categorical colour scale
lc_scatter(dat(x = iris$Sepal.Length,
y = iris$Petal.Length,
colourValue = iris$Species),
width = 300, height = 300, # change width and height of the chart to 300px
place = "A1")
# a scatter plot with a continuous colour scale
lc_scatter(dat(x = iris$Sepal.Length,
y = iris$Petal.Length,
colourValue = iris$Petal.Width),
width = 300, height = 300,
place = "A2")
To further specify colours, one can use palette
and colourDomain
properties.
In the case of a categorical colour scale, colourDomain
is a set of all possible colour values. Thus for the chart on the left by default colourDomain = c("setosa", "versicolor", "virginica")
. If a value is not included in the colour domain, the corresponding points will be black.
For continuous colour scales, colourDomain
defines the range in which colour values can change. All values outside this range will produce colours corresponding to the maximal or minimum value of the colourDomain
.
To illustrate all this, let’s add colourDomain
to our example.
openPage(layout = "table1x2")
# a scatter plot with a categorical colour scale
lc_scatter(dat(x = iris$Sepal.Length,
y = iris$Petal.Length,
colourValue = iris$Species,
colourDomain = c("setosa", "something else", "virginica")),
width = 300, height = 300,
place = "A1")
# a scatter plot with a continuous colour scale
lc_scatter(dat(x = iris$Sepal.Length,
y = iris$Petal.Length,
colourValue = iris$Petal.Width,
colourDomain = c(-1, 1)),
width = 300, height = 300,
place = "A2")
All the “versicolor” points on the left plot are now black since this value is no longer present in colourDomain
. At the same time, the orange colour is reserved for “something else”. On the plot to the right, the colour scale now varies between -1 and 1. All the points with colourValue
greater than 1 are just purple.
Finally, palette
defines what colours are used. It is always a vector of colours (their names or hexadecimal codes). For categorical colour scale, palette
must have a colour for each element from colourDomain
. For continuous scales, palette
is a set of “reference points” for the colour scale. By default, they are spread evenly withing colourDomain
, but one can also specify intermediate points.
openPage(layout = "table1x2")
# a scatter plot with a categorical colour scale
lc_scatter(dat(x = iris$Sepal.Length,
y = iris$Petal.Length,
colourValue = iris$Species,
colourDomain = c("setosa", "something else", "virginica", "versicolor")),
palette = c("gold", "hotpink", "dodgerblue"),
width = 300, height = 300,
place = "A1")
# a scatter plot with a continuous colour scale
lc_scatter(dat(x = iris$Sepal.Length,
y = iris$Petal.Length,
colourValue = iris$Petal.Width,
colourDomain = c(0, 0.3, 2.6),
palette = c("red", "grey", "black")),
width = 300, height = 300,
place = "A2")
Now chart to the left has more elements in colourDomain
than colours in palette
, and therefore colours are repeated, starting from the first element in palette
. For the chart to the right, we’ve provided palette
with three colours, and also we’ve added an intermediate point in colourDomain
. This tells R/LinkedCharts to interpolate colours from red to grey between 0 and 0.3 and from grey to black between 0.3 and 2.6.
Instead of using colourValue
and palette
, one can also assign colours directly. colour
is similar to colourValue
, but it assigns colours without generating a colour scale. Here is an example. Note how the fifth circle is coloured black since some_strange_colour
doesn’t correspond to any colour.
lc_scatter(dat(x = 1:5, # x-coordinates of the points are 1, 2, 3, 4, 5
y = rep(1, 5), # all 5 points has 1 as their y-coordinates
size = 15, # Let's have big points! (Default point size is 6)
width = 350, # Our chart will be 350px wide...
height = 200, # ...and 200px high.
colour = c("red", "#123456", rgb(0.4, 0.8, 0.1), "#aaa", "some_strange_colour"))
)
Besides that, you can also change stroke colour (stroke
) in scatters, bar charts and ribbons and fill
lines (lc_line
, lc_abLine
, etc.) with some colour. These two properties work the same way as colour
.
openPage(layout = "table1x2")
# some filled lines
points <- seq(0, 6.5, 0.1)
x <- cos(points)
y <- sin(points)
lc_path(dat(x = sapply(0:2, function(i) x + i), # coordinates for three circles
y = sapply(0:2, function(i) y + i),
lineWidth = 5, # make lines thicker
fill = c("blue", "red", "black"),
# colour of the elements (in this case - lines)
colour = c("cornflowerblue", "coral", "grey")),
width = 300, height = 300,
place = "A1")
# same plot, but using lc_scatter
lc_scatter(dat(x = 0:2, y = 0:2, size = 55, # three huge points instead of circles
# change default axes limits to those of the chart to the left
domainX = c(-1, 3), domainY = c(-1, 3),
strokeWidth = 5, # make strokes thicker
stroke = c("blue", "red", "black"),
# colour of the elements (in this case - points)
colour = c("cornflowerblue", "coral", "grey")),
width = 300, height = 300,
place = "A2")
colourLegendTitle
allows giving a meaningful name to your legend.
globalColourScale
can be helpful when your chart has several layers with coloured elements. If globalColourScale
is TRUE
(default value), a single global scale is created for all the layers. Otherwise, each layer gets its own colour scale.
addColourScaleToLegend
(by default, TRUE
) defines whether or not to display a legend for the colour scale of this layer.
Imagine we have three types of data samples. Samples of type “a” are randomly scattered, samples of types “b” and “c” are distributed along some lines, but for type “c”, we don’t have any points, only the area where they are likely to be found. This artificial example can help us illustrate the meaning of having a global colour scale. So we are going to have a chart with three layers: one with points of type “a” and “b”, another for lines along which type “b” and “c” samples are scattered, and one more to highlight the area of most likely location for type “c” samples.
# generate 20 randomly distributed points, and 20 that
# are scattered along y = 3 * x line
pointsX <- runif(40)
pointsY <- c(runif(20, 0, 3), pointsX[21:40] * 3 + rnorm(20, sd = 0.2))
x <- seq(0, 1, 0.05)
openPage(layout = "table1x2")
# first we add filled area first to put it under all other elements
lc_ribbon(dat(
x = x,
# lc_ribbon fills area between ymax and ymin values
ymax = x * 2 + abs(x - 0.5),
ymin = x * 2 - abs(x - 0.5),
colourValue = "c",
# properties that influence the entire chart can be
# set in any of its layers
width = 300, height = 300),
place = "A1")
lc_scatter(dat(
x = pointsX,
y = pointsY,
# first 20 points are of class "a"
# the other 20 are of class "b"
colourValue = c(rep("a", 20), rep("b", 20))),
# to add a layer one need either define "lyaerId"
# or set "addLayer = TRUE"
place = "A1", addLayer = T)
lc_abLine(dat(
# we can add 'n' lines by defining 'n' values
# of slope ('a') and intercept ('b')
a = c(3, 2), b = c(0, 0),
colourValue = c("b", "c")),
place = "A1", addLayer = T)
# The same chart but without global colour scale
lc_ribbon(dat(
x = x,
ymax = x * 2 + abs(x - 0.5),
ymin = x * 2 - abs(x - 0.5),
colourValue = "c",
globalColourScale = F,
addColourScaleToLegend = F,
width = 300, height = 300),
place = "A2")
lc_scatter(dat(
x = pointsX,
y = pointsY,
colourLegendTitle = "scatter",
colourValue = c(rep("a", 20), rep("b", 20))),
place = "A2", addLayer = T)
lc_abLine(dat(
a = c(3, 2),
b = c(0, 0),
colourLegendTitle = "lines",
colourValue = c("b", "c")),
place = "A2", addLayer = T)
The chart to the left uses a single colour scale for all the layers, even if some colour values of one layer are not present in the others. In the right one, where we didn’t use global colour scale (globalColourScale = F
), each layer has its own colourDomain
and the same default palette
. lc_ribbon
has only one element with colourValue = "c"
. Hence, it uses the first colour of the default palette (which is blue) for this data type. lc_abLine
has data of types “b” and “c”, but it has no idea that blue has already been used for the data type “c”. “b” comes first in the data and it gets the first colour (blue), and “c” is now orange. Similarly, colours are defined for lc_scatter
. You can make the chart to the right look like the one to the left by adding to each layer colourDomain = c(“a”, “b”, “c”)`.
R/LinkedCharts shows a colour legend for each layer. It is helpful for the chart to the right, where each colour’s meaning changes from layer to layer but can be excessive in some cases. If globalColourScale = T
, only one legend (of the first layer with the colourValue
property) will be used. In other cases, one should manually set addColourScaleToLegent = F
(the ribbon layer of the chart to the right).
So far, we didn’t mention heatmaps (lc_heatmap
), but their colouring is defined by the already mentioned palette
and colourDomain
properties the same way it happens for all other charts. It can also be interesting to use an interactive lc_colourSlider
instead of a static colour scale.
# if you want to plot 150x150 correlation matrix, it's better to
# use your browser instead of RStudio Viewer.
openPage(useViewer = F)
lc_heatmap(dat(
values = cor(t(iris[, 1:4])),
colourDomain = c(-1, 1),
palette = RColorBrewer::brewer.pal(11, "RdBu"),
# if we use colour slider, we don't need the static legend
showLegend = F
))
lc_colourSlider(chart = "Chart1")
Shape
This section describes the following properties:
symbol
symbolValue
symbolLegendTitle
size
strokeWidth
lineWidth
opacity
dasharray
Scatters and beeswarms in R/LinkedCharts can display points as one of seven standard d3 symbols. symbolValue
, which is similar to colourValue
, generates a scale that converts some user-provided values to one of the symbols. symbol
property directly assigns symbol types to each element. Possible symbol types are "Circle", "Cross", "Diamond", "Square", "Star", "Triangle", "Wye"
. symbolLegendTitle
adds a title to the symbol legend.
openPage(layout = "table1x2")
lc_scatter(dat(x = iris$Sepal.Length,
y = iris$Petal.Length,
symbolLegendTitle = "Species",
symbolValue = iris$Species),
width = 300, height = 300,
place = "A1")
lc_scatter(dat(x = iris$Sepal.Length,
y = iris$Petal.Length,
symbol = ifelse(iris$Species == "setosa", "Star", "Cross")),
width = 300, height = 300,
place = "A2")
On the left chart R/LinkedCharts automatically puts a symbol for each of the present species symbolValue = iris$Species
. On the chart to the right, we do it manually symbol = ifelse(iris$Species == "setosa", "Star", "Cross")
(for the sake of simplicity, we only distinguish “setosa” from everything else).
We can also change size
of the points (default is 6), opacity
(value from 0 to 1), or strokeWidth
(by default, 0.1 * size
).
lc_scatter(dat(x = iris$Sepal.Length,
y = iris$Petal.Length,
size = iris$Sepal.Width * 2,
colourValue = iris$Petal.Width,
strokeWidth = 3,
opacity = runif(150))
)
Lines can have different widths (lineWidth
, default is 1.5) and dashes pattern (dasharray
). dasharray
is defined the same way as CSS stroke-dasharray attribute. It is a list of numbers that specify the lengths of alternating dashes and gaps. The first number is the length of the first dash, second - of the first gap, third - of the second dash, and so forth.
lc_hLine(dat(
h = 1:5,
lineWidth = 1:5 * 2,
dasharray = c("", "10", "10 2", "15 3 8", "3 6 9 12"),
domainY = c(0, 6)
))
Titles and labels
This section describes the following properties:
label
title
titleSize
titleX
/titleY
axisTitleX
/axisTitleY
axisTitlePosX
/axisTitlePosY
colourLegendTitle
symbolLegendTitle
label
defines the text that you see when hovering the mouse over some element. If vectors that specify x
or y
coordinates have names, these names will be used as labels. Otherwise, the element’s index is used as its label (note that indices start from 0). One can also define the main title
of a plot, its size (titleSize
) and position. titleX
specifies the horizontal position of the title midpoint, while titleY
is the vertical position of its bottom.
axisTitleX
and axisTitleY
set title to x and y axes respectively. By default, the titles are positions above the axis (inside the plotting area) around its end. One can change the default positioning with axisTitlePosX
and axisTitlePosY
. These properties accept a combination of the keywords. "up"
and "down"
defines whether the title should be above or below the axis (inside or outside the plotting area), "start"
, "middle"
and "end"
specify positioning along the axis. So, the default value of this property is "up end"
. You can use one or two keywords at once.
colourLegendTitle
and symbolLegendTitle
have already been mentioned in the sections above - they specify titles for colour and symbol legends.
lc_scatter(dat(
x = iris$Sepal.Length,
y = iris$Petal.Length,
size = iris$Sepal.Width * 2,
colourValue = iris$Petal.Width,
symbolValue = iris$Species,
title = "Iris dataset",
titleX = 100,
titleY = 500,
titleSize = 30,
axisTitleX = "Sepal Length",
axisTitleY = "Petal Length",
axisTitlePosX = "middle",
axisTitlePosY = "down start",
colourLegendTitle = "Petal Width",
symbolLegendTitle = "Species"
))
Interactivity
This section describes the following properties:
on_click
on_mouseover
on_mouseout
on_marked
on_clickPosition
R/LinkedCharts is designed as an easy-to-use framework to create sets of interactive linked charts. From this tutorial, you can already get a pretty good idea of how it works, but let’s quickly go through it again here.
All the interactivity in R/LinkedCharts is based on the two main ideas.
First is the dat()
function. Properties, defined inside this function (e.g. dat(x = somePoints$x, y = somePoints$y)
) can be reevaluated any moment by calling the updateCharts
function and the chart will be changed accordingly. For example, if somePoints
is changed, then updateCharts()
will make the points move to new positions.
lc_scatter(dat(x = rnorm(10)),
y = rnorm(10))
updateCharts()
Note how each time you call updateCharts
, points are moved along the x-axis to some new randomly generated locations. At the same time, the y-coordinates of each point remain unchanged.
The second one is a system of callback functions. A user-defined function is called whenever something happens on the opened web page (e.g. a point is clicked). The most straightforward way to use this function is to change some global variable that the charts depends on and then call updateCharts
. Of course, you can also run calculations, print some information to console, make static plots or whatever else you want.
So here is a basic example with reaction to click (on_click
). In this example, we have a set of ten colours and a scatter plot with ten points. When any of the points is clicked, all of them change colour, and the index of the clicked point is printed in the console. To do this, we create a variable that stores the currently selected colour selColor
and then use it to set the colour
property. When a point is clicked, the function assigned to the on_click
property is called. It gets the index of the clicked point and prints it (print(i)
). It also changes selColour
and updates the chart. Note that we use the global assignment operator <<-
inside this function instead of the usual <-
. <-
will just create a local variable selColour
inside the function.
colours <- RColorBrewer::brewer.pal(10, "Set3")
selColour <- 1
lc_scatter(dat(
x = 1:10,
y = 1:10,
colour = colours[selColour],
on_click = function(i) {
print(i)
selColour <<- i
updateCharts()
}
))
on_mouseover
and on_mouseout
specify what happens when the user moves the mouse over and out an element. on_mouseover
like on_click
gets an index of the point (line, cell, etc.), on_mouseout
doesn’t get anything. Here is an example similar to the one above, but now, one can just move the mouse over it instead of clicking on a point. When the mouse moves out, all points become black.
colours <- c(RColorBrewer::brewer.pal(10, "Set3"), "black")
selColour <- 1
lc_scatter(dat(
x = 1:10,
y = 1:10,
colour = colours[selColour],
on_mouseover = function(i) {
selColour <<- i
updateCharts()
},
on_mouseout = function() {
selColour <<- 11
updateCharts()
}
))
In any chart, you can select elements by drawing a rectangle with the mouse or by clicking on an element while holding the Shift
key. Double mouse click with the Shift
key pressed will deselect all the elements. Whenever any element is selected or deselected, the function assigned to the on_marked
property is called. At any moment, you can get indices of currently selected elements of any chart by calling getMarked
. Let’s make a brushing example. When the on_marked
event is triggered for one of the charts, we get indices of selected points (getMarked("A1")
) and select them on the other chart (mark(getMarked("A1"), "A2")
). Note that when we use mark
function, on_marked
is not called to prevent creating an infinite stack of calls. You can change that by setting preventEvent = FALSE
.
openPage(layout = "table1x2")
lc_scatter(dat(
x = iris$Sepal.Length,
y = iris$Petal.Length,
colourValue = iris$Species,
on_marked = function() {
mark(getMarked("A1"), "A2")
}
), "A1", width = 300, height = 300)
lc_scatter(dat(
x = iris$Sepal.Width,
y = iris$Petal.Width,
colourValue = iris$Species,
on_marked = function() {
mark(getMarked("A2"), "A1")
}
), "A2", width = 300, height = 300)
Try it by selecting points on one of the charts (the Shift
key must be pressed in order to select points).
If you want to see more use cases of these properties, check this tutorial.
So far, we talked about the event handlers that react to user interactions only with a chart element that represents a data instance: a point, a line, a heatmap cell, etc. Unlike them, on_clickPosition
reacts to any user’s click on the plot. It receives coordinates of the click based on the current axes scales. If the axis is discrete, the closest value will be used as the corresponding coordinate. The example below shows how this property works.
x <- 0
y <- "a"
lc_scatter(dat(
x = x,
y = y),
domainX = c(-10, 10),
domainY = letters[1:10],
on_clickPosition = function(d) {
x <<- c(x, d[1])
y <<- c(y, d[2])
updateCharts()
})
Here, we start with a scatter plot that has a single point. Clicking on any position in this plot adds another value to x
and y
variables and, thus, another point to the plot. Note the difference between a continuous X-axis and a discrete Y-axis. As an X-coordinate, the exact click position is returned, and, therefore, a new point is placed exactly where the mouse points. As a Y-coordinate, the nearest tick is returned, and the points are placed only at one of the 10 specified positions. At any moment, you can return to your R session to explore the current state of the x
and y
variables and modify them, if needed. However, since R doesn’t allow values of different types inside a single vector, both coordinates are returned as characters. Base R as.numeric
function will turn the values in the x
variables into numbers. The rlc
package also doesn’t allow a scatter plot without x
or y
values; therefore, in this example, we start with one point.
Axes Settings
This section describes the following properties:
shiftX
/shiftY
jitterX
/jitterY
logScaleX
/logScaleY
layerDomainX
/layerDomainY
domainX
/domainY
aspectRatio
ticksRotateX
/ticksRotateY
ticksX
/ticksY
Overplotting can become an annoying issue when you want to put too many points on a scatter plot. It means that several points have the same (or almost the same) coordinates and are plotted on top of each other. In this case, it’s impossible to say how many points are there at some coordinates. What seems to be one point can be two, or ten, or a hundred. One way to address this problem is to make points transparent (e.g. opacity = 0.4
). Another is to add noise along one of the axes, which can be especially helpful when one of the axes is categorical or discrete, and there are noticeable gaps between agglomerations of points.
shiftX
, shiftY
, jitterX
, jitterY
can add this noise. jitterX
and jitterY
are numbers that specify the amplitude of the random noise that will be added to each point along one of the axes. 0 stands for no noise, 1 is distance between x
and x + 1
for linear scale, x
and b*x
for logarithmic scale (where b
is a base of the logarithm), and between neighbouring ticks for categorical scale. shiftX
and shiftY
specify shift for each point separately. jitterX = 0.3
is equivalent to shiftX = runif(length(x), -0.3, 0.3)
.
This example demonstrates how jitterX
works and how one can use shiftX
to create a violin plot. We generate 1500 points divided into three groups "a", "b"
and "c"
. Y-values are normally distributed within each group, but with different means.
x <- rep(c("a", "b", "c"), each = 500)
y <- c(rnorm(500), rnorm(500, 3), rnorm(500, 7))
openPage(useViewer = F, layout = "table1x2")
# scatterplot with jitter
lc_scatter(dat(
x = x,
y = y,
jitterX = 0.3,
size = 2.5
), "A1")
# simple function to scale a vector into unitary range
rescale <- function(x, min = 0, max = 1) {
(x - min(x)) / (max(x) - min(x)) * (max - min) + min
}
# generate random noise that is proportional to ker
shift <- unlist(tapply(y, x, function(points) { # for every group of points
d <- density(points) # calculate density distribution of y-values
runif(length(points), -0.3, 0.3) * # multiply random noise
rescale(approx(d$x, d$y, xout = points)$y) # by value from 0 to 1 proportional
# to density at this point
}))
# use generated noise as shift along x-axis
lc_scatter(dat(
x = x,
y = y,
shiftX = shift,
size = 2.5
), "A2")
You can make your X or Y axis logarithmic by setting logScaleX
and logScaleY
respectively to the base of the desired logarithm.
lc_scatter(dat(x = seq(1, 128, length.out = 20),
y = seq(1, 128, length.out = 20),
logScaleY = 2))
By default, limits of the axes are set so that to include all the user-provided values. One can change them simply by changing domainX
or domainY
. If an axis is continuous, the corresponding domain should be a vector with minimal and maximal value to display. Domain for a logarithmic scale must contain only positive values. Domain for a categorical axis is a vector of all possible values to display. One can also specify domain not for the entire chart but only for a given layer using layerDomainX
and layerDomainY
. The resulting domain then will be something that includes domains of all the layers.
No matter how you set the axes limits, they define only the initial state of the chart. Afterwards, you can always zoom in or out. Just draw a rectangle with your mouse, and the chart will zoom in on the selected area. Double click will return the chart to its original scale. You can also use +
and -
buttons on the tools panel (click on the grey triangle in the left upper corner) to zoom in or out.
x1 <- runif(40, 0, 10)
x2 <- runif(40, -5, 5)
lc_scatter(dat(
x = x1,
y = x1 * 3 + rnorm(40),
layerDomainX = c(3, 9),
domainY = c(0, 20)),
chartId = "chart")
lc_scatter(x = x2, y = -x2 + rnorm(40),
colour = "red",
chartId = "chart",
addLayer = T) # new scatter plot will be added as a new layer
When the first scatter plot is generated, it uses c(3, 9)
as limits for the x-axis and c(0, 20)
- for the y-axis. There is no difference between domainX
and layerDomainX
when the chart has only one layer. Then we add another layer that has points outside the limits of both axes. And now, the x-axis changes to fit all the new (red) points, while the y-axis remains the same. That is the difference between the two properties. domainY
specify limits of the y-axis for the entire chart, no matter what else will be there. layerDomainX
, on the other hand, is combined with other layers’ domains to define the final limits of the axis.
And here is how domain works for categorical axes.
lc_scatter(dat(
x = iris$Species,
y = iris$Sepal.Length,
jitterX = 0.2,
colourValue = iris$Petal.Length,
domainX = c("virginica", "something else", "setosa", "versicolor")
))
domainX = c("virginica", "something else", "setosa", "versicolor")
not only specifies the order of ticks but also tells the chart to expect "something else"
as one of the species, even though there are no points with this x-value.
aspectRatio
allows controlling the aspect ratio of the axes. Note that it’s possible only if both axes are linear and continuous. In all other cases, this property will be ignored.
lc_scatter(x = 1:10, y = 1:10,
height = 200, # make the chart wide
aspectRatio = 1)
ticksRotateX
and ticksRotateY
allow to rotate ticks. The angle of rotation must be set in degrees and lie between 0 and 90. Any values outside this range will be put into it. It is also possible to define what ticks to show and to replace their labels with something else. To this end, one of the ticksX
and ticksY
should be set to one or several columns of values. The first column is always where to put ticks. The next columns are optional, and they allow to specify with what to replace tick values (one tick can be replaced with several values simultaneously as if you have several axes at the same time).
Imagine now some values in the Iris dataset are missing. By default, these points are moved to the left upper corner of the plot, but what if we want to show them as well? We can replace NAs with some numeric value that is smaller than all our real values and then change tick labels to indicate that.
values <- iris$Sepal.Length
#add some NAs
values[sample(length(values), 10)] <- NA
values[is.na(values)] <- 3
lc_scatter(dat(
x = iris$Species,
y = values,
ticksY = cbind(3:8, c("NA", 4:8)),
ticksRotateX = 45,
jitterX = 0.2,
size = 4,
colourValue = iris$Petal.Length
))
Global chart Settings
This section describes the following properties:
width
height
paddings
showLegend
showPanel
transitionDuration
mode
width
and height
specify the chart size in pixels (the default value for both is 500). It is possible to change default paddings
that are used for axes, titles, labels and dendrograms (for heatmaps). paddings
must be a list with at least one of the following four elements: "top", "right", "bottom"
and "left"
. showLegend
is a boolean property, which specifies whether or not to show any legend at all. Similarly, with showPanel
, one can show or hide the instrument panel (grey triangle in the right upper corner).
lc_scatter(dat(
x = iris$Sepal.Length,
y = iris$Petal.Length,
colourValue = iris$Sepal.Width,
symbolValue = iris$Species,
showLegend = F,
showPanel = F,
width = 600,
height = 300,
paddings = list(left = 10)
))
As you can see, this chart has no instrument panel or legend, it’s wider than a default-sized one, and its left padding is too small for the y-axis ticks.
You could have noticed by now that when you update a chart, zoom in or zoom out, there is an animated transition between the states that allows you, for example, to track the movement of each point. The duration of this transition is defined by transitionDuration
property (in ms). If it is set to 0, there will be no transition effect, which can considerably save performance time in case of cumbersome calculations. It is also helpful to turn the transition off if you plan to make rapid changes to the chart. For example, you can change the colour of the points depending on which point the mouse is hovering right now.
pca <- prcomp(iris[, 1:4]) #get principle components
#fucntion that calculates a distance from a given pint to
#all other points
getDinstance <- function(p) {
sqrt(rowSums(t((t(iris[, 1:4]) - unlist(iris[p, 1:4])))^2))
}
selPoint <- 1
lc_scatter(dat(
x = pca$x[, 1],
y = pca$x[, 2],
colourValue = getDinstance(selPoint),
transitionDuration = 0,
on_mouseover = function(i) {
selPoint <<- i
updateCharts(updateOnly = "ElementStyle")
}
))
This example allows checking how well a dimensionality reduction approach (in this case, PCA) preserves the original distance. As it usually happens, we plot the first two principal components. The points are coloured by the distance from the selected one (selPoint
). When the user moves the mouse over one of the points, a mouseover event fires. When it happens, we change the selected point (selPoint <<- i
) and call the updateCharts
function. updateOnly = "ElementStyle"
tells the chart to change only the style of the points and nothing else. In case of many points, this may save some performance.
Scatters, beeswarms and heatmaps can also work in either "svg"
or "canvas"
mode. In the "svg"
mode, each point or cell is a separate element, which allows you to be more efficient when you want to change only some of them or only a specific aspect of a chart (e.g. change the colour of the points without changing their location). Yet if you have too many points, rendering each of them as a separate element can require too much memory and considerably slows down your browser or RStudio Viewer. An alternative to SVG is HTML-Canvas. In “canvas” mode, all the points or cells are parts of a single image. It makes rendering much faster, but any plot change, no matter how small, requires the image to be completely redrawn. No transition effect is available for the “canvas” mode. It also can’t be used in RStudio Viewer. If you want to use “canvas” mode, you need first to open a page in your browser (openPage(useViewer = FALSE)
). There are three options available for the mode
property: "svg"
and "canvas"
specifies the mode, and "default"
allows the chart to select the mode automatically.
Heatmaps
This section describes the following properties:
rowLabel
/colLabel
clusterRows
/clusterCols
showDendogramRow
/showDendogramCol
rankRows
/rankCols
on_labelClickRow
/on_labelClickCol
rowTitle
/colTitle
showValue
valueTextColour
Each row and column of heatmap has labels shown next to the row or column and on the label that appears when the mouse hovers over one of the cells. These labels can be automatically picked from column and row names of the value
matrix, or you can specify them using colLabel
and rowLabel
. When neither is available, numbers are used instead. clusterRows
and clusterCols
specify whether rows and columns should be clustered when the heatmap is generated. Even if these two are set to FALSE
, you can always cluster rows and columns later, using the instrument panel (click on the grey triangle in the upper-left corner to open/hide the instrument panel). Note that hierarchical clustering is slow and it may cause the page to go down. When rows or columns are clustered, an interactive dendrogram appears on the top or to the left from the heatmap. You can click on a branch of the, for example, row dendrogram to cluster columns, using only rows of the selected branch as features.
openPage(useViewer = F)
lc_heatmap(dat(
value = as.matrix(dist((iris[1:4]))),
rowLabel = iris$Species, # labels do not have to be unique
colLabel = iris$Species,
clusterRows = T, # we cluster rows and columns
clusterCols = T,
showDendogramRow = F # but hide row dendogram
))
By default, rows and columns are ordered as they are given. If clusterRows
or clusterCols
is TRUE
, then rows or columns are ordered according to the hierarchical clustering. It is also possible to set row and column order by setting rankRows
and rankCols
properties. By now, you may have noticed that clicking on a row label causes columns to rearrange themselves so that cells in the clicked row are sorted by their values. The same goes for column labels. Now, we are going to replace that with a different kind of behaviour. When a row label is clicked, it will be placed on top of the heatmap, and all other rows will be ordered by correlation value with the selected one.
openPage(useViewer = F)
rnk <- 1:nrow(iris) # initial rank of rows
lc_heatmap(dat(
value = iris[, 1:4],
rowLabel = iris$Species,
colLabel = colnames(iris),
height = 1000,
rankRows = rnk,
on_labelClickRow = function(i) {
rnk <<- -cor(unlist(iris[i, 1:4]), t(iris[, 1:4]))
updateCharts()
}
))
There are other types of information one can add to a heatmap.
rowTitle
and colTitle
define titles to rows and columns. They work like axisTitleY
and axisTitleX
for plots with axes, but for heatmaps. One can also add title to the colour legend with legendTitle
. By setting showValue = T
, you can force a heatmap to display the corresponding values inside each cell. The colour of the text is, by default, set individually for each cell to make it visible. You can also change the colour with valueTextColour
property.
openPage(layout = "table1x2")
lc_heatmap(value = matrix(rnorm(100), nrow = 10),
showValue = T,
rowTitle = "Here are the rows!",
colTitle = "And these are the columns.",
legendTitle = "Another title",
place = "A1")
lc_heatmap(value = matrix(rnorm(100), nrow = 10),
showValue = T,
palette = viridisLite::mako(10),
valueTextColour = "brown",
rowTitle = "Here are the rows!",
colTitle = "And these are the columns.",
place = "A2")