For all examples the movies data set contained in the package will be used.
library(UpSetR)
movies <- read.csv( system.file("extdata", "movies.csv", package = "UpSetR"), header=T, sep=";" )
Before we start producing examples using the movies dataset, it is
important to know alternative formats to input data. In some cases, the
data you have may not be in the form of a file. In UpSetR there are two
built in converter functions fromList
and
fromExpression
that take alternative data formats. The
fromList
function takes a list of named vectors and
converts them into a data frame compatible with UpSetR. The
fromExpression
function takes a vector that acts as an
expression. The elements of the expression vector are the names of the
sets in an intersection, seperated by an amerpsand (&), and the
number elements in that intersection.
#example of list input (list of named vectors)
listInput <-list(one = c(1,2,3,5,7,8,11,12,13), two = c(1,2,4,5,10), three = c(1,5,6,7,8,9,10,12,13))
#example of expression input
expressionInput <- c("one" = 2, "two" = 1, "three" = 2, "one&two" = 1, "one&three" = 4, "two&three" = 1, "one&two&three" = 2)
Note that both of these inputs contain the same data. To generate an
UpSet plot with these inputs set the data
paramter equal to
either fromList(listInput)
or
fromExpression(expressionInput)
.
When not specifying specific sets, nsets
selects the
n largest sets from the data. number.angles
determines the angle (in degrees) of the numbers above the intersection
size bars. point.size
changes the size of the circles in
the matrix. line.size
changes the size of the lines
connecting the circles in the matrix. mainbar.y.label
and
sets.x.label
can be used to change the axis labels on the
intersection size bar plot and set size bar plot, respectively. Recently
added, text.scale
allows scaling of all axis titles, tick
labels, and numbers above the intersection size bars.
text.scale
can either take a universal scale in the form of
an integer, or a vector of specific scales in the format:
c(intersection size title, intersection size tick labels, set size title, set size tick labels, set names, numbers above bars)
.
upset(movies, nsets = 6, number.angles = 30, point.size = 3.5, line.size = 2, mainbar.y.label = "Genre Intersections", sets.x.label = "Movies Per Genre", text.scale=c(1.3, 1.3, 1, 1, 2, 0.75))
To look at specific sets, a vector of set names can be entered into
the sets
parameter. To change the proportions of the plot
heights assigned to the matrix and intersection size bar plot, use the
mb.ratio
parameter entered as percentages. If no order is
specified, the matrix will be ordered by degree, then frequency. The 3
plots below show different ways the data can be ordered.
upset(movies, sets = c("Action", "Adventure", "Comedy", "Drama", "Mystery", "Thriller", "Romance", "War", "Western"), mb.ratio = c(0.55,0.45), order.by = "freq")
upset(movies, sets = c("Action", "Adventure", "Comedy", "Drama", "Mystery", "Thriller", "Romance", "War", "Western"), mb.ratio = c(0.55,0.45), order.by = "degree")
upset(movies, sets = c("Action", "Adventure", "Comedy", "Drama", "Mystery", "Thriller", "Romance", "War", "Western"), mb.ratio = c(0.55,0.45), order.by = c("degree", "freq"))
To keep the sets in the order entered using the sets
parameter (Example 3), set the keep.order
parameter to
TRUE.
upset(movies, sets = c("Action", "Adventure", "Comedy", "Drama", "Mystery", "Thriller", "Romance", "War", "Western"), mb.ratio = c(0.55,0.45), order.by = "freq", keep.order = TRUE)
Instead of the default method of grouping by degree, grouping by sets
can be acheived using group.by
. To set a cutoff for the
number of intersections per group of sets use cutoff
.
There may be times where an intersection you are looking for is not
present in the matrix. This may be due to not showing enough
intersections which can be changes with nintersects
, or it
may be because the intersection contains no elements. To additionally
show empty intersections turn on empty.intersections
.