---
title: "Introduction to gglite"
output:
html:
meta:
css: ['@default', '@article', '@copy-button', '@heading-anchor', '@pages']
js: ['@sidenotes', '@appendix', '@toc-highlight', '@copy-button', '@heading-anchor', '@pages']
options:
toc: true
number_sections: true
vignette: >
%\VignetteIndexEntry{Introduction to gglite}
%\VignetteEngine{litedown::vignette}
%\VignetteEncoding{UTF-8}
---
```{r, setup, include = FALSE}
if (!exists('penguins')) {
load(con <- url('https://cdn.jsdelivr.net/gh/r-devel/r-svn/src/library/datasets/data/penguins.rda'))
close(con)
}
```
The **gglite** package provides a lightweight R interface to the
[AntV G2](https://g2.antv.antgroup.com/) JavaScript visualization library. It
follows the _Grammar of Graphics_ framework---the same theoretical foundation
behind **ggplot2**---but renders interactive, web-based charts powered by G2.
A visualization in gglite is built by composing independent layers:
1. **Data** -- the data frame you want to visualize.
2. **Marks** (geometries) -- the visual shapes representing data (points,
lines, bars, ...).
3. **Encodings** (aesthetics) -- mappings from data columns to visual channels
(position, color, size, ...).
4. **Scales** -- control how data values translate to visual values.
5. **Coordinates** -- the coordinate system (Cartesian, polar, ...).
6. **Transforms** -- statistical or layout transforms applied to the data.
7. **Facets** -- split data into multiple panels.
8. **Themes** -- overall visual styling.
9. **Components** -- axes, legends, titles, tooltips, and labels.
Each layer is added with the pipe operator `|>`, so building a chart reads
naturally from left to right. If you prefer the ggplot2 convention, you can
also use `+` instead of `|>`---both operators produce identical results. Use
whichever you prefer.
## Data and encodings
Every chart starts with `g2()`, which accepts a data frame and aesthetic
mappings as R formulas:
```{r}
library(gglite)
g2(mtcars, hp ~ mpg)
```
You can also set encodings later with `encode()`:
```{r}
g2(mtcars) |> encode(x = ~ mpg, y = ~ hp, color = ~ cyl)
```
### Formula interface
You can use R formulas as a shorthand for aesthetic mappings. The left-hand
side maps to `y` and the right-hand side maps to `x`:
```{r}
g2(mtcars, hp ~ mpg)
```
Additional aesthetics like `color` can be passed alongside the formula:
```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species)
```
Use `|` for faceting:
```{r}
g2(iris, Sepal.Length ~ Sepal.Width | Species)
```
A one-sided formula maps only `x` (useful for histograms or counts):
```{r}
g2(mtcars, ~ mpg)
```
Use `+` on the right-hand side for multiple position fields (parallel
coordinates):
```{r}
g2(iris, ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
color = ~ Species)
```
### Character string interface
All aesthetic channels also accept plain character strings instead of formulas.
This alternative syntax is equivalent---`color = 'species'` produces the same
result as `color = ~ species`:
```{r}
g2(mtcars, x = 'mpg', y = 'hp', color = 'cyl')
```
The `encode()` function also accepts character strings:
```{r}
g2(mtcars) |> encode(x = 'mpg', y = 'hp', color = 'cyl')
```
## Marks (geometries)
Marks are the visual building blocks. gglite provides 35+ mark types. Here are
the most common ones.
### Points (scatter plot)
```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
mark_point()
```
### Lines
```{r}
df = data.frame(
x = rep(1:5, 2), y = c(3, 1, 4, 1, 5, 2, 7, 1, 8, 3),
group = rep(c('A', 'B'), each = 5)
)
g2(df, y ~ x, color = ~ group) |> mark_line()
```
### Bars (intervals)
```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x) |> mark_interval()
```
### Areas
```{r}
df = data.frame(x = 1:10, y = c(3, 1, 4, 1, 5, 9, 2, 6, 5, 3))
g2(df, y ~ x) |> mark_area()
```
### Box plots
```{r}
g2(iris, Sepal.Width ~ Species) |> mark_boxplot()
```
### Combining marks
Multiple marks can be layered on the same chart:
```{r}
df = data.frame(x = c('A', 'B', 'C'), y = c(3, 7, 2))
g2(df, y ~ x) |>
mark_interval() |>
mark_text(encode = list(text = 'y'))
```
## Automatic marks
When no `mark_*()` is added to the pipeline, gglite automatically chooses a
mark based on the types of the `x` and `y` variables:
| `x` type | `y` type | Mark | Chart type |
|-----------|----------|------|------------|
| numeric | numeric | `point` | Scatter plot |
| categorical (unique) | numeric | `interval` | Bar plot |
| categorical (repeated) | numeric | `beeswarm` | Beeswarm plot |
| categorical (repeated, n ≥ 30) | numeric | `beeswarm` + `density` | Beeswarm + density |
| numeric | categorical (unique) | `interval` (transposed) | Horizontal bar plot |
| numeric | categorical (repeated) | `beeswarm` (transposed) | Horizontal beeswarm |
| numeric | categorical (repeated, n ≥ 30) | `beeswarm` + `density` (transposed) | Horizontal beeswarm + density |
| categorical | categorical | `cell` + `group` | Contingency table |
| Date | numeric | `line` | Line chart |
| `ts`/`mts` | _(auto)_ | `line` | Time series line chart |
| numeric | _(none)_ | `interval` + `binX` | Histogram |
| categorical | _(none)_ | `interval` + `groupX` | Count bar chart |
| _(position)_ | _(none)_ | `line` + parallel | Parallel coordinates |
When `x` (or `y`) is categorical, the choice depends on whether the categories
are unique in the data. If every category appears exactly once, a bar plot
(`interval`) is drawn. If categories are repeated, a beeswarm plot shows
individual data points. When all groups have at least 30 observations, a density
curve is overlaid on the beeswarm for a summary view.
This means you can often skip the mark entirely:
### Scatter plot (numeric × numeric)
```{r}
g2(penguins, bill_len ~ bill_dep, color = ~ species)
```
### Bar plot (categorical × numeric, unique categories)
When each category appears once, a bar chart is drawn:
```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x)
```
### Beeswarm plot (categorical × numeric, repeated categories)
When categories are repeated, individual points are shown in a beeswarm layout:
```{r}
g2(chickwts, weight ~ feed)
```
### Beeswarm + density (categorical × numeric, large groups)
When every group has at least 30 observations, a density curve is overlaid on
the beeswarm:
```{r}
g2(penguins, bill_len ~ species)
```
### Horizontal beeswarm (numeric × categorical)
```{r}
g2(penguins, species ~ bill_len)
```
### Contingency table (categorical × categorical)
Cells are automatically colored by the count of each combination:
```{r}
g2(penguins, island ~ species)
```
### Line chart (Date × numeric)
```{r}
df = data.frame(date = Sys.Date() + 0:9, value = cumsum(rnorm(10)))
g2(df, value ~ date)
```
### Histogram (numeric only)
```{r}
g2(penguins, ~ bill_len)
```
### Count bar chart (categorical only)
```{r}
g2(penguins, ~ species)
```
### Parallel coordinates (multiple position fields)
```{r}
g2(penguins, ~ bill_len + bill_dep + flipper_len + body_mass, color = ~ species)
```
You can still add scales, themes, titles, and other components as usual:
```{r}
g2(mtcars, hp ~ mpg, color = ~ cyl) |>
scale_color(type = 'ordinal') |>
titles('Motor Trend Cars')
```
If you add any `mark_*()`, automatic detection is skipped entirely, so explicit
marks always take priority.
### Time series
`g2()` also accepts R time series (`ts` and `mts`) objects directly. Univariate
series are converted to a data frame with `time` and `value` columns;
multivariate series are reshaped to long format with `time`, `series`, and
`value` columns. The auto-mark feature draws a line chart automatically:
```{r}
g2(sunspot.year) |> titles('Yearly Sunspot Numbers (1700--1988)')
```
Multivariate time series produce one line per series:
```{r}
g2(EuStockMarkets) |> titles('EU Stock Markets (1991--1998)')
```
## Scales
Scales control how data values map to visual properties. Use helpers like
`scale_x()`, `scale_y()`, and `scale_color()` to configure scales:
```{r}
g2(mtcars, hp ~ mpg, color = ~ wt) |>
scale_y(type = 'log') |>
scale_color(palette = 'viridis')
```
Custom domain and range:
```{r}
g2(mtcars, hp ~ mpg) |>
scale_x(domain = c(10, 35)) |>
scale_y(domain = c(0, 400))
```
## Coordinates
Coordinate systems change how positional encodings are interpreted. gglite
supports Cartesian (default), polar, theta, and radial coordinates.
### Polar coordinates (rose chart)
```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x, color = ~ x) |>
mark_interval() |>
coord_polar()
```
### Theta coordinates (pie chart)
```{r}
g2(df, y ~ x, color = ~ x) |>
mark_interval() |>
transform('stackY') |>
coord_theta(innerRadius = 0.5)
```
### Transposing axes
`coord_transpose()` swaps x and y (similar to ggplot2's `coord_flip()`):
```{r}
g2(df, y ~ x) |>
mark_interval() |>
coord_transpose()
```
## Transforms
Transforms modify the data before rendering. Use `transform()` to apply
statistical or layout transforms. When using `+`, the first argument must be
unnamed; see `?transform.g2` for details.
### Stacked bars
```{r}
df = data.frame(
x = rep(c('A', 'B', 'C'), each = 2), y = c(3, 2, 5, 4, 1, 6),
color = rep(c('a', 'b'), 3)
)
g2(df, y ~ x, color = ~ color) |>
mark_interval() |>
transform('stackY')
```
### Dodged bars
```{r}
g2(df, y ~ x, color = ~ color) |>
mark_interval() |>
transform('dodgeX')
```
### Stacked area chart
```{r}
df = data.frame(
x = rep(1:5, 2), y = c(3, 1, 4, 1, 5, 2, 7, 1, 8, 3),
group = rep(c('A', 'B'), each = 5)
)
g2(df, y ~ x, color = ~ group) |>
mark_area() |>
transform('stackY')
```
## Facets
Faceting splits data into panels. Use `facet_rect()` for a grid layout:
```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
facet_rect(~ Species)
```
The formula interface supports faceting with `|`. Use `| var` for column facets,
`| 0 + var` for row facets, and `| var1 + var2` for both:
```{r}
# Column facet: panels arranged in columns by species
g2(penguins, bill_len ~ bill_dep | species)
```
```{r}
# Row facet: panels arranged in rows by island
g2(penguins, bill_len ~ bill_dep | 0 + island)
```
```{r}
# Both: columns by species, rows by island
g2(penguins, bill_len ~ bill_dep | species + island)
```
## Themes
Themes change the overall look. Built-in themes include `theme_classic()`
(default), `theme_classic_dark()`, `theme_light()`, `theme_dark()`, and
`theme_academy()`:
```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
theme_academy()
```
## Components
Components are the non-data elements of a chart: titles, tooltips, axes,
legends, and labels.
### Titles
```{r}
g2(mtcars, hp ~ mpg) |>
titles('Motor Trend Cars', subtitle = 'mpg vs horsepower')
```
### Tooltips
```{r}
g2(sunspot.year) |> tooltip(crosshairs = TRUE)
```
### Labels
Use `labels()` to add text annotations. When using `+`, the first argument
must be unnamed; see `?labels.g2` for details.
```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x) |>
mark_interval() |>
labels(text = ~ y)
```
## Interactions
Interactions add user-driven behaviors like hovering, brushing, and filtering:
```{r}
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
interact('tooltip') |>
interact('legendFilter') |>
interact('brushHighlight')
```
## Putting it all together
Here is a more complete example combining several grammar layers:
```{r}
df = data.frame(
x = rep(c('Q1', 'Q2', 'Q3', 'Q4'), each = 2),
y = c(120, 80, 150, 90, 180, 110, 200, 130),
product = rep(c('Widget', 'Gadget'), 4)
)
g2(df, y ~ x, color = ~ product) |>
mark_interval() |>
transform('dodgeX') |>
scale_color(range = c('#5470c6', '#91cc75')) |>
titles('Quarterly Sales', subtitle = 'By product line') |>
interact('tooltip') |>
interact('elementHighlightByX') |>
theme_classic()
```
## Using `+` instead of `|>`
If you are used to ggplot2, you can replace `|>` with `+`. Both operators
produce identical charts:
```{r}
# Pipe style
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) |>
scale_color(palette = 'set2') |>
titles('Iris Dataset')
```
```{r}
# ggplot2 style
g2(iris, Sepal.Length ~ Sepal.Width, color = ~ Species) +
scale_color(palette = 'set2') +
titles('Iris Dataset')
```
You can mix modifiers freely---marks, scales, coordinates, themes, facets,
transforms, and components all work with `+`:
```{r}
df = data.frame(x = c('A', 'B', 'C', 'D'), y = c(3, 7, 2, 5))
g2(df, y ~ x, color = ~ x) +
mark_interval() +
coord_polar() +
titles('Polar Bar Chart') +
theme_academy()
```
You can even freely mix `+` and `|>` in the same expression---due to R's
operator precedence (`|>` binds tighter than `+`), any combination produces
the same result:
```{r}
# These are all equivalent:
g2(mtcars, hp ~ mpg) |>
scale_x(type = 'log') + theme_dark()
g2(mtcars, hp ~ mpg) +
scale_x(type = 'log') + theme_dark()
```