Skip to contents

Visualize relationships between numeric variables and categorical groupings using parallel coordinate plots.

Usage

ggparallel(
  data,
  col_id = NULL,
  col_colour = NULL,
  highlight = NULL,
  interactive = TRUE,
  order_columns_by = c("appearance", "random", "auto"),
  order_observations_by = c("frequency", "original"),
  verbose = TRUE,
  palette_colour = palette.colors(palette = "Set2"),
  palette_highlight = c("red", "grey90"),
  convert_binary_numeric_to_factor = TRUE,
  scaling = c("uniminmax", "none"),
  return = c("plot", "data"),
  options = ggparallel_options()
)

Arguments

data

A data frame containing the variables to plot.

col_id

The name of the column to use as an identifier. If NULL, artificial IDs will be generated based on row numbers. (character)

col_colour

Name of the column to use for coloring lines in the plot. If NULL, no coloring is applied. (character)

highlight

A level from col_colour to emphasize in the plot. Ignored if col_colour is not set. (character)

interactive

Produce interactive ggiraph visualiastion (flag)

order_columns_by

Strategy for ordering columns in the plot. Options include:

  • "appearance": Order columns by their order in data (default).

  • "random": Randomly order columns.

  • "auto": Automatically order columns based on context:

    • If highlight is set, columns are ordered to maximize separation between the highlighted level and all others, using mutual information.

    • If col_colour is set but highlight is not, columns are ordered based on mutual information with all classes in col_colour.

    • If neither highlight nor col_colour is set, columns are ordered to minimize the estimated number of crossings, using a repetitive nearest neighbour approach with two-opt refinement.

order_observations_by

Strategy for ordering lines in the plot. Options include:

  • "frequency": Draw the largest groups first.

  • "original": Preserve the original order in data.

Ignored if highlight is set.

verbose

Logical; whether to display informative messages during execution. (default: TRUE)

palette_colour

A named vector of colors for categorical levels in col_colour. (default: Set2 palette)

palette_highlight

A two-color vector for highlighting (highlight and others). (default: c("red", "grey90"))

convert_binary_numeric_to_factor

Logical; whether to convert numeric columns containing only 0, 1, and NA to factors. (default: TRUE)

scaling

Method for scaling numeric variables. Options include:

  • "uniminmax": Rescale each variable to range [0, 1].

  • "none": No rescaling. Use raw values.

return

What to return. Options include:

  • "plot": Return the ggplot object (default).

  • "data": Return the processed data used for plotting.

options

A list of additional visualization parameters created by ggparallel_options().

Value

A ggplot object or a processed data frame, depending on the return parameter.

Examples

ggparallel(
  data = minibeans,
  col_colour = "Class",
  order_columns_by = "auto"
)
#>  Ordering columns based on mutual information with [Class]
#>  Making plot interactive since `interactive = TRUE`
Perimeter Convex Area Area Equiv Diameter Shape Factor2 Major Axis Length Compactness Shape Factor3 Aspect Ratio Eccentricity Roundness Minor Axis Length Shape Factor1 Shape Factor4 Solidity Extent DERMASON SIRA SEKER HOROZ CALI BARBUNYA BOMBAY
ggparallel( data = minibeans, col_colour = "Class", highlight = "DERMASON", order_columns_by = "auto" ) #> Ordering columns based on how well they differentiate 1 group from the rest [DERMASON] (based on mutual information) #> Making plot interactive since `interactive = TRUE`
Area Equiv Diameter Convex Area Perimeter Minor Axis Length Shape Factor1 Major Axis Length Shape Factor2 Roundness Compactness Shape Factor3 Aspect Ratio Eccentricity Shape Factor4 Solidity Extent SIRA SEKER HOROZ CALI BOMBAY BARBUNYA DERMASON
# Customise appearance using options argument ggparallel( data = minibeans, col_colour = "Class", order_columns_by = "auto", options = ggparallel_options(show_legend = FALSE) ) #> Ordering columns based on mutual information with [Class] #> Making plot interactive since `interactive = TRUE`
Perimeter Convex Area Area Equiv Diameter Shape Factor2 Major Axis Length Compactness Shape Factor3 Aspect Ratio Eccentricity Roundness Minor Axis Length Shape Factor1 Shape Factor4 Solidity Extent DERMASON SIRA SEKER HOROZ CALI BARBUNYA BOMBAY