Annotation Taxonomy in Grouped Bar Charts · Information Visualization 2025

01 · Overview

Research Overview

This study evaluates how visualization students annotate grouped bar charts when answering high-level analytical questions. Through qualitative coding, we identify a taxonomy of annotation types and characterize how different types support specific low-level analytic tasks.

Taxonomy. We coded annotations from 20 student submissions and identified five primary annotation types: enclosure, connector, text, mark, and color, plus a residual category for special symbols. Each type is mapped to the low-level analytic tasks it was used to support.
Ensemble Annotations. We found that participants frequently combined types into ensembles, including 2-, 3-, and 4-annotation combinations for a single task. We characterize one-way and two-way dependency relationships between the constituent types.
Design Space Observations. We report cross-cutting patterns: within-subject consistency in type selection, task-driven ensemble use (sort always required ensembles), and annotation strategies specific to non-chronological data ordering.

02 · Study Design

Methodology

We used a course assignment in a mixed undergraduate/graduate data visualization course at the University of South Florida (Spring 2022). Students annotated three grouped bar charts based on the Georgia Department of Public Health (GDPH) COVID-19 visualization to support answering 12 high-level questions.

39

Enrolled students

38

Completed

20

Submissions analyzed

3

Charts per submission

12

Total questions

5

Annotation types

~45

Avg. annotations / student

Participants & Assignment

We recruited students from an Interactive Data Visualization elective at the University of South Florida (USF), with 21 undergraduate and 18 graduate students (39 total). The assignment was given halfway through the semester, before any lecture on bar charts or annotations.

Each student received three grouped bar charts with four high-level questions each (12 total). They were asked to annotate each chart to make the questions easy to answer without writing the answer directly. Any tool was permitted; work was done individually outside class.

Inclusion: Of 38 completed submissions, 13 had no annotations and 5 had minimal unrelated marks. We analyzed the remaining 20 submissions.

Low-Level Analytic Tasks

Students also identified which low-level tasks each question required. These tasks form the axis of the taxonomy: each annotation type is characterized by the tasks it supports.

RV Retrieve Value: look up a specific data value.
Filter Filter: identify a subset satisfying a condition.
CDV Compute Derived Value: calculate a value from multiple points.
FE Find Extremum: identify the max or min in a set.
Sort Sort: rank data items by value.

Data Coding

Two coders independently labeled each annotation, starting from a prior five-type taxonomy and refining iteratively.

Step 1 · Labeling

Each annotation received a type label and a task label.

Step 2 · Ensembles

Co-occurring annotations for one task were coded as ensembles, with constituent count (2, 3, or 4) and dependency type (one-way or two-way) recorded.

Step 3 · Resolution

Coder disagreements were resolved through discussion after each round.

Chart Conditions

We used two bar ordering conditions to investigate how chart structure affects annotation strategy.

Chrono Bars ordered chronologically by date.
Non-C Bars ordered highest to lowest, replicating the original GDPH chart.

This produced 31 chronological and 29 non-chronological charts across the 20 analyzed submissions.

Figure 2 · From the Paper

Student-Annotated Bar Charts

Three representative submissions showing how differently students annotated the same kind of grouped bar chart. The notes below point to the exact boxes, arrows, labels, and highlights visible in each figure.

Student annotation example (a): freehand rectangles, arrows, and text annotations on the GDPH COVID-19 grouped bar chart

Enclosure Connector Text

Large blue brackets, arrows, and handwritten notes are drawn directly on the chart. The long bracket under Apr 30-May 2 and the smaller boxes around May 3 and May 8 show which dates to focus on, while the arrows and notes like #, sort, and last explain what to compare or retrieve.

Fig. 2(a) · Rahman et al. 2025

Student annotation example (b): ellipses, rectangles, reference lines, and text annotations

Enclosure Mark Text

This example combines circles, guide lines, and handwritten questions. The red and green circles under the x-axis pick out specific dates, the horizontal lines around 40 and 48 mark thresholds to check against, and the box around May 8 isolates the bar group the student wants to sort.

Fig. 2(b) · Rahman et al. 2025

Student annotation example (c): colored rectangular enclosures, data value labels, and a text legend

Enclosure Color Text Mark

This student uses clean color-coded boxes instead of freehand marks. Matching boxes at the top and bottom highlight Apr 30-May 2, May 3, and May 5, the numbers above selected bars give exact values, and the note at May 9 states the main takeaway.

Fig. 2(c) · Rahman et al. 2025

03 · Taxonomy

Five Annotation Categories

Five annotation types were identified. Usage counts reflect instances observed across all 20 analyzed submissions. Primary tasks are those most frequently associated with each type; secondary tasks were observed less often.

Enclosure

144 instances across all submissions

Shapes surrounding one or more data elements. Ellipses and rectangles were most common; brackets marked axis-aligned ranges. We found enclosures to be self-sufficient: a rectangle around a bar can communicate a filter or RV task without requiring a second type.

Subtypes

Ellipse Rectangle Half-box Bracket

Tasks supported

RV Filter CDV › FE ›

Relative usage 201 = max

Connector

201 instances most frequently used type

Lines and arrows. Lines marked bar heights or trends; arrows pointed from text or enclosures to specific bars. We found that connectors almost never appeared alone; without an anchoring type, a connector has no referent and cannot communicate a task.

Subtypes

Line (undirected) Arrow (directional)

Tasks supported

RV Filter FE › Sort ›

Relative usage 201 = max

Text

156 instances across all submissions

Words, phrases, and sentences. We identified three subtypes: description (explanatory prose about a task or derivation), value (a specific numeric label), and legend (a label for annotation-based groupings, often paired with color). Text was the only type we observed across all five tasks.

Subtypes

Description Value Legend

Tasks supported

RV Filter CDV FE Sort

Relative usage 201 = max

Mark

33 instances least frequent primary type

Small, non-enclosing symbols placed on or near data elements to indicate them without surrounding them. We observed checkmarks, circles, underlines, T-shapes, and Roman numerals. Marks were the least-used type despite functional overlap with enclosures for RV and filter tasks.

Subtypes

Checkmark Small circle Underline T-shape Roman numeral / letter

Tasks supported

RV Filter CDV ›

Relative usage 201 = max

Color

194 instances second most frequent type

Color modifications applied to annotations or chart elements. We found two subtypes: highlight (translucent overlay, filter tasks only) and hue variation (distinct color to differentiate groups or questions). We did not observe color used for Sort tasks.

Subtypes

Highlight (overlay) Hue variation

Tasks supported

RV Filter CDV FE

Relative usage 201 = max

› Faded cards contain no examples of the selected task. Primary tasks are shown in full color; secondary tasks (less frequent) are marked with ›.

04 · Ensembles

Ensemble Annotations

An ensemble is two or more annotation types used together for a single task instance. Ensembles arose when individual annotations were insufficient, either because the task was complex (e.g., sort) or because visual clutter required spatial separation between an annotation and its referent.

Ensembles were divided into three categories by the number of constituent annotation types: 2-annotation, 3-annotation, and 4-annotation. Within 2-annotation ensembles, we further distinguish one-way dependency (one type can stand alone; the other cannot) from two-way dependency (neither type is meaningful without the other for that task).

One-way: One annotation type can independently convey the task; the second type complements it for clarity (e.g., a half-box can filter on its own, but adding text makes the intent explicit). · Two-way: Neither type alone is sufficient; both are required (e.g., arrow + text for a sort task: without the arrow, the text has no directional reference; without the text, the arrow has no label).

Most common

Connector + Color

Colored lines or arrows used to filter or point to bars of interest. A common pattern: colored connectors filtered data before a CDV computation was performed on the selected subset.

RV Filter CDV FE

Two-way

Connector + Text

An arrow or line paired with explanatory text. Neither type alone conveys the full task: the connector provides directionality, the text provides the label or instruction. Frequent for RV and filter tasks.

RV Filter

One-way

Enclosure + Text

A shape (e.g., half-box) marking a region of interest, paired with text that clarifies the task intent. The enclosure is independent; the text provides supplemental specificity. Used heavily for CDV and filter tasks.

Filter CDV

One-way

Enclosure + Other

A bracket or half-box accompanied by a special symbol (e.g., # to signal computation). The enclosure marks the range; the symbol indicates that a calculation must be performed on the enclosed data. Used for CDV tasks.

CDV

3-annotation ensembles were used for compound tasks requiring multiple layers of communication, typically identifying a data region, pointing to it, and labeling or explaining it simultaneously.

Most common

Enclosure + Connector + Text

The most frequently observed 3-annotation ensemble. A shape marks a region, a connector (often an arrow) links it to another element or provides directionality, and text labels or explains the intent.

RV Filter FE Sort

Observed

Connector + Text + Color

A colored connector with accompanying text. Color disambiguates which connector belongs to which task or question when multiple connectors are present on the same chart.

RV FE

Observed

Text + Mark + Color

A colored mark (e.g., colored Roman numeral) paired with text. Color encodes question identity; mark and text together describe what computation to perform and on which data item.

CDV

Observed

Text + Color + Mark

A variant of the above, used primarily for filter tasks. Color groups items; mark and text together identify and label the filtered subset.

Filter

4-annotation ensembles were rare and arose in particularly complex task situations, typically when a task involved multiple data regions, required both grouping and retrieval, and needed a legend to disambiguate the different annotation roles.

Example

Enclosure + Connector + Text + Color

Four annotation types operating on the same task instance. A colored enclosure groups a data region; a connector links it spatially; text labels the task or result; color differentiates multiple co-occurring instances (e.g., separate question groups on the same chart). This combination was observed for filter and CDV tasks involving multiple date ranges or groups that required visual separation to remain legible.

Filter CDV

05 · Findings

Key Findings

Patterns observed across the 20 coded submissions, organized by annotation usage, ensemble behavior, and task-specific constraints.

Usage was balanced across four types

We observed high usage of enclosure (144), connector (201), text (156), and color (194). Mark was the outlier at 33 instances. The near-parity across four types suggests participants selected among multiple viable options based on personal preference rather than any dominant convention.

Connectors almost never appeared alone

Connectors were the most-used type overall, yet rarely appeared independently. Without an anchoring type (text, enclosure, or color), a connector cannot communicate which element is referenced or what task is intended. The high connector count reflects ensemble prevalence, not standalone use.

Sort always required ensembles

We found no standalone annotation sufficient for sort tasks. Every sort-task instance in our data was an ensemble. Arrow + text (two-way dependent) was the dominant pattern: the arrow conveys direction; the text identifies the items being ranked.

Color was absent from sort tasks

We observed color across RV, filter, CDV, and FE tasks, but not Sort. Color encodes grouping or identity, not order or direction, so it cannot satisfy the directional cues that sort tasks require.

Participants used a stable personal type subset

We observed within-subject consistency: most participants reused the same limited set of annotation types across all three charts. One used only rectangle, text, and color; another used only rectangle, text, and mark. Type selection appeared to be driven by individual vocabulary rather than task-specific optimization.

Non-chronological ordering required workarounds

Date-range questions were harder on non-chronological charts because relevant dates were not adjacent. We observed: no annotation (7 cases), separate enclosures per date (8), individual marks per date (8), and ensembles (2). On chronological charts, a single spanning enclosure was sufficient.

06 · Citation

Cite This Work

@article{rahman2025annotationtaxonomy,
  title   = {Exploring annotation taxonomy in grouped bar charts:
             {A} qualitative classroom study},
  author  = {Rahman, Md Dilshadur and Quadri, Ghulam Jilani
             and Szafir, Danielle Albers and Rosen, Paul},
  journal = {Information Visualization},
  volume  = {24},
  number  = {1},
  pages   = {79--94},
  year    = {2025},
  publisher = {SAGE},
  doi     = {10.1177/14738716241270247}
}