This document is the Tableau implementation of the more general, conceptual discussion regarding data visualizations of relationships.
Categorical Variables
Define
Data analysis, regardless of the data analysis system, always begins with the identification and definition of categorical variables.
Before analysis begins, first define all categorical variables, what Tableau calls “dimensions”.
Only categorical variables with text data values will be automatically classified as categorical. Properly define the categorical variables as needed: Drag all categorical variables for analysis to the list of dimensions, order the levels, and attach meaningful labels to the levels. The visualization attaches a numerical value to each group, the combination of levels of the categorical variables.
Stacked Bar Chart
For more specific guidance, assume a vertical bar chart. Switch the column and row orientation for a horizontal bark chart.
- Select x-axis variable: Drag a categorical variable (dimension) to the
Columnsshelf. - Select y-axis variable: Drag a continuous variable (measure) to the
Rowsshelf, which results in a one categorical variable bar chart.- If the y-axis variable is the pre-defined Count variable, then the aggregation is CNT.
- If the y-axis is another variable, then Tableau defaults to aggregating the SUM of that variable for each level of the categorical variable on the x-axis. You may want to change that aggregation to the AVG.
- Add the second categorical variable: Drag another categorical variable to the
Colormark. - Add labels to the bars [optional]: Select the numerical variable and drag to the
Labelmark.- Tableau will again default aggregate to Sum for the labels even if the aggregation on the bar chart y-axis is AVG.
- Usually change that aggregation for the labels to be the same aggregation on the y-axis for consistency.
Figure 1 shows the resulting stacked bar chart with the specified Marks parameters.
Unstacked Bar Chart
First, create the stacked bar chart.
To unstack the bars, drag the second categorical variable level over to the Columns shelf (or whatever shelf contains the first categorical variable).
Figure 2 shows the resulting stacked bar chart with the specified Marks parameters.
100% Stacked Bar Chart
First create the regular stacked bar chart, except leave the aggregation of the numerical variable as SUM.
To convert to a 100% stacked bar chart:
- Right-click the on the numerical variable name in the respective shelf
- Select
Quick Table Calculation - In the drop-down menu, select
Percent of Total - Again, right-click on the numerical variable name
- Select
Compute Using - In the drop-down menu, select
Cell
Figure 3 shows the resulting stacked bar chart with the specified Marks parameters.
Treemap
One way to proceed is to first create the regular stacked bar chart.
Then, select the Treemap icon on the Show Me panel.
The boxes are shaded according to the value of the numerical variable for each box.
Figure 4 shows the resulting scatterplot with the specified Marks parameters. In this example, aggregate on the count of the number of occurrences in each group.
Bubble Plot
One way to proceed is to first create the regular stacked bar chart.
Then select the Bubble plot icon on the Show Me panel. Usually, change the aggregation of the numerical variable to AVG.
Figure 5 shows the resulting scatterplot with the specified Marks parameters.
The link to the video of examples of these processes follows.
Video: Two categorical variables. [4:25]
Continuous Variables
Two-Variable Scatterplot
- Select x-axis variable: Drag one variable, measure, to the
Columnsshelf. - Select y-axis variable: Drag the other variable, measure, to the
Rowsshelf. - Disaggregate: On the Main Menu at the top of the screen, select
Analysis, then selectAggregate Measures, which will uncheck the menu option, turning off aggregation. - Fit line: Select the
Analyticstab, next to theDatatab at the top of the list of variables. UnderModel, select the mis-namedTrend lineoption (misnamed because “trend” as generally understood applies to an orientation over time). Then, if a linear trend line is desired, choose the displayedLinearoption.
Figure 6 shows the resulting scatterplot with the specified Marks parameters.
Stratification
Trellis plot: Drag the categorical variable (dimension) label over to one of the shelves.
Figure 7 shows the resulting Trellis (facet) scatterplot with the specified Marks parameters.
Same panel plot: Drag the categorical variable (dimension) label over to Color mark.
Figure 8 shows the resulting scatterplot with the specified Marks parameters.
Bubble Scatterplot
Drag the continuous variable (measure) over to the Size mark. Perhaps adjust the sizes, though apparently not a way to better differentiate among the different sizes other than make them all larger or smaller at the same scale.
Figure 9 shows the resulting scatterplot with the specified Marks parameters.
The link to the video of examples of these processes follows.
Video: Continuous variables. [3:31]