Published on July 19

Taiwan mangoes interactive visualization write-up

By Julia Janicki

Here is the github repo if you want to refer to the code.

Part 1. Background research & ideation

1.1 Goal & Target audience

To create a visualization of the most common mango varieties in Taiwan to show off the diversity by allowing the user to sort by sweetness or size, and also allow the user to click on a mango to see more details about it.

1.2 What are some initial ideas

Arrange all mangoes in a circle, ordered by a specific attribute starting from the top of the circle.

1.3 Direction after doing some research

Some considerations regarding the type of story & data viz?

Exploration vs presentation → Exploration
Static vs interactive → Interactive
2D vs 3D → 2D but with 3D rendered mangoes
Charts vs maps vs graphics → A mix of chart plus graphics

1.4 What would be our tools?

→ Javascript, more specifically d3 for the interactive visualization (By Julia Janicki)

→ Cinema4D for the 3D mangoes (By Jane Guan)

→ Adobe XD for the interface design (By Daisy Chung)

Part 2. Data cleaning / exploration

2.1 Data collection / data sources

Normally I would start by looking for an existing dataset. But since no curated dataset exists for Taiwanese mangoes, I had to create my own. The compiled dataset below is based on over ten different sources, most of which were in Chinese.

2.2 Compile data & do some data cleaning

2.2.1 Description of dataset:

We set up a Google sheets to store mango data with the following fields (columns), with each row corresponding to a mango variety

name, name_en
size_cm
sweetness_brix
color (didn’t end up using)
origin_en, origin
feature_en, feature
region, region_en
year

Here is the link to the CSV & JSON datasets on github.

2.2.2 Add attributes:

To arrange the mangoes into a circle, there are a couple different options:

Option 1 is to calculate their positions manually using sine and cosine, which I will not go into detail here.
Option 2, what I did in this case, is to use D3’s cluster layout.

d3.cluster creates node-link diagrams that place leaf nodes of the tree at the same depth. If you pass it a root to the cluster layout, it would add x & y values to each descendent, in our it’s all the mangoes. The x and y attributes can then be treated as angles as a radius to produce a radial layout. Since we only have one level of children nodes, the attributes can then be used to arrange the mangoes in a circle.

Here are my steps:

(1) I manipulated the data by adding a parent data point. To do so, I added a column “parent” as well as a new row (first data row above) representing the parent to all the mango varieties.

(2) I converted the CSV file into a json file, and stored the array of objects in a Javascript variable (mangoes) in its own file.

(3) Next I needed to convert the data from tabular to hierarchical data. This part I did directly in Javascript using D3. I transformed the data from a flat list format into a hierarchical format by using d3.stratify, so that there is one root which is the parent node and each mango is a leaf node. Some layouts in D3 require data in a hierarchical format, such as d3.tree or in our case d3.cluster.

(4) D3’s cluster layout is used to produce dendrograms. Since we want to produce a radial layout, for the size we can pass in [360, radius] which corresponds to a breadth of 360° and a depth of radius, and in this case I passed in null for the radius since the x & y positions will be calculated in the next step. Then we pass in the root from the previous step to the cluster layout and only get the leaf nodes (as we only have the one parent node plus one level of children nodes) to get the mango data we will work with.

(5) Next I added further attributes to each node:

The x position should minus the centerAdjustment so it starts at the top center. Then the distance between two mangoes can be calculated based on the difference of that attribute between the first two mangoes (or any two mangoes).

I wrote a function to calculate the other attributes we would need for each node, as we will be doing this multiple times when the data updates when a user sorts them or clicks on a mango.

Here are the variables I declared to store the radius of various elements in the visualization

A resource I referenced for this setting up the cluster layout is Data Sketches’s Cardcaptor Sakura project.

Part 3. Dataviz Development

3.1 Set up SVG

3.2 Draw circles

Instead of drawing images right away, I wanted to try drawing circles first. To draw the circles onto the screen, you can use d3’s join method to join the data to the circle elements, and set the cx, cy, and r attributes. The r attribute is scaled by the size_cm property of the mangoes dataset. So far the circles will draw in the order of the dataset.

3.3 Replace circles with images

Now that the circles are successfully drawing, I will replace the circles with images. So instead of appending circle elements with cx, cy and r attributes, I appended image elements with x, y, width, height, and xlink:href attributes.

The images have a set width and height, since the mangoes are already scaled based on their size and placed on an artboard with the same width & height, so I didn’t need to scale them dynamically.

I wanted to add a transition where the circles are not visible on page entry, then slowly display them one by one. So I added the following transition and delay:

Now the mango visualization should look like this:

3.4 Draw text arcs

Instead of having the mango names as straight text displaying near each mango, I’ve decided that arced texts along paths are more aesthetically pleasing. Visual Cinnamon has a really nice blog post that goes into great detail on how to do this. And I also referred to the same project as above (Data Sketches’s Cardcaptor Sakura project) for some of the implementation details.

In a nutshell, to render text along the shape of a path element (such as the arcs), you would need to enclose the text in a textPath element that has an href attribute that references the path element.

The steps I took include:

(1) Created an arc path first for each mango based on each mango’s position

I created a function to calculate the arc paths for each mango since we will be using drawing the arcs on multiple occasions

(2) Gave each of the arcs a unique id

(3) Then created text elements for each mango

(4) And for each text element appended a textPath

(5) For the x:href attribute of the text path, made reference to the id of the corresponding arc

3.5 Identify interactions

Drawing all the elements onto the screen was the easy part. The interactions were a bit trickier.

There are two types of interactions for this visualization:

The user can click on a mango, after which the selected mango will move to the center of the circle while the other mangoes will slide over to fill in the excavated space while keeping an even distance between the mangoes.
The user can select to sort the mangoes by size or by sweetness. By default, the mangoes are arranged by size.

3.6 Sort mangoes and redraw

Let’s start with the sorting part, since it’s more simple as it doesn’t involve adding (entering) or removing elements, just updating them.

I added two buttons to allow the user to click on to either sort the mangoes by size or by sweetness. Once a button is clicked, the sortMangoByAttribute function is called and either “size_cm” or “sweetness_brix” is passed in as the argument (attached to each mango based on the original dataset, i.e. the columns in the original csv data). The function essentially sorts the mangoes based on the passed-in attribute, then repeats some of the steps that were done in the beginning (see section 2.2.2), essentially passing the root to the cluster function to add x & y attributes and obtaining the leaf nodes, then adding further attributes for each node.

Then once I had the new and sorted set of nodes, I passed it into the drawSortedMangoes function.

This function essentially updates the position of the mangoes & text of the arcs based on the new dataset, since at the point the number of mangoes arranged in a circle is still 23.

An important point to consider is when passing in the updated dataset to .data, you should also pass in a unique key, such as id, like this .data(data, d=>d.id), in order to keep track of individual mango elements. A key function can be specified to control how to join the data to the elements, as opposed to using the default join-by-index option. This will make the animation a lot cooler as d3 would look for the matching mango id while joining the data, which essentially keeps the data in order.

Here are some good posts related to passing in a key to .data():

Another note, if I use d.cluster as a key D3 throws an error, while d.id works. Refer to the following post to see more details:

DOMException: Failed to execute 'insertBefore' on 'Node'

A note on drawing the arcs, when joining the data, I needed to separate the enter & update selections in order for it to draw correctly. For the update selection use select, while for the enter selection use append. See this link for further details.

Note: If I click on a mango so that it goes to the center, then click on sortBySize, the last arc doesn't draw if I use d3.selectAll as opposed to mangoNameG.selectAll… If anyone knows why please let me know.

3.7 Click on mango to move to center

When a mango is clicked, a couple of things happen.

I created a function called rearrageMangoData that essentially takes in a dataset, filters out the selected mango, gives that mango some attributes manually (x = 0, y = 0, cluster = 22), then calculates the attributes for the remaining 22 mangoes. At the end it returns an array of two elements, first is the 22 mangoes that are now left to fill the circle, and the second is the 22 mangoes plus the last mango now in the center.

In this function, before passing the root to cluster, the nodes include the following attributes (plus values of the first row as an example): data: {parent: 'All', name: '夏雪', name_en: 'XiaXue' ...}, depth: 1, height: 0, id: "夏雪"

After passing the root to clust, the nodes also have parent, x and y attributes: parent: pd {data: {…}, height: 1, depth: 0, parent: null, id: 'All'...}, x: 8.181818181818182, y: 0

Then, after calling addAttributes, the nodes includes also the following attributes, with x and y updated: centerAngle: 0.27369935997183803, cluster: 0, endAngle: 0.4164990260441014, endAngle2: 0.39269908169872414, startAngle: 0.13089969389957468, startAngle2: 0.1546996382449519, x: 72.97963350023863, y: -259.94994343944535

I then essentially pass these two arrays to the mangoClicked function as below:

The mangoClicked function essentially redraws the mangoes based on the full dataset (with 23 mangoes), setting the position of the clicked mango at the center while the other 22 mangoes along the circle. The labels are also updated but with the filtered dataset that excludes the mango in the middle.

Final Note

If I wanted more interesting interactions, I could have randomized the order of the data first, which I didn’t end up doing in this iteration.

Part 4. Style based on design

I then updated the data story based on Daisy’s UI design below:

For this piece it was quite simple, I just used Bootstrap for the layout, then also added some colors and button styling using css.

Part 5. Publish

The final data story can be found here, while the github repo can be found here.