Published on July 19
Taiwan mangoes interactive visualization write-up
By Julia Janicki
Here is the github repo if you want to refer to the code.
Part 1. Background research & ideation
1.1 Goal & Target audience
To create a visualization of the most common mango varieties in Taiwan to show off the diversity by allowing the user to sort by sweetness or size, and also allow the user to click on a mango to see more details about it.
1.2 What are some initial ideas
Arrange all mangoes in a circle, ordered by a specific attribute starting from the top of the circle.
1.3 Direction after doing some research
Some considerations regarding the type of story & data viz?
Exploration vs presentation → Exploration
Static vs interactive → Interactive
2D vs 3D → 2D but with 3D rendered mangoes
Charts vs maps vs graphics → A mix of chart plus graphics
1.4 What would be our tools?
→ Cinema4D for the 3D mangoes (By Jane Guan)
→ Adobe XD for the interface design (By Daisy Chung)
Part 2. Data cleaning / exploration
2.1 Data collection / data sources
Normally I would start by looking for an existing dataset. But since no curated dataset exists for Taiwanese mangoes, I had to create my own. The compiled dataset below is based on over ten different sources, most of which were in Chinese.
2.2 Compile data & do some data cleaning
2.2.1 Description of dataset:
We set up a Google sheets to store mango data with the following fields (columns), with each row corresponding to a mango variety
color (didn’t end up using)
Here is the link to the CSV & JSON datasets on github.
2.2.2 Add attributes:
To arrange the mangoes into a circle, there are a couple different options:
Option 1 is to calculate their positions manually using sine and cosine, which I will not go into detail here.
Option 2, what I did in this case, is to use D3’s cluster layout.
d3.cluster creates node-link diagrams that place leaf nodes of the tree at the same depth. If you pass it a root to the cluster layout, it would add x & y values to each descendent, in our it’s all the mangoes. The x and y attributes can then be treated as angles as a radius to produce a radial layout. Since we only have one level of children nodes, the attributes can then be used to arrange the mangoes in a circle.
Here are my steps:
(1) I manipulated the data by adding a parent data point. To do so, I added a column “parent” as well as a new row (first data row above) representing the parent to all the mango varieties.
(4) D3’s cluster layout is used to produce dendrograms. Since we want to produce a radial layout, for the size we can pass in [360, radius] which corresponds to a breadth of 360° and a depth of radius, and in this case I passed in null for the radius since the x & y positions will be calculated in the next step. Then we pass in the root from the previous step to the cluster layout and only get the leaf nodes (as we only have the one parent node plus one level of children nodes) to get the mango data we will work with.
(5) Next I added further attributes to each node:
The x position should minus the centerAdjustment so it starts at the top center. Then the distance between two mangoes can be calculated based on the difference of that attribute between the first two mangoes (or any two mangoes).
I wrote a function to calculate the other attributes we would need for each node, as we will be doing this multiple times when the data updates when a user sorts them or clicks on a mango.