My bitemoji Zach data enthusiast. options trader.

Making better graphs - a journey toward improving Data Visualization

» data

Introduction

Constructive feedback always has a way of sharpening my focus. I posted a graph awhile back on an internet forum that compared the test scores of Primary School students in Singapore. I was bored and just trawling around data.gov.sg (the government’s data repository) and found this dataset and wanted to plot it out. However, the issue was that I had no key intent and this resulted in a plot that perhaps wasn’t clear because it didn’t convey a key message. As with all things on the internet, I had both constructive feedback as well as people who were downright nasty about it. The constructive feedback was really good though and I think I wanted to record some of it here just to remind myself about the need to improve.

On that note, I have taken practical steps toward improving and am slowly going through this book on data visualization. Anyway, here’s the graph that I posted and here’s 3 key pieces of feedback (as well as my thoughts/response to it)

PSLE

Feedback

1. No Clear Message

Comment: There was no clear message to the graph and because of that, people were not sure what the takeaway was.

My thoughts: In this regard, I agreed. I didn’t merge any of the plots or did any form of obvious comparison because of the fact that I wasn’t sure exactly what I wanted people to takeaway from the graph. Hence, I faceted them out individually and used bar graphs (instead of line graphs) because I wanted each piece of information to have its own filled out space.

Improvement in future: In future, I will want to analyze the data and make sure I have an idea of the story I want to tell before I present it. In this case, it could be an issue of socio-economic inequality that results in certain races faring more poorly at this test than others. If that was the key message, I could then have used a line plot to allow comparisons to be made more easily.

2. Inability to compare

Comment: It was hard to compare the plots since they were all segregated. A line plot would have been better.

My thoughts: I think this was related to the issue of not having a clear message. Wanting to segregate each piece of data and present it on its own, I chose a bar plot over a line plot because I wanted the data to have its own space.

Improvement in future: Again, I could have thought longer about what the point of me presenting the data was and not just randomly graphed it. This would have allowed me to have a better sense of the comparisons I wanted to make and, consequently, the type of graph I should have used to make things clearer.

3. Truncation of the y-axis

Comment: I truncated the y-axis to make the comparisons more stark and to exaggerate the differences in achievement levels.

My thoughts: I really had no intent of doing so and I thought a 50% - 100% scale was better because it allowed the differences to be visualized more clearly. Furthermore, referencing this paper, I felt it wasn’t manipulation because the difference was 60% and 90% - a large difference! and not 35.1% and 35.5%, scaled to make it look 4 times as much.

Improvements in future: I don’t feel there was anything wrong with the truncation of the y-axis BUT I could have made that clearer in my graph. In future, I could perhaps add a note at the bottom to inform viewers of the re-scaling of the y-axis.

Conclusion

I am always hungry for feedback because I recognize that I am but a nascent explorer of data and that there is so much more to be gleaned from others who are more experienced than me. Data Visualization is a key component of presenting data and I am excited to get better at this.