7 days of r/dataisbeautiful

Data Is Beautiful Topics

That’s a word cloud.  It took me about 30 seconds to build.

It shows the relative word frequencies in the titles of posts to r/dataisbeautiful over the last 7 days.


One thing that jumps out immediately is the high occurrence of the word “Citi”.

“Citi” is presumably a reference to New York’s bike sharing service that is sponsored by Citibank.

It turns out that there were 4 data visualisations about Citi Bikes last week,

And they were all submitted by the same Reddit user: wefollocitibike

How we talk about data

The word cloud suggests some interesting patterns about how data visualisations are talked about.

But it is difficult to draw clear conclusions from a word cloud because it relies on the viewer being able to compare different font sizes – which is a hard thing to do.

To make it easier for you, here is a bar chart of the top 10 words ranked by relative frequency.

Screen Shot 2015-01-28 at 23.47.49

Build a word cloud in 30 seconds

I was inspired to give this a go after reading today’s tutorial that shows how to use Import.io Magic to extract written content from a blog and then visualise the results in a word cloud, in the shape of a logo.

logo word clouds

It was very quick.  It took longer to write this post than it did to create the visualisation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s