Information Technology and Services | York, United Kingdom, GB
The Information Lab is a team of passionate Tableau software professionals. We are one of the longest standing Tableau partners in the UK with experience of all aspects of the Tableau product suite.
Our team are skilled at working with data and we are all certified Tableau consultants.
Like most Tableau evangelists, I first discovered the power of Tableau through constant frustration with data report authoring using classic spreadsheet applications. Since then I've thoroughly enjoyed helping businesses see and understand their data, as well as making full use of Tableau Public bringing public datasets to life.
Specialties: Tableau Data Visualisation, Dashboard Design, SQL, Excel
2011 - Present
CTO / The Information Lab
Helping clients implement, create and understand exciting dashboards with Tableau data visualisation software. Whether it's training or authoring, my goal is to help people make sense of data.
Using maps in Tableau is a very powerful tool that can quickly show the user geographical data at a glance. In this post I will show you how you can build a Filled Map.
What is a Filled Map?
Filled Maps are maps that have polygon shapes defined and then they can be filled with colours dependant on the data.
Where to start
The first thing you are going to need is geographical dimensions that match up with the built in roles within Tableau. If we take Tableau’s Superstore Sales data this is in the form of Country, State, City and Postcode. You can tell these have been matchup to a role by the icon next to the name, and you will also see the Generated Latitude-Longitude fields in the Measures section.
This is not the only way to make a filled map as you can also use polygons to create your own areas and then colour them with your data and I will show an example of this method in the second example.
For the first example I will use the Super Store data set available with Tableau desktop.
As you can see we already have the Country, State, City and Postal Codes matched to the roles so any one or all of these can be used.
Drag out the Country Dimension onto the rows shelf and then drag the Profit Measure onto column shelf and you will get a Bar chart like the one below. From here you just need to pick the Filled Map option from the Show Me card.
You could of course do this without the Show Me by dragging the fields into their final position Tableau will automatically try and build the best visualization it can for you. Drag out Country Dimension onto the level of Detail Mark and then Drag the Profit Measure onto the Color Mark and you should instantly see and Filled Map of the world where there is profit data in the source.
We can tell from the first map that there are no countries that have a negative profit value. What you could do is further develop your view by drilling down to the State level of detail by clicking on the + sign on the Country pill in the Marks Card. This will now draw a polygon shape over each state it has data for.
Now we can see that there are states that have a negative profit but it is hard to see the difference between them as they are spread apart. First off we can add a filter to the Profit measure and include only negative values. This clears up the view a bit but it is still hard to tell what the difference is between them.
What we can do is take these polygons and make a Scatter plot against the amount of sales vs profit.
NOTE – If you used the automatic second option to build your filled map you will need to change the Marks Card from “Automatic” into “Filled Map” to keep the polygons.
To do this you will need to replace the Latitude-Longitude fields on the Rows and Columns shelf with Sales and Profit fields. As we know these are all Negative profit values we can change the colour to be by country, increase the size of the Polygons as they are not over a map and we can add in the State field on the Labels mark.
The second example is using non-geographical dimensions
This method relies on the Latitude-Longitude being available in the data and not generated by Tableau. It also requires the data to have the structure for the shapes (polygons) to be defined in the data by using a polygon mark and a “Path” field.
I have connected to a datasource that has the “non-generated” Latitude-Longitude in the data and no dimensions are linked to geographical roles.
To start with I just need to add my Latitude-Longitude to the sheet by double clicking or dragging them out. This gives us a single point in the middle of the map.
Now to add in the fields that I need to make Tableau draw the shapes, first by changing the Marks type to Polygon and then drag out a fields to build up the shapes.
I did not go into too much detail as this is just an overview of the function in Tableau but you should find more detailed posts on our site.
We’re very aware that the Higher Education sector will be trawling through the REF 2014 data published yesterday, to find out how well they did, and what impacts this will have on their league table rankings and their public funding.
We’ve put together a simple dashboard to help you look into the results, and published it to Tableau Public. Remember, like other Tableau Public visualizations, you can download the report and use the data in your own reports. If you’d like to see anything else in it, please get in touch.
Bar charts are surely well know but let’s spend a few words on treemaps now. A treemap is a chart type that displays hierarchical or part-to-whole relationships via rectangles. In case of hierarchical (tree-structured) data these rectangles are nested. The space in the view is divided into rectangles that are sized and ordered by a measure. Nested rectangles mean that hierarchy levels in the data are expressed by larger rectangles (above in the hierarchy) containing smaller ones (below in the hierarchy).
The rectangles in the treemap range in size from the top left corner of the chart to the bottom right corner, with the largest rectangle positioned in the top left corner and the smallest rectangle in the bottom right corner. In case of hierarchical data – when the rectangles are nested -, the same ordering of the lower level rectangles is repeated within each higher level rectangle in the treemap. So the size, and thus the position of a rectangle that contains other rectangles is determined by the sum of the areas of the contained rectangles.
Let’s see now an example of a treemap showing part-to-whole relationships and a nested treemap from hierarchical data.
(All the charts will be built on the ‘Superstore data’ provided with Tableau Desktop and will be using consistently the Sales measure for the sake of simpicity.)
A) Simple, ‘part-to-whole’ treemap (1 dimension only, now Region)
B) Nested treemap (more dimensions, now Region & Product Category)
So much for now about the theory of treemaps, let’s see if they stand a chance compared to bar charts…
Treemaps vs bar charts – what are the differences?
My green/red coloring stands for having an advantage/disadvantage. Although I did not color anything in terms of the ‘strength’ of the chart, bar charts can be further flavoured with a running total line on dual axis (moving towards a Pareto chart) that results in a slight edge over a treemap even in this case.
Concluding the advantages of the bar chart, the next logical question is: can we always replace a treemap with an equivalent bar chart? Let’s try…
A treemap is created from 1 or more dimensions, 1 or 2 measures.
Let’s see the basic variant, 1 dimension and 1 measure: Product Sub-Category and Sales.
The order of Product Sub-Categories is clearer in the bar chart and the differences between the individual values are also better displayed.
I quit the case of 1 dimension and 2 measures as the second measure is used only for coloring the treemap. Also, later on the variants with an additional measure on color button are not discussed as the number of dimensions will be more important for us.
Let’s move on to including 2 dimensions and 1 measure in the view. We may also use the higher level dimension on color to visually group the marks.
Treemap (hierarchy of dimensions)
Bar chart (hierarchy of dimensions)
Individual region-product category combinations are still more accurately shown on the bar chart. What the treemap does well is organizing the cells (region-product category combinations) by regions in a descending order. Still, can you tell if East or West is bigger in total if you do not know the second biggest region should appear below the first one and the third one on the right?
There is another bar chart solution with 2 dimensions and 1 measure, namely a stacked bar.
A similar view is produced by combining treemaps and bar chart to create a bar chart where each bar is a treemap itself.
Notice that the length of the bars corresponds with the total value of Sales in the region, just like in the stacked bar chart.
Does the treemap outperform the stacked bars? No. Bar sizes still translate the differences in sales clearer. Another aspect you may mention is that dimension members are automatically ordered on treemaps, while it is not straightforward to solve this on a stacked bar chart but can be simply reached.
Just create a combined field of the dimensions (in our case Region & Customer Segment), then drag the new, combined field to Level of Detail, above the dimension (now Customer segment) of color and finally sort that combined field by SUM(Sales).
So far all cases of a treemap could be reproduced with a bar chart that proved to be more effective. When is a treemap superior to a bar chart?
What if we are dealing with a very large number of dimension members?
Treemap (large number of items)
Bar chart (large number of items)
To be honest none of these look appropriate, although the tree map groups the items well using color and labels are visible in larger rectangles. This many data points are rather visualized in a text table if looking up the individual values is important.
The last resort for the treemap is that due to the rectangular shapes it may be more user friendly on mobile. Yet if a reasonable number of marks are involved, there is no reason why a bar chart would not do the job.
All in all the treemap may be the new pie chart. It looks fancy and the software capabilites are luring us to go for it – it is so simple to create. Please consider the good old bar chart first.
“Our ultimate goal is to create a ‘one version of the truth’ data source!”…a statement a modern data consultant will get used to hearing. Questions such as “how can Tableau Server be used as a centralised, one version of the truth repository” or “how can we restrict what people can create with Alteryx to ensure it doesn’t disrupt our one version of the truth philosophy.” Honestly I do not like the phrase “one version of the truth”, upon its whispered utterance I immediately ask “one version of who’s truth?”
Seeking out everybody’s truth
So what’s wrong with everybody singing from the same hymn book? (…and yes if you’re going throw IT catchphrases at me I’m going to throw corny phrases back). Well in theory nothing, if you’re analysing your company’s profit and loss for an official report, especially if it’s going to Her Majesty’s Government, then need to be certain that the underlying figures are as good as they can be. But that’s one data source and a very specific one at that. What advocates of “one version of the truth” tend to promote is the idea that everybody’s question can be answered with just a single data source.
Everybody’s question…from one data source…? This is a logical and even a physical impossibility. Like those who travelled to seek out Deep Thought in The Hitchhiker’s Guide to The Galaxy to hear that the answer to the ultimate question of life, the universe and everything was 42, those who seek to answer every analytical question that every person in a company could possibly have will inevitably be disappointed with the output. Either a dataset has been overly simplified for ‘user friendliness’ so that no real questions can be answers, or they’ve been made so large and overly complicated nobody has a single chance of understanding how to use them.
The Data Team, The Challenge
So this brings us back to “who’s version of the truth?” Let’s think how one version of the truth comes to being in a company. Somebody in middle to upper management decides that the BI capability needs to be improved and believes that information should be democratised across the organisation. So they set up a meeting to bring together potential end users, line managers of critical departments and the heads of IT. They scope out the known major data systems within the organisation, argue about the complexity of matching these all together, discuss the known unknowns of each, try to ignore the possible existence of unknown unknowns, and many hours, jugs of coffee and Visio diagrams later they come out with a methodology to achieve success. But that’s just one part of the problem, how should they get the data to the users? Well you can’t just present it as SQL tables and views, most users don’t understand those (big assumption there) and anyway imagine the anarchy of giving users raw, row level data which they can join and blend at will! Insane!
So what to do? Well you can either present multiple single views on the data through some kind of delivery mechanism (a locked down datamart, a data exporter such as Business Objects) or you try to mash everything together into a single cube. That second one sounds really clean. One source, a controlled logical machine which will present correct figures at different drill-down levels, sweet! So the team go with that and 9 to 12 months (18 in reality after discovering those unknown unknowns) later their cube is ready to be presented to the company.
The outcome? Well the designer couldn’t decide how many different date fields people would want, so they’ve added them all. Then there’s the finance guys who want everything broken down by department, which is actually a cost centre, but that’s not the same cost centre as what everybody else uses on the internal CapEx & OpEx planning system. Then of course there’s the holes in the data which were never resolved from combining all these systems into one single source. So you can analyse headcount by calendar year, but not financial year, that will generate an error.
And that’s just what can be seen in the data connection. What about all the logic written into the system to calculate the values. Some present percentages while a quick switch of a dimension will switch that same measure to an average with a total average line included…which can’t be hidden. The result? Confusion, blind assumptions, and request for change, after request for change, after request for change. But if people are building reports on existing measures, how can you change the underlying logic? So a new measure and dimensional hierarchy is added, and another and another. Doesn’t sound very ‘one version of the truth’ does it? Of course they could have just denied every change request, but then nobody would end up using the system.
Data to the People
So where did it all go wrong? In my opinion it all happened back when the first assumption was made, that all users couldn’t connect directly to the data and combine datasets as they needed. This is the new age of data analysis, it’s here. Tools such as Alteryx for data blending and Tableau for visual analysis mean end users no longer need to be SQL gurus to make use of row level data. Yes I said it…Row Level Data! And before you argue that you couldn’t possibly make row level data available, send somebody on a fact finding mission around different departments. They’re already doing it! People are becoming experts in sending just the right query to your cube through Excel to get row level data out. They’re picking the Business Objects report to export the past 30 days of data in to Access, their “new” datamart.
What could the team have done differently? Well besides trusting that the people who have the questions which can be answered by the data aren’t stupid enough to report the sales which they’re responsible for as being twice the amount they really are, they could have used the working group and all that meeting time to transform the IT department. With data blending and analysis moving out of IT and to line of business, IT needs to get back to what it’s good at…keeping systems up and performing as well as they possibly can. Companies who can transform their IT departments into the Amazon Web Services of the corporate world will find they can make full use of a workforce used to working with technology that doesn’t require a user manual or week long training course. A workforce that has the freedom to do what they need to do to get the answers to the questions they have.
So yes, I’m not a great fan of one version of the truth. Instead I’d prefer a conversation which allows truths to be tested, refined and made into fact by scientific testing and organic peer review. To make it happen, accept that IT don’t need to report on data any more, they don’t have to understand the entire organisation. Instead they just need to understand query times, server clustering and IOPS.
Using maps in Tableau is a very powerful tool that can quickly show the user geographical data at a glance. In this post I will show you how you can build a Symbol Map quickly using the “Show Me” feature in Tableau Desktop.
What is a Symbol Map?
Symbol Maps are simply maps that have a mark displayed over a given Longitude and Latitude. Using the “Marks” card in Tableau you can quickly build up a powerful visual that informs users about their data in relation to its location. These maps can be as simple or as complex as you need them to be and I will show you some examples to get you started.
Where to start
The first thing you are going to need is geographical data, this means that you either have to have the Longitude and Latitude for the data points you want to show or you have fields of data that you can match up to the geographical roles in Tableau for example Country, City, Postcode etc.
For the first example I will use the Super Store data set available with Tableau desktop.
As you can see we already have the Country, State, City and Postal Codes matched to the roles so any one or all of these can be used.
Double click on the Country field and Tableau will move this onto the level of detail and also move the generated Longitude and Latitude onto the rows and column shelfs. This is the most basic of a symbol maps as it is showing your data in relation to its location.
From here we can do any number of things to improve the visualisation to help inform our audience.
For example we may want to have a look at how our Profit is doing per country?
All we need to do is drop the Profit field from our measures section onto the Size shelf on the Marks card.
Now that we know which countries make us the most profit we may want to know what it is that makes us all this profit?
We can change the Marks type from Automatic into a Pie and then drop Department onto the Color shelf on the Marks card. This will now split up each mark with the Departments profit contributions per country.
The next example is using GPS data gathered from a mobile phone app
The following is an example from some GPS data I had and this only provided Lat/Longs so there is no field with city or postcode to match against the inbuilt geographical roles.
Each mark is a placed on its Lat/Long point as a visit and the size of the mark is the amount of time spent at each location.
Last Example is just to show that you do not need to me on land to use the the Lat / Long method
As you do not need to have a country or city level of detail you can actually chart data points in the middle of the sea or even against a image of your own that is not a map at all.
The following is a screenshot from Chris Loves’ entry into The Volvo Ocean Race Alteryx and Tableau Challenge we posted earlier. You can see all of our attemps here The Volvo Ocean Race Tableau Challenge and have a go with the data provided if you wish.
The last point I would like to say on symbol mapping is that you can use this method of adding marks on top of an image that is not a map. Using the Adding Images page from the help file ( Adding Background Images)you can specify an X and Y axis over the image and use this to build your visualization on. I could not find any good examples to link here but if I find any or if you know of any please let me know and I’ll post them here for everyone to see.
Before we get into the meat of the blog I wanted to give you a short test: see if you can guess where the data used in the below vizusalisation came from. I removed the axis labels to make it harder. I’ve also highlighted one series, but at random, highlight another and see if you can work out the dataset…
Stocks and Shares, right? You know what that share who’s dropping is? That high flyer? Now hit F5 to refresh your internet page, watch the data….
I’m sorry to say that this was all randomly generated data, not stocks and shares, I made it all up. Each line started off at the same value and I gave it 100 random movements (1 point up, 1 point down or no movement – all equally likely) before I showed you on the chart. Want to check? Here’s the axes and full chart (the above chart starts at x = 100)
So next time you’re telling yourself you’re onto a sure fire stock market winner, or a “can’t lose” streak at the roulette / blackjack table (Vegas TCC 15 anyone?) then just double check that you’re not looking at random increase. I find it amazing how different each of these lines of data is after just 100 generations of randomness.
It’s randomness in Tableau, and specifically generating random numbers that I want to explore in this post.
Why Generate Random Numbers in Tableau?
Before we get into the HOW, lets explore the WHY. The main reason for introducing randomness into the dataset might be to “jitter” data-points in the view. Steve Wexler of Data Revelations has already written on this subject, and I recommend his excellent article to see details of one approach. However another approach, where using INDEX() isn’t appropriate might be to use a random number. We’ll visit one particular use case later in this article.
Secondly you may wish to model processes that include a random probability or chance, if you do then obviously random numbers offer an approach.
Aside from methods using RAWSQL functions or SCRIPT functions to call out to SQL/JET and R respectively then there is no function that will allow you to bring a random number into your Tableau workbook. Instead you’re going to have to use a random number generator such as a Linear congruential generator (LCG) – these are pseudo-random number generators that are incredibly simple as they have linear algorithms.
You can read about LCG’s here, and I will also show you an implementation stolen from the Tableau genius that is Joshua Milligan (author of the Blackjack game referenced above – I advise you check out his Tableau Public visualisations).
The actual random number calculation is recursive – so the calculation takes its previous value as one of the inputs – using Previous_Value – a table calculation:
Random Number (one method – many variants involving different values exist)
In the calculations above, [Seed] could be anything to start of the series but I have chosen a completely random seed based on the date and time, this ensures a different random number series each time. I could have used a fixed number or a parameter to control the series, or give the user control, if we follow this route the same [Seed] will generate the same series of random numbers, allowing repeatability.
Implementing a Random Number for “Jittering”
To show you how to implement jittering I want to return to an old post of mine, Health Check your Data using Alteryx and Tableau. In that post I showed a “DNA” profile of data, I want to now show you an alternative method using Jittering. in this case using INDEX() wasn’t appropriate as the Row Number was quite possibly related to the data type. So I use a random number.
Though here I used another version of the LCG formula (just for fun):
I also created a seed and integer version as I detailed above, then I added the integer version as a continuous row after my existing Column names. Hiding the axis of this “jitter” row then left me with what I needed (after some formatting) – click below to see the jittered result. Backwards engineer the viz to see the exact details (this is a highly recommended way of learning).
More Random Adventures
Doing random stuff in Tableau is how I get my kicks, and so also to add a bit more randomness to this post here’s a video of a vizualisation I built I call “Tableau Life” – I’d like to think this is how new features get propagated in Tableau
As a bit of fun while you’re watching this try and guess how many rows of data were used in making this visualisation – answer at the bottom of this blog.
Building this visualisation was a challenge and fun but perhaps a little complicated to explain as part of this post, it’s probably enough to say I took inspiration from the amazing and inspirational Noah Salvaterra and his amazing Fractal images in Tableau. Take a look at his blog post to explore his methods, mine are fairly similar (if less advanced).
My approach was to create two sets of random number to check for + and – movement (or none) on each axis (equal chance of each one). Then to iterate across an X value – for the path – and a Y value – for each “node”. The result was a set of random walks – which you can play with and recreate here:
How many rows of data? The answer is only 2! Here’s the dataset I used to create both random datasets in this post – the random “stocks and shares” and the “random walk”.
If I’ve just blown your mind then I suggest your read Noah’s post and download my workbooks and backwards engineer them. Welcome to the world of Tableau – the rabbit hole just got deeper Any questions on this – I have left it unexplained as it is a little off the beaten track for most Tableau users then please tweet me @ChrisLuv or comment below.
We often get asked about how to start influencing people who just want the numbers, to start moving them towards more visualised dashboards. For me, that answer includes Highlight Tables.
Let’s remember that data visualisation is focused on letting users of a dashboard analyse their data to find the meaning and story in the data quickly and easily. Highlight Tables do exactly as their name suggests – it adds highlights for the user to read the table more quickly.
Find the highest value in the table below:
Both tables are the same, just one is using colour to give visual clues as to where to look.
Colour in Highlight Tables
Colour is really where the magic is added within a Highlight Table. However, with the ability to add the ‘magic’ means that you have to think carefully about your colour choices. So what are those choices?
Andy Cotgreave (Tableau and gravyanecdote.com) advocates showing your visualisation in grey first of all to allow you to see what is going to fundamentally stand out. I like this technique in Highlight Tables to work out what your table is showing before you apply the storytelling power of colour.
Colour adds a greater intensity than the greyscale as we can see by adding blue to the same table.
But there are limitations with this. If there are negative figures, they don’t stand out to the consumer of your visualisation as they lack a presence on the screen due to the absence of colour. That leads us nicely to our second option.
Adding a second colour gives the reader more visual clues about what the table is showing. In the version below, I have used red to show negative values and black for positive values.
Tim from The Information Lab notes this is a great technique for financial highlight tables as it lends itself to the leading the user to see whether something is in the “red” or “black”.
This table lets me see the negative figures pop out in the same way that I see profitable figures. Therefore, if I am trying to understand both the highest and the lowest, a diverging set works nicely.
Adapting Highlight Tables
There is a lot more that you can do with Highlight Tables by adding additional Dimensions in Tableau to the visualisation. Just like anything in visualisation, adding more often does not mean better. Be careful with the level of granularity you give your consumer as if they are having to wade through hundreds of rows of data – it’s going to be harder to visually analyse the data.
We have also pulled together this video guide to help you see more about the techniques mentioned in this post.
Hopefully that is a useful guide how you can start your audience on their journey to visual analysis if they are currently reluctant.
It is slightly ironic that while preparing last weeks post for the Show Me How series on Heat Maps in Tableau I was also preparing this, rather more complex post, on another form of heat mapping in Tableau – this time in the form of chloropleth maps.
Heat Mapping in this sense is straightforward in Tableau when in the form of Polygons datasets or points in just a few clicks, however it can be difficult to achieve any sense of further geographic analysis such as thematic gradients. In this post I want to explore I have worked around that problem using Alteryx, and show you how you can use a simple web application built in Alteryx to do the same.
Easy in Tableau
(stay tuned to our Show Me How series to find out how to produce these)
Hard / Impossible
The below thematic gradient map is difficult, especially where we only have point data. It shows areas of high concentration as a deep red, and low concetration as a blue – hence the heat map naming (hot -> cold). In this case the map below is actually showing temperature, but similar maps can be used to show population, etc.
Where something is hard or impossible in Tableau then I like to take it on as a challenge.
Alteryx to the Rescue
Many of our regular readers will be familiar with my love of Alteryx and what it allows me to do with Tableau that just wouldn’t be possible otherwise. This is definitely even more true in the geospatial world, Alteryx beats any other piece of BI / analytics software you care to name hands down when it comes to producing data for mapping.
If we have some points of data we want to convert into a heat map the process is actually quite complex, here are the steps:
1. Build a grid of data points, the size of the is dependent on the resolution of the map needed.
2. For each grid square (X):
a. Find the data points within that grid square X and sum their “heat” (e.g. sum their population)
b. For the neighbouring grid squares add their data points heat but reduced by a factor that is inversely proportional to the distance from grid square X.
Imagine doing this for many thousands of grid squares that might make up the UK or US and you can quickly see why it may be difficult and time consuming. I’ve flirted with the Table Calculations needed and it’s not straightforward, especially if you don’t have a pregenerated grid of data – which we don’t in most cases!
Thankfully a lot of this logic is already written into the “Heat Map” tool in Alteryx, however it only creates a 5 polygons containing points of similar “heat” for use as shape files or reports in Alteryx – useless from a Tableau point of view.
Customising Alteryx Tools
One of the many things I love about Alteryx is that many Alteryx Tools are in fact “macros” – meaning we can customise and edit those – and so we can edit the Heat Map tool for our purposes. That is exactly what I have done, the macro was well annotated and so I removed the pieces I didn’t need, mainly it was a case of simply removing the “Tile” tool and the latter part of the module. Then I saved a custom version – I love it that Alteryx make it that easy to steal and customise their experts work.
Then I imported my data and tested it, here was my test module:
The result for this test file in Tableau was a start but with more work I knew I could improve it. As you can see each point is replicated as a small dot in Tableau – like a pixel.
A Real Use Case
To really test out this macro I wanted to put it through it’s paces with some real data – so I downloaded data from police.uk of every crime recorded in the UK from November last year. I then broke out the results by crime and used a “batch macro” in Alteryx to run the process multiple times, looking at weighted differences with regard population density (so it didn’t just show a map of population – as can happen when showing frequency of occurences) and produced the data for several categories – which was then combined into a single file and visualised in a Tableau workbook. Click to see the workbook on Tableau Public.
You can download my macro and modules for the Test data here.
Build it yourself via this Free Alteryx Web Application
If you want to experience Alteryx yourself for 14 days you can using the free, easy to install trial but you can also experience the power of Alteryx via their web gallery. Just register and you can use this app I’ve put together to generate the data behind these maps yourself using your own data simply and easily. Give it a try, here, and you will find the app – fill in the details and run to get your tde of heat points ready to use in Tableau, use the workbook in the link above for details of how I then generated the viz in Tableau:
Enjoy creating your own heat maps, and please pass on any feedback via Twitter (@ChrisLuv) or in the comments below.
Continuing our “Show Me How” series in this post we look at Heat Maps in Tableau.
Heat Maps are a very quick way to get a high level look at a couple of measures or KPIs at a glance across a range of dimensions. They do so by encoding measure values in two ways:
1. The SIZE of a square i.e. it’s area.
2. The COLOUR of a square
By doing this they allow a viewer to quickly see the relative performance of those two measures across a group.
Having said this I have a problem with Heat Maps in Tableau, and that is because using the size of an area as a way of encoding information is a relatively poor choice when it comes to us being able to precisely compare the relative values of two measures. With this in mind I tend to use a Bar rather than a Square in Tableau – as we are better at comparing lengths of bars. However having asked my colleagues how they use Heat Maps the general answer came back that they are good to use as a very quick overview, perhaps with just a few dimension values, allowing a quick comparison before allowing drill-down (via action filters) to more precise visualisations.
Heat Maps can produced via Show Me when there are one or two measures on the view.
When there is one measure then the default encoding applied by Show Me will be on SIZE, as below (typically Show Me will shrink the size of the cells so I’ve enlarged them for the image below).
You can quickly see that the South is doing poorly related to the other areas, and the Corporate Segment is doing very well in the Central Region; these kind of comparisons are easy. However as you can see from the legend comparing precisely how much is difficult. How much better is West than South in the Corporate Segment? Approximately five times is the answer, could you judge that from this chart? So as mentioned previously avoid leaving the user to make those comparisons by using this chart as a brief overview – supported by drilldown, etc.
In the below image I’ve used Show Me with two measures, here Sales are shown SIZE and Profit on COLOUR. The result means I can see both measures at once with ease, e.g. Sales and Profit are both High in the Central Region in Corporate.
Setting up these Heat Maps is easy without Show Me simply by dragging measures to the Size and Colour buttons. See the video below where I walk through how to create these visualisations quickly and easily.
I’ve said a couple of times that I prefer to use a bar chart in this instance, but how might you do that? Simply changing the mark type to bar. Consider the below, it’s a very small change but makes comparisons slightly easier in my opinion:
Below are my top tips for beginners to help you get the most out of Alteryx and start getting answers in minutes not hours.
Turn on Connection progress to see record counts across your workflow
While connection progress shows when you run a module often it’s very useful to see it all the time, so make sure you change your properties and turn connection progress to show. To do so then bring up the properties for your workflow (click on the background of your canvas) and select the Show option under “Show Connection Progress”.
Use Dynamic Rename to sort out header issues
Do you ever get data with the headers in the second row rather than the first? Or receive files with no headers and a separate field layout. Well slightly hidden away from beginners in the Developer tools is the Dynamic Rename tool – this has a couple of modes that can be very useful in this situation. Download a sample here.
Use a record limit or yxdb in your module while developing with databases to cut down load times
If you’re working with a large database it can be slightly frustrating to wait for large datasets to download into Alteryx every time you run it. Either use the record limit option in the Input Tool (shown below) to temporarily limit the data you’re downloading, alternatively you can export the database to a .yxdb format and use that while building the workflow, then switch to the database pull down the live data when you’re happy with it.
Use Containers to hide and disable investigative workflows
You’re doing some predictive analysis, you’re investigating the best model and which variables to include, once you’re done it’s tempting to save the module away somewhere and start again elsewhere. However if someone picks up your module they won’t have the full story, so use Containers (drag and drop tools onto them then minimise) and you’ll save all that important investigation in the module so anyone can see it.
Only keep what you need
Don’t carry baggage around that you don’t need, make sure you cut down data as you go using Select tools (or the inbuilt Select tools in the likes of the Join and Spatial Match tools). This will speed up your module, in some cases dramatically. Pay particular attention to polygon fields and ensure you’re not duplicating these many times after a spatial match – it can create many GBs of data (using the record counts from tip 1 can be useful here)
Use Multi-Field Formula to work with many records at once
The Multi-Field formula tool can really shortcut a formula you need to perform many times, e.g. if you need to clean up NULL records in your data then you can write
IF ISNULL([_CurrentField]) then 0 else [_CurrentField] endif
The special _CurrentField_ “field” will work across every field you’ve selected in your dataset and run formula, changing NULLs to Zeros in each instance.