Information Technology and Services | York, United Kingdom, GB
The Information Lab is a team of passionate Tableau software professionals. We are one of the longest standing Tableau partners in the UK with experience of all aspects of the Tableau product suite.
Our team are skilled at working with data and we are all certified Tableau consultants.
Like most Tableau evangelists, I first discovered the power of Tableau through constant frustration with data report authoring using classic spreadsheet applications. Since then I've thoroughly enjoyed helping businesses see and understand their data, as well as making full use of Tableau Public bringing public datasets to life.
Specialties: Tableau Data Visualisation, Dashboard Design, SQL, Excel
2011 - Present
CTO / The Information Lab
Helping clients implement, create and understand exciting dashboards with Tableau data visualisation software. Whether it's training or authoring, my goal is to help people make sense of data.
Alteryx has a vast number of tools, and it’s easy to miss some functionality that might be useful, so for this new series of blog posts we’re going to take readers through three tools per blog post, detailing functionality as well as hints and tips for each tool. This is part two, you can find part one here.
The Filter Tool
The Filter tool queries records from input to meet specified criteria. Two outputs are created – True and False. Criteria is set in the configuration box by creating an Expression.
The tool is extremely useful in the data preparation stage. I’m going to use the National Student Survey data as an example. This data set contains survey results for all Higher Education institutions. But what if I wanted data just for my particular institution? Then we can use the filter tool to do just that.
Select your data file using the input tool, then connect up the filter tool. I used the Expression [Institution] == “University of Manchester”. In the True output I get all survey results for that Institution, and in the False output I get everything that does not match the expression.
The Sample & Random % Sample Tools
The Sample tool selects only a specified portion of the records from the input. The choices are quite broad ranging too: from the First or Last N Records, to selecting a truly random set of records.
The Random % Sample tool allows you to select a random number of records, or a random percentage of records.
Some great use cases for this when carrying out surveys. You may have a dataset of hundreds of thousands (or more) contact details. But you want to create a random sampling. Using the Random % Sample option in the configuration you select what random number or percentage of records to select. The output could then be your contact list.
Date Time Tool
The Date Time tool will standardise date/time data for use in other tools downstream, or it’s a great way of formatting your dates before you plug your data in to Tableau.
The tool works both ways – you can convert a date/time field to a string, or you can convert a string to a date/time. Using the list of date/time formats in the help files you can convert pretty much any format.
In the example I have connected some data I have exported from Runkeeper. However, the time field is text and annoyingly has characters in it (e.g. 2013-08-17T08:30:13Z). I just want to pull the date out of this field. Connect the input to the Date Time tool and select whether you are converting from Date/Time to String or vice versa. You then type in the field name you wish to format, and the format that the field is in. The output includes a new field with lovely parsed dates in.
The alternative to the Date Time tool is to use a Formula tool with date/time calculation. I have also included this in the attached workflow. This can be a bit tricker, but doesn’t limit you to the formats set in the Date Time tool.
One thing it makes good sense to do when choosing a new BI/Analytics Tool is to check out what’s happening in the market. What tool are your peer companies choosing. What do the analysts say?
With that in mind I’ve created a couple of visualisations using Tableau intended to help you. The first of these shows the growth of license revenue in recent years with particular emphasis on what are termed the ‘Business Discovery’ vendors. The second is an interactive take on the Gartner Magic Quadrant.
I would suggest that there are two ways of engaging with this blog post. The first would be to carefully analyse the relative positions of all of the vendors in the charts below and the then select the one that looks like the star performer. Alternatively you could just ask the sales person (as I am) of any given vendor to reproduce the Gartner MQ using their product.
You’ll find that you’ll arrive at the same place whichever approach you take!
Alteryx has a vast number of tools, and it’s easy to miss some functionality that might be useful, so for this new series of blog posts we’re going to take readers through three tools per blog post, detailing functionality as well as hints and tips for each tool.
The unique tool is a Data Preparation tool, and as you can imagine from the name it’s function is to return a unique dataset for the fields provided, though I think a clearer name for the tool would be “First Distinct”, because that’s essentially what the tool does. For the field(s) selected the tool groups the rows based on those fields (sorting on them in the process), then takes the first row of each group into the U (Unique) stream, the remaining rows go into the D (Duplicate) stream. This means that sorting is incredibly important if we want to control the output. e.g. If we have first sorted a recordset based on a Profit field descending, then did a unique on the Product Category, we would get the highest record with the highest Profit for each Product Category.
Remember that if we simply want to know the maximum Profit per Product Category then we could use a Summarize Tool, grouping by Product Category and selecting the Maximum Profit will produce that result, however the Unique tool has the difference that it keeps all the fields, the Summarize Tool will only keep the aggregations you specify.
This is probably the tool we get most support questions on, it’s a complex tool and it isn’t immediately obvious how to use it. As such it requires more explanation than probably any other tool in the Data Cleansing / Preparation categories.
We’re likely to reach for Fuzzy Matching when we have a dataset which needs to be string matched against some kind of reliable “master” dataset, but the strings don’t provide an exact match – we call this a “Merge” match in Alteryx (here we don’t care about looking for duplicates within the Master file). Alternatively you might have one dataset and want to find all the duplicates across it, to keep one master record’s details – this is a “Purge” match. The data might be Product Names, Company Names, Name and Address, the approach is the same, but crucially we can tweak the matching algorithm so that we get the best match depending on our use case, for example, if I’m matching Customer Names to a mailing list to send an offer or promotion, then I might accept a looser match than if I’m, say, approving them for a loan.
The first thing you’ll notice when you drag the tool onto the canvas is that it only has one input, which immediately seems counter intuitive. How can I match my data if I only have one input? Well in requiring a single input stream Alteryx have actually given us limitless options in how we do our matching, and it’s this flexibility that makes the tool appear daunting to new users. However by approaching the problem step by step we can remove the complexity.
Before the Fuzzy Match Tool
Step 1: Get rid of any identical matches using the join tool (or the duplicate tool if you only have one data set), this will aid the matching process by reducing the number of possible matches.
Step 2: Build a single dataset – add a column to each dataset to give each source a unique identifier if you are doing a merge match – then use the Union Tool to combine your source datasets with your master data.
Step 3: Give each row a unique ID field using the Record ID. Top Tip: Given that the tool returns these IDs I find if you’re just using one field it helps if you create an ID incorporating the field you want to match, that way it’s easy to see the results.
Step 4: Connect the Fuzzy match tool and specify the RecordID field. Also specify whether you are doing a merge or a purge, as defined above. In a merge you will need to specify the source id field.
Finding the right match algorithm is an iterative process, likely to be dependent on the data you are feeding through the tool. Once you’ve connected to the fuzzy match tool your first step is to decide which fields to create your match on, e.g. if it’s just Company Name then you’re likely to only have one match field, but if you have Name, Address and Postcode you might have sever: First Name, Surname, Address1, Address2, Town, County, Postcode. Once you’ve decided then follow this process for each field:
Step 5: Add the match field to the “Match Fields” selection at the bottom of the tool.
Step 6: Choose a match style, there are some redefined ones, so the most relevant one to your match, if you’d like to understand the matching algorithm or tweak it then you can hit “custom” (and we would recommend you do to get the most benefit out of the tool, see the Appendix of this post for details of what you will find and how the matching is performed).
After the Fuzzy Match Tool
Step 7: Out of the tool will come two Record IDs to show which records match each other; a match will only appear once, you won’t get the reverse match shown. Unless you made the RecordID includes the match field (as suggested in Step 3 you will then need a couple of joins to rejoin the original data back on and work with the results.
There’s a lot, relatively speaking, compared to other Alteryx Tools, to getting the Fuzzy Match process working, it’s a very powerful tool though. Check out the Appendix for ways to tweak the results and build a very powerful custom algorithm.
The Make Group Tool identifies groups within you dataset, and works very well in combination with the Fuzzy match tool. You won’t use it much, and interestingly enough this isn’t a tool you’ll find anywhere in the v8.6 Samples, but it’s worth having it in your back pocket.
The tool looks across pairs of matches, and forms them into groups e.g. if A links to B and B links to C then A, B and C will all be in the same group. This can be useful in fuzzy matching because you are going to get paired results, and you’ll want to form them into groups to identify where to purge the results, e.g. if Alan Smith matches to Alan Smythe who matches to A Smythe, but Alan Smith didn’t match to A Smythe it’s likely they are all duplicates.
Let’s look at these tools in action in a 20 minute example video.
There are several elements to the matching process, the most important are the Generate Keys and Match Style.
This is the first “rough” match that is then further refined later. Each record is assigned a key and if two records have the same key they will be matched. The options are
None: This field will not be taken into used in deciding which records match – use this if you are using multiple match fields but want to ignore a field in the key generation match phase
Digits Only: Use only the digits to form the key, e.g. useful for phone numbers which may contain punctuation or spaces.
Digits Only Reverse: This is useful where phone numbers may contain a country code, combined with the maximum key length you can ensure you only match on the last x digits of the phone number.
Double Metaphone / Double Metaphone with Digits : A useful algorithm that can match words based on what they sound like, so can match even where there are spelling errors. It works by phonetically by reducing them to 12 consonant sounds.
Soundex / Soundex with Digits : useful for European names this is another sounds-like algorithm
Whole Field: Only records where the entire field matches will be matched. Case must also match.
Whole Field (Case Insensitive): Only records where the entire field matches will be matched. Case is ignored.
Alphanumeric Only: Looks only at alphanumeric characters to make a match. Case must also match.
Alphanumeric Only (Case Insensitive): Looks only at alphanumeric characters to make a match. Case is ignored.
Generate Keys for each word: Ignore the order of words, e.g. University Birmingham and Birmingham University will match.
Maximum Key Length: Specify the maximum length of the key to consider for the match
The Match function is a actual matching process that does the beef of the work, after the brutish nature of the Key Generation. Records are only compared where keys match.
None – Key Match Only: Looks only at the Key Generation specifications.
Levenshtein Distance / Jaro Distance : Two measures of similarity between two strings. The Jaro Distance is more forgiving than the Levenshtein Distance.
Using character matches the whole string will be matched, use a Word based match to compare individual words (i.e. the order within the string won’t matter)
Use this setting to set the tolerance of the match, we tend to recommend returning the score using the options on the Fuzzy Match tool then sorting by the score and viewing the data to determine the best threshold to use. If using the match threshold, ensure the overall threshold on the main settings for the tool is also set to a similar value, these settings work independently and so both will limit the matches returned.
I’m going to return to the concept of Radar Charts today, with part 3 of the series – you’d never imagine that we could keep talking about such a Tableau-taboo chart type, but we have!
To refresh, in Part 1 I showed how you could use a background image for the ‘web’ of the chart, whereas in Part 2, Chris used Alteryx to create the background image. I was considering showing how you could create the web in R, but Phillip Burger made that unnecessary.
So, where can we go from here? The obvious step is to do away with the background images, and bring the web creation into Tableau itself. Sounds simple, and in theory it is. However, I wanted to also make something that could easily be picked up and dropped into any workbook without required a high level of skill to implement. I wanted the solution to have that “Tableau” feel to it, which is a concept I am coming to appreciate recently more and more.
So without further ado, I present my solution – I’ll go into more detail after. One point – since I’m producing a tool, I’ve purposefully used fake data, so don’t try to gain great insight into the cars!
Still here? Hopefully the viz has sparked your interest. Next steps – how it is made and how to use it
How the viz was made
The first point is obviously the data – without the right data, the viz can’t happen. The method I went for is to separate the background data and the chart data in an excel file, so that the chart data can easily be updated without worrying about breaking the background image. The two strands of data are then stitched together with custom SQL so that we have a single data source in Tableau that contains all the data.
The sharp eyed among you may have noticed that the image above shows the underlying data containing far more radials than the resultant viz. This is to allow the radar chart to automatically grow to as many metrics are included in the data, up to a total of 20 radials. There are two parts to the background – the radials and the rings.
These exist on the zeroth ring, and use the Spindle Point dimension to go from 0 (centre of the chart) to 1 (110% of the extent of the data). The number of radials depends on a count of the number of metrics that are present in the view at any one moment. This is done by using one table calculation to determine the number of metrics for each of the data dimensions, and then we wrap that in another to allow all elements of the viz to access this number
Once we know the number of radials we are plotting, we then need to know how to arrange them. Using the previously calculated number of radials, we can just divide that into 2PI to get the angle between each one. Note that Tableau works in Radians, so for those of you that are used to 360degrees, sorry!
Using this, we can now work out the X and Y co-ordinate of the points on the radial line using basic trigonometry:
X = length * SIN(angle)
Y = length * COS(angle)
A line is then drawn for each radial and we move on.
Next step is add the rings to connect each radial. The last ring is at 100% of the maximum of each measure, and the rest of the rings are equally spaced between the maximum and the centre. The number of rings is determined by a parameter, which is currently surfaced on the viz. To make the calculations easier, the rings don’t need to know about what the maximum value is, as the length for the maximum is fixed at 1, so the length for each ring is just
Using the same trigonometry calculations as before, we can get the X and Y co-ordinates that we join up for each ring.
Finally, we move on to the data. The idea is to map each metric onto the corresponding radial. To do this we need to number the metrics, but as we want to be able to filter the metrics on the fly, we need to make the numbering dynamic, after the filtering. This means table calculations, and in this case, we turn to the INDEX function.
If ISNULL(Min([Radial])) Then INDEX() End
Utilising similar steps as for the background, we use the metric number to work out the angle, the metric value vs the max value to work out the length, and then the angle and length to provide the X and Y coordinates.
These points are then used to create a polygon for each dimension.
Putting it all together
Using the combined X field and the background Y field, we start off by creating the background lines, and then dual axis this with the Y field for the metrics.
Things are looking good, but we need some additional tweaks to bring a bit more polish.
The metrics are added as a dimension filter
The dimensions are added as a table calculation filter. This is to allow dimensions to be selected without resetting the maximum value for each metric
Labels to indicate what the maximum for each metric is. This is a little tricky, with each background radial looking for the maximum for the corresponding metric, and what that metric is called
How to use this viz
As I mentioned earlier, this was designed to be easy to move into another workbook. The way to do this is to modify the underlying data.
Download the twbx workbook from Tableau Public.
Download the Radar Chart Data. Open the excel file. Do not touch the Radials or Rings tabs. Modify the data within the Data tab, without changing the structure, though you can alter the number of metrics for each dimension
Save the excel
Open the workbook. Try to refresh the extract. At this point it will ask you to provide the new file location. Just point it at your modified excel. The extract will be refreshed, and the dashboard will show your data on a radar chart.
I hope that was clear and that now if you are asked to produce a radar chart, you can have this sitting up your sleeve for an easy win. Of course, this again doesn’t cover the fact that a radar may not be the best way to visualise the data, but this was more about being able to easily make what you need to for your audience. And hopefully I succeeded in making it have the Tableau feel.
And there we go. Radar Charts done in Tableau. Let me know what you think.
Often people think of Dual Axes in Tableau as a way of showing different measures on the same chart, but they can be just as effective, if not more so, in adding an an extra level of information or detail to a chart in a more interesting, or clearer way.
As an example let’s take a look at the Gantt Charts below, they show predicted and actual times for several stages of a project. The second is a lot clearer and enables us to quickly look at the actual vs target/predicted times for each stage.
Similarly dual axes can be used to overlay information such as averages onto charts to overlay on a more detailed, granular level of information – as in the case of the scatter plot below, where the averages per region are shown from the transparent bubbles.
Putting together dual axes charts like the above is straightforward, simply pick up the pill you want to duplicate while holding the control key and drag it next to itself on the relevant shelf. Then right click on the second, new pill and select the “Dual axis” option. Straight away you should ensure you also right-click on the new axis and “Syncronize axes”, this ensures everything lines up.
You will notice now, by moving to a dual axes chart, you will have a second “Marks” pane for the second measure, as well as an “All” pane. You can use these panes to change how the two sets of marks are shown, the “All” pane will affect both sets of marks, while the others work independently meaning you can have different types, sizes, colours, etc.
Here’s some top tips for working with dual axes charts like this:
1. Remember to sync Axes – easily the biggest mistake when working with dual axes charts in this way is forgetting to synchronize the axes.
2. Mix up your marks – if things are too similar they will be confusing, ensure you use different marks and also use help text and/or tooltips to give the user guidance in what they are being shown.
3. Don’t add “non-data ink” – whatever you add should be to either give the user new information, or help them interpret the existing data, not just “bling”
4. You may need to reshape your data – e.g. in the first Gantt example we needed to use table calculations to create new dimensions and measures.
5. Use your imagination (and other people’s) – dual axes bring a new dimension you can apply to your visualisations so you can be as as creative as you like, but looking at examples on Viz of the Day and Tableau Public and de-constructing them will help give you new ideas.
Long time followers of the Information Lab blog may remember Andrew Ball’s post - Radar Charts in Tableau – part 1 - back in May. It’s long overdue that we return to the subject and look with fresh eyes now we have Alteryx in our toolkit.
We’ve been documenting a lot recently how Alteryx can help with common Tableau data cleansing tasks and creating radar charts is no different. For the Top Trumps Viz below I decided to use radar charts to compare the different characters in the set of cards and their attributes.
Firstly I need to standardise the value attributes so I summarised the variables to find the minimum and maximum values and attached those back to the data, then creating a standardised value between 0 and 1 was simple using the formula:
Next I simply needed to number each attribute 1 – 6 according to the spoke they took on the radar and, remembering SOHCAHTOA from my high school maths, then write the following formulae in the tool:
Andrew’s post lays out the rest of the details and so I won’t go into them here, here is my final Alteryx implementation to clean and build the data for the Viz:
The only things to add to Andrew’s post are that I created a dummy value (a hero called “Chris”), with all the values as 1, which made lining up the template radar picture in Tableau an easier process. It meant I could tweak the X, Y coordinates and get the radar exactly right – then hide the dummy row in my final analysis:
The “Top Trumps” Viz
The idea for the viz came from a game I was playing with my eldest son, I was trying to teach him some of the nuances of Top Trumps, he’s only 5 years old and so in his mind choosing the highest value (Height in this case) was the best way of playing, even when the height for that card was the lowest in the overall set of heights. I thought about ways of showing him the comparisons between the cards and thus the viz was born. Thankfully the data was online and publicly available and so I was able to produce the viz in Tableau Public and share it.
Radar vs Parallel Plots
Radar charts were a choice I made based on my requirements, I want to see which attributes in my card are better as a whole, based on a quick, holistic view in comparison to other cards. Whether a parallel plot is better I think will be a matter of debate, on the whole it’s certainly easier to quickly compare a given attribute across the five chosen heroes, so I’ve also included an alternative view including a parallel plot for those who prefer to work with those.
Post title pun aside, when you bring Alteryx for data blending & advanced analytics, Tableau for the visual component, and Mapbox for custom maps into your data project anything is possible.
While sitting at home last week listening to the storm outside, hoping the roof would still be attached by morning, I took to Google to find data for hourly wind speeds in the UK. No dice. What I did find was an open weather API openweathermap.org which produces some fantastic live weather data from around the world. After some data capture I managed to produce the following dashboard, read on if you want to know how it was done.
Get the Data
Tableau doesn’t natively connect to custom APIs let alone live ones which leaves two choices for capturing the data. The Tableau TDE API or our favourite data blending & advanced analytics tool Alteryx. For this project I went with Alteryx simply for speed and the ability to easily schedule the process using Alteryx Server.
Open Weather Data Alteryx App
The Aleryx app takes a list of cities from my SQL Server, queries each one in turn against the API and then assembles the results back to the SQL database. There are three different XML parse tools in play as the XML returned by the API has a number of layers with each parse tool dealing with a different layer. The results are then assembled back into the SQL database for analysis by Tableau.
Tell the Story with Shapes
There really are only 2 primary pieces of information relating to wind, speed & direction. Speed’s no problem to encode in Tableau and in this example I made use of both size and colour, the larger and more purple being greater speeds. But what about direction? Clearly arrows are a great way to express direction but without an angle/rotation shelf in Tableau another solution was needed.
When trying to figure out how to tackle these problems in Tableau I start to look at what I haven’t used on the marks card. I could switch to the polygon mark type, and effectively draw what ever I like, but that seems like overkill especially when that shape shelf is sitting there unused. The challenge is how to get the arrow to rotate, at at least look like it’s rotating.
Back when I created a viz to compare the ratings of Larry King vs Piers Morgan I decided to show an image of every guest…all 477 of them! Now assigning them one by one simply wasn’t an option so I had to use the assign palette function, what’s key is that Tableau assigns each shape in turn and also lists the shapes in alphabetical order. So if you can get the shapes palette in the same order as your data you can in effect assign 360 rotated arrows to their corresponding angles.
So how did I go about creating 360 rotated arrows? Again taking to Google and I found the command line image editor ImageMagick for which I wrote a bat script to incrementally rotate the arrow and save the output to a new file. Boom, 360 rotated arrows that can now be assigned to the wind direction.
Take Mapping to the Next Level
About a month ago we announced our latest partnership with map delivery experts Mapbox. Amongst other things the new capabilities brought to Tableau mapping via Mapbox are a fully customised map palette and custom mapping layers. Within a few minutes I’d created a map which becomes part of the dashboard by matching the ocean colour to the dashboard background.
Thanks also to a well timed email from Paul Chapman (@cheeky_chappie) I had a shapefile of flood alert areas in England to add an extra dimension to the storm analysis.
Best of Breed
Working with (or for) The Information Lab today means being able to take advantage of three awesome, best of breed companies each bringing together everything you need to complete your data viz project.
SIGN UP TODAY for our FREE Alteryx introduction for Tableau users. By popular demand we’ve added two more one day training sessions with Tableau Zen Master Craig Bloodworth and Alteryx Grand Prix champion Chris Love.
This unique training opportunity is brought to you courtesy of The Information Lab.
Are you a Tableau user who has been wondering what all the recent chat about Alteryx software is all about? You may have seen their recent partnership with Tableau, or announcements about our partnership with Alteryx, or just noticed an increase in social media posts including the #tableau and #alteryx hashtags.
So what’s all this about? It’s about taking the best data visualisation software on the market, and combining it with the best data preparation tool on the market to provide the complete toolkit for today’s analyst. Users of BOTH these software products are the most self reliant data professionals we know.
As a Tableau user, you stopped asking for help with reporting and analysis a long time ago, Tableau users are self reliant when it comes to data analysis. However, there are still times when you need to reach out for help… What if you want to join data sources of different types? Connect to data sources un-supported by Tableau (such as XML, KML, shape files etc)? Perform complex predictive analysis, or complex spatial analysis such as calculating drive times?
Alteryx to the rescue
We want as many Tableau users to understand the capabilities of Alteryx as possible, so are running a series of courses to make this happen. These courses are FREE OF CHARGE.
To give you the very best chance of understanding how Alteryx and Tableau come together , we’re running a full day session with some of the most knowledgeable Tableau and Alteryx users around.
Trainers on the day will include:
Craig Bloodworth, one of the original crop of Tableau Zen Masters, a certified Tableau trainer and regular contributor to the Tableau community.
Robin Kennedy, also a certified Tableau trainer and 4 year veteran of the Tableau world.
Chris Love, has been using Alteryx for longer then he cares to remember. Chris is current holder of the coveted Alteryx Grand Prix trophy, basically making him the best Alteryx user in the world.
Where and when?
Where: Imparando training centre, Commercial Road, London
When: 21st March 2014 & 25th April 2014 10am until 4pm (only one session per person)
Lunch: Provided, let us know if you have special dietary requirements
Hardware: All provided, but bring your laptop if you prefer to use your own machine
Recently I used the powers of Alteryx to make National Student Survey (NSS) results available as a Tableau .tde. Then I thought I would use the data set to create a viz.
I decided I wanted to create an explorer viz: where users could navigate their own way through the data. My end user in mind was a student who might soon be applying to University and wondering where to study. Users can see a league table of institutions, with the option to narrow down by subject group and/or subject. Hovering over an institution then brings up the option to see the results in more detail. And lastly you can then compare the detailed results with a second institution.
[click image to go to viz]
If you’re wondering how I was able to map the institutions, I did this by finding a data source online of HE names and postcodes and then using Alteryx to join these to my original data set. Then I joined a set of full UK postcodes with latitudes and longitudes (available from our Tableau mapping site) and output to .tde. If anyone is interested in the mechanics of this, please get in touch. Or maybe that could be another blog post… Of course if you want the data set with postcodes you can download the viz from Tableau Public and create your own analysis. If you do, please post a link to your work in the comments!
A client recently came to me with an interesting challenge. They wanted to embed Tableau Server dashboards in Salesforce (nicely demonstration by Ellie Fields) however instead of using Tableau Online they intended to install Tableau Server on an Amazon EC2 server alongside Amazon Redshift. Here’s where it gets difficult, it needed to be a seamless single sign-on (SSO) experience from Salesforce to Tableau using enterprise Active Directory (AD) usernames and passwords.
Clearly the authentication username-password challenge had to be done by active directory in order to keep usernames and passwords consistent, however until Tableau Server 8.1 our only option would have been to explore a custom trusted ticket authentication interface. With the introduction of SAML in Tableau 8.1 and Active Directory Federation Services (AD FS) 2.0 with Windows Server 2008 (as an update) & 2012 authentication could pass seamlessly from the active directory to Salesforce and then onto Tableau.
So how do you go about configuring a Tableau Server, not situated anywhere near your enterprise domain, to work with active directory? Here’s how….
It’s assumed you have a working instance of Tableau Server & Windows active directory. The instructions will be based on Server 2012 however should also carry across into Server 2008 with the AD FS 2.0 update. You’ll need a valid SSL certificate key pair for the AD FS server & Tableau Server, I’ll point you in the right direction when it comes to it. It is also important that usernames configured on your Tableau Server match those on your active directory so if you haven’t created an admin account on Tableau which matches the username of your own or test AD account I’d do that now.
Getting a single AD FS instance up and running
I’m going to assume you don’t have AD FS already installed & configured, if you do you can skim read this section, just make sure you save the FederationMetadata xml at the end.
We’re going to start in the top right of Server Manager -> Manage -> Add Roles and Features
Follow the wizard to install Active Directory Federation Services (may require server restart)
AD FS requires a certificate to be in place in IIS under the Default Site. If you don’t already have a certificate you’ll need to generate a certificate signing request and submit it to a fully configured enterprise or third party certificate authority (I’d recommend RapidSSL as they have lots of instructions about how to do this). Don’t forget the CN (common name) needs to be the fully qualified domain name of your AD FS server as it’ll be seen by the Tableau Server. e.g. adfs.theinformationlab.co.uk
Edit the IIS Default Site bindings to assign the server certificate to the https protocol
Start AD FS Management under Administrative Tools
On the first run of AD FS Management you’ll be presented with a wizard to configure the server. The questions are simple, at this point you’re just looking to install a single instance of AD FS, not configure a farm (if you are configuring a farm why are you reading my step by step guide).
When the wizard completes you should see a bunch of green ticks. You can then test that all’s well by going to https://yourserver.yourdomain.com/FederationMetadata/2007-06/FederationMetadata.xml
Save the FederationMetadata.xml file in a safe place, ideally transfer it now to your Tableau Server.
Finally Tableau Server requires forms based authentication to serve logins via a web browser, Tableau Desktop and the Tableau Mobile app. In web.config found in the C:\inetpub\adfs\ls directory modify the tag order under <localAuthenticationTypes> so that <add name=”Forms” page=”FormsSignIn.aspx” /> appears first in the list. You can customise the login form by editing FormsSignIn.aspx in the same directory.
Congrats! AD FS is up and running….make yourself a coffee.
Configure Tableau Server for SAML authentication
While these steps are specific to AD FS SAML integration they can be translated to any SAML identity provider (IdP).
If you don’t already have an SSL certificate for your Tableau Server follow these steps. Again you’ll need to have the request signed by a fully configured enterprise or a third party CA. Don’t forget the CN (common name) needs to be the fully qualified domain name of your Tableau Server as it’ll be seen by the AD FS Server. e.g. tableau.theinformationlab.co.uk
Open the Tableau Server configuration window and select the SSL tab.
If SSL isn’t already enabled tick the box and assign the certificate and key files created in the previous step
Stop the Tableau Server (tabadmin stop), click OK on the configuration box, start the Tableau Server (tabadmin start) and check that the server is fully operational (tabadmin status –verbose) and is now accessed in a web browser via https instead of http
Open the Tableau configuration window again, this time selecting the SAML tab
The Tableau Server return URL is the URL the user will be sent to after authenticating with SAML. It should be the URL of your Tableau Server beginning with https (to comply with AD FS endpoint requirements).
You can give the server any SAML entity ID although it must be unique on your AD FS
The SAML certificate and key files can be those used by your SSL config or they can be unique. Importantly the CN must match the return URL FQDN
Once they’re all set you can export your metadata file, this file needs to be copied across to your AD FS server
Remember the FederationMetadata.xml file we saved earlier? Select it as the SAML IdP Metadata file
Stop Tableau Server (tabadmin stop), click OK on the config screen, start the Tableau Server (tabadmin start)
While Tableau’s starting go back to your AD FS server with the newly exported Metadata xml file.
Add Tableau to AD FS
In AD FS Management, under Trust Relationships, right click on Relying Party Trust and Add Relying Party Trust
In the second wizard screen select “Import data about the relying party from a file” and browse to the exported Tableau SAML metadata xml file (the one you most recently copied)
Click next…any error alerts at this point need to be rectified, the import should be seamless
Give the config a friendly name e g. Tableau SAML and finish off the wizard
You should be asked to create some claim rules. If not right click on your new Relying Party Trust and select Edit Claim Rules.
Under the Issuance Transform Rules tab cluck the Add Rule button
For SAML authentication Tableau requires two attributes to be returned, Name ID and username, both of which can make use of the same AD attribute SAM-Account-Name
Once configured all should be ready to go. Try logging in with a username already authorised on the Tableau Server (don’t forget to use the AD password)
Most errors are caused by poorly configured certificates. Make sure your certificates appear valid to both servers (during the FederationMetadata.XML or the Tableau SSL test you shouldn’t receive any browser warnings about your certificates)
If the AD FS accepts your login but doesn’t forward you on to Tableau check the AD FS server logs. Some certificates require hash algorithm SHA-1 rather than SHA-256, this is set in the trust partner properties
The following log files are useful (that can be found under ProgramData for installations on the C drive, Program Files for all other drives):
tabadming.log (\Tableau\Tableau Server\logs\) – As the Tableau Server comes back online and “Waiting for wgserver on 8000 to become ready” appears in the log look out for any 404 errors. This would indicate a problem with your certificates
error.log (\Tableau\Tableau Server\data\tabsvc\logs\httpd\) – Certificate errors referencing localhost:443 are OK however any other certificate errors would indicate a problem
wgserver-#.log (\Tableau\Tableau Server\data\tabsvc\logs\wgserver\) – This is where you’ll find SAML specific errors such as missing claim attributes