Power Up Your Tableau Prep using Scripts

2 0 0
                                        


While Tableau Prep has been around for quite some time, not every Tableau user can vouch for using it extensively. Does that sound like you? That was definitely me a couple of months ago until I discovered the power of scripts: a feature that was released just last year. For all the data prep nerds out there, I wanted to walk you through a couple of examples of when Scripts on Tableau Prep can prove to be extremely handy. For more details Best Tableau Books

For most of the cases, Joins & Blends, Level of Detail (LOD) Calculations, and limited Data manipulation that Tableau Desktop allows might suffice. However, when you are dealing with data where fields might change (courtesy: collaborative worksheets) followed by a pivoting of rows to columns/ unpivoting of columns to rows, it becomes prone to breaking when the source file gets refreshed. Tableau Prep allows you to handle these changes independently and transition into a cleaned output file for analysis.

Case 1: Using Tableau Prep Features & Python Script interchangeably:

I would strongly encourage you to go check this out if you want to challenge yourself with the best ways to prep data (& compare with the provided solutions). You can also download the source files, Tableau Prep flows, and Python scripts in case you would like to follow along. My intention here is not to solve the challenge but to use an excerpt from this flow and show how the data manipulation could be achieved either with in-built features or using the Script feature.

Notice the Stands table which gives the list of which stands can be accessed by which gates at an airport terminal. It is in a format that needs to be changed to a more ingestible format for

This implies: splitting the column [Accessed_by_Gates] so that there is one entry for each (Stand + Gate) combination. The way to achieve this with Tableau Prep would be a succession of two steps: 1.Custom Split on ',' for the ['Accessed_by_Gates'] column (since the max no. of options is 7, this will lead to 7 columns) and 2. Select the 7 columns resulting from the split, click the icon with three dots and Pivot Columns to Rows plus remove the rows with blank gates allotted. This is represented below with the highlighted steps.

However, if you happen to hold a stronger faith in data cleaning by converting them to dataframes first, then you have to set up TabPy to be able to run your Python scripts to bring the one-step transformation. I will not go into the details of the setup as you will find clear instructions. Learn more skills from

It is worthwhile to go over the syntax of the simple Python script that enables Tableau Prep to process the desired output. Make sure you have run 'tabpy' on your command prompt after setup and have the output as shown below:

As for the Python script, it should contain 3 chunks: the libraries, the defined function, and the desired output schema. While the first two are pretty much the norm for anyone who has used pandas dataframes before, the 'get_output_schema' is used to define the structure of the output with the datatypes (which are a bit different for Tableau Prep and denoted as prep_string/prep_int and so on). The primary function defined 'CleanStandsDataset' consists of the steps where I have used a string split method and pandas melt function to pivot the split columns, thus yielding the same output as the two highlighted steps above, albeit in a single step.

Once you have the script ready, it is time to add it to the Stands excel sheet by clicking on the + icon and selecting the 'Script' option.

This will open up the Settings pane at the bottom where you will select TabPy Server and add the localhost connection by going to Help (at the top)> Settings & Performance> Manage Analytics Extensions Connection. Select your server as localhost and Port as 9004 (what you got in the command prompt window) and hit Sign in. This should establish the connection as you see here. Browse to your .py file here and select the function name you defined.

The script should now run successfully to process your dataset and clean it the way you defined it in your script. Enhance your tableau skills with

Case 2: Using Tableau Prep & TabPy to make API requests

Tableau Prep with its entire suite of connectors can pretty much connect to data coming from various databases/ big data warehouses in the form of SQL queries or Cloud services thus ensuring your dashboards are up to date depending on the refresh frequency. On similar lines, you can also leverage the power of scripts to make API calls and analyze constantly changing data.

Here I have imported the requests library to be able to make a get request on the URL. Note that in my 'response.get' call, I have not defined parameters since the CoinGecko website already provided me the curated URL with parameters included. Since the output is in JSON format, I have next converted it into a dataframe to do any other data manipulation if needed. Finally, I have ended with a get_output_schema function, defining only the fields I needed with their respective datatypes.

Before you run this script, you would have to import an excel sheet with dummy values i.e. a column with 'dummy' as a column header and 1,2,3 as values since Prep needs a source file to start. Now you can simply import this excel on Tableau Prep, add a script just as before and you would be good to output all the desired information to a file at your desired location using the 'Output' option when you hit the + icon as the last step. Each time you run this flow (which can be completely scheduled if you publish this flow to your Tableau Server), this will reflect the current data as on the website.

To recap, here are some of the features that are you can take advantage of:

It allows you to view the output after every step or basically decouples the data prep process so as to not interfere with the existing dashboards. Some of the really cool new features include 'group & replace' & 'wildcard matching' which lets you alter the data in bulk without any convoluted calculations. With the latest Tableau releases, the focus certainly seems to shift towards making Prep more powerful and accessible.You have the liberty to leverage Python or R to make API requests or to carry out statistical analysis/ build Machine Learning models on top of the live data followed by directing these outputs to Tableau dashboards.

For more advanced skills on tableau learn with 

Power Up Your Tableau Prep using ScriptsDonde viven las historias. Descúbrelo ahora