Today’s blog we will learn about running R codes with Tableau. R is the most used statistical tools by data scientests and researchers across the globe. Together, R & Tableau could be extremely useful in data science arena as it can help organisations to unravel end to end discovery needs.
Here in the blog we will go step by step process of integrating R & Tableau
Prerequisite: R studio & Tableau already installed on machine
Step 1
Install Rserve Package on your R software
install.packages("Rserve", repos="https://cran.rstudio.com/bin/windows/contrib/3.4/Rserve_1.7-3.zip")
## Error in install.packages : Updating loaded packages
Once the package is installed, we will run below command to load the package
library(Rserve)
Rserve()
## Starting Rserve... ## "C:\Users\HP-LAP~1\DOCUME~1\R\WIN-LI~1\3.4\Rserve\libs\x64\Rserve.exe"
This will start the server in the background to enable Tableau to execute commands on R
Step 2
Now we will connect Tableau to R server using following
Help –> Setting and Performance –> Manage external Service connection
This will open the below shown window with auto populated values for Server as “Localhost” and port as “6311”.
Click on Test connection to check the connection, it will show successfully connected to Rserve in pop-up
click on ok
This will help Tableau to execute script commands like:
Script_BOOL
Script_INT
Script_STR
Script_REAL
Step 3
We will take below FMCG example as our data source and import into Tableau
Step 4
Now, we will create an calculated field “Outliers” using following code to find outliers in the FMCG data:
IF SCRIPT_REAL(“library(pracma); a <- rep(1, length(.arg1)); a[findpeaks(.arg1,threshold=quantile(.arg1,.99),sortstr=FALSE)[,2]]=0;a;”, SUM([Sales])) == 0
THEN “Outlier”
ELSE “OK”
END
One more calculated field “Cluster” with following code
SCRIPT_INT(
‘set.seed(42);result <- kmeans(data.frame(.arg1,.arg2,.arg3,.arg4), .arg5[1]);result$cluster;’,
max([Sales]), SUM([Profit]),min([Freight Expenses]),MIN([Discount offered]),3)
Step 5
Drag order date on column shelf and sales on row shelf then drag “Outliers” into color
As you can see all the outlier values are highlighted in orange while other values are in blue color. This way we can find outliers in our data very easily by using dream team of Tableau and R
Step 6
Lets go one step further and see if we can do clustering in Tableau with R.
Drag Sales into column shelf and profit into row shelf (unselect analysis –> aggregate measure) then drag calculated field “Cluster” to color in marks area and add trendline in the chart. We will see it creates 3 clusters as shown below:
Hope this blog helps to understand how to run Rserver in background and execute R scripts using Tableau.
you can download tableau file from here
Thanks for reading the blog
Do subscribe to my blog and keep receiving new posts 🙂