A Pratical Tip To Validate Your Approach

data science tableau check

I have just enrolled in a Data Science course on Udemy  and I learned good stuff.

How was the A/B test « Number Of Product » ? Easy or difficult ?

Here is the result I found.

data science tableau check bar chart

I think you noticed there was something bizarre. There is an anomaly. We imagine that the more the client has products, the more the client is satisfied with the bank so this type of clients should stay in the bank.

In the first 2 bars we can see that a client who has 1 product is more likely to leave the bank than a client who has 2 products. But when a client has 3 or 4 products, we see a huge rate of clients leaving the bank.

Look, there is a little bizarre detail. In the 2nd bar, we can’t see the « Exited » label. This is because there is no place in the orange part to put the text. To make it simpler, we’ll remove the label « Exited ». Drag and drop on the « Exited » text label to the outside.

data science tableau check bar chart

data science tableau check bar chart

Perfect, we can read the percentages. On the 1st bar, we can see that among the client that have 1 products, 28% left the bank. On the 2nd bar, we can see that among clients who have 2 products, 8% left the bank. This show us that clients who have 1 products are more likely to leave the bank than clients with 2 products.

And for the next bars, we observe an anomaly. On the 3rd bar, we can see that among the clients who have 3 products, 83% left the bank. On the 4th bar, we can see that among clients who have 4 products, 100% left the bank. We clearly see that there is a problem and we need to do a deeper analysis to understand what is going on .

As a Data Scientist, we need to explain what happens in bars 3 and 4. Usually when a client has 3 or 4 banking products, that means he/she is satisfied and is loyal to the bank. But in our case, it’s the opposite because there is a high rate of client who left the bank. This is the time to do deeper analysis.

The first thing to analyze is the quality of the data. There is a very big anomaly and it may be because there is something insignificant in our data that disturbs the statistics. For example, it’s possible that when the bank selected these clients in this sample, there were very few clients with 4 products and all those clients with 4 products left the bank. Sometimes chance can create anomalies and you have to play attention to these effects of chance because they don’t seem important but they can create false interpretations.

To start, we will check the number of clients with 4 products.

In « Measure », move « Number Of Records » (which gives the number of observations) on « Label ».

data science tableau check bar chart

data science tableau check bar chart

We observe on the first 2 bars than many clients with 1 or 2 products selected for our sample. For clients with 3 or 4 products, we can see that there were fewer clients selected for our sample.

There are 220 clients with 3 products and 60 clients with 4 products. These small number of clients probably explain why we observe these anomalies.

In this sample of randomly selected clients, there are very few clients with 4 products and they all left the bank. In this situation, we can confirm that it’s a chance. When thing like that happen, you have to be very careful not to make conclusion too fast and make misinterpretations.

The conclusion is that a lot of clients have been selected for category 1 and 2. For category 3 and 4, there have been few clients selected so we can’t do accurate statistics. We need to do deeper analyze for these categories of clients with 3 and 4 products.

Now, let’s put the percentage back on the bar chart. Click on the « Back » button.

.

data science tableau check bar chart

Or do a click and drag of « SUM(Number of Record) » to outside.

data science tableau check bar chart

data science tableau check bar chart

We saw that there is an anomaly and what is interesting to do is to have a comment to remember to do a more in-depth analysis of columns 3 and 4.

Right-click between the bar chart’s title and the bars. Select « Annotate » then « Areas… ».

data science tableau check bar chart

A window appears. In this window, you write « Low observation in last 2 categories » and click on the « OK » button.

data science tableau check bar chart

data science tableau check bar chart

Click on the comment and move it on bars 3 and 4.

data science tableau check bar chart

data science tableau check bar chart

The next time you work on this bar chart, you will see this comment that will remind you to seriously analyze client who have 3 and 4 products.

Validate our approach

It’s time to show you how to validate an approach and how to validate the data. For this we will create a new A/B test.

Duplicate this worksheet with a right-click on the « NumberOfProducts » tab and select « Duplicate ».

data science tableau check bar chart

And rename the tab « Validation ».

data science tableau check bar chart

For this tab, we will erase the comment. Select the comment and press the « Delete » button on your keyboard.

data science tableau check bar chart

data science tableau check bar chart

Everything is ready, the idea is to find a variable that doesn’t affect our results. That is a variable that has no impact on a client’s decision to leave or stay in the bank.

Take for example, the variable « Customer Id ». Client’s identification number has no influence on the client’s decision to stay or leave the bank.

We’ll do an A/B test with the last digit of the « Customer Id » and we’ill check that there is the same clients proportion who leave the bank in the 10 categories of the last digit of the « Customer Id ». The 10 categories are the numbers 0,1,2,3,4,5,6,7,8,9.

Let’s g.To start, we will create the variable that contains the last digit of the « Customer Id ». To have this variable, we will create a « Calculated Field ».

Right-click on « Customer Id », select « Create » and click on « Calculated Field ».

data science tableau check bar chart

data science tableau check bar chart

Name the calculated field « LastDigitOfCustID ». In the text field, we use the « RIGHT » function with « Customer Id » in parenthesis to select the last character of the « Customer Id ». In our case, the last character of the « Customer Id » is the last digit.

Here is the code to write in the text field : Right ({Customer Id},1)

data science tableau check bar chart

data science tableau check bar chart

Oooops, you see there is a small mistake => The calculation contains errors.

There is an error in the formula because « Customer Id » is a number variable and the « RIGHT » function applies to a variable of type « STRING ».

To use the « RIGHT » function, we will convert « Customer Id » into a string. We will use the « STR » function with « Customer Id » in parenthesis.

Here is the code to write in the text field

And click on the « OK » button : Right (STR({Customer Id}),1).

data science tableau check bar chart

Now, you can see that our calculated field « LastDigitOfCustID » is in « Dimensions ».

Click on « LastDigitOfCustID » and move it on top of « NumOfProducts » in « Columns ».

data science tableau check bar chart

data science tableau check bar chart

Now we have a new bar chart and we see that for every last digit of the « Customer Id » there is about the same proportion of clients leaving the bank. All these proportions don’t correspond exactly to the average of 20% but these slight variations aren’t important.

Seeing this uniform distribution allows us to validate our data because these data are homogenous.

Conculsion

Here’s how you can check the homogeneity of your data. You take a variable that has no impact on the fact that a client leaves or stays in the bank. The example we did with the last digit of the « Customer Id » is excellent. We were able to verify that in each of the categories taken by this variable, if there was the same proportion of clients leaving the bank. As is the case, we can validate our data.

Imagine another result. When we do the test with the last digit of the « Customer Id », we observe that for one of the numbers, the rate of clients who left is really higher than the average. This shows us that there is a problem in our data because it indicates an anomaly.

You can find other ways to verify your data by using other « insignificant variables » to see if the distribution is homogeneous. But be careful when you select an « insignificant variable » because there may be traps.

Here is an example. If you create a variable that takes the first letter of the first name, the distribution will not be homogeneous. The reason is simple, there are many more people who have a name that starts with the letter « M » than with the letter « Y ».

Share this article if you think it can help someone you know. Thank you.

-Steph

Connect Tableau to An Excel File

tableau connect excel file geographic map

I have just enrolled in a Data Science course on Udemy  and I learned good stuff.

Now that you downloaded the dataset in Excel file format, we’ll use Tableau to analyze this.

We’ll connect to the dataset using the « Excel » option.

Now that you downloaded the dataset which is in Excel format, we will use Tableau to analyze this.

We will connect the the dataset using the « Excel » option.

tableau connect excel file geographic map

Select the dataset in Excel file you downloaded and click on the « Open » button.

tableau connect excel file geographic map

And as you can see, there is only one tab.

tableau connect excel file geographic map

There is only one tab because in the Excel file there is only one tab. If in the Excel file there were several tabs, they would all have been listed here.

tableau connect excel file geographic map

It’s necessary to check that all data is « OK ». Scroll the lines and columns to see that. Everything is good, there are 10 000 lines as in the Excel file.

tableau connect excel file geographic map

Excellent, we connected our Excel source file to Tableau.

Now, click on the « Sheet1 » tab to access the Worksheet.

tableau connect excel file geographic map

tableau connect excel file geographic map

We’ll have a little fun.

For example, let’s look at what we have with « Geography »

tableau connect excel file geographic map

« Geography » is the dimension that gives us the country, so we’ll make a map to see where the clients from the bank come from.

Move « Geography » on this area.

tableau connect excel file geographic map

tableau connect excel file geographic map

Ah, it’s odd, nothing happens ?!? Why ? Look, when you look at « Geography », it’s not recognized by Tableau as a geographic dimension. Here,, you can see that Tableau recognized « Geography » as a dimension of type text with the label « ABC »

tableau connect excel file geographic map

Don’t worry, we can fix it quickly. Click on the arrow of « Geography ».

tableau connect excel file geographic map

Selects « Geography Roles » and « Country Region » so that the « Geography » dimension become geography’s type.

tableau connect excel file geographic map

Now you remove « Geography » made a table with a click-and-drag.

tableau connect excel file geographic map

tableau connect excel file geographic map

Look, we have a globe next to « Geography ». This means that Tableau recognize that « Geography » is a geographic dimension.

tableau connect excel file geographic map

Since « Geography » is a dimension of geography type, there are 2 new measures that have appeared : Latitude (generated) and Longitude (generated).

tableau connect excel file geographic map

Put « Geography » in this space with a click and drag.

tableau connect excel file geographic map

Look, this time there is a map.

tableau connect excel file geographic map

You have the possibility of zooming with these buttons.

tableau connect excel file geographic map

The map is fine but we’ll remove the blue dots and modify the map so that it’s easier to read.

We’ll color the countries and display the clients number that has in each country.

We know that in the dataset each line corresponds to a client. What we can do is use the « number Of Record », it means the total of number of observations. In this way, we can visualize the number of lines attended to each country and the number of lines attended to each country is the number of client per country.

Then, take the « number Of Record » and move it to « Colors ».

tableau connect excel file geographic map

Boom ! Each country has a color.

tableau connect excel file geographic map

Look at the color contrasts. France has a darker color which indicates that it is the country with the most clients. Germany and Spain have almost the same colors which indicates that they have almost the same clients number.

But we want to know the clients number per country without have the cursor on the country.

To do this we’ll add a label. Take « number Of Record » and moves it to « Label ».

tableau connect excel file geographic map

tableau connect excel file geographic map

We’ll increase the text’s size and put in bold. Click on « Label », click on « Font » and select « 12 » and bold.

tableau connect excel file geographic map

It’s cool, we can see the clients number per country. You have the possibility to zoom on a region. Click on « Zoom area » and drag and drag to select the region on the map.

tableau connect excel file geographic map

tableau connect excel file geographic map

Now we can see that the majority of clients are in France, this represents almost half of the total clients number of the dataset. Germany and Spain have almost the same number of clients.

Share this article if you think it can help someone you know. Thank you.

-Steph

Export Worksheet

data science tableau export worksheet chart

I have just enrolled in a Data Science course on Udemy  and I learned good stuff.

Now is the time to learn how to export your worksheet as an image in another document.

You have made an excellent report that can be shown to your customers or your managers and you need to export it in a Word, PowerPoint or other document.

The only way to do with the free version of Tableau (Public Tableau) is to make a snapshot (printscreen).

In the paid version of Tableau, there is a function to copy the bar chart with a right-click.

But before doing this, it’s necessary to change the worksheet’s title because at the top of the bar chart, you can read the title « Sheet 1 ».

data science tableau export worksheet chart

« Sheet 1 » is the tab’s name at the bottom of the screen.

data science tableau export worksheet chart

To change the tab’s name, you do a right-click on it and you select « Rename Sheet ».

data science tableau export worksheet chart

You rename the tab « Annual Bonus Analysis ».

data science tableau export worksheet chart

You have the possibility to put the title visible or not on the bar chart. Click on « Worksheet » and activate/deactivate « Show Title ».

data science tableau export worksheet chart

The worksheet’s title is at the top of the bar chart.

data science tableau export worksheet chart

You can edit the title by double-clicking on it.

data science tableau export worksheet chart

If you want to change the title border, do a right-click on it and select « Format Title ».

data science tableau export worksheet chart

On the left, a tab appears to change the border.

data science tableau export worksheet chart

If you want to change the text’s format, you need to double-click on « Annual Bonus Analysis ». A window appears to change the size, etc.

data science tableau export worksheet chart

To export the bar chart to a Word document, do a snapshot or a print screen with the PrintScreen key (PrtSc) or Command+shift+3 for Mac on your keyboard.

Personally, I use the Paint.net software  to remove unnecessary elements from the image.

One you saved the bar chart’s image, you insert or paste this image into the Word file.

data science tableau export worksheet chart

In this image, you can see that the bar chart’s title is « Annual Bonus Analysis » instead of « Sheet1 ». It’s perfect for presenting to your clients or managers.

Then you can add text below the bar chart to present your insights.

To export the bar chart into a PowerPoint document, it’s the same principle. Do a snapshot or a print screen with the PrintScreen key (PrtSc) or Command+shift+3 for Mac on your keyboard.

Then you remove the unnecessary elements of the image.

data science tableau export worksheet chart

Once you saved the bar chart image, you insert or paste this image into PowerPoint file. Be sure that the title on PowePoint is at the top of the page otherwise the title will be under the image.

data science tableau export worksheet chart

To save your workspace, click on the « File » menu and select « Save To Public Table As… »

data science tableau export worksheet chart

« Tableau Public Sign In » appears to connect to your account.

data science tableau export worksheet chart

A tab will appear on your web browser to display your Tableau Public account and you can see your workspace saved on the server.

data science tableau export worksheet chart

Caution, Tableau doesn’t automatically save your work so it’s up to you to save regularly.

This section is now closed. We made an introduction to Tableau with a simple dataset to discover functions like calculated field, add colors, add labels and how to format.

Now you know the basics of Tableau and later we will use complex functions.

Share this article if you think it can help someone you know. Thank you.

-Steph

Navigate In Tableau

front boat

I have just enrolled in a Data Science course on Udemy  and I learned good stuff.

We’ll explore Tableau’s tools

From the connection manager, we’ll go into the Tableau’s workspace.

Click on the « Sheet1 » tab at the bottom of the window.

data science tableau screenshot

Here is the Tableau’s workspace.

data science tableau screenshot

The 2 important elements of the workspace are « Data » on the left and the workspace on the right. It’s in the workspace that you’ll create tables and charts.

We’ll start with « Data » on the left.

data science tableau screenshot

« Data » divided into 2 zones : dimensions and measures.

The dimensions and measures are 2 different rules that will allow you to manipulate data.

Tableau sets the numerical values in « measures » and the categorical or quantitative variables in « dimension ». This is the Tableau’s settings by default.

There is also another way to explain « dimension » and « measures ». The « dimensions » are independent variables and the « measures » are dependent variables.

For exemple, « Units » is a measure, it’s the number of items sold per product. « Region » is a dimension, it’s the geographic region where the product sold. With 2 elements we can know how many items sold by region. This means that « Region » is an independent variable and « Units » is a dependent variable because it will be grouped by region.

But if you don’t like it, you can move the entities between dimension and measures and the opposite by click and drag.

In the menu bar, at the top, there is « File » where you can open and save file.

data science tableau screenshot

« Data » to connect to new source files.

data science tableau screenshot

« Worksheet » is the workspace to create analyzes

data science tableau screenshot

« Dashboard » is a combination of worksheet

data science tableau screenshot

« Story » is a combination of worksheet and dashboard

data science tableau screenshot

« Analysis » to specify how you want to do your analysis on your workspace

data science tableau screenshot

« Map » to add maps to the workspace

data science tableau screenshot

« Format » contains formatting options

data science tableau screenshot

Now, let’s study the workspace.

In the workspace, the main elements are « Columns » and « Rows ». This is where you decide which data goes in columns and rows in your worksheet.

You can also choose different format for these elements like colors, size, text level of detail and tooltips (useful tool optional).

data science tableau screenshot

Let’s do a test. Use data from « Region » (which is in « dimension »). Move « Region » with a click and drop to the center of your workspace. Now, « Region » is in the element « Rows ».

A table appears in your workspace.

data science tableau screenshot

You put a dimension in your workspace. Now put a measure in your workspace.

Uses the « Units » data. Move « Units » with a click and drop next to the « Region » column.

data science tableau screenshot

As you can see, Tableau automatically put « Region » in the « Rows » element and the « Units » data aggregated by region. In this way, you can tell how many items were sold by region.

Now, what you can do is to move « SUM(Units) » to the « Columns » element.

data science tableau screenshot data science tableau screenshot

And then, you have a « bar chart » to see how many items have been sold by region. You can enlarge the graphic with a click and drop.

Let’s look at the tools that are in « Show Me » zone.

data science tableau screenshot

Click on « Pie chart » to have this chart’s type.

data science tableau screenshot

Click on « Size » icon and drag from left to right you can increase the chart’s size.

data science tableau screenshot

In this chart, each region has a color and proportion of items sold by region.

You can also test the « bubble chart ». Tableau organizes the data automatically and everything and placed in the « Marks ».

data science tableau screenshot

You can test « Treemaps » chart. This is the same principle as « bubble chart » but it’s rectangles instead of circles.

data science tableau screenshot

As you can see in « Show Me », there are charts disabled. This is because you need some elelments in your data to be able to activate them.

For example for the « Area chart », you need « date »data to activate it.

Share this article if you think it can help someone you know. Thank you.

-Steph