## A Pratical Tip To Validate Your Approach

I have just enrolled in a Data Science course on Udemy  and I learned good stuff.

How was the A/B test « Number Of Product » ? Easy or difficult ?

Here is the result I found.

I think you noticed there was something bizarre. There is an anomaly. We imagine that the more the client has products, the more the client is satisfied with the bank so this type of clients should stay in the bank.

In the first 2 bars we can see that a client who has 1 product is more likely to leave the bank than a client who has 2 products. But when a client has 3 or 4 products, we see a huge rate of clients leaving the bank.

Look, there is a little bizarre detail. In the 2nd bar, we can’t see the « Exited » label. This is because there is no place in the orange part to put the text. To make it simpler, we’ll remove the label « Exited ». Drag and drop on the « Exited » text label to the outside.

Perfect, we can read the percentages. On the 1st bar, we can see that among the client that have 1 products, 28% left the bank. On the 2nd bar, we can see that among clients who have 2 products, 8% left the bank. This show us that clients who have 1 products are more likely to leave the bank than clients with 2 products.

And for the next bars, we observe an anomaly. On the 3rd bar, we can see that among the clients who have 3 products, 83% left the bank. On the 4th bar, we can see that among clients who have 4 products, 100% left the bank. We clearly see that there is a problem and we need to do a deeper analysis to understand what is going on .

As a Data Scientist, we need to explain what happens in bars 3 and 4. Usually when a client has 3 or 4 banking products, that means he/she is satisfied and is loyal to the bank. But in our case, it’s the opposite because there is a high rate of client who left the bank. This is the time to do deeper analysis.

The first thing to analyze is the quality of the data. There is a very big anomaly and it may be because there is something insignificant in our data that disturbs the statistics. For example, it’s possible that when the bank selected these clients in this sample, there were very few clients with 4 products and all those clients with 4 products left the bank. Sometimes chance can create anomalies and you have to play attention to these effects of chance because they don’t seem important but they can create false interpretations.

To start, we will check the number of clients with 4 products.

In « Measure », move « Number Of Records » (which gives the number of observations) on « Label ».

We observe on the first 2 bars than many clients with 1 or 2 products selected for our sample. For clients with 3 or 4 products, we can see that there were fewer clients selected for our sample.

There are 220 clients with 3 products and 60 clients with 4 products. These small number of clients probably explain why we observe these anomalies.

In this sample of randomly selected clients, there are very few clients with 4 products and they all left the bank. In this situation, we can confirm that it’s a chance. When thing like that happen, you have to be very careful not to make conclusion too fast and make misinterpretations.

The conclusion is that a lot of clients have been selected for category 1 and 2. For category 3 and 4, there have been few clients selected so we can’t do accurate statistics. We need to do deeper analyze for these categories of clients with 3 and 4 products.

Now, let’s put the percentage back on the bar chart. Click on the « Back » button.

.

Or do a click and drag of « SUM(Number of Record) » to outside.

We saw that there is an anomaly and what is interesting to do is to have a comment to remember to do a more in-depth analysis of columns 3 and 4.

Right-click between the bar chart’s title and the bars. Select « Annotate » then « Areas… ».

A window appears. In this window, you write « Low observation in last 2 categories » and click on the « OK » button.

Click on the comment and move it on bars 3 and 4.

The next time you work on this bar chart, you will see this comment that will remind you to seriously analyze client who have 3 and 4 products.

# Validate our approach

It’s time to show you how to validate an approach and how to validate the data. For this we will create a new A/B test.

Duplicate this worksheet with a right-click on the « NumberOfProducts » tab and select « Duplicate ».

And rename the tab « Validation ».

For this tab, we will erase the comment. Select the comment and press the « Delete » button on your keyboard.

Everything is ready, the idea is to find a variable that doesn’t affect our results. That is a variable that has no impact on a client’s decision to leave or stay in the bank.

Take for example, the variable « Customer Id ». Client’s identification number has no influence on the client’s decision to stay or leave the bank.

We’ll do an A/B test with the last digit of the « Customer Id » and we’ill check that there is the same clients proportion who leave the bank in the 10 categories of the last digit of the « Customer Id ». The 10 categories are the numbers 0,1,2,3,4,5,6,7,8,9.

Let’s g.To start, we will create the variable that contains the last digit of the « Customer Id ». To have this variable, we will create a « Calculated Field ».

Right-click on « Customer Id », select « Create » and click on « Calculated Field ».

Name the calculated field « LastDigitOfCustID ». In the text field, we use the « RIGHT » function with « Customer Id » in parenthesis to select the last character of the « Customer Id ». In our case, the last character of the « Customer Id » is the last digit.

Here is the code to write in the text field : Right ({Customer Id},1)

Oooops, you see there is a small mistake => The calculation contains errors.

There is an error in the formula because « Customer Id » is a number variable and the « RIGHT » function applies to a variable of type « STRING ».

To use the « RIGHT » function, we will convert « Customer Id » into a string. We will use the « STR » function with « Customer Id » in parenthesis.

Here is the code to write in the text field

And click on the « OK » button : Right (STR({Customer Id}),1).

Now, you can see that our calculated field « LastDigitOfCustID » is in « Dimensions ».

Click on « LastDigitOfCustID » and move it on top of « NumOfProducts » in « Columns ».

Now we have a new bar chart and we see that for every last digit of the « Customer Id » there is about the same proportion of clients leaving the bank. All these proportions don’t correspond exactly to the average of 20% but these slight variations aren’t important.

Seeing this uniform distribution allows us to validate our data because these data are homogenous.

# Conculsion

Here’s how you can check the homogeneity of your data. You take a variable that has no impact on the fact that a client leaves or stays in the bank. The example we did with the last digit of the « Customer Id » is excellent. We were able to verify that in each of the categories taken by this variable, if there was the same proportion of clients leaving the bank. As is the case, we can validate our data.

Imagine another result. When we do the test with the last digit of the « Customer Id », we observe that for one of the numbers, the rate of clients who left is really higher than the average. This shows us that there is a problem in our data because it indicates an anomaly.

You can find other ways to verify your data by using other « insignificant variables » to see if the distribution is homogeneous. But be careful when you select an « insignificant variable » because there may be traps.

Here is an example. If you create a variable that takes the first letter of the first name, the distribution will not be homogeneous. The reason is simple, there are many more people who have a name that starts with the letter « M » than with the letter « Y ».

-Steph

I have just enrolled in a Data Science course on Udemy  and I learned good stuff.

In the previous article we learned how to work with aliases. We will learn how to add a reference line in our bar chart.

Before I start, I’ll show you a trick in Tableau.

In our bar chart we can see the labels in this order : percentage and below : « Stayed » or « Exited ».

We will reverse this order. You go in this rectangle.

And you place the label « Exited » above the label « SUM(Number of Records ».

Look, the label « Stayed » is above percentage.

With that, we can understand the bar chart more easily.

Let’s add a reference line, let’s go . But before, I think you’d like to know why I’m talking to you about a reference line.

A reference line helps us to compare bar chart results with a benchmark. This benchmark is represented by this reference line.

In our case, the benchmark is the percentage of clients who left the bank in our sample of 10 000 people.

The first thing to do is find this percentage in our bar chart. To be able to do that, remove « Gender » from « Columns ».

Boom, we have a new bar chart.

Look, we only have the percentage of clients who left the bank and the percentage of clients who stayed in the bank.

We see that on our sample of 10 000 people, there are 20% of the clients who left the bank and 80% of the clients stayed in the bank. This means that the churn rate (client departure rate) is 20%.

What we’re going to do is we will add this churn rate in our A/B test. To return to our A/B test, press 2 times on Ctrl+Z or Command+Z or you can click 2 times on the « Back » button in the menu bar.

Now we know that the average clients who left the bank is 20%.

We will add a horizontal line in the Y axis (Y = 20%) to compare the 20% of the churn rate and the 2 categories male and female.

Let’s go. Right-click on the vertical axis (Y axis) and select « Add Reference Line ».

A window appears with several options.

You have the choice to add a line, a band, a distribution or a box plot.

We will use the line for the entire table.

Click on the « Line » button and activate the « Entire Table » checkbox. In « Value » selects « Constant ».

The constant is 20%, so it’s necessary that you put 0.20 in « Value ».

It’s possible to put a label on this reference line. For example, if the line reference corresponds to a formula, the label displays the formula. But for our case, our constant is 20% and it’s already displayed on the vertical axis so we will select « None ».

For the format of the line, select the continuous line and click on the « OK » button.

We have our reference line is added to our chart.

Here is what we can see. Female clients are more likely to leave the bank than average clients. Male clients are less likely to leave the bank than average clients.

In our case, it’s obvious to see that because there is only 2 categories, men and women.

Now you know how to add a reference line in a bar chart.

-Steph

## Dataset For Data Mining

I have just enrolled in a Data Science course on Udemy  and I learned good stuff.

To have the dataset to do Data Mining, you need to go to the superdatascience website . In « Part.1 Visualization », you see the section « How to use Tableau for Data Mining ». Click on « Churn Modeling » to download the file.

Once you have downloaded the file, move the file to the directory you created for the course. In this directory, create a new directory (unless you already do it) named « 2.Chunk investigation ».

Open this fiel with Excel or with other spreadsheet software.

Know that we use this dataset for the visualization part but we will also use this dataset for the modeling part.

Let’s analyze the data of this dataset.

This dataset is quite large because it contains 10 000 lines and a few columns. This is the list of a bank’s client. The client information is :

• Surname (last name)

• Credit score ( is the measure that indicates the client’s ability to borrow)

• Geography (client’s country)

• Gender (male or female)

• Age

• Tenure -(the number of years the client is in the bank)

• Balance (balance of the client’s bank account)

• NumOfProduct (number of product that the client has in the bank – credit card, contract, account)

• HasCrCard (does the client have a credit card ?)

• IsActiveMember (did the client use his/her credit card during the last month ?)

• EstimatedSalary (the bank’s estimate of the client’s annual salary)

• Exited (did the client leave the bank ?)

Now, I will explain the context related to this dataset. This bank has branches in several countries like Germany, Spain and France. This bank noticed that lately there were many clients who left the bank. The bank has a report called « churn rate » which is the customers rate who leave the bank and for a few months the « churn rate » is really higher than usual. It’s for this reason that the bank needs a data scientist (you) to find the problem and propose solutions.

This dataset is a small sample of clients bank. These are 10 000 randomly selected client.

The column « Exited » is a column that didn’t exist before. This column has created when the bank realized that there was an abnormal number of client who were leaving the bank.

Then the bank observed these clients for 6 months to see which client left the bank.

In the « Exited » column, the number « 1 » means that the client left the bank and the number « 0 » means that the client stayed in the bank.

To analyze this dataset, you’ll need to do A/B Tests. For exemple, a classic A/B Test is to see if women are more likely to left the bank than men. That’s means, see the number of men who left the bank, see the number of women who left the bank and then normalize by the total number of clients. It’s important to normalize the number of clients because there are not the same proportions of women as men. Next, based on the last column « Exited », you’ll find out if it’s the men or women who are likely to left the bank.

Once you have relevant results, you can show your report to the bank. And with this report you should be able to propose solutions to the bank. For example, if the report says that women leave the bank in bulk, it’s because there is a problem and it’s necessary to see whether the bank is offering women something right. Or it’s possible that another bank offers a much more attractive offer for women or something else.

You will learn how to investigate in the dataset and find answer through client information with A/B tests.

-Steph

## The Importance Of Counting Calories

I watched a Jamcore DZ’s video and I learned good stuff.

As you already know, food is our energy. The energy from food, we call it calorie. Calories composed of macronutriments : carbohydrates, proteins and lipids.

# Count the number of calories

With carbohydrates, proteins and lipids, we can know the minimum number of calories we need to consume, it’s the Basal Metabolic Rate (BMR). Basal Metabolic Rate (BMR) is the number of calories your body needs to function without doing any physical activity. That’s the number of calories you burn while you stay in bed all day.

With your Basal Metabolic Rate (BMR), you can find your Total Daily Energy Expenditure (TDEE) . Total Daily Energy Expenditure (TDEE) is the number of calories your body needs in a day when you train.

Total Daily Energy Expediture (TDEE) varies according to your level of activity during the day (the intensity of the tasks you have to do during the day and the intensity of your training).

Total Daily Energy Expenditure (TDEE) is always dynamic because it’s influenced by the Not Exercise Activity Thermogenesis (NEAT). Not Exercise Activity Thermogenesis (NEAT) is the number of calories you burn in a day when you have no workout. But attention, it doesn’t count the calories you burn when you sleep and eat. It only counts the calories you burn when you walk, read, write, work, study, etc.

As the intensity’s level of your tasks that aren’t a sport activity changes all the time, it changes your Total Daily Energy Expenditure (TDEE) all the time. Here are 2 scientific studies to better understand the Not Exercise Activity Thermogenesis (NEAT), here and there .

Now that you’ve seen your Total Daily Energy Expenditure changes all the time, you need to know that it changes your caloric deficit when you want to lose weight (be shredded) and your caloric surplus when you want to gain muscle.

# Lose weight or gain muscle

The phase for losing weight (be shredded) and gaining muscle influenced by 2 things : morphology (endomorph, ectomorph and mesomorph) and genetics.

Take the example of an endomorph person. An endomorph person is a person who can easily store fat because, it’s a person who has greater resistance to leptin. Leptin is a hormone that manages your metabolism and your appetitie (leptin controls your feeling of satiety). An endomorph person who has this type of information has good tools to know how to manage his/her Total Daily Energy Expenditure (TDEE).

For my case, I’m a mesomorph person. Which means that I’m a person who is losing weight fast and gaining muscle fast. At first, I didn’t count my calories because it easy for me to maintain a good physique. But since I’ve been studying my body to find out how it works, my results are really better for the long term.

That’s why no matter what your body type, I advise you to take the time to count your calories to have the best body possible in the long term.

I use an app on my smartphone to count my calories (you can scan barcodes). It’s My Fitness Pal. Try it, it’s really helpful.

If you intentionally don’t take care of your health, accidentally, you will have an illness.

-Steph

## Masturbation And Sport

I watched a Jamcore DZ’s video  and I learned good stuff.

My intention is to explain masturbation’s effect on a sport activity without being vulgar. Masturbation is a normal activity for some people and abnormal for others. In some religions, it’s a forbidden or ignoble activity but it’s an activity that most people do.

# Type of person

The open minded person

For this person, masturbation is a normal activity because it’s natural and give a good mood. It’s not something bad.

The guilty person

For this person, masturbation is an abnormal activity, ignoble but despite that, this person does it and still feels guilty after masturbation.

The obsessed person

For this person, masturbation is an obsessive activity. This person could masturbate every hour, every day.

The bored person

For this person, masturbation is an activity to kill boredom. He’s a person who has nothing to do in life or in the day and doesn’t know what to do, masturbation.

# How the body reacts

1. The intensity of masturbation increases (black line)

2. There is a release of hormones, it’s dopamine. Before masturbation, there is a first release of dopamine that cause sexual arousal and there are other dopamine release during masturbation (blue line)

3. Sexual arousal during masturbation will start the process to have the explosion, this explosion is orgasm (red line).

4. Once the orgasm has happened, there is a release of hormones, it’s prolactin. Prolactin causes the loss of interest to continue masturbation. You satisfied and you don’t want to continue to masturbate. In some situation, prolactin can make you feel guilty (green line).

# Stop masturbation for 1-3 weeks

There is a black line in the middle of the chart. This is the line of testosterone. When you stop masturbation for 1-3 weeks, there is an increase in testosterone rate. But after 3 weeks, the testosterone rate returns to normal. You have to base on your normal testosterone rate to see if your testosterone is affected by anything.

It’s possible that during masturbation, the testosterone rate is below your normal testosterone rate. In this case, it’s another problem and it’s necessary to see your doctor for more details.

Most people think that masturbation makes you lose testosterone. So if masturbation makes you lose testosterone, it means that masturbation also make you lose muscles. The truth is that this myth is wrong.

# Masturbation and gain muscle

The answer is « No » because the elements that influence muscle gain are training program, nutrition and rest/sleep. The only thing that masturbation does before or after a training session is the increase in heart rate.

Warning

Now that you have this information, it doesn’t mean that I encourage you to masturbation several times a day, every day. You do this when you want according to your desires but in moderation.

Here are several scientific studies on masturbation and testosterone :

• Endocrine response to masturbation-induced orgasm in healthy men following a 3-week sexual abstinence. Click here .
• 3 Weird but Surprisingly Effective Tricks & Tips to Raise Natural Testosterone Levels. Click here
• Lack of sexual activity from erectile dysfunction is associated with a reversible reduction in serum testosterone. Click here.
• Studies on the relationship between plasma testosterone levels and human sexual activity. Click here
• Acute changes in plasma testosterone levels and their relation to measures of sexual behavior in the male house mouse (Mus musculus). Click here
• Effects of ejaculation on levels of testosterone, cortisol, and luteinizing hormone in peripheral plasma of rhesus monkeys. Click here
• Relationship between sexual satiety and brain androgen receptors. Click here .
• Increased estrogen receptor alpha immunoreactivity in the forebrain of sexually satiated rats. Click here
• A study of the prostate, androgens and sexual activity of male rats. Click here
• Androgen Receptors: 5 Ways to Increase the Density and Sensitivity of the AR. Click here
• Scientists visit sex club for research into testosterone levels. Click here
• Male and female salivary testosterone concentrations before and after sexual activity. Click here
• Sex and Testosterone: Most Enjoyable Way to Boost Male T-Levels Naturally. Click here

-Steph

## Back To The Source Of Human Strength (Part 1)

I read a Nerd Fitness article  and I learned good stuff.

When I’m in gym, people tell me why they started to train. There are several reasons, lose weight, gain muscles or to be in shape for a special event. There is a golden rule to have result in strength training, it’s eat clean and lift heavy.

Everybody says it, eat clean and lift heavy but what does that means ? I think it’s time to go back to the source.

# Why

The life is easier when we’re stronger, did you notice it ? By exemple, to carry bags from supermarket, to carry a piece of furniture, to run to be in time, etc. Everything is easier.

Strength training is extremely efficient to build muscle and burn fat. Whether to lose 6kg (15lbs) or 45kg (100lbs), it’s the same thing.

Strength training help to build muscle and lose fat but it also help to stop or even reverse sarcopenia (it’s skeletal muscle reduction with aging). Train our skeletal muscle allow us to be independant (therefore avoid nursing homes) and live longer.

The profits list isn’t finished yet.

A nice athletic body

Strength training allows you to have a better oxygen consumption than aerobic exercices. After training your body needs to make a lot of efforts to recover and return to the normal state so the state you had before the training. Scientific studies have shown that your metabolism‘s level increase for 38 hours after your training.

Strength training increase your metabolism, increase your Resting Metabolic Rate (RMR) because your body needs more calories to keep your muscls than to keep your fat. It’s estimated that for each 4.45kg (1lbs) of muscle, your RMR increase of 30-35 calories.

Strength training help you to improve your balance and coordination, improve cholesterol’s rate, help to control blood sugar, stop muscle loss, improve blood flow, reduces your resting blood pressure, build a stronger heart and increase bone density.

Feel better

It’s clear with strength training you have more self-confidence, your have more energy, a better mood, less anxiety and less stress . It also improves the quality of your thoughts, a scientific study has shown that this increases the cognitive function .

It’s not advisable to do a strength training 1 hour before sleeping. On the other hand to train early in the day helps to prevent sleep apnea  and insomnia.

Prevents diseases and degenerative diseases

You notices that many men and women die because of cardiac disease. Stength training helps to solve problems that are factors of heart disease such as inactivity, diabets, obesity, high blood pressure and cholesterol. Cardiologists begin to advise strength training to people who have heart attack.

Strength training helps to manage and improve the lifestyle of people with clinical depression , cancer survivors , have had a spinal cord injury ,who have recently had a stroke , fibromyalgia , Lymphedema , Down Syndrome , Parkinson’s Disease , Osteoporosis  and Arthritis.

It’s fun

Strenght training helps to reach your goals whether an effective training of 20 minutes to have a good naked body or to do a sports competition. It’s fun to see our progress because it’s like increase level in a video game. If you want to be better in a sport like badminton or rock climbing, strength training is a good choice.

People who shouldn’t train

The only people I’ve found who shouldn’t train are people who have an injury and it’s a break that doesn’t last long. We’re human being and it’s natural to carry things and move. Strength training is recommended for pregnant women , children and teenagers  and even paraplegics .

Obviously, it recommanded to see his/her doctor before beginning a strength training program to adjust the training program.

# Basic objections

But I’m too old, it’s not reasonable

When people between 30 and 60 years old tell me that they’re too old, I laugh because it’s a lie. It’s like say : « I don’t have time » and later I see a messsage on Facebook like : « Yestersay, Game of the thrones was epic ! ». Scientific studies have shown that people between 70 and 90 years old had impressive results in 10 weeks .

Other scientific studies have shown that this helps to avoid dementia  and to delay Alzheimer’s .

People who think they’re too old to train are exactly people who should train.

But I just want to be better for a sport and I want to stay fit

Strength training improve your muscle’s endurance, scientific studies have shown it. Resistance exercices help to increases overall muscles endurance , fix nervous system problems and increase activation of motor units within your muscles.

I don’t want to bulk too much

It’s a great strength training’s myth, it’s not in 1 year than you’ll have the bodybuilder’s body no matter wheteher you’re a man or a woman.

For women, here 2 exemples of moms who have become strong and fit without having a bodybuilder body. This is the story of Veronica  and Bronwyn .

Have a the body of a famous bodybuilder doesn’t happen in one night. Naturally we don’t have the amount of homones to have the body of a famous bodybuilder. We need to eat a huge amount of food (8 000 – 10 000 calories per day) and a huge amont of drugs (no, no, it’s another type of pharmacy, it’s true).

I’m fat, I want only lose weight

I’m fat, I want only lose weight

It’s a good reason to start strenght training. By losing weight, you surely want to keep your muscles. I mean, yout want to keep your muscles while you lose fat. With strength training you lose quickly centimeters (inches) in different body’s parts. It’s true that the overall weight loss may seem slow, that’s why is important to track your measurements. With strenght training when you eat with a caloric deficit, it’s increase your metabolism so you lose fat.

It’s boring

What I like with strength training, it’s we can see our progress quickly. It’s like increase level in a video game. I think people get bored when they expect nothing to do between 2 sets. This is why I advise people to have headphones to listen to their own music selection or an audio book to increase motivation.

# Muscles and strength training

Don’t be fancy, it’s interesting to understand how our muscles work to avoid failling into the Matrix’s trap. What I mean is that the only person you gave to impress is yourself and not the others.

It’s our muscle fibers that are small muscle cells that build up our muscles. Our muscle fibers are long, cylindrical and have the size of a strand of hair. Our muscle fibers are composed of myofibrils surrounded by sarcoplasm. I really summarized that but if you want to see it in detail, click here .

We have around 642 skeletal muscles that work together to make move our body. Imagine, when you bend your arm, your biceps contracts and your triceps does the opposite (elongates) to let the elbow bend. Every muscle of your body work as a team to make you move.

We have different type of muscle fibers :

Slow twitch (Type 1 fibers)

Slow twitch fibers are used to convert oxygen into energy for a long period of time. They don’t move quickly to be resistant to fatigue. These are the fibers we use most for marathons.

Fast twitch (Type 2 fibers)

Fast twitch move quickly so they’re not resistant to fatigue. Our body has 2 categories of fast fibers. The type 2A fiber has an endurence characterisitc used for long sprints. The type 2X fiber has « super fast » characterisitc used to short sprints or lift weights. I really summarized that but if you want to see it in detail, click here .

Each individual has a different percentage of slow and fast fibers. It’s for this reason that people are naturally better to run long distances or sprints. It’s funny because we can see it in strength training with people who are better to do sets with high or low repetitions.

It’s the end of the first part and the second part is event more interessting.