Photo by Angiola Harry on Unsplash

Cancer, the mere utterance of the word sends shivers down the spine. According to, the disease is the second leading cause of death across the world with over 9.56 million deaths, only behind cardiovascular diseases.

If we look just at Breast Cancer, WHO reports that in 2018 alone the disease accounted for over 2,088,849 cases or 11.6% of the cancer pie, second only to Lung Cancer. It was also the 5th leading category of cancer with 626,679 deaths or 6.6% of the cancer pie for the same year.

The above numbers are dismal.

But is there hope? …

A case-based approach towards the method and its applications

Photo by Cathy Mü on Unsplash

Introduction —

We shall cover the following aspects in this article:

  1. Exploratory Data Analysis (EDA) to build our intuitions
  2. K-means Clustering in action
  3. What do marketers do with these clusters?

Let’s take a dataset and try to demystify concepts on the go.

Our data set is a collection of customer data at a shopping mall. This type of data is usually collected through a myriad of sources — Customer Relationship Management tools (CRM), transaction data, parking tickets, lucky draw coupons etc.

Our data frame has 200 rows of customer data across 5 features or variables.

Photo by Stephanie LeBlanc on Unsplash

The stage of building and evaluating ML models is arguably the most sought after portion of the data analysis pipeline. The rising anticipation while the code is executing, followed by the sheer joy of discovering a high accuracy model is a great feeling.

Recently, I was working on building a predictive model that would estimate house prices in Bengaluru, India. A restless me was particularly eager on checking the performance of ML models way early in the pipeline. I knew I would end up with bad performing models, but I did not know — how bad?

And that’s when it…

Photo by Saransh Sinha on Unsplash

The Struggle is real.

In the process of buying a house, the customer carries out extensive market research to identify the best properties. To collect market data, they talk to their friends and family, property brokers or real estate agents, carry out their own online research and what not. The process can be excruciatingly long (over a few months!). But despite such efforts, at times, customers do end up buying a property that was not after all the steal deal that their agent suggested.

So how can data science come to rescue in all this?

— enter Predictive Modelling.

Imagine how convenient would it be to such customers, if they could stumble upon a website…

Photo by Franki Chamaki on Unsplash

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”

Sir Arthur Conan Doyle, Sherlock Holmes

The Business problem –

A Portuguese bank is facing a decline in revenues. Upon investigation they have identified the root cause as follows — clients are not depositing as frequently as before.

Term deposit allows the bank to hold onto a deposit for a specific amount of time, allowing it to invest in higher gain financial products to make a profit. Further, the bank typically stands a greater chance to convince long term deposit holders into buying other products such as funds or insurance to further increase their revenues.

As a result, the bank wishes to identify…

Gade Venkatesh

Love to read, write and share interesting stuff | Data science | Marketing

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store