Cancer, the mere utterance of the word sends shivers down the spine. According to ourworldindata.org, the disease is the second leading cause of death across the world with over 9.56 million deaths, only behind cardiovascular diseases.
If we look just at Breast Cancer, WHO reports that in 2018 alone the disease accounted for over 2,088,849 cases or 11.6% of the cancer pie, second only to Lung Cancer. It was also the 5th leading category of cancer with 626,679 deaths or 6.6% of the cancer pie for the same year.
The above numbers are dismal.
But is there hope? …
We shall cover the following aspects in this article:
Let’s take a dataset and try to demystify concepts on the go.
Our data set is a collection of customer data at a shopping mall. This type of data is usually collected through a myriad of sources — Customer Relationship Management tools (CRM), transaction data, parking tickets, lucky draw coupons etc.
Our data frame has 200 rows of customer data across 5 features or variables.
The stage of building and evaluating ML models is arguably the most sought after portion of the data analysis pipeline. The rising anticipation while the code is executing, followed by the sheer joy of discovering a high accuracy model is a great feeling.
Recently, I was working on building a predictive model that would estimate house prices in Bengaluru, India. A restless me was particularly eager on checking the performance of ML models way early in the pipeline. I knew I would end up with bad performing models, but I did not know — how bad?
And that’s when it…
In the process of buying a house, the customer carries out extensive market research to identify the best properties. To collect market data, they talk to their friends and family, property brokers or real estate agents, carry out their own online research and what not. The process can be excruciatingly long (over a few months!). But despite such efforts, at times, customers do end up buying a property that was not after all the steal deal that their agent suggested.
— enter Predictive Modelling.
Imagine how convenient would it be to such customers, if they could stumble upon a website…
― Sir Arthur Conan Doyle, Sherlock Holmes
A Portuguese bank is facing a decline in revenues. Upon investigation they have identified the root cause as follows — clients are not depositing as frequently as before.
Term deposit allows the bank to hold onto a deposit for a specific amount of time, allowing it to invest in higher gain financial products to make a profit. Further, the bank typically stands a greater chance to convince long term deposit holders into buying other products such as funds or insurance to further increase their revenues.
As a result, the bank wishes to identify…
Love to read, write and share interesting stuff | Data science | Marketing