Algorithm & Analytics: Propel Digitization to deliver Business Value

0
“Data is inherently dumb... Algorithms are where the real value lies. Algorithms define action.”
- Peter Sondergaard, SVP of Gartner research
While a good number of businesses are in daze over the hype around technology buzzwords like “Big Data” and “IoT” often attributed to the hysteria unleashed by IT vendors and industry, a few have set their eyes on something that is even older than computing, Algorithms. More precisely machine learning algorithms. Gartner predicts that
 
  • By 2017, enterprises making decisions controlled by differentiated algorithms will achieve a 5% to 10% increase in revenue
  • By 2018, 40% of digital commerce sites will use price optimization algorithms and configure/price/quote tools to dynamically calculate and deliver product pricing.
  • By 2020, poor customer experiences will destroy 30% of digital business projects.

Well, the obvious question that would pop up is how can we mitigate this risk? Application design and architectures are undergoing a rapid transformation moving away from a specified requirement based objective, to having systems that evolve, observing human behaviour and nuances. Digital products should be designed to evolve in response to the behaviour of their users.

The pith of this transformation is the emphasis on machine learning algorithms. Imagine an e-commerce system that is able to self administer and adapt over the behaviours of its users. This system is not only able to auto execute rollout of custom offers for its users maximising conversion, but doing so in the most opportune moment as it has learnt from the characteristics of its users. We can without doubt predict that in near future, the primary target platform for digital applications would be digital personal assistants, the likes of which in today’s terms are Alexa, Siri, etc, but appreciably more advanced. The programming model and solution design would shift considerably as your target is no longer the “dumb” browser but an intelligent "Personal Assistant", howsoever artificial or digital it might be. Also, it is ever more important to build intelligence into your offerings due to the rise of “IoT”. There's a slight difference in how we interpret IoT though. IoT should essentially be viewed as “Intelligence of Things”. It is amazing how easy it to conceptualize endless possibilities using this reasoning, as we start speaking directly in terms of the outcome of connecting things. The last thing that you would want is your digital product to be the dumbest piece in the puzzle, while everything around is smart and intelligent.

We have observed in many of the digital transformation initiatives, too much emphasis is misplaced in doing a lot many things all at once than building them organically. The results in all such cases are bloated machinery and systems adding little value to business. In all our strategic endeavors in such transformations the core emphasis has been placed on short feedback loops to collect as much data as early as possible. Because it’s only through meticulous evaluation of data at right time would we be able to stay or steer through the right course in this digital transformation journey. And in almost all cases, the course taken by an enterprise would be unique and difficult to replicate. An enterprise can start right now from traditional BI systems in place to begin exploring and analyzing the data they already have. They can then put in place machinery to collect data from a lot many sources and have them pooled to a bare bones Big Data analytics or Real-Time analytics system along with the data in BI systems. Analytics is the discovery, interpretation, and communication of meaningful patterns in data. If BI is about making decisions, analytics is about asking the right questions to explore data. Analytics can complement BI systems to provide more deeper, exploratory perspective on data. Business analytics includes data mining, predictive analytics, applied statistics, and advanced algorithms to recommend actions, and is delivered as an application suitable for a business user.Once you have enough data, you can run a predictive model on the data to gather insights. This will enable the enterprise to make quick decisions to take a business action. Big data and analytics can harness massive volumes of raw data into smart and actionable business insights. Streaming analytics makes it possible to capture, analyze and act on all relevant data in real time, enabling decision makers to act instantly with confidence.

The insights and value derived from this initiative can serve as pointers as to which areas can we bring in intelligence be it customer experience or forecasting or process mining . The data trove can be utilized effectively to train models or build neural networks that can potentially become the basis for deep learning or machine learning modules for enterprise.Your digital offerings need to be intelligent enough to adapt and learn autonomously if you don’t want to fall into the 30% as Gartner points out. Investment in machine learning and deep learning is of absolute importance to the success of digital strategy for the periods to come. Our machine learning group has curated a class of ML algorithms in the rest for this blog to provide an insight to the possibilities while discussing various algorithms in layman’s terms. For those who would like to go deeper, a detailed analysis and documentation is available here along with sample implementations in R & Python.

Regression

Widely used for prediction and forecasting, Regression Analysis is a statistical process for investigating the relationship between variables. In the Victorian era, Sir Francis Galton showed that ‘when dealing with the transmission of stature from parents to children, the average height of the two parents, … is all we need care to know about them’ (1886).

In the context of Galton’s analysis, Regression tries to answer the following questions:

  1. Can the height of a child be predicted from the parent’s height?
  2. Is variation in heights of children related to variation in parent’s heights?
  3. What is the extent to which the estimated relationship is close to the true relationship?

Applications of regression are enormous and range from predicting health outcomes in medicine, stock prices in finance, predicting future demand for a product in business, and power usage effectiveness in high performance computing, or optimization of a manufacturing process in engineering. Regression has been central to the field of econometrics. For instance, Okun’s law uses a linear regression equation model to represent the relationship between unemployment rate and GDP growth. In marketing industry, it can be used to assess the marketing effectiveness, pricing of a product. In insurance domain, regression has been effectively applied to predict claims from the demographics. Linear regression yields well interpretable models which can be used to estimate trends and forecast sales.

View code: Regression

Classification

All industries suffer from voluntary churn. It is important for a telecom company to identify the dissatisfied customers early to reduce revenue loss. In machine learning the problem of predicting customer churn can be viewed as a classification problem where there are two classes: high-risk customers and low-risk customers. Classification is an example of pattern recognition. The details of a customer forms the input data, and a machine learning algorithm known as classifier maps this input data to a category. A similar problem is consumer loan default prediction which helps reduce the risk of insolvency. Another example is spam classification where emails have to be categorized as spam or non-spam. Some of the popular classifications algorithms are:

  • Logistic Regression
  • Support Vector Machines
  • LDA
  • Decision Trees
  • Perceptrons

The choice of algorithm is made based on the complexity of the problem. Logistic regression has been widely used in medicine and health industry. Many medical scales (TRISS, MPI, PSS etc) have been developed and analyzed using logistic regression. Perceptrons and SVMs find applications in OCRs, cutting tool state classification during a metal turning operation, detection of faults in a process plant, credit-card fraud detection, predicting delayed flights etc. Artificial Neural Networks are suitable for solving complex problems which can’t directly be addressed by rule based programming models. They are extensively employed in robotics (eg. for object recognition) and Computer Numerical Control systems wherever handwritten rules seems impractical.

 

Clustering

Clustering is the process of partitioning a dataset into distinct groups such that objects in the same group are quite similar to each other than the objects in other groups. Consider the following scenarios:

  1. A cellular company wants to find the locations to place its towers so that users within a cluster receive optimum signal strength from the tower.
  2. Group the accident prone areas in a region from road accidents data to investigate various circumstances that are associated with the occurrence of an accident, or to set up emergency care wards.

To apply clustering, we have to define what it means for two or more objects to be similar. There exists different cluster models and algorithms, each tailored for different applications to unearth hidden patterns in data. Market research can use clustering algorithms to perform market segmentation by identifying the right subgroup of customers to target. They are popular in many fields and has been extensively applied in medicine and biology for imaging, clustering genes and proteins, sequence analysis, in climate analysis for identifying patterns in atmosphere and ocean, in information systems for detecting anomalies in a process, in social network analysis for grouping people into communities, in e-commerce to provide accurate recommendations of new items for purchase, image segmentation etc.

View code: Clustering

Dimensionality Reduction

Dimensionality reduction refers to the techniques associated with transforming data to remove redundant features from the dataset and retain important features so that:

  1. Complex data can be easily visualized and interpreted.
  2. Performance of the machine learning model can be improved by eliminating redundant data.
  3. To save storage space and time required for processing the data.
  4. Removal of noise from data.

Principal Component Analysis (PCA) is an extensively used DR algorithm. The most successful application of PCA is in computer vision and image recognition to extract the significant features from an image database which gives maximum discrimination between the individual images. Independent Component Analysis (ICA) is another popular algorithm which can be used to separate a mixed signal into its sources, for example to extract individual speech signals when a group of people are talking in a noisy room, decomposing EEG recorded by multiple sensors, or to separate interfering radio signals arriving at a mobile phone.

The sensors signals from the embedded accelerometer and gyroscope in a Samsung Galaxy S-II can be used to predict whether the user is walking, sitting, standing or laying. While preprocessing these signals, many new variables are generated. Dimensionality reduction helps to get rid of the unwanted variables or dimensions, which ultimately improves accuracy of the final prediction model.

View code: Dimensionality Reduction

Rule based Learning

Rule based learning is the acquisition of structured knowledge in the form of rules. Rules are human readable, and widely used in expert systems. A sample rule deduced from a customer transaction database would look like {milk, bread} → {butter}, which can be interpreted as the “those who buy milk and bread together will also buy butter”. Learning algorithms typically use a statistical evaluation function to pick the rules. One of the principal areas of application is health information technology, where they can be used to create Clinical Decision Rules to help increase the accuracy of clinician’s diagnostic and prognostic assessments. In Clinical Decision Support Systems rule learning eliminates the need for writing rules and for expert input. Rule learning can aid situations where producing rules manually is too labor intensive. Some of other real world applications include preventive maintenance of electromechanical devices in chemical plants, creation of rules for setting parameters of a system controlling the separation of crude oil from natural gas in an oil refinery, in customer support system to decide the right technician to assign to the job when a customer reports a problem, etc.

View code: Rule Learning

Machine Learning in Practice: Predictive Maintenance

“Anything that can go wrong, will go wrong.”
-Murphy’s law
Predictive maintenance focuses on the detection, diagnosis and anticipation of failure of a machine so that maintenance can be performed in advance. A commercial flight generates about 20TB of sensor data per engine per every hour of operation. This data can be made useful and insightful, to provide alerts, and enable detecting or predicting engine failure. The rise of the IoT and big data analytics provides us with a plethora of solutions to reduce cost of operating complex machines, improve system availability and responsiveness by minimizing the risk of unexpected failures, thereby increasing the productivity of assets. Predictive maintenance techniques have also been used for root cause analysis of the failures and to schedule repair before it breaks. The state of the art systems employ a machine learning model at its core. In the naive preventive maintenance approach, machines are inspected periodically at fixed time intervals to perform a repair. In predictive maintenance apart from the failure statistics and test results, we draw additional data from the sensors and devices including environmental conditions to improve predictive accuracy of our model. Such an intelligent model can compute the metrics: time between failure, time to repair, predict whether a component will fail within a certain time frame, or assess the probability of a total system failure when multiple components fail. The model is trained using an online learning algorithm which can reoptimize the model instantly, and dynamically adapt to new patterns whenever a failure goes undetected. This ensures that the performance doesn’t degrade, or maybe improve with time.