Algorithm & Analytics: Propel Digitization to Deliver Business Value

0
Data is inherently dumb... Algorithms are where the real value lies. Algorithms define action.
- Peter Sondergaard, SVP of Gartner research.
 
While a good number of businesses are dazed over the hype around technology buzzwords like “Big Data” and “IoT” often attributed to the hysteria unleashed by IT vendors and industry, a few have set their eyes on something that is even older than computing—Algorithms. More precisely, machine learning algorithms. Gartner predicts that:
  • By 2017, enterprises making decisions controlled by differentiated algorithms will achieve a 5% to 10% increase in revenue.
  • By 2018, 40% of digital commerce sites will use price optimization algorithms and configure/price/quote tools to dynamically calculate and deliver product pricing.
  • By 2020, poor customer experiences will destroy 30% of digital business projects.
Well, the obvious question that would pop up is how can we mitigate this risk? Application design and architectures are undergoing a rapid transformation moving away from a specified requirement based objective, to having systems that evolve, observing human behavior and nuances. Digital products should be designed to evolve in response to the behavior of their users.
 
The pith of this transformation is the emphasis on machine learning algorithms. Imagine an e-commerce system that is able to self-administer and adapt over the behaviors of its users. This system is not only able to auto execute rollout of custom offers for its users maximizing conversion, but also does that in the most opportune moment as it has learnt from the characteristics of its users. We can without doubt predict that, in near future, the primary target platform for digital applications would be digital personal assistants, the likes of which in today’s terms are Alexa, Siri, etc., but appreciably more advanced. The programming model and solution design would shift considerably as your target is no longer the “dumb” browser but an intelligent "personal assistant", howsoever artificial or digital it might be. Also, it is ever more important to build intelligence into your offerings due to the rise of “IoT”. There's a slight difference in how we interpret IoT though. IoT should essentially be viewed as “Intelligence of Things”. It is amazing how easy it to conceptualize endless possibilities using this reasoning, as we start speaking directly in terms of the outcome of connecting things. The last thing that you would want is your digital product to be the dumbest piece in the puzzle, while everything around is smart and intelligent.
 
We have observed in many of the digital transformation initiatives that too much emphasis is misplaced in doing a lot many things all at once than building them organically. The results in all such cases are bloated machinery and systems adding little value to business. In all our strategic endeavors in such transformations, the core emphasis has been placed on short feedback loops to collect as much data as early as possible. Because it’s only through meticulous evaluation of data at the right time, would we be able to stay or steer through the right course in this digital transformation journey. And, in almost all cases, the course taken by an enterprise would be unique and difficult to replicate. An enterprise can start right now from traditional BI systems in place to begin exploring and analyzing the data they already have. They can then put in place a machinery to collect data from a lot many sources and have them pooled to a bare bones big data analytics or real-time analytics system along with the data in BI systems. Analytics is the discovery, interpretation, and communication of meaningful patterns in data. If BI is about making decisions, analytics is about asking the right questions to explore data. Analytics can complement BI systems to provide deeper, exploratory perspective on data. Business analytics includes data mining, predictive analytics, applied statistics, and advanced algorithms to recommend actions, and is delivered as an application suitable for a business user. Once you have enough data, you can run a predictive model on the data to gather insights. This will enable the enterprise to make quick decisions to take a business action. Big data and analytics can harness massive volumes of raw data into smart and actionable business insights. Streaming analytics makes it possible to capture, analyze and act on all relevant data in real time, enabling decision makers to act instantly with confidence.
 
The insights and value derived from this initiative can serve as pointers as to which areas we can bring in intelligence, be it customer experience or forecasting or process mining. The data trove can be utilized effectively to train models or build neural networks that can potentially become the basis for deep learning or machine learning modules for enterprise. Your digital offerings need to be intelligent enough to adapt and learn autonomously if you don’t want to fall into the 30% as Gartner points out. Investment in machine learning and deep learning is of absolute importance to the success of digital strategy for the periods to come. Our machine learning group has curated a class of ML algorithms in the rest for this blog to provide an insight to the possibilities while discussing various algorithms in layman’s terms. For those who would like to go deeper, a detailed analysis and documentation is available here along with sample implementations in R & Python.
 
Regression
Widely used for prediction and forecasting, Regression Analysis is a statistical process for investigating the relationship between variables. In the Victorian era, Sir Francis Galton showed
that ‘when dealing with the transmission of stature from parents to children, the average height of the two parents, . . .  is all we need care to know about them’ (1886).
 
In the context of Galton’s analysis, Regression tries to answer the following questions:
  1. Can the height of a child be predicted from the parent’s height?
  2. Is variation in heights of children related to variation in parent’s heights?
  3. What is the extent to which the estimated relationship is close to the true relationship?
Applications of regression are enormous and range from predicting health outcomes in medicine, stock prices in finance, predicting future demand for a product in business, and power usage effectiveness in high performance computing, or optimization of a manufacturing process in engineering. Regression has been central to the field of econometrics. For instance, Okun’s law uses a linear regression equation model to represent the relationship between unemployment rate and GDP growth. In marketing industry, it can be used to assess the marketing effectiveness or pricing of a product. In the insurance domain, regression has been effectively applied to predict claims from the demographics. Linear regression yields well interpretable models, which can be used to estimate trends and forecast sales.
 
 
Classification
All industries suffer from voluntary churn. It is important for a telecom company to identify the dissatisfied customers early to reduce revenue loss. In machine learning the problem of predicting customer churn can be viewed as a classification problem where there are two classes: high-risk customers and low-risk customers. Classification is an example of pattern recognition. The details of a customer form the input data, and a machine learning algorithm known as classifier, maps this input data to a category. A similar problem is consumer loan default prediction, which helps reduce the risk of insolvency. Another example is spam classification where emails have to be categorized as spam or non-spam. Some of the popular classifications algorithms are:
  1. Logistic Regression
  2. Support Vector Machines
  3. LDA
  4. Decision Trees
  5. Perceptrons
The choice of algorithm is made based on the complexity of the problem. Logistic regression has been widely used in medicine and health industry. Many medical scales (TRISS, MPI, PSS, etc.) have been developed and analyzed using logistic regression. Perceptrons and SVMs find applications in OCRs, cutting tool state classification during a metal turning operation, detection
of faults in a process plant, credit-card fraud detection, prediction of delayed flights, etc. Artificial Neural Networks are suitable for solving complex problems which can’t directly be addressed by rule based programming models. They are extensively employed in robotics (eg., for object recognition) and Computer Numerical Control systems wherever handwritten rules seem impractical.
 
 
Clustering
Clustering is the process of partitioning a dataset into distinct groups such that objects in the same group are quite similar to each other than the objects in the other groups. Consider the following scenarios:
  1. Find suitable locations for a cellular company to place its towers so that users within a cluster receive optimum signal strength from the tower.
  2. Group accident prone areas in a region from road accidents data to investigate various circumstances associated with the occurrence of an accident, or to set up emergency care wards.
To apply clustering, we have to define what it means for two or more objects to be similar. There exists different cluster models and algorithms, each tailored for different applications to unearth hidden patterns in data. Market research can use clustering algorithms to perform market segmentation by identifying the right subgroup of customers to target. They are popular in many fields and have been extensively applied in medicine and biology for imaging, clustering genes and proteins, sequence analysis; in climate analysis for identifying patterns in atmosphere and ocean; in information systems for detecting anomalies in a process; in social network analysis for grouping people into communities; in e-commerce to provide accurate recommendations of new items for purchase,; and in computer vision for image segmentation, etc.
 
 
Dimensionality Reduction
Dimensionality reduction refers to the techniques associated with transforming data to remove redundant features from the dataset and retain important features so that:
  1. Complex data can be easily visualized and interpreted.
  2. Performance of the machine learning model can be improved by eliminating redundant data.
  3. Storage space and time required for processing the data can be reduced.
  4. Noise can be removed from data.
Principal Component Analysis (PCA) is an extensively used DR algorithm. The most successful application of PCA is in computer vision and image recognition to extract the significant features from an image database, which gives maximum discrimination between the individual images. Independent Component Analysis (ICA) is another popular algorithm, which can be used to separate a mixed signal into its sources, for example, to extract individual speech signals when a group of people are talking in a noisy room; decomposing EEG recorded by multiple sensors; or to separate interfering radio signals arriving at a mobile phone.
 
The sensors signals from the embedded accelerometer and gyroscope in a Samsung Galaxy S-II can be used to predict whether the user is walking, sitting, standing or laying. While preprocessing these signals, many new variables are generated. Dimensionality reduction helps
get rid of the unwanted variables or dimensions, which ultimately improve accuracy of the
final prediction model.
 
 
Rule based Learning
Rule based learning is the acquisition of structured knowledge in the form of rules. Rules are human readable, and widely used in expert systems. A sample rule deduced from a customer transaction database would look like {milk, bread} → {butter}, which can be interpreted as “those who buy milk and bread together will also buy butter”. Learning algorithms typically use a statistical evaluation function to pick the rules. One of the principal areas of application is health information technology, where they can be used to create Clinical Decision Rules to help
increase the accuracy of the clinician’s diagnostic and prognostic assessments. In Clinical Decision Support Systems, rule learning eliminates the need for writing rules and for expert input. Rule learning can aid situations where producing rules manually is too labor intensive. Some of the other real world applications include preventive maintenance of electromechanical devices in chemical plants; creation of rules for setting parameters of a system controlling the separation of crude oil from natural gas in an oil refinery; and in customer support system to decide the right technician to assign to the job when a customer reports a problem.
 
 
Machine Learning in Practice: Predictive Maintenance
Anything that can go wrong, will go wrong.
-Murphy’s law
Predictive maintenance focuses on the detection, diagnosis, and anticipation of failure of a machine so that maintenance can be performed in advance. A commercial flight generates about 20TB of sensor data per engine per hour of operation. This data can be made useful and insightful to provide alerts, and enable detection or prediction of engine failure. The rise of the IoT and big data analytics provides us with a plethora of solutions to reduce the cost of operating complex machines and improve system availability and responsiveness by minimizing the risk of unexpected failures, thereby increasing the productivity of assets. Predictive maintenance techniques have also been used for root cause analysis of the failures and to schedule repair before it breaks. The state of the art systems employ a machine learning model at its core. In the naive preventive maintenance approach, machines are inspected periodically at fixed time intervals to perform repairs. In predictive maintenance, apart from the failure statistics and test results, we draw additional data from the sensors and devices including environmental conditions to improve predictive accuracy of our model. Such an intelligent model can compute the metrics: time between failure, time to repair, predict whether a component will fail within a certain time frame, or assess the probability of a total system failure when multiple components fail. The model is trained using an online learning algorithm, which can reoptimize the model instantly, and dynamically adapt to new patterns whenever a failure goes undetected. This ensures that the performance doesn’t degrade, or might even improve with time.