Category Archives: Data Science

Azure Data Factory

By: Vineeta Tawney
Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows in the cloud for orchestrating and automating data movement and data transformation. Azure Data Factory does not store any data itself.

Data Factory is an enabler for any Cloud projects. In almost any Cloud project you will need to perform data movement activities across various networks (on-premise network and Cloud) and across various services (i.e. from and to close different Azure storages).

Data Factory is particularly a required enabler for organizations who are making their first steps in the Cloud & who thus try to connect on-premise data with the Cloud. For this Azure Data Factory has an Integration Runtime engine, a Gateway service which can be installed on-premise which guarantees performant & secure transfer of data from & to the cloud.

Continue reading

Components of Data Science

By: Vineeta Tawney



Data science is a multi-disciplinary course that applies scientific techniques, methods, algorithms, and practices to obtain knowledge and perspicacity from data in different forms, including structured and unstructured, alike to data mining.

Data science is a “theory to unify statistics, data analysis, machine learning, and their related methods” in order to “explain and analyze actual phenomena” with data. It uses techniques and theories described from various fields within the context of mathematics, statistics, information science, and computer science.

Components of Data Science:


  1. Data analysis:

Data analysis, the invented word for the analysis of data, is like a questioning activity.

It separates the macro picture of data within a micro picture and commands the complications produced by humans.


Continue reading

Levels of Big Data

Levels of Big Data

LevelofBigDataThe concept of Big data is continuously being developed and reconsidered. As it remains the driving force behind many ongoing waves of digital transformation, including AI, data science and IoT.

The data which is unstructured, time sensitive or just very large cannot be processed by relational database engines. So, this type of data requires a different managing method called big data, which uses huge parallelism on readily-available hardware.

Big Data is a process that is used when established data mining and handling methods cannot uncover the insights and meaning of the underlying data. Continue reading

Big Data Analytics and Healthcare

Big Data Analytics and Healthcare
By: Vineeta Tawney

What Is Big Data in Healthcare?

Now days we live longer primarily because treatment methods have changed and many of these changes are tracked by data. Doctors understand as much as they can about a patient and as early as possible, to pick up warning indications of serious diseases as they arise – treating any disease at an early stage is far simpler and cheaper. In healthcare data analytics, prevention is better than cure and managing to draw a complete picture of a patient will let insurances provide a personalized package. Continue reading

Data Mining – An Introduction

Data Mining – An introduction
By: Vineeta Tawney

Data mining can be described as the process of improving decision-making by identifying useful patterns and insights from data. Data mining is particularly useful for revealing hidden patterns and providing insights during analysis, for example, understanding how many people will be impacted by specific changes. It involves examining large volumes of data from varying viewpoints and summarising the data so that useful patterns and connections can be established. It may involve the use of dashboard and reports that facilitate visual communication of results. The main challenge with data mining usually lies in securing the right type, volume and quality of data that is necessary to draw insights.

highlights 3 variants of data mining outcomes.

Descriptive: This involves the use of clustering to display patterns within a set of data, for example, similarities between suppliers can be displayed visually.

Diagnostic: With this approach, techniques such as decision trees and segmentation can be employed to show why a pattern or relationship exists within the data set. An example here is identifying the attributes of the most successful suppliers within a region.

Predictive: This approach involves the use of techniques such as regression to show the probability of an event occurring in the future.

If you are an analyst charged with a data mining exercise, ensure the following steps are followed at the minimum:

Define goal and extent of the data mining exercise. What questions are to be answered?

Prepare the data set to be used as basis for analysis. Is the data sufficient and accurate?

Analyse the data using a variety of statistical measures and visualisation tools so that observations can be made around how data values are distributed and missing data identified. Examples of data mining techniques that can be employed include linear regression, decision tree analysis, predictive scorecards, etc.

Intelligent Automation – The future of Robotic Process Automation

By : Vineeta Tawney

Robotic Process Automation (RPA) is considered as a revolution in the field of business automation. It is the use of software robots (bots) with Artificial Intelligence (AI) and Machine Learning to carry out some high-volume tasks, which were carried out by humans earlier. RPA is used to streamline business operations, to automate standard business practices, and to reduce costs.

Three broad categories of RPA are classified into three:

Probots: Probots are bots that follow repeatable rules to process data.
Knowbots: User-specified information is gathered and stored using Knowbots.
Chatbots: Chat bots are virtual agents who can respond to customer queries through chat in real-time.

Artificial Intelligence (AI) and Machine Learning

Artificial Intelligence, also termed as machine intelligence, is a machine that thinks and works like a human. AI performs complex tasks like problem-solving, speech recognition, etc. Machine Learning is a method of data analysis that automates the analytical model; it is like teaching a computer how to make an accurate prediction when fed with any data. Advancement in machine learning (ML) and artificial intelligence has paved the way for Intelligent Automation (IA).
Continue reading

Capsule Networks

Capsule Networks

By Vineeta Tawney

What are Capsule Networks?
It is also known as Capsule Neural Network. It is a machine learning system which is used to better model hierarchical relationships. It is commonly known as CapsNet.
Definition of Capsule Networks
In simpler words, CapsNet is combined of numerous capsules. Every capsule is a group of neurons which learns to identify an object (e.g., a square) in a given region of the image.
It outputs a vector (e.g., an 8-dimensional vector) whose length represents the estimated probability that the object is present, and whose orientation (e.g., in 8D space) encodes the object’s pose parameters (e.g., precise position, rotation, etc.). If the position of an object is changed a little (e.g., shifted, rotated, resized, etc.) then the capsule output will be a vector image of the same length but placed slightly differently.
A CapsNet is arranged in multiple layers, very much like a regular neural network. The lowest layer capsules are called primary capsules: each of them obtains a small region of the image as input (called its receptive field). It tries to detect the presence and pose of a specific pattern, for example, a square. The higher layer Capsules are called routing capsules, identify larger and more complex objects, such as boats.
What do Capsule Networks do:
The purpose behind Capsule Networks is to perform computer vision as inverse graphics. In graphics, an object is represented through using a tree part. A specific rotation describes the conversion from the viewpoint of the part to the viewpoint of the parent.
CapsNets are encouraged by these tree-like representations and try to learn conversions relating the parts of an object to the whole. Capsules could be viewed as parts/object, with parent parts/objects that are also capsules.
Capsule Networks Deep Learning
Deep Learning is a feature of artificial intelligence (AI). In simple words, Deep Learning is a way to automate Predictive Analytics. Whereas traditional machine learning algorithms are linear, Deep Learning algorithms are stacked in a hierarchy of increasing difficulty and abstraction.

In simple terms, a CapsNet is combined of capsules and a capsule is a group of artificial neurons that learn to detect a specific object in a given region of the image. It produces a vector whose length represents the estimated probability of the object’s presence and whose orientation encodes the object’s position, size, and rotation. If the object is customized (for example, translated, rotated, or resized), the capsule will then produce a vector of the similar length, but with a slightly different orientation.
Capsule Networks: Deeper Analysis
CapsNet is organized in multiple layers. The deep layer is composed of primary capsules that receive a small portion of the input image and detect the presence and placement of a subject, such as a square, for example.
The high layer capsules, more commonly known as routing capsules, are capable of detecting larger and more complex objects. Capsules communicate mostly through an iterative “routing-by-agreement” mechanism: a lower level capsule prefers to send its output to higher level capsules whose activity vectors have a big scalar product with the prediction coming from the lower-level capsule.
“Lower level capsule can send its input to the higher-level capsule that ‘agrees’ with its input. This is the essence of the dynamic routing algorithm.” Most professionals working on Capsule Networks paper believe CapsNets to be an improvement on convolutional neural networks (CNN).
CapsNets attempts to solve the issues caused by Max Pooling and Deep Neural Networks like loss of information regarding the order and orientation of features. For example, a CNN used for face recognition will extract certain facial features of the image such as eyes, eyebrows, a mouth, a nose etc. Then the higher-level layers (the ones deeper down within the network) will merge those features and check if all of those features were found within the image regardless of order.

The mouth and nose may have switched places and your eyes can be sideways in the picture, but the CNN can still put together the facts and classify that as a face. This problem exacerbates the deeper your network gets as the features become more and more abstract and also shrink in size because of pooling and filtering. The idea behind CapsNets is that the low-level features could also be arranged in a certain order for the object to be classified as a face.
For example, it would learn that your nose must be between your two eyes and your mouth must be below that. Images with these features in the specific order can then be classified as a face, everything else will be rejected.
The publication of “Dynamic Capsule Routing” has led various researchers to work intensely towards refining algorithms and implementations, and advances have been published at a speedy pace.
Advantages of using CapsNets for Deep Learning:
1. Good preliminary results.
2. Requires less training data.
3. Works good with overlapping objects.
4. Potentially good on crowded scene.
5. Can detect partially visible objects.
6. Results are interpretable, components hierarchy can be mapped.
7. Equivariance (classifier adapt to small changes in input).
Disadvantages of using CapsNets
1. No known yet accuracy on large images.
2. Slow training time (so far).
3. Nonlinear squashing may not reflect the probability nature.
Future of Capsule Networks
Capsule Networks have presented a new building block that can be used in Deep Learning to better model hierarchical relationships inside of internal knowledge representation of a neural network.
To know more about Capsule Networks deep learning, refer critical essays on CapsNet models for Deep Learning. Or refer Capsule Networks paper for expert discussions on Deep Learning. Also, refer papers that contain discussions on Hinton’s Capsule Networks.