-
Action
- How should engineers change the manufacturing process to generate
higher product yield? How should an insurance company choose
which policies to offer to whom and at what price? The output of
computation should enable actions that align to the goals of the data
product. Results that do not support or inspire action are nothing but
interesting trivia.
Given the fractal nature of Data Science analytics in time and
construction, there are many opportunities to choose fantastic or
shoddy analytic building blocks. The Analytic Selection Process offers
some guidance.
-
Computation
- Computation aligns the data to goals through the process of creating
insights. Through divide and conquer, computation decomposes
into several smaller analytic capabilities with their own goals, data,
computation and resulting actions, just like a smaller piece of broccoli
maintains the structure of the original stalk. In this way, computation
itself is fractal. Capability building blocks may utilize different
types of execution models such as batch computation or streaming,
that individually accomplish small tasks. When properly combined
together, the small tasks produce complex, actionable results.
-
Data
- Data dictates the potential insights that analytics can provide. Data
Science is about finding patterns in variable data and comparing those
patterns. If the data is not representative of the universe of events you
wish to analyze, you will want to collect that data through carefully
planned variations in events or processes through A/B testing or
design of experiments. Datasets are never perfect so don’t wait for
perfect data to get started. A good Data Scientist is adept at handling
messy data with missing or erroneous values. Just make sure to spend
the time upfront to clean the data or risk generating garbage results.
-
Goal
- You must first have some idea of your analytic goal and the end state
of the analysis. Is it to Discover, Describe, Predict, or Advise? It is
probably a combination of several of those. Be sure that before you
start, you define the business value of the data and how you plan to
use the insights to drive decisions, or risk ending up with interesting
but non-actionable trivia.