2. Self-Service Analytics Technologies Today
– What you need to know
What is the problem with traditional ETL/Data Warehousing?i
Data warehouses are inflexible. Once the data model
has been defined and the data has been loaded into
the warehouse, the paths for analyzing the data get
frozen into place, limiting the number of potential
insights that can be derived from it. Data warehouses
are difficult to access if you are an analyst, and
requests for data are time consuming.
Data warehouses emphasize reporting over ad hoc
exploration. Data warehouses are architected to
support scheduled reports or real-time dashboards
created by the IT team. The rigidly structured ways in
which they store data lend themselves to reports and
dashboards that track pre-defined KPIs, but aren’t as
well suited to exploratory analysis. For instance, the
particular data attributes in which a business analyst
is interested in order to answer a one-time inquiry
from the CEO may not have been considered when the
data was transformed into a normalized format and
loaded into the warehouse.
Only IT can prepare this data. ETL tools aren’t
designed for business users, even business analysts.
Moreover, since one of the points of the data
warehouse is to centralize and normalize the
organization’s data in standardized formats, it makes
organizational sense to task a single unit with
preparing it.
Too many autonomous,
one-off projects
Hard to build and maintain,
need IT to help resolve
Many compliance
requirements
Takes too long to get, clean
and organize the data
Lacks operational
repeatability
Lack of TRUST in the data
2. Self-Service Analytics Technologies Today – What you need to know
What is Self Service Data Prep?
Data Preparation is the iterative, agile process of exploring, collecting, and manipulating data
into a form suitable for analysis (reporting or processing) by cleaning and often combining or
consolidating data into one file or data table. Data preparation includes transforming raw data
into curated datasets for operational processes, data science, data visualization and
BI/ analytics, and is most often used when business analysts are challenged with:
x
Limited access to data
sources, dependency
on IT for access
to datasets
Trying to
combine data
from multiple
sources
Manual data entry
into spreadsheets,
reporting on
error-prone data
Dealing with data that was pulled
from an unstructured source, such
as PDF documents, enterprise
application reports or web pages
Data Prep can be broken down into three simple and fundamental steps
1.
2.
3.
Data Acquisition:
Data Cleansing:
Data Blending:
Identifying and obtaining
access to the data within
your sources
Manipulating and preparing
data into a usable, functional
format and correcting or
removing any bad data
Combining and enriching
data with other datasets
for detailed analysis or
process improvements
A self-service data prep tool enables non-IT users to repeat this three-step process as many times
as necessary to add or subtract data sources as needed. In addition, users can extract and blend a
variety of data from disparate data sources they typically wouldn’t have access to. Data preparation
is a critical component of both operational process efficiency and enabling self-service analytics.
Please complete the form to gain access to this content