A flexible decision-support architecture provides insight.
by Debbie Smith
Data is the lifeblood of an enterprise. Investigating and analyzing data provides a
corporation with insight on where it has been and how the business got to where it is,
while studying data trends provides an understanding of where the organization is going and
what might be needed for the next step or to change direction.
Because an organization's information is locked in its data, users are passionate
about its availability and accessibility, and they may go to creative lengths to get what
they need through data marts, operational data stores (ODSs) or sandboxes. All may have
a role to play, but these analytic methods have the potential to become time-consuming
data management challenges if not kept under control. IT must stay ahead of the users by
providing a methodology and infrastructure that will ensure the data is acceptable,
accessible, flexible and timely.
The best approach to establishing a reliable decision-support infrastructure is through an
enterprise data warehouse (EDW). This may not always be feasible if, for instance,
the data has not yet been integrated into the EDW. Creating a data mart, ODS or
sandbox should be based on business reasons, such as:
Internal proof of concept (POC) for potential applications
Testing and exploring new business ideas in a sandbox environment
Storing data of minimal value that does not merit its integration
into the data warehouse
Temporary placeholder for data pending an application implementation
Specialized short-term data requests, such as an
analytical point application
Data issues, data profiling investigations
The figure shown illustrates a best practices architecture for a high-value decisioning
environment. As a centralized repository, an EDW provides business users direct access to
the integrated data they need, when they need it.
In situations where a data mart might be necessary—when testing a new application, for instance,
or validating that data gained through a merger is compatible with the current
environment—a data mart can be implemented virtually, inside the EDW with combinations
of logical views and physical structures for specific use, performance or workload
requirements. Users will have access to the organization's data without physically
duplicating it into a separate data mart (or platform).
When business needs are not fulfilled with a virtual data mart, one of two types of external
data marts might be needed. The first and most favored choice is a dependent data mart.
Since this is sourced from the EDW, data issues are minimized. The second type, the
independent data mart, is typically only used when needed data is not in the EDW.
Issues with going outside the EDW
Data marts are seldom the answer for long-term success. Like ODSs and sandboxes, data
marts that are external to an EDW can encounter challenges when capturing and storing data.
The spiral factor
Consider this scenario from an IT perspective: Business users request a method that will
enable them to readily interrogate their data. After a few discussions with the users in
which various alternatives are suggested, IT realizes that the data is not available in a
useful format. To quickly satisfy the users' needs, IT pulls the data together and makes
it accessible to them in a data mart.
Here is the concern: By creating a data mart, a precedent and process is established.
Business users will soon want additional data elements, or similar data with slightly
different data elements or at different aggregate levels—or even another data mart.
The volume of data marts soon spirals out of control. What started out as a simple
request results in drastic and time-consuming consequences. Management challenges become
uncontrollable. Data becomes inconsistent as it moves across the data marts, resulting
in queries and generated reports being unaligned. Concerns arise about the data's validity,
and IT must devote extra time to either confirm the data is accurate or research and explain
When researching the data validity issues in this scenario, some in IT suggest a
hub-and-spoke architecture as a feasible solution. This type of architecture consists of
a centralized data store (the hub) with the data (spokes) radiating out to the business
users in the form of a data mart.
But this architecture brings new challenges. Updating the data in the hub, packaging
it and sending it down the spokes creates a time hardship. Because of IT's processing
requirements, the data is not updated or accessible to the users during critical periods.
In addition, once the IT networking group finds that the data movement negatively affects
LAN bandwidth, it limits when data is sent down the spokes. So the window to move the data
narrows and users have even less access to updated data.
Data mart costs
Meanwhile, the growing number of data marts is affecting the organization's bottom line.
Taking into account prices for tools and licensing fees, reduced number of processes,
data reusability, maintenance and infrastructure support, the total cost of ownership
(TCO) for an EDW is equivalent to or less than the estimated cost of five data marts.
Add in the costs of time-to-market for new applications, sourcing the data and
developing the processes, and the savings is even greater for each additional data mart.
More critical is the business users' level of confidence when the data they need is
unavailable and inaccessible and when its accuracy and validity are questioned. These
factors support the EDW's best practice appeal.
Benefits of going inside the EDW
These are just some of the reasons that a centralized EDW is the optimal solution.
By working toward integrating all of the enterprise's data into a centralized data
warehouse and allowing users direct access to that platform, issues of timeliness
and propagation can be eliminated.
In addition, an EDW's value and use, as well as its benefit to the organization,
will increase exponentially as data elements are integrated into it. The more data
subjects in an EDW, the greater the number of logical combinations among the data
elements. Queries limited to one data subject produce fewer and less intricate results
than queries with a greater range of data subjects.
As a simple example, if we create an Orders data mart and a separate Inventory
data mart, the questions to ask would be specific to that data subject, such as:
- What product orders were placed?
- What products are on back order?
- What is the quantity of the products in inventory?
- What are their expiration dates?
If we integrate these data marts, we could ask additional questions that reach beyond
these narrow subject areas and expand to the subject areas that intersect and connect them:
Combined Orders and Inventory
- Which orders can be filled with existing inventory?
- How many days of inventory are required for each location?
- How will a large order affect current inventory levels?
As the data warehouse environment grows and expands, the time it takes to develop
and deliver new applications will be dramatically shortened. Established processes
and procedures can be leveraged, and the data that the EDW already holds for existing
applications can be reused as new applications are implemented.
Then, as additional data subjects are integrated into the EDW, query possibilities are
increased and application delivery schedules are shortened. All of this leads to savings
in time and money by building the application in the EDW rather than creating a data mart
from the ground up.
Valid reasons for generating external data marts will not diminish the long-term challenges.
However, the goal is to develop the data mart infrastructure in such a way that maximum
benefit is gained while management challenges are minimized. Organizations will want to
ensure their data mart platforms can leverage existing tools, training and practical
experience. When possible, data should be loaded from the EDW so that it is reflective
of the organization. Finally, when the data is either integrated into the centralized
EDW or simply eliminated, these data mart platforms can be redeployed.
Overall, businesses want an architecture that promotes vast growth and provides
data—the lifeblood of an organization—in an accessible, flexible and timely format.
|Why the Teradata platform family for extended decision-support opportunities?
Teradata supports a best practices architecture that can satisfy an organization's end-to-end decision-support needs.
Each platform in the Teradata family is designed to serve different data warehousing and business intelligence
(BI) paradigms that allow for virtual or external data marts:
Teradata Active Data Warehouse 5550 is an active data warehouse platform that provides up to twice the performance of legacy
systems and supports all types of virtual data marts, operational data stores and analytical sandboxes.
Teradata Data Warehouse Appliance 2550 is an entry-level data warehouse used either by companies just starting out in
data warehousing or by companies with an established enterprise data warehouse (EDW) that require
a complementary external analytical sandbox.
Teradata Data Mart Appliance 550 is a departmental data warehouse. This single-application system can be used as a
small data mart for temporary or quick data analytics or to test and develop
new applications, data types or tools.
The platforms provide complete decision-support life cycle management. While the Teradata 550
is a small-scale platform, the Teradata 2550 can be upgraded to the Teradata 5550—the EDW. In addition,
they all support Teradata Database Views, which enables creation of virtual data marts.
The family's standardized architecture enables organizations to leverage existing resources
and tools. Applications used in the Teradata 550 and Teradata 2550 can easily migrate into the EDW.
This flexibility enables the same data, data models, table structures, views, load jobs and queries
to meet short- and long-term data integration requirements.
Furthermore, database and system administrators and application developers can support multiple
systems across the organization with standardized BI and extract,
transform and load tools without additional training.
Whatever your organization's needs may be, the Teradata platform family provides for
them—all with the benefits of the Teradata Database.
Debbie Smith, a senior data warehouse consultant, was a Teradata retail customer
for 14 years before joining Teradata in 2000.
Teradata Magazine-September 2008