An effective data warehouse must balance ease of use and complexity of function.
by Dan Higgins, Director of Teradata Warehouse sales support
Simplicity is a frequent topic among platform and tool vendors, data warehouse architects, designers, implementers
and users. Platform vendors claim lower total cost of ownership through ease of setup, management and use of their
systems. Business intelligence (BI) tool vendors frequently build their products around a paradigm of simplicity that works
for some needs but may not for others. Data warehouse architects wrestle with addressing complex business needs while
creating an architecture that is easy to use and cost-effective to manage and evolve. End users want the flexibility and power of
robust data models but sometimes struggle with the resulting complexity.
Simplicity is essential to ease of use, ease of manageability and effective exploitation and extension of the data warehouse. But
not if that simplicity is achieved by severely limiting the scope, flexibility or capabilities of the data warehouse solutions. As an
analogy, consider sports action photography. Cameras have come a long way from the days when you manually measured
the available light, adjusted the lens opening (aperture) and shutter speed and then focused on a quickly moving and highly
unpredictable subject.
Today, dozens of cameras on the market are simple to use; thus the phrase "point and shoot." But these cameras lack the
speed, accuracy and capabilities necessary to capture fast, unpredictable action. At the other end of the spectrum are extremely
sophisticated digital cameras—some of which automatically adapt to complex and dynamic shooting situations, freeing the
photographers to concentrate on the action and the image they want to capture, not on how the camera functions.
| A paradox: Complex technology limits capability |
|
Consider the following paradox: If the technology architecture or platform is overly
complex, you need to simplify or limit the business solution to accommodate the challenges of
working with the underlying platform. A platform/architecture that hides its complexity frees
you to focus your effort on developing more sophisticated and powerful business solutions.
—D.H.
|
|
Get the picture?
The point is that not all simplicity is equal. (See table below.) In some cases simplicity
is merely the product of inadequate capability, which in data warehousing can result in:
|
Limited and sometimes dead-end business solutions
|
|
Significant additional custom work on the part of IT to build a substitute for the missing functionality
|
|
Overly complex and costly architectures as it becomes necessary to use multiple platforms, each servicing a narrow scope of needs
|
On the other extreme are platforms, architectures and data models that attempt to provide considerable capability and
flexibility but do so in a manner that requires extensive and deep knowledge of the inner workings of the technology. In
these situations IT workers must constantly monitor the system and make adjustments to ensure the technology is stable and performing optimally.
| enlarge |
|
Some data warehousing products and architectural approaches are simple because they lack
capability. Other products have extensive capability but force the associated complexity, increasing cost and risk.
|
|
At some point the complexity of the technology platform, architecture or data model becomes a burden and an inhibitor
to delivering business solutions. These overly complex platforms and architectures place far greater demands on IT,
overwhelm end users and increase the cost, risk and the time it takes to deliver business solutions.
The ideal is to find the balance between these extremes: lots of capability to ease each person's task—database administrator,
application developer, power user, etc.—in a manner that:
|
Virtualizes most of the underlying computing resources
|
|
Automates the management and, in many cases, the use of those computing resources
|
|
Exposes underlying capability only when it is needed to address more complex business solutions
|
Zoom in on simplicity and capability
When we select technology platforms and tools, and when we define the architecture or the data models for our decisioning
infrastructure, we need to find that balance. We need to insulate operations personnel and end users from technology complexity.
When we design our platform architecture, we need to minimize the number of architectural components and platforms to
simplify management, administration and the use of integrated data. With data models we need to provide semantic layers to hide
the complexity and ease the exploitation of complex data models.
| Tips on 'capturing' a data warehouse solution |
| > |
Start by choosing trustworthy technology. Data warehousing technology needs
to provide the speed and flexibility to keep pace with the needs of a dynamic business environment.
|
| > |
Leverage detailed data to "zoom in" on critical information. Detailed and
summary data is needed for effective data warehousing. Summary data alone will
not allow you to"zoom in" on the business information.
|
| > |
Allow the data warehouse users to freely ask a wide variety of business questions. Many
queries may be throw-aways, but you often won't know until you see the answer.
|
| > |
Make questions relevant. A fast query is meaningless if it does not answer the business question.
|
| > |
Take full advantage of the system's innate capabilities. For example, with
Teradata you do some minor tuning based on your anticipated workload and then
start using the system and occasionally monitor resource utilization, making minor
adjustments when required.
|
|
|
Consider where your data warehousing support team is spending its time. How much of it is spent dealing with the
complexities of technology rather than helping the business more effectively use the data warehouse? How much time
is spent filling in functionality gaps or building surrounding infrastructure for the data warehouse platform because it
was not included in the products you are using?
In the case of architectures, independent decision-support platforms can add complexity, cost and risk, turning a "divide and
conquer" approach into "divide and complicate." Attempts to make things simple for one part of the business can make things
more complex for the enterprise overall.
As for your approach to modeling your data, it is possible to simplify your data model to the point at which you have lost
the ability to answer many potentially valuable business questions. Have you limited the
ability of the business to exploit your investment in the data warehouse? On the other
hand, a data model may accurately represent the information within the enterprise but be overwhelming to the end user. Are
you using views, BI tools and other forms of an end-user semantic layer to narrow the scope of what a particular end user may
see and to simplify the use of that complex data model?
Technologies that lack capability or use overly simplistic data models will limit the ability to deliver business solutions and
create value from the data warehouse. On the other hand, technologies, architectures and data models that are overly complex
increase cost, effort and risk while becoming a barrier to delivering business solutions and business value.
To fully exploit the data warehouse in the most cost-effective manner, we need to choose technologies and architectures that
provide extensive functionality, allowing us to address a wide variety of business problems while minimizing the effort required to
manage and use that technology. And we need to provide tools and semantic layers for the end users that insulate them from underlying
data and technology complexities until they are necessary. T
| Why Teradata? |
|
Since its beginning more than 25 years ago, Teradata's
simplicity and ease of management have been key enablers
for its customers' industry-leading data warehouses and decision-support
capabilities. Many characteristics and features of
Teradata contribute to this overall simplicity:
| > |
Software-based shared-nothing architecture eliminates
the challenges and risks of managing shared data in a clustered architecture.
|
| > |
Virtualization of CPU, memory and data capacity enables
users to view and manage the Teradata configuration as a
single system, whether it is one node or 300.
|
| > |
A cost-based optimizer with query rewrite simplifies writing well-performing queries.
|
| > |
Parallelism is always on and integral to all system
operations. Parallelism does not need to be configured,
managed or monitored as with other technologies.
|
| > |
Automated management of physical disk space eliminates
requirements for reorganizations or for managing
physical disk objects such as files, extents and tablespaces.
|
| > |
Teradata's database views can be used to hide potentially
confusing portions of the data model and encapsulate complex join logic.
|
| > |
Teradata's parallel load and updated utilities provide
powerful functionality and scalable performance without
the need to develop customer solutions.
|
—D.H.
|
|
Teradata Magazine-December 2007
|