Data Warehousing
The Business Need:
Most organisations will have a system for tracking orders, or running the manufacturing process, or financial reporting. These systems are designed to be efficient for data entry and detailed querying on a single item, but not for analysis of very large queries.
For example: consider a company with a system to track orders, deliveries, returns & payments. This system will implement a classic order header+detail structure for orders; when a customer calls to find out the status of an order the order number is the reference, and the system pulls out the detail against that order. Customer data is held elsewhere, product data in other tables.
Databases such as this do their function very well, but struggle with large-scale queries that management need to know, such as the percentage share of a given product compared to the rest of the range, sold over the whole of last year, broken down by region and customer gender. In fact it is unlikely that such a query could be answered by the operational system; sales region definitions are normally held somewhere else in the organisation – typically on a spreadsheet in the marketing department.
Unfortunately these are exactly the sort of questions that a manager who wants to grow the business needs to ask. In order to make a decision such as “Do we launch a new product of type X?”, then a manager needs to look at long-term trends: “How have costs and revenue changed over the past two years?” “Does this vary by region?” The answers to these questions will raise other questions, and so on until the manager has built up an understanding of the business, and can make the decision confidently.
Without a data warehouse, the manager trying to analyse how the business is performing has to ask the IS department to write bespoke queries against the system – a new query each time he or she has a slightly different idea to check. Not only is this time consuming for the IS department, it will slow down the operational system which is the backbone of the company.
Companies that have several sites may have several different systems, each using different terms for the same business entity, probably each implementing slightly different business logic. In this case the manager has to hope that the different IS departments interpret his questions in the same way. And when several managers turn up for a meeting, each with their own data, there are inevitably differences in the figures.
The Data Warehouse from a Business View:
A data warehouse can solve these problems, and provide a single, reliable source of data that everyone in the company can use for making business decisions.
During the design phase of the project, all the departments are required to agree on the data sources, the definitions, how metrics such as profit, cost, margin, are defined and calculated. This guarantees, for example, that the customer data includes everything needed by marketing, that product data includes marketing and production data as required, and that the sales, costs, profit and so on are calculated in the correct way – agreed by everyone including finance.
The warehouse is typically loaded overnight from all the different operational systems in the company, plus other systems such as CRM as required. Data is cleaned, and transformed into a consistent format.
Customer data is checked, cleaned, and where the same customer exists in multiple systems this is identified and standardised; product data includes all the physical data, but also costs, sale price and non-operational data such as the product hierarchy and date of introduction; duff records (of which there are inevitably a few) are removed and flagged for later processing. The data warehouse itself is available before work starts, and so there is no impact on the operational systems during the day.
Creating a Star Schema Database is one of the most important, and sometimes the final, step in creating a data warehouse. Given how important this process is to our data warehouse, it is important to understand how me move from a standard, on-line transaction processing (OLTP) system to a final star schema (OLAP - Online Analytical Processing system).
The Star-Schema structure of the warehouse is such that queries over the data are very much faster – but actually the management departments run their analysis themselves using Excel Pivot tables or other analytic tools. All of the management teams use the same data warehouse, so everyone has the same figures, and everyone knows that these figures are reliable.
A typical data warehouse is a relational database built with what is known as a Star-Schema.
To discover how Altimus can help you with you Data Warehouse/Mart call and speak to one of our experts on 0800 804 6442, or contact us through Information Request |