Data Warehousing


By

Isha Budhiraja
B.Tech, MBA, M.Phil (Computer Science)
Lecturer
K.R. Mangalam Institute of Management
Kailash Colony, New Delhi
 


Working Title:-"Data Warehousing"

Data is a very important resource in today's competitive environment. It contains the day-to-day information which is required to run our life, so, first of all what is Data? A Data is a collection of raw material which we collected from many different sources and after interpreting, it becomes necessary information. To effectively manage our services, we must have the capability to effectively analyze our data, and use it in proper form. A complete, high-quality data warehouse, coupled with the right data mining applications for our life could be the key strategic advantage needed to stay ahead in the universe. Now what is the term mean "Data Warehousing".

According to Bill Inmon "Data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process".

So, what is exactly Data warehousing? In simple words we can say data warehousing is that warehouse where we can store the information in a Database in the form of Dimensions, Tables, Lookup Tables, Aggregated Fact tables. Basically Data warehouse only focus on data storage. For that first we make the Data cleaned, transformed, catalogued and then make it available for use by managers and other business professionals for data mining, online analytical processing, market research and decision support. Data warehouse puts data from many different sources and from that data it tries to resolve such problems as naming conflicts and inconsistencies among units of measure. And when they achieve in their target, they said data warehousing as an integrated form.

However, to retrieve and analyze data, to extract, transform and load data, and to manage the data dictionary all are considered essential components of a data warehousing system. Many references to data warehousing use this broader context. Thus, an expanded definition for data warehousing includes business intelligence tools, tools to extract, transform and load data into the repository, and tools to manage and retrieve metadata. Data warehousing is Nonvolatile in nature because once data has entered in warehouse then it cannot be changed. This is logical because the purpose of a warehouse is to enable us to analyze what has occurred. In order to discover trends in business, analysts need large amounts of data. This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. A data warehouse's focus on change over time is what is meant by the term time variant.

What makes a Data warehouse different from other traditional applications

Data warehouses are additive in nature, which means they don't conform to standard accounting  rules for financial data and are tolerant of redundancy, a construct normal relational databases avoid at all costs. 

Legacy applications typically have standard interfaces and prescribed reporting packages, but data wareho uses are mostly accessed via adhoc single use queries. Data warehousing is essentially what you need to do in order to create a data warehouse, and what you do with it. It is the process of creating, populating, and then querying a data warehouse and can involve a number of discrete technologies. Data warehouses are only designed to help you to analyze the data. For example, to learn more about your company's sales data, you can build a warehouse that concentrates on sales. Using this warehouse, we can easily check out the status of the sales that how much we have targeted in prior years and from that we can generate our reports and from that reports we can take the decisions for our future business.

Designing of Data warehouse:-

Before designing our data warehouse, first we will decide that what contents are to be made for data warehousing. For e.g.  If you want to build a data warehouse containing monthly sales numbers for multiple store locations, across time and across products then we have to set our dimensions like: Location, Time and Product.

Each dimension table contains data for one dimension. From above example we get all our stored information and we put that data into one single table called Location. We can spanned our stored location data across multiple tables in our OLTP system (Online transaction processing), but it needs to de-normalize all that data into one single table.

Dimensional modeling is the design concept used by many data warehouse designers to build their data warehouse. In this model, all data is contained in two types of tables called Fact Table and Dimension Table and all the framework of the measurement is represented in dimens

n tables. In the business process Sales, the characteristics of the 'monthly sales number' measurement can be a Location (Where), Time (When) and Product Sold (What). The Dimension Attributes are the various columns in a dimension table. If we take an example of Location dimensions we can make attributes of location as the Location Code, State, Country and the Location Zip code. Generally the Dimension Attributes are used in reports, and query constraints such as where Country='INDIA'. The dimension attributes also contain one or more hierarchical relationships that depend on the data.

Need of these Data warehousing:-

1) Design a data warehouse or data mart to present information needed by management in a form that is usable for management clients.

2) After implementing a high quality data warehouse or data mart, work can be done in easier form.

3) Effectively administer a corporate Data resource in such a way that it will truly meet Management's needs.

4)  To evaluate standards and new technologies to determine the potential impact on our information resource.

Advantages of data warehousing are:

1)  High query performance

2)  Queries not visible outside warehouse

3)  Local processing at sources unaffected

4)  It can operate when source is unavailable

5)  Can query data which is not stored in DBMS( Data Base Management System)

Difference between Database and Data warehousing:-

1) Database is a collection of inter related data. It is a storage place where we store the Data. But Data warehousing is a typical relational database which is used for analyzing and querying business requirements.

2) Database is the root of any data related operations where as data warehousing is a schema or process how effectively we access or utilize a database.

3) Normally Database is used for storing the data in the form of tables. And the table will be updated or overwritten depending upon the business requirement. But in Data Ware housing Database is used for query and analysis i.e. the data is never deleted. We can compare the old data as more and more data added.

4) Database holds only current data and it has a dynamic nature because according to the surroundings and need the data in the Data base can be changed but Data warehousing holds historical data. It stores lightly and highly summarized data and it has a static in nature. The data cannot be changed if once inserted in Data warehousing.

5) Database is Application-oriented and Support day-to-day decision, but Data wore housing is Subject-oriented and support strategic decisions.

References:-

1) Willim. H. Inmon, Building the Data Warehouse, Fourth edition, 2004

2) Peter Marwedel, Data warehousing - Springer, Netherlands, 2007

3) Wikipedia, Data warehousing.

4) Michael Venerable, Christopher Adamson, Data Warehouse Design Solutions.
 


Isha Budhiraja
B.Tech, MBA, M.Phil (Computer Science)
Lecturer
K.R. Mangalam Institute of Management
Kailash Colony, New Delhi
 

Source: E-mail June 30, 2011

          

Articles No. 1-99 / Articles No. 100-199 / Articles No. 200-299 / Articles No. 300-399 / Articles No. 400-499/ Articles No. 500-599
Articles No. 600-699 / Articles No. 700-799 / Articles No. 800-899 / Articles No. 900-1000 / Articles No. 1001-1100
Articles No. 1101-1200 / Articles No. 1201-1300 / Articles No. 1301 Onward / Faculty Column Main Page