Last month’s article covered databases and data warehousing at a very basic level. This month, let’s take a closer look at the terminology associated with data warehouses. A knowledge of these search terms, keywords, and concepts will increase your ability to source and screen these high-demand technologists. A traditional warehouse is used to store a company’s products or goods. Likewise, a data warehouse stores assets of an organization as well, but in this case the asset is information, or data. At the most basic level, a data warehouse gathers information from external and internal sources. This information is then analyzed to understand trends such as sales or customer behavior and for forecasting. Four typical processes for a data warehouse are:
- Warehouse generation, which involves designing and loading data.
- Data management, which concerns the storing of data and staging for user access.
- Information access, which is the ability to retrieve data.
- Information analysis, which is the use of the data to support decision-making.
These processes help define the roles associated with data warehousing. The skills required by these technologists are different than those required by professionals working on operational databases, since the data encompasses the entire organization and is used differently. Understanding the basic terms and job functions relating to data warehousing will help you get a grasp on the skill set and requisite experience of the technologist candidate. For example, a database designer is concerned with files, records, and fields and their relationships in an operational database. In data warehousing, a Data Modeler, or Data Architect, is involved with enterprise-wide data. Understanding the user’s selection and analysis criteria is a critical skill. The data model is crucial to a good data warehouse and incorporates both warehouse generation and data management. It usually involves a blueprint of all data from all departments as well as metadata. Metadata is a summary defining the data in the warehouse, much like a card catalog in a library, in which the cards contain summary information. Additional keywords that are critical to understanding data modeling are logical and physical design. Logical design defines the data and their relationships. Multidimensional, star schema, and snowflake model all refer to types of logical design. Physical design relates to the hardware and how the data is stored. Information access and analysis define how a user accesses, views, and analyzes the data. Data can be accessed and viewed using OLTP and OLAP. OLTP, online transactional processing, is most frequently used for operational databases. OLTP shows detailed, day to day data that is current information. In contrast, OLAP, online analytical processing, is most commonly used for decision support. The data is typically more summary data, utilizing historical and near-current data. OLAP builds models for forecasting, spotting trends, and statistical analysis. End users query data to retrieve, manipulate and analyze data. Data mining looks for hidden patterns within the data. A few questions that can help you qualify this type of technologist include:
- What were your primary responsibilities with data warehouse?
- What size was the data warehouse?
- What is logical design? Physical?
- Who were the key users? How was it used?
- How was the data brought into the warehouse? What were the sources?
- How was the data accessed?
According to Dataquest, the data warehousing software market is expected to reach $6.9 billion by the year 2000. With numerous predictions of explosive growth in this area, the competition for talent can be staggering. However, a recruiter who knows the marketplace will better be able to locate the talent, be better equipped to interview candidates, and have greater credibility with the technologists themselves. <*SPONSORMESSAGE*>