Course Description
Data Integrator enables the implementation of ETL (Extract, Transform, and Load) projects from disparate data sources to deliver more timely and accurate data that end users in an organization can trust. In this four-day course, you will learn about batch data transformation jobs, techniques for capturing changes in data, handling errors, multi-user environment tasks, administering server and migration basics. The course also shows you how to use tools to audit, profile data, and manage metadata to assist in the implementation of an ETL project.
Smaller activities in this course focus on the tools and features discussed in each lesson and enable you to create dimensions and a fact table. You will have a chance to apply different concepts learned from a few combined lessons in two one-hour comprehensive workshops. At the end of the course, you will be able to put into practice data flow design concepts in a final two-three hour workshop.
As a business benefit, by being able to create efficient ETL projects, you can use the transformed data to help improve operational and supply chain efficiencies, enhance customer relationships, create new revenue opportunities, and optimize return on investment from enterprise applications.
Course Audience
This course is designed for individuals responsible for implementing ETL projects (batch-mode), administering and managing projects that involve Data Integrator.
Course Topics
Lesson #1: Datawarehousing Concepts
- Describe dimensional modeling
Lesson #2: Understanding Data Integrator
- Describe components, management tools, and the development process
- Explain object relationship
Lesson #3 Defining Source and Target Metadata
- Create a database datastore and import metadata
- Create a new file format and handle errors in file formats
Lesson #4 Validating, Tracing and Debugging Jobs
- Use descriptions and annotations
- Validate and trace jobs
- Use View Data and the Interactive Debugger
Lesson #5 Creating a Batch Job
- Create a project, job, work flow, and data flow
- Use the Query transform in a data flow
- Use template tables
Lesson #6 Using Built-in Transforms and Nested Data
- Use the case, merge, and validation transforms
- Import metadata from XML documents
- Use the XML_Pipeline in a data flow
Lesson #7 Using Built-in Functions
- Use date and time functions and the date generation transform to build a dimension table
- Use the lookup functions to look up status in a table
- Use match pattern functions to compare input strings to patterns
- Use database type functions to return information on data sources
Lesson #8 Using Data Integrator Scripting Language and Variables
- Explain differences between global and local variables
- Create global variables and custom functions
- Use strings and variables in Data Integrator scripting language
Lesson #9 Capturing Changes in Data
- Use Changed Data Capture (CDC) with time-stamped sources
- Create an initial and delta load job
- Use history preserving transforms
Lesson #10 Handling Errors and Auditing
- Recover a failed job
- Create a manual, recoverable work flow
- Define audit points, rules and actions on failures
Lesson #11 Supporting a Multi-user Environment
- Describe terminology and repository types in a multi-user environment
- Create and activate the central repository
- Work with objects in the central repository
Lesson #12 Migrating Projects
- Create multiple configurations in a datastore
- Work with projects in the central repository
- Create a secure central repository
- Implement and modify group permissions
Lesson #13 Using the Administrator
- Add a repository and user roles
- Set the job status interval and log retention period
- Execute, schedule, and monitor batch jobs
- Understand architecture, load balance index, and job execution in server groups
Lesson #14 Profiling Data
- Set up the Data Profiler and users
- Submit a profiling task
- Monitor profiling tasks in the Administrator
Lesson #15 Managing Metadata
- Import and export metadata
- Use Metadata Reports
Prerequisite Education
N/A
Prerequisite Knowledge/Experience
Experience with these products or technologies will be helpful:
- Knowledge of data warehousing and ETL concepts, SQL language
- Experience with Microsoft SQL Server
- Experience using functions, elementary procedural programming and flow-of-control statements, for example: If then Else, and While Loop statements
It is recommended you review these articles prior to attending the course:
http://www.rkimball.com/html/articles.html
- Data Warehouse Fundamentals: TCO Starts with the End User and Fact Tables and Dimension Tables
- Data Warehouse Architecture and Modeling: There Are No Guarantees
- Architecture/Modeling: Advance Dimension Topics Surrogate Keys: It’s Time for Time and Slowly Changing Dimensions
- Architecture/Modeling - Industry- and Application-Specific Issues: Think Globally, Act Locally
- Data Staging and Data Quality: Dealing with Dirty Data
Additional Education
N/A
Delivery & Duration
This instructor-led offering is a four-day course.
Applicable Certifications
N/A
|