Share via


Master Data Services in the cloud

Introduction

*"Master data is information that is key to the operation of a business. It is the primary focus of the Information Technology (IT) discipline of Master Data Management (MDM), and can include reference data. This key business information may include data about customers, products, employees, materials, suppliers, and the like. While it is often non-transactional in nature, it is not limited to non-transactional data, and often supports transactional processes and operations. For example, analysis and reporting is greatly dependent on an organization's master data. Because master data may not be stored and referenced centrally, but is often used by several functional groups and stored in different data systems across an organization, master data may be duplicated and inconsistent (and if so, inaccurate)." *Wikipedia

In this article, we will be talking about master data services in the cloud. This article is being presented in the following order:

  1. What challenges and problems can the combined  cloud computing and MDS solve?
  2. How are we going to solve those problems?
  3.  Technical and achitectural overview 
  4.  Installation and configuration of Master data services
  5.  Basic concept of Master data services
  6.  Master data services basic operation
  7. Advantages of the new SQL server 2012 insightful computing and the cloud

Sample Scenario

Fictional company "Westwind" is operating business in 98 countries around the world with 50,000 employees and more than 2,800 enterprise client.

The company wants a consistent record of employees, client, contacts and subcontractors across the entire enterprise. These data will be used by more than in 500 different departments in different delivery centers around the globe.

 These department level user are known as application team or "app team" in short.

An example of this is the "mylearning" team - a team similar to Microsft's mylearning team. They are know as app team because they build a specific application for the entire "westwind" enterprise. "Westwind" has an existing data warehouse solution and they will make use of Master Data Service and Data Quality Service (DQS) for master data management.

Reference tables, dimensions and look-ups such as employees, client, and accounts will be supplied to department level application consumer as a "data as a service" Some department such as sales, finance and human resource supplies data to the central data warehouse. They are referred to as "inbound" applications becuase they supply data to master database. While the data from these departments are being loaded, the warehouse is almost unavailable to the user.

However, "Westwind" requires that the data be available at the soonest possible time and they require an uptime of 99.9999 percent. "Westwind" requires that there should be one source of truth. In the current setting, the only source of truth is the data warehouse. However, the data warehouse is used for reporting and application team that may need that data are only allowed to pull the data from the warehouse using integration packages. This will later result in 1,000 copies of the employee's table alone and 1,000 SSIS package being maintained and monitored by the integration team. There are at least seven reference tables in the company which will be used by 500 different departments. Each department having different application.

This scenario will result to deployment of integration team around the world for 24/7 operation coverage Westwind would like to conserve hardisk space and insists to limit the number of copies of employee tables, accounts, and other reference table.

Another problem that "Westwind" encounters is that when inbound application changes their metadata. The change should be coordinated to the outbound or downstream application. The downstream application may or may not need the data.

The real headache comes in for example the seven reference table changes their metadata each reference with at least three fields and each seven reference tables having at least 500 to 1,000 downsteam applications. With the above metioned problems, the integration team also faces an extreme challenge commencing from requirements gathering, build deadlines, testing and port openings due to changes in the server or server migration. Westwind are also facing problems with the environment aside from port opening. These are patching, VM backup retention of three days only, hard disk capacity planning and expansion and high availability solution. "Westwind" stakeholders are well aware of the cloud however they are extra paranoid about putting their data on the public cloud. "Westwind" want to simplify all of these complexity and reduce their total cost of ownership.

Master Data Services Solution

Westwind implemented master data services to solve many of their problems above. Master data services provided some sort of governance over the data that goes into the dimension table. To resolve downtime issue, "Westwind" initially implemented three MDS database in the cloud. While the main MDS server is being populated the rest of the MDS database are kept alive supplying data to the enterprise. There are three MDS implementation, one each for EMEA, APAC and ASIA. EMEA servers are populated during EMEA off peak hours and APAC are replicated during APAC off peak hours and so it is in ASIA. Synchronization of data in every region does not affect each other. With cloud implementation "Westwind" business insight team are now empowered to scale-up their cloud VM roles by adding compute power or scale-out master data implementation by provisioning new servers. To resolve the problems on schema changes from the inbound applications, Westwind has eliminated all the reference tables in the downstream application and forces all of them to use the MDS reference table that is geographically nearest to them. EMEA user, for example, uses the EMEA-MDS to minimize bandwidth. To rapidly respond to changing inbound schema changes, Westwind consolidates this changes to a monthly release. To facilitate faster testing westwind provisions new VMRoles for a particular release leaving the current MDS online. When the release passes all the testing requirements The server is then tagged as the new production servers and the current servers are archieved and decommisioned from the cloud. All applications using the MDS table repoint to the new production server by changing their configuration. While most of the work of the integration team concerning integration were all gone, the team now focusses on integration that is not involving master data. The problems with requirements gathering, port opening, metadata familiarity and sample data are still present. To resolve this problem "Westwind" business insight team an implemented metadata store in which all participating applications/department can query and explore. Application team are required to connect their database to the metadata store before requesting any integration activity. This metadata allows "Westwind" business insight team to explore enterprise database and architecture even before an official integration request. With the new "Westwind" cloud enabled BI. Most of the environment and configurations were gone. Cloud environment delivery team can expand or shrink computing resources such as CPU, hard disk storage and test environment provisioning were all done in a matter of minutes. "Westwind" stakeholder paranoia concerning their data on the cloud were secured by implementing some sort of DMZ in the cloud. Their servers can be only accessed onsite. This has been implemented via the VMrole internal and virtual IP.

Master Data Service and the Business Insight Architecture

Below is the westwind proposed architecture.

To gain more  understanding the new BI and Cloud architecture please take your time to watch the video below: Cloud-Ready Data Services

 

Configuring Master Data Service in the Cloud

To set up a master data service in the cloud you should create a windows server 2012 64- bit VMrole. Master data services works only on a 64-bit operating system.

 

STEP #1. Provision a "Windows Server 2012  Cloud VM role"

The first step  is to provision a cloud "VM role". For this step please refer to the first demo in the Cloud-Ready Data Services video published above.

 

Step #2. Install MDS by launching the SQL Server installer and check on MDS.

Launch the MDS installer on the Windows Server 2012 VM role.

 

Step #3. Verify that MDS is Installed.

 

Figure 2. Showing MDS tiles in Windows Server 2012

 

Step #4. Enable IIS

To be able to configure Master Data services you should configure IIS. The following links show how to Configure IIS in Windows Server 2012.

For a complete list of web application requirements, check here.

 

Step #5. Configure Master Data service database

Once you have completed the IIS requirements you will then configure the MDS database. Click on the MDS Icon to begin installation.

 

Step #6. Configure the MDS Website

Next step is to configure the MDS web site

 

Step #7. Connect the mds website to an MDS database

 

Step #8. Open the master Data service administration pages

Master Data Services Basic Concept

Before proceeding lets review some of the basic terminology of MDS.

A. Model.

Models are the highest level of data organization in Master Data Services. A model defines the structure of   data in your master data management solution. A model contains the following objects: Entities, Attributes  and attribute groups, Explicit and derived hierarchies, Collections.  MDS model is pretty much like a database in Microsoft DBMS terminology. 

B. Entity 

Entities are objects that are contained in Master Data Services models. Each entity contains members, which are the rows of master data that you manage. Entities is very much like a table in Microsoft RDBMS terminology.

C. Attributes

Attributes are objects that are contained in Master Data Services entities. Attribute values describe the members of the entity. An attribute can be used to describe a leaf member, a consolidated member, or a collection.

Master Data Service Basic Operations

A. Managing Data Warehouse Dimensions with MDS

One of the main objective of MDS is to manage warehouse dimension. In these videos you will learn how to manage warehouse dimension using MDS

 

B. Master Data management with Excel addin

One of the newest feature of MDS 2012 is to manage your data using MDS addin for excel.  This tool will allow you to  perform a lot of work such as designing your entity, doing picklist and managing entity relationship.

For more   information on this feature you can consult the following videos below: Using MDS Excel Add-In to Model and Manage your Master Data 

 

C. Managing MDS with Web User Interface Explorer overview

Below is an overview of The MDS Web User Interface. 

SQL 2012 Master Data Services - Web User Interface Explorer overview 

 

D. Data Quality Service and MDS working together

This TechNet video will give you an overview of DQS and MDS working together

DQS Integration with MDS: Data Matching using MDS

 

E. Using DQS and SSIS for building "MDS inbound" integration package

Although you can manage data entry with excel add-in you will be using SSIS to "inbound" the data to MDS database especially if you have an OLTP system as source. In the video link below, you will be shown how SSIS, DQS and MDS integrate with each other.  Enterprise Information Management (EIM): Bringing Together SSIS, DQS, and MDS