I have been using Microsoft’s Master Data Services (MDS) on a daily basis for the last 8 months. I also have been using Microsoft’s Data Quality Services (DQS) for the last couple of months. While both products are very good and are a great solution for Master Data Management (MDM), there are a few areas that could be better. To the rescue is a product called Master Data Maestro by Profisee. It is a Platform built on top of MDS that uses the MDS API to enhance MDS as well as their own composite APIs which add many new features that MDS does not support. Profisee is the original developer of MDS (originally called Enterprise Dimension Manager (EDM) built by a company called Stratature that Microsoft purchased) and have essentially extended their original roadmap that existed when Microsoft’s purchased Strature in 2007.
In addition, Profisee also has industry models for metadata management for healthcare, insurance, oil & gas, and retail to greatly speed up delivery of your MDM solution.
For more info and a free demo, click here.
Below is more detail on the capabilities and benefits of Master Data Maestro:
Maestro is comprised of two components: a desktop client application (Maestro Desktop) designed to appeal to master data stewards and a server module (Maestro Server) to administer advanced data quality functionality including merge, match, deduplication, survivorship, harmonization, location and mailing address cleansing all without coding or scripting.
Capabilities
Maestro Desktop provides the following capabilities:
- A responsive and intuitive user interface (with a tab structure). Maestro uses it’s own composite API extending and building on top of MDS API through it’s much easier to use interface. It also uses a faster connection (net.tcp instead of http). It is a replacement for the Excel MDS add-in by providing easier to use Excel-like features when viewing data while not having all the extra Excel features getting in the way and therefore providing significantly better performance even on small entities
- Interface customization via personalization and workspaces (including personal attribute groups)
- Bi-directional cut, copy, and paste to and from Microsoft Excel
- Data quality validation, issues review, and change auditing
- Statistical data quality results derived from the matching process
- A tailored matching interface to review, approve, or reject results of the matching process without writing code or scripting
- Address verification, standardization, and geocoding without writing code or scripting
- Reporting integration with SQL Server Reporting Services
- Ability to clone members
- Ability to compare members side by side including the difference between members and the ability to pin records and have them updated by a good member
- Metadata caching, deferred publishing (edit data offline and publish it at once, with undo), and management of MDS connections
Maestro Server provides the following capabilities:
- Adaptive modeling to quickly create and populate MDS models, entities, and attributes – what can take 4 days using the MDS web UI can be done in 4 hours with Maestro
- Matching strategy creation and execution without writing coding or scripting
- Address location verification strategy creation and execution with interfaces to Bing Maps and Melissa Data without coding or scripting. Bing Maps integration included at no cost and the Melissa Data integration allows 75 million unique record lookups a year for $10,000
- Excel and OLEDB integration capabilities, including importing data via OLEDB into model (for example, building the model by pulling metadata and data from SQL Server)
- A software developer kit (SDK), web parts, and workflow integration components to expedite the creation and delivery of more customized, real-time master data interfaces and applications. This SDK is much easier to use than the Microsoft MDS API with the developer working with strongly typed classes and records versus the request/response model necessary to use the MDS API. This can mean months of development cut down to weeks or even days
Benefits over MDS
The items below depict key benefits other Profisee clients are realizing through the adoption of Master Data Maestro.
- Productivity. Data stewards enjoy higher levels of productivity given the product’s more intuitive interface and personalization with support for hierarchy management. The time to create and edit models, entities, attributes, and hierarchies shrink significantly for master data administrators.
- Scalability, to support increased usage of MDS
- Integrated Data Quality. Location verification, matching, master record creation (golden record), and survivor-ship are integrated with the master data hub. DQ integration renders the need to create and manage complex ETL scripts to integrate with 3rd party solutions unnecessary.
- Lower Cost of Ownership. Maestro’s lower price points relative to the competition accompanied by Profisee’s relationship with Microsoft and Melissa Data enable customers to take ad-vantage of attractive transaction discounts.
- Profisee Partnership. Profisee’s relationship with Microsoft and intimate knowledge of Master Data Services benefits customers through knowledge transfer and proactive product support. Maestro’s release cycle schedule with considerable customer input enable customers to benefit from the product’s ongoing innovation around MDM and data quality. Profisee beta tests SPs and CUs and can recommend to their clients whether to apply them or not
Benefits over DQS
- Improved workflow
- Layered matching strategy that is easier to use than defining multiple rules in DQS
- Match Groups feature
- Synonyms that is easier to use than Domain Values and Term-Based Relations in DQS
- Business Rules support. Unlike Maestro, DQS has no business rules to use in conjunction with matching (Matching is done in Excel with those limitations). DQS has some support via Domain Rules, but limited…for example, can’t send email alert or execute workflow
- Supports golden records, survivorship and harmonization. In DQS these would all be ETL/coding tasks. DQS somewhat supports golden records via the use of cluster_id in which you would use cluster_ids for your surrogate key
- Unlimited workload. In DQS, I think the current recommendation is no more than 1 million items. Profisee currently has customers running 25 million records in a single entity and have tested up to 70 million records
- More matching scenario support than DQS
- For specific domains like party (people and companies) you need address cleansing to do that and the DQS and the Azure market place is expensive and you throw a lot of information away. Maestro is much cheaper and return more relevant info
- DQS interface is not as slick as Maestro and is too small and difficult to maneuver
- Batch cleansing requires a DQS task in SSIS (DQS task does not do any matching, only cleansing)
- Assigning records to different golden records is easy in Maestro. In DQS you must copy the cluster_id
- With DQS you must export the matching/cleaning results (including “score”) and do something with it. No way to update the data as-is in the model – you must import into DQS, correct, export, and feed back into model. So the data goes to the rules instead of the rules going to the data
- When using Maestro you should create reference lists in MDS for everything instead of accessing a list (i.e. Knowledge Base, like DQS does) outside of MDS for validation. You would likely need those lists for managing that domain as a MDM subject area anyway. Even if you were just doing a reference list, DQS has no web service calls so you would have to write that yourself for real-time queries. You get that plus database access methods with MDS/Maestro
- The Data Steward for matching in Maestro is a many times better than doing it via Excel with DQS
- DQS only allows you to put data in or export it via a SQL Server from the same server that DQS is installed (must create a view pointing to a linked server as a workaround)
- DQS has three activities: Knowledge discovery (builds a knowledge base), Domain Management (verify and modify the knowledge that is in knowledge base domains), and Matching Policy (defines how DQS processes records to identify potential duplicates and non-matches). So you need to bring reference data into the knowledge base, as well as keep a copy in MDS since you will be using that to transfer to the data warehouse
- The bottom line is DQS is totally separate from MDS and this introduces more coding in SSIS to make it all work together. Also, DQS does not include a web service interface for real time access or update
- Since there is no SSIS package for DQS matching, it all must be done interactively. Plus you can’t refresh the DQS knowledge base in SSIS either, which may mean fuzzy matching in SSIS is a better option
- Think of Maestro as the “knowledge base” and that there is no need to do any importing that is required in DQS, where you create a knowledge base and then run a data quality project where you make decisions/project values and then you need to import the decisions/project values into the knowledge base
- The only benefit DQS may have is with its interface to the Azure Data Marketplace