You are here: Data Profiling and Mapping Suite - FAQ

Data Profiling and Mapping Suite : FAQ


The following section addresses Frequently Asked Questions regarding the Global IDs Data Profiling and Mapping Suite.

Is there a limit to the amount of data that can be profiled?

Can the tool profile documentation, content?

What types of profiling can be done using the tool?

Is ad hoc execution of profiling processes via user interface possible?

Can the software present profiling results in textual report format?

Does Global IDs provide pre-built functionality to analyze trends in profiling results over time?

Can Global IDs track the flow and transformations of the data through complete data lifecycle?

Can Global IDs tool classify data based on default or user defined rules?

How do you establish linkages or relationships between metadata objects from different development tools? Is this done manually or automatically? Do you have any specialized "data rationalization" or "auto-linking" facilities which create relationship?

 

Is there a limit to the amount of data that can be profiled?

No. There are no limitations on the profiling activity. However, we recommend that our customers not profile very large transaction tables (e.g. billion row tables).


Can the tool profile documentation, content?

Yes. We have unstructured data scanners that can profile and index textual content from a variety of document formats.


What types of profiling can be done using the tool?

Statistical profiling, domain profiling, pattern profiling, dependency profiling, ID profiling, relationship profiling, time series profiling, subtable profiling, direct XML profiling, direct file profiling. A number of additional profiling capabilities can be found in our 2012 product roadmap.


Is ad hoc execution of profiling processes via user interface possible?

Yes. Via the UI of the Profiling Application Module (Metadata Crawler), ad-hoc profiling of any schema, table or column can be performed.

 

Can the software present profiling results in textual report format?

Yes. Profiling Reports can be generated in Excel, HTML or PDF formats.

 

Does Global IDs provide pre-built functionality to analyze trends in profiling results over time?

Yes. We periodically profile our data sources, and maintain profiling history.

 

Can Global IDs track the flow and transformations of the data through complete data lifecycle?

Yes. The Global IDs Product Suite can trace the flow of semantic domains across the data landscape, across the complete data life-cycle from creation to archiving.

Note: The Global IDs software cannot guarantee traceability in situations where the data is transformed in complex ways (e.g. through program code, ETL tools or message buses).

 

Can Global IDs tool classify data based on default or user defined rules?

Yes. The Global IDs Product Suite can classify data based on default rules or user defined rules. The classification functionality can be used to construct business taxonomies and ontologies.

 

How do you establish linkages or relationships between metadata objects from different development tools? Is this done manually or automatically? Do you have any specialized "data rationalization" or "auto-linking" facilities which create relationship?

A related module ("Data Lineage Analyzer") is responsible for establishing relationships and mappings across the enterprise data landscape. The Data Lineage Analyzer only works on physical data environments, since lineage can only be established by computing similarity between populated data columns. (Metamodels usually do not have adequate information to establish lineage.)


Data Discovery Features

Data Discovery provides the ability to automatically scan and identify information assets within an organization. The DPS Product Suite offers the following list of features to aid the Data Discovery Process

# Requirement Available Comments
1 Create a Metadata Repository without significant manual involvement Yes
2 Scan all structured databases of interest Yes See database types supported
3 Scan all unstructured data of interest Yes See content formats supported
4 Scan all semi-structured data No In development (beta). See supported formats
5 Provide web access to metadata repository Yes
6 Profile the structured data Yes 10 types of profiling is supported. See details
7 Automatically detect data quality problems in structured data Yes
8 Create a data dictionary and glossary for critical business data Yes Requires some degree of manual input
9 Continuously monitor the metadata environment for changes Yes
10 Create metadata reports and profile reports on demand Yes

Based on customer requests, the Global IDs Product Roadmap also includes some new features that have been listed below
  • It can accomplish the above tasks, with a minimum of manual involvement
  • It can scale to extremely complex environments (enterprise or global levels)
  • It can meet these requirements in systematic and repeatable ways