General
Last updated
Was this helpful?
Last updated
Was this helpful?
PDQ manages platform-level settings located in the General section of the settings page. This page is key to defining the platform's overall functionality. A thorough understanding of each section is essential to maximize profiling and reporting on the assets.
The semantics feature provides users with the ability to execute a Semantic data discovery job on a predefined schedule at the platform level. When configuring a connection, users can enable the option to Push semantics, which triggers the semantic job to classify data, push specified terms according to Semantics, and more.
Additionally, users have the flexibility to manually initiate Semantic discovery for individual assets by clicking the compass icon near the Terms field on an asset page.
To streamline the process and avoid the inconvenience of navigating to each asset, users can utilize this setting option to run Semantic discovery for all assets at a scheduled interval.
The user can choose to deactivate the Semantic option if they prefer not to execute Semantic jobs for any of their assets. Disabling this setting will also deactivate the Semantic Run option on the assets page. To understand more about Semantics Terms, please read the article Semantic Term and KPI Monitoring Standardization
This parameter, specifies the duration for which the PDQ retains metadata and other asset-related data in its backend database (Postgres) after an asset is deleted from the platform. It is important to understand the 2 asset-deleting options that PDQ implements:
Soft Delete - Maintain history (leaves the asset details, all metadata information, and audit logs for future purposes)
Hard Delete (Yes and Purge) - Removes history (DOES NOT leave any asset details, metadata information, and audit logs) In the event of a hard delete, the backend database will be completely cleared of asset details, metadata, and audit information.
On the contrary, the soft delete option allows the user to store this information for a defined number of days. This setting applies to various types of information, including asset logs, audit logs, user logs, etc. Users have the flexibility to define the storage period based on their specific use case.
The preview setting enables users to observe data in a measure without requiring the activation of external database reporting, provided the necessary permissions are granted for the role. Users can determine the number of records to be displayed in the preview.
The user also is provided with the option to decide the data range that can be viewed for any measure configured.
● All columns - Displays all columns in the dataset (as shown in the image below)
● Primary Key - Displays only the primary key sensed by the platform or the identifier mapped by the user.
x
x
The preview column choice is not applicable for Standalone measures and aggregate queries.
x
Once primary keys are sensed by the platform or manually input by the user under identifiers, the preview sections will show only the primary key based on the asset configuration defined under identifiers
● Measure Preview
● Preview After Measure Validation
If Identifiers are not selected then in the preview the following error message will be shown, “This asset does not have a unique identifier and query cannot be executed due to security settings, please see your admin“
x
Reporting is enabled when a user requires the failed records of the measures to be pushed to a preferred external database. The following options are key in setting up the Reporting database:
Measure - Allows users to choose the type of measures from which failed records will have to be pushed. The user can choose between auto-measures and custom measures or both.
The export of the following auto-measures are only supported:
■ Blank Count
■ Duplicate
■ Duplicates
■ Enum
■ Length
■ Length Range
■ Long Pattern
■ Null Count
■ Regular Expressions
■ Short Pattern
■ Zero Values count
■ Value Range
Reporting Connection – The target connection to which the failed rows should be pushed
Reporting Database - The database to which the failed rows should be exported (The reporting database is mandatory for Google Big query connector)
Reporting Schema– The target schema to which the failed rows table should be created
Retention Period- This defines the maximum number of runs/days to store the failed rows (up to 24 months). Old records will be deleted from the write-back tables based on the Created_Date column
x
x
x