Release 2024.02.15

These notes catch up with several releases since our last published update on 2024.01.11 and are current as of the 2024.02.15 version of the platform.

Feature Enhancements

  • New Check – “Expected Schema”
  • This replaces the now-retired Required Fields rule and adds significant additional functionality. Expected Schema asserts that all selected fields are present and their data types match expectations, resulting in comprehensive schema validation. The removal of a field will always trigger an anomaly, and users are given the option of whether to allow additional fields to be added.
  • Incremental Scan Starting Threshold:
  • Mirroring an already-existing feature of Profile Operations, we have introduced a “Starting Threshold” option for Incremental Scans. This feature allows users to manually set a starting value for the incremental field, giving users more control over which records are evaluated.
  • External Scans:
  • Have you ever wanted to run a stand-alone file against your existing data quality rules without adding it to your file system or entering the information to your database? We now support that use case. Users can now upload individual Excel or CSV files (and JSON via the API) and apply all of a container’s existing checks ad-hoc.

Note: The uploaded file must match the container’s schema (eg: column headers in a CSV must exactly match the names of fields within the container so the checks know what to look for and evaluate).

  • “Run Now” for Scheduled Operations
  • Don’t want to wait for your next scan to run to get results? You no longer have to. Based on user feedback, we’ve added the ability to execute any scheduled operation on demand.
  • Simplified Customization of Notification Messages:
  • We’ve removed the “use custom message” toggle from the notification form, making the message input field always editable.
  • We’ve also significantly overhauled the canned messages in an attempt to make your notifications as relevant as possible without the need for human input. For example, notifications that are triggered by the creation of an anomaly will now include a hyperlink to the anomaly by default. The new standard payload for each notification type is listed below:
  • An Operation Completes:

Notification: {{rule_name}}

A {{operation_type}} operation on {{datastore_name}} has completed.

Result: {{operation_result}}

For more details, check the link below:

{{target_link}}

  • An Anomaly is Detected:

Notification: {{rule_name}}

A {{anomaly_type}} anomaly was detected on {{datastore_name}}.

Message: {{anomaly_message}}

For more details, check the link below:

{{target_link}}

Note: This notification triggers based on the tags related to the notification and the anomaly itself.

  • Anomalies are Detected in a Table or File:

Notification: {{rule_name}}

A scan on {{scan_target_name}} within the datastore {{datastore_name}} detected {{anomaly_count}} anomalies.

Message: {{anomaly_message}}

For more details, check the link below:

{{target_link}}

Note: This notification triggers based on the tags related to the notification and the container, regardless of the tags present on the individual anomaly.

  • A Freshness SLA Violation Occurs:

Notification: {{rule_name}}

A freshness SLA violation occurred on {{container_name}} in the datastore {{datastore_name}}.

The violation started at {{freshness_violation_started}} and was identified based on the last modified time of {{container_last_modified_time}}.

  • Improvements for Profile and Scan Operation Dialogs:
  • We’ve made significant upgrades with the goal of improved clarity and a more intuitive flow. Key improvements include:
  • Incremental fields and most recently used incremental values are now displayed in the Scan configuration modal. As a result, users will know exactly where in their underlying data the operation will begin its evaluation.
  • We have reordered Profile and Scan steps to be more intuitive. Users will now select the operation targets before moving on to other details.
  • We’ve added additional language that clarifies the distinction between “Starting Threshold” and “Limit” settings.
  • We now allow users to choose whether to proceed to Schedule Options or start Scan and Profile operations immediately.
  • Naming for Scheduled Operations:
  • We’ve added a Schedule Name field to scheduled operations, enabling users to assign descriptive names or aliases. This feature aids in distinguishing and managing multiple scheduled operations more effectively.
  • Container Name Filters for Operations:
  • Users can now filter operations and scheduled operations by container name, creating the ability to quickly locate relevant information.
  • Locked/Unlocked Status Filter in Library Page:
  • Added a new filter feature to the Library page, enabling users to view check templates based on their Locked or Unlocked status.
  • Archiving Anomalies:
  • We’ve implemented the capability of archiving anomalies. Users can now remove anomalies from view without permanently deleting them, providing greater control and flexibility in anomaly management.
  • Improved Messaging for Locked Template Properties:
  • Informative messages have been added that explain why certain inputs are disabled when a check is associated with a locked template.
  • Enhanced Archive Template Confirmation:
  • When archiving a Check Template, we now list the total number of checks that would be impacted.
  • Refined Tree Navigation Experience:
  • Previously, making a selection in the Tree View would automatically expand all underlying tables or fields, which made quick navigation between elements difficult. We have changed this behavior such that the items will only expand at the user’s command.
  • Excluded Fields Inclusion in Drop-downs:
  • We now persist excluded fields in the UI, giving users visibility into which fields, if any, are not being evaluated by Profile and Scan operations. In addition, a warning message has been added to notify users if a profile operation is required when including fields that were previously excluded.
  • Performance Improvement in User Notifications Management:
  • We’ve implemented infinite scrolling pagination for the user notifications side panel, resulting in a much smoother UX when reviewing multiple items.
  • Improved Design for Field Identifier Tooltips:
  • Improved Interaction with Computed Tables:
  • We now allow users to immediately view Details of newly created Computed Tables without the need to wait for the rudimentary Profiling Operation to complete.
  • Optimized DFS File Reading:
  • Streamlined file reading in Distributed File Systems by storing and utilizing the ‘file_format’ identified during the Catalog operation. This change will result in significantly more efficient operations, especially for partitioned file types.

General Fixes

  • Repetitive Release Notification and Live Update Fixes:
  • Under certain circumstances, our release notification would annoyingly continue popping up after it had been dismissed. We have corrected this behavior.
  • Resolved DFS Reading Issues with Special Character Headers:
  • Fixed a DFS reading issue where columns with headers containing special characters (like pipes |) adversely affected field profiling, including inaccuracies in histogram generation.
  • Corrected Insights Metrics for Check Templates:
  • Fixed an issue where check templates were incorrectly counted as checks in related metrics and counts on the Insights page. Templates are now appropriately filtered out, ensuring accurate representation of check-related data.
  • Enabled Template Creation with Calculated Rules:
  • Resolved a limitation that prevented the creation of templates using calculated rules like ‘Satisfies Expression’ and ‘Aggregation Comparison’. This fix expands the capabilities and flexibility of template creation.
  • Preventing Unrelated Entity Selection in Check Form:
  • Fixed an issue in the Check Form where users could inadvertently select unrelated entities. Selecting datastores, containers, and fields is restricted during any ongoing data loading, preventing mismatched entity selections.
  • Performance enhancements for BigQuery and Snowflake removing the need for count operations during full table analysis.
  • Linkable Scan Results for Direct Access:
  • Made Scan Results dialogs accessible via direct URL links, addressing previous issues with broken anomaly notification links. This enhancement provides users with a straightforward path to detailed scan outcomes.
  • Property Display Refinement for Various Field Types:
  • Corrected illogical property displays for specific field types like Date/Timestamp. The system now intelligently displays only properties relevant to the selected data type, eliminating inappropriate options. This update also includes renaming ‘Declared Type’ to ‘Inferred Type’ and adjusting the logic for accurate representation.
  • Timezone Consistency in Insights and Activity Pages:
  • Implemented improvements in timezone handling across Insights and Activity pages. These changes ensure that date aggregations are accurately aligned with the user’s local time, eliminating previous inconsistencies compared to the Operations list results.
  • Fixed breadcrumb display in the datastore for members with restricted permissions
  • Enhanced the datastore interface to address issues faced by members with limited permissions. This update also fixes misleading breadcrumb displays and ensures that correct datastore enhancement information is visible.
  • Resolved State Issue in Bulk Check Archive:
  • Addressed a bug in the bulk selection process for archiving checks. The fix corrects an issue where the system recognized individual selections instead of the intended group selection due to an overlooked edge case.
  • Improved Operation Modal State Management:
  • Tackled state management inconsistencies in Operation Modals. Fixes include resetting the remediation strategy to its default and ensuring ‘include’ options do not erroneously carry over previous states.
  • Eliminating Infinite Load for Non-Admin Enrichment Editing:
  • Solved a persistent loading issue in the Enrichment form for non-admin users. Updates ensure a smoother, error-free interaction for these users, improving accessibility and functionality.
  • External Scan Rollup Threshold Correction:
  • Fixed an issue in external scans where the rollup threshold was not applied as intended. This correction ensures that anomalies exceeding the threshold are now accurately consolidated into a single shape anomaly, rather than being reported as multiple individual record anomalies.
  • Corrected Field Input Logic in Check & Template Forms:
  • Addressed a logic error that incorrectly disabled field inputs for certain rules in check and template forms. This correction re-enables the necessary field input, removing a significant barrier that previously prevented users from creating checks affected by this issue.
  • Addressed Absence of Feedback for No-Match Field Filters on Explore Page:
  • Rectified the absence of feedback when field filters on the Explore Page yield no results, ensuring users receive a clear message indicating no items match the specified filter criteria.
  • General Fixes and Improvements

As usual, our User Guide and accompanying Change Log captures more details about this release.

Share:

Related Posts

Search

Automated data quality that supports your company at scale