Reporting On A Pipeline

Reporting on a pipeline is all about monitoring what’s going on in the pipeline, its performance and any errors should they occur. This helps to identify issues as far upstream as possible, so that they can be prevented from moving downstream and closer to the end user. Noticing issues late results in inaccuracy, lost trust from stakeholders and lost time.

What to Monitor?

There are many things we can report on to maintain the accuracy of a pipeline, and each business will have its own requirements and priorities in terms of reporting. Error logs provide details on errors that have occurred and when. If an error persists this can serve as an indication for engineers to investigate the root cause of the error and review that part of the pipeline. Monitoring errors enables engineers to ensure that pipelines are running as intended and to demonstrate accuracy to other areas of the business. Encouraging staff to record how they resolved errors and how long this process took is also invaluable for use in future troubleshooting.

Monitoring the latest runtime of various stages of a pipeline and the most recent updates of datasets ensures data freshness. Column properties to monitor to check that data is being processed as expected include null rates, uniqueness, allowed values and allowed value ranges. The volume of data being processed throughout a pipeline is a strong sense check of whether the pipeline is working as expected. The downtime of systems relying on a pipeline is critical to many businesses but also flags potential errors. Anomoly detection also serves  as a useful way to alert engineers of potential issues within a pipeline.

Business validation checks are a reporting practice not to be overlooked. Business validation checks, such as sense checks or comparing figures from a new pipeline to an established single source of truth, improve the reliability of a pipeline.

Monitoring for Security

Engineers can monitor items not just to ensure the accuracy of a pipeline, but for security purposes. Monitoring access - who is accessing stuff, what they’re accessing, where they are accessing it from and new accesses being granted - acts as a flag for potential security breaches of a pipeline. Monitoring access patterns from users also lets engineers identify any activity that might be concerning, for example accessing databases in an unrelated part of the business.

Reports on resource use can be analysed to identify issues and potential security breaches in a pipeline. For example, a sudden surge in resource use might be an indication of a security breach.

Billing – for example the cost of cloud services – is essential to monitor. Billing can be monitored by department to identify where costs are coming from and to keep them under control. Unexpected spikes in bills can be an indication of a security breach.

Tracking excess permissions facilitates the practice of least privilege (PoLP) – only granting users the privileges that they need to carry out their jobs. PoLP improves security and can save on costs. Some tools provide monitoring for users that have permissions that aren’t being utilised, and can be set up to send alerts of these cases or to remove such permissions after a set time period.

Ways To Implement Reporting

Email alerts can be automated to send to engineers when an error occurs based on error logs.

Testing is built in to many tools. For example, dbt has native testing features that can be used to test the structure of datasets and to validate chunks of code. Airbyte provides logs which can be retrieved and analysed. Pytest is a Python library for testing parts of a project.

A reporting dashboard for a pipeline provides an excellent overview of how a pipeline is performing in a centralised location. The most critical metrics can be highlighted on a dashboard, and a dashboard allows users to quickly access the information they need.

Author:
Zoe Reed
Powered by The Information Lab
1st Floor, 25 Watling Street, London, EC4M 9BR
Subscribe
to our Newsletter
Get the lastest news about The Data School and application tips
Subscribe now
© 2025 The Information Lab