My Philosophy of Report Development

Process-Oriented vs. Data-Oriented Report Design

I think, often, that data professionals approach report design from a data-oriented angle. In a data-oriented approach, the data analyst focuses on getting a report that matches the technical specifications of the report as much as possible. There's a focus on the facticity of the data, on the SQL being technically accurate.

Reports do need to be technically accurate, but it is critical to remember that reports are situated within businesds processes that actual human beings need to conduct, processes that have impact on your customers, other actual human beings. As such, having data that perfectly recreates the technical specifications of the report does not guarantee that the report is useful.

Here are a few reasons that data consumption's presence within a process can complicate report writing.

If data professionals do not adopt a process-oriented approach to report development, they may produce reports with incorrect data integrity assumptions, or that their users don't trust. You could also call this outcome-oriented since the report is built around a process that has a positive outcome for the business. The pithy way I think about this is that we often take a lot of time to develop the intelligence side of the business intelligence equation, with time spent developing sophisticated data warehouses and stored procedures, and we don't think enough about the business side of the equation.

I would say that the most critical skill I have developed as a data professional is not learning neat new tricks with SQL, but developing my customer service skills, my instinct for understanding processes, and my ability to ask questions that inform me about business processes. Certainly, my SQL chops help make sure that reports get delivered in a timely fashion, and that they are correct. But it's my people skills, the ability to coax out the true needs of the report, that have served as my most important trait. This leads me to my next point...

The Travails of Customer Service

Report writing is a customer service problem as much as a technical problem. I have written about customer service elsewhere. The problem of customer service, in my mind, has three interrelated problems, each of which apply well to report writing.

Report requestors are like customers anywhere. They don't always know what they want, and when they do it does not follow that they have the data literacy or general rhetorical skills needed to communicate these needs, and often times fulfilling the stated request does not actually satisfy the business requirements.

Some example from other fields may help highlight this. Graphic designers often struggle with poorly communicated and evolving needs. I love reading listicles on weird graphic design requests. A favorite: I really like the colour but can you change it. Another good one: "The sandwich needs to be more playful." What the hell does that mean? And did the client know how playful they wanted their sandwich when they commissioned their designer, or did this strange need only evolve later?

Too Many Stakeholders

But if a single person has a hard time figuring out what they need from a report, how can we ensure that a group of stakeholders get what they need from a report? The synthesis of our two above problems show the true difficulties of satisfying report requests. Processes necessarily involve multiple people, and all of them will have complicated and self-contradictory ideas. Therefore, a report writer doesn't just need to be a customer service expert, they need to be a product manager and project manager capable of managing expectations and needs.

There are a few ways to ensure that these too many cooks don't spoil the broth. An approach I have seen work is pairing technically minded data analysts with a project manager capable of organizing meetings and drawing up gantt charts. I also will propose that a good report intake process can help orient these multiple stakeholders from the get-go.

A Simple Example

An seemingly simple example from my recent work exemplifies the above. I was asked to troubleshoot a report that would return providers who resigned this month. Simple indeed! But, the correct providers were not returning. Let's sketch out some of the processes, stakeholders, and evolving needs of this project to figure out why this happened.

Stakeholders: This report would be presented to the heads of hospital departments looking to make staffing decisions for their part of the hospital. Knowing who resigned is critical to know how many new providers the hospital should try to hire. Because of this, the report had to be simple and straight forward. The heads of hospital departments, like the heads of departments anywhere, are busy people.

But, there were two hidden stakeholders of this report, whose existences should help orient us to the processes that surround the report. The first of these stakeholders was the project leads who initially commissioned this resignation report at the beginning of my client's install on my company's software. Prior to installation, they had identified how they would record a resigned provider.

Processes: Three processes have been mentioned thus far. Department heads would review the report to make staffing decisions. But two other unaccounted for processes lead to incorrect data being returned.

The first was the installation of the database itself, and the selection of the data elements that would be used to define a resignation. Our project leads had made the decisions that the resignation date for a provider would be used to determine when a provider resigned.

This seems simple enough, but the process of installation, just like the process of report requesting, often has evolving scope and evolving needs. This leads us to our final process, that of recording provider resignations. In the end, the medical staff office decided not to record resignation dates for providers, but to instead simplify their data entry field by only recording the committee review date for a provider. If a provider had a committee review date, and had a status of "resignation" on their application, then the assumption was that the committee review date was the resignation date. There was no need to double document by further specifying a resignation date. No resignation dates had been filled out since install.

Evolution of needs: The delight of report requests! Initially, the project lead had requested a report, and had been given a report, that used this resignation date field. But, upon installation, this field was never used. As such, they needed to have a report that looked to the new field.

While a technically focused report writer may note that this was a "data flow issue," and that the report simply needed to be redirected to look at a new date field, well, they would be correct. And when the actual nature of resignation documentation was spelled out for me, the change to my code was quite simple. But no amount of SQL genius would allow you to identify the problem at hand, and deeming it a "data flow issue" hides the messy, beautiful process problems under the hood.

(For those who like database theory: I love that the medical staff office deemed a field irrelevant! They noted that resignation dates were dependent on the combination of two elements: the status of the record, and the committee review date. If you knew the provider had resigned, and the value for the committee review date, you knew the resignation date. As such, they removed an extraneous element, moving the database closer to 3NF.)

Conclusion

Quickly, here's what I think are most important when thinking about report writing: