Silos, silos everywhere. No, I’m not talking about grain silos. Or missile silos. Or even organizational silos. I’m talking about data silos. You know, those pockets of data that exist in an enterprise that only one group has access to and ownership of, and that most of the rest of the organization has no idea even exists.
Silos are a remnant of a time when business units or teams had control of their own data and infrastructure and produced or collected the data that they needed to accomplish their tasks and achieve their goals. However, as organizations changed, teams retained their fiefdoms and have never relinquished the data that they control.
With tight margins and little room for error, pharmaceutical companies are moving to a state in which every decision has to be optimally informed. Whether it’s on the front lines of molecular discovery or in sales and distribution, having access to data and information that impacts these decisions is crucial.
Data democratization means that anyone who has a valid use case and decision point should have access to the data and/or information that they require to make that decision. It means having access to the right data at the right time in the right form, taking into consideration all of the enterprise constructs around privacy, security, and access that typically get lost and result in “data anarchy.”
Why haven’t organizations pressed toward democratization? Why don’t most organizations want to unencumber their data and make it available to the larger enterprise? Fear of data anarchy is one reason, but it’s not the principal reason. The biggest resistance is corporate inertia and the amount of work that goes into developing a data democracy in any organization, especially in pharma.
Within this inertia, three major factors preclude the democratization of enterprise data: organizational governance, data governance, and technological infrastructure.
Organizational governance is the first, major step toward data democracy. And it’s usually the most difficult. Organizational governance means putting in place standards and requirements for creating, managing, storing, and sharing data throughout the organization. Although it may sound simple, the organizational agreement and organizational change management required by this effort is usually the most difficult and time-consuming part of the process.
To get to democratization, it’s crucial to get all of the data stakeholders together to agree on use cases and intended uses of data and to generate standards and policies about data ownership, curation, and stewardship. Models for this agreement range from complete centralization and control to distributed ownership and community access. The right answer is only right for the organization that is building the data democracy, because their data needs dictate organizational governance.
Data governance is the complement of organizational governance. Organizational governance looks at the people and processes involved with data management. Data governance focuses on the rules and standards around the data itself from how it’s generated to how it’s stored and consumed. Again, this process requires data stakeholders to determine the use cases and intended uses of the enterprise’s data and then determine how that data can and should be used.
Although not as potentially painful in terms of interpersonal engagements, data governance can be exceptionally difficult from a volume, variety, and duplication perspective. Even though it seems daunting, data governance can be done incrementally by focusing on priority use cases that are aligned with operational needs and projects, and it can follow an xOps or agile paradigm.
Technological infrastructure is typically the easiest part of the process to deal with, but it can be the most expensive. Technology that supports democratization generally falls into one of two categories: governance or access. Governance tools generally hold the metadata, the rules about the data, that are established in the two governance processes. Access is where the implementation of governance comes into play. Any paradigm, from data ponds to lake houses to virtual lakes, can be implemented, as long as the technology enables and empowers both organizational and data governance.
Infrastructure that supports democratization can span multiple tenants and range from on premises to public cloud. A lot of thought must be given to the architecture chosen so that it meets organizational needs and strategies while remaining cost effective and achievable. This kind of finesse represents the difference between a true data democracy with pristine data and a data swamp that does not support any organizational outcomes. Unfortunately, the latter is how many organizations feel about their democratization efforts.
A simple way to look at the infrastructure is to define a data fabric, a set of capabilities that manage and store data consistently and effectively across the entire data estate. NetApp® ONTAP® data management software does just that, allowing seamless management of data on premises or in the cloud. A data fabric powered by NetApp optimizes the storage of files, blocks, and objects. Tools like the NetApp StorageGRID® object-based storage solution and NetApp Cloud Volumes ONTAP help ensure that data is where it needs to be, when it needs to be there. The infrastructure must also be considered holistically and robustly. NetApp delivers industry-leading data protection with Cloud Secure, data management with Cloud Insights, and capabilities for data science production work streams with the Data Science Toolkit.
In pharma, where holistic perspective is a distinct advantage and necessary to accelerate the business, a data democracy provides the foundation for insights and decision making. Done right, it should be the source of truth and believability for the entire enterprise, eliminating duplication of work, errors, and other incidents that slow down time to market.