Data Quality in Discovery and Systems Thinking
What has network discovery and data quality to do with systems thinking? If you’re wondering now what is systems thinking and anyhow what is the author talking about? Then take the time and read on. We would delve today into a slightly philosophical contemplation about relations in the large and in the small.
Network discovery tries to form a complete picture of the IT infrastructure of a company or department for the purpose of IT asset management (ITAM), IT documentation and governance regarding security, license compliance and internal company guidelines, retroactive. Retroactive? It is not necessarily logical to use the help of IT management solutions like discovery or monitoring to find out which assets there are and keep them running. The environment has been setup once upon a time, installed and taken into operations, though in doubt it was only know at that point of time which hardware, software and configuration the system is composed of.
This knowledge often has been lost because it only exists in the form of a specification of a service as office document that now rots on some file share of a team, that does not exist anymore. So discovery tries to re-gain the overview of all the systems in order to keep them manegeable. This might be a extreme and harsh statement but when you think about it, is often true.
Would there have been an alternative to this „approach“? Well, yes there is the concept of infrastructure as code, which is probably even used in one or the other corners of the IT infrastructure, especially in the area of public cloud. The basic idea is to treat infrastructure as code by automating the setup of the infrastructure with tools and configuration such as Terraform, Ansible etc. and keep those scripts and configuration files strictly under source code version control (e.g. git repository). This way one can explicitely trace changes and control what is rolled out into the environment.
In such as case one would not necessarily have to discover in the environment what is there but rather look into the code of the deployment and extract all necessary information from it. The problem hereby is, that the code is usually in the form of some declarative tool-specific configuration like a Terraform YAML file or Ansible playbook.
The original developer understand that code of course, but those that are responsible for ITAM, documentation or GRC in the company often don’t unfortunately. It is a different perspective, not that of provisioning a system but overarching transparency. Furthermore there is usually not a single company-wide repository and a uniform structure or technical standards for infrastructure as code that would make it especially easy for those IT management teams to use that information in their tools and systems. It simply is a different silo so to say, it is only a part of the story.
Parts, that exactly is the problem here. Our IT infrastructure is, like all areas of science or business extremely specialized and special. For every part of the infrastructure there is an own group of experts, standards, technologies and even nomenclatura and language. You name this reductionism, the world is devided into smaler units that are better to handel. Every unit is researched separately and experts build, specialized companies, industry groups and business areas, that define optimal or at least suitable standards and formats for this specific units, develop the suitable tools and invent a suitable language for it, that is used in documentation.
Divide et impera
(Latin)
device and conquer is a suitable technique to tackle complexity in the details of a system. It is the prominent way that we learn in school and that is taught and enforced at university and that is further followed at work.
The result is an IT infrastructure, to stay with this, that shows the heterogenity and felt complexity that we see every day. Every system is different, with different concepts, configuration, commands and software.
What does that have to do with network discovery? A discovery solution, which is by itself a specialist area within the domain of IT management, itself a discipline of enterprise software and so on, should re-establish the lost overview about the heterogenity of the environment, shouldn’t it?
Every discovery solution has a long list of vendors, technologies, platforms and solutions that it is able to detect and understand and collect details about. And that list will never be complete, so large and diverse are our IT environments.
Well so there is a special type of software that collects and understands all the parts of our environment and provides these to higher-level function, job done, no problem, or?
Systems Thinking
At this point the concept of systems thinking comes into play. Because every system, such as also the human being, nature, the world of economics but also the IT infrastructure is more as the sum of its parts. It is more than just a collection of single systems, applications, users and more.
With the traditional approach of reductionism we do not come nearer to the understanding of a system. Why?
Let us look as an example, what does it help to know which VMs there are in the company, which software is installed and which users use any system and what networks are there. When and what now? You can generate a nice documentation about what you found that lists the parts and their details as good as the discovery information can and then have another office document that gets lost and ignored on a file share.
What got lost in all that reductionism are all the relations between the parts and their dynamics. About these relations thare are still many hints and crumbs in their corresponding silos but these parts need to brought together again to re-gain a complete picture of the system, that provides some value.
In the world of systems thinking, those relations are an important part for the understanding of a system and might even be more important than the single parts and their details.
Back to that example, if we know which VMs run on which host, which applications are deployed on them, which users use them, what files are really used by them or not, which department is manageing the application, which networks connect the servers on which they are running and which connections between which components are running over those networks, which users can access them and the shares the files are located on, which IT components on which infrastructure are used by the services and some more details like interfaces used etc., then we can use this information for the management of the IT.
If we finally also capture the changes in the systems and the differences between areas then we can start to understand the IT environment as a system and start building tools that can make statements about the systems and then finally we can derive decisions that allow to make manipulations of settings and the like in the system, because we can finally understand the relations.
The idea of systems thinking is in no way new, but instead it was already coined in 1956 by Prof. Jay W. Forrester in the Systems Dynamics Group of the Sloan School of Management at the MIT. It is a way of thinking to analyze and predict the behaviour of complex systems that surround us. But still it has not found wide-spread use in the design of complex systems, at least so it seems. But replacing linear thinking with thinking in self-enforcing control loops etc. is critical.
Data Management
At the very start of the chain of system understanding is the quality of the data, which again is currently provided by network discovery. This quality is not only timeliness, accuracy, completeness, consistency, validity and uniqueness (see article https://blog.jdisc.com/2020/12/18/data-quality-in-discovery/) of the individual parts, but the quality of the relations in the object model, delivered by discovery.
This information is the basis of a configuration management database (CMDB), which again is the basis of IT service management (ITSM) applications. It becomes obvious, that IT management systems are composed of the underlying components and rely on them in order to build a reliable IT management and governance system. The same is true for data. Data is even more important, otherwise there is software with useful functionaliy but, according to the principle of GIGO (garbage in – garbage out) this all serves nothing when the data used therein is of bad quality.
So a network discovery solution needs to make a synthesis of the discovered data into information, combining details of single parts and relate them with each other. Focusing on this synthesis is an important aspect of systems thinking.
According the DIKW pyramid (data, information, knowledge, wisdom), single pieces are brought into relation and into context, such that they get information. The interpretation of the information generates knowledge and deriving strategies is then the task of the IT management and IT governance systems, that process that information.
Conclusion
Discovery information are the fundamental assets that are crucial also in this area and need more attendance, they need data management. Data management is all about data and information and managing them as valuable assets and to spend more thoughs on the data quality accross individual silos.
This is a trend in many areas, where it is aknowledged that one can extends business models to data-driven business model as well as optimize internal processes in order to save costs.
Network discovery is one of the important pilars for the second part, because it already collects, cross-silo information and thus delivers data synthesis in the sense of systems thinking.
Finally let us conclude that the look at the overall IT management systems under special consideration of the data and information with their details and relations is fundamental for a modern efficient enterprise IT architecture. IT architecture should not be build on sand but rather invest in the quality of network discovery. It is a long-term investment and using the right tools, such as JDisc Discovery, that realize the right perspective that is not too expensive and finally worth the effort.