2.1 Definition of terms used - From the point of view of Integral Security
Based on the analysis of sources [1-6,8-76], the following is a summary of the knowledge and references related to the terms used and their definitions, as applied in the presented dissertation.
An asset is understood as a physical, logical, or cyber item that defines the structure and behavior of the monitored system [6]. The results of the works [1,4] provide lists of identified assets of the model metro station and the control system of the Prague metro (i.e., people, property including technologies, energy, information, and material flows), primarily based on the analysis of metro documentation [1,4,10,11]. Since this is an open system of systems, it is necessary to consider, in addition to the technical parts discussed in [1,4], also other aspects, such as organizational, financial, functional, logical connections, and others. For the purposes of further analyses, we consider the following groups of assets: construction, technology, personnel, locations, functions, connections and flows, organization, and economy.
2.1.2 Disasters
The cause of risks are disasters 1 (of all kinds), the so-called All-Hazard-Approach [12], and in the case of risks in technological systems, it also involves failure states due to random or systematic errors of the system [4,13]. From the above, it is evident that the occurrence of a single extreme disaster can trigger a chain of other disasters, i.e., secondary impacts, and even a whole cascade of impacts. Secondary, tertiary, and other impacts are referred to as indirect impacts. The indirect impacts of extreme disasters are shown in Figure 1. Figure 1 shows the connection of the impacts of an extreme disaster with various protected assets, which trigger further impacts on other assets, i.e., indirect impacts, which take the form of cascades (i.e., cascade effect).
Figure 1: Effects of extreme disasters on public assets [14].
According to the size of damages and losses on public assets and the probability of occurrence, i.e., based on the analysis and evaluation of risks using the risk matrix method according to [13], disasters in safety management can be categorized into three categories:
Critical Disasters: can trigger a critical situation in the monitored area or part of it, where, according to current Czech legislation, a crisis situation may be declared, and therefore, asset recovery after the crisis situation will be required. From a safety management perspective, preventive and mitigating measures should be implemented in land use planning, design, construction, and operation of civil and technological buildings and infrastructure.
Specific Disasters: can trigger emergency situations, and therefore, response and preparedness (mitigation measures) must take them into account. From a safety management perspective, preventive measures should be implemented in land use planning, design, construction, and operation of civil and technological buildings and infrastructure, as well as mitigating measures as part of preparedness for responses.
Relevant Disasters: all other disasters that may affect an entity and are neither critical nor specific. They should be managed using standard conventional means, i.e., prevention applied in practice. From a safety management perspective, the existing measures applied in land use planning, design, construction, and operation of civil and technological buildings and infrastructure are sufficient, and therefore, only regular checks of their effectiveness are necessary.
For the purposes of the presented dissertation, the following disasters, identified in [4] through the analysis of archival documents from the capital city of Prague [15], were used:
Results of processes occurring inside and outside the Earth: flood, hurricane, earthquake, liquefaction of the subsoil, gas eruption to the Earth’s surface.
Results of processes in the human body, human behavior, and processes in human society: epidemic, pandemic, collapse of societal stability, attack, terrorist attack, attack using chemical, nuclear, radiological, and biological (CBRNE) weapons, armed conflict, war.
Results of processes and activities installed by humans: industrial accident, accident during transportation or storage of hazardous substances, transportation accident, disaster in critical infrastructure, economic disaster, disaster in territorial infrastructure, disaster in cyber infrastructure, disaster in service infrastructure, supply and communication, technology failure, loss of operability.
Interaction between the Earth and the environment on human activities: violation of subsoil stability due to vibrations, air contamination, water contamination, rapid climate variations, migration of large groups of people.
Internal dependencies in the human system, whether natural or man-made: organizational failure, disruption of material and product flows, disruption of energy flows, disruption of information flows.
Table 1 contains a classification of disasters relevant to the capital city of Prague into categories, with details provided in [4]. That is, the All-Hazard-Approach [12,13] and data [15] are considered; details are in the works [12,13,16].
Table 1: Distribution of disasters – Relevant, Specific, Critical.
Relevant
Specific
Critical
Results of processes occurring inside and outside the Earth
Flood
yes
yes
yes
Hurricane
yes
yes
Earthquake
yes
Liquefaction of the subsoil
yes
yes
yes
Gas eruption to the Earth’s surface
yes
Results of processes in the human body, human behavior, and processes in human society
Epidemic
yes
yes
yes
Pandemic
yes
yes
yes
Collapse of societal stability
yes
yes
Criminality
yes
yes
Attack
yes
yes
Terrorist attack
yes
yes
yes
Attack using chemical, nuclear, radiological, and biological (CBRNE) weapons
yes
yes
yes
Armed conflict
yes
yes
yes
War
yes
yes
yes
Results of processes and activities installed by humans
Industrial accident
yes
Accident during transportation or storage of hazardous substances
yes
Transportation accident
yes
yes
yes
Disaster in critical infrastructure
yes
yes
Disaster in the economy
yes
Disaster in territorial infrastructure
yes
Disaster in cyber infrastructure
yes
yes
Disaster in service infrastructure, supply, and communication
yes
Technology failure
yes
yes
yes
Loss of operability
yes
Interaction between the Earth and the environment on human activities
Violation of subsoil stability due to vibrations
yes
yes
yes
Air contamination
yes
yes
Water contamination
yes
yes
Rapid climate variations
yes
Migration of large groups of people
yes
Internal dependencies in the human system, whether natural or man-made
Organizational failure
yes
yes
yes
Failure of material and product flows
yes
Failure of energy flows
yes
yes
yes
Failure of information flows
yes
yes
yes
2.1.3 Risk and Criticality
The concept of risk has different and non-unified interpretations across various fields; some definitions of risk are based on probability, while others are based on expected value or uncertainty and indeterminacy [17]. In the context of project management and control systems, risk has generally been defined as the “effect of uncertainty” [18]. The effect of uncertainty, when realized, can take on both negative and positive characteristics (i.e., opportunities) [18].
Risk in engineering disciplines such as system risk management, reliability management, and safety risk management expresses the likely magnitude of unacceptable (i.e., undesirable) impacts (losses, damages, and harm) from a catastrophe of the size of a threat (i.e., the potential of the catastrophe as normatively determined) on protected interests (assets) over a specified time interval in a given location [17].
The sources of these risks are the catastrophes mentioned in the previous paragraph. These include risks to people, their property, the environment, critical infrastructure, and, last but not least, the state. Risks can be categorized based on the protected assets considered and whether one protected asset is being monitored (i.e., partial risk) or a set of protected assets (integrated risk), or a set of protected assets with interconnections and flows between them (complex risk/integral risk).
Additionally, risks are divided based on the catastrophes or sources of catastrophes that are taken into account (only some catastrophes, part of their scenarios, or all relevant catastrophes, etc.).
In practice, especially for transport systems, risks are usually considered as partial and integrated risks, which are expressed as the product of the probability of a catastrophe (incident or failure) occurring or the frequency of its occurrence and the magnitude of its impacts (losses, damages, harm) on the monitored entity or selected set of entities. There are many quantities for calculating risk depending on the monitored area, but it is usually the product of the two factors mentioned above. In more detailed studies, the vulnerability and sometimes the controllability of a harmful event are also considered, for example in the automotive industry [19].
In the understanding of risk (R), many differences can be observed, and what is common is that risk arises from the fear of an uncertain future [5,17]:
R = frequency ∙ consequences;
R = severity ∙ likelihood of occurrence;
R = threat ∙ vulnerability;
R = threat ∙ vulnerability ∙ impacts;
R = threat ∙ vulnerability / capacities;
R = (threat ∙ vulnerability) / countermeasures ∙ impacts;
R = f (threat ∙ vulnerability / capacities);
R = f (assets (protected interest) ∙ threat ∙ vulnerability);
R = frequency ∙ population ∙ vulnerability.
To ensure a safe area, or larger technological units or systems, it is necessary to consider complex risk, i.e., integral risk based on a system view of reality [2]. Integral risk includes multiple protected assets, including life, health, and safety of people, property, public welfare, the environment, and technologies and infrastructure, and also includes the influence of connections between the listed protected assets (known as interdependencies) [4,17].
Integral risk, denoted as R, for all catastrophes in a given area is expressed by the following relationship [17]:
(1)
Rk expresses the risk for the k-th catastrophe:
(2)
Pk represents the probability of occurrence of the k-th catastrophe, and Di,k is the impact of the k-th catastrophe on the i-th protected interest. Similar relationships apply to integrated risk, except that for integral risk, the impacts Di,k include not only direct impacts DDi,k but also indirect (secondary, tertiary, and further) impacts DIi,k, whose relationships are according to source [19] as follows:
(3)
Vi is the value of the protected interest, S is the monitored area or object, Zi,k is the vulnerability of the i-th protected interest for the k-th catastrophe, and Ii,k is the function of interdependencies. The interdependencies depend on the specific structure of protected interests in the area and the specific connections between protected interests and the catastrophe, i.e., according to [17]:
(4)
VDk is the characteristic of the degree of the k-th catastrophe that affects the impacts on protected assets. VPi,k is the characteristic of the degree of mutual connectivity of protected interests in the given area. Determining VPi,k is the subject of detailed research based on Boolean logic or, for more complex dependencies, based on operational analysis methods [17,19,20].
For technical systems [21], the following relationship applies:
(5)
where H is the threat associated with the given catastrophe at the location of the object; Ai are the values of the monitored assets for i = 1,2,…, n; Zi are the vulnerabilities of assets for i = 1,2,…, n; F is the loss function; Pi are the probabilities of asset damage for i = 1,2,…, n – conditional probabilities; O is the vulnerability of protective measures; S is the size of the monitored object; t is the time since the harmful event occurred; T is the time during which losses occur; and t is the recurrence period of the catastrophe. Since the loss function is usually not known, failure scenarios are created, and multicriteria methods are used for risk evaluation, typically decision support systems [22].
From the above knowledge and considering the complexity (difficulty) of systems, it is clear that integral security can only be increased by considering and managing integral risks, which do not only consider the sum of partial risks but also account for the connections and flows between assets [13].
For the purposes of safety management, asset criticality (K) is understood as the function of the importance and vulnerability of the monitored asset or the entire entity, expressed as the product [13,17]:
K = importance ∙ vulnerability (6)
Criticality with regard to a specific catastrophe can be expressed by the relationship:
C = S ∙ O ∙ B (7)
where S is the severity of the greatest impact of the catastrophe (harmful event), O is the probability of occurrence of the catastrophe, and B is the conditional probability that the most severe impact will occur [13,23].
As mentioned at the beginning of this paragraph, risk refers to the effect of uncertainty, i.e., how often (or probably) substantial losses will occur. By reducing risk, we reduce the frequency of unfavorable events (if it is within our power) or their impacts. Risk is thus related to safety, but it is not defined by safety. Criticality relates to the threshold value between two states, in the field of safety, this means undesirable (danger) and desirable (safety). By reducing criticality, i.e., the threshold between danger and safety, we increase the state space of the system in the safe area, i.e., we increase safety. Therefore, criticality is a complementary variable to safety, although it is a consequence of risk factors and may share the same input parameters as risk (e.g., vulnerability) [27].
2.1.4 Security
In current practice, the term “security” is assigned several different meanings. In transport systems, security is associated with: the protection of people without considering interconnections with the system; the resilience of the system against disruption by some adverse event (calamity); or against internal faults. In connection with protective or security systems, security is understood as functional safety, i.e., the implementation of a safe function or process in the event of anticipated situations [24]. In reality, these meanings have the same goal: to protect human health and life, and to ensure the development of human society. Therefore, all these meanings are part of integral security, which combines them together.
Systemic security in the context of integral security means that the system is protected against both internal and external calamities, including the human factor, i.e., the system has sufficient resilience and adaptability to expected conditions. Moreover, a secure system must not jeopardize its surroundings, even under critical conditions [20,25,26,27], as illustrated in Figure 2 [20].
Potential impacts of system failures shown in Figure 2 will manifest in other systems as a disaster in their surroundings, creating a chain of calamities, i.e., a cascading effect.
Figure 2. The relationship between system security and protection [20].
The term “security” (Safety), according to current knowledge, refers to a set of means and measures by which humanity ensures its own safety (Security) and sustainable development (Sustainable Development). Figure 3 depicts a concept focused on safety, i.e., the higher goal; it is not just about reducing risk, but about increasing the safety of people and other public assets on which people depend [27].
Figure 3. The relationship between safety and security as a tool for ensuring safety [27].
From the above, it follows that security and risk are related, but they are not complementary quantities, as security can be increased through organizational measures that do not affect the size of the risk. The complementary quantity to security is criticality. By reducing criticality, we increase the security of the monitored object
2.1.5 Human safety and integral safety
Human Security (in English, Human Security), the goal of which is to ensure security, is a topic known since the dawn of humanity. However, this concept in the field of security sciences has only recently been defined. The United Nations defined Human Security as a concept that means:
“… protecting the essential foundation of all human lives in a way that enriches people’s freedom and self-realization. Human security means protecting the foundations of freedoms – freedoms that are the essence of life. This means protecting people from critical (severe) and pervasive (widespread) threats and situations. It means using processes that build on human strengths and desires. It means creating political, social, environmental, economic, military, and cultural systems that together provide basic building blocks for people’s survival, livelihood, and dignity…” [8].
Thus, it is primarily a shift in approach from mere state protection against threats from hostile armed forces to an approach that emphasizes people’s lives and their protection from other known threats. The main areas of the concept of “Human Security and its threats,” according to the UN, are as follows [8]:
personal security (physical violence, crime, terrorism, domestic violence, child abuse),
political security (inter-ethnic, religious, and other identity-based tensions),
security of politics (political repression, abuse of human rights).
From an economic security perspective, the concept of Human Security emphasizes the restoration (rehabilitation) of transport and transportation routes. Transport is a prerequisite for the successful achievement of the goals of individual areas of security, i.e., the Human Security concept. At the same time, it can, through its own errors and weaknesses, damage the subject of these goals. This implies that transportation and transportation systems create new threats, such as pollution, direct impact on people’s lives and health, and property [19].
States ensure the security of people and individual security goals through the so-called main functions of the state. One of the means is infrastructure [2]. The focus of the present work is on transport infrastructure and related critical infrastructure (e.g., critical information infrastructure).
A tool for ensuring human security is integral security, which is ensured by various security methods and technologies. It encompasses other engineering areas, such as reliability management, functional safety, cybersecurity, technical and physical security, surveillance, occupational safety, ensuring safe locations, people’s safety, etc. Integral security addresses the safety of multiple assets within a monitored area that interact with each other, are interconnected, and have various types of connections with superior and surrounding systems. The concept of integral security also considers the occurrence of all possible sources of threats that may affect the monitored entity [2]. Integral security management works with the management of integral risks [19].
The real world that we perceive is not ideal, and therefore conflicts arise due to imperfections and differences. Conflicts also arise in individual areas of security, safety, and interdisciplinary contexts. As a result, increasing the security of one element of the monitored system may indirectly worsen the security of another element, thus affecting integral security and overall human security.
From the above, it is clear that ensuring integral security requires more than just increasing the security or protection of individual system elements, which, through their interconnections, form a complex system. We must ensure a more effective management system capable of dealing with the complexities of the real world as best as possible [19].
Increasing Integral Security is based on process and project management, with the goal of continuously improving the quality and maintaining a certain level of system security under the dynamically changing conditions of the real world (surrounding physical conditions, interconnections with other systems, changes in culture and behavior of individuals or groups of people, etc.). In the context of the European Union, project management of the so-called Total Quality Management (TQM) type is used [28]. ISO standards of classes 9000, 14000, etc., have been created for its success.
The TQM approach relies on the requirement that the process of improving the quality of an entity involves all employees, from rank-and-file workers to top management. The process of quality improvement (i.e., at its highest level, essentially increasing integral security) comes from impulses based on the needs of the customer, or the citizen [29,30]. TQM assumes that sustained quality (excellence) of products and services cannot be ensured through directives, control, partial programs, organizational or economic measures, but through targeted search, measurement, and evaluation of the causes of why productivity and quality are not improving. It is essentially about a safety culture (in other words, the way measures and actions are applied by people). Attention is focused on the processes occurring within the entity. When implementing TQM, the specifics of the entity are taken into account, because, for effectiveness, all measures must correspond to the entity’s structure, i.e., they must be locally specific [19,30].
Furthermore, TQM, in addition to standardized management systems (ISO standards), which are based on TQM principles, includes the principles and attitudes towards managing soft socio-technical systems with simple idealized goals so that they are understood by all affected personnel, or residents of the considered area. From a security perspective, TQM builds so-called Total Safety Systems (TSS). TSS introduces the concept of zero risks (Zero Risks), which is the foundation for following the zero defects strategy (Zero Defects) and doing things “right the first time” (Right First Time).
By incorporating specific prevention into the safety of socio-technical elements of organizational systems, it includes comparing contributions from the so-called Total Prevention Systems (TPS), which include the principle of zero breakdown (Zero Breakdown), and the Human Development System, designed for the education and training of workers in the “right first time” principle [28]. These Total Prevention Systems include, for example, the implementation of Total Productive Maintenance (TPM) [28], etc.
Overall (integrally) safe systems include three basic elements:
site safety (disposition, management of environmental aspects, emergency procedures, fire safety measures, first aid provision, lighting, social facilities, etc.),
process safety (physical security, emergency stop elements, the “fail safely” principle, perimeter protection),
human resource safety (safety training, personal protective equipment, supervision, health checks).
The EU has issued a checklist, widely used primarily for inspections, covering the three areas mentioned above [2]. The TQM system, together with TSS, significantly exceeds the legislative requirements in the Czech Republic in many areas. For increasing safety, the fundamental prerequisite of the presented systems is risk reduction through proactive programs with continuous measurement and elimination of so-called near-misses. Near-misses are events that, based on current knowledge, would typically lead to an accident or disaster, but in this case, no issue occurred, e.g., due to the operator’s presence of mind [19,21,28].
Current trends in the field of safety sciences and risk engineering are based on the principles of engineering risk management, taking into account the complexity of systems arising from the nature, properties, and uncertainties of socio-technical, cyber-physical systems, referred to as systems of systems (SoS) [2,19,26,30,31].
2.1.6 Critical infrastructure and its security
Critical infrastructure, according to Directive 2008/114/EC of the Council on the identification and designation of European critical infrastructures and the assessment of the need to increase their protection, is defined as: “Assets, systems, and their parts located in a Member State that are crucial for the maintenance of essential societal functions, health, safety, security, or good economic or social conditions for the population, and whose disruption or destruction would have a serious impact on the Member State due to the failure of these functions.” According to source [33], critical infrastructure can be defined as systems of various nature (technical, organizational, cyber, territorial, educational, etc.) that may affect the functioning of the economy, the state, and the management of emergency and critical situations. In the Czech Republic, critical infrastructure is composed of infrastructures divided into the following nine areas [33]:
Energy supplies (electricity, gas, heat, oil, and petroleum products).
Water (provision of drinking and utility water, management of surface and groundwater resources, wastewater system).
Food supply and agriculture (food production, food care, agricultural production).
Healthcare (emergency pre-hospital care, hospital care, public health protection, production, storage, and distribution of pharmaceutical products and medical devices).
Transport (road, rail, air, and water).
Cyber, communication, and information systems (fixed and mobile telecommunications network services, radio communication and navigation, television and satellite communication, postal and courier services, internet and data services).
Banking and financial sector (management of public finances, banking, insurance, capital markets).
Rescue systems (Fire Rescue Service of the Czech Republic, fire protection units, Police of the Czech Republic, Czech Army, radiation monitoring, forecasts, warning systems, etc.).
Public administration (state and local government, social security and employment, state social support and assistance, judicial and prison system functions).
The area of critical infrastructure is governed by the crisis law [34]. The object or element of critical infrastructure is understood as a building, facility, means, or public infrastructure designated according to cross-sectional and sectoral criteria, according to [35]. In the case of the railway system, critical infrastructure objects include stations, metro stations, significant bridges or tunnels, technological equipment, and the flow of information, material, and energy in systems, as per the methodology for determining the criticality of objects according to source [33].
Protection of health and property is a primary concern of the basic functions of the state enshrined in the Constitution of the Czech Republic (Act No. 1/1993 Coll.). Possible occurrences of disasters can affect not only the proper functioning of the critical infrastructure element but also threaten people’s health, property, and the environment. Therefore, appropriate measures are taken according to the category of disaster as mentioned in the previous paragraph [4,13,33].
2.1.7 Modern approaches: All-Hazard-Approach and Defense in Depth
The approach All-Hazard-Approach[12] means considering all possible types of hazards when managing security, i.e., phenomena that may cause damage, loss, and harm to the monitored assets, i.e., people and relevant entities in the given area [2].
Defence-In-Depth is a comprehensive security philosophy that began to be applied in technology in the 1980s [27]. In general, this approach can be understood as protecting a system through measures in multiple layers of the system.
According to [36], Defence-In-Depth represents a comprehensive approach that ensures that both people and the environment will be protected even under critical conditions within the facility. It includes all activities focused on the security of the facility and the area in which it is located, starting from placement, through design and planning, construction, operation, commissioning, operation, and decommissioning of the facility. To ensure a secure system of systems, barrier systems and procedural measures are used.
The Defence-In-Depth approach is also known in cybersecurity and the protection of control systems, e.g., according to [37], Figure 4.
Figure 4 illustrates the Defence-In-Depth approach as a security management strategy including the following areas:
security directives,
security requirements specifications,
security through design,
secure implementation,
security verification and validation,
Defence-In-Depth strategy.
The generalized layered model for security management based on the Defence-In-Depth approach, used in the dissertation, is described further in section 2.3.4.
2.1.8 Systems of systems (SoS), project and over-project phenomena
System of Systems (SoS) is defined in the field of systems engineering [38] as a set of independent systems integrated into a larger system that provides unique properties. Independent so-called constituent systems cooperate to produce global behavior that they cannot produce alone. According to [39], the classical concept of a system and SoS differs primarily in the following elements:
autonomy – autonomy is exercised by the constituent systems to fulfill the purpose of the global system, i.e., SoS,
affiliation – individual constituent systems choose affiliation based on cost-benefit ratios, to fulfill their own purpose and in belief of the supra-purpose of SoS; in the classical system concept, affiliation is determined by their nature and cannot be arbitrarily changed (e.g., as a member of one family),
connectivity – countless possible connections of systems and their parts to improve SoS capabilities,
diversity – higher diversity in the capabilities of SoS achieved through the autonomy of various constituent systems, chosen affiliation, and open connectivity,
emergence – in the SoS concept, increased intentional unpredictability of the system and the creation of conditions for the possibility of emergence (i.e., the emergence) is of crucial importance in both negative (the occurrence of unpredictable negative events, disasters) and positive (early detection and elimination of adverse system behavior) senses.
The element of emergence has a significant impact on the choice of methods for working with systems, with an emphasis on exact methods for classical systems and heuristic methods for SoS, including the use of Artificial Intelligence (AI), etc.
For the purposes of this dissertation, we understand SoS as a set of open, mutually connected systems [33], further consisting of subsystems and objects (components) of various properties and their locations. The links between subsystems and objects ensure the necessary functions and behaviors of the entire SoS [40]. Mutual links and dependencies, i.e., interdependencies, are, according to their nature, physical, cyber, local, and logical [6]. Additionally, the interdependencies of SoS can be divided into:
desired: improve the properties of systems, devices, and infrastructures,
undesired:
a) under normal and abnormal conditions: managed by the project according to the requirements of legislation [41],
b) under critical conditions (beyond design):
lead to system losses,
cause systems to fail in performing their functions,
cause systems to endanger themselves and their environment.
In the presence of certain conditions within the system being solved, some situations can be addressed using exact methods. The conditions for ensuring safety are defined in the project according to its lifespan and criticality, in which case we speak of design criteria. If unfavorable phenomena or accidents occur and do not exceed the design criteria or conditions, they are referred to as design events (accidents). Safety primarily affects areas beyond the specified limits and conditions of systems, i.e., beyond design events, or accidents.
The terms design (angl. Design Basis Accident) and beyond design (angl. Beyond Design Basis Accident) accidents are formally defined, for example, by the International Atomic Energy Agency (IAEA) [42], but they are also commonly used in other areas of safety management for technical works [41].