ICCK Transactions on Systems Safety and Reliability

The ultra-large scale and prolonged runtime of Large Language Model (LLM) training—often involving thousands of GPUs and spanning weeks—render reliability a pivotal bottleneck. Hardware failures, stragglers, and runtime issues can waste over 30% of GPU resources, delaying model rollout and driving up costs. This survey focuses on reliability optimization for LLM training systems. The discussion centers on three pillars of reliability: fault detection, fault recovery, and straggler mitigation. For each pillar, we dissect innovative mechanisms, which range from communication-aware fault detection to adaptive load balancing, and we assess their impact on critical reliability metrics such as... More >

Free Access | Research Article | 08 February 2026

Design and Implementation of a Fire Fighting System for a High-Rise Residential Building: A Case Study of SAMA Tower in Palestine

by

,

,

,

ICCK Transactions on Systems Safety and Reliability | Volume 2, Issue 1: 26-35, 2026 | DOI: 10.62762/TSSR.2025.636226

Abstract

This paper describes the design and installation of a fire-fighting system in a high-rise building, the SAMA Residential Tower in Palestine. A sprinkler system, a standpipe system, a fire hose cabinet, a landing valve, externally accessible fire hydrants, and a Siamese connection are all included in this system. Additionally, it complies with NFPA 13, 14, and 20 codes, as well as those of the Palestinian and Jordanian governments. Hydraulic calculations were performed manually and confirmed using Elite Fire Protection software. The layouts were developed using AutoCAD, and 3D modeling was created in Revit to optimize the placement of components. All zones of the building will be protected by... More >

Graphical Abstract

Design and Implementation of a Fire Fighting System for a High-Rise Residential Building: A Case Study of SAMA Tower in Palestine

Free Access | Research Article | 03 February 2026

Distribution Field Construction and Prediction Method for Gas Leakage based on Kriging model and Gaussian Process

by

,

,

,

ICCK Transactions on Systems Safety and Reliability | Volume 2, Issue 1: 11-25, 2026 | DOI: 10.62762/TSSR.2025.861997

Abstract

Gas leakage poses a significant hazard in chemical industry operations, where failure to respond rapidly to gas diffusion can lead to poisoning, fire, or explosion. Timely and accurate prediction of gas dispersion is therefore essential for emergency decision-making and operational safety. While existing methods such as computational fluid dynamics, spatiotemporal statistics, and surrogate models emphasize prediction accuracy, they often suffer from excessive computational delays—especially critical in leak scenarios where casualties can occur within minutes. To address this gap, this paper introduces a Gaussian process-Markov random field-Kriging (GP-MRF-K) model for fast and reliable pre... More >

Graphical Abstract

Distribution Field Construction and Prediction Method for Gas Leakage based on Kriging model and Gaussian Process

Free Access | Research Article | 29 January 2026

Reliability of Coupled Subway and Bus Networks under Uncertainty

by

,

,

,

ICCK Transactions on Systems Safety and Reliability | Volume 2, Issue 1: 3-10, 2026 | DOI: 10.62762/TSSR.2025.612115

Abstract

In urban public transport networks, subway and bus systems complement each other and together form a coupled system that serves passenger travel. However, a disturbance in either subsystem can propagate through coupling nodes across the entire network, thereby reducing overall operational efficiency. Most existing studies focus only on the reliability of a single mode, and few have analyzed the overall reliability of the system while considering the coupling relationship between the two. To address this gap, this paper proposes a probabilistic evaluation model to assess the reliability of the subway and bus coupling system. System reliability is defined as the probability that the network ca... More >

Graphical Abstract

Reliability of Coupled Subway and Bus Networks under Uncertainty

Open Access | Editorial | 20 January 2026

Editorial: Summary of 2025

by

Rui Peng

ICCK Transactions on Systems Safety and Reliability | Volume 2, Issue 1: 1-2, 2026 | DOI: 10.62762/TSSR.2025.899033

Abstract

Summary of 2025 More >

Free Access | Research Article | 30 December 2025 | Cited: 1 , 1

A Hybrid RUL Prediction Approach for Lithium-ion Batteries Based on CEEMDAN-SSA-SVR-BiGRU

by

Fei Zhao

,

Xinyu Dai

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 2: 136-148, 2025 | DOI: 10.62762/TSSR.2025.657859

Abstract

The capacity regeneration phenomenon in lithium-ion batteries is inevitable and leads to non-monotonic fluctuations in capacity degradation trajectories, significantly complicating accurate remaining useful life (RUL) prediction. To address this challenge, this paper proposes a hybrid prediction model based on CEEMDAN-SSA-SVR-BiGRU. The method first employs Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) to decompose the original capacity sequence into multiple Intrinsic Mode Functions (IMFs) representing local regeneration fluctuations, and a residual component (RES) referring to the global degradation trend, thereby achieving effective signal decoupling. Subseq... More >

Graphical Abstract

A Hybrid RUL Prediction Approach for Lithium-ion Batteries Based on CEEMDAN-SSA-SVR-BiGRU

Free Access | Research Article | 29 December 2025

Non-invasive Continuous Glucose Monitoring (CGM) System Reliability Analysis Based on the DFMEA Model

by

,

,

,

,

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 2: 128-135, 2025 | DOI: 10.62762/TSSR.2025.581880

Abstract

Non-invasive continuous glucose monitoring (CGM) systems offer the advantage of non-invasive, real-time dynamic glucose monitoring, marking a significant advancement in diabetes management. However, the complexity of their sensing principles and operational mechanisms make systems vulnerable to various factors, which may introduce measurement bias or cause system interruptions and thereby compromise patient safety and monitoring effectiveness. To address these challenges, the Design Failure Mode and Effects Analysis (DFMEA) method is employed to identify and prioritize risks by assigning expert-based scores to critical components, ultimately enabling targeted improvements for high-risk failu... More >

Graphical Abstract

Non-invasive Continuous Glucose Monitoring (CGM) System Reliability Analysis Based on the DFMEA Model

Free Access | Research Article | 12 November 2025

Remaining Useful Life Prediction Using Optimized Multi-source Features and Model Fusion

by

Xin Zhou

,

Lechang Yang

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 2: 114-127, 2025 | DOI: 10.62762/TSSR.2025.167369

Abstract

Remaining Useful Life (RUL) prediction is critical for ensuring equipment safety and optimizing maintenance schedules, directly impacting system reliability and maintenance efficiency. However, in real-world industrial scenarios, factors such as operating condition fluctuations and load variations lead to inconsistent data distributions, making it challenging for existing models to achieve satisfactory adaptability and accuracy. To address this issue, this paper proposes a deep learning framework based on a multi-branch serial-parallel fusion of CNN-BiLSTM-Transformer architectures. Through innovative model architecture design and optimized training strategies, the framework aims to enhance... More >

Graphical Abstract

Remaining Useful Life Prediction Using Optimized Multi-source Features and Model Fusion

Free Access | Research Article | 11 November 2025 | Cited: 1 , 1

Optimization and Control of Discrete-Time Production-Inventory Systems Using Reinforcement Learning

by

,

,

,

,

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 2: 98-113, 2025 | DOI: 10.62762/TSSR.2025.621059

Abstract

This study introduces a novel approach for enhancing production decision-making by applying Reinforcement Learning to optimize the Economic Manufacturing Quantity (EMQ) model within discrete-time production-inventory systems. By incorporating machine status, inventory levels, and production choices, a Markov Decision Process (MDP) is constructed and combined with the Q-learning algorithm to derive an adaptive control method. This method enables the dynamic adaptation of production decisions, by effectively balancing the normal operation and shutdown for rest states. Numerical simulations show that the suggested Reinforcement Learning model surpasses conventional EMQ models and steady-state p... More >

Graphical Abstract

Optimization and Control of Discrete-Time Production-Inventory Systems Using Reinforcement Learning

Free Access | Review Article | 31 October 2025 | Cited: 1 , 1

Performability Analysis for Large-Scale Multi-State Computing Systems: Methodologies, Advances, and Future Directions

by

,

,

,

,

,

,

,

,

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 2: 81-97, 2025 | DOI: 10.62762/TSSR.2025.527003

Abstract

Large-scale computing systems, such as cloud data centers, grid infrastructures, and high-performance computing clusters, are the backbone of modern information technology ecosystems. These systems typically consist of numerous heterogeneous, multi-state computing nodes that exhibit varying performance levels due to component failures, degradation, or dynamic resource allocation. Performability analysis, which integrates both system reliability and performance evaluations to quantify the probability of the system operating at a specified performance level, is critical for ensuring the efficient, reliable, and cost-effective operation of these complex systems. This paper presents a comprehens... More >

Graphical Abstract

Performability Analysis for Large-Scale Multi-State Computing Systems: Methodologies, Advances, and Future Directions

Free Access | Research Article | 02 August 2025

Beyond Firm Boundaries: Orchestrating Ecosystem Sustainability Through Business Model Innovation

by

Jing Zhao

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 1: 63-80, 2025 | DOI: 10.62762/TSSR.2025.410841

Abstract

Business model innovation (BMI) constitutes a structural catalyst for competitive advantage within contemporary business ecosystems (BEs), transcending firm-level adaptation to reconfigure multi-stakeholder value generation networks. This research theorizes the reciprocal dynamism between BMI and ecosystem evolution through systematic literature synthesis and longitudinal analysis of a keystone technology enterprise. The study establishes four constitutive dimensions: (1) BMI’s steering effect on BE trajectories and resilience under sustainability pressures; (2) Its mediation of environmental integrity, social equity, and economic viability across ecosystem lifecycles; (3) The generative m... More >

Graphical Abstract

Beyond Firm Boundaries: Orchestrating Ecosystem Sustainability Through Business Model Innovation

Free Access | Review Article | 31 July 2025 | Cited: 1 , 1

State-of-the-Art Advances and Emerging Challenges in UAV Routing Optimization: A Comprehensive Review

by

,

,

,

,

,

,

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 1: 43-62, 2025 | DOI: 10.62762/TSSR.2025.423261

Abstract

This literature review offers an in-depth overview of recent advances in routing optimization for Unmanned Aerial Vehicles (UAVs), a field central to improving the performance, reliability, and flexibility of UAV systems. The review is organized into five categories: (1) multi-objective mission planning, (2) algorithmic design and optimization techniques, (3) energy efficiency and resource allocation, (4) communication protocols and network management, and (5) context-specific applications and environmental adaptability. The review highlights methodological progress and algorithmic approaches developed to balance competing demands such as mission effectiveness, energy use, and communication... More >

Graphical Abstract

State-of-the-Art Advances and Emerging Challenges in UAV Routing Optimization: A Comprehensive Review

Free Access | Research Article | 30 July 2025 | Cited: 2 , 2

Preventive Maintenance and Competitive Strategies in IIoT-enable After-sales Markets: A Degradation Modeling and Game Theoretic Approach

by

Xiaojun Liang

,

Li Yang

,

Qingan Qiu

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 1: 21-42, 2025 | DOI: 10.62762/TSSR.2025.782610

Abstract

In the after-sales service market, understanding both the internal degradation of products and the external incentives within warranty period is crucial. Efforts into preventive maintenance can slow down the internal degradation, but these efforts are also influenced by external strategic services. Enabling an Industrial Internet of Things (IIoT) platform for preventive maintenance requires carefully considering the benefits instead of merely increasing efforts. This paper addresses these complexities by first proposing an additive degradation model to characterize the internal deterioration of products and the impact of efforts into preventive maintenance. It then introduces a sequential ga... More >

Graphical Abstract

Preventive Maintenance and Competitive Strategies in IIoT-enable After-sales Markets: A Degradation Modeling and Game Theoretic Approach

Free Access | Review Article | 30 July 2025

The State-of-the-Art Development and New Challenges: Operations Management of Metro Systems

by

,

,

,

,

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 1: 4-20, 2025 | DOI: 10.62762/TSSR.2025.246708

Abstract

This paper comprehensively reviews literature on the operations management of metro systems, which are crucial for urban mass transit. It classifies the existing research into five categories: 1) passenger demand prediction; 2) timetabling and scheduling; 3) system vulnerability, resilience and performance; 4) resource planning; and 5) evacuation optimization. The paper focuses on publications in the last decade in order to reflect the latest research and industrial trends. In addition, some limitations of the existing literature are located and the potential knowledge gaps are identified. The paper provides a useful reference for developing sustainable and resilient metro systems to meet th... More >

Graphical Abstract

The State-of-the-Art Development and New Challenges: Operations Management of Metro Systems

Open Access | Editorial | 16 May 2025

Inaugural Editorial of the ICCK Transactions on Systems Safety and Reliability

by

Rui Peng

ICCK Transactions on Systems Safety and Reliability | Volume 1, Issue 1: 1-3, 2025 | DOI: 10.62762/TSSR.2025.424761

Abstract

ICCK Transactions on Systems Safety and Reliability is an academic journal dedicated to advancing the research and application of safety, reliability, and resilience in industrial and engineering systems. The journal seeks to publish high-quality research that addresses the challenges of maintaining and improving the performance of systems across a variety of industries, including energy, transportation, manufacturing, and emerging technologies like artificial intelligence and the Internet of Things. We invite both theoretical and applied research that offers innovative solutions to critical issues such as system optimization, fault diagnosis, risk assessment, and system defense strategies.... More >

ICCK Transactions on Systems Safety and Reliability

Journal Metrics

Recent Articles

Journal Statistics