Browsing by Subject "reliability"
Now showing 1 - 17 of 17
Results Per Page
Sort Options
Item Algorithms and Data Representations for Emerging Non-Volatile Memories(2014-04-29) Li, YueThe evolution of data storage technologies has been extraordinary. Hard disk drives that fit in current personal computers have the capacity that requires tons of transistors to achieve in 1970s. Today, we are at the beginning of the era of non-volatile memory (NVM). NVMs provide excellent performance such as random access, high I/O speed, low power consumption, and so on. The storage density of NVMs keeps increasing following Moore?s law. However, higher storage density also brings significant data reliability issues. When chip geometries scale down, memory cells (e.g. transistors) are aligned much closer to each other, and noise in the devices will become no longer negligible. Consequently, data will be more prone to errors and devices will have much shorter longevity. This dissertation focuses on mitigating the reliability and the endurance issues for two major NVMs, namely, NAND flash memory and phase-change memory (PCM). Our main research tools include a set of coding techniques for the communication channels implied by flash memory and PCM. To approach the problems, at bit level we design error correcting codes tailored for the asymmetric errors in flash and PCM, we propose joint coding scheme for endurance and reliability, error scrubbing methods for controlling storage channel quality, and study codes that are inherently resisting to typical errors in flash and PCM; at higher levels, we are interested in analyzing the structures and the meanings of the stored data, and propose methods that pass such metadata to help further improve the coding performance at bit level. The highlights of this dissertation include the first set of write-once memory code constructions which correct a significant number of errors, a practical framework which corrects errors utilizing the redundancies in texts, the first report of the performance of polar codes for flash memories, and the emulation of rank modulation codes in NAND flash chips.Item Dynamic reliability using entry-time approach for maintenance of nuclear power plants(2009-05-15) Wang, ShuwenEntry-time processes are finite-state continuous-time jump processes with transition rates depending only on the two states involved in transition, the calendar time, and the most recent arrival time, which is termed as entry-time. The entry-time processes have the potential to provide a significantly greater range of applicability and flexibility than traditional reliability tools for case studies related to equipment and components in nuclear power plants. In this dissertation, the finite difference approximation of the integrodifferential Chapman-Kolmogorov equations for the entry-time processes was developed, and then it was verified by application to some hypothetical examples that are solved by alternative means, either (semi-)analytically or via simulation. To demonstrate the ability of entry-time model to applications in nuclear power plants for a RIAM based scenario, the entry-time approach is applied to the maintenance of main generators in nuclear power using the data from INPO-EPIX database. In this application, both reliability and financial performances acquired using the entry-time approaches corresponding to different maintenance policies are presented and discussed to help make maintenance decisions for the plant management. The ability of the EPIX database to provide time-dependent failure rates is demonstrated and the techniques for extraction of failure rates from the database for main generators are also discussed.Item Electrochemical characterization and time-variant structural reliability assessment of post-tensioned, segmental concrete bridges(2010-07-14) Pillai Gopalakrishnan, RadhakrisIn post-tensioned (PT) bridges, prestressing steel tendons are the major load carrying components. These tendons consist of strands, ducts, and cementitious grout that fill the interstitial space between the strands and ducts. However, inspections on PT bridges have reported the presence of voids, moisture, and chlorides inside grouted ducts as the major cause of accelerated corrosion of strands. Corrosion of the strands has resulted in PT bridge failures in Europe and tendon failures in the United States. As most of the PT bridges have high importance measures and the consequences of failure are significant, it is important to maintain high levels of safety and serviceability for these bridges. To meet this goal, bridge management authorities are in dire need of tools to quantify the long-term performance of these bridges. Time-variant structural reliability models can be useful tools to quantify the long-term performance of PT bridges. This doctoral dissertation presents the following results obtained from a comprehensive experimental and analytical program on the performance of PT bridges. 1) Electrochemical characteristics of PT systems 2) Probabilistic models for tension capacity of PT strands and wires exposed to various void and environmental conditions 3) Time-variant structural reliability models (based on bending moment and stress limit states) for PT bridges 4) Time-variant strength and service reliabilities of a typical PT bridge experiencing HS20 and HL93 loading conditions and different exposure conditions for a period of 75 years The experimental program included exposure of strand specimens to wet-dry and continuous-atmospheric conditions. These strand specimens were fabricated to mimic void and/or grout-air-strand (GAS) conditions inside the tendons. It was found that the GAS interface plays a major role in strand corrosion. The GAS interfaces that are typically located in the anchorage zones of harped PT girders or vertical PT columns can cause aggressive strand corrosion. At these locations, if voids are present and the environment is relatively dry, then limited corrosion of the strands occurs. However, if the presence of high relative humidity or uncontaminated and chloride-contaminated water exists at these interfaces, then corrosion activity can be high. The strands were exposed for a period of 12, 16, and 21 months, after which the remaining tension capacity was determined. The analytical program included the development of probabilistic strand capacity models (based on the experimental data) and the structural reliability models. The timevariant tension capacity predicted using the developed probabilistic models were reasonably consistent with the tendon failures observed in PT bridges in Florida and Virginia. The strength reliability model was developed based on the moment capacity and demand at midspan. Service reliability model was developed based on the allowable and applied stresses at midspan. Using these models, the time-variant strength and service reliabilities of a typical PT bridge were determined based on a set of pre-defined constant and random parameters representing void, material, exposure, prestress, structural loading, and other conditions. The strength and service reliabilities of PT bridges exposed to aggressive environmental conditions can drop below the recommended values at relatively young ages. In addition, under similar conditions the service reliability drops at a faster rate than the strength reliability.Item Fault tolerant control of homopolar magnetic bearings and circular sensor arrays(Texas A&M University, 2006-04-12) Li, Ming-HsiuFault tolerant control can accommodate the component faults in a control system such as sensors, actuators, plants, etc. This dissertation presents two fault tolerant control schemes to accommodate the failures of power amplifiers and sensors in a magnetic suspension system. The homopolar magnetic bearings are biased by permanent magnets to reduce the energy consumption. One control scheme is to adjust system parameters by swapping current distribution matrices for magnetic bearings and weighting gain matrices for sensor arrays, but maintain the MIMO-based control law invariant before and after the faults. Current distribution matrices are evaluated based on the set of poles (power amplifier plus coil) that have failed and the requirements for uncoupled force/voltage control, linearity, and specified force/voltage gains to be unaffected by the failure. Weighting gain matrices are evaluated based on the set of sensors that have failed and the requirements for uncoupling x1 and x2 sensing, runout reduction, and voltage/displacement gains to be unaffected by the failure. The other control scheme is to adjust the feedback gains on-line or off-line, but the current distribution matrices are invariant before and after the faults. Simulation results have demonstrated the fault tolerant operation by these two control schemes.Item Improving Distribution System Reliability Through Risk-base Doptimization of Fault Management and Improved Computer-based Fault Location(2013-11-07) Dong, YimaiUtilities of distribution systems now are under the pressure of improving the reliability of power supply, not only from the urge to increase revenue, but also from requirements of their customers and the Independent Service Organization (ISO)?s regulation on power quality. Optimization in fault management tasks has the potential of improving system reliability by reducing the duration and scale of outages caused by faults through fast fault isolation and service restoration. The research reported by this dissertation aims at improving distribution system reliability through optimized fault management. Three questions are explored and answered: 1) how to establish the cause-and-effect relationship between fault management and system reliability; 2) how can individual fault management tasks benefit from the newly emerged smart grid technologies; and 3) how to improve the overall performance of fault management under new operation condition. Optimization of the fault management is done through minimizing a risk function representing system reliability. The improvement in system reliability is approached in following steps: 1) a risk function consists of distribution reliability indices is defined as the criterion for system reliability; 2) a new fault location method is proposed first that can accurately locate the faults with the assistance of voltage-sag-measurements from system-wide Intelligent Electronics Devices (IEDs); 3) the fault management task of field inspection is optimized using the risk function and the probability model of the true fault location established using results from fault location; 4) the decision making on the execution of during-fault service restoration is optimized through Monte Carlo simulation; 5) the optimized fault management is utilized in processing the faults and the improvement in system reliability is assessed by reduction of costs associated with these faults. The proposed optimization is demonstrated on a realistic distribution system. The stochastic model of faults in the system is built with consideration of normal and extreme weather conditions. Results show that the proposed optimization is capable of improving system reliability by reducing the mean and variance of outage cost calculated over the simulated years.Item Measuring eating disorder attitudes and behaviors: a reliability generalization study(2009-05-15) Pearson, Crystal AnneI used reliability generalization procedures to determine the mean score reliability of the Eating Disorder Inventory (EDI), the Eating Attitudes Test (EAT), and the Bulimia Test (BULIT). Reliability generalization is a type of meta-analysis used to examine the mean score reliability of a measure across studies and to explore study factors that influence mean score reliability. Score reliability estimates were included in 28.67% of 293 studies using the EDI, 36.28% of 215 studies using the EAT, and 41.46% of 41 studies utilizing the BULIT. For the EDI, mean Cronbach?s alphas for the subscales ranged from .52 to .89 and the mean estimate for the total score was .91. For the EAT-40 and EAT-26, mean estimates of internal consistency were .81 and .86 respectively. Mean estimates of internal consistency for the EAT-26 subscales ranged from .56 to .80. The mean estimate of internal consistency for the BULIT-R was .93. Overall, the mean reliability of scores on all three measures and their subscales/factors was acceptable except for the Asceticism subscale of the EDI and the Oral Control factor on the EAT-26, which had mean internal consistency estimates of .52 and .56 respectively. For the EDI, the majority of the subscales that measure specific eating disorder attitudes and behaviors, such as Bulimia and Perfectionism displayed higher score reliability in clinical eating disorder samples than in nonclinical samples. This difference was not found in the Drive for Thinness and Body Dissatisfaction subscales, perhaps because these attitudes are common in both eating disorder and nonclinical samples. Score reliability information for the EAT and BULIT was primarily reported for nonclinical samples; therefore, it is difficult to characterize the effect of type of sample on these measures. There was a tendency for mean score reliability for all the measures to be higher in the adult samples than in adolescent samples and in the female samples compared to the male samples. This study highlights the importance of assessing and reporting internal consistency every time a measure is used because reliability is affected by characteristics of the participants being examined.Item Prognostic Control and Load Survivability in Shipboard Power Systems(2011-02-22) Thomas, Laurence J.In shipboard power systems (SPS), it is important to provide continuous power to vital loads so that their desired missions can be completed successfully. Several components exist between the primary source and the vital load such as transformers, cables, or switching devices. These components can fail due to mechanical stresses, electrical stresses, and overloading which could lead to a system failure. If the normal path to a vital load cannot supply power to it, then it should be powered through its alternate path. The process of restoring, balancing, and minimizing power losses to loads is called network reconfiguration. Prognostics is the ability to predict precisely and accurately the remaining useful life of a failing component. In this work, the prognostic information of the power system components is used to determine if reconfiguration should be performed if the system is unable to accomplish its mission. Each component will be analyzed using the Weibull Distribution to compute the conditional reliability from present time to the end of the mission. To determine if reconfiguration is needed, all components to a given load will be utilized in structure functions to determine if a load will be able to survive during a time period. Structure functions are used to show how components are interconnected, and also provide a mathematical means for computing the total probability of a system. This work will provide a method to compute the conditional survivability to a given load, and the results indicate the top five loads that have the lowest conditional survivability during a mission in known configuration. The results show the computed conditional survivability of loads on an all electric navy ship. The loads conditional survivability is computed on high/medium voltage level and a low voltage level to show how loads are affected by failing components along their path.Item Reliability Evaluation of Composite Power Systems Including the Effects of Hurricanes(2011-02-22) Liu, YongAdverse weather such as hurricanes can significantly affect the reliability of composite power systems. Predicting the impact of hurricanes can help utilities for better preparedness and make appropriate restoration arrangements. In this dissertation, the impact of hurricanes on the reliability of composite power systems is investigated. Firstly, the impact of adverse weather on the long-term reliability of composite power systems is investigated by using Markov cut-set method. The Algorithms for the implementation is developed. Here, two-state weather model is used. An algorithm for sequential simulation is also developed to achieve the same goal. The results obtained by using the two methods are compared. The comparison shows that the analytical method can obtain comparable results and meantime it can be faster than the simulation method. Secondly, the impact of hurricanes on the short-term reliability of composite power systems is investigated. A fuzzy inference system is used to assess the failure rate increment of system components. Here, different methods are used to build two types of fuzzy inference systems. Considering the fact that hurricanes usually last only a few days, short-term minimal cut-set method is proposed to compute the time-specific system and nodal reliability indices of composite power systems. The implementation demonstrates that the proposed methodology is effective and efficient and is flexible in its applications. Thirdly, the impact of hurricanes on the short-term reliability of composite power systems including common-cause failures is investigated. Here, two methods are proposed to archive this goal. One of them uses a Bayesian network to alleviate the dimensionality problem of conditional probability method. Another method extends minimal cut-set method to accommodate common-cause failures. The implementation results obtained by using the two methods are compared and their discrepancy is analyzed. Finally, the proposed methods in this dissertation are also applicable to other applications in power systems.Item Reliability-yield allocation for semiconductor integrated circuits: modeling and optimization(Texas A&M University, 2005-11-01) Ha, ChunghunThis research develops yield and reliability models for fault-tolerant semiconductor integrated circuits and develops optimization algorithms that can be directly applied to these models. Since defects cause failures in microelectronics systems, accurate yield and reliability models considering these defects as well as optimization techniques determining efficient defect-tolerant schemes are essential in semiconductor manufacturing and nanomanufacturing to ensure manufacturability and productivity. The defect-based yield model considers various types of failures, fault-tolerant schemes such as hierarchical redundancy and error correcting code, and burn-in effects, simultaneously. The reliability model counts on carry-over single-cell failures accompanied by the failure rate of the semiconductor integrated circuits under the assumption of an error correcting code policy. The redundancy allocation problem, which seeks to find an optimal allocation of redundancy that maximizes system reliability, is one of the representative problems in reliability optimization. The problem is typically formulated as a nonconvex integer nonlinear programming problem that is nonseparable and coherent. Two iterative heuristics, tree and scanning heuristics, and variants are studied to obtain local optima and a branch-and-bound algorithm is proposed to find the global optimum for redundancy allocation problems. The proposed algorithms engage a multiple-search paths strategy to accelerate efficiency. Experimental results of these algorithms indicate that they are superior to the existing algorithms in terms of computation time and solution quality. An example of memory semiconductor integrated circuits is presented to show the applicability of both the yield and reliability models and the optimization algorithms to fault-tolerant semiconductor integrated circuits.Item Safety assured financial evaluation of maintenance(Texas A&M University, 2004-09-30) Erguina, VeraManagement decisions in complex industrial facilities usually consider both the economic and environmental aspects of the plant's performance. For nuclear power plants (NPPs), safety is also a very substantial issue. The objectives of this dissertation are to develop and demonstrate a novel useful conceptual model that could be used to allocate maintenance funds for a nuclear power plant in such a way as to meet all specified safety requirements and objectives, while achieving a high degree of economic performance. The model is based on the general theory that the reliability of a plant at any time is a function of its initial reliability and the maintenance history of the individual plant components (Smith, 1997). Such a model can assist in evaluating strategic management decisions regarding allocation of funds for nuclear power plant maintenance. It could be used as a simulation tool; various scenarios could be studied to answer "what if" questions. Simulations of this type will allow a better understanding of the relationship between maintenance, economic performance, and safety, and consequently will lead to better decision making. The novelty of this model is tied to the intimate relationship that it develops between maintenance activities at a nuclear plant, and their relationship to prescribed safety requirements and to the economic performance of that plant.Item Scheduling screening inspections for replaceable and non-replaceable systems(2009-05-15) Aral, BahadirThis dissertation focuses on developing inspection schedules to detect non-self- announcing events which can only detected by inspections. Failures of protective sys- tems ,such as electronic equipments, alarms and stand-by systems, incipient failures and the emergence of certain medical diseases are examples of such events. Inspec- tions are performed at pre-determined times to detect whether or not the event has occurred, and necessary actions are taken upon the detection. In this research, my interest is in developing effective inspection schedules to detect non-self-announcing events that balance system downtime and inspection effort. To evaluate the quality of an inspection schedule, I use the availability (for re- placeable) and the detection delay (for non-replaceable systems) as performance mea- sures. When the monetary cost of inspection and the cost of delay are difficult to determine, non-monetary performance measures become more meaningful. In this research, the focus is on maximizing availability or minimizing detection delay given a limited number of inspections or a limited inspection rate. I show that for replace- able and non-replaceable systems, it is possible to construct inspection schedules that perform better than periodic inspection with respect to our performance measures. The occurrence of the event I would like to detect may be influenced by certain individual characteristics. For instance, the risk of developing a certain type of dis- ease might be different for different subgroups within the population. In this case, because of the non-homogeneity in the population, benefits of performing screening tests may not be fully achieved for each sub-group by using an inspection strategy developed for the entire population. Thus, it may be of value for an individual to learn more information about his/her likehood to have the disease. To address this issue, I analyze the change in the expected delay if schedules are based on the whole population information or the individual information and provide numerical results.Item Short time scale thermal mechanical shock wave propagation in high performance microelectronic packaging configuration(Texas A&M University, 2004-11-15) Nagaraj, MahavirThe generalized theory of thermoelasticity was employed to characterize the coupled thermal and mechanical wave propagation in high performance microelectronic packages. Application of a Gaussian heat source of spectral profile similar to high performance devices was shown to induce rapid thermal and mechanical transient phenomena. The stresses and temporal gradient of stresses (power density) induced by the thermal and mechanical disturbances were analyzed using the Gabor Wavelet Transform (GWT). The arrival time of frequency components and their magnitude was studied at various locations in the package. Comparison of the results from the classical thermoelasticity theory and generalized theory was also conducted. It was found that the two theories predict vastly different results in the vicinity of the heat source but that the differences diminish within a larger time window. Results from both theories indicate that the rapid thermal-mechanical waves cause high frequency, broadband stress waves to propagate through the package for a very short period of time. The power density associated with these stress waves was found to be of significant magnitude indicating that even though the effect, titled short time scale effect, is short lived, it could have significant impact on package reliability. The high frequency and high power density associated with the stress waves indicate that the probability of sub-micron cracking and/or delamination due to short time scale effect is high. The findings demonstrate that in processes involving rapid thermal transients, there is a non-negligible transient phenomenon worthy of further investigation.Item Short-term and long-term reliability studies in the deregulated power systems(Texas A&M University, 2006-04-12) Li, YishanThe electric power industry is undergoing a restructuring process. The major goals of the change of the industry structure are to motivate competition, reduce costs and improve the service quality for consumers. In the meantime, it is also important for the new structure to maintain system reliability. Power system reliability is comprised of two basic components, adequacy and security. In terms of the time frame, power system reliability can mean short-term reliability or long-term reliability. Short-term reliability is more a security issue while long-term reliability focuses more on the issue of adequacy. This dissertation presents techniques to address some security issues associated with short-term reliability and some adequacy issues related to long-term reliability in deregulated power systems. Short-term reliability is for operational purposes and is mainly concerned with security. Thus the way energy is dispatched and the actions the system operator takes to remedy an insecure system state such as transmission congestion are important to shortterm reliability. Our studies on short-term reliability are therefore focused on these two aspects. We first investigate the formulation of the auction-based dispatch by the law of supply and demand. Then we develop efficient algorithms to solve the auction-based dispatch with different types of bidding functions. Finally we propose a new Optimal Power Flow (OPF) method based on sensitivity factors and the technique of aggregation to manage congestion, which results from the auction-based dispatch. The algorithms and the new OPF method proposed here are much faster and more efficient than the conventional algorithms and methods. With regard to long-term reliability, the major issues are adequacy and its improvement. Our research thus is focused on these two aspects. First, we develop a probabilistic methodology to assess composite power system long-term reliability with both adequacy and security included by using the sequential Monte Carlo simulation method. We then investigate new ways to improve composite power system adequacy in the long-term. Specifically, we propose to use Flexible AC Transmission Systems (FACTS) such as Thyristor Controlled Series Capacitor (TCSC), Static Var Compensator (SVC) and Thyristor Controlled Phase Angle Regulator (TCPAR) to enhance reliability.Item Simulation and Optimization of Wind Farm Operations under Stochastic Conditions(2011-08-08) Byon, EunshinThis dissertation develops a new methodology and associated solution tools to achieve optimal operations and maintenance strategies for wind turbines, helping reduce operational costs and enhance the marketability of wind generation. The integrated framework proposed includes two optimization models for enabling decision support capability, and one discrete event-based simulation model that characterizes the dynamic operations of wind power systems. The problems in the optimization models are formulated as a partially observed Markov decision process to determine an optimal action based on a wind turbine's health status and the stochastic weather conditions. The rst optimization model uses homogeneous parameters with an assumption of stationary weather characteristics over the decision horizon. We derive a set of closed-form expressions for the optimal policy and explore the policy's monotonicity. The second model allows time-varying weather conditions and other practical aspects. Consequently, the resulting strategy are season-dependent. The model is solved using a backward dynamic programming method. The bene ts of the optimal policy are highlighted via a case study that is based upon eld data from the literature and industry. We nd that the optimal policy provides options for cost-e ective actions, because it can be adapted to a variety of operating conditions. Our discrete event-based simulation model incorporates critical components, such as a wind turbine degradation model, power generation model, wind speed model, and maintenance model. We provide practical insights gained by examining di erent maintenance strategies. To the best of our knowledge, our simulation model is the rst discrete-event simulation model for wind farm operations. Last, we present the integration framework, which incorporates the optimization results in the simulation model. Preliminary results reveal that the integrated model has the potential to provide practical guidelines that can reduce the operation costs as well as enhance the marketability of wind energy.Item Single-Phase Inverter and Rectifier for High-Reliability Applications(2014-05-01) Harb, SouhibWith the depletion of fossil fuels and skyrocketed levels of CO_(2) in our atmosphere, Renewable Energy Resources, generated from natural, sustained, clean, and domestic resources, have caught the eye in recent years of both the industries and governments worldwide. In addition to finding these energy resources, new technologies are being sought to improve the efficiency of consuming the generated energy. Power Electronics is the key technology for both generation and the efficient consumption of energy. The recent trend in power electronics is to integrate the electronics into the source (Photovoltaic (PV)) or the load (light). For PV and outdoor lighting applications, this imposes a harsh, wide-range operating environment on the power electronics. Thus, the reliability of power electronics converters becomes a very crucial issue. It is required that the power electronics, used in such environments, have reliability indices, such as lifetime, which match with the source or load one. This eliminates the reoccurring cost of power electronics replacement. Relatively high efficiencies have been reported in the literature, and standards have been developed to measure it. However, the reliability aspect has not received the same level of scrutiny. In this study, two main aspects have been investigated: (1) A new methodology to evaluate the integrated power electronics that becomes more involved task; and (2) new topology and control schemes, for the single-phase DC/AC and AC/DC converters, which will improve the reliability. The proposed methodology has been applied for different PV Module-Integrated-Inverter (MII) that employs different power decoupling techniques. The results showed that the decoupling capacitor is the limiting lifetime component in all the studied topologies. Moreover, topologies use film capacitor instead of electrolytic capacitor showed an order of magnitude improvement in the lifetime. This clearly suggests that replacing the electrolytic capacitor by a high-reliability film capacitor will enhance the reliability of the PV MII. In the second part of this study, the ripple-port concept is applied for the single-phase DC/AC inverter and AC/DC rectifier, which allows for the usage of the minimum required decoupling capacitance. In conclusion, film capacitor can be used, which led to the improvement of the overall reliability and lifetime.Item Spatial stochastic processes for yield and reliability management with applications to nano electronics(Texas A&M University, 2005-02-17) Hwang, Jung YoonThis study uses the spatial features of defects on the wafers to examine the detection and control of process variation in semiconductor fabrication. It applies spatial stochastic process to semiconductor yield modeling and the extrinsic reliabil- ity estimation model. New yield models of integrated circuits based on the spatial point process are established. The defect density which varies according to location on the wafer is modeled by the spatial nonhomogeneous Poisson process. And, in order to capture the variations in defect patterns between wafers, a random coeff- cient model and model-based clustering are applied. Model-based clustering is also applied to the fabrication process control for detecting these defect clusters that are generated by assignable causes. An extrinsic reliability model using defect data and a statistical defect growth model are developed based on the new yield model.Item The Impact of Protection System Failures on Power System Reliability Evaluation(2012-11-05) Jiang, KaiThe reliability of protection systems has emerged as an important topic because protection failures have critical influence on the reliability of power systems. The goal of this research is to develop novel approaches for modeling and analysis of the impact of protection system failures on power system reliability. It is shown that repairable and non-repairable assumptions make a remarkable difference in reliability modeling. A typical all-digital protection system architecture is modeled and numerically analyzed. If an all-digital protection system is indeed repairable but is modeled in a non-repairable manner for analysis, the calculated values of reliability indices could be grossly pessimistic. The smart grid is emerging with the penetration of information-age technologies and the development of the Special Protection System (SPS) will be greatly influenced. A conceptual all-digital SPS architecture is proposed for the future smart grid. Calculation of important reliability indices by the network reduction method and the Markov modeling method is illustrated in detail. Two different Markov models are proposed for reliability evaluation of the 2-out-of-3 voting gates structure in a generation rejection scheme. If the model with consideration of both detectable and undetectable logic gate failures is used as a benchmark, the simple model which only considers detectable failures will significantly overestimate the reliability of the 2-out-of-3 voting gates structure. The two types of protection failures, undesired-tripping mode and fail-to-operate mode are discussed. A complete Markov model for current-carrying components is established and its simplified form is then derived. The simplified model can appropriately describe the overall reliability situation of individual components under the circumstances of complex interactions between components due to protection failures. New concepts of the self-down state and the induced-down state are introduced and utilized to build up the composite unit model. Finally, a two-layer Markov model for power systems with protection failures is proposed. It can quantify the impact of protection failures on power system reliability. Using the developed methodology, we can see that the assumption of perfectly reliable protection can introduce errors in reliability evaluation of power systems.