APPLICATION OF THE BOOTSTRAP METHOD FOR STATISTICAL CHARACTERISTICS ASSESSMENT OF AIRCRAFT COMPONENTS’ SMALL SAMPLES


Дәйексөз келтіру

Толық мәтін

Аннотация

The estimation of adequate service life of aircraft instruments is a factor of great importance in aircraft operation process. Changing the instrument service interval affects both reliability (shorter intervals make it easier to locate malfunctions of components and assemblies as early as possible) and economic performance (inducing increase of operating costs). So, the increasing the service interval without potentially reducing reliability is an economically im- portant task. To determine the optimal time to maintenance for aviation components and assemblies, it is necessary to determine the span of their service life with the highest degree of precision. The problem of calculating such estimates is compli- cated by the fact that the data on component failures are scattered and incomplete, which makes it difficult to assess their statistical characteristics accurately. The purpose of this article is to find an effective method of statistical characteristics assessment for small samples as the first stage of modeling of the aircraft components and assemblies reliability. It is induced by specific operational factors of aviation components exchange at small airlines operating Soviet-time aircraft. The article examines two methods of resampling, bootstrap and jackknife. There is also an assessment of mean time to failure expectation for fuel gauges, of the variance and root-mean- square deviation in the article. The bootstrap method is offered as applicable for statistical characteristics assessment of mean time to failure ex- pectation for aircraft components and assemblies taken for analysis in small samples (pressure gauges were chosen as an example of such analysis). The assessments and calculations can be used by airlines to state the nonfailure service time of a variety of components and assemblies.

Толық мәтін

Introduction. The development of methods and means to reduce the number of aircraft and helicopter system failures, as well as flight safety improvement, has always been a priority task. This is connected with eco- nomic factors, for example, to some reduction of aircraft maintenance cost, and accordingly, with the increase of air transportation economic efficiency [1], and to psycho- logical factors as well [2]. For increasing the technical reliability of certain aircraft components, one must use an effective reliability assessment of the components already in operation. The failure of aircraft components and assemblies is a stochastic process. To adequately model such processes, one must know their statistical characteristics. Unfortunately, the collection, accumulation and stor- age of information on the status and failures of aircraft in general and of their specific components is not systematic at the moment [3]. There is an acute lack of information concerning end-to-end documentation on instruments and spare parts. It often happens that there are no entries in the logs about previous failures or repairs of various parts and components, and there are errors in inaccurate filling of the forms of these components. Due to all these factors, it is difficult to collect and process the statistical informa- tion on aircraft failures and defects; that makes it hard to assess the components reliability. Under the circumstances, the methods aimed at statis- tical analysis of small samples are of particular impor- tance. Statistical methods of small sample analysis. Statis- tics doesn’t have any clear definition of small samples. Typically, the sample is called small if its volume does not exceed 30 units [4]. The “criterion of smallness“ of the sample was outlined by D. V. Gaskarov and V. I. Shapovalov in the article “Small sample“, which states that the sample should be considered small when in processing by means of methods based on data grouping, the specified accuracy and reliability cannot be attained [5; 6]. The development of new statistical methods focused on the processing of a limited number of observations started out when the use of traditional methods of mathe- matical statistics proved inapplicable - they aren’t suit- able for processing samples of such volume. For assessment of small samples parameters special methods which extend the characteristics of the sample over the whole sample population were proposed. They are the methods of direct recounting or the methods of correction factors [7], as well as the method of rectangular contribu- tions, as described in [8] and examined in [9]. The essence of this method is in assuming that the random variable is of fluctuating character. It is assumed that x i is not the only possible, but just the most probable value within a certain interval, so when the empirical density of xi is building, it is just a certain finite density function called “contribution“, but not a real function. The physical meaning of this construction is the assumption that the probability density is nonzero not only at the point of the variant value, but also in its closest vicinity. The method of reducing the uncertainty, the method of successive medians [10] and some other methods were also used, as described in the thesis abstract by E. B. Gorbunova [11] Evaluating parameters, in most cases the general assumption is that the analysed value is distributed normally. In assessing the reliability of technical systems, some specific quantities are often regarded as having the exponential distribution. But this assumption often serves the purpose of simplification of the further calculations. By applying such an approach, the interval estimation of statistical characteristics become more important then oint ones. In the XX century (in the 60-ies), the statistical meth- ods focused on similar tasks appeared, namely, the jack- knife method (meaning a knife which can de folded), and the bootstrap method [12]. These methods are of resam- pling and randomization groups, that is, they allow to obtain both point and interval estimates of the original population characteristics, forming new samples based on the already available small sample. These methods had limited application at the time of their development because of their relatively high compu- tational complexity and lack of suitable computer equip- ment. They were given a new life at the end of the 20th century, when computer technologies became widely ac- cessible [13]. The advantages of these methods include relatively high efficiency, while the drawback is the ab- sence of strong theoretical justification. Calculation and processing of the received data was performed using special PC programs: Python, Pandas, Numpy and JupyterNotebook Python is a high-level general-purpose programming language designed to improve developer’s performance and code readability. It is widely used in research calcula- tions. Pandas is a Python library for data manipulation and analysis, used, for example, in cases of multivariate tem- poral series and cross-sectional data sets that are com- monly found in statistics and outcomes of experiments. NumPy is an extension of the Python language that provides additional support for large multidimensional arrays and matrices, together with a large library of high- level mathematical functions allowing operations with these arrays. JupyterNotebook is a command shell for interactive computing. This software can be run not only by using Python, but other programming languages, as Julia, R, Haskell and Ruby. It is often used for data processing, statistical modeling and computer-aided learning. [14; 15] Source data for the analysis. The data selected for the analysis were the failures of components installed in the An-24RV aircraft of “KrasAvia” airline. The available information on the aircraft components and assemblies was: - product name; - product code; - factory code number; - date of manufacture; - date of repair; - operation time since initial installation (hours); - operation time since the latest repair (hours). As far as the reliability of the newly-released aircraft is concerned, its non- failure operation time becomes the matter of primary importance, mainly, the non-failure time since the aircraft was first put in operation. The available data contained information on various components and assemblies of aircraft. The selected data were those on the failures of the aircraft MA-250M pres- sure gauge (mainly because the amount of data on this instrument failures was well known). The available data contained information about 3 in- stances of the instrument replacement (see table). Operation time since initial installation № Operation time since initial installation (hours) 1 3707 2 10520 3 3707 4 10520 5 21993 6 3707 7 10520 8 3707 9 10520 10 21993 Coincidence of the operating time for different pres- sure gauges depends on the events of their replacement during the aircraft scheduled maintenance. Application of traditional and resampling methods for estimating aviation systems’ failure rate parame- ters. Table (above) indicates that there may be not enough data to make adequate statistical analysis. This statement can be verified experimentally by constructing statistical characteristics based on the available data. Histogram of the original sample and estimates ob- tained by traditional method (fig. 1). Mathematical expectation of mean time to failure is 10 089.40 hours. Root-mean-square deviation is 6686.35 hours. The given distribution is not normal. According to the estimation, 47.5 % of the gauges are installed in aircraft already being out of order or fail for the first time within 10.00 hours of operation. The use the of resampling methods solves the prob- lem. We take for consideration the jackknife-method - this is one of resampling methods (linear approximation of statistical bootstrap); it is used to estimate the statistical inference error. The method works as follows: the average sample value is calculated for each element without re- gard for this element, and then - the average of all thus obtained values. For a sample from N elements the esti- mate is obtained by calculating the mean value of the re- maining N-1 elements. For assessment of this method there was developed a function generating sub-samples. Its code is given below: # build the array of samples to analyze manom1=list(manometers.iloc[:,2].tolist()) importnumpyasnp means=[] stds=[] foriinrange(len(manom1)): a=form_jack_sample(manom1,i) means.append(np.mean(a)) stds.append(np.std(a)) hist,bins=np.histogram(means,bins=10) hist_s,bins_s=np.histogram(stds,bins=10) print(np.mean(means)) print(np.std(means)) print(np.mean(stds)) print(np.mean(manom)) print(np.std(manom)) print(means) print(stds) For the available sample, the same way as in the for- mer example, statistical characteristics were developed using the following code: # build the array of samples to analyze manom=list(manometers.iloc[:,2].tolist()) importnumpyasnp n=10000 means=[] stds=[] foriinrange(n): a=form_sample(manom) means.append(np.mean(a)) stds.append(np.std(a)) hist,bins=np.histogram(means,bins=10) print(np.mean(means)) print(np.std(means)) print(np.mean(stds)) print(np.mean(manom)) print(np.std(manom)) # means %matplotlib inline importmatplotlib.pyplotasplt width=0.7*(bins[1]-bins[0]) center=(bins[:-1]+bins[1:])/2 plt.bar(center,hist,align='center',width=width) plt.show() The resulting distributional histogram is presented (fig. 2). Mathematical expectation of mean time to failure with the use of the jackknife method equaled 10 090 hours. Root-mean-square deviation was 6298 hours. Fig. 1. Histogram of data obtained through traditional calculation Рис. 1. Гистограмма данных при традиционном методе расчета Fig. 2. Histogram obtained through the jackknife calculation Рис. 2. Гистограмма при jackknife-методе Fig. 3. Distribution of mean time to failure Рис. 3. Распределение среднего времени наработки на отказ As we see, this method is ineffective for analyzing such small-sized samples. Here is the statistical characteristics assessment ob- tained through the bootstrap method. Bootstrap method: Let the sample be (z1, z2, …) it is required to estimate θ parameter. Doing this requires selecting N pseudosam- ples to be developed from the elements of the original sample with replacement. For each pseudosample (z*1, z*2, …) n = 1, 2, ..., N pseudostatistics θ*n is calcu- lated. Pseudostatistics θ*1, θ*2, …, θ*n are ranged from minimal to maximal. Quantiles q*λ1, q*1-λ2 assume values θ*[Nλ1], θ*n [N(1-λ2)+1]. Confidence interval is calculated on this basis. 10000 pseudosamples were developed for the avail- able data. The code is given below: # form the sample from the source data importrandom defform_sample(source_sample): sample_len=len(source_sample) res_sample=[source_sample[random.randint(0,sampl e_len-1)]foriinrange(sample_len)] returnres_sample # build the array of samples to analyze manom=list(manometers.iloc[:,2].tolist()) importnumpyasnp n=10000 means=[] stds=[] foriinrange(n): a=form_sample(manom) means.append(np.mean(a)) stds.append(np.std(a)) hist,bins=np.histogram(means,bins=10) print(np.mean(means)) print(np.std(means)) print(np.mean(stds)) print(np.mean(manom)) print(np.std(manom)) %matplotlib inline importmatplotlib.pyplotasplt width=0.7*(bins[1]-bins[0]) center=(bins[:-1]+bins[1:])/2 plt.bar(center,hist,align='center',width=width) plt.show() The resulting mean operation time equaled 10 106 hours. Root-mean-square deviation for the given value equaled 2122 hours (values are rounded to inte- gers). Provided the given random value is of normal distri- bution, it could be confirmed that the mean time to failure is from 5862 to 14 350 hours at the 95 % credible level. The distribution of this value is shown in fig. 3. It can be confirmed that mathematical expectation of mean time to failure is close to 10000 hours (more precise value cannot be obtained, because bootstrap belongs to probabilistic methods). Conclusion. The article examines the methods of sta- tistical analysis based mainly on computer calculations. The advantages of these methods comparing to the classi- cal ones are that there is no need to adopt a hypothesis about the form of the distribution law of the selected ran- dom variable. Also, there is the possibility of numerical analysis for statistical parameters assessment for small data samples. The traditional calculation of mathematical expecta- tion of mean time to failure gives the value of 10 089 hours, which is very close to the value obtained through the bootstrap method. However, the value of standard deviation calculated for the original sample is 6686 (rounded to integers); that is quite far from the value obtained through the bootstrap method. The bootstrap method outcome looks more plausible, although it requires further verification. The use of traditional assessments obviously provides less accuracy: according to this estimate, 47.5 % of the pressure gauges are installed in aircraft already being out of order or fail for the first time within 10000 hours of operation. Therefore, the bootstrap evaluation method allows to obtain more adequate estimates. Among the disadvantages of this method is its stochastic nature (in particular, this method doesn’t provide point estimate of the mean time to failure - it slightly varies from modeling to modeling), and also the lack of strict demonstrations of its correct- ness. It should be noted that further studies are needed to confirm the applicability of this method for statistical characteristics assessments concerning reliability of air- craft components and assemblies. Currently, a software product that will make wider application of the methods analyzed in this article possi- ble is being developed; that can help to solve specific problems airlines often come across.
×

Авторлар туралы

D. Gerasimova

Reshetnev Siberian State University of Science and Technology

Email: Wolhidka@mail.ru
31, Krasnoyarsky Rabochy Av., Krasnoyarsk, 660037, Russian Federation

A. Sayapin

Reshetnev Siberian State University of Science and Technology

31, Krasnoyarsky Rabochy Av., Krasnoyarsk, 660037, Russian Federation

A. Palukhin

Reshetnev Siberian State University of Science and Technology

31, Krasnoyarsky Rabochy Av., Krasnoyarsk, 660037, Russian Federation

A. Katsura

Reshetnev Siberian State University of Science and Technology

31, Krasnoyarsky Rabochy Av., Krasnoyarsk, 660037, Russian Federation

Әдебиет тізімі

  1. Акимова Н. Управление коммерческой экс- плуатацией на основе системы показателей экономи- ческой эффективности деятельности авиакомпании : автореф. дис. … канд. экон. наук : (08.00.14). М., 2000. С. 56-101
  2. Филипьева Т. Психологическое содержание труда бортпроводника воздушного судна гражданской авиации : автореф. дис. … канд. психол. наук : (19.00.03). М., 2006. С. 198-201
  3. Князьков П. Анализ и обеспечение надежности воздушных судов гражданской авиации в процессе их эксплуатации : автореф дис. … канд. техн. наук : (05.22.14). СПб., 2001. С. 67-98.
  4. Efron B. and Tibshirani R. J. An Introduction to the Bootstrap. New York : Chapman & Hall, 1993. P. 338-352.
  5. Гаскаров Д. В., Шаповалов В. И. Малая выбор- ка. М. : Статистика, 1978. 248 с.
  6. Колмогоров А. Н. Три подхода к определению понятия «количество информации» // Проблемы пе- редачи информ. 1965. Т. 1, вып. 1. С. 3-11.
  7. Buhlmann P. Sieve bootstrap with variable length - Markov chains for stationary categorical Time series (with discussions) // J. Amer. Stat. Assoc. 2002. P. 443-455.
  8. Чавчанидзе В. В., Кумсишвили В. А. Об опре- делении законов распределения на основе малого числа наблюдений // Применение вычислительной техники для автоматизации производства : тр. сове- щаний 1959 г. М. : Машгиз, 1961. С. 71-75.
  9. Гузик В. Ф., Кидалов В. И., Самойленко А. П. Статистическая диагностика неравновесных объектов. СПб. : Судостроение, 2009. 304 с.
  10. Efron B. Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction. Cam- bridge University Press. 2012. P. 89-112.
  11. Горбунова Е. Б. Метод статистической обра- ботки малых выборок данных в задачах прогнозиро- вания и контроля состояния сложных систем : авто- реф дис. … канд. техн. наук : ( 05.13.01). Таганрог, 2017. C. 33-75.
  12. Davison A. C., Hinkley D. V. Bootstrap Methods and Their Application / Har Dskt edition. Cambridge University Press, 1997. P. 529-552.
  13. Devore L. J. Probability and statistics for engi- neering and the science. Duxbury press. 2003.
  14. Россум Г., Дрейк Ф. Л. Дж., Откидач Д. С. Язык программирования Python. 2001. 454 c.
  15. Judgment under Uncertainty: Heuristics and Bi- ases / D. Kahneman [et al.]. 21st. Cambridge University Press, 2005. 255 p.

Қосымша файлдар

Қосымша файлдар
Әрекет
1. JATS XML

© Gerasimova D.S., Sayapin A.V., Palukhin A.A., Katsura A.V., 2018

Creative Commons License
Бұл мақала лицензия бойынша қолжетімді Creative Commons Attribution 4.0 International License.

Осы сайт cookie-файлдарды пайдаланады

Біздің сайтты пайдалануды жалғастыра отырып, сіз сайттың дұрыс жұмыс істеуін қамтамасыз ететін cookie файлдарын өңдеуге келісім бересіз.< / br>< / br>cookie файлдары туралы< / a>