اعتبار سنجی داده های مواد و محاسبه با یک شبکه عصبی مصنوعی / Materials data validation and imputation with an artificial neural network

اعتبار سنجی داده های مواد و محاسبه با یک شبکه عصبی مصنوعی Materials data validation and imputation with an artificial neural network

  • نوع فایل : کتاب
  • زبان : انگلیسی
  • ناشر : Elsevier
  • چاپ و سال / کشور: 2018

توضیحات

رشته های مرتبط مهندسی مواد، فناوری اطلاعات
گرایش های مرتبط شبکه های کامپیوتری
مجله علوم مواد محاسباتی – Computational Materials Science
دانشگاه University of Cambridge – Cambridge CB3 0HE – United Kingdom
شناسه دیجیتال – doi https://doi.org/10.1016/j.commatsci.2018.02.002
منتشر شده در نشریه الزویر
کلمات کلیدی انگلیسی Materials data, Neural network, Alloys, Polymers

Description

1. Introduction Through the stone, bronze, and iron ages the discovery of new materials has chronicled human history. The coming of each age was sparked by the chance discovery of a new material. However, materials discovery is not the only challenge: selecting the correct material for a purpose is also crucial [1]. Materials databases curate and make available properties of a vast range of materials [2–6]. However, not all properties are known for all materials, and furthermore, not all sources of data are consistent or correct, introducing errors into the data set. To overcome these shortcomings we use an artificial neural network (ANN) to uncover and correct errors in the commercially available database MaterialUniverse [5] and Prospector Plastics [6]. Many approaches have been developed to understand and predict materials properties, including direct experimental measurement [7], heuristic models, and first principles quantum mechanical simulations [8]. We have developed an ANN algorithm that can be trained from materials data to rapidly and robustly predict the properties of unseen materials [9]. Our approach has a unique ability to handle the data sets that typically have incomplete data for input variables. Such incomplete entries would usually be discarded, but the approach presented will exploit it to gain deeper insights into material correlations. Furthermore, the tool can exploit the correlations between different materials properties to enhance the quality of predictions. The tool has previously been used to propose new optimal alloys [9–14], but here we use it to impute missing entries in a materials database and search for erroneous entries. Often, material properties cannot be represented by a single number, as they are dependent on other test parameters such as temperature. They can be considered as a graphical property, for example yield stress versus temperature curves for different alloys [15]. In order to handle this type of data more efficiently, we treat the data for these graphs as vector quantities, and provide the ANN with information of that curve as a whole when operating on other quantities during the training process. This requires less data to be stored than the typical approach to regard each point of the graph as a new material, and allows a generalized fitting procedure that is on the same footing as the rest of the model. Our proposed framework is first tested and validated using generated exemplar data, and afterwards applied to real-world examples from the MaterialUniverse and Prospector Plastics databases. The ANN is trained on both the alloys and polymers data sets, and then used to make predictions to identify incorrect experimental measurements, which we correct using primary source data. For materials with missing data entries, for which the database provides estimates from modeling functions, we also provide predictions, and observe that our ANN results offer an improvement over the established modeling functions, while also being more robust and requiring less manual configuration. In Section 2 of this paper, we cover in detail the novel framework that is used to develop the ANN. We compare our methodology to other approaches, and develop the algorithms for computing the outputs from the inputs, iteratively replacing missing entries, promoting graphing quantities to become vectors, and the training procedure. Section 3 focuses on validating the performance of the ANN. The behavior as a function of the number of hidden nodes is investigated, and a method of choosing the optimal number of hidden nodes is presented. The capability of the network to identify erroneous data points is explained, and a method to determine the number of erroneous points in a data set is presented. The performance of the ANN for training and running on incomplete data is validated, and tests with graphing data are performed. Section 4 applies the ANN to real-world examples, where we train the ANN on MaterialUniverse [5] alloy and Prospector Plastics [6] polymer databases, use the ANN’s predictions to identify erroneous data, and extrapolate from experimental data to impute missing entries.
اگر شما نسبت به این اثر یا عنوان محق هستید، لطفا از طریق "بخش تماس با ما" با ما تماس بگیرید و برای اطلاعات بیشتر، صفحه قوانین و مقررات را مطالعه نمایید.

دیدگاه کاربران


لطفا در این قسمت فقط نظر شخصی در مورد این عنوان را وارد نمایید و در صورتیکه مشکلی با دانلود یا استفاده از این فایل دارید در صفحه کاربری تیکت ثبت کنید.

بارگزاری