N are described. At the end of the section, the general overall performance in the two combined approaches of estimation is presented. The outcomes are compared together with the configuration in the femur obtained by manually marked keypoints.Appl. Sci. 2021, 11,10 of3.1. PS Estimation Consequently of training more than 200 networks with various architectures, the a single ensuring the minimum loss function value (7) was selected. The network architecture is presented in Figure eight. The optimal CNN architecture [26] consists of 15 layers, ten of which are convolutional. The size of the last layer represents the number of network outputs, i.e., the coordinates of keypoints k1 , k2 , k3 .Input imageFigure 8. The optimal CNN architecture. Every single rectangle represents one particular layer of CNN. The Naftopidil Cancer following colors are made use of to distinguish critical elements with the network: blue (fully connected layer), green (activation functions, where HS stands for hard sigmoid, and LR denotes leaky ReLU), pink (convolution), purple (pooling), white (batch normalization), and yellow (dropout).After 94 epochs of coaching, the early stopping rule was met and the finding out procedure was terminated. The loss function of development set was equal to 8.4507 px2 . The results for all studying sets are gathered in Table 2.Table 2. CNN loss function (7) values for diverse learning sets. Learning Set Train Improvement Test Proposed Resolution 7.92 px2 8.45 px2 six.57 px2 U-Net [23] (with Heatmaps) 9.04 px2 ten.31 px2 6.43 pxLoss function values for all finding out sets are within acceptable variety, provided the general complexity from the assigned task. The performance was slightly far better for the train set in comparison towards the improvement set. This feature VU0152099 Autophagy commonly correlates to overfitting of train information. Thankfully, low test set loss function worth clarified that the network functionality is accurate for previously unknown information. Interestingly, test set information accomplished the lowest loss function value, which can be not widespread for CNNs. There may very well be a number of causes for that. Initially, X-ray photos utilised throughout coaching had been of slightly distinct distribution than these from the test set. The train set consisted of images of children varying in age and, consequently, of a unique knee joint ossification level, whereas the test set included adult X-rays. Second, train and development sets had been augmented making use of common image transformations, to constitute a valid CNN studying set (as described in Table 1). The corresponding loss function values in Table two are calculated for augmented sets. Some of the image transformations (randomly chosen) resulted in higher contrast images, close to binary. Consequently, those photos were validated with high loss function worth, influencing the all round performance on the set. On the other hand, the test set was not augmented, i.e., X-ray images weren’t transformed ahead of the validation. The optimization of your hyperparameters of CNN, as described in Appendix A, improved the process of network architecture tuning, when it comes to processing time as well as low loss function value (7). The optimal network architecture (optimal inside the sense of minimizing the assumed criterion (7)) consists of convolution layers with various window sizes, for convolution and for pooling layers. It really is not consistent with all the extensively popular heuristics of modest window sizes [33]. Within this unique situation, compact window sizes inAppl. Sci. 2021, 11,11 ofCNN resulted in higher loss function or exceeded the maximum network size limi.