Anales de la RANM

269 A N A L E S R A N M R E V I S T A F U N D A D A E N 1 8 7 9 DEEP LEARNING GENITAL LESIONS IMAGE CLASSIFICATION González-Alday R, et al. An RANM. 2022;139(03): 266 - 273 In this case, given the scarcity of the data, we carried out a k-fold cross-validation process, with 5 folds. The dataset images were randomly distributed in 5 sections, and having the pre-trained model, 4 of these sections were used for re-training the network while the remaining one was saved for testing. This process was repeated 5 times, rotating the sections. Moreover, during the training of the CNN, data augmentation was performed, as another way to palliate the problems of having limited data. This means that some random subtle transformations were made to the training images (tweaks in color or size, rotations, translations…), so that they differ every time and the model does not suffer from overfitting (17). Presenting varying images instead of the same images over and over again helps the model to learn to ignore differences caused by illumination or perspective and focus on the important features that define the class shown on the image. For this work in particular, we tweaked the image orientation and included random shifts in height and width. 2.2.2 Explainability methods To enhance the interpretability of the CNN, and therefore enable a better evaluation of its functioning, a post-hoc explainability method was used. Techniques of this type, which fall within the field of XAI, allow explanations to be obtained for the predictions of a model by either visualizing the inner workings of it, creating surrogate simplified models that are more understandable or fiddling with the inputs and predictions to identify important features (18). For CNNs and other image analysis models, the most popular methods currently are those that produce different kinds of heat maps showing what are the parts of the image that are more influential in the final outcome of the network, or where the network’s attention is focused on. In particular, the method of choice for this work is GradCam (19), a technique that visualizes the last convolutional layer of the CNN. Usually, while the first layers of these networks recognise simple features, the deepest layers identify the most complex concepts, and the final one is the one that directly leads to the final classification, by transfering its values to a traditional shallow neural network classi- fier. Moreover, the artificial neurons’ position inside the convolutional layers is directly related to the original position of pixels in the image. These two properties imply that, by visualizing the resulting values of the artificial neurons’ operations of the last convolutional layer, a heatmap can be produced indicating the regions of the image that are important to predict the chosen class. To do so, the final classifier loss for a particular class is differ- entiated with respect to these raw values from the convolutional layer to obtain a series of importance weights. These weights are used to ponderate and pool these same values from the CNN last layer and obtain a gradient map. In this map, specific to the chosen class, higher gradients mean more influence on that prediction. 2.3. Data collection and annotation web tool As an addition to the deep learning model, it was decided to develop a web tool to enable further data collection for this project. This tool consists in a web service that can be accessed through a web page where, after registering as an user, clinicians can upload images and label with the respective condition shown and additional data if necessary. Figure 1. Schematic representation of a convolutional neural network.