Speaker
Description
This paper presents a comprehensive performance evaluation of various AI architectures for a classification of holes drilled in melamine faced chipboard, including custom Convolutional Neural Network (CNN-designed), five-fold CNN-designed, VGG19, single and five-fold VGG16, an ensemble of CNN-designed, VGG19, and 5xVGG16, and Vision Transformers (ViT). Each model's performance was measured and compared based on their classification accuracy, with the Vision Transformer models, particularly the B_32 model trained for 8000 epochs, demonstrating superior performance with an accuracy of 71.14%.
Despite this achievement, the study underscores the need to balance model performance with other considerations such as computational resources, model complexity, and training times. The results highlight the importance of careful model selection and fine-tuning, guided not only by performance metrics but also by the specific requirements and constraints of the task and context. The study provides a strong foundation for further exploration into other transformer-based models and encourages deeper investigations into model fine-tuning to harness the full potential of these AI architectures for image classification tasks.