AI and Artworks: Object Detection, Image Classification and Iconographic Analysis
Seminar room 1&2
Postsparkasse
AI is transforming how we approach images of the past and present. This workshop is dedicated to the subject of AI-supported object detection, image classification and iconographic analysis, with a focus on drawn, painted and printed content.
We invite proposals for contributions that draw on large- and small-scale datasets and explore the current possibilities and limitations of AI in this area. Potential topics include but are not limited to:
· How AI is currently employed in art historical research and writing to detect, classify and analyse artworks
· Challenges surrounding pretrained models (mostly trained on modern photographic material) and the creation of new training data and models
· Controlled vocabularies and their application in AI-based art historical research
· The development of AI tools specifically tailored to art history
· Ethical and legal considerations surrounding the use of AI in art historical research.
The event is jointly organised by Doris Gruber, head of the Art History Research Unit of the Institute for Habsburg and Balkan Studies (IHB), Maria Theisen, head of the Department for Paleography and Codicology of the Institute for Medieval Research (IMAFO), and the thematic platform Machine Learning (MLA2S) at the Austrian Academy of Sciences in Vienna, Austria.
Confirmed keynote speakers:
· Peter Bell (Marburg University)
· Etienne Posthumus (FIZ Karlsruhe) & Hans Brandhorst (Iconclass)
The call for papers is now closed.
Registration for in-person attendance is now open. Participation is free of charge. However, due to space limitations, guest participation is limited. Seats are allocated on a first come, first served basis.
Online participation: For online participation please follow this link. There is no need for registration. In case of troubles regarding online participation, please contact MyrjamHelena.Raich@oeaw.ac.at.
Figure: Hieronymus Bosch, The Tree-Man, detail, ca. 1500, Albertina, Vienna.

-
-
08:45
Arrival and Registration
-
Introduction
-
Models
-
1
Meta-Acervos
In this talk we'll present our methodology and prototype for a meta-collection system called Meta-Acervos that uses different AI models and computer vision techniques to recombine existing archive metadata. We’ll share the challenges we faced when using pre-trained models and how we addressed questions of vocabulary and legibility in the art history field.
This work was carried out within the context of a digital humanities research group (Digital Collections and Archives) that aims to create strategies for visibility of underrepresented narratives.
The current interface focuses on combining and expanding information from 17 Brazilian museum collections available in public datasets and APIs like Wikimedia and the Brasiliana Museus project, with 4000 artworks from the XIII to XXIst century.
Our methodology makes use of open-source multimodal models like CLIP, SigLIP and OWLv2 to annotate, cluster and classify paintings and drawings according to their visual and semantic characteristics.
The augmented data and interface not only allow for new ways of searching and organizing the collections, but also enable new ways of visualizing their content, like creating composite grids with every instance of particular objects found in the artworks. Another possibility is to place cropped objects on a blank canvas, sized and positioned relative to their original location in the artworks.
This can be used to visually explore aggregate characteristics and patterns in groupings like: all the hands in religious paintings, or all the hands and faces from artworks in a specific collection, or all of the palm trees extracted from paintings created between the a given period:

Hands in religious paintings
Faces in portraits
Palm trees in the 1800sNevertheless, the diversity of the profiles of the collections available in Meta-Acervos leads to significant ambivalences in the use of artificial intelligence for the treatment of museum collections and, consequently, within the field of art history. On one hand, the system, based on computer vision tools, enables image-based searches that allow combinations (by color, shape, and internal visual elements such as flora and fauna) that break with the traditional canons of art-historical databases (author, date, and style). These features suggest speculative and exploratory curatorial approaches that bring forth alternative aesthetic and historical narratives. On the other hand, the results of the visualization and search filters reveal the fragility of AI models when dealing with contemporary artworks (most of which are non-figurative) and the weight of biases in the interpretation of images depicting Black and Indigenous people, thereby reinforcing stereotypes rooted in conservative historiographies.
However, visualization resources—such as the latent space and the distribution of artworks over time—are tools that reveal historiographical layers absent from online digital archives. The visualization of the latent space, for instance, makes explicit the ways in which AI models have systematized information, functioning as an invisible map of the emphases and priorities established by machine learning. In turn, the timeline allows users to understand the dynamics of acquisition and the patterns of interest that have shaped the institutions featured in Meta-Acervos.
Code and extracted metadata is available on GitHub and a public interface for navigating the results is available here.
Speakers: Thiago Hersan (Parsons School of Design), Dr Giselle Beiguelman (University of São Paulo), Dr Ana Gonçalves Magalhães (University of São Paulo) -
2
A Benchmark Dilemma: Between Representativeness and Domain-Specificity
When is a machine learning model performing well? From a computer science perspective, this question can be answered quite simply with evaluation metrics. A statistically well-performing model forms the basis for any further research, in real-world, as well as in humanities domains. However, domain-specific tasks and in-depth case studies, such as those in digital art history, often reveal specific biases and shortcomings, especially with pre-trained models. Therefore, additional domain-specific benchmarking is often necessary to evaluate a model's performance. And suddenly, well performing becomes not a statistical, but a humanities problem as well. How representative is the dataset for art-historical research questions? What role does the canonicity of certain painters play in the training data as well as in the benchmarking dataset? How generalizable are the features of a domain as heterogeneous as art, especially in more nuanced subsets?
In this submission, we present a systematic benchmarking of various pre-trained state-of-the-art transformer models for the visual arts domain. Multimodal vision-language models such as CLIP have previously shown good performance across natural-image domain tasks. This motivates the need for a systematic evaluation of their usability in the visual arts domain. Addressing the gap between statistics and art history, this is accompanied by a discussion of dataset- and architecture-specific challenges. With large, deeply annotated datasets lacking, the evaluation task often boils down to available meta data such as style, genre and artist classification, oftentimes rejecting nuanced differences between sub-genres or artists and opting for the well documented information already obvious to art historians. Additionally, the quality of image data may also affect the quality of a benchmark. Museum databases, for example, often offer higher quality images than what is presented in large datasets such as WikiArt. Furthermore, transformer model architecture adds additional challenges, such as subject-specific terms in image description. We are going to address this problem with domain specific tasks, proposing a framework for additional evaluation on a more qualitative, art historical level.
Speakers: Dr Marta Kipke (Center for Humanities Computing, Aarhus University), Louise Brix Pilegaard Hansen (Center for Humanities Computing, Aarhus University) -
3
Learning from Small Data: Adapting Pretrained Diffusion Models for 17th-Century Painting
Most AI models used in art-historical analysis or image generation are trained on large photographic datasets whose statistical structure differs fundamentally from painted images. This raises a key methodological problem: how can such pre-trained models be adapted to small, historically specific corpora while retaining interpretive reliability?
This paper presents results from the ongoing project ARTofAI, which investigates the integration of generative diffusion models into art-historical research. Focusing on the 17th-century painter Almenak, active in Carniola, we fine-tuned Stable Diffusion using Low-Rank Adaptation (LoRA) on a dataset of 99 image fragments derived from four preserved works. Each fragment was paired with one of four captioning strategies – simple keywords, expert art-historical descriptions, CLIP-generated tags, and Vision-Language Model (VLM) sentences – to examine how different textual inputs affect what the model learns.
The experiments show that while LoRA successfully transfers stylistic surface features such as palette, texture, and brushwork, caption variation has minimal impact on the visual outcome. Models trained on all caption types produced comparable stylistic results, but none reproduced the compositional or iconographic coherence characteristic of historical painting. These findings highlight three broader challenges central to AI-based art history:
1. The domain gap between pre-trained photographic models and historical pictorial data;
2. The need for controlled vocabularies that translate art-historical categories into machine-readable form;
3. The importance of lightweight, transparent methods enabling art historians to experiment with fine-tuning on small, curated datasets.Rather than aiming at reconstruction or authenticity, this study treats fine-tuning as a diagnostic tool for assessing what diffusion models can and cannot learn from historical material – an approach that informs the development of AI methods genuinely adapted to art-historical inquiry.
Speakers: Dr Katarina Mohar (Research Centre of the Slovenian Academy of Sciences and Arts, ZRC SAZU), Dr Rok Vrabič (University of Ljubljana)
-
1
-
11:00
Coffee Break
-
Augmented Data
-
4
Beyond Vision: Metadata-Augmented AI for Iconclass Classification in Medieval Manuscripts
This paper presents a practical approach to applying the Iconclass system to medieval manuscript imagery by combining image analysis with existing textual metadata. Building on our earlier large-language-model (LLM) pipeline for assigning Iconclass codes to early modern woodcuts, we extend the method to the Wenzelsbibel, a richly illuminated fourteenth-century German Bible.
The project tests how combining image data with existing metadata, such as image captions, editorial notes, and TEI transcriptions, can improve automated iconographic classification. Each miniature is processed through a Retrieval-Augmented Generation (RAG) workflow in which an LLM generates a description informed by both the image and its associated texts. These descriptions are then used to retrieve candidate Iconclass entries from a vector database. The model selects the best match, taking into account both specific and broader correspondences within Iconclass’s hierarchy. Our evaluation framework explicitly values partial matches, recognising that identifying a correct parent category (e.g. “Story of David and Goliath”) is still highly meaningful for cataloguing and search.
We also test text-only variants, using captions or descriptions without the image, to explore whether language alone can yield reliable iconographic assignments—a scenario relevant for digitized collections that already possess textual metadata but lack visual embeddings.
The broader aim is pragmatic as well as scholarly: to demonstrate a scalable, low-resource method that heritage institutions can use to enrich or standardize metadata without extensive manual annotation or machine-learning training. For collections with descriptive records but no controlled vocabulary, the pipeline offers a practical route to apply Iconclass consistently and transparently.
By integrating vision, language, and metadata, the project shows how AI can support rather than replace art-historical expertise, providing both a proof of concept for the Wenzelsbibel and a transferable model for iconographic description across collections.Speakers: Dr Drew Thomas (University of Salzburg), Ms Julia Hintersteiner (University of Salzburg) -
5
Digital Iconology. Classification and Association in Visual Knowledge Systems
Almost every art historian has heard of Iconclass. Less known is that the creator of Iconclass, Henri van de Waal (1910-1972) for the greater part of his life worked on another classification of the arts, with the title: “Beeldleer” (Iconology). It was intended as a tool for “iconological exploration” for mapping still uncharted territories of the arts. In line with Aby Warburg and his followers, Van de Waal rejected commonly rigid categorizations of art according to genres and periods and proposed instead new more associative and cross-cultural ways of arranging images blurring the boundaries between different disciplines such as art history, philosophy, and anthropology.
The claim is put forward that linking Van de Waal’s unfinished classification of the arts to Iconclass and potentially future ontologies in the domain of the digital humanities will enhance the quality of the classification of results in computer vision experimentation. The potential of Iconclass for describing such results of experiments foremost based on figurative Western art has already been outlined in digital art historical studies. In this paper the additional value of linking to Beeldleer (Iconology) for classifying cross-cultural, non-figurative and more syntactic aspects of the arts as well will be explored. Similar to the automated step by step approach of deep learning underlying computer vision to recognize features and patterns in data that are more meaningful than other, our human reading of the produced results is a gradual process of understanding images in the context of our knowledge. Therefore, we propose associative systems that allow for the pre-classification, classification and post-classification(contextualization) of images in order to compare, to benchmark and to contextualize (intermediate) results of computer vision in a meaningful way. A beta-version of such an associative system will be presented.
Speakers: Charles van den Heuvel (University of Amsterdam / Huygens Institute), Etienne Posthumus (FIZ Karlsruhe), Hans Brandhorst (ICONCLASS) -
6
Exploring AI Approaches on Image Data Mining in Collections of the Austrian National Library
“Reading Images, Writing Metadata” is an ongoing project by the Austrian National Library (ONB) which is aiming towards enriching metadata using various Computer Vision techniques, including AI models and Machine Learning, on a diverse collection of graphics and images.
The pictures and graphics available in the online portal ONB Digital are to be made more accessible through automatic object detection and classification, enhancing the general retrievability via search. The metadata generated hereby will be published and made available for other research applications in the future. Furthermore, images in the digitized book collection ABO (Austrian Books Online) and in ANNO (Austrian Newspapers Online) are to be identified and extracted using AI, which will expand the collection of items searchable. Digitized images with enriched metadata from object detection will enable users to find similar images. These different milestones will enable users to gain new approaches to the different collections of the ONB and foster serendipitous exploration as well as enhancing fundamental research on the use of digitized artworks in different contexts.
In a first and already completed step, various models and classification systems like ICONCLASS were tested on their robustness concerning the diversity of art styles and iconographic content in testing datasets. Even though these datasets do not contain artworks in the conventional sense of the word, the challenges faced by the team working on the project – such as depth(s) and degree(s) of description, the librarians’ expectations vs. the models’ capabilities, ontological borders between object detection and contextual classification, as well as implementation in (library) interfaces to ultimately benefit different user groups – are inherently transferable to other domains and therefore very suitable for a broader discussion with peers during this conference. The presentation will briefly explain the institutional background of the project to then focus on the main questions and decisions made so far, before finally giving an outlook on the tasks and challenges ahead.
Speakers: Carla Maria Schnedlitz (Austrian National Library), Christoph Steindl (Austrian National Library), Johannes Knüchel (Austrian National Library), Simon Mayer (Austrian National Library)
-
4
-
13:00
Lunch Break
-
Layout
-
7
Computational Analysis of Medieval Pen Flourishing
A particular form of medieval book decoration is so-called pen flourishing, used to describe delicate penwork with floral and geometric motifs. Pen flourishing typically appears in decorated initials inserted, usually in red and blue, after the main text had been copied. As book production became increasingly specialized in the later Middle Ages, this task was performed by rubricators, calligraphers, gilders, or painters. Such initials carried no narrative function but served as visual anchor points within the text and allowed for a wide range of ornamental expression. Pen flourishing is used by art historians as a marker of individual hands, regional styles, and workshop networks.
USTP and the research center of Klosterneuburg abbey have teamed up to investigate computational analysis of medieval pen flourishing. We are investigating methods to analyze large amounts of material with the use of digital tools and thus to raise new questions on sources from medieval libraries that have been largely unexplored so far.
Our goal is to develop a prototype that can be used to support art historical research in the processing of mass sources. The progress so far is a transparent method to estimate pen flourishing similarity based on local patterns [1]. Currently, we are developing an interactive tool allowing for two tasks: a) to travel through our training data corpus alongside the similarity of individual pen work as well as local patterns and b) to upload pen flourishing and retrieve similar pen flourishing and local patterns. At the conference, we will be presenting our similarity estimation method [1] and demonstrating a functional prototype of our interactive tool.[1] Florian Kibler, Monica Apellaniz-Portos, Max Theisen, Victor-Adriel De-Jesus-Oliveira, Martin Haltrich, Matthias Zeppelzauer, and Markus Seidl. 2025. Transparent Similarity Estimation of Medieval Pen Flourishing via Local Visual Patterns. In Proceedings of the 7th International Workshop on analySis, Understanding and proMotion of heritAge Contents (SUMAC '25). Association for Computing Machinery, New York, NY, USA, 3–11. https://doi.org/10.1145/3746273.3760203
Speakers: Markus Seidl (University of Applied Sciences St. Pölten), Florian Kibler (University of Applied Sciences St. Pölten), Martin Haltrich (Research Center Stift Klosterneuburg), Max Theisen (Research Center Stift Klosterneuburg), Victor-Adriel De-Jesus-Oliveira (University of Applied Sciences St. Pölten) -
8
AI and Automatic Visual Recognition: Some Thoughts on the New Digital Methodologies for Image Retrieval
AI-driven methodologies can transform the study of digital images. Focusing on the Lyon16ci project, which catalogs over 10,000 printed illustrations from Lyon (1480–1600), the paper explores the potential of utilizing automatic image recognition softwares, such as the Imagematching (VGG, Oxford), to detect varying degrees of visual similarity across large iconographic corpora.
By integrating these tools with established iconographic indexing systems like the Warburg Institute Iconographic Database and ICONCLASS, the paper will discuss the opportunities and limitations of utilizing softwares based on AI models for art historical research, also presenting the challenges related to long term sustainability of these digital projects. The new digital art historical approach developed during projects such as "The Illustrated Book in Lyon" (CNRS, Equipex Biblissima) testify to the new possibilities for the analysis of visual material, not only facilitating large-scale iconographic analysis but also encouraging a critical reflection on the methodological, ethical, and epistemological implications of employing AI in art history.Speaker: Barbara Tramelli (Freie Universität Bozen) -
9
Large-scale Study of Text-Image Layout in 20th-Century Periodicals – As a Reaction to the Absence of Archival Documents
ABSTRACT
The literary, philological and historical study of illustrated magazines encounters numerous problems, especially when it comes to nineteenth-century or early twentieth-century periodicals. The main problem is due to the scarcity of preserved editorial archives and correspondence between publishers and writers/illustrators, documents that could record the production processes and editorial practices that guided the construction of the periodicals themselves.
The core idea behind this proposal is based on the observation that, if it is true that the editorial archives do not exist, it could also be true that much, we suggest, could be deduced from the periodicals themselves. For this reason, the research we wish to bring to your attention aims to study the mutation and/or persistence of layout within individual illustrated magazines issues: the recognition of recurring or discontinuous templates (i.e. “similar layouts”) could indeed provide philological data on the material history of periodicals – in particular on the evolution of the text-image relationship and the visual pervasiveness of advertising – and suggest deductions regarding editorial practices, such as the recognition of an “editorial line” instead of an “authorial intent” in the display of images. This could allow us to conjecture the type of relationship between writer(s)-publisher(s)-illustrator(s), but also to trace the presence of a more or less strong “author’s will” rather than a “publisher’s will”. In other words, the study of periodical layouts and templates could offer a sideways glance at the backstage of editorial practices.
We intend to study this on a selected dataset which includes several early 20th-century Italian magazines published in Turin considered as case studies. The layout analysis is performed using a pretrained segmentation-based model which partitions images, text and captions with bounding boxes. Dimensions, position and numbering of the identified bounding boxes are then evaluated. From this data, it is computed the percentage of space occupied by images within each page. The latter will be used as one of the two main variables, together with the number of images on the page, to evaluate similarities between layout of pages. In this way, similar layouts of pages will be highlighted providing explainability of their similarity based on a quantitative aspect. Moreover, an attempt to label the clusters of similar images with specific templates will be performed.
The conference will be an opportunity to present the approach used and the early critical results obtainedSpeakers: Marta Pizzagalli (Università della Svizzera italiana / Cambridge University), Mr Rocco Felici (Università della Svizzera italiana / Scuola universitaria professionale della Svizzera italiana)
-
7
-
15:30
Coffee Break
-
Poster Session
-
10
Artificially Intelligent Art History? A Transcultural Evaluation of Algorithmic Systems Building on Aby Warburg’s Mnemosyne Atlas
Currently existing AI systems – ranging from MLLMs to specialized models like GalleryGPT or CLIP – are increasingly used for art historical research, but being mostly trained on surface-level, image intrinsic characteristics, they fail to approximate deep semantic and contextual art historical methods such as Aby Warburg’s Mnemosyne Atlas. This study asks two questions: Can we currently build an AI art history and – more importantly – do we want to?
To answer this, Aby Warburg’s Mnemosyne Atlas is proposed as a heuristic method, as it illustrates research as a process of creating constellations between objects, of contextualization, and of visualizing the voids in between. Warburg’s research practice is then put into dialogue with Monica Juneja’s concept of critical globality in order to unmoor it from its Eurocentric foundations and to foreground transculturation as the key concept of cultural processes. To critically assess algorithmic systems, Karen Barad’s agential realism and its ethical implications of accountability are employed: images, researchers, and their tools are entangled in the very phenomena art history may seek to study and thereby co-produce perceived realities. From this, five criteria for a possible AI art history are derived: it must be driven by a research question; operate from a transcultural standpoint; understand art as a multilayered product of cultural processes; foreground relational constellations instead of linear narratives; and take account of itself as a transformative actor.
An examination of recent AI models, benchmark tests, and David G. Storks notion of computer-assisted art connoisseurship shows current AI methods can succeed in specialized areas and automate certain aspects of research, but omit a transcultural and contextual polysemy that Warburg mobilizes. Innovation therefore shifts from the level of models to the level of methods: by which criteria do we declare AI to be art historically relevant in the first place?
The result is ambivalent. Current systems can accelerate partial tasks and survey large image corpora, but they fulfill neither transcultural nor ethical dimensions of the proposed framework. A fully automated AI art history would therefore be not only technically challenging to achieve but also socially problematic.
Speaker: Mert Özdemir (Heinrich Heine University Düsseldorf) -
11
Faces of Nobility. AI-Assisted Image Analysis of the Wiener Salonblatt (1870–1938)
The Wiener Salonblatt was one of the most popular society magazines of fin-de-siècle Vienna. The highly illustrated weekly was published from 1870 to 1938 and primarily featured short notices about personal achievements, travels, and family matters – mostly contributed by members of the nobility. These messages fulfilled a function similar to that of posts on today’s social media platforms: maintaining social connections and cultivating a public persona.
The content of the Wiener Salonblatt is currently being digitized using AI-assisted methods as part of a dissertation project. The overarching research goal is to better understand the self-representation and transformation of the late Habsburg nobility by identifying topical trends, geospatial patterns, and historical networks. A central component of this process is the extraction of all images, resulting in a corpus of approximately 30,000 photographs and xylographs published between 1870 and 1938 – consisting mainly of portraits of members of the Habsburg nobility.
This period is crucial in three respects: first, for the development of portrait photography, encompassing technological innovations such as the focal-plane shutter (1883), roll film (1888), and the emergence of the 35mm camera (1920s). Second, printing techniques like halftone and intaglio printing, both developed in the late 19th century, greatly contributed to the rise of illustrated magazines. Third, the Habsburg nobility experienced a series of transformative events, including the electoral reform of 1907 and the law abolishing nobility in 1919.
With these three perspectives in mind, the resulting image corpus offers a unique opportunity to analyse the visual self-representation of the Habsburg nobility, to identify photographic trends, and to trace the evolution of portrait photography. Research interests include the appearance of smiles in portraits, the changing composition of aristocratic family photographs, and the depiction of pets.
To process this extensive visual material, various AI-assisted methods are employed. These include layout recognition with Transkribus, image analysis using CLIP, and facial recognition with DeepFace. The poster will present the methodological workflow of this ongoing project and showcase preliminary results.
Speaker: Mr Christian Lendl (Austrian Academy of Sciences) -
12
Toward Digital Iconologies in Architecture
Abstract
Ever-accelerating technologies, such as 3D scanning and AI open up new approaches to research history while challenging the development and narratives of visual representations. Correlating novel digital innovations and art history resulted in ‘digital art history’ as a collaboration between art history, digital humanities, and computer science (A. Bentkowska-Kafel et al. 2005; K. Brown 2020). Similar formulations remain largely absent in the historical debate surrounding architecture. Nevertheless, authors like Kemp (2006) and Carpo (2011; 2013) alluded to transhistorical connections between past technological advancements and the digital turn, yet scholarship did not propose an investigative framework for architecture as digital art history did. Hence, this paper explores how interdisciplinary digital approaches to architectural history help reconsider visual narratives while changing our engagement, perception and analysis of past architecture. To materialise such positions, the paper relies on Panofsky’s famed Iconography & Iconology (1939), as a principle to assess architectural form and meaning, via 3D scans and AI-generated images as illustrative cases. Such a conjoint reading allows for defining digital iconography in architecture as a response to ‘digital art history.’ Iconography’s oscillation between form and context revalidates its position in contemporary architectural debates and grounds well-established methods of architectural history in the digital realm. Thus, this paper questions and scrutinises the relationship between architectural history and digital technology by combining art historical methods of iconography with digital applications and computer science. The proposal of digital iconography posits a theoretical exploration to provide a framework and knowledgebase to explore interdisciplinarity in architectural history, mediating future possibilities for investigating form and meaning in the field.Keywords
Architectural History; Digital Art History; Iconology; Interdisciplinarity; Generative AISpeaker: Nick Mols (Royal Museums of Art and History, Brussels)
-
10
-
16:30
Coffee Break
-
Bias
-
13
Disentangling Bias – Model, Corpus, or Both?
This paper will showcase multiple computational approaches to explore the pictorial conventions of landscape paintings in the late 19th century, taking a comparative approach between collections from Japan, China, and the UK. These datasets will be the basis of an investigation into the use of latent space to explore cultural frameworks within museum collections.
The use of such models raises a number of methodological issues for researchers. Most notably, they can introduce – or exacerbate – biases within datasets (Bode, 2020). Concerningly, these biases are highly variable and difficult to predict. As a result, it can be difficult to separate the historical patterns of interest from the impact of the model’s training data and architecture. We propose a new line of inquiry focusing on latent spaces as a place to quantify these distortions. During the talk we will explore the limitations of these models within digital art history; what can or cannot be projected in the latent space? What is amplified by the model, and what is underrepresented? And how do changes while training a model shift the relationships between artworks and our consequent understanding of the collections? Ultimately, we will highlight the trade-off of various approaches, exploring which methods are most appropriate for different forms of cultural analysis.
Therefore, our study of landscape paintings will form the foundation for a methodological critique as we attempt to open new avenues of inquiry for exploring similarity within collections.
Referenced Works
Bode, K. (2020) ‘Why You Can’t Model Away Bias’, Modern Language Quarterly, 81(1), pp. 95–124. Available at: https://doi.org/10.1215/00267929-7933102.Speakers: Ellen Charlesworth (University of Luxembourg), Ms Ludovica Schaerf (University of Zurich) -
14
The Algorithmic Canon and the Politics of Non-Western Visibility in the Age of AI
Artificial intelligence has become an unexpected curator of global art. Tools such as ChatGPT, image generators, and search engines increasingly act as cultural institutions that reshape how artworks are perceived, classified, and archived. They decide, often invisibly, which works are seen, how they are described, and which narratives are amplified. Yet the databases behind these systems are far from neutral. Built largely from Euro-American collections, vocabularies, and scholarly conventions, they reproduce long-standing hierarchies of art history in which Western categories remain the default frame of reference. This digitally enforced hierarchy of visibility, privileging existing distributions of authority, is what I refer to as the Algorithmic Canon.
This paper examines what such a canon means for art histories that developed outside Western frameworks. Using nineteenth-century Qajar painting from Iran as an example, it explores how algorithmic systems absorb and reframe non-Western materials. It often appears stripped of its cultural and historical context; the interpretive knowledge needed to read its imagery is either absent or reduced to decorative surface cues. This process functions as a form of soft colonialism: an epistemic dominance maintained not by direct rule but through digital infrastructures of knowledge.
Although Qajar art forms the central case study, the implications extend more broadly. The paper argues that digital archives and AI systems, while offering new modes of access, also reassert older asymmetries of power by redefining what counts as legitimate or valuable art. Recognizing AI as a new arbiter of cultural value invites a critical rethinking of how global art histories are being rewritten and who participates in writing them.Speaker: Elham Etemadi (Arkin University of Creative Arts and Design)
-
13
-
17:45
Coffee Break
-
Keynote 1
-
15
The Stables of Augeas: Standardizing Metadata with Iconclass Will Benefit AI
King Augeas of Elis possessed the largest herd of cattle, goats and horses in all of Greece. For 30 years he did nothing to prevent his animals from polluting the floors of his stables with a mountain of dung. It took the strenght and ingenuity of Hercules to clean up the mess.
For 30 years museums, libraries and archives have put in an enormous effort to digitize their image collections. In most cases they used local vocabulary systems to provide subject metadata to describe image content. As a result the AI models that might help us with the analysis of iconography have to be trained on messy metadata.
Unfortunately we cannot call Hercules to clean up our stable. We can, however, use an accepted standard for iconography - ICONCLASS - to produce better organized metadata to boost the quality of AI models.
The ICONCLASS platform we are presently developing will facilitate collaboration, allowing groups of researchers to actually work together on shared datasets of images. The improved metadata will then be used to re-train our AI model so with every loop the performance of the AI agent we are developing for image analysis should get better.
In April we shall demonstrate where we are and invite others to test what we have made and collaborate with us to improve it further.Speakers: Dr Etienne Posthumus (FIZ Karlsruhe), Dr Hans Brandhorst (ICONCLASS)
-
15
-
08:45
-
-
Keynote 2
-
16
Attributes, Objects, Poses, Scenes and Bias: Retrospective and Future Challenges of Art History and Computer Vision
Strong AI seems to solve every question and task in a chat interface which is easy to use by scholars of the humanities. Multimodal approaches in AI lately obscured the sharp distinctions in machine learning with its research field computer vision and various methods like object recognition, pose estimation and scene understanding. Art historians who test this promise of strong AI are either disappointed by false or superficial results, or so stimulated by the recognizable potential that they await the commercial development of LLMs.
In order to analyze to what extent current AI is a game changer for art history, I will describe the ante quem situation and the developments of the past 15 years, to compare the difficulties of earlier approaches with the new problems and biases.
In a certain sense, our work on gesture and pose detection, object recognition, and iconography seems like a set of trial runs. However, it was our data, our annotations, and our models that—despite black-box effects—gave us more transparent results. Formerly, researchers in the field of explainable AI used conventional computer vision methods (e.g., SWIFT) to evaluate convolutional neural networks (CNNs). Must we now use self-trained models of this generation to try to better understand commercial LLMs? To what extent do our art-historical problems remain divergent from the learned conceptions of an LLM?Speaker: Prof. Peter Bell (Philipps-University Marburg)
-
16
-
10:00
Coffee Break
-
Relations
-
17
From Data to Context: AI-Based Style Attribution in Art History
We present an exploratory approach to a relationally conceived art history, which does not consider its central categories of order in isolation, but models them in their interconnection. One of these categories is the concept of style. With the advent of AI, we must redefine “style,” which until now has been thought of as a definite and epochal entity. Recent developments in Digital Art History have addressed this problem through the multimodal input and recognition of visual image content. In the reality of art history, however, many works are only preserved in textual form (i.e., as titles or descriptions). Even the nearly 208,000 catalog entries (i.e., the identity of exhibited works and their style) recorded in the Database of Modern Exhibitions (DoME for short; http://exhibitions.univie.ac.at/) are mostly no longer identifiable due to changed titles, missing reproductions, loss, or destruction. In order to still be able to capture stylistic developments, within the ArtVis project (https://www.cvast.tuwien.ac.at/projects/artvis) we generated annual style attributions, including probability values, for all artists in DoME using LLMs (Gemma 3/12b) and transferred them to a graph database. A knowledge graph based on this data will make it possible to link stylistic developments to time, place, and context. For quality assurance and further development, a visual interface is being developed that enables subject matter experts to review, comment on, and validate stylistic attributions. In doing so, they can use the defined rules and restrictions from the knowledge graph to semantically correct or refine the generated style attributions.
Once successfully validated, the resulting knowledge graph will serve as a basis for exploring networks of style attributions and as a reference dataset for developing and testing new LLM prompts and various generative models and their application in the field of art history. It also provides an empirical basis for investigating generative problems such as hallucinations, style bias, or temporal drift in the context of AI-supported style attribution.Speakers: Ms Teresa Kamencek (University of Vienna), Dr Velitchko Filipov (Technical University of Vienna), Ms Michaela Tuscher (Technical University of Vienna), Prof. Silvia Miksch (Technical University of Vienna), Prof. Raphael Rosenberg (University of Vienna) -
18
Embedding-Based Image Analysis for Art Historical Research: Integrating AI into Digital Catalogues Raisonnés
Navigating.art is developing an AI-assisted image analysis feature within an existing platform that enables art researchers to record, manage, and publish digital catalogues raisonnés anchored in a relational database. The new feature leverages amazon.titan-embed-image-v1 (AWS Bedrock) to generate 384-dimensional vector embeddings, representing individual artworks and detected subregions in a multidimensional semantic space. These embeddings are designed to support similarity search, clustering, and visual discovery across large digitized collections, extending the platform’s capabilities for reverse image search and comparative analysis.
The embedding pipeline is implemented using a serverless architecture with AWS Step Functions, orchestrating image acquisition, computer vision–based detection of rectangular regions, OCR via Textract, and embedding generation. Parallel processing, exponential backoff, and automated optimization ensure scalability and robustness, while results are consolidated and made accessible via the platform’s relational database and callback mechanisms.
A key challenge remains: the large language models with vision capabilities currently used interpret an entire page as a single image, complicating reverse image search for individual works. Addressing this limitation is central to ongoing development, alongside improving subregion detection and embedding quality. By April 2026, we anticipate presenting a working prototype and preliminary results, demonstrating how embedding-based AI can enrich catalogues raisonnés by enabling visual pattern recognition, iconographic comparison, and cross-collection research.
This work illustrates a pragmatic, ethically grounded approach to integrating AI into art historical scholarship, bridging computational methods with established scholarly practice in a scalable and accessible manner.
Speaker: Kiersten Thamm (HPF Innovations GmbH / Navigating.art)
-
17
-
11:20
Coffee Break
-
Collections
-
19
Object Detection for Visual Analysis of Medieval Charters: Decoration and Its Makers in Papal Documents
This paper addresses the application of deep learning tools to the charter-specific platform Monasterium.net, with particular focus on the “Illuminated Charters” collection. Building on the object detection pipeline developed within the project DiDip, we extend its application toward art-historical and diplomatist analysis, a task previously requiring prohibitive manual effort.
The DiDip pipeline applies object detection and layout classification to over one million charter images, assigning detected regions to ten defined classes. This allows distinguishing ‘Writable Area’ from photographic artifacts, or ‘Seal’ from later additions like archival stamps (‘NewOther’). Our contribution lies in making these detections usable for scholarly inquiry: we integrate model predictions into the upcoming Monasterium virtual research environment, providing a faceted browsing and discovery interface by exploiting associated metadata. This enables experts to evaluate results qualitatively—revealing, for instance, that ‘NewOther’ contains an almost complete set of rotae, the distinctive authentication signs of solemn papal privileges. Such systematic access opens new avenues for tracing decorative conventions. For the sake of demonstration, results concerning litterae and their makers are presented and critically discussed.
From an art-historical and diplomatic perspective, such interfaces prove productive for identifying visual patterns across a given corpus, yet meaningful results require iterative refinement through expert guidance. Model-generated classifications remain provisional; their scholarly utility depends on applications enabling users to reorganize, filter, and interpret results in domain-appropriate ways. Nonetheless, it is an example of transforming raw detections into a discovery tool for comparative iconographic research.
Keywords: object detection, illuminated charters, Monasterium.net, layout classification, diplomatic studies, papal chancerySpeakers: Florian Atzenhofer-Baumgartner (University of Graz), Martin Roland (Austrian Academy of Sciences) -
20
Between Potential and Practicality: Exploring AI’s Potential in an Incomplete, Low-Resource Art Collection
This contribution examines what AI-supported object detection, image classification analysis and data enrichment look like from the vantage point of a small-sized 19th-20th century art collection that is only partially inventoried, unevenly digitized, and structurally under-resourced. The Art Collection of the Hungarian Academy of Sciences – comprising mainly painted and sculpted portraits, prints, furniture, and applied arts – recently emerged from a major facility renovation that finally provides adequate storage and the possibility of restarting systematic inventory, revision, and data registering/cleaning in the CMS (Museum+). Yet the database is still incomplete, inconsistently structured, and only sparsely enriched with high-resolution images or controlled vocabularies.
This situation reveals a central tension in current AI discourses. While institutions with abundant clean metadata can experiment with pretrained models or develop bespoke training sets, collections like ours confront an epistemological gap: AI requires interoperable data before it can “discover” anything, but data interoperability requires sustained human labor, resources, and expertise that AI cannot replace. As a result, existing inequalities risk being amplified. The painstaking, decades-long work of documentation becomes the raw material exploited by larger institutions and companies, while smaller collections – those most in need of support – are least able to benefit.
Archival sources further complicate the picture. Minutes and administrative documents of the Academy containing crucial provenance information survive primarily in handwritten form. Without high-resolution digitization, HTR and automated information extraction remain largely aspirational. Yet in principle, AI could help identify artwork references in these documents and link them to Museum+ records, creating a cyclical process in which improved metadata strengthens future AI applications.
Rather than offering solutions, this paper reflects on the methodological implications of working with incomplete collections, the risks of black-box automation, and the value of slow, connoisseurial research. It approaches AI with skepticism – but also with curiosity – seeking collaborations that might eventually help transform infrastructural weaknesses into opportunities for more responsible, context-aware applications in collection management and art historical research.Speaker: Dr Zsuzsa Sidó (Institute of Art History, ELTE Research Center for the Humanities – Art Collection of the Hungarian Academy of Sciences) -
21
Using AI for Icon Analysis
Title: Using AI for Icon Analysis
The paper examines the AI analysis of the selection of icons exhibited at "The Light of the Logos," an international, juried exhibition of sacred art held in Belgrade in August 2025. This exhibition showcased the works of some of the most significant and talented icon painters from Serbia and around the globe. Featuring over a hundred meticulously crafted icons, the event included contributions from 97 artists hailing from a diverse array of countries, each presenting their unique interpretations of sacred themes. Notably, the exhibition displayed the finest works created by local authors within the past year, shining a spotlight on contemporary iconography while honoring traditional techniques. The selected artists display a diverse array of artistic interests. Some specialize in icon painting and focus solely on this form of expression, while others engage in various types of art and occasionally create icons.
This small-scale dataset is used for analyzing artwork through AI applications, with results compared to the actual characteristics of the pieces. This selection offers an intriguing opportunity for several reasons. First, it allows for an iconographical analysis of the paintings, which, while adhering to certain iconographical canons and conventions, also display the artistic freedom of their creators. Additionally, as the paintings are created on diverse surfaces such as canvas, wood, or stone, they present interesting and challenging material for AI analysis due to the various techniques used and the wide range of materials involved. The outcome reveals various obstacles and limitations inherent in the AI analysis of visual elements and design principles. It highlights the challenges that AI faces in accurately interpreting artistic context and conveying the deeper meaning behind each work.Speaker: Ljudmila Djukic (Belgrade)
-
19
-