Program
Accepted Papers
Long Papers
Beyond chat: towards greater user involvement and agency in human-AI co-creativity through shared collaborative spaces
Abstract: As artificial intelligence (AI) tools become increasingly common for assisting with writing tasks, chat-based interfaces have emerged as the dominant mode of interaction. We hypothesise that relying solely on chat interfaces can reduce users’ involvement in the writing process, relegating them to an instructive role rather than an active co-author. To investigate this, we developed two prototype systems combining a chat window next to a collaborative writing editor where both the human and AI can make contributions, and compared them to a chat-only system. Across two user studies, participants who had access to the collaborative workspace reported higher levels of personal contribution, perceived a more balanced partnership with the AI, and adopted more engaged roles than participants using the chat-only interface. These findings suggest that offering a collaborative editor alongside chat can mitigate the risk of overreliance on AI and help preserve creative skills. As generative AI tools become more pervasive, designing interfaces that maintain user involvement is crucial to supporting meaningful human–AI co-creation.
Do LLMs Agree on the Creativity Evaluation of Alternative Uses?
Abstract: This paper investigates whether large language models (LLMs) agree in assessing creativity in responses to the Alternative Uses Test (AUT). While LLMs are increasingly used to evaluate creative content, previous studies have primarily focused on a single model assessing responses generated by itself or humans. This paper explores whether LLMs can impartially and accurately evaluate creativity in outputs from both themselves and other models. Using an oracle benchmark set of AUT responses, categorized by creativity level (common, creative, and highly creative), we experiment with four state-of-the-art LLMs evaluating these outputs. Both scoring and ranking methods are tested under two evaluation settings (comprehensive and segmented) to assess LLMs agreement on the creativity evaluation of alternative uses. Results reveal high inter-model agreement, with Spearman correlations averaging above 0.7 across models and reaching over 0.77 with respect to the oracle, indicating a high level of agreement and validating the reliability of LLMs in creativity assessment of alternative uses. Notably, LLMs do not favor their own responses; they assign similar creativity scores or rankings for alternative uses from other models. These findings suggest that LLMs exhibit impartiality and high alignment in creativity evaluation, offering promising implications for their use in automated creativity assessment.
Psychologically-inspired generative AI videos for supporting creativity
Abstract: Generative AI has become a powerful tool for supporting creativity across a breadth of disciplines. Nonetheless, applications are generally domain- and/or task- specific, providing targeted support for creative tasks like design ideation. Here, we present a domain-general approach to using generative AI to amplify human creativity by targeting its underlying psychological processes at high temporal resolution. We describe a generative AI workflow for creating two-minute video interventions designed to increase open-minded, flexible thinking, a central domain-general component of creative processes. We also present human participants’ ratings of and reactions to these videos, detailing implications for future work.
Me, Myself and Irony: Modelling the deceptive creativity of irony with Large Language Models
Abstract: Modern generative AI systems certainly excel at generation, whether of images, audio or text, and can now shoulder so much of the creative burden that they may already meet a popular definition of computational creativity, all without actually embodying any explicit theory or model of creativity. For many tasks, the issue of whether these systems can appreciate what they generate, or whether whether they are merely generative, is a moot one, given the human-like quality of their outputs. Yet for some creative tasks, this question still matters. This paper explores the capacity of large language models (LLMs) to both speak ironically and to appreciate the irony of what they produce. Irony requires a contrast between a speaker’s thoughts and a speaker’s words; the user of irony holds something back, something unsaid, that undermines what is actually said. We compare and contrast creative comparisons from humans on the web and the outputs of LLMs such as GPT4o-mini, with a focus on the “X is the Y of Z” construction, to quantify the biases, divergence, and scope for deliberate irony in each. Our aim is to quantify the extent to which an LLM can self-assess and appreciate the irony of its own outputs, and thus filter any unsuccessful outputs for itself.
When Hallucinations are Good: Building AI Agents for Co-Creation of Improvised Stories
Abstract: This work focuses on human-agent co-creation of improvised stories, investigating whether Large Language Models can effectively engage in an improvisational practice known as the “Yes! and…” game. We demonstrate how AI systems can participate successfully in improvisational co-creation, moving beyond response generation to collaborative story-telling. We provide a systematic framework for evaluating creative AI outputs in improvisational contexts, combining human evaluation with computational metrics. Our evaluations show that stories co-created with an AI agent are indistinguishable from stories co-created with a human in terms of novelty, value, and surprise. This shows how “hallucinations” – typically considered problematic in AI systems – can serve as creative assets in collaborative storytelling. More generally, our approach presents the “Yes! and…” game as a novel model-system for studying improvised co-creativity in a well-defined and measurable setup.
Creative as IDEO experts: LLM-agent-based design thinking workshop
Abstract: Design thinking is a well-acknowledged process for generating innovative ideas. In the design thinking process, workshops that invite experts from different domains are critical to the fluency and quality of idea generation. Due to participants’ varying expertise, their involvement could bring bias, leading to inconsistent workshop results. This research introduces an LLM-agent-based workflow to simulate design thinking workshops. Using IDEO’s well-known shopping cart exercise as a benchmark, we dynamically simulate participants’ interactions in a “Redesigning Shopping Cart” task by employing multiple LLM agents. We first assigned each agent a specific role, and then followed the Empathize, Define, Ideate, and Prototype phases of the Stanford d.school design thinking methodology. Our findings indicate that the workflow generated both expected and novel ideas, with the final concept surpassing the quality of those developed by IDEO experts across all dimensions, including originality, flexibility, complexity, practicality, and functionality, as assessed by human experts. These results demonstrate the potential of our LLM-agent-based workflow for conducting design thinking workshops and provide preliminary evidence of its utility for analyzing complex creative processes.
Experimenting with Large Language Models for Poetic Scansion in Portuguese: A Case Study on Metric and Rhythmic Structuring
Abstract: Poetic scansion — segmenting verses into syllables and identifying stressed positions — is an essential yet challenging analytical step in structured poetry analysis. Despite significant progress in language modeling, automatic scansion in Portuguese poetry remains underexplored, particularly considering rhythmic complexity and phonetic ambiguity. This study investigates how Large Language Models (LLMs) can effectively identify poetic syllables and metric patterns in Portuguese texts. We compare four approaches (zero-shot prompting, few-shot prompting, chain-of-thought reasoning, and fine-tuning) to assess accuracy in syllabic segmentation and rhythm detection. While prompting-based techniques exhibit moderate limitations due to phonetic variability and contextual nuances, a targeted fine-tuning approach yields better results, achieving an 88.6% syllabic segmentation accuracy and a 97.4% metric correspondence within a ±1 syllable tolerance threshold. Findings underscore both the promise of fine-tuned LLMs for computational poetic analysis and the challenges posed by linguistic variability. As part of developing the Pajeú platform, which aims to empower Northeastern Brazilian folk poetry communities through computational tools, this research sets a foundation for future investigations into culturally informed computational scansion methods. It highlights avenues for further improvements, such as addressing phonemic disambiguation and reducing computational training costs.
Co-Designing Fashion with AI: A Small-Data Approach to Generative Garment Design
Abstract: This paper details a case study of implementing a co-design framework for integrating artificial intelligence (AI) into fashion design through a collaboration between the researcher and a garment designer. Using the designer’s fashion development as a practical example, this research investigates AI’s potential to enhance creative workflows, improve efficiency, and facilitate marketing strategies that reduce the necessity for physical prototyping. The project implements a small-data approach, exclusively fine-tuning AI models on the designer’s sketches and the researcher’s photography to ensure highly personalised outputs. AI architectures, such as Stable Diffusion and ControlNet, are fine-tuned and used to generate sketches and photorealistic visualisations that work as an extension of the designer’s artistic vision while promoting sustainable awareness and ethical data practices. Through continuous dialogue and a final interview, the designer’s perspective was analysed, demonstrating AI’s efficacy as a creative co-partner that can significantly streamline design iteration by enabling rapid prototyping and ideation, while also addressing concerns about preserving artisanal expertise. This research contributes to the growing field of AI in the creative industries by demonstrating the value of co-design, small-data strategies, and ethical AI development to build tools that empower designers.
Creative Gameplay For All: A Development and Teaching Framework for Codenames
Abstract: A significant challenge associated with developing creative agents is the necessity of reconciling tractable evaluation metrics with the subjective nature of creativity. Traditional approaches often rely on bespoke methodologies that require significant abstraction and domain-specific engineering, creating barriers to rapid iteration. Language games like Codenames offer an alternative paradigm by providing a structured environment with inherent evaluation mechanisms, enabling clearer feedback loops for agent development. This work introduces an open-source resource designed to streamline creative agent development for the game Codenames. The resource consists of a web application for real-time game simulation and evaluation and a modular Python client supporting both spymaster and guesser agent roles. Initially presented at a [venue anonymized] workshop, the system was refined through a code jam session with computational creativity researchers before being deployed in a pilot study set in a graduate-level computational creativity class. Validation with 25 students demonstrates the resource’s effectiveness as both a development tool and educational instrument. Quantitative and qualitative analysis of a user survey shows that the resource facilitates meaningful iterative improvements, exploration of approaches to clue generation and semantic reasoning, and application and understanding of fundamental CC concepts.
Battle Rap as a Framework for Human-Machine Co-Creativity
Abstract: This research introduces a human-in-the-loop GAN framework for rap battles, where a human artist (MC – Master of Ceremonies) acts as the generator, and AI as an adaptive discriminator. The AI provides real-time feedback on rhyme complexity, coherence, and stylistic alignment, challenging the MC’s improvisational skill. Fine-tuned Large Language Models (LLMs) emulate diverse rap styles, while voice cloning creates adversarial loops; This enables the MC to compete against AI-generated versions of their own voice and style, creating a dynamic, self-reflective rap duel. The framework follows a dual learning process. (i) Emulation Phase, where AI imitates known styles to reinforce the MC’s techniques, and (ii) Improvisational Phase, where AI challenges the MC to freestyle, prompting spontaneous, original stylistic choices. This ensures the MC’s mastery of existing patterns before developing new ones through competition. Success is measured through the MC’s and audience ratings of the AI as an opponent. Beyond computational science, we explore cultural evolution in rap by modeling peer influence and historical trends using Natural Language Processing (NLP). Additionally, the framework expands access to computational creativity for underrepresented communities.This work advances computational creativity by demonstrating how AI can function as both collaborator and competitor in human-centred artistic expression.
How far afield should you go when being creative? Semantic area as a metric of AI’s effects on creative ideation
Abstract: Generating creative ideas that are both novel and useful is one of the most important goals of any innovation process. But most measures of creativity focus only on the output of a creative process, not on the process itself. Here, we present a new metric for objectively evaluating the creative process itself. The metric, called semantic area, combines semantic embedding models, dimensionality reduction, and 2D area calculations to measure the semantic space covered during a creative task. We then use this method to quantify the relationship between what experimental participants input, see, and select during a creative task and the results of their creative efforts. We find that people who see a more diverse set of ideas (a) select more diverse ideas for their results and (b) have more subjective satisfaction with the results. But, surprisingly, they also select ideas that objective evaluators find less innovative, inspiring, and enjoyable. We close with a discussion of how this method might support interventional approaches that are able to help constructively expand a user’s exploration space without distracting them from end goals during a creative process.
LuminAI: Embodied AI as a Catalyst, Constraint, and Co-Creator in Dance Improvisation Class
Abstract: As co-creative AI systems gain traction in artistic fields, their integration into formal improvisational pedagogy remains underexplored. This study examines how LuminAI, an AI dance partner, mediates improvisational practices within a structured educational setting, investigating how it reshapes norms and contradictions between pedagogy and AI-mediated co-creation. Using Cultural-Historical Activity Theory as an analytical framework, we conducted a three-month diary study with ten dancers and one instructor in an undergraduate improvisational dance course, capturing student reflections and instructor interviews. Our findings reveal that LuminAI functioned as both a catalyst and constraint in improvisation, reshaping movement decision-making, spatial awareness, and collaboration. Dancers adapted strategies to accommodate AI tracking limitations, adjusting tempo, spatial positioning, and clarity of movements. While some students experienced frustration due to technological constraints, others reported increased self-reflection, creative expansion, and novel co-creation dynamics. From an instructional perspective, the AI disrupted class structures, requiring the instructor to navigate tensions between AI engagement and human improvisational flow. Key contradictions emerged—between spontaneity and constraint, autonomy and responsiveness, and pedagogy versus AI-mediated co-creation. By analyzing these adaptations, we contribute to discussions on AI in co-creative education, advocating for systems that enhance rather than restrict embodied pedagogy, fostering new movement possibilities within improvisational dance.
Computational Modeling of Artistic Inspiration: A Framework for Predicting Aesthetic Preferences in Lyrical Lines Using Linguistic and Stylistic Features
Abstract: Artistic inspiration remains one of the least understood aspects of the creative process, yet plays a crucial role in producing works that resonate with audiences. This paper introduces a novel computational framework for modeling individual artistic preferences in poetic lines through key linguistic and stylistic features. Our approach consists of two components: (1) a feature extraction module that quantifies poetic imagery, word energy, abstraction level, emotional valence, and linguistic complexity, and (2) a calibration network that learns to predict what content will inspire specific individuals. To evaluate our framework, we present EvocativeLines, a dataset of poetic lines annotated as either “inspiring” or “not inspiring” across diverse preference profiles. Experiments demonstrate that our framework significantly outperforms state-of-the-art language models, surpassing LLaMA-3-70b by nearly 18 percentage points in accuracy. The framework’s design prioritizes interpretability and flexibility, making it adaptable to analyzing various types of artistic preferences across different creative domains and skill levels. By formalizing the measurement of subjective aesthetic responses, our work provides a foundation for computational systems that can support the early stages of the creative process.
Is Prompt Engineering the Creativity Knob for Large Language Models?
Abstract: The increasing use of large language models (LLMs) to generate creative artifacts raises critical questions about effective methods for guiding their output. While prompt engineering has emerged as a key control mechanism for LLMs, the impact of different prompting strategies on the quality and novelty of creative artifacts remains underexplored. This paper systematically compares four prompting strategies of increasing methodological complexity: basic prompts, human-engineered prompts, automatically generated prompts, and chain-of-thought (CoT) prompting. We generate ten examples in each of four textual domains (jokes, short poems, six-word stories, and flash fiction), evaluating outputs through both a human survey and GPT-4o-based automatic evaluations. Our analysis reveals that advanced prompting techniques such as OPRO (an automatic prompting method) and R1 (a chain-of-thought prompting model) surprisingly do not produce artifacts of significantly higher quality, greater novelty, or greater creativity than artifacts produced through basic prompting. The results reveal somelimitations of using GPT-4o for automatic evaluation; provide empirical grounding for selecting prompting methods for creative text generation; and raise important questions about the creative limitations of large language models and prompting.
Evaluating Creative Short Story Generation in Humans and Large Language Models
Abstract: Story-writing is a fundamental aspect of human imagination, relying heavily on creativity to produce narratives that are novel, effective, and surprising. While large language models (LLMs) have demonstrated the ability to generate high-quality stories, their creative story-writing capabilities remain under-explored. In this work, we conduct a systematic analysis of creativity in short story generation across 60 LLMs and 60 people using a five-sentence creative story-writing task. We use measures to automatically evaluate model- and human-generated stories across several dimensions of creativity, including novelty, surprise, diversity, and linguistic complexity. We also collect creativity ratings and Turing Test classifications from non-expert and expert human raters and LLMs. Automated metrics show that LLMs generate stylistically complex stories, but tend to fall short in terms of novelty, surprise and diversity when compared to average human writers. Expert ratings generally coincide with automated metrics. However, LLMs and non-experts rate LLM stories to be more creative than human-generated stories. We discuss why and how these differences in ratings occur, and their implications for both human and artificial creativity.
Integrating Computational Creativity and Climate Data for Environmental Awareness: Design and Analysis of an Ongoing Project
Abstract: Climate change is often understood through scientific data, but how can digital technologies be used to make climate change more accessible to non-expert publics? This article presents a project that explores the intersection of computational creativity and environmental awareness through a web platform that transforms real-time climate data into interactive audiovisual art. By collecting live climate data from satellites and processing it into unique visual and musical compositions, the platform translates global environmental phenomena into local, personal aesthetic experiences. Using generative algorithms, data-driven creativity and interactive design, the platform aims to bridge the vast scale of climate change with the intimate scale of individual perception. Based on observation of users engaging with the platform, we analyze the potential of digital art for environmental education, the challenges of representing environmental data in non-figurative ways, and the technical challenges of adapting scientific data infrastructure for artistic exploration. We conclude by suggesting that art can play an important role in climate communication and that computational creativity can significantly contribute to this by connecting objective data with subjective representations.
COCOA: A Framework for Control Optimization in Co-Creative AI
Abstract: Striking the appropriate balance between humans and co-creative AI is an open research question in computational creativity. Co-Creativity, a form of hybrid intelligence where both humans and AI take action proactively, is a process that leads to shared creative artifacts and ideas. Achieving a balanced dynamic in co-creativity requires characterizing control and identifying strategies to distribute control between humans and AI. We define control as the power to determine, initiate, and direct the process of co-creation. Informed by a systematic literature review of 172 full-length papers, we introduce COCOA (Control Optimization in Co-Creative AI), a novel framework for characterizing and balancing control in co-creation. COCOA identifies three key dimensions of control: autonomy, initiative, and authority. We supplement our framework with control optimization strategies in co-creation. To demonstrate COCOA’s applicability, we analyze the distribution of control in six existing co-creative AI case studies and present the implications of using this framework.
Invisible Strings: Revealing Latent Dancer-to-Dancer Interactions with Graph Neural Networks
Abstract: Dancing in a duet often requires a heightened attunement to one’s partner: their orientation in space, their momentum, and the forces they exert on you. Dance artists who work in partnered settings might have a strong embodied understanding in the moment of how their movements relate to their partner’s, but typical documentation of dance fails to capture these varied and subtle relationships. Working closely with dance artists interested in deepening their understanding of partnering, we leverage Graph Neural Networks (GNNs) to highlight and interpret the intricate connections shared by two dancers. Using a video-to-3D-pose extraction pipeline, we extract 3D movements from curated videos of contemporary dance duets, apply a dedicated pre-processing to improve the reconstruction, and train a GNN to predict weighted connections between the dancers. By visualizing and interpreting the predicted relationships between the two movers, we demonstrate the potential for graph-based methods to construct alternate models of the collaborative dynamics of duets. Finally, we offer some example strategies for how to use these insights to inform a generative and co-creative studio practice.
A Full Pipeline for Context-Aware Pun Generation
Abstract: Among the different forms of humor, puns are the most researched in Computational Creativity and Natural Language Generation. However, existing systems generate puns by taking a pair of ambiguous words as input, without addressing how such pairs are obtained, which disconnects the generation from any contextual background. Furthermore, the majority of the works focus on English, leaving other languages, such as Portuguese, behind. In this paper, we present a full pipeline for creating puns in Portuguese, including the creation of homographic and homophonic word pairs from news headlines. We evaluate ten different Transformer-based approaches for generating jokes given these word pairs — fine-tuned T5 models and different LLM prompting techniques — through a questionnaire with 23 participants from Brazil and Portugal, who rated the puns in terms of humor and relation to the base headline. Results suggest that including the words’ definitions in the prompt can harm humor ratings and that few-shot prompting outperformed zero-shot. Additionally, the T5 model fine-tuned without word definitions produced texts that were more closely related to the base headline, but at the expense of humor.
Refining Metrical Constraints in LLM-Generated Poetry with Feedback
Abstract: Among many other tasks, general purpose pretrained Large Language Models~(LLMs) can be prompted for generating poetry in a zero-shot scenario, with relative success, even if in many occasions they fail to meet all the constraints in a prompt 100\%. In this work, we use LLMs to produce poetry, and in the case that some expected metrical constraints are not correct, feedback is given, and the LLM is asked to produce a poem that fulfills the constraints. We prompt LLMs for generating poems with a specific number of lines and syllables, analyse the result with another LLM or a rule-based poetry evaluation system, and, if given constraints are not met, we prompt it again with the results of the analysis, so that it generates a poem according to the expected. This can go through several iterations. Using this methodology, we not only analyse the ability of LLMs to produce metrical poetry but we also test how good they can be to produce feedback on some metrical aspects. We conclude that feedback does improve the metrical constrains in some LLMs, but this is done at the cost of less rhyme. We also observe that feedback by the rule-based system works better than feedback by an LLM.
A Creativity Assessment Scale for Text-to-Image Prompting: Challenges & Observations
Abstract: Online platforms offering Text to Image (TTI) generation services are dynamic environments where the creative process of ideation and verbal skills of the users can be observed through time. User interactions on these platforms yield extensive behavioral data, encompassing patterns of engagement, verbal expression, visual imagination, and prompt engineering, a specialized skill set essential for image generation. Consequently, these platforms provide a unique opportunity to examine the verbal creativity of the users. However, the creativity assessment literature is restricted to specific conditions such as strictly controlled user studies, or product-based assessments that are not generalizable. As such, there is an obvious need to create an assessment method that fits the prompting language to generate images. In this study, we provide a framework for prompt creativity assessment in the format of an online tool. The tool is developed based on the Creativity Product Semantic Scale. We test this framework with 8 trained annotators. We analyze samples of prompts from Midjourney and Stable Diffusion models while evaluating how user engagement, measured by time spent and amount of prompts and prompt length on these platforms, influences overall creativity scores.
Poems to Lyrics: Automated Rephrasing with Beat Alignment
Abstract: This paper explores the generation of English lyrical lines that align with the rhythm perceived from language. We use the ByT5 transformer model, which processes text at the byte level, to rephrase poetry lines according to a given beat pattern. Our approach builds on earlier studies on beat patterns perceived from English conversations by integrating a guided paraphrasing task with rhythmic constraints. Additionally, we introduce ParaPoetry, a large-scale parallel dataset of automatically generated poetry line rephrases. Our results demonstrate that the proposed model can effectively rephrase lyrics to align with specific beat patterns using only textual data. A human evaluation study further confirms that English-speaking participants largely agree with our model’s beat alignment. We further assess other qualities, including fluency, meaningfulness, and poeticness.
Do Conversational Interfaces Limit Creativity? Exploring Visual Graph Systems for Creative Writing
Abstract: We present a graphical, node-based system through which users can visually chain generative AI models for creative tasks. Research in the area of chaining LLMs has found that while chaining provides transparency, controllability and guardrails to approach certain tasks, chaining with pre-defined LLM steps prevents free exploration. Using cognitive processes from creativity research as a basis, we create a system that addresses the inherent constraints of chat-based AI interactions. Specifically, our system aims to overcome the limiting linear structure that inhibits creative exploration and ideation. Further, our node-based approach enables the creation of reusable, shareable templates that can address different creative tasks. In a small-scale user study, we find that our graph-based system supports ideation and allows some users to better visualise and think through their writing process when compared to a similar conversational interface. We further discuss the weaknesses and limitations of our system, noting the benefits to creativity that user interfaces with higher complexity can provide for users who can effectively use them.
Hidden Layer Interaction: Image Synthesis with Neural Semantics
Abstract: While the inner workings of neural networks have been explored to aid interpretability, their role in co-creating with generative AI remains underexplored. Inspired by feature visualization—where neural activation reveals the semantics captured by image recognition models—we propose an algorithmic method to manipulate generated images by directly altering neural activation. Providing an early image synthesis model trained to generate fashion images as a prototype, we demonstrate how the model’s neural semantics can provide novel forms of control over the generative process and discuss its potential role in human-AI co-creativity.
Short Papers
Creativity Rivalry: Human, Artificial Intelligence, and Co-Design
Abstract: This study investigates whether a machine, specifically Artificial Intelligence (AI), can exercise creative ability by comparing design solutions developed for a real-world competition. The competition involved designing a light fixture for a pediatric waiting room and featured three approaches: AI-generated design, co-design effort, and solution produced by a human designer. Design solutions were documented as they evolved through three distinct stages: initial sketches (S), three-dimensional renderings (3D), and fully developed models placed in virtual waiting rooms (VR). Evaluators recruited from Amazon Mechanical Turk and Prolific observed the design process and rated each solution using the Creative Product Semantic Scale (CPSS). They assessed each stage on three criteria: novelty (originality and surprise), resolution (logic and utility), and style (craftsmanship and elegance). Despite some demographic discrepancies, evaluators expressed overall satisfaction and calmness, aligning with the competition’s objectives. Statistical analysis of CPSS ratings revealed that while AI excelled in style during the 3D stage, the human designer outperformed in novelty during both the S and VR stages. Surprisingly, co-design efforts finished last. These results challenge prevailing assumptions about AI’s creative capacity and offer practical guidance for designers and educators seeking to integrate AI thoughtfully into the design process.
Indeterminacy, co-creation human-computer and AI in the creative process with musical interactive systems
Abstract: This article presents a practical analysis of the application of artificial intelligence (AI) and machine learning (ML) tools in the piece “XXXXXXXXXXXXX XXXXXXXXXXX”, for saxophone, bandoneon, electronics, and video (2025). The study focuses on investigating the paradigms emerging in human-machine co-creation processes within the context of artistic practice involving interactive music systems. To this end, SOMAX2 (Fiorini and Malt 2023) software was used, enabling real-time interactions during the performance involving improvisation between human musicians and the machine. The audiovisual interactive system was developed in the Max environment. The experiment revealed that the degree of indeterminacy between inputs and outputs is a central paradigm in human-machine co-creation, highlighting its relevance to understanding these processes. The article outlines the theoretical framework, describes the technical and creative processes, and reflects on the outcomes in relation to human-machine co-creation concepts.
Making Mountains out of Molehills: Abstraction without Information Loss in Analogical Mapping
Abstract: Analogy is a cognitive process that propels many of our most creative leaps, from the cross-domain forays of scientific discovery and case-based reasoning to the poetry of metaphor, bisociation and blending. By concerning itself with the shape of meanings, and the structural arrangements of their parts, analogy allows us to unite different meanings with similar shapes across distant domains. Crucial to this unification, or mapping, of domains is the ability to abstract over structural representations that capture the broad sweep of an idea without getting bogged down in details. This paper presents a large new resource for analogical mapping that defines structured representations to support abstraction at multiple levels, but without information loss. This allows the mapping process to be flexible in its reconciliation of different meanings, while also preserving the distinctions that make abstraction necessary in the first place. This resource, named ATLAS, is a wide-ranging database of symbolic structures for lexical concepts (the ideas behind common words), and is designed to support explicit analogical reasoning in an era where symbolic reasoning is giving way to the statistics of LLMs.
Measuring Creativity in Co-Writing with AI: Rhyme Density and the Limits of Computational Proxies
Abstract: The evaluation of creativity in AI-assisted writing remains a challenge, often reliant on subjective user ratings. This paper investigates the use of computational metrics, specifically rhyme density and lexical distance, as proxies for creative quality in human–AI collaborative writing. Drawing on data from a mixed-methods study we analyse the relationship between quantitative measures of rhyme structure, divergent thinking scores, and participant evaluations of writing fluency, creativity, and accuracy. Our results demonstrate that analysis of both rhyme density and structural complexity can partially predict human assessments of creative output, and also reveal significant discrepancies between computational metrics and participant perceptions. We discuss the limitations of such automated scoring approaches, and argue for a multi-modal evaluation framework that combines computational analysis with human assessment. This work contributes to ongoing debates within computational creativity on how to meaningfully assess co-created work.
Making the Familiar Strange: A Computational Approach to Defamiliarization in Creativity Support
Abstract: Defamiliarization—originating in literary theory and later adapted into Critical and Speculative Design—seeks to disrupt habitual perception and invite renewed engagement via ”slight strangeness” and has been extensively used to provoke critical engagement in design and interactive systems. However, existing approaches remain subjectively designer-controlled, limiting generalizability in interactive creative systems, while generative-AI tools introduce unexpected elements without offering structured, user-level control over estrangement.To address this gap, we present Familiarity–Estrangement Space, which projects textual materials (e.g., design concepts and invented stories) along three personalized dimensions—Familiarity, Positive Estrangement, and Negative Estrangement—empowering creators to modulate the exact degree of strangeness they desire. By extracting latent topics and inviting users to identify which topics to retain, amplify, or minimize, our approach transforms defamiliarization from a designer-imposed intervention into an adjustable parameter for creativity support. This enables creators to navigate and discover fresh thematic collisions or reinforce familiar patterns, offering a structured framework for AI-assisted creative exploration that balances coherence with deliberate estrangement.
Quantitative Measures of Task-Oriented Creativity in Popular Image Generators
Abstract: Creativity of generative AI models has been a subject of scientific debate in the last years, without a conclusive answer. In this paper, we study creativity from a practi- cal perspective and introduce quantitative measures that help the user to choose a suitable AI model for a given task. We evaluated our measures on a number of popular img2img generation models and the results suggest that our measures conform to the intuition.
Brave: Engineering an Embedded Network-Bending Instrument, Manifesting Output Diversity in Neural Audio Systems
Abstract: As neural audio synthesis becomes more widely adopted there is a growing risk that its limitations could impact the content, quality and diversity of music. Some musicians, artists, and researchers perceive an increased risk of cultural homogenisation and qualitative degeneration due to poor-quality training data andparameterisation. This work seeks to explore new methods for addressing these challenges by contributing to the developing field of ”network-bending”. Network-bending employs direct manipulation of internal ML architectures to enable active divergence from the training corpus, increasing the statistical variability and capability of model outputs. We present “Brave”: an embedded, network-bending hardware instrument, which can provide a novel blueprint for embedding a network-bending system on a stand-alone system. Through a process of iterative musician-led feedback, drawing on Proof-of-Concept Media and Arts Technology approaches, this work seeks to stimulate futher interest in network-bending frameworks applied to the field of AI-driven sound synthesis.
Are AI-generated Jokes Truly Original? Charting the “Joke Space”
Abstract: This paper tackles the challenge of evaluating the originality of jokes generated by Large Language Models (LLMs), which operate as opaque “black boxes” with non-transparent algorithms, potentially relying excessively on their training datasets to produce humor. While LLMs excel at rephrasing, raising concerns about plagiarism, existing studies often assume their outputs are original without verification. We propose a novel framework to assess joke originality by characterizing the “Joke Space”—the set of all possible verbal jokes in English. Drawing on the General Theory of Verbal Humor (GTVH), we define a joke’s “essence” as its script opposition (SO) and logical mechanism (LM), which are critical for determining similarity and potential plagiarism. We estimate the size of the Joke Space using combinatorial and pragmatic approaches, suggesting that the vast number of possible jokes (at minimum 500 million to 50 billion) exceeds the capacity of LLM training datasets, implying potential for novel outputs. To ensure originality, we recommend prompting LLMs with randomly sampled noun pairs to generate jokes, enabling comparative evaluation with human outputs. This framework offers a systematic method to verify the originality of AI-generated humor, providing practical recommendations for researchers.
Fairness as a Creative Resource: Challenges and Opportunities in Creative Computing
Abstract: Artificial Intelligence (AI) technologies and generative models have significantly expanded the field of computational creativity. Traditionally, fairness has been addressed as a criterion for mitigating social and cultural biases in AI systems, particularly in the context of Natural Language Processing (NLP) and machine learning. This work, however, proposes a new perspective: fairness as a creative resource capable of enhancing the aesthetic, narrative, and cultural diversity of AI-generated content. Rather than treating fairness solely as a corrective mechanism, we explore its potential to enrich content generation by fostering representativity and innovation.In the context of language models, recent research highlights their few-shot learning capabilities — the ability to adapt behavior based on a limited number of examples provided in the prompt. We investigate whether this property can be leveraged to induce fairness efficiently, influencing the diversity of generated outputs through minimal prompt modifications. To this end, we propose a theoretical framework that positions fairness as a catalyst for creativity and conduct exploratory experiments using prompting techniques. These experiments examine whether small adjustments to input prompts lead to more diverse and representative outputs, with a focus on narrative generation as a form of computational creativity.Preliminary results suggest that fairness can function as an active driver of inclusion and originality in creative AI systems, challenging the conventional view that fairness is limited to bias mitigation. We argue that this approach opens new directions for the development of more equitable and innovative generative models, enhancing their ability to reflect and engage with diverse cultural contexts.
Embedding visual thinking into an AI-driven furniture design critiquing system
Abstract: The critique is a cornerstone of design and arts education. However, current AI-driven critiquing systems remain ill-equipped due to their inability to sense, adapt, and question meaningfully. This paper presents fCrit, a critiquing chatbot that collaborates with human designers to reflect on the visual forms of furniture designs. Our work involved reconstructing the process behind enabling AI to analyze visual concepts and patterns using a hand-crafted expert knowledge base, and developing a functional, critique-ready prototype. We envision a critiquing chatbot that resonates with human creativity and empowers designers in form-finding. This work-in-progress is part of a broader initiative that aims to demonstrate how Human-Centred Explainable AI solutions can support creativity for artists and designers.
Turning Linear Stories into Thrillers: The Impact of Story Reordering and Familiarity
Abstract: This study explores how modifying narrative structure affectssuspense perception. Drawing on suspense theory, wesystematically reorder key events—specifically the InitiatingEvent (IE) and Outcome Event (OE)—and introduce nondiegeticsymbolic delays to heighten anticipatory tension.We also examine how prior familiarity with a story influencessuspense. An online experiment on Amazon MechanicalTurk (MTurk) tested linear versus suspense-modified versionsof four short narratives—two familiar and two original.Statistical analyses (ANOVA, ANCOVA, regression)show that reordering events and adding delays significantlyboost reported suspense and engagement, with prior familiaritymoderating these effects.
When AI Says No: Investigating the Creative Power of Dissent
Abstract: In recent years, co-creative AI has emerged as a novel approach to facilitate ideation tasks. Contemporary AI systems tend to exhibit conforming behavior. For example, well known chat bots such as chat-GPT will almost exclusively provide affirmative feedback on ideas, regardless of their quality. In human collaboration, having dissenting ideation partners often enhances creativity by fostering divergent thinking and preventing groupthink. This study examines whether the same applies to AI, investigating whether interaction with a dissenting co-creative AI leads to more creative story-writing outcomes and how it affects adoptability. Participants engaged in a co-creative story-writing task, after which expert evaluators assessed the creativity of the written stories using the consensual assessment technique. The likelihood of adoption was examined using the Unified Theory of Acceptance and Use of Technology (UTAUT). The results indicate that, on average, participants produced more creative output when collaborating with a dissenting AI than a conforming AI. However, the difference did not reach statistical significance, rendering the results inconclusive. The UTAUT analysis revealed a slight variation in adoption tendencies between the two AI conditions. These variations were minor, however, and did not provide significant evidence that adoption likelihood differed for the two conditions. These findings suggest that, in co-creative contexts, dissenting AI systems are equally adoptable as conforming AI systems. However, further research is necessary to determine whether dissenting behavior in AI meaningfully enhances creative performance.
Foundation Models as Agents of Austerity
Abstract: This article examines the role of Foundation Models within the contemporary landscape of generative Artificial Intelligence, arguing that they should not be regarded as neutral technological developments, but rather as instruments embedded in the political economy of neoliberal austerity. It contends that these models facilitate the displacement of socially embedded practices, favouring efficiency-oriented, privatised, and platform-governed operations.
Introducing Pathomalgametry: Conceptual Blending with Geometric Path-finding and Amalgamation
Abstract: Conceptual blending, where new concepts are created through a selective combination of known ideas, is a widely known option for concept invention within computational creativity research. However, existing approaches implementing blending are often either neglecting conceptual aspects, as e.g. in image morphing, or they suffer from the high complexity of the creation of the blend. Therefore, we propose a new neuro-symbolic approach for conceptual blending of ontologies, based on knowledge-graph embeddings. Here, the inherent structure of the embedding space is used both to identify a generic space and to guide the blending process byinterpreting blending as path search in the embedding space by iteratively relaxing the input concepts. Our approach is efficient as it can reuse existing embedding techniques as created for, e.g., link prediction. Its suitability is discussed both theoretically and based on a toy example.
The “What” Space. Prosodic Variability and Affective Virtual Environments
Abstract: What? What! What… How Many Ways to Say the Word What. The “What” Space is a research project that investigates the emotional variability and expressive potential of spoken language. Focusing on the word “what,” the project examines how shifts in emotional prosody shape its interpretation. Drawing from the IEMOCAP database, we identified 54 instances of “what,” each annotated with categorical emotion labels and continuous values in the valence–arousal–dominance (VAD) space. To explore these variations, we developed a series of multimodal visualizations: an interactive 3D visualization in Unity3D, network-based structures in Blender, a real-time speech-to-texture web application, and an immersive virtual reality installation hosted on Onland.io. These experiences allow users to navigate a spatialized soundscape of “what” utterances and directly experience their emotional diversity. This work contributes to affective computing, speech emotion recognition, and human–computer interaction by proposing novel frameworks for interpreting and visualizing emotional expression through speech. By revealing the rich affective range embedded in a single word, The “What” Space underscores the role of prosody in communication and demonstrates the potential of speech-based affective virtual environments.
The truth is no diaper: Human and AI-generated associations to emotional words
Abstract: Human word associations are a well-known method of gaining insight into the internal mental lexicon, but the responses spontaneously offered by human participants to word cues are not always predictable as they may be influenced by personal experience, emotions or individual cognitive styles. The ability to form associative links between seemingly unrelated concepts can be the driving mechanisms of creativity. We perform a comparison of the associative behaviour of humans compared to large language models. More specifically, we explore associations to emotionally loaded words and try to determine whether large language models generate associations in a similar way to humans. We find that the overlap between humans and LLMs is moderate, but also that the associations of LLMs tend to amplify the underlying emotional load of the stimulus, and that they tend to be more predictable and less creative than human ones.
Adapting Proppian Morphology for Generating Narrative Structures
Abstract: Vladimir Propp’s “Morphology of the Folk Tale” presented a meticulous analysis of a basic type of story into a several layers of constituents that could be recombined to make new stories. Early adaptations of a very limited portion of his account to generate stories were criticised on the grounds that the representation was too simple. The present paper considers how the more complex aspects of Propp’s analysis might be implemented computationally. The original descriptive representation of stories is augmented with solutions inspired by reverse engineering some of the more complex examples of stories analysed by Propp in his book. The revised computational model includes important aspects of narrative such as the distinction between fabula and discourse, the use of embedded stories and construction of narrative structures with more than one plot line.
Transformational Creativity in Science: A Graphical Theory
Abstract: Creative processes are typically divided into three types: combinatorial, exploratory, and transformational. Here, we provide a graphical theory of transformational scientific creativity, synthesizing Boden’s insight that transformational creativity arises from changes in the “enabling constraints” of a conceptual space and Kuhn’s structure of scientific revolutions as resulting from paradigm shifts. We prove that modifications made to axioms of our graphical model have the most transformative potential and then illustrate how several historical instances of transformational creativity can be captured by our framework.
Spark: A System for Scientifically Creative Idea Generation
Abstract: Recently, large language models (LLMs) have shown promising abilities to generate novel research ideas in science, a direction which coincides with many foundational principles in computational creativity (CC). In light of these developments, we present an idea generation system named Spark that couples retrieval-augmented idea generation using LLMs with a reviewer model named Judge trained on 600K scientific reviews from OpenReview. Our work is both a system demonstration and intended to inspire other CC researchers to explore grounding the generation and evaluation of scientific ideas within foundational CC principles. To this end, we release the annotated dataset used to train Judge, inviting other researchers to explore the use of LLMs for idea generation and creative evaluations.
Between codes and dreams: hallucinatory cut-ups as a poetic in creation with AI
Abstract: This text discusses the results obtained from experimenting with Artificial Intelligence technologies as assistive tools for artistic creation, carried out during the course “Artificial Intelligence as a Platform for Artistic Creation,” held in the Visual Arts Graduate Program at the Institute of Arts of UNESP (SĂŁo Paulo State University) during the first semester of 2024. Throughout the course, the aim was to discuss the use of artificial intelligence (AI) technologies in the field of Art and Technology, exploring their limits and creative possibilities. The central objective of the classes was to consider the use of AI, especially Large Language Models (LLM), as intelligent tools that can assist artists in the creative process. After the course period, an Artificial Intelligence study group was established so that these reflections could be further explored. Presented here is the result of an experiment conducted during the course, its conceptual proposal, and reflections arising from that practice.
Every Luzia is Now Luzia: A Case Study on Material Heritage in Times of Digital Dispersion
Abstract: Following the 2018 National Museum fire which de-stroyed the Luzia skull, its pre-existing digital images circulated widely. This paper examines the paradox where such digital dispersion obscured the material loss it ostensibly compensated for. Using Luzia as a case study, we investigate the interplay of digital technology, memory politics, and documentary prac-tices following heritage destruction. We employ “ar-chaeology of digital dispersion,” applying inverse photogrammetry to heterogeneous, socially-sourced online images, prioritizing affective resonance over institutional criteria. The resulting fragmented 3D models visualize the digital and affective dispersion and the trauma of loss. The study critiques idealized digital replicas, arguing for valuing fragmentation and imperfection in collective digital memory. It promotes embracing digital dispersion for a critical understand-ing of heritage, memory, and representation today.
Abductive Computational Systems: Creative Abduction and Future Directions
Abstract: Abductive reasoning, reasoning for inferring explanations for observations, is often mentioned in scientific, design-related and artistic contexts, but its understanding varies across these domains. This paper reviews how abductive reasoning is discussed in epistemology, science and design, and then analyses how various computational systems use abductive reasoning. Our analysis shows that neither theoretical accounts nor computational implementations of abductive reasoning adequately address generating creative hypotheses. Theoretical frameworks do not provide a straightforward model for generating creative abductive hypotheses, computational systems largely implement syllogistic forms of abductive reasoning. We break down abductive computational systems into components and conclude by identifying specific directions for future research that could advance the state of creative abductive reasoning in computational systems.
The Wizard in the Town Plaza: Voice-Based Interactive Storytelling in Public Spaces
Abstract: We present The Wizard in the Town Plaza: a field reportand a demonstration on a system that generates narrativecontent using advanced voice cloning, a community-informed knowledge base, and a structured story world.In March 2025 (spring equinox), we deployed this sys-tem in a small tourist and college town in the USA aspart of a seasonal festival. A human actor served as alive intermediary between the AI system and visitors,delivering AI-generated responses to audience ques-tions. This deployment demonstrated the potential forAI-driven storytelling to operate in oral, performative,and social contexts, offering cultural and educationalvalue. We detail the system architecture, includingcloned voice, narrative knowledge graphs, and LLMs,while addressing ethical considerations and communityresponse. We also outline its expansion into an interac-tive mural project designed to amplify local Indigenousvoices and history through AI-supported storytelling.
Automatic Narrative Knowledge Base Generation
Abstract: Traditional symbolic CC systems like [ANON] often require the creation of handcrafted knowledge bases. In order to advance the development of the [ANON] project, this paper introduces methods for the automatic creation of a knowledge base of short stories. The methods include a series of requests to Deepseek’s R1 model to extract relevant structured data from a narrative, using the model to validate and correct the extracted data, and then parsing the structured data and formatting it for the required [ANON] artifacts. This process is validated by evaluating the quality of the extracted narrative data through a human survey. The results show that the process was effective at extracting conceptually accurate structured narrative data from a set of test stories. This work unblocks a significant bottleneck for [ANON] which is necessary for the system to advance to the next level of understanding narrative generation and demonstrates a unique symbiosis between symbolic and generative AI systems.
Automatic Aesthetic Evaluation in Generative Image Models
Abstract: Recent art assessment neural models have shown promising results as aesthetic evaluators of paintings, computing objective metrics through handcraft features and deep leaning techniques. While these models have been applied to evaluate human produced pieces, their use in measuring the aesthetic quality of artificial intelligence (AI)-generated artistic pieces remains unexplored. This paper employs ArtClip, a recent art assessment neural model, to evaluate the aesthetic quality of paintings produced by state-of-the-art image generation models. Our methodology includes describing human artworks automatically with an image-to-text model and generating new images from these prompts, allowing us to pair human artworks with generated pieces. We compare the distribution of nine different aesthetic scores given by ArtClip between human and AI-generated paintings. We considered models of two considerably different sizes to measure the impact of model size on aesthetic quality. The results showed that human artwork has higher aesthetic quality in all nine metrics, but large generators have similar performance.
Reimagining Dance: Real-time Music Co-creation between Dancers and AI
Abstract: Dance performance traditionally follows a unidirectional relationship where movement responds to music. While AI has advanced in various creative domains, its application in dance has primarily focused on generating choreography from musical input. We present a system that enables dancers to dynamically shape musical environments through their movements. Our multi-modal architecture creates a coherent musical composition by intelligently combining pre-recorded musical clips in response to dance movements, establishing a bidirectional creative partnership where dancers function as both performers and composers. Through correlation analysis of performance data, we demonstrate emergent communication patterns between movement qualities and audio features. This approach reconceptualizes the role of AI in performing arts—as a responsive collaborator that expands possibilities for both professional dance performance and improvisational artistic expression across broader populations.
Prompting AI in Co-Creation: The Role of Syntax and Sentiment in Shaping AI-Generated Content
Abstract: As Generative AI has advanced at an incredible pace and has been increasingly integrated into creative domains, understanding how users’ prompting styles shape AI-generated creative content is critical. Despite a growing interest in human-AI co-creation, there is limited research exploring how linguistic and emotional cues in user prompts affect AI-generated responses in creative domains. This study addresses this gap by analyzing the syntactic and sentiment patterns of user prompts and their impact on ChatGPT’s responses in a co-creative storytelling study with 100 participants. We found different stylistic patterns in both prompts and responses, while ChatGPT often amplified users’ sentiment and positivity. The findings offer insights to enhance our understanding of how AI can be optimized for more appropriate interactions in co-creation.
Controlling the image generation process with parametric activation functions
Abstract: As image generative models continue to increase not only int heir fidelity but also in their ubiquity, the development of tools that leverage direct interaction with their internal mechanisms in an interpretable way has received little attention. In this work, we introduce a system that allows users to develop a better understanding of the model through interaction and experimentation. By giving users the ability to replace activation functions of a generative network with parametric ones and a way to set the parameters of these functions, we introduce an alternative approach to control the network’s output. We demonstrate the use of our method on StyleGAN2 and BigGAN networks trained on FFHQ and ImageNet, respectively.
Making the Familiar Unfamiliar: AI-Driven Synectic Metaphor Generation for Computational Creativity
Abstract: Metaphors play a central role in human creativity, enabling abstract reasoning and novel conceptual connections. This paper explores how AI can generate and evaluate metaphors using Synectics-based creativity techniques, a structured framework that systematically makes the familiar unfamiliar. We introduce a computational pipeline leveraging GPT-4 to generate metaphors across four structured analogy types: Direct, Personal, Symbolic, and Fantasy. A comparative analysis against a dataset of 1,000 human-created metaphors drawn from literature and cultural memory is conducted using a three-tier evaluation framework: semantic coherence (BERT embeddings), novelty (TF-IDF similarity), and GPT-4 self-assessed creativity. Our findings show that AI-generated metaphors exhibit higher novelty but lower coherence compared to human-created ones, with Fantasy analogies achieving the highest creativity at the expense of logical structure. Thematic abstraction significantly influences metaphor quality, with abstract themes such as Dreams and Knowledge fostering more imaginative outputs. This study contributes to computational creativity research by demonstrating how structured AI-driven conceptual shifts enhance metaphor generation. We discuss practical applications in creative writing, education, and ideation, and propose future directions including hybrid evaluation methods combining computational and human assessment for deeper insight into AI-generated figurative language.
Automatically Detecting Amusing Games in Wordle
Abstract: We explore automatically predicting which Wordle games Reddit users find amusing. We scrape approximately 80k reactions by Reddit users to Wordle games from Reddit, classify the reactions as expressing amusement or not using OpenAI’s GPT-3.5 using few-shot prompting, and verify that GPT-3.5’s labels roughly correspond to human labels. We then extract features from Wordle games that can predict user amusement. We demonstrate that the features indeed provide a (weak) signal that predicts user amusement as predicted by GPT-3.5. Our results indicate that user amusement at Wordle games can be predicted computationally to some extent. We explore which features of the game contribute to user amusement. We find that user amusement is predictable, indicating a measuremable aspect of creativity infused into Worldle games through humour.