icon to open the header menu

Cognitive aesthetics in program texts

In this section, we show how the aesthetics of source code can be understood through the dual lenses of spatial navigation and semantic compression. We start by highlighting how previous scholars have engaged with the semantic ambiguities that source code presents, including linguistic, poetic and functional perspectives.
We complement this approach by suggesting that semantic compression is tied to the spatial navigation of a program text—i.e. its non-linear active reading patterns. To do so, we will see how we can consider source code aesthetics along a logic of levels; working from structure across syntax and towards vocabulary, these different levels have different connotation in terms of levels of abstraction. Positively valued aesthetic manifestations at each of these levels facilitate reasoning about more or less abstract parts195 of the program text. Aesthetic manifestations of source code provide different levels of granularity when it comes to describing the what, how and why of a program.
Furthermore, we show how this is complemented by semantic compression, understood as the ability to reference concepts from multiple domains (hardware, software, problem) in order to both minimize the amount of cognitive effort necessary to grasp all implications denoted by a given token. As different practices of different programmers might prioritize different aspects to source code aesthetics, we provide an overview of how values such as abstraction, transparency, openness, function and emotion are best seen in certain kinds of program texts, but are not exclusive to them.

Between humans and machines

The ambivalence of source code has also been explored in the literature through different names. As we will see, all of these argue for the intertwining of human interpretation and machine execution. This ambivalence is first taken up by Mateas and Montfort in their study of weird programming languages ( Mateas, 2005) ; in it, they highlight an aesthetics of code that goes beyond the mainstream "literate programming" (see  Code as text   ). Rather than making clear and elegant, they inquire about the aesthetic effects of obfuscation in esoteric languages, one which departs from a requirement of source code to be understandable to both humans and computers, and ultimately argue that esoteric languages do so by playing with more traditional understanding of double-coding as "open to multiple interpretations"196 .
Also focusing on the more fringe and creative uses of code, Camille Paloque-Bergès presents the related concept of double-meaning in her work on networked texts and code poetics ( Paloque-Bergès, 2009) . She defines it as the affordance provided by the English-like syntax of keywords reserved for programming to act as natural-language signifiers. As we've seen in Black Perl (    ), the Perl functions can indeed be interpreted as regular words when the source is read as a human text. As she continues her analysis of codeworks , a body of literature centered around a créole language halfway between humanspeak and computerspeak197 , it can be extended into the aesthetically productive overlap of syntactic realms, however leaving aside any functional or productive aspect of source code.
Such a layered approach echoes the stratas that N. Katherine Hayles envisions when discussing the medium-specificity of electronic hypertexts198 . While the object of her study is the electronic hypertext, meaning a text written and accessed via a computer but not necessarily exclusively written in source code, she scores several points which corroborate our work. First, these texts operate in three dimensions (and thus navigable), are hybrids of programming and natural languages, and rely on distributed cognition for their reading and writing. This "cyborg reading practice" involves digital apparatuses such as the IDE (see  Extended cognition   ) in order to fully access the aforementioned points of spatiality and hybridity. As such, these cyborg apparatuses act as a (software) interface to what is a (linguistic) interface—source code.
Previous research by Philippe Bootz has also highlighted the concept of the double-text in the context of computer poetry, a text which exists both in its prototypal, virtual, imagined form, under its source manifestation, and which exists as an executed, instantiated, realized one ( Bootz, 2005) . However, he asserts that, in its virtual form, "a work has no reality", specifically because it is not realized. Here, we encounter the dependence of the source on its realized output, indeed a defining feature of the generative aesthetics of computer poetry. A work of code poetry can very much exist as a prototypal form, with its output providing only additional meaning, further qualifying the themes laid out in source beforehand. From this perspective, the output of a code poem would have a drastically diminished semantic richness if the source is only read, or only executed. For this double-meaning to take place, we can say that the sitation is inverted: the output becomes the virtual, imagined text, while the source is the concrete instantiation of the poem.
The role of execution is even more embedded in Geoff Cox and Alex McLean's take of double-coding  ( Cox, 2013) . According to them, double-coding "exemplifies the material aspects of code both on a functional and an expressive level" (p.9). Cox and McLean's work, in a thorough exploration of source code as an expressive medium, focus on the political features of speaking through code, as a subversive praxis. They work on the broad social implications of written and spoken code, rather than exclusively on the specific features of what makes source code expressive in the first place, with a particular attention to the practice of live coding. Double-coding nonetheless helps us identify the unique structural features of programming languages which support this expressivity, such as reserved keywords, data types and control flow. As we show below, notably through the use of data types such as symbols and arrays in source code poetry, programming languages and their syntax hold within them a specific kind of semantics which enable, for those who are familiar with them and understand them, expressive power, once the computer semantics are understood both in their literal sense, and in their metaphorical sense. The succint and relevant use of these linguistic features can thicken the meaning of a program text and, in the case of code poetry, bringing into the realm of the thinkable ways to approach metaphysical topics.
Finally, the tight coupling of the source code and the executed result brings up Ian Bogost's concept of procedural rhetoric  ( Bogost, 2008) . Bogost presents procedures as a novel means of persuasion, along verbal and visual rhetorics. Working within the realm of videogames, he outlines that the design and execution of processes afford particular stances which, in turn, influence a specific worldview, and therefore argue for the validity of such worldview. Further work has shown that source code examination can already represent these procedures, and hence construct a potential dynamic world from the source ( Tirrell, 2012 Brock, 2019) . If procedures are expressive, if they can map to particular versions of a world which the player/reader experiences, then it can be said that their textual description can also already persuasive, and elicit both rational and emotional reactions due to their depiction of higher-order concepts (e.g. consumption, urbanism, identity, morality). As its prototypal version, source code acts as the pre-requisite for such a rhetoric, and part of its expressive power lies in the procedures it deploys (whether from value assignment, execution jumps or from its overall paradigms). Manifested at the surface level through code, these procedures however run deeper into the conceptual structure of the program text, and such conceptual structures can nonetheless be echoed in the lived experiences of the reader.
The ambivalence between human meaning and machine meaning is thus at the core of source code aesthetics. We develop on this work in our analysis below by showing some of the configurations of source code which can elicit a aesthetic experience during reading and writing. Through the investigation of levels of abstraction, spatial navigation, metaphorical expression and functional correspondence, we offer concrete examples of the different ways semantic layerings can be manifested in program texts.

Matters of scale

The discourses of programmers in our corpus do not contain uni-dimensional criteria, but rather criteria which can be applied at multiple levels of reading. Some tend to relate more to the over-arching design of the code examined while others focus on the specific formal features exhibited by a a given token or successions of tokens in a source code snippet. To address this variability of focus, we borrow from John Cayley's distinction between structures, syntaxes and vocabularies ( Cayley, 2012) . Cayley's framework will allow us to take into account an essential aspect of source code: that of scales at which aesthetic judgment operates. Beyond literary studies, this framework is also used by Dijkstra when he introduces his approach to structured programming, from the high-level of the program taken as a whole down to the details of line-by-line syntactic choices ( Dijkstra, 2007) . From the psychological accounts of understanding source code in  The psychology of programming   to the uses of space in domain-specific aesthetics in  Words in space   or  Architectural beauty   , one of the specificities of source is the multiple dimensions of its deep structure hidden behind the two-dimensional layout of a text file, and the need for programmers to navigate such space.
Structure is defined by the relative location of a particular statement within the broader context of the program text, as well as the groupings of particular statements in relation to each other and in relation to the other groups of statements within the program-text, whether it is across the same file, series of files, or a sprawling network of folders and files. This also includes questions of formatting, indenting and linting as purely pattern-based formal arrangements, as seen in  Tools as a cognitive extension   , since these affect the whole of the program-text.
Syntax concerns the local arrangement of tokens within a statement or a block statement, including control flow, iterators statements, function declarations, etc., which can be referred to as the "building blocks" of a program-text. Syntax also includes language-specific choices—idioms—and generally the type of statements needed to best represent the task required (e.g. using an array or a struct as a particular data structure for representing entities from the problem domain).
Finally, the vocabulary refers to the user-defined elements of the source code, in the form of variable, function, classe and interface names. Source code vocabulary is constituted of both reserved keywords (which the computer "understands" by being explicitly mentioned by the language designers) and user-defined keywords, the single words which the writes defines themselves and which are not known to a reader who would already know the language's keywords. Unlike the two precedent categories, this is therefore the only one where the writer can come up with new tokens, and is the closest to metaphors in traditional literature.

Structure

At the highest-level, the structure of a program-text can be examined at the surface-level and at the deep-level. The criteria for beauty in surface-structure is layout, as the spatial organization of statements, through the use of line breaks and indentations. While serving additional ends towards understanding, proper layout (whether according to stylistic conventions, or deliberately positioning themselves against these conventions) seems to be the first requirement for beautiful code. In terms of aiding understanding, blank space creates semantic groupings which enable the reader to grasp, at a glance, what are the decisive moments ( Sennett, 2009) in the code's execution, and presented by some as akin to paragraphs in litterature ( Matsumoto, 2007) . Such groupings also fit Détienne's identification of beacons  ( Detienne, 2001) as visual markers that indicate important actions in the program text, whether these actions can be described in a single line, in a block or, more rarely, in a series of blocks. Any cursory reading of source code always first and foremost judges layout.
This aid to understanding is further highlighted by a deep-structure criteria of conceptual distancing ; statements that have to do with each other are either located close to each other or, in the case of more complex program texts, are separated in coherent units (such as in folders) and connected by single syntactic expressions, such as import  or use  statements. Such statements establish an intratextual dimension insofar as it acts as an alias for a larger piece of code, making the most of practices of abstraction and of non-linear readings. As such, visual appearance at the level of the file can reflect the conceptual structure of the code. At the level of the folder(s), that is, at the level of a collection of files located at different levels of nestedness, one can also highlight stylistic agreement or disagreements. In conventional architectures such as the Model-View-Controller, or the use of lib  ,  bin  or data  folders also act as aesthetic makers establishing the mental space of the programmer ahead of the reading of the actual source code199 .
Another instance of intratextual interfacing is the limitation of function arguments , according to which arguments given to a function should be either few, or grouped under another name. Going back to the structural criterion above of limiting input/output and keeping groups of statements conceptually independent, function arguments solve this requirement at the level of vocabulary, demonstrating in passing the relative porosity of those categories. Indeed, the naming of variables also reveals the pick of adequate data-structures , echoing those who claim that the data on which the code operates can never be ignored, and that beautiful code is code which takes into account that data and communicates it, and its mutations, in the clearest, most intelligible, possible way. Such an echo relates to our discussion on the issues of problem-domain modelling, analyzed in  Modelling complexity   ; ultimately, the structure, syntax and vocabulary of a program text has a necessary involvement with the problem domain.
Within the file, structural aesthetic criteria are vague enough to be open to interpretation by practitioners and is therefore unable to act as a strict normative criteria, but are nonetheless a solid heuristicof the quality of the software. For instance, a program text should follow the stepdown rule of function declaration rather than the alphabetical rule when writing in a language which doesn't enforce it. As for variable declarations, global variables should all be declared at the beginning of the highest scope to which they belong (e.g. at the beginning of the file), rather than at the closest location of their next use. Program-texts therefore tend to be more aesthetically pleasing when semantic groupings are respected by the human writer (such as global variable declaration) and syntactic groupings are respected by the machine writer (through linting and formatting).
This uncovers the related criteria of local coherence : what is next to each other should related to each other. Local coherence reveals what Goodman calls semantic density, in which tokens grouped together obtain a greater denotative power, while remaining open to further modification by inserting modifications within this grouping. Local coherence operates in balance with the undesirable but unavoidable entanglement of code, as proponents of local coherence in source code imply that a beautiful piece of code should not have to rely on input and output (i.e. not be entangled) and therefore be entirely autotelic. Such an assumption runs contrary to the reality of software development as a practice, and as an object embedded in the world, and thus not "usable" by software developers. This balancing issue can be resolved by writing a source code whose code blocks are structured in such a way that they are related to, but not dependent on, each other200 .
As we travel down from structure to syntax, we can point to a correlate to conceptual distancing in the form of conceptual symmetry , according to which that groups of statement which do the same thing should look the same. It then becomes possible to catch a glimpse of patterns, in which readers get a grasp of what such pattern does. Conceptual distancing can be further improved by conceptual uniqueness , which demands that all the statements that are grouped together only refer to one single action: complex enough to be useful, and simple enough to be graspable. Aesthetically-pleasing code is thus the code that "does the job" while using the least amount of different ideas, which implies a linear relationship between the number of lines of code and the amount of conceptual information to be understood201 , and a relation to elegance.
Finally, it should be noted that this aesthetic criteria of structure is most relevant for a particular class of program texts, texts written by software engineers. In the case of hackers, poets or scientists202 , their program texts are limited in numbers of lines when compared to code bases of, for instance, large open-source projects. And yet, as we will see below, in their case, aesthetic code is still code which manages to pack the maximum number of ideas in a minimal amount of lines of code, both in obfuscation practices, one-liners, poetic depictions and in demonstrations of algorithmic ideas.

Syntax

Syntax, as the mid-level group of criteria, deals most explicitly with two important components of the implementation: the algorithm and the programming language. Beautiful syntax seems to denote a conceptual understanding of the computational entities and of the tools at hand to solve a particular problem, and implies an expertise (i.e. both a knowing-what and a knowing-how).
Here, we consider both algorithms and languages as tools since they are part of the implementation implementation process: a process which turns an idea into form and ensures that this form is functional, and can thus subsequently be examined for aesthetic purposes. While algorithms exist independently from languages, their aesthetic value in the context of this research cannot be separated from the way they are written, itself affected by the language they are written in. Indeed, most algorithms are expressed first as pseudo-code and then implemented in the language that is most suited to a variety of factors (e.g. speed, familiarity of the author, suitability of the syntax, nature of the intended audience); this seems to be a contemporary version of the 1950s, when computer scientists would devise those algorithms through pencil and paper, and then leave their implementation at the hands of entirely different individuals—women computers ( Chun, 2005) .
Beautiful syntax in code responds to this limitation. It aims at resolving the tension between clarity and complexity, with the intent to minimize the number of lines of code, while maximizing both the conceptual implications and the specific affordances for modification. Since algorithms must be implemented in a certain context, with a certain language, it is the task of the writer to best do so with respect to the language that they are currently working in. In this case, knowledge of the language from both writers and readers makes idiomatic syntax a beautiful syntax (see  Styles and idioms in programming   above). This involves knowing the possibilities that a given language offers and, in the spirit of the craftsmanship ethos noted previously, working with the language rather than against it. These sets of aesthetic criteria thus become dependent on the syntactical context of the language, itself dependent on its suitability for the problem at hand, and can only be established with regards to each languages. Specifically, this involves knowing which keywords should traditionally not be used, such as unless  in perl, or *  in C, knowing when to use decorators in python, or the spread ...  operator in ECMAScript, etc. A common featured shared by these keywords is their tendency to cause more cognitive friction than ease of comprehension.
Here, syntax also engages with the ideal of conciseness: a writer can only be concise if they know how the language enables them to be concise. Knowing the algorithms implemented and the problem domain addressed also influence the overall experience of the program text, as the goal is to optimize these three components. The extent to which a syntax is idiomatic and the extent to which the problem domain is accurately represented203 , are therefore good indicators of the aesthetic value of a program-text. Conversely, quality syntax is also syntax which refrains from being too idiomatic for the purposes at hand, in software engineering; this is referred to as "clever code" and is generally frowned upon. In the case of hacking code or poetry, cognitive friction is, on the opposite, seen as a positive aesthetic experience. By doing the most with the least, complex hacker code and code poetry enable understandings of polar opposites: hacker syntax displays insight into the highly technical hardware or machine-linguistic environment, while poet syntax offers access to broad human concepts (e.g. of the self, of religion, or history) through a minimal number of lines of code.
A programmer who finds that she can best communicate her ideas according to Java will find Java beautiful. A developer who finds that she can best communicate her ideas while writing in Go will find Go beautiful, and so on204 . Ultimately, a syntactical criteria which acts as a response to these discusssions is consistency . While there might be specific, personal, preferences as to why one would want to be writing code one way or another (e.g. calling functions on objects rather than calling functions from objects in order to prevent output arguments), this minor increase in aesthetic value through subjective satisfaction—through display of individual skill and personal knowledge—does not compensate for the possible increase in cognitive noise in a collaborative environment. If those different ways of writing are used alternatively in an arbitrary manner, this requires unnecessary mental gymnastics from the reader. In this context, consistency prevails over efficacy, and confirms at the fact that aesthetics in source code in this context is a game of tradeoffs. Again, hacker and poet aesthetics stand at the opposite: the highly localized and personal function of the program text implies a tolerance for idiosyncracy, since personal knowledge and preference are part of the aesthetic value of the program text.
Beyond the state of syntactic consistency and idiomatic writing, another aesthetic criteria is linguistic reference , meaning bringing practices from one language into another. Being able to implicitly reference another language in a program-text205 , a code-switching of sorts, can both communicate a deep understanding of not just a language, but an ecosystem of languages while satisfying the purpose of maintaining clarity, at the expense of, again, assuming a certain skill level in the reader. This communicates a feeling of higher-understanding, akin to perceiving all programming languages as ultimately just "tools for the job" and whose purpose is always to get a concept across minds as fully and clearly as possible. However, a misguided intention of switching between two languages, or a mis-handled implementation can push a program-text further down the gradient of ugliness. The concept communicated would in such a case be obscured by the conflicting idioms, reveal of lack of mastery of the unique aspects of the working language(s), and therefore fail to fulfill the aesthetic criterion of being true to ones materal.
Moving down to the level of vocabulary, a final syntactic criterion with high aesthetic value is the preference of natural language reading flow . For instance, of the two alternatives in Ruby: if people.include? person  vs. if person.in? people  , the second one is to be considered more beautiful than the first one, since it adapts to the reader's habit of reading human languages. However, the essential succintness and clarity of source code is not to be sacrificed for the sake of human-like reading qualities, such as when writers tend to be overly explicit in their writing. Indeed, a definite criteria for ugliness in program-text is verbosity, or useless addition of statements without equivalent addition of functionality or clarity. This is, once again, an example of source code aesthetics being a balance between machine idioms over human idioms—here, the resolution of this balance is the point at which machine idioms are presented as human-readable.

Vocabulary

Vocabulary, as the only component in this framework which directly involves words that can be chosen by the writers themselves, is often the most looked at in the literature regarding beautiful code among software developers, as it is the closest to human aesthetics, and since their understanding does not require existing knowledge of programming, but also the easiest to assess their functional impact ( Oliveira, 2022) . Aesthetics here exist at the level of the name and affects most directly the readership of a program text.
Of the two big problems of programming, the most frequent one is naming206 . One reason as to why that is might be that naming is an inherently social activity, because a name is an utterance which only makes sense when done in the expectation of someone else's comprehension of that name ( Voloshinov, 1986) . This is supported by the fact that the process of creating a variable or function name on one's own is often more time-consuming when done alone, as opposed to discussing it with others. Naming, furthermore, aims not just at describing, but at capturing the essence of an object, or of a concept. This is a process that is already familiar in literary studies, particularly in the role of poetry in naming the elusive. Here, we remember how Vilém Flusser sees poetry as the briging-forth that which is conceivable but not yet speakable through its essence in order to make it speakable through prose, using the process of naming through poetry in order to allow for its use and function in prose (see  Literary Beauty   for a mention of Flusser's conception of prose and poetry). In this light, good, efficient and beautiful names in code are those who can communicate the essence of the concept that is being operated upon in the program text, implying both what they are and how they are used, while omitting extraneous details.
On a purely sensory level, surface-level aesthetic criteria related to naming are that of character length and pronounceability . Visually, character length can indicate the relative importance of a named concept within the greater structure of the program-text. Variables with longer names are variables that are more important, demand more cognitive attention, offer greater intelligibility in comparison with shorter variable names, which only need to be "stored in memory" of the reader for a smaller amount of time. This length also signifies at which level of abstraction is the variable operating: longer names denote more global variables that denote the program text's structure, while shorter variable names indicate that we are currently working at a lower level of abstraction. Variables and functions with longer names thus exist in a broader scope than their counterparts with shorter names, re-introducing a component of scale within our existing framework of scale.
Pronounceability, meanwhile, takes into account the basic human action of "speaking into one's head", as an internal dialogue, and therefore participates in the requirement for communicability of source code amongst human readers. For instance, the difference between mnPtCld  and meanPointCloud  both refer to the same entity, and the second provides an easier cognitive access to it at the very minimal expense of a few characters.
Equally visual, but aesthetically pleasing for both typographical and cognitive reasons, is the casing of names. Dealing with the constraint that variable names cannot have whitespace characters as part of them, casing has resulted into the establishment of conventions which pre-exists the precise understanding of what a word denotes, by first bringing that word into a category (all-caps denotes a constant, camelCasing denotes a multi-word variable and first-capitalized words indicate classes or interfaces). By using multiple cues (here, typographical, then semantical), casing helps with understandability and, in this specific instance, there seems to be quantitative evidence for CamelCasing  to facilitate the scanning of a program text ( Binkley, 2009) . Again, casing, by its existence first as a convention, implies that it exists within a social community of writers and readers, and acknowledges the mutual belonging of both writer and reader to such a community, and turns the program-text from a readerly text further into a writerly one ( Barthes, 1984) .
Following these visual, auditory and typographical criteria, an aesthetically-pleasing vocabulary includes a strict naming of functions as verbs and variables as nouns . Continuing this correspondence between machine language and human language, there is here a clear mapping between syntax and semantics: functions do things and variables are things. If written the other way around, while this would respect the criteria for consistency, functions as nouns and variables as verbs hint at what it is not, are counter-intuitive and ultimately confusing—confusion which brings ugliness.
The noun given to a variable should be a hint towards the concept addressed, and ideally address what it is, how it is used, and why it is present, things that cannot be deduced from the environment of the program text207 . Each of these three aims aren not necessarily easily achieved at the same time, but finding one word which, through multiple means, points to the same end, is an aesthetic goal of source code writers and another testimony of elegant writing. A beautiful name is a name which differentiates between value (obvious, decontextualized, and therefore unhelpful) and intention, informing the reader not just about the current use, but also about future possible use, in code that is written or yet to be written. This is particularly salient in the general distaste of the use of magic numbers, as are called pure values, which do not have a semantic label applied to them. We see here a paradox between direct conceptual relationship between a name and what it denotes, and the multiple meanings that it embodies (its description, its desired immediate behaviour, and its purpose).
Indeed, in the community of software developers, variable names should then have a direct mapping with the object or concept they denote. This is not the case in other communities, whether those that rely on obfuscation, in which confusion becomes beautiful, or in poetic code, in which double-meaning brings an additional, different understanding which ultimately enriches the complexity of the reading, by moving it away from strict functionality.
The contextual nature of source code aesthetics proposes a slightly adjacent standard for source code poetry, in which the layering of meanings is a positive aesthetic trait in the community of code poets. These ambivalent semantics allow writers to offer metaphors, and provide an entry point to the metaphorical tendencies of source code. This aesthetic criteria of double-meaning comes from poetry in human languages, in which layered meanings are aesthetically pleasing, because they point to the un-utterable, and as such, perhaps, the sublime ( Aquilina, 2015) .

Comments

Before moving on to another aspect in which the aesthetics of source code involved with spatialization of meaning mechanically-realized, we touch on a specific case of machine languages. Comments in code do not seem to fall clearly in any of the three categories above. By definition ignored by the compiler or interpreter, comments can be erroneous statements which will persist in an otherwise functional codebase, and are therefore not entirely trusted by experienced, professional software practitioners. In this configuration, comments seem to exist as a compensation for a lack of functional aesthetic exchange.
By functional aesthetic exchange we mean an exchange in which a skilled writer is able to be understood by a skilled reader with regards to what is being done and how, only through executable source code. If any of these conditions fail (the writer isn't skilled enough and relies on comments to explain what is going on and how it is happening, or the reader isn't skilled enough to understand it without comments), then comments are here to remedy to that failure, and therefore are a symptom of non-beautiful code, specifically because it relies on extraneous devices, that the computer does without. Nonetheless, they can act as a locus for social expression, or human creativity—    is an example of such a display of ingenuity and helpful mental scaffolding for understanding the code.
rendering-borders

if (style == StyleBorderStyle::Dotted) {
    // If only this side is dotted, other side draws the corner.
    //
    //  otherBorderWidth + borderWidth / 2.0
    // |<------->|
    // |         |
    // +------+--+--------
    // |##  ##| *|*  ###
    // |##  ##|**|**#####
    // |##  ##|**+**##+##
    // |##  ##|* P *#####
    // |##  ##| ***  ###
    // |##  ##+-----------
    // |##  ##|  ^
    // |##  ##|  |
    // |##  ##| first dot is not filled
    // |##  ##|
    //
    //      radius.width
    // |<----------------->|
    // |                   |
    // |             ___---+-------------
    // |         __--     #|#       ###
    // |       _-        ##|##     #####
    // |     /           ##+##     ##+##
    // |   /             # P #     #####
    // |  |               #|#       ###
    // | |             __--+-------------
    // ||            _-    ^
    // ||          /       |
    // |         /        first dot is filled
    // |        |
    // |       |
    // |      |
    // |      |
    // |      |
    // +------+
    // |##  ##|
    // |##  ##|
    // |##  ##|
    Float minimum = otherBorderWidth + borderWidth / 2.0f;
    if (isHorizontal) {
      // ...
    } else {
      // ...
    }
    return P;
  }


- An example of useful comments complementing the source code in Mozilla's layout engine, literally drawing out the graphical task executed by the code.
The situation in which comments seem to be tolerated is when they provide contextual information, therefore (re-)anchoring the code in a broader world. For instance, this is achieved by offering an indication as to why a given action is being taken at a particular moment of the code, called contractual comments, again pointing at the social existence of source code. This particular use of comments seems to bypass the aesthetic criteria of source code being self-explanatory. However, it also integrates the criteria of being writable, a piece of code which, by its appearance, invites the reader to contribute to it, to modify it. As such, in an educational setting (from a classroom to an open-source project), comments are welcome, but rarely quoted as criteria for beautiful code in other communities.
In conclusion, we have seen that aesthetic standards for source code can be laid out along a logic of scale, from a macro-level of structure all the way to a micro-level of vocabulary, through an appropriate use of syntax. Throughout, the aesthetic principles of consistency, elegance and idiomaticity are recurring concepts against which a value judgment can be given. Such aesthetic manifestations enable the traversal of the program text, from the micro- to the macro-, from file to file, class to class, or layer to layer. However, this approach of a linear scale also hints at another dimension: that of the interface between human concepts and machine concepts, as source code aesthetics enhance the communication of separate semantic layers.

Semantic layers

The specificity of source code is that it acts as a techno-linguistic interface between two meaning-makers: the human and the machine. While the machine has a very precise, operational definition of meaning (see  Programming languages   above), programmers tend to mobilise different modalities in order to make sense of the system they are presented with through this textual interface (see  The psychology of programming   ). Among those modalities are the resort to literary techniques (in the form of metaphors), to architecture (in the form of pattern-based structural organization), to mathematics (in the form of symbolic elegance) and craft (in the form of material adequacy and reliabilty).
As a formal manifestation involving, in the context of a crafted object, a producer and a receiver, aesthetics contribute to the establishment of mental spaces. In domains such as mathematics or literature, mental spaces can represent theorems or emotions; within source code, however, they acquire a more functional dimension. As such, they also communicate states and processes .
Building on our discussion of understanding software in  Understanding computation   , we now highlight concrete instances of complex computational objects interfaced through source code. We take three examples to highlight the multiple manifestations of semantic layers at play in source code, operating in different socio-technical contexts, yet all sharing the same properties of using structure, syntax and vocabulary in order to communicate implicitly a relatively complex idea. We can consider a semantic layer to be an abstraction over relatively disorganized data in order to render it relatively organized, by providing specific reference points which contribute to the establishment of a mental space, based on the computational space of the program. In the words of Peter Neumann:
A challenge in computer system design is that the representation of the functionality at any particular layer of abstraction should exhibit just those characteristics that are essential at that layer, without the clutter of notational obfuscation and unnecessary appearance of underlying complexity. ( Neumann, 1990)
First, we look at the astract data type called a semaphore , and how it operates as the interface between the computational reality of concurrency, and the human associations of traffic, and resorting primarily from the scientific domain. Second, we turn to a fragment of open-source software which uses abstraction and configuration in order to signify modification, in the context of collaborative, read-write texts characteristic of software development. Finally, we discuss a code poem, as an illustration for the role of programming languages and creative metaphors to support a different kind of functional communication, in Goodman's sense of working . Each of these examples have been chosen to reflect the different practices of programming, and should thus be considered in a complementary manner, rather than independently. While each was chosen for how well it illustrates their respective, concept, abstraction, transparency and execution are all parameters that come into play for each practices of programming,.
Programming can be seen as an act of encryption and decryption across layers, in which human meaning is encrypted in machine language for computational execution, with the possibility for such meaning of being decrypted later on for study and modification ( Ledgard, 2011) . We argue that this encryption process involves different aesthetic modalities which act as a heuristic for writing functionally good code and provide keys for decrypting the intent and processes represented in source code, moving across human and machine layers. Each in their own way, these modalities represent different semantic layers which bridge machine-meaning and human-meaning.

Abstraction and metaphors

An early and recurring problem in computer science is that of concurrency. Concurrency, or the overlapping execution of multiple interacting computational tasks, emerged along with the development of time-sharing and multi-core hardware. From the 1964 development of MULTICS, a time-sharing operating system that multiple users could use at the same time, to the popularization of multi-processor architecture in the early 1980s, computers moved from executing one task for one user at a time, to multiple tasks for multiple users. The issue that arises then is that of shared memory: how can one design a progam in which two parallel operations can access and modify a shared resource, while at the same time guaranteeing the integrity of the data?
While this problem arose from hardware development, whether synchronization is of a universal nature—that is, whether there exists a solution which can be applied to all practical synchronization problems—has been an ongoing research investigation ( Leppäjärvi, 2008) . This tension between hardware innovation and fundamental computing problems illustrates the multiplicity of layers at play, from matter to ideal, in programming practice.
Specifically for concurrency, it turns out to be quite difficult for human programmer to model the different actions taking place in parallel, with overlapping consequences on common data. Just like programming languages can nudge their users into safety208 , there are technical systems which can be designed in order to help humans both think through and implement mechanically this thinking. One such system is Edsger Dijkstra's semaphores .
Writing in 1965, Dijkstra describes a data type which prevents such issues of simultaneous access to critical data by two threads of a same process. Such data type possesses two behaviours: post  and wait  . When a thread is about to access a critical part of data, it calls wait  , and when it is done, it calls post  . If another thread also calls wait  before accessing the critical part, wait  checks its internal value to see if another thread is currently processing that data. If it is, then it puts the requesting thread to sleep, and waits until post  is called to wake it up ( Dijkstra, 1965) . A textbook implementation of such data structure is described in    .
For instance, say two separates users (P and J) want watch an online video at the same time, and the website wants to keep track of the number of views. If viewCount  is the variable of the number of views, and is equal to 2046  before the visit. Parallel execution means that, if P and J view the video at the same time, they might both separately increase the viewCount  from 2046  to 2047  , rather than one waiting for the other to complete the increase, and end up with a viewCount  value of 2048  . In this case, a construct such as a semaphore would be used by the first user, P, calling wait  on the semaphore which is attached to the database, before updating the viewCount  in the database. When the second user, J, wants to increase the view count as well, they see that the semaphore is raised, and they cannot access the database immediately. Once P is done with the database update, they call post  . At this point, J is allowed to operate on the database, with a guarantee of data integrity.
semaphore-pseudocode

int sem_wait(sem_t *s)
{
    // decrement the value of semaphore s by one
    // wait if value of semaphore s is negative
}

int sem_post(sem_t *s)
{
    // increment the value of semaphore s by one
    // if there are one or more threads waiting, wake one
}

- A textbook semaphore description in pseudo-code  ( Arpaci-Dusseau, 2018)
Such a piece of code is interesting for its use of multifaceted engagement with metaphors, its existence between abstract and concrete and its involvement with functional reliability.
A lot of different introductory textbooks, from Dijkstra's original paper to the Wikipedia article on semaphores, rely on analogies in the problem domains to describe the problems implied by a concurrent use of shared resources. Dijkstra refers to problems such as the sleeping barber or the banker's algorithm as use-cases that are both unrealistic but mentally graspable ( Dijkstra, 1965) , while the Wikipedia entry refers to the dining philosophers problem  ( Wikipedia, 2023) 209 . In this case, we see a macro-level aesthetic device in the form of storytelling, which introduces the reader to the origins and implications of concurrent use in computing systems.
Then comes the use of the term semaphore itself. In its mechanical form, a semaphore is a device which signals information to a running train, such as whether the tracks ahead are blocked or clear, whether the train should stop or proceed at reduced speed, or if they should exercise caution. There are multiple properties from this source domain that are applied to the target domain of programming. First, the property of a continuously running train, whose alternate state is that of waiting, before starting again, and whose state change is dependent on the state change of the semaphore. In the programming context, a thread also assumes a linear continuous execution, and can under certain circumstances be put to sleep, or woken up, by the process. As we get further away from the concept, and closer to the implementation, the aesthetics of the programming expression switches domains, and becomes more fine-grained.
Ultimately, the micro-specific details of the implementation stop making use of metaphors at all and, in doing so, rely on a different kind of representation. The two operations, denoted above wait  and post  , are actually left to implementation details. In Dijkstra's original paper, such operations are denoted P  and V  , and it is still a matter of debate what those letters stand for, due to the author's use of his native Dutch language ( Wikipedia, 2023) . Since these appear as arbitrary marks, we argue that their aesthetic properties in communicating the abstract data type of semaphore changes; they take on the appearance of a single letter: one which is very concrete to the machine, and very abstract to the human.

Openness and transparency

This next example, taken from Adafruit's pi_video_looper  , exhibits interesting features in terms of openness and transparency, hinting at the reader's implied ability to write.
The program text, published by the Adafruit company, is written in Python and is the source code for a video application running on the Raspberry Pi hardware platform. Both the company and the hardware platform it manufactures are strongly rooted in the ethos of open-source, meaning that it is not just meant to be used, but also to be modified by its users. In this context, the pi_video_looper  project is made available on the GitHub platform, which facilitates re-use by other users.
The particular section of the program text, whose presentation enables understanding for further modiyfing, concerns two similar functions, _load_player()  and _load_file_reader()  , in the video_looper.py  file, reproduced in    . These two functions are member methods if the VideoLooper  class and return the specific video playback processes (such as VLC media player or OMXplayer) and file reading drivers (such as a USB drive or a network filesystem).
pi-video-looper

def _load_player(self):
    """Load the configured video player and return an instance of it."""
    module = self._config.get('video_looper', 'video_player')
    return importlib.import_module('.' + module, 'Adafruit_Video_Looper').create_player(self._config, screen=self._screen, bgimage=self._bgimage)

def _load_file_reader(self):
    """Load the configured file reader and return an instance of it."""
    module = self._config.get('video_looper', 'file_reader')
    return importlib.import_module('.' + module, 'Adafruit_Video_Looper').create_file_reader(self._config, self._screen)

- Abstracting hardware specific resources via configuration options in an open-source project. Both of these rely on getting a variable from the configuration file, before loading the file whose name corresponds to the value of that variable. This architectural choice enables broad generalization via a simple loading mechanism.  ( Dicola, 2015)
To do so, these two methods operate similarly. Based on the current configuration of the running software, they load actual files through Python's importlib  module, and calls an expected method to return an instantiated object. Since it engages directly with modules in the form of files, rather than through pre-registered abstractions, it gives the end user a glimpse into the workings of the source code as latent scripts of plain text, rather than interpreted code. This might be considered confusing, since this is also the only two occurences of such a technique in a file that is 500 lines long.
And yet, this architectural choice enables the reader to grasp a couple of fundamental concepts never made explicit otherwise. First, the use of the _  character prefix for both methods ensure that these are private methods, and therefore are only directly used in the current class, and not in other, invisible places in the rest of the program text. Second, in a program text whose intention is to be user-friendly, given the number of comments and the culture of the organization from which it stems, an explicit unveiling of dynamic module loading signifies the potential for other modules to be loaded, without having to modify the loading function itself. This expresses the feeling of habitability discussed in  Compression and habitability in functional structures   , in that readers are invited, in turn, to write into the text and make it their own—here, to use a different video playback or file reading system.
Particularly, the source code is written in such a way that there are hints at the existing parts of the computational environment (the configuration file, the method to be called on the module). This presents a structure in which the writer can insert itself without modifying anything that was not meant to be modified. With three lines of code, each of these methods present an elegant interface between the problem domain (e.g. the media player) and the hardware domain (e.g. omx  vs. hello\_video  ); revealing this loading of files, the program text never states how to add to the source, but rather shows that adding a new playback engine is as simple as writing the playback engine in a new file with at least one specific method as an entrypoint (e.g. create_player  ), and changing the configuration file value for the new filename, without having to touch these functions themselves. Furthemore, by acting as this textual location through which multiple computational processes interact, this is an example of a beacon, lighting the way along a non-linear reading process by establishing signposts from which to proceed.
This abandoning of abstraction at a certain level, in order to reveal what should be revealed to a reader-as-potential-writer, builds on a community ethos of hacking, where the machine's workings are laid bare in order to support unexpected changes by unknown individuals. This textual hint at both multiple realities (i.e. how the playback is actually done, inside the VideoLooper  abstraction) and particular possibilities (i.e. using, or changing it), creates a particularly welcoming space for newcomers.

Description and execution

To complement our examples of scientist and software developer code, we now look at how source code can evoke a certain sense of the aesthetic by accentuating, rather than reducing, the semantic gap between human and machine.
The poem presented in    , written in Ruby by Macario Ortega in 2011 and titled self_inspect.rb  , opens up this additional perspective on the relationship between aesthetics and expressivity in source code. Immediately, the layout of the poem is reminiscent both of obfuscated works and of free-verse poetry, such as E.E. Cummings' and Stéphane Mallarmé's works210 . This particular layout highlights the ultimately arbitrary nature of whitespace use in source code formatting: self_inspect.rb  breaks away from the implicit rhythm embraced in Black Perl , and links to the topics of the poem (introspection and unheimlichkeit ) by abandoning what are, ultimately, social conventions, and reorganizing the layout to emphasize both keyword and topic, exemplified in the end  keyword, pushed away at the end of their line.
self_inspect

class Proc
                            def in_discomfort?; :me; end
                                                                end
you_are = you = 
   
   ->(you) do
       self.inspect until true
         until nil
               break you
                                                  end
           puts you.in_discomfort?
             you_are[you]
                                                end

you[
          you_are
]

- A code poem written in Ruby, exhbiting complex interactions between human reference, machine reference, language idioms, source code description and runtime execution.  ( Ortega, 2011)
From a computer perspective, the program declares a class called Proc  , a generic and essential construct in Ruby, which has a single member method in_discomfort?  returning the value of the symbol me  . The core of the program then takes place in the declaration of the two variables you_are  and you  , assigning them a value of a lambda expression. It includes four statements; the first two, self.inspect  and break you  , due to their conditions, are never actually executed. The third prints the result of calling in_discomfort?  , and the fourth recursively calls the lambda expression stored in you_are  with the argument you  . Finally, the whole execution of the program is due to the last call to you  with the argument you_are  , the symmetric opposite of the last statement of the lambda expression. Functionally, this program text is then a series of recursive function calls.
When read aloud, the poem includes a first mention of the self, before reiterating mentions of you, and inviting tones of uncertainty, through the mention of inspection and discomfort. It thus evokes intimacy, individuality, feelings of absolute, by referring to terms such as true, end or nil (meaning nothing), and short, imperative orders such as do or break. As such, the poem, as read and pronounced by a human, evokes feelings of identity and introspection, felt as negative forces.
The poem also presents features which operate on another level, halfway between the surface and deep structures of the program text. First, the writer makes expressive use of the syntax of Ruby by involving data types. While Black Perl remained evasive about the computer semantics of the variables, such semantics take here an integral part. Two data types, the lambda expression and the symbol are used not just exclusively as syntactical necessities (since they don't immediately fulfill any essential purpose), but rather as (human) semantic ones. The use of :me  on line 2 is the only occurence of the first-person pronoun, standing out in a poem littered with references to you  . Symbols, unlike variable names, stand for variable or method names. While you  refers to a (hypothetically-)defined value, a symbol refers to a variable name, a variable name which is here undefined, and would default to a literal me  . Such a reference to a first-person pronoun implies at the same time its ever elusiveness. It is here expressed through this specific syntactic use of this particular data type, while the second-person is referred to through regular variable names, possibly closer to an actual definition. It is a subtlety which does not have an immediate equivalent in natural language, and by relying on the concept of reference, hints at an essential différance between you and me.
Reinforcing this theme of the elusiveness of the self, the author maca plays with the ambiguity of the value and type of you  and you_are  , until they are revealed to be arrays. Arrays are basic data structures consisting of sequential values, and representing you  as such suggests the concept of the multiplicity of the self, adding another dimension to the theme of elusiveness. The discomfort of the poem's voice comes from, finally, from this lack of clear definition of who you  is. Using you_are  as an index to select an element of an array, subverts the role suggested by the declarative syntax of you are . The index, here, doesn't define anything, and yet always refers to something, because of the assigment of its value to what the lambda expression ->  returns. This further complicates the poem's attempt at defining the self, calling the reverse expression you_are[you]  . While such an expression might have clear, even simple, semantics when read out loud from a natural language perspective, knowledge of the programing language reveals that such a way to assign value contributes significantly to the poem's expressive abilities.
A final feature exhibited by the poem is the execution of the procedure. When running the code, the result is an endless output of print statements of "me", since Ruby interprets an undefined symbol as its literal name, as seen in    .
self_inspect_output

...
me
me
me
me
me
me
me
me
me
me
me
me
me
me
me
me
Traceback (most recent call last):                 
        11913: from poem.rb:16:in `<main>'
        11912: from poem.rb:13:in `block in <main>'
        11911: from poem.rb:13:in `block in <main>'
        11910: from poem.rb:13:in `block in <main>'
        11909: from poem.rb:13:in `block in <main>'
        11908: from poem.rb:13:in `block in <main>'          
        11907: from poem.rb:13:in `block in <main>'
        11906: from poem.rb:13:in `block in <main>'                             
         ... 11901 levels...                                                    
            4: from poem.rb:13:in `block in <main>'                             
            3: from poem.rb:13:in `block in <main>'                             
            2: from poem.rb:12:in `block in <main>'                             
            1: from poem.rb:12:in `puts'                                        
self_inspect.rb:12:in `puts': stack level too deep (SystemStackError)  

- The executed output from
The computer execution of the poem provides an additional layer of meaning to our human interpretation. Through the assignment of you_are  in an until  loop, the result is an endless succession of the literal interpretation of the symbol :me  , the actual result of being in discomfort. While we have seen that a symbol only refers to something else , the concrete output of the poem evokes an insistence of the literal self, exhibiting a different tone than a source in which the presence of the pronoun you is clearly dominant. Such a duality of concepts is thus represented in the duality of a concise source and of an extensive output, and is punctuated by the ultimate impossibility of the machine to process the accumulation of these intertwined references to me and you , resulting in a stack overflow error.
We now understand that the undefined symbol me is to be taken literally, while the output of the program is the result from the mutual recursive calls of you[you_are]  and you_are[you]  , creating an infinite mantra (you are you are you are you are you are you, etc.) which is heard first by the computer, and only viscerally understood through the execution of the program. Ultimately, an additional theme of the poem can be deciphered: through recursion, the entanglement of individuals depending on each other leads to a semantic and computational overload.
The added depth of meaning from this code poem goes beyond the syntactic and semantic interplay immediately visible when reading the source, as the execution provides a result whose meaning depends on the co-existence of both source and output. Beyond keywords, variable names and data structures, it is also the procedure itself which gains expressive power: a poem initially about you results in a humanly infinite, but hardware-bounded, series of me 211 .
A final idea here is that the writing of code is an artistic enterprise, both in its traditional understanding of craft, and its contemporary understanding of art. The emphasis on executable code reveals aesthetic possibilities of source code as a medium, in which form, content and function are closely aligned. Poems such as self_inspect.rb  are fascinating because they are variably accessible and inaccessible to readers, a function of their readers' knowledge of programming languages and facility with poetry. They also provide means of expression in multiple ways: the visual impression of the code on the page, an aural dimension if read aloud, and the output rendered by the code when compiled. Their possibilities for interpretation, then, are fragmentary, requiring negotiation on these many fronts to appreciate and understand. ( Risam, 2015) .
If code poems are not immediately functional in the industrial sense of the term, they are nonetheless dependent on the functioning of the program that they describe for a part of their expressive power. This computational function is therefore always a part of the meaning of a program text.
-
We've seen through this section that the expressivity of program texts rely on several aesthetic mechanisms, connected in a spatial way between a metaphorical understanding of humans and a functional understanding of machines. From layout to double-meaning through variables and procedure names, double-coding and the integration of data types and functional code into a program text and a rhetoric of procedures in their written form, all of these activate the connection between programming concepts and human concepts to bring the unthinkable within the reach of the thinkable. While these techniques are deployed differently according to the socio-technical environment in which the program text is being written and read, they nonetheless all contribute to faciltating the navigation of the program text, be it at the same level of abstraction across parts of the text (such as in    , where the patterns of writerly text exists across the codebase), or at different levels of abstraction in the same locations (such as in    , where the syntax abstracts away the unnecessary signifiers of parallel computing).
Ultimately, these aesthetic manifestations of source code in a program text are all tightly coupled to the execution of that program text. The next section concludes this research by assessing the relation between such function(s) of a program text and its aesthetic manifestations.
icon to download the pdf of the thesis