icon to open the header menu

Understanding computation

Software, computation and source code are all related components; respectively object, theory and medium. The ability to dematerialize software (from firmware, to packaged CDs, to cloud services) and the status of source code as intellectual property point to an ambiguous nature: it is both there and not there, idea and matter. This section makes explicit some of the affordances of software which make it a challenging object to grasp, in order to lay out what programmers are dealing with when they read and write source code.
In order to reconcile the different tensions highlighted in the various kinds of complexities that software exhbits, we first turn to an ontological stance. Particularly, we will develop on Norbay Irmak's proposal that software exists as an abstract artifact , simultaneously on the ideal, practical and physical planes, and see how Simondon's technical and aesthetic mode of existence can reconcile fragmented practice with unified totality.
We then shift to the practical specifities of software, particularly in terms of levels and types of complexity. This will highlight some of the properties that make it hard to understand, such as its relation to hardware, its relation to a specification, and its existence in time and space.
With this in mind, we will conclude this section by looking specifically at the source code representation of software, and at how programmers deploy strategies to understand it. Approaching it from a cognitive and psychological perspective, we will see how undersanding software involves the construction of programming plans and mental models; the tools and helps used in order to construct them will be explicited in the next section.

Software ontology

Before we clarify what software complexity consists of, we first frame these difficulties in a philosophical context, more specifically the philosophy of technology. We will investigate how these complexities can be seen as stemming from the nature of technology itself, and how this connects to an aesthetic stance. Before moving back to practical inquiries into how specific individuals engage with this nature, this section will help provide a theoretical background, framing technology as a relational practice, complementing other modes of making sense of and taking action on the world. This conceptual framework will start with an investigation into the denomination of software as an abstract artifact , followed by an analysis of technology as a specific mode of being, and concluding on how it is related to an aesthetic mode of being.

Software as abstract artifact

When he coins the phrase abstract artifact , Nurbay Irmak addresses software partly as an abstract object, similar in his sense to Platonic entities, and partly as a concrete object which holds spatio-temporal properties ( Irmak, 2012) . This is based on the fact that software requires an existence as a textual implementation, in the form of source code ( Suber, 1988) ; it is composed of files, has a beginning (start) and an end (exit); but software also represents ideas of structure and procedure which go beyond these limitations of being written to a disk, having a compilation target or an execution time. Typically, the physical aspects of software (its manifestation as source code) can be changed without changing any of the ideas expressed by the software73 .
Irmak complements Colburn's consideration of software as a concrete abstraction , an oxymoron which echoes the tensions denoted by the concept of the abstract artifact. He grounds these tensions in the distinction between a medium of execution (a—potentially virtual—machine) and a medium of description (source code). He considers that, while any high-level programming language is already the result of layers of abstraction, such language gets reduced to the zeroes and ones input to the central processing unit ( Colburn, 2000) . Here, he sees the abstraction provided by languages ultimately bound to the concrete state of being of hardware and binary. And yet, if we follow along along his reasoning, these representations of voltage changes into zeroes and ones are themselves abstractions over yet another concrete, physical event. Concrete and abstract are recursively tangled properties of software.
Writing on computational artefacts, of which software is a subset, Raymond Turner formalizes this specificity of in a three-way relationship. Namely, abstract artefact A is an implementation in medium M of the definition F. For instance, concerning the medium:
Instead of properties such as made from carbon fiber , we have properties such as constructed from arrays in the Pascal programming language, implemented in Java .  ( Turner, 2018)
This metaphor provides an accurate but limited account of the place of source code within the definition of software: the Java implementation is itself a definition implemented in a specific bytecode, while arrays in Pascal are different abstractions than arrays implemented in C, etc. Nonetheless, source code is that which gives shape to the ideas immanent in software—through a process of concretization—and which hides away the details of the hardware—through abstraction. This metaphor of abstract artifact thus helps to clarify the tensions within software, and to locate the specific role of source code within the different moving parts of definition, medium and model.
Software, like other artefacts, has a relation between its functional properties (i.e. purpose that are intended to be achieved through their use) and structural ones (both conceptual and physical configuration which are involved in the fullfilment of the functional purpose) ( Turner, 2018) . As such, it also belongs to the broader class of technology, and thus holds some of the specifities of this lineage, into which we extend our inquiry.

Software as a relational object

The technological object underwent a first qualitative shift during the European Industrial Revolution, and a second one with the advent of computing technologies. The status of its exact nature is therefore a somewhat recent object of inquiry. Here, we will start from Gilbert Simondon's understanding of technology as a mode , in order to ultimately contrast it with the aesthetic mode .
According to Simondon, the technical object is a relation between multiple structures and the result of a complex operation of various knowledges ( Simondon, 1958) , some scientific, some practical, some social, some material. The technical object is indeed a scientific object, but also a social object and an artistic object at the same time. Differentiated in its various stages (object, individual, system), it is therefore considered as relational, insofar as its nature changes through its dependance, and its influence, on its environment.
Technology is a dynamic of organized, but inorganic matter ( Stiegler, 1998) . Following Latour, we also extend the conception of inorganized matter to include social influences, personal practices, and forms of tacit and explicit knowledges ( Latour, 2007) . That is, the ambiguity of the technical object is that it extends beyond itself as an object, entering into a relation with its surrounding environment, including the human individual(s) which shape and make use of it.
Technology is generally bound to practical matter, even though such matter could, under certain circumstances, take on a symbolic role of manifesting the abstract. This is the case of the compass, the printing press, or the clock. The clock, a technology which produces seconds, its action reached into another domain—that of mechanical operation on abstract ideas ( Mumford, 1934) . The domain of abstract ideas was hitherto reserved to different modes than technology: that of religion and philosophy, and technology holds a particularly interesting relation with these two. According to Simondon, philosophy followed religion as a means of relating to, and making sense of, the abstract such the divine and the ethical. Tracing back the genesis of the technological object, he writes that the technical mode of existence is therefore just another mode through which the human can relate to the world, similar to the religious, the philosophical, and the aesthetic mode ( Simondon, 1958) .
Technical objects imply another mode of being, consequential to the recognition of the limtations of magic—humanity's primary mode of being. Technicity, according to Simondon, focuses on the particular, on the elements, a contrario to the religious mode of being, which finds more stability in a persepective of totality, rather than a focus on individuals74 .
This technical mode of existence, based on particulars, can nonetheless circle back to a certain totality through the means of induction; that is, deriving generals from the observed particulars. As such, technical thinking, as inverted religious thinking, stems from practice, but also provide a theory Technology, religion and philosophy are all, according to Simondon, combinations of a theory of knowledge and a theory of action, compensating for the loss of magic's totalizing virtues. While the religious, followed by the philosophical, approach from theory to deduce a practice, and thus lack grounding, technology reverses the process and induces theory from operatoins on individual elements.
Simondon complements the technical with the aesthetic mode, and as such counter-balances the apparent split between technics and religion by striving for unity and totality, for the balance between the objective and the subjective. Yet, rather than being a monadic unity of a single principle, Simondon considers the aesthetic mode as a unifying network of relationships75 . He further argues that the aesthetic mode goes beyond taste and subjective preference, into a fundamental aspect of the way in which human beings relate to the world around them. An aesthetic object therefore acquires the property of being beautiful by virtue of its relationships, of its connections between the subject and the objective, between one's history and one's perceptions, and the various elements of the world, and the actions of the individual. Finally, the aesthetic thought when related to the technical object consists in preparing the communication between different communities of users, between different perspectives on the world, and different modes of action upon this world. Ultimately, the aesthetic mode can therefore be seen as the revealing of a nexus of relationships found in its environment, highlighting the key-points of in the structure of the object76 . How aesthetics enables a holistic thought through the use of sensual markers will be the subject of  Beauty and understanding   .
Computation, as a particular kind of computation, is thus both a theory and a practice, and can also be subject to an aesthetic impression. Particularly, one can think of computers as a form of technology through which meaning is mechanically realized 77 .
Software is a manifestation of technology as both knowledge and action. Furthermore, it also enables ways to act mechanically on knowledge and ideas, an affordance named epistemic action by David Kirsh and Paul Maglio ( Kirsh, 1994) . They define epistemic actions as actions which facilitate thinking through a particular situation or environment, rather than having an immediate functional effect on the state of the world. As technology changes the individual's relationship to the world, software does so by being the dynamic, manipulable notion of a state of a process, ever evolving around a fixed structure, and by changing the conceptual understanding of said world ( Rapaport, 2005) . Such examples of world related to the environment in which software exists, e.g. the social environment, or hardware environment, or the environment which has been recreated within software. David M. Berry investigates this encapsulation of world in his Philosophy of Software :
The computational device is, in some senses, a container of a universe (as a digital space) which is itself a container for the basic primordial structures which allow further complexification and abstraction towards a notion of world presented to the user. ( Berry, 2011)
Software-as-world is the material implementation of a proposed model, itself derived from a theory. It therefore primarily acts at the level of episteme , sometimes even limiting itself to it78 . Paradoxically, it is only through peripherals that software can act as a mechanical technology in the industrial sense of the word.
Along with software's material and theoretical natures (i.e. in contemporary digital computers, it consists of electrons, copper and silicium and of logical notations), another environment remains—that of the intent of the humans programming such software. Indeed, thinking through the function of computational artefacts, Turner states that it is agency which determines what the function is. He defines agency as the resolution of the difference between the specification (intent-free, external to the program) and semantic interpretation (intent-rich, internal to the programmer) ( Turner, 2018) . In order to understand a computer program, to understand how it exists in multiple worlds, and how it represents the world, we need to give it meaning. To make sense of it, a certain amount of interpretation is required in relation to that of the computer's—such that the question "what does a Turing machine do?" has n+1  answers. 1 syntactic, and n  semantic (e.g. however many interpretations as there can be human interpreters) ( Rapaport, 2005) . In his investigation into what software is, Suber corroborates:
This suggests that, to understand software, we must understand intentions, purposes, goals, or will, which enlarges the problem far more than we originally anticipated. [ ] We should not be surprised if human compositions that are meant to make machines do useful work should require us to posit and understand human purposiveness. After all, to distinguish literature from noise requires a similar undertaking. ( Suber, 1988)
-
In conclusion, we have seen that while software can be given the particular status of an abstract artifact , these tensions are shared across technological objects, as they connect theory and practice. Technology, as a combination of a theory of knowledge and a theory of action, as an interface to the world and a recreation of the world, is furthermore related to other modes of existence—and in particular the aesthetic mode. We have seen how Simondon suggests that the aesthetic mode has totalizing properties: through the sensual perception of perfected execution, it compensates technology's fragmented mode of existence.
What do these tensions and paradoxes look like in practice? In the next section, we examine more carefully the specific properties of software, and the complexities that this specific object entails. Specifically, we will see how software's various levels of existence, types of complexities, and kinds of actions and interpretations that it allows, all contribute to the cognitive hurdles encountered when attempting to understanding software.

Software complexity

What is there to know about software? Looking at the skills that novel programmers have to develop as they learn their trade, one can include problem solving, domain modelling, knowledge representation, efficiency in problem solving, abstraction, modularity, novelty or creativity ( Fuller, 2007) . The variety of these skills and their connection to intellectual work—for instance, there is no requirement for manual dexterity or emotional intelligence—suggests that making and reading software is a complex endeavor.
Indeed, software exhibits several particularities, as it possesses several independent components which interact with each other in non-trivial, and non-obvious ways. In order to clarify those interactions, we start by looking at the different levels at which software exists, before turning to the different kinds of complexity which make software hard to grasp, concluding on its particular existence in time and space.
Along with different levels of existence needed to be taken into account by the programmer, software also exhibits specific kinds of complexity. Our definition of complexity will be the one proposed by Warren Weaver. He defines problems of (organized) complexity as those which involve dealing simultaneously with a sizable number of factors which are interrelated into an organic whole ( Weaver, 1948) 79 . Specifically, there are three different types of software complexity that we look at: technical complexity, spatio-temporal complexity and modelling complexity.

Levels of software

Software covers a continuum from an idea to a bundled series of distinct binary marks. One of the essential steps in this continuum is that of implementation . Implementation is the realization of a plan, the concrete manifestation of an idea, and therefore hints at a first tension in software's multiple facets. It can happen through individuation, instantation, exemplification and reduction ( Rapaport, 2005) . On the one side, there is what we will call here ideal software, often existing only as a shared mental representation by humans (not limited to programmers), or as printed documentation, as a series of specifications, etc. On the other side, we have actual software, which is manifested into lines of code, written in one or more particular languages, and running with more or less bugs.
The relationship between the ideal and the actual versions of the same software is not straightforward. Ideal software only provides an intent, a guidance towards a goal, assuming, but not guaranteeing, that this goal will be reached80
Actual software, as most programmers know, differs greatly from its ideal version, largely due to the process of implementation, translating the purpose of the software from natural and diagrammatic languages, into programming languages, from what it should do, into what it actually does.
Writing on the myths of computer science, James Moor ( Moor, 1978) allows us to think through this distinction between ideal and practical along the lines of the separation between a theory and a model. The difference between a model and a theory is that both can exist independently of one another—one can have a theory for a system without being able to model it, while one can also model a system using ad hoc programming techniques, instead of a coherent general theory.
Most of the practice of programmers (writing and reading code for the purposes of creating, maintaining and learning software) depends on closing this gap between the ideal and the practical existences of software.
The third level at which software exists is that of hardware. While the ideal version of software is presented in natural language, diagrams or pseudo-code, and while the practical version of software exists as executable source code, software also exists at a very physical level—that of transistors and integrated circuits. To illustrate the chain of material levels at which software exist, the series of listings in    ,     ,     and    perform the exact same function of implementing a FILL ME algorithm, respectively in pseudo code, in C, in Assembly and in bytecode.
level_text

how to get the difference in character length between two words

store the first word in a variable
store the second word in a variable

store the difference between the number of characters in the first word
and the number of characters in the second word

print the difference to the console

- Example of a program text represented in pseudo code. See , and for lower level representations.
level_c

#include <string.h>
#include <stdio.h>

int main(){
    char* a_word = "Gerechtigkeit";
    char* an_unword = "Menschenmaterial";

    int difference = strlen(a_word) - strlen(an_unword);

    printf("%d", difference);

    return 0;
}

- Example of a program text represented in a high level language. See for a higher level representation and and for lower level representations.
level_asm

push   %rbp
mov    %rsp,%rbp
movl   $0xa,-0xc(%rbp)
movl   $0x2,-0x8(%rbp)
mov    -0xc(%rbp),%eax
sub    -0x8(%rbp),%eax
mov    %eax,-0x4(%rbp)
mov    $0x0,%eax
pop    %rbp
ret

- Example of a program text represented in an Assembly language. See and for a higher level representation and for a lower level representation.
level_byte

1119:       55
111a:       48 89 e5
111d:       c7 45 f4 0a 00 00 00
1124:       c7 45 f8 02 00 00 00
112b:       8b 45 f4
112e:       2b 45 f8
1131:       89 45 fc
1134:       b8 00 00 00 00
1139:       5d
113a:       c3

- Example of a program text represented in bytecode. See , and for higher level representations.
The gradient across software and hardware has been examined thoroughly ( Kittler, 1997 Chun, 2008 Rapaport, 2005) , but never strictly defined. Rather, the distinction between what is hardware and what is software is relative to where one draws the line: to a front-end web developer writing JavaScript, the browser, operating system and motherboard might all be considered hardware. For a RISC-V assembly programmer, only the specific CPU chip might be considered hardware, while the operating system being implemented in C, itself compiled through Assembly, would be considered software. A common definition of hardware, as the physical elements making up the computer system, overlooks the fact that software itself is, ultimately, physical changes in the electrical charge of the components of the computer.
Software can be characterized the dynamic evolution of logical processes, described as an ideal specification in natural languages, as a practical realization in programming languages, and in specific states of hardware components. Furthermore, the relations between each of these levels is not straightforward: the ideal and the practical can exist independently of each other, while the practical cannot exist independently of a machine. For instance, the machine on which a given program text is executed can be a virtual machine or, conversely, a real machine managing virtual memory.
In any case, these are only the technical components underpinning software, its specifications and formalizations. Another dimension of complexity is introduced by the fact that software is supposed to interact with entities that are not already formalized nor quantized, such as physical reality and its actors.

Spatio-temporal complexity

A rough way of describing computers is that they are extremely stupid, but extremely fast ( Muon Ray, 1985) . The use of programming language is therefore a semantic translation device between a natural problem, the formalization of the problem in such a language, and the binary expression of the program which can be executed by the CPU at very high speeds.
This very high speed of linear execution involves another dimension to be taken into account by programmers. For instance, the distinction between endurants and perdurants by Lando et. al. focuses on the temporal dimension of software components (i.e. a data structure declaration has a different temporal property than a function call) ( Lando, 2007) . Whether something changes over time, and when such a thing changes becomes an additional cognitive load for the programmer reading and writing source code, a load which can be alleviated by data types (such as the const  keyword, marking a variable as unchangeable), or by aesthetic marks (such as declaring a variable in all capital letters to indicate that it should not change).
Temporal complexity relates to the discrepancy between the way the computer was first thought of —i.e. as a Turing machine which operates linearly, on a one-dimensional tape—and further technological developments. The hardware architecture of a computer, and its specification as a Turing machine involve the ability for the head of the machine to jump at different locations. This means that the execution and reading of a program would be non-linear, jumping from one routine to another across the source code. Such an entanglement is particulary obvious in Ben Fry's Distellamap series of visualizations of source code (    represents the execution of the source code for the arcade game Pac-Man)81 .
pacman-visualization
Visualization of the execution of Pac-Man's source code
Visualization of the execution of Pac-Man's source code
Furthermore, the machine concept of time is different from the human concept, and different machines implement different concepts. For instance, operations can be synchronous or asynchronous, thus positing opposite frames of reference, since the only temporal reference is the machine itself82 . While humans have somewhat intuitive conceptions of time as a linearly increasing dimension, computer hardware actually includes mutliple clocks, used for various track-keeping purposes and structuring various degrees of temporality ( Mélès, 2017) .
Later on, the introduction of multi-core architecture for central processing units in the late 2000s has enabled the broad adoption of multithreading and threaded programming. As a result, source code has transformed from a single non-linear execution to a multiple non-linear process, in which several of these non-linear executions are happening in parallel. Keep tracking of what is executing when on which resource is involved in problems such as race conditions , when understanding the scheduling of events (each event every e.g. 1/18000000th of a second on a 3.0 Ghz CPU machine) becomes crucial to ensuring the correct behaviour of the software.
Conversely, the locii of the execution of software creates contributes to those issues. Even at its simplest, a program text does not necessarily exist as a single file, and is never read linearly. Different parts can be re-edited and re-arranged to facilitate the understanding of readers83 . Modern programming languages also have the feature of including other files, not directly visible to the user. The existence of those files have a textual manifestation, such as the #include  line in C or import  in Python, but the contents of the file can remain elusive.
Where exactly these files exist is not always immediately clear, as their reference by name or by Uniform Resource Locator (URL) can obfuscate whether or not a file exists on the current machine. As such, software can be (dis-)located across multiple files on a single machine, on multiple processes on a single machine, or on multiple processes on multiple machines (on a local-area or wide-area network) ( Berry, 2011) . Facilitating navigation between files through the references that files hold to one another is one way that the tools of the programmers alleviate cognitive burden, as we will see in  Tools as a cognitive extension   .
Additionally, time and space in computation can interact in unexpected ways, and fragments the interface to the object of understanding. For instance, the asynchronicity of requesting and processing information from distinct processes is a spatial separation of code which has temporal implications (e.g. due to network latency). When and where a certain action takes place becomes particularly hard to follow.

Modelling complexity

Modeling complexity addresses the hurdles in translating a non-discrete, non-logical object, event, or action, into a discrete, logical software description through source code. Indeed, the history of software development is also the history of the extension of the application of software, and the hurdles to be overcome in the process. From translation of natural languages ( Poibeau, 2017) , to education ( Watters, 2021) or psychological treatment ( Weizenbaum, 1976) , it seems that problems that seem somehat straightforward from a human perspective become more intricate once the time for implementation has come.
This translation process involves the development of models ; these are abstract descriptions of the particular entities which are considered to be meaningful in the problem domain. The process of abstracting elements of the problem domain into usable computational entities is an essential aspect of software development, as it composes the building blocks of software architectures (see  Software developers   for discussion of software architects). Abstraction encompasses different levels, at each of which some aspect of the problem domain is either hidden or revealed, and finding the right balance of such showing or hiding in those models does not rely on explicit and well-known rules. but rather on cognitive principles. Starting from the observation that there no generalizable rules for modelling classes in computer science, Parsons and Wand suggest that cognitive principles can be a productive way foreward84 . They base their proposal on the theories of Lakoff and Johnson, insofar as metaphors operate cognitively by mapping two entities abstracting at the same level; such a tool for understanding is further explored in  Metaphors in computation   .
For a banking system, this might involve a Client  model, an Account  model, a Transfer  model and a Report  model, among others. The ability to represent a Client  model at a productive abstraction level is then further complicated by the conceptual relations that the model will hold with other models. Some of these relations can be made explicit or implicit, and interact in unexpected ways, since they differ from what our personal conception of what a Client  is and of what it can do85 .
Working at the "right" layer of abstraction then becomes a contextual choice of reflecting the problem accurately, taking into account particular technical constraints, or the social environment in which the code will circulate. For instance, choosing to represent a color value as a three-dimensional vector might be efficient and elegant for an experienced programmer, but might prove confusing to beginner programmers. The key aspect of being a triplet might be lost to someone who focuses on the suggested parallels between points in space and a shade of red.
Let us consider a simple abstraction, such as having written publications, composed of three components: the name of an author, the date of publication, and the content of the publication. This apprently useful and practical abstraction becomes non-straightforward once the system that uses it changes in scale. With a hundred publications, it is easy to reason about them. With a million publications, the problems themselves start to change, and additional properties such as tags, indexes or pages should be considered in modelling the publication for the computer ( Cities, 2022) .
The aphorism " All models are wrong, but some are useful "  ( Box, 1976) captures the ambiguity of abstraction of a model from real-world phenomena. The aim of a model is to reduce the complexity of reality into a workable, functional entity that both the computer and the programmer can understand. This process of abstraction is the result of judging which parts of a model are essential, and which are not and, as we have seen in  Practical attempts at implementing formal understanding   , judgments involve a certain amount of subjectivity ( Weizenbaum, 1976) .
Ultimately, the concrete representation of a model involves concrete syntax through the choice of data types, the design of member functions and the decision to hide or reveal information to other models. Which individual tokens and which combination of tokens are used in the representation process then contribute to communicate the judgment that was made in the abstraction process.
-
Software involves, through programming languages, the expression of human-abstracted models for machine interpretation, which in turn is executed at a scale of time and space that are difficult to grap for individuals. These properties make it difficult to understand, from conception to application: software in the real-world goes through a process of implementation of concepts that lose in translation, interfacing the world through discrete representations, and following the execution of these representations through space and time. Still, source code is the material representation of all of these dynamics and the only point of contact between the programmer's agency and the machine execution and, as such, remains the locus of understanding. Programmers have been understanding software as long as they have been writing and reading it. We now turn to the attempts at studying the concrete cognitive processes deployed by source code readers and writers as they engage meaningfully with program texts.

The psychology of programming

In practice, programmers manage to write, read and understand source code as a pre-requisite of producing reliable source code. Being able to write a program has for effective pre-requisite a thorough understanding of the problem, intent and platform, making the programming activity a form of applied understanding86 .
How programmers deal with such a complex object as software has been a research topic which appeared much later than software itself. The field of software psychology aims at understanding how programmers process code, and with which level of success, and under which conditions. How do they build up their understanding(s), in order to afford appropriate modification, re-use or maintenance of the software? What cognitive abilities do they summon, and what kind of technical apparatuses play a role in this process? In answering these questions, we will see how the process of understanding a program text is akin to constructed a series of mental models, populating a cognitive map.
The earliest studies of how computer programmers understand the code they are presented with consisted mostly in pointing out the methodological difficulties in doing so ( Sheil, 1981 Shneiderman, 1977 Weinberg, 1998) . This is mainly due to three parameters. First, programming is an intertwined combination of notation, practices, tasks and management, each of which have their own impact on the extent to which a piece of source code is correctly understood, and it is hard to clearly establish the impact of each of these. Second, program comprehension is strongly influenced by practice—the skill level of the programmer therefore also influences experimental conditions87 . Third, these early studies have found that programmers have organized knowledge bases, if informal and immaterial. This means that, while programmers demonstrate epistemic mastery, they are limited in their ability to explain the workings of such ability.—that is, the constitution and use of their own mental models.
Marian Petre and Alan Blackwell attempted in their 1992 study to identify these mental models and their uses. They asked 10 expert programmers from North America and Europe to describe the thought process in source code-related problem-solving and design solutions in code. While this study was an investigation into the design of code, before any writing happens, one of the limitations is that it did not investigate the understanding of code, which takes places once the writing has been done (by oneself, or someone else), and the code now needs to be read.
The main conclusion of their study is that, beyond the fact that each programmer had slightly different descriptions of their mental process, there are some commonalities to what is happening in someone's thoughts as they start to design software. The behaviour is dynamic, but controlled; the resolution of that behaviour was also dynamic, with some aspects coming in and out of focus that the will of the programmer, providing more or less uncertainty, level of details and fuzziness on-demand; and those images co-existed with other images, such that one representation could be compared with another representation of a different nature ( Petre, 1997) . Finally, while most imagery was non-verbal, all programmers talked about the need to have elements of this imagery labelled at all times, hinting at a relationship between syntax and semantics to be translated into source code.
Francoise Détienne, in her study of how computer programmers design and understand programs ( Detienne, 2001) , defines the activity of designing and understanding programs in activating schemas , mental representations that are abstract enough to encompass a wide use (web servers all share a common schema in terms of dealing with requests and responses), but nonetheless specific enough to be useful (requests and responses are qualitatively different subsets of the broader concept of inputs and outputs). An added complexity to the task of programming comes with one of the dual nature of the mental models needing to be activated: the computer's actions and responses are comprised of the prescriptive (what the computer should do) to the effective (what the computer actually does). In order to be appropriately dealt with, then, programmers must activate and refine mental models of a program which resolves this tension. To do so, they seem to resort to spatial activities, such as chunking and tracing  ( Cant, 1995) , thus hinting at a need to delimitate some cognitive objects with a material metaphor, and connecting those concepts with a spatial metaphor.
In programming, within a given context—which includes goals and heuristics—, elements are being perceived, processed through existing knowledge schemas in order to extract meaning. Starting from Kintsch and Van Dijk's approach of understanding text ( Kintsch, 1978) , Détienne nonetheless highlights some differences with natural language understanding. In program texts, she finds, there is an entanglement of the plan, of the arc, of the tension, which does not happen so often in most of the traditional narrative text. A programmer can jump between lines and files in a non-linear, explorative manner, following the features of computation, rather than textuality. Program texts are also dynamic, procedural texts, which exhibit complex causal relations between states and events, which need to be kept track of in order to resolve the prescriptive/effective discrepancies. Finally, the understanding of program text is first a general one, which only subsequently applies to a particular situation (a fix or an extension needing to be written), while narrative texts tend to focus on specific instances of protagonists, scenes and descriptions, leading to broad thematic appreciation.
Conversely, a similarity in understanding program texts and narrative texts is that the sources of information for understanding either are: the text itself, the individual experience and the broader environment in which the text is located (e.g. technical, social). Building on Chomsky's concepts, the activity of understanding in programming can be seen as understanding the deep structure of a text through its surface structure  ( Chomsky, 1965) . One of the heuristics deployed to achieve such a goal is looking out for what she calls beacons , as thematic organizers which structure the reading and understanding process ( Wiedenbeck, 1991 Koenemann, 1991) . For instance, in traditional narrative texts, beacons might be represented by section headings, or the beginning or end of paragraphs. However, one of the questions that her study hasn't answered specifically is how the specific surface structure in programming results in the understanding of the deep structure—in other terms, what is the connection between source code syntax, programmer semantics and program behavior.
Détienne's work ushers in the concept of a mental model as means of understanding in programmers, which proved to be a fruitful, if not settled field of research. Mental models are a dynamic representation formed in working memory as a result of using knowledge from long term memory and the environment ( Cañas, 2001) . As such, they are a kind of internal symbolic representation of an external reality, are a rigorous, personal and conceptual structure. They are related to knowledge, since the construction of accurate and useful mental models through the process of understanding is shaped by, and also underpins knowledge acquisition. However, mental models need not be correlated with empirical truth, due to their personal nature, but are extensive enough to be described by formal (logical or diagrammatical) means. Mental models can be informed, constructed or further qualified by the use of metaphors, but they are nonetheless more precise than other cognitive structures such as metaphors—a mental model can be seen as a more specific instance of a conceptual structure than a metaphor.
Further research on mental model acquisition have established a few parameters which influence the process. First, programmers have a background knowledge that they activate through the identification of specific recurring patterns in the source code, confirming Détienne characterization of the roles of beacons. Second, mental models seem to be organized either as a layered set of abstractions, providing alternative views of the system as needed, or as a groups or sets of heuristics. Finally, programmers use both top-down processes of recognizing familiar patterns, they also make use of bottom-up techniques to infer knowledge from which they can then construct or refine a mental model ( Heinonen, 2023) .
Epistemic actions, the kinds of actions which change one's knowledge of the object on which the actions are taken, contribute to reducing the kinds of complexities involved with software. Concretely, this involves refining the idea that one has of the software system at hand, by comparing the result of the actions taken with the current state of the idea(s) held.In their work on computer-enabled cognitive skills, Kirsh and Maglio develop on the use of epistemic actions:
More precisely, we use the term epistemic action to designate a physical action whose primary function is to improve cognition by:
  • reducing the memory involved in mental computation, that is, space complexity;
  • reducing the number of steps involved in mental computation, that is, time complexity;
  • reducing the probability of error of mental computation, that is, unreliability.
Since epistemic actions rely on engaging with a text, at the syntax and semantics level, it has often been assumed by programmers and researchers that reading and writing code is akin to reading and writing natural language. Additional recent research in the cognitive responses to programming tasks, conducted by Ivanova et. al., do not appear to settle the question of whether programming is rather dependent on language processing brain functions, or on functions related to mathematics (which do not rely on the language part of the brain) ( Ivanova, 2020) , but contributes empirical evidence to that debate. They conclude that, while language processing might not be one of the essential ways that we process code—excluding the code is text hypothesis—, it also does not rely on exclusively mathematical functions. Stimulating in particular the so-called multi-demand system, it seems that programming is a polymorphous activity involving multiple exchanges between different brain functions. What this implies, though, is that neither literature, linguistics nor mathematics should be the only lens through which we look at code.
-
In a way, then, programming is a sort of fiction, in that the pinpointing of its source of existence is difficult, and in that it affords the experience of imagining contents of which one is not the source, and of which the certainty of isn't defined, through a particular syntactic configuration. Both programming and fiction suggest surface-level guiding points helping the process of constructing mental models as a sort of conceptual representation. It is also something else than fiction, in that it deals with concrete issues and rational problems88 , and that it provides a pragmatic frame for processing representations, in which assumptions stemming from burgeoning mental models can be easily verified or falsified, through the taking of epistemic actions. It might then be appropriate to treat it as such, simultaneously fiction and non-fiction, as knowledge and action, mathematic and artistic. Indeed, it is also an artistic activity which, in Goodman's terms, might be seen as an analysis of [artistic] behavior as a sequence of problem-solving and planning activities."  ( Goodman, 1972) .
Remains the interpretation issue mentioned above: the interpretation of the machine is different from the interpretation of the human, of which there are many, and therefore what also needs to be intepreted is the intent of the author(s). Such a tension between the computer's position as an extremely fast executer and the programmer's position as a cognitive agent is summer up by Niklaus Wirth in Beauty Is Our Business , Dijkstra's festschrift : " What the computer interprets, I wanted to understand. "  ( Wirth, 1990) .
One key aspect of the acquisition process seems to be mapping or linking features of the actual target system to its mental representation. The result of have been referred to as cognitive maps or knowledge maps. Here
The complexities of software are echoed in how programmers evoke their experience of either designing or, comprehending code. They have shown to use multiple cognitive abilities, without being strictly limited to narrative, or mathematic frames of understanding, and making use of notions of scale and focus to disentangle complexity. For the remaining section of this chapter, we will focus on two specific means that contribute to this process of building a mental model of software-as-source code. Based on the reports that programmers use mental images and play with dynamic mental structures to comprehend the functional and structural properties of software, we can now say that understanding of a program text involves the construction of mental models. This happens through a process of mapping textual cues with background knowledge at various layers of abstraction, resulting in a cognitive cartography allowing for an program text to be made intelligible, and thus functional, to the programmer.
We conclude this chapter with a look at two practical ways in which sense in made from computational systems. From a linguistic perspective, we look at the role that metaphors play in translating computational concepts into ones which can be grasped by an individual. From a technical perspective, we start from the role of layout (indentation, typography) to develop on the concept of extended cognition to see how understanding is also located in a programmers' tools.
icon to download the pdf of the thesis