Vagueness and Biological Individuality

John Boyer
11 min readMar 3, 2022

Around 2400 years ago, a contemporary of Aristotle’s named Eubulides of Miletus formulated a famous thought experiment called the Sorites Paradox. It is a strange problem that has bothered philosophers for thousands of years, and more recently parallels some quandaries in biology and artificial intelligence.

The name Sorites comes from the Greek word for “pile” or “heap,” and some of the more famous philosophers of the 20th century had a crack at solving it. The paradox stems from linguistic vagueness, and the vagueness of words like “heap” or “tall” or even “red” can lead us to paradoxical conclusions. These kinds of words are perfectly useful to us in daily life but issues with vagueness and classification start to arise around edge cases and borderlines.

To be more specific, let’s imagine a heap of sand made up of 10,000 grains. If we take away one grain of sand from the heap, our new collection of 9,999 grains of sand should certainly still be called a “heap.” It does not seem like that small change should make a difference to how we refer to the thing. Conversely, we would also not consider a collection of 2 grains of sand to be a “heap,” and adding a third grain of sand to the collection would not suddenly make it a heap.

This leads us to think that for any pair of collections of sand A and B that differ in quantity by only one grain, it must be the case that either both A and B are heaps or neither A nor B should be called a heap. It seems to be a valid rule that two things which differ only by very small degree should not be classified differently. This makes intuitive sense, and in the same way two women who differ in height by just 1 millimeter are either both considered tall or neither is tall. However, repeatedly applying these rules leads to impossible conclusions.

With these rules in mind we can take one grain from the heap of 10,000 to make a new heap, and the collection of 9,999 is still a heap. Repeat the rule to remove one grain from the heap of 9,999 and the collection of 9,998 must be too. And so on– keep removing sand and following the categorization rule that collections differing by just 1 grain must be of the same type. The paradox arises once 9,999 grains have been removed; we have to say that a “collection” or set of just one grain is a “heap” following our transitive categorization rule. The rules about categorization initially laid out seem to be valid, but lead to this absurd conclusion. Furthermore, this kind of classification paradox applies to many different vague words like “tall” or even what color a thing is.

A heap of sand under a microscope

Philosophical analysis of the problem leads to few satisfying conclusions. As with any paradox, we can either reject some of the premises, reject the reasoning linking them together, or accept the conclusions. For purposes here it is most useful to bite the bullet and accept that by evidence of the paradox, vague descriptions are deeply flawed in the first place. One way of sorting this out is to replace vague descriptions with descriptions that rely on empirical data and some comparison between elements. Replacing the word “tall” in the sentence ‘Is Alice tall?’ with ‘Is Alice above average height?’ helps to eliminate some vagueness and make the questions more answerable. Although adding relative terms provides a practical solution in many cases, it does nothing to help with the logical issue. We seem to be struggling to find some kind of borderline between heaps and non-heaps that does not obviously exist.

The problem of vagueness and classification is not unique to ancient Greeks contemplating heaps of sand. Similar issues arise in AI systems that need to perform categorization and the problem parallels an ongoing line of inquiry in biology. In much the same way as “heap” proves to be a vague descriptor for collections of sand, the notion of an “individual” is a difficult one for biologists to define. However, there is an interesting contrast between the different types of vagueness that plagued the minds of ancient Greek philosophers versus the kind some biologists are interested in today.

The issue for biologists is vagueness in a linguistic sense around what an individual is, but also the fact that the boundaries between individuals and their environments are vague. The history of the field of biology is essentially the study of what look like individuals. Developments in ecology made it clear how important the systems in which individuals are embedded within are, but improvements in microbiology and genetics brought the definition of the word itself into question.

Portuguese Man-O-War; a colony not a jellyfish

Individuals are traditionally defined by several criteria. The word implies that they have a defined anatomy and spatial structure that they occupy. They are also assumed to have their own genetic identity and, via either sexual or asexual means, have the ability to reproduce. Cells, or the subcomponents of cells must work together to form larger structures that all work together, and sophisticated systems can even distinguish the self from others in the form of an immune response.

This criteria work reasonably well to describe the macroscopic world, but break down when molecular techniques reveal the complexity of relationships between individuals. Polymerase chain reaction techniques and bioinformatics allowed scientists to better understand the complexity of symbiosis and the flow of genes through time. In fact there are exceptions to all of the criteria for individuality.

To any of these, there are plenty of exceptions. In some sponges over 40% of the organism’s body is composed of internal bacteria critical to its survival. For that matter, the human body contains more microorganism cells than human cells–hardly anatomically distinct. When mammals reproduce and the parents’ genes replicate, the offspring also inherit the microbiome of the mother in a paired parallel genetic inheritence. It’s also clear that cells don’t need to be genetically the same to work together as a physiological system. The division of labor can be performed by species interacting such as with lichen or by colonies of separate organisms in one body like the Portuguese man-o-war.

Exceptions to the conventional categorization of individuals are causing something of a paradigm shift in theory. Perhaps the old concept of individuality just comes from a human perceptual bias and pattern seeking. A refresh of an old framework in biology should not be surprising given the improvement in observational tools over time.

Complex behavior emerging from simple individuals

The Sorites paradox has become a popular illustration of the problem with defining individuals. Much like the boundary between what is or is not a heap, the boundary between what is or is not an individual is vague. An ongoing trend in biology is the formalization of individuality under different terms that are able to generalize to these kinds of counterexamples and even generalize to alien life. In his 2020 article on information theory of individuality, David Krakauer takes a cross-disciplinary approach by characterizing biological individuality in terms of information theory.

The difficulties in discriminating between organisms and their environments stem from the complex relations in living systems. While information theory has historically been applied to information transmissions like telegraphs or signals between computers, it also helps to clarify some of the complex interactions. Traditional biological definitions view organisms in terms of their structures and spaces they occupy. Information theory allows biologists to reframe the word “individual” as a verb rather than a noun, looking at them as processes rather than things that persist through time.

Electrical engineers can use Claude Shannon’s information entropy equation to measure susceptibility of the information transmitted between devices to alteration by noise in the system. Biologists can use the same equation to measure an organism’s ability to transmit and preserve genetic information through time. This can mean coded genetic information in two different times and places, similar to a telegraph message sent from one place to another, or it could be some other quantifiable description of the system like size, rates of metabolism, etc.

Informational entropy can be understood roughly as the richness of information that is possible with a communication channel. Consider two people trying to discuss the weather outside by passing coded information to each other underneath a door. If they can only slide a coin under the door, they might just be able to answer the yes/no question “Is it raining?” with heads for yes and tails for no. If they communicate with 20 scrabble tiles, they can say more interesting things. The information contained in DNA and RNA is much richer still, and therefore has higher informational entropy.

Claude Shannon’s original signal/interference diagram (1948)

Informational decay is also low when a message is conveyed between two states with minimal changes, like a text message received without any errors or a genetic sequence perfectly copied in a clonal organism. Bits of information traveling through an insulated wire would have a much lower susceptibility to interference by the channel they are conveyed through than the messages that children whisper to each other in the game Telephone.

One can gauge the organism’s influence on itself through time, as in its ability to maintain its own internal states. Additionally, some slight alterations to the math can examine the two-way relationship between individuals and their environment. The more individual an organism, the more its own past states will correlate with its future states. Likewise, the stronger an individual’s ability to maintain its own states through time, the more it will influence its environment rather than the other way around.

David Krakaeur’s two way individual/environment diagram (2020)

Given the tremendous range of possible information encoded in DNA, the vast majority of possible states are meaningless and would not code for anything. The tendency of living things to preserve the very unique and specific information that perpetuate life, in spite of all the interference and noise in the world around it, can be thought of as a mark of individuality.

Three general types of individuals emerge when the informational relationships between them and their environments are mapped. The first is fairly conventional individuals like animals which maintain their own states in spite of large informational changes in the environment. The second is colonial organisms which have some ability to maintain their microstates through time, but which are closely intertwined with neighbors in their environments. The third class is organisms which have little endogenous ability to maintain their own individual states, and exist largely because of nuances in their environment.

Complex systems of individuals are also able to overlap and exist at different levels of hierarchy within this model. Take an ant colony for instance. Ants are biologically distinct individuals under the classical criteria of each one possessing a distinct anatomy, unique genes, and discrete developmental history. However, the vast majority of the ants in a colony never reproduce. Their labors are a form of kin selection that helps to support the queen and the group as a whole. Moreover, they act in close enough coordination that the collective produces a sort of emergent intelligence. Much of an ant’s physiology would also not be possible without its microbiome.

Each level has individuality, and each level is critical to the sophisticated behavior that emerges at a high level. The individual bacteria in the gut of an ant are certainly individuals, but also play a large role in the larger system of an individual ant. Individual ants also preserve their own informational states through time despite not reproducing, but at a higher level, the collective actions of the colony constitute an individual in that they preserve the information of the collective across generational time scales. The innovation of information theory is that all of these nested systems can be considered individuals from a quantitative perspective.

Nested levels of individuality are likened to Gestalt images where the same image can be interpreted in different ways by an observer. Our brains are making quick decisions about whether the image is of vases or faces, but we can readjust our focus to interpret the same image in the same way. It is also possible to look at the system of an ant colony and see the individual ants, or refocus and see the colony as a whole as an emergent individual.

Vases vs Faces — Classical vs. Informational Individuals

Our brains are constantly classifying the world around us, and all of our perceptions are in some way biased by our own place in the world. The classical way of classifying biological organisms may end up being one example of a flawed categorization system biased by the way we see the world. An improved view based on informational abstraction helps to generalize classification. Thankfully, one of the chief benefits of science is its ability to cut through our perceptions and intuitions to reveal facts about the world that we could not see before. Technical and theoretical improvements are able to elucidate empirical systems like biology, but these are exactly the kinds of tools that are not able to help with linguistic quandaries such as the Sorites paradox.

The kind of linguistic problems that come from words like “heap” or “tall” cannot be readily resolved by looking for some kind of threshold on a continuum. The only way of really clarifying what is meant in a scientific way is to define the categories in relation to an emergent threshold like an average. Using relational classification like averages allows the observer to tap into knowledge about the whole set of things related.

Saying that something is “tall” or a “heap” is a fundamentally subjective categorization that is useful in day to day life where the classification is normatively correct insofar as others agree with it. More scientific phrasing like “above average height” or “a collection of more than 1,000 grain” has some verifiable epistemic basis in reality beyond just linguistic agreement.

Biology has also benefited tremendously over time from more quantifiable methods of taxonomy, and theoretical improvements like the informational theory promise to help further. Asking a question without knowing what form an answer might take is not particularly scientific. Currently, asking “Is X an individual?” is challenging because of difficult edge cases that toe some borderline between individual and not. The information theory method allows for much more quantitative and less comparative methods, and hopefully, provides fruitful avenues of research towards new insights.

References

  • “The Information Theory of Individuals” by David Krakauer
  • “Paradoxes” by R. M. Sainsbury
  • “What Is an Individual? Biology Seeks Clues in Information Theory” by Jordana Cepelowicz
  • “A Symbiotic View Of Life: We Have Never Been Individuals” by Scott F. Gilbert
  • “A Mathematical Theory of Communication” by C. E. Shannon

--

--