Molecular Machines and Evolution, Part 1
I will be writing a number of essays addressing the argument that co-option is a plausible evolutionary mechanism for the origin of molecular machines. Briefly, there are many molecular machines which carry out functions that require the interaction multiple protein components. How could these biological systems have evolved through purely non-teleological mechanisms? Co-option is often offered as a plausible evolutionary mechanism that can give rise to such molecular machines. But there are a number of problems with invoking non-teleological co-option as a general solution to the origin of molecular machines, and these problems can be summarized as follows:
1. Complementary conformations.
2. Pre-adaptation of components.
3. A specific sequence of co-option events is required.
4. Other considerations.
I will be discussing the first problem in this article; in following articles, the other problems will be considered.
The Co-option Mechanism
Molecular machines are composed of specific protein components that interact to produce biological function. Below is a figure describing a hypothetical molecular machine composed of protein components A, B, C, and D.
Figure 1. Components A, B, C, and D interact to produce a biological function.
If this biological function can only be carried out by the interaction of multiple protein components, then co-option must be invoked. Simply put, co-option involves a protein originally carrying out function X, which then undergoes a shift in function such that it now carries out function Y. “Normal” Darwinian evolution, where for example, gene duplication gradually increases the efficiency of the system, cannot explain the origin of molecular machines that carry out functions that can only be carried out by the interaction of multiple protein components (note: when I say “normal Darwinian evolution,” I simply mean random mutation and natural selection gradually increasing the efficiency of the system; my use of the term “normal” does not imply that co-option is not a normal process, etc.). This is because the very nature of the function requires that multiple components interact, so you couldn’t start with a single component carrying out this function. It’d have to carry out another function, then interact with other proteins over evolutionary time, and undergo a shift in function such that the new function arises. And this is the essence of co-option, illustrated in Figure 2.
Figure 2. Components originally carrying out unrelated functions associate with each other step-by-incremental-step, eventually producing a novel function (function 6).
Now let’s examine if co-option offers a general solution to the origin of molecular machines.
The Problem of Protein Shapes
Molecular machines are constructed from individual protein parts. For example, the bacterial flagellum of Salmonella consists of 42 protein parts, such as MotA and MotB (the motor proteins), FlgE (universal joint), FliD (cap protein), etc. These protein parts interact in a tightly-integrated, specific manner. And to interact in precise ways, these protein parts have specific, complementary shapes consisting of knobs, crevices, protrusions, etc. Thus, in order for these molecular machines to have been co-opted from precursor parts, the precursor proteins would need to independently evolve complementary shapes prior to the co-option of the molecular machine, despite carrying out different functions. Consider Figure 3.
In figure 3 we see a “protein complex” composed of 5 components: A, B, C, D, and E. Importantly, the shapes of these proteins are complementary to each other. Component A is complementary to B, C, and D. And D is complementary to A, C, and E. Yet the co-option scenario would have us believe that parts A through E (which are shown apart from the complex in the above figure), were originally functioning in different contexts, and were independently shaped by natural selection such that their shapes just happened to be complementary. But there is no reason why, without teleology, these 5 proteins should be shaped just right prior to their co-option into the novel molecular machine. Is it really plausible to expect several proteins to independently match a given pattern – the pattern of complementarity – especially since this pattern would have to be shaped by chance alone, since there is nothing in natural selection that would drive towards matching a pattern that will only be beneficial in the future? It is important to remember that non-teleological evolution has no foresight, unlike an engineer. And since evolutionary mechanisms cannot peer into the future, it is entirely unreasonable to expect non-teleological processes to shape these proteins in just the right way such that when they do associate, novel function appears.
How specific do protein shapes need to be?
The parts depicted in Figure 3 have very specific shapes and fit very well. But one could argue that during the evolution of a molecular machine, the parts were not quite so complementary but still managed to elicit the function. Then, over time, the parts became more tightly integrated, performing the function more efficiently. This situation is seen in Figure 4.
In the above figure, components A through E are somewhat complementary to each other, but not very much. These parts associate to form the molecular complex, and then over time (represented by the large red arrow), the parts become more tightly integrated, resulting in a molecular machine that is composed of tightly integrated components. So, in this scenario, the shapes of the precursor parts do not need to be that complementary to each other – they only need to be suited to each other well enough so that there is function, even if it is only minimal. Let us now consider this possibility.
The first point I will make here is that random protein-protein binding almost always does not produce new biochemical functions. Of all the possible ways for 2 or more proteins to bind together, the vast majority of them will not offer novel biological functionality. And since we are talking about 3D space here, there are trillions of different possible protein-protein interactions between 2 proteins.
However, there is another point I want to make, and that is that as the complexity of the system increases, and more parts are co-opted into the system, the greater the constraints on evolutionary mechanisms, and the less plausible it is for that biological machine to increase in complexity. And we can tie this into the discussion of complementary shapes. Two proteins could feasibly, through chance, have roughly complementary shapes, and loosely bind, producing a novel but inefficient function. This loose conglomeration could grow in complexity by the co-option of more proteins. But as the system becomes more complex and more tightly integrated, simple binding to the system by a protein will not produce novel function. The protein must bind specifically to particular components of the complex, and of course, the more components there are, the greater the number of possible protein-protein binding interactions – the vast majority of which will be non-functional. Indeed, you might “gum up the works” if your new protein does not bind specifically enough and if its shape is not fully complementary to the proteins it will interact with. John Bracht highlighted this way back in 2002 in a response to Ursula Goodenough on the evolution of the bacterial flagellum:
“Evolutionary explanations must describe how a new protein integrates into an old system in such a way as to allow continued functionality overall (often, both the incoming protein and the pre-existing system must be extensively modified to fit together in a coordinated way), and enhance functionality of the entire system in such a way as to provide selective advantage.”
Furthermore, if a new protein component will interact with multiple components of the system, there are even severer constraints on what protein shapes are allowed. The blind watchmaker would have to independently shape this new protein precisely so that when it is incorporated into the molecular machine, its shape fits well. We can construct a hypothesis that goes as follows:
The more components a protein interacts with, the more specified its shape must be, and subsequently, the more specified its sequence must be.
The greater the number of components a protein will interact in a biological machine, the greater the degree of specificity its shape must have, and the more specified its sequence must be (since the sequence is what codes for the protein shape).
Now, let’s test this hypothesis.
ATP synthase and Protein Conformation Specificity
To test this hypothesis, we will begin with the following premise: protein conformation specificity is determined by amino acid sequence specificity. In other words, since it is ultimately the amino acid sequence of the protein that determines its shape (there are other factors, but I won’t get into that right now), we’d predict that a protein that interacts with multiple components will have a greater degree of sequence conservation across taxa than a protein that only interacts with one protein.
Here’s where ATP synthase comes in. Bacterial F1F0 ATP synthases are composed of 8 components: the alpha subunit, the beta subunit, the a subunit, the b subunit, the c subunit, the gamma subunit, the delta subunit, and the epsilon subunit (see Figure 5).
Figure 5. Diagram of the ATP synthase system.
I retrieved the sequences of each of these components from UniProt. The sequences were all from three different bacteria genera: Escherichia, Shigella, and Bacillus. So, there were 3 alpha sequences, 3 beta sequences, 3 subunit a sequences, etc. The sequences of each component were then aligned using ClustalO, and the percent identity was recorded. Below is a table of the subunits, the percent sequence identity shared among the 3 sequences from each subunit, and the number of ATP synthase (ATPase) components each of these proteins interact with.
|Name of Protein||Percent Sequence Identity||Number of components protein interacts with|
|ATPase subunit alpha||52.621%||4 (beta, gamma, delta, epsilon)|
|ATPase subunit beta||65.962%||4 (alpha, gamma, delta, epsilon)|
|ATPase subunit a||24.468%||2 (c, b)|
|ATPase subunit b||24.571%||2 (a, delta)|
|ATPase subunit c||39.241%||3 (a, gamma, epsilon)|
|ATPase subunit delta||22.162%||3 (alpha, beta, b)|
|ATPase subunit epsilon||34.532%||4 (alpha, beta, gamma, c)|
|ATPase subunit gamma||36.426%||4 (alpha, beta, epsilon, c)|
The first feature that I’d like you to notice is that ATPase subunits a and b both interact with only 2 components, and they share almost exactly the same amount of sequence conservation (24.468% and 24.571%, respectively, a difference of about .1%). However, we do see some exceptions to the hypothesis I described above. Subunit delta interacts with 3 components but has the lowest degree of sequence conservation. And subunits gamma and epsilon both interact with 4 components but have a lower degree of sequence conservation than ATPase subunit c. Nevertheless, if we average the degrees of sequence conservation among the ATPase subunits that interact with different numbers of components – we do indeed find that, on average, the greater the number of components an ATPase subunit interacts with, the greater the degree of sequence conservation, and hence, the more conserved the 3D structure of the protein (see graph, below).
Graph. This graph lists the mean degree of sequence conservation among ATPase subunits that interact with 2, 3, and 4 components respectively.
From the above graph we can see that, generally speaking, our hypothesis is correct. The greater the number of components a protein interacts with, the more specific its sequence and shape must be. And this adds another constraint on what kinds of proteins are and are not tolerated for being co-opted into a multi-part molecular machine.
There is one more detail I would like to add here, regarding the matter of complementary shapes: not only must the proteins be fairly complementary to one another, but these complementary-shaped proteins must also be localized to the same subcellular location. If they are not, then the molecular complex cannot be co-opted from these precursor proteins. This, again, adds another constraint on the co-option scenario, and diminishes its plausibility as a general solution to the origin of molecular machines.
I will summarize the conclusions of this article in brief:
- In order to function properly, molecular machines require the interaction of protein components that interlock and bind together. The shapes of the proteins are what allow proteins to fit snuggly with each other, producing biological function.
- Although loosely complementary shapes will produce function, there is a threshold at which there will be either novel functionality or no novel functionality; and the vast majority of physically possible protein shapes will be below this threshold. Further, the precursor proteins which independently evolve complementary shapes must just happen to be localized to the same subcellular location.
- If a protein that will be co-opted into a multi-part complex will interact with multiple components of the molecular complex, then its shape must be very specific. And there are many more ways to clog up, gum up, and destroy the function of a molecular machine by tossing a protein into the mix than there are ways to enhance the function of the machine by the addition of a new protein.
To be continued…
Note: Some of these images are not very high quality; however, if you click on them, they will have far better quality.