February 04, 2007

Case Notes 7

The liver is a large and complex organ with some rather unusual characteristics. For one thing, it can regenerate damage; for another, the majority of its blood flow comes from veins rather than arteries. Both of these features relate to its functions, which include processing and storing various nutrients and also, famously, detoxification. As first port of call for blood that's just absorbed all kinds of junk from the gastrointestinal lining, the liver is constantly being damaged in the line of duty.

It also has substantial responsibility for the glucose metabolism -- maintaining the levels of glucose in circulation by managing conversion to and from the storage form glycogen. Glucose homeostasis is critically important for the brain, which -- unlike many body tissues -- can't make use of other energy sources.

The maintenance process is mediated by two principal hormones, insulin and glucagon, manufactured and released by the pancreas in response to blood glucose levels. Glucagon stimulates the release of glucose via glycogenolysis, while insulin stimulates its uptake from the blood. When this process fails, as with diabetes, the consequences can be disastrous.

The functional units of the liver are the lobules, hexagonal arrangements of hepatocytes through which the incoming blood diffuses. In the course of the journey from the periportal exterior of the lobule to the perivenous interior, the hepatocytes clean the blood up, rebalance it and send it on its way with a spring in its step.

Data on how the liver functions comes from many sources. Although the process is extremely delicate, hepatocytes can be grown in the laboratory and various measurements made to determine what they do, and to a certain extent how they do it, both individually and en masse.1 Laboratory experiments in model organisms such as mice run alongside endless clinical studies of human liver pathology.

As with most such experimental data, the results do not in themselves explain anything; for that you need models, which the data informs. For a complex system like the liver, which does a bunch of different things in different ways and involves the interactions of many interdependent processes at a variety of scales, there will inevitably be a great deal of data from a great many uncoordinated investigations, each predicated on a different model. Integrating such a disparate patchwork of sources into any kind of comprehensive explanation is a job for the trendy discipline of systems biology.2

Liver systems biology is difficult at least in part because the system is essentially chemical in nature, rather than electrical or mechanical -- the latter tend to lead more naturally to quantitative results. The chemical behaviour of the liver is non-uniform, with both systematic structural variations, like the periportal-perivenous gradient, and variation that is apparently random; for example, 30-40% of adult hepatocytes are bi-nuclear, leading to differences in biochemical behaviour. Hepatocytes work together, communicating via connexin gap junctions, so local models cannot adequately capture their emergent behaviour, and the liver itself does not work in isolation -- for glucose homeostasis it is closely bound to the pancreas.

Accurate and detailed simulations of the entire system are unlikely to materialise anytime soon. Nevertheless, the effort to develop composite and multi-scale models can lead to improved understanding and in turn point the experimentalists toward phenomena worth investigating. Iterate the process a few thousand times and we'll be away.

The problem of integrative modelling is, essentially, an engineering one; a software engineering problem, in fact. While it is possible to tailor a single monolithic composite of a number of existing sub-models by bespoke reimplementation, such an approach leads inexorably to crisis: as soon as you want to add something to the structure, it all falls down. Computing may be in its infancy compared to most of science, but if there's one single lesson we have learnt in the last half century3 it's that complexity is unmanageable without modularisation.

A core tenet of software engineering is that the coupling between distinct components should be kept to a minimum. Things that exist in separate realms should be kept as separate as possible. Communication should take place through clearly defined interfaces and what goes on behind those interfaces is no-one else's damn business.

Biological models, at whatever scale and with whatever purpose, are seldom if ever designed with such principles in mind. Models tend to be structured to address a specific circumstance rather than to provide a service, and even where the results are universal the implementation almost certainly won't be.

To recruit such models into a larger composite, you can either reimplement them altogether or try to create an infrastructure that allows existing implementations, suitably wrapped, to interoperate. The latter task is non-trivial4 but surely better than rewriting everything again and again; and certainly more readily adaptable, allowing component models to be mixed and matched according to need. Individual models can still be reimplemented where necessary.5

A crucial aspect of any such an infrastructure is meta-data: information describing the models -- eg, what they are implemented in, how they are invoked, their inputs and outputs, their assumptions. If these meta-data can be made available then at least some of the nitty-gritty of managing interactions between different models can be automated. An XML-based language for this, CMDL, has been developed, along with "orchestrator" software that can run models together, although this process remains somewhat crude and ad hoc.

A particular problem arises when different models have circular dependencies: the output of one affects the input of another and vice versa. In these cases an approach is used known as waveform relaxation, which basically involves iterating the interactions until they converge. However, this method may exhibit non-uniform convergence, meaning that you can't rely on it to always reach the right answer. This may be so even where the separate models would have a simple analytic solution if treated together as a single system of equations. In specific cases this may constitute an argument against modularisation; that assumes there is a choice, though, which there may not be.

An advantage of using a modular framework is that individual models can be as detailed or as simplistic as necessary, with more complex models swapped in where required without disturbing the rest of the composite. The result is a new modelling approach, a model ecology in which multiple possible models coexist. As a result, the system is much less prone to the last cog effect, where the whole meta-model produces nothing until every aspect is perfectly finished. Instead, the gross structure can be blocked out with simple placeholders and some results obtained before stepwise refinement.

In the particular case under consideration, the main focus is to model liver function, so more complex components are used for problem-specific elements such as glucagon receptors, while relatively trivial models suffice for the blood and pancreas. If these prove inadequate, they can be replaced by more sophisticated ones.

Models typically require two different kinds of content: their systematic/structural/topolical characteristics -- which correspond to the underlying ideas of how things actually work -- and their parameters, specific numerical values that are estimated from experimental data. The latter are a source of potential confusion because of disparities of definition and method across experimental studies.

There are several criteria one might apply to determine the success of a model. One obvious one is that it should produce the expected results; that is, its predictions should match the experimental data that went into it. However, this is not always very informative: it is often possible to fiddle model parameters to match the training data without having more general applicability, a process known as over-fitting. More significant is how well the model's predictions match future data from different experiments.

A good model should generate interesting science: it should suggest new experiments and avenues of exploration, making predictions for things that have not yet been tried. It should provide a conceptual framework within which a problem can be better understood. If a composite model does so, and its predictions prove correct, then there are better grounds for supposing some correspondence between its components and the elements of the real world system it describes. However, it will always remain merely a model.


1 One of the problems that has to be faced when dealing with hepatocytes, and indeed all cells, in vitro is that they can de-differentiate, losing their liverness and tending to more generic cell behaviour. Also, most cells have a shelf-life: after some number of generations they just die. To get around this, cells are often cultured from immortalised lines -- often from tumours, since avoidance of cell death is one of the characteristics of cancer -- but it's not always clear that these cells exactly reproduce the behaviour of their mortal cousins.
2 Systems biology is hip, groovy, important and spectacularly difficult; but, though informed by the genomic revolution, its territory is not exactly new. Cynics may see it as something of a rebranding exercise: physiology for the 21st century.
3 No, you're right: there isn't. Not even one.
4 Litotes is alive and well in the hacker vocabulary.
5 In the peer-reviewed scientific process you would expect replicability to be part of the deal, but that doesn't always go down to the software. Depending on the journal, a paper presenting a biomathematical model and its computational results may not have to include source code. Reviewers will be asked to referee the content on the theory and results without unpicking the implementation. This is an obvious failure of transparency; while it is unlikely to substantially mislead the long-term progress of science, it may throw a few caltrops onto the road.

Posted by matt at February 4, 2007 09:33 PM
Comments
Something to say? Click here.