Friday, November 14, 2008

Making Sense in Biology

 
When I teach students how to read the scientific literature, I caution them not to believe everything they read. Science is, by it's very nature, tentative and exploratory. Much of what is published doesn't get confirmed and is quietly ignored. Many of the ideas and speculations that are published never amount to anything. Some experiments are flawed. Many skim over hidden assumptions so that the conclusions aren't valid.

How do you tell the difference between the wheat and the chaff? Well, for one thing, you ask yourself whether the results "make sense" in light of what you already know. Are there any basic principles of biology that conflict with the conclusions? You always have to be on the lookout for papers that just don't fit in with your current model of how things work.

There are two potential problems with this approach. First, your model may not be correct. Maybe you don't know enough to make a judgment. Second, it prevents you from recognizing truly novel results that may change your idea of what makes sense.

The first problem is curable. The second is more serious. Science is basically conservative in its acceptance of new ideas. This may seem like a bad thing but, in fact, it's the only way to do good science. You simply can't afford to believe in several paradigm shifts every day before breakfast because most of them will turn out to be wrong. Today, when scientists want to convince their colleagues of something new that may not "make sense", they are obliged to present solid evidence that will convince the skeptics. It's an uphill fight. And it should be.

One of my colleagues has been following the discussion about alternative splicing and he directed my attention to a paper he just published in Nature Genetics. He pointed out that far from being an overestimate of alternative splicing, the EST data actually underestimates the extent of alternative splicing.

The paper by Pan et al. (2008) makes two extraordinary claims.
  1. Their data indicates that about 95% of all multiexon human genes undergo alternative splicing.

  2. They estimate that there are, on average, seven (7) alternative splicing events per multiexon human gene.

Neither of these claims make sense. It's not reasonable to assume that most conserved housekeeping genes produce variants by alternative splicing yet that's exactly what would have to happen if 95% of all genes undergo alternative splicing. It means that most most genes for things like metabolic enzymes, RNA polymerase, ribosomal proteins and transport proteins will have variants due to alternative splicing. This doesn't make sense from an understanding of biochemistry and it doesn't make sense in light of evolution.

That's good reason to be skeptical.

But surely the data must be convincing? Surely the proponents of these extraordinary claims have extraordinary data to back their cease?

Frankly, I don't know. I can't evaluate the Pan et al. (2008) paper because I have no idea how they actually do their experiments and whether those experiments are reliable. Part of the problem is that the authors don't tell me enough and part of it is that this is unfamiliar technology (to me).

All I know is that it doesn't make sense. I've asked the author to give me some specific examples of alternative splicing predictions for common genes, like those in the citric acid cycle. By looking at specific, rather than global, data it might be possible to see whether the results make sense.


Pan, Q., Shai, O., Lee, L.J., Frey, B.J. and Blencowe, B.J. (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nature Genetics, published online Nov. 2, 2008. [DOI:10.1038/ng.259]

No comments:

Post a Comment