Open QSAR
Open QSAR is a new site for the development and assessment of comparative QSAR modelling methodology (meta QSAR) as well being available for QSAR model building and multi-property prediction. The site uses an automated QSAR building and prediction software system that has been developed over the last few years and which is known as the Discovery Bus.
At this stage the site displays 15,000 QSAR models as well as validation data for around 100 properties extracted from the WOMBAT database of Prof. Tudor Oprea of the University of New Mexico. We evaluate the models against criteria outlined by Tropsha et al in their paper “The Importance of Being Earnest”, QSAR Comb. Sci. 22 (2003), 69.
We will shortly be extending this to include the remainder of the WOMBAT database as well as the recently released ChEMBL database comprising 2 million data points for over 5,000 targets. This study, which we are calling the “Mother of All QSAR’s” is supported by Microsoft Research who are providing the Azure cloud computing nodes that we need to reduce the time required from several years to a few weeks. We expect this study to create a comprehensive QSAR model library that can be used for multi-property predictions in drug discovery. We are now testing its use for predictions and will make that available for use shortly. It is also our intention to offer the site for training models.
This is the first of a series of posts on the technical capabilities of this new resource and announcements as it develops. We would be grateful for feedback and questions.
The Micro Pharma
Pre-clinical drug discovery has traditionally been the preserve of large fully integrated pharmaceutical companies (FIPCO’s) in established geographical centres in the UK, US and Europe. It relied on tight integration of laboratory facilities and drug design expertise within monolithic mega-research facilities. As laboratory science has become more industrialised and experimental data resources become more widely available through the internet, drug discovery is moving out from the mega-Pharma into thousands of small mid-size Biotech or Pharma research businesses. These new “mini-Pharma” entrants usually focus on developing some biological insight emerging from academic research into potential drug development candidates and then license those to the mega-Pharmas that have the financial muscle to carry the costs of clinical development.
Although the mini-Pharma companies are smaller and operate more cost-effectively, they often operate as smaller versions of their larger role models. Their productivity may sometimes be higher, but they are not radically different. They remain expensive to set up, fixed costs are high and the majority fail commercially. Nevertheless, driven by unmet medical needs, growing populations and increasing wealth, there is a very large market for new clinical development candidates to fill the empty pipelines of the large FIPCO’s.
The revolution in information technology and emerging developments such as the semantic web, cloud computing as well as community data and software resources is now creating the right circumstances for a second wave of change. This change will be driven by solutions for managing information, extracting knowledge and making decisions in virtual organisations which will create opportunities for new entrants to this very valuable market, especially those with strong IT expertise. The new entrants will be “micro-Pharmas” that minimise fixed costs through accessing on demand, pay-as-you-go laboratory and computing services and exploit Web 2.0 technologies to access globally available services and expertise “on demand”. Driven by their expertise in Information systems, they will deliver new medicines at low unit cost and lower risk. These new micro-drug discovery companies will stimulate local markets for services and grow expertise core in biotechnology activities that create wealth for their investors and communities.
In this emerging landscape for Pharmaceutical research, the mega-Pharmas will increasingly focus on clinical development and commercialisation, in-licensing their products from high productivity mini- and micro-Pharmas. Services will be acquired from the best providers that operate globally and offer services on demand.
This emerging landscape creates new businesses opportunities in the provision of on demand services and in the formation of micro-Pharmas which may operate as businesses, charitable foundations or potentially as participants in open source drug discovery.
Independent Science
In a post earlier in the year I talked about the independent scientist and how very experienced people were stepping out of their careers with sufficient income to engage in science as independents, assuming that they could find ways to reproduce the networks and tools that they needed. In a somewhat plaintive cry from a commenter, I was asked how that might work if you didn’t have a made-up pension or other income following a well paid career but wanted to continue working in research. Roughly translated, I think that means “this may work for you old gits, but what about us that still have our hair and eyesight”. Having confidently said I would reply soon, it has taken me 5 months to think of an answer. Hubris.
Because I am not sure you can. At least, I can see ways to make an independent living as a young scientist, but it may not be end up being research and it may mean that you do a lot of things that you don’t want to do. It depends a lot on what your research area is, what motivates you and how committed you are. So how could you do this?
Nutters on the Bus
Having argued in the previous post that open science lets the general public engage in science and that this is a Good Thing, I’d like to backtrack slightly and qualify my remarks. I think open science will be a bit like public transport. For some things it can get you places quicker and it has a community value, but is not for everyone or for all journeys. It means that you have to mix with the public, and you can meet some interesting people that way, but you do have to be a little circumspect. Most importantly, you don’t want to sit near the Nutter on the Bus and if you do, whatever else happens, don’t make eye contact.
Beyond the Pale
Enthusiasts for open science come from many backgrounds and see different advantages. For me, open science is what science is supposed to be. Open publishing of results and data in such a way that the “educated layman” can repeat the work, check its validity and potentially come to alternative conclusions. After all science is the process, the debate, not the corpus. The corpus changes with new data, new interpretations and new theories, it’s the continuity of debate which is constant. Not all scientific research should be open. It requires confidentiality to take research through to a commercial product, but if there is an expectation of gaining credit from the work, then it should be. The other argument is that open data and open publishing create radically new opportunities for mining information and realising the potential of the semantic web in building new kinds of scientific understanding.
All of these are noble goals and scientists that go open, publish their data, software methods and conclusions, especially if they do it “live” as in Open Notebook Science are pioneers in an exciting new world of collaborative, transparent scientific discovery. One such pioneer is Steve McIntyre, whose blog I link to here, who operates fully “live” in the open publishing and archiving of data, his software and interpretations as he does the work and leaving himself open to continuous peer review by all comers at his blog as well as elsewhere, If he makes a mistake (which is unusual) it is quickly challenged. There are many good examples of his work to see on his blog and a recent analysis of his is representative. As an experiment in the sociology of open science and a window into what open science might look like, I find it fascinating to observe.
However, despite his site winning the Science blog of the year in 2008, his pioneering of open notebook science and his blog’s very high traffic, I don’t see any recognition of what he does in open science “circles”or in the mainstream scientific debate about open publishing. I don’t know why that is, because I see the same comments on his site about the importance of open data, transparency and rigorous analysis as elsewhere. What’s more he doesn’t just talk about it, he does it for real and does it well. I can’t help wondering whether this is because he is a climate change denier, someone whose scientific analysis undermines the basis of claims for man-made global warming. As such it is possible he is largely ignored because he is seen as “Beyond the Pale” and excluded from the respect I think he (and others) are due for their courage in working openly and publishing live.
I agree he is “Beyond the Pale”, but not in the modern sense of being outside what should be regarded as acceptable to polite society. Rather, I think he is Beyond the Pale as in its original meaning and if interpreted in that way, what he does, how he works and the reaction he gets has very important lessons for any scientist, whether working in the open or not.
Bacon and Eggs
The great Northern cities of England grew out of the world’s first industrial revolution and its specialisations for Cotton (Manchester), Wool (Bradford), Ships (Newcastle) and other products. The Universities, often started and funded by successful Victorian entrepreneurs to do research to devise newer products, also contributed to better educational opportunities for many and the growth of new knowledge that improved health care, nutrition and the environment. They and their founders saw themselves as Civic Universities, adding to the culture of the city as well as its wealth. They committed their time, energy, knowledge and innovation to their local citizens, and they did it because that is what they were for.
Today’s Universities operate as businesses, selling their time, energy, knowledge and innovation to their fellow citizens for as much as they can get. They compete with real businesses. They may say that they are Civic Universities, and they may appear to be involved in the life of their city, but they don’t show the commitment of their predecessors. They may be involved, but they are not committed.
An investor of mine once explained the difference between involvement and commitment when I had said something about “being involved” in the new business we were launching. “Involvement’s no good to me”, he said, “I need commitment. Just like with Bacon and Eggs, the Hen may be involved, but the Pig is committed”.
Walled Gardens
There is a nice quote in Wikinomics which I like, particularly the bit about walled gardens
The quote is largely referring to new-style on-line business and the importance of engaging users and helping them to establish communities, but it seems to me that this is also very relevant advice for the modern University. Universities make money from Teaching, Research and Technology Transfer. They are usually good at teaching and can turn a profit, because they generally know what they are doing. Research Contracts bring in cash, although less than they might if they got their overheads down. However, most Universities lose money from spin-out and licensing, and this is bad for them, bad for the academics and most importantly, it is very bad for the community in which they live.
Why is this? They build Walled Gardens.
Heroes and Villains
I am part way through watching Series 3 of “Heroes”, and enjoying the interesting twists of character and plot. For those that don’t know it, the basic premise is that an unusual set of genetic mutations have produced new kinds of individuals that are able to suspend the laws of physics and fly, move in time, read minds, or whatever else the scriptwriters can come up with in time for the next episode. Very enjoyable stuff, even if (or because of) the science being so “creative”. Still, where would popular culture be without warp drive and inertial dampeners? The latest series has an interesting twist in that some of the Heroes of the first and second series are now turning into Villains and some of the Villains are becoming Heroes. Even so, I still don’t buy the argument that Syler should be let off his conduct (i.e. slicing off the skulls of his victims in order to suck out the brains) in the earlier series, just because he has an eating disorder and his mother had him adopted. So it is getting confusing and unsettling, but very watchable.
When it comes to the Pharmaceutical industry and health care, popular culture has no such uncertainty about who is the hero and who is the villain. The litany of complaint is well known. Drug companies extort obscene profits from the sick, corrupt honest scientists and doctors through bribes, abuse the poor of the third world as human guinea pigs, and force dangerous, untested new drugs on us. We know that because we read it in the news and watch films like “the Fugitive” and “the Compleat Gardener”.
Still, it’s only entertainment, no harm done, and anyway, everyone knows who is the villain …
Inkspot Steps Up
We are now working with the first closed Beta version of the Inkspot Science on demand site as well as the workflow design desktop client that currently acts as an IDE for adding services, managing data and sharing both. The Inkspot desktop client is a very nice implementation of a workflow design tool that makes it easy to build workflows from standard blocks and manage these on the Inkspot server. We have been adding specialist blocks for data mining and cheminformatics and plan to extend these depending on what the beta test users need. I particularly like the scripting block which lets me write simple R scripts which are implemented using an Rserve server on the Inkspot site. The (currently) small number of blocks for cheminformatics were added using a few bits of the CDK toolkit along with some charting, modelling and data manipulation tools. We are expecting to be adding a great deal more in the near future connected with research projects that we are keen to support and would be happy to work with collaborator groups to tailor the content, particularly if those groups would be willing to help us improve the site and help us direct its development.
The server site has a simplified version of the workflow editor as well as a “lab notebook” blogging mechanism that makes it easy to add scientific content such as data, workflows, references and so on. This is an early focus of what we are doing to simplify the process of data analysis and communication. Of course the site allows the user to form groups and share objects, although the default is private. We have two major research collaborations already underway and I hope to be able to say more about both soon. Although it is early days for us, we the site is making excellent progress and we are excited about the prospects in the coming year.
If you have a research project that could benefit from using Inkspot as a hosting and collaboration mechanism, or if you’d like to use the site in your business get in touch and we will arrange a web demo and conversation.
The Winner’s Curse
Most of us mix up the meaning of Biotech and Pharma since both do pretty much the same thing, invent new drugs. We tend to use Pharma to describe the FIDDCO, i.e. a fully integrated Drug Discovery and Development Company, whereas Biotech companies are more likely to be emergent and on some pathway towards full integration, driven by their own research. But as I remember it, Biotech was originally used to describe the new “big molecule” companies inspired by the early success of Amgen and Genentech from inventing protein drugs. However, when this early success was hard to reproduce, most biotechs stuck to the small classical organic molecules which still dominate clinical practice. This is now changing, as the majors move to acquire or develop a protein drug component for their pipelines. Some argue that we are likely to see protein drugs as 50% of Pharma pipelines, indeed with the acquisitions of CAT and Medimmune, AstraZeneca now has 27% of its current pipeline as proteins and that’s a big change for a company that grew out of manufacturing paints and dyestuffs.
Why is this happening now? No doubt the science has improved and people have gained much more experience in the technologies required to develop and manufacture proteins to the demanding standards required of a new drug, but there are also significant commercial drivers, proteins are expensive and the technology barrier to cloning by generics is higher.
If this continues there will be big changes coming …


Recent Comments