‘World Wide Web of cancer research’ exploits human genome map

| April 15, 2008
By Heather Havenstein

Click here to find out more!

March 26, 2008 (Computerworld) In June 2000, President Bill Clinton and British Prime Minister Tony Blair unveiled what amounted to a "rough draft" of the deciphered human genome, a milestone in the effort to crack the complex genetic code that shapes human development.

The work of the mapping of the human genome, whose completion was announced in April 2003, was heavily dependent on advanced computing for the data-intensive task of mapping the sequence of 3 billion base gene pairs.

Ironically, getting that genetic data into the hands of biomedical researchers has created another major computer quandary: the need for even more advanced systems that can keep up with an increasing number of disease subcategories being discovered through genetic research.

The National Cancer Institute took on that issue in 2003 by launching what it called the largest IT project in the history of biomedical research. The NCI created what is, in essence, a World Wide Web of cancer research.

The Cancer Biomedical Informatics Grid, or caBIG, promises to help researchers, physicians and patients across the country to better share more-detailed information about diseases and thus speed the development of new drugs and treatments for them.

The government-funded effort costs about $20 million a year, the NCI said.

To date, 42 of the institute’s 63 national cancer centers are either linked to the caBIG grid or are installing the necessary infrastructure to participate. Many are already building applications that can be shared by members of the grid.

The need for wider data sharing became obvious as genetics research found more subcategories of cancers that would require specific treatment methods.

Traditionally, cancer researchers focused on studying a relatively small number of disease categories, such as lung cancer, breast cancer or colon cancer. But as the genome work expanded, many disease subtypes were discovered within those categories, and each may require a different treatment.

Cancer researchers quickly saw the need to assemble as much information as possible to help in the development of new disease-specific treatment options. So, to broaden the number of data sources, the NCI has begun expanding the grid to include the community hospitals and physicians that treat 80% of U.S. cancer patients.

Interoperability

Project backers said that researchers decided early on to focus on improving interoperability rather than forcing research organizations to standardize on expensive new IT systems and software.

To accomplish that, the developers used the Globus Toolkit, a set of open-source tools for building grid systems and applications that run on top of Web services that are open for anyone with a node on the system. The Globus tools are distributed by the Globus Alliance.

Developers also created a collection of tools that serve up semantic descriptions of vocabulary and data so that both humans and machines can interpret data from dissimilar systems. And a common security model was built to allow research centers to run caBIG as a distributed infrastructure that lets each participant create individual policies to determine who can author or access data.

In addition, Ken Beutow, director of NCI’s Center for Bioinformatics, said the NCI has set up "workspaces" — groups of people that meet regularly to discuss specific domains of work, such as tissue banks and pathology tools. The workspace groups provided input on building the common vocabularies and data elements, he noted.

Robert Annechiarico, director of cancer center information systems at Duke University, which has already helped build applications for the grid, said that creating the common data elements is particularly important for academic researchers. "Academic medical centers are a community of fiefdoms bound together by a common parking problem," he explained.

Researchers at Duke contributed to the development of two caBIG applications, the Cancer Central Clinical Database and the Cancer Central Clinical Participant Registry.

The latter application, a Web-based tool for managing clinical trial data across multiple cancer centers, can provide researchers with access to records about patients suffering from one of the new subcategories of cancer.

"Where I might see five patients a year with a particular disease, now I can see 50," Annechiarico said.

Duke is using the former application in a $6.8 million research project, funded by the U.S. Department of Defense’s Breast Cancer Research Program, to study how genomic profiling can be used to guide treatment plans for women with newly diagnosed breast cancer, he added.

In addition to expanding access to specific data sets, caBIG can increase the safety of clinical trials for patients, noted Warren Kibbe, director of bioinformatics at the Robert H. Lurie Comprehensive Cancer Center at Northwestern University in Evanston Ill.

For example, he said, development of a caBIG clinical trial management application would allow researchers to determine the adverse effects of a single medication used in multiple clinical trials. "That is one example of how caBIG is starting to touch patients in a way that hasn’t been possible," Kibbe added.

The open-source Patient Study Calendar application now in development at the center will be used for patients in clinical trials, he noted. Among other things, the application will be able to tell patients how much medication to take and when.

The single application could define patient management parameters, eliminating some of the problems that result when doctors with different types of training — a surgeon versus an oncologist, for example — interpret rules differently, Kibbe said.

Implementing caBIG has not been without challenges, according to an NCI-commissioned review of the project that was released late last year.

The report found that over the life of the effort — from 2003-2007 — developers have not focused enough on the needs of end users and have too often released buggy products.

Beutow said the report prompted the NCI to "redouble" its efforts to provide better technical support to users. The agency now sends updates on the program to user e-mail lists, has created Web sites with caBIG information and launched a telephone help line to provide technical support to users.

Long Road Ahead

At the same time, the caBIG program is in the midst of an expansion to add links to the grid and its 40-plus applications to community health care providers. To date, 1 have signed up to join the program.

And national cancer centers in the U.K. are in the process of building an infrastructure to become "caBIG-enabled," Beutow added.

He urged that health care organizations use caBIG and other IT resources to further extend biomedical research, following the lead of the financial services industry and the Department of Defense.

Len Lichtenfeld, deputy chief medical officer of the American Cancer Society, noted that projects like caBIG are critical to science but still have a long way to go.

"We haven’t even begun to scratch the surface of how we can cooperate and share data," Lichtenfeld said. Taking advantage of the "explosion of information" generated by genomic research is going to take a tremendous amount of infrastructure development — and time, he added.

"I am 61 years old, [and] I would hope we are able to see some of this connectivity before I am gone from this earth," he noted. "It is going to take us another generation until we see the type of applications where we can put it directly into affecting patient care."

Nonetheless, the NCI’s parent organization, the National Institutes of Health, is already holding up caBIG as a model for sharing research and treatment data associated with other illnesses, such as cardiovascular disease.

"This change in medicine is revolutionary," said NCI’s Beutow. "We have the capacity now to look and see how an individual might respond to a particular therapeutic approach."

David Steffen, director of the Bioinformatics Research Center at Baylor College of Medicine in Houston, noted that his organization is now working under caBIG’s auspices to find a way to use the grid to share cardiovascular disease research data.

Steffen said he envisions a time where this type of technology could evolve to support some of the genetic advances shown in the 1997 science fiction film Gattaca, in which DNA analysis at birth could predict the likelihood of disease.

"The goal is to look at this [genetic] sequence and say, ‘Aha, you have this combination of genes which predisposes you to heart disease,’" Steffen noted. "It won’t be much longer before we’ll be able to routinely do that at birth. [The caBIG grid] is going to have complete, unexpected and very dramatic impacts on the pace of medical research."

CaBIG is also has working with President George W. Bush’s Office of the National Coordinator for Health IT, which oversees the development of electronic health records, to ensure that the EHRs can include details about a person’s genetic makeup.

Category: Web/Tech

Comments are closed.