Australia's largest cancer research agency, Melbourne's Peter MacCallum Cancer Centre, is linking up with the national information and communications technology agency, NICTA, to develop adaptive software to sift huge datasets from microarray chips for patterns of gene activity distinctive to different cancers.
NICTA and Peter Mac geneticists will collaborate with NICTA's Statistical Machine Learning research program (SML) to develop software to accurately diagnose cancers, and select therapies appropriate to each patient's personal prognosis.
Moore's Law is driving microarray technology down into the low-micron realm, threatening to drown cancer geneticists in a tidal wave of data.
Peter Mac research director Prof David Bowtell said that until recently, cancer researchers used simple microarrays with only 5000 to 10,000 elements to profile gene-expression patterns in cancerous tissues.
Bowtell said researchers now used 30,000-element microarrays, and were moving to 100,000-element chips. Affymetrix' latest GeneChip Exon 1.0 ST array has more than 5.5 million elements, allowing researchers to perform whole-genome scans of activity within exon clusters.
The company is already working on a 1.5 micron chip with 100 million elements that will allow whole-genome scans at a resolution capable of identifying the myriad permutations of alternately spliced cDNAs from individual genes.
PMCC has been accumulating high-quality datasets involving hundreds of samples with detailed clinical annotation, from relatively basic analytical techniques. They carry information about gene expression, gene copy number, and gene deletions.
Opportunity to innovate
Dr Adam Kowalczyk, a senior researcher with NICTA's SML program, said the challenge of unlocking information from such large datasets provided an excellent opportunity to develop innovative approaches to data analysis and data mining.
Kowalczyk said initial efforts would concentrate on developing an IT methodology for discerning biologically meaningful and clinically useful knowledge from microarray profiles of tissue, with a focus on cancer diagnosis and treatment. Researchers will then move to develop practical algorithms and software tools for testing, demonstration and clinical use.
Bowtell said the interaction with NICTA's machine-learning experts was vital, given the need for sophisticated computer-learning techniques to achieve accurate diagnoses.
The Peter Mac already had 250 annotated samples from early, 10,000-element arrays, and 230 samples from ovarian cancers analysed with 58,000-element arrays, Bowtell said. "Our data set from the Australian Ovarian Cancer Study is much larger than anything published, by a large margin -- it's the largest biospecimen set in the world," he said.
Analysing datasets from 100 million-element microarray chips would pose a "massive problem" in pattern recognition. "The problem of over-fitting becomes enormously difficult, because there are so many opportunities for chance associations -- we need very clever software to detect real associations in gene activity," Bowtell said.
"It's not just gene activity. We're also searching for information in regions of gene amplification, duplication and loss in cancerous cells, and then trying to associate the signals with the patient's clinical response -- whether they responded, or failed to respond, to a particular therapy.
"If we see a gain in activity in one region, and a loss in another, is it clinically relevant? When we start to look at the potential pair-wise combinations, the numbers become very challenging."
Bowtell said that by establishing links between patterns of gene activity in individual patients, and their response to particular therapies, oncologists could perform prognostic analyses on individual patients, and tailor therapies accordingly. For instance, breast cancers over-expressing oestrogen and progesterone receptors would be candidates for treatment with the monoclonal-antibody HER2 receptor-blocker Herceptin.
"Clinical decision-making now involves lots of molecular information. We now need to do proof-of-principle experiments, and translate them into something that alters clinical practice," Bowtell said.