Interdisciplinary
Faculty Seminar in Bioinformatics*
Department
of Computer & Information Science
A frontier of bioinformatics: Investigation
into the Feasibility of
Detecting Microscopic
Disease Using Machine Learning
Dr. Jack Yang
Harvard University
Time: Thursday November 5th at 5:30pm,
Place: Fordham
University, Bronx,
NY, John Mulcahy Hall (JMH) 112
Refreshments: will be served at 4:30pm in JMH 312
Brief Bio:
Dr. Jack Yang received his Ph.D. and MS degrees both
from Purdue University, West Lafayette main campus and his post doctoral training from Harvard Medical School and Indiana University School of Medicine. He also
received training in biostatistics and bioinformatics from Johns Hopkins University. He was a faculty member of Indiana University. Dr. Yang was trained as a combined experimental and
computer scientist with more than 15 years of teaching, research and
engineering practice experience in biomedical engineering and computational
science. He was a recipient of a number of outstanding achievement and best
paper awards.
Dr. Yang is the Editor-in-Chief of International
Journal of Functional Informatics and Personalized Medicine and an honorary
consulting editor of International Journal of Computational Biology and Drug
Design. He has also been an editor of more than a dozen journals and
proceedings books. He was the general chair of IEEE Bioinformatics and
Bioengineering at Harvard Medical School in 2007. Dr. Yang has delivered many invited talks
including a number of keynote lectures to promote the emerging field of
functional informatics and personalized medicine. He has published more than
100 peer reviewed papers and book chapters. He specializes in cancer biology
and artificial intelligence. He is the chair of board of directors of
International Society of Intelligent Biological Medicine.
http://en.wikipedia.org/wiki/Jack_Yang
http://www.spl.harvard.edu/pages/People/jyang
Abstract:
The
prognosis for many cancers could be improved dramatically if they could be
detected while still at the microscopic disease stage. We are investigating the
possibility of detecting microscopic disease using statistical learning
approaches based on features derived from gene expression levels and metabolic
profiles. We use immunochemistry and QRT-PCR to measure the gene expression
profiles from a number of antigens such as cyclin E, P27KIP1, FHIT, Ki-67,
PCNA, Bax, Bcl-2, P53, Fas, FasL and hTERT in several particular types of
neuroendocrine tumors such as pheochromocytomas, paragangliomas; and the
adrenocortical carcinomas (ACC), adenomas (ACA), and hyperplasia (ACH) in
Cushing's syndrome. We provide
statistical evidence that, higher expression levels of hTERT, PCNA and Ki-67
etc. are associated with a higher risk that the tumors are malignant or
borderline, as opposed to benign.
We
also investigated whether higher expression levels of the P27KIP1 and FHIT etc.
are a ssociated with a decreased risk of
adrenomedullary tumors. While no significant difference was found between
cell-arrest antigens such as P27KIP1 for malignant, borderline, and benign
tumors, there was a significant difference between expression levels of such
antigens in normal adrenal medulla samples and in adrenomedullary tumors. It
follows from a comprehensive statistical analysis that a number of antigens
such as hTERT, PCNA and Ki-67 can be considered as cancer markers, while
another set of antigens such as P27KIP1 and FHIT are possible markers for
normal tissue. Because more than one marker must be considered to obtain a
classification of cancer or no-cancer, and if cancer, to classify it as
malignant, borderline, or benign, we must develop a intelligent decision system
using machine learning techniques, including variants of support
vector machines, neural networks, decision trees, self-organizing feature maps
(SOFM) and recursive maximum contrast trees (RMCT).
These
variants and algorithms we developed tended to work very well, yielding an
average accuracy that was generally in excess of 90%. Our frame work focused on
not only different classification schemes and feature selection algorithms but
also ensemble methods such as boosting and bagging in an effort to improve upon
the accuracy of the individual classifiers. It is evident when all sorts of
machine learning and statistically learning techniques are combined
appropriately into one integrated intelligent medical decision system, the
prediction power can be enhanced significantly.
This
research has many potential applications, not only in providing an alternative
diagnostic tool and a better understanding of the mechanisms involved in
malignant transformation subject to environmental changes, but also in
providing information that is useful for treatment planning and cancer
prevention.
For more information and direction, please contact
Ms.
Danielle Aprea (718) 817-4480 or aprea@cis.fordham.edu.
*This talk is sponsored by the Dean of Fordham
College
at Rose Hill under the Interdisciplinary Faculty Seminar Program
2009-2010.