Fordham University            The Jesuit University of New York

Interdisciplinary Faculty Seminar in Bioinformatics*

Department of Computer & Information Science

A frontier of bioinformatics: Investigation into the Feasibility of

Detecting Microscopic Disease Using Machine Learning

Dr. Jack Yang
Harvard University

Time:  Thursday November 5th at 5:30pm,

Place: Fordham University, Bronx, NY, John Mulcahy Hall (JMH) 112

Refreshments: will be served at 4:30pm in JMH 312


Brief Bio:

Dr. Jack Yang received his Ph.D. and MS degrees both from Purdue University, West Lafayette main campus and his post doctoral training from Harvard Medical School and Indiana University School of Medicine. He also received training in biostatistics and bioinformatics from Johns Hopkins University. He was a faculty member of Indiana University. Dr. Yang was trained as a combined experimental and computer scientist with more than 15 years of teaching, research and engineering practice experience in biomedical engineering and computational science. He was a recipient of a number of outstanding achievement and best paper awards.

Dr. Yang is the Editor-in-Chief of International Journal of Functional Informatics and Personalized Medicine and an honorary consulting editor of International Journal of Computational Biology and Drug Design. He has also been an editor of more than a dozen journals and proceedings books. He was the general chair of IEEE Bioinformatics and Bioengineering at Harvard Medical School in 2007. Dr. Yang has delivered many invited talks including a number of keynote lectures to promote the emerging field of functional informatics and personalized medicine. He has published more than 100 peer reviewed papers and book chapters. He specializes in cancer biology and artificial intelligence. He is the chair of board of directors of International Society of Intelligent Biological Medicine.


The prognosis for many cancers could be improved dramatically if they could be detected while still at the microscopic disease stage. We are investigating the possibility of detecting microscopic disease using statistical learning approaches based on features derived from gene expression levels and metabolic profiles. We use immunochemistry and QRT-PCR to measure the gene expression profiles from a number of antigens such as cyclin E, P27KIP1, FHIT, Ki-67, PCNA, Bax, Bcl-2, P53, Fas, FasL and hTERT in several particular types of neuroendocrine tumors such as pheochromocytomas, paragangliomas; and  the adrenocortical carcinomas (ACC), adenomas (ACA), and hyperplasia (ACH) in Cushing's syndrome. We provide statistical evidence that, higher expression levels of hTERT, PCNA and Ki-67 etc. are associated with a higher risk that the tumors are malignant or borderline, as opposed to benign.

We also investigated whether higher expression levels of the P27KIP1 and FHIT etc. are a ssociated with a decreased risk of adrenomedullary tumors. While no significant difference was found between cell-arrest antigens such as P27KIP1 for malignant, borderline, and benign tumors, there was a significant difference between expression levels of such antigens in normal adrenal medulla samples and in adrenomedullary tumors. It follows from a comprehensive statistical analysis that a number of antigens such as hTERT, PCNA and Ki-67 can be considered as cancer markers, while another set of antigens such as P27KIP1 and FHIT are possible markers for normal tissue. Because more than one marker must be considered to obtain a classification of cancer or no-cancer, and if cancer, to classify it as malignant, borderline, or benign, we must develop a intelligent decision system using machine learning techniques, including variants of support vector machines, neural networks, decision trees, self-organizing feature maps (SOFM) and recursive maximum contrast trees (RMCT).

These variants and algorithms we developed tended to work very well, yielding an average accuracy that was generally in excess of 90%. Our frame work focused on not only different classification schemes and feature selection algorithms but also ensemble methods such as boosting and bagging in an effort to improve upon the accuracy of the individual classifiers. It is evident when all sorts of machine learning and statistically learning techniques are combined appropriately into one integrated intelligent medical decision system, the prediction power can be enhanced significantly.

This research has many potential applications, not only in providing an alternative diagnostic tool and a better understanding of the mechanisms involved in malignant transformation subject to environmental changes, but also in providing information that is useful for treatment planning and cancer prevention.


For more information and direction, please contact Ms. Danielle Aprea (718) 817-4480 or


*This talk is sponsored by the Dean of Fordham College at Rose Hill under the Interdisciplinary Faculty Seminar Program 2009-2010.


Site  | Directories
Submit Search Request