Chi-Shuan Huang, Harn-Jing Terng, Yu-Chin Chou, Sui-Lung Su, Yu-Tien Chang, Chin-Yu Chen, Woan-Jen Lee, Chung-Tay Yao, Hsiu-Ling Chou, Chia-Yi Lee, Chien-An Sun, Ching-Huang Lai, Lu Pai, Chi-Wen Chang, Kang-Hwa Chen, Thomas Wetter, Yun-Wen Shih and Chi-Ming Chu
Background: Optimal molecular markers for detecting colorectal cancer (CRC) in a blood-based assay were evaluated. Microarray technology has shown a great potential in the colorectal cancer research. Genes significantly associated with cancer in microarray studies, were selected as candidate genes in the study. Pooling Internet public microarray data sets can overcome the limitation by the small number of samples in previous studies. Objective: Using public microarray data sets verifies gene expression profiles for colorectal cancer. Methods: Logistic regression analysis was performed, and odds ratios for each gene were determined between CRC and controls. Public microarray datasets of GSE 4107, 4183, 8671, 9348, 10961, 13067, 13294, 13471, 14333, 15960, 17538, and 18105 included 519 cases of adenocarcinoma and 88 controls of normal mucosa, which were used to verify the candidate genes from logistic models and estimated its external generality. Results: A 7-gene model of CPEB4, EIF2S3, MGC20553, MAS4A1, ANXA3, TNFAIP6 and IL2RB was pairwise selected that showed the best results in logistic regression analysis (H-L p=1.000, R2=0.951, AUC=0.999, accuracy=0.968, specificity=0.966 and sensitivity=0.994). Conclusions: A novel gene expression profile was associated with CRC and can potentially be applied to bloodbased detection assays.