TY - JOUR
T1 - Prioritizing candidate genes by weighted network structure for the identification of disease marker genes
AU - Shin, Miyoung
AU - Lee, Hyungmin
PY - 2011/1
Y1 - 2011/1
N2 - The use of microarray gene expression profiles for gene ranking is one of the most popular approaches to find marker genes associated with specific diseases. In addition, recently, other types of biological resources, such as gene annotations, bio-literature, and so forth, have been also explored along with the expression profiles. The GeneRank algorithm is one of such approaches that employs gene annotation data as well as expression scores to prioritize genes. Particularly, the GeneRank algorithm constructs an unweighted network structure from gene annotation data. Based on such network, it calculates ranking scores for individual genes according to their associated links and expression scores. In this work, our interest is to investigate the effectiveness of the weighted network structure generated from gene annotations for gene prioritization. For this purpose, we propose two novel weighting schemes to define the link strength between genes, called the Shared Functions (SF) link-weighting scheme and the Weighted Shared Functions (WSF) link-weighting scheme. The evaluation of the proposed schemes was done by applying them to prioritize candidate genes associated with prostate cancer. That is, from microarray expression profiles and gene annotation data, we produced ranking scores of individual genes based on the weighted network structure built by our proposed link-weighting schemes. As results, the top n-ranked genes were taken as our selection of marker genes associated with prostate cancer. For biological validation of the identified marker genes, we searched for a priori known list of genes related to prostate cancer disease from bio-literature and used them as the gold standard. Then, from the top n-ranked genes, we counted how many genes in the gold standard were identified by using the proposed schemes. According to our experiments, the proposed link-weighting schemes improved the performance of the detection of disease marker genes, compared to original GeneRank algorithm. Consequently, it is observed that the use of the weighted network structure for gene ranking can be very effective to identify marker genes involved in specific diseases.
AB - The use of microarray gene expression profiles for gene ranking is one of the most popular approaches to find marker genes associated with specific diseases. In addition, recently, other types of biological resources, such as gene annotations, bio-literature, and so forth, have been also explored along with the expression profiles. The GeneRank algorithm is one of such approaches that employs gene annotation data as well as expression scores to prioritize genes. Particularly, the GeneRank algorithm constructs an unweighted network structure from gene annotation data. Based on such network, it calculates ranking scores for individual genes according to their associated links and expression scores. In this work, our interest is to investigate the effectiveness of the weighted network structure generated from gene annotations for gene prioritization. For this purpose, we propose two novel weighting schemes to define the link strength between genes, called the Shared Functions (SF) link-weighting scheme and the Weighted Shared Functions (WSF) link-weighting scheme. The evaluation of the proposed schemes was done by applying them to prioritize candidate genes associated with prostate cancer. That is, from microarray expression profiles and gene annotation data, we produced ranking scores of individual genes based on the weighted network structure built by our proposed link-weighting schemes. As results, the top n-ranked genes were taken as our selection of marker genes associated with prostate cancer. For biological validation of the identified marker genes, we searched for a priori known list of genes related to prostate cancer disease from bio-literature and used them as the gold standard. Then, from the top n-ranked genes, we counted how many genes in the gold standard were identified by using the proposed schemes. According to our experiments, the proposed link-weighting schemes improved the performance of the detection of disease marker genes, compared to original GeneRank algorithm. Consequently, it is observed that the use of the weighted network structure for gene ranking can be very effective to identify marker genes involved in specific diseases.
KW - Disease marker genes
KW - Gene prioritization
KW - Gene ranking
KW - Microarray
KW - Weighted network structure
UR - http://www.scopus.com/inward/record.url?scp=79953034489&partnerID=8YFLogxK
U2 - 10.1007/s13206-011-5105-4
DO - 10.1007/s13206-011-5105-4
M3 - Article
AN - SCOPUS:79953034489
SN - 1976-0280
VL - 5
SP - 27
EP - 31
JO - Biochip Journal
JF - Biochip Journal
IS - 1
ER -