针对蛋白质复合体检测的自学习图聚类
A self-learning graph clustering approach for protein complexes detection
摘要点击 84  全文点击 60  投稿时间:2016-08-03  修订日期:2017-03-21
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2017.60581
  2017,34(6):776-782
中文关键词  图聚类  蛋白质复合体  非负矩阵分解
英文关键词  graph clustering  protein complexes  non-negative matrix factorization
基金项目  广东自然科学基金及其他
学科分类代码  
作者单位E-mail
朱佳 华南师范大学 jzhu@m.scnu.edu.cn 
武兴成 华南师范大学计算机学院  
林雪琴 华南师范大学计算机学院  
肖丹阳 华南师范大学计算机学院  
肖菁 华南师范大学计算机学院  
黄晋 华南师范大学计算机学院 1936079@qq.com 
贺超波 仲恺农业工程学院信息科学与技术学院  
中文摘要
      蛋白质复合体是由两条或多条相关联的多肽链组成, 在生物过程中起着重要作用. 假如用图表示蛋白质–蛋白质相互作用(protein-protein interactions, PPI)网络数据, 那么从中找出紧密耦合的蛋白质复合体是非常困难的, 特别是在近年来PPI网络的容量大大增加的情况下. 在本文中, 通过对称非负矩阵分解, 针对蛋白质复合体检测问题提出了一种图聚类方法, 该方法可以有效地从复杂网络中检测密集的连通子图. 并且将此方法和当前最先进的一些方法在3个PPI数据集中用同一个基准进行比较. 实验结果表明, 本文的方法在3个拥有不同大小和密度的数据集中均显著优于其它方法.
英文摘要
      Protein complex is a group of two or more associated polypeptide chains which plays essential roles in biological process. Given a graph representing protein-protein interactions (PPI) data, it is important but non-trivial to find protein complexes, the subsets of proteins that are closely coupled, from it, particularly in the condition that the PPI network has increased greatly in capacity in the recent years. In this paper, we propose a graph based clustering approach by adopting symmetric non-negative matrix factorization, which can effectively detect densely connected subgraphs from complex networks. We compare the performance of our approach with state-of-the-art approaches in three PPI networks with a well known benchmark complexes. The experimental results show that our approach significantly outperforms other methods in three PPI networks with different data sizes and densities.