基于聚类和支持向量机的胃癌患者住院费用建模

周涛; 陆惠玲; 王文文; 王惠群

引用本文:	周涛,陆惠玲,王文文,王惠群.基于聚类和支持向量机的胃癌患者住院费用建模[J].控制理论与应用,2017,34(6):803~810.[点击复制]
	Tao Zhou,huiling lu,Wenwen Wang,Huiqun Wang.A new model for hospitalization expenses of Gastric cancer based on clustering and support vector machine[J].Control Theory and Technology,2017,34(6):803~810.[点击复制]

基于聚类和支持向量机的胃癌患者住院费用建模

A new model for hospitalization expenses of Gastric cancer based on clustering and support vector machine

摘要点击 2408 全文点击 1526 投稿时间：2016-07-25 修订日期：2016-11-21

查看全文查看/发表评论下载PDF阅读器

DOI编号 10.7641/CTA.2017.60545

2017,34(6):803-810

中文关键词胃癌住院费用支持向量机聚类分类标签

英文关键词 gastric cancer hospitalization expense support vector machine clustering category label

基金项目国家自然科学基金(No61561040);宁夏自然科学基金(No. NZ16067)

作者	单位	E-mail
周涛	宁夏医科大学理学院	zhoutaonxmu@126.com
陆惠玲	宁夏医科大学理学院
王文文	宁夏医科大学理学院
王惠群	宁夏医科大学理学院

中文摘要

针对胃癌患者住院费用分类标签设定的复杂性以及传统费用建模算法的局限性, 本文提出了一种基于聚类和支持向量机的住院费用建模算法, 为胃癌患者住院费用的控制和预测提供方法基础. 搜集整理宁夏某三甲医院2009–2011年间1583例胃癌患者为样本, 采用K-means对总住院费用逐年聚类得到分类标签, 最后通过支持向量机对住院费用进行建模预测以及影响因素分析, 用分类准确率作为预测效果的评价指标. 实验结果表明胃癌患者住院费用呈逐年增加趋势, 其中以西药费为主, 占总费用的53.74%. 通过K-Means以年份对费用聚类比单纯以费用分布特征聚类的分类准确率提高了13.13%, 当核函数选用高斯核函数, 且惩罚因子C = 10和核参数 = 1时建立的支持向量机模型最稳定, 分类准确率为92.11%. 实验结果表明根据年份聚类得到类别标签更合理, 结合聚类的SVM来预测住院费用更有效.

英文摘要

A new modeling method based on clustering and support vector machine (SVM) is proposed to simplify category labels complexity for the hospitalization expenses of gastric cancer patients and overcome the limitation of traditional cost modeling techniques, thereby providing some theoretical evidence to control and predict hospitalization expenses of gastric cancer patients. 1583 cases of gastric cancer patients in a certain tertiary general hospital of Ningxia from 2009 to 2011 were collected as samples. Total hospitalization expenses were clustered by years using K-means to obtain category labels, SVM was used to forecast and analyze the influencing factors of hospitalization expenses. The classification accuracy was used as indexes to evaluate the predicting effect. The experiment result show that hospitalization expenses of gastric cancer patients were increased year by year, and western drugs accounted for most of the hospital expenses(53.74%). The influencing factors of the cost of hospitalization were treatment outcome, surgery, admission situation, hospitalization time, ages and marital status, in which prognosis and surgery were the most important influences. The experimental results showed that the clustering accuracy of K-means by year was increased by 13.13% compared to only by distribution characteristics. The gauss kernel function-based SVM was the most stable model, with a classification accuracy rate of 92.11% when the penalty factor C and parameter were set to be 10 and 1, respectively. The method clustered by year was more reasonable to get category labels, and it was effective to combine clustering and SVM to forecast the hospitalization expenses.