农业信息化专业学位研究生培养创新研究
作者:
龙陈锋;谭泗桥;方逵;张红燕
期刊:
计算机教育 ,2019年(1):48-51 ISSN:1672-5913
作者机构:
湖南农业大学信息科学技术学院;[方逵; 谭泗桥; 龙陈锋] 湖南省农村农业信息化工程技术研究中心;[张红燕] 湖南农业大学
关键词:
农业信息化;专业学位;培养模式;六段式;四双
摘要:
针对农业信息化专业学位硕士生研究培养过程中存在专业认识不清、知识老化、实践基地缺乏等问题, 分析存在问题的原因, 结合地方农业高校信息化专业学位研究生培养的经验, 提出"强化专业、追踪前沿、注重能力、突出应用"的人才培养新思路, 阐述"四双"培养方式、"五个共同"的管理方式、"六段式"教学模式、"4+2"课程群、"4+4+2"课程教学与考核方式.
语种:
中文
展开
以工程教育专业认证为导向的信息工程专业实践教学改革
作者:
张红燕
期刊:
实验技术与管理 ,2019年36(5):167-169+175 ISSN:1002-4956
作者机构:
湖南农业大学信息科学技术学院 ,湖南长沙,410128;[张红燕] 湖南农业大学
关键词:
工程教育;信息工程;实践教学;现代教育技术;工程能力
摘要:
针对信息工程专业人才培养的特点,以工程教育专业认证标准为导向,构建了"模块化+个性化"的实践教学体系,规范了不同层次的实践教学内容,并以现代教育技术手段创新了实验课堂。这些实践教学改革突出了对学生自主学习能力、工程实践能力和创新能力的培养,取得了较好效果。
语种:
中文
展开
大学生满意度调查研究与分析——以湖南农业大学信息科技学院为例
作者:
李依依;张红燕
期刊:
教育教学论坛 ,2017年(4):69-71 ISSN:1674-9324
作者机构:
湖南农业大学信息科学技术学院,湖南长沙,410128
关键词:
大学生;校园生活;满意度
摘要:
针对大学生满意度的问题,以湖南农业大学信息科学技术学院在校大学生作为研究对象,对大学生满意度进行抽样调查,得到相应的数据,深入研究大学生对校园各方面的满意度情况,对调整和提高大学生满意度进行分析,得出大学生满意度总体较好,但还有需要改进的地方,建议高校在提高教学服务和注重基础设施建设等方面提高大学生满意度。
语种:
中文
展开
基于System View的BPSK、QPSK与OQPSK仿真与分析
作者:
张晓慧;张红燕
期刊:
长江信息通信 ,2015年(12):1-3 ISSN:2096-9759
作者机构:
湖南农业大学信息科学技术学院,湖南长沙,410128;[张晓慧; 张红燕] 湖南农业大学
关键词:
System View仿真
摘要:
近几年,在卫星通信、移动通信等领域广泛应用相移键控。BPSK、QPSK和OQPSK是基本的相移键控。为了帮助初学者区分并学习这三种调制方式,文章介绍了BPSK、QPSK及OQPSK调制解调原理和基于System View的系统仿真设计,并对其仿真结果进行对比分析,使相移键控法更加形象具体,从而利于初学者深入理解。
语种:
中文
展开
Binary matrix shuffling filter for feature selection in neuronal morphology classification
作者:
Sun, Congwei;Dai, Zhijun;Zhang, Hongyan;Li, Lanzhi* ;Yuan, Zheming
期刊:
Computational and Mathematical Methods in Medicine ,2015年2015:626975:1-626975:9 ISSN:1748-6718
通讯作者:
Li, Lanzhi
作者机构:
[Zhang, Hongyan; Li, Lanzhi; Dai, Zhijun; Yuan, Zheming; Sun, Congwei] Hunan Agr Univ, Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Li, Lanzhi; Dai, Zhijun; Yuan, Zheming; Sun, Congwei] Hunan Agr Univ, Hunan Prov Key Lab Biol & Control Plant Dis & Ins, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.
通讯机构:
[Li, Lanzhi] H;Hunan Agr Univ, Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.
摘要:
A prerequisite to understand neuronal function and characteristic is to classify neuron correctly. The existing classification techniques are usually based on structural characteristic and employ principal component analysis to reduce feature dimension. In this work, we dedicate to classify neurons based on neuronal morphology. A new feature selection method named binary matrix shuffling filter was used in neuronal morphology classification. This method, coupled with support vector machine for implementation, usually selects a small amount of features for easy interpretation. The reserved features are used to build classification models with support vector classification and another two commonly used classifiers. Compared with referred feature selection methods, the binary matrix shuffling filter showed optimal performance and exhibited broad generalization ability in five random replications of neuron datasets. Besides, the binary matrix shuffling filter was able to distinguish each neuron type from other types correctly; for each neuron type, private features were also obtained. © 2015 Congwei Sun et al.
语种:
英文
展开
黄烷酮类衍生物的抗菌活性QSAR研究
作者:
熊光;张红燕
期刊:
中国农学通报 ,2015年31(29):77-81 ISSN:1000-6850
作者机构:
湖南农业大学信息科学技术学院,长沙410128;湖南化工职业技术学院,湖南株洲412000;湖南农业大学信息科学技术学院,长沙,410128
关键词:
定量构效关系;黄烷酮;杀菌活性;支持向量回归;多元线性回归
摘要:
为了研究黄烷酮衍生物分子结构与杀菌活性的关系,以34种黄烷酮衍生物的杀菌活性为指标,以量子化学计算软件PCLIENT提取化合物的初始分子描述符,经高维特征筛选以及多轮末尾淘汰进行特征选择,获得了14个重要分子描述符,并基于保留描述符构建了多元线性回归、支持向量回归模型,获得了较高的预测精度.最后基于支持向量机非线性解释体系,对保留分子描述符的作用进行解析,各描述符对影响黄烷酮衍生物的杀菌活性的重要性依次为SssO> MATS6e> SeaC2C2aa> EEig04d>EEig03x> C-028> R7v+> L2e >X4A> Mor25m>G1m>Mor 17m> BEHv1 >RDF045m,为指导黄烷酮类杀菌剂的合成提供了重要指导.
语种:
中文
展开
基于ZigBee技术智能大棚监控系统的研究
作者:
徐鸽;张红燕
期刊:
企业技术开发 ,2014年33(11):6-8 ISSN:1006-8937
作者机构:
湖南农业大学信息科技学院,湖南长沙,410128
关键词:
精准农业;智能大棚;农业环境监控;传感器
摘要:
基于Zig Bee技术的智能农业大棚监控系统的研究工作,以往提出的智能农业大棚主要关注于监测大棚数据,同时通过人工对环境要素进行调节,因而,系统控制的实时性不够高,同时,过于依赖于人的经验。针对这一劣势,文章提出了基于Zig Bee通讯协议的智能监测与控制系统。该系统具有不完全依赖人的经验、实时性高、无需现场布线、采集点设置灵活、采集点覆盖面积广阔、系统运行周期长等多种优点。
语种:
中文
展开
Informative gene selection and direct classification of tumor based on chi-square test of pairwise gene interactions
作者:
Zhang, Hongyan;Li, Lanzhi;Luo, Chao;Sun, Congwei;Chen, Yuan;...
期刊:
BioMed Research International ,2014年2014:589290 ISSN:2314-6133
通讯作者:
Yuan, Zheming
作者机构:
[Zhang, Hongyan; Li, Lanzhi; Chen, Yuan; Dai, Zhijun; Yuan, Zheming; Sun, Congwei] Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Luo, Chao] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Li, Lanzhi; Chen, Yuan; Dai, Zhijun; Yuan, Zheming; Sun, Congwei] Hunan Prov Key Lab Biol & Control Plant Dis & Ins, Changsha 410128, Hunan, Peoples R China.
通讯机构:
[Yuan, Zheming] H;Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.
摘要:
In efforts to discover disease mechanisms and improve clinical diagnosis of tumors, it is useful to mine profiles for informative genes with definite biological meanings and to build robust classifiers with high precision. In this study, we developed a new method for tumor-gene selection, the Chi-square test-based integrated rank gene and direct classifier (χ 2-IRG-DC). First, we obtained the weighted integrated rank of gene importance from chi-square tests of single and pairwise gene interactions. Then, we sequentially introduced the ranked genes and removed redundant genes by using leave-one-out cross-validation of the chi-square test-based Direct Classifier (χ 2-DC) within the training set to obtain informative genes. Finally, we determined the accuracy of independent test data by utilizing the genes obtained above with χ 2-DC. Furthermore, we analyzed the robustness of χ 2-IRG-DC by comparing the generalization performance of different models, the efficiency of different feature-selection methods, and the accuracy of different classifiers. An independent test of ten multiclass tumor gene-expression datasets showed that χ 2-IRG-DC could efficiently control overfitting and had higher generalization performance. The informative genes selected by χ 2-IRG-DC could dramatically improve the independent test precision of other classifiers; meanwhile, the informative genes selected by other feature selection methods also had good performance in χ 2-DC. © 2014 Hongyan Zhang et al.
语种:
英文
展开
基于支持向量机的昆虫数值化鉴定
作者:
吴宏华;张红燕;陈渊
期刊:
中国农学通报 ,2014年30(7):286-291 ISSN:1000-6850
作者机构:
[吴宏华; 张红燕] 湖南农业大学信息科学技术学院;[陈渊] 湖南省植物病虫害生物学与防控重点实验室
关键词:
支持向量机;数值化鉴定;特征筛选;昆虫识别
摘要:
为提高昆虫鉴定的准确度,基于支持向量机提出了一种新的计算机昆虫数值化鉴定方法,并应用于以前翅内部翅脉交点距离为数值特征的7种蝴蝶的鉴定。首先利用DrawWing 软件对7种蝴蝶的翅脉交点坐标进行了自动获取,并计算各相邻交点之间的欧式距离;然后将每类样本与其他样本组成二分类模型;再对每一模型经支持向量机非线性特征筛选,去除无用或冗余特征值,并以保留特征构建最终分类器。7个预测模型的独立测试平均精度达98.64%,明显高于参比模型,表明新方法在昆虫鉴定领域具有较好的应用前景。
语种:
中文
展开
基于ARIMA-SVM组合模型的移动通信用户数预测
作者:
王佳敏;张红燕
期刊:
计算机时代 ,2014年(9):12-15,17 ISSN:1006-8228
作者机构:
湖南农业大学信息科学技术学院,湖南长沙,410128;[王佳敏; 张红燕] 湖南农业大学
关键词:
移动通信用户数;预测;时间序列;差分自回归移动平均模型;支持向量机
摘要:
运营商通过分析各时段、各区域的历史移动通信业务数据,能够预测未来一段时间的业务量,从而提供面向管理层的决策支持。为准确把握国内移动通信用户数的波动规律,提高预测精度,通过对2012年1月到2014年2月的26个月忙时移动通信用户总数和3G用户数进行分析,采用差分自回归移动平均模型(ARIMA)对业务量时间序列数据进行线性建模,并采用支持向量机(SVM)对ARIMA模型残差进行非线性建模,将ARIMA模型与SVM模型组合对忙时移动通信用户数进行预测,结果表明,ARIMA-SVM组合模型预测精度明显优于单一模型,发挥了两种模型各自的优势。该组合模型是一种切实可行的移动通信业务预测方法。
语种:
中文
展开
运用虚拟仿真实验改革通信原理实验教学
作者:
任峻;张红燕
期刊:
实验技术与管理 ,2014年(3):95-97,104 ISSN:1002-4956
作者机构:
湖南农业大学信息科学技术学院,湖南长沙,410128
关键词:
通信原理;实验教学;虚拟实验
摘要:
基于实验箱的通信原理实验受硬件固定的限制,无法自由构建通信系统和无法进行频谱分析,利用SystemView 仿真软件建立了虚实结合的通信原理综合实验室。通过2FSK 调制解调系统虚拟仿真实验实例,说明通信原理虚拟仿真实验的系统设计、系统构建、信号分析等实验教学过程。
语种:
中文
展开
基于REMCC-BPNN 的稻瘿蚊发生量预测研究
作者:
隆轲;张红燕;谢元瑰;李诚
期刊:
中国农学通报 ,2014年30(13):289-293 ISSN:1000-6850
作者机构:
[隆轲; 张红燕; 谢元瑰; 李诚] 湖南农业大学信息科学技术学院;湖南省农村农业信息化工程技术研究中心
关键词:
BP神经网络;虫害预测;时间序列
摘要:
为了提高预测稻瘿蚊发生量的准确度,有效防控稻瘿蚊虫害成灾面积,采用基于K近邻样本拟合相对误差绝对值与时序相关系数最小原则优化的BP神经网络预测模型REMCC-BPNN,选取广为认可的气温和降水量为影响因子,对稻瘿蚊的发生量进行独立预测。通过2个实例(化州市晚稻稻瘿蚊发生程度和广西邕宁县稻瘿蚊发生程度)验证显示:REMCC-BPNN 模型的独立预测精度分别为94%和100%,明显优于经典回归分析、SVR-CAR、MIV-BPNN等参比模型。可见,REMCC-BPNN模型在虫害发生量预测方面有良好的应用前景。
语种:
中文
展开
基于层次结构的农民工就业特征模型研究
作者:
陈玉峰;张红燕;敬松;谢元瑰;隆珂
期刊:
中国农学通报 ,2013年29(11):101-106 ISSN:1000-6850
作者机构:
[陈玉峰; 张红燕; 敬松; 谢元瑰; 隆珂] 湖南农业大学信息科学技术学院
关键词:
层次结构;农民工;就业;特征模型;个性化推荐
摘要:
为了更有效地为农民工提供个性化就业推荐信息,促进农村剩余劳动力转移。通过采用层次结构表示方法融合决策树模型和向量空间模型,分析了农民工就业个性化推送服务的特点,并将农民工的基本信息特征与其网络上求职的操作特征有机结合,建立农民工就业特征模型。实证结果表明,该模型能够更好地对农民工的就业特征进行表示,提高就业推荐准确率。
语种:
中文
展开
基于GS-SVR的耕地面积预测及其驱动因子分析
作者:
王笑冰;张红燕;谢元瑰;陈玉峰;隆轲
期刊:
中国农学通报 ,2013年29(23):210-215 ISSN:1000-6850
作者机构:
[王笑冰; 张红燕; 谢元瑰; 陈玉峰; 隆轲] 湖南农业大学信息科学技术学院
关键词:
耕地面积;驱动因子;支持向量回归;预测
摘要:
影响耕地面积变化的驱动因子复杂多变,难以确定。为了合理选择耕地面积的驱动因子,提高耕地面积的预测精度,指导耕地资源科学分配利用,通过采用一种基于GS-SVR自变量全组合预测均方误差(Mean Squared Error,MSE)最小原则的方法确定耕地面积的驱动因子;并以湖南省耕地面积变化为例,通过SVR-CAR、LSSVM、BPNN、ARIMA和MLRR等常用的时间序列预测方法来验证所选取驱动因子的有效性。结果表明,湖南省耕地面积变化的最优驱动因子组合为城市化水平和房地产业产值指数,且常用时间序列预测方法采用GS-SVR全组合方式选取的驱动因子组合大幅度提高了耕地面积的预测精度。采用GS-SVR自变量全组合均方误差最小原则的方法选择耕地面积的驱动因子是科学合理的,在耕地面积等时间序列预测领域具有广泛的应用前景。
语种:
中文
展开
Prediction of multidimensional time series based on GS-RSR-SVR and its application in agricultural economy
作者:
Y.G. Xie;H.Y. Zhang;H.Y. Wang;L.F. Wang;Z.M. Yuan
期刊:
Bulgarian Journal of Agricultural Science ,2013年19(6):1327-1336 ISSN:1310-0351
通讯作者:
Zhang, H. Y.
作者机构:
Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization, Changsha 410128, China;Hunan Agricultural University, College of Information Science and Technology, Changsha 410128, China;Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, Changsha 410128, China;[Wang H.Y.] Kansas State University, Department of Statistics, Manhattan, KS 66506, United States;[Yuan Z.M.; Wang L.F.] Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization, Changsha 410128, China, Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, Changsha 410128, China
通讯机构:
[Zhang, H. Y.] H;Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization, Changsha 410128, China
关键词:
Geo-statistics tool;Multidimensional time series;Prediction;Reasonable sample rejection;Support vector machine regression
摘要:
This paper proposes a method that creatively applies a Geo-statistics tool (GS) to complete fast and adequate order determination and introduces a novel algorithm, named Reasonable Sample Rejection (RSR) to realize rational sample selection. Then, combined with Support Vector Machine Regression (SVR), a high precision non-linear prediction method named GSRSR- SVR is proposed for multidimensional time series. The main steps of the novel method includes: 1) determine the order for the dependent variable of the training samples based on one-dimensional GS aftereffect duration (range), 2) screen the independent variables according to Leave-One-Out Cross Validation (LOOCV) based on the minimum Mean Squared Error (MSE), 3) reject some oldest training samples based on the minimum correlation coefficient of fitting absolute relative error of training sets of different rejected sizes and sample number. Three real-world datasets was used to test the effectiveness of GSRSR- SVR. The results show that GS-RSR-SVR has higher prediction precision and more stable prediction ability than MLR, ARIMA, CAR, BPNN, SVR and SVR-CAR.
语种:
英文
展开
TSG: A new algorithm for binary and multi-class cancer classification and informative genes selection
作者:
Wang, Haiyan;Zhang, Hongyan;Dai, Zhijun;Chen, Ming-shun;Yuan, Zheming*
期刊:
BMC Medical Genomics ,2013年6(SUPPL.1):1-14 ISSN:1755-8794
通讯作者:
Yuan, Zheming
作者机构:
[Wang, Haiyan] Kansas State Univ, Dept Stat, Manhattan, KS 66506 USA.;[Zhang, Hongyan; Dai, Zhijun; Yuan, Zheming] Hunan Prov Key Lab Crop Germplasm Innovat & Util, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Dai, Zhijun; Yuan, Zheming] Hunan Agr Univ, Coll Bio Safety Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Chen, Ming-shun] USDA ARS, Manhattan, KS 66506 USA.
通讯机构:
[Yuan, Zheming] H;Hunan Prov Key Lab Crop Germplasm Innovat & Util, Changsha 410128, Hunan, Peoples R China.
会议名称:
International Conference on Bioinformatics and Computational Biology (BIOCOMP)
会议时间:
JUL 18-21, 2011
会议地点:
Las Vegas, NV
会议主办单位:
[Wang, Haiyan] Kansas State Univ, Dept Stat, Manhattan, KS 66506 USA.^[Zhang, Hongyan;Dai, Zhijun;Yuan, Zheming] Hunan Prov Key Lab Crop Germplasm Innovat & Util, Changsha 410128, Hunan, Peoples R China.^[Zhang, Hongyan] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.^[Zhang, Hongyan;Dai, Zhijun;Yuan, Zheming] Hunan Agr Univ, Coll Bio Safety Sci & Technol, Changsha 410128, Hunan, Peoples R China.^[Chen, Ming-shun] USDA ARS, Manhattan, KS 66506 USA.^[Chen, Ming-shun] Kansas State Univ, Dept Entomol, Manhattan, KS 66506 USA.
关键词:
Support Vector Machine;Chisquare Statistic;Family Classifier;Marker Pair;Informative Gene
摘要:
Background: One of the challenges in classification of cancer tissue samples based on gene expression data is to establish an effective method that can select a parsimonious set of informative genes. The Top Scoring Pair (TSP), k-Top Scoring Pairs (k-TSP), Support Vector Machines (SVM), and prediction analysis of microarrays (PAM) are four popular classifiers that have comparable performance on multiple cancer datasets. SVM and PAM tend to use a large number of genes and TSP, k-TSP always use even number of genes. In addition, the selection of distinct gene pairs in k-TSP simply combined the pairs of top ranking genes without considering the fact that the gene set with best discrimination power may not be the combined pairs. The k-TSP algorithm also needs the user to specify an upper bound for the number of gene pairs. Here we introduce a computational algorithm to address the problems. The algorithm is named Chisquare-statistic-based Top Scoring Genes (Chi-TSG) classifier simplified as TSG. Results: The TSG classifier starts with the top two genes and sequentially adds additional gene into the candidate gene set to perform informative gene selection. The algorithm automatically reports the total number of informative genes selected with cross validation. We provide the algorithm for both binary and multi-class cancer classification. The algorithm was applied to 9 binary and 10 multi-class gene expression datasets involving human cancers. The TSG classifier outperforms TSP family classifiers by a big margin in most of the 19 datasets. In addition to improved accuracy, our classifier shares all the advantages of the TSP family classifiers including easy interpretation, invariant to monotone transformation, often selects a small number of informative genes allowing follow-up studies, resistant to sampling variations due to within sample operations. Conclusions: Redefining the scores for gene set and the classification rules in TSP family classifiers by incorporating the sample size information can lead to better selection of informative genes and classification accuracy. The resulting TSG classifier offers a useful tool for cancer classification based on numerical molecular data. © 2013 Yuan; licensee BioMed Central Ltd.
语种:
英文
展开
基于REMCC-BPNN的粮食产量预测研究
作者:
谢元瑰;张红燕;陈玉峰
期刊:
安徽农业科学 ,2013年(06):2775-2777+2781 ISSN:0517-6611
作者机构:
湖南农业大学信息科学技术学院
关键词:
BP神经网络;时间序列;粮食产量;预测
摘要:
粮食产量的准确预测对保证粮食安全、维持社会稳定具有重大意义。提出了一种基于K个最近邻训练样本拟合相对误差绝对值与时序的相关系数最小原则优化BP神经网络的时间序列预测模型REMCC-BPNN,并将该模型应用到我国粮食产量及湖南省粮食产量预测中。结果表明,REMCC-BPNN模型的预测精度优于BPNN、SVR、ARIMA、GM(1,N)等常用的时间序列预测模型,训练速度快,稳定性高。
语种:
中文
展开
依托母体高校的独立学院教务管理队伍建设的长效机制思考
作者:
陈玉峰;张红燕;肖宇;张香芽
期刊:
时代教育(小学版) ,2012年(9):21-22 ISSN:1672-8181
作者机构:
湖南农业大学信息科学技术学院,湖南长沙,410128;湖南农业大学东方科技学院,湖南长沙,410128
关键词:
母体高校;独立学院;教务管理人员;长效机制
摘要:
教务管理人员是独立学院教学工作的核心组成部分,为教学提供基础的保障。笔者对独立学校自身的特点进行了深入的分析,总结了当前独立学院教务管理人员队伍建设中存在的主要问题,并分析了问题的原因,提出了独立学院教务管理队伍建设的长效机制的意见和对策。
语种:
中文
展开
信息工程创新实验室建设与创新人才培养实践
作者:
张红燕;曾炼成
期刊:
计算机教育 ,2012年(13):43-46 ISSN:1672-5913
作者机构:
湖南农业大学信息科学技术学院,湖南长沙,410128;[曾炼成; 张红燕] 湖南农业大学
关键词:
创新实验室;创新人才;创新能力;信息工程
摘要:
创新实验室在创新型人才培养中发挥着重要的作用。本文介绍湖南农业大学信息工程专业本科生创新实验室的建设和组织运行情况,并对取得的成效进行分析和讨论。指出创新实验室的建设是提高大学生实践动手能力和科研创新能力的有效途径之一。
语种:
中文
展开
Improving accuracy for cancer classification with a new algorithm for genes selection
作者:
Zhang, Hongyan;Wang, Haiyan* ;Dai, Zhijun;Chen, Ming-shun;Yuan, Zheming
期刊:
BMC Bioinformatics ,2012年13(1):298 ISSN:1471-2105
通讯作者:
Wang, Haiyan
作者机构:
[Zhang, Hongyan; Dai, Zhijun; Yuan, Zheming] Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Dai, Zhijun; Yuan, Zheming] Hunan Agr Univ, Coll Biosafety Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Wang, Haiyan] Kansas State Univ, Dept Stat, Manhattan, KS 66506 USA.;[Chen, Ming-shun] Kansas State Univ, USDA ARS, Manhattan, KS 66506 USA.;[Chen, Ming-shun] Kansas State Univ, Dept Entomol, Manhattan, KS 66506 USA.
通讯机构:
[Wang, Haiyan] K;Kansas State Univ, Dept Stat, Manhattan, KS 66506 USA.
关键词:
Linear Discriminant Analysis;Support Vector Machine Classifier;Feature Subset;Quadratic Discriminant Analysis;Informative Gene
摘要:
Background: Even though the classification of cancer tissue samples based on gene expression data has advanced considerably in recent years, it faces great challenges to improve accuracy. One of the challenges is to establish an effective method that can select a parsimonious set of relevant genes. So far, most methods for gene selection in literature focus on screening individual or pairs of genes without considering the possible interactions among genes. Here we introduce a new computational method named the Binary Matrix Shuffling Filter (BMSF). It not only overcomes the difficulty associated with the search schemes of traditional wrapper methods and overfitting problem in large dimensional search space but also takes potential gene interactions into account during gene selection. This method, coupled with Support Vector Machine (SVM) for implementation, often selects very small number of genes for easy model interpretability.Results: We applied our method to 9 two-class gene expression datasets involving human cancers. During the gene selection process, the set of genes to be kept in the model was recursively refined and repeatedly updated according to the effect of a given gene on the contributions of other genes in reference to their usefulness in cancer classification. The small number of informative genes selected from each dataset leads to significantly improved leave-one-out (LOOCV) classification accuracy across all 9 datasets for multiple classifiers. Our method also exhibits broad generalization in the genes selected since multiple commonly used classifiers achieved either equivalent or much higher LOOCV accuracy than those reported in literature.Conclusions: Evaluation of a gene's contribution to binary cancer classification is better to be considered after adjusting for the joint effect of a large number of other genes. A computationally efficient search scheme was provided to perform effective search in the extensive feature space that includes possible interactions of many genes. Performance of the algorithm applied to 9 datasets suggests that it is possible to improve the accuracy of cancer classification by a big margin when joint effects of many genes are considered. © 2012 Zhang et al.; licensee BioMed Central Ltd.
语种:
英文
展开