Cumulative risk assessment of phthalates in edible vegetable oil consumed by Chinese residents
作者:
Xiang, Wei;Gong, Qin;Xu, Jian;Li, Kailong;Yu, Fengxiang;...
期刊:
Journal of the Science of Food and Agriculture ,2020年100(3):1124-1131 ISSN:0022-5142
通讯作者:
Wang, Fangbin
作者机构:
[Xiang, Wei] Hunan Acad Agr Sci, Crop Res Inst, Changsha, Hunan, Peoples R China.;[Wang, Fangbin; Gong, Qin; Chen, Ting; Li, Can; Li, Kailong] Hunan Inst Food Qual Supervis Inspect & Res, Changsha 410111, Hunan, Peoples R China.;[Xu, Jian] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha, Hunan, Peoples R China.;[Yu, Fengxiang] Hunan Biol & Electromech Polytech, Dept Food Sci & Technol, Changsha, Hunan, Peoples R China.;[Qin, Si] Hunan Agr Univ, Coll Food Sci & Technol, Changsha, Hunan, Peoples R China.
通讯机构:
[Wang, Fangbin] H;Hunan Inst Food Qual Supervis Inspect & Res, Changsha 410111, Hunan, Peoples R China.
关键词:
edible vegetable oil;phthalates;cumulative risk assessment;hazard index
摘要:
BACKGROUND: Phthalates have been widely used as plasticizers in various industries and are widely focused on in the international community as a result of their reproductive toxicity. Exposure of Chinese residents to phthalates via edible vegetable oil occurs often. In the present study, gas chromatography–mass spectrometry was used to detect the two main phthalates bis(2-ethylhexyl) phthalate (DEHP) and dibutyl phthalate (DBP) in four major edible vegetable oil sources: an edible oil blend, soybean oil, peanut oil and rapeseed oil (a total of 1016 samples), as collected throughout China. Furthermore, cumulative risk assessment was used to estimate the reproductive health risk to Chinese residents caused by the phthalates that come from edible vegetable oils. RESULTS: Both phthalates were detected in four major edible vegetable oil sources. The phthalate with the highest detection rate was DBP (13.48%), followed by DEHP (7.78%). The results of the cumulative risk assessment showed that the hazard indices of these two phthalates in edible vegetable oils were less than 1, except in soybean oil. Nevertheless, the two phthalates had the lowest detection rates in soybean oil, which were 1.94% (DEHP) and 5.16% (DBP). In China, contamination levels of phthalates in the soils where oil crops are cultivated have a great influence on the phthalate concentrations in edible vegetable oils. CONCLUSION: It is recommended that Chinese residents who are consuming soybean oil choose well-known brands and regularly change their brand of consumption. The phthalates in edible vegetable oils pose a relatively small reproductive health risk to Chinese residents. © 2019 Society of Chemical Industry. © 2019 Society of Chemical Industry
语种:
英文
展开
Application-aware deadline constraint job scheduling mechanism on large-scale computational grid
作者:
Tang, Xiaoyong;Liao, Xiaoyi*
期刊:
PLOS ONE ,2018年13(11):e0207596- ISSN:1932-6203
通讯作者:
Liao, Xiaoyi
作者机构:
[Tang, Xiaoyong; Liao, Xiaoyi] Hunan Agr Univ, Southern Reg Collaborat Innovat Ctr Grain & Oil C, Coll Informat Sci & Technol, Changsha, Hunan, Peoples R China.;[Tang, Xiaoyong] Hunan Univ, Sch Informat Sci & Engn, Changsha, Hunan, Peoples R China.;[Tang, Xiaoyong] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.
通讯机构:
[Liao, Xiaoyi] H;Hunan Agr Univ, Southern Reg Collaborat Innovat Ctr Grain & Oil C, Coll Informat Sci & Technol, Changsha, Hunan, Peoples R China.
关键词:
Jobs;Algorithms;Computer software;Network bandwidth;China;Forecasting;Geographic distribution;Genetic algorithms
摘要:
Recently, computational Grids have proven to be a good solution for processing large-scale, computation intensive problems. However, the heterogeneity, dynamics of resources and diversity of applications requirements have always been important factors affecting their performance. In response to these challenges, this work first builds a Grid job scheduling architecture that can dynamically monitor Grid computing center resources and make corresponding scheduling decisions. Second, a Grid job model is proposed to describe the application requirements. Third, this paper studies the characteristics of commercial interconnection networks used in Grids and forecast job transmission time. Fourth, this paper proposes an application-aware job scheduling mechanism (AJSM) that includes periodic scheduling flow and a heuristic application-aware deadline constraint job scheduling algorithm. The rigorous performance evaluation results clearly demonstrate that the proposed application-aware job scheduling mechanism can successful schedule more Grid jobs than the existing algorithms. For successful scheduled jobs, our proposed AJSM method is the best algorithm for job average processing time and makespan.
语种:
英文
展开
Discovering essential proteins based on PPI network and protein complex
作者:
Ren, Jun;Wang, Jianxin* ;Li, Min;Wu, Fangxiang
期刊:
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS ,2015年12(1):24-43 ISSN:1748-5673
通讯作者:
Wang, Jianxin
作者机构:
[Ren, Jun; Li, Min; Wang, Jianxin] Cent S Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.;[Ren, Jun] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Wu, Fangxiang] Univ Saskatchewan, Dept Mech Engn, Saskatoon, SK S7N 5A9, Canada.;[Wu, Fangxiang] Univ Saskatchewan, Div Biomed Engn, Saskatoon, SK S7N 5A9, Canada.
通讯机构:
[Wang, Jianxin] C;Cent S Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.
关键词:
centrality measure;complex centrality;essential proteins;harmonic centrality;PPI network;protein complex
摘要:
Most computational methods for identifying essential proteins focus on the topological centrality of protein-protein interaction (PPI) networks. However, these methods have limitations, such as the difficulty for identifying essential proteins with low centrality values and the poor performance for incomplete PPI network. In this paper, protein complex is proven to be an important factor for determining protein essentiality and a new centrality measure, complex centrality, is proposed. The weighted average of complex centrality and subgraph centrality, called harmonic centrality (HC), is proposed to predict essential proteins. It combines PPI network topology and protein complex information and has better performance than methods based on PPI network. The improvement is higher when the PPI network is incomplete. Furthermore, a weighted PPI network is generated by integrating cellular localisation and biological process to a PPI network. The performance of HC measure is improved 5% in this weighted PPI network.
语种:
英文
展开
Binary matrix shuffling filter for feature selection in neuronal morphology classification
作者:
Sun, Congwei;Dai, Zhijun;Zhang, Hongyan;Li, Lanzhi* ;Yuan, Zheming
期刊:
Computational and Mathematical Methods in Medicine ,2015年2015:626975:1-626975:9 ISSN:1748-6718
通讯作者:
Li, Lanzhi
作者机构:
[Zhang, Hongyan; Li, Lanzhi; Dai, Zhijun; Yuan, Zheming; Sun, Congwei] Hunan Agr Univ, Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Li, Lanzhi; Dai, Zhijun; Yuan, Zheming; Sun, Congwei] Hunan Agr Univ, Hunan Prov Key Lab Biol & Control Plant Dis & Ins, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.
通讯机构:
[Li, Lanzhi] H;Hunan Agr Univ, Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.
摘要:
A prerequisite to understand neuronal function and characteristic is to classify neuron correctly. The existing classification techniques are usually based on structural characteristic and employ principal component analysis to reduce feature dimension. In this work, we dedicate to classify neurons based on neuronal morphology. A new feature selection method named binary matrix shuffling filter was used in neuronal morphology classification. This method, coupled with support vector machine for implementation, usually selects a small amount of features for easy interpretation. The reserved features are used to build classification models with support vector classification and another two commonly used classifiers. Compared with referred feature selection methods, the binary matrix shuffling filter showed optimal performance and exhibited broad generalization ability in five random replications of neuron datasets. Besides, the binary matrix shuffling filter was able to distinguish each neuron type from other types correctly; for each neuron type, private features were also obtained. © 2015 Congwei Sun et al.
语种:
英文
展开
Informative gene selection and direct classification of tumor based on chi-square test of pairwise gene interactions
作者:
Zhang, Hongyan;Li, Lanzhi;Luo, Chao;Sun, Congwei;Chen, Yuan;...
期刊:
BioMed Research International ,2014年2014:589290 ISSN:2314-6133
通讯作者:
Yuan, Zheming
作者机构:
[Zhang, Hongyan; Li, Lanzhi; Chen, Yuan; Dai, Zhijun; Yuan, Zheming; Sun, Congwei] Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Luo, Chao] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Li, Lanzhi; Chen, Yuan; Dai, Zhijun; Yuan, Zheming; Sun, Congwei] Hunan Prov Key Lab Biol & Control Plant Dis & Ins, Changsha 410128, Hunan, Peoples R China.
通讯机构:
[Yuan, Zheming] H;Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.
摘要:
In efforts to discover disease mechanisms and improve clinical diagnosis of tumors, it is useful to mine profiles for informative genes with definite biological meanings and to build robust classifiers with high precision. In this study, we developed a new method for tumor-gene selection, the Chi-square test-based integrated rank gene and direct classifier (χ 2-IRG-DC). First, we obtained the weighted integrated rank of gene importance from chi-square tests of single and pairwise gene interactions. Then, we sequentially introduced the ranked genes and removed redundant genes by using leave-one-out cross-validation of the chi-square test-based Direct Classifier (χ 2-DC) within the training set to obtain informative genes. Finally, we determined the accuracy of independent test data by utilizing the genes obtained above with χ 2-DC. Furthermore, we analyzed the robustness of χ 2-IRG-DC by comparing the generalization performance of different models, the efficiency of different feature-selection methods, and the accuracy of different classifiers. An independent test of ten multiclass tumor gene-expression datasets showed that χ 2-IRG-DC could efficiently control overfitting and had higher generalization performance. The informative genes selected by χ 2-IRG-DC could dramatically improve the independent test precision of other classifiers; meanwhile, the informative genes selected by other feature selection methods also had good performance in χ 2-DC. © 2014 Hongyan Zhang et al.
语种:
英文
展开
Virtual medical plant modeling based on L-system
作者:
Ding, Dehong;Fang, Kui* ;Jing, Song;Bo, Liu;Bo, Qiao;...
期刊:
AFRICAN HEALTH SCIENCES ,2014年14(4):1056-1062 ISSN:1680-6905
通讯作者:
Fang, Kui
作者机构:
[Fang, Kui; Bo, Liu; Bo, Qiao; Ding, Dehong; Jing, Song] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Ding, Dehong] HeZhou Univ, Coll Comp Sci & Informat Technol, Hezhou 542800, Peoples R China.;[Yu, Hexing] Hunan Agr Univ, Coll Resource & Environm, Changsha 410128, Hunan, Peoples R China.
通讯机构:
[Fang, Kui] H;Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.
关键词:
Drug R&D;L-system;fractals;medical plants;quasi binary-trees;toxicity
摘要:
Background: Searching the drug molecules from the medicinal plants become more and more popular given that herbalcomponents have been widely considered to be safe.In medical virtual plant studies, development rules are difficult to be extracted, the construction of plant organs is highlydependent on equipment and the process is complicated.Aim: To establish three-dimensional structural virtual plant growth model.Methods: The quasi-binary tree structure and its properties were obtained through the research of theory on binary tree,then the relationship between quasi-binary tree structure and plant three-dimensional branching structure model was analyzed,and the three-dimensional morphology of plants was described.Results: A three-dimensional plant branch structure pattern extracting algorithm based on quasi-binary tree structure. Byusing 3-D L-system method, the extracted rules were systematized, and standardized. Further more, we built a comprehensiveL-model system. With the aid of graphics and PlantVR, we implemented the plant shape and 3-D structure’s reconstruction.Conclusion: Three-dimensional structure virtual plant growth model based on time- controlled L-system has been successfullyestablished.Keywords: Drug R&D, toxicity, medical plants, fractals; L-system; quasi binary-trees.
语种:
英文
展开
Identifying hierarchical and overlapping protein complexes based on essential protein-protein interactions and "seed-expanding" method
作者:
Ren, Jun;Zhou, Wei;Wang, Jianxin*
期刊:
BioMed Research International ,2014年2014:838714 ISSN:2314-6133
通讯作者:
Wang, Jianxin
作者机构:
[Ren, Jun] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Ren, Jun; Wang, Jianxin] Cent South Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.;[Zhou, Wei] Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.
通讯机构:
[Wang, Jianxin] C;Cent South Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.
摘要:
Many evidences have demonstrated that protein complexes are overlapping and hierarchically organized in PPI networks. Meanwhile, the large size of PPI network wants complex detection methods have low time complexity. Up to now, few methods can identify overlapping and hierarchical protein complexes in a PPI network quickly. In this paper, a novel method, called MCSE, is proposed based on -module and "seed-expanding." First, it chooses seeds as essential PPIs or edges with high edge clustering values. Then, it identifies protein complexes by expanding each seed to a -module. MCSE is suitable for large PPI networks because of its low time complexity. MCSE can identify overlapping protein complexes naturally because a protein can be visited by different seeds. MCSE uses the parameter -th to control the range of seed expanding and can detect a hierarchical organization of protein complexes by tuning the value of -th. Experimental results of S. cerevisiae show that this hierarchical organization is similar to that of known complexes in MIPS database. The experimental results also show that MCSE outperforms other previous competing algorithms, such as CPM, CMC, Core-Attachment, Dpclus, HC-PIN, MCL, and NFC, in terms of the functional enrichment and matching with known protein complexes. © 2014 Jun Ren et al.
语种:
英文
展开
Cross-correlation detection and analysis for California's electricity market based on analogous multifractal analysis
作者:
Wang, Fang;Liao, Gui-ping* ;Li, Jian-hui;Zou, Rui-biao;Shi, Wen
期刊:
CHAOS ,2013年23(1):013129 ISSN:1054-1500
通讯作者:
Liao, Gui-ping
作者机构:
[Zou, Rui-biao; Shi, Wen; Wang, Fang] Hunan Agr Univ, Coll Sci, Changsha 410128, Hunan, Peoples R China.;[Liao, Gui-ping; Wang, Fang; Li, Jian-hui] Hunan Agr Univ, Agr Informat Inst, Changsha 410128, Hunan, Peoples R China.
通讯机构:
[Liao, Gui-ping] H;Hunan Agr Univ, Agr Informat Inst, Changsha 410128, Hunan, Peoples R China.
摘要:
A novel method, which we called the analogous multifractal cross-correlation analysis, is proposed in this paper to study the multifractal behavior in the power-law cross-correlation between price and load in California electricity market. In addition, a statistic rho(AMF-XA), which we call the analogous multifractal cross-correlation coefficient, is defined to test whether the cross-correlation between two given signals is genuine or not. Our analysis finds that both the price and load time series in California electricity market express multifractal nature. While, as indicated by the rho(AMF-XA) statistical test, there is a huge difference in the cross-correlation behavior between the years 1999 and 2000 in California electricity markets. (C) 2013 American Institute of Physics. [http://dx.doi.org/10.1063/1.4793355]
语种:
英文
展开
Identifying protein complexes based on density and modularity in protein-protein interaction network
作者:
Ren, Jun;Wang, Jianxin* ;Li, Min;Wang, Lusheng
期刊:
BMC Systems Biology ,2013年7(4):1-15 ISSN:1752-0509
通讯作者:
Wang, Jianxin
作者机构:
[Ren, Jun; Li, Min; Wang, Jianxin] Cent South Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.;[Ren, Jun] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Wang, Lusheng] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China.
通讯机构:
[Wang, Jianxin] C;Cent South Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.
关键词:
Protein Complex;Neighbor Vertex;Dense Subgraph;Undirected Weighted Graph;Predict Protein Complex
摘要:
Identifying protein complexes is crucial to understanding principles of cellular organization and functional mechanisms. As many evidences have indicated that the subgraphs with high density or with high modularity in PPI network usually correspond to protein complexes, protein complexes detection methods based on PPI network focused on subgraph's density or its modularity in PPI network. However, dense subgraphs may have low modularity and subgraph with high modularity may have low density, which results that protein complexes may be subgraphs with low modularity or with low density in the PPI network. As the density-based methods are difficult to mine protein complexes with low density, and the modularity-based methods are difficult to mine protein complexes with low modularity, both two methods have limitation for identifying protein complexes with various density and modularity. To identify protein complexes with various density and modularity, including those have low density but high modularity and those have low modularity but high density, we define a novel subgraph's fitness, f
ρ
, as f
ρ
= (density)
ρ
*(modularity)1-ρ, and propose a novel algorithm, named LF_PIN, to identify protein complexes by expanding seed edges to subgraphs with the local maximum fitness value. Experimental results of LF-PIN in S.cerevisiae show that compared with the results of fitness equal to density (ρ = 1) or equal to modularity (ρ = 0), the LF-PIN identifies known protein complexes more effectively when the fitness value is decided by both density and modularity (0<ρ<1). Compared with the results of seven competing protein complex detection methods (CMC, Core-Attachment, CPM, DPClus, HC-PIN, MCL, and NFC) in S.cerevisiae and E.coli, LF-PIN outperforms other seven methods in terms of matching with known complexes and functional enrichment. Moreover, LF-PIN has better performance in identifying protein complexes with low density or with low modularity. By considering both the density and the modularity, LF-PIN outperforms other protein complexes detection methods that only consider density or modularity, especially in identifying known protein complexes with low density or low modularity.
语种:
英文
展开
TSG: A new algorithm for binary and multi-class cancer classification and informative genes selection
作者:
Wang, Haiyan;Zhang, Hongyan;Dai, Zhijun;Chen, Ming-shun;Yuan, Zheming*
期刊:
BMC Medical Genomics ,2013年6(SUPPL.1):1-14 ISSN:1755-8794
通讯作者:
Yuan, Zheming
作者机构:
[Wang, Haiyan] Kansas State Univ, Dept Stat, Manhattan, KS 66506 USA.;[Zhang, Hongyan; Dai, Zhijun; Yuan, Zheming] Hunan Prov Key Lab Crop Germplasm Innovat & Util, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Dai, Zhijun; Yuan, Zheming] Hunan Agr Univ, Coll Bio Safety Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Chen, Ming-shun] USDA ARS, Manhattan, KS 66506 USA.
通讯机构:
[Yuan, Zheming] H;Hunan Prov Key Lab Crop Germplasm Innovat & Util, Changsha 410128, Hunan, Peoples R China.
会议名称:
International Conference on Bioinformatics and Computational Biology (BIOCOMP)
会议时间:
JUL 18-21, 2011
会议地点:
Las Vegas, NV
会议主办单位:
[Wang, Haiyan] Kansas State Univ, Dept Stat, Manhattan, KS 66506 USA.^[Zhang, Hongyan;Dai, Zhijun;Yuan, Zheming] Hunan Prov Key Lab Crop Germplasm Innovat & Util, Changsha 410128, Hunan, Peoples R China.^[Zhang, Hongyan] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.^[Zhang, Hongyan;Dai, Zhijun;Yuan, Zheming] Hunan Agr Univ, Coll Bio Safety Sci & Technol, Changsha 410128, Hunan, Peoples R China.^[Chen, Ming-shun] USDA ARS, Manhattan, KS 66506 USA.^[Chen, Ming-shun] Kansas State Univ, Dept Entomol, Manhattan, KS 66506 USA.
关键词:
Support Vector Machine;Chisquare Statistic;Family Classifier;Marker Pair;Informative Gene
摘要:
Background: One of the challenges in classification of cancer tissue samples based on gene expression data is to establish an effective method that can select a parsimonious set of informative genes. The Top Scoring Pair (TSP), k-Top Scoring Pairs (k-TSP), Support Vector Machines (SVM), and prediction analysis of microarrays (PAM) are four popular classifiers that have comparable performance on multiple cancer datasets. SVM and PAM tend to use a large number of genes and TSP, k-TSP always use even number of genes. In addition, the selection of distinct gene pairs in k-TSP simply combined the pairs of top ranking genes without considering the fact that the gene set with best discrimination power may not be the combined pairs. The k-TSP algorithm also needs the user to specify an upper bound for the number of gene pairs. Here we introduce a computational algorithm to address the problems. The algorithm is named Chisquare-statistic-based Top Scoring Genes (Chi-TSG) classifier simplified as TSG. Results: The TSG classifier starts with the top two genes and sequentially adds additional gene into the candidate gene set to perform informative gene selection. The algorithm automatically reports the total number of informative genes selected with cross validation. We provide the algorithm for both binary and multi-class cancer classification. The algorithm was applied to 9 binary and 10 multi-class gene expression datasets involving human cancers. The TSG classifier outperforms TSP family classifiers by a big margin in most of the 19 datasets. In addition to improved accuracy, our classifier shares all the advantages of the TSP family classifiers including easy interpretation, invariant to monotone transformation, often selects a small number of informative genes allowing follow-up studies, resistant to sampling variations due to within sample operations. Conclusions: Redefining the scores for gene set and the classification rules in TSP family classifiers by incorporating the sample size information can lead to better selection of informative genes and classification accuracy. The resulting TSG classifier offers a useful tool for cancer classification based on numerical molecular data. © 2013 Yuan; licensee BioMed Central Ltd.
语种:
英文
展开
Multi-objective dynamic population shuffled frog-leaping biclustering of microarray data.
作者:
Liu, Junwan;Li, Zhoujun;Hu, Xiaohua* ;Chen, Yiming;Liu, Feifei
期刊:
BMC Genomics ,2012年13(3):1-11 ISSN:1471-2164
通讯作者:
Hu, Xiaohua
作者机构:
[Liu, Junwan] Cent S Univ Forestry & Technol, Sch Comp & Informat Engn, Changsha 410004, Hunan, Peoples R China.;[Hu, Xiaohua] Cent China Normal Univ, Dept Comp Sci, Wuhan 430079, Peoples R China.;[Li, Zhoujun] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China.;[Li, Zhoujun] Beihang Univ, Beijing Key Lab Network Technol, Beijing 100191, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Informat Sci, Philadelphia, PA 19104 USA.
通讯机构:
[Hu, Xiaohua] C;Cent China Normal Univ, Dept Comp Sci, Wuhan 430079, Peoples R China.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
会议时间:
NOV 12-15, 2011
会议地点:
Atlanta, GA
会议主办单位:
[Hu, Xiaohua] Cent China Normal Univ, Dept Comp Sci, Wuhan 430079, Peoples R China.^[Liu, Junwan] Cent S Univ Forestry & Technol, Sch Comp & Informat Engn, Changsha 410004, Hunan, Peoples R China.^[Li, Zhoujun] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China.^[Li, Zhoujun] Beihang Univ, Beijing Key Lab Network Technol, Beijing 100191, Peoples R China.^[Hu, Xiaohua] Drexel Univ, Coll Informat Sci, Philadelphia, PA 19104 USA.^[Chen, Yiming] Hunan Agr Univ, Sch Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.
关键词:
Particle Swarm Optimization;Pareto Front;Microarray Dataset;MOPSO;Discrete Particle Swarm Optimization
摘要:
Multi-objective optimization (MOO) involves optimization problems with multiple objectives. Generally, theose objectives is used to estimate very different aspects of the solutions, and these aspects are often in conflict with each other. MOO first gets a Pareto set, and then looks for both commonality and systematic variations across the set. For the large-scale data sets, heuristic search algorithms such as EA combined with MOO techniques are ideal. Newly DNA microarray technology may study the transcriptional response of a complete genome to different experimental conditions and yield a lot of large-scale datasets. Biclustering technique can simultaneously cluster rows and columns of a dataset, and hlep to extract more accurate information from those datasets. Biclustering need optimize several conflicting objectives, and can be solved with MOO methods. As a heuristics-based optimization approach, the particle swarm optimization (PSO) simulate the movements of a bird flock finding food. The shuffled frog-leaping algorithm (SFL) is a population-based cooperative search metaphor combining the benefits of the local search of PSO and the global shuffled of information of the complex evolution technique. SFL is used to solve the optimization problems of the large-scale datasets. This paper integrates dynamic population strategy and shuffled frog-leaping algorithm into biclustering of microarray data, and proposes a novel multi-objective dynamic population shuffled frog-leaping biclustering (MODPSFLB) algorithm to mine maximum bicluesters from microarray data. Experimental results show that the proposed MODPSFLB algorithm can effectively find significant biological structures in terms of related biological processes, components and molecular functions. The proposed MODPSFLB algorithm has good diversity and fast convergence of Pareto solutions and will become a powerful systematic functional analysis in genome research.
语种:
英文
展开
Improving accuracy for cancer classification with a new algorithm for genes selection
作者:
Zhang, Hongyan;Wang, Haiyan* ;Dai, Zhijun;Chen, Ming-shun;Yuan, Zheming
期刊:
BMC Bioinformatics ,2012年13(1):298 ISSN:1471-2105
通讯作者:
Wang, Haiyan
作者机构:
[Zhang, Hongyan; Dai, Zhijun; Yuan, Zheming] Hunan Prov Key Lab Crop Germplasm Innovat & Utili, Changsha 410128, Hunan, Peoples R China.;[Zhang, Hongyan; Dai, Zhijun; Yuan, Zheming] Hunan Agr Univ, Coll Biosafety Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Wang, Haiyan] Kansas State Univ, Dept Stat, Manhattan, KS 66506 USA.;[Chen, Ming-shun] Kansas State Univ, USDA ARS, Manhattan, KS 66506 USA.;[Chen, Ming-shun] Kansas State Univ, Dept Entomol, Manhattan, KS 66506 USA.
通讯机构:
[Wang, Haiyan] K;Kansas State Univ, Dept Stat, Manhattan, KS 66506 USA.
关键词:
Linear Discriminant Analysis;Support Vector Machine Classifier;Feature Subset;Quadratic Discriminant Analysis;Informative Gene
摘要:
Background: Even though the classification of cancer tissue samples based on gene expression data has advanced considerably in recent years, it faces great challenges to improve accuracy. One of the challenges is to establish an effective method that can select a parsimonious set of relevant genes. So far, most methods for gene selection in literature focus on screening individual or pairs of genes without considering the possible interactions among genes. Here we introduce a new computational method named the Binary Matrix Shuffling Filter (BMSF). It not only overcomes the difficulty associated with the search schemes of traditional wrapper methods and overfitting problem in large dimensional search space but also takes potential gene interactions into account during gene selection. This method, coupled with Support Vector Machine (SVM) for implementation, often selects very small number of genes for easy model interpretability.Results: We applied our method to 9 two-class gene expression datasets involving human cancers. During the gene selection process, the set of genes to be kept in the model was recursively refined and repeatedly updated according to the effect of a given gene on the contributions of other genes in reference to their usefulness in cancer classification. The small number of informative genes selected from each dataset leads to significantly improved leave-one-out (LOOCV) classification accuracy across all 9 datasets for multiple classifiers. Our method also exhibits broad generalization in the genes selected since multiple commonly used classifiers achieved either equivalent or much higher LOOCV accuracy than those reported in literature.Conclusions: Evaluation of a gene's contribution to binary cancer classification is better to be considered after adjusting for the joint effect of a large number of other genes. A computationally efficient search scheme was provided to perform effective search in the extensive feature space that includes possible interactions of many genes. Performance of the algorithm applied to 9 datasets suggests that it is possible to improve the accuracy of cancer classification by a big margin when joint effects of many genes are considered. © 2012 Zhang et al.; licensee BioMed Central Ltd.
语种:
英文
展开
Identification of hierarchical and overlapping functional modules in PPI networks
作者:
Wang, Jianxin* ;Ren, Jun;Li, Min;Wu, Fang-Xiang
期刊:
IEEE Transactions on NanoBioscience ,2012年11(4):386-393 ISSN:1536-1241
通讯作者:
Wang, Jianxin
作者机构:
[Ren, Jun; Li, Min; Wang, Jianxin] Cent South Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.;[Ren, Jun] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Wu, Fang-Xiang] Univ Saskatchewan, Dept Mech Engn, Saskatoon, SK S7N 5A9, Canada.;[Wu, Fang-Xiang] Univ Saskatchewan, Div Biomed Engn, Saskatoon, SK S7N 5A9, Canada.
通讯机构:
[Wang, Jianxin] C;Cent South Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.
关键词:
Clustering coefficient;PPI networks;hierarchical clustering algorithm;overlapping and hierarchical module
摘要:
Various evidences have demonstrated that functional modules are overlapping and hierarchically organized in protein-protein interaction (PPI) networks. Up to now, few methods are able to identify both overlapping and hierarchical functional modules in PPI networks. In this paper, a new hierarchical clustering algorithm, called OH-PIN, is proposed based on the overlapping M, λ-module, and a new concept of clustering coefficient between two clusters. By recursively merging two clusters with the maximum clustering coefficient, OH-PIN finally assembles all M into λ-modules. Since M s are overlapping, λ-modules based on them are also overlapping. Thus, OH-PIN can detect a hierarchical organization of overlapping modules by tuning the value of λ. The hierarchical organization is similar to the hierarchical organization of GO annotations and that of the known complexes in MIPS. To compare the performance of OH-PIN and other existing competing algorithms, we apply them to the yeast PPI network. The experimental results show that OH-PIN outperforms the existing algorithms in terms of the functional enrichment and matching with known protein complexes. © 2011 IEEE.
语种:
英文
展开
Identifying differentially expressed genes in cancer patients using a non-parameter Ising model
作者:
Li, Xumeng;Feltus, Frank A.;Sun, Xiaoqian;Wang, James Z.;Luo, Feng*
期刊:
PROTEOMICS ,2011年11(19):3845-3852 ISSN:1615-9853
通讯作者:
Luo, Feng
作者机构:
[Wang, James Z.; Luo, Feng; Li, Xumeng] Clemson Univ, Sch Comp, Clemson, SC 29634 USA.;[Li, Xumeng] Hunan Agr Univ, Dept Informat & Comp Sci, Changsha, Hunan, Peoples R China.;[Feltus, Frank A.] Clemson Univ, Dept Biochem & Genet, Clemson, SC 29634 USA.;[Sun, Xiaoqian] Clemson Univ, Dept Math Sci, Clemson, SC 29634 USA.
通讯机构:
[Luo, Feng] C;Clemson Univ, Sch Comp, Clemson, SC 29634 USA.
关键词:
Bioinformatics;Differentially expressed genes;Ising model;Microarray;Protein interaction network
摘要:
Identification of genes and pathways involved in diseases and physiological conditions is a major task in systems biology. In this study, we developed a novel non-parameter Ising model to integrate protein-protein interaction network and microarray data for identifying differentially expressed (DE) genes. We also proposed a simulated annealing algorithm to find the optimal configuration of the Ising model. The Ising model was applied to two breast cancer microarray data sets. The results showed that more cancer-related DE sub-networks and genes were identified by the Ising model than those by the Markov random field model. Furthermore, cross-validation experiments showed that DE genes identified by Ising model can improve classification performance compared with DE genes identified by Markov random field model. © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
语种:
英文
展开
Dynamic biclustering of microarray data by multi-objective immune optimization
作者:
Liu, Junwan* ;Li, Zhoujun;Hu, Xiaohua;Chen, Yiming;Park, E. K.
期刊:
BMC Genomics ,2011年12(2):1-7 ISSN:1471-2164
通讯作者:
Liu, Junwan
作者机构:
[Liu, Junwan] Cent S Univ Forestry & Technol, Sch Comp & Informat Engn, Changsha 410004, Hunan, Peoples R China.;[Li, Zhoujun] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China.;[Li, Zhoujun] BeiHang Univ, Beijing Key Lab Network Technol, Beijing 100191, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Informat Sci & Technol, Philadelphia, PA 19104 USA.;[Chen, Yiming] Hunan Agr Univ, Sch Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.
通讯机构:
[Liu, Junwan] C;Cent S Univ Forestry & Technol, Sch Comp & Informat Engn, Changsha 410004, Hunan, Peoples R China.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine
会议时间:
DEC 18-21, 2010
会议地点:
Hong Kong, PEOPLES R CHINA
会议主办单位:
[Liu, Junwan] Cent S Univ Forestry & Technol, Sch Comp & Informat Engn, Changsha 410004, Hunan, Peoples R China.^[Li, Zhoujun] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China.^[Li, Zhoujun] BeiHang Univ, Beijing Key Lab Network Technol, Beijing 100191, Peoples R China.^[Hu, Xiaohua] Drexel Univ, Coll Informat Sci & Technol, Philadelphia, PA 19104 USA.^[Chen, Yiming] Hunan Agr Univ, Sch Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.^[Park, E. K.] CSI CUNY, Staten Isl, NY USA.
关键词:
Pareto Front;Pareto Optimal Solution;Microarray Dataset;Human Dataset;Yeast Dataset
摘要:
Background: Newly microarray technologies yield large-scale datasets. The microarray datasets are usually presented in 2D matrices, where rows represent genes and columns represent experimental conditions. Systematic analysis of those datasets provides the increasing amount of information, which is urgently needed in the post-genomic era. Biclustering, which is a technique developed to allow simultaneous clustering of rows and columns of a dataset, might be useful to extract more accurate information from those datasets. Biclustering requires the optimization of two conflicting objectives (residue and volume), and a multi-objective artificial immune system capable of performing a multi-population search. As a heuristic search technique, artificial immune systems (AISs) can be considered a new computational paradigm inspired by the immunological system of vertebrates and designed to solve a wide range of optimization problems. During biclustering several objectives in conflict with each other have to be optimized simultaneously, so multi-objective optimization model is suitable for solving biclustering problem.Results: Based on dynamic population, this paper proposes a novel dynamic multi-objective immune optimization biclustering (DMOIOB) algorithm to mine coherent patterns from microarray data. Experimental results on two common and public datasets of gene expression profiles show that our approach can effectively find significant localized structures related to sets of genes that show consistent expression patterns across subsets of experimental conditions. The mined patterns present a significant biological relevance in terms of related biological processes, components and molecular functions in a species-independent manner.Conclusions: The proposed DMOIOB algorithm is an efficient tool to analyze large microarray datasets. It achieves a good diversity and rapid convergence. © 2011 Liu et al; licensee BioMed Central Ltd.
语种:
英文
展开
Predicting gene function using few positive examples and unlabeled ones
作者:
Chen, Yiming;Li, Zhoujun* ;Wang, Xiaofeng;Feng, Jiali;Hu, Xiaohua
期刊:
BMC Genomics ,2010年11(2):1-9 ISSN:1471-2164
通讯作者:
Li, Zhoujun
作者机构:
[Chen, Yiming] Natl Univ Def Technol, Comp Sch, Changsha, Hunan, Peoples R China.;[Li, Zhoujun] BeiHang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China.;[Chen, Yiming] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha, Hunan, Peoples R China.;[Feng, Jiali; Wang, Xiaofeng] Shanghai Maritime Univ, Coll Informat Engn, Shanghai, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Informat Sci & Technol, Philadelphia, PA 19104 USA.
通讯机构:
[Li, Zhoujun] B;BeiHang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China.
关键词:
Annotate Gene;Unknown Gene;Functional Term;Class Imbalance Problem;Predict Gene Function
摘要:
Background: A large amount of functional genomic data have provided enough knowledge in predicting gene function computationally, which uses known functional annotations and relationship between unknown genes and known ones to map unknown genes to GO functional terms. The prediction procedure is usually formulated as binary classification problem. Training binary classifier needs both positive examples and negative ones that have almost the same size. However, from various annotation database, we can only obtain few positive genes annotation for most offunctional terms, that is, there are only few positive examples for training classifier, which makes predicting directly gene function infeasible.Results: We propose a novel approach SPE_RNE to train classifier for each functional term. Firstly, positive examples set is enlarged by creating synthetic positive examples. Secondly, representative negative examples are selected by training SVM(support vector machine) iteratively to move classification hyperplane to a appropriate place. Lastly, an optimal SVM classifier are trained by using grid search technique. On combined kernel ofYeast protein sequence, microarray expression, protein-protein interaction and GO functional annotation data, we compare SPE_RNE with other three typical methods in three classical performance measures recall R, precise P and their combination F: twoclass considers all unlabeled genes as negative examples, twoclassbal selects randomly same number negative examples from unlabeled gene, PSoL selects a negative examples set that are far from positive examples and far from each other.Conclusions: In test data and unknown genes data, we compute average and variant of measure F. The experiments showthat our approach has better generalized performance and practical prediction capacity. In addition, our method can also be used for other organisms such as human. © 2010 Li et al; licensee BioMed Central Ltd.
语种:
英文
展开
Biclustering of microarray data with MOSPO based on crowding distance
作者:
Liu, Junwan* ;Li, Zhoujun;Hu, Xiaohua;Chen, Yiming
期刊:
BMC Bioinformatics ,2009年10(4):1-10 ISSN:1471-2105
通讯作者:
Liu, Junwan
作者机构:
[Li, Zhoujun; Chen, Yiming; Liu, Junwan] Natl Univ Deference Technol, Sch Comp, Changsha, Hunan, Peoples R China.;[Liu, Junwan] Cent S Univ Forestry & Technol, Sch Comp Sci, Changsha, Hunan, Peoples R China.;[Li, Zhoujun] Beihang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Informat Sci & Technol, Philadelphia, PA 19104 USA.;[Chen, Yiming] Hunan Agr Univ, Sch Informat Sci & Technol, Changsha, Hunan, Peoples R China.
通讯机构:
[Liu, Junwan] N;Natl Univ Deference Technol, Sch Comp, Changsha, Hunan, Peoples R China.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine
会议时间:
NOV 03-05, 2008
会议地点:
Philadelphia, PA
会议主办单位:
[Liu, Junwan;Li, Zhoujun;Chen, Yiming] Natl Univ Deference Technol, Sch Comp, Changsha, Hunan, Peoples R China.^[Liu, Junwan] Cent S Univ Forestry & Technol, Sch Comp Sci, Changsha, Hunan, Peoples R China.^[Li, Zhoujun] Beihang Univ, Sch Comp Sci & Engn, Beijing, Peoples R China.^[Hu, Xiaohua] Drexel Univ, Coll Informat Sci & Technol, Philadelphia, PA 19104 USA.^[Chen, Yiming] Hunan Agr Univ, Sch Informat Sci & Technol, Changsha, Hunan, Peoples R China.
关键词:
Gene Ontology;Particle Swarm Optimization;Particle Swarm;Pareto Front;Microarray Dataset
摘要:
Background: High-throughput microarray technologies have generated and accumulated massive amounts of gene expression datasets that contain expression levels of thousands of genes under hundreds of different experimental conditions. The microarray datasets are usually presented in 2D matrices, where rows represent genes and columns represent experimental conditions. The analysis of such datasets can discover local structures composed by sets of genes that show coherent expression patterns under subsets of experimental conditions. It leads to the development of sophisticated algorithms capable of extracting novel and useful knowledge from a biomedical point of view. In the medical domain, these patterns are useful for understanding various diseases, and aid in more accurate diagnosis, prognosis, treatment planning, as well as drug discovery. Results: In this work we present the CMOPSOB (Crowding distance based Multi-objective Particle Swarm Optimization Biclustering), a novel clustering approach for microarray datasets to cluster genes and conditions highly related in sub-portions of the microarray data. The objective of biclustering is to find sub-matrices, i.e. maximal subgroups of genes and subgroups of conditions where the genes exhibit highly correlated activities over a subset of conditions. Since these objectives are mutually conflicting, they become suitable candidates for multi-objective modelling. Our approach CMOPSOB is based on a heuristic search technique, multi-objective particle swarm optimization, which simulates the movements of a flock of birds which aim to find food. In the meantime, the nearest neighbour search strategies based on crowding distance and ε-dominance can rapidly converge to the Pareto front and guarantee diversity of solutions. We compare the potential of this methodology with other biclustering algorithms by analyzing two common and public datasets of gene expression profiles. In all cases our method can find localized structures related to sets of genes that show consistent expression patterns across subsets of experimental conditions. The mined patterns present a significant biological relevance in terms of related biological processes, components and molecular functions in a species-independent manner. Conclusion: The proposed CMOPSOB algorithm is successfully applied to biclustering of microarray dataset. It achieves a good diversity in the obtained Pareto front, and rapid convergence. Therefore, it is a useful tool to analyze large microarray datasets. © 2009 Liu et al; licensee BioMed Central Ltd.
语种:
英文
展开