摘要:
This paper addresses the problems in scheduling a precedence constrained tasks of parallel application with random tasks processing time and edges communication time on Grid computing systems so as to minimize the makespan in stochastic environment. This is a difficult problem and few efforts have been reported on its solution in the literature. The problem is first formulated in a form of stochastic scheduling model on Grid systems. Then, a stochastic heterogeneous earliest finish time (SHEFT) scheduling algorithm is developed that incorporates the expected value and variance of stochastic processing time into scheduling. Our rigorous performance evaluation study, based on randomly generated stochastic parallel application DAG graphs, shows that our proposed SHEFT scheduling algorithm performs much better than the existing scheduling algorithms in terms of makespan, speedup, and makespan standard deviation.
关键词:
Support Vector Machine;Chisquare Statistic;Family Classifier;Marker Pair;Informative Gene
摘要:
One of the challenges in classification of cancer tissue samples based on gene expression data is to establish an effective method that can select a parsimonious set of informative genes. The Top Scoring Pair (TSP), k-Top Scoring Pairs (k-TSP), Support Vector Machines (SVM), and prediction analysis of microarrays (PAM) are four popular classifiers that have comparable performance on multiple cancer datasets. SVM and PAM tend to use a large number of genes and TSP, k-TSP always use even number of genes. In addition, the selection of distinct gene pairs in k-TSP simply combined the pairs of top ranking genes without considering the fact that the gene set with best discrimination power may not be the combined pairs. The k-TSP algorithm also needs the user to specify an upper bound for the number of gene pairs. Here we introduce a computational algorithm to address the problems. The algorithm is named Chisquare-statistic-based Top Scoring Genes (Chi-TSG) classifier simplified as TSG. The TSG classifier starts with the top two genes and sequentially adds additional gene into the candidate gene set to perform informative gene selection. The algorithm automatically reports the total number of informative genes selected with cross validation. We provide the algorithm for both binary and multi-class cancer classification. The algorithm was applied to 9 binary and 10 multi-class gene expression datasets involving human cancers. The TSG classifier outperforms TSP family classifiers by a big margin in most of the 19 datasets. In addition to improved accuracy, our classifier shares all the advantages of the TSP family classifiers including easy interpretation, invariant to monotone transformation, often selects a small number of informative genes allowing follow-up studies, resistant to sampling variations due to within sample operations. Redefining the scores for gene set and the classification rules in TSP family classifiers by incorporating the sample size information can lead to better selection of informative genes and classification accuracy. The resulting TSG classifier offers a useful tool for cancer classification based on numerical molecular data.
期刊:
IEEE Transactions on Nanobioscience,2012年11(4):386-393 ISSN:1536-1241
通讯作者:
Wang, Jianxin
作者机构:
[Ren, Jun; Li, Min; Wang, Jianxin] Cent S Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.;[Ren, Jun] Hunan Agr Univ, Coll Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.;[Wu, Fang-Xiang] Univ Saskatchewan, Dept Mech Engn, Saskatoon, SK S7N 5A9, Canada.;[Wu, Fang-Xiang] Univ Saskatchewan, Div Biomed Engn, Saskatoon, SK S7N 5A9, Canada.
通讯机构:
[Wang, Jianxin] Cent S Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China.
关键词:
Clustering coefficient;hierarchical clustering algorithm;overlapping and hierarchical module;PPI networks
摘要:
Various evidences have demonstrated that functional modules are overlapping and hierarchically organized in protein-protein interaction (PPI) networks. Up to now, few methods are able to identify both overlapping and hierarchical functional modules in PPI networks. In this paper, a new hierarchical clustering algorithm, called OH-PIN, is proposed based on the overlapping
<i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</i>
_
<i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">clusters</i>
, λ-module, and a new concept of clustering coefficient between two clusters. By recursively merging two clusters with the maximum clustering coefficient, OH-PIN finally assembles all
<i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</i>
_
<i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">clusters</i>
into λ -modules. Since
<i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</i>
_
<i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">cluster</i>
s are overlapping, λ -modules based on them are also overlapping. Thus, OH-PIN can detect a hierarchical organization of overlapping modules by tuning the value of λ. The hierarchical organization is similar to the hierarchical organization of GO annotations and that of the known complexes in MIPS. To compare the performance of OH-PIN and other existing competing algorithms, we apply them to the yeast PPI network. The experimental results show that OH-PIN outperforms the existing algorithms in terms of the functional enrichment and matching with known protein complexes.
作者机构:
[Liu, Junwan] Cent S Univ Forestry & Technol, Sch Comp & Informat Engn, Changsha 410004, Hunan, Peoples R China.;[Hu, Xiaohua] Cent China Normal Univ, Dept Comp Sci, Wuhan 430079, Peoples R China.;[Li, Zhoujun] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China.;[Li, Zhoujun] Beihang Univ, Beijing Key Lab Network Technol, Beijing 100191, Peoples R China.;[Hu, Xiaohua] Drexel Univ, Coll Informat Sci, Philadelphia, PA 19104 USA.
通讯机构:
[Hu, Xiaohua] Cent China Normal Univ, Dept Comp Sci, Wuhan 430079, Peoples R China.
会议名称:
IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
会议时间:
NOV 12-15, 2011
会议地点:
Atlanta, GA
会议主办单位:
[Hu, Xiaohua] Cent China Normal Univ, Dept Comp Sci, Wuhan 430079, Peoples R China.^[Liu, Junwan] Cent S Univ Forestry & Technol, Sch Comp & Informat Engn, Changsha 410004, Hunan, Peoples R China.^[Li, Zhoujun] Beihang Univ, State Key Lab Software Dev Environm, Beijing 100191, Peoples R China.^[Li, Zhoujun] Beihang Univ, Beijing Key Lab Network Technol, Beijing 100191, Peoples R China.^[Hu, Xiaohua] Drexel Univ, Coll Informat Sci, Philadelphia, PA 19104 USA.^[Chen, Yiming] Hunan Agr Univ, Sch Informat Sci & Technol, Changsha 410128, Hunan, Peoples R China.
摘要:
Multi-objective optimization (MOO) involves optimization problems with multiple objectives. Generally, theose objectives is used to estimate very different aspects of the solutions, and these aspects are often in conflict with each other. MOO first gets a Pareto set, and then looks for both commonality and systematic variations across the set. For the large-scale data sets, heuristic search algorithms such as EA combined with MOO techniques are ideal. Newly DNA microarray technology may study the transcriptional response of a complete genome to different experimental conditions and yield a lot of large-scale datasets. Biclustering technique can simultaneously cluster rows and columns of a dataset, and hlep to extract more accurate information from those datasets. Biclustering need optimize several conflicting objectives, and can be solved with MOO methods. As a heuristics-based optimization approach, the particle swarm optimization (PSO) simulate the movements of a bird flock finding food. The shuffled frog-leaping algorithm (SFL) is a population-based cooperative search metaphor combining the benefits of the local search of PSO and the global shuffled of information of the complex evolution technique. SFL is used to solve the optimization problems of the large-scale datasets. This paper integrates dynamic population strategy and shuffled frog-leaping algorithm into biclustering of microarray data, and proposes a novel multi-objective dynamic population shuffled frog-leaping biclustering (MODPSFLB) algorithm to mine maximum bicluesters from microarray data. Experimental results show that the proposed MODPSFLB algorithm can effectively find significant biological structures in terms of related biological processes, components and molecular functions. The proposed MODPSFLB algorithm has good diversity and fast convergence of Pareto solutions and will become a powerful systematic functional analysis in genome research.
关键词:
power markets, pricing, statistical testing, time series
摘要:
A novel method, which we called the analogous multifractal cross-correlation analysis, is proposed in this paper to study the multifractal behavior in the power-law cross-correlation between price and load in California electricity market. In addition, a statistic rho(AMF-XA), which we call the analogous multifractal cross-correlation coefficient, is defined to test whether the cross-correlation between two given signals is genuine or not. Our analysis finds that both the price and load time series in California electricity market express multifractal nature. While, as indicated by the rho(AMF-XA) statistical test, there is a huge difference in the cross-correlation behavior between the years 1999 and 2000 in California electricity markets. (C) 2013 American Institute of Physics. [http://dx.doi.org/10.1063/1.4793355]
关键词:
Average response time;Cache;Multi-core;Schedule length;Task scheduling
摘要:
In the past few years, multi-core processors incorporating four, six, eight, or more cores on a single die have become ubiquitous. Those cores, having their own private caches, often share a higher level cache memory, which leads to compete among different tasks. This can seriously affect the average performance of multi-core systems as the probability of cache hit could be lowered. In realizing this, we study the problem of scheduling bag-of-tasks (BoT) applications with shared cache constraint on multi-core systems. We first use cache space isolation techniques to divide shared caches into partitions. Then, we give a motivational example and outline the shared cache aware scheduling problem of multi-core systems. Finally, to provide an optimum solution for this problem, we propose a heuristic shared cache contention aware scheduling (SCAS) algorithm on multi-core systems. Our extensive simulation performance evaluation study clearly demonstrate that our proposed SCAS algorithm outperforms the existing traditional scheduling algorithm Min-min and the modified algorithm MSCAS in terms of schedule length and average response time.
期刊:
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),2005年3759:647-656 ISSN:0302-9743
会议名称:
Parallel and Distributed Processing and Applications - ISPA 2005 Workshops, ISPA 2005 International Workshops AEPP, ASTD, BIOS, GCIC, IADS, MASN, SGCA, and WISA, Nanjing, China, November 2-5, 2005, Proceedings
会议地点:
Nanjing, China
会议主办单位:
(1) Institute of Computer and Information Engineering, Hunan Agricultural University, Changsha 410128, Hunan, China; (2) Institute of Artificial Intelligence, Zhejiang University, Hangzhou 310027, Zhejiang, China; (3) College of Computer and Communication, Hunan University, Changsha 410082, Hunan, China
会议论文集名称:
Lecture Notes in Computer Science
摘要:
Along with high development of multimedia information technique, the provider of badness information embeds some badness information to image or directly saves as a image file, avoiding the filter of image, which brings extreme effect of security hidden trouble in society. An information audit system based on image content filtering is provided in this paper. At first, we discuss some basic method filtering physical badness image content, analyze some key technology of filtering image content, and mark as texture character by four eigenvectors: contrast, energy, entropy and correlation. Afterwards, we utilize dynamic programming method to segment image objects, and utilize similarity measurement to denote similarity degree of two character measures. At last, we give an example of identify yellow content, which distill the texture character of image and match it with defined character database. Our system can supervise and control badness information of physical badness image content, and realize automation audit of multimedia information.
关键词:
information audit;badness information;text character;Fuzzy neural network
摘要:
At Intelligent Methods for information audit system is hot spot in the field of network security, and application of pattern recognition and data mining in information audit system is world widely concerned and worldwide studying. As an important method of pattern recognition, Fuzzy neural network has the capability of self-organization, self-learning and generalization. Application of Fuzzy neural network in information audit system can not only identify the known badness information, but also can detect the new badness information and abnormal event.
期刊:
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),2006年3842:869-876 ISSN:0302-9743
通讯作者:
Yu, F
作者机构:
[Yu, F] Hunan Agr Univ, Sch Comp & Informat Engn, Changsha 410128, Peoples R China.;Chinese Acad Sci, State Key Lab Informat Secur, Grad Sch, Beijing 100049, Peoples R China.;Hunan Univ, Coll Comp & Commun, Changsha 410082, Peoples R China.
会议名称:
APWeb 2006 International Workshops: XRA, IWSN, MEGA, and ICSE
会议时间:
January 16, 2006 - January 18, 2006
会议地点:
Harbin, China
会议主办单位:
(1) School of Computer and Information Engineering, Hunan Agricultural University, Changsha, 410128, China; (2) State Key Laboratory of Information Security, Graduate School, Chinese Academy of Sciences, Beijing, 100049, China; (3) College of Computer and Communication, Hunan University, Changsha, 410082, China
摘要:
High-throughput microarray technologies have generated and accumulated massive amounts of gene expression datasets that contain expression levels of thousands of genes under hundreds of different experimental conditions. The microarray datasets are usually presented in 2D matrices, where rows represent genes and columns represent experimental conditions. The analysis of such datasets can discover local structures composed by sets of genes that show coherent expression patterns under subsets of experimental conditions. It leads to the development of sophisticated algorithms capable of extracting novel and useful knowledge from a biomedical point of view. In the medical domain, these patterns are useful for understanding various diseases, and aid in more accurate diagnosis, prognosis, treatment planning, as well as drug discovery. In this work we present the CMOPSOB (Crowding distance based Multi-objective Particle Swarm Optimization Biclustering), a novel clustering approach for microarray datasets to cluster genes and conditions highly related in sub-portions of the microarray data. The objective of biclustering is to find sub-matrices, i.e. maximal subgroups of genes and subgroups of conditions where the genes exhibit highly correlated activities over a subset of conditions. Since these objectives are mutually conflicting, they become suitable candidates for multi-objective modelling. Our approach CMOPSOB is based on a heuristic search technique, multi-objective particle swarm optimization, which simulates the movements of a flock of birds which aim to find food. In the meantime, the nearest neighbour search strategies based on crowding distance and ϵ-dominance can rapidly converge to the Pareto front and guarantee diversity of solutions. We compare the potential of this methodology with other biclustering algorithms by analyzing two common and public datasets of gene expression profiles. In all cases our method can find localized structures related to sets of genes that show consistent expression patterns across subsets of experimental conditions. The mined patterns present a significant biological relevance in terms of related biological processes, components and molecular functions in a species-independent manner. The proposed CMOPSOB algorithm is successfully applied to biclustering of microarray dataset. It achieves a good diversity in the obtained Pareto front, and rapid convergence. Therefore, it is a useful tool to analyze large microarray datasets.
会议名称:
Joint 14th IEEE Int Conf on Trust, Secur and Privacy in Comp and Commun / 13th IEEE Int Symposium on Parallel and Distributed Proc with Applications / 9th IEEE Int Conf on Big Data Science and Engineering (IEEE TrustCom-ISPA-BigDataSE)
关键词:
key management;rekeying;mult-privileged group communications;security
摘要:
In some group-oriented applications, users can access several data resources according to their respective willingness. So, how to effectively access the data resources is a challenge in multi-privileged group communications. Some key management schemes for hierarchical access control are proposed. In this paper, we discuss the challenges of key management. Then, we present a list of evaluation criteria for secure key management for multi-privileged group communications, and investigate the features of some typical schemes. The schemes can be divided into several classes according to some separate dimensions, such as the topology of key model, the rekeying method, the rekeying policy and the encryption method of data resources. We analyze them comparatively regarding to the secure distribution and renewal of key materials.
摘要:
Feature extraction plays a great important role in image processing and pattern recognition. As a power tool, multifractal theory is recently employed for this job. However, traditional multifractal methods are proposed to analyze the objects with stationary measure and cannot for non-stationary measure. The works of this paper is twofold. First, the definition of stationary image and 2D image feature detection methods are proposed. Second, a novel feature extraction scheme for non-stationary image is proposed by local multifractal detrended fluctuation analysis (Local MF-DFA), which is based on 2D MF-DFA. A set of new multifractal descriptors, called local generalized Hurst exponent (Lh<inf>q</inf>) is defined to characterize the local scaling properties of textures. To test the proposed method, both the novel texture descriptor and other two multifractal indicators, namely, local Hölder coefficients based on capacity measure and multifractal dimension D<inf>q</inf> based on multifractal differential box-counting (MDBC) method, are compared in segmentation experiments. The first experiment indicates that the segmentation results obtained by the proposed Lh<inf>q</inf> are better than the MDBC-based D<inf>q</inf> slightly and superior to the local Hölder coefficients significantly. The results in the second experiment demonstrate that the Lh<inf>q</inf> can distinguish the texture images more effectively and provide more robust segmentations than the MDBC-based D<inf>q</inf> significantly.