Activities @ NCRA

Activities @ NCRA

NCRA Seminar Series

*25 May 2012*Surrogate models for Single and Multi-Objective Stochastic Optimization: Integrating Support Vector Machines and Covariance-Matrix Adaptation-ESMarc Schoenauer (INRIA Saclay - Ile-de-France)

**Abstract:**Support Vector Machines are today widely used in Machine Learning. Designed originally to handle classification problems, their success relies on 2 main ingredients: linear classification, that leads to some simple quadratic constrained optimization problems, that can be solved using standard powerful algorithms; the "kernel trick", that allows the programmer to solve the quadratic problem in a very high dimensional feature space while computing only in the low-dimension original space. Furthermore, beyond classification, several other Machine Learning problems can be addressed by using other constraints on the resulting quadratic problem, from continuous to ordinal regression to one-class identification. Surrogate models are mandatory for any optimization method when tackling expensive objective functions - the more for Evolutionary Algorithms, that are well-known for their appetite in terms of CPU cycles. Regression-SVM are one of the many methods that have been used to build approximations of the objective function in the evolutionary framework. However, the flexibility of SVM to address other ML paradigms mentioned above can lead to original surrogate models. Along these lines, in the framework of continuous evolutionary optimization, this talk will present two Pareto-oriented surrogate models for multi-objective optimizers, and a comparison-based surrogate for (comparison-based) CMA-ES in the context of single-objective optimization.*5 July 2011*Heuristics for optimisation and financial applicationsProf Dietmar Maringer (University of Basel)

**Abstract:***5 July 2011*Adaptive Algorithmic Trading - the challengesProf Chris Stephens (UNAM)

**Abstract:***4 July 2011*Market microstructure for Algorithmic TradingProf Philip Hamill (University of Ulster)

**Abstract:***12 November 2010*Artificial Intelligence in Games using Evolutionary TechniquesMiguel Nicolau (UCD)

**Abstract:***30 October 2009*Exploring Hierarchical Evolution with an Artificial ProtocellProf Barry McMullin (Dublin City University)

**Abstract:**In this talk I will present an idealised model of an RNA-world "protocell", and some of the evolutionary phenomena this gives rise to. In this model a protocell is simply a vesicle-like container carrying a well stirred mixture of "polymers" which can function as abstract analogues of certain (hypothetical) kinds of ribozymes - i.e., they function both as informational templates to be replicated, and as replicase enzymes to catalyse such replication. Replicases vary in the classes of templates which they can replicate (i.e., in the template motifs which they recognise and bind to). Some molecular species function as self-replicases. As the molecular population increases through replication, the vesicle grows until it reaches a critical size at which it fissions. The result is a system in which there are two distinct but coupled selectional processes: one at the molecular level, one at the cellular level. I will illustrate some of the (perhaps unexpected?) consequences of this architecture.*9 October 2009*Theoretically-Grounded Genetic ProgrammingColin Johnson (University of Kent, Canterbury, England)

**Abstract:**This talk will consist of an overview of a number of projects that we have carried out in recent years. The general topic is the interaction between theoretical issues in programming languages and genetic programming. The talk will split into two sections. In the first section I will discuss the idea of analysing programs to derive fitness properties. I will give two case studies: the first is the use of multimodal optimisation to evolve a robot control task with a safety constraint that must be satisfied by the final evolved program; the second is the application of model checking as a means of constructing analysis-based fitness functions that apply across the input space. The second section will look at the application of program semantics to initialisation and operators in GP. I will start with a discussion of the idea of a "semantic wrapper" as a general principle for introducing semantics to aspects of GP, and then present some examples from initialisation, crossover and mutation.*11 June 2009*Multi-stage stochastic optimization - modeling techniques and solution approachesRonald Hochreiter (University of Vienna)

**Abstract:**Multi-stage decision optimization under uncertainty can be useful to obtain quantitative decision support solutions for a wide range of application areas. However, both the modeling process of the underlying real-world problem as well as the solution of the resulting stochastic optimization problems can be tedious and complicated, especially if non-textbook applications are considered. An overview of various issues which need to be considered to obtain useful results will be given, and a set of solution approaches for each step from pre-processing data to analyzing solutions will be presented.*26 September 2008*From Artificial Evolution to Computational EvolutionProf Wolfgang Banzhaf (Memorial University of Newfoundland)

**Abstract:**In the past 50 years Computer Scientists and Engineers have been very successful in gleaning recipes from Natural Evolution. With the advent of Genome Biology, however, there have been many new discoveries in Biology not yet reflected in computational models of evolution. This talk discusses some of the more recent developments in Biology, and how they might possibly impact our thinking about algorithms.*9 May 2008*Evolvability in Optimization Problems: Towards more Efficient Evolutionary AlgorithmsJorge Tavares (INRIA Lille)

**Abstract:**A fundamental issue in the development of evolutionary algorithms is the representation of a candidate solution. The success of this type of approaches relies on choosing a genetic representation that allows the algorithm to be efficient and accurate. Additionally, it is important to understand its role and the interplay with heuristics. Does a particular representation need auxiliary mechanisms? Which of these complementary procedures are influential? Within combinatorial optimization problems, this fact becomes even more relevant since they possess a strong practical application in many diverse fields. Understanding the effects of a representation when solving a problem is important so that new and more efficient encodings can be developed. The ultimate case will be when the encoding designer is the evolutionary algorithm itself. Preliminary work in this direction has been made bringing self-organization and evolution to a common problem solving framework. This is the natural next step since finding a good representation for an optimization problem is a problem by itself. Even when studying the known representations for a given problem, it may not be enough to ensure that the algorithm will encode the candidate solutions in the best way.*10 April 2008*Genetic Programming in Drug Discovery and DevelopmentLeonardo Vanneschi (University of Milano-Bicocca)

**Abstract:**Genetic Programming (GP) is an evolutionary approach which extends the genetic model of learning to the space of programs. It is a major variation of Genetic Algorithms in which the evolving individuals are themselves computer programs instead of fixed length strings from a limited alphabet of symbols. In the last few years, GP has become more and more popular for biomedical applications. In particular, GP has been recently used to mine large datasets with the goal of automatically generating the underlying (hidden) functional relationship between data and correlate the behavior of latent features with some interesting parameters bound to drug activity patterns. In this seminar, after a brief introduction to GP and an informal discussion of how and why GP works as an optimization method,an application of GP in drug discovery and development will be discussed in which GP has been used for quantitative prediction of drug induced toxicity and bioavailability.*14 March 2008*Inference for Continuous Semimartingales Observed at High Frequency: A General ApproachProf. Per Mykland (Dept of Statistics, University of Illinois at Chicago)

**Abstract:**The econometric literature of high frequency data usually relies on momentestimators which are derived from assuming local constancy of volatilityand related quantities. We here show that this first order approximation is not always valid if used naively. We find that such approximations require an ex post adjustment involving asymptotic likelihood ratios. These are given. Several examples (powers of volatility, leverage effect,ANOVA) are provided. The first order approximations in this study can be over the period of one observation, or over blocks of successive observations. The theory relies heavily on the interplay between stableconvergence and measure change, and on asymptotic expansions for martingales. Practically, the procedure permits (1) the definition of estimators of hard to reach quantities, such as the leverage effect, of volatility, (2) the improvement in efficiency in classical estimators, and (3) easy analysis.*14 March 2008*Microstructure and the Hidden Semimartingale ModelProf. Lan Zhang (Dept of Finance, University of Illinois at Chicago)

**Abstract:**The availability of high frequency data for financial instruments has opened the possibility of accurately determining volatility in small time periods,such as one day. Recent work on such estimation indicates that it is necessary to analyze the data with a hidden semimartingale model, typically by incorporating microstructure noise. We develop the methodology for analyzing such data, including two- and multiscale sampling. We also show how to make these estimators robust to dependent noise.*5 December 2007*Semantic Building Blocks in Genetic ProgrammingNic McPhee (University of Minnesota, Morris)

**Abstract:**Subtree crossover is one of the oldest and remains one of the most widely used recombination operators in genetic programming (GP). It is still unclear, however, why or how it works. It's hardly obvious that yanking a random chunk of code from one program, and plopping it unceremoniously in a random location in a second program would be a good thing. Yet it clearly works (at some level) in GP. But why? How (and when) does subtree crossover move the population closer to the solution?

I will present a new mechanism for studying the impact of subtree crossover in terms of semantic building blocks. This approach allows us to completely and compactly describe the semantic action of crossover, and provide insight into what does (or doesn't) make crossover effective. Our results make it clear that a very high proportion of crossover events (typically over 75\% in our experiments) are guaranteed to perform no immediately useful search in the semantic space...*15 Nov 2007*Can Trend Followers Survive in the Long-Run? Insights from Agent-Based ModelingYouwei Li (Queens University Belfast)

**Abstract:***18 September 2007*XGridErik Hemberg (NCRA)

**Abstract:**Xgrid allows a group of network computers to avail of distributed computing by acting as a job scheduler. An introduction to NCRA's Xgrid is presented and advice is provided on how best to utilise this resource for our needs.*13 July 2007*Entropy and Compression in Understanding Genetic ProgrammingProf. Bob McKay (Seoul National University)

**Abstract:***25 May 2007*Complex Adaptive SystemsProf Doyne Farmer (Santa Fe Institute)

**Abstract:**Prof. Farmer is a senior member of the Santa Fe Institute and previously worked at the Los Alamos National Laboratory. He has a PhD in Physics from the Univerisity of California (Santa Cruz). While at Santa Cruz, he was a founding member of the 'Dynamical Systems Collective', an informal grouping, which played an important role in the development of Chaos Theory. After finishing his PhD, Doyne took up a post-doctoral appointmment at the centre for non-linear studies at the Los Alamos National Laboratory and in 1988 became leader of the Complex Systems Group there.*17 May 2007*Integrating generative growth and evolutionary computation for form explorationMartin Hemberg (Imperial College London)

**Abstract:**We present an algorithmic growth process that is an extension of Lindenmayer's Map L-systems. This growth process relies upon a set of rewrite rules, a map axiom and a geometric interpreter which is integrated with a 3D simulated environment. The outcome of the growth process is a digital surface in 3D space which has grown within and in response to its environment. The growth procedure has been can be derived from a context-free grammar and we have developed an evolutionary algorithm based on Grammatical Evolution (GE) which is able to generate sets of rewrite rules. Using a quantitative multi-objective fitness function that evaluates a variety of surface properties, the integrated system (evolutionary algorithm and growth process) can explore and generate diverse and interesting surfaces with a resemblance of organic form. The algorithms have been implemented to create a design tool for architects called Genr8.*3 April 2007*A Grammatical Genetic Programming Approach to Modularity in Genetic AlgorithmsErik Hemerg (NCRA)

**Abstract:**The ability of Genetic Programming to scale to problems of increasing difficulty operates on the premise that it is possible to capture regularities that exist in a problem environment by decomposition of the problem into a hierarchy of modules. As computer scientists and more generally as humans we tend to adopt a similar divide-and-conquer strategy in our problem solving. In this paper we consider the adoption of such a strategy for Genetic Algorithms. By adopting a modular representation in a Genetic Algorithm we can make efficiency gains that enable superior scaling characteristics to problems of increasing size. We present a comparison of two modular Genetic Algorithms, one of which is a Grammatical Genetic Programming algorithm, the meta-Grammar Genetic Algorithm (mGGA), which generates binary string sentences instead of traditional GP trees. A number of problems instances are tackled which extend the Checkerboard problem by introducing different kinds of regularity and noise.*16 March 2007*An Artificial Genetic Regulatory Model: Analysis and ApplicationsMiguel Nicolau (INRIA)

**Abstract:**A recently introduced artificial gene regulatory network is presented and analysed. Created through a process of whole genome duplication and divergence, it presents remarkable similarities to natural transcriptional regulatory networks. Its dynamic regulatory nature has the potential for use on optimisation problems, and some potential applications are presented.*8 February 2007*Quantum-Inspired Evolutionary AlgorithmsKai Fan (NCRA)

**Abstract:***21 November 2006*Memetic AlgorithmsCarlos Cotta (Departamento de Lenguajes y Ciencias de la Computacion, University of Malaga)

**Abstract:**Memetic algorithms are population-based metaheuristics aimed to solving hard optimization problems. These techniques are explicitly concerned with exploiting available knowledge in order to achieve the most effective resolution of the target problem. The rationale behind these optimization techniques, and their main features will be presented. A glimpse of current research issues and future challenges will be provided as well.*3 July 2006*The Hidden Risks of Value at RiskDietmar Maringer (Centre of Computational Finance and Economic Agents (CCFEA), University of Essex)

**Abstract:**Though heavily criticized for its shortcomings, Value at risk (VaR) has become a standard measure of portfolio risk over the last decade. It even became one of the corner stones in the Basel II accord about banks' equity requirements. Nevertheless, the practical application of the VaR concept suffers from two problems: how to estimate VaR and how to optimise a portfolio for a given level of VaR? For the first problem, several approaches have been suggested including the historical simulation method. The second problem, however, might become an utmost challenge: even under the simple assumption of normally distributed returns, constraints faced in real life such as non-divisibility of the assets and disallowed short-selling will make the optimisation problem too complex for standard optimisation techniques. Empirical distributions are often claimed to be superior to parametric distributions, yet to also increase the computational complexity and are therefore hard to apply in portfolio optimisation. However, if non-parametric distributions are used to describe the assets' returns, the optimisation problem becomes virtually inapproachable to traditional numerical methods. In this paper, a memetic algorithm was implemented where a population based search strategy is combined with local neighbourhood search elements. In a preliminary study, several variants for the local search part and the usefulness of elitist strategies were tested on different kinds of constraints. We found that different types of constraints seem to favour different variants of the heuristic. Next, stock portfolios (Standard and Poor's 100 stocks) and to bond portfolios (Swiss bonds) are then optimized. Our results indicate that empirical distributions might turn into a Pandora's Box: Though highly reliable for predicting the assets' risks, employing these distributions in the optimisation process might result in severe mis-estimations of the resulting portfolios' actual risk. It is found that even a simple mean-variance approach can be superior despite its known specification errors. In other words, we find that a solution to the two aforementioned problems gives rise to a third one: the actual VaR of portfolios optimised under a VaR constraint might exceed its nominal level to a large extent.*5 May 2006*The Probabilistic Halting ProblemWilliam Langdon (Computer Science, University of Essex)

**Abstract:**Experiments with linear genetic programming (GP) system, including memory, IF and loops, suggests the fraction of terminating programs is vanishingly small. However their number grows exponentially with their length. The halting probability in von Neumann architectures scales as the square root of the program's length. Suggesting you can test for almost all infinite loops on GHz machines in one millisecond.*9 February 2006*Evolution and Permanent Change in OrganisationsRobin Matthews (Kingston University Business School London)

**Abstract:**