The following list provides models and their summaries along with pros and cons compared to our model.
- ART: generate synthetic next-generation sequencing data by mimicking real sequencing process with empirical error models or quality profiles. ART supports simulation of single-end, paired-end and mate-pair reads of three major commercial next-generation sequencing platforms: Illumina’s Solexa, Roche’s 454 and Applied Biosystems’ SOLiD. ART can perform regular genome sequencing simulation as well amplicon sequencing simulation. ART is implemented in C++ with optimized algorithms and is highly efficient in read simulation. ART outputs reads in the FASTQ format, and alignments in the ALN/MAP and/or SAM format. ART can also generate alignments in UCSC BED file format.
- AVIDA: provides detailed control over experimental settings and protocols, a large array of measurement tools, and sophisticated methods to analyze and post-process experimental data. In Avida, each digital organism is a self-contained computing automaton that has the ability to construct new automata. The organism is responsible for building the genome (computer program) that will control its offspring automaton and handing that genome to the Avida world. Avida will then construct virtual hardware for the genome to be run on and determine how this new organism should be placed into the population. In a typical Avida experiment, a successful organism attempts to make an identical copy of its own genome, and Avida randomly places that copy into the population, typically by replacing another member of the population. In principle, the only assumption made about these self-replicating automata in the core Avida software is that their initial state can be described by a string of symbols (their genome) and that it is possible through processing these symbols to autonomously produce offspring organisms. However, in practice our work has focused on automata with a simple von Neumann architecturethat operate on an assembly-like language inspired by the Tierra system. Future research projects will likely have us implement additional organism instantiations to allow us to explore additional biological questions
- CDPOP: an individual-based simulator of gene flow in complex landscapes to explain observed population responses and provide a foundation for landscape genetics. It models genetic exchange among spatially located individuals as a function of individual-based movement through mating and dispersal, incorporating population dynamics and the all factors that affect the frequency of an allele in a population (mutation, gene flow, genetic drift, and selection). User’s initially specify individual locations, environmental conditions governing gene flow, spatially-explicit fitness landscapes governing selection, and various genic configurations, and CDPOP models divergence through time as function of individual-based movement, breeding and dispersal as functions of the given landscape surfaces.
- EasyPOP: simulate haploid, diploid or haplodiploid data. For diploids there is the choice between hermaphrodites or sexuals. For hermaphrodites, the proportion of clonal reproduction and selfing can be chosen, whereas for sexuals, complex breeding structures can be simulated (e.g. monogamy with a given proportion of extra-pair matings). The number of individuals can be selected for each population and dispersal is sex-specific. There are various migration models such as two-dimensional stepping stone or hierarchical island model. In addition there is an isolation-by-distance option which works with the coordinates of the populations on any number of dimensions. There are also several mutation models implemented, which are particularly oriented on the simulation of microsatellite loci. Genotypes are real multilocus, (i.e. there are not independent replicates for each locus). All mutation parameters can be set individually for each locus. EASYPOP is able to handle very large simulations on standard personal computers and is limited only by the memory of the machine. The computer code has been optimized for maximum speed. This allows running very large simulations on personal computers in a reasonable amount of time. In order to fit to analytical xpectations in particular for variances, the functions implemented in EASYPOP are probabilistic and not deterministic. In other words, the simulations rely on the genertation of random numbers.
- ForSlim: a forward evolutionary simulation system designed to be highly flexible for application to a wide variety of both applied health and life science questions as well as issues in theoretical evolutionary biology. It attempts to simulate in the most natural way the evolutionary process that generates the genetic architecture that underlies present-day traits, and related phenomena such as mate choice, migration bias, population substructure, and interactions with the environment. These phenomena are related to the way natural selection affects underlying genetic variation, molding the trait’s genetic architecture. Variation over the short evolutionary scale, within species or among closely related species, is generally built upon a phylogenetically stable underlying causal genetic architecture upon which mutation, selection, and demographic effects are laid to generate subsequent variation within and among populations.
- Marlin: Marlin is a program for running spatially explicit forward-in-time population genetic simulations. It provides an intuitive user interface with realistic geographic scenarios can easily be easily created and simulated. But Marlin goes further than that and directly analyses and plots the results. This combination of creation, simulation, and analysis makes Marlin ideal for teaching and for scientists who are interested in doing simulations without having to learn command-line operations.
- Nemo: implements many different life cycles and evolvable traits with a large variety of genetic architectures. Species interaction between a parasite and its host can also be modeled (i.e., Cytoplasmic-Incompatibility inducing endosymbiont: Wolbachia). All this is framed within a flexible metapopulation model that allows for patch-specific carrying capacities, dispersal rates (dispersal matrices), stochastic extinction/harvesting rates, and demographic stochasticity. Populations can be dynamically modified during a simulation, allowing for population bottlenecks, patch fusion/fission, population expansion, etc. Spatially heterogeneous selection on quantitative traits can also be modeled. Nemo’s interface is a simple text file containing the simulation parameters. Large batches of simulations can be run from a single parameter file with multiple parameter values. Many complex evolutionary and demographic scenarios can be modeled easily by providing temporally varying parameter values.
- PEDAGOG: simulates population dynamics at the individual level, allows for heritability and selection of traits, records individual genotype and pedigree information, and allows for several types of errors to manifest in the output which can be formatted for 57 existing software programs. In all, parameters can be specified for genetics, demographics, mating strategy, mutations and genetic/demographic errors, growth models, heritability and selection, and output. Demographic parameters can be either age or size based, and all parameters can be drawn from twelve statistical distributions where appropriate.
- QMSim: Linkage disequilibrium (LD) and linkage analyses have been used extensively to identify quantitative trait loci (QTL) in human and livestock. Owing to the recent developments in genotyping technologies, dense marker maps are now available for several livestock species. Even though genotyping costs have substantially declined, large scale genome-wide association studies are still costly. For this reason many studies in livestock suffer from small sample size or from low density of markers. However, simulation is a highly valuable tool for assessing and validating new proposed methods for association studies at very low cost. During the last few decades, simulation has played a major role in answering a wide variety of questions in genomics. Several software have been developed for simulating genomes especially in human research. However most of the developed software tools do not provide functionality required for many of the applications in livestock. QMSim was developed to simulate large scale genomic data in livestock populations. QMSim is a family based simulator, which can also take into account predefined evolutionary features, such as LD, mutation, bottlenecks and expansions. The simulation is basically carried out in two steps: In the first step, a historical population is simulated to establish mutation-drift equilibrium and, in the second step, recent population structures are generated, which can be complex. QMSim allows for a wide range of parameters to be incorporated in the simulation models in order to produce appropriate simulated data
- quantiNEMO: quantiNEMO is an individual-based, genetically explicit stochastic simulation program. It was developed to investigate the effects of selection, mutation, recombination, and drift on quantitative traits with varying architectures in structured populations connected by migration and located in a heterogeneous habitat. quantiNEMO is highly flexible at various levels: population, selection, trait(s) architecture, genetic map for QTL and/or markers, environment, demography, mating system, etc. quantiNEMO is a console program, and is coded in standard C++ using an object oriented approach, runs on any computer platform, and is distributed under an open source license.
- simuPOP: a general-purpose individual-based forward-time population genetics simulation environment. The core of simuPOP is a scripting language (Python) that provides a large number of objects and functions to manipulate populations, and a mechanism to evolve populations forward in time. Using this environment, users can create, manipulate and evolve populations interactively, or write a script and run it as a batch file. Owing to its flexible and extensible design, simuPOP can simulate large and complex evolutionary processes with ease
- SLiM: an evolutionary simulation framework that combines a powerful engine for population genetic simulations with the capability of modeling arbitrarily complex evolutionary scenarios. Simulations are configured via the integrated Eidos scripting language that allows interactive control over practically every aspect of the simulated evolutionary scenarios. The underlying individual-based simulation engine is highly optimized to enable modeling of entire chromosomes in large populations. For Mac OS X users (on OS X 10.9 or later), we also provide a graphical user interface for easy simulation set-up, interactive runtime control, and dynamical visualization of simulation output.
- SMARTPOP: a fast and flexible forward-in-time simulator for population genetics. Specially developed for speed, it is available in a serial and a parallel versions. Developed for anthropological inference on human populations and eco-anthropological questions, SMARTPOP simulates individuals with sequences of sex-linked DNA (mitochondria, X and Y chromosomes) and autosomes. Studies of social dynamics are enabled using SMARTPOP flexible demographic model and social rules of mating.
- SPIP: simulates the transmission of genes from parents to offspring in a population having demographic structure defined by the user. Numerous variables controlling the age structure of the population, the number of offspring produced, the variance in male and female reproductive success, survival rates of different age classes, mate fidelity, duration of simulation, etc. can be specified by the user. The program stores the pedigree of all individuals in the simulated population. This pedigree is used to simulate genetic data on sampled individuals by tracing lineages back through paternal or maternal genes within each sampled individual. Data may be simulated for an arbitrary number of loci that are assumed to be independently segregating and to not be subject to natural selection, nor linked to any selected genes. Genotypes are reported in terms of both “founder alleles” (i.e., each distinct allele amongst the founders of the pedigree is given a distinct label) and also in terms of alleles whose frequencies amongst the founding members of the pedigree may be specified by the user. The pedigree of individuals is also output by spip in a format that may be read into programs of the MORGAN package maintained by Elizabeth A. Thompson at the Department of Statistics, University of Washington. Particularly useful, in this regard, the pedigree output from spip may be fed into the MORGAN program kin and the coefficients of inbreeding for sampled individuals may be computed exactly, given the pedigree, as well as coefficients of coancestry for pairs of individuals.