Title sim1000G
Short Description sim1000G integrates fully with R and can simulate existing variation from a single VCF file. In addition it can also simulate arbitrary pedigrees.
Long Description We develop a new user-friendly and integrated R package, sim1000G, which simulates genomic regions for unrelated individuals or for families. Only a single input of raw phased Variant Call Format (VCF) file is needed. Haplotypes are extracted to compute linkage disequilibrium in the simulated region and then for the generation of new genotype data for unrelated individuals. The covariance across variants is used to preserve the LD structure of the original population. Arbitrary pedigree sizes are generated by modeling recombination events within sim1000G. Various simulation scenarios are presented assuming unrelated individuals from a single population or two distinct populations, or alternatively for three-generation family data. Sim1000G can capture allele frequency diversity, short and long-range linkage disequilibrium (LD) patterns and subtle population differences in LD structure without the need for any tuning parameters.
Keywords simulator variants VCF pedigree
Version 1.19
Project Started 2018
Last Release 1 year, 1 month ago
