Applying Multivariate Discrete Distributions to Genetically Informative Count Data

Behav Genet. 2016 Mar;46(2):252-68. doi: 10.1007/s10519-015-9757-z. Epub 2015 Oct 24.

Abstract

We present a novel method of conducting biometric analysis of twin data when the phenotypes are integer-valued counts, which often show an L-shaped distribution. Monte Carlo simulation is used to compare five likelihood-based approaches to modeling: our multivariate discrete method, when its distributional assumptions are correct, when they are incorrect, and three other methods in common use. With data simulated from a skewed discrete distribution, recovery of twin correlations and proportions of additive genetic and common environment variance was generally poor for the Normal, Lognormal and Ordinal models, but good for the two discrete models. Sex-separate applications to substance-use data from twins in the Minnesota Twin Family Study showed superior performance of two discrete models. The new methods are implemented using R and OpenMx and are freely available.

Keywords: Biometric variance components; Count variables; Lagrangian probability distributions; Multivariate discrete distributions; Substance use; Twin study.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adolescent
  • Computer Simulation
  • Databases, Genetic
  • Family
  • Humans
  • Models, Genetic
  • Monte Carlo Method
  • Multivariate Analysis
  • Phenotype
  • Substance-Related Disorders / genetics
  • Twins / genetics*