API Reference

Input/Output

The io module provides functions for loading and saving GenotypeArrays or scalar values to common variant formats

from_plink(input[, swap_alleles, …])

Load genetic data from plink v1 files (.bed, .bim, and .fam) into a DataFrame.

to_plink(data, output[, phenotype_name, …])

Save genetic data to plink v1 files (.bed, .bim, and .fam)

from_vcf(filename[, min_qual, drop_filtered])

Load genetic data from a VCF or BCF file into a DataFrame

from_bed(filename)

Yields genomic regions from a bed file as Region scalars


Scalars

This module contains scalar types, some of which are used in the ExtensionArrays. They may also be useful on their own.

Variant(chromosome, position, id, ref, alt, …)

Information about a variant.

Genotype(variant, allele_idxs, …)

Genotype information associated with a specific variant.

Region(chromosome, start, end, name)


Simulation

The sim module provides classes for generating simulated genotypes

BAMS(pen_table, …)

Biallelic Model Simulator.

SNPEffectEncodings(value)

Enum: Normalized SNP Effects encoded as 3-length tuples

PenetranceTables(value)

Enum: Penetrance Tables for Simple Models

generate_random_gt(variant, alt_allele_freq)

Simulate random genotypes according to the provided allele frequencies


Arrays

This module contains ExtensionArrays and their corresponding ExtensionDtypes

GenotypeDtype(variant)

An ExtensionDtype for genotype data.

GenotypeArray(values, …)

Holder for genotypes

Specialized methods are added to the GenotypeArray using Mixins:

encoding_mixin.EncodingMixin()

Genotype Mixin containing functions for performing encoding

info_mixin.InfoMixin()

Genotype Mixin containing functions for calculating various information


Accessors

This module contains ‘genomics’ accessors for DataFrames and Series

GenotypeSeriesAccessor(obj)

Series accessor for GenotypeArray methods

GenotypeDataframeAccessor(pandas_obj)

DataFrame accessor for GenotypeArray methods