pandas_genomics.arrays.GenotypeArray¶

class pandas_genomics.arrays.GenotypeArray(values: Union[List[pandas_genomics.scalars.Genotype], pandas_genomics.arrays.genotype_array.GenotypeArray, numpy.ndarray], dtype: Optional[pandas_genomics.arrays.genotype_array.GenotypeDtype] = None, copy: bool = False)[source]¶

Holder for genotypes

Variant information is stored as part of the type, and the genotype is stored as a pair of integer arrays

Parameters

valueslist-like: The values of the genotypes.
dtypeGenotypeDtype: The specific parametized type. Optional (if possible to infer from values)

Attributes

dtype: GenotypeDtype: The specific parametized type
data: np.dtype(“u8”) with shape (<genotypes>, <ploidy>): The genotype values encoded as indices into the allele list of the dtype

__init__(values: Union[List[pandas_genomics.scalars.Genotype], pandas_genomics.arrays.genotype_array.GenotypeArray, numpy.ndarray], dtype: Optional[pandas_genomics.arrays.genotype_array.GenotypeDtype] = None, copy: bool = False)[source]¶: Initialize assuming values is a GenotypeArray or a numpy array with the correct underlying shape

Methods

`__init__`(values[, dtype, copy])	Initialize assuming values is a GenotypeArray or a numpy array with the correct underlying shape
`argmax`([skipna])	Return the index of maximum value.
`argmin`([skipna])	Return the index of minimum value.
`argsort`([ascending, kind, na_position])	Return the indices that would sort this array.
`astype`(dtype[, copy])	Cast to a NumPy array with ‘dtype’.
`copy`()	Return a copy of the array.
`delete`(loc)
`dropna`()	Return ExtensionArray without NA values.
`encode_additive`()	Additive Encoding
`encode_codominant`()	This encodes the genotype into three categories.
`encode_dominant`()	Dominant Encoding
`encode_edge`(alpha_value, ref_allele, …)	Perform EDGE (weighted) encoding.
`encode_recessive`()	Recessive Encoding
`equals`(other)	Return if another array is equivalent to this array.
`factorize`([na_sentinel])	Return an array of ints indexing unique values
`fillna`([value, method, limit])	Fill NA/NaN values using the specified method.
`is_genotype_array`(other)
`isin`(values)	Pointwise comparison for set containment in the given values.
`isna`()	A 1-D array indicating if each value is missing
`ravel`([order])	Return a flattened view on this array.
`repeat`(repeats[, axis])	Repeat elements of a ExtensionArray.
`searchsorted`(value[, side, sorter])	Find indices where elements should be inserted to maintain order.
`set_reference`(allele)	Change the reference allele (in-place) by specifying an allele index value or an allele string
`shift`([periods, fill_value])	Shift values by desired number.
`take`(indexer[, allow_fill, fill_value])	Take elements from an array.
`to_numpy`([dtype, copy, na_value])	Convert to a NumPy ndarray.
`transpose`(*axes)	Return a transposed view on this array.
`unique`()	Return a GenotypeArray of unique values
`value_counts`([dropna])	Return a Series of unique counts with a GenotypeArray index
`view`([dtype])	Return a view on the array.

Attributes

`T`
`allele_idxs`	Return the allele indices for each genotype
`dtype`	The specific parametized type
`gt_scores`	Return the genotype score for each genotype (as a float)
`hwe_pval`	Calculate the probability that the samples are in HWE for diploid variants
`is_heterozygous`	Boolean array: True if the sample is heterozygous for any alleles
`is_homozygous`	Boolean array: True if the sample is homozygous for any allele
`is_homozygous_alt`	Boolean array: True if the sample is homozygous for any non-reference allele
`is_homozygous_ref`	Boolean array: True if the sample is homozygous for the reference allele
`is_missing`	Boolean array: True if the sample is missing all alleles
`maf`	Calculate the Minor Allele Frequency (MAF) for the most-frequent alternate allele.
`nbytes`	How many bytes to store this object in memory
`ndim`	Extension Arrays are only allowed to be 1-dimensional.
`shape`	Return a tuple of the array dimensions.
`size`	The number of elements in the array.
`variant`	Return the variant identifier