I’ve generated a vcf file with filtered SNPs. I’ve tried to use PLINK to generate genotype data in terms of 0,1,2 using the --recodeAD option along with --allow-extra-chr and --double id (my sample names have underscores).
My plink.raw file has not only 0,1,2s but also many 'NA’s. I’ve referred to the PLINK manual but cant particularly understand what it implies. It goes something like this:
Indv1 0 NA NA 0 0 0 0 0
Indv2 0 0 0 0 0 0 0 0 0 0
Indv3 NA NA 1 1 0 0 1 1
I’m am incredibly new to this kind of analysis. It would be great if I can get some input on what it means and whether there is something simpler. My ultimate aim is to use this data to generate a genotype heat map for my individuals on R, so I just want the position and genotype (2 types of homozygotes and 1 type of heterozygote) of each sample.