measureAggregateRsquared measures aggregate r squared from imputed gen files
measureAggregateRsquared --validation truth.gen.gz --imputed imputed.gen.gz \
--sample truth_and_imputed.samples --freq allele_frequencies_of_imputed_sites.freq \
--bin allele_frequency_bins.txt --output output_baseMake sure the truth and imputed gen files contain the same samples in the same order, which is defined in the .samples file.
This code was written by Olivier Delaneau and Warren Kretzschmar. The maintainer of this code is Warren Kretzschmar.
Please raise an issue on the github page.
The --validation and --imputed input files are Impute2 .gen files.
Below are a set of examples for the other input files.
comparison is by population, multiple populations allowed...
ID_1 ID_2 missing pop
0 0 0 D
NA07346 NA07346 0 EUR
NA11832 NA11832 0 EUR
Population is first line. After that, each line corresponds to the allele frequency in that population in the truth.gen file. Multiple columns, one for each population allowed.
EUR
0.2214
0.02241
0.3206
Each line defines a boundary of bins.
Whether or not the first boundary is included can be changed using the "--discard-monomorphic" flag (I think).
0.000
0.005
0.010
output files are written to the output base
The test suite requires an installed version of CPAN. To install the perl dependencies for the test runners:
make test-setup
To run the tests:
make test