- Public release!

- Minor non-code changes for first CRAN submission.

- README.md now contains instructions for installing from CRAN as well as from GitHub.

Added option

`noFixed`

to function`rbnpsd`

to redraw loci that were drawn fixed for a single allele. These loci are not polymorphic so they would normally not be considered in analyses.Added function

`fixed_loci`

to test for fixed loci within rbnpsd.

- Added function
`coanc_to_kinship`

to easily obtain kinship matrices from coancestry matrices.

- Second CRAN submission.

- Converted the vignette from PDF to HTML

`qis`

now returns a numeric admixture proportions matrix (used to be logical).`q1d`

and`q1dc`

now handle`sigma = 0`

special case.`q1d`

and`q1dc`

now provide more informative out-of-bounds messages when`sigma`

is missing (and`s`

is provided)`sigma`

root finding in`q1d`

and`q1dc`

(when`s`

is provided) is now more robust, explicitly tested at boundaries (min`s > 0`

achieved at`sigma = 0`

and max`s = 1`

achieved at`sigma = Inf`

).- Removed arguments
`interval`

and`tol`

from both`q1d`

and`q1dc`

(users would never need to set them now that procedure is more robust).

- Removed arguments
- Updated coding style, renamed some internal functions and variables.

- Renamed most functions for clarity:
`coanc`

->`coanc_admix`

`q1d`

->`admix_prop_1d_linear`

`q1dc`

->`admix_prop_1d_circular`

`qis`

->`admix_prop_indep_subpops`

`rpanc`

->`draw_p_anc`

`rpint`

->`draw_p_subpops`

`rpiaf`

->`make_p_ind_admix`

`rgeno`

->`draw_genotypes_admix`

`rbnpsd`

->`draw_all_admix`

`fst`

->`fst_admix`

(no deprecated version available in this case, to eliminate conflict with`popkin::fst`

)- Functions with old names remain for now as deprecated functions (to be removed in the future).

- Renamed several recurrent argument names for clarity:
`Q`

->`admix_proportions`

`F`

->`coanc_subpops`

(if general matrix is accepted),`inbr_subpops`

(vector or scalar versions required)`s`

->`bias_coeff`

`w`

->`weights`

`Theta`

->`coancestry`

`m`

->`m_loci`

`n`

->`n_ind`

`k`

->`k_subpops`

`pAnc`

->`p_anc`

`B`

->`p_subpops`

`P`

->`p_ind`

- Deprecated functions still accept old argument names.

- Fixed a
`sigma = 0`

bug in`admix_prop_1d_circular`

. - Changed default values for
`draw_all_admix`

(compared to deprecated`rbnpsd`

, which retains old defaults):`require_polymorphic_loci`

(old`noFixed`

) is now`TRUE`

by default.`want_p_ind`

and`want_p_subpops`

(old`wantP`

and`wantB`

) are now`FALSE`

by default.- Names (following above conventions) and order of items in return list changed.

`draw_p_subpops`

now admits scalar inputs`p_anc`

and`inbr_subpops`

, while number of loci and number of subpopulations can be provided as additional options.- Added more input checks to functions, informative error messages.
- Updated documentation, particularly on whether intermediate subpopulation coancestries are accepted generally (
`coanc_subpops`

) or if the diagonal matrix case is required (specified as vector or scalar`inbr_subpops`

).

- Third CRAN submission.
- Added ORCIDs to authors.
- Corrected doc typos.
- Adjusted layout of subpopulations and individuals (default limits) for circular 1D geography (
`admix_prop_1d_circular`

) to prevent overlapping individuals on the edges, and to better agree visually with the linear version (`admix_prop_1d_linear`

).

- Non-code changes:
- Edited .Rbuildignore to stop ignoring README; also removed non-existent files from list
- Removed unused .travis.yml and bnpsd.Rproj files

- Improved memory efficiency of default
`draw_genotypes_admix`

- Old approach was by default very memory-hungry (created IAF matrix whole when admixture proportions were provided). The
`low_mem`

option could be set but filled slowly by locus only. - New approach is always low-memory (so the explicit option was removed). This was made faster by filling by individual when there are fewer individuals than loci, or filling by locus otherwise, therefore always vectorizing as much as possible. Test showed this was always as fast as the original full memory approach, so the latter was removed as an option.

- Old approach was by default very memory-hungry (created IAF matrix whole when admixture proportions were provided). The
`draw_all_admix`

is also now automatically low-memory whenever`want_p_ind = FALSE`

, and the explicit`low_mem`

option has also been removed.- Updated documentation to use RMarkdown
- Other code tidying

- Added option
`beta`

in function`draw_p_anc`

to trigger a symmetric Beta distribution for the ancestral allele frequencies, with the desired shape parameter. The`beta`

option can also be set on the wrapper function`draw_all_admix`

. This option allows simulation of a distribution heavier on rare variants (when`beta`

is much smaller than 1), more similar to real human data.

- Fourth CRAN submission.
- Removed deprecated function names:
`q1dc`

,`q1d`

,`qis`

,`coanc`

,`rbnpsd`

,`rgeno`

,`rpanc`

,`rpint`

,`rpiaf`

. - Moved logo to
`man/figures/`

- Minor Roxygen-related updates.

- Fourth CRAN submission, second attempt.
- Fixed a rare bug in
`bias_coeff_admix_fit`

, which caused it to die if the desired bias coefficient was an extreme value (particularly`1`

). The error message was:`f() values at end points not of opposite sign`

. The actual bug was not observed in the regular R build, but rather in a limited precision setting where R was configured with`--disable-long-double`

.

- Added option
`p_anc`

to function`draw_all_admix`

, to specify desired ancestral allele frequencies instead of having the code generate it randomly (default). - Added details for documentation of function
`draw_p_subpops.R`

, clarifying that input`p_anc`

can be scalar.

- Function
`draw_all_admix`

: when option`p_anc`

is provided as scalar and`want_p_anc = TRUE`

, now the return value is always a vector (in this case the input scalar value repeated`m_loci`

times). The previous behavior was to return`p_anc`

as scalar if that was the input, which could be problematic for downstream applications.

- 5th CRAN submission
- Functions
`admix_prop_1d_linear`

and`admix_prop_1d_circular`

had these changes:- The optional parameters
`bias_coeff`

,`coanc_subpops`

and`fst`

now have default values (of`NA`

,`NULL`

, and`NA`

, respectively) instead of missing, and these “missing” values can be passed to get the same behavior as if they hadn’t been passed at all. - Their documentation has been clarified.
- Improved internal code to handle edge case
`bias_coeff = 1`

(to fix an issue only observed on Apple M1).

- The optional parameters
- Function
`admix_prop_indep_subpops`

: default value for the optional parameter`subpops`

is now made more clear in arguments definition. - Simplified documentation (most functions) by clarifying language, using markdown roxygen, and replacing all LaTeX equations with simpler code equations.
- Updated paper citations in
`DESCRIPTION`

,`README.md`

and the vignette, to point to the published method in PLoS Genetics.

- Documentation updates:
- Fixed links to functions, in many cases these were broken because of incompatible mixed Rd and markdown syntax (now markdown is used more fully).

- Added support for intermediate subpopulations related by a tree
- New function
`draw_p_subpops_tree`

is the tree version of`draw_p_subpops`

. - New function
`coanc_tree`

calculates the true coancestry matrix corresponding to the subpopulations related by a tree. - Function
`draw_all_admix`

has new argument`tree_subpops`

that can be used in place of`inbr_subpops`

(to simulated subpopulation allele frequencies using`draw_p_subpops_tree`

instead of`draw_p_subpops`

). - Note: These other functions work for trees (without change) because they accept arbitrary coancestry matrices (param
`coanc_subpops`

) as input, so they work if they are passed the matrix that`coanc_tree`

returns:`coanc_admix`

,`fst_admix`

,`admix_prop_1d_linear`

,`admix_prop_1d_circular`

.

- New function
- Functions
`admix_prop_1d_linear`

and`admix_prop_1d_circular`

, when`sigma`

is missing (and therefore fit to a desired`coanc_subpops`

,`fst`

, and`bias_coeff`

), now additionally return multiplicative`factor`

used to rescale`coanc_subpops`

.

It’s Fangorn Forest around here with all the tree updates!

- Added these functions:
`fit_tree`

for fitting trees to coancestry matrices!`scale_tree`

to easily scale coancestry trees and check for out-of-bounds values.`tree_additive`

for calculating “additive” edges for probabilistic edge coancestry trees, and also the reverse function .- This already existed as an internal, unexported function used mainly by
`coanc_tree`

, but now it’s renamed, exported, and well documented!

- This already existed as an internal, unexported function used mainly by

- Added support of
`$root.edge`

to tree`phylo`

objects passed to these functions:`coanc_tree`

: edge is a shared covariance value affecting all subpopulations.`draw_all_admix`

and`draw_p_subpops_tree`

: if root edge is present, functions warn that it will be ignored.

- Functions
`admix_prop_1d_linear`

and`admix_prop_1d_circular`

: debugged an edge case where`sigma`

is small but not zero and numerically-calculated densities all come out to zero in a given row of the`admix_proportions`

matrix (for`admix_prop_1d_circular`

infinite values also arise), which used to lead to NAs upon row normalization; now for those rows, the closest ancestry (by coordinate distance) gets assigned the full admixture fraction (just as for independent subpopulations/`sigma = 0`

).

- Updated various functions to transfer names between inputs and outputs as it makes sense
- Functions
`admix_prop_1d_linear`

,`admix_prop_1d_circular`

now copy names from the input`coanc_subpops`

(vector and matrix versions, only required when fitting`bias_coeff`

) to the columns of the output`admix_proportions`

matrix. - Function
`draw_genotypes_admix`

now copies row and column names from input matrix`p_ind`

(or rownames from`p_ind`

and column names from the rownames of`admix_proportions`

when the latter is provided) to output genotype matrix - Function
`draw_p_subpops`

now copies names from`p_anc`

to rows, names from`inbr_subpops`

to columns, when present and of the right dimensions. - Function
`draw_p_subpops_tree`

now copies names from`p_anc`

to rows. Names from`tree_subpops`

were already copied to columns before. - All other functions already transferred names as desired/appropriate. Tests were added for these functions to ensure that this is so.

- Functions
- Updated various functions to stop if there are paired names for two objects that are both non-NULL and disagree, as this suggests that the data is misaligned or incompatible.
- Functions
`coanc_admix`

and`fst_admix`

stop if the column names of`admix_proportions`

and the names of`coanc_subpops`

disagree. - Function
`draw_all_admix`

stops if the column names of`admix_proportions`

and the names of either`inbr_subpops`

or`tree_subpops`

disagree. - Function
`draw_genotypes_admix`

, when`admix_proportions`

is passed, stops if the column names of`admix_proportions`

and`p_ind`

disagree. - Function
`make_p_ind_admix`

stops if the column names of`admix_proportions`

and`p_subpops`

disagree.

- Functions
- Function
`tree_additive`

now has option`force`

, which when`TRUE`

simply proceeds without stopping if additive edges were already present (in`tree$edge.length.add`

, which is ignored and overwritten).

New functions and bug fixes dealing with reordering tree edges and tips.

- Added function
`tree_reindex_tips`

for ensuring that tip order agrees in both the internal labels vector and the edge matrix. Such lack of agreement is generally possible (technically the tree is the same for arbitrary orders of edges in the edge matrix). However, such a disagreement causes visual disagreement in plots (for example, trees are plotted in the order of the edge matrix, versus coancestry matrices are ordered as in the tip labels vector instead), which can now be fixed in general. - Added function
`tree_reorder`

for reordering tree edges and tips to agree as much as possible with a desired tip order. The heuristic finds the exact solution if it exists, otherwise returns a reasonable order close to the desired order. Tip order in labels and edge matrix agree (via`tree_reindex_tips`

). - Function
`fit_tree`

now outputs trees with tip order that better agrees with the input data, and tip order in labels vector and edge matrix now agree (via`tree_reorder`

). - Several functions now work with trees whose edges are arbitrarily ordered, particularly when they do not move out from the root (i.e. reverse postorder):
- Function
`tree_additive`

. Before this bug fix, some trees could trigger the error message “Error: Node index 6 was not assigned coancestry from root! (unexpected)”, where “6” could be other numbers. - Function
`draw_p_subpops_tree`

. Before this bug fix, some trees could trigger the error message “Error: The root node index in`tree_subpops$edge`

(9) does not match`k_subpops + 1`

(6) where`k_subpops`

is the number of tips! Is the`tree_subpops`

object malformed?”, where “9” and “6” could be other numbers. Other possible error messages contain “Parent node index 6 has not yet been processed …” or “Child” instead of “Parent”, where “6” could be other numbers. - Internal functions used by
`fit_tree`

had related fixes, but overall`fit_tree`

appears to have had no bugs because users cannot provide trees, and the tree-building algorithm does not produce scrambled edges that would have caused problems.

- Function

- Functions
`fixed_loci`

and`draw_all_admix`

have a new parameter`maf_min`

that, when greater than zero, allows for treating rare variants as fixed. In`draw_all_admix`

, this now allows for simulating loci with frequency-based ascertainment bias.

- Fixed a rare bug in
`draw_all_admix`

that could cause a “stack overflow” error. The function used to call itself recursively if`require_polymorphic_loci = TRUE`

, and in cases where there are very rare allele frequencies or high`maf_min`

the number of recursions could be so large that it triggered this error. Now the function has a`while`

loop, and does not recurse more than one level at the time; there is no limit to the number of iterations and no errors occur inherently due to large numbers of iterations.

- Function
`fit_tree`

internally simplified to use`stats::hclust`

, which also results in a small runtime gain. The new code (when`method = "mcquitty"`

, which is default) gives the same answers as before (in other words, the original algorithm was a special case of hierarchical clustering).- New option
`method`

is passed to`hclust`

. Although all`hclust`

methods are allowed, for this application the only ones that make sense are “mcquitty” (WPGMA) and “average” (UPGMA). In internal evaluations, both algorithms had similar accuracy and runtime, but only “mcquitty” exactly recapitulates the original algorithm.

- New option

- Updated citations in
`inst/CITATION`

(missed last time I updated them in other locations).

- Added function
`undiff_af`

for creating “undifferentiated” allele frequency distributions based on real data but with a lower variance (more concentrated around 0.5) according to a given FST, useful for simulating data trying to match real data. - Added
`LICENSE.md`

. - Reformatted this
`NEWS.md`

slightly to improve its automatic parsing.

- Function
`undiff_af`

:- Added several useful informative statistics to return list:
`F_max`

,`V_in`

,`V_out`

,`V_mix`

, and`alpha`

. - Debugged
`distr = "auto"`

cases where mixing variance ended up being smaller than required due to roundoff errors (`alpha`

is now larger than given in direct formula by`eps = 10 * .Machine$double.eps`

, which is also a new option.

- Added several useful informative statistics to return list:

- Function
`draw_all_admix`

added option`p_anc_distr`

for passing custom ancestral allele frequency distributions (as vector or function). This differs from the similar preexisting option`p_anc`

, which fixed ancestral allele frequencies per locus to those values. These two options behave differently when loci have to be re-drawn due to being fixed or having too-low MAFs: passing`p_anc`

never changes those values, whereas passing`p_anc_distr`

results in drawing new values as necessary. The new option is more natural biologically and results in re-drawing fixed loci less often.

- Function
`undiff_af`

renamed parameter`F`

to`kinship_mean`

, and updated all documentation to reflect the correction that this parameter is the mean kinship and not FST (the complete derivation will appear in a manuscript).- One element in the return list previously called
`F_max`

is similarly now`kinship_mean_max`

.

- One element in the return list previously called

- 6th CRAN submission.
- Removed “LazyData: true” from DESCRIPTION (to avoid a new “NOTE” on CRAN).
- Fixed spelling in documentation.

- 6th CRAN submission, second attempt.
- Debugged internal code (
`bias_coeff_admix_fit`

) shared by`admix_prop_1d_linear`

and`admix_prop_1d_circular`

for edge cases.- Error was only observed on M1mac architecture (previous code worked on all other systems!).
- If a bias coefficient of 1 was desired, expected sigma to be
`Inf`

, but instead an error was encountered. - Previous error message: “f() values at end points not of opposite sign” (in
`stats::uniroot`

)