Contributing to unmarked: guide to adding a new model to unmarked

Ken Kellner

Léa Pautrel

December 08, 2023

Follow the steps in this guide to add a new model to the unmarked package. Note that the order can be adjusted based on your preferences. For instance, you can start with the likelihood function, as it forms the core of adding a model to unmarked, and then build the rest of the code around it. In this document, the steps are ordered as they would occur in an unmarked analysis workflow.

This guide uses the recently developed gdistremoval function for examples, mainly because most of the relevant code is in a single file instead of spread around. It also uses occu functions to show simpler examples that may be easier to understand.

Prerequisites and advices

1 Organise the input data: design the unmarkedFrame object

Most model types in unmarked have their own unmarkedFrame, a specialized kind of data frame. This is an S4 object which contains, at a minimum, the response (y). It may also include site covariates, observation covariates, primary period covariates, and other info related to study design (such as distance breaks).

In some cases you may be able to use an existing unmarkedFrame subclass. You can list all the existing unmarkedFrame subclasses by running the following code:

showClass("unmarkedFrame")
## Class "unmarkedFrame" [package "unmarked"]
## 
## Slots:
##                                                                               
## Name:                  y           obsCovs          siteCovs           mapInfo
## Class:            matrix optionalDataFrame optionalDataFrame   optionalMapInfo
##                         
## Name:             obsToY
## Class:    optionalMatrix
## 
## Extends: "unmarkedFrameOrNULL"
## 
## Known Subclasses: 
## Class "unmarkedMultFrame", directly
## Class "unmarkedFrameDS", directly
## Class "unmarkedFrameOccu", directly
## Class "unmarkedFrameOccuFP", directly
## Class "unmarkedFrameOccuMulti", directly
## Class "unmarkedFramePCount", directly
## Class "unmarkedFrameMPois", directly
## Class "unmarkedFrameOccuCOP", directly
## Class "unmarkedFrameOccuMS", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFrameOccuTTD", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFrameG3", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFramePCO", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFrameGDR", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFrameGMM", by class "unmarkedMultFrame", distance 3
## Class "unmarkedFrameGDS", by class "unmarkedMultFrame", distance 3
## Class "unmarkedFrameGPC", by class "unmarkedMultFrame", distance 3
## Class "unmarkedFrameGOccu", by class "unmarkedMultFrame", distance 3
## Class "unmarkedFrameMMO", by class "unmarkedMultFrame", distance 4
## Class "unmarkedFrameDSO", by class "unmarkedMultFrame", distance 4

You can have more information about each unmarkedFrame subclass by looking at the documentation of the function that was written to create the unmarkedFrame object of this subclass, for example with ?unmarkedFrameGDR, or on the package’s website.

1.1 Define the unmarkedFrame subclass for this model

1.2 Write the function that creates the unmarkedFrame object

1.3 Write the S4 methods associated with the unmarkedFrame object

Note that you may not have to write all of the S4 methods below. Most of them will work without having to re-write them, but you should test it to verify it. All the methods associated with unmarkedFrame objects are listed in the unmarkedFrame class documentation accessible with help("unmarkedFrame-class").

Specific methods

Here are methods you probably will have to rewrite.

Generic methods

Here are methods that you should test but probably will not have to rewrite. They are defined in the unmarkedFrame.R file, for the unmarkedFrame mother class.

  • coordinates
  • getY
  • numSites
  • numY
  • obsCovs
  • obsCovs<-
  • obsNum
  • obsToY
  • obsToY<-
  • plot
  • projection
  • show
  • siteCovs
  • siteCovs<-
  • summary

Methods to access new attributes

You may also need to add specific methods to allow users to access an attribute you added to your unmarkedFrame subclass.

  • For example, getL for unmarkedFrameOccuCOP

2 Fitting the model

The fitting function can be declined into three main steps: reading the unmarkedFrame object, maximising the likelihood, and formatting the outputs.

2.1 Inputs of the fitting function

2.2 Read the unmarkedFrame object: write the getDesign method

Most models have their own getDesign function, an S4 method. The purpose of this method is to convert the information in the unmarkedFrame into a format usable by the likelihood function.

Writing the getDesign method is frequently the most tedious and difficult part of the work adding a new function.

2.3 The likelihood function

2.3.1 The R likelihood function: easily understandable

If you are mainly used to coding in R, you should probably start here. If users want to dig deeper into the likelihood of a model, it may be useful for them to be able to read the R code to calculate likelihood, as they may not be familiar with other languages. This likelihood function can be used only for fixed-effects models.

  • Example for occu
  • gdistremoval doesn’t have an R version of the likelihood function

2.3.2 The C++ likelihood function: faster

The C++ likelihood function is essentially a C++ version of the R likelihood function, also designed exclusively for fixed-effects models. This function uses the RcppArmadillo R package, presented here. In the C++ code, you can use functions of the Armadillo C++ library, documented here.

Your C++ function should be in a .cpp file in the ./src/ folder of the package. You do not need to write a header file (.hpp), nor do you need to compile the code by yourself as it is all handled by the RcppArmadillo package. To test if your C++ function runs and gives you the expected result, you can compile and load the function with Rcpp::sourceCpp(./src/nll_yourmodel.cpp), and then use it like you would use a R function: nll_yourmodel(params=params, arg1=arg1).

2.3.3 The TMB likelihood function: for random effects

#TODO

2.4 Organise the output data

2.4.1 unmarkedEstimate objects per submodel

Outputs from optim should be organized unto unmarkedEstimate (S4) objects, with one unmarkedEstimate per submodel (e.g. state, detection). These objects include the parameter estimates and other information about link functions etc.

The unmarkedEstimate class is defined here in the unmarkedEstimate.R file, and the unmarkedEstimate function is defined here, and is used to create new unmarkedEstimate objects. You normally will not need to create unmarkedEstimate subclass.

2.4.2 Design the unmarkedFit object

You’ll need to create a new unmarkedFit subclass for your model. The main component of unmarkedFit objects is a list of the unmarkedEstimates described above.

After you defined your unmarkedFit subclass, you can create the object in your fitting function.

The fitting function return this unmarkedFit object.

2.5 Test the complete fitting function process

3 Write the methods associated with the unmarkedFit object

Develop methods specific to your unmarkedFit type for operating on the output of your model. Like for the methods associated with an unmarkedFrame object above, you probably will not have to re-write all of them, but you should test them to see if they work. All the methods associated with unmarkedFit objects are listed in the unmarkedFit class documentation accessible with help("unmarkedFit-class").

Specific methods

Those are methods you will want to rewrite, adjusting them for your model.

getP

The getP method (defined here) “back-transforms” the detection parameter (\(p\) the detection probability or \(\lambda\) the detection rate, depending on the model). It returns a matrix of the estimated detection parameters. It is called by several other methods that are useful to extract information from the unmarkedFit object.

simulate

The generic simulate method (defined here) calls the simulate_fit method that depends on the class of the unmarkedFit object, which depends on the model.

The simulate method can be used in two ways:

You should test both ways with your model.

plot

This method plots the results of your model. The generic plot method for unmarkedFit (defined here) plot the residuals of the model.

Generic methods

Here are methods that you should test but probably will not have to rewrite. They are defined in the unmarkedFit.R file, for the unmarkedFit mother class.

Methods to access new attributes

You may also need to add specific methods to allow users to access an attribute you added to your unmarkedFit subclass.

For example, some methods are relevant for some type of models only:

4 Update the NAMESPACE file

5 Write tests

Using testthat package, you need to write tests for your unmarkedFrame function, your fitting function, and methods described above. The tests should be fast, but cover all the key configurations.

Write your tests in the ./tests/testthat/ folder, creating a R file for your model. If you are using RStudio, you can run the tests of your file easily by clicking on the “Run tests” button. You can run all the tests by clicking on the “Test” button in the Build pane.

6 Write documentation

You need to write the documentation files for the new classes and functions you added. Documentation .Rd files are stored in the man folder. Here is a documentation on how to format your documentation.

Depending on how much you had to add, you may also need to update existing files:

7 Add to unmarked