---
title: "Pseudo-absences"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Pseudo-absences}
%\VignetteEngine{knitr::knitr}
%\VignetteEncoding{UTF-8}
---
### Definition
When using `data.type = 'binary'` in [BIOMOD_FormatingData](../reference/BIOMOD_FormatingData.html), `biomod2` requires either **presence / absence data**, or **presence-only data supplemented with pseudo-absences**. These pseudo-absences can be generated with the same function.
The general idea behind is to select points in the studied area that will be used to compare observed environment (represented by the presences) against what is available. Those points are NOT to be considered as absences, and rather represent the available environment. From a semantic point of view, several terms can be encountered in the literature for similar purposes : *pseudo-absences* ; and *background data* when it comes to MaxEnt mostly, or *quadrature points* when applying point-process model (PPM). These last two differ from pseudo-absences in the fact that they allow presence points to be selected as well, while pseudo-absences can not be selected over coordinates matching an observation.
**Note** that it is NOT allowed to mix both absences and pseudo-absences data.
### How to select them ? - Methods
3 different methods are implemented within `biomod2` to select pseudo-absences (PA) through either [bm_PseudoAbsences](../reference/bm_PseudoAbsences.html) or [BIOMOD_FormatingData](../reference/BIOMOD_FormatingData.html) :
1. **random** : PA are randomly selected over the studied area (excluding presence points)
2. **disk** : PA are randomly selected within circles around presence points defined by a minimum and a maximum distance values (same projection system units as the presence points)
3. **SRE** : a Surface Range Envelop model is used to randomly select PA outside this envelop, i.e. in conditions (combination of explanatory variables) that differ in a defined proportion from those of presence points
The selection of one or the other method will depend on a more important and underlying question :
*how the data set presence points were obtained ?*
- Was there a sampling design ?
- If yes, what was the objective of the study ? the scope ? Is it possible that this design allowed an exhaustive sampling ?
- Otherwise, were there any potential sources of bias ?
+ the question of interest
+ the studied area, its extent and how this extent was defined (administrative, geographical limits ?)
+ the observation method
+ the number of observers, the consistency between them (formation, objective)
+ etc
