| Title: | Educational Datasets for Ecology and Agriculture | 
| Version: | 0.1.0 | 
| Description: | A collection of curated educational datasets for teaching ecology and agriculture concepts. Includes data on wildlife monitoring, plant treatments, and ecological observations with documentation and examples for educational use. All datasets are derived from published scientific studies and are available under CC0 or compatible licenses. | 
| License: | MIT + file LICENSE | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.3.2 | 
| Suggests: | dplyr, ggplot2, knitr, rmarkdown, testthat (≥ 3.0.0) | 
| Config/testthat/edition: | 3 | 
| Depends: | R (≥ 3.5) | 
| LazyData: | true | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | no | 
| Packaged: | 2025-06-25 14:29:39 UTC; weh90 | 
| Author: | W. Edwin Harris [aut, cre, cph] | 
| Maintainer: | W. Edwin Harris <weh9000@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-06-29 10:20:02 UTC | 
ecoteach: Educational Datasets for Ecology and Agriculture
Description
A collection of curated educational datasets for teaching ecology and agriculture concepts. The package provides clean, well-documented datasets that can be used for teaching data analysis, statistics, and ecological concepts in classroom settings. Each dataset includes comprehensive documentation and examples of potential analyses.
Details
The package includes the following datasets:
-  berberis_treatment: Data on invasive Berberis management treatments
-  magellanic_penguins: Long-term monitoring data on Magellanic penguins
-  vulture_diet: Diet composition analysis of African vultures
-  chimpanzee_cameras: Camera trap detection data for wild chimpanzees
-  lion_reproduction: Reproductive data for Galapagos sea lions
-  barnswallow_brightness: Plumage brightness data for barn swallows
-  pangolin_habitat: Habitat occupancy data for Chinese pangolins
-  whale_brains: Brain size evolution data for cetaceans
-  elephant_farmers: Agricultural use metrics for elephants
-  Dsimulans_matechoice: Mate copying data for fruit flies
-  carrion_arrivals: Vertebrate scavenger visits to roe deer carrion
-  scavenger_community: Scavenger community structure along environmental gradients
-  badger_energy: Energy expenditure data for European badgers with tuberculosis
-  raccoondog_environment: Raccoon dog activity and environmental factors in China
-  dormouse_hibernation: Hibernation and reproduction data for edible dormice
All datasets are provided in tidy format, with factors appropriately coded, and include proper citation information. The package aims to make it easy for instructors to incorporate real ecological data into their teaching.
Data Sources
All datasets are derived from published scientific studies and are available under CC0 or compatible licenses. Full citations and DOIs are provided in the documentation for each dataset.
Author(s)
Maintainer: W. Edwin Harris weh9000@gmail.com [copyright holder]
References
Adriaens, T., Verschelde, P., Cartuyvels, E., D'hondt, B., Vercruysse, E., van Gompel, W., Dewulf, E., & Provoost, S. (2019). A preliminary field trial to compare control techniques for invasive Berberis aquifolium in Belgian coastal dunes. doi:10.5281/zenodo.3351504
Rebstock, G. A., Boersma, P. D., & García-Borboroglu, P. (2022). Magellanic penguin nest counts and reproductive success at Punta Tombo, Argentina, 1982-2021. doi:10.5061/DRYAD.8931ZCRSV
Baino, A., Hopcraft, G., Kendall, C., Munishi, L., Behdenna, A., & Newton, J. (2021). We are what we eat, plus some per mill: Using stable isotopes to estimate diet composition of African vultures. doi:10.5061/dryad.1ns1rn8qf
Crunchant, A-S., Borchers, D., Kuehl, H., & Piel, A. K. (2020). Listening and watching: do camera traps or acoustic sensors more efficiently detect wild chimpanzees in an open habitat? doi:10.5061/dryad.5dv41ns34
Kalberer, S., Meise, K., Trillmich, F., & Krüger, O. (2018). Reproductive performance of a tropical apex predator in an unpredictable habitat. doi:10.5061/DRYAD.6S48579
Morosse, O., Tsunekage, T., Kenny-Duddela, H., Schield, D., Keller, K., Safran, R., & Levin, I. (2025). North American barn swallows pair, mate, and interact assortatively. doi:10.5061/DRYAD.1G1JWSV8G
Subba, A., Tamang, G., Lama, S., Basnet, N., Kyes, R. C., & Khanal, L. (2024). Habitat occupancy of the critically endangered Chinese pangolin (Manis pentadactyla) under human disturbance in an urban environment: Implications for conservation. doi:10.5061/DRYAD.73N5TB34T
Peacock, J., Waugh, D., Bajpai, S., & Thewissen, J. G. M. (2025). The evolution of hearing and brain size in Eocene whales. doi:10.5061/DRYAD.SF7M0CGH1
Hahn, N. (2021). Elephant agricultural use metrics in Mara-Serengeti ecosystem. doi:10.5061/DRYAD.RN8PK0PBN
Nöbel, S., & Kaufmann, T. (2025). Data from: Mate copying in Drosophila simulans. doi:10.5061/DRYAD.ZS7H44JMC
Schwegmann, S. (2023). Use of viscera from hunted roe deer by vertebrate scavengers in summer in central European mountainous mixed forest. doi:10.5061/DRYAD.Q573N5TPP
Gomo, G., Rød-Eriksen, L., Andreassen, H. P., Mattisson, J., Odden, M., Devineau, O., & Eide, N. E. (2020). Scavenger community structure along an environmental gradient from boreal forest to alpine tundra in Scandinavia. doi:10.1002/ece3.6834
Barbour, K., McClune, D. W., Delahay, R. J., Speakman, J. R., McGowan, N. E., Kostka, B., Montgomery, I. W., Marks, N. J., & Scantlebury, D. M. (2019). No energetic cost of tuberculosis infection in European badgers (Meles meles). doi:10.5061/DRYAD.MN84H20
Miyamoto, K., Chen, C., & Luan, X. (2025). Seasonal activity changes in raccoon dogs and influences of environmental factors from autumn to winter. doi:10.5061/DRYAD.C866T1GJN
Bieber, C., Turbill, C., & Ruf, T. (2019). Effects of aging on timing of hibernation and reproduction. doi:10.5061/DRYAD.8004G37
Examples
# Load a dataset
data(vulture_diet)
# View the structure
str(vulture_diet)
# Basic summary
summary(vulture_diet)
# See what datasets are available
data(package = "ecoteach")
Mate Copying in Drosophila simulans
Description
A dataset containing observations of mate choice decisions of Drosophila simulans females from three different populations to test whether they copy the mate choice of their conspecifics. The study tested whether female fruit flies acquire a sexual preference for a particular trait of a male after observing a single mating event. The experimental protocol involved a naïve, unmated female first observing a conspecific's mate choice between one artificially colored green and one artificially colored pink male, and subsequently being allowed to choose between two males of the same phenotype herself.
Usage
Dsimulans_matechoice
Format
A data frame with 383 rows and 14 variables:
- Experimentor
- Who conducted the experiment (SN: Sabine Nöbel, TK: Tim Kaufmann) 
- Date
- Date of the experiment 
- TimeDemo
- Beginning time of the experiment 
- Chamber
- Position in the experimental box (A-F) 
- Device
- Number of the experimental set-up 
- Strain
- Fly population: Haale (Saale), Maison Salasar, or Deyme 
- Treatment
- Experimental treatment: Mate copying (informed) or Control (uninformed) 
- Temp
- Temperature (°C) in the experimental room 
- Humidity
- Humidity (%) in the experimental room 
- ColourDemo
- Color of the male copulating in the demonstration: Green or Pink 
- Colour1Court
- Color of the male that started the first courtship: Green or Pink 
- Colour2Court
- Color of the male that started the second courtship: Green or Pink 
- ColourTest
- Color of the male copulating in the test phase: Green or Pink 
- MCS
- Mate-copying score: "Same color" (observer female chose the same colored male as the demonstrator) or "Different color" (observer female chose a different colored male than the demonstrator) 
Source
Nöbel, Sabine and Kaufmann, Tim (2025). Data from: Mate copying in Drosophila simulans. Dryad Digital Repository. doi:10.5061/DRYAD.ZS7H44JMC
Examples
# Load the dataset
data(Dsimulans_matechoice)
# Basic exploration
head(Dsimulans_matechoice)
summary(Dsimulans_matechoice)
# Examine mate copying rates by treatment
table(Dsimulans_matechoice$Treatment, Dsimulans_matechoice$MCS)
# Compare mate copying across different strains (using base R)
mosaicplot(table(Dsimulans_matechoice$Strain, Dsimulans_matechoice$MCS),
           main = "Mate choice outcomes by strain",
           color = c("lightblue", "salmon"))
           
# Analyze if environmental conditions affect mate copying
boxplot(Temp ~ MCS, data = Dsimulans_matechoice, 
        main = "Temperature effects on mate copying",
        ylab = "Temperature (°C)")
European badger energy expenditure and tuberculosis infection data
Description
This dataset contains measurements of daily energy expenditure (DEE) and related variables for European badgers (Meles meles) in relation to their tuberculosis (TB) infection status. The data were collected to examine how disease status and other factors like season, group size, sex, age, and body mass affect energy balance in wild badgers. Some individuals were measured multiple times across different seasons.
Usage
badger_energy
Format
A data frame with 56 rows and 7 variables:
- ID
- Unique identifier for each badger 
- age
- Age class of the badger: "cub" or "adult" 
- sex
- Sex of the badger: "F" (female) or "M" (male) 
- group_size
- Number of badgers in the social group 
- body_mass
- Body mass in kilograms (kg) 
- daily_energy
- Daily energy expenditure (DEE) in kilojoules per day (kJ/day) 
- season
- Season when measurements were taken: "Winter", "Spring", "Summer", or "Autumn" 
- disease
- Tuberculosis infection status: "Negative", "Diseased", or "Exposed" 
Source
Barbour, Katie and McClune, David W. and Delahay, Richard J. and Speakman, John R. and McGowan, Natasha E. and Kostka, Berit and Montgomery, Ian W. and Marks, Nikki J. and Scantlebury, David M. (2019). Data from: No energetic cost of tuberculosis infection in European badgers (Meles meles). Dryad Digital Repository. doi:10.5061/DRYAD.MN84H20
Examples
# Load the dataset
data(badger_energy)
# Basic exploration
head(badger_energy)
summary(badger_energy)
# Compare energy expenditure by disease status
boxplot(daily_energy ~ disease, data = badger_energy, 
        main = "Daily Energy Expenditure by TB Status",
        ylab = "DEE (kJ/day)", xlab = "TB Status")
        
# Examine relationship between body mass and energy expenditure
plot(daily_energy ~ body_mass, data = badger_energy,
     col = as.numeric(disease), pch = 16,
     main = "Energy Expenditure vs Body Mass",
     xlab = "Body Mass (kg)", ylab = "DEE (kJ/day)")
legend("topright", levels(badger_energy$disease), 
       col = 1:3, pch = 16)
Barn Swallow Plumage Brightness and Mate Selection
Description
This dataset contains information on plumage brightness measurements for North American barn swallows (Hirundo rustica erythrogaster) and their mating patterns. The data includes measurements of belly and breast brightness for social pairs and extra-pair mates. This dataset was used to study assortative mating patterns in barn swallows, investigating how plumage coloration affects mate selection both within social pairs and through extra-pair fertilizations.
Usage
barnswallow_brightness
Format
A data frame with 19 rows and 7 variables:
- MaleID
- Band ID for the focal male 
- PairID
- Band ID for the female social mate 
- EPID
- Band ID for the extra-pair mate 
- PairB_bright
- Belly brightness for the social male (unitless measurement of reflectance) 
- PairR_bright
- Breast brightness for the social male (unitless measurement of reflectance) 
- EPB_bright
- Belly brightness for the extra-pair male (unitless measurement of reflectance) 
- EPR_bright
- Breast brightness for the extra-pair male (unitless measurement of reflectance) 
Source
Morosse, Omar, Tsunekage, Toshi, Kenny-Duddela, Heather, Schield, Drew, Keller, Kayleigh, Safran, Rebecca, & Levin, Iris (2025). North American barn swallows pair, mate, and interact assortatively. Dryad Digital Repository. doi:10.5061/DRYAD.1G1JWSV8G
Examples
# Load the dataset
data(barnswallow_brightness)
# Basic exploration
head(barnswallow_brightness)
summary(barnswallow_brightness)
# Compare brightness between social and extra-pair males
boxplot(barnswallow_brightness$PairB_bright, barnswallow_brightness$EPB_bright,
        names = c("Social Male", "Extra-pair Male"),
        main = "Comparison of Belly Brightness",
        ylab = "Brightness")
        
# Correlation between social and extra-pair male brightness
plot(barnswallow_brightness$PairB_bright, barnswallow_brightness$EPB_bright,
     main = "Correlation between Social and Extra-pair Male Brightness",
     xlab = "Social Male Belly Brightness",
     ylab = "Extra-pair Male Belly Brightness")
abline(lm(EPB_bright ~ PairB_bright, data = barnswallow_brightness), col = "red")
Berberis aquifolium Invasive Species Management Treatment Data
Description
Experimental data from a management treatment study of invasive Berberis aquifolium (Oregon grape) plants conducted across four heavily infested dune sites in Belgium. The study evaluated the effectiveness of different management treatments on individual plants, with regrowth assessments conducted at 6 months and 1 year post-treatment.
Usage
berberis_treatment
Format
A data frame with 127 rows and 14 variables:
- plant_id
- Character, unique identifier for each B. aquifolium plant/clone 
- region
- Factor, field code identifying the dune site location 
- date
- Date, when the plant was initially located and treated (April/May 2013) 
- treatment
- Factor, management treatment applied: - Manual digging - Uprooting by digging with shovels 
- Leaf spray (glyphosate) - 5\ 
- Stem cut + glyphosate - Cut and paint with 5\ 
- Stem cut + salt - Cut and treat with saturated NaCl solution 
 
- height
- Integer, plant height in centimeters 
- diameter
- Integer, clone diameter in centimeters 
- n_stems
- Integer, number of stems per individual plant/clone 
- date_regrowth
- Date, date of regrowth assessment 
- regrowth
- Ordered factor, stem regrowth response (Dead < Limited < Vital) 
- x_proj
- Numeric, X-coordinate of plant location (GPS projection) 
- y_proj
- Numeric, Y-coordinate of plant location (GPS projection) 
- days_to_assessment
- Numeric, days between treatment and assessment 
- treatment_success
- Factor, binary outcome (Success/Failure) 
- volume_approx
- Numeric, approximate plant volume 
Source
Adriaens, T., Verschelde, P., Cartuyvels, E., D'hondt, B., Vercruysse, E., Gompel, W.V., Dewulf, E., & Provoost, S. (2019). Data from: A preliminary field trial to compare control techniques for invasive Berberis aquifolium in Belgian coastal dunes. Dryad Digital Repository. doi:10.5061/DRYAD.ZKH189361
Examples
# Load the dataset
data(berberis_treatment)
# Treatment effectiveness summary
table(berberis_treatment$treatment, berberis_treatment$regrowth)
# Visualize treatment effectiveness
barplot(table(berberis_treatment$treatment, berberis_treatment$regrowth),
        beside = TRUE, legend = TRUE,
        main = "Treatment Effectiveness for Invasive Berberis")
Vertebrate scavenger visits to roe deer carrion
Description
This dataset contains observations of vertebrate scavenger activity at roe deer evisceration residues (carrion) in a central European mountainous mixed forest. The data was collected from 47 roe deer viscera samples from hunted deer that were exposed to vertebrate scavenging in front of camera traps between May and October 2022. The dataset records visits, feeding, and removal events by various vertebrate scavengers, with events considered independent if more than 20 minutes passed between consecutive pictures. Samples were observed for a maximum of 16 days. The data provides insights into the composition of vertebrate scavenging fauna using evisceration residues, which species remove entire samples, and how long viscera remain available to invertebrate scavengers.
Usage
carrion_arrivals
Format
A data frame with 599 rows and 7 variables:
- Samples
- Sample site identifier (factor) 
- Dates Setup
- Date when camera and sample were set up (Date) 
- Date Event
- Date when the scavenging event was recorded (Date) 
- Time
- Time when the scavenging event was recorded (hms) 
- Species
- Species of scavenger observed (factor) 
- Behaviour
- Type of behavior observed: Visit, Feeding, or Removal (factor) 
- Days2
- Time elapsed in days between sample exposure and detected event (numeric) 
Source
Schwegmann, Sebastian (2023). Data for: Use of viscera from hunted roe deer by vertebrate scavengers in summer in central European mountainous mixed forest. Dryad Digital Repository. doi:10.5061/DRYAD.Q573N5TPP
Examples
# Load the dataset
data(carrion_arrivals)
# Basic exploration
head(carrion_arrivals)
summary(carrion_arrivals)
# Count observations by species
table(carrion_arrivals$Species)
# Compare behaviors by species
table(carrion_arrivals$Species, carrion_arrivals$Behaviour)
# Calculate average days until first scavenger arrival by species
library(dplyr)
carrion_arrivals %>%
  group_by(Species) %>%
  summarize(mean_days = mean(Days2, na.rm = TRUE)) %>%
  arrange(mean_days)
# Visualize scavenger activity over time
if (require(ggplot2)) {
  ggplot(carrion_arrivals, aes(x = Days2, fill = Species)) +
    geom_histogram(binwidth = 1, position = "stack") +
    labs(title = "Scavenger activity over time",
         x = "Days since carrion placement",
         y = "Number of observations")
}
Chimpanzee Camera Trap Detection Data
Description
This dataset contains presence/absence data for wild chimpanzees (Pan troglodytes) detected by camera traps in the Issa Valley, Tanzania. The data was collected as part of a study comparing the efficiency of camera traps versus passive acoustic monitoring for detecting chimpanzees in a savanna-woodland mosaic habitat.
Usage
chimpanzee_cameras
Format
A data frame with observations across multiple cameras and dates:
- Camera
- Camera trap identifier (factor) 
- Latitude
- Latitude coordinates of the camera trap location (numeric) 
- Longitude
- Longitude coordinates of the camera trap location (numeric) 
- Method
- Camera placement method: 'systematic' or 'targeted' (factor) 
- Vegetation
- Vegetation type at camera location: 'open' or 'closed' (factor) 
- Topography
- Landscape feature at camera location: 'valley', 'slope', or 'plateau' (factor) 
- date
- Date of observation (Date) 
- detection
- Chimpanzee detection status: 'absent' or 'present' (factor) 
Details
The dataset is in long format, with each row representing a camera trap observation for a specific date. Detection values are coded as 'present' (at least one detection during the day) or 'absent' (no detection). NA values indicate days when no survey was conducted (e.g., due to camera malfunction or not being deployed).
Source
Crunchant, Anne-Sophie and Borchers, David and Kuehl, Hjalmar and Piel, Alex K. (2020). Listening and watching: do camera traps or acoustic sensors more efficiently detect wild chimpanzees in an open habitat?. Dryad Digital Repository. doi:10.5061/DRYAD.5DV41NS34
Examples
# Load the dataset
data(chimpanzee_cameras)
# Basic exploration
head(chimpanzee_cameras)
summary(chimpanzee_cameras)
# Count detections by camera (requires dplyr)
if (requireNamespace("dplyr", quietly = TRUE)) {
  library(dplyr)
  chimpanzee_cameras %>%
    group_by(Camera) %>%
    summarize(
      total_observations = n(),
      detections = sum(detection == "present", na.rm = TRUE),
      detection_rate = mean(detection == "present", na.rm = TRUE)
    )
}
# Visualize detection patterns over time (requires ggplot2)
if (requireNamespace("ggplot2", quietly = TRUE)) {
  library(ggplot2)
  ggplot(chimpanzee_cameras, aes(x = date, y = Camera, fill = detection)) +
    geom_tile() +
    scale_fill_manual(values = c("absent" = "lightblue", "present" = "darkred"),
                      na.value = "gray90") +
    theme_minimal() +
    labs(title = "Chimpanzee detections by camera over time",
         x = "Date", y = "Camera")
}
Dormouse Hibernation and Reproduction Dataset
Description
This dataset contains hibernation and reproductive data for edible dormice (Glis glis). The data tracks hibernation patterns, body mass changes, and reproductive activity across multiple years and individuals. The study examines how age affects hibernation timing and reproductive behavior in these small hibernating mammals.
Usage
dormouse_hibernation
Format
A data frame with 290 rows and 16 variables:
- animal_id
- Unique identifier for each dormouse 
- year_birth
- Year of birth for the animal 
- age
- Age of the animal in years 
- log_age
- Logarithm of age 
- body_mass_before
- Body mass before hibernation (g) 
- body_mass_after
- Body mass after hibernation (g) 
- hibernation_duration
- Duration of hibernation in days 
- hibernation_start
- Date when hibernation started (DD.MM.YY format) 
- hibernation_end
- Date when hibernation ended (DD.MM.YY format) 
- hibernation_end_year_before
- End date of previous year's hibernation 
- body_mass_spring
- Body mass in spring (g) 
- year
- Year of observation 
- sex
- Sex of the animal (male or female) 
- diet
- Diet type (medium, high fat, or protein) 
- age_death
- Age at death in years 
- repro_active
- Whether the animal was reproductively active (yes or no) 
Details
The research shows that age strongly affects hibernation/activity patterns through two pathways: (1) with increasing age, dormice are more likely to reproduce, which delays hibernation onset, and (2) age directly advances emergence from hibernation in spring. This suggests hibernation is not merely an energy-saving strategy but an age-affected life-history trait used to maximize fitness.
Source
Bieber, Claudia and Turbill, Christopher and Ruf, Thomas (2019). Data from: Effects of aging on timing of hibernation and reproduction. Dryad Digital Repository. doi:10.5061/DRYAD.8004G37
Examples
# Load the dataset
data(dormouse_hibernation)
# Basic exploration
head(dormouse_hibernation)
summary(dormouse_hibernation)
# Examine hibernation duration by age
boxplot(hibernation_duration ~ age, data = dormouse_hibernation, 
        main = "Hibernation Duration by Age",
        xlab = "Age (years)", ylab = "Duration (days)")
        
# Compare body mass change during hibernation
with(dormouse_hibernation, 
     plot(body_mass_before, body_mass_after, 
          col = as.integer(sex),
          main = "Body Mass Before vs After Hibernation",
          xlab = "Body Mass Before (g)", 
          ylab = "Body Mass After (g)"))
legend("topleft", levels(dormouse_hibernation$sex), 
       col = 1:2, pch = 1)
       
# Examine reproductive activity by age
with(subset(dormouse_hibernation, !is.na(repro_active)), 
     table(age, repro_active))
Elephant Agricultural Use Metrics in Mara-Serengeti Ecosystem
Description
A dataset containing agricultural use metrics for 66 elephants in the Mara-Serengeti ecosystem in Kenya and Tanzania. The data were collected to characterize crop use tactics by elephants and to understand how elephants interact with agricultural areas. The dataset includes metrics such as mean agricultural use, maximum use from a moving average, and the difference between mean and max use. These metrics were used to classify agricultural use tactics for each elephant using Gaussian mixture models. The dataset contains individual-year agricultural use metrics, space use, and elephant metadata, with tactic classifications for both lifetime tracks and individual years.
Usage
elephant_farmers
Format
A data frame with 202 rows and 17 variables:
- subject_name
- Individual elephant ID 
- tactic.aggregate
- Tactic classification for lifetime GPS track of individual: Rare, Sporadic, Seasonal, or Habitual 
- season.year
- Year cuts (cut date April 1 of each year) 
- tactic.season
- Tactic classification for the associated year: Rare, Sporadic, Seasonal, or Habitual 
- year.begin
- Data start date for a given year 
- year.end
- Data stop date for a given year 
- n.fixes
- Number of GPS relocations for an individual in a given year 
- year.mean
- Mean agricultural use for a given year 
- year.max
- Maximum agricultural use from a 90-day moving average for a given year 
- year.delta
- Difference in mean and max agricultural use for a given year 
- year.mcp.area
- MCP homerange for an individual in a given year 
- mu.daily.disp
- Mean daily displacement for an individual in a given year 
- subject_sex
- Sex of the individual (male or female) 
- subject_ageClass
- Age class of individual (young adult or mature adult) 
- centroid.dist.meters
- Distance from centroid of homerange to agriculture (meters) 
- tactic.prev
- Tactic of the previous year (NA if no previous tactic could be confirmed) 
- tactic.change
- Whether an individual changed tactics ("Changed") or stayed the same ("No change") 
Source
Hahn, Nathan (2021). Elephant agricultural use metrics in Mara-Serengeti ecosystem. Dryad Digital Repository. doi:10.5061/DRYAD.RN8PK0PBN
Examples
# Load the dataset
data(elephant_farmers)
# Basic exploration
head(elephant_farmers)
summary(elephant_farmers)
# Examine distribution of tactics by sex
table(elephant_farmers$subject_sex, elephant_farmers$tactic.season)
# Compare agricultural use metrics across tactics
boxplot(year.mean ~ tactic.season, data = elephant_farmers, 
        main = "Mean Agricultural Use by Tactic",
        ylab = "Mean Agricultural Use")
        
# Examine tactic changes over time
# Count how many elephants changed tactics vs stayed the same
table(elephant_farmers$tactic.change, useNA = "ifany")
# Look at relationship between distance to agriculture and tactic
boxplot(centroid.dist.meters ~ tactic.season, data = elephant_farmers,
        main = "Distance to Agriculture by Tactic",
        ylab = "Distance (meters)")
Galapagos Sea Lion Reproduction Data
Description
This dataset contains reproductive performance data for female Galapagos sea lions (Zalophus wollebaeki) collected over a 13-year period in the Galapagos archipelago. The data includes information on mother birth dates, body mass, age at first reproduction, and offspring details. This dataset was used to study life history traits and reproductive trade-offs of this tropical apex predator in an unpredictable habitat.
Usage
lion_reproduction
Format
A data frame with 48 rows and 12 variables:
- MotherID
- Unique identifier for the mother sea lion 
- MotherBD
- Mother's birth date 
- exact
- Whether the birth date is exact ("Yes"), estimated ("No"), or unknown ("Unknown") 
- MotherBirthyear
- Year the mother was born 
- AgeAtCapture
- Age of the mother when first captured (in days) 
- AverageOneYearBodymass
- Average body mass of the mother at one year of age (in kg) 
- PupID
- Unique identifier for the pup 
- FirstPupBorn
- Date when the first pup was born 
- OffspringSex
- Sex of the offspring ("Male" or "Female") 
- SeenSince
- Year the mother was first observed 
- SeenUntil
- Year the mother was last observed 
- AFR
- Age at first reproduction (in years) 
Source
Kalberer, Stephanie, Meise, Kristine, Trillmich, Fritz, & Krüger, Oliver (2018). Reproductive performance of a tropical apex predator in an unpredictable habitat. Dryad Digital Repository. doi:10.5061/DRYAD.6S48579
Examples
# Load the dataset
data(lion_reproduction)
# Basic exploration
head(lion_reproduction)
summary(lion_reproduction)
# Calculate mean age at first reproduction
mean(lion_reproduction$AFR)
# Compare age at first reproduction by offspring sex
boxplot(AFR ~ OffspringSex, data = lion_reproduction,
        main = "Age at First Reproduction by Offspring Sex",
        ylab = "Age (years)")
        
# Relationship between mother's body mass and age at first reproduction
plot(AverageOneYearBodymass ~ AFR, data = lion_reproduction,
     main = "Body Mass vs. Age at First Reproduction",
     xlab = "Age at First Reproduction (years)",
     ylab = "Average Body Mass at One Year (kg)")
Magellanic Penguin Foraging and Reproductive Data
Description
Data from satellite tracking of Magellanic penguins (Spheniscus magellanicus) at Punta Tombo, Argentina, spanning 23 breeding seasons. The dataset contains information on foraging site fidelity, trip characteristics, and reproductive success for individual penguins.
Usage
magellanic_penguins
Format
A data frame with 212 rows and 21 variables:
- SeasonYear
- Integer, breeding season year 
- PenguinID
- Integer, unique identifier for each penguin 
- PenguinSeq
- Integer, sequential penguin number within season 
- InstrumentSeq
- Integer, instrument deployment sequence 
- NTripPairsLong
- Integer, number of long trip pairs 
- DistBetMean
- Numeric, mean distance between foraging sites (km) 
- DistBetSD
- Numeric, standard deviation of distance between sites (km) 
- DistBetMin
- Numeric, minimum distance between foraging sites (km) 
- DistBetMax
- Numeric, maximum distance between foraging sites (km) 
- InstrType
- Factor, type of tracking instrument used 
- InstrModel
- Character, model of tracking instrument 
- NFledged
- Integer, number of chicks that fledged successfully 
- DurDaysMean
- Numeric, mean trip duration in days 
- TripDistMean
- Numeric, mean trip distance (km) 
- BearingMean
- Numeric, mean bearing of foraging trips (degrees) 
- PenguinSex
- Factor, sex of penguin ("Male", "Female") 
- NumTrips
- Integer, total number of foraging trips 
- NumChicksDeploy
- Integer, number of chicks at deployment 
- NumChicksStarved
- Integer, number of chicks that starved 
- DeployDurDays
- Integer, deployment duration in days 
- ChlaMean
- Numeric, mean chlorophyll-a concentration 
Source
Rebstock, G., Abrahms, B., & Boersma, D. (2022). Data from: Site fidelity increases reproductive success by increasing foraging efficiency in a marine predator. Dryad Digital Repository. doi:10.5061/DRYAD.8931ZCRSV
References
Rebstock, G.A., Abrahms, B. & Boersma, P.D. (2022). Site fidelity increases reproductive success by increasing foraging efficiency in a marine predator. Proceedings of the Royal Society B, 289(1975), 20220175.
Examples
# Load the dataset
data(magellanic_penguins)
# Basic exploration
head(magellanic_penguins)
summary(magellanic_penguins)
# Examine foraging efficiency by sex
boxplot(TripDistMean ~ PenguinSex, data = magellanic_penguins,
        main = "Mean Trip Distance by Sex",
        xlab = "Sex", ylab = "Mean Trip Distance (km)")
# Relationship between site fidelity and reproductive success
plot(magellanic_penguins$DistBetMean, magellanic_penguins$NFledged,
     xlab = "Mean Distance Between Sites (km)",
     ylab = "Number of Chicks Fledged",
     main = "Site Fidelity vs Reproductive Success")
Habitat Occupancy of the Critically Endangered Chinese Pangolin
Description
A dataset containing habitat occupancy observations of the Critically Endangered Chinese pangolin (Manis pentadactyla) in the urban landscape of Dharan Sub-metropolitan City, Nepal. The data were collected to analyze spatial distribution, habitat use patterns, and anthropogenic impacts on habitat occupancy of Chinese pangolins. The study used a single-season occupancy modeling approach, investigating factors influencing detection probability and habitat occupancy across 134 grid cells of 600m × 600m each.
Usage
pangolin_habitat
Format
A data frame with 152 rows and 18 variables:
- object_id
- Unique identifier for each grid cell 
- replicate_1
- Detection (1) or non-detection (0) in first survey replicate 
- replicate_2
- Detection (1) or non-detection (0) in second survey replicate 
- replicate_3
- Detection (1) or non-detection (0) in third survey replicate 
- replicate_4
- Detection (1) or non-detection (0) in fourth survey replicate 
- replicate_5
- Detection (1) or non-detection (0) in fifth survey replicate 
- replicate_6
- Detection (1) or non-detection (0) in sixth survey replicate 
- distance_to_water
- Distance to nearest water body in meters 
- terrain_ruggedness
- Terrain Ruggedness Index (TRI), a measure of topographic heterogeneity 
- mean_ndvi
- Mean Normalized Difference Vegetation Index, a measure of vegetation density 
- habitat_type
- Type of habitat: "Sal Forest", "Mixed Forest", "Human Settlement", or "Agricultural Land" 
- habitat_structure
- Topographic structure: "Terrace" or "Cliff" 
- human_disturbance_index
- Index of human disturbance, ranging from 0 (low) to 1 (high) 
- termite_mounds
- Number of termite mounds in the grid cell 
- detection_sum
- Total number of detections across all six replicates 
- detected
- Binary indicator of whether pangolin was detected (1) or not (0) in any replicate 
- disturbance_level
- Categorized human disturbance: "Low", "Medium-Low", "Medium-High", or "High" 
Details
The dataset is particularly valuable for teaching concepts in wildlife conservation, occupancy modeling, and human-wildlife interactions in urban environments. It demonstrates how ecological and anthropogenic factors affect endangered species in human-dominated landscapes.
Source
Subba, Asmit and Tamang, Ganesh and Lama, Sony and Basnet, Nabin and Kyes, Randall C. and Khanal, Laxman (2024). Habitat occupancy of the critically endangered Chinese pangolin (Manis pentadactyla) under human disturbance in an urban environment: Implications for conservation. Dryad Digital Repository. doi:10.5061/DRYAD.73N5TB34T
Examples
# Load the dataset
data(pangolin_habitat)
# Basic exploration
head(pangolin_habitat)
summary(pangolin_habitat)
# Examine detection rates across habitat types
table(pangolin_habitat$habitat_type, pangolin_habitat$detected)
# Visualize the relationship between termite mounds and pangolin detection
boxplot(termite_mounds ~ detected, data = pangolin_habitat,
        main = "Termite Mounds and Pangolin Detection",
        xlab = "Pangolin Detected", ylab = "Number of Termite Mounds",
        names = c("Not Detected", "Detected"))
        
# Examine the effect of human disturbance on pangolin detection
boxplot(human_disturbance_index ~ detected, data = pangolin_habitat,
        main = "Human Disturbance and Pangolin Detection",
        xlab = "Pangolin Detected", ylab = "Human Disturbance Index",
        names = c("Not Detected", "Detected"))
        
# Visualize detection across disturbance levels
barplot(prop.table(table(pangolin_habitat$disturbance_level, 
                         pangolin_habitat$detected), 1)[,2],
        main = "Pangolin Detection Rate by Disturbance Level",
        xlab = "Disturbance Level", ylab = "Detection Rate")
Raccoon dog activity and environmental factors in China
Description
This dataset contains records of raccoon dog (Nyctereutes procyonoides) detections and other mammal species from camera traps in the Sizuolou Nature Reserve, Beijing, China, along with associated environmental variables. The data was collected from October 15, 2023, to February 29, 2024, covering both autumn and winter seasons. The study examined seasonal activity changes in raccoon dogs and their relationship to environmental and mammalian factors.
Usage
raccoondog_environment
Format
A data frame with 144 rows and 21 variables:
- point_id
- Camera installation point ID 
- asian_badger
- Number of Asian badger detection events 
- wild_boar
- Number of wild boar detection events 
- hog_badger
- Number of hog badger detection events 
- leopard_cat
- Number of leopard cat detection events 
- masked_palm_civet
- Number of masked palm civet detection events 
- rock_squirrel
- Number of Père David's rock squirrel detection events 
- raccoon_dog
- Number of raccoon dog detection events 
- red_squirrel
- Number of red squirrel detection events 
- roe_deer
- Number of Siberian roe deer detection events 
- siberian_weasel
- Number of Siberian weasel detection events 
- striped_squirrel
- Number of Swinhoe's striped squirrel detection events 
- tolai_hare
- Number of Tolai hare detection events 
- dist_impervious
- Distance from camera to nearest impervious area (meters) 
- dist_agricultural
- Distance from camera to nearest agricultural land (meters) 
- dist_water
- Distance from camera to nearest water source (meters) 
- dist_roads
- Distance from camera to nearest road (meters) 
- Altitude
- Elevation of camera installation point (meters above sea level) 
- tpi
- Topographic Position Index of the camera installation point 
- Day
- Date of detection (days since start of study, October 15, 2023) 
- Season
- Season at time of detection ("Autumn" or "Winter") 
- Vegetation
- Type of vegetation at the camera installation point 
Source
Miyamoto, Keisuke and Chen, Chuan and Luan, Xiaofeng (2025). Seasonal activity changes in raccoon dogs and influences of environmental factors from autumn to winter. Dryad Digital Repository. doi:10.5061/DRYAD.C866T1GJN
Examples
# Load the dataset
data(raccoondog_environment)
# Basic exploration
head(raccoondog_environment)
summary(raccoondog_environment)
# Compare raccoon dog detections by season
boxplot(raccoon_dog ~ Season, data = raccoondog_environment,
        main = "Raccoon Dog Detections by Season",
        ylab = "Number of Detections", xlab = "Season")
        
# Examine relationship between environmental factors and raccoon dog presence
# Create a binary presence variable
raccoondog_presence <- raccoondog_environment
raccoondog_presence$presence <- ifelse(raccoondog_presence$raccoon_dog > 0, 1, 0)
# Plot relationship with distance to agricultural land
plot(dist_agricultural ~ presence, data = raccoondog_presence,
     main = "Raccoon Dog Presence vs. Distance to Agricultural Land",
     xlab = "Presence (0=Absent, 1=Present)", 
     ylab = "Distance to Agricultural Land (m)")
     
# Examine vegetation types where raccoon dogs were detected
table(raccoondog_presence$Vegetation[raccoondog_presence$presence == 1])
Scavenger community structure in Scandinavian ecosystems
Description
This dataset contains observations of scavenger communities along an environmental gradient from boreal forest to alpine tundra in central Scandinavia. The data was collected using baited camera traps to quantify the structure of the winter scavenger community and assess how climatic conditions affected spatial patterns of species occurrences at baits. The study found that habitat type (forest or alpine tundra) and snow depth were main determinants of community structure. Occurrence at baits by habitat generalists (red fox, golden eagle, and common raven) typically increased at low temperatures and high snow depth, likely due to increased energetic demands and lower abundance of natural prey in harsh winter conditions.
Usage
scavenger_community
Format
A data frame with 1255 rows and 61 variables:
- id
- Site identifier (factor) 
- year
- Year of observation (numeric) 
- jd
- Julian day (numeric) 
- photo_sum
- Total number of photos (numeric) 
- empty
- Number of empty photos (numeric) 
- raven
- Number of photos with ravens (numeric) 
- crow
- Number of photos with crows (numeric) 
- magpie
- Number of photos with magpies (numeric) 
- eurjay
- Number of photos with Eurasian jays (numeric) 
- sibjay
- Number of photos with Siberian jays (numeric) 
- geagle
- Number of photos with golden eagles (numeric) 
- wteagle
- Number of photos with white-tailed eagles (numeric) 
- rlbuzz
- Number of photos with rough-legged buzzards (numeric) 
- goshawk
- Number of photos with goshawks (numeric) 
- redfox
- Number of photos with red foxes (numeric) 
- arcfox
- Number of photos with arctic foxes (numeric) 
- wolverine
- Number of photos with wolverines (numeric) 
- badger
- Number of photos with badgers (numeric) 
- pinemart
- Number of photos with pine martens (numeric) 
- mustelids
- Number of photos with other mustelids (numeric) 
- lb
- Total number of photos with large birds (numeric) 
- bird
- Total number of photos with birds (numeric) 
- mammal
- Total number of photos with mammals (numeric) 
- age
- Age of bait in days (numeric) 
- snow
- Snow presence indicator (numeric) 
- session
- Sampling session identifier (factor) 
- ravenm
- Raven presence/absence (numeric) 
- crowm
- Crow presence/absence (numeric) 
- magpiem
- Magpie presence/absence (numeric) 
- eurjaym
- Eurasian jay presence/absence (numeric) 
- sibjaym
- Siberian jay presence/absence (numeric) 
- geaglem
- Golden eagle presence/absence (numeric) 
- wteaglem
- White-tailed eagle presence/absence (numeric) 
- rlbuzzm
- Rough-legged buzzard presence/absence (numeric) 
- goshawkm
- Goshawk presence/absence (numeric) 
- redfoxm
- Red fox presence/absence (numeric) 
- arcfoxm
- Arctic fox presence/absence (numeric) 
- wolverinem
- Wolverine presence/absence (numeric) 
- badgerm
- Badger presence/absence (numeric) 
- pinemartm
- Pine marten presence/absence (numeric) 
- mustelidsm
- Other mustelids presence/absence (numeric) 
- arcfox.day
- Arctic fox detected that day (numeric) 
- redfox.day
- Red fox detected that day (numeric) 
- wolverine.day
- Wolverine detected that day (numeric) 
- raven.day
- Raven detected that day (numeric) 
- crow.day
- Crow detected that day (numeric) 
- magpie.day
- Magpie detected that day (numeric) 
- geagle.day
- Golden eagle detected that day (numeric) 
- wteagle.day
- White-tailed eagle detected that day (numeric) 
- badger.day
- Badger detected that day (numeric) 
- pinemart.day
- Pine marten detected that day (numeric) 
- eurjay.day
- Eurasian jay detected that day (numeric) 
- sibjay.day
- Siberian jay detected that day (numeric) 
- lb.day
- Large bird detected that day (numeric) 
- se
- Session identifier (numeric) 
- bird.day
- Any bird detected that day (numeric) 
- mammal.day
- Any mammal detected that day (numeric) 
- length
- Length of session (numeric) 
- samean
- Mean solar angle (numeric) 
- tamean
- Mean temperature (numeric) 
- habitat
- Habitat type: "Boreal forest" or "Alpine tundra" (factor) 
- hosl
- Hours of sunlight (numeric) 
- scover
- Snow cover: "Snow cover" or "No snow cover" (factor) 
- loghosl
- Log-transformed hours of sunlight (numeric) 
- altitude
- Altitude in meters (numeric) 
- sdepth
- Snow depth in cm (numeric) 
Source
Gomo, Gjermund and Rød-Eriksen, Lars and Andreassen, Harry P. and Mattisson, Jenny and Odden, Morten and Devineau, Olivier and Eide, Nina E. (2020). Scavenger community structure along an environmental gradient from boreal forest to alpine tundra in Scandinavia. Ecology and Evolution. doi:10.1002/ece3.6834
Examples
# Load the dataset
data(scavenger_community)
# Basic exploration
head(scavenger_community)
summary(scavenger_community)
# Species richness by habitat type
if(require(dplyr)) {
  # Count number of unique species by examining columns with species data
  scavenger_community %>%
    group_by(habitat) %>%
    summarize(
      raven_present = sum(ravenm > 0, na.rm = TRUE),
      redfox_present = sum(redfoxm > 0, na.rm = TRUE),
      wolverine_present = sum(wolverinem > 0, na.rm = TRUE),
      total_species = raven_present + redfox_present + wolverine_present
    )
}
# Compare bird vs mammal occurrence between habitats
if(require(dplyr)) {
  scavenger_community %>%
    group_by(habitat) %>%
    summarize(
      bird_observations = sum(bird, na.rm = TRUE),
      mammal_observations = sum(mammal, na.rm = TRUE),
      observation_ratio = bird_observations / mammal_observations
    )
}
# Visualize snow depth distribution by habitat
if(require(ggplot2)) {
  ggplot(scavenger_community, aes(x = sdepth, fill = habitat)) +
    geom_histogram(position = "dodge", bins = 20) +
    labs(title = "Snow depth by habitat type",
         x = "Snow depth (cm)",
         y = "Count")
}
Gyps Vulture Stable Isotope Analysis - Feather Data (AR.feather subset)
Description
Stable isotope data (carbon, nitrogen, and sulfur) from Gyps vulture feathers collected in Tanzania for dietary analysis using stable isotope mixing models (SIMM). This dataset represents the AR.feather subset containing raw consumer isotope values from vulture feathers. Data was collected over 10 months from two protected areas: Serengeti National Park and Selous Game Reserve to analyze vulture dietary patterns across space and time using stable isotope analysis.
Usage
vulture_diet
Format
A data frame with 21 rows and 5 variables:
- d13C
- Numeric, delta 13C carbon isotope values per mill (‰) 
- d15N
- Numeric, delta 15N nitrogen isotope values per mill (‰) 
- d34S
- Numeric, delta 34S sulfur isotope values per mill (‰) 
- species
- Factor, vulture species sampled (African white-backed or Rüppell's griffon) 
- tissue
- Factor, tissue type analyzed (feathers) 
Details
Vultures were captured using noose lines around provisioned or natural bait, processed, and released. Feather samples were analyzed for delta13C, delta15N, and delta34S using a PyroCube elemental analyzer at the NERC Life Sciences Mass Spectrometry Facility. The isotope signatures provide insights into vulture diet composition, with delta13C distinguishing between C3 and C4 plant consumers (browsers vs grazers), delta15N indicating trophic level, and delta34S helping separate geographic regions.
This subset was specifically prepared for use in stable isotope mixing models to estimate diet composition in Gyps vultures. The study found that vulture diet consisted primarily of grazing herbivores, with those in Serengeti National Park consuming higher proportions (>87%) of grazing species. Coordinates in the original study were denatured by +0.5 degrees to preserve geographic distribution while ensuring location confidentiality.
Collection period: August 18, 2018 to May 31, 2019 Study locations: Serengeti National Park (2.1540°S, 34.6857°E) and Selous Game Reserve (9.0000°S, 37.5000°E), Tanzania
Source
Baino, A., Hopcraft, G., Kendall, C., Munishi, L., Behdenna, A., & Newton, J. (2021). We are what we eat, plus some per mill: Using stable isotopes to estimate diet composition in Gyps vultures over space and time. Dryad Digital Repository. doi:10.5061/DRYAD.1NS1RN8QF
Examples
# Load the dataset
data(vulture_diet)
head(vulture_diet)
summary(vulture_diet)
# Examine isotope signatures by species
boxplot(d13C ~ species, data = vulture_diet,
        main = "Carbon Isotope Signatures by Vulture Species",
        xlab = "Species", ylab = "d13C (per mill)")
# Create isotope biplot
plot(vulture_diet$d13C, vulture_diet$d15N,
     col = as.numeric(vulture_diet$species),
     pch = 16, cex = 1.2,
     xlab = "d13C (per mill)", ylab = "d15N (per mill)",
     main = "Vulture Feather Isotope Signatures")
legend("topright", legend = levels(vulture_diet$species), 
       col = 1:nlevels(vulture_diet$species), pch = 16)
# Summary statistics by species
aggregate(. ~ species, data = vulture_diet[,1:4], FUN = mean)
The Evolution of Hearing and Brain Size in Eocene Whales
Description
A dataset containing endocranial volume and body mass measurements for various cetacean (whale) species and other mammals. This dataset was compiled to study the evolution of hearing and brain size in Eocene whales. It includes both extant (living) and fossil species, with a focus on understanding how brain size evolved in relation to body mass and hearing adaptations across different taxonomic groups. The dataset is particularly valuable for teaching concepts in comparative anatomy, allometry, and cetacean evolution.
Usage
whale_brains
Format
A data frame with 269 rows and 9 variables:
- family
- Taxonomic family of the species 
- binomial_name
- Full taxonomic name for each species 
- common_name
- Common name for each species (NA for most fossil species) 
- endocranial_volume
- Endocranial volume in cubic centimeters (cc) 
- brain_mass
- Brain mass in grams 
- ocw_mm
- Occipital condyle width in millimeters 
- body_mass
- Body mass in kilograms 
- taxonomic_group
- Categorization as "Cetacean", "Hippopotamid", or "Other Mammal" 
- time_period
- Classification as "Extant" (living) or "Fossil" species 
Details
Toothed whales (odontocetes) use high-frequency sounds to echolocate, differing significantly from baleen whales (mysticetes), which use low-frequency sound for long-distance communication. This dataset helps explore how hearing functioned in ancestral archaeocetes, and when the specializations of modern species arose.
Source
Peacock, John and Waugh, David and Bajpai, Sunil and Thewissen, JGM (2025). The evolution of hearing and brain size in Eocene whales. Dryad Digital Repository. doi:10.5061/DRYAD.SF7M0CGH1
Examples
# Load the dataset
data(whale_brains)
# Basic exploration
head(whale_brains)
summary(whale_brains)
# Compare brain mass across taxonomic groups
boxplot(whale_brains$brain_mass ~ whale_brains$taxonomic_group, 
        main = "Brain Mass by Taxonomic Group",
        ylab = "Brain Mass (g)", log = "y")
# Look at the relationship between brain mass and body mass
# Using log scales to show allometric relationships
plot(whale_brains$body_mass, whale_brains$brain_mass, 
     log = "xy", col = as.numeric(whale_brains$taxonomic_group),
     pch = 16, main = "Brain Mass vs. Body Mass",
     xlab = "Body Mass (kg)", ylab = "Brain Mass (g)")
legend("topleft", legend = levels(whale_brains$taxonomic_group), 
       col = 1:3, pch = 16)
       
# Compare fossil and extant cetaceans
cetaceans <- subset(whale_brains, taxonomic_group == "Cetacean")
boxplot(cetaceans$brain_mass ~ cetaceans$time_period,
        main = "Brain Mass in Fossil vs. Extant Cetaceans",
        ylab = "Brain Mass (g)", log = "y")