| Title: | Retime and Analyse Speech Signals | 
| Version: | 0.1.3 | 
| Description: | Retime speech signals with a native Waveform Similarity Overlap-Add (WSOLA) implementation translated from the 'TSM toolbox' by Driedger & Müller (2014) https://www.audiolabs-erlangen.de/content/resources/MIR/TSMtoolbox/2014_DriedgerMueller_TSM-Toolbox_DAFX.pdf. Design retimings and pitch (f0) transformations with tidy data and apply them via 'Praat' interface. Produce spectrograms, spectra, and amplitude envelopes. Includes implementation of vocalic speech envelope analysis (fft_spectrum) technique and example data (mm1) from Tilsen, S., & Johnson, K. (2008) <doi:10.1121/1.2947626>. | 
| Depends: | R (≥ 4.1) | 
| License: | MIT + file LICENSE | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| Imports: | rPraat, purrr, dplyr, tibble, tidyr, tuneR, ggplot2, stringr, signal, gsignal, methods, seewave, phonTools | 
| RoxygenNote: | 7.3.2 | 
| Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0) | 
| Config/testthat/edition: | 3 | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | no | 
| Packaged: | 2025-01-21 11:55:20 UTC; alistairbeith | 
| Author: | Alistair Beith [aut, cre, cph] | 
| Maintainer: | Alistair Beith <alistair.beith@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-01-22 17:20:02 UTC | 
retimer: Retime and Analyse Speech Signals
Description
Retime speech signals with a native Waveform Similarity Overlap-Add (WSOLA) implementation translated from the 'TSM toolbox' by Driedger & Müller (2014) https://www.audiolabs-erlangen.de/content/resources/MIR/TSMtoolbox/2014_DriedgerMueller_TSM-Toolbox_DAFX.pdf. Design retimings and pitch (f0) transformations with tidy data and apply them via 'Praat' interface. Produce spectrograms, spectra, and amplitude envelopes. Includes implementation of vocalic speech envelope analysis (fft_spectrum) technique and example data (mm1) from Tilsen, S., & Johnson, K. (2008) doi:10.1121/1.2947626.
Author(s)
Maintainer: Alistair Beith alistair.beith@gmail.com [copyright holder]
extractPitchTier
Description
Extracts 'Praat' PitchTier from wav object.
Usage
extractPitchTier(wav, res = 0.1, fmin = 50, fmax = 250, output = "PitchTier")
Arguments
| wav | path to a wav file or a tuneR WAVE object | 
| res | resolution of PitchTier | 
| fmin | minimum frequency of PitchTier | 
| fmax | maximum frequency of PitchTier | 
| output | can be "PitchTier" or "file" | 
Value
Returns a PitchTier object or the temporary path to the generated PitchTier file
extractWord
Description
Extract from a wav file with reference to a TextGrid.
Usage
extractWord(
  x,
  word,
  tier = "Word",
  ignore_case = TRUE,
  instance = "random",
  wd = getwd()
)
Arguments
| x | path to a TextGrid | 
| word | word to search for | 
| tier | name of word tier in TextGrid | 
| ignore_case | default is 'TRUE' | 
| instance | instance of word in TextGrid to extract. Default extracts a random instance. Can also be numeric (row number) | 
| wd | working directory for Praat to use. Accepts relative paths. | 
Value
Extracts section of wav file corresponding to word and saves in format name_wordi.wav where name is the original name, word is the word and x is the numeric instance.
See Also
density
extract_env
Description
Extract amplitude envelope of filtered speech signal. Adapted from Tilson & Johnson (2008). Procedure:
Usage
extract_env(
  x,
  fs,
  low_pass = 80,
  fs_out = 80,
  win = c(700, 1300),
  mean_centre = FALSE,
  replace_init = FALSE
)
Arguments
| x | a speech signal | 
| fs | sampling frequency of signal | 
| low_pass | frequency of lowpass filter used for smoothing | 
| fs_out | output sampling frequency | 
| win | lower and upper frequencies for initial bypass filter. Default is 700Hz-1300Hz as in Tilson & Johnson (2008) | 
| mean_centre | if TRUE signal will be scaled between 0 and 1 and then mean centred. Default is FALSE | 
| replace_init | if TRUE (default is FALSE) first sample of result will be replaced with second sample to deal with initialisation issue in resampling | 
Details
1. Signal is bypass filtered to extract desired frequency range 2. Absolute signal is then lowpass filtered 3. Signal is downsampled and mean centred if desired
Value
A matrix with time and amplitude
References
Tilsen, S., & Johnson, K. (2008). Low-frequency Fourier analysis of speech rhythm. The Journal of the Acoustical Society of America, 124(2), EL34–EL39. doi:10.1121/1.2947626
See Also
fft_spectro
fft_spectro
Description
Calculates low frequency power spectrogram of vocalic interval of speech signal. Following method of Tilsen & Johnson (2008)
Usage
fft_spectro(x, f_out = 80, window_size = 256, padding = 2048, plot = TRUE)
Arguments
| x | a 'tuneR' "Wave" object or the path to a .wav file. | 
| f_out | the sample frequency for the output | 
| window_size | number of samples to calculate each spectrum over | 
| padding | length to zero pad signal to. If signal is longer than padding, this will be increased. | 
| plot | if true a spectrogram will be plotted | 
Value
Returns a tibble with frequency (Hz), time (s) and power
References
Tilsen, S., & Johnson, K. (2008). Low-frequency Fourier analysis of speech rhythm. The Journal of the Acoustical Society of America, 124(2), EL34–EL39. doi:10.1121/1.2947626
See Also
fft_spectrum
fft_spectrum
Description
Calculates low frequency power spectrum of vocalic interval of speech signal. Following method of Tilsen & Johnson (2008)
Usage
fft_spectrum(signal, f, f_out = 80, padding = 512)
Arguments
| signal | a speech signal | 
| f | sampling frequency | 
| f_out | output sampling frequency. Signal will be lowpass filtered at f_out/2 | 
| padding | length to zero pad signal to. If signal is longer than padding, this will be increased. | 
Value
Returns a matrix with columns 'freq' (frequency in Hz) and 'pwr' (spectral power).
References
Tilsen, S., & Johnson, K. (2008). Low-frequency Fourier analysis of speech rhythm. The Journal of the Acoustical Society of America, 124(2), EL34–EL39. doi:10.1121/1.2947626
See Also
fft_spectro
findPeak
Description
Find the mode of numeric vector using the peak of its density distribution.
Usage
findPeak(x, ...)
Arguments
| x | a numeric vector | 
| ... | further arguements to be passed to 'density' | 
Value
Returns the value of 'x' that corresponds to the peak of the density curve.
See Also
density
findWord
Description
Find a word in a TextGrid
Usage
findWord(x, word = "speech", tier = "Word", ignore_case = TRUE)
Arguments
| x | path to a TextGrid | 
| word | word to search for | 
| tier | name of word tier in TextGrid | 
| ignore_case | default is 'TRUE' | 
Value
Returns a tibble with onset (t1) and offset (t2) of each occurance of the word in the TextGrid
See Also
extractWord
flatF0
Description
Flatten fundamental frequency contour using 'Praat'
Usage
flatF0(wav, .f = findPeak, ...)
Arguments
| wav | path to a wav file or a tuneR WAVE object | 
| .f | function to use to determine pitch. Default is findPeak which finds the mode of the existing pitch contour. | 
| ... | Additional arguments passed to extractPitchTier | 
Value
Returns a tuneR WAVE object of the input with a flat F0 contour
See Also
extractPitchTier
get_serial_anchors
Description
Convert a set of point anchors to a set of anchors that prevent overlaps while fixing the retiming factor within words.
Usage
get_serial_anchors(
  anc_in,
  anc_out,
  w_onsets,
  w_offsets,
  fs = NULL,
  retime_f = NULL,
  dry_run = FALSE,
  smudge = 0
)
Arguments
| anc_in | a vector of time points in the input signal | 
| anc_out | a vector of the times anc_in should be mapped to in the output signal | 
| w_onsets | a vector of time points for the onsets of words. Should be same length as anc_in. | 
| w_offsets | a vector of time points for the offsets of words. Should be same length as anc_in. | 
| fs | Sample rate of signal. If provided, returned anchor points with be expressed in samples. If NULL result will be expressed in seconds. | 
| retime_f | The desired factor that words should be sped by. If NULL the minimum change in rate that will prevent overlaps will be calculated. | 
| dry_run | If TRUE function will exit early with the minimum factor that will prevent overlaps. | 
| smudge | If > 0 this applies a crude adjustment to the calculated anchors to ensure monotonicity. Not necessary unless w_onsets are same as previous w_offsets. | 
Value
A list that can be used to perform retiming with the wsola function of this package.
See Also
wsola
S4 generic for length
Description
S4 generic for length.
Usage
## S4 method for signature 'Wave'
length(x)
Arguments
| x | a 'tuneR' WAVE object | 
Value
The length of the left channel of the WAVE object
See Also
length
mm1
Description
Example speech from Tilsen & Johnson (2008)
Usage
mm1
Format
A tuneR "Wave" object:
References
Tilsen, S., & Johnson, K. (2008). Low-frequency Fourier analysis of speech rhythm. The Journal of the Acoustical Society of America, 124(2), EL34–EL39. doi:10.1121/1.2947626
praatRetime
Description
praatRetime
Usage
praatRetime(wav, tg)
Arguments
| wav | path to a wav file or a tuneR WAVE object | 
| tg | a 'Praat' TextGrid object with 2 tiers: First tier should be intervals in the input audio file and second tier should be the same intervals with the desired onsets (t1) and offsets (t2). | 
Value
A wav file with the timing of the second tier of the TextGrid will be saved to the outfile location.
See Also
[read_tg()] for reading an existing TextGrid and [write_tg()] for saving a tibble as a TextGrid.
Examples
set.seed(42)
data(mm1)
dur <- length(mm1)/mm1@samp.rate
x <- runif(10)
t2_out <- dur*cumsum(x)/sum(x)
t1_out <- c(0, t2_out[-length(t2_out)])
t2_in <- dur*seq_len(10)/10
t1_in <- c(0, t2_in[-length(t2_in)])
tg <- dplyr::tibble(
               name = rep(c("old", "new"), each = 10),
               type = "interval",
               t1 = c(t1_in, t1_out),
               t2 = c(t2_in, t2_out),
               label = rep(letters[1:10], times = 2)
             ) |>
  tidyr::nest(data = c(t1, t2, label))
if (Sys.which("praat") != "") {
 wav_retimed <- praatRetime(mm1, tg)
} else {
 message("Skipping example because Praat is not installed.")
}
praatScript
Description
Executes a Praat script using the R system function.
Usage
praatScript(args, script = "reTimeWin.praat", wd = getwd(), praat = NULL)
Arguments
| args | arguements to pass to Praat script ("–run" not required) | 
| script | name of script if using a script from this package, or path to script for other scripts | 
| wd | working directory for Praat to use | 
| praat | path to Praat. If null will search for Praat in C:/Program Files (for Windows) or attempt to use "praat" for Unix based systems. | 
Value
Runs script in Praat and prints stdout to console.
praatSys
Description
Call 'Praat' via system2()
Usage
praatSys(args = "--version", praat = NULL, ...)
Arguments
| args | arguements to pass to 'Praat' | 
| praat | path to 'Praat'. If null will search for 'Praat' in C:/Program Files (for Windows) or attempt to use "praat" for Unix based systems. | 
| ... | arguements to pass to internal get_praat_path() function. This can be used to change the folder to look for R in for Windows (default is appDir = "C:/Program Files") | 
Value
Prints stdout to console
See Also
system2
read_tg
Description
Reads a 'Praat' TextGrid as a nested tibble
Usage
read_tg(file, encoding = "auto")
Arguments
| file | path to TextGrid file | 
| encoding | Passed to rPraat::tg.read: 'auto' (default) will detect encoding, or can be set to 'UTF-8' (rPraat default) | 
Value
Returns a nested tibble with 'name', 'type' and 'data'. 'data' has the variables 't1', 't2' and 'label'
spectrogram
Description
Universal spectrogram function.
Usage
spectrogram(
  x,
  fs = NULL,
  method = NULL,
  output = "tibble",
  wintime = 25,
  steptime = 10
)
Arguments
| x | a signal, 'tuneR' WAVE object, or the path to an .wav or .mp3 file. | 
| fs | sample rate if supplying the signal as a vector | 
| method | spectrogram implementation to use. Available options are 'phonTools', 'tuneR', 'gsignal', and 'seewave'. Default is to select the first of these methods that is available. | 
| output | format of output | 
| wintime | length of analysis window in ms | 
| steptime | interval between steps in ms | 
Value
Returns a spectrogram in the desired format
write_tg
Description
Writes a nested tibble to a 'Praat' TextGrid file
Usage
write_tg(x, file)
Arguments
| x | Nested tibble. Must contain the columns 'name', 'type' and 'data'. 'data' must have the columns 't1', 't2' and 'label' | 
| file | File name to save TextGrid as | 
Value
Returns path of saved TextGrid file
wsola
Description
Waveform Similarity Overlap-add. Translated from 'TSM Toolbox'.
Usage
wsola(x, s, win = "hann", winLen = 1024, synHop = 512, tol = 512)
Arguments
| x | an audio signal | 
| s | a scaling factor or a list of two vector with anchor points | 
| win | window function. Default is 'hann' for hanning window. Can also be a custom window supplied as a vector | 
| winLen | window length | 
| synHop | synthesis window hop size | 
| tol | tolerance for overlap delta | 
Value
retimed audio signal as vector
References
Driedger, J., Müller, M. (2014). TSM Toolbox: MATLAB Implementations of Time-Scale Modification Algorithms. In Proceedings of the International Conference on Digital Audio Effects (DAFx): 249–256.
See Also
fft_spectrum, get_serials_anchors
Examples
set.seed(42)
data(mm1)
dur <- length(mm1)
n <- 10
x <- runif(n)
anchors <- list(anc_in = c(0, dur*seq_len(n)/n),
                anc_out = c(0, dur*cumsum(x)/sum(x)))
sig <- wsola(mm1@left, anchors)