| Title: | Parse 'Tableau' Workbooks into Functional Data | 
| Version: | 0.2.3 | 
| Description: | High-performance parsing of 'Tableau' workbook files into tidy data frames and dependency graphs for other visualization tools like R 'Shiny' or 'Power BI' replication. | 
| License: | MIT + file LICENSE | 
| URL: | https://prigasg.github.io/twbparser/, https://github.com/PrigasG/twbparser | 
| BugReports: | https://github.com/PrigasG/twbparser/issues | 
| Depends: | R (≥ 4.2.0) | 
| Imports: | dplyr, igraph, purrr, R6, rlang, stringr, tibble, tidyr, withr, xml2 | 
| Suggests: | cli, covr, ggraph, knitr, lintr, magrittr, optparse, rmarkdown, spelling, styler, testthat (≥ 3.0.0), tidygraph, vctrs, zip | 
| VignetteBuilder: | knitr | 
| Config/testthat/edition: | 3 | 
| Encoding: | UTF-8 | 
| Language: | en-US | 
| RoxygenNote: | 7.3.2 | 
| NeedsCompilation: | no | 
| Packaged: | 2025-09-23 16:26:11 UTC; garthur | 
| Author: | George Arthur | 
| Maintainer: | George Arthur <prigasgenthian48@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-09-30 08:30:02 UTC | 
twbparser: Parse Tableau TWB/TWBX files
Description
Tools to parse Tableau workbooks (relationships, joins, fields, datasources) and inspect TWBX assets.
Author(s)
Maintainer: George Arthur prigasgenthian48@gmail.com (ORCID)
See Also
Useful links:
- Report bugs at https://github.com/PrigasG/twbparser/issues 
Classify a TWBX entry by file extension
Description
Classify a TWBX entry by file extension
Usage
.twbx_classify(name)
Arguments
| name | Path/filename | 
Value
One of "workbook","extract","image","text","excel","other"
Tableau Workbook Parser (R6)
Description
Initialize the parser from a .twb or .twbx path.
Return the TWBX manifest (if available).
Return TWBX extract entries.
Return TWBX image entries.
Extract files from the TWBX to disk.
Validate relationships; optionally stop on failure.
Print a oneāline summary of parsed content.
Arguments
| path | Path to a  | 
| types | Optional vector of types (e.g.,  | 
| pattern | Optional regex to match archive paths. | 
| files | Optional explicit archive paths to extract. | 
| exdir | Output directory (defaults to parser's twbx dir or tempdir()). | 
| error | If  | 
Format
An R6 class generator.
Details
Create a parser for Tableau .twb / .twbx files. On initialization, the
parser reads the XML and precomputes relationships, joins, fields, calculated
fields, inferred relationships, and datasource details. For .twbx, it also
extracts the largest .twb and records a manifest.
Fields
- path
- Path to the - .twbor- .twbxfile on disk.
- xml_doc
- Parsed - xml2document of the workbook.
- twbx_path
- Original - .twbxpath if the workbook was packaged.
- twbx_dir
- Directory where the - .twbxwas extracted.
- twbx_manifest
- Tibble of - .twbxcontents from- twbx_list().
- relations
- Tibble of - <relation>nodes from- extract_relations().
- joins
- Tibble of join clauses from - extract_joins().
- relationships
- Tibble of modern relationships from - extract_relationships().
- inferred_relationships
- Tibble of inferred relationship pairs by name and role. 
- datasource_details
- List containing - data_sources,- parameters, and- all_sources.
- fields
- Tibble of raw fields with table information. 
- calculated_fields
- Tibble of calculated fields. 
- last_validation
- Result from - validate()as list with- okand- issueselements.
Methods
- new(path)
- Create a parser from - .twbor- .twbxfile.
- get_twbx_manifest()
- Return - .twbxmanifest tibble.
- get_twbx_extracts()
- Return - .twbxextract entries.
- get_twbx_images()
- Return - .twbximage entries.
- extract_twbx_assets(types, pattern, files, exdir)
- Extract files from - .twbxarchive.
- get_relations()
- Return relations tibble. 
- get_joins()
- Return joins tibble. 
- get_relationships()
- Return modern relationships tibble. 
- get_inferred_relationships()
- Return inferred relationship pairs. 
- get_datasources()
- Return datasource details tibble. 
- get_parameters()
- Return parameters tibble. 
- get_datasources_all()
- Return all sources tibble. 
- get_fields()
- Return raw fields tibble. 
- get_calculated_fields(pretty = FALSE, strip_brackets = FALSE, wrap = 100L)
- 
Return calculated fields tibble. When pretty = TRUE, includes aformula_prettycolumn with line breaks and indentation.
- validate(error = FALSE)
- Validate relationships. Stops execution if - error = TRUE.
- summary()
- Print a brief summary to console. 
Escape square/round brackets with backslashes
Description
Escape square/round brackets with backslashes
Usage
add_back_slash(string)
Arguments
| string | Character vector (or NULL) | 
Value
Character vector with [], () escaped; NULL passes through.
Safely extract an attribute from a named list
Description
Safely extract an attribute from a named list
Usage
attr_safe_get(attrs, name, default = NA_character_)
Arguments
| attrs | Named list (e.g.,  | 
| name | Attribute to retrieve | 
| default | Fallback value | 
Value
Scalar character
Safe basename with fallback
Description
Safe basename with fallback
Usage
basename_safe(x, fallback = "<unknown>")
Arguments
| x | Path or file name | 
| fallback | Value to use if  | 
Value
Basename or fallback
Build a field dependency graph from calculated fields
Description
Creates a directed graph where edges point from input fields used in a
formula to the calculated output field. Tokens are extracted from bracketed
references like [Table].[Field] or [Field].
Usage
build_dependency_graph(fields_df)
Arguments
| fields_df | A data frame with at least columns  | 
Value
An igraph directed graph where vertices are field names and edges
represent dependencies (input → output).
Examples
fields <- tibble::tibble(
name = c("X_plus_Y", "Z"),
formula = c("[X] + [Y]", "[X_plus_Y] * 2")
)
g <- build_dependency_graph(fields)
Extract calculated fields from a TWB
Description
Finds columns that contain <calculation> nodes and returns metadata and
formulas, with a heuristic flag for table calculations.
Usage
extract_calculated_fields(xml_doc)
Arguments
| xml_doc | An  | 
Value
A tibble with columns:
- datasource
- Datasource name. 
- name
- User-visible caption or cleaned internal name. 
- tableau_internal_name
- Internal Tableau name (often bracketed). 
- datatype
- Tableau datatype. 
- role
- Tableau role. 
- formula
- Calculation formula string. 
- calc_class
- Tableau calc class (often - "tableau").
- is_table_calc
- Heuristic flag for table calcs (e.g., - WINDOW_,- LOOKUP).
- table
- Raw table reference. 
- table_clean
- Cleaned table name. 
Examples
# A tiny TWB shipped with the package:
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
stopifnot(nzchar(twb), file.exists(twb))
xml <- xml2::read_xml(twb)
# Extract calculated fields
calcs <- extract_calculated_fields(xml)
head(calcs)
Extract columns with their source tables from a TWB
Description
Scans top-level <datasource> nodes (excluding view-specific references) and
returns fields with raw names/captions, cleaned table/field names, and basic
metadata.
Usage
extract_columns_with_table_source(xml_doc)
Arguments
| xml_doc | An  | 
Value
A tibble with columns:
- datasource
- Datasource name. 
- name
- Raw column name (may include brackets/qualifiers). 
- caption
- Column caption if present. 
- datatype
- Tableau datatype. 
- role
- Tableau role. 
- semantic_role
- Semantic role if present. 
- table
- Raw table reference. 
- table_clean
- Cleaned table name (no brackets/suffix). 
- field_clean
- Cleaned field name. 
Examples
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
stopifnot(nzchar(twb), file.exists(twb))
xml <- xml2::read_xml(twb)
fields <- extract_columns_with_table_source(xml)
Extract datasource details from a Tableau TWB
Description
Gathers runtime tables (from the object graph), merges in named connection metadata (class, caption, targets), and augments with top-level datasource definitions (field counts, connection type, location). Also returns a filtered table of parameter datasources.
Usage
extract_datasource_details(xml_doc)
Arguments
| xml_doc | An  | 
Value
A named list with:
- data_sources
- Tibble of datasources joined with connection metadata. 
- parameters
- Tibble of parameter datasources (if present). 
- all_sources
- Same as - data_sources(placeholder for future variants).
Examples
# Preferred: from a tiny .twb
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
if (nzchar(twb) && file.exists(twb)) {
xml <- xml2::read_xml(twb)
res <- extract_datasource_details(xml)
head(res$data_sources)
}
# Alternative: from a tiny .twbx
twbx <- system.file("extdata", "test_for_zip.twbx", package = "twbparser")
if (nzchar(twbx) && file.exists(twbx)) {
members <- twbx_list(twbx)
twb_member <- members$Name[grepl("\\.twb$", members$Name)][1]
if (!is.na(twb_member)) {
xml <- xml2::read_xml(unz(twbx, twb_member))
res <- extract_datasource_details(xml)
head(res$data_sources)
  }
}
Extract Tableau join clauses from <relation type="join"> nodes
Description
Handles both column-based clauses (<clause><column/></clause>) and
expression-based predicates (<expression op=...>) found in TWB XML.
Usage
extract_joins(xml_doc)
Arguments
| xml_doc | An  | 
Value
A tibble with columns:
- join_type
- Join kind (e.g., - inner,- left), if available.
- left_table
- Left table name (cleaned). 
- left_field
- Left field name. 
- operator
- Predicate operator (defaults to - "="when missing).
- right_table
- Right table name (cleaned). 
- right_field
- Right field name. 
Examples
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
stopifnot(nzchar(twb), file.exists(twb))
xml <- xml2::read_xml(twb)
extract_joins(xml)
Extract <named-connection> entries from a TWB
Description
Rich, safe extraction of <named-connection> nodes and their <connection>
attributes into a tidy tibble.
Usage
extract_named_connections(xml_doc)
Arguments
| xml_doc | An  | 
Value
Tibble with columns like connection_id, connection_caption,
connection_class, connection_target, dbname, schema, warehouse,
region, filename, and location_named.
Examples
# Preferred: read from a tiny '.twb'
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
if (nzchar(twb) && file.exists(twb)) {
xml <- xml2::read_xml(twb)
extract_named_connections(xml)
}
# Alternative: read from a tiny '.twbx'
twbx <- system.file("extdata", "test_for_zip.twbx", package = "twbparser")
if (nzchar(twbx) && file.exists(twbx)) {
members <- twbx_list(twbx)
twb_member <- members$Name[grepl("\\.twb$", members$Name)][1]
if (!is.na(twb_member)) {
xml <- xml2::read_xml(utils::unz(twbx, twb_member))
extract_named_connections(xml)
  }
}
Extract parameter fields from a TWB
Description
Returns parameter columns (those with param-domain-type) and basic metadata,
including a best-effort current value if present.
Usage
extract_parameters(xml_doc)
Arguments
| xml_doc | An  | 
Value
A tibble with columns:
- datasource
- Datasource name. 
- name
- User-visible caption or cleaned internal name. 
- tableau_internal_name
- Internal Tableau name. 
- datatype
- Tableau datatype. 
- role
- Tableau role. 
- parameter_type
- Tableau parameter domain type. 
- allowable_type
- Underlying data-type (if present). 
- current_value
- Current value if specified. 
- is_parameter
- Always - TRUE.
- table
- Raw table reference. 
- table_clean
- Cleaned table name. 
Examples
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
stopifnot(nzchar(twb), file.exists(twb))
xml <- xml2::read_xml(twb)
params <- extract_parameters(xml)
head(params)
Extract non-calculated, non-parameter fields from a TWB
Description
Returns raw columns excluding calculated fields and parameters.
Usage
extract_raw_fields(xml_doc)
Arguments
| xml_doc | An  | 
Value
A tibble with columns:
- datasource
- Datasource name. 
- name
- User-visible caption or cleaned internal name. 
- tableau_internal_name
- Internal Tableau name. 
- datatype
- Tableau datatype. 
- role
- Tableau role. 
- is_hidden
- Whether the field is hidden. 
- is_parameter
- Always - FALSE.
- table
- Raw table reference. 
- table_clean
- Cleaned table name. 
Examples
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
stopifnot(nzchar(twb), file.exists(twb))
xml <- xml2::read_xml(twb)
raw_fields <- extract_raw_fields(xml)
head(raw_fields)
Extract all <relation> tags from a TWB
Description
Returns a tibble of <relation> elements found in a Tableau TWB XML,
with key attributes and any custom SQL text.
Usage
extract_relations(xml_doc)
Arguments
| xml_doc | An  | 
Value
A tibble with columns:
| name | Relation name | 
| table | Table reference | 
| connection | Connection ID | 
| type | Relation type (table, join, etc.) | 
| join | Join type if applicable | 
| custom_sql | Inline SQL text if present | 
Examples
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
stopifnot(nzchar(twb), file.exists(twb))
xml <- xml2::read_xml(twb)
fields <- extract_columns_with_table_source(xml)
inferred <- infer_implicit_relationships(fields)
head(inferred)
Extract modern relationships from a Tableau TWB
Description
Parses Tableau "relationships" (introduced in 2020.2) between logical tables, including the join predicate fields and operator.
Usage
extract_relationships(xml_doc)
Arguments
| xml_doc | An  | 
Value
A tibble with columns:
| relationship_type | Always "Relationship" | 
| left_table | Left table name | 
| right_table | Right table name | 
| left_field | Field name on left side | 
| operator | Join operator (e.g., "=") | 
| right_field | Field name on right side | 
| left_is_calc | Logical, whether left field is a calculation | 
| right_is_calc | Logical, whether right field is a calculation | 
Examples
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
stopifnot(nzchar(twb), file.exists(twb))
xml <- xml2::read_xml(twb)
extract_relationships(xml)
Extract the .twb (and optionally all files) from a .twbx
Description
Extract the .twb (and optionally all files) from a .twbx
Usage
extract_twb_from_twbx(
  twbx_path,
  extract_dir = file.path(tempdir(), paste0("twbx_",
    tools::file_path_sans_ext(basename(twbx_path)), "_", format(Sys.time(),
    "%Y%m%d%H%M%S"))),
  extract_all = FALSE
)
Arguments
| twbx_path | Path to a  | 
| extract_dir | Directory to extract into (defaults to a timestamped temp dir). | 
| extract_all | If  | 
Value
List with twb_path, exdir, twbx_path, and manifest (tibble).
Examples
twbx <- system.file("extdata", "test_for_zip.twbx", package = "twbparser")
res  <- extract_twb_from_twbx(twbx, extract_all = FALSE)
basename(res$twb_path)
Infer implicit relationships between tables from field metadata
Description
Generates candidate join pairs by:
- Matching - semantic_roleacross different tables.
- Matching field names (case-insensitive) across different tables. 
Usage
infer_implicit_relationships(fields_df, max_pairs = 50000L)
Arguments
| fields_df | A data frame like the output of
 | 
| max_pairs | Maximum number of candidate pairs to return (default 50,000). | 
Value
A tibble with columns:
- left_table
- Left table name. 
- left_field
- Left field name. 
- right_table
- Right table name. 
- right_field
- Right field name. 
- reason
- Why the pair was suggested. 
Examples
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
stopifnot(nzchar(twb), file.exists(twb))
xml <- xml2::read_xml(twb)
fields <- extract_columns_with_table_source(xml)
inferred <- infer_implicit_relationships(fields)
head(inferred)
Is an xml2 node missing?
Description
Is an xml2 node missing?
Usage
is_xml_missing(node)
Arguments
| node | An  | 
Value
TRUE/FALSE
Log a one-line summary of .twbx contents
Description
Log a one-line summary of .twbx contents
Usage
log_twbx_contents(twbx_path)
Arguments
| twbx_path | Path to a  | 
Value
(Invisibly) the manifest tibble
Plot a field dependency graph
Description
Draws a quick base-graphics plot of a dependency graph. Vertices that are
calculated fields (present in fields_df$name) are drawn differently.
Usage
plot_dependency_graph(g, fields_df = NULL, seed = NULL)
Arguments
| g | An  | 
| fields_df | Optional data frame with a  | 
| seed | Optional integer seed to make the layout reproducible. If  | 
Value
Invisibly returns g.
Examples
fields <- tibble::tibble(
name = c("X_plus_Y", "Z"),
formula = c("[X] + [Y]", "[X_plus_Y] * 2")
)
g <- build_dependency_graph(fields)
plot_dependency_graph(g, fields)           # nondeterministic layout
plot_dependency_graph(g, fields, seed = 1) # deterministic layout
Plot a field-level relationship DAG (legacy)
Description
Uses relationships_df with columns left_table, right_table,
left_field, right_field, and optional operator.
Usage
plot_relationship_graph(relationships_df, seed = NULL)
Arguments
| relationships_df | Data frame of field-level relationships. | 
| seed | Optional integer seed to make the layout reproducible. If  | 
Value
Invisibly returns the plotted graph.
Plot a source join graph
Description
Visualizes joins between sources. Expects joins_df with columns
left_table, right_table, left_field, right_field. If
relationships_df is provided (modern relationships), it will render a
second graph highlighting those relationships.
Usage
plot_source_join_graph(joins_df, relationships_df = NULL, seed = NULL)
Arguments
| joins_df | Data frame with join edges. | 
| relationships_df | Optional data frame with modern relationships. | 
| seed | Optional integer seed to make layouts reproducible. If  | 
Value
Invisibly returns the join graph, or a list list(joins = g, relationships = gr)
if relationships_df is provided.
Add a prettified formula column to calculated fields table
Description
Add a prettified formula column to calculated fields table
Usage
prettify_calculated_fields(calcs, strip_brackets = FALSE, wrap = 100L)
Arguments
| calcs | tibble from extract_calculated_fields() | 
| strip_brackets | logical | 
| wrap | integer wrap width; default 100 | 
Value
tibble with extra column formula_pretty
Print a quick data-source summary from a parser object
Description
Print a quick data-source summary from a parser object
Usage
print_datasource_summary(parser)
Arguments
| parser | An object with  | 
Value
Invisibly prints summary
Wrap a string in Tableau-style square brackets
Description
Wrap a string in Tableau-style square brackets
Usage
put_in_sq_bracket(string)
Arguments
| string | Character vector; NA/NULL returns  | 
Value
Character vector wrapped like [name]
Redact AWS access keys in strings
Description
Redact AWS access keys in strings
Usage
redact(x)
Arguments
| x | Character vector | 
Value
Character vector with keys replaced by [REDACTED_AWS_KEY]
Safely evaluate and return fallback on error (with warning)
Description
Safely evaluate and return fallback on error (with warning)
Usage
safe_call(expr, fallback)
Arguments
| expr | Expression to evaluate | 
| fallback | Value if an error occurs | 
Value
Result of expr or fallback
Prettify a Tableau calculation formula for display
Description
Prettify a Tableau calculation formula for display
Usage
tableau_formula_pretty(formula, strip_brackets = FALSE, wrap = NA_integer_)
Arguments
| formula | character scalar | 
| strip_brackets | logical; remove [ ] around field names (default FALSE) [ ]: R:%20 | 
| wrap | optional integer to hard-wrap lines (use NA to disable) | 
Value
character scalar (multi-line, indented)
Custom SQL (Metadata API) for a published item
Description
Queries the Metadata (GraphQL) API for Custom SQL tables in the content graph.
Usage
tbs_custom_sql_graphql(
  content_id,
  base_url = Sys.getenv("TABLEAU_BASE_URL"),
  site = Sys.getenv("TABLEAU_SITE"),
  token = Sys.getenv("TABLEAU_PAT")
)
Arguments
| content_id | Character. Workbook or datasource ID (GUID). | 
| base_url | Character. Server/Cloud base URL (e.g., "https://..."). | 
| site | Character. Site contentUrl ("" for default site). | 
| token | Character. REST credentials token. | 
Value
A tibble with columns such as custom_sql_name, custom_sql_query,
database, schema. Zero rows if none.
Examples
tbs_custom_sql_graphql("abc-123")
Publish info for a workbook or datasource on 'Tableau' Server/Cloud
Description
Returns an empty tibble when credentials are missing or the item is not found.
Usage
tbs_publish_info(
  content_id,
  base_url = Sys.getenv("TABLEAU_BASE_URL"),
  site = Sys.getenv("TABLEAU_SITE"),
  token = Sys.getenv("TABLEAU_PAT")
)
Arguments
| content_id | Character. Workbook or datasource ID (GUID). | 
| base_url | Character. Server/Cloud base URL (e.g., "https://..."). | 
| site | Character. Site contentUrl ("" for the default site). | 
| token | Character. REST credentials token (from a prior sign-in). | 
Value
A tibble with columns like content_id, site, project, web_url,
created_at, updated_at. May be zero rows if unavailable.
Examples
tbs_publish_info("abc-123")
Extract Custom SQL relations from a TWB XML
Description
Extract Custom SQL relations from a TWB XML
Usage
twb_custom_sql(xml_doc)
Arguments
| xml_doc | An xml2 document for a .twb | 
Value
tibble with relation_name, relation_type, custom_sql
Extract Initial SQL statements from connections (if present)
Description
Extract Initial SQL statements from connections (if present)
Usage
twb_initial_sql(xml_doc)
Arguments
| xml_doc | An xml2 document for a .twb | 
Value
tibble with connection_id, initial_sql
Detect likely references to published data sources (vs embedded)
Description
Detect likely references to published data sources (vs embedded)
Usage
twb_published_refs(xml_doc)
Arguments
| xml_doc | An xml2 document for a .twb | 
Value
tibble with datasource name, caption, likely_published, hints
Extract specific files from a .twbx
Description
Extract specific files from a .twbx
Usage
twbx_extract_files(
  twbx_path,
  files = NULL,
  pattern = NULL,
  types = NULL,
  exdir = NULL
)
Arguments
| twbx_path | Path to a  | 
| files | Vector of archive paths to extract (optional). | 
| pattern | Perl regex to match archive paths (optional). | 
| types | Subset by  | 
| exdir | Output directory (defaults to temp). | 
Value
Tibble with name, type, and out_path of extracted files.
Examples
twbx <- system.file("extdata", "test_for_zip.twbx", package = "twbparser")
files <- twbx_extract_files(twbx, types = c("workbook"))
head(files)
List contents of a Tableau .twbx
Description
List contents of a Tableau .twbx
Usage
twbx_list(twbx_path)
Arguments
| twbx_path | Path to a  | 
Value
Tibble with columns: name, size_bytes, modified, type.
Examples
twbx <- system.file("extdata", "test_for_zip.twbx", package = "twbparser")
twbx_list(twbx)
Validate relationships against available datasources and fields
Description
Checks that relationship endpoints reference known datasource tables and that
the predicate fields appear somewhere in the workbook (calculated, raw, or
parameter fields), using a lenient token match (e.g., INT([GEOID]) = GEOID).
Usage
validate_relationships(parser, strict = FALSE)
Arguments
| parser | A  | 
| strict | Logical. Reserved for future table-scoped checks that can be overly conservative with federated sources. Currently not used. | 
Value
A list with:
- ok
- TRUEif no issues;- FALSEotherwise.
- issues
- A named list of tibbles. Possible elements: -  unknown_tables: endpoints not found among known tables.
-  unknown_fields: predicate fields not found in the field pool.
 
-  
Examples
twb <- system.file("extdata", "test_for_wenjie.twb", package = "twbparser")
if (nzchar(twb) && file.exists(twb)) {
  parser <- TwbParser$new(twb)
  res <- validate_relationships(parser)
  if (!res$ok) print(res$issues)
}