Skip to contents

Scans Cricinfo parquet directories for ball-by-ball, match, and innings data and loads them into DuckDB. Skips matches already in the database. Also loads the fixtures index if present.

Usage

ingest_cricinfo_data(
  cricinfo_dir = NULL,
  path = NULL,
  formats = c("t20i", "odi", "test"),
  genders = c("male", "female"),
  verbose = TRUE
)

Arguments

cricinfo_dir

Character. Path to the cricinfo data directory (e.g., "../bouncerdata/cricinfo"). If NULL, auto-detects from bouncerdata sibling directory.

path

Character. Database file path. If NULL, uses default.

formats

Character vector. Formats to ingest. Default c("t20i", "odi", "test").

genders

Character vector. Genders to ingest. Default c("male", "female").

verbose

Logical. Print progress messages. Default TRUE.

Value

Invisibly returns a list with counts of ingested records.

Examples

if (FALSE) { # \dontrun{
# Ingest all formats from default location
ingest_cricinfo_data()

# Ingest only T20I male data
ingest_cricinfo_data(formats = "t20i", genders = "male")

# Ingest from a specific directory
ingest_cricinfo_data(cricinfo_dir = "path/to/cricinfo")
} # }