sheetwise.formula_parser

Formula parsing and analysis utilities for spreadsheets.

Classes

FormulaDependencyAnalyzer([formula_parser])

Specialized analyzer for formula dependencies.

FormulaParser()

Extracts, analyzes and simplifies Excel formulas from spreadsheets.

FormulaTokenizer()

Helper class to tokenize Excel formulas safely.

class sheetwise.formula_parser.FormulaTokenizer[source]

Helper class to tokenize Excel formulas safely.

tokenize(formula)[source]

Split formula into tokens while preserving structure.

Return type:

List[str]

extract_args(token_stream)[source]

Extract arguments from a function call, handling nested parenthesis.

Return type:

List[str]

class sheetwise.formula_parser.FormulaParser[source]

Extracts, analyzes and simplifies Excel formulas from spreadsheets. Optimized for memory usage with streaming reads.

CELL_REF_PATTERN = re.compile('([A-Z]+[0-9]+|[A-Z]+\\:[A-Z]+|[0-9]+\\:[0-9]+|[A-Z]+[0-9]+\\:[A-Z]+[0-9]+)')
__init__()[source]

Initialize the formula parser.

extract_formulas(excel_path, sheet_name=None)[source]

Extract all formulas from an Excel file using Memory-Efficient Streaming.

Parameters:
  • excel_path (str) – Path to the Excel file

  • sheet_name (Optional[str]) – Optional specific sheet to parse (saves time)

Return type:

Dict[str, str]

Returns:

Dictionary mapping cell addresses to formulas

build_dependency_graph()[source]

Build a graph of cell dependencies based on extracted formulas.

Return type:

Dict[str, Set[str]]

extract_cell_references(formula)[source]

Extract all cell references from a formula.

Return type:

List[str]

simplify_formula(formula)[source]

Generate a simplified explanation using robust tokenization. Handles nested functions better than Regex.

Return type:

str

get_formula_impact(cell_address)[source]

Analyze the impact of a formula cell.

Return type:

Dict[str, Any]

encode_formulas_for_llm(formulas=None)[source]

Generate LLM-friendly encoding.

Return type:

str

class sheetwise.formula_parser.FormulaDependencyAnalyzer(formula_parser=None)[source]

Specialized analyzer for formula dependencies.

__init__(formula_parser=None)[source]
find_calculation_chains()[source]
Return type:

List[List[str]]

identify_critical_cells()[source]
Return type:

List[str]