A rules-based expression rewriting toolkit for symbolic computation.
xtk
is a Python package that provides tools for symbolic computation through expression rewriting. It offers capabilities such as pattern matching, rule-based transformations, expression evaluation, theorem proving via tree search, and data generation for AI and machine learning applications.
To quickly get started with xtk
, follow these steps:
-
Installation:
Install
xtk
from PyPI:pip install xtk
-
Basic Usage:
Here's a simple example of how to use
xtk
to simplify an expression:from xtk import simplifier # Define a simplification rule: x + 0 => x rules = [ [['+', ['?', 'x'], 0], [':', 'x']] ] # Create a simplifier function using the rule simplify = simplifier(rules) # Simplify an expression expr = ['+', 'a', 0] result = simplify(expr) print(f"Simplified expression: {result}") # Output: Simplified expression: a
-
Exploring More Features:
To explore more advanced features like symbolic differentiation, tree search algorithms for theorem proving, and working with predefined mathematical rules, refer to the detailed sections below.
- Expression Toolkit:
xtk
- Introduction
- Quick Start
- Table of Contents
- Installation
- Overview
- Representation of Rules and Expressions
- Pattern Matching
- Skeleton Instantiation
- Evaluation
- Simplification Process
- Tree Search and Theorem Proving
- Modules
- Using Tree Search Algorithms: DFS and Best-First Search
- Predefined Rewrite Rules
- Notebooks and Examples
- The Power of Language Design
To install the package locally from the source code:
git clone https://github.com/queelius/xtk
cd xtk
pip install -e .
To install it from PyPI:
pip install xtk
xtk
provides a comprehensive set of tools for symbolic computation:
-
Expression Rewriting Engine (
xtk/rewriter.py
): Functions for pattern matching, expression instantiation, evaluation, and simplification using transformation rules. -
Simplifier (
xtk/simplifier.py
): A recursive simplifier that applies rewrite rules to expressions in a bottom-up manner, facilitating expression simplification. -
Tree Search Algorithms (
xtk/search/
): Algorithms for theorem proving and exploring expression spaces, including BFS, DFS, IDDFS, Best-First Search, A* Search, and Monte Carlo Tree Search. -
Predefined Mathematical Rules (
xtk/rules/
): A collection of rules for various mathematical domains such as algebra, calculus, trigonometry, limits, and more. -
Jupyter Notebooks (
notebooks/
): Examples demonstrating the functionality of the package.
xtk
employs a powerful yet simple representation for rules and expressions, enabling efficient definition and manipulation of symbolic expressions.
Rewrite rules are defined using an abstract syntax tree (AST) representation, utilizing nested lists to represent expressions. Each rule consists of:
- Pattern: Defines the structure of the expression to match.
- Skeleton: A template for the replacement expression.
-
Simplification Rule: The sum of a variable
x
and zero simplifies tox
.[['+', ['?', 'x'], 0], [':', 'x']]
- Pattern:
['+', ['?', 'x'], 0]
- Skeleton:
[':', 'x']
- Pattern:
-
Derivative of a Constant: The derivative of a constant
c
with respect tox
is zero, ( \frac{d}{dx} c = 0 ).[['dd', ['?c', 'c'], ['?', 'x']], 0]
- Pattern:
['dd', ['?c', 'c'], ['?', 'x']]
- Skeleton:
0
- Pattern:
For enhanced readability, xtk
offers a simplified DSL for writing rules:
# Derivative of a constant: d(c)/dx = 0
dd (?c c) (?v x) = 0
# Derivative of a variable with respect to itself: d(x)/dx = 1
dd (?v x) (?v x) = 1
# Product rule: d(f * g)/dx = f' * g + f * g'
dd (* (? f) (? g)) (?v x) =
(+ (* (dd (: f) (: x)) (: g))
(* (: f) (dd (: g) (: x))))
In the DSL:
?c c
: Matches any constant and binds it toc
.?v x
: Matches any variable and binds it tox
.? f
,? g
: Match any expressions and bind them tof
andg
.:
is used in the skeleton to refer to matched variables from the pattern.
Rules can also be represented as JSON objects, allowing for additional metadata:
{
"pattern": ["dd", ["?c", "c"], ["?v", "x"]],
"replacement": 0,
"name": "derivative_of_constant",
"description": "The derivative of a constant is zero."
}
Expressions are represented in the same AST format as rules:
['+', 'x', 3]
represents ( x + 3 ).['*', ['+', 'x', 3], 4]
represents ( (x + 3) \times 4 ).['dd', ['*', 2, 'x'], 'x']
represents ( \frac{d}{dx} (2x) ).
Pattern matching is used to determine if a rule can be applied to an expression. The syntax includes:
- Exact Match: An expression
foo
matches exactlyfoo
. - List Match: An expression
["f", "a", "b"]
matches a list with first element"f"
and subsequent elements"a"
,"b"
. - Pattern Variables:
['?', 'x']
: Matches any expression and binds it tox
.['?c', 'c']
: Matches any constant and binds it toc
.['?v', 'x']
: Matches any variable and binds it tox
.
This pattern-matching system is simple yet powerful, allowing for flexible expression transformations.
After matching, the skeleton is instantiated by replacing pattern variables with the matched expressions:
- Direct Substitution:
["f", "a", "b"]
remains unchanged. - Variable Substitution:
[':', 'x']
is replaced with the expression bound tox
.
For example, if x
is bound to 3
, then [':', 'x']
instantiates to 3
.
The evaluator computes the value of instantiated skeleton expressions using a dictionary of bindings for operations and variables.
from xtk import evaluate
# Define the bindings
bindings = {
'+': lambda x, y: x + y,
'x': 3,
'y': 4
}
# Evaluate the expression
expr = ['+', 'x', 'y']
result = evaluate(expr, bindings)
print(f"evaluate({expr}, {bindings}) => {result}")
# Output: evaluate(['+', 'x', 'y'], {'+': <function>, 'x': 3, 'y': 4}) => 7
In this example:
- The evaluator replaces
'x'
and'y'
with their values3
and4
. - It then applies the
+
operation to compute7
.
Simplification involves recursively applying rewrite rules to an expression until it cannot be further reduced. The process works bottom-up, simplifying sub-expressions before moving up the expression tree.
We can visualize the simplification process as follows:
graph TD
A0(( )) -->|Expression| A
B0(( )) -->|Pattern| A
C0(( )) -->|Bindings| A
D0(( )) -->|Skeleton| B
A[Matcher] -->|Augmented Bindings| B
B[Instantiator] -->|Instantiated Skeleton| C[Evaluator]
C -->|Simplified Expression| A
style A0 fill:none, stroke:none
style B0 fill:none, stroke:none
style C0 fill:none, stroke:none
style D0 fill:none, stroke:none
- Matcher: Matches the pattern against the expression and generates bindings.
- Instantiator: Uses the bindings to instantiate the skeleton.
- Evaluator: Evaluates the instantiated skeleton to produce a simplified expression.
- Recursive Application: The process repeats until no further simplifications can be made.
Simplify ['+', 'x', 0]
using the rule that adding zero simplifies to the original expression.
- Rule:
[['+', ['?', 'x'], 0], [':', 'x']]
-
Match:
- Pattern:
['+', ['?', 'x'], 0]
- Expression:
['+', 'x', 0]
- Binding:
{'x': 'x'}
- Pattern:
-
Instantiate:
- Skeleton:
[':', 'x']
- Instantiated Skeleton:
'x'
- Skeleton:
-
Evaluate:
- Result:
'x'
- Result:
The expression simplifies from ['+', 'x', 0]
to 'x'
.
Simplify ['+', 3, 5]
using the addition operation defined in the evaluator.
-
Bindings:
{'x': 3, 'y': 5, '+': lambda x, y: x + y}
-
Expression:
['+', 'x', 'y']
-
Evaluate:
- Replace
'x'
and'y'
with3
and5
. - Apply
+
operation:3 + 5 = 8
.
- Replace
-
Result:
8
Beyond simplification, xtk
provides tree search algorithms for tasks such as theorem proving, where the goal is to find a sequence of rewrites that transforms an expression into a target form.
- Breadth-First Search (BFS): Explores all nodes at the current depth before moving deeper.
- Depth-First Search (DFS): Explores as far as possible along each branch before backtracking.
- Iterative Deepening DFS (IDDFS): Combines DFS's space efficiency with BFS's completeness.
- Best-First Search: Uses a heuristic to explore more promising branches first.
- A* Search: Combines path cost and heuristic information for optimal pathfinding.
- Monte Carlo Tree Search (MCTS): Uses randomness and statistical sampling to explore the search space.
These algorithms can be used to prove equivalence between expressions, find transformations that satisfy certain conditions, or explore the space of possible rewrites.
-
rewriter.py
: Contains core functions for pattern matching, instantiation, and evaluation.match(pattern, expression, bindings)
: Matches a pattern against an expression using the provided bindings.instantiate(skeleton, bindings)
: Instantiates a skeleton using the bindings.evaluate(expression, bindings)
: Evaluates an expression using the bindings.
-
simplifier.py
:simplifier(rules)
: Returns a function to simplify expressions using the provided rules.
Each search algorithm is implemented in its own module under search/
:
bfs.py
: Breadth-First Searchdfs.py
: Depth-First Searchiddfs.py
: Iterative Deepening DFSbest_first.py
: Best-First Searchastar.py
: A* Searchmcts.py
: Monte Carlo Tree Search
These modules provide functions to perform search operations on expression spaces. Next, we show how to use these modules to explore the space of possible rewrites.
xtk
provides powerful tree search algorithms for tasks such as theorem proving, expression transformation, and exploring possible rewrites. This section demonstrates how to use Depth-First Search (DFS) and Best-First Search within xtk
.
DFS explores as far as possible along each branch before backtracking, making it suitable for finding deep solutions without excessive memory consumption.
Goal: Prove the identity ( \sin^2 x + \cos^2 x = 1 ) by transforming the left-hand side (LHS) into the right-hand side (RHS) using DFS.
-
Define the Rewrite Rules
We'll use fundamental trigonometric identities as rewrite rules.
# trigonometric_rules.py rules = [ # Pythagorean identity [['+', ['^', ['sin', ['?','x']], 2], ['^', ['cos', ['?','x']], 2]], 1], # Expand sine squared [['^', ['sin', ['?','x']], 2], ['-', 1, ['^', ['cos', [':','x']], 2]]], # Expand cosine squared [['^', ['cos', ['?','x']], 2], ['-', 1, ['^', ['sin', [':','x']], 2]]], # Sine double angle [['sin', ['*', 2, ['?','x']]], ['*', 2, ['sin', [':','x']], ['cos', [':','x']]]], # Cosine double angle [['cos', ['*', 2, ['?','x']]], ['-', ['^', ['cos', [':','x']], 2], ['^', ['sin', [':','x']], 2]]], # Other relevant identities... ]
-
Set Up the Initial Expression and Goal Test
initial_expr = ['+', ['^', ['sin', 'x'], 2], ['^', ['cos', 'x'], 2]] # Define a goal test function def goal_test(expr): return expr == 1
-
Implement DFS
from xtk.search.dfs import dfs_search # Perform the search solution = dfs_search(initial_expr, rules, goal_test) if solution: print("Proof found!") for step in solution: print(step) else: print("No proof found.")
-
Run the Code
Execute the script to see if the proof is found. The output should show the steps leading from the LHS to the RHS.
- dfs_search: A function in
xtk
that performs DFS given an initial expression, a set of rules, and a goal test. - Goal Test Function: Determines when the search should stop by checking if the current expression equals
1
.
Best-First Search uses a heuristic to prioritize exploration, making it efficient in finding optimal or near-optimal solutions.
Goal: Simplify the expression ( (x^2 - 1) ) into its factored form ( (x - 1)(x + 1) ).
-
Define the Rewrite Rules
# algebraic_rules.py rules = [ # Difference of squares [['-', ['^', ['?', 'x'], 2], ['^', ['?','y'], 2]], ['*', ['+', [':','x'], [':','y']], ['-', [':','x'], [':','y']]]], # Expand multiplication [['*', ['+', ['?','a'], ['?','b']], ['+', ['?','c'], ['?','d']]], ['+', ['*', [':','a'], [':','c']], ['*', [':','a'], [':','d']], ['*', [':','b'], [':','c']], ['*', [':','b'], [':','d']]]], # Simplify exponents [['^', ['^', ['?','x'], ['?','m']], ['?','n']], ['^', [':','x'], ['*', [':','m', [':','n']]]], # ... other algebraic rules ]
-
Set Up the Initial Expression and Goal Test
initial_expr = ['-', ['^', 'x', 2], 1] target_expr = ['*', ['-', 'x', 1], ['+', 'x', 1]] # Define a goal test function def goal_test(expr): return expr == target_expr
-
Define a Heuristic Function
The heuristic estimates how "close" an expression is to the target expression.
def heuristic(expr): # Simple heuristic: count matching elements with the target expression def flatten(e): return [e] if not isinstance(e, list) else sum(map(flatten, e), []) expr_elements = set(flatten(expr)) target_elements = set(flatten(target_expr)) # Higher score if more elements match return len(expr_elements & target_elements)
-
Implement Best-First Search
from xtk.search.best_first import best_first_search # Perform the search solution = best_first_search(initial_expr, rules, goal_test, heuristic) if solution: print("Simplification found!") for step in solution: print(step) else: print("No simplification found.")
-
Run the Code
Execute the script to find the simplification steps from the initial expression to the factored form.
- best_first_search: A function in
xtk
that performs Best-First Search using the provided heuristic. - Heuristic Function: Guides the search by estimating the similarity between the current expression and the target expression.
- Rule Completeness: Ensure that the set of rules covers the transformations needed to reach the goal.
- Heuristic Design: The effectiveness of Best-First Search heavily relies on the heuristic function's ability to estimate closeness accurately.
- Performance Considerations: While DFS might explore irrelevant branches deeply, Best-First Search can be more efficient if the heuristic is well-designed.
- Combining Searches: Sometimes, combining different search strategies can yield better results.
- Custom Goal Tests: For more complex goals, define custom goal test functions that capture the desired conditions.
- Debugging: If the search isn't finding a solution, check the rules and heuristic for completeness and correctness.
- Theorem Proving: Use tree searches to find proofs for mathematical theorems by transforming premises into conclusions.
- Expression Optimization: Find more efficient or simplified forms of expressions in computational settings.
- Automated Reasoning: Implement logic-based systems that require exploring possible inferences.
The rules/
directory contains predefined rules for various mathematical domains:
deriv_rules.py
: Rules for symbolic differentiation.trig_rules.py
: Trigonometric identities.limit_rules.py
: Rules for computing limits.random_var_rules.py
: Manipulations of random variables.integral_rules.py
: Rules for integration.calculus_rules.py
: General calculus rules.algebra_rules.py
: Algebraic manipulation rules.
Additional rules cover domains such as differential equations, logic, set theory, combinatorics, graph theory, group theory, ring theory, field theory, vector spaces, linear algebra, topology, measure theory, probability theory, and statistics.
The notebooks/
directory contains Jupyter notebooks demonstrating the functionality of xtk
, including examples of simplification, evaluation, and theorem proving.
Designing a domain-specific language (DSL) enables expressing complex ideas concisely and readably. In xtk
, the DSL allows users to define transformation rules effectively, leveraging the power of symbolic computation and rule-based systems.
Our rules-based system is Turing-complete, capable of expressing any computable function. While powerful, such systems are better suited for symbolic computation, theorem proving, and other symbolic tasks rather than general-purpose programming.