Prealloc for jectcoeffs! (#137)

* Fixes in Project.toml, and add Manifest.toml to .gitignore * Add BookKeeping mutable structure to parse eqs * Rewrite taylorize macro internal functions to produce two parsed functions the optimized jetcoeffs! and a function to allocate the memory resources once (`_allocate_jetcoeffs!`) * Fix threads-macro calls and generation of preamble Rename _defs_preamble! to _defs_allocs! * Adapt main code to new output of taylorize macro * Fix common interface with DiffEqs * Fix tests * Add Logging to Project.toml * Update taylorize-macro docs * Test emitted warnings during the integration and fix tests for common interface * Fixes for DynamicalODEPoblem's and uncomment tests * Fix methods returning local Taylor1 polynomials * Remove warnings from tests * Fix conflicts in docs * Fix common interface * Fixes after review * Bump minor version * Fix broken test in Julia v1.0, and test Julia v1.6
PerezHz · Apr 19, 2022 · efdeb5f · efdeb5f · lbenet · Apr 19, 2022
1 parent 5cc625f
commit efdeb5f
Show file tree

Hide file tree

Showing 19 changed files with 1,619 additions and 985 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -17,7 +17,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        version: ['1.0', '1', 'nightly']
+        version: ['1.0', '1.6', '1', 'nightly']
         os:
           - ubuntu-latest
           - macos-latest

diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,5 @@
 .DS_Store
+Manifest.toml
 examples/.ipynb_checkpoints
 examples/JuliaCon2017/.ipynb_checkpoints/
 docs/build/

diff --git a/Project.toml b/Project.toml
@@ -1,14 +1,15 @@
 name = "TaylorIntegration"
 uuid = "92b13dbe-c966-51a2-8445-caca9f8a7d42"
 repo = "https://github.com/PerezHz/TaylorIntegration.jl.git"
-version = "0.8.11"
+version = "0.9.0"
 
 [deps]
 DiffEqBase = "2b5f629d-d688-5b77-993f-72d75c75574e"
 Elliptic = "b305315f-e792-5b7a-8f41-49f472929428"
 Espresso = "6912e4f1-e036-58b0-9138-08d1e6358ea9"
 InteractiveUtils = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
 LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
+Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
 Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a"
 OrdinaryDiffEq = "1dea7af3-3e70-54e6-95c3-0bf5283fa5ed"
 RecursiveArrayTools = "731186ca-8d62-57ce-b412-fbd966d074cd"
@@ -34,11 +35,11 @@ DiffEqBase = "2b5f629d-d688-5b77-993f-72d75c75574e"
 Elliptic = "b305315f-e792-5b7a-8f41-49f472929428"
 InteractiveUtils = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
 LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
+Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
 Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
 StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
 TaylorSeries = "6aa5eb33-94cf-58f4-a9d0-e4b2c4fc25ea"
 Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
 
 [targets]
-test = ["TaylorSeries", "LinearAlgebra", "Test", "Elliptic", "InteractiveUtils",
-    "Pkg", "DiffEqBase", "StaticArrays"]
+test = ["DiffEqBase", "Elliptic", "InteractiveUtils", "LinearAlgebra", "Logging", "Pkg", "StaticArrays", "TaylorSeries", "Test"]
diff --git a/docs/src/taylorize.md b/docs/src/taylorize.md
@@ -16,17 +16,20 @@ the function where the ODEs of the problem are defined, in
 order to compute the recurrence relations that are used to construct
 the Taylor expansion of the solution. This is done for each order of
 the series in [`TaylorIntegration.jetcoeffs!`](@ref). These computations are
-not optimized: they waste memory due to allocations of some temporary
+not optimized: they waste memory due to repeated allocations of some temporary
 arrays, and perform some operations whose result has been previously
 computed.
 
 Here we describe one way to optimize this: The idea is to replace the
-default method of [`TaylorIntegration.jetcoeffs!`](@ref) by another (with the
-same name) which is called by dispatch, that in principle performs better.
-The new method is constructed specifically for the actual function
-defining the equations of motion by parsing its expression; the new
-function performs in principle *exactly* the same operations, but avoids
-the extra allocations and the repetition of some operations.
+default method of [`TaylorIntegration.jetcoeffs!`](@ref) by another
+method (same function name) which is called by dispatch, and that in principle
+performs better.
+The new method is constructed specifically for the specific function
+defining the equations of motion by parsing its expression. This new
+method performs in principle *exactly* the same operations, but avoids
+the extra allocations; to do the latter, the macro also creates an *internal*
+function `TaylorIntegration._allocate_jetcoeffs!`, which allocates all temporary
+`Taylor1` objects as well as the declared `Array{Taylor1,1}`s.
 
 
 ## An example
@@ -46,29 +49,38 @@ end
 
 # Initial time (t0), final time (tf) and initial condition (q0)
 t0 = 0.0
-tf = 100.0
+tf = 10000.0
 q0 = [1.3, 0.0]
 
 # The actual integration
 t1, x1 = taylorinteg(pendulum!, q0, t0, tf, 25, 1e-20, maxsteps=1500); # warm-up run
 e1 = @elapsed taylorinteg(pendulum!, q0, t0, tf, 25, 1e-20, maxsteps=1500);
-e1
+all1 = @allocated taylorinteg(pendulum!, q0, t0, tf, 25, 1e-20, maxsteps=1500);
+e1, all1
 ```
 
 We note that the initial number of methods defined for
 `TaylorIntegration.jetcoeffs!` is 2.
 ```@example taylorize
 length(methods(TaylorIntegration.jetcoeffs!)) == 2 # initial value
 ```
-Using `@taylorize` will increase this number by creating a new method.
+Similarly, the number of methods for `TaylorIntegration._allocate_jetcoeffs!` is
+also 2.
+```@example taylorize
+length(methods(TaylorIntegration._allocate_jetcoeffs!)) == 2 # initial value
+```
+Using `@taylorize` will increase this number by creating a new method for these
+two functions.
 
 The macro [`@taylorize`](@ref) is intended to be used in front of the function
 that implements the equations of motion. The macro does the following: it
-first parses the actual function as it is, so the integration can be computed
+first parses the function as it is, so the integration can be computed
 using [`taylorinteg`](@ref) as above, by explicitly using the keyword
-argument `parse_eqs=false`. It then creates and evaluates a new method of
+argument `parse_eqs=false`; this also declares the function of the ODEs, whose name
+is used for parsing. It then creates and evaluates a new method of
 [`TaylorIntegration.jetcoeffs!`](@ref), which is the specialized method
-(through `Val`) on the specific function passed to the macro.
+(through `Val`) on the specific function passed to the macro as well as a specialized
+`TaylorIntegration._allocate_jetcoeffs!`.
 
 ```@example taylorize
 @taylorize function pendulum!(dx, x, p, t)
@@ -85,27 +97,32 @@ println(methods(TaylorIntegration.jetcoeffs!))
 ```
 
 We see that there is only one method of `pendulum!`, and
-there is a new method of `TaylorIntegration.jetcoeffs!`, whose signature appears
-in this documentation as `Val{Main.ex-taylorize.pendulum!}`; it is
-an specialized version for the function `pendulum!` (with some extra information
+there is a *new* method (three in total) of `TaylorIntegration.jetcoeffs!`,
+whose signature appears
+in this documentation as `Val{Main.ex-taylorize.pendulum!}`. It is
+a specialized version for the function `pendulum!` (with some extra information
 about the module where the function was created). This method
 is selected internally if it exists (default), exploiting dispatch, when
 calling [`taylorinteg`](@ref) or [`lyap_taylorinteg`](@ref); to integrate
-using the hard-coded method of [`TaylorIntegration.jetcoeffs!`](@ref) of the
+using the hard-coded (default) method of [`TaylorIntegration.jetcoeffs!`](@ref) of the
 integration above, the keyword argument `parse_eqs` has to be set to `false`.
+Similarly, one can check that there exists a new method of
+`TaylorIntegration._allocate_jetcoeffs!`
 
 Now we carry out the integration using the specialized method; note that we
-use the same instruction as above.
+use the same instruction as above; the default value for the keyword argument `parse_eqs`
+is `true`, so we may omit it.
 
 ```@example taylorize
 t2, x2 = taylorinteg(pendulum!, q0, t0, tf, 25, 1e-20, maxsteps=1500); # warm-up run
 e2 = @elapsed taylorinteg(pendulum!, q0, t0, tf, 25, 1e-20, maxsteps=1500);
-e2
+all2 = @allocated taylorinteg(pendulum!, q0, t0, tf, 25, 1e-20, maxsteps=1500);
+e2, all2
 ```
 
-We note the difference in the performance:
+We note the difference in the performance and allocations:
 ```@example taylorize
-e1/e2
+e1/e2, all1/all2
 ```
 
 We can check that both integrations yield the same results.
@@ -121,8 +138,8 @@ setting it to `false` causes the standard method to be used.
 taylorinteg(pendulum!, q0, t0, tf, 25, 1e-20, maxsteps=1500, parse_eqs=false); # warm-up run
 
 e3 = @elapsed taylorinteg(pendulum!, q0, t0, tf, 25, 1e-20, maxsteps=1500, parse_eqs=false);
-
-e1/e3
+all3 = @allocated taylorinteg(pendulum!, q0, t0, tf, 25, 1e-20, maxsteps=1500, parse_eqs=false);
+e1/e3, all1/all3
 ```
 
 We now illustrate the possibility of exploiting the macro
@@ -144,13 +161,13 @@ The speed-up obtained comes from the design of the new (specialized) method of
 allocations and some repeated computations. This is achieved by knowing the
 specific AST of the function of the ODEs integrated, which is walked
 through and *translated* into the actual implementation, where some
-required auxiliary arrays are created and the low-level functions defined in
+required auxiliary arrays are created, and the low-level functions defined in
 `TaylorSeries.jl` are used.
 For this, we heavily rely on [`Espresso.jl`](https://github.com/dfdx/Espresso.jl) and
 some metaprogramming; we thank Andrei Zhabinski for his help and comments.
 
-The new `jetcoeffs!` method can be inspected by constructing the expression
-corresponding to the function, and using
+The new `TaylorIntegration.jetcoeffs!` and `TaylorIntegration._allocate_jetcoeffs!` method can be inspected by
+constructing the expression corresponding to the function, and using
 [`TaylorIntegration._make_parsed_jetcoeffs`](@ref):
 
 ```@example taylorize
@@ -160,12 +177,13 @@ ex = :(function pendulum!(dx, x, p, t)
     return dx
 end)
 
-new_ex = TaylorIntegration._make_parsed_jetcoeffs(ex)
+new_ex1, new_ex2 = TaylorIntegration._make_parsed_jetcoeffs(ex)
 ```
 
-This function has a similar structure as the hard-coded method of
+The first function has a similar structure as the hard-coded method of
 `TaylorIntegration.jetcoeffs!`, but uses low-level functions in `TaylorSeries`
-(e.g., `sincos!` above) and explicitly allocates the necessary temporary arrays.
+(e.g., `sincos!` above); all temporary `Taylor1` objects as well as declared
+arrays are allocated once by `TaylorIntegration._allocate_jetcoeffs!`.
 More complex functions quickly become very difficult to read. Note that,
 if necessary, one can further optimize `new_ex` manually.
 
@@ -174,10 +192,10 @@ if necessary, one can further optimize `new_ex` manually.
 
 The construction of the internal function obtained by using
 [`@taylorize`](@ref) is somewhat complicated and limited. Here we
-list some limitations and advice.
+list some limitations and provide some advice.
 
 - It is best to have expressions which involve two arguments at most, which
-  requires parentheses be appropriately used: For example, `res = a+b+c` should
+  requires using parentheses appropriately: For example, `res = a+b+c` should
   be written as `res = (a+b)+c`.
 
 - Updating operators such as `+=`, `*=`, etc., are not supported. For
@@ -200,13 +218,13 @@ list some limitations and advice.
   `DifferentialEquations` may not be able to exploit it.
 
 - `if-else` blocks are recognized in their long form, but short-circuit
-  conditional operators (`&&` and `||`) are not.  When comparands are subject to
+  conditional operators (`&&` and `||`) are not.  When comparing to a
   Taylor expansion, use operators such as `iszero` for `if-else` tests
   rather than comparing against numeric literals.
 
 - Input and output lengths should be determined at the time of `@taylorize`
   application, not at runtime.  Do not use the length of the input as an
-  implicit indicator of whether or not to write all elements of the output.  If
+  implicit indicator of whether to write all elements of the output.  If
   conditional output of auxiliary equations is desired, use explicit methods,
   such as through parameters or by setting auxiliary `t0` vector elements
   to zero, and assigning unwanted auxiliary outputs zero.
@@ -216,16 +234,16 @@ list some limitations and advice.
   heuristics used, especially for vectors, may not work for all cases.
 
 - Use `local` for internal parameters (simple constant values); this improves
-  performance. Do not use it if the variable is Taylor expanded.
+  performance. Do not use it if the variable is Taylor expanded during the integration.
 
 - `@taylorize` supports multi-threading via `Threads.@threads`. **WARNING**:
   this feature is experimental. Since thread-safety depends on the definition
   of each ODE, we cannot guarantee the resulting code to be thread-safe in
   advance. The user should check the resulting code to ensure that it is indeed
   thread-safe. For more information about multi-threading, the reader is
-  referred to the [Julia documentation](https://docs.julialang.org/en/v1/manual/parallel-computing#man-multithreading-1).
+  referred to the [Julia documentation](https://docs.julialang.org/en/v1/manual/multi-threading/#man-multithreading).
 
 It is recommended to skim `test/taylorize.jl`, which implements different
-cases.
+cases and highlights cases where the macro doesn't work and how to solve the problem.
 
 Please report any problems you may encounter.