The term newtonian motion denotes motion at low speed compared to the speed of light in a flat space/time and the change in the movement of an particles is describes by the equation

$$ \frac{dp}{dt} = F $$ | (1) |

where \(F\) denotes the force acting upon the particle with the impuls \(p\).

In the order to use the above equation to calculate the movement of a given particle it has to be integrated. The integration of newton's quation can be done rather straight forward in three steps as shown below. First all forces acting on a particle are summed up. THe change of motion is calculated and in a final step the actual position of the particle is calculated. After this has been done for all particles interacting with each other, the new impuls and locations are the initial values for a new caculation step.

$$ F_{i}=\sum_{j=1}^{N}F(x_{i},x_{j}) $$ | (2) |

$$ v_{i}(t+\Delta t)\simeq v_{i}(t)+\frac{1}{m_{i}}F_{i}\triangle t $$ | (3) |

$$ x_{i}(t+\Delta t)\simeq x_{i}(t)+v_{i}(t)\triangle t+\frac{1}{2m_{i}}F_{i}(x_{t})\triangle t^{2} $$ | (4) |

The above algorithm is rather easy for implement in any modern programing language. Unfortunately the comptuation effort increases like \(N^2\) with the number of interacting particles. One way of speeding the algorithm up is to utilitze multiple CPU's in one computer.

The algorithm above is basicaly an iteration over all particles \(i\) by applying some function \(I\) to the inititial state vector \((x_{1}....x_{N})\).

- Step 1
$$ y_{i} = I( (x_{1}....x_{N})) $$ (5) - Step 2
$$ x_{i} = y_{i} $$ (6)

After step 2 the calculation starts again at step 1.

If the function \(I\) cant be speeded up, the process can be speeded up if the index range in split up into partitions and the work for one partition gets assigned to one CPU.

Using the collaboration frame work which has been developed with a more general scope, i have implemented a simple program to demonstrate the speedup for different number of processor and particle number. The result is shown in the diagram below.

Fig. 1 - Execution Time vs number of particles

The result is shown in the diagram above. It shows the execution time depending on the particle number for different number of processors. The diagram shows clearly that using multiple processor has its tradeoff due to synchronisation.

Step 2 of the algorithm above requires that the results of all paritions are available before the step 1 can be executed again. To ensure this synchronization point between all CPU have to be used. Consequently the 8 core configuration is only faster then 2 core configuraton if the number of particles exceeds about 60 particles.