Grid ALU Processor

  • Start date:01.10.2007
  • End date:31.01.2012
  • Funded by:DFG (Deutsche Forschungsgemeinschaft)
  • Local head of project:Prof. Dr. Theo Ungerer
  • Local scientists:Sascha Uhrig


Currently few architectural approaches propose new paths to raise the performance of conventional sequential instruction streams in the time of the billions transistor era. Many application programs could profit from processors that are able to speed up the execution of sequential applications beyond the performance of current superscalar processors. The Grid ALU Processor (GAP) is a runtime reconfigurable processor designed for the acceleration of a conventional sequential instruction stream without the need of recompilation. It comprises a superscalar processor front-end, a novel configuration unit, and an array of reconfigurable functional units (FUs), which is fully integrated into the pipeline. The configuration unit maps data dependent and independent instructions simultaneously at run-time into the array of FUs. The dynamic mapping allows the acceleration of sequential control-flow-oriented applications. Moreover, the execution of loop iterations is accelerated by executing the loop inside the array of FUs. 

Beyond the hardware design, a software tool is being developed for analysis and optimization of the binaries to be executed on the GAP. Concerning analysis, it supports programmers’ understanding of a whole program’s structure by isolating the functions and graphically representing their control flow graphs. The tool also performs analyses on program or function level. It is also used as platform for whole program post-link optimizations which can use information gained from executing the program on the simulator, e.g. how often blocks are executed. Thereby, programs shall be modified for faster execution on the GAP by better exploitation of its special features.

With the tool FADSE (Framework for Automatic Design Space Exploration) near-optimal configuration for either the hardware-parameters of the GAP or for both the parameters of hardware- and code-optimizations can be found. Genetic algorithms are used and the exploration time is short compared to the huge size of the design space due to several optimizations in the exploration process. Because of its speed FADSE can also be used for adaptive code optimizations. By this the performance of the GAP can be increased or the necessary hardware resources to reach a given performance number can be reduced very much.





  • Finding near-perfect parameters for hardware and code optimizations with automatic multi-objective design space explorations 
    Ralf Jahr, Horia Calborean, Lucian Vintan, Theo Ungerer 
    Concurrency and Computation: Practice and Experience



  • Advanced architecture optimisation and performance analysis of a reconfigurable grid ALU processor 
    Sascha Uhrig, Ralf Jahr, Theo Ungerer 
    High-performance computing system architectures: design and performance, Volume 6, Issue 5, pages 334-341
  • A Comparison of Multi-objective Algorithms for the Automatic Design Space Exploration of a Superscalar System 
    Horia Calborean, Ralf Jahr, Theo Ungerer, Lucian Vintan 
    Advances in Intelligent Systems and Computing, volume 187, pages 489-502




  • Automatic Multi-Objective Optimization of Parameters for Hardware and Code Optimizations 
    Ralf Jahr, Theo Ungerer, Horia Calborean, Lucian Vintan 
    Proceedings of the 2011 International Conference on High Performance Computing & Simulation (HPCS 2011), pages 308-316
  • Optimizing a Superscalar System using Multi-objective Design Space Exploration 
    Horia Calborean, Ralf Jahr, Theo Ungerer, Lucian Vintan 
    Proceedings of the 18th International Conference on Control Systems and Computer Science (CSCS), pages 339-346
  • Optimized Replacement in the Configuration Layers of the Grid Alu Processor 
    Ralf Jahr, Basher Shehan, Sascha Uhrig, and Theo Ungerer 
    Proceedings of the Second International Workshop on New Frontiers in High-Performance and Hardware-aware Computing (HipHac´11), pages 9-16




  • Optimization and Evaluation of the Reconfigurable Grid Alu Processor 
    Basher Shehan, Ralf Jahr, Sascha Uhrig, and Theo Ungerer 
    Tehran The 15th CSI Symposium on Computer Architecture & Digital Systems (CADS)
  • Reconfigurable Grid Alu Processor: Optimization and Design Space Exploration 
    Basher Shehan, Ralf Jahr, Sascha Uhrig, Theo Ungerer 
    DSD2010 13th EUROMICRO Conference on Digital System Design Lille, France, 1-3 September 2010
  • Static Speculation as Post-Link Optimization for the Grid Alu Processor 
    Ralf Jahr, Basher Shehan, Sascha Uhrig and Theo Ungerer 
    4th Workshop on Highly Parallel Processing on a Chip (HPPC 2010)
  • Enhancing the Grid Alu Processor for a Better Exploitation of the Functional Units 
    Basher Shehan, Ralf Jahr, Sascha uhrig, and Theo Ungerer 
    17th International Conference Mixed Design of Integrated Circuits and Systems
  • The Two-dimensional Superscalar GAP Processor Architecture 
    Sascha Uhrig, Basher Shehan, Ralf Jahr, Theo Ungerer 
    International Journal on Advances in Systems and Measurements, issn 1942-261x vol. 3, no. 1 & 2, year 2010, pp. 71-81




  • A Two-dimensional Superscalar Processor Architecture 
    Sascha Uhrig, Basher Shehan, Ralf Jahr, Theo Ungerer 
    The First International Conference on Future Computational Technologies and Applications (FUTURE COMPUTING 2009)
  • The Grid ALU Processor 
    Ralf Jahr, Basher Shehan, Sascha Uhrig, Theo Ungerer 
    ACACES 2008 Poster Abstracts