Main Menu
About us
Project Description
Quantitative Results
Research Lines
Research Results
Impact on Society
Press room
Contact us
Secure Login
Events Calendar
« < April 2019 > »
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 1 2 3 4 5

Activity 3: Memory organization without caches for data storage

 Leader: Julio Sahuquillo; Researchers: Noel Tomás, Salvador Petit, Pedro López 

1. Brief Description of the Goals

Cache memories have been used by computer architects for several decades due to the huge memory latencies to reduce the average memory data access time. However, current systems and applications require from caches larger and larger, which implies a longer cache access time. On the other hand, as main memory capacity increases, the working set for the applications also increase, thus increasing the pressure on the caches and producing a larger number of capacities misses.

In this research we pursued to investigate the use of current compiler strategies for analyzing the utilization frequency and lifetime of program variables to map variables, not only to registers or main memory, but to a hierarchy of memories featuring very different access times. That is, we looked for a set of memory structures implementing a flat addressing space. In other words, different cache levels implemented as in current processors, except that no cache controllers and no tags are required. In this way, depending on the memory address, the access is performed to one memory structure (e.g., L1) or to another one. Those ranges of addresses covered by the different structures do not overlap, thus providing a truly non-uniform memory access (NUMA) architecture, even for uniprocessors.

Using this simple memory organization, the compiler can assign memory addresses to variables based on their utilization frequency and lifetimes, in such a way that the most frequently used variables are mapped to faster regions of memory (of course, the most frequently used variables should still be mapped to registers). Also, similarly to the case for registers, different variables can be mapped to the same memory location at different phases during program execution, simply by inserting suitable code for dumping variables from fast memory to slower memory locations. The expected benefits were worth the research effort.

2. Scientific and Technical Developed Activities

In this task we characterized the memory requirements of different applications. To this end, we use the SPEC2000 benchmark suite and launched experiments in detailed cycle-by-cycle simulator for superscalar processors. The study was performed on a large amount of detailed traces provided by the simulator. We analyzed the obtained the results and we found that an important subset of benchmarks could improve their performance when using flat address instead of a typical cache hierarchy. However, and mainly to in some applications the used data only exhibit spatial locality, these memory structures did not work well in the devised flat address space. Thus, further refinements of the proposal became necessary. Unfortunately, we had to leave the research in this line due to two main reasons:

i. the person hired to perform this task left the research group since he found another job, and 

ii. at that time, we did not found any person with enough compiler knowledge to carry out this task.