Home  
 

Main Menu
Home
About us
Project Description
Quantitative Results
Research Lines
Research Results
Impact on Society
Press room
Contact us
News
Secure Login
Events Calendar
« < October 2017 > »
M T W T F S S
25 26 27 28 29 30 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5
Login

subTask 1.2 (UPV): Development of deadlock-free routing strategies

Leader: José Flich; Researchers: Andrés Mejía, Crispín Gomez, Francisco Gilabert, Jose María Martínez, M. Engracia Gómez, Crispín Gómez

1. Brief Description of the Goals

Routing is a challenging task in multiprocessor systems and clusters. The main challenge comes when failures arise in the system, thus leading to irregular networks. In such scenario the straightforward solutions are not suitable and more complex solutions are required. In addition, current routing solutions often require more resources (like virtual channels) or impose severe restrictions that may harm performance. In this task, different alternative solutions have been explored and novel proposals have been presented in different venues (international peer-reviewed conferences and journals).

2. Scientific and Technical Developed Activities

Three different paths have been followed in Task 1.2.

In a first path (topology-agnostic routing), a topology-agnostic routing algorithm has been designed. The algorithm, referred to as Segment-based Routing (SR) is flexible enough to define multiple routing algorithms all of them agnostic to the topology, thus being suitable for an environment when faults lead to changes in the original topology. The SR routing does not require virtual channels, thus having a low implementation cost. The algorithm, although designed for high-performance networks, have been also applied to Ethernet networks. Such work has been done in collaboration with Simula Labs in Oslo, Norway.

SR flexibility may harm performance. To mitigate this effect, different procedures to achieve routing algorithms out of SR have been explored. As a major result in this activity, a paper has been published by Mejia et al. in ICPP 2008 conference. In such paper the different methodologies to compute segments in SR are described and evaluated.

In addition, SR routing has been combined with different task mapping strategies. For each SR combination, network performance has been obtained taking into account the mapping of tasks to processors. As a major result in this activity, a new routing method (CART) has been proposed and a paper has been published by Tornero et al. in Euromicro DSD 2008 Conference. The paper described different suitable combinations of SR routing and mapping policy. An extension of the work has been published by Tornero et al. in International Journal on Parallel Programming, vol 39, issue 3, 2011.

As a major outcome of the research related to SR (three previous paragraphs), a PhD has been defended by Andres Mejía, including all the works related with SR (topology-agnostic routing). Several international conferences have been attended linked to these activities.

To conclude the research performed in topology-agnostic routing, a complete survey on multiple routing algorithms for irregular topologies have been performed, analyzing for each case on multiple topologies the maximum performance achieved. Also, a taxonomy of routing algorithms have been developed. All this has been included in a journal paper published by Flich et al in Transactions on Parallel and Distributed Systems, vol 23, issue 3, 2012.

In a second path, adaptive routing algorithms have been explored. One of the usual requirements of adaptive routings is the need for virtual channels. However, a new approach without virtual channel requirements has been explored, ending up in a technical report describing the new algorithm.

In a third path, we have analyzed how to route packets efficiently in  fat-tree topologies, with a low hardware cost and routing time, while obtaining maximum throughput and minimum latency. For doing this, we have focused on selecting carefully the upwards paths in the upwards adaptive phase which is the one that decides the downwards path in fat-trees for each packet.

With the mechanism designed to select the upwards phases we have obtained a new deterministic routing algorithm published by Gómez et al. in the International Parallel and Distributed Processing Symposium 2007, that achieves similar performance levels that the ones obtained by more sophisticated adaptive routing algorithms while being simpler and providing in-order delivery of packets. Thus, savings in implementation are achieved without compromising performance levels, thanks to balancing network traffic and reducing the HoL blocking effect as much as possible since when using such algorithm, the traffic going down the tree is totally classified by destination, in the sense that each down link is used only for traffic addressed to a single destination, and a separate path exists (in the downstream path) for each destination. This deterministic algorithm is being used in Magnum switch; an InfiniBand-based system made of 3456 ports, built by Sun Microsystems, and has been protected by the patent “Method and switch for routing data packets in interconnection networks".

Taking into account the packet classification in the downwards phase of the fat-tree, we have proposed a simplified version of the fat-tree topology. This has been achieved by replacing the down path by a single link connecting each destination with a switch located at the uppest stage. 

This means that link cost is halved and switch cost is also reduced (switches have a lower radix) but performance is maintain in the same levels. The final outcome called RUFT has been published by Gomez et al. in Computer Architecture Letters, vol 7, no 2, 2008 and by Gómez et al. in ICPADS 2008 and Ludovici et al. in DATE 2009. 


Publications: [Mejia07] [Mejia08] [Gomez07] [Gomez08] [Gomez08b] [Tornero09] [Tornero08] [Gilabert09]

Projects funded by Public Calls:   TIN2006-15516-C04-01, TIN2009-14475-C04-01STREP Num. 248972CA501 COMCAS labelTSI-020400-2009-64PRI-PIBIN-2011-0989 , PCC-08-0078-9856 

External collaborations Academia: Tor Skeie and Sven-Arne Reinemo

External collaborations Industry: Sun Microsystems (Norway)

Company Agreements: Sun Microsystems (US)

PhD dissertations: Andrés Mejía Gómez