Home  
 

Main Menu
Home
About us
Project Description
Quantitative Results
Research Lines
Research Results
Impact on Society
Press room
Contact us
News
Secure Login
Events Calendar
« < October 2017 > »
M T W T F S S
25 26 27 28 29 30 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5
Login

Task 2 (UM): Distributed software enhancement in PC clusters
1. Brief Introduction

This task is made up of four subtasks that we describe below.

Subtask 2.1. Design and implementation of efficient mechanisms for data transfer and storage in distributed systems.

Secondary storage is usually a performance bottleneck in modern computer systems, from desktops to large distributed systems. This performance problem can be tackled from two points of view: locally, by developing new mechanisms to improve the throughput of the hard disks attached to each node, and globally, by implementing new scalable techniques and algorithms for distributed storage management. In this project, we have proposed different approaches to improve the performance of the secondary storage both locally and globally.

Subtask 2.2. Building a middleware infrastructure for the development of distributed applications over high-performance networks.

The development of the CORBA-LC component model was motivated by some deficiencies in existing tools for the development of distributed applications. First, it was noted that component-based development (CBD), identified as a very well established and  convenient paradigm for the development of applications, did not have a distributed counterpart that also allowed the efficient and transparent usage of the network resources. Existing distributed component models, such as EJB or CCM, are oriented towards enterprise applications, and implement a server-oriented model, instead of a peer model that would allow the homogeneous integration of the set of network resources. At the same time, actual developments that offer those network integration capabilities (as Grid systems), in general, don't offer a rich programming model that allows modular, reusable application development, having complex installation and maintenance procedures. Also, they are thought to be run mostly in dedicated systems, not for the transparent integration of existing networks, as it is the aim of CORBA-LC.


Subtask 2.3. Development of location aware services for WLAN users.

Location-based services are a special kind of application which take into account information about the geographical location of a particular subject in order to provide a context-dependent service. The main purpose of this subtask is to define an architecture for indoor location services based on widely deployed technologies such as 802.11 or Bluetooth. Using some models for signal ropagation and received signal strength (RSSI), the architecture will be able o determine, in real time, the location of a particular subject with an accuracy of 3-5 meters.

Subtask 2.4. Design of a security policy management system for GT4.

Globus Toolkit version 4 (GT4) provides a range of new services and features. One of these new features includes a robust implementation of Globus Web ervices components. This allows GT4 to complete the first stage of migration o web services that began with GT3.

The management of grid services is a complex task, i.e. there are an important number of issues to be considered. Ideally, useful grid infrastructures should offer a number of tools for managing services with the purpose of facilitating he management tasks of grid users and administrators. They also ought to provide some added-value services like, for example, checking the validity of grid management rules, performing proactive monitoring, etc., thus enabling automatic and dynamic management frameworks. In this sense, the Globus toolkit does not provide any particular solution to distributed grid management. However, several active groups of the OGF (Open Grid Forum) are starting to address some of these issues.

Aligned with that fact, our work proposes the design and implementation of a framework for managing security policies for the grid scenario based on GT4. Moreover, our framework meets certain requirements that make it different to the existing solutions (e.g., PERMIS). These requirements are the integration of web-services in the whole management cycle (from the definition ask to the enforcement and monitoring processes) and the use of semantic-aware management languages oriented to enable new added-value
features, such as detection and resolution of conflicts existing between different grid management policies specified on the basis of decision rules.
  

2. Main results achieved

The main results achieved by each subtask are the following:

Subtask 2.1. Design and implementation of efficient mechanisms for data transfer and storage in distributed systems.

In order to improve the performance of the secondary storage of each cluster node, we have implemented REDCAP, a RAM-based disk cache, placed between the operating system's buffer cache and the built-in cache of the disk controller, which is able to greatly reduce the I/O time of the disk read requests by using small portion of the main memory. One of the most important elements of REDCAP is its activation-deactivation algorithm. This algorithm dynamically compares the performance obtained by REDCAP and the estimated performance achieved by a system without REDCAP, and activates or deactivates the REDCAP cache accordingly. The experiments carried out show that REDCAP can improve the performance up to 80% for several workloads, while achieves similar results to hat obtained by a vanilla Linux kernel for workloads where an improvement in the I/O is hard to obtain [Gonzalez07, Gonzalez08a].

In the case of distributed file system, we have implemented a new approach for Active Storage in collaboration with the Pacific Northwest National Laboratory (USA). Active Storage reduces the bandwidth requirements between the compute and storage elements of a cluster by moving appropriate processing tasks to the storage nodes. The execution of tasks on the storage nodes also allows Active Storage to leverage the processing power of these nodes too. Active Storage has been implemented on Lustre and PVFS parallel file systems. We have also provided a scientist-friendly environment where it is easy to describe and run an Active Storage job [Piernas07].

The current implementation of Active Storage is able to deal with striped files, i.e., files whose data is spread across several nodes, and which are typically used to improve the I/O bandwidth. It is also able to deal with netCDF files which are very common for data exchange in some scientific applications [Piernas08].

Subtask 2.2. Building a middleware infrastructure for the development of distributed applications over high-performance networks.

We have specified a component model based on CORBA (CORBA Lightweight Components, CORBA-LC) that eases the development of distributed applications. This is because of the key characteristics of CORBA-LC:

1. Software is divided in independent modules, with a well-defined interface. These modules are called components, and, by being independent, they can be developed, specified and tested in isolation, enhancing the software quality.

2. Each component specifies CORBA interfaces. Those interfaces allow that component to be accessed remotely in a transparent way, regardless of the computer it is running on. Also, each components offer standarised information on how to install it, its requirements of CPU and memory usage, architecture and operating system, other required components (meta-information) that allows to manage it, send it remotely, install and deinstall it, etc.

3. The running environment also supports a well-known standard set of services, such as load balancing, fault tolerance, distributed searches for components, etc. These services are offered to the component through the container. The container also manages activation and deactivation of components through standard interfaces.

4. Components specify what services they demand from the container, and it provides them. This has the following advantages:

a) service code is shared among all the components
b) the component only implements its required functionality, and does not have to have into account other requirements such as load balancing, provided by the container.

5. Component-based development allows developing applications by connecting basic components. In this activity, such a tool to develop applications have been carried out.

Finally, the set of policies forming the dialog between the component and the container has been formally stablished [Sevilla07a].

Subtask 2.3. Development of location aware services for WLAN users.

At this stage, we have developed a first prototype based on 802.11 and fingerprints. The system has been deployed in two different phases. On one hand, we analyzed the indoor environment, a building fllor, to determine the number of fingerprints to be obtained and to record the signals from the existing access points. On the other hand, a second online phase has been accomplished to test some initial location algorithms based on RSSI. This first prototype makes use of a particular software that has been installed of laptops and PDAs to provide the signal information to a central server which is responsible for the location estimation. Current accuracy is around 6 meters.


Subtask 2.4. Design of a security policy management system for GT4.

This work has designed and implemented a semantic-aware framework enabling the dynamic management of security services in GT4 infrastructures. The defined framework also represents one step towards the automatic management of security services, considering not only authorization services, but also providing additional reasoning mechanisms to deal with issues such as detection and resolution of conflicts between different grid management rules.

Some other components of the framework (as the security module area) have been used to provide security components in ubiquitous systems, routing scenarios and overlay networks. In this sense, the most relevant paper of this subtask [Martinez07] provides a two-tier framework for managing semantic-aware distributed firewall policies to be applied to the devices existing in one administrative domain.


Publications: [Gonzalez07], [Gonzalez08a], [Gonzalez08b], [Piernas07], [Piernas08], [Sevilla07a], [Sevilla07b], [Sanchez08], [Muñoz07a], [Muñoz07b], [Muñoz07c], [Muñoz09], [Martinez07], [Martinez08], [García09]

PhD dissertations: [Sevilla08

Collaborations:  [Nieplocha], [Cortes]

 

 3. Current work

These are the current issues we are working on:

Subtask 2.1. Design and implementation of efficient mechanisms for data transfer and storage in distributed systems.

The activation-deactivation algorithm of REDCAP uses a disk simultador for estimating the I/O throughput that can be achieved by a system with REDCAP, and a regular system without REDCAP. We are improving the accuracy of this simulator in order to make the algorithm make right decisions. We are also evaluating the effectiveness of REDCAP with aged file systems and other additional workloads.

With respect to Active Storage, we are improving the implementation to make the management of striped netCDF files easier.

Subtask 2.2. Building a middleware infrastructure for the development of distributed applications over high-performance networks.

This subtask has been successfully completed. There is no current work in progress.

Subtask 2.3. Development of location aware services for WLAN users.

The current infrastructure is being modified to address two different issues. First, we are incorporating new devices, such as wireless 802.11 tags and RFID tags in order to support a wider range of devices. This new hardware requires some modifications on the existing infrastructure, since now signals must be obtained by access points and not by end user devices. Secondly, we are working on different methods for location estimation, especially considering accuracy and performance.


Subtask 2.4. Design of a security policy management system for GT4.

This subtask has been successfully completed. There is no current work in progress.