Login

[email protected] · 08-16-2017, 08:46 PM

Randomized Protocol for Duplicate Elimination in Peer-to-Peer Storage System

PEER-TO-PEER systems have emerged as cost-effective alternatives for scalable data sharing, backup, and archival storage. Peers contribute data and storage and, in return, gain access to data at other peers. Effective storage management is an important issue in the deployment of such systems. Data replication and caching are key enabling techniques for scalability, performance, and availability. In this context, an important problem relates to pruning unwanted copies of data efficiently and safely. Attempts at aggressive replication may lead to significant overheads associated with thrashing in resource constrained environments. Even if replication at peers is controlled, as in systems such as Samsara, the network as a whole must provide mechanisms for eliminating replicas that are not accessed, while leaving a minimum number of replicas in the network to satisfy availability constraints.
In this paper, we investigate the problem of eliminating duplicate data items in peer-to-peer systems. We examine this issue in the context of unstructured networks, where no assumptions can be made about the relationship between an object and the peers at which it resides. Unstructured networks differ from their structured counter-parts in several important respects. Structured networks provide a simple primitive for locating an object which relies on a distributed hash table (DHT) abstraction. The associated lookup techniques provide bounds on the number of hops as a function of the number of peers. These bounds are achieved by establishing and maintaining a well-defined overlay topology. In networks with a high transient population, the overhead associated with this may be significant. In contrast to structured peer-to-peer networks, unstructured networks are resilient to node failures and incur low overhead on node arrivals and departures. These characteristics make unstructured networks attractive for use in highly transient networks, where peers do not have significant resources. Unfortunately, the issue of object location, which is central to the problem of identifying redundant copies, is significantly more complex in this environment.
The primary focus of this paper is on systems where peers are cooperative and non malicious. Peers divide their storage into two spaces: a private and a public space. The private space contains the peer s data and is not subject to duplicate elimination. The public space holds data from other peers and is subject to duplicate elimination. We can view the public space as back up storage or a cache to facilitate availability and performance, respectively.

HARDWARE SPECIFICATION
Processor : Any Processor above 500 Mhz.
Ram : 128Mb.
Hard Disk : 10 Gb.
Compact Disk : 650 Mb.
Input device : Standard Keyboard and Mouse.
Output device : VGA and High Resolution Monitor.

SOFTWARE SPECIFICATION
Operating System : Windows 2000 server Family.
Techniques : JDK 1.5
Data Bases : Microsoft SQL Server
External Tool : JFree Chart