Collaborative Research: CIF: Small: Versatile Data Synchronization: Novel Codes and Algorithms for Practical Applications
University Of Virginia Main Campus, Charlottesville VA
Investigators
Abstract
The total volume of global digital data created or copied is estimated to double approximately every three years. This rapid growth has led to an increasing need for reliable and universal access to data in personal, enterprise, and scientific environments. To meet these requirements, data synchronization, which refers to the process of maintaining consistency between different versions of data stored on separate hosts, has become a crucial aspect of managing data. However, state-of-the-art synchronization tools have significant shortcomings and inefficiencies, resulting in increased costs and high-latency access. This project aims to develop data synchronization algorithms with optimal communication bandwidth based on error-correcting codes and to broaden the applicability of synchronization to real-world settings where current tools are inadequate. In addition to scientific and technological advances, the project has the potential to facilitate access to distributed storage systems for users with limited access to broadband Internet, such as in rural areas; help reduce energy consumption associated with data transmission; and provide opportunities to engage and train undergraduate researchers. The goals of the project will be achieved through three research thrusts. The first thrust aims to increase the efficiency of data synchronization by designing low-redundancy systematic edit-correcting codes, along with efficient encoding and decoding algorithms. The second thrust focuses on synchronizing compressed data. As conventional compression typically destroys the similarity between related files, the project will develop mutually compatible compression and synchronization methods which, given the prevalence of data compression, have the potential to significantly expand the use of synchronization for large datasets. Finally, the third thrust will address often-overlooked real-world constraints on synchronization from theoretical and practical points of view. In particular, bounds on the information exchange will be established where one party is under communication or computational constraints. Furthermore, incremental and adaptive synchronization protocols will be developed to efficiently synchronize data when the statistics of the stochastic processes governing data update and modification are unknown. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →