Feasibility study of GlusterFS and CephFS in geographically distributed environment
The management of massively generated data from different disciplines has already been considered as a big challenge. Recently some new projects like GlusterFS and CephFS claimed to provide a geo-replication model. In this thesis we look forward to develop a solution that can allow us to run distributed file systems beyond the cluster-based approach but with the acceptable performance penalty. For this task we would like to experiment with both Ceph and Gluster file systems. They have different architectures but claim to provide geo-replication model. The proposed solution will provide basis for developing next level intra-Cloud solution for data analysis.
Prerequisities: Distributed systems, basics of file systems, good knowledge of programming and shell scripting, working experience with Linux operating system.
For preliminary discussion contact Lirim Osmani (lirim.osmani@cs.helsinki.fi) or Salman Toor (salman.toor@helsinki.fi).