In order to apply to an Outreachy project this round you should:
Communication channels for more information:
IRC: #ceph channel at irc.oftc.net.
HDDs and SSDs expose internal metrics about usage, wear, and hardware health in the form of SMART metrics that can be queried from the host. This information can be used to build a model around expected longevity of the device so that impending failures can be predicted and data can be proactive replicated and failing devices removed from the system before they cause availability problems.
Ceph is an open source distributed storage system that uses replication and erasure coding to distribute data across many HDDs and/or SSDs. The system aims to be self-managing, which should include predicting the failures of constituent devices before they happen to improve overall data safety.
This project will include integration of low-level tools to extract SMART data from devices on a regular basis (e.g., by modifying the smartctl(8) utility to dump it’s result in structured form), feeding that data back to a the central Ceph cluster “manager” daemons, and implementing a simple mathematical model to predict failures and preemptive remove failing devices from the system. Several simple existing models are available that can be used as-is once the SMART data is centrally stored and monitored, although once this infrastructure is in place we’ll eventually look to improve the accuracy of the predictive models based on additional data.
Ceph is a highly available distributed software defined storage, providing object, key/value and file-system interfaces. Ceph Radosgw provides HTTP REST API that is AWS S3 and openstack swift compatible.
radosgw-admin is a command line tool for configuring, controlling this service.
It also allow querying the service status, user data and the geo replication..
Today adding or updating commands is a complex process, resulting from the choice of implementation.
We would like to improve the implementation:
The project will consist of three parts:
The Ceph Benchmarking Tool is a python based framework used to automate distributed data storage performance tests. Lately, we have been working on the integration of CBT with our nightly regression testing framework called Teuthology. The goal is to automate comprehensive performance testing for Ceph. We need a candidate with a keen interest in data visualization and knowledge of web programming (especially web based graphing frameworks) to help us take the nightly performance test results and system monitoring data and present them in intuitive and user friendly ways.
Ceph includes a lightweight web dashboard that enables users to see the health of the system and explore the status of the various services within the cluster.
The dashboard was added recently in the 12.x Luminous Ceph release, and has much scope for enhancement, as there is currently much more data available in ceph-mgr than is exposed in the dashboard code.
This project will include making a variety of improvements to the dashboard, such as:
There is some flexibility in exactly which features/pages to work on.