Google Summer of Code

Google Summer of Code (g.co/gsoc) is Google's mentorship program for bringing new contributors into open source communities.

Google Summer of Code is a unique program where new contributors to open source, ages 18 and over, are paired with a mentor to introduce them to the open source community and provide guidance while they work on a real world open source project over the summer. Projects cover a wide range of fields including: Cloud, Operating Systems, Graphics, Medicine, Programming Languages, Robotics, Science, Security and many more. GSoC Contributors do earn a stipend to work on their small, (~90 hour), medium (~175 hour) or large (~350 hour) projects.

GSoC is a highly competitive program, so don't wait to the last minute to prepare! GSoC Contributors should reach out to the mentors of projects that interest them once orgs are announced on February 27, 2025. Potential GSoC Contributors can apply for Ceph's projects at g.co/gsoc from March 24th - April 8, 2025.

See the list of projects we have available for GSoC contributors and learn how get started with contributions.

In addition, for each project, mentors have detailed their expectations for usage of AI tools at the end of each project description.

For any questions, contact Vallari Agrawal (vallari.agrawal@ibm.com).


radosgw-admin UX and documentation improvements

Mentor name(s): Yuval Lifshitz, Jacques Heunis

Mentor email(s): ylifshit@ibm.com, jheunis@bloomberg.net

Difficulty: Advanced

Project Hours: 350

Skills needed: C++, (maybe python)

Subcomponent of Ceph: RGW

Description of project:

Currently documenting radosgw-admin commands is a manual and error prone process.

After implementing a new command, the "usage" part should be updated accordingly in the code, where there could be a mismatch between the command itself and its arguments and what is documented in the usage.

After that the man page should be updated manually, as well as the admin guide. any reference to this command in other places in our documentation is also manual.

We would like to use a more Programmatic approach towards the problem:

Standup/weekly call mentee could attend?: RGW standup, RGW refactoring meeting

Steps to evaluate an applicant for the project: TBD

1-2 short paragraphs about what first 2 weeks of work would look like during the internship: TBD

Expected Outcome: Detailed in the description.

Rules for AI usage: https://gist.github.com/yuvalif/b07312c98ea74890e157594a456c6e6b


Kafka Security

Mentor name(s): Yuval Lifshitz

Mentor email(s): ylifshit@ibm.com

Difficulty: Advanced

Project Hours: 350

Skills needed: Python and C++

Subcomponent of Ceph: RGW (teuthology and rook as stretch goals)

Description of project:

Bucket notification integration with Kafka is a very useful feature in the RGW. However, some security features needed for such integrations are missing. so, in this project we will try to make bucket notifications over kafka more secure.

The following features are missing:

The main challenge in the above would be in automating the tests, so they could easily run locally,

Standup/weekly call mentee could attend?: RGW daily standup, RGW weekly refactoring meeting

Steps to evaluate an applicant for the project: TBD

1-2 short paragraphs about what first 2 weeks of work would look like during the internship: TBD

Expected outcome: Detailed in the description.

Rules for AI usage: https://gist.github.com/yuvalif/b07312c98ea74890e157594a456c6e6b


RGW tcmalloc profiling

Mentor name(s): Yuval Lifshitz

Mentor email(s): ylifshit@ibm.com

Difficulty: Advanced

Project Hours: 175

Skills needed: Python and C++

Subcomponent of Ceph: RGW (teuthology and rook as stretch goals)

Description of project:

All daemons in ceph are using tcmalloc as the memory allocator to achieve better performance.

However, while the OSD, MON, and MDS can report the memory allocation performance, the RGW does not support that (see: https://docs.ceph.com/en/latest/rados/troubleshooting/memory-profiling/#memory-profiling).

In this project, we should add this reporting support to the RGW as well.

As a stretch goal, we should use the profiling information from RGW runs to tune the tcmalloc parameters so that would be more suitable for the memory use of the RGW

Standup/weekly call mentee could attend: RGW daily standup, RGW weekly refactoring meeting

Steps to evaluate an applicant for the project: TBD

1-2 short paragraphs about what first 2 weeks of work would look like during the internship: TBD

Expected outcome: Detailed in description.

Rules for AI usage: https://gist.github.com/yuvalif/b07312c98ea74890e157594a456c6e6b