Malinda Malwala Arachchige, a PhD candidate in Computer Science studying under Professor Danny Dig, has been awarded a gold prize at the ESEC/FSE Student Research Competition 2021 for developing an infrastructure that seeks to reduce the repetitiveness of machine learning software. The victory was made even more sweet because it came in the midst of great personal stress, as Malwala Arachchige's child was born prematurely and he had to submit to the competition while he and his family were in the hospital. Having won the gold medal, he is eligible to compete at the SRC global world finals in 2022 and he plans to do so.
What research did you undertake for the competition?
Over the years, researchers capitalized on the repetitiveness of software changes to automate many software evolution tasks such as code completion, bug-fix recommendation, library adaption, automatic program repair, and automated refactoring. Despite the extraordinary rise in popularity of Python-based ML systems, we observed that they do not benefit from these advances. Therefore, we developed an infrastructure that enables researchers to study the repetitiveness of ML software. Using the infrastructure, we conducted the first and most fine-grained study on code change patterns of 1000 top-rated ML systems comprising 58 million SLOC. We identified 22 code pattern groups, and we revealed 4 major trends of how ML developers change their code. Our research serves as a guide for tool builders to build code automation tools that can save thousands of ML developers’ man-hours.
If you were talking to someone without specific knowledge of software engineering, how would you describe the importance of the research you undertook?
In my Ph.D. research career, I went from searching for the hardest problem I can solve to searching for the simplest problem that will have the most impact. We observed that automation tool support for Python ML code development is significantly behind the level of support provided by IDEs in other languages such as Java. ML developers spend thousands of man-hours to perform the same code in the same project or across the projects. To solve this problem and save man-hours, the research, Discovering Repetitive Code Changes in ML Systems, proposes and implements techniques to study repetitive code changes that the Python ML developers perform. Using the technique, we conducted the first and most fine-grained study on code change patterns of 1000 top-rated ML systems comprising 58 million SLOC. This research guides tool builders and researchers to automate the frequent code changes that the ML developers perform and, in return, reduce the pain of Python-ML software evolution.
What makes this medal so important?
Winning the gold award in the student research competition at FSE-2021, for my short-paper "Discovering Repetitive Code Changes in ML Systems", which describes the technology behind identifying repetitive code changes in Python ML systems, is so far the most important milestone in my research philosophy and PhD career. There are many professional and personal reasons why this medal is important to me. First, FSE is the flagship conference in software engineering, so it helps my work reach the top research community of software engineering.
Second, this rewards my hard work and dedication throughout the last year. But above all this medal is extra special to me due to the unprecedented circumstances I undertook during the paper submission. One week prior to the submission deadline, my wife and I had to experience emergency preterm labor. This is something that I never expected. We had to stay in the hospital for one month until my son and wife were fully recovered. After being admitted to the hospital, I started reading success stories of premature babies and focused only on the positive sides.
My Ph.D. adviser, Prof. Danny has also taught me the importance of being positive and how it helps make the right decision whenever necessary. I knew that I had all the support that I needed to go through this difficult time. I also did not forget the deadline that I have in a few days. Though Prof. Danny insisted on taking time off from work, I really wanted to finish what I started and did not want to miss the deadline. Once I felt mentally strong, I opened my laptop, started working on the paper from the hospital, and submitted the paper to the conference as we planned. After months of evaluations, the evaluation committee invited me to present the work at the conference, and I won the gold medal in the student research competition.
Even though I have thousands of reasons not to submit the paper, I did not want to have my son as a reason for missing the deadline. I am truly grateful to my adviser Danny Dig, who helped me be mentally strong, Ameya Ketkar who guided me throughout the competition and the CU Boulder, for making an environment for students to succeed even with life challenges.