College of Marin (COM) student Kayumba Herve and Astronomy Instructor Dr. Antonino ‘Nino’ Cucchiara set their sights high — no, really, really high.
Herve and Cucchiara worked alongside researchers at the Lawrence Berkeley National Laboratory (better known as the Berkeley Lab) for a summer internship like no other. Their mission? To teach computers how to efficiently study the cosmos.
There is nothing more infinite than space and, currently, data collection methods to study it are not particularly discriminating. It becomes overwhelming for researchers, who have to choose either to turn off their instruments and potentially miss something or try to sift through a universe full of data. Researchers at the Berkeley Lab have been working on special aspects of data science called machine learning that enables this process with the help of trained artificial intelligence and computers. For nearly a decade, astronomers have pursued this incredibly challenging task, but there is still more work to be done, according to Cucchiara.
Computer science researchers with Berkeley Lab worked with Herve and Cucchiara to design the algorithm and create the computer framework starting point, which evolved during the 10-week internship. Although Herve and Cucchiara began working with the Berkeley Lab in early June 2021, their efforts just to prepare for the scholarship began in spring 2021. Herve, Cucchiara and other students gathered a preliminary data set to bring to the Berkeley Lab researchers in the hopes that they could get relevant data as soon as possible.
“Machine learning is the technique that we're going to use to discriminate between what data are interesting and what are not,” Cucchiara said. “Our project is much, much further reaching than just the few weeks that we are expecting to do so over the summer.”
Herve, an international student from Rwanda, is studying computer science at COM. During this internship, he worked alongside the Berkeley Lab researchers to help train their original algorithm into something that can collect information on X-ray, Gamma rays, Infrared, and more. Specifically, the work conducted by Herve, Cucchiara, and the Berkeley Lab researchers focused on time-domain astronomy, which looks at how objects in the cosmos change over time -- some of them, like supernovae and Gamma-ray bursts, even explode.
The 10-week program was through the U.S. Department of Energy’s (DOE) Community College Internship (CCI). It seeks to encourage community college students to enter technical careers that will support the efforts of the DOE, which owns the Berkeley Lab and 14 others across the country. CCI is sponsored and managed by the DOE’s Office of Science’s Office of Workforce Development for Teachers and Scientists. Last year was the first year COM applied for a grant for the program.
Cucchiara and Herve met through Umoja, a learning community for African American and African diaspora students, and through a STEM (Science, Technology, Engineering, Mathematics) Field Club. Cucchiara knew Herve was a computer science student and thought he would be a great fit for this research project which married their two interests.
“During my internship last summer, we were able to create algorithms for a software program that pulled astronomical data gathered from satellites, mostly the Swift satellite,” Herve said. “The algorithm parameters we created helped the software program label astronomical data by using light curves to detect different types of astronomical events and objects in space.”
“I enjoyed my internship to the fullest,” continued Herve. “I would tell any student who is interested, that internships are great. They allow you to learn from skilled people who have lots of experience. And they help you build a strong resume and secure good references and recommendations.”
Cucchiara is particularly invested in the project. He is part of a team that is expected to open a new observatory in Chile called the Rubin Observatory in 2023. That facility is expected to produce petabytes of data per night every night. A petabyte is one million gigabytes times 1,024 terabytes and without machine learning, this amount of data could prove untenable.
For more information about STEM programs at COM, visit: academics.marin.edu/programs/stem
For more information about the Community College Initiative through the U.S. Department of Energy, visit: science.osti.gov/wdts/cci