For my summer term of 2014 at the United States Department of Energy’s Lawrence Berkeley National Laboratory, I am conducting research in computational molecular biology for a blind submission experiment for the Critical Assessment of Protein Structure Prediction (CASP) initiative. In this process, I am formulating algorithms to help identify three dimensional protein structures based on amino acid sequences, hydrophilic properties, repulsive forces, and other protein folding propensities. When proteins do not fold properly, diseases such as cancer, Alzheimer’s, Parkinson’s, and other illnesses may develop. Because I found this scientific endeavor to be a rewarding way of applying theoretical computer science to alleviate society’s burden of degenerative diseases, I decided to join Berkeley’s national research lab to further CASP’s goal of assessing efficient methods for prediction. This objective will allow scientists in the future to use the knowledge of 3D protein structures in developing drugs for treating and finding potential cures for fatal diseases. Throughout my experience as a research intern, I have learned machine learning practices of different clustering algorithms and applied them to grouping similar unknown proteins to improve the submissions in the CASP experiment. I have also gained experience in working in different programming languages such as Python and utilizing different computational resources such as Message Pass Interface for parallel computing or MatPlotLib for mathematical visualization and computation. Research in the protein structure prediction team of the computational research division has stressed the ability to share ideas across multiple academic disciplines as tackling this scientific problem has encompassed a combination of computer science, biology, mathematics, and chemistry. Ultimately, this opportunity in the Science Undergraduate Laboratory Internship (SULI) program has broadened my knowledge in multiple scientific backgrounds and provided me with the research experience to prepare for pursuing graduate level studies in the future.
My experiences have allowed me to:
- extensively improve my Python programming skills
- learn introductory cluster analysis algorithms and their implementations in variant forms
- apply parallel computing using Message Pass Interface (MPI) to implement hybrid methods simultaneously in a biological software pipeline
- utlize different Python Libraries such as MatPlotLib or scikit-learn
- gain experience working in an interdisciplinary team to solve problems in computational molecular biology
- develop a higher understanding of the fast pace level of ground-breaking research