Tanzima Islam is an assistant professor in the Department of Computer Science at Texas State University (TxState). Dr. Islam earned her Ph.D. in Computer Engineering from Purdue University and was a postdoctoral scholar at Lawrence Livermore National Laboratory (LLNL). Her research develops software tools and data-driven analysis techniques to automatically identify performance problems of scientific applications running on High-Performance Computing (HPC) systems. She is the recipient of the prestigious Early Career Research Program (ECRP) award from the Department of Energy (DOE), DOE's SRP fellowship ('22, '19). The impact of her research has been recognized nationally and internationally through awards such as the R&D 100 award, the Science and Technology award from Lawrence Livermore National Laboratory, and the College of Science and Engineering’s Excellence in Scholarly Activities at TxState. Dr. Islam’s research has been funded by DOE, TxState, AMD, and various national laboratories.
Dr. Islam is also the co-founder of Bangladeshi Women in Computer Science and Engineering (BWCSE) -- research and mentoring platform for mentoring Bangladeshi female Computer Science and Engineering students in scholarly activities. Since its inauguration in 2014, this pioneering effort has provided hundreds of female students with information and mentorship to secure national and international opportunities to strengthen their resumes. More one BWCSE can be found here.
I am the director of the Per4ML laboratory at TxState. Our research mission is to enable rapid scientific discoveries with effective cyberinfrastructure utilization. To achieve this mission, we develop performance modeling, analysis, and optimization techniques for both scientific and deep learning applications running on heterogeneous High Performance Computing (HPC) systems. Specifically, at Per4ML, we work in the cross-cutting field of data science for HPC by addressing exciting problems in computer systems, machine learning, software development, and visualization.
Per4ML has Funded Positions Available!
Per4ML has multiple RA and Postdoc positions available. I am looking for driven and self-motivated individuals interested in developing deep learning techniques for decision-making with application to HPC and systems. Please reach out with your CV that demonstrates your experience in leveraging ML.
Self-motivated undergraduate students can also get involved in research to bolster their resumes. Contact me by clicking on the email icon under my picture.
Research and Other Funding
PI, INTELYTICS: An Efficient Data-Driven Decision-Making Engine for Performance In the Era of Heterogeneity, DOE Early Career Research Program, 770K (2022-2027)
Co-PI, Scalable Metadata And Provenance Services for Reproducible Hybrid Workflows, DOE Next Generation Data Management, 300K (2022-2025)
Sub-contract, ICE4HPC: Towards the Intelligent Center for HPC, Laboratory Directed Research and Development at Lawrence Livermore National Laboratory, 450K (2022-2025)
PI, Characterizing Workflow Applications using Machine Learning, DOE SRP-HPC fellowship at Brookhaven National Lab, 68K (2022)
PI, AMD Research Gift, 100K (2021-2022)
Member, REU Site: Research Experiences for Undergraduates in Edge Computing, NSF, 389K (2021)
Co-PI, AMD COVID HPC Grant, 5-petaflops HPC system, 400K (2021)
PI, Proxy Application Validation for Exascale Co-design. SRP fellowship at Lawrence Berkeley National Laboratory, 43K (2019)
PI, Parallel Computing course. Time allocation grant from XSEDE, 100K core-hours (2017, 2018, 2021, 2022)
Co-PI, course development for "Scientific Data Visualization". Office of Research and Sponsored Programs at Western Washington University, 12K (2018)
PI, Veritas for Understanding Performance Evolution during Code Development. Linking Exploratory Application Research to Next-gen Development at Lawrence Livermore National Laboratory, 200K (2016)
Awards & Honors
The College of Science and Engineering’s Excellence in Scholarly Activities at TxState, 2021-2022
R&D 100 award for Scalable Checkpoint-restart library, 2019-2020, by R&D 100 Magazine
Director's Science & Technology Awards and Excellence in Publication, LLNL, 2014
2nd place, Computation Directorate Postdoctoral Poster Symposium LLNL, 2014
Best Poster Award, Scholars Symposium@LLNL, 2014, 2015, 2016
2nd place, ACM Student Research Competition (SRC), GHC, 2010
Why Texas State (TxState)?
According to CSRankings, in the area of HPC alone, TxState ranks among the top 40 universities in the nation. We are one of the few institutions to have a petaflop supercomputer on-premise and have easy access to the TACC supercomputers.
TxState is a large public university (~40K students) located in San Marcos, just 30 miles south of the heart of Austin, TX. Austin is a vibrant city bustling with diversity in food, people, and culture. Being so close to Austin, you will have access to the city's many perks and amenities while enjoying affordable living in San Marcos. Added bonus: almost every single large company (e.g., Google, Amazon, Apple, Meta, AMD, NXP, Samsung) has an office in Austin, so opportunities for jobs and internships are aplenty.
TxState is committed to providing resources and support for a diverse student body. Everyone fits right in!
Meet The Team
DOE Next-gen Data Management
Excited! Our collaborative proposal with Brookhaven National Lab and Argonne National Lab has been funded by DOE.
DOE ECRP Award
I have received the Early Career Research award from the Department of Energy (DOE) for the innovative application of deep learning to accelerate real-time performance analytics! I will be hiring two new Ph.D. students in this very cutting-edge research project!
I have received DOE's SRP-HPC fellowship and will be conducting research over the summer as a visiting scholar at Brookhaven National Laboratory. My students--Chase Phelps, Arunavo Dey, and Alicia Guite (sophomore) will also join me.
Congratulations to Tarek Ramadan for successfully defending his M.Sc. thesis.
I have been invited to give a talk at the CHEOPS workshop colocated with EuroSys'22.
Our paper "libNVCD: An Extendable and User-friendly Multi-GPU Performance Measurement Tool" has been accepted at IEEE COMPSAC'22 (acceptance rate: 23%)!
I am chairing the Performance measurement, modeling, and analysis track at SC this year. Excited and honored!
Thanks to AMD for the 50K gift! This is the second year of our collaboration. Here is to many more!
My invited paper talks about designing an HPC course focusing on distributed memory parallelism and scalable parallel I/O patterns.
College achievement award
I won the College of Science and Engineering's "Achievement Award in Scholarly Activities" at TxState. Honored!
Congratulations to Chase Phelps and Tarek Ramadan for securing internships at LLNL and LBNL, respectively!
Our paper "College Life is Hard! - Shedding Light on Stress Prediction for Autistic College Students using Data-Driven Analysis" accepted at COMPSAC'21.
Our paper on "comparative Code Structure Analysis using Deep Learning for Performance Prediction" has been accepted at ISPASS'21.
RobustScience Community I
Excited to be in the RobustScience virtual café talking about scalability, reproducibility, and trustworthy issues of scientific workflows.
Received funding from the Research Enhancement Program at TXST for developing deep learning models using code structure to predict performance.
AMD Research Gift
My team will develop performance models for the next-gen AMD GPUs and improve their performance profiler rocprof. Thanks AMD for the gift!
AMD Equipment Grant
AMD has granted $100K worth of on-prem and Cloud HPC resources to TXST for COVID-19 related research. Read here: https://www.amd.com/en/corporate/hpc-fund
Organized a panel at ACM Richard Tapia Celebration of Diversity in Computing Conference on High Performance Computing. Take a look: video presentation and slides: https://drive.google.com/file/d/18daQuN8CCdkrIKgm2cHNrUBKJDggfZyx/view?usp=sharing
I gave a talk at AMD on my research in performance analysis on various architectures.
Invited talk at the Exascale Computing Project's annual hackathon organized by ORNL and BNL on Dashing (here).
Gave invited talk "Learning to Manage in Grad School for a Sustainable Career in Future" at Purdue University. Here is an uncut version.
R&D 100 Award Winner
Our research on scalable checkpoint restart won the R&D 100 award.
Our paper titled "Towards A Programmable Analysis and Visualization Framework for Interactive Performance Analytics" accepted in the ProTools workshop at SC'19.
I have moved! I have joined Texas State University in San Marcos. It was incredibly hard to leave my colleagues, students, and friends in beautiful Bellingham behind.
Our paper titled "Performance Optimality or Reproducibility: That is the Question" has been accepted in the performance track at SC'19 (acceptance rate: 72/344 = 20%)! Congratulations to Alex for his first SC paper.
Visiting faculty scholar at LBNL with my students for the summer.
Received the DOE SRP fellowship for my proposal titled "Proxy application validation for Exascale Co-design".
Congratulations to Gian-Carlo for getting his first research poster accepted at ISC'19!
I am organizing the first workshop on how different application domains generate and analyze big data by leveraging HPC. The workshop (BDCAA) will be organized in conjunction with the IEEE COMPSAC'19 in Mulwaukee.
Super excited to be a part of the Computer Systems Engineering (CSE) track at GHC 2019. About time women working in computer systems meet!
Attending the CRA-W Career mentoring workshop in Phonix, AZ.
Invited to present my research at the Sustainable Research Pathways Program in Lawrence Berkeley National Laboratory
I am the vice-chair of the performance track at ICPP 2019. Make sure to submit your work at the conference.
Our poster titled "Automatic Generation of Mixed-Precision Programs" has been accepted at SC'18. This work has been done in collaboration with LLNL.
Received XSEDE allocation grant for my Parallel Computing course. Happy computing!
Several of my students are interning at research labs such as LLNL, PNNL. Happy summer!
Our paper titled "Low Rank Smoothed Sampling Methods for Identifying Impactful Pair-wise Mutations" has been accepted at ACM Conference on Bioinformatics, Computational Biology, and Health Informatics. Congratulations Nick!
Our paper titled "PADDLE: Performance Analysis using a Data-driven Learning Environment" has been accepted at IPDPS 2018!
Joined as an Assistant Professor in CS at WWU.
Our paper on scalable file system for burst buffers titled "MetaKV: A key-value store for metadata management of distributed burst buffers" has been accepted in IPDPS'17 [Acceptance rate ~22%].
I am a TPC member of the workshop "Advancing Science via Large Scale Text Analytics over Scientific Articles" that will be hosted as a part of the IEEE International Conference on Distributed Computing Systems (ICDCS) 2017.
I will serve as a TPC member for the performance track in SC'17.
Our paper titled "CMT-bone - A Proxy Application for Compressible Multiphase Turbulent Flows" has been accepted in the 23rd IEEE International Conference on High Performance Computing, Data, and Analytics that will be held in Hyderabad, India during Dec 19-22 2016.
My paper with the title "A Machine Learning Framework for Performance Coverage Analysis of Proxy Applications" has been accepted in the International Conference for High Performance Computing, Networking, Storage and Analysis (SC) 2016!
Attending IPDPS 2016 in Chicago. Our paper on "I/O aware power shifting" will be presented in the "I/O and Storage Track".
Started my own project as the PI on applying machine-learning to analyze application performance. The coolest parts of my project are (a) managing an awesome team; (b) working directly with the applications that are important for LLNL. Excited!
Heard back from the LLNL research council and they are interested to move forward with my LEARN proposal. The idea is about developing machine-learning techniques for correlating a plethora of system performance metrics to application performance.
I have been invited as a Program Committee member at the workshop "Tools for Program Development and Analysis in Computational Science" co-located with ICCS 2016.
Our paper titled "I/O Aware Power Shifting" has been accepted in IPDPS 2016.
Our paper titled "Fault Tolerance Assistant (FTA): An Exception Handling Approach for MPI Programs" has been accepted in the ExaMPI workshop at SC'15.
Best poster 2015
Our poster titled "Towards Scientific-Data Compression Using Variable Clustering" received the Best Poster Award in the Computation Directorate's scholar poster session at LLNL.
Our paper titled "Exploring the MPI Tool Information Interface: Features and Capabilities" has been accepted in IJHPCA.
My work titled "Reliable and Efficient Distributed Checkpointing System for Grid Environments" has been accepted in Journal of Grid Computing.
My work on proxy application validation is one of the few projects that have been selected to showcase LLNL's research in achieving the laboratory's mission to the External Review Committee (ERC).
I presented my work on proxy application validation at the Joint Operations Weapons Operations Group (JOWOG) 34 meeting at Sandia National Laboratory in New Mexico.
Our poster on the feasibility of applying lossy compression on checkpoints has been accepted in SC'14.
Our poster titled "Lossy Compression for Checkpointing: Fallible or Feasible?" received the Best Poster Award in the 2014 Computation Directorate's scholar poster symposium.
My poster on proxy application validation secured the 2nd place in Computation's Annual Postdoc Poster Symposium.