I am an honorary research fellow at the University of Liverpool and a computational biologist at King Abdullah International Medical Research Center (KAIMRC). I advocate for inclusive and collaborative community spaces to enable purposeful human-centred adoption of open research practices, tools and culture. I lead the Open Science community Saudi Arabia (OSCSA) which introduce and contextualise Open Science practices in Arabic-speaking countries. I'm also a Content Subject Matter Expert (SME) in NASA's Transform to Open Science (TOPS).
I started as a pharmacologist and made a transition to computational biology and bioinformatics during my Ph.D. The focus of my Ph.D. was on exploring the potential vulnerability of Triple-negative breast cancer (TNBC), particularly in the DNA repair pathways. Having worked with various tools and techniques in genetics and structural biology, I obtained a unique blend of interests, including the use of machine learning algorithms to answer biological questions.
I am part of R-Ladies Global Team, which is a worldwide organisation to promotes gender diversity in the R community. I'm also a core contributor to The Turing Way, an open-source, community-driven guide to reproducible, ethical and collaborative data science, a certified Carpentry instructor, and a member of the education committee in the international society of computational biology (ISCB). I'm a mentor and facilitator in The Open Life Science (OLS) program, which is a 16-week long personal mentorship and cohort-based training, where participants share their expertise and gain knowledge essential to create, lead, and sustain an Open Science project to empower each other to become effective Open Science ambassadors in their communities. I also served as a reviewer for different journals, including PLOS Computational Biology, R Journal and JOSS.
I have been involved in delivering a number of talks and training programmes related to Open Science in a number of events and communties including eLife and UNESCO (see below for more details about talks). When I'm not coding, I'm embarking on an unplanned road trip exploring new places.
I am passionate about using technology to make a positive impact on the world, and I am constantly looking for new opportunities to learn and grow. On this page, you will find a list of the various projects I am currently involved in, including academic research, open-source contributions, community building, and educational initiatives. These projects represent a diverse range of interests and skills, but they all share a common goal: to use technology to make a difference.
I am currently leading an exciting project to produce a FAIR (Findable, Accessible, Interoperable, Reusable) dataset for encephalitis patients in KSA. The goal of this project is to improve the diagnosis of encephalitis using deep learning and to further our understanding of this condition by responsibly leveraging clinical data from 15 hospitals across the country. This project is a collaboration between multiple institutions and is designed to make a real impact on patient care by providing doctors with more accurate and timely diagnosis.
I am currently engaged in a research project that aims to understand the correlation between mutations in BLM (Bloom syndrome protein) and disorder regions. Using molecular dynamics simulations on the high-performance computing facilities at the University of Liverpool, we are exploring the impact of these mutations on the structure and stability of BLM. This research has the potential to provide new insights into the role of BLM in DNA repair and to inform the development of new therapeutic strategies for treating disorders associated with BLM mutations.
I am proud to have worked on a collaborative project with Odin Vision in the Alan Turing's Data Science Group to explore methods for interpreting neural network classifications in diagnostic predictions. Our focus was on developing interpretable predictions for clinicians to support their decision making. We investigated a variety of techniques, with a particular emphasis on gradient-based and perturbation-based attribution methods. Our research resulted in the identification of Guided GradCam as the most insightful method for understanding the relative influence of input features on the classification process to identify potential failure modes such as false negative predictions. The results of this project have been published in a preprint.
I am currently collaborating with Dr. Sean Brady from Rockefeller University and King Abdullah International Medical Research Center (KAIMRC) on a project to discover new antibiotics from Saudi soil using culture-independent methods. Our project involves collecting environmental samples from all regions of Saudi Arabia, resulting in a large dataset containing over 600 samples with rich metadata. Our goal is to develop a reproducible and modular pipeline for integrated metagenomic analysis in drug discovery from Saudi soil.
I am currently working on a project to develop an nf-core pipeline for the Molecular Dynamics of proteins as part of the nf-core mentorship program. This project will be conducted as part of the "BioHackthn MENA" organized by KAUST. The goal of this project is to create a modular and reproducible pipeline for the molecular dynamics simulation of proteins. The pipeline will be built using Nextflow, a powerful workflow tool that allows for the execution of tasks across multiple compute infrastructures in a portable manner. We will be using Docker/Singularity containers to ensure that installation is easy and results are highly reproducible.
I am honored to be participating in a meaningful project to formulate a manifesto on Open Science in collaboration with eLife and the Einstein Foundation Berlin. This project aims to explore important topics related to decolonization of knowledge, representation of research from local communities in non-WEIRD countries, multilingualism in Open Science, and how Open Science can compound inequities if not done responsibly. As part of this project, I have co-organizing a two-day virtual symposium on December 1-2, 2022 which explored the theme of "Global Dynamics in Responsible Research" under the mission of the Einstein Foundation Award. I am part of an international team that is responsible for all aspects of event planning, website design, maintenance, and management, including nominating and inviting speakers. I am excited to be a part of this initiative and to contribute to the advancement of responsible and inclusive Open Science practices.
I am proud to be a core member of the Turing Way, an open-source, community-driven guide to reproducible, ethical, inclusive and collaborative data science developed by the Alan Turing Institute. In February 2020, the Turing Way expanded to a series of books covering reproducible research, project design, communication, collaboration, and ethical research. I am leading the localization and translation team, working to make the guide available in multiple languages using Crowdin and GitHub. I am excited to be a part of this initiative and contribute to the advancement of responsible and inclusive data science practices.
I am honored to be a mentor and co-facilitator in the Open Life Science (OLS) program, which is based on the Mozilla Open Leader Program and aims to help individuals and stakeholders become Open Science ambassadors. I have been a mentor for both OLS5 and OLS6, and one of the projects I mentored was led by Biandri Joubert, which aimed to create an open educational resource that demonstrates the "why" for people from legal backgrounds who might want to learn R. The project is partly a case for open and reproducible research in a field that does not typically use R and quantitative methods. The project also involves creating different data sets derived from commonly used legal sources that law students or graduates would be familiar with and incorporating them into a single platform with a few examples of practical use and application, as well as code. I am excited to be a part of this initiative and contribute to the advancement of open and reproducible research in legal field.
I am proud to have founded the Open Science Community Saudi Arabia (OSCSA), with the mission to build capacity and promote open science and open innovation in Arabic speaking countries. The community aims to provide a platform for researchers, practitioners, and educators to share knowledge, collaborate, and support each other in the adoption of open science principles and practices. The OSCSA community offers a variety of activities, such as workshops, training, meetups, and online discussions to introduce and contextualize open science practices in Arabic-speaking countries. The community also provides resources, such as guides, templates, and online tools to help researchers in these countries to implement open science practices. As the founder of OSCSA, I have been actively involved in building the community from the ground up, by recruiting members, organizing events, and creating resources to support open science in the region. My goal is to make open science more accessible and inclusive for researchers in Arabic-speaking countries, and to promote collaboration and knowledge sharing among researchers in the region
I am honored to be a Content Subject Matter Expert (SME) in NASA’s TOPS (Transform to Open Science) initiative, which is designed to promote scientific innovation, transparency, and reproducibility. As part of my role, I have been actively involved in designing the Open Tools module, which is a component of a Massive Open Online Course (MOOC) on open science. The MOOC will be hosted on the openEDx platform, making it accessible to a wide audience. I am excited to be a part of this initiative and to contribute to the development of educational resources that will help researchers to adopt open science practices in their work.
I am proud to be working with the Education committee in the International Society for Computational Biology (ISCB) to co-organise hands-on tutorials as part of the ISCB Academy. Our goal is to provide researchers and students with the knowledge and skills they need to perform cutting-edge computational biology research. The tutorials are designed to be interactive and engaging, with a focus on practical applications of computational biology methods. One of the key goals of this project is to implement FAIR principles in the design of the tutorials. By implementing FAIR principles in the tutorials, we aim to promote transparency and reproducibility in computational biology research.
I had the opportunity to spend a month in two cities in Uganda, Hoima and Kampala, as part of the Liverpool-Mulago Partnership. The partnership is a collaboration between the University of Liverpool and Mulago National Referral Hospital in Kampala, Uganda that aims to improve maternal and child health outcomes in Uganda. During my time there, I had the opportunity to assist in the hospital and provide support to the medical staff. In addition, I was involved in collecting and analyzing data regarding infant mortality in the region. The data collected will be used to identify the main causes of infant mortality and to develop strategies to reduce the rate of infant mortality in Uganda. This experience was extremely valuable to me as it gave me the opportunity to learn about the challenges that people in the region face and the impact of poor maternal and child health on their lives.
I am proud to be a certified instructor for The Carpentries, a global non-profit organization that teaches computer programming and data science skills to researchers through instructional workshops. The Carpentries is made up of three program areas: Software Carpentry, Data Carpentry, and Library Carpentry. As a certified instructor, I have been involved in teaching workshops to researchers, providing them with the skills they need to analyze and manage data effectively. In addition to teaching workshops, I am also working with the lesson development team to develop a new lesson about single-cell RNA-seq in the Carpentries-incubator. Single-cell RNA-seq is a cutting-edge technology that allows researchers to study gene expression at the level of individual cells. The new lesson will provide researchers with the knowledge and skills they need to analyze single-cell RNA-seq data and gain insights into the biology of individual cells. I am excited to be a part of this initiative and to contribute to the development of educational resources that will help researchers to advance the field of single-cell RNA-seq research
I recently had the opportunity to co-organize the Global useR Conference 2021, a non-profit conference organized by community volunteers for the R community. The conference brought together around 2000 attendees from academia and industry, including R developers, data scientists, business intelligence specialists, analysts, statisticians and students. As a member of the organizing committee, I played an active role in the planning and execution of the conference. I was in charge of the newbie's event, which was designed to provide a supportive environment for individuals who are new or never contributed to the R community. The event helped to introduce new members to the R community and provided them with the resources and support they need to become active contributors. I was also part of the Code of Conduct response team, which is responsible for ensuring that the conference is a safe and inclusive space for all attendees. I am proud to have been a part of this initiative and to have contributed to the success of the Global useR Conference 2021. It was an incredible opportunity to connect with the R community and to learn from experts in the field.
I am the founder and organizer of the first local chapter of R-Ladies in the Arab states of the Arabian Gulf. R-Ladies is an organization that promotes gender diversity in the R community worldwide. As the founder and organizer of the local chapter, I am dedicated to creating a welcoming and inclusive space for individuals of all genders to learn about the R programming language, algorithms, and advanced tools. I also collaborate with other R-Ladies chapters in North Africa to promote reproducibility with R to scientists in the Arab nations. Through our meetings, whether in person or virtually, members have the opportunity to learn from experienced R users, network with peers, and become part of a supportive and encouraging community. I am passionate about fostering diversity and inclusivity in the field of data science and technology, and I am proud to have created a local chapter of R-Ladies that reflects these values. I look forward to continue to build and grow this community and to support its members in their journey to learn and to use R programming language.
I am always happy to talk! Here is some of my recent talks with links to the materials.
- Batool Almarzouq. (2022, May 27). UNESCO Talk: Saudi Arabia effort towards adopting Open Science Practices inline with Vision 2030. Zenodo. https://doi.org/10.5281/zenodo.6586783 A short talk delivered at the UNESCO (Slides)
- Batool Almarzouq. (2022, March 11). Reflection on Open Science practices and Research Software in the Kingdom in alignment with Saudi Arabia's Vision 2030. Zenodo. https://doi.org/10.5281/zenodo.6345895 A talk presented to Research Software Alliance (ReSA) as part of Asia Pacific Advanced Network (APAN53) (Slides)
- Batool Almarzouq. (2022, April 11). Our Journey Towards the Adoption of Open Science (OS). Zenodo. https://doi.org/10.5281/zenodo.6450357 A talk delivered to eLife Community (Slides)
- Batool Almarzouq. (2021, November 11). The Turing Way: Four Selfish reasons to work openly. Zenodo. https://doi.org/10.5281/zenodo.5674321 A talk introducing Open Science and the Turing Way in the Danish Diabetes Academy (DDA) Winter School (Slides)
- Anelda van der Walt, Batool Almarzouq, Malvika Sharan, Yo Yehudi, Nelsy Mtsweni, & Asmaa Nofal. (2021, September 22). How can open educational video resources be made more accessible for remixing and translation beyond posting on YouTube?. Zenodo. https://doi.org/10.5281/zenodo.5521150 This presentation was given at the Creative Commons Summit 2021 (Slides)
- Batool Almarzouq. (2022, August 1). Open Science approach to increase the discoverability of the local research outputs (Arabic). Zenodo. https://doi.org/10.5281/zenodo.6948435. A talk delivered to useR Oman Community (Slides)
- Almarzouq, Batool, Karoune, Emma, & Sharan, Malvika. (2022, June 9). Are best practices applicable to your project? Contextualising The Turing Way for the global Community. [Workshop materials in Arabic and English]. Zenodo. https://doi.org/10.5281/zenodo.6627260 A session delivered at RightsCon 2022 (Slides)
- Batool Almarzouq. (2022). Leveraging Communities to Advance R and Open Science (Arabic only). https://doi.org/10.5281/zenodo.6449331 A talk delivered to JeelAIDM Community (Slides)
- Batool Almarzouq. (2022, March 17). Practices to Improves Visibility and Outputs for Postgraduate Students in Saudi UK network (Arabic/English). Zenodo. https://doi.org/10.5281/zenodo.6366449 A talk presented to the Saudi UK Network (Slides)
- Batool Almarzouq. (2021, October 6). Challenges of Developing Open Science Communities and Resources in the Middle East. Zenodo. https://doi.org/10.5281/zenodo.5552035 A talk presented in the Open Source Community Call co-hosted by FORCE11, Dryad, and eLife (Slides)
- Batool Almarzouq. (2021, April 4). An Open Science Approach to Machine Learning in Biomedical Research. Zenodo. https://doi.org/10.5281/zenodo.4662096 A talk delivered to the Saudi Data Community (Slides)
- Batool Almarzouq. (2021, June 13). Open Science Community in Saudi Arabia. Zenodo. https://doi.org/10.5281/zenodo.4940010 A talk delivered to Open Life Science Program (Slides)
- Batool Almarzouq. (2021, July 6). Make Your Computational Analysis Citable. Zenodo. https://doi.org/10.5281/zenodo.5075932 A lightning talk presented in useR!2021 Conference (Slides)
I enjoy simplifying complex open science concepts and programming. Here is a list of workshops with links to the materials.
- Batool Almarzouq. (2022, August 13). Demystifying the Command line (Arabic). Zenodo. https://doi.org/10.5281/zenodo.6988763 A 60 mins traning delivered to JeelAIDM (Slides)
- Batool Almarzouq, Hussain Alsalman, Monah Abou Alezz, Abulrahman Alasiri, Haifa Ben Messaoud, Ammar Alkhaldi, & Abulrahman Alswaji. (2022, July 2). Beginners Guide for R through Open Data Science Practices (Arabic). Zenodo. https://doi.org/10.5281/zenodo.6796234 A training based on the Carpentries Workshops (Slides)
- Batool Almarzouq, & Hussain Alsalman. (2022, April 25). All you need to know to master GitHub without the command line (Arabic only). Zenodo. https://doi.org/10.5281/zenodo.6484771 A training delivered to ArabR Community (Slides)
- Batool Almarzouq. (2021, April 4). Collaborating on Open Data Science Projects. Zenodo. https://doi.org/10.5281/zenodo.4662095 This workshop was delivered as a part of WiDS Saudi Arabia (Slides)
I occasionally write blog posts since it allows me to merge my love for writing with everything else I’m passionate about. I write about coding, deep Learning (ML) and try to simplify scientific concepts and make it easier to learn, understand and use, without sacrificing its power and usefulness.
Please feel free to contact me on firstname.lastname@example.org.
I occasionally take on freelance opportunities.
Have an exciting project where you need some help?
Send me over a message, and let's chat!