A community-driven computational framework for standardizing, collecting, and submitting microbial biotransformation data to ChEMBL — making xenobiotic metabolism research FAIR and reproducible.
What is BioXend?
Microbial biotransformation of xenobiotics — the process by which microorganisms chemically modify drugs, environmental contaminants, and other foreign compounds — is poorly represented in public databases due to a lack of reporting standards. BioXend fixes that.
Work Packages
Develop community-consensus minimum reporting standards (MIX-MB) covering chemicals, microbes, and biotransformation assays.
Automate metadata collection from user inputs, external APIs (ChEMBL, NCBI Taxonomy), and Bioschemas-aligned identifiers.
Nextflow DSL2 pipeline that generates seven ChEMBL-ready submission files from simple tabular inputs..
Release BioXend as part of the NFDI4Microbiota workflow ecosystem for broad community adoption and long-term maintenance.
MIX-MB Standards
Three interconnected sub-standards define exactly what metadata must be reported for reproducible and database-ready experiments. The standards are continuously undergoing updates and are versioned. We appreciate your feedback.
Top-level overview of all three sub-standards, version matrix, identifier scheme, ChEMBL field mappings, and FAIR alignment. Start here to understand the full MIX-MB framework.
View MIX-MB Standards overview →Minimum metadata for the chemical substrate of biotransformation reactions: structure, identity, and classification.
Minimum metadata for the microbial agent: taxonomy, culture conditions, and sample provenance — aligned to Bioschemas Taxon profile.
Minimum metadata for the experimental assay and activity: conditions, analytical methods, products, and activity measurements.
Contribute
MIX-MB is community-driven. Your expertise in microbiology, metabolomics, cheminformatics, or data science directly shapes what gets standardized. Every form of participation counts.
No GitHub account needed. Share your opinion on which fields are truly essential for biotransformation data.
Open Survey →Propose a new field, report a bug, or flag missing vocabulary terms via GitHub Issues — templates provided.
Open an Issue →Have a broader idea or question? GitHub Discussions is the place for open-ended proposals and community debate.
Go to Discussions →Browse the Project Board and add a thumbs-up to issues and PRs you support. Community endorsement drives the standards forward.
View Project Board →Comfortable with Git? Submit PRs against the devel branch. Standards changes require a 14-day review period.
Explore community survey results and analysis to see what the field considers most important to standardize.
Browse Analysis →Nextflow Pipeline
A Nextflow pipeline that generates seven ChEMBL-ready submission files
from a single filled-out Template_open.ods.
Pipeline Steps
Reads publication metadata from the Reference sheet
REFERENCE.tsv README.tomlValidates SMILES and generates 2D structures via RDKit
COMPOUND_RECORD.tsv COMPOUND_CTAB.sdfBuilds assay entries per organism and condition
ASSAY.tsvLinks experimental parameters to each assay by AIDX
ASSAY_PARAM.tsvLinks compounds to assays and writes activity records
ACTIVITY.tsvQuickstart
Requires Nextflow ≥ 22.10.0 and Docker.
Team & Funding
Maintainer · EMBL, Molecular Systems Biology Unit
ORCID: 0000-0003-2127-4150 ↗Collaborator · European Bioinformatics Institute (EBI)
ebi.ac.uk/about/teams/chemical-biology-services/ ↗
BioXend is funded by the NFDI4Microbiota FlexFund 2026. NFDI4Microbiota is part of Germany's National Research Data Infrastructure (NFDI) initiative, dedicated to making microbiome research data FAIR and accessible.