Statistics
0
Sample
0
MAG
0
Gene
Usage
1. Sample search
- Bar search: Enter keywords such as sample ID, sample name, country, wetland type, or sampling medium (e.g., water or soil) to batch search samples. The search is case-insensitive.
- Map search: Select samples by clicking on individual sampling points on the map or by using the Drawing Selection tool to define and select a region of interest. You can also use the checkbox in the upper-left corner to display sampling points by type.
2. MAG search
- Bar search: Enter keywords such as MAG ID, sample ID or taxonomic name to batch search MAGs. The search is case-insensitive.
- Map search: View MAGs associated with sampling points by clicking on individual points on the map or by using the Drawing Selection tool to select a region of interest. You can also use the checkbox in the upper-left corner to display sampling points by type.
- Filter: Use the filter to narrow search results based on MAG completeness, contamination, GC content, N50, and genome size.
3. Blast
Query nucleotide or protein sequences in FASTA format against a custom gene database built from all genes in our collection. The search uses the BLAST algorithm to identify homologous sequences and provides alignment scores, E-values, and percentage identities to facilitate functional annotation and comparative analyses.
- Methods
- Parameters
- Input format
Each gene or protein sequence should begin with a header line starting with '>', followed by one or more lines containing the sequence. Multiple sequences can be included in one file, with a maximum of 10 entries.
example file - Result format
This description of the result format follows the presentation on Metagenomics Wiki.
| Method | Input type | Database type | Function |
|---|---|---|---|
| blastn | Nucleic acid sequence | Nucleic acid database | The blastn application searches a nucleotide query against nucleotide subject sequences or a nucleotide database. |
| tblastn | Protein sequence | Nucleic acid database | The tblastn application searches a protein query against nucleotide subject sequences or a nucleotide database translated at search time. |
| Parameter | Function | Available method |
|---|---|---|
| E-value | Expect value (E) for saving hits. | blastn, tblastn |
| Max target sequences | Number of aligned sequences to keep. | blastn, tblastn |
| Word size | Word size for initial match. | blastn, tblastn |
| Reward | Reward for a nucleotide match. | blastn |
| Penalty | Penalty for a nucleotide mismatch. | blastn |
| Percent identity | Percent identity cutoff. | blastn |
| Threshold | Minimum score to add a word to the BLAST lookup table. | tblastn |
| Window size | Multiple hits window size, use 0 to specify 1-hit algorithm. | tblastn |
4. Module mapper
Map a user-provided KO list to the 511 KEGG modules and evaluate whether each step in the module satisfies the specified conditions. The results are displayed as a nested semi-circular plot, with KOs in the inner ring and module steps in the outer ring.
- Input format
Provide KOs via the input box or by uploading a .txt file, with one KO per line. Alternatively, upload a .annotations file, which must adhere to the original output format produced by eggNOG-mapper.
example file - Step condition
Symbol Meaning Logical Operator Description space Pathway Step AND Separates consecutive steps in a pathway or module. Order implies the sequence of reactions. + Molecular Complex AND Connects subunits of a molecular complex. All components are required for function. , Alternative OR Separates homologous genes or proteins. Only one of the components is required. - Optional Component Optional Indicates a non-essential component within a complex or signature. () grouping N/A Groups expressions to define the order of logical evaluation (precedence).
Acknowledgments
This work was supported by grants from the National Natural Science Foundation of China (NSFC nos. 32025024, 92251305 and 32430070 to L.C.; 32101246 to C.X.; 32301286 to S.L.), the Zhejiang Provincial NSFC (LZ24C030001 to L.C., LQ22C030006 to C.X., LQ24C030001 to J.X.) and the Academy of Ecological Civilization of Zhejiang University.
We gratefully acknowledge the large collective effort expended in administering, gathering, sequencing and uploading these datasets into the public domain. The datasets utilized in this study were obtained from the following public projects:
- NCBI projects
- - PRJNA983538
- - PRJNA870682
- - PRJNA1283286
- - PRJNA1068274
- - PRJNA788992
- - PRJNA1013564
- - PRJNA1109798
- - PRJNA1140737
- - PRJNA554750
- - PRJEB36558
- - PRJNA745574
- NGDC projects
- JGI/IMG
Contact
Lei Cheng, Ph. D., Professor
Email: lcheng@zju.edu.cn
College of Life Sciences
Zhejiang University
Zhejiang 310058, China
Yuxiang Wei, Ph. D candidate.
Email: 12407146@zju.edu.cn
College of Life Sciences
Zhejiang University
Zhejiang 310058, China
