This tool uses an advanced machine learning model (XGBoost) to predict efficient siRNA candidates, combined with reference transcriptome alignment for specificity checking to minimize off-target effects.
1. Design Parameters
Note: This parameter defines the sensitivity of the off-target search. A larger value (e.g., 3) detects more potential off-target sites (even distant matches), resulting in stricter filtering and safer siRNAs. A smaller value (e.g., 1) means relatively looser screening conditions.
2. Input Sequence
Or paste nucleotide sequence:
Supported formats: Raw sequence or FASTA format.
Algorithm & Model
siRNA efficacy prediction is based on an XGBoost model trained on a large experimentally validated dataset(Test R² = 0.45, ROC AUC = 0.84). Key features include:
- Sequence composition: Nucleotide frequencies at specific positions.
- Thermodynamic features: GC content (local and global), Tm value (melting temperature).
- Position preferences: Specific rules derived from Reynolds et al. and Ui-Tei et al. (e.g., A/U at the 5' end of the antisense strand).
- Motif filtering: Avoid immune-stimulatory motifs and toxic sequences.
Specificity check: Candidate sequences are aligned against the transcriptome of the selected species. Only siRNAs specifically targeting the input gene (with no off-target hits within the specified mismatch tolerance) are retained.
References
- Vert JP, Foveau N, Lajaunie C, Vandenbrouck Y. An accurate and interpretable model for siRNA efficacy prediction. BMC Bioinformatics. 2006 Nov 30;7:520.
- Huesken D, et al. Design of a genome-wide siRNA library using an artificial neural network. Nat Biotechnol. 2005 Aug;23(8):995-1001.
- Katoh T, Suzuki T. Specific residues at every third position of siRNA shape its efficient RNAi activity. Nucleic Acids Res. 2007;35(4):e27.
- Bai Y, Zhong H, Wang T, Lu ZJ. OligoFormer: an accurate and robust prediction method for siRNA design. Bioinformatics. 2024 Oct 1;40(10):btae577.