Design siRNA (SiPro)

This tool uses an advanced machine learning model (XGBoost) to predict efficient siRNA candidates, combined with reference transcriptome alignment for specificity checking to minimize off-target effects.

1. Design Parameters

Note: This parameter defines the sensitivity of the off-target search. A larger value (e.g., 3) detects more potential off-target sites (even distant matches), resulting in stricter filtering and safer siRNAs. A smaller value (e.g., 1) means relatively looser screening conditions.

2. Input Sequence

Or paste nucleotide sequence:

Supported formats: Raw sequence or FASTA format.

Algorithm & Model

siRNA efficacy prediction is based on an XGBoost model trained on a large experimentally validated dataset(Test R² = 0.45, ROC AUC = 0.84). Key features include:

  • Sequence composition: Nucleotide frequencies at specific positions.
  • Thermodynamic features: GC content (local and global), Tm value (melting temperature).
  • Position preferences: Specific rules derived from Reynolds et al. and Ui-Tei et al. (e.g., A/U at the 5' end of the antisense strand).
  • Motif filtering: Avoid immune-stimulatory motifs and toxic sequences.

Specificity check: Candidate sequences are aligned against the transcriptome of the selected species. Only siRNAs specifically targeting the input gene (with no off-target hits within the specified mismatch tolerance) are retained.

References

  1. Vert JP, Foveau N, Lajaunie C, Vandenbrouck Y. An accurate and interpretable model for siRNA efficacy prediction. BMC Bioinformatics. 2006 Nov 30;7:520.
  2. Huesken D, et al. Design of a genome-wide siRNA library using an artificial neural network. Nat Biotechnol. 2005 Aug;23(8):995-1001.
  3. Katoh T, Suzuki T. Specific residues at every third position of siRNA shape its efficient RNAi activity. Nucleic Acids Res. 2007;35(4):e27.
  4. Bai Y, Zhong H, Wang T, Lu ZJ. OligoFormer: an accurate and robust prediction method for siRNA design. Bioinformatics. 2024 Oct 1;40(10):btae577.