Advancing Flowering-Time Gene Identification: A Breakthrough in Machine Learning Models
en-GBde-DEes-ESfr-FR

Advancing Flowering-Time Gene Identification: A Breakthrough in Machine Learning Models

05/09/2024 TranSpread

A research team created seven learning models using Support Vector Machine (SVM) algorithms to discern flowering-time-associated genes (FTAGs) from non-FTAGs, with the SVM-Kmer-PC-PseAAC model performing the best (F1 score = 0.934, accuracy = 0.939, and receiver operating characterstic = 0.943). They created 'FTAGs_Find', a plant FTAGs prediction tool, identifying 318,521 FTAGs from 81 species protein datasets. Notably, Ostreococcus lucimarinus, a non-flowering plant, only 208 FTAGs were predicted, indicating extensive FTAG loss. They constructed a FTAG database (FTAGdb), facilitating user access to the FTAG prediction tool and the FTAG datasets. Plans involve expanding FTGD (Flowering-time Gene Database) with more datasets and exploring other machine learning (ML) methods, enhancing resources for breeders and researchers in the flowering-time community.

Flowering marks a pivotal shift from vegetative to reproductive phases in higher plants, impacting crop yield and overall plant fitness. While substantial progress has been made in understanding flowering mechanisms, identifying FTAGs remains challenging. Current methods rely on costly, time-consuming and labor-intensive wet-lab experiments or resource-intensive omics technologies. Existing bioinformatics tools like BLAST+ lack comprehensive information for accurate gene recognition. In response, ML emerges as a promising solution, yet no ML model exists for FTAGs' protein sequences.

A study (DOI: 10.48130/tp-0024-0007) published in Tropical Plants on 03 April 2024, develops an ML model for precisely identifying proteins encoded by FTAGs, enhancing research efficiency in flowering-time studies.

To construct the SVM classification model for predicting FTAGs, 628 positive and 8,163 negative protein sequences underwent data preprocessing. The dataset was divided into training and test sets, 80% dataset was used to construct the SVM prediction model, while 20% formed the test set for evaluating the prediction model.. Seven types of features were employed to train the SVM prediction model, including ACC, Kmer, PC-PseAAC, Kmer-ACC, ACC-PC-PseAAC, Kmer-PC-PseAAC, and ACC-Kmer-PC-PseAAC, and optimized using a grid search on kernel, gamma, and cost parameters. Among the models, SVM-Kmer-PC-PseAAC demonstrated superior performance. Subsequently, a local Python tool, 'FTAGs_Find', was developed based on this model, enabling proteome-wide identification of FTAGs. The tool identified 318,521 FTAGs from 2,873,697 protein sequences across 81 species. Notably, species like Sphagnum fallax exhibited significant FTAG expansion, while non-flowering plants like Ostreococcus lucimarinus showed minimal FTAG presence. Further, GO enrichment analysis in Brassica rapa revealed FTAG involvement in various flower development processes. Additionally, the constructed prediction model demonstrated an 88% recognition rate for flowering-time-related genes in B. rapa, enhancing confidence in its accuracy and reliability. Finally, the FTGD (www.sagsanno.top:8080/FTGD) was established, offering user-friendly tools for FTAG prediction, dataset browsing, and submission, aiming to facilitate comprehensive research in the field.

According to the study's lead researcher, Zhidong Li, “We are confident that the FTGD will prove to be a valuable and user-friendly resource for all researchers.”.

In summary, this study used SVM algorithms to distinguish FTAGs with high accuracy, leading to the development of 'FTAGs_Find' for proteome-wide FTAGs identification. Large-scale analysis across 83 species revealed FTAGs' evolutionary patterns. The FTGD was established for easy access. Looking ahead, the goal is to expand FTGD with additional datasets and explore advanced machine learning techniques to further refine the prediction model. This refinement will enhance its utility for the scientific community and contribute to broader insights into plant flowering mechanisms.

###

References

DOI

10.48130/TP-2023-0023

Original Source URL

https://doi.org/10.48130/TP-2023-0023

Funding information

This work was supported by the National Natural Science Foundation of China (32172614), Hainan Province Science and Technology Special Fund (ZDYF2023XDNY050). Authors thank the anonymous editor and reviewers for their valuable comments and suggestions.

About Tropical Plants

Tropical Plants (e-ISSN 2833-9851) is the official journal of Hainan University and published by Maximum Academic Press. Tropical Plants undergoes rigorous peer review and is published in open-access format to enable swift dissemination of research findings, facilitate exchange of academic knowledge and encourage academic discourse on innovative technologies and issues emerging in tropical plant research.

Title of original paper: FTGD: a machine learning method for flowering-time gene prediction
Authors: Junyu Zhang1,2,3, Shuang He1,2,3, Wenquan Wang1,2,3, Fei Chen1,2,3* and Zhidong Li1,2,3*
Journal: Tropical Plants
Original Source URL: https://doi.org/10.48130/TP-2023-0023
DOI: 10.48130/TP-2023-0023
Latest article publication date: 22 November 2023
Subject of research: Not applicable
COI statement: The authors declare that they have no competing interests.
Attached files
  • Fig.1 FTGD platform build flowchart.
05/09/2024 TranSpread
Regions: North America, United States, Asia, China
Keywords: Applied science, Engineering, Science, Life Sciences

Disclaimer: AlphaGalileo is not responsible for the accuracy of news releases posted to AlphaGalileo by contributing institutions or for the use of any information through the AlphaGalileo system.

Testimonials

For well over a decade, in my capacity as a researcher, broadcaster, and producer, I have relied heavily on Alphagalileo.
All of my work trips have been planned around stories that I've found on this site.
The under embargo section allows us to plan ahead and the news releases enable us to find key experts.
Going through the tailored daily updates is the best way to start the day. It's such a critical service for me and many of my colleagues.
Koula Bouloukos, Senior manager, Editorial & Production Underknown
We have used AlphaGalileo since its foundation but frankly we need it more than ever now to ensure our research news is heard across Europe, Asia and North America. As one of the UK’s leading research universities we want to continue to work with other outstanding researchers in Europe. AlphaGalileo helps us to continue to bring our research story to them and the rest of the world.
Peter Dunn, Director of Press and Media Relations at the University of Warwick
AlphaGalileo has helped us more than double our reach at SciDev.Net. The service has enabled our journalists around the world to reach the mainstream media with articles about the impact of science on people in low- and middle-income countries, leading to big increases in the number of SciDev.Net articles that have been republished.
Ben Deighton, SciDevNet

We Work Closely With...


  • BBC
  • The Times
  • National Geographic
  • The University of Edinburgh
  • University of Cambridge
  • iesResearch
Copyright 2024 by AlphaGalileo Terms Of Use Privacy Statement