In-silico identification and phylogenetic analysis of auxin efflux carrier gene family in Setaria italica L .

1 National Institute of Plant Genome Research, Aruna Asaf Ali Marg, 110067 New Delhi, India. 2 Istituto Agrario San Michele all’Adige, Research and Innovation Centre, Foundation Edmund Mach, Trento, Italy. 3 Departent di Biologia Vegetale, Viale Mattioli, 10125, University of Turin, Italy. 4 Istituto Agrario San Michele all’Adige, Research and Innovation Centre, Foundation Edmund Mach, Trento, Italy.


INTRODUCTION
In the model plant Arabidopsis thaliana, auxin plays a crucial role in regulating and coordinating plant growth and is involved in many developmental processes, including embryogenesis, meristem maintenance, organogenesis, lateral root initiation, vascular tissue differentiation and tropisms.Specific auxin influx carriers (AUX/LAX proteins) and efflux carriers (PIN and PGP/MDR proteins) mediate a directional, active, cell-tocell auxin transport, creating auxin concentration maxima in specific tissues or cells.PIN auxin efflux carriers play a major role in mediating and regulating polar auxin transport (PAT), creating the auxin gradients that provide positional information for cells and tissues development (Benkova et al., 2003;Michniewicz et al., 2007;Reinhardt et al., 2000).
In A. thaliana, there are eight PIN genes (AtPIN1-AtPIN8) coding for proteins that differ in the length of the hydrophilic loop in the middle of their polypeptide chain (Krecek et al., 2009a;Zazimalova et al., 2007).The long PIN proteins of Arabidopsis viz., PIN1,PIN4 and PIN7 show plasma membrane localization and their polar localization determines direction of auxin flux (Friml 2010).The three PIN proteins PIN5, PIN6, and PIN8, have a shorter central hydrophilic domain and both PIN5 and PIN8, have been shown to localize in the endoplasmic reticulum, suggesting a possible role in regulating intracellular auxin homeostasis (Wabnik et al., 2010;Wabnik et al., 2011).The classification of AtPIN6 is more controversial since it has a partially reduced hydrophilic loop with high sequence similarities at trans-membrane regions (Krecek et al., 2009a;Mravec et al., 2009).In addition to the eight AtPIN proteins, Arabidopsis encodes seven PIN like genes and they form a different clusters and the role of these is yet to find out (Paponov et al., 2005).
Many homologous PIN genes were well characterized in monocot species like rice (Oryza sativa) and maize (Zea mays).Both specific features and homologies between monocots and Arabidopsis (eudicot) PIN families have been shown.Monocot-specific features comprise both sequence clustering in phylogenetic analyses and expression pattern at transcript and protein level.In rice, the sequence analysis of the 12 PIN genes present in the genome showed that rice has four PIN1 genes and one OsPIN2, while no OsPIN protein was grouped into the AtPIN3, AtPIN4 and AtPIN7 cluster.Four OsPIN genes encode for rice PIN proteins with a short central hydrophilic domain: three OsPIN5 and one OsPIN8.Furthermore, three OsPIN proteins appear monocot-specific: OsPIN9, OsPIN10a, and OsPIN10b.OsPIN9 has a central hydrophilic domain intermediate in length between long and short PINs of Arabidopsis and its expression analysis at transcription level suggests a possible function in adventitious root differentiation.OsPIN10a and OsPIN10b have a long central hydrophilic domain (Carraro et al., 2006;Forestan et al., 2012;Forestan and Varotto 2010;2012;Xu and Scheres 2005).So far, three PIN1 genes were described in maize using an antibody raised against AtPIN1 protein (Forestan and Varotto, 2010).Recent studies of PIN genes on Sorghum bicolor revealed the presence of 11 PIN genes; at least there members were grouped in the AtPIN1 cluster and another three in the AtPIN5 cluster (Shen et al., 2010;Wang et al., 2010).
S. italica [(L.)P. Beauv.] commonly known as foxtail millet is one of the most cultivated millet species grown worldwide including India, China, Japan, Australia North and South America (Devos et al., 1998).Foxtail millet is a diploid grass with small genome (≈515 Mb) and its draft sequences has been published recently (Bennetzen et al., 2012).The major phytohormone auxin is central to plant growth and development.Availability of publicly accessible genome sequences of S. italica lead us to find the auxin efflux carrier genes (PIN) using an in-silico approach.Here, we used bioinformatics and comparative genomics approaches to find auxin efflux carrier genes in S. italica.

MATERIALS AND METHODS
Auxin efflux carrier (PIN) genes of S. italica were identified from plant genome database (http://www.plantgdb.org)and phytozome (www.phytozome.net)database (Dong et al., 2004;Duvick et al., 2008;Goodstein et al., 2012).To identify PIN genes, orthologous auxin efflux carrier genes from A. thaliana were used as search query.Arabidopsis PIN genes were downloaded from "The Arabidopsis Information Resources" (http://www.arabidopsis.org/).Hidden markov model approach was carried out to find the auxin efflux carrier genes of S. italica (Altschul et al., 1997).Identified StPIN genes were again confirmed by running BLASTP searches in "The Arabidopsis Information Resources" and presence of auxin efflux carrier domains were confirmed by SWISS MODEL Workspace (www.swissmodel.expasy.org/workspace/).Nomenclature of identified StPIN genes were carried out according to BLASTP similarity found with A. thaliana AtPIN genes.TMMOD (The Hidden Markov Model for Transmembrane Protein Topology Prediction) (http://www.cbs.dtu.dk/services/TMHMM/)analyses were carried out to confirm the presence of transmembrane domains in SiPIN proteins (Kahsay et al., 2005;Kahsay et al., 2004).Orthologous PIN genes from A. thaliana (AtPIN), O. sativa (OsPIN), Physcomitrella patens (PpPIN), Populus trichocarpa (PtPIN), and S. bicolor (SbPIN) were used to analyze protein sequence similarity and construction of phylogenetic trees.OsPIN genes were downloaded from The TIGR Rice Genome Annotation Resources (Ouyang et al., 2007) whereas, PpPIN, PtPIN, SbPIN genes were downloaded from plant genome database and phytozome database.Multiple alignments of PIN genes from the above mentioned species were carried out by using the online available software Multalin (http://multalin.toulouse.inra.fr/multalin/).Phylogenetic tree was constructed by using MEGA5.2software.

RESULTS AND DISCUSSION
Genome wide analysis of the S. italica genome led to the identification of 12 auxin efflux carrier (SiPIN) genes (Table 1).This result shows, Setaria has the same number of SiPIN genes as of rice and has four more SiPIN genes than A. thaliana.The major genome assembly of S. italica is arranged in 336 scaffolds.The first nine scaffolds are pseudomolecules and 98.9% of sequence data is presented in the nine pseudomolecule.Besides, the Setaria genome has 35,471 loci containing 40,599 protein coding transcripts (Bennetzen et al., 2012).The 12 identified S. italica auxin efflux carrier genes are distributed in eight scaffolds.Scaffold five contains four auxin efflux carrier genes (SiPIN4a, SiPIN5a, SiPIN5b and SiPIN8).The biggest SiPIN gene was SiPIN2 with an ORF (open reading frame) length of 1890 nucleotides present in scaffold 4, whereas the smallest one was SiPIN5c which was present in scaffold 6.Among the 12 SiPIN genes, seven SiPIN genes (SiPIN1a, SiPIN1b, SiPIN4a, SiPIN4b, SiPIN4c, SiPIN4d and SiPIN8) contained five introns each and SiPIN2 and SiPIN5d contained six introns each (Figure 1).SiPIN1 transcript organization matched with that of OsPIN1 and AtPIN1 indicating their close homology (Wang et al., 2009).

SiPIN1a SiPIN1b
SiPIN2 SiPIN4a SiPIN4b SiPIN4c SiPIN4d SiPIN5a SiPIN5b SiPIN5c SiPIN5d SiPIN8  Statistical method used to construct the phylogenetic tree was neighbor joining method; test phylogeny-boot strap method; no. of boot strap replication -500; substitution type-amino acids and model used was Jones-Taylor-Thornton (JTT).Multiple alignment of amino acid sequences shows conserved N and C-terminal domains (Supplementary Figure 1).The N-terminal region shows a conserved S-P/T-P motif, a potential target phosphorylation site for mitogen activated protein kinases (MAPK) (Sinha et al., 2011).The central hydrophilic loop is dynamic in nature and differs from each other in terms of sequence homology, but some PIN genes are conserved in this dynamic region with a T-P-R motif (Supplementary Figure 1).The T-P-R motif is a target phosphorylation site of mitogen activated protein kinase 3 and mitogen activated protein kinase 6 (Sorensson et al., 2012).The T-P-R motif is conserved only in the case of long transmembrane auxin efflux carrier domains.This shows that, although the central hydrophilic loop is diverse in nature, its phosphorylation events are conserved to carry out specific function suggesting that evolution of protein phosphorylation is conserved.
The PIN-Formed (PIN) proteins are a plant-specific family of transmembrane proteins that transport the phytohormone auxin as substrate molecule.There is very limited data available which suggests auxin is a signaling molecule of ancient origin.The PIN gene family is found only in genomes of land plants.They act as regulator and play key roles in developmental process including embryogenesis, morphogenesis and organogenesis (Krecek et al., 2009b).The number of PIN genes present in S. italica (12) is equal to that of rice ( 12) and more than that of Arabidopsis (8) suggesting that the presence of more PIN genes may have some extra role in development and morphogenesis.The predicted structure of a PIN protein is similar to the structure of membrane transport proteins that use the electrochemical gradient across the membrane to transport molecules.All the identified PIN proteins have two hydrophobic domains with cytoplasmic orientation.The transmembrane helices of hydrophobic domains are highly conserved in their amino acid sequence.But substantial differences are present between the long and short PINs.The hydrophobic domains of all long PIN proteins contain the amino acids at invariant position, but these positions are not invariants in short PINs.The presence of invariant amino acid sequences in long PINs may play major roles, which has not been retained in short PIN.The loop between the transmembrane helices being present within the hydrophobic domain exhibits dynamic variability in size and sequence.

Conclusion
S. italica popularly known as foxtail millet is one of the best studied millet species in the world.The genome sequencing project of this plant is going to be completed in the near future.This will open the door for progressing research of this plant at the molecular level.Auxin efflux carrier genes identified in this report will help to understand the role of auxin signaling and its role in growth, development as well as response to different biotic and abiotic stresses.Phylogenetic analysis shows that auxin efflux carrier genes in species of grass family are conserved.

Figure 1 .
Figure 1.Transcript organization of SiPIN genes.Blue color boxes indicate the exons and lines indicate the introns of respective SiPIN genes.The arrow mark indicate the direction of expression of transcript.

Table 1 .
Phytozome locus ID and transcript information of SiPIN.Naming of SiPIN were done as found by BLASTP against the Arabidopsis Information Resources database.

ID Gene name ORF Length Number Of a.a Number of Introns 5'-3' Coordinates
SiPIN2 clustered with ObPIN2 and OsPIN2; SiPIN8 clustered with SbPIN8 and OsPIN8.In group II, SiPIN5a clustered with SbPIN5b and OsPIN5b; SiPIN5c clustered with SbPIN5c and OsPIN5c; SiPIN5d clustered with SbPIN5a and OsPIN5c: SiPIN5b clustered with SbPIN3 and OsPIN9.In group III, AtPIN6 clustered with PtPIN6.There is no gene of Setaria or any other grass in this cluster, showing diversification of PIN genes.Cluster analysis reflects, S. italica PIN genes are much closer to PIN genes of grasses Sorghum bicolor and Oryza sativa.