Classification of PH Domain-containing Proteins
The Pleckstrin Homology (PH) domains represent the 11th most common domain in the human proteome, first noted in pleckstrin, which contains two such regions. These protein domains generally have low sequence identities, but sharing a conserved structural superfold: seven beta sheets followed by one alpha helix at the C-terminal. Each PH domain contains about 120 amino acids that are involved in intracellular signaling or serve as critical constituents of the cytoskeleton.
One of the most important features of these proteins is that they bind to phosphatidylinositol (PtdIns or PI) lipids, such as phosphatidylinositol (4, 5)-bisphosphate and phosphatidylinositol (3,4,5)-trisphosphate (PIP3), and recruit proteins to different cell membranes. PI consists of a water-soluble myo-inositol head group linked by a glycerol moiety to two different fatty acid chains, usually a saturated C18 tail in the 1-position and a tetra-unsaturated C20 tail in the 2-position. Unphosphorylated PIs synthesized in the endoplasmic reticulum are transported to other membranes via PI transfer protein. PIs bind to different cell membranes via the two lipid tails. On the other hand, they bind to and regulate the protein activity via the water-soluble inositol head group.
We analyzed 250 PH domain proteins and built deep learning models to determine what PIs they can bind. This will provide insight of their potential biological functions in cells. The Venn diagram below illustrates the distribution of these proteins in terms of their PI binidng properties. Please click on the numbers in the figure to see the lists of PH domain proteins binding respective PI lipid molecules (best viewed with screen width >1300px). Additionally, we have created a complete, dynamic classification tree map to illustrate the hierarchical relationship of these PH domain proteins.