C22orf31

C22orf31
Identifiers
Aliases	C22orf31, HS747E2A, bK747E2.1, chromosome 22 open reading frame 31
External IDs	HomoloGene: 81840; GeneCards: C22orf31; OMA:C22orf31 - orthologs
Gene location (Human)
Chr.	Chromosome 22 (human)
End	29,061,831 bp
RNA expression pattern
	Top expressed in
	testicle; ; pancreatic ductal cell; ; sperm; ; right testis; ; left testis; ; buccal mucosa cell; ; putamen; ; external globus pallidus; ; caudate nucleus; ; nucleus accumbens;
	n/a
	More reference expression data
	n/a
Orthologs
	25770
	n/a
	ENSG00000100249
	n/a
	O95567
	n/a
	NM_015370; NM_001386866
	n/a
	NP_056185
	n/a
	Wikidata
View/Edit Human

C22orf31 (chromosome 22, open reading frame 31) is a protein which in humans is encoded by the C22orf31 gene. The C22orf31 mRNA transcript has an upstream in-frame stop codon, while the protein has a domain of unknown function (DUF4662) spanning the majority of the protein-coding region.^[3] The protein has orthologs with high percent similarity in mammals.^[4] The most distant orthologs are found in species of bony fish, but C22orf31 is not found in any species of birds or amphibians.

Similar to many proteins, C22orf31 is found to be highly expressed in the testes. Analysis of in vivo mature oocytes has revealed increased levels of C22orf31^[5] while promoter analysis has identified transcription factors for C22orf31 that are active during myeloid cell differentiation.^[6]

Gene

C22orf31 is located on the minus strand of chromosome 22 at 20q12.1.^[7] The gene is 3,172 base pairs long and spans from chr22: 29,058,672 to 29,061,844.^[8] C22orf31 contains 3 exons and is also known by the aliases BK747E2.1 and HS747E2A.

Transcript

There is one transcript of C22orf31. The mRNA sequence is 1,070 base pairs long and contains an upstream in-frame stop codon from nucleotide 122–124.^[9]

Protein

General properties

The protein encoded by C22orf31 is 290 amino acids in length with a predicted molecular mass of 33kDa.^[10] The isoelectric point of the protein is 10, indicating that the pH of the protein is basic. The C22orf31 protein contains a domain of unknown function (DUF4662) from amino acid 2 – 263.^[11] The secondary and tertiary structure of this protein is not well known.

Isoforms

C22orf31 has two protein isoforms.^[12] A comparison of these isoforms is shown in the table below.

C22orf31 Isoforms
Protein	Accession #	Size (AA)	Features
C22orf31 [Homo sapiens]^[13]	NP_056185	290	DUF4662 (AA 2-263)
Uncharacterized protein C22orf31 isoform X1 [Homo sapiens]^[14]	XP_016884230	249	DUF4662 (AA 1-221)
Uncharacterized protein C22orf31 isoform X2 [Homo sapiens]^[15]	XP_005261548	186	DUF4662 (AA 40-158)

Composition

The protein derived from C22orf31 is considered somewhat rich in lysine and somewhat poor in phenylalanine compared to the composition of the average human protein.^[16] There are no positive, negative, mixed, or uncharged segments in C22orf31. There are also no transmembrane components or signal peptides in the protein.

Regulation

Gene level regulation

Transcription factor binding sites

The C22orf31 promoter has many transcription factor binding sites.^[6] C22orf31's transcription factors are commonly found in immortalized liver cancer cell lines (HepG2) and immortalized myelogenous leukemia cell lines (K562).^[17] The presence of C/EBP epsilon suggests a role for C22orf31 in myeloid cell differentiation. The presence of ARNT, which is typically associated with hypoxia-inducible factor 1 alpha, suggests a role for C22orf31 in the formation of acute myeloblastic leukemia.^[18]

Expression

C22orf31 has been found to have moderate expression in the testes and low amounts of expression in the brain and ovaries.^[19] The protein is also expressed in fetal tissue as well as adult tissues. C22orf31 has been seen to have increased conditional expression in vivo matured oocytes in comparison to metaphase II oocytes.^[5]

Transcript level regulation

There are no microRNA binding sites found in C22orf31.^[20] Three functionally important stem loops are predicted in both the 3' UTR and 5' UTR of C22orf31.^[21]

Protein Level Regulation

C22orf31 is predicted to undergo several types of post-translational modifications. With a high degree of certainty, it is predicted that C22orf31 undergoes O-glycosylation,^[22] glycation,^[23] phosphorylation,^[24] and O-GlcNAcylation.^[25] Only two phosphorylation sites are located in highly conserved regions of the protein. These modifications can be seen in the conceptual translation on the right.

Homology/evolution

Paralogs

No human paralogs for C22orf31 have been identified.^[26]

Orthologs

Orthologs of the C22orf31 protein exist predominantly in mammals.^[4] However, the most distant orthologs are found in bony fish, with no orthologs being identified in amphibians or birds. Some of the major taxon groups that C22orf31 orthologs belong to include: bovidae, eulipotyphyla, cetacea, diprotodontia, vertebrata, and rodentia.

A list of 20 C22orf31 orthologs can be seen below, organized first by ascending date of divergence and second by descending percent identity with human C22orf31.

C22orf31 Orthologs
Genus species	Common Name	Taxon	Date of Divergence (MYA)^[27]	Accession #^[4]	Length (AA)^[4]	% identity w/ human^[4]	% similarity w/ human
Homo sapiens	Human	Homonidae	0	NP_056185.1	290	100	100
Miniopterus natalensis	Natal Long-fingered Bat	Chiroptera	94	XP_016054130.1	301	78.45	82.1
Physeter catodon	Sperm whale	Cetacea	94	XP_023976708.1	307	75.68	78.8
Bison bison bison	Bison	Bovidae	94	XP_010827019.1	292	75	79.5
Mustela putorius furo	Domestic ferret	Mustelidae	94	XP_012918895.1	395	73.31	60.4
Ovis aries	Sheep	Bovidae	94	XP_027836065.1	315	73.2	72.7
Suricata suricatta	Meerkat	Carnivora	94	XP_029777390.1	296	72.39	81.1
Manis javanica	Malayan pangolin	Manidae	94	XP_017520770.1	302	72.3	78.2
Lagenorhynchus obliquidens	Pacific white-sided dolphin	Cetacea	94	XP_026981083.1	307	71.14	76
Orcinus orca	Killer whale	Cetacea	94	XP_004283847.1	271	68.62	72.6
Globicephala melas	Long-finned pilot whale	Cetacea	94	XP_030715704.1	287	68.28	74.1
Neophocaena asiaeorientalis	Yangtze finless porpoise	Cetacea	94	XP_024623713.1	324	66.04	70.2
Sorex araneus	European shrew	Eulipotyphla	94	XP_004615674.1	325	64.11	63.1
Condylura cristata	Star-nosed mole	Rodentia	94	XP_004690724.1	347	62.54	59.2
Loxodonta africana	African bush elephant	Paenungulates	102	XP_023415096.1	536	78.52	46.6
Chrysochloris asiatica	Cape golden mole	Rodentia	102	XP_006869362.1	460	77.7	53.9
Dasypus novemcinctus	Nine-banded armadillo	Xenarthrans	102	XP_023445504.1	305	75.44	79
Echinops telfairi	Small Madagascar hedgehog	Eulipotyphla	102	XP_012863338.2	300	68.01	73.4
Phascolarctos cinereus	Koala	Diprotodontia	160	XP_020852397.1	302	49.19	60.8
Vombatus ursinus	Common wombat	Diprotodontia	160	XP_027718888.1	378	48.87	48.8
Myripristis murdjan	Pinecone soldierfish	Vertebrata	433	XP_029922652.1	184	48.98	27
Cottoperca gobio	Cottoperca	Vertebrata	433	XP_029301846.1	171	34.04	22.4
Astyanax mexicanus	Mexican tetra	Vertebrata	433	XP_022533372.1	208	26.36	26.3

Divergence

When compared to other proteins, namely fibrinogen alpha chain and cytochrome c, C22orf31 is a moderately evolving protein. This was determined by calculating the corrected percent divergence, using molecular clock equations,^[28] of different orthologs for each protein in comparison to their date of divergence. A physical representation of this information can be seen in the divergence graph on the right.

Interacting Proteins

C22orf31 interacts physically with 3 different proteins, according to the BioGRID,^[29] Mentha,^[30] and IntAct^[31] protein interaction browsers. In particular, C22orf31 interacts with two histone deacetylases (HDAC1 and HDAC2) and the protein Lacritin (LACRT). These interactions were determined using high-throughput affinity-purification mass spectrometry^[32]^[33] A biochemical association has also been determined through protein microarray between C22orf31 and F-box protein 7 (FBOX7).^[29] All of these proteins, with additional information, are shown in the table below.

C22orf31 Interacting Proteins^[29]
Protein Name	Abbreviation	Interaction Type	Score	Interaction Detection Method
Histone deacetylase 1	HDAC1	Physical association	0.9017	Affinity chromatography
Histone deacetylase 2	HDAC2	Physical association	0.9213	Affinity chromatography
Lacritin	LACRT	Physical association	0.9886	Affinity chromatography
F-box protein 7	FBOX7	Biochemical association	-	Protein microarray

The score for each protein in the table refers to the level of confidence of the prediction protein interaction with C22orf31 on a scale from 0–1, 1 being more confident.

Clinical significance

Pathology

Increased in vivo expression of C22orf31 in mature oocytes suggests that the gene plays a role in oocyte development.^[34]

Disease

The predicted transcription factor binding sites of C22orf31 could possibly suggest a role for the gene in myeloid cell differentiation and the formation of acute myeloblastic leukemia.^[6]^[18]

References

^ ^a ^b ^c GRCh38: Ensembl release 89: ENSG00000100249 – Ensembl, May 2017
^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
^ "NCBI".
^ ^a ^b ^c ^d ^e "NCBI Blastp".
^ ^a ^b "NCBI GEO Profile for record GDS3256, C22orf31". NCBI GEO.
^ ^a ^b ^c "Genomatix MatInspector transcription factor binding sites of C22orf31". Genomatix.^{[permanent dead link‍]}
^ "NCBI Gene results for human C22orf31". NCBI Nucleotide.
^ "C22orf31 GeneCards Entry".
^ "NCBI Nucleotide results for C22orf31". 2 September 2020.
^ "ExPasy compute pI/Mw tool". ExPasy.
^ "MotifFinder results for C22orf31 protein". MotifFinder.
^ "NCBI protein search for C22orf31 isoforms".
^ "NCBI protein entry for Human C22orf31".
^ "NCBI protein entry for uncharacterized protein C22orf31 isoform X1 [Homo sapiens]".
^ "NCBI protein entry for uncharacterized protein C22orf31 isoform X2 [Homo sapiens]".
^ "SAPs compositional analysis tool result for C22orf31 protein". SAPs compositional analysis.
^ "UCSC Genome browser results for C22orf31 protein". UCSC Genome Browser.
^ ^a ^b Kallio PJ, Pongratz I, Gradin K, McGuire J, Poellinger L (May 1997). "Activation of hypoxia-inducible factor 1alpha: posttranscriptional regulation and conformational change by recruitment of the Arnt transcription factor". Proceedings of the National Academy of Sciences of the United States of America. 94 (11): 5667–72. Bibcode:1997PNAS...94.5667K. doi:10.1073/pnas.94.11.5667. PMC 20836. PMID 9159130.
^ "Human Protein Atlas page on C22orf31". Human Protein Atlas.
^ "miRDB microRNA prediction for C22orf31".
^ "quickFold Web Server".
^ "NetOGlyc mucin type GalNAc O-glycosylation site prediction for C22orf31 protein".
^ "NetGlycate glycation site predictor for C22orf31 protein".
^ "NetPhos phosphorylation prediction for C22orf31 protein".
^ "YinOYang prediction for C22orf31 protein".
^ "NCBI BLASTp of Human C22orf31". NCBI Blastp.
^ "Time Tree: The Timescale of Life".
^ Ho S (2008). "The molecular clock and estimating species divergence". Nature Education. 1 (1): 168.
^ ^a ^b ^c "BioGRID protein interaction browser results for C22orf31 protein".
^ "Mentha interactome browser results for C22orf31 protein".
^ "IntAct protein interaction browser results for C22orf31 protein".
^ Huttlin EL, Ting L, Bruckner RJ, Gebreab F, Gygi MP, Szpyt J, et al. (July 2015). "The BioPlex Network: A Systematic Exploration of the Human Interactome". Cell. 162 (2): 425–440. doi:10.1016/j.cell.2015.06.043. PMC 4617211. PMID 26186194.
^ Huttlin EL, Bruckner RJ, Paulo JA, Cannon JR, Ting L, Baltier K, et al. (May 2017). "Architecture of the human interactome defines protein communities and disease networks". Nature. 545 (7655): 505–509. Bibcode:2017Natur.545..505H. doi:10.1038/nature22366. PMC 5531611. PMID 28514442.
^ Gonzalez-Muñoz E (2014). "Histone chaperone ASF1A is required for maintenance of pluripotency and cellular reprogramming". Science. 345 (6198): 822–825. Bibcode:2014Sci...345..822G. doi:10.1126/science.1254745. PMID 25035411. S2CID 34666170.

[refGRCh38Ensembl-1] GRCh38: Ensembl release 89: ENSG00000100249 – Ensembl, May 2017

[2] "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.

[3] "NCBI".

[:0-4] "NCBI Blastp".

[:1-5] "NCBI GEO Profile for record GDS3256, C22orf31". NCBI GEO.

[:2-6] "Genomatix MatInspector transcription factor binding sites of C22orf31". Genomatix.^{[permanent dead link‍]}

[7] "NCBI Gene results for human C22orf31". NCBI Nucleotide.

[8] "C22orf31 GeneCards Entry".

[9] "NCBI Nucleotide results for C22orf31". 2 September 2020.

[10] "ExPasy compute pI/Mw tool". ExPasy.

[11] "MotifFinder results for C22orf31 protein". MotifFinder.

[12] "NCBI protein search for C22orf31 isoforms".

[13] "NCBI protein entry for Human C22orf31".

[14] "NCBI protein entry for uncharacterized protein C22orf31 isoform X1 [Homo sapiens]".

[15] "NCBI protein entry for uncharacterized protein C22orf31 isoform X2 [Homo sapiens]".

[16] "SAPs compositional analysis tool result for C22orf31 protein". SAPs compositional analysis.

[17] "UCSC Genome browser results for C22orf31 protein". UCSC Genome Browser.

[Kallio_1997-18] Kallio PJ, Pongratz I, Gradin K, McGuire J, Poellinger L (May 1997). "Activation of hypoxia-inducible factor 1alpha: posttranscriptional regulation and conformational change by recruitment of the Arnt transcription factor". Proceedings of the National Academy of Sciences of the United States of America. 94 (11): 5667–72. Bibcode:1997PNAS...94.5667K. doi:10.1073/pnas.94.11.5667. PMC 20836. PMID 9159130.

[19] "Human Protein Atlas page on C22orf31". Human Protein Atlas.

[20] "miRDB microRNA prediction for C22orf31".

[21] "quickFold Web Server".

[22] "NetOGlyc mucin type GalNAc O-glycosylation site prediction for C22orf31 protein".

[23] "NetGlycate glycation site predictor for C22orf31 protein".

[24] "NetPhos phosphorylation prediction for C22orf31 protein".

[25] "YinOYang prediction for C22orf31 protein".

[26] "NCBI BLASTp of Human C22orf31". NCBI Blastp.

[27] "Time Tree: The Timescale of Life".

[28] Ho S (2008). "The molecular clock and estimating species divergence". Nature Education. 1 (1): 168.

[:3-29] "BioGRID protein interaction browser results for C22orf31 protein".

[30] "Mentha interactome browser results for C22orf31 protein".

[31] "IntAct protein interaction browser results for C22orf31 protein".

[pmid26186194-32] Huttlin EL, Ting L, Bruckner RJ, Gebreab F, Gygi MP, Szpyt J, et al. (July 2015). "The BioPlex Network: A Systematic Exploration of the Human Interactome". Cell. 162 (2): 425–440. doi:10.1016/j.cell.2015.06.043. PMC 4617211. PMID 26186194.

[pmid28514442-33] Huttlin EL, Bruckner RJ, Paulo JA, Cannon JR, Ting L, Baltier K, et al. (May 2017). "Architecture of the human interactome defines protein communities and disease networks". Nature. 545 (7655): 505–509. Bibcode:2017Natur.545..505H. doi:10.1038/nature22366. PMC 5531611. PMID 28514442.

[34] Gonzalez-Muñoz E (2014). "Histone chaperone ASF1A is required for maintenance of pluripotency and cellular reprogramming". Science. 345 (6198): 822–825. Bibcode:2014Sci...345..822G. doi:10.1126/science.1254745. PMID 25035411. S2CID 34666170.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]