Transcription Guiding Principles: Difficult Utterances

Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Transcription guiding principles

This document summarizes the guiding principles for certain types of phrases, that are 
transcribed in ways that conflict with standard grammar rules for transcription 
language. For language specific transcription, follow the language specific written 
domain convention. 

Difficult utterances

Everything relating to problematic utterances (background noise, false starts, etc.) or 
different language varieties. 

Skipping a prompt 

● If you can't understand part of the audio, transcribe only the part you can 
understand. For the part you cannot understand, create a separate speaker 
segment and add the Unintelligible label. 
● For utterances that contain speech that is user-generated, pre-recorded, or 
synthesized, transcribe all of it. 

Hesitations and truncations 

● If a very small part of a word (at most one syllable) has been cut off, and you 
know what the word is supposed to be, transcribe the entire word. If you are not 
sure what the word should be, do not transcribe the word at all. Do not put 
punctuation after words that have been cut off. 
● If a quotation is cut off in the middle, use an end quotation mark anyway. 
● Transcribe only numbers that you hear even if the speaker didn't finish saying the 
entire number. 

Accents 

● If you hear a word with non-standard pronunciation, transcribe the word using the 
standard spelling 
○ Example Audio:​ where is dat 
○ Correct:​ Where is that? 
Agreed spelling

Spelling conventions for words where several options are thinkable, as well as proper 
names. 

Spelling out 

● If a word is spelled out, write it with spaces in between. This rule does not apply 
to acronyms, URLs, or email addresses. 
○ Example Audio:​ how do you get to c o s t c o 
○ Correct:​ How do you get to c o s t c o? 

Interjections 

● Transcribe words representing laughter or other non speech vocalizations. Some 


of these words might be: 
○ huh, woah, okay, yep, uh-huh, mhm, nah 
○ This list is not exhaustive of all true interjections. 
● Annotate laughter that is included within speech. 

Proper names 

● For proper names, always use the official spelling and punctuation. 
○ Example Audio:​ will i am 
○ Correct:​ will.i.am 
● If a personal name could have multiple spellings and context does not help 
choose a spelling, use the spelling that yields the most Google search hits when 
you search for the name followed by the word "name" (without quotation marks) 
(e.g. "Anna name"). 
○ Example Audio:​ mcdonald 
○ Correct:​ MacDonald 

Brand and product 

● Format brand and company names as officially formatted 


○ Example Audio:​ the team is sponsored by united healthcare 
○ Correct:​ The team is sponsored by UnitedHealthcare. 

Media title 
● Write media titles as they are most commonly written. 
○ Example Audio:​ screenshots of call of duty black ops two 
○ Correct:​ screenshots of Call of Duty: Black Ops 2 

Multiple spellings 

● If you hear a word that does not sound like a standard word of your language 
because there is a small sound change (i.e. accent, speech errors, speech 
impairment, etc.), transcribe the intended word. 
○ Example Audio:​ where is the nearest liberry 
○ Correct:​ Where is the nearest library? 
● If you hear a word that does not sound like a standard word of your language, but 
it is obviously based on real words, suffixes, prefixes, infixes or circumfixes, 
transcribe as is. 
○ Correct:​ interpretate 
● If you hear a word that does not sound like a standard word of your language 
because it appears to be nonsense, first search for the word in Google. If there is 
a clear candidate, transcribe that word. If there is not a clear candidate, but it is 
easy to spell and articulated clearly, transcribe it anyway. If there is no clear 
candidate and it is not easy to spell, create a separate speaker segment and add 
the Unintelligible label. 
○ Correct:​ Souk Abdali 
○ Explanation:​ User says Souk Abdali. Transcriber searches “sukabdali”, 
finds correct results. Transcribe Souk Abdali. 

Punctuation

Follow the punctuation regulations of your locale. Additional conventions are outlined in 
this section. 

Fragments versus sentences 

● Answers to questions and sentences with dropped subjects should be 


transcribed with ending punctuation. 
○ Correct:​ Who are you talking about? The guy next door. 
○ Explanation:​ two speakers 

Commas 
● Only use commas if they are required according to language grammar. 

Colon and quotation 

● If a speaker is quoting another person, transcribe a comma before the quoted 


speech. 
○ Example Audio:​ my friend said alligator crocodile 
○ Correct:​ My friend said, "alligator crocodile". 
● If the quoted text is a complete sentence, transcribe ending punctuation inside 
the quotation marks. In cases like these, do not add an additional ending 
punctuation after the main sentence. 
○ Example Audio:​ jane asked are we meeting at three o clock 
○ Correct:​ Jane asked, "Are we meeting at 3:00?" 
● Do not use quotation marks for metalinguistic uses of words or phrases. These 
uses include defining the word, talking about the spelling of the word, or any 
other type of reference to the word itself as a thing. 
○ Correct:​ He just said the word zucchini. 

Other symbols 

● Apart from standard letters, you should not use any other symbol than: 0-9 
äâàæÆçÇéèëêïîñÑôöŒœüûùμÿÄÂÀÉÈËÊÏÎÔÖÜÛÙŸ23,?!'"_°:.()<>{}[]√/@#$€£₹+=
%*&-.; 
● When two opposing teams are mentioned, include a hyphen between their 
names. 
○ Example Audio:​ are you going to the saints bears game 
○ Correct:​ Are you going to the Saints-Bears game? 
● Include a hyphen between locations in flight itineraries. 
○ Example Audio:​ rome london flight 
○ Correct:​ Rome-London flight 

Spoken punctuation 

● For sentence-level spoken punctuation, write out the full word or words between 
curly brackets. Do not add punctuation symbols after spoken punctuation. Be 
careful with homonyms. 
○ Example Audio:​ okay dot dot dot 
○ Correct:​ Okay {dot} {dot} {dot} 
Format

Transcribe numbers, abbreviations etc. following the formatting conventions in this 


section. 

Number 

● Cardinals and ordinals from 0 to 9 are written with letters (except for measures 
and currency - see Currency and Unit). Use digits for cardinals and ordinals 10 
and above, even if they are coordinated with numbers under 10. Transcribe all 
decimal numbers as digits. 
○ Correct:​ I have six dogs and 12 parakeets. 
● In math expressions or units & measures, transcribe fraction words using 
numerals and slashes. Be careful not to use pre-combined fractions like "1⁄4". 
○ Example Audio:​ in three fourths of a mile turn right 
○ Correct:​ In 3/4 of a mi, turn right. 
● For mixed numbers in math expressions and units & measures, transcribe them 
using numerals. 
○ Example Audio:​ the koala weight twelve and a third pounds yesterday 
○ Correct:​ The koala weighed 12 and 1/3 lb yesterday. 
● When referring to items (not units or measures), write fractions out in words. 
With mixed numbers, write the whole number part out in words if it is under ten, 
otherwise write it with numerals. 
○ Example Audio:​ twelve and a half pumpkin pies were made 
○ Correct:​ 12 and a half pumpkin pies were made. 
● Transcribe percentages using numerals followed by the "%" sign. In the unlikely 
case that you encounter a number of a million or greater used as a percentage, 
spell it out. 
○ Example Audio:​ two percent milk 
○ Correct:​ 2% milk 
● Use Roman numerals only when part of an official name or title. 
○ Example Audio:​ king henry the eighth 
○ Correct:​ King Henry VIII 
● Transcribe seasons and episodes of television shows with numerals. 
○ Example Audio:​ season three episode two 
○ Correct:​ season 3 episode 2 
● Transcribe phone numbers using the most common format(s) in the 
transcription language. 
● Transcribe phone numbers as you would write them down in their natural groups. 
When applicable, the STD code should be surrounded by spaces. 
● Math expressions should be transcribed with numerals and math symbols with 
spaces in between them. 
○ Example Audio:​ five times six to the third 
○ Correct:​ 5 / 6 ^ 3 

Currency and unit 

● Transcribe currencies as commonly written in transcription locale. 


● Abbreviate all units that follow numeric values. 
○ Example Audio:​ my family bought ten liters of orange juice 
○ Correct:​ My family bought 10 L of orange juice. 
● If it is clear from context that a number or number sequence refers to currency or 
time, format it as such. 
○ Example Audio:​ milk is two ninety nine 
○ Correct:​ Milk is $2.99. 

Date and time 

● Use the common form for transcribing dates and times as used in transcription 
language. 
● Write times in hh:mm format whenever possible, unless it would look unnatural 
to do so. 
○ Example Audio:​ a few minutes after three 
○ Correct:​ a few minutes after 3:00 

Address 

● Write out the full names of locations, roads, states, etc. Only use abbreviations 
when explicitly spoken. 
○ Correct:​ 751 Jefferson Street, New York City 
● Transcribe entities and locations by using a comma between them "ENTITY, 
LOCATION" 
○ Correct:​ McDonald's, Castro Street 

Web 

● Write URLs, email addresses, and Twitter hashtags as they are spoken and don't 
capitalize them. 
○ Example Audio:​ im so hashtag hungry i could eat a whole pizza 
○ Correct:​ I'm so #hungry I could eat a whole pizza. 
● Do not correct speaker errors such as transcribing a slash when the user actually 
says "backslash". 
○ Example Audio:​ h t t p colon backslash backslash mail dot yahoo dot com 
○ Correct:​ http:\mail.yahoo.com 
● If the speaker drops a "w" or dots and it's an obvious URL, you should correct 
these errors. If the speaker doesn't say the "w"s at all, do not add them. 
○ Example Audio:​ w w facebook dot com 
○ Correct:​ www.facebook.com 
● If a URL is spelled out in individual letters, transcribe without spaces between 
individual letters. 
○ Example Audio:​ w w w dot t a r g e t dot c o m” 
○ Correct:​ www.target.com 

Abbreviation 

● Do not abbreviate unless the speaker says an abbreviated form. 


○ Correct:​ Warriors versus Lakers 
● In acronyms, do not use periods between letters. 
○ Correct:​ NASA, NASCAR, AAMCO, ZIP code 

Longform generic rules


Follow the rules in this section to label and transcribe speaker segments for longform 
audio. Transcribe ALL speech that is heard, including pre recorded and synthesized 
speech. For speech that is unintelligible, foreign, or singing, follow the rules below. Only 
transcribe up to 20 unique speakers. Stop transcribing the task once the 21st speaker is 
heard. 

Unintelligible or foreign or singing

 
If you hear speech that is unintelligible or in a foreign language, create a speaker 
segment that covers only the audio range with that speech. Select either the 
Unintelligible or Foreign Speech Label and assign to the appropriate speaker. 
 
If the entire audio is unintelligible or in a foreign language, create a speaker segment 
that covers the entire audio range and select either the Unintelligible or Foreign Speech 
Label. 
 
If you hear audio that is singing, transcribe the lyrics, assign to the appropriate speaker, 
and add the Singing label. If the singing is in a foreign language, select the Foreign 
Speech label. 

Segmentation

 
All speaker segment boundaries should be accurate with at least 100ms precision. 
 
Speaker turns should not contain pauses in speech that are longer than 0.5 seconds. If 
a speaker does pause their speech for longer than 0.5 seconds, end the speaker turn 
before the pause then create a new turn for when the speaker resumes talking. 
 
Speaker turns should not be longer than 30 seconds. If a single speaker talks for more 
than 30 consecutive seconds without taking a 0.5 second pause, then end the turn at 
the 30 second mark and begin a new turn. 

Speaker labeling

 
All speaker labels should be consistently formatted. Speaker labels should always: be in 
all lowercase, be spelled correctly, and should not contain underscores or hyphens. Only 
transcribe up to the 20th unique speaker. 
Correct: speaker 1   
Incorrect: Speaker 1 

Correct: pre recorded speaker 1   

Incorrect: pre-recorded speaker_1 

Correct: unidentifiable speaker   

● 'speaker #' Used for different speakers in the audio. Includes a number that 
corresponds to each different speaker. 
● 'pre recorded speaker #' Used when there is speech coming from a machine. 
Includes a number that corresponds to each different pre recorded speaker. 
● 'unidentifiable speaker' Used when you cannot identify who the speaker is. Does 
not ever include numbers. 
● 'speaker Tom' Used when the name of a speaker becomes known. The names of 
speakers should always be capitalized. You can use first and last names. (Note: 
adding speaker names will be allowed for some projects but not others. In tool 
validators will indicate whether or not you can submit a speaker name.) 

Audio labels

 
For instances with music and lyrics, create separate labels: one for Music and one for 
Singing. Transcribe the singing speech and assign to the appropriate speaker. 
 
Below is a list of labels that may be available for you to choose from. Each contain a 
description of when is an appropriate time to use it. 
● Applause: Use this label if you hear one or more people clapping or cheering. 
● DTMF: Stands for 'dual-tone multi-frequency.' This is the sound you hear when a 
number is pressed during a phone call. For example, 'Press one to speak to a 
representative' DTMF. 
● Foreign Speech: Use this label if the speaker is not talking in the target language. 
● Laughter: Use this label when you hear laughter. 
● Music: Use this label when you hear music. 
● Noise: Use this label for instances of miscellaneous noise events. 
● PII: Use this label when you hear Personally Identifiable Information (PII). For 
more information see the PII section of this guideline. 
● Ring Tone: Use this label when you hear a ring-tone. 
● Singing: Use this label to indicate that the speaker is singing their speech. 
● Unintelligible: Use this label to indicate that you cannot understand what the 
speaker is saying. 
● Unknown: Use this label for audio events that are not classified to any of the 
above labels. 

Pii

 
PII stands for Personally Identifiable Information. PII is information that is not publicly 
available, but can help you or Google identify an individual person. 
 
PII should never be transcribed. When PII is heard, create a new speaker segment that 
captures the audio range of the PII speech. Add the PII label and assign to the 
appropriate speaker. Do not transcribe PII. 
 
If applicable, select the appropriate PII subcategory which is nested underneath the 
parent 'PII' label. If the appropriate PII subcategory is not available then select the 
parent 'PII' label to cover all other cases. If there are no PII subcategories available to 
choose from then use the 'PII' label for all cases of PII. 
 
See the below list for all valid PII items 
● NAME: First and/or Last name 
● CREDIT_CARD_NUMBER 
● EMAIL 
● PHONE_NUMBER 
● SOCIAL_INSURANCE_NUMBER 
● DRIVER_LICENSE_NUMBER 
● NATIONAL_HEALTH_SERVICE_NUMBER 
● SOCIAL_SECURITY_NUMBER 
● PASSPORT 
● TAX_FILE_NUMBER: A tax file number (TFN) is a unique identifier issued by the 
Australian Taxation Office (ATO) to each taxpaying entity 
● LOCATION_STREET: If the street name is heard, mark as PII. Other locations 
such as State, City, County, zip code are all OK to transcribe. 
● LOCATION_STREET_NUMBER: If the street number is heard, mark as PII. Other 
locations such as State, City, County, zip code are all OK to transcribe. 
● MRN (medical record number) 
● BANKERS_CUSIP_ID: CUSIP stands for Committee on Uniform Securities 
Identification Procedures. A CUSIP number identifies most financial instruments, 
including: stocks of all registered U.S. and Canadian companies, commercial 
paper, and U.S. government and municipal bonds. 
● BC_PHN: Each B.C. resident enrolled with the Medical Services Plan (MSP) is 
given a unique lifetime identifier for health care called a Personal Health Number 
(PHN) 
● OHIP: Ontario Health Insurance Plan 
● QUEBEC_HIN: Quebec Health Insurance Number 
● CNI NIR: The French national identity card (French: Carte nationale d'identite or 
CNI) is an official identity document consisting of a laminated plastic card 
bearing a photograph, name and address. 
● IBAN_CODE: The International Bank Account Number (IBAN) is an internationally 
agreed system of identifying bank accounts 
● SWIFT_CODE: A SWIFT code is an international bank code that identifies 
particular banks worldwide. It's also known as a Bank Identifier Code (BIC). 
● BANK_ROUTING_MICR: The numbers located on the bottom of a check is called 
a MICR line. MICR means Magnetic Image Character Recognition. The MICR line 
is made up of three sets of numbers. The first set is called the ABA Bank Routing 
Number or routing transit number (RTN) 
● DEA_NUMBER: A DEA number (DEA Registration Number) is an identifier 
assigned to a health care provider (such as a physician, optometrist, dentist, or 
veterinarian) by the United States Drug Enforcement Administration 
● HEALTHCARE_NPI: A National Provider Identifier or NPI is a unique 10-digit 
identification number issued to health care providers in the United States by the 
Centers for Medicare and Medicaid Services (CMS). 
● MEDICARE_NUMBER 
● NIE_NUMBER: The NIE is a tax identification number in Spain 
● CPF_NUMBER: The CPF (Cadastro de Pessoas Fisicas or Natural Persons 
Register) is a number assigned by the Brazilian revenue agency to both Brazilians 
and resident aliens who are subject to taxes in Brazil 
● PAN_INDIVIDUAL: Permanent Account Number (PAN) is a code that acts as an 
identification for individuals, families and corporates (Indian or Foreign), 
especially those who pay Income Tax 
● BSN_NUMBER: netherlands: The citizen service number (BSN) is a unique 
personal number allocated to everyone registered in the Personal Records 
Database (BRP). 
● ICD_CODE: International Statistical Classification of Diseases and Related Health 
Problems (ICD), a medical classification list by the World Health Organization 
(WHO). It contains codes for diseases, signs and symptoms, abnormal findings, 
complaints, social circumstances, and external causes of injury or diseases. 
● FDA_CODE: Prescription drug 
● NIF: Tax Identification Number in Spain 
http://www.investinspain.org/guidetobusiness/en/2/art_2_3.html 
● TAXPAYER_REFERENCE 
● CURP_NUMBER: CURP is the abbreviation for Clave Unica de Registro de 
Poblacion (translated into English as Unique Population Registry Code or else as 
Personal ID Code Number). It is a unique identity code for both citizens and 
residents of Mexico. 
● RRN: Receiver Registration Number (RNN) is a 10-character alphanumeric can be 
to a bank account, a credit/debit card, mobile wallet, or home delivery. 

 
Information that does not fit the above PII rules should NOT be considered PII. Some 
examples of things that are not PII are: 
Correct: Commonly known names like Taylor   
Swift, Tom Hanks, Michael Jordan 

Correct: Company or Business name, phone   


number, or address. 

Correct: Pet names   

Correct: Social media handles   


Transcription quality
Comply with the standard rules of the writing system. 

Typo

 
A typo results in the unintentional creation of a non-word. 
Correct: နလငဘ   Explanation: Please make sure to type 
correctly. 
Incorrect: နလူငဘ  

 
Avoid making any typographical errors. Carefully check your work before marking items 
as "complete". 
Correct: ရပစနဆင တ ကမယ။  Explanation: Certain words spell differently than 
their pronunciation. Please make sure you're 
Incorrect: ရပဇနဆင တ ကမယ။ 
using the right spelling. 

Correct: သစ လမ က လ ပ ။  Example audio: "​ သစ လမ က လ ပ " 

Incorrect: သစ လန က လ ပ ။  Explanation: Pay attention to homonyms. 


Guess the most likely spelling given the 
context. 

Context error

 
A context error occurs when a real word is used incorrectly or when the incorrect form 
of a word is used. This includes homophones and punctuation, among other things. 
Correct: ငက ပ ရက တ သတလက။   

Incorrect: ငက ပ ရက တ သပလက။ 

Correct: ငအတ ပ လကပ ။  Explanation: Here, " ငအတ" is a meaningful 


word in this context, while " ငအပ" may not 
Incorrect: ငအပ ပ လကပ ။ 
make much sense. Guess the most likely 
spelling given the context. 

Added or missing words

 
Do not transcribe words that are not spoken, even if they are obviously intended by the 
speaker. Avoid putting words in the speaker's mouth. However, do transcribe implied 
times and units of currency. 
Correct: ပန က ဝတမန သ ကညရ အ င။  Example audio: "​ ပန က ဝတမန သ ကညရ အ င " 

Incorrect: ပန က ဝတမန ပရင သ ကညရ အ င။  Explanation: Do not add the omitted " ပရင". 

Correct: Kyat ၅၅၀၀၀ ဆတ သ က လ တစဗ အတက  Example audio: "​ င သ င ခ ဆတ  


ဈ မ က ဘ လ ။  သ က လ တစဗ ဖ ဆ ဈ မ က ဘ လ " 

Incorrect: Kyat င သ င င ထ င ဆတ  
သ က လ တစဗ အတက ဈ မ က ဘ လ ။ 

 
Transcribe all words spoken, even if they are not intended by the speaker. For 
interjections and non-speech vocalizations, refer to Agreed Spelling > Interjections and 
Difficult Utterances > Hesitations and Truncations. 
Correct: ဇ တက ထမ ဇတလက အ လ  Explanation: Speaker clearly corrected 
ဇ တလကလပတ ဘယသမ လ?  themselves after mistakenly saying "ဇတလက". 
Correct: စ မ ပ ပ ရင ဟ လ စတတ   Explanation: Speaker thinks out loud by saying 
သ ကညမယ န ။  "ဟ လ". 

Correct: ဇ ကက၊ ပစတင ထ င၊ ဖ ဝ ပ ဝယခ။  Example audio: "​ ဇ ကက ပစတင ထ င ဖ ဝ ပ ဝယခ 



Incorrect: ဇ ကက၊ ပစတင ထ င န ဖ ဝ ပ ဝယခ။ 
Explanation: Burmese can exclude the "န " when 
a series of items are mentioned. Transcribe "န " 
only if spoken. 

Substitution

Spacing

 
Use only one space between words and sentences. 
Correct: မနကဖနနနက ဘယသ ခ ငသလ။   

Incorrect: မနကဖနနနက _ဘယသ ခ ငသလ။ 

Correct: ရတဂက သ ပ န က တ    
ပည ထ ငစရပသ မ စ မယ။ 

Incorrect: ရတဂက သ ပ _ န က တ  
ပည ထ ငစရပသ မ စ မယ။ 

 
For most types of punctuation, do not put a space between the preceding word and the 
punctuation. 
Correct: န နတ လ ။   
Incorrect: န နတ လ ။ 

Correct: မမ ဝ၊ မ တ တ က ပ န ။   

Incorrect: မမ ဝ ၊ မ တ တ က ပ န ။ 

 
For quotation marks and similar punctuation, put a space before the opening 
punctuation, but not necessarily after the closing punctuation. 

Punctuation
Follow the punctuation regulations of your locale. Additional conventions are outlined in 
this section. 

Fragments versus sentences

 
Add punctuation where needed, but err on the side of keeping it minimal. 
 
Sometimes a phrase which is not obviously grammatically a sentence should 
nevertheless be treated as a sentence because of its context, e.g. if it's an answer to a 
specific question, or if it's an example where dropping the subject sounds completely 
natural as a complete sentence. 
Correct: ဘယသ ပ နတ လ။ ဟဘ က ။  Explanation: Two speakers. "ဟဘ က ။" is an 
answer to a specific question. 

Correct: ဟဘ က   Explanation: No context to suggest this is a 


sentence; treat as a fragment. 
Correct: မနကဖနက အလ သ မ လ ။  Explanation: Dropping the subject and main 
verb here sounds natural as a complete 
sentence. Punctuate as a sentence. 

Correct: ခ မနစ နပ  Explanation: Sounds like a web search as 


opposed to a dropped subject. Treat as a 
fragment. 

 
Interjections, greetings, and farewells said in isolation should be considered complete 
sentences and punctuated as such. 
Correct: သဟ။  Explanation: interjection 

Correct: ဟလ။  Explanation: greeting 

Correct: သ ပ ပက ။  Explanation: Entire phrase is being used as an 


interjection. 

 
Add end punctuation to sentence fragments that sound like the end of a sentence. For 
fragments that do not clearly sound like the end of a sentence, leave out punctuation. 
Note that sentence fragments may be a result of cut-off audio samples. 
Correct: ဘယလလပမလ။ တကယလမ   Explanation: Sentence-initial fragment ends 
mid-stream. 

Correct: ပဆ တ မ ။ အဓပ ယ တ မရဘ ။  Explanation: Sentence-final fragment begins 


mid-stream. 

Correct: ဆ တ ပခက နရင သ မလပန ။  Explanation: Audio was cut off at the beginning. 
Correct: လကဘကရညဆင သ လကဦ မယ။ ခဏ လ   Explanation: Do not use a punctuation, hyphen, 
ဆရ လကဦ မလ ။  or ellipsis after a fragment even if another 
sentence follows. 

Correct: ဘယက န ဘ ဖစလ ဘယလ က င  Explanation: Both sound like beginnings of 
sentences. 

Correct: ဘယလ ဘယလ ပန ပ ပ ဦ ။  Explanation: Repeated beginning of the 


sentence. 

Correct: ဆ ပမ တကယတမ က တ   Explanation: Sounds like the middle of a 


sentence; beginning and end were cut off. 

Correct: က က န ခ က ရ ထ ပ ။  Example audio: "​ က က န ခ က ရ ထ ပ " 

Incorrect: က က န ခ က ရ ထ ပ ။  Explanation: Burmese does not use a space 


between words. When two items are mentioned 
together, do not use a space between the first 
item and "န " or " င". Use a space only between 
"န " or " င" and second item. 

 
If an utterance is not clearly a sentence according to the above rules and examples, do 
not punctuate it as a sentence. 

Commas

 
Only use downward stroke "၊" where required. Err on the side of minimal punctuation. Do 
not rely on intonation. 
Correct: ဓ တဆ ဖညဖ နရ က ဘယန မ လ။  Explanation: Even if the speaker uses long 
pauses, do not use commas to show those 
pauses. There are places where commas are 
Incorrect: ဓ တဆ ဖညဖ၊ နရ က၊ ဘယန မ လ။  allowed or required, but this example contains 
neither. 

 
Use a downward stroke "၊" when a sentence begins with a discourse word, interjection, 
or yes/no word. If there is a long pause between a discourse word, interjection, or 
yes/no word and a full sentence that follows it, treat that initial word as a separate 
sentence. 
Correct: သ ဓကယ၊ က န မ ခ မ သ ပ စ။  Explanation: Discourse word. Other examples of 
discourse words in Burmese include "ခငဗ ", 
"တငပ ", "မနလပ ", and "ဟငအင ". 

Correct: အကယ၊ လလကတ ။  Explanation: Interjection. Other examples of 


interjections include "အ တ ", "ဟ ဟ ", 
" မတစ ဘရ ", and others. 

Correct: အ ၊ သပ တယ။  Explanation: yes/no word 

Correct: ဟတ ပ၊ လပ ပ မယ။  Explanation: yes/no word 

Correct: ဟတ ပ။ လပ ပ မယ။  Explanation: substantial pause after "ဟတ ပ" 

Correct: သ ဓကယ။ ကသလအမ က ရတ ပ ။  Explanation: substantial pause after "သ ဓကယ" 

Correct: ဇ ကကရယ၊ ပစတင ထ ငရယ၊ ဖ ဝ ပရယ  Explanation: A series of items shall be 
အကနဝယခ။  separated by little section sign "၊". 

Incorrect: ဇ ကကရယ ပစတင ထ ငရယ ဖ ဝ ပရယ 


အကနဝယခ။ 

 
Except in greetings, sentence-initial and sentence-final addressees should be separated 
by downward stroke "၊". 
Correct: မ မ၊ သ အမစ ပ လ တယ။   

Incorrect: မ မ။ သ အမစ ပ လ တယ။ 

 
The phrase "Ok Google" in isolation is transcribed without a comma or end punctuation. 
When the phrase appears before longer utterances, place a comma after "Google". 
Correct: Ok Google   

Incorrect: အ က၊ ဂဂ။ 

Correct: Ok Google, သလငပင ပပ  Explanation: Always use a comma "," after 
Google even when followed by an utterance in 
Burmese. 

Correct: Ok Google, က င အပ က လပစ ပပ ။   

Correct: Ok Google, Saint Patric Day ဆတ    


ဘ လ။ 

Intonation marks

 
Questions should be punctuated as sentences. In the case of a question in another 
language (English for instance), the sentence should be capitalized and punctuated with 
a question mark. 
Correct: သခ တယလ ။  Explanation: "လ " is a question word in 
Burmese. 
Correct: မနက ၃:၀၀ လ ။  Explanation: Regardless of rising tone, it is 
most likely a question when an utterance ends 
with "လ ". 

 
Exclamation marks should not be used in Burmese script. Use a double downward 
stroke "။" even if the speaker uses an exclamatory intonation. 
Correct: ဟ ။  Explanation: Speaker sounds enthusiastic. 

Correct: ဟ ။  Explanation: Speaker sounds unenthused. 

Colon and quotation

 
Do not put punctuation between reported speech verbs and direct quotations. Do not 
put punctuation within quotation marks unless the punctuation belongs to the reported 
speech. 
Correct: သဇ က " ပညက သ လညမယ" တ။  Explanation: The word "တ" is the most common 
reported speech particle in Burmese, but other 
Incorrect: သဇ က၊ " ပညက သ လညမယ" တ။ 
words such as "ဆပ", " ပ တယ", "လ" can be used 
for reported speech. No need to use comma or 
Incorrect: သဇ က " ပညက သ လညမယ။" တ။ 
section sign before quotation. 
Incorrect: သဇ က - " ပညက သ လညမယ" တ။ 

Correct: " က အ သ စ မယ" လ ပ ။  Explanation: Do not punctuate if the quotation 


is only one word. 
Incorrect: " က အ သ စ မယ။" လ ပ ။ 

 
If the text in quotation marks qualifies as a sentence, punctuate as if it were its own 
utterance. Do not alter its end punctuation even if the quote is within a sentence. Do not 
add excess punctuation after end quotation marks. 
Correct: အဘက ဆသညမ "သသရ ငမ စသတည ။"  Explanation: Text in quotation marks qualifies 
as a sentence. 
Incorrect: အဘက ဆသညမ "သသရ  
ငမ စသတည ။"။ 

Correct: မ မက မ တ "၃ န ရ ဆမ လ ။"  Explanation: Text in quotation marks qualifies 


as a sentence. Do not add excess punctuation. 
Incorrect: မ မက မ တ ၊ "၃ န ရ ဆမ လ ။"။ 
Do not put a comma or section sign before the 
quotation. 
Incorrect: မ မက မ တ ၊ "၃ န ရ ဆမ လ ။" 

Correct: စ ပ လကတ "ဟ ရ ကမ တ မယ။"  Explanation: Text in quotation marks qualifies 


as a sentence. Do not add excess punctuation. 
Incorrect: စ က ပ လကတ ၊ "ဟ ရ ကမ တ မယ။" 
Do not put a comma or section sign before the 
။ 
quotation. 

Correct: တဇ က "ရနကနဆင လ မယ။" ဆ ပ   Explanation: Text in quotation marks qualifies 


နရကအတအက မ ပ ဘ ။  as a sentence. 

Correct: "ကစ ရမယ။ စ က ကရမယ။ မဘက   


ကညရမယ။" ဟ ဆရ ကဆသည။ 

 
Use a hyphen in quotative voice actions when the quote follows the command. Use 
quotation marks when the quote is in the middle of the sentence. 
Correct: ပငသစဘ သ ပနရန - န က င လ ။  Explanation: The quote follows the command, 
so use a hyphen not quotation marks. 

Correct: " န က င လ " က ပငသစဘ သ သ ပနပ ။  Explanation: The quote is in the middle of a 
sentence, so use quotation marks not a 
hyphen. 
Correct: ပငသစလ "ခ စတယ" လ ဘယလ ပ မလ။  Explanation: Do not use a hyphen after " ပ " 
verbs in translation requests. 
Incorrect: ပငသစလ - "ခ စတယ" လ ဘယလ ပ မလ။ 

Correct: [email protected] သ - ဟ ၊   
န က င ကလ ။ 

Correct: [email protected] ဆက " ဟ ၊   


န က င ကလ " လ စ ပလက။ 

 
Do not use quotation marks for metalinguistic uses of words or phrases. These uses 
include defining the word, talking about the spelling of the word, or any other type of 
reference to the word itself as a thing. 

Other symbols

 
Apart from the Burmese letters and the Latin letters a through z, you should not use any 
other symbol than: 0-9 ၀-၉ 
äâàáāçčćđéèéëêēïîíīñóôöüőōšûùúűūÿȳžÄÂÀÁĀÇČĆĐÉÈÉËÊĒÏÎÍĪÑÓÔÖŌŠÜÛÙÚŪŸȲŽ²³,?
!~^\'"_°:.()<>{}[]√/@#$€£+=%*&-.; 

Spoken punctuation

 
For sentence-level spoken punctuation, write out the full word or words between curly 
brackets. Do not add punctuation symbols after spoken punctuation. Be careful with 
homonyms. (See exceptions in the next rule.) 
Correct: Okay {dot} {dot} {dot}  Example audio: "​ okay dot dot dot "​  

Incorrect: Okay... 
Correct: ဇဇဝ {ပဒက လ } ခတ {ပဒက လ } ပန ညက  Example audio: "​ ဇဇဝ ပဒက လ ခတ ပဒက လ  
ဝယလ ပ {ပဒမ}  ပန ညက ဝယလ ပ ပဒမ " 

Incorrect: ဇဇဝ ၊ ခတ ၊ ပန ညက ဝယလ ပ ။ 

Correct: ဦ ဘန င {ကင စ} တကသလ {ကင ပ}  Example audio: "​ ဦ ဘန င ကင စ တကသလ 


ကင ပတ " 
Incorrect: ဦ ဘန င (တကသလ) 

Correct: {smiley face}  Example audio: "​ smiley face "​  

Incorrect: :-) 

Incorrect: smiley face 

 
Don't spell out internal punctuation like hyphens in web pages, email addresses, 
addresses, phone numbers, or other word-level punctuation. 
Correct: အခန နပ တ ၉၈/၉၉  Example audio: "​ အခန နပ တ က ဆယရစ မ စ င 
က ဆယက " 
Incorrect: အခန နပ တ ၉၈ မ စ င ၉၉ 

Correct: www.abc.com  Example audio: "​ ဒဘလ ဒဘလ ဒဘလ ဒ 


အဘစ ဒ ကန " 
Incorrect: www {dot} abc {dot} com 

Correct: သက ဆရ ဝန/အဆ တ /စ ရ ဆရ ။  Example audio: "​ သက ဆရ ဝန မ စ င အဆ တ  


မ စ င စ ရ ဆရ " 
Incorrect: သက ဆရ ဝန {မ စ င } အဆ တ  
{မ စ င } စ ရ ဆရ ။  Explanation: Slash is a word-level punctuation. 

Correct: ယ နပတ ၄က/၅၈၇၂  Example audio: "​ ယ နပ တ လ က က မ စ င 


င ရစခန စ " 
Incorrect: ယ နပ တ ၄က မ စ င ၅၈၇၂ 

 
If a word that can refer to a punctuation mark is spoken in isolation, it should be written 
out between curly brackets. 
Correct: {ပဒမ}   

Correct: {ကင စကင ပတ}   

Correct: {မ စ င }   

Format
Transcribe numbers, abbreviations etc. following the formatting conventions in this 
section. 

Number

 
Use Burmese numerals and not Arabic numerals. 
Correct: ၁ ၂ ၃ ၄ ၅  Example audio: "​ တစ စ သ လ င " 

Incorrect: 1 2 3 4 5 

 
Cardinals and ordinals from 0 to 9 are written with letters (except for measures and 
currency - see Currency and Unit). Use digits for cardinals and ordinals 10 and above, 
even if they are coordinated with numbers under 10. Transcribe all decimal numbers as 
digits. 
Correct: အတန ထမ က င သ က ယ က ရတယ။  Explanation: Numbers less than 10 are written 
with letters. 

Correct: အတန ထမ က င သ ၁၅ ယ က ရတယ။  Explanation: Numbers 10 or above are written 


as numbers. 
Correct: သတမ ခ စ က ငန တပည ၁၅ ယ က  Explanation: Follow this rule even if the noun 
ရတယ။  phrases with numbers are coordinated. 

Correct: ၆.၀၁  Example audio: "​ ခ က ဒသမ သညတစ " 

Incorrect: ခ က ဒသမ သညတစ  Explanation: decimal numbers 

Correct: က င သ ၂၀ ယ က ရတယ။  Example audio: "​ က င သ အ ယ က စဆယ 


ရတယ " 
Incorrect: က င သ စဆယ ယ က ရတယ။ 
Explanation: In the case of tens numbers, even 
if the number is spoken after the unit, 
transcribe the number in numerals followed by 
the unit. 

 
If a large number consists of only a number followed by "သန ", "မလယ", "ဘလယ", 
"ထရလယ", or higher, then transcribe as a numeral plus word. Otherwise, transcribe as 
numerals. 
Correct: ဘငန က င ရ ၅ သန   Example audio: "​ ဘငန က င ရ င သန " 

Correct: ဘငန က င ရ ၁ ထ င  Example audio: "​ ဘငန က င ရ တစ ထ င " 

Correct: ဘငန က င ရ ၁၀၀၀  Example audio: "​ ဘငန က င ရ တစ ထ င " 

Correct: $ ၁.၅ ဘလယ  Example audio: "​ ဒ လ တစဒသမင ဘလယ " 

Explanation: For mixed numbers before "သန ", 


"မလယ", etc., use decimals. 

 
Write lists of numbers with digits and without commas. 
Correct: ၀ ၁ ၂ ၃ ၅ ၇ ၁၁ ၁၃  Example audio: "​ သည တစ စ သ င ခန ဆယတစ 
ဆယသ " 

Explanation: list of numbers, no comma if just 


counting 

Correct: ၁ ၂ ၃ ပ   Example audio: "​ တစ စ သ ပ " 

Explanation: list of numbers, no comma if just 


counting 

Correct: ၂ ၄ ၆ ၈ ၁၀  Example audio: "​ စ လ ခ က ရစ ဆယ " 

Explanation: If counting, write them with a 


space in between, not a comma. 

 
In math expressions or units & measures, transcribe fraction words using numerals and 
slashes. 
Correct: သ က ၁/၃ ပ လမယ။  Example audio: "​ သ က သ ပတစပပ လမယ " 

Incorrect: သ က ၃ ပ ၁ ပ ပ လမယ။  Explanation: Here, the "- ပ - ပ" is part of the 


fraction, so don't include it in the transcription. 
Also, avoid including spaces or pre-combined 
fraction characters. 

Correct: ရ ၃:၁ ရ ပ ။  Example audio: "​ ရ သ ခ တစခ ရ ပ " 

Incorrect: ရ ၃ ခ ၁ ခ ရ ပ ။  Explanation: Here, the " - ခ - ခ " describe 


ratio, so use ":" in between to indicate the 
expression. 

Correct: ၈ ၁/၂ မင က လ ခပ ။  Example audio: "​ ရစမငခက လ ခပ " 

Incorrect: ၈ မငခက လ ခပ ။ 
Correct: က န တ တ ၁/၂ လကမ သစ ပ လမယ။  Example audio: "​ က န တ တ လကမဝက သစ ပ  
လမယ " 

Explanation: Given the context, "ဝက" is a 


fraction, so transcribe it using numerals and 
slashes. Also, be careful not to include spaces 
or pre-combined fraction characters. 

 
For mixed numbers that represent currency amounts, always use decimals. 
Correct: လတင မ Kyat ၂.၅ ရတယ။  Example audio: "​ လတင မ စက ပခ ရတယ " 

Correct: အမဝယတန က Kyat ၅.၅ သန ပ ခရတ ။  Example audio: "​ အမဝယတန က င သန ခ ပ ခရတ  

 
Transcribe percentages using numerals and the % sign. (In the unlikely case that you 
encounter a number of a million or greater used as a percentage, spell it out.) 
Correct: ၂%    

Correct: ၁ မလယ ရ ခင န    

 
If a number appears in a context which calls for a certain formatting in your language, 
use that formatting. Otherwise, default to the general rule for transcribing numbers. 
 
Transcribe phone numbers using the most common format in the transcription 
language. 
Correct: ၀၁-၂၀၂၈၁၈  Example audio: "​ သည တစ စ သည စ ရစ တစ ရစ " 

Explanation: landline number with leading "0" 


and one-digit area code 
Correct: +၉၅ ၁ ၂၉၁၅၉၄  Example audio: "​ အ ပ င က င တစ စ က တစ င  
က လ " 

Explanation: landline number with country code 

Correct: ၀၉ ၂၀၀ ၅၄၃ လင ခ ၁၂  Example audio: "​ သည က စရ င လ သ လင ခ 


ဆယ စ " 

Correct: +၉၅ ၉၄၂ ၁၁၁ ၆၆၆၆  Example audio: "​ အ ပ င က င က လ စ 


တစသ လ ခ က လ လ " 

Explanation: mobile phone number with country 


code 

Correct: ၀၆ ၂၀ ၂၅ ၉၀၀  Example audio: "​ သည ခြ က စဆယ အစတ က ရ  


Correct: +၆၅ ၈၃၇၇ ၂၆၅၅  Example audio: "​ အ ပ င ရစသ ခနခန 


စ ခ ကင င " 

 
If it really sounds like a math expression, then transcribe it with numbers and symbols, 
with spaces in between. 
Correct: ၅ / ၆ ^ ၃  Example audio: "​ င အစ ခ က သ ထပ " 

Incorrect: င အစ ခ က ထပ န ကန သ  

Correct: ၅ * ၆ ဆ ဘယ လ ကလ။  Example audio: "​ င အ မ က ခ က ဆ 


ဘယ လ ကလ " 

Correct: √၂၅ မ ၅ ဖစသည  Example audio: "​ စဆယင ၏ စထပကန ရင မ  


င ဖစသည " 
Correct: Kyat ၈ * ၅ န ရ ဆ ဘယ လ ကလ။  Example audio: "​ ရစက ပ န င န ရ ဆ ဘယ လ ကလ 

Correct: ခ သ က င န က င စ က င က  Explanation: Does not sound like a true math 


ပ င ထ တယ။  expression with useful units. 

Currency and unit

 
Transcribe currencies as commonly written in the transcription language. 
Correct: Kyat င ခ ကနခတယ။  Example audio: "​ က ပ င ခ ကနခတယ " 

Incorrect: က ပ ၅၀၀၀၀ ကနခတယ။ 

Correct: £ ၂၀  Example audio: "​ ပ င စဆယက ယအကစ ဒ လ  


ဘယ လ ကလ " 

Correct: ၂၀၀ ယန   Example audio: "​ ယန စရ " 

Incorrect: ¥၂၀၀ 

Correct: $ ၁၀  Example audio: "​ ဆယ ဒ လ " 

 
For degrees, use the ° symbol. 
Correct: အပခ န ၃၁° စငတဂရတ   

Correct: အ ရ ၉၇°၃၀' လ ငဂ က ဟ    
မနမ စ တ ခ နမ ပ ။ 
Correct: ပတ အမ ရခမတ အ က အ တင ရတယ။  Example audio: "​ ပတ အမ ရခမတ အ က အ တင  
ရတယ " 
Incorrect: ပတ အမ ရခမတ အ က -၅ ရတယ။ 

 
For all the units that follow numeric values, please use full Burmese words. 
Correct: အမမ ၂ လတ ပလင ရတယ။  Example audio: "​ အမမ စလတ ပလင ရတယ " 

 
Transcribe all numeric values preceding units in numeral form, even if under 10. 
 
If it is clear from context that a number or number sequence refers to currency or time, 
format it as such. 
Correct: မနက ၅:၃၀ စက ပ ထ ပ ။  Example audio: "​ မနကင ခ စက ပ ထ ပ " 

Incorrect: မနက ၅ ခ စက ပ ထ ပ ။ 

Date and time

 
Use the natural form for transcribing dates. 
Correct: ဇလင ၁၂ ရက၊ ၁၉၆၄ ခ စ  Example audio: "​ ဇလငလ တစ စ ရက 
ထ ငက ရ ခ ကဆယ လ ခ စ " 

Correct: ၇၈ ခ စ ရ သ  Example audio: "​ ခနစဆယရစခ စ ရ သ " 

Correct: ၈၀ ဝန က င သခ င မ   Example audio: "​ ရစဆယဝန က င သခ င မ " 

Correct: ဗဒဟ န၊ မတလ ၆ ရက  Example audio: "​ ဗဒဟ န မတလ ခြ ကရက " 

 
Write times in hh:mm format whenever possible, unless it would look unnatural to do 
so. 
Correct: ၃:၀၀ န ရ  Example audio: "​ သ န ရ " 

Incorrect: ၃ န ရ  Explanation: Use the pattern ၁၂:၀၀ to describe 


time. 

Correct: စကက ၄:၀၀ န ရ ထ   Example audio: "​ စကက လ န ရထ " 

Correct: ၄:၄၅ န ရ  Example audio: "​ င န ရမတတင " 

Correct: ၆:၃၀ အ ရ က ပနလ မယ။  Example audio: "​ ခ ကခအမ ပနလ မယ " 

Incorrect: ၆ န ရ ၃၀ မနစ အ ရ က ပနလ မယ။ 

 
Use "နနက၊", " နလည၊", "ည န၊", or "ည၊" if spoken. 
Correct: နလည ၁၂:၀၀ န ရတတ လ ခမယ။  Example audio: "​ နလည ဆယ စန ရတတ လ ခမယ 

Address

 
Favor full spellings over abbreviations where natural, but use abbreviations when 
explicitly spoken. 
 
Use the section sign "ပဒက လ " for ENTITY၊ LOCATION 
Correct: ဝင က ကအ၊ က ကတတ    

Correct: ရ က င ကယ၊ ၅၁ လမ    
Correct: အမအမတ ၃၁၊ ၄၉ လမ    

Correct: ၅၃(ခ)၊ ရ တ င က လမ ၊ ဗဟန မ နယ   

Correct: ဝဇရ ၊ မ လ အတက ပရင ပခ နမ    

Correct: မ င အ က ဖ ၊ ဗလ မတထန လမ င   
ရ က လမ ထ င 

Correct: ရနကန မ တ ခန မ၊ မ လ မ   

Web

 
Write URLs, email addresses, and Twitter hashtags as they are spoken and don't 
capitalize them. 
Correct: www.google.com.mm  Example audio: "​ ဒဘလ သ လ ဒ ဂဂ ဒ ကန ဒ 
အမအမ " 

Correct: amazon.com  Example audio: "​ အ မဇန ဒ ကန " 

Correct: [email protected]  Example audio: "​ ကအ မ င အက ဂ မ ဒ ကန " 

Correct: သမတအသစ #ဦ ထင က   Example audio: "​ သမတအသစ ဟကရတက 


ဦ ထင က " 

 
If the speaker drops a "w" or dots and it's an obvious URL, you should correct these 
errors. If the speaker doesn't say the "w"s at all, do not add them. 
Correct: www.google.com.mm  Example audio: "​ ဒဘလ ဒဘလ ဒ ဂဂ ဒ ကန  
ဒ အမအမ " 

Correct: www.google.com.mm  Example audio: "​ ဒဘလ သ လ ဒ ဂဂ ဒ ကန ဒ 


မနမ " 

Correct: www.bbc.com  Example audio: "​ ဒဘလ သ လ ဘဘစ ဒ ကန " 

Correct: www.youtube.com  Example audio: "​ ဒဘလ သ လ ဒ ယတ ဒ ကန " 

Abbreviation

 
Do not abbreviate unless the speaker says an abbreviated form. 
Correct: မက န သ သမဟတ က က တ   Example audio: "​ မက န သ သမဟတ က က တ " 

Incorrect: မက န သ (သ) က က တ  

Incorrect: မက န သ သ က က တ  

Correct: မရပည ဟ တယ၊ မန ။  Example audio: "​ မရပည ဟ တယ မန " 

Incorrect: မရပည ဟ တယ၊ မ လ ။ 

Correct: စ ပ က သန ဝန က ဌ န  Example audio: "​ စ ပ က သန ဝန က ဌ န " 

Incorrect: စ ပ ရ င က သန ရ င ဝယ ရ  
ဝန က ဌ န 

 
Capitalize and abbreviate English titles when accompanied by proper English names. 
Correct: Dr. Schuster   

 
In Burmese, transcribe titles as spoken. 
Correct: ပ ရ ဖကဆ ဒ နလ သန    

Correct: ဒ နလ သန ဟ ပ ရ ဖကဆ ဖစပ တယ။   

Correct: ဒ နမ   Example audio: "​ ဒ နမ " 

Incorrect: ဒတယ န က ရမ  Explanation: A job title is abbreviated. Just 


transcribe as spoken. 

Correct: ဒတယ န က ရမ ခန   Example audio: "​ ဒတယ န က ရမ ခန " 

Incorrect: ဒ နမ ခန   Explanation: Transcribe title as spoken. 

Correct: ရ န ဦ ကန  Explanation: For titles like Lawyer, Doctor, 


Celebrity preceding proper name, please write 
Incorrect: ရ နဦ ကန 
them with a space, and use the standard 
spelling. 

Correct: မ ပ ပ ဦ က ရင  Explanation: For titles like Lawyer, Doctor, 


Celebrity preceding proper name, please write 
them with a space, and use the standard 
spelling. 

Correct: အလက က စ က စ ညန   Explanation: For titles like Lawyer, Doctor, 


Celebrity preceding proper name, please write 
them with a space, and use the standard 
spelling. 

 
In acronyms, do not use periods between letters. 
Correct: NASA, NASCAR, AAMCO, ZIP code   

Correct: အထက၊ တစည၊ ဖဆပလ၊ မခမမ   

Correct: စစစ  Example audio: "​ စသ လ " 

Agreed spelling
Spelling conventions for words where several options are thinkable, as well as proper 
names. 

Spelling out

 
If a word is spelled or obvious pauses are made between letters, spell it into letters as it 
is said (often done for foreign names or businesses, for example). Use lowercase letters 
for the spelled-out portion. This rule does not apply to acronyms or initialisms, or to 
spelled-out web or email addresses. 
Correct: သ ဝထ မရရစ မ  Example audio: "​ သ ဝထ မရရစ မ " 

Correct: က က ခ က ဂငယ ဃ က င  Explanation: spelled out the consonants 

Correct: CEO  Example audio: "​ စအ အ " 

Explanation: Transcribe common acronyms 


with Latin letters. 

Correct: VIP တ  Example audio: "​ vip တ " 

Incorrect: ဗအငပ တ 
Explanation: When it is plural, add the 
plural-indicator as pronounced. 

Interjections

 
Transcribe words representing laughter or other non-speech vocalizations with up to 
three syllables, but no more. 
Correct: ဟ ဟ ဟ ဟ    

Correct: ဟဟဟ  Example audio: "​ ဟဟဟဟဟ " 

Incorrect: ဟဟဟဟဟ  Explanation: Do not transcribe more than three 


syllables. 

Proper names

 
use official spelling and punctuation for proper names. Google them and pay attention 
to the correct format. Official format and spelling of a proper name may supersede the 
usual written transcription conventions detailed in this document. 
 
If a personal name could have multiple spellings and context does not help choose a 
spelling, use the spelling that yields the most Google search hits when you search for 
the name followed by the word "name" (without quotation marks) (e.g. "Anna name"). 
Correct: သန လင  Explanation: Although "သန ", "သဏ " and "သမ " 
have the same pronunciation, "သန " is the most 
Incorrect: သဏ လင 
common spelling. 
Correct: ဦ က ကလ   Explanation: "ဦ က ကလ " is the most common 
spelling. 
Incorrect: ဦ က ကလ  

Correct: အဘအမည ဦ မင ဆ (ခ) ဦ သ က   Example audio: "​ အဘအမည ဦ မင ဆ ခ ဦ သ က  



Incorrect: အဘအမည ဦ မင ဆ ( ခ ) ဦ သ က  

Correct: Sophia MacDonald  Explanation: Search full names to see if they 


refer to a celebrity. If the full name does not 
belong to a celebrity, use the most common 
spelling for each part of the name (e.g. search 
"Sophia name" and "MacDonald name"). 

 
Please follow the standard spelling when you transliterate toBurmese. For names 
constituted of a first name and last name, there shouldn't be a space between the first 
name and the surname. Look them up to check their standard spelling. 
Correct: အဒလ  Explanation: If the name is only one word (in 
this example: Adele), transliterate to Burmese. 
Please follow the standard spelling. 

Correct: အနဂ လန ဂ လ  Example audio: "​ အနဂ လန ဂ လ " 

Incorrect: အနဂ လန ဂ လ  Explanation: Some names constituted of a first 


name and last name are fully transliterated. In 
that case, there shouldn't be a space between 
the first name and the surname. Look them up 
to check their standard spelling. 

Correct: ဂ ကကခ န   Example audio: "​ ဂ ကကခ န " 

Incorrect: ဂ ကက ခ န   Explanation: Some names constituted of a first 


name and last name are fully transliterated. In 
that case, there shouldn't be a space between 
the first name and the surname. Look them up 
to check their standard spelling. 

Correct: က င ဝ  Explanation: Some names constituted of a first 


name and last name are fully transliterated. In 
that case, there shouldn't be a space between 
the first name and the surname. Look them up 
to check their standard spelling. 

Correct: လမငဟ  Explanation: Some names constituted of a first 


name and last name are fully transliterated. In 
that case, there shouldn't be a space between 
the first name and the surname. Look them up 
to check their standard spelling. 

 
If a speaker makes a small mistake in a proper name, transcribe it anyway as long as 
the difference is minimal. "Minimal differences" refers to adding or dropping articles, 
possessives, and plurals. 
Correct: ညမ လ က အခ စက က ကယတယတလ   Example audio: "​ ညမ လ က အခ စက 
က ကယတယတလ " 
Incorrect: ညမ လ က အခ စက က ကယသတလ  

Correct: The Lord of the Ring  Explanation: Actual name is "The Lord of the 
Rings". 

Correct: သမ န အ မမ   Example audio: "​ သမ န အ မမ " 

Incorrect: သမ ငအ မမ  

Brand and product


 
Format proper names as they are most commonly formatted on the entity's website 
(especially official documents), if available, or the Wikipedia or IMDb page. In cases of 
ambiguity, defer to their privacy policy. If no other sources, use top Google hits. 
Correct: အလပက ပတ မ အကဖမ မ ပ ။   

Correct: က လ တက Xbox လခ င၊ Wii လခ ငတယ။   

Correct: ယက    

Correct: ဂ သန    

Incorrect: ခ သန  

Correct: မထသ  Explanation: Officially "မထသ" and is based on 


an acronym. 

Correct: အငတရပကက  Explanation: Comes from english acronym "IT" 


followed by "ရပကက". Usually, English acronyms 
are transliterated according to Burmese 
pronunciation. The whole product name is 
"အငတရပကက". 

Correct: LEGO   

Incorrect: Lego 

Correct: KFC  Explanation: Do not transliterate well known 


names based on acronyms like KFC, DHL, 
Incorrect: ကအကဖစ 
ACCA. Look them up to check whether they 
should be transliterated or not. 
Correct: United Vision မ ကမနဆင   

 
The phrase "Ok Google", as well as possible derivatives such as "Ok Google Now" and 
"Ok Glass", require their own particular spelling of "okay". This spelling is unique to these 
cases. 
Correct: Ok Google   

Correct: Ok Google ဖငပ    

Correct: Ok Google, ရ၊ က ဖဆင ဘယန မ လ။   

Correct: Ok Google, အ ခ ကကင   

Correct: အ က   

Correct: အ က၊ ဆကလပ။   

Correct: အ က၊ န လည ပ။ ဆကလပ ကစ။   

Media title

 
Refer to the Google Play Store for official spellings of media titles. For film/television, 
IMDb is also available. If an utterance is ambiguous between a media title and a 
sentence or web search, use your judgment for which is more likely; if truly unclear, 
default to media title. 
 
Do not use quotation marks for media titles. 
Correct: သန ဖ မငရ အ ရ က နဝန ထကသညပမ    
စ အပ 

Correct: က န ရ ကကခ စ သခ င ဖငပ ။   

Correct: ပ လကရငခင ကညမယ။   

 
Sometimes, media titles can include numbers or digits. Please transcribe as full words. 
Correct: သတ ခနစတန  Example audio: "​ သတ ခနစတန " 

Incorrect: သတ ၇ တန  Explanation: For media titles, the whole title 


should be transcribed including numbers. 

Correct: ခ က ဒသမ သညတစ စ အပ  Example audio: "​ ခ က ဒသမ သညတစ စ အပ " 

Incorrect: ၆.၀၁ စ အပ  Explanation: For media titles, the whole title 
should be transcribed including numbers. 

Multiple spellings

 
When multiple spellings are attested, use the first spelling used in the reference 
dictionary for your language. If there is no entry, Google the word and use the form with 
the most hits. 
Correct: ဖည ဖည   Explanation: " ဖည ဖည " is preferred by 
Myanmar Language Commission (MLC) 
Incorrect: ဖ ဖ  
Dictionary. 

Correct: သ ခညဥ  Example audio: "​ သ ဂ ဥ or သ ဂ ဥ " 


Incorrect: သ ခ ညဥ or သ ခဥ  Explanation: Although it is spoken as " သ ဂ ဥ" 
or " သ ဂ ဥ", please use the standard spelling 
" သ ခညဥ" by MLC Dictionary. 

Correct: မငအ   Example audio: "​ မငအ " 

Incorrect: မငအ   Explanation: Certain words are pronounced 


differently than their spelling, such as the words 
" မတ ", "မငအ ". Please use the standard 
spelling by MLC Dictionary. 

 
Transcribe slang and colloquialisms as spoken according to the appendix on this page. 
Do not alter non-standard speech that the speaker probably wouldn't want corrected. 
Correct: မသဘ  Example audio: "​ မသဘ " 

Incorrect: မသဘ  

Correct: ခဏ လ က   Example audio: "​ ခဏ လ က " 

Incorrect: ခဏ လ က က  

Correct: က ဖ သ ကဦ မယ  Example audio: "​ က ဖ သ ကဦ မယ " 

Incorrect: က ဖ သ ကဦ မယ 

 
Write commonly accepted contractions as usual. Transcribe contractions when you 
hear them spoken. 
Correct: ပနလ မယ မတလ   Example audio: "​ ပနလ မယ မတလ " 

Incorrect: ပနလ မယ မဟတလ   Explanation: The speaker said "မတလ " which 
actually means "မဟတလ ", please write the 
utterance as pronounced. 
Correct: ဟတကလ   Example audio: "​ ဟတကလ " 

Incorrect: ဟတကလ   Explanation: The speaker said "ဟတကလ " 


which actually means "ဟတကလ ", please write 
the utterance as pronounced. 

 
Use standard spelling for reductions that commonly occur in normal running speech, 
like "သ တ မလ" and "စ မလ ", for "သ တ မ" and "စ မ " respectively. 
Correct: ဘယသတန   Example audio: "​ ဘသတန " 

Correct: သ မ ပ   Example audio: "​ သ မ ဘ " 

Correct: ဟတရ လ   Example audio: "​ ဟတကလ " 

Correct: သ ရ သလ   Example audio: "​ သ က သလ " 

 
If you hear a word that does not sound like a standard word of your language, but it is 
obviously based on real words, suffixes, or prefixes, transcribe as is. 
Correct: ဥ ရဩ  Explanation: Speaker meant "ဥဩ" but added 
one extra middle-infix " ရ". 

Correct: ပ ဈ က ရခ ရမန က င တယ  Explanation: Unsure whether speaker intended 


to say " ရခမန" or " ရမန", but clearly articulated 
and easy to spell. 

 
Transcribe onomatopoeia when clearly spoken. Otherwise, use the Unintelligible label 
as instructed in: Longform generic rules > Unintelligible or foreign or singing. 

Difficult utterances
Everything relating to problematic utterances (background noise, false starts, etc.) or 
different language varieties. 

Skipping a prompt

 
For Loft 1.0: If you can't understand part of the audio, transcribe only the part you can 
understand. For the part you cannot understand, create a separate speaker segment 
and add the Unintelligible label as instructed in: Longform generic rules > Unintelligible 
or foreign or singing. 
 
For Loft 2.0: Below is a list of reasons to skip the audio that may be available for you to 
choose from. Each contain a description of when is an appropriate time to use it. 
● No Audio: The audio doesn't load. 
● No Sound: The waveform indicates there is audio but I can't hear anything. 
● Other Locale: All of the speech is in a different language. 
● Silent Audio: The entire utterance is silent 
● Noisy Audio: The entire utterance is too noisy. 
● Other: Other reason (Please explain). 

 
For utterances that contain speech that is user-generated, pre-recorded, or synthesized, 
transcribe all of it. 

Hesitations and truncations

 
If a speaker says only the beginning part of the word, transcribe it if it can be considered 
a word on its own. Otherwise do not transcribe the false start. 
Correct: အလ မ ခ ကန   Example audio: "​ အ အလ မ ခ ကန " 

Correct: သ သ ၊ မလ ခန   Example audio: "​ သ- သ သ မလ ခန " 


 
Complete words that have been truncated only if a very small portion of the word is 
missing (one syllable or less in a multisyllable word) and it is obvious what the word 
should be. In cases of ambiguity, do not transcribe the cut-off word. Do not put 
punctuation at the end of truncated words. 
Correct: ကကသ ပလ တ  Example audio: "​ ကကသ ပလ တ- " 

Explanation: Final vowel "ာ" was truncated 


from the word "တ ". When the word is well 
known, please transcribe fully. 

Correct: ကကဥ ခ ကမန စ   Example audio: "​ ကကဥ ခ ကမန စ ပ- " 

Explanation: Unclear whether the speaker 


would have said "ပ " or "မလ " in the end. In 
case we cannot know how the sentence is 
ended, please transcribe until the last clearly 
heard word. 

Correct: ကနဘလ မ   Example audio: "​ -အနဘလ မ " 

Explanation: Initial consonant "က" from "ကန" 


was cut off. 

Correct: က ပင က ကနယသ ပ   Example audio: "​ -ပင က ကနယသ ပ " 

Explanation: Initial syllable " က " from 


" က ပင က က" was cut off. 

 
Sometimes, twisted words may be used in Burmese. The speaker may or may not have 
actual apraxia, sometimes, he/she may choose to say the word by purposely swapping 
vowels or syllables. English example: "kitchen" - "chicken" or "tea pot" - "top pea". If the 
speaker uses a twisted word, do not correct, transcribe as spoken. 
Correct: သဘ တ ပ သလ ။  Example audio: "​ သဘ တ ပ သလ " 

Incorrect: သ ဘ တပ သလ ။  Explanation: It might be said this way on 


purpose. No need to correct it. 

Correct: ထမ စင ကမယ။  Example audio: "​ ထမ စင ကမယ " 

Explanation: Understood as a call for meal time 


"ထမင စ " but people may say "ထမ စင " by 
swapping vowels of each syllable. Please 
transcribe as spoken. 

Correct: က ပက ကလတ ပ ။  Example audio: "​ က ပက ကလတ ပ " 

Correct: ဘ က လ က င နတယ။  Example audio: "​ ဘ က လ က င နတယ " 

Explanation: The speaker actually wants to say 


"ဘ က " but it is more fun to say backwards, so 
the uttered word may become "ဘ က ". Please 
transcribe as spoken. 

Correct: အ ဘ က န ဝယလက။  Example audio: "​ အ ဘ က န ဝယလက " 

Explanation: The speaker actually wants to say 


"eBay" but happened to say backwards, so the 
uttered word may become " အ ဘ " . Please 
transcribe as spoken. 

 
Transcribe any filler words that are applicable and used in the target language. Below 
are examples of filler words in the English language. These may or may not be 
applicable in the target language. Again, only transcribe filler words that exist in and are 
used in the target language. 
● um 
● uh 
● right 
● you know 
● so 
● like 

 
Below is a list of all filler words that should only be transcribed if affirmations or 
answers to a question. 
Correct: I need to get a new um telephone.   

Correct: Do you like it? Mhm.   

● ah 
● er 
● mhm 

 
Only transcribe foreign words if they are common in your language or if speakers of 
your language would understand it. If it is foreign and speakers of your language would 
not know this word, use the Foreign Speech label as instructed in: Longform generic 
rules > Unintelligible or foreign or singing. 

Background and foreground speech

Foreign language

Accents

 
Correct non-standard pronunciations to their standard ones. Non-standard 
pronunciations could be from speakers of regional dialects, language learners, or 
speakers from different countries. 
Correct: အ က က သယရတ   Example audio: "​ အ က က သယရသ " 

Incorrect: အ က က သယရသ   Explanation: Speaker said "သ " instead of "တ ", 
but it should be spelled as standard. 

Correct: က ဇ တငတ   Example audio: "​ က ဇ တငသ " 

Incorrect: က ဇ တငသ   Explanation: Speaker said " က ဇ တငတ " with a 


Northern accent, using "သ" instead of "တ". 

Correct: ကမ ယ စ လ။  Explanation: Upper-river-region dialect uses 


"ကင " instead of "ကမ ". Use the standard 
Incorrect: ကင ယ စ လ။ 
spelling. 

Correct: မ ကင ဂမန ရတယ။  Explanation: Upper-river-region dialect uses 


"ကန " instead of "ကင ". Use the standard 
Incorrect: မ ကန ဂမန ရတယ။ 
spelling. 

Correct: သတအမက သရကသ တ င လ တယ။  Example audio: "​ တတအမက တရကတ  


တ င လ တယ " 
Incorrect: တတအမက တရကတ တ င လ တယ။ 
Explanation: Unintentionally mixing up the 
pronunciation of "တ" and "သ" may occur. Please 
use the standard spelling regardless of 
mistakes in the utterance. 

You might also like