All Collections
SMS
How does character encoding impact the size of my SMS?
How does character encoding impact the size of my SMS?

Understand what type of characters can be used in SMS messages (special characters, accents, etc) and how it impacts SMS length and cost.

Selma avatar
Written by Selma
Updated over a week ago

An SMS message can be composed of one or several message parts, each part being limited to a maximum number of characters. The number of characters allowed per message part depends on two factors:

  1. The type of characters used in the message

  2. The number of required message parts (one or several)

Types of characters used in SMS

Depending on the character encoding, two types of character sets can be used in short messages:

A. The GSM-7 character set

This includes the basic Latin alphabet (A-Z), numbers (0-9), and a set of common symbols:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 : ; < = > ? ¡ ¿ ! " # ¤ % & ( ) ' * + , - . / Ä Ö Ñ Ü § ä ö ñ ü à @ £ $ ¥ è é ù ì ò Ç Ø ø Å å Δ _ Φ Γ Λ Ω Π Ψ Σ Θ Ξ Æ æ ß É

Note: the following characters are also part of the GSM-7 character set, however, they count as 2 characters instead of one:

€ ^ { } [ ] ~ |

B. The Unicode (UNI) character set

This includes all characters that are not part of the GSM-7 set listed above. For instance:

  • Additional accented letters (ë, â, ï, etc)

  • Non-Latin alphabets (Arabic, Chinese, Cyrillic, etc)

  • Emojis

  • Other symbols (©, ™, ★, etc)

Message size and character count

The number of characters allowed in a message part varies depending on the used character encoding:

Short messages are encoded in GSM-7 by default unless containing at least one character from the Unicode set, in which case they are encoded in Unicode (UNI).

GSM-7 encoding

A short message containing only characters from the GSM-7 character set will fit one message part of up to 160 characters or several message parts of up to 153 characters each.

Unicode encoding (UNI)

A short message containing at least one character from the Unicode set will fit one message part of up to 70 characters or several message parts of up to 67 characters each.

In addition, the maximum number of characters per message part will be reduced when the message contains several parts. This is due to the invisible data header added on each part to allow concatenation.

In summary:

Character encoding

Number of message parts

Max message part length

GSM-7

1

160 chars

GSM-7

2 or more

153 chars

Unicode

1

70 chars

Unicode

2 or more

67 chars

Message edition and preview

Batch's SMS editor allows you to preview the length of your message, the number of message parts as well as the type of encoding while editing.

Auto-encoding

Batch's auto-encoding feature helps reduce the number of message parts in an SMS message by converting characters from glyphs into transliterated characters.

For instance, many GSM-7 characters have equivalent characters in the Unicode character set.

Here are a few examples:

Character

Replacement

"

}

)

'

Marketing vs Transactional SMS

Transactional SMS is used to relay transactional information. You can use this type of SMS to share important information such as order status, delivery notifications, etc.

On the other hand, Marketing SMS is used to share promotional content. Unlike the former, this type of SMS requires prior consent from the recipient as well as clear unsubscription instructions within the message.

Thus, when an SMS automation or campaign is set to only target users who are subscribed to your SMS Marketing communications, Batch will automatically append unsubscription instructions at the end of your message.

Note: The number of characters required for the STOP mention will be included in the character count of your SMS.

How does encoding impact SMS cost?

The cost of sending an SMS is based on the number of message parts it contains regardless of the used encoding. e.g. a 2-part SMS will cost twice the price per message part.

That being said, it is more likely to have a higher number of message parts when using Unicode encoding since the message part character allowance is lower.

Note: the cost per message part may vary based on the recipient's country.


This article belongs to Batch's FAQ.

Need more help? Find insightful articles, documentation, case & market studies, guides, and even more in our website's Resources section on batch.com and our blog.

Did this answer your question?