Summary: in this tutorial, you will learn about Go string type and how to manipulate strings effectively.
Introduction to Go string
ASCII (American Standard Code for Information Interchange) uses 7-bit integers (0-127) to encode 128 specified characters including English letters (both uppercase and lowercase), digits (0-9), control characters (newline, carriage return), and special symbols (punctuation marks). Each character in the ASCII character set fits in one byte.
Unicode uses 4 bytes to support characters from all writing systems, which is much more comprehensive than ASCII.
But we don’t want to use 4 bytes to encode every character in programs. To make programs more efficient, we use UTF-8 (Unicode Transformation Format – 8 bit) with a variable-width encoding for encoding Unicode characters.
UTF-8 uses 1 to 4 bytes to represent a character:
- 1 byte for characters in the ASCII range.
- 2 to 4 bytes for other characters.
UTF-8 is more efficient because it uses one byte for ASCII and 2 to 4 bytes for other characters.
In Go, strings are sequences of bytes that are typically encoded in UTF-8. Additionally, strings are immutable. It means that once they are created, they cannot be changed.
Go strings can include Unicode characters, which makes them very flexible for handling text from various languages.
Declaring string variables
To declare a string variable, you use the var
keyword, variable name string
type:
var message string
Code language: Go (go)
In this example, we declare the message
variable with the type string
. The message
variable takes the “zero value” as the default value, which is an empty string ""
.
You can declare a string variable and initialize its value at the same time:
var message string = "Hi"
Code language: Go (go)
In this example, we declare the message
variable and initialize its value to the string "Hi"
.
Inside a function, you can use the short variable declaration syntax to make the code more concise:
message := "Hi"
Code language: Go (go)
Creating string literals
In Go, string literals have two forms: plain and raw string literals.
- Plain string literals are enclosed in double-quotes. They can include escape sequences. Go converts these escape sequences to their corresponding characters. For example,
\n
for newline,\t
for tab,\\
for backslash. - Raw string literals are enclosed in backticks. They do not support escape sequences The raw string literals can be useful for multiline strings or strings that include many special characters like HTML code or regular expressions.
Here’s an example of a plain string literal:
message := "They said\n\"Go is awesome!\"."
Code language: Go (go)
In this string, Go converts \n
to the newline character and escapes the double quotes "
:
package main
import "fmt"
func main() {
message := "They said\n\"Go is awesome!\"."
fmt.Println(message)
}
Code language: Go (go)
Output:
They said:
"Go is awesome".
Code language: Go (go)
It’s equivalent to the following raw string literal:
package main
import "fmt"
func main() {
message := `They said:
"Go is awesome".`
fmt.Println(message)
}
Code language: Go (go)
Output:
They said:
"Go is awesome".
Code language: Go (go)
Getting length
The built-in len()
function returns the number of bytes required to represent Unicode characters in a string:
message := "elite"
fmt.Println(len(message))
Code language: Go (go)
Output:
5
Code language: Go (go)
If a string includes characters other than ASCII, the len()
function may return a number that is higher than the number of characters in the string. For example:
message := "élite"
fmt.Println(len(message))
Code language: Go (go)
Output:
6
Code language: Go (go)
In this example, the string "élite"
has 5 characters but 6 bytes.
Accessing characters
To access individual characters in a string, you use index notation:
string[index]
Code language: Go (go)
Note that this returns the byte at the given index
. String uses zero-based indexing, meaning that the first index is zero.
For example:
package main
import "fmt"
func main() {
message := "elite"
fmt.Printf("%c",message[0])
}
Code language: Go (go)
Output:
101
Code language: Go (go)
101
is the Unicode value of the character 'e'
. To show the character, you can use the Printf
function from the fmt package:
package main
import "fmt"
func main() {
message := "elite"
fmt.Printf("%c",message[0])
}
Code language: Go (go)
Output:
e
Code language: Go (go)
Concatenating strings
To concat two strings into a single string, you use the +
operator. For example:
package main
import "fmt"
func main() {
message := "Hi" + " Bob"
fmt.Println(message)
}
Code language: Go (go)
Output:
Hi Bob
Code language: Go (go)
Comparing strings
To compare two strings, you use comparison operators such as ==
, !=
, <=
, >
, >=
, <
:
package main
import "fmt"
func main() {
name := "Joe"
result := name == "Joe" // true
fmt.Println(result)
}
Code language: Go (go)
Output:
Hi Bob
Code language: Go (go)
Converting strings and other types
To convert between strings and values of other types, you use functions from the strconv package. For example, the following illustrates how to convert an integer (200
) to a string:
package main
import (
"fmt"
"strconv"
)
func main() {
i := 200
str := strconv.Itoa(i)
fmt.Println(str) // Output: 200
}
Code language: Go (go)
Output:
200
Code language: Go (go)
String functions
The strings package includes many useful string functions for working with strings effectively. Here are some common ones:
package main
import (
"fmt"
"strings"
)
func main() {
str := "Hi there!"
// Contains
fmt.Println(strings.Contains(str, "there")) // Output: true
// Count
fmt.Println(strings.Count(str, "e")) // Output: 2
// HasPrefix
fmt.Println(strings.HasPrefix(str, "H")) // Output: true
// HasSuffix
fmt.Println(strings.HasSuffix(str, "!")) // Output: true
// Index
fmt.Println(strings.Index(str, "there")) // Output: 3
// ToUpper
fmt.Println(strings.ToUpper(str)) // Output: HI THERE!
// ToLower
fmt.Println(strings.ToLower(str)) // Output: hi there!
}
Code language: Go (go)
Summary
- Strings are immutable sequences of bytes, which represent text in Go.
- Plain string literals are enclosed in double quotes while raw string literals are enclosed in backtick.
- Use the
+
operator to concatenate strings len(string)
returns the number of bytes of the string.- Use comparison operators to compare two strings.
- Use functions from the
strconv
package to convert strings to values of other types. - Use functions from the
strings
package to manipulate strings.