Utf8.next
Jump to navigation
Jump to search
This is an iteration function to traverse each single codepoint of a UTF-8 string.
Syntax
int, int utf8.next ( string input [[, int charpos = 0 ], int offset = 1 ] )
Required Arguments
- input: A string character sequence
Optional Arguments
NOTE: When using optional arguments, you might need to supply all arguments before the one you wish to use. For more information on optional arguments, see optional arguments.
- charpos: An integer representing the beginning position (offset will be added/subtracted).
- offset: An integer representing the offset to charpos.
Returns
Returns the integer position in bytes and the integer codepoint at this position, nil otherwise.
Example
Click to collapse [-]
ServerThis example shows how to traverse a UTF-8 string the proper way without running into problems as in byte strings.
for position, codepoint in utf8.next, "utf8-string" do print( "Codepoint @ ".. position .." = ".. codepoint ) end for position, codepoint in utf8.next, "Как" do print( "Codepoint @ ".. position .." = ".. codepoint ) end
Output:
// 1st iteration Codepoint @ 1 = 117 Codepoint @ 2 = 116 Codepoint @ 3 = 102 Codepoint @ 4 = 56 Codepoint @ 5 = 45 Codepoint @ 6 = 115 Codepoint @ 7 = 116 Codepoint @ 8 = 114 Codepoint @ 9 = 105 Codepoint @ 10 = 110 Codepoint @ 11 = 103 // 2nd iteration Codepoint @ 1 = 1050 Codepoint @ 3 = 1072 Codepoint @ 5 = 1082
See Also
- utf8.byte
- utf8.char
- utf8.charpos
- utf8.escape
- utf8.find
- utf8.fold
- utf8.gmatch
- utf8.gsub
- utf8.insert
- utf8.len
- utf8.lower
- utf8.match
- utf8.ncasecmp
- utf8.next
- utf8.remove
- utf8.reverse
- utf8.sub
- utf8.title
- utf8.upper
- utf8.width
- utf8.widthindex