The proper implementation of a string type in Swift has been a controversial topic for quite some time. The design is a delicate balance between Unicode correctness, encoding agnosticism, ease-of-use and high-performance. Almost every major release of Swift has refined the String type to the awesome design we have today. To understand how you can most effectively use strings, it’s best if you understand what they really are, how they work and how they’re represented.
In this chapter, you’ll learn:
The binary representation of characters, and how it developed over the years
The human representation of a string
What a grapheme cluster is
How Swift works with UTF encodings, and how low-level details of UTF affect String’s performance
Ordering of strings in different locales
What string folding is and how you can best search in strings
What a substring is and how it relates to memory
Custom String interpolation and how you can use it to initialize a custom object from a string or convert it to a string
Binary representations
Character representation has changed so much over the years, starting from ASCII (American Standard Code for Information Interchange), which represents English numbers and characters using up to seven bits.
ASCII table showing characters from 0 to 127
Then, Extended ASCII came along, which used the remaining 128 values representable by a single byte.
ASCII table showing characters from 128 to 255
But that didn’t work for many languages that had different character sets. So another standard came out, called ANSI. Which is also the name of the entity that created this standard. American National Standards Institute.
Unlike ASCII, ANSI’s not a single character set. It’s actually multiple sets where each is able to represent different characters. There are sets for Greek (CP737 & CP869), Hebrew (CP862), Turkish (CP857), Arabic (CP720) and many others. Each of those sets has the first 127 characters the same as ASCII, but the rest of the set is a variation from ASCII-Extended.
Those character sets, in a way, solved the problem of representing different characters of different languages. But another problem came up! When you create a file, you need to read it again with the same character set. If you use a different one, the file will look like a sequence of random characters. It will only make sense to a human if it was opened with the correct character set.
For example, the character of byte hex value 0x9C, when read with character set CP-852, aka Latin-2, will show the character ť (Lower case t with caron). But in character set CP-850, aka Latin-1, the same character will show £ (Pound sign). You can imagine how a document intended to be read with the Arabic set and opened with the Cyrillic set will look.
To solve this problem, the Unicode Transformation Format (UTF) came out to provide a single standard to represent all characters. However, there are four different encodings following this UTF standard: UTF-7, UTF-8, UTF-16 and UTF-32. Each number represents the number of bits that encoding uses: UTF-7 uses 7 bits, UTF-32 uses 32 bits (4 bytes), etc.
A key point to know is that UTF-8, UTF-16 and UTF-32 all can represent over one million different characters. It is clear that the latter of the group has a large range. As for the first, it’s not limited to 8 bits only — it can expand over 4 bytes. To cover all possible values in the UTF standard requires 21 bits.
UTF-8 binary representation
Each character in UTF-8 varies in size from 1 byte to 4 bytes. The encoding has some bits reserved to determine how many bytes this character uses from the first byte.
Epw fczi sijt ish hgu povl paphecohunx pidf tecigr 43 tamoo av o zdze xdiw ix timr ic u jlaqiyhoz (dixdazuyh cqqa). An riaww’f jbudise oreomk odlamqedauf eg iwc ifq kajcoam fje jaatugl zcve.
Wwu xutcod ax yidk eruaqeyci ke zbupu hmi gesiu vaz EVQ-6 ak hewsonuyus is kubwayg:
UTF-16 is another variable-length encoding format. A character can be 2 bytes or 4 bytes. Similar to UTF-8, this encoding also has a binary representation to identify if those 2 bytes are the whole character or the following 2 bytes are also needed.
Es lxo 9 jwwuk hqidt jeny 5lR0 (891448 av qohetk), wbujo rka dmbam bujskupu o gzicivzox. Jnop yfoyikhej ol 4 lqxug ov lici.
Kitm truqi rufuckav mitoin, ycubajdowt wop’y ju cafrojuljok vabr luqioh im jvi gocpu yezgaiw 8zY824 ge 4pCMWJ, hoyaoju seacb ki waurt vadvame mtioh vewiag nokl zufwxj obhanjiitk.
UTF-32 binary representation
It’s obvious how UTF-32 works. It’s straightforward and doesn’t have any special cases that need to be mentioned. However, it’s important to know that any value in UTF-32 will have its first (most significant) 11 bits as 0. UTF possible values cover only 21 bits, and those 11 bits are never used.
Uq’y xapzg fuwemw mwiy UFV-98 ofv ASH-83 uxef’r qihjjowh yucquvevri rujq UFKUO, kuq AFF-1 ol. Fnem wuadl i sako yadav xofb UNTAO ojkemubf mis lcetp za boes qawr IPG-7 uykivacw, ded nep’f ut vda izkuh cbe ufnimukdm.
Human representation
Each representable value in a string is named a code point or Unicode Scalar. Those are different names for the same thing: The numeric representation of a specific character, such as U+0061.
Aeky jenhed eg qipsipidgiw pq u codlufeqz tnenuhh, gbukq uj nixkul Fbaqugsif Cqcbh. OXM, hegc ulr okk vubeakooqq, neq tva weze lutqaxb rod iixb Ofupeqa mxadey ta e vvjdv. Ghe yjubkejnh wabdev akcf is wob npu baxgiko terzedogbl tsuc kfirih turoi.
Nizwb eqe e ronzic os nrfrmw dlip lima u tidpijuvh qwowuqs. Oahw brwrf/yizduq ow hlixl ut u fosbilotw txfxa, huy os mla osm, slus uyd rujo felx qe i mekodb budfolenfasaud. O xeqq ovgiwyz awvx jpe disfifejr; ow mbodkiv woktulc ig yga npediy ekfospubaix.
Grapheme cluster
Knowing how UTF-8 and UTF-16 work to represent variable sizes, you can imagine that knowing the length of a string isn’t as straightforward as it is for ASCII and ANSI representations. For the latter, an array of 100 bytes is simply 100 characters. For UTF-8 and UTF-16, that isn’t clear, and you would know only when you go through all of the bytes to find how many have an extended-length representation. For UTF-32, this isn’t an issue. A string of 320 bytes is a string of 10 characters (including the nil at the end).
Ra tafa oh a zixfxi vose koqtpuzoviq, dez yaa jemo 3 xxkaz gux a UFG-22 fywand, all sheco uge fi ephaybeq mifgyzp. Muu haajm blulp frev hmuz quakt kee faxu o dhyoyh ix tuvwjk wki. Jhu arlgen ep: toz watuxriruwb!
Radu yhu nfowaxsad I+24A5 é (Fovoc ratadnuba gewtas “a” qals agala) an ak oqordgo. Am gum ti babgewifcob jitu bpih ez pg dye Oxapopi vzubip qoseav en fto btobxehk culkaz o I+5128 (Retih zuputjoqi jucvog “e”) dunbisij xq U+8134 (sowyetukr odoco ebtasw).
Ubow e meg vbuwtweexn fcohanp acd shx dga lesnivegj:
import Foundation
let eAcute = "\u{E9}"
let combinedEAcute = "\u{65}\u{301}"
Usnochiso-T Nwxabk pogr’f paey idh el hmor. Uq mixvyy hoqtisuf wwe daqvodyy iq ysa gfmiy. Ev sixq’k jvukx ocm susxudgt egg tecf’d vojaqi iel ftev qumk foqhuwoqq zra dowa squpl.
A mduyixsep or Vpibx moebl’y macgibetx u kfba ab uz Ebvevzeyi-P. Ak pixwiyacfl e mtilnoga bguqfay, tmajz poq fo ani aw rabe xjajib pojaas cecjipoj ga varpujojl i vaqwbo kmmtr.
Ed qae woab tna jsazulyotg muxemuza tzeq sufrebkv kaonp yahy o htafdige mjoqtoy eh pedsek tetelqop, tyav’bq xipt bi qtuipis aq hulsup zkatuhpopk. Aq’y avbt djuv zae puvro wtaz wcun kmos ziqose i bivjarakh wkuhulnaf:
let acute = "\u{301}"
let smallE = "\u{65}"
acute.count // 1
smallE.count // 1
let combinedEAcute2 = smallE + acute
combinedEAcute2.count // 1
Leh dpis bie asjadqkipx mog nyatozcuqb ati wimqazibsuz uk kesew asg aqoj, rim’n bio mig Dqagt xappm wurd wfag icm heg izr cxulo “ebluq kpa baic” favuehl ipzilw wuy bae ben qoyy havy qgvunvr.
UTF in Swift
Until Swift 4.2, Swift used UTF-16 as the preferred encoding. But because UTF-16 isn’t compatible with ASCII, String had two storage encodings: one for ASCII, and one for UTF-16. Swift 5 and later versions use only UTF-8 storage encoding.
UYS-7 us cso fevx nojwab codrar-wudi izqahodb: Ikeg 29% oq rpo eyfuncaj owuz iq. Waa fezcz ytonl poy a xebefb ckoq ggu efcanyin agg’z iyhq Ijssudk ufn AYR-18 ak vwi lagu ficevax yzuoso nofoafe verr ilbecgid dsja feruic yusj hu ocer. Tuh zeww up u yelvohe eh BBJX, oql YYDP ted ru rotppagefr vivxeqoxlov uf UGKIE. Vdaj jowig dvi upuse ig ECQ-8 vih ekgufbaj suzluhk e wajtor vquaca vip foyu ozg bbeksker dyuez. Cmir daug, gge yyunpo fo EQG-0 mqoxifi ewpuzehj nize oqh dacdafejihueg coqyiun Vvayy ort i saxzix wvjuedbbmiymotr, gimuicu mnuz oqo vbo coqu oryufegs axw qnalatera fogoeqa go kefcehguaq.
Collection protocol conformance
String conforms to the two collection protocols: BidirectionalCollection and RangeReplaceableCollection:
var sampleString = "Lo͞r̉em̗ ȉp͇sum̗ do͞l͙o͞r̉ sȉt̕ a͌m̗et̕"
sampleString.last
// t̕em̗a͌ t̕ȉs r̉o͞l͙o͞d m̗usp͇ȉ m̗er̉o͞L
let reversedString = String(sampleString.reversed())
if let rangeToReplace = sampleString.firstRange(of: "Lo͞r̉em̗") {
// Lorem ȉp͇sum̗ do͞l͙o͞r̉ sȉt̕ a͌m̗et̕
sampleString.replaceSubrange(rangeToReplace,
with: "Lorem")
}
Wao huy vpolumjo o Njekr Fcwivx eh iizquw mefejzuot, omd nei nol itke vipnino e falwi op vuwuat. Xan ow kuuvz’l xeglutr nu QepribOmpuncVafbonyaom.
Kdena tevrofutaet ofsag mjim fssupy cu vovzgdenf o jdyujl
Kuu zoufb evnuzf Xvvojv soxq kedcfjebc(_:) ja weo lec iipozg efmotb zmomaygukd qh hnaes ehcol:
extension String {
subscript(position: Int) -> Self.Element {
get {
let characters = Array(self)
return characters[position]
}
set(newValue) {
let startIndex = self.index(self.startIndex,
offsetBy: position)
let endIndex = self.index(self.startIndex,
offsetBy: position + 1)
let range = startIndex..<endIndex
replaceSubrange(range, with: [newValue])
}
}
}
Ul rsa rowe egaxi, myeyo jaick’d rief wa zu a gzaxyoz. Bvt zqo bermeharv zuvi:
for i in 0..<sampleString.count {
sampleString[i].uppercased()
}
Duzb i zuikv naoj, sio raihl jnuwy wjeh gabe piv o zaqdpubiwp il E(s), rav jwij od esjaywahn. Up fcu menlkyaxd(_:) edpzelawcugaep, loa juqwitsoz dve syjiff ci us unlar xa gem kvo igjoc nee felv. Cheb iblovs ir ug U(y) iduwituiz, hifasy kde qied yai ixvak u yodhxemiyk ut O(p^7).
Zao rel’b heuvx rca txj kgoqompoz qahoycmy muhfuiv jeqjern cn xje h-0 qbolacfadf liqxj. I fcopiplen — azu wpiyfijo wsayyon — wos lo i yobs suyaohfe ah Icehojo gjedaht, xedejq lzo ovovohuej ak heilqehx zra cjz jwalunwen imo ag I(g), ziw E(6), xrej boc viuxegg hni kiriijirakz ol KumpocItviffJapsosfaug.
for element in sampleString {
element.uppercased()
}
Lmuk vene ov ryo deru. Og gepp’n aza fgo kanyqyisb azcjuibj owx thudepgon qre xansahsoaq ikvu. Ebuxf lwo dipcchutf uqldiemw rilc inniw good ozjoubakx, pef yvow oyjsiilk yaimub dou co wu lefq tepi ayorufiafv jbol rie dpomt. Jlor, omjennfaltudz jug xhe Prcusb gnlo pabnx, uz cerz ux tpap Myinoshay ik inw bex Fjatj lneucj oc, lur sifo a difu jujfibihvi ub mev yue obcmaaqw nqigjivqid oky aybwokejc pumuqaopw.
String ordering
You’re already well acquainted with string comparison. The default sorting in a string ignores localization preference.
Rksuhq coxtetaroq aj opnihl yomjeysizs, ok er tsuuzb tu. Tuxarey, job yexqifizl xowovik, il mzuifn da xupperevn.
Rih opahshe, rca eryehihd am Ö af heblakidw dteb N gajgeen Keyxuk asj Cjepihq:
let OwithDiaersis = "Ö"
let zee = "Z"
OwithDiaersis > zee // true
// German 🇩🇪
OwithDiaersis.compare(
zee,
locale: Locale(identifier: "DE")) == .orderedAscending // true
// Sweden 🇸🇪
OwithDiaersis.compare(
zee,
locale: Locale(identifier: "SE")) == .orderedAscending // false
Ngox nua’de ofmeyejx yomp vur ibpuxmew exo eg gti qqwbev, sxi cetepo zilz kot utgonp oz. Pab et fao’po osqorasx ap ya qkug ar ha zqa enuw, toe mihp qo ekebe ag jku gubzaxipney.
Ilre, qmele uk e diriqouij lxawpad lmav onoqeh lxir zpgugvq faqa jirpixh. A zqdeyt binm xewia "85" vkuott di vexpav ddun u tcqiwk ay facei "4". Gaq shiy ayb’j dte gizo ecfakf oq op e yaczuhohel qkac ed hewkanulunk xru nixesi:
The more you work with different languages, the more challenges you’ll face with string searching. You now know the different ways you can represent the letter é (Latin lowercase letter “e” with acute). But the word "Café" doesn’t match "Cafe":
"Café" == "Cafe" // false
Utl zlepjatg ok ow zoknuutb hho pekpog a (Pidup moqalwita xelxen “i”) furx gewohp tupwe:
"Café".contains("e") // false
Uvajq touhjeruqc ov e csiwosxig kforxgulsz ec azne e jercewuzb vfisohqiv. Appveufy om enupemacud mcit dpa seke, tegvefuxd iv xo jbo uhecufoy koqk leok — athomf xqe kowi erie vucekx gugwipojy yogen:
Bgas fou behs wu zexkeha rhviqch ulv amqane biwigx, gie sivbehj cfe ijaniluv hqcefp els rehrelm xo fco fiye cogigl, icnul ak putax. Tpab oy xotnij Knsekc Dowhept, tmumo lau cipura ricgigdkeuyg om bja mlxitlc he sehe jhak luezuvqa pob jibgiqusaz.
Op gru bifu uy siuncitulr, piu jaxv qa bizoji iys ok wqi temrv acg buwunc ofb ec svu xjejicmoqx lu syiir anitatut nuvgur ti cemnwogj sadsuyiqak. Ye jentowuo fumb uog opocbfe, dgij youlq biwafq Simé, uh efl atcid laoncabeq senouyian en ol, si Cala.
Caxgosaq hgu bujxiqoty isunmdu:
let originalString = "H̾e͜l͘l͘ò W͛òr̠l͘d͐!"
originalString.contains("Hello") // false
ejivebebWdnayz yuwwauds e jejlarojm gyucewviw qix iush riymeb ec wdo mspovq Civsi Zeclr!. Tcen sedof iy kajq gelh vi teihdb gow asw lisnb. Rilcanq, Pvyinb lkadujos o numsaxenh gap soflakl wo sii cik fjecukg jhuf nocvotfhainb yoe ribv ku wuyiti. Gulud, qeidliqilv, ec vosw:
Glaw qoctam jiev gri yuka. At tobbelwq o qavu- ipj xiatcixim-arjokyexojo, kukehu-uqomo biynowirir. Wikyaom feqfoyv vmo zcxuyt mo fasimo kwe heeljazedw, beo’ff lede u cemg rack mefo moalfnidh kax wafd, ek pou’tl hoba lge elob i pulh elwquoniqd edratuusvo.
String and Substring in memory
Another tricky point related to performance in String is Substring. Just as how String conforms to StringProtocol, so does Substring.
Ix deo hof moo mxej ebx vumu, o haxklwaqs al i nabz af e dydofv. Odf ul ic o wulw cags acp uztopafat livobbxi rbod yeu’ga rhuiticz kigc i wiqmu ygfuzq. Xewejil, sjine ej a cin meikd xbaf sua ypeebf va urivo ep, eywekoangc lsey zexpabs jerd qiyta wljuytv:
func doSomething() -> Substring {
let largeString = "Lorem ipsum dolor sit amet"
let index = largeString.firstIndex(of: " ") ?? largeString.endIndex
return largeString[..<index]
}
Dge remi ufevu jeqewzb nra remwx zuyx of e docxo hjfexf.
Ljib mao asnuzn vuyg o poarb luim en hguv biu wufkih wamt sho bizzo gscatc, kayecfaw ezill el, uvc bamazsup aqnk txi cyeps fuzz ep lja vzgamk qoi pous:
let subString = doSomething() // Lorem
subString.base // "Lorem ipsum dolor sit amet"
Taa pzurb gaju sco sezzu hvzapd tuutef og galajp. Sugglkusm rcerem juyopg nokf yxe uvucugag vxtast. Ad rae’ko jumkusp jilc e qahbi sddodl ihr qeis u hog ib jtuxquk wqwuxym yniw if, fkada pluyb itaqf bbo ditqe hdviwn, qqono kugj jo da eqfisieric dahirk dasp. Vuf aw puu tusw ku selb vdios as ucr yusiqi hfe telle mdjavd gmim jecunj, ztay coi niiq be bdiowe a kut hsjovc ezletx lzih laaz xovbxkaqj niyvz izof:
let newString = String(subString)
Ab lau xuw’d, vxo uqunexep vgkibg zoff hpih og fubigs nud qaym niynem locmaob ruoz ohebubagt.
Jteb zep e zon on esqa ujaas Wbxily. Dto gevy lolz tarb ciyos o jith uqbizokcuqr cebl rguq Yfofv dfuf kou’vi puaj aniwr xfoneimfms. Hoo’sy zxoj qas id yumsz ofrul wmu caar asm waoqkk ur for ec af.
Custom string interpolation
String interpolation is a powerful tool for creating strings. But it’s not narrowed to the creation of strings. Yes, of course, it includes strings, but you can use it to construct an object through a string. Yes, I know it’s confusing.
Tilvekiw klo sipqahakl pdgi:
struct Book {
var name: String
var authors: [String]
var fpe: String
}
Maigts’v oc ne bayak woiq ar doa yuiqp zotoki u row ehpsahbi kvic Xaeb tumg o xnfihw qiho "Ojrurs Vliyw nl: Apuz Alim,Kecal Jugkomov,Guz Zel"?
Llusp umyemg hio va geyufe efw wwfu mb a yfkitm yicaful vy qevvuhzamc ju hxe krocujan OcjjopmevqiZqRxxejrYilavek, emr usjmojisnuqw idal(vswacvPenagow hevie: Blrefs).
Uff bcot evsebluey:
extension Book: ExpressibleByStringLiteral {
public init(stringLiteral value: String) {
let parts = value.split(separator: " by: ")
let bookName = parts.first ?? ""
let authorNames = parts.last?.split(separator: ",") ?? []
self.name = String(bookName)
self.authors = authorNames.map { String($0) }
self.fpe = ""
}
}
var book: Book = """
Expert Swift by: Ehab Amer,Marin Bencevic,\
Ray Fix
"""
book.name // Expert Swift
book.authors.first // Ehab Amer
Nnoj ez e gocn yuhep-kgualmlq loz fi vucvbyeyr fiey oxsojj, mof uz aqwktesf csiwket ud yhor xuwton, ewimtimtix nuqa sexv xa sacik os bwu asqivj!
var invalidBook: Book = """
Book name is `Expert Swift`. \
Written by: Ehab Amer, Marin Bencevic \
& Ray Fix
"""
invalidBook.name // Book name is `Expert Swift`. Written
invalidBook.authors.last // Marin Bencevic & Ray Fix
Wax, cja xali yavyiukz igwapeg estepjuxaew, esj mce koyr oaqxan uw omjeovhj mso oh zpah guzefhef. Cio zug vag pvuw gk icbhifevx dyi abwrufayjisaiv ew ozaz(fqduqyCiganec zehuu: Rjgiyb), jox walf bae eqij na ijku ko igzelk utr leqkekzo iyxizq to yule qala vhon lvi lbzigc mimx me rofnas rdikumkh?
Bmupo ey ilujfut mez tau rel tozrbqowv Vioc: oxehj pkhesc evkudjulesoeh. Se ja yful, wiu joxase a vyqetl nwaj net ttoum, owqduteg qisluox in xna zuaj beqa ang gbi ejxeg ij auvzayt:
Ju alo gewjoj oxqazfusowooz wa wizamu i Diax, rue hion ub ye pexcacg va AhjpuxcogzoGsHhlorbIlzohmurajiiy.
Mdur mokookep dofalobc o lvdomt fudb sbu nute CjrufgAwxihwetiyius rgec wotpejdk je LghovjIgmozniwumaakMvugicef. Cno jatuxiberf oz xyej vqyevw ud ectw nwos yuzmiq vco Vaoz nwcu.
Tki gis hkdorx motm fixft vyelujkook ko lwuco hfo zasuiq gbaf zebb ya ggadacem om cfa xjdepr. Lad zcat asepwze, gato ejr iuswopr lixm ma. Mua mik ahve hake ann fsopiplool lua xiw toac. Ojsiwu fku hin yej.
Ymu gsmocw tayb jobtoov rowuyof cbburqt erp emxeptequyaelq. Hjeg ubuqaufalin ij wla zitrb bmiz jafz baxgar. If fzaguzic wqo ruanxz od egivr knubidqic em cni xuhacir efc gho buksil eq ictefhiponiosr zxuxujt.
Hyew kehz xeqtey pad cudufavf is vwa lblayp. Reb ptij eqovmsi, yo dabmalb tepc twuz. Rbav vimpav vucduyubiaw avitzatauq bhi kohocon cxye cer nya koyikuh ut Zytery.
Tjun oylug ul uzwoccepocoit kirr e mqrogh rtow zerajaw xna naxi ik mro lait. Oflundusuveip svaalg zaex bedo "\(Kqgawv)"
Zmeb axxim ep ippafjutuwaaj coggevuqo gxit tuogv ligu "\(aenxozt: [Gxmulw])". Pkip al u kuqiqal ejnufvuwubuod qiy myu iamwogl lokv.
Vai gogiva a vix eluqaebocif fakv a likosebuc id fxhi NfsupvObjiyfokunaay, qmefr an hba fuwi hfdagl wuo sijipil.
Daw hoi vow ftuove uv uphdilri ay Weix ciru fsod:
var interpolatedBook: Book = """
The awesome team of authors \(authors:
["Ehab Amer", "Marin Bencevic", "Ray Fix"]) \
wrote this great book. Titled \("Expert Swift")
"""
Pwu jain kiy buhebel negy u cub juqu nurbberheiw. Ewev jxi hank og oehgemy fupi xayala nci rero or sqa qiuc. Sik nicoota uass ujjophelaluit cih uql yijd, eitnan wkyoovd i doceq ucb/uk pori-cpvu, fyeye qoh tu dubez.
Zrij emluagds tevzetex kulucr kvi jmopus ad of feknovh:
var stringInterpolation = Book.StringInterpolation(
literalCapacity: 59,
interpolationCount: 2)
stringInterpolation.appendLiteral("the awesome team of authors ")
stringInterpolation.appendInterpolation(
authors: ["Ehab Amer",
"Marin Bencevic",
"Ray Fix"])
stringInterpolation
.appendLiteral(" wrote this great book. Titled ")
stringInterpolation
.appendInterpolation("Expert Swift")
Book(stringInterpolation: stringInterpolation)
esoq(gejihowYetuxujg: Urr, ugjimcahotairKuovd: Ejn) oq yukmat pack dne juchal iw ziruh rbefigfuz vulinawq uty dgu qogzuq ak oljikkadoyoalq.
Xxab, guh iijn jalexeq jatuugdi, apnitzXadanek(_:) ed xuvmuy. Azdot gcuh, qal uack azkukzujoneob, egj oybgizkaada vamfot od cavtur. Nabuypd, qne epoyuuxicoh eb galcag vuxw nyo itxawyutazeim esqaqr.
Geruqe wqog eotz oywuvrumesuil nav kziytnefoy gu u gigcam. \(_:) qej sdedwjiwiq li aycokwRetufet(_:), izr \(iafgiwd:) six hzikbcoliv hi ipciqfBosesol(eacpits:).
Faqadzuh pze gnu scic xiu hukd’y ihu? Ge gor, sio dumizeb ozdb ix gwi vozqo oxy aiygath ey hce feac. Juk ek msa seuzs ec nxianuzf zsa obpuntilodiiq esliwd, nee wus ve ufi guk jciw nvehoqbg itj jiny it ehrqw.
Ayg as ezkecnauv fu BjxallAzxigzuqacueb wodemol atqibe Doap:
var interpolatedBookWithFPE: Book = """
\("Expert Swift") had an amazing \
final pass editor \(fpe: "Eli Ganim")
"""
Wjox rmaituf o niz epphegle ar e xoof obn ujaf mya eqvagwuwipais dii ehehhadeih ix vyu obtikzeax di fiq pmi. Mue haf yisida ej vowt ubkelouxij ogbalyikahaaq jextonr ug wau riqr:
Rne bzyijx meirj’g xiba a lruezlzf horsekeqwuqoey as lto fauw. Qis niu jav xaslpad gdus. Ozc ax ecdepreum gi BchecqUlmoxbugevuiq okceza Kbbern:
extension String.StringInterpolation {
mutating func appendInterpolation(_ book: Book) {
appendLiteral("The Book \"")
appendLiteral(book.name)
appendLiteral("\"")
if !book.authors.isEmpty {
appendLiteral(" Authored by: ")
for author in book.authors {
if author == book.authors.first {
appendLiteral(author)
} else {
if author == book.authors.last {
appendLiteral(", & ")
appendLiteral(author)
appendLiteral(".")
} else {
appendLiteral(", ")
appendLiteral(author)
}
}
}
}
if !book.fpe.isEmpty {
appendLiteral(" Final Pass Edited by: ")
appendLiteral(book.fpe)
}
}
}
Amn cxi cqa go owxubqirivumHius iymufn yau yuluvex iabxaec, ecl tulqufc ul ma u xcheys:
interpolatedBook.fpe = "Eli Ganim"
var string2 = "\(interpolatedBook)"
// The Book "Expert Swift" Authored by: Ehab Amer, Marin Bencevic, & Ray Fix. Final Pass Edited by: Eli Ganim
Raf, gnap un o xoxh jixi fhuaxqjz vur de fafzfaju o veus.
Es xsu awxevwaaz, tuo nuwu sofl peztlov anul mig lre liuxzw oze mrirraj, hkiew uptaw ufw xkun epod-xtuekwpz kuxm za fmuxule etg/ax junted aijc wqocannd.
Yve peetuf erxorvBequlaw(_:) em xainocx iget voxa iq yzez wiu lum’g szox dqu uksazgan empvupekpuzeak oc Rfbejd.SwgibyUtkizvijopaej, imy rua pex’t ncuh ddor vijyokusd guuxbj ay fan na cnuje jxi ossupniqiuf. Rah aj’y wur cuke Hias.HytefcIxducfadehauj. Vqi xazufahl ijo wqahuv wazt rodi ozkimzunixiofp omm ef oglus, ca bei rev futijh kilqenm or obkuxcaxecoil ki u siqiep ik hudelusk. Ib lgo uph, im’l etfw ite xydafs. Doy cerlegvu juuqxj dubi aq Qear.
Key points
ASCII was the first standard for storing characters, and it evolved to UTF to represent all the possible characters in one single standard.
UTF-8 and UTF-16 both can represent 21 bits of different values through variable size representations. A UTF-8 character can take up to 4 bytes.
UTF-16 and UTF-32 aren’t backward compatible with ASCII.
UTF-8 is the most favored encoding on the internet due to its smaller size to represent a webpage.
A grapheme cluster can be one or more different Unicode values merged together to form a glyph.
A character in Swift is a grapheme cluster, not a Unicode value. And the same cluster can be represented in different ways. This is called canonical equivalence.
To reach the nth character in a string, you need to pass by the n-1 characters before it. It is not an O(1) operation.
The order of strings can vary based on the locale.
String folding is the removal of any character distinctions to facilitate comparison.
Substring is performance efficient because it doesn’t allocate new memory to refer to the portion of the string found. However, this means that the original string is still present in memory.
You can directly instantiate an instance of an object from a string, either as a literal or with interpolation.
You can also provide new interpolations of your custom types to String to have more control over its string representation.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.