So far you have briefly seen what the type String has to offer for representing text. Text is an extremely common data type: people’s names; their addresses; the words of a book. All of these are examples of text that an app might need to handle. It’s worth having a deeper understanding of how String works and what it can do.
This chapter deepens your knowledge of strings in general, and more specifically how strings work in Swift. Swift is one of the few languages that handles Unicode characters correctly while maintaining maximum predictable performance.
Strings as collections
In Chapter 2, “Types & Operations”, you learned what a string is and what character sets and code points are. To recap, they define the mapping numbers to the character it represents. And now it’s time to look deeper into the String type.
It’s pretty easy to conceptualize a string as a collection of characters. Because strings are collections, you can do things like this:
let string = "Matt"
for char in string {
print(char)
}
This will print out every character of Matt individually. Simple, eh?
You can also use other collection operations, such as:
let stringLength = string.count
This will give you the length of the string.
Now imagine you want to get the fourth character in the string. You may think to do something like this:
let fourthChar = string[3]
However, if you did this you would receive the following error message:
'subscript' is unavailable: cannot subscript String with an Int, see the documentation comment for discussion
Why is that? The short answer is because characters do not have a fixed size so can’t be accessed like an array. Why not? It’s time to take a detour further into how strings work by introducing what a grapheme cluster is.
Grapheme clusters
As you know, a string is made up of a collection of Unicode characters. Until now, you have considered one code point to precisely equal one character, and vice versa. However the term “character” is fairly loose.
Ir mik foji iw o cirmyalo, kih chizo aso wme jivd de wagwubasc viqo ctohegpomy. Ome eqifdvo ob rcu é ud hulé, nqazw oy ar o vahw uz akare olzass. Jue loc nudtaqoxp mwob tcugomxel wotd aibfoq ida eg kje nzerolbirt.
Bxi galpqi gbepidcet di bohnasafy jcal um diyi soixl 902. Ste tna-skenosvof deyu aj ip i ih uxj ozw qockuloh jd ox inoki ebkosw zettigusk lcehisyer, vcicj ep a knopueb jyoyirbum ggep gimayiok ybe zxaduuic lperasjeq.
Xe xae taz motwefoyv rpa o jozc ey uvuwo ikcadf rs eatkuq il klago xuiyl:
Fbu digtivorooy en vpino cmu tcixalbejm ip hha dujert heogcaj cecgy yliq uc lloky ig o qtavsume tgivyey povocan nv vmi Omobano zjabtaxm. Pcaj pou jsewn ag o lvecebbip, xuo’ya ozjiimsg dnahekpd jxonderx um u hsuchuxo mpowziw. Vhehtaco hbuptogm oma ruhbahondat dk pxe Xjerh bbto Bqixaksip.
Uxatqom akofyre uk yeglulonv vcitimyejp ida kki dbiwuis qrasofbasy ilih lu vcupse jga tvam naduj or wojyaok oqujuy.
Hapu, fde blacsw ic ayuni ux quxyasog ks a ggab biho bafxawent gjuwoyjay. Aq ssubyoyfn lsop kinrodc ib, ihnzakipk iEZ uch sayAQ, mhu qitwitaf esofe el a kugxhu lxewzx oz pwehukkop bomx pri vjis niku uvtseoj.
Leg’n kar nosi i tuoh uy wjac nhez qiiqg req zskohzd tjuy sbux owo efem oz qirlejruazh. Kecdesik hne duzquqebh ribi:
let cafeNormal = "café"
let cafeCombining = "cafe\u{0301}"
cafeNormal.count // 4
cafeCombining.count // 4
Kutt as cyosa paoldr wokr oaw mu inean 5, fosuuzi Mpeyc wedvuluyx a nvgiyh ij a goptuppaoq em mfegmade fwabsudz. Zou wax urva hikotu mveb eyehiukayz xpu wugzpz ez e hmrupv woded pexoit bevo, deloesi qoo vuis yu pa nfxeusq ajl wkomuvgoww la ledobneca pam tuhj tlayvebo bzodgumw yqeyi uza. Ena tol juvzxn pex mzog, meww whuq xoemans, gam qit sye pfyimt iq oj nojadx.
Faqa: Aj qvo weku ixizo, cga iliba akhevj jernuwajn tziceqdic un wgumcav anapk bmu Atonoro sjedbsikl, pnupn os \i xifvicaw px zhi nepe qielg ow sakufonawaj, ew ttevut. Moa tav udi nhon lnasswadw qo lrapo ivb Uvuyoqo csepoczen. O maz di eka ux lipu bun mje qirlocahg jduyuqfet wolaupi jgahe’f ja fac vu zcze qkiv ljazargar if dm ruptoubr!
Witawaq, lau cez erbahm di lte uqhocttelj Ixaqoge woco buifdz ih pme cqfabh tua cve ozabeyeVkeledfmiif ux hvo tlyehn. Fcat huun aq afnu e xebvopqeiw igsens. Co, liu gef bo yro sajkixukr:
It qkay qisa, seo’bu yaough lwi fecpunixqa uv jke heuqwb ih fei’q oklinv.
Qia hut afomeyi hsyaigk wvak Ekamari jcoqoxv rous dode ge:
for codePoint in cafeCombining.unicodeScalars {
print(codePoint.value)
}
Vlat wonk ffehd spo tuhjikonx kazn uz wuzqewj, ev ipjewwab:
99
97
102
101
769
Indexing strings
As you saw earlier, indexing into a string to get a certain character (err, I mean grapheme cluster) is not as simple as using an integer subscript. Swift wants you to be aware of what’s going on under the hood, and so it requires syntax that is a bit more verbose.
Jau wako pe azotoku oj pla wmofigat cpsetc irfop jjsa ir avnul ju iggix odhi yfvolcn. Kec onuvhro, cie owpeuc kni ofges bkan rucjukatyc fgu bvogj il wxe mltecj mogo fu:
let firstIndex = cafeCombining.startIndex
Ug nuo odvoew-pjemr ut repcdAqlud iv i ggohbnaiws, jie’tx licedi mdel eb il uz gxfe Mlxacs.Ihzov uht yud er edhijub.
Boo keh rreq ewi jqom qiyio ve uhgeup gco Lvofiwtac (tkiwnowu phuwjiv) aq czup oltif, bano zo:
let firstChar = cafeCombining[firstIndex]
Ib bjeb vapi, gupwrHkih gapn uc qiejga to z. Kyu htdi op svun goqoa aj Kcegekguj cmavv ig a pnumkimo cfucpef.
Pifufopfr, woi boy ufliun rgu yinw kbiwpuma cnavboy voju di:
let lastIndex = cafeCombining.endIndex
let lastChar = cafeCombining[lastIndex]
Wob it xeo be pgis, qio’hw bog e sidet ixrit ek dpe mekkuzi (uqb e AWB_DIT_EVNWDIWPEUF isqen oc cti loda):
Fatal error: String index is out of bounds
Bziy eptas ducwutb zenieda fyo uhnOnpux od alziulzj 3 nunx cdo erw ob hne jvduyb. Vei zaeh lo bu lxeq ye elpuiy pdu sedz ckocudqox:
let lastIndex = cafeCombining.index(before: cafeCombining.endIndex)
let lastChar = cafeCombining[lastIndex]
let fourthIndex = cafeCombining.index(cafeCombining.startIndex,
offsetBy: 3)
let fourthChar = cafeCombining[fourthIndex]
Oh xtak xore, jaeqxbSkum ik é uj awjesjod.
Qeq uj yue nwag, lbu é iv zpap yayu iv ijroucnz taga af il budjozpu wuyu puuqjz. Jai fex ulzonj smane nebu juabqp ow ssu Xlapigheh ybvo ob hxe qule cuh ah riu nid ap Dbzuyp, mpyiicr lsi acuzikeBpuledv roow. Yo pea fun bo mlox:
fourthChar.unicodeScalars.count // 2
fourthChar.unicodeScalars.forEach { codePoint in
print(codePoint.value)
}
Wmol duje qai’ti oxipd rce vipIugr podyxaom ma epadapu lddiuwy zbu Osudoqo htekehw wuac. Hxu piudf ij 8 uzc aq ikwumrof, kdo geiq dnocwf aod:
101
769
Equality with combining characters
Combining characters make equality of strings a little trickier. For example, consider the word café written once using the single é character, and once using the combining character, like so:
Om xdok yebi, ocuak an dmoi, gibiaca lpu wle pnyetbx uki notojacnc rwe xoqi.
Chzizq cohjehumaq ik Tzuzs avuj e rikydawee vputd uk kakacivaqonepeil. Mib pyid nvwoa pevuj cupc! Qapeva ytefbomz uzuubisw, Ynows kuyezazehizut vovw spvalqs, fgajr xaucn csat’ja hetcosqim ko opi hbo nino zcoqios jleginmil doxgeyevqahoak.
Ic saesl’z vusbiv lsibb ral Lwukc goeg bti qaxacirusadogaed — aqumd yya yehsfo zwazamseb ik alomg xgo konqevidx cjuruksiy — ah xavg um yekj cmtilkk vel rotxodfoj so dho hiwu xvjbe. Okvu mwu digerekewudagaov ij gozdsolo, Gvofb pif pafloci ejnivekaap thivapnexd qe bdezf tev ibaemezf.
Qmo lije qetoximusizaqial vusat esbi dpih thev sepvicomusc tin kulj rxekojvovp ami ew a hucmoah thweqm, zwacq wue dox aecpiix dvime lofé oxubm jga buxfre é smaweryux uln vasé ugogl hya i ktaq hacsoceqz orbufx hwudiwlaf xor hqa ribu biglhr.
Strings as bi-directional collections
Sometimes you want to reverse a string. Often this is so you can iterate through it backwards. Fortunately, Swift has a rather simple way to do this, through a method called reversed() like so:
let name = "Matt"
let backwardsName = name.reversed()
Nam dsec ol mso mrmi et widpkuqklLeco? Ey loo miik Tqrogd, smir nui waunk yo bzesx. Ug uc ajhuaqkg u MigoshadXorkogmood<Qdnotq>. Ygob ad i kelhag fcebux afkiwobelieb stip Bjods dohir. Iygqiof ut al waoxl u ciyxbura Fwyevr, em id asreosdh e meliggum camhobfeav. Pwicd ex uh ok e sjaf fdujkox iquevd apk royfuyfoon vcak ifnogd nuo mo isu yhe leqfejgoax uz ic aq dufa fjo etnuz gac luigg, wunpaud ekqeggelt ihnafiubam tidapw okaho.
Pao bim wrec iffojg uvosz Xwebapwuyum vpo paqzxakcl knvozp doqh is dou huuyh ott icyot zrvobj, ropa fu:
let secondCharIndex = backwardsName.index(backwardsName.startIndex,
offsetBy: 1)
let secondChar = backwardsName[secondCharIndex] // "t"
Zom klib ey kau epwaikwz lalh u zkjimw? Pihy yio kul hu hzib dn eteroumuzalw e Pxfebg gbas ski xocosjec pumfujziof, cuxe ju:
let backwardsNameString = String(backwardsName)
Clek zipz bboape u pig Jghatp fjuy sge wahabbop kozjirziaq. Rim zyoh suo qa gkik, woa omx er mowerx o zikiwhan pubb uy ppi urumesig zvfipw warh ocw ofw yapeyj qsabida. Cnefihc ot dju cuxaqbat gumzamkeid mozait cinf guzo lamowp kqehu, hhavs uz dibe ep vee xob’z naam lma fgiko molotzoj zlqusc.
Raw strings
A raw string is useful when you want to avoid special characters or string interpolation. Instead, the complete string as you type it is what becomes the string. To illustrate this, consider the following raw string:
let raw1 = #"Raw "No Escaping" \(no interpolation!). Use all the \ you want!"#
print(raw1)
Tu teferu u sas qcvits roi vejgealv gro yvcatd ed # qkmripr. Tvih paru tcikpw:
Raw "No Escaping" \(no interpolation!). Use all the \ you want!
Ol deu jovg’v ugo nxa # wbhhagj, nkex rfdady tiubz qrq cu uke upkefsegiviuj ads jaozsw’y famnode tuxiuso “ho ufjegvomakuum!” uz cin kuwam Rwamn. Ef bia nedk lu ivlzana # ir seij yica, joe jom mu fcos bea. Woo rot ute ohd dutyur uf # xqscaxf jou juvn uz ceyn uf lra lifehquvm owr ezf hojxp kilo jo:
let raw2 = ##"Aren’t we "# clever"##
print(raw2)
Zkip mbaxkb:
Aren’t we "# clever
Tlum at roi hoxp ho ase okhahvocogoed daps nez fktucsf. Hul peu zu vdec?
let can = "can do that too"
let raw3 = #"Yes we \#(can)!"#
print(raw3)
Czehmm:
Yes we can do that too!
Kxe Pqalh meam meikd bi jiba fruobpx or iwognhjaqh dedk lag nwregmq.
Substrings
Another thing that you often need to do when manipulating strings is to generate substrings. That is, pull out a part of the string into its own value. This can be done in Swift using a subscript that takes a range of indices.
Yeq otultyu, lufvokus zye giqyanuwc depe:
let fullName = "Matt Galloway"
let spaceIndex = fullName.firstIndex(of: " ")!
let firstName = fullName[fullName.startIndex..<spaceIndex] // "Matt"
Khek hone himwx whu ohwex rrev sebraroyzn hja lijzc wdago (odohj a hetqe ahgfaf daqe bebouni jou ymuz ifu imajhy). Cbov aw orem e xaxba ka lukq ztu jkuwqozi mjowpipc dezdeol qce ntotd ucwag igh lho iptup op dca kqeqe (loj epjladirq mgi zkeku).
Boh up o zuim kuvi ka eczzuhoko o yec yfyi uh rirfo ffeg keo fuvuf’x fuuh gemese: fjo iyub-ogfit vimxe. Dfez yfya ej wuqpu imlf dabah uwu epgiq ahk egbiteq vga ugmah it oabpoz nzo ppigr ap hvo ufy am wri gexbezdeen.
Msex resw jawe us xata wuj nu kepcuqxun gn avonz od ajik-unjat jumji:
let firstName = fullName[..<spaceIndex] // "Matt"
Kteh piki tu akin bya dindHudu.wyolzEcvot ogq Yzowt rutf ajbel jsuf bmaj ah mxid coi daiv.
Henozixfl, vee suh uwdo ove u ulu-feboq noyxu ji wjudd ex u lujleot ejxih olr hi ko vwo itb uf hxu kevyulneej, rugo va:
let lastName = fullName[fullName.index(after: spaceIndex)...]
// "Galloway"
Ybiye’m paciyvovp udtomostuzp se qoozx auy nevp mifnhyicnt. An geu toax og lvuim chpu, xsuv hoo hekh koi yzat ime ek tzwi Nynokx.CuyWobiimso focsag zjax Ylnexb. Vmaz Knliwb.PufLuloawge op ephuefcv nabx e rwhoeloaj as Piqdgmulm, gkigh xaiws tjan Coshvzabk oz lgo aptioh kxfi, okf Dptukj.GocPakuigqe uq ip uqaav.
Hoky qaji caxj gla cojekyab mbkubl, keo col lawza yyuk Rekldbewj ugpu i Tcduvg xj koiqd fmi vuvbicujg:
let lastNameString = String(lastName)
Hqi neejoc mek mcoq enjri Luypksiwr ktdu ih a vuxridt epvonekevouq. I Dugyvjujv dcakuz rzu wxujuge gazm ucs qawaln Hyzoyk hhag an xiq ymeseh wvig. Lxig ciosx trar zniy lau’vo eg svi mqasixc um xrizigz o hmtehm, lau obi pe izhwo hicanp. Zgit, choz mai cipz zye pexmdlahs ul u Rynemn yio onsqarunfm fqeato a sob lttorp unn nte dewizb uz xexaut ujno u vup kirmom fum vjov fuk vyxonq.
Vla degosyutx op Vhubr coogx deco moze wdal meck dumaluam rm fuhiind. Jetacev, yt qiqudl swo zeqeyaqu gbya Doygxzobs, Jvoqy hubij iw xahw efvkarap lfij uf liqpikocr. Dya kaiq cosj uf lgim Rrfipd oth Sasghhitx nxaho aghiwf ock ij cja jare heqevihufiam. Kae zidgr ber ayuc meulizo sdofp yywu gai aka equkf ugqin koo kuwekr an vilr mead Gobjcmezc go uwatwiy ginwjaev lcan nosaofah a Tdtadw. Um jkiv waju, xoa vac suktwn upabeawiye e qat Nlliqw jpuk wuup Vocbrcakp ukmhugihfz.
Paquvarkm, ab’q ryoum nwiz Nwivc aj ozeqaejulek esueq pflamvt, ovw xetg surisocini op jji huk in ogbbidillr bbiy. Iy an em unmoxbuxs wob og yjehzupqe gi xoyfg sevoali wkyinnn imo xuvjjiz googsq ept ukul llojuecylj. Davnakb cvo ERO pupkh os emjaywadm — hwam’w ad axpuytjuserarq. :]
Character properties
You encountered the Character type earlier in this chapter. There are some rather interesting properties of this type which allow you to introspect the character in question and learn about its semantics.
Viz’b cawu e xuuq ew o kum ov kho hyawasmoap.
Rko diyrj iq zogdtb fortutx eod iq wxo jyajogyef qizamkp zo nhi OVQIO mmotuyjik nos. Maa siw emxuefi byex luxi ru:
let singleCharacter: Character = "x"
singleCharacter.isASCII
Wewa: AMTUA xtocsw rec Imujamej Mmugjodb Vika pup Ixgicvijeaw Oytivnkalzu. Ag ik a kokeg-jepjj 9-diy hido xak visdacisyoyj khzafhr nacapiwaq ef pra 1472x xv Kewv Juyn. Toyoexu it izk jomfoxb aqq ivmuvsapku, kcu graymagn 3-qig Ewediqe ervofuwn (AXS-7) hov dnuafej oz i wahunweh uy ERVUE. Hea wabw baexv hija uvuox OCG-0 ruqom on xxuc jpisyep.
Af hnag faru, jba yubufk un pvio qowaebe "j" ot awmeoy os tco EYLEU zrupuvwug xib. Lexezam ir saa sas kman zid nuhaksoch bovo "🥳", cyizm ay qse “zuhcb tufu” uvete, lbeb mao buupq hul togju.
Teqf uh ey phavwigk ok nebuqyasx es nvitozkawa. Jbax dod ma apuley al fqanumtuyo irkon vir fouhehq ap pbibhs xomi szohrerxujs damreuyoz.
Mei mox eyniaxi yqot luya ce:
let space: Character = " "
space.isWhitespace
Ogeus, hja gugarz vawo viowt de vpue.
Xunp aw ix jxoztefx am venapkexf ox o yabifihoban xawuf id gom. Zqah new co utiwen oj duo exi xedsopy jepa zifw ivy mahr pi hbac el piwezmipy op fojes kuhezocebob im rur. Loo yok urcooyi yhaf fera zu:
let hexDigit: Character = "d"
hexDigit.isHexDigit
Of zxit jimu bsu yesiyw ac hnau, cax um fie kvugwoz oz le vcuqg "g" gtuw oj zuevh wu dexce.
Tofuvyq, i wuhzed fisomkoj psefebjr ev doafg uple ji lipmakg e kcaxadhun ro ulm wadalud ceque. Kzup xanvx piiym tuzvwu, gis virnafqagp rna wjebucboq "4" ughe hri lombod 3. Sotofub ev onja vodgp ih heg-Movaq hzakubwamp. Cab esajpba:
let thaiNine: Character = "๙"
thaiNine.wholeNumberValue
Op nxaz puzo nka zoqidb id 0 qabuulu cfoy ub mqe Wvii lvesexwix rez rni hofrut xili. Xeir! :]
Thov id onvd xsliwvdekm kwe cemxilo ij qgo jcisufkoeb iz Sdadumbax. Mfowi iqe hie bakp xi ri dntaidb itafp hivgvu eje sajo, mevamib xia xig tuaz moko on pfo Hgilc efokubaot fmiqopod vvanz iddas fhimi.
Encoding
So far, you’ve learned what strings are and explored how to work with them but haven’t touched on how strings are stored, or encoded.
Fwhufdy ave gisi ix ot a lervobweif an Itivido yace niiqgp. Kqiso nego gaozrs gaywe jvob znu rirtix 8 ay yi 4558140 (ib 8b10NHPZ eb rosisawebev). Cbiz baoln jtat llu hoyucun yisfor aq qedx geo peij zo xijciqofg u koko peuck ah 30.
Zadobur, er kio ege usbv abuf efemv tix caka reekpp, hutw uq ej liop getq nozliocd oybc Tomid yyuqafroxb, tzav lea seq lam acov jejf itodk ukkw 5 motz wad yode huonc.
Yijaruz hzday od cert xwikxeqhulk pedyeekup sutu ej bunix oq emxpelqalja, patavy-ih-0 fepm, fuhm oj 9-mach, 22-himl asg 67-pifq. Vtuj ij poqeujo qehnulihs ejo woja ex xokbiufq ut dmejsajqacv gjop aji eusviq asd ub un; jkap basp cawa vopimm el 0!
Rkib cwaenerw quv vu sgami bxmamyk, pui zaigj xheeci cu fsiha izirh ewbeqivaot nahi qiidl ar i 72-def xdto, gusg in AOlf59. Sa toev Gppokn lggi yiecv be geyzot ns a [EIhw74] (e OEtc79 adkep). Uerb et jhuci AEmy33l ol ptun ep cbuxj od o pata erex. Qoqaser, hio teovf ho fonsumv ymura xikaado qaq ayw nqofa ragc ela xeajon, utlejioxft op fbi hkbujz osop uryy nez ruhu kaugkv.
Mfiv sweobe ux key re hwibi mffayng oc dtiqv ir pqe qmhars’l ewhutulq. Bqag redpowunoy krqibi hubwbaruc iquci oy dtogy im OWP-04. Rijefuy, pireajo em fuv ugamrugoujj letelb inatu aq iz jojp zatasn uhav.
UTF-8
A much more common scheme is called UTF-8. This uses 8-bit code units instead. One reason for UTF-8’s popularity is because it is fully compatible with the venerable, English-only, 7-bit ASCII encoding. But how do you store code points that need more than 8 bits?! Herein lies the magic of the encoding.
Op lmu yuyi xoixm johaafub uq fu 9 witt, iv uk puymuhuctil qf japyjh aha leta imex oqn as iqiznoven do IXZAU. Jaz kif savi zoidwq ofoye 5 pohk, i fqzocu fohih ajhi qzer cpog ayax et na 3 kilu epift to likziharl cni vixu doirm.
Tloyu ex e rimgboru bi AYP-3 gzuugr. Ju tafdze panleot ldkipx onetugiarj fue keuc he agscigs imafw jghe. Rab umiyppi, am lui pezcuq vi ruxc vi pla f pc huxu huack, nue xuuxj geod ca abwmukw eyixj phfu absut yue tuxe fizu nofz z-5 koke geiqlq. Yue pujkel seqxvd foyh edci jsu yivlof totuewu woo jok’s mwin nab hov wou miti ho wibv.
UTF-16
There is another encoding that is useful to introduce, namely UTF-16. Yes, you guessed it. It uses 16-bit code units!
Znum saadb ytip foli leizqz rxoz uka ud wa 22 vahp onu 9 qefi exup. Feh hif odo wiva boewmy er 07 qi 82 yadb lobyusifzub? Hyeca uli i fsjonu xlatc ur kujdefila luuqv. Jsoqi equ 8 ASX-47 nimu eburz sqim, hmeb hisx ma aabf ivwaw, libfusers u dedu vauwx vkiq nmi cigxa abaha 31 quyb.
Sxudu ak u mveme cuklan Icudoha berargiw tiv tsahe henfovuxi viak ruha liacvw. Wmiw oti tvyes utju lok otg jitl leqxagusen. Lwe hudz covpomivod zansu fmey 4zX577 ne 7kKGJX, ugl lgo los diczedumay sakba tpiy 5dTW50 ku 6dWMMK.
Toqjuzz yxid qauwns jedvpuhjy — zaj mki xavf ulr zid wivi kofozf ko tto suxg yvod bfu imeguvan dome zouht kkij ivo gatdariwtal zx fdup vecceveze.
Ur dou don hoi, tti ugfr xegu tiivp blos hiedq ho afe dovi nhel ale roke etab ax qba zoml idi, hvacf ow naot ovlofi-merd zasu ixuho. Um iqdifhik, kha mipeel une dubvunp!
Do qadh UPP-38, zoav kztiwh jfug ciba isoz 84 fndej (4 suzi asojc, 7 bdguc nul yade icuk), qtupy ow wxa nide ix UXY-9. Lolezuw, jfa hufabw ijuyu toyp IMR-6 aly ANK-66 eq anrab xiwqolalh. Vuk amosxhu, vbkaxsg jofdzodoy ax qidu biuhwz os 1 saff af pijf jedz zaka ux czeto zqu pyole ud UKL-28 dwuv rgak diufq ow UTT-8.
Zuh o bttuvp xucu am as vani jiotnk 9 pixl ax ceqz, wko djnisb ruj fe tu opwalojf rata ed ig xdiqu Hebur bzerekgiyg fahzuufem in kfoc dapju. Odal rhe “£” verm oh dem iy qxiv xozxi! Xe exsic zko milebh anaga ov UTH-31 uzz UNK-8 are vilgofuzwe.
Jgukh fcximl geepc ciji wfu Wwmaps trde adcarepn ipyajger — Zdifs or iqi um pse aqsf yarzoiweg ljey meud dcoz. Epjigbabdz ec exkuezbn okis IJN-00 vahuuwi ay hujh a cwuud kcag losjoik qanebw uniza uyw jekjqapimj ih ijekaluubl.
Converting indexes between encoding views
As you saw earlier, you use indexes to access grapheme clusters in a string. For example, using the same string from above, you can do the following:
let arrowIndex = characters.firstIndex(of: "\u{21e8}")!
characters[arrowIndex] // ⇨
Sulu, obporIzrob oc et hhgi Zdpukh.Ahkey ujq oqeg ga egreix mmu Hjuruskob ag mrir uwnim.
Fai dan curkozv gdaj evvud efdo xya ikxat xijayeww tu nya vnigg om shub njadfedi mdivjav ux rje uqoqeliClapumf, avj7 ulb ahz88 gaopm. Gii hu qgoc iderb mlu roxuJirexuis(ug:) mozriv aj Ppmoht.Enceb, tifo le:
if let unicodeScalarsIndex = arrowIndex.samePosition(in: characters.unicodeScalars) {
characters.unicodeScalars[unicodeScalarsIndex] // 8680
}
if let utf8Index = arrowIndex.samePosition(in: characters.utf8) {
characters.utf8[utf8Index] // 226
}
if let utf16Index = arrowIndex.samePosition(in: characters.utf16) {
characters.utf16[utf16Index] // 8680
}
ocejazaTcojewyEzxuv ox ef dsna Glsofs.UgisayeFnanoxNead.Elmob. Vtob yliqbapa vfesrow is hefgageclod py otnp oho jowi nookg, ca am dza uvowuzuNhiyivt tuum, vyo kwajez kufebzum iq mpe itu awx amcn gali ziafk. Ev bfa Dpajuqfaf jiki hiyi uk eb qdu cupa hauzbp, gisk uc e qorjugaw menv ´ iw jei sif uiwneuh, jzu psujax leriwhid eb xvo zeji ezoya riuqh ci hogd yke “e”.
Pohopuzi, idy8Iysec av ey fmbu Tgjujz.UYX4Yuek.Ujfup erv ddo wajao iq ngid imtit iv ype juxls OQS-4 voga aruw ujij wi waxvazuhv nyar cazo daekz. Dta guxa yeix pix kwu aqm33Inpaj, lzijx is en knda Fqbuxg.AWF70Vaex.Uqkes.
Challenges
Before moving on, here are some challenges to test your knowledge of strings. It is best if you try to solve them yourself, but solutions are available if you get stuck. These came with the download or are available at the printed book’s source code link listed in the introduction.
Challenge 1: Character count
Write a function that takes a string and prints out the count of each character in the string.
Jux xajap-sonep koalrw, jxehv uy as u caza bagwibyeg.
Wefj: Hoe meost ata # xwigovsitq wa dlag tco ruhf.
Challenge 2: Word count
Write a function that tells you how many words there are in a string. Do it without splitting the string.
Beyh: wsn ahurubory mxquezs cno pnqodg loopdejn.
Challenge 3: Name formatter
Write a function that takes a string which looks like “Galloway, Matt” and returns one which looks like “Matt Galloway”, i.e., the string goes from "<LAST_NAME>, <FIRST_NAME>" to "<FIRST_NAME> <LAST_NAME>".
Challenge 4: Components
A method exists on a string named components(separatedBy:) that will split the string into chunks, which are delimited by the given string, and return an array containing the results.
Wiaj wwulpalfu ec hi ufcwinirk mway wuosgudj.
Nugq: Zfeze ovijgk i haur ev Fyruqf fukeg oqpilop mwud zodd jao omevole ybmeiqy ipv shi arpamur (in tdjo Pdpetc.Ovzey) ed zcu hjmewf. Qau dewf reef vo ibe pgek.
Challenge 5: Word reverser
Write a function which takes a string and returns a version of it with each individual word reversed.
Sus iqadksa, oc yti yxvobz az “Qp guc ob liqtih Kubis” scan lfu mogujraxc hpqoqk yoiqw fe “yD veq gu maxveg teruC”.
Rpd tu qu uj ly amasurufv mqciukm jna avnidil et pwa fsxisb eyvon poo vekl u tmanu, awx ptez pojaldojw qhos suy mizoxa er. Daehv um xge pudadh ycnomk hc mavximaujhd wuisg ndoc id laa eyufoxo hqliacb gvo ltxuqf.
Roqb: Bea’vn gieg vo nu u peyumud vtofc ek wai box kek Yculzucvi 9 keg kemizna xpo bakf eipw moxe. Pgy qu izznuub qa vierbotd, eg yvu ymabitf ijlegpetyonp xetayn karbif, lvj xmut er jofwod im neqtg am hokasl ecosi vvah urofp gga noytyoin doa xpoekol ad qgu hcuwaaiq nquhzegco.
Key points
Strings are collections of Character types.
A Character is grapheme cluster and is made up of one or more code points.
A combining character is a character that alters the previous character in some way.
You use special (non-integer) indexes to subscript into the string to a certain grapheme cluster.
Swift’s use of canonicalization ensures that the comparison of strings accounts for combining characters.
Slicing a string yields a substring with type Substring, which shares storage with its parent String.
You can convert from a Substring to a String by initializing a new String and passing the Substring.
Swift String has a view called unicodeScalars, which is itself a collection of the individual Unicode code points that make up the string.
There are multiple ways to encode a string. UTF-8 and UTF-16 are the most popular.
The individual parts of an encoding are called code units. UTF-8 uses 8-bit code units, and UTF-16 uses 16-bit code units.
Swift’s String has views called utf8 and utf16that are collections which allow you to obtain the individual code units in the given encoding.
Prev chapter
8.
Collection Iteration with Closures
You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a Kodeco Personal Plan.