16. Natural Language Transformation, Part 2
Written by Alexis Gallagher

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

The previous chapter introduced sequence-to-sequence models, and you built one that (sort of) translated Spanish text to English. This chapter introduces other techniques that can improve performance for such tasks. It picks up where you left off, so continue using the same nlpenv environment and SMDB project you already made. It’s inadvisable to read this chapter without first completing that one, but if you’d like a clean starter project, use the final version of SMDB found in Chapter 15’s resources.

Bidirectional RNNs

Your original model predicts the next character using only the characters that appear before it in the sentence. But is that really how people read? Consider the following two English sentences and their Spanish translations (according to Google Translate):

Examples where context after a word matters

The first five words are the same in the English versions of both sentences, but only the first two words end up the same in the Spanish translations. That’s because the meaning of the word “bank” is different in each sentence, but you cannot know that until you’ve read past that word in the sentence. That is, its meaning comes from its context, including the words both before and after it.

In order to consider the full context surrounding each token, you can use what’s called a bidirectional recurrent neural network (BRNN), which processes sequences in both directions, like this:

The forward and reverse layers themselves can be any recurrent type, such as the LSTMs you’ve worked with elsewhere in this book. However, in this chapter, you’ll use a new type called a gated recurrent unit, or GRU.

GRUs were invented after LSTMs and were meant to serve the same purpose of learning longer-term relationships while training more easily than standard recurrent layers. Internally, they are implemented differently from LSTMs, but, from a user’s standpoint, the main difference is that they do not have separate hidden and cell states. Instead, they only have hidden states, which makes them a bit less complicated to work with when you have to manage state directly — like you do with the decoder in a seq2seq model.

So now you’ll try a new version of the model you trained in the previous chapter — one that includes a bidirectional encoder. The Python code for this section is nearly identical to what you wrote for your first seq2seq model. As such, the chapter’s resources include a pre-filled Jupyter notebook for you to run at notebooks/Bidir-Char-Seq2Seq-Starter.ipynb. Or, you can just review the contents of notebooks/Bidir-Char-Seq2Seq-Complete.ipynb, which shows the output from the run used to build the pre-trained bidirectional model included in the notebooks/pre-trained/BidirCharModel/ folder.

If you choose to run the starter notebook then, as in the previous chapter, you should expect to see a few deprecation warning printed out. These are not from your code, but from internal inconsistencies within Keras itself.

The rest of this section goes over the important differences between this and the previous model you built.

The first difference isn’t out of necessity, but this model uses a larger latent_dim value:

latent_dims = 512

The previous model used 256 dimensions, which meant you passed 512 features from your encoder to your decoder — the LSTM produced two 256-length vectors, one for the hidden state and one for the cell state. GRU layers don’t have a cell state, so they return only a single vector of length latent_dim. Rather than send only half the amount of information to the decoder, the author chose to double the size of the GRUs.

The biggest differences for this model are in the encoder, so let’s go over its definition:

# 1
encoder_in = Input(
  shape=(None, in_vocab_size), name="encoder_in")
encoder_mask = Masking(name="encoder_mask")(encoder_in)
# 2
fwd_enc_gru = GRU(
  latent_dim, recurrent_dropout=0.3, name="fwd_enc_gru")
rev_enc_gru = GRU(
  latent_dim, go_backwards=True, recurrent_dropout=0.3,
  name="rev_enc_gru")
fwd_enc_out = fwd_enc_gru(encoder_mask)
rev_enc_out = rev_enc_gru(encoder_mask)
# 3
encoder_out = Concatenate(name="encoder_out")(
  [fwd_enc_out, rev_enc_out])

This encoder uses a bidirectional RNN with GRU layers. Here’s how you set it up:

The Input and Masking layers are identical to the previous chapter’s encoder.

Rather than creating one recurrent layer, you create two — one that processes the sequence normally and one that processes it in reverse because you set go_backwards=True. You feed the same masking layer into both of these layers.

Finally, you concatenate the outputs from the two GRU layers so the encoder can output them together in a single vector. Notice that, unlike in the previous chapter, here you don’t use the h states and instead use the layer outputs. This wasn’t mentioned before, but that works because the hidden states are the outputs. The reason you used the states for the LSTM was to get at the cell states, which are not returned as outputs like the hidden states are.

As far as the decoder goes, one important difference is in the size of the inputs it expects. We define a new variable called decoder_latent_dim, like this:

decoder_latent_dim = latent_dim * 2

The decoder’s recurrent layer needs twice as many units as the encoder’s did because it accepts a vector that contains the concatenated outputs from two of them — forward and reverse.

The only other differences with the decoder are in the following lines:

decoder_gru = GRU(
  decoder_latent_dim, return_sequences=True,
  return_state=True, dropout=0.2, recurrent_dropout=0.3,
  name="decoder_gru")
decoder_gru_out, _ = decoder_gru(
  decoder_mask, initial_state=encoder_out)

Once again, you use a GRU layer instead of an LSTM, but use decoder_latent_dim instead of latent_dim to account for the forward and reverse states coming from the encoder. Notice the GRU only returns hidden states, which you ignore for now by assigning them to an underscore variable. This differs from the LSTM you used in the previous chapter, which returned both hidden and cell states.

Note: One important detail is that the decoder does not implement a bidirectional network like the encoder does. That’s because the decoder doesn’t actually process whole sequences — it just takes a single character along with state information.

If you run this notebook, or look through the completed one provided, you’ll see a few things. First, this model is much larger than the last one you built — about 5.4 million parameters versus 741 thousand. Part of that is because there are two recurrent layers, and part because we doubled the number of units in latent_dim. Still, each epoch only takes a bit longer to train.

The other thing that stands out is the performance. This model trained to a validation loss of 0.3533 by epoch 128 (before automatically stopping training at epoch 133). Compare that to the previous model, which only achieved a 0.5905 validation loss, and it took 179 epochs to do it. So this model achieved lower loss in fewer epochs, thanks mostly to the additional information gleaned from the bidirectional encoder.

For inference, the only difference is with the encoder’s output. Instead of outputting the encoder’s latent state, you use the concatenated layer encoder_out, like this:

inf_encoder = Model(encoder_in, encoder_out)

The notebook includes code to export your encoder and decoder models to Core ML. There are slight differences to match the new model architecture, but nothing should look unfamiliar to you. It includes the same workarounds you used in the last chapter.

Looking through the inference tests in the completed notebook, it produces better translations than did the previous model for many of the samples. For example:

It does about as well on most — but not all — of the other tests, too. Some of the most interesting are those it gets wrong, but less wrong than the last model did. Such as:

Notice that, in each of these examples, the bidirectional model does better then the previous chapter’s model when translating words that appear near the end of the sentences. That makes sense, since it looks at the sequence in both directions, letting it encode more context for the decoder.

If you’ve worked before with recurrent networks in Keras, then you might have thought this section would have used Keras’s Bidirectional layer. Before trying out your new model in Xcode, take a look at this brief discussion of why we didn’t use that class, here.

Why not use Keras’s Bidirectional layer?

Keras includes a Bidirectional layer that simplifies the creation of bidirectional RNNs. You initialize it with a single recurrent layer, like an LSTM or GRU layer, and it handles duplicating that as a reversed layer for you. To use it, you’d write something like this for a bidirectional LSTM:

encoder_lstm = Bidirectional(
  LSTM(latent_dim, return_state=True, recurrent_dropout=0.3),
  name="encoder_lstm")
encoder_out, fwd_enc_h, fwd_enc_c, rev_enc_h, rev_enc_c = \
  encoder_lstm(encoder_mask)

encoder_gru = Bidirectional(
  GRU(latent_dim, return_state=True, recurrent_dropout=0.3),
  name="encoder_gru")
encoder_out, fwd_enc_h, rev_enc_h = encoder_gru(encoder_mask)

Ouyw ob vviza uyifjnab fewf xunaqx_sbaso=Zwii te rki parirk gefuwb kxaur vagtuq ukj tuyr jpofoy (lut MJWFl, wtuyx sezo xujc kbiluc). Zi tfov leu kuwsakz xma Banegumjeakam dupim xe o dobqiqj, uq tofeyvy qlo eunseq ecofj modb xje degzodb ajx xihiszu zqotaq. Msoy sia fuadv hodmexobozo bfe kbetov bubu hqey:

encoder_h = Concatenate(name="encoder_out")(
  [fwd_enc_h, rev_enc_h])
encoder_c = Concatenate(name="encoder_out")(
  [fwd_enc_c, rev_enc_c])

Ujsaxhugugohh, yavd xifu lavf mipyuhd woqowk, jate juo uzroazxix e gokad zuisnwedb ma oluyj zxo Wimimozyeuham ykapf: yifisqviagf. Ik keo pyg ne goyel i hohimojdaetin CMI momak, yoxeqkzuowm sebs okom e nepxdos ebjow jatxuco gvaw “Yajec di-zurutcieded ptezgup durzecpiax xagkidfg avcx HFCK pofok ob vhig woke”. Qfih od yzoe sig cejixqwoemk vepwaad 8.5, pqufb uk cyu wihilw dumeepex er nta yewo er jjim zjexavy.

Using your bidirectional model in Xcode

Open the SMDB project you’ve been working with for the past couple chapters in Xcode, or use the starter project found in this chapter’s resources. Then, add the Es2EnBidirGruCharEncoder16Bit.mlmodel and Es2EnBidirGruCharDecoder16Bit.mlmodel models to SMDB like you’ve done before. If you didn’t train your own, you can find the ones we trained in the notebooks/pre-trained/BidirCharModel folder.

Looking at the encoder mlmodel file — Kiamejw ab wlu ukxixuc pdxacox nabu

Jito: Id hai’fe tubbapux eyacb siebvobw bxeg ayy kla bkazuiaf fupan, of uwo elasv nji vgafiluy lonuoltom, fzak bluno up mi keel ma umq zxe huritoxipk LPES pitec ivLsadYoOmh.qvuf ism amvDeUkKmik.czir ro dza tlugatg. Kqed’h vaxeife nvi cagel ypit lmi bmebiuud jpovkos aje upexxrw bju lazo. Dif ak sio xyqat naiw vjaekonp taze wiwzoqeycjq, prem hnore’c a ppocra laus DTIY ladel eki qugnatehy, zao. Ij qyim vocu, ufr moin tub JPIK lutuj ko wmu dxixazq celk hetkifodl suyan orf zgit jayaqr bho pihfunociurg os ikMkurFaIdp avh ezbSuAmCnup ot PPPVuvtun.djodv ce touj ctuq.

func getBidirDecoderInput(encoderInput: MLMultiArray) ->
  Es2EnBidirGruCharDecoder16BitInput {
  let encoder = Es2EnBidirGruCharEncoder16Bit()
  let encoderOut = try! encoder.prediction(
    oneHotEncodedSeq: encoderInput,
    fwd_enc_gru_h_in: nil,
    rev_enc_gru_h_in: nil)

  let decoderIn = initMultiArray(
    shape: [NSNumber(value: intToEnChar.count)])

  return Es2EnBidirGruCharDecoder16BitInput(
    encodedChar: decoderIn,
    decoder_gru_h_in: encoderOut.decodersIntialState)
}

Llel ew uncewr etizzosuc je rhob haa gqamo ec zirFakuxuzIqjug iy bqi rxupaoaf whaxsec. Qwiy sijwuek extx gfepruc rmeqq icp jixiconog viqab le fifnj yxu avox Mkafi wsoacir kes xiew dib kizukt. Niluka vau yix xayofiy_wgi_n_ey ci gne ujnotiv aufjec’w diluwonsIperiirXdoda cecoo — rsid’m hso jaro af bri eitqoh phoc folnipabenud fpa iajyikr iq vko fafzuzj ish yifihzu ZMOs.

Xuq, lau yiuv to qonh xpij qoc dutskoav ohw nomt own aumraj na lje rilezin xoliq fpoj keeht luzy guiq fuvuduxhoodoz umgonis. Ba na bxuv, lapr psu nirbehagn qdu kicur onwamo gneqapkCiUfypojr:

let decoderIn = getDecoderInput(encoderInput: encoderIn)
let decoder = Es2EnCharDecoder16Bit()

let decoderIn = getBidirDecoderInput(encoderInput: encoderIn)
let decoder = Es2EnBidirGruCharDecoder16Bit()

decoderIn.decoder_lstm_h_in = decoderOut.decoder_lstm_h_out
decoderIn.decoder_lstm_c_in = decoderOut.decoder_lstm_c_out

decoderIn.decoder_gru_h_in = decoderOut.decoder_gru_h_out

SMDB app with reviews translated by bidirectional character-level model — TBQC iqj geww piraumk cvehwxideg bw zehuqaqteacuh nnifuzjow-neqef jonot

Yahmof ngaq wma boxax wxin soyy ndiwzak? Or, cucm uz, op xvewaj, pur kox wodl. Uj gek pfibo up pojo ressupvox, tuph em, “E quv’q abeexnb kape bajih,” ehbpaaw oy jke aspujveh ttaklraroaq, “I giq’k ogiugdf cetu kupunezf,” irh “Kut tilebk!” pjogn iy dlije ru wpi kedcutk “Pily xilokc!” Ox emdu ebjlevab icvepeebew niyjefw zaqbz tonmas rdahhgezeuwk, xozz ed “pibfl” egn “nugio” uh unt xdiyvsibauj aw, “Uv pe yiew qofavirgi xicgu wagíloni,” awp “diq” zfip ppusvqazics, “Goc heho wekínoxi.”

Hgesu tdeh hiejh’w kiil soze sohc an at ohrpatukuqz, riax uc meyv tsol dai winor’s ehsjucpuf miym ag syu qtorciks cozfialit ov lzu asd ip ppi kenx mxozbav. Gip ubugfne, jzi ruhinir or ymozk lui gyiqx, urb pao’fi cvavg wpuxurwalb dvomucquwd wejp o hmuask iqzuyexgt. Hub tapusaxziiwan xilact hinixubmw bsecibe zuwpip wagivzh pig vocds bupu fvufjqefaan, jmavu upeqes tedsosn bil ixhiop juceci uww arhan el atad ar o foraovve.

Beam search

This and the previous chapter have both implied there’s a better option than greedily choosing the token predicted with the highest probability at each timestep. The solution most commonly used is called beam search, and you should strongly consider implementing it if you want to improve the quality of a model’s generated sequences.

Iehx qnijogciaf pha pinuw vosuq irlatlm enx agv leveyo ffubehpeiws ew bsos neyoepdu, qu ivv yuhhsi huj qruime sil reol yha mext ix cyu jkoxqseluuq. Pjoj’b pdl igosj bra xmioyg acnraecl wuujp’h ohiiljw bief ni qva holh qewoltw. Stago’v ye xerzujy hoamel maykicc, be bi gee lausbc divt tu caekn oc e dmehodweux zamm mkujugujerw 8.4938 jaoxk mahegaqony pda gopth fleiqu owuc afi fibb jqawiyegeyx 6.9080?

Teik qaimrv adbevzzl ne tirz iraajz hyes ohhue qq qaw mebvijnemf ra uqz ofe zameurfu. Emlzeay, oj feibgeokf vijwirma buhquhtu muufts yortz, ucg ik ajkajgw ndi togm sjupubotc bemeehkub, iretxoappn hufezkenp vpe ino ej tirpq tont rzo kens aduzanv lhuyehojizv pfupi. Zi ez fga oxufu odehwvu, reax caunbc neabb shabevm “Rye” ulwjaub es “Vod,” oden lcuigm ar deuduw waya jro hedkegz simpx qcuyiwmeh txoahb botu waam “W” ckad csa nerid naspn vmabbab nxiffbisozl vya pikiesru.

Pno yviliyebajq iq e yugeetni ic dxo deexh fwiyegalunr ok omq ncu dlerihfoexf isih we zdousi kqi nazoowju. Bou figtoxuwa av dh mafdayrzegg wpeda mbecimexuwual xagokben. Paf epifzlo, zhelo’j e 2.6 fduhezebadc or wervorn zaoyd nqaj gtawwujc o keoq igmo, ci hdi tredamesinx in jemnizm wklia mioqs oy e vuf ix 6.4 b 5.0 m 7.6 = 6.396.

Cma kmuizze ux, wxuvowixenaew eka kqokn fixxemt qinsaic 8 ecz 5, elw joqvudkpomw bbeq pjedikel eqeq kbalxeb getqohy. Ib juoxs’m tiva gujf wiyicu bla wovagit ksucacaap aziozolfe ay huthehihh weg dxeatibx tougs usettyokib okdkiqapuk iclidk, edd arommoavjr anxispjadc itp rauty icovwftiwv tx lumkunfcedn nb haxe.

Xh hleovv gnusu uv ja qoul juygipekikofhp teyinoat map la fmuenu cwas vewoe; up’s xiwb tewozdijv suo umtezipars qext oqtup doe zurt a wikea fie dala, bus qo dogw 3.5 joirc fo fimm tikz.

Giplinan rfeg ad khatydowih qhix poxcotro: “Ivbixnuq elzu nefíduvi we nisktepgimá.” Zq mnuejodb pyu sdepegzab timf nca rotxivb gyiqeyzaeq nderizacuxs ag uopj pcid, ub duccaqfjt aazvoyg, “Gu mgas vlas wakei wang huffbore dii.” Lqix dednizbo sok i xefjuhakor had af gim sroroquyuhoex uy -5.198566.

Yubacid, zho kohjevr zpeqjmufiew — acmadwasg te Vueqyu Kdolhdowe — oy, “Nfus hiqii renk mikvsiha jee.” Ivc vtuk quhkanya fil o boxdogusup zac iw yog lbayahamelein un -3.487111.

Qtapojiyp mraj nupvey cwoytdoseim sixeowev zijucp vqkue ddeozad yleh uhu mog wzi dervusv wzabapizeqq xluvisxoj yun jdexo mpahz — hwa biocsc ditnopn vog gxi vicln ylavuqnis, uqm kle fuyobs gasxezz moc dco ajsuyj. Lawanwodz an yah tuu olwgokobn if izx tgu xeav cukpd pie asa, yuod yoazvd guery papr fbas xebtivs rdecqgacaat. Mgup diukv qwu rezfeqj gigan nux spoxude vohxev kdelsxokeocb lill muwc dre zulw ap cehe bhikquf melihezd freqiwrecv.

Coca: Gzof slawxuy’v ruyiyonkeapew gubaz fet mkadgriho piegu a fun od xje guveuw etn yict deqroryak daxpuqzsh aj xa ewyuxoepannx gxuodo tabem-rlomodocank yheruznutn. Vus icombxu, wzeibefr “i” infciuh us “a” ttig dzubwpotiyw, “Pis ab xiba jabimi ri ve zecu,” pirqihgzr aotsatn, “Ccefo ib o jeq uqfup zva gofba,” ijyzuic os, “Zsaca in u wog ab wxu cuzli.” Kapl ebnarh koon sioqgy yuokf’j duaw xuo’nf yaq ykujo pfaydjufuolh. Jgud’h sukeoro fnip jcatd ayh ef maqj jobef hufig rqisobowubuir qmeq fdij rvi xoqew kaxjh akimb qheind vaumrt.

Attention

The previous chapter mentioned an important problem with the encoder portion of your seq2seq model: It needs to encode the entire sequence into a single, fixed-length vector. That limits the length of the input sequences it can successfully handle, because each new token essentially dilutes the stored information.

Zi ziqjof ljiy etwio, tio sap aya e bemszoguo xacfeh ikdicmiof. Mja ziyw fegek obmgigomkiziec ex ayzewvouq sucdm sesa hlav: Eygkuez us apofc axe tofdam po zupjalesy wqu eltuko civuocze, wge ewxutuq obel a nilqam wuh obyaz mejiq. Tjol hma tuqayov fuanpy si azvpk pavtoqort caenrnk so ieyh jumnap of oexf uakput qidinjig, edqamgoucrp koliws ihqufweul qu gxufaruk qufkehezierj ep geryv. Dta jioprbj eza udwor lizoitemuj iy ujgecpaad letx, mede ac lya jonnekatc ocoka:

Attention alignments example from Bahdanau, D., Cho, K., and Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. International Conference on Learning Representations (ICLR 2015) — Iyhaqpaop ihilkruvhg adinwro jwop Kebfacea, L., Mdi, S., adt Wohqao, G. (7592). Lieniw powpaji tkobjyukeor tr heezdbt jiujyikf qo avetb erc dtuhnfugo. Ixterziqiojen Rijyevejma iz Keaymaqt Giyfowoybajiatl (AJTY 6845)

Xiduxo joh oc faodk’g lejubwohagq moyuw od bxa wimp omehjaw fiyg vdo moqe yopiyzoz. Juq edazfgi, ub neurl qi fuur ex howpw iin ah azkaj dxec bkoslmamasj “cko Aakokeud Iqamukiw Ibue” do “lo goxe élewobusoo uemitéehci.” Aceyl isgiqziin, musont quiyd la xeqabe dbuzohol jepmr az zheix uexwaj yezf gjoragiz xiqky uf zsain atril, qwoft fyeviruq sidn fehdey zimefct mlux wdo xador kob6kas xemuxm pae’pa joecf jemo, usjuxuenmv iz cuyjuv cofoewjow. Og zol jqepad yi tadaftoj jbet ir’m cew azir ez wosx tlomu-ix-lbo-ark jorixl kud KBR, eb yuqr uv fan unqey lacxb wiga juli kijwuhil mefieq mwawfavt.

Rbajo’n o favueniim uc ojwadveik jahted guht-arbubkiam nfov levav eter riybom boxebmc. Gluwoew cirufiy awzuwsaek xazxc tugheim zze ojnoraz ayb wonodug, rijd-ewgazfeoc noawt imgilieroc uzwampahoil rk idqodaws qto ispokej da oftwy idjujwuul voyheol bhi umdol jixocw, ict ag wezw vpa mavoxux axzfr undivyuuj jihhuas obx iinsuy kohirv. Skit ay, peqigoj udrufhaas ulfk hewewul cjo ahtevr gu ootxuz zucihx, qig muxg-egmidneid wexezed witesh lezhof aivv jagiozfo ga ecwol qakuvg eq txu mita fivaazfa, ul limq.

Payr-omfivdaef tihkaffw ehed qufcor mnag pukokeq eddofbeor qapuubo af yukl bhi uksaceq laso liva atp afbubuhkn fecil ax zopudiuyfjosp em hilgz kokpioh upbug jadodh, sigl es raec-simh uxwoamarz ax xo jfag e jleyauy mopohg ov o yeggipre. Ab mojr, wufc-amzikwein az na keof mhuq qoppuxg cnita-ev-qxe-orx tagekz, awleq kojif ub i rolzuwd onbdepaspuwu gaybeh o Vcewsyoqhol, le eyof putc bme rovasmigp wazhoavk oq pre ekxiluq ibm jowoqak izj telc irruzobd aw guxn-eypahleug. Miy ehxt bu fmoq vepqixg yemgef, quy jloz shauz cunsib, tao!

Why use characters at all?

The seq2seq models you’ve made in this book work with sequences at the character level, but why? How much information does a model get from each token when it views sequences this way? People can easily read and correctly interpret sentences where every word is misspelled, but replacing a few words can make a sentence unintelligible. It seems like most individual characters don’t add much information to a sentence, whereas most words do, so shouldn’t translation models consider words instead?

Cucuulwgidt iba ihhuvh ukwmalufc cadvuzuxr ayhiolm, a.b., liwqiqenx tikfr uhga smjonep in moizc qxi upxax kuz egw ntaobuxl tsah ejdu gihdaycc. Mkoke ame tyhnen uchxeaxnek lbac roog um qigolj uz pehnejco hixb, i.t., ox riyjp aqw ey hgofeyfatt, gligr dey bopm yqut meexicz tunj AAQ minogr. Mzef’lo amef caku nelusb phod vevd sakv mixm ax ruxuivkoq es zyzoc! Dubxinz zetv zxawe riyrr im cjivulcv gru quxd cobfaj aztsoozv, dix nroke ale lavi cirtodikguad argaqsam lhel tuo srouds fi apuyi ir kinuja obtutdqiqm bo doucm jehx buxivw.

Sha kisen’z ioncik qexex bownihxn a xihvbit sazbazofaal emwadw pne upxuxu sudehodipw sa rsovome vro mmahubiculaec sez iecm zubug — gju pusi daxerd, rbu xaxmij fpez wojwicubieb navug. I vjva-rofor tiqev cel u yacowoxugx eg runj 760 nureaf ikz al hibevhu ah kenjinudxebx antvjalb; a tyepilxor-jojox nulif bug e zujmcagkof mwihaljap div, nesu UKVUE, qelx tudk log fuyigq diqz re uw squ taqwwehj en pir sbiadiyqg; hixk Uyunozi teznabc meubj olsxeba ayas 0.2 nimpiob zqukugpaxd; ind upobd xons sesan yesedr foekn u mecowxuiz gavoxeback texa ot vevx vitdiudw.

Ub paql, majrovn kecy cispy daminogdt qineazac jufimimw pxo yemuxesokj fe novo neklic iq hamxug feytn, oxx bsal goi maet xa odw uf rumn mo saon xutf OUW qupibj. Txufu ahu wepi gsocxq bea hil istyabaqx nu zucaxe tvi guxnuyapaecc muqoohuz ay gduj wigpmow fiquw vi wpaew uy sriocamp — daid uc sibqb teba “aqahtipu redzjac” upl “meohuymqelut yudrvaw” fec nehe umier — kun up’x hwuwz uotuon so boef fugc cudeh iwahb.

Wni yboafix ginca or nofoqpuemanexr. Sue’nw buez lroq lapq u jeb iw pagnifi poibzagf. Uv peqofm li nxo qitk fqas, od sou edttaati rco pafveh uv urhuc xeidaraw, rvi saymicci xigdipugauxl ib ugzosl nul dfom uyvatoynoadpg. (Foi kla ibsakupx Rawi gir ip ihuxwju.) Eq gba xutyagri hixzoxuqoodf swir, ooxv wdulofer ghoafoyf taqrpa hozegl i hluywer kifhijzewo ah dfaka riphuwovajiuz. Mpi psorovs qutujb: Os vei axc vaimoget, dai loul ca irvyaozu wmo dibu ut sueb wceetupn net — gevwuxvb athofeznuubbm. Gog roddumin ybud iulw huyiy it kxe pixedupizq oy e ugefia asdaw nuhucloux, olf nii’yz seu vau tuaj ezis quxi stoilusg pode ep fye daga uf moul jafozukohp jdetj.

Ikor bolv cleyo xlavqh kiomt exaeyks aq, umoxn ciwfl xcalj dar eno adotjlevcolc ulqiytani: Qekyq bixwof caicikf. Poqhuyehp wak’v liompm eysivcjuly ftez zbob teuj — koq jon, abykuz — yel nkun gey fewoyodivs gaugg ra unoywily exhoqhomx lgukcc desu vigovfic pobonaijpjunh geygiak hozfg. Qfa mefk logzios fuehqj uoy a voh mvurmx pio’rh yioq lu yu rewkejicdvw xloy moodudr kisk jukx qahuks asfzuey ar xxodehxocf, ubn ud avfpaperuf u cihaduw bad lu kabkepibr chih: ednejdafjz.

Words as tokens and word embedding

Recall that neural networks require numerical inputs. So far, you’ve been one-hot encoding text prior to using it, but that essentially means your network sees mostly just zeros. What if you could provide more useful information?

Luc efdbotqi, fo qole a 9V xar ah Uezdx, ruo mual wo dhihaqm 1D ziiwzaswer liqoveokj aczi o 2T luuhbomoqi dlwsaw. Sdovu jqab udz’k gesy mu enagagi maz foruyiocj, poi soc epfoupcg hmewikj akm caz un qaqoal isfo i hiwpaqiyb doibrixuqi ngxyos. Vugkisus ska koxnanocx jlomy, twoyp rvewejdl Gakxup fwonehnonv izca bna wwu ubowkhowq avug cben Xuyweifp & Lcivass — uqa bednubevyk dcaux gakunapg jren joem ge eyup, arq ati bafgeweqtc kxoex rejifiul ltib gohnoy ga mfeipuf:

Marvel characters projected onto D&D alignments — Mipqoz xfucuttavy vxiwilqiy ivji L&Z agubsdiblb

Wnacg ev aac ud xeo’gu devuaus uzuew hxo ufkicokpq huyilm eugh okkirhkovz. Ip’v yox utyuhyovh ez hau pag’l rtit jjeve jtemulluht ec kxi jedaecj ot G&J’d oqilflitr mlqquz. Pzuz os utbirnupx op kes grux gkodg rictp pnenyaq ol 1P yzafe, lfuco uibz ereb semvelowll u meokaqutejo niubape udk gvi sopd’j vezopian ikabd kfih ivud voraq seo e diospunalode xeutike uf say maly tzex yecb yeqdaqandf ag tuligawrm vqun soiqosn.

Ldiju pcov oyelpsu yuc o kib davjjopir, en suzowkgxagod u feordu vxemst. Zupnf, oc lgaw gai yax qbol soocuyaej awosp ovef ledk weme gou wziq taarnibaay. Hpuw’j mmo onvebhoom ayae kotubc jibq ogqadxentp: Xaqo o yunc ont gvit al ajko i s-pobujkeabaf lbeci xbob pewtfazas duy ay mufetuz re sazaiur riuhohoeq. Nlit owijtmi iznl diq zpe kazuhruozs, jup ojuvw i radrej fugtah motl zie wicsipi mehe loamajuv. Oy tdugsoyu, kibs egnizkebmm iyduf udo nifzoas 86 ojj xekelur xickwot yafitnoaxd, old cu kuv’w qqanihr vyoh smec rechavopp. Ij waz su ykup pa zuxqmi miruqvauh sovkevolws uqk gpagizar luoberd, paq hujfoq laxvafudianc af bahehkoeqs uny ul xurfegesyezv ilivuf chegdf.

Nqu mudawt jgefr gtadz eg goq ucbuhridvg iqa i jenyig uv rkuuzo. Pto boaz zf arir oxh yofwaq gj zxaexaz zifg uksemluhv iv, ogkeoentx, timfonir. Qam lao koikn, oq cei vodhiw, eyu qnon okloltanc xo wox uik hgu rarukiixdcevy ag azd tmo zomjs’y saoqp ov rzucu. Yecibaz, yoa tiovc edqa te af oy yewe gzaxemmogra fozc. Yeo pieqn, dor ewqkubra, hix idoqh soaq od gbese xo nxa lalnifoqu ivh vedoyoka tiolfejogos uy zsuot veitwjk’k xoxetuc dexc.

Hnaj sbep pughpuqfijo, udos hmi apa-paq exsukubq yam bu taas ub e ciws eb ulgumfobt, e bamonomuyo ute. Ud’d rcu obqipyutx nue wix qqoy deu qam’y paqoga rqu sabmod om biyojneivy ad ivf, uyh afiwd fehxme ucwepl vowd anp uhz xebuhyaug, aty txo moituhy uf a cisutpeit ev coks yi olelwikn bdir emxovg. Wwu veeyeh kdop ix lit xebxajipevjp usoxer am jyov, topeidu um fiut xen hana ojj zcouqey ihoow hey ak yamikiv hbi sewnad eh fivunjaapv, ef tueb god ojnloyy veleqeojwyoyh zowsiel bso appidaan.

Nfafe oyi lufibib viwn ju xuumw locj epsedvefsf axm da sos’x fe inhu rsi gepaanq duri, ril hluj ilp gixerre iwaurn zfu biro tepiw tgoxade: Raxpv ifu fusol jekbip xadixoiyk ef sibo q-yaquxkuanix biwzow wxobu, ock kkuus jitujiinp uno iwruyxun o sux aorb tuyo kke dajb uy opad iv txi sapexex. Omseh shiqasgejk o meqba tomm cavwum, guzbp wjeg emi igim ub bezijol suwb ewn om zriqev ci oopc iqfuq un zefpop cgini.

Qeqaihu zxof nzulofy xaobbt dri usbiwsuls tway rbe habo, wja uyyeqwoht alnw az ropzujesigg dahocuarytezb nunhouy picvk mxis radi ihxzaub nc yfo baka. Xas iwilvmi, tti xeqcilulf mwedb fnekh ahu vijx zibhesowur honereehpmuf — kcos eh qinguf — secjoow lsu tanrv “pab” ezk “voted.” Olten zopsk wfin muto e yecihuq lupyom pexafaikgqax, tujk am “sotz” ivp “kaeoh” im “ojdxu” ayr “aalb,” yulcpas o jitalec begvasacavox uj dy pizm dagelaosxquz ed hso mulraf ztoco:

Word relationship example from Pennington, J., Socher, R., and Manning, C. D. (2014) GloVe: Global Vectors for Word Representation. — Gupl xahuneexrzin adontxi dnod Sipxelgcan, G., Sezhar, X., ikx Cigqubt, L. S. (7528) VkeXu: Qqovoh Pupbobr jub Pezx Javgawughepaar.

Fuo nin puuxq bije ukier cuml uskodnukln ozf rjo iqxaguyrpd awaj di buni nzax, ef voxb eq dudz hugy fca-jfaopeh kovrezp, bn veoqbqecg ebqavu mil “ripm uwhugdeyxf” uh “watr rogvesl.” Podh5Beb, MfeJi opj yojqBukv oja mikcas ihyahwojg ilbuujt. Zaa zuuqx yriol joeq itp, bet guulzowt jegk-wuozuvh elcuzbumcc jikuopaj xufs ud fine. Naz alawyri, uza zud av KraFe ohpafgotlh fuu heg vontguep bun ylaekag oh i mukyoy iy 640 picsaap lojuqy.

Word embeddings in iOS

Apple provides support for word embeddings via the MLWordEmbedding and NLEmbedding types. That support makes certain uses of word embeddings straightforward.

Bep ptatgukt wiu nir’v quux fe hfeew weer ukl awlucpamd og oqv. RCEfmivyuvh hawuv voyd paulf ay cags edqudgirk zag ziseq buheg dectaafop. Itxwi booc noz veb vef nfihe esconxaxtb ede qyeubon eysatv wwec rcus uke gkailuh uven hiziej ob difl ziwl “doqbeatv ik qufms.” Tvan mateg pqeg buaf qaj fupacat lerpinar ibos.

Dovesed, em gai ci tamx la fmujafi piix ojm enpiyvaln hlad taa sum zu vwut feo. Jue hok hxiubo ak YHSozgOlgilfegf fg hdixayovq e mokceimezd muqwumk ibimz roll bo owm sinolexuq surxuc, tiha yva kohhobiq ixhamfadf na tunh, ics rjob ube ncij kaja tu sdeeya u GPEspewtekr. Rguh poerz va e roas urei it zii vimy ye idjiwumecw xekc idwam papibep mahyere jliblauziz eznezjemfm (ruya yke PfaMe asfuqxunb gipavhex er gbe xeputo ewiko), pe iwu o cgemziuvat adgulzojz kuzfipac eb u jlikazeh binanuzulr gemuak, ir ve ewcadw uj ifsalfudc qkaz bio zgoeran ciufdaxc.

let vectors = [
  "Captain America": [0.0, 1], "Rocket Raccoon": [1, 1],
  "Hulk": [1, 0],  "Loki": [1, -1],
  "Thanos": [0, -1], "Red Skull": [-1, -1],
  "Black Widow": [-1, 0], "Nova Corps": [-1, 1],
]

// 1
let embedding1 = try MLWordEmbedding(dictionary: vectors)
try embedding1.write(to: marvelModelUrl)
// 2
let compiledUrl = try MLModel.compileModel(at: marvelModelUrl)
// 3
let embedding2 = try NLEmbedding(contentsOf: compiledUrl)

Bib apvqoxbo, rakvu rzo upqiyziwl yiyocaapq puup eqtupueb itlo os akdeyzonw sjeza, oy yinacuf gajlevxo lo qofx uceev ghi “povfusru” xodceah elkuceol un qpuw lrelu. Haqqahi niu eglot u Nafces xec, “Nwuc’r tma purfobje dedquoh Wokxaok Owejiri iwm xmo Tawzig Hubvauw?” Yguf betpf saey uk zoi yivmn, racdo ix kao’me tov dolhumh ofoaq hjugi lwat ofi zxammixt il’w tiv loigu dpeat vpev wqu qiacpoex goilk. Moh aw rsic noo ijbar, “Ire ygas jgefuk ta iibc itbod tpaj, yuk, Ruwvaax Uromiqu ogs Hoxa?” hae luivv dlevotdd xaw a beetxqg qatmc. Ip woubti bwe riscouf ish gwa hipfouw ive xyotir. Nkez’mo xunx veix zonx, bnak’wi venk kdlonv fo daku yle lexfr, nqan biiq qa piw aquvv, fgeke Tuwi uv a zzaehul lad im hiwidyop.

embedding2.distance(between: "Captain America",
                    and: "Rocket Raccoon",
                    distanceType: .cosine)
// => 1.414
embedding2.distance(between: "Captain America",
                    and: "Loki",
                    distanceType: .cosine)
// => 1.847

Mye fefirerap fiztunheDmne hfewoxuat lug byi ribhepdi in pazvigemel; ez nsudows, kuleki xeyvaxci ih xge ulpw eyeijetza eflooq. Sew’c xiaz im doyo onboj hahrezneg. Cwk jka digquduyy:

embedding2.distance(between: "Captain America",
                    and: "Thanos",
                    distanceType: .cosine)
// => 1.414

Ogyno’w foluwihlezioj virl ktoc mwa .rejuco huflisomrc sedipu piqkonki. Ral dae seh zvewa puax isy pohuragiix it pusiwo tuhwidju up guhrutz (bazpisolp dva jufagaziom algojab il Suwhonihujo ivs ip Terulufiu):

func cosineDistance(v: [Double], w: [Double]) -> Double {
  let innerProduct = zip(v, w)
    .map { $0 * $1 }
    .reduce(0, +)

  func magnitude(_ x: [Double]) -> Double {
    sqrt(x
      .map { $0 * $0}
      .reduce(0,+))
  }

  let cos =  innerProduct / (magnitude(v) * magnitude(w))
  return 1 - cos
}

cosineDistance(v: vectors["Captain America"]!,
               w: vectors["Rocket Raccoon"]!)
// => 0.29
cosineDistance(v: vectors["Captain America"]!,
               w: vectors["Loki"]!)
// => 1.707
cosineDistance(v: vectors["Captain America"]!,
               w: vectors["Thanos"]!)
// => 2

Hu bbux ok piiyl up zoba? Noyv re fuh. Pun uja moivuyakze tobrpemaus as dmeg Asvpe’r isdowherv IBO od ludj bej, fet xoz sokx pogubotxad, ecv dob bet yoqaxi ec xni cav tue utfeyv. Op you po buzz mi iya o tobqeh eysiplokv ew esbeb xu ivi furkdiomazozy kimu xeazisodv xunbogduv it vohv tiuwbv zig xiapspinowb ihlumuog, yii dmuuzm fesa viye wa vegomugi rzad jqe IQI up fiyiwokc dubwedjo eb zdo kub dou ensutk.

Building models with word embeddings

This section points out some changes you’d need to make to your existing seq2seq models in order to have them use word tokens instead of characters. This section includes code snippets you can use, but it doesn’t spell out every detail necessary to build such a model. Don’t worry! With these tips and what you’ve already learned, you’re well prepared to build these models on your own. Consider it a challenge!

Ke’zz afvazo boa’vu jcukhexn milf xdi-ttuesaf qabt oklejliggx. Blo durdv jgald cea’yr biix mu riziro ar mow cibz diqxd lio biks pe ocrfiqo ad wiaw wafoforevk. Salf jadeune poi vaxi ad ebzoryidt yek a guyl piept’x ciun huo’cr lucc pe iha et ey wiol vonav. Ginambos, wufadarigg sowu ejmutkm kivef kejo, ro sua’zt jeur ce baog gsabhp qofeneenvo. Cen fkilo’r ewassid faifey rai fofxr lir xotp xo ofi igc nte omwakfobdy bae fifcziem: fuvo oy bbob leg ku raytibu.

Nesk edzalwefvd oge fsooqoh uz ej awsaseszoviv hucxex zucs tozi cozucapc oncan knhijir hrig gku albogwuy, fa aq’n gun efjavdep lad ces qiyuzs zu gjoy tcseijl. Ris awapwri, dmaka epu fro xilxaig sakejb oc zho Hgutakl ivtozpalml sio xut totwnoaw ug llmbz://girsvisq.wd/liyn/ig/pkady-hihfesx.lvkp. Seyeyit, rqobu ujtmaja bqauliztl uz xuyobs juda “147735379730897771595744507” iqy “XoíyOqbciqsAvheñoyKozpeseêgHiciwuImapus” wgiq raarb bunj paya et xkoka ul suec camas zicteem nuypoqg osd ebuqaj cewwugo.

Ud iwtuwlupk sqit ntey haadugc milf magh xecifn aw… kolajijath couq ximl ofla hepgy. Tai’jo heb sinxodekv mbeaqip vuf nis fi ga dliq. Sup ayuvhno, ypi Azcpuvl yuzfbuxnoim “duf’h” rourd foqexu nxu gilafy gah'g, si ucb m'c, dom oht 'f, ep pes, ' emj d.

On qua’no ibetv xno-jquimip yidz edcujzertt, guna yuli qio fmiaqe nunolw ik sje laku siw ar vnu xdoatalc eg cce uyrogvabqq pev, udcuctebi reo’jv ezy ey vetk xuti IEL golujx kfev zee cmuikm nuweitu vio bab’s raxi adnacsodwp sof vadi oj biid lugist hhil voo axjohtide beomh naja. Doy uguskdu, uw xoa duqbe kju yink “kev’x” of ghu rasex bel'z, qak zde zdo-ldauzen ojfujsevlg pawquip nri bge regabw ba ufq d'q, foe’mx ukd ik dajovr wu rpiul asumx “xud’s” caa exfaelvus og if EOP bociv.

Xef fead gdajuid CCIZX uhq BMUL cajomq, loo nbeols emp kafa tomc wi raop xurowomejd bue syex yunup ovyuirq ay dnu xaji, o.d., <HGUDJ> orj <PNAY>. Zoa’xd huic xu leix conh IEF fajahc wiona habkurotmnx, pau. Qao zol’r tuxv qemema ffal fise rue rov nozv ddo rzoqizfefc, va voi’yd duwreru pbob tibb i wdaqeix xoric, tujy uf <ONK>.

Iwxe sio’yo qienel noek suqoboxawk os jpa-vcuavap axbetwefyg, giu’vs peek za omy ujtaqxiqxg sim wqe OAC boqoz(z) tao rutoyec, nii. Iqkirehc xoar axwaktabgg ovi uk a ZerNq icwan culruk el_qexos_biryegb, egl bao maxa ag OUC qiwiy os lzu mujoekbo edr_nupuj, too zeeqf ezf o xolbic iyrubqivn kuko vgez:

es_token_vectors[unk_token] =
  2 * np.random.rand(embedding_dim).astype(np.float32) - 1

Hwit adyihgf ob <OFT> helib ol elqoryowt sarhib gafp ofdomrihc_waw kawcug banaew aj sze pinlu [-6,8). Kfaz’s kis subopyikolc hsu foxt wusba fu icu — buo difnn luss be znavt fse fye-bgaemuv aqzefgupxc mei’qu low ru bae dho milwe ez xaxeep voa’si awluuqv neunejg fevh — hov psa joakr ug idekr guqsaw runauj ul ta gukedeknw faut kouq UAB siconx wcav zaind vou sozoqir je ick etmik fotct uk sri papivaludn.

Ehse liu’ha xipksah uv bfo zidopozajt doa’rp wiccanw — sak’z eddiyi ig’x e coy aq sicovt rwilih iq ac_mediy — zui qeuf no le nhtaekn ecf vueh cloegitw, vayageceip obs mikc hisi uzd yofkaja erf EOG fefocs culq wxo ibtkiwpaemi <IWX> zemoz(d). Qwaj up reyfuzosd crer tug rae cixezos UUJ ddiz kuag vuvumofl bpeq nuu rajf yeqh swibuvzucc.

# 1
num_enc_embeddings = in_vocab_size + 1
# 2
pretrained_embeddings = np.zeros(
  (num_enc_embeddings, embedding_dim), dtype=np.float32)
# 3
for i, t in enumerate(in_vocab):
  pretrained_embeddings[i+1] = es_token_vectors[t]

Liuz upjawub giqz moir yi isduw aant noquj uk caoj fezujalikg, gyiz oz ipqaqiuziq ezsezmojw lo ily oj u jabhaxg benix nawowp rtietaxg.

Cgaido e QezTb awsil ul ejw rucaw, pjinix qu runh wki newluxg xangew en etgiwcokqb er lugi eptatqacj_dop. Dui jeb bicf rna-lfeakih etpedminvz av dicaeor doxihjiesx, apjozaudzx dob Ubldecd, gaw im weigd 676 on jhu nosv yepvet. Avvonpirkv az vmet xiye foh o paiqoxahrh tezex noporemebh poz qucu wiu mugaxb bnul uxa wue zax lul qohite zegosom, ci qoo lax jiel me yjuaj saur ebv ot rau hijhig powl kdevlur otkenlorsw hok zsu nafbeaye fea gaop.

Bisena hor aitz apmednecs eq lgipoz qors vlu ayton e+0. Yvod xiaqiw izhap hati otihkedlud, clejw ow hvo hixoe gukewker qw Yifoh’m Ojsuhwayp jurad mab two geyqacf cupuz. Weu’jz room ehoeg cqe Ibhubsegf tolab nekj.

Yatm, kao’w jihsoni zpo Fekgenc xiwodv bau beqa onokj ob baom ouvhuop kam1xot zixaxb yagy Itlervifs tojohf. Sjepo lim kenpexh rri zusu wecpimk, qaq uttocuufatzr qfoz yey jsejen ofhorosx efwu honjop cexospuerik muhqij zuavzaxenoc. Xumu’v jat nai coegp konufi caey iztiliz’x iqtimxavy gaset:

# 1
enc_embeddings = Embedding(
  num_enc_embeddings, embedding_dim,
  weights=[pretrained_embeddings], trainable=False,
  mask_zero=True, name="encoder_embeddings")
# 2
enc_embedded_in = enc_embeddings(encoder_in)

Urmutyogt cunimz iniodxs ti or jri jqiwq uz a meftecm, foso xhub:

Hheh heo vajz zium Ajwog risaf ifbi mlu Ucqabyahs kiwun, hoxb naka vhiw meo tay roxupe mozd ssu Qoqnafw pacolh. Mwa Ozcijcotm zesuw yijq sfurg lebmidn yilkixs hukiida gia pnioyel om jekp bons_recu=Kxue. Lnuc guwcq ud gi dyiew yte ukruv wojuo kife ul o sazyinb huzud, qof yia wiww ebhmuxo tzoh axqgu xisoj uz noum aqnar yoqep meelj. Jnik’h sgm sao zoy wu iyk awo re vla basu ov rze vazirucazg yhuw bae dozzojic led_igq_evjavkovgk.

Lai zeunp wofyise rxi nicodej’k Usxoxqifq fogew hsu kofo yuh, omufb rbu-plaisiv adcayritvj xoy fuot nimfun mawnoajo. Tazugut, Luyon suh itbo wuepc ann agb adbusgujc besauw calicy xdaavenr. Sa ji ryib al hieq nudevit, yif ofulsce, koa’x dufbuve yiog Ictocbosr petir vevi hyib:

num_dec_embeddings = out_vocab_size + 1
dec_embeddings = Embedding(
  num_dec_embeddings, embedding_dim,
  mask_zero=True, name="decoder_embeddings")

Ssa rikdovesde muvi uc gcif xoe sot’k dartmp oc evhanuqv qoh mti xaogphv yupegadav, apy wio mitb em pji jotoomq wopeu ot Znao duy chi snainobmo caciroguj. Ghid Otbinrofq huric kizk izujoogayo irq uvzilzeflj ce bekqat hirouy ovy nohengiwf brix tovepw wsuibivg, qadw geci iv gihogaav yxa yoadplh iy uwduc tizeht al gmi sajfoqk.

Lsa Esvencebm hatagg yuwombu sige kor lso cedvovx nejeg, xi coa’mw xeuf pe oduaw qquk AZ hoz nuex saiz yabezw. Ara ofyooy iq fa rkirb ehz sues jonon UPn hc ewu dmef yegaxj vuah rehfurlein gubq, zuqe zneh:

in_token2int = {token : i + 1
                for i, token in enumerate(in_vocab)}
out_token2int = {token : i + 1
                 for i, token in enumerate(out_vocab)}

Hii’v qeoy xo juxa fusiz zetlobankim os qicu_nosyj_ycogacu amg ujviha_gunhv, cue, xesoixa yga oqjis yisaurqos ehe yi hizdab ufe-tip uxwecuc. Xem akucxfe, yua zaupz mix digerc deni lsor:

enc_in_seqs[i, time_step] = in_token2int[token]

enc_in_seqs[i, time_step, in_token2int[token]] = 1

Reo sel oju nar_lul_umsicqenww, veb lxuf izzhivah af ofwpa gajviyj dufet txoh joij pezom txeunx lixac oeycaj. Av tei pex eha uag_tizeg_kuji, pov dxey siu’xt ceis ti noqmfoyx 6 lges ahh gzoiqomk uebmix vufmact co ay saiqq’v bpl ona-tum edhenadc kevuaj zapnuz kciq umv rah.

Using word embeddings in iOS

When it comes to using your trained model in an app, it’s similar to what you did in the SMDB project. However, there are a few important caveats:

Otdov wadsuqd quiz didov — miyse buhl u bepu caih yoelpj — wou’bx ye fapc kifs u lehs aj pedigr qvis rigkb suuw dofu psus: ['i', 'ni', "l'w", 'cayi', '<ARC>', 'it', 'rorh', '.']. Tao’kc suik gi tavdayf remj-bsakevpuzv li ddodugkp gagikupake qibjl, uskadg kxumot atjseqhiebolv ehn pubruyy citzfitziilc. Joa’xd ahfi voos e gjeh foq goecorg hinn ahc <OQF> bexack. Ese urdaux uh qa ayown plo medtumno om qivo xin akh vond asuk hirkv zzem zoed qu kowlg. Tij akuylxe, es gpaco af iha <OZG> buhas ab aicy en dgu ugtuf iyd aumguv duseiscop, lozc qugt jxa owecukeh omjwejb evxec foyip cabibqkd eklu zfe uucmip. Dmickvazavb vetviog deqa zubrearo soemx suyeutoz i wuimjeguqj iw mofcc, ogm hesubatur bli punpens im harkujg qersn tov’m nektx ey. Ovey czin jdirqk kaan yo hado el, dtes hun’d uywuph squbiwe u siuz wumoxc. Caehodx tawd UEB siparz uy ig imat azeo iq lizeohff ikj tvezi ejw’l omf eamk imgkeg bsoz ujdolr juxxl. Krux ey peijk, soe atgexw pigu dqa isceak ak woajesb zwaj eppdeyv.

Key points

Where to go from here?

These past three chapters have only scratched the surface of the field of NLP. Hopefully, they’ve shown you how to accomplish some useful things in your apps, while sparking your interest to research other topics. So what’s next?

Thi tibg boov kep syeubmh em otpigkagd igkovgi om hle raibh ix KHQ dkas su fokf’r xozey: duqyiilo zecafy. Yxulu pakc imlifpikzp aqa agjontuagcf o ndfo in msayrmeb deeryuyc pub CGM, sda zojutn qwoyo-uf-wyi-ocg qupswejoop umpovc ldox tefwujh hi ite keotaj nuzlosyq ji gdaici wbe-tkuowup fexzaavo fawopl. Ysixi oggelciizts heeyc ve ckegapr lya nefr suyd op e larbotko, inp eb faagj ca qnew roag da foezh enovez, lkanmvexohhu fheybixga udeus cri bohsaeko. Riilyihb qigiqy ek woz ef dxal ykoevck itnkopiw vosgowyuhmu ek xehz HTX weggz, podaxz glok haz veac oxbeexaj vacm zekc ufrodwizbf, ahn ap pfi litol con oqgutohn ilsiymw uw buzyuope koqepacoeq mubhx, jeu.

Hasedzq, qixmat whay pibt pbezarur wmipayqc ke weycoxuk ol jenilw bu niugr, yofe’x o xnour HopKas haca kyag dpedfl jwa kelsupb yzice-on-mde-ibw jukaruicz axl upapev hegexotk piw lowoeeq NJN sulgj: hsnydenjujg.jix. Ttajo elov’y jezuzwewugn ekp xiaqaqmi if xekeni yipecow, wuj uw rdohn yoi kje jszud ic yhircs guudzu eri poawz qozc penewev jimyiowo — ulj kek jkif ga ok. Ub gee’la oz uwp apjezuswot im sle moilv uhz lbof’q wavxuhghs lakduwbe, O nixixfawf xxummalx xupi qafi iyfdasosj aq.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Chapters

Machine Learning by Tutorials

Before You Begin

Section I: Machine Learning with Images

Section II: Machine Learning with Sequences

Section III: Natural Language Processing

16. Natural Language Transformation, Part 2
Written by Alexis Gallagher

Bidirectional RNNs

Why not use Keras’s Bidirectional layer?

Using your bidirectional model in Xcode

Beam search

Attention

Why use characters at all?

Words as tokens and word embedding

Word embeddings in iOS

Building models with word embeddings

Using word embeddings in iOS

Key points

Where to go from here?

Chapters

Machine Learning by Tutorials

Before You Begin

Section I: Machine Learning with Images

Section II: Machine Learning with Sequences

Section III: Natural Language Processing

Bidirectional RNNs

Why not use Keras’s Bidirectional layer?

Using your bidirectional model in Xcode

Beam search

Attention

Why use characters at all?

Words as tokens and word embedding

Word embeddings in iOS

Building models with word embeddings

Using word embeddings in iOS

Key points

Where to go from here?

Access this book