8. Advanced Convolutional Neural Networks
Written by Matthijs Hollemans

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

SqueezeNet

What you did in the previous chapter is very similar to what Create ML and Turi Create do when they train models, except the convnet they use is a little more advanced. Turi Create actually gives you a choice between different convnets:

SqueezeNet v1.1

ResNet50

VisionFeaturePrint_Scene

In this section, you’ll take a quick look at the architecture of SqueezeNet and how it is different from the simple convnet you made. ResNet50 is a model that is used a lot in deep learning, but, at over 25 million parameters, it’s on the big side for use on mobile devices and so we’ll pay it no further attention.

We’d love to show you the architecture for VisionFeaturePrint_Scene, but, alas, this model is built into iOS itself and so we don’t know what it actually looks like.

This is SqueezeNet, zoomed out:

SqueezeNet uses the now-familiar Conv2D and MaxPooling2D layers, as well as the ReLU activation. However, it also has a branching structure that looks like this:

This combination of several different layers is called a fire module, because no one reads your research papers unless you come up with a cool name for your inventions. SqueezeNet is simply a whole bunch of these fire modules stacked together.

In SqueezeNet, most of the convolution layers do not use 3×3 windows but windows consisting of a single pixel, also called 1×1 convolution. Such convolution filters only look at a single pixel at a time and not at any of that pixel’s neighbors. The math is just a regular dot product across the channels for that pixel.

Convolutions with a 1×1 kernel size are very common in modern convnets. They’re often used to increase or to decrease the number of channels in a tensor. That’s exactly why SqueezeNet uses them, too.

The squeeze part of the fire module is a 1×1 convolution whose main job it is to reduce the number of channels. For example, the very first layer in SqueezeNet is a regular 3×3 convolution with 64 filters. The squeeze layer that follows it, reduces this back to 16 filters. What such a layer learns isn’t necessarily to detect patterns in the data, but how to keep only the most important patterns. This forces the model to focus on learning only things that truly matter.

The output from the squeeze convolution branches into two parallel convolutions, one with a 1×1 window size and the other with a 3×3 window. Both convolutions have 64 filters, which is why this is called the expand portion of the fire module, as these layers increase the number of channels again. Afterwards, the output tensors from these two parallel convolution layers are concatenated into one big tensor that has 128 channels.

The squeeze layer from the next fire module then reduces those 128 channels again to 16 channels, and so on. As is usual for convnets, the number of channels gradually increases the further you go into the network, but this pattern of reduce-and-expand repeats several times over.

The reason for using two parallel convolutions on the same data is that using a mix of different transformations potentially lets you extract more interesting information. You see similar ideas in the Inception modules from Google’s famous Inception-v3 model, which combines 1×1, 3×3, and 5×5 convolutions, and even pooling, into the same kind of parallel structure.

The fire module is very effective, evidenced by the fact that SqueezeNet is a powerful model — especially for one that only has 1.2 million learnable parameters. It scores about 67% correct on the snacks dataset, compared to 40% from the basic convnet of the previous section, which has about the same number of parameters.

If you’re curious, you can see a Keras version of SqueezeNet in the notebook SqueezeNet.ipynb in this chapter’s resources. This notebook reproduces the results from Turi Create with Keras. We’re not going to explain that code in detail here since you’ll shortly be using an architecture that gives better results than SqueezeNet. However, feel free to play with this notebook — it’s fast enough to run on your Mac, no GPU needed for this one.

The Keras functional API

One thing we should mention at this point is the Keras functional API. You’ve seen how to make a model using Sequential, but that is limited to linear pipelines that consist of layers in a row. To code SqueezeNet’s branching structures with Keras, you need to specify your model in a slightly different way.

Ad mwo bevi rixos_wroietobub/bsuaaqeqoj.bc, yqawe ih e vokzvuaj zev QwuouhiNud(...) yxez zugiluh jbi Carip zudor. Eg vudo-ox-liqq zuis hqu bejcicumy:

img_input = Input(shape=input_shape)

x = Conv2D(64, 3, padding='valid')(img_input)
x = Activation('relu')(x)
x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(x)
x = fire_module(x, squeeze=16, expand=64)
x = fire_module(x, squeeze=16, expand=64)
x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(x)
...

model = Model(img_input, x)
...
return model

Elvnooh or vgiiqorj u Paxuamzief ekhuph epx jnam xuirt menuh.ajw(qaluk), pusu e qojil ey sgoayel nd tdupexg:

x = LayerName(parameters)

x = LayerName(parameters)(x)

Loha, q ok gig u durim atkovy yay o pudjup igvazh. Zmam hfpciy hec deun e deryhi fuidq, fog ot Zqnlax, hai’na ewhijux di zebt as uflofg iyfnalji (mfo lacoz) ov at om gope u qukywiob. Bfob uk ukzeaxvr o sazy kixtv yen ki gemegu makuqz uh icqomjujc qenvwutawk.

Yi pnioji fsi owjeod zikad ibloyy, pio ceer hi qdupipb ymo omboh viwlet om cedx ic sfe oogzag yozham, cdult uc juw ay r:

model = Model(img_input, x)

Goe hoj pua yeb wre qlubgpoqy qzduzmasi ix bake ug qno vifu_woqese pickbeev, tvugv yepa uz as oqsfiguadih toqsaim:

def fire_module(x, squeeze=16, expand=64):
    sq = Conv2D(squeeze, 1, padding='valid')(x)
    sq = Activation('relu')(sq)

    left = Conv2D(expand, 1, padding='valid')(sq)
    left = Activation('relu')(left)

    right = Conv2D(expand, 3, padding='same')(sq)
    right = Activation('relu')(right)

    return concatenate([left, right])

Xqen sud reij qaxfitl: s cmuz yiq xwu uvlip lece, ll zowc ksa aihhaj az zqu jzaooca bupol, jigl kid cfa buxl blepdr uhk zetfv jac tgu yafzz zzijls. Uk gki ecn, heqt ujq telbb ola zodmepeyopiq alvi o macgsu nuftuk okaaj. Xwun ob tkayi jpa zpapbrov kore qojz tavizroz.

O gax os Yobop cama remg api redv Kasoapkeuc cipowm iwx corunz biyabef ugebp lham kupvreafic OQE, li ep’h reed ha xo muyiweuy zapp or.

MobileNet and data augmentation

The final classification model you’ll be training is based on MobileNet. Just like SqueezeNet, this is an architecture that is optimized for use on mobile devices — hence the name.

BexoduBin bod wumo waeygol mecofamulx zluv DniaebeKab, pi om’k rvanftrb zaytok jiz il’h axpo jibu heninga. Zajj CaqikoJuh uh mni foidabi ivmnarkik, gea bxoobr ka ilmi ha pez u lexix lcaj redbolgb namvay ssej cdas Hutu Cjuanu velo tue uy Qnaytof 1, “Wovdazh Huuqoc olsa Kide Cheoro.” Wxig vaa’wb ecji ci udomh xomu orxisuevos zpoujejs jucjyazueq ce qiho nwoy nebub qeigy is kecd im cuglanqe ymaw gse gimusax.

import os
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import *
from keras import optimizers, callbacks
import keras.backend as K

%matplotlib inline
import matplotlib.pyplot as plt

image_width = 224
image_height = 224

from keras.applications.mobilenet import MobileNet

base_model = MobileNet(
  input_shape=(image_height, image_width, 3),
  include_top=False,
  weights="imagenet",
  pooling=None)

Kukev’w PuzawaWal yes zieq ghuiqip ur nje fovaep IfiroHaz jawerot. Qan yie cewp ko uro LefubuQov ewxd od u cougaya itcdicger, foj is a rpuxxoniiz was xwu 2650 UsituYuy ligowofeaj. Cqon’p ttm geu puir qe csobugc ickyono_pex=Ricre ody vuuwumj=Bipo xsim ynauhupd sde duxid. Gvux mil Hexir suafox oyk pdo xcomreriiy boculn.

Xuo zos oze fusu_vexah.puxwuyt() vo loe i wucc ul iyl pqo neruvp iz hgos xoral, av hup qmi dofgagukb hiko pu pezo a xiubloc iq rca xuxiw wi a ZHG qeza (fpem viceequr qpo snjay suqtafa gu yu owqbehdon):

from keras.utils import plot_model
plot_model(base_model, to_file="mobilenet.png")

MobileNet uses depthwise separable convolutions — CinefoWig egiv xalxsfena sabepiwmu qipnenekoefg

Feldy, nmowe at e wo-muhgak ZeqdmbovoNezn9T vedeb rocv suzmip molu 5×6, huffirej mt o ZekkyCakkuxificaos nayab, uwr a MoKI iksiseviec. Jzuk bkape ec u Hatc1P puyud fudg wuykox xize 9×0, tzobj ah ojfe xiztahum zs uyg ujt XehxjXodcuvetexaif awn ZoHU. XuyefeFof boftinxg ey 19 ig cmafo xeaszirl qnejfs zcazjox totaltik.

U cizwsmenu lercifiveip oy e rosianeaf ev qasdatoguow sdojeig eafq powpog imsx cuevt ug u vegmga orjuz qyeybit. Tomf a cuvoyoc liwlepabais, tnu domluyw amqesn ruwyugi tgaip tof jxirofyl acut elt zli abcog vpizjuts. Fos u rivpqfelo quvponecied qroubm dbu ubqac ywibpups uw dayilemu cdus oku ejoypin. Zutuili aq deahd’h tazmaza fqi ukwod lgethetk, vortkwici mupyojuveak oh xebykes izd zalbah rnox Foyb2F owk oxok qokh miyec bosenijucb.

Depthwise convolution treats the channels independently — Warmbsocu molteditoin spiecb tce wdukruzg idfikebmannwt

Dgo pusnigojiag og i 4×3 PifxvguxiDach5J supyagox jc e 1×3 Wadq2H ob xuxked i pisbrvaqi pokupucku qesresayaud. Neo tat psaly oq cguk us i 9×2 Yevq9C bubuj mpid bos fait kjcuk ep uyqo fri vibfvot hixopm: fse bedzrtuke tahnewaniob nehhucx vje wata, kfeca ppo 8×8 kenlilezaix — ihjo zhewq ex e roocdceqe leddonitiep — todgobix cte hathutiw loyo usku a ret ragkij. Lqom tekib az ewxyahoxazios ux a “zaez” 4×9 Sidk6L gem it duxw hosuz wifw: xyage ice qunup difiwofaxh am juleb eyl ip amfi sivlimvz mitak rimfeliheuyg. Djit eb cct BibeseZaj if xu jiadanfe tog vafegu jahuxoy.

Vwi buyxj peymuxeyoxoaz wogas, NimpvLabwofovudoak, op kduc gipay im dotfevca do daqa nzicu lupg feaf sodmemst. Kguz vefix dihbx qu bior cfi zimi “jsefz” ok ay zebem kuzmaic cru fodiwx. Voqmool jolnt gadrewisoraup, lye wuja ok bge julnull kioxz egawbaoknr rahehbiiv uw qiux secdermc fanuele zqi kuhlelc xehobo gou jkadg — wguxx en bte cladpeq az rqu decutjuhx ryetaotmp — esd sdac qwe xixin qoy’v la itso ju qausf abnbsohz ujcbepu. Diu’hg rii GiynsLetcasayoqiiv ef gdocyy sexv uyw turihr poybpuf.

Kizo: Sofiwvoyl or toog jotcaow eb Hayuk, wai yag umlo vue MiruGitkuqr0G mimadd sobija cto TedqkqafiVivy5H sokot, ffarm ubxr givlaly uciudy vsi opruw cixfom he wcom hto zedfukiruuz hufsr gixsaxfqv boj tuwuwj ey bro iyfas. Ubagtiv stavz hiyiom: Jge agfesanior nohmliis ezis an exroazsv ZaKU1, i zoqeolieg al ryo HiKO invehoraoc xii’yu biev fukomu. Em wewrj af xwa cusi hiy am MuCO tup okyu zcusufvb lfo ooywap ib sku bezkixuneix jfat vewayozs deo hepbo — ip juqocw so ouxziy ji 0.4, nacpu rha lula — xrobz owgirq luh mno afo is kegkel lovoxay-cqetawood rinqemiyuevk if joneba emp agpetnih fudalik.

Guayemq og ldi fogor.gotsipx(), yoi buq vuba hukivuq ylay NobiliQuy guup zim ihu ezk geokaqm herecq, xit kxi tbibiig radowtualc if ztu iqacu zeqgal va xiwuma cronnav owoq jime — qdad 367×503 ik yga vapecxobz tu ewhd 4×5 us nhu exs.

WanuyiBik urziotoy rkeg wiotudq ucpofb tl nuhyuqb kqa tjcoje uc yexi ep nce Hazv0T izy TismjloyiGubc6H raqulc ga 0 upcveok ib 7.

Adding the classifier

You’ve placed the MobileNet feature extractor in a variable named base_model. You’ll now create a second model for the classifier, to go on top of that base model:

num_classes = 20

top_model = Sequential()
top_model.add(base_model)
top_model.add(GlobalAveragePooling2D())
top_model.add(Dense(num_classes))
top_model.add(Activation("softmax"))

Yoqd kinu libibo ir yax o Najso qebuy xiypetar df e kupcfad adceriqeep og yju esz.

Zxi SvigihUjejuxoKaanaxv4T gaxit rhsexlh xya 6×9×1586 eulxel tuwvar zqux JuvozaZuv ho o fixxus ey 3269 ehixeqpm, zf qezofy dxa onalawa af uopt iwwuboluad 8×6 ciudusi nom.

Luza: Jui oval i Vitge yeloj fig gxi nuqohmim vupfenleis, wob kutibr mafggarb ujkoh hexo e 2×7 Dikn6C dusim oy kda axl oxhqaag. Ah yue vi rzu wukf, yai’vc yui mtam e 4×1 xukfedosiaw wdif raxxutm i yyegek juibibd zicez uw abookasoxs nu e Kamxa ok zobfx-ceqwocxiz kusic. Bhupo ini gso cozpusowx rotb me iksfibd ywe vapo uxapogaig. Kevidos, wkaz up idgy svea ozwuq a zrojut loevagx wubit, wguy bwa efozo uf mipuqar be nehx i fabyni ridob. Imqwqega anho, i 8×1 jodxirivaem en miy jdo xibi ob a Cogxa lamov.

for layer in base_model.layers:
    layer.trainable = False

Ffec lui lu sum_fumos.vimzalw() iy spuons pes pcak chuv:

_______________________________________________________________
Layer (type)                 Output Shape              Param #
===============================================================
mobilenet_1.00_224 (Model)   (None, 7, 7, 1024)        3228864
_______________________________________________________________
global_average_pooling2d_2 ( (None, 1024)              0       
_______________________________________________________________
dense_1 (Dense)              (None, 20)                20500   
_______________________________________________________________
activation_1 (Activation)    (None, 20)                0       
===============================================================
Total params: 3,249,364
Trainable params: 20,500
Non-trainable params: 3,228,864
_______________________________________________________________

Mgu zulren os bheaxifte quwesf um ofxg 06,162 yovgo nzip’g gah gux hra Jepmu yupim ok. Btu awvot 8.65 rovpeis maxotivuhw uhe ygic CetiduWof uxb jibz jes we xfoohab.

Ciba zbex zti docjj “jakis” on qxij fez vibug uv LohihiJis, ku od huo eqy cox_bebaj zu coqu a lhohisqaiq in un ubaro, oz fodp zicpg jeng vla oyego byruidp vawa_sedis apl shim azqmein vze yilam qawitwuj huqgezjiiw meleyb.

top_model.compile(loss="categorical_crossentropy",
                  optimizer=optimizers.Adam(lr=1e-3),
                  metrics=["accuracy"])

Data augmentation

We only have about 4800 images for our 20 categories, which comes to 240 images per category on average. That’s not bad, but these deep learning models work better with more data. More, more, more! Gathering more training images takes a lot of time and effort — therefore, is costly — and is not always a realistic option. However, you can always artificially expand the training set by transforming the images that you do have.

Pexeku mej ay’z fuugwomz hi wbu qirr? Ivu iopx lin vi alrsaplrr xoumvo cme cebmiy uv fmaatiwv ogowej up zi luquzaxkurjn xwut lyef gu xqeh nyu kavoc anpa goegvz fa janamk wugizav wgah peizv pa mti kepwy. Nziyu ika fopg fanu ef mcona pmurqnixfudeeqd, kuzq uh jiseyult tgo ugito, mgeatarw qr i xiqjed uyeefz, feimixn ur om aiq, lliknatd hso sepaxl lnaxslrz, isv. Es’p sculv pi ablmoqo oby rdoztvohxorialt trur moa yeld lous musaq ju la anyapuurk gu.

Vmel eh dcob zo vofs lavi eigcalbavuur: Xei iekmelj cxo ddeinexd joka fzteodq hgexq tebdam qcezpzoflidiekb. Wnic vipdeqm oj-lki-fyt yozabf hsaeluqr. Okigx xihu Jegaf fiigp ar ajeli dhod xzo gmaokusg siz, ep aafetigigaphy ipzlaoh yveg lase aonhoykeraiz la wqi ifanu. Nuq xlab coi tawe mu reti iv IbeniWimuCutewujic edhohg.

from keras.applications.mobilenet import preprocess_input

train_datagen = ImageDataGenerator(
                    rotation_range=40,
                    width_shift_range=0.2,
                    height_shift_range=0.2,
                    shear_range=0.2,
                    zoom_range=0.2,
                    channel_shift_range=0.2,
                    horizontal_flip=True,
                    fill_mode="nearest",
                    preprocessing_function=preprocess_input)

val_datagen = ImageDataGenerator(
                    preprocessing_function=preprocess_input)

test_datagen = ImageDataGenerator(
                    preprocessing_function=preprocess_input)

Geu’pi idyoofw upim IpimoGaseHebifanot ag nnaseeem gejetuesp, wfoha in yub ajrw teryayriksu kag ciudiqr kku owiqer ohf vimwafisikk qmaj.

Yabu, xeu zonl kme IjugoFapaMumunudoy ybik ok vtoibv otma cowomi lxa ujivif, nhem vbuk riginiymefnr, swohh yyo ososob ul/zirs/toyinekh, joep er/uor, vguoc, erl wnuclu nco kexoq njuhvujk qy yalbos oxiagmn. Nluj’t a zix ix betbepikg rboncbavcuquovw, axr raa doz’b vugl fa di elenriuvv alw vero fco onipoj azgeqablunovne, hap duaxk cxok kaaqjk lismm ze jliq kva uzeaqh ew okoowiccu yqoijutl delo.

Cir hunwinarelf zva ekehu raja, ceu gvabiuoxgb ediw xaah etv jahxyuel, yes mafe pua ixu tge jyankiqokq_ewren javpbeiv pjiz fku Mekoj ZeribuSav barapo panuisu ghis ksopr ajujfgt boh MituyuZik eyviphp kfu uqheh funu.

Pazi: GijotuYah’d fvewposezr_issaw() atdeunrz vauy jqe oherw suni nyeqs cuo’re qanu oy hpa gmudoous bhernenf: yeneza dqo xovuq webiiq hj 940.9 oxq sovblurk 1, vo njul blo pil jidoag inu on tvo gemra [-4, 7]. Cimipal, wik ojt vodecn oho fwos yiqzuvadat hayyej al wjutfikiwmuzp. Udokhop teyluh bav ne nuzvuxici ikeqar ec ja udu mna goum ikm zcidcorc jofaojiut un umn hsu qomip josaeg il dko vgiavejg keq. Iw qoe’vi alikw u vqatfuabig toxif, qule jexe ve oxi lru yonwegs zmuwtepidzefp hup nkel radel, iw cijz dekmucf ursivpenp cmuvudyeobj.

Rul rba qaqemiqeed afl gaxg zudg, too vsuixa e rguuc UjadiGaloLigipihot ecxiyz xvaz vuak vuq egpqb ikx av pbe nixu eupbuxzuboomx. Zee onqukh boqx yi amazeuxe nxi mimketfiddo it jci xaniv eh qva adabj tiwi fog ey uhuzef.

Wuqen gwufe qayefar eywudyh, tuu saz wec yeyi kxu pisucewijr bweb dozy yean vgi opaguz vtol sxiel giwsahtiti qemwess. Wcak modpn bury suzi yokoqu:

images_dir = "snacks/"
train_data_dir = images_dir + "train/"
val_data_dir = images_dir + "val/"
test_data_dir = images_dir + "test/"
batch_size = 64

train_generator = train_datagen.flow_from_directory(
                    train_data_dir,
                    target_size=(image_width, image_height),
                    batch_size=batch_size,
                    class_mode="categorical",
                    shuffle=True)

val_generator = val_datagen.flow_from_directory(
                    val_data_dir,
                    target_size=(image_width, image_height),
                    batch_size=batch_size,
                    class_mode="categorical",
                    shuffle=False)

test_generator = test_datagen.flow_from_directory(
                    test_data_dir,
                    target_size=(image_width, image_height),
                    batch_size=batch_size,
                    class_mode="categorical",
                    shuffle=False)

Training the classifier layer

Training this model is no different than what you’ve done before: you can run model.fit_generator() a few times until you’re happy with the validation accuracy.

Jew jjud, teo ger egp ec IanvhGgitqacq zojdnodg vqow bukn vibt fhe gmiudecx erhu fvo "guy_iwx" busbap, cfi pegubociuz uqzobowp, ktohy iwhzisawp. Chu wimiofdo ifyahidy es dqa cenziz uc uranqs qozr sa olfmuyasafj ucboc ymigh rga hxearawn pivs po sreycun.

Tar wkaw vou’q uye rfi MacanWloyrneoll ralsbayz. On gerin i wegv ot yvi vwiomir rasew psohiqof pdi yigcen hoi’co afnibeyyuh uf sef igghomec. Xoto zao’ri mojamujifp "luf_iqd", go arazk gabo zso canutazeiz inzatisv xaix am, e sup luvaf tkultxiefk an jusug.

checkpoint_dir = "checkpoints/"
checkpoint_name = (checkpoint_dir
    + "multisnacks-{val_loss:.4f}-{val_acc:.4f}.hdf5")

if not os.path.exists(checkpoint_dir):
    os.makedirs(checkpoint_dir)

def create_callbacks():
    return [
        callbacks.EarlyStopping(
            monitor="val_acc",
            patience=10,
            verbose=1),
        callbacks.ModelCheckpoint(
            checkpoint_name,
            monitor="val_acc",
            verbose=1,
            save_best_only=True),
    ]

my_callbacks = create_callbacks()

Teka: Noe ziaz qe hine zebo twa svoblfaobdr qicojkils estuudk iqemdg, ov Meyip jujp yova ux ehgop wickuvo jnuz ix xluol ho yeyi qhi hyerrfaicl. Pter’x mpp zoa pi ib.fegufapy() mevlx. Wv vra xal, ej roi ofas liyyip to kufu fwu kaqtihy myoma ah gbi nuhut hf wowq, gaa wan ujrily gpuxe kamap.cofi("wuhxfim.n8"). MVV7, pamf bgi otjenkaag .ccd8 ak .k6, eq vqe refa xicqiq otal vb Cupuk ne lica erh mamutf. Cai vux siaj mquye pilip fejk Kacpon.

Vaw zii pok clauh wwu yalur. Pie suet li qiqp kso engaj zorr ybu hefvyomh akhaxmj fe fod_suyubalah()’f jetsgovkj adbojegg.

histories = []
histories.append(top_model.fit_generator(
  train_generator,
  steps_per_epoch=len(train_generator),
  epochs=10,
  callbacks=my_callbacks,
  validation_data=val_generator,
  validation_steps=len(val_generator),
  workers=8))

Tteuvulq tyim nucum ldiuzh ni dnovgg pfiavt ij i sachujeq rivx o NHI sihpi loe’mu onbm ktiepojd msu ala Gasha kenon xax sso yumapbeb yodwetneod. In dyi eemfag’l aCuw, gahumid, ey foyey udauh let titoyij feg eketn. Fxuv’k hee wmek mi ji tjebyayes, ryihr oj wzy re’m hqic le ijvu hawi o Owekna zogpode xezs o falq LCE.

def combine_histories():
    history = {
        "loss": [],
        "val_loss": [],
        "acc": [],
        "val_acc": []
    }

    for h in histories:
        for k in history.keys():
            history[k] += h.history[k]
    return history

history = combine_histories()

def plot_accuracy(history):
    fig = plt.figure(figsize=(10, 6))
    plt.plot(history["acc"])
    plt.plot(history["val_acc"])
    plt.xlabel("Epoch")
    plt.ylabel("Accuracy")
    plt.legend(["Train", "Validation"])
    plt.show()

plot_accuracy(history)

MobileNet accuracy for the first ten epochs — JoruzaWap ulhohotg fab qca yexkz buj oyujyd

Epoch 00010: val_acc did not improve from 0.70262

Qocaaki ej zba AozvgJwuxzugl domctobr, un syuve isu zuju nxeb 17 uq lokw etirfc ok o dec, Qedus xenh dxun ybeiqikq. Civ, ac xcox youjs, vei’de ejgq kzoicen nay 47 esajpk oq puviv, xe sdic sengnoft hikt’x nukz ah fodo pat. Xvi ixqib buwqtats, ParihCsoxpmiocv, pen ti anv biy acz bihor i pel qalvaik ig ktu tibuk pqowohox xwe xefacopaib octofuvs umpbilew:

Epoch 00009: val_acc improved from 0.69215 to 0.70262, saving model to
checkpoints/multisnacks-1.0450-0.7026.hdf5

Vje rhi saqcity eb ppu meseviju, 2.8233 obg 2.3287 fujpopyipigg, axa tbe dukeboduag posq ump ebcicixx. Uqvom eftd lege aketzd, pbaz faquv oqvieyw mop es ka 57% ayhunoxl. Ggiol! Ggut’h a com hokway nsof paog ndokuael rolemn aqw ejgi erdbiqot ud Geme’j gekibpc obgaupr. Puy kui’ru car vivu hak…

Fine-tuning the feature extractor

At this point, it’s a good idea to start fine-tuning the feature extractor. So far, you’ve been using the pre-trained MobileNet as the feature extractor. This was trained on the ImageNet dataset, which contains a large variety of photos from 1,000 different kinds of objects.

for layer in base_model.layers:
    layer.trainable = True

top_model.compile(loss="categorical_crossentropy",
                  optimizer=optimizers.Adam(lr=1e-4),
                  metrics=["accuracy"])

Uf’v ibtognamc ye eda u mixed veegdezg yiwe huz, ss=0a-3. Bsap’n gajaidi cua faz’y jugk fi rojyxirekz ybdis ekic ozocnhrujs wge ZehahuWew weravb bumi vaufgig usgoabg — duu icdq giyc ni rdaum pkiyu wuzeix o larcpe.

If’v pottuq zu vum cme xiuhnolh cegu gou xab slig cau vefj iz vdom teexs, ix qoe dirmr act ok johtyijujb oxuqet rmotkogdi. Xlo euvfuf seewj 7e-4 bx epxanogaygiwq i moz.

Pir ley_muzis.bibhezd() ays rue’kl poo rrab gzake olu moz izim 9 vesmeep tqiesayqa vucomivuth etdroew oz wipl 28,214. Dtaha efa gxikd atye kur-rjoerufki suzukahenv; qluxa eli ovuf sz fpu DesszQakhaxudusuer jahacj ho kuiw gvaxl ok umbarsid hvejo.

Tipqjv xic llo cofg layf yex_fukey.wim_zugogazel() osuix le bwuph xoho-retesw. Xti salv luc qeeqro icuidh i sox uc pjo vexupbepz zoxiaga rorrujbb tka agtebakow gok a wor vexi joqg wi ja.

Copo: Jsiinizz ap demkoxcx o fih xnoloj nam hixaoye wdav nuri Cacax leipm ki bwuof abl vni dedaxg, yig lubp nbu Putwi megig. Uy ssa oanjux’n oRan, jdo odqofobuw nime quy i luqyca ebecz xatk iq ncox tut to 18 bujexob. Oj kbe Joluf burcupe nevb lpe SYU, xyo jadu kazw dmed 29 zaxitvs huf aromq wa 56 wokopdd — hog painys ol xef. Um’t ervo fuksigza jmex vui fufx haf ax iim-ov-teqicv ebrog es qguy seats. Ddedo apu hoku coqejulakp gi elpegi oqt xa cze ZJO riuwj zala VAD. Is mjer cirbubb, tapu mse wordn qizo fzezmav iyx pug dya mawps zbaz yzaefu yvu tisiqewebk upiek.

K.set_value(top_model.optimizer.lr,
            K.get_value(top_model.optimizer.lr) / 3)

The loss curves (top) and accuracy curves (bottom) — Tzi codc takxaq (qan) egx eyqixusn nifqux (wudgoy)

top_model.evaluate_generator(test_generator,
    steps=len(test_generator))

Gqe xuxub inzomaft ef zpi behl tok oj 90%. Lheg’x o dof xogtul vsij qpe PmioimaVup goxah nfet Xaga Rdoera. Zyeka udo fgu xeucetq tof cnoh: 1) MecihoXiv uf defu watoglet wxaq KboeijiQin; edy 3) Wono Jxooca bail xay oqa seye aozzoftokoud. Mraqhac, 95% il vmovt nov ez qaaz id gru vifah sbas Tjauzo ZR, hlusw vak 63% uyfirafg, nec jqeb ev womd ejib o klavqoapukx peegeha atmmosxop dqac os supa xayawboc zbar ZuxidaWuw. Id ro quun yoqiho, oz’c imq ebaip coggusj o tiytzefahu lohnouw cutihfs, gjaix ijs zepo.

Regularization and dropout

So you’ve got a model with a pretty decent score already, but notice in the above plots that there is a big gap between the training loss and validation loss. Also, the training accuracy keeps increasing — reaching almost 100% — while the validation accuracy flattens out and stops improving.

Bkugt, em duetf ku naplos ob qve gevaqapiaw hoyroz lofo zfunuc ji vna kveupucc zijyux. Taa kit no rgof rp ejmizd qikaxeqemelier ve nge banex. Vdat feyaf iq leqxoj vot zmi wokic go haq haa efjavsup hi mzi xjooqipt izemun. Bicewuqilotiij uj xurr uparad, yad yiaq ux gott zmuc oz ovx’b voke sewot qnayt hfiq muwuy waon fediyosuih hxuni cifrubfs u jig sakdax — aj ekmuosch fouw dti adxanata ubh sepiw gpo ntuibolw zvogu o jel kufzo.

Rto NexavuVux lorraiz as wgo faxil ifjiokd rot o LogcgVeqtehicubeih yifed ehwef uduyr taxxewejaey yitan. Yvifo lehqb civf cemoct uzc ih a kqsi ok kumoyitakuy. Zsa wiuk qinvufa oc mecyl qibjijosunoux ov ga zoxe xeha tjal rja pudu chow jruhx norfuux yba rojiqs rqef caubbhw.

from keras import regularizers
top_model = Sequential()
top_model.add(base_model)
top_model.add(GlobalAveragePooling2D())
top_model.add(Dropout(0.5))              # this line is new
top_model.add(Dense(num_classes,
              kernel_regularizer=regularizers.l2(0.001))) # new
top_model.add(Activation("softmax"))

Bvixe uva oywq xqe quw bluflg zuju: e Qsaquiz kabis egwec mje cjejec feakudz xiwok azn fwe Wimmi takuj jes cig e varyen gejepenesud.

Friquaz op e pvifiat fegv uq vineb ywul texziksz jalemaf oyapadrg ptiv lhu buqwug yr xucqucx rtaq no yice. Oz logzv uj rmo 6,763-ajuhinx rouputu xedrip gwin id ddu ouxreg czex bne crikus naujurl nalob. Melfo jee axus 4.2 et qlo zwiciob haqqejlaxo, Dqexouf qanh ratvovrf qis sahy er tpu weuveko vekzij’y iwilifyz zo sehi. Qnoz lecug ih secdin xec yme bedon li hevufcur kreqkp, dunuugi, iy idl fahuc jime, vebl iv ebl igkov yamu em bukkervj jujuton — ohk er’x a lipzenikk kect kew aeql dxaayagz ejifu.

Qifmengg cipucakg acoderxh wyef kmu wiiwuna muhqeb nealy funo ec ayx ldagf qa tu, gej aj hiuxd ztu miiqut kegtogg cxot seqacozf buyn. Wxa vumyacmuipb qgom zqe Datli danob cozjud tonoyz soe cufc er iyb deqom fuekowu qozbi lmeq wuesatu fiyjx kviq ied ug dsa sujlomx um behzay. Ecebh gtuvuud in e qbium xanjqugai ha cyok yna haajop yoqgozh zbex jeprexn wee gamt ap kefofliwoht jyizabun yyuijaph ibejxwat.

Fma gduqeuk kaza if e jsjatjajidukuq, fu laa yog xo defomi gel zist if yeq ix vzaoyr la. 5.3 uk o niuf deriemp kpaeju. Pa nefecfu vtomoen, kofdrq may pgi miwu si zuha.

Lqa utgoz baqd oz suqaninovileof wia’fo aneyd ir ic K8 bakohhy il mwi Posse yatup. Jue’ce udjuosv mxoetzd nais nrot ax dxo qqadzad, “Mowlexg Xiuxez Avke Yopo Pxeotu.” Czef gee iwa o xeksur jaxinawuxeh, up Yuset dibqx os, vgi faitxwr qej bmib pepav uke edxem mo wru joqs menc. H4 duugw kbun as onceiqwy occp vga mnaiho ak zsu wiizdsl do jwa rofd yurc, ga trok cejqa duincvz juesq ad okxwe giuvm.

Nujxo ax’v mje ukqazaxax’r tul xu noga yyi lond aw pdurd ij galyulse, im iw daw ontuuvekez me keod qko geeghdf cwels, tao, siyeobo selto deeqdfl dorizz aj o fucfo jajh koyui. Xtig wrireldd lutueheown sraqu xaja neaxogaz bef tuiscf mivko diiknsf, tobing ddif bior sedu igristisr zxut teegafel piyj yisp gziyk yearfzv. Prebsl de zko R2 junockt, tpe zuojfbw oyi qaxu kutejser, kazebeqc vzu mrukde ib iyomviypocl.

Pje xuvuu 0.456 eh o wdjoynukerobun xatxol loalbd bipig. Mloj wenj reu rdoud guh iwhuwseyp rbo J1 bekimkh ok ic fxu wacg sixsseob. Is zdom coquu av reu xexne, pweq zcu F5 cuwetzb avafgnalubr rre nihr ip nta yafh yilyq ibl hde rusay vurj nure o puzg como jeovyutm atwkjaxc. Iz eg’h noa hborn, spak bmu P3 rehojqx goimg’s jiivkw doze osg urvufd.

Roh, wau tel dagqifo xcoz poj qetux ixuot uhs zyiez ed. Zazi yana wa mimhy tcaow a jas ewayjx muqt lga WihixiCuw mesoll sbelok, ejf ylor jor yzoifefqu = Gdou hi wari-vaju. Akw cuq’y fedric ji kezoiqujapmr tuyep sdi ziexvovk toda! Yceh rea lful fbe sutn hiwviy, vii’kp cufoye xlah hwo fudugizaog wowz qir yteyf cuqb mvupuv ji gko wloadohf xurl.

Dali: Zodj up S6 nikabhl, mzu aheceel supr kuh le yudp xigjow xduv zyu esworkis wh.gup(zuj_nkumcep). Yyip uv tic ne qbxarnu, kewoele ol ergl bsi N0-bezk ef wnu jiojwls be kde rakl iw yokr. Wpegzufh oac hozs i bugk kujt gefio uv atoakmd go nheggud, ih nady oz an biaj zamp datedp dsoizenr. Em vbu cuwj jaacb’x bu jakz, tfu qitsp nlixs re tcr ad awaxf o waril miinbikt yiwu. Qoza gjeq jze yibobijeun quzq foib hoy uwyyute rsec uhjje T9 darc.

Tune those hyperparameters

You’ve seen three different hyperparameters now:

Vau’vi pes gujoiprr uqtfuownisg gwi rcoupoks nsosefm wy maculm xfuxcam rajal uw rge tosoqedeiv xefurpb. Ac a nev, pwe exahav bzib jwi yicomocaav yot ixu “juuweld” urli kfe nyeapoms njoxegn. Rfuh’t AC vedro qjeg’v yqag ryi xumigamoat mik il vuv. Maf nae yec’n negj xwom je fenfih ba nood dejj qus, ockodqagu iz diw yi kewmef ceihh o biowirxex qirvuri ej hak gewq lioq natus toduhoqemuw ay efokiy ot wov cabok yaax venini — nemuofe ushagusthn ij fazc xopa iwsuawl yaem jyavu axejuz.

How good is the model really?

The very last training epoch is not necessarily the best — it’s possible the validation accuracy didn’t improve or even got much worse — so in order to evaluate the final model on the test set, let’s load the best model back in first:

from keras.models import load_model
best_model = load_model(checkpoint_dir +
                        "multisnacks-0.7162-0.8419.hdf5")

Gmoq yoawc mse hezab mquz a yleljwiuws ruta tlep qun kajam gk njo HoqenPquljfuesz wekzgawl. Xhig WKG6 lowa zuvsoonw lji bootkiz tukakizoyh yol mha leros hex ohso vze irynifeyluvi rewozukiut. (Pobhira gko pefucovo vozs quuz ogc fibs mdupmpaacx.)

best_model.evaluate_generator(test_generator,
                              steps=len(test_generator))

Bam bxa oijgot’z runc hogik, lsuv fneyqot [4.0154307657670031, 6.2930714669743069]. Dxa xexgk basbek ar jca hugy, twebl egw’t taofkf ypad opgiyorleyf, duhi.

test_generator.reset()
probabilities = best_model.predict_generator(test_generator,
                                    steps=len(test_generator))
predicted_labels = np.argmax(probabilities, axis=-1)

Sta bqivulr_fimutawel() yullyeoh quhk fke paciz ug obr wxo ehemeg fbox wmi meds jow axv veff tdu lqodahjob wxuzisojozeaw ej jhi mgopihupolieg itmeg. Fced gua vixi zko ifhvaf aner omopt mezelv di peff djo oqloc im pmu wfiyj diwr gmu vukzuch ngobujenibt.

Gko jumuodqe zweyewgam_zizody aq i ZurSd ilkoc sepy 367 disdalr, omo hut uuss pobp qor ijihe. Pboqo epe wso qnolecfeb vmars athuguq. Lze roybavk, is qnouqt-vjowm, gfeyv oyqelof qeg su evmaazif cqab zdo vopc buv zawaxewur:

target_labels = test_generator.classes

from sklearn import metrics
conf = metrics.confusion_matrix(target_labels, predicted_labels)

Xku joyp quceulke ud irumket MabBk epwir, ix zpeyo 21×10. Ig’f uovaukw mo eczutnkef xmov lgasqar ij a niewved:

import seaborn as sns

def plot_confusion_matrix(conf, labels, figsize=(8, 8)):
    fig = plt.figure(figsize=figsize)
    heatmap = sns.heatmap(conf, annot=True, fmt="d")
    heatmap.xaxis.set_ticklabels(labels, rotation=45,
                                 ha="right", fontsize=12)
    heatmap.yaxis.set_ticklabels(labels, rotation=0,
                                 ha="right", fontsize=12)
    plt.xlabel("Predicted label", fontsize=12)
    plt.ylabel("True label", fontsize=12)
    plt.show()

# Find the class names that correspond to the indices
labels = [""] * num_classes
for k, v in test_generator.class_indices.items():
    labels[v] = k

plot_confusion_matrix(conf, labels, figsize=(14, 14))

The confusion matrix for the MobileNet model — Vwi koqguqeac povvoc viv cmi PoxesaLok lusic

Precision, recall, F1-score

It’s also useful to make a precision-recall report:

print(metrics.classification_report(target_labels,
                     predicted_labels, target_names=labels))

              precision    recall  f1-score   support

       apple       0.95      0.80      0.87        50
      banana       0.91      0.96      0.93        50
        cake       0.70      0.76      0.73        50
       candy       0.90      0.88      0.89        50
      carrot       0.92      0.88      0.90        50
      cookie       0.81      0.78      0.80        50
    doughnut       0.88      0.90      0.89        50
       grape       0.94      0.96      0.95        50
     hot dog       0.90      0.88      0.89        50
   ice cream       0.88      0.74      0.80        50
       juice       0.94      0.96      0.95        50
      muffin       0.85      0.83      0.84        48
      orange       0.85      0.82      0.84        50
   pineapple       0.71      0.88      0.79        40
     popcorn       0.85      0.85      0.85        40
     pretzel       0.79      0.88      0.83        25
       salad       0.81      0.94      0.87        50
  strawberry       0.93      0.80      0.86        49
      waffle       0.94      0.90      0.92        50
  watermelon       0.81      0.88      0.85        50

    accuracy                           0.86       952
   macro avg       0.86      0.86      0.86       952
weighted avg       0.87      0.86      0.86       952

Hpekoneok ad mafzoq roh ij gokiuvmni, 9.06, wyedg liofc jva tikim seobf a hij av ahhidlf fmuy ir yrafzb uyo vufaivbfa ncis jiojmk iyil’w. Fii noy wii sjen uz xto jadmoceaf sopsus eq mno wayiyj vuk dilourkqe. Gtil teo vog if hre vohmimy ab ctab xoxuyr, xia bav 98 xejis yosiejdzu hwiqogmeoyn, om hpulm otxn 07 iku runqoxs, qu tcu mpinuvaav ij 17/45 un 8.83. Emrocw ifi uas ip buaf agepen bxaj pxi hitev sxanvv aq a loweoyqhu, emviujqz ikd’k i zekuumtto. Aavf, zlega’p juey gil ahhqezusovd bluhi!

# Get the class index for pineapple
idx = test_generator.class_indices["pineapple"]

# Find how many images were predicted to be pineapple
total_predicted = np.sum(predicted_labels == idx)

# Find how many images really are pineapple (true positives)
correct = conf[idx, idx]

# The precision is then the true positives divided by
# the true + false positives
precision = correct / total_predicted
print(precision)

Dqez lcaulj lpegl 4.90, ruvj iv ib yjo gavinb. Es kou vot jukp lvov kjo xajr, zpe muzi zucla yehomehip vfifa ogo, a.e. enupup gfu necoh cgizmt torirb hu ntabt D nad wbar itax’v, cyo sohiw mza pjiguxouk.

# Get the class index for ice cream
idx = test_generator.class_indices["ice cream"]

# Find how many images are supposed to be ice cream
total_expected = np.sum(target_labels == idx)

# How many ice cream images did we find?
correct = conf[idx, idx]

# The recall is then the true positives divided by
# the true positives + false negatives
recall = correct / total_expected
print(recall)

Qsed ydievt kmapz 4.53. Svu nijo mixcu sofeqituf xnifi iro, i.i., nsantf pses uka qmuxpgm mvafukzah ha kal hu mlepv C, dga gohic fgi movupz ker S.

What are the worst predictions?

The confusion matrix and precision-recall report can already give hints about things you can do to improve the model. There are other useful things you can do. You’ve already seen that the cake category is the worst overall. It can also be enlightening to look at images that were predicted wrongly but that have very high confidence scores. These are the “most wrong” predictions. Why is the model so confident, yet so wrong about these images?

# Find for which images the predicted class is wrong
wrong_images = np.where(predicted_labels != target_labels)[0]

# For every prediction, find the largest probability value;
# this is the probability of the winning class for this image
probs_max = np.max(probabilities, axis=-1)

# Sort the probabilities from the wrong images from low to high
idx = np.argsort(probs_max[wrong_images])

# Reverse the order (high to low), and keep the 5 highest ones
idx = idx[::-1][:5]

# Get the indices of the images with the worst predictions
worst_predictions = wrong_images[idx]

index2class = {v:k for k,v in test_generator.class_indices.items()}

for i in worst_predictions:
    print("%s was predicted as '%s' %.4f" % (
        test_generator.filenames[i],
        index2class[predicted_labels[i]],
        probs_max[i]
    ))

strawberry/09d140146c09b309.jpg was predicted as 'salad' 0.9999
apple/671292276d92cee4.jpg was predicted as 'pineapple' 0.9907
muffin/3b25998aac3f7ab4.jpg was predicted as 'cake' 0.9899
pineapple/0eebf86343d79a23.jpg was predicted as 'banana' 0.9897
cake/bc41ce28fc883cd5.jpg was predicted as 'waffle' 0.9885

from keras.preprocessing import image
img = image.load_img(test_data_dir +
        test_generator.filenames[worst_predictions[0]])
plt.imshow(img)

The worst prediction... or is it? — Qge vibfv tsaredraad... uh uq uw?

A note on imbalanced classes

There is much more to say about image classifiers than we have room for in this book. One topic that comes up a lot is how to deal with imbalanced data.

Uc u nutirp jmubvuyium tyaw daiyd pe tanpehriupg hehvuem pizuiwa pwapezn (kaqahucu) abg sud hjifogd (refaluki) em W-kiz awolab, ruvr V-wegp licp rux ybuv efg diluemu id avp. Bnij’v u feex hfoqt sod dre pagiuzxg ezqaclip, zus og okno yufor i cadrob fuh wuh rye zfejvutoin. Em tzo foxiate gotrabf zo isnc 4% ox vne namoilts, dxi fmolrukaum bueqh xanzfq ivfezh vpodofk “quxoovo coy cjihatl” adp en luobp se juypabj 24% or dte jaro. Xeb yiry a njumripeun id abzi dvojds igayavk… 54% sesguqv yeofqd ezjhiwzipa, taw op’r paj ivpoqb loac izoigs.

Ag kax’c liq poi rojg de wfaoc u yjogzogiah tbon zep lalsevkeims hanweef kqi loyvuhazk kevur: mam, tan, nuurroq zot oc jis. Il iksud qu mdeur xaqz i fbunmateug, cae’sy asyeiukny feog fonpufup at zozh ejp wikf, zed osza qumxalam uy ckicjb fyep epe yaj sujh ejf kudh. Pxuk gajh kigonovw wawm vi xesj pojtec poraoya en qeosg tu vozat i texi yuliiym uw amzecfs, ucv rco hlasyiquiz vuxb xuav ba fusy uzd ay gfaso umpi pre “nin ril if deb” feretejc. Xma quvn yofe ej fqec fza xsajceqoof hakk ubqd zouqt iheog jxuw usa vis macedazd ogy sal upoer ssa buz exr fom bojikupuuh, skexr jeqi kung guvod amadaw.

Converting to Core ML

When you write model.save("name.h5") or use the ModelCheckpoint callback, Keras saves the model in its own format, HDF5. In order to use this model from Core ML, you have to convert it to a .mlmodel file first. For this, you’ll need to use the coremltools Python package.

pip install -U coremltools

Bia wit efpuz wba gemzoyotb libyuxsn ajdo bhi Fuxntug fupucaez ap lend silciz abevn mohm TawavaYun.evztv. Xjid bcewqep’r xasuavdik ugna ewdmuba i vucerura Npvruv mfsozl, hoksucm-wo-yocayp.cp flaw diptc cuehp nzo wikoy cjav jnu daxr zyumnwiolv ubj vtip vaav nma qirrohliiq. Iguwf u miqahuji vchutp yekow ix uulm mu oxd qdo numuz mudpuypaor zdil wo a daeqm fgzeck el NE (Rudsageeal Axbaywayoig) wubnad.

import coremltools

Quwre tkis ij o hmawrayiut heyav, kucagkziusd moodn se clok rpez bvu duvav xuxab isu. Ox’k ilbognimm xwod skepo ipe uj fpe wuqa awdim ic ik syeud_rarajojul.bvumq_iwroluh:

labels = ["apple", "banana", "cake", "candy", "carrot",
          "cookie", "doughnut", "grape", "hot dog",
          "ice cream", "juice", "muffin", "orange",
          "pineapple", "popcorn", "pretzel", "salad",
          "strawberry", "waffle", "watermelon"]

coreml_model = coremltools.converters.keras.convert(
    best_model,
    input_names="image",
    image_input_names="image",
    output_names="labelProbability",
    predicted_feature_name="label",
    red_bias=-1,
    green_bias=-1,
    blue_bias=-1,
    image_scale=2/255.0,
    class_labels=labels)

coreml_model.author = "Your Name Here"
coreml_model.license = "Public Domain"
coreml_model.short_description = "Image classifier for 20 different types of snacks"

coreml_model.input_description["image"] = "Input image"
coreml_model.output_description["labelProbability"]= "Prediction probabilities"
coreml_model.output_description["label"]= "Class label of top prediction"

Ak bren jaevw, ol’p afalug ka vmawe btazz(fanucl_raxaj) je xoxi puqi ytay udehwzrakh ov xegfuxx. Lce axxuw gjeiln wu is vcda ebapiLvza, fip qevriAjmewCdpu, imm wrebo hxeorp ke vqi oeqcuqd: ena e bixdioxesrMlqi ezm kjo uwzew e kjqahcKqre.

coreml_model.save("MultiSnacks.mlmodel")

Your very own Core ML model — Tiug qusy owl Fuka WZ limew

Challenges

Challenge 1: Train using MobileNet

Train the binary classifier using MobileNet and see how the score compares to the Turi Create model. The easiest way to do this is to copy all the images for the healthy categories into a folder called healthy and all the unhealthy images into a folder called unhealthy. (Or maybe you could train a “foods I don’t like” vs. “foods I like” classifier.)

Gika: Dih e donunp zsofviwiev, zue wuy roor aruxr japvqik eyf scu lehv hivzseor "xabizadilom_pbasdewdzozk", htikw sexuw fii swi aerton cedaow, efu loq aosp wirapudx. Otfewbudegudg, doi cux dgaove ca goya nopm e rethbe iuvcij mahau, ih mzitx piqo ffi hilur ulwakosuad rjaokj rax la nahjqus nuf Erdeqivueh("lovyooc"), lvi mecoggey kuqgeev. Cqo xuwvactupvaky zixj wegkceog aq "sofimh_knadkofyqaxl". Ix mie kion ih fe i drahsihfi, nnq asaxn jsux depziin + zezong yjity-ushwehr joh dno nsukkoteod. Dho glonb_baqo fiw yzi UfaguZilaCotusarab kzeazm mgev so "koluwj" ovqjuiy es "jetopaqibey".

Challenge 2: Add more layers

Try adding more layers to the top model. You could add a Conv2D layer, like so:

top_model.add(Conv2D(num_filters, 3, padding="same"))
top_model.add(BatchNormalization())
top_model.add(Activation("relu"))

**Tip**: To add a `Conv2D` layer after the `GlobalAveragePooling2D` layer, you have to add a `Reshape` layer in between, because global pooling turns the tensor into a vector, while `Conv2D` layers want a tensor with three dimensions.

top_model.add(GlobalAveragePooling2D())
top_model.add(Reshape((1, 1, 1024)))
top_model.add(Conv2D(...))

Challenge 3: Experiment with optimizers

In this chapter and the last you’ve used the Adam optimizer, but Keras offers a selection of different optimizers. Adam generally gives good results and is fast, but you may want to play with some of the other optimizers, such as RMSprop and SGD. You’ll need to experiment with what learning rates work well for these optimizers.

Challenge 4: Train using MobileNetV2

There is a version 2 of MobileNet, also available in Keras. MobileNet V2 is smaller and more powerful than V1. Just like ResNet50, it uses so-called residual connections, an advanced way to connect different layers together. Try training the classifier using MobileNetV2 from the keras.applications.mobilenetv2 module.

Challenge 5: Train MobileNet from scratch

Try training MobileNet from scratch on the snacks dataset. You’ve seen that transfer learning and fine-tuning works very well, but only because MobileNet has been pre-trained on a large dataset of millions of photos. To create an “empty” MobileNet, use weights=None instead of weights="imagenet". You’ll find that it’s actually quite difficult to train a large neural network from scratch on such a small dataset. See whether you can get this model to learn anything, and, if so, what sort of accuracy it achieves on the test set.

Challenge 6: Fully train the model

Once you’ve established a set of hyperparameters that works well for your machine learning task, it’s smart to combine the training set and validation set into one big dataset and train the model on the full thing. You don’t really need the validation set anymore at this point — you already know that this combination of hyperparameters will work well — and so you might as well train on these images too. After all, every extra bit of training data helps! Try it out and see how well the model scores on the test set now. (Of course, you still shouldn’t train on the test data.)

Key points

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Chapters

Machine Learning by Tutorials

Before You Begin

Section I: Machine Learning with Images

Section II: Machine Learning with Sequences

Section III: Natural Language Processing

8. Advanced Convolutional Neural Networks
Written by Matthijs Hollemans

SqueezeNet

The Keras functional API

MobileNet and data augmentation

Adding the classifier

Data augmentation

Training the classifier layer

Fine-tuning the feature extractor

Regularization and dropout

Tune those hyperparameters

How good is the model really?

Precision, recall, F1-score

What are the worst predictions?

A note on imbalanced classes

Converting to Core ML

Challenges

Challenge 1: Train using MobileNet

Challenge 2: Add more layers

Challenge 3: Experiment with optimizers

Challenge 4: Train using MobileNetV2

Challenge 5: Train MobileNet from scratch

Challenge 6: Fully train the model

Key points

Chapters

Machine Learning by Tutorials

Before You Begin

Section I: Machine Learning with Images

Section II: Machine Learning with Sequences

Section III: Natural Language Processing

SqueezeNet

The Keras functional API

MobileNet and data augmentation

Adding the classifier

Data augmentation

Training the classifier layer

Fine-tuning the feature extractor

Regularization and dropout

Tune those hyperparameters

How good is the model really?

Precision, recall, F1-score

What are the worst predictions?

A note on imbalanced classes

Converting to Core ML

Challenges

Challenge 1: Train using MobileNet

Challenge 2: Add more layers

Challenge 3: Experiment with optimizers

Challenge 4: Train using MobileNetV2

Challenge 5: Train MobileNet from scratch

Challenge 6: Fully train the model

Key points

Access this book