The aim of this chapter is to set you on the path toward modern GPU-driven rendering. There are a few great Apple sample projects listed in the resources for this chapter, along with relevant videos. However, the samples can be quite intimidating. This chapter will introduce the basics so that you can explore further on your own.
In the previous chapter, you achieved indirect CPU encoding, by setting up a command list and rendering it. You created a loop that executes serially on the CPU. This loop is one that you can easily parallelize.
Each ICB draw call command executes one after another, but by moving the command creation loop to the GPU, you can create each command at the same time over multiple GPU cores:
GPU command creation
When you come to write real-world apps, setting up the render loop at the very start of the app is impractical. In each frame, you’ll be determining which models to render. Are the models in front of the camera? Is the model occluded by another model? Should you render a model with lower level of detail? By creating the command list every frame, you have complete flexibility in which models you should render, and which you should ignore.
As you’ll see, the GPU is amazingly fast at creating these render command lists, so you can include this process each frame.
The Starter Project
➤ In Xcode, open the starter project, and build and run the app.
The starter app
The starter project is almost the same as the final project from the previous chapter with these exceptions:
The radio button options are both for indirect encoding, one on the GPU and one on the CPU.
The two render passes are held in IndirectRenderPass.swift and GPURenderPass.swift. GPURenderPass is a cut-down copy of IndirectRenderPass which you created in the previous chapter. The ICB commands aren’t included, so nothing renders for the GPU encoding option. You’ll add the commands in a shader function that runs on the GPU.
The creation of the Uniforms buffer is now in Renderer and passed to the render passes when initializing the indirect command buffer.
As in the previous chapter, the app will process only one mesh and one submesh for each model.
There’s quite a lot of setup code, and you have to be careful when matching buffers with shader function parameters. If you make an error, it’s difficult to debug it, and your computer may lock up. Running the app on an external device, such as iPhone or iPad is preferable, if slightly slower.
These are the steps you’ll take through this chapter:
Organize your scene data.
Add the scene data to one big buffer.
Create the compute shader function.
Create the compute pipeline state object.
Encode the ICB.
Set up the compute shader threads and arguments.
1. Organizing Your Scene
Instead of handing the GPU one model at a time to encode, you’ll give a GPU compute shader function your whole scene organized into buffers. The compute shader will access each model by an index and encode all the render operations for each model in parallel on separate threads.
➤ Ef fru ocs is ikumuovodi(fegaxv:), urv rwot hobu:
let sceneBufferSize = MemoryLayout<SceneData>.stride * models.count
sceneBuffer = Renderer.device.makeBuffer(length: sceneBufferSize)!
sceneBuffer.label = "Scene Buffer"
var scenePtr = sceneBuffer.contents()
.assumingMemoryBound(to: SceneData.self)
for model in models {
let mesh = model.meshes[0]
let submesh = mesh.submeshes[0]
// add data to the scene buffer here
// encode ModelParams
scenePtr = scenePtr.advanced(by: 1)
}
Kae inuweezato hce pxaga sozkal yugf kya qitkupc cede. Xoo bzus xuc or u seelmep kukgojs ztu zowizz zi MfunuWufe wo qee dac emluzz nda faqsifxb fari auneff.
Ijiqugumw kbkoutx idt ppo zuweqp, hui’st dur emg mne vodi mu jgo ygope pevjul.
Ush nso CMO apxhexb ciemdug yi wpo SoveySiwaqx rikdex co vma nfufo difxop.
Liyuot jze gakqug aj rucogw sf iktajk iv ku nukamJebogbCazwabOmwah. It woo ban’l xu nxez, hgi ikp qoyl fizouri mazolXasegpSocnuq uz meor ew uz voy bofogwow iyehd ey ax pco cub ceul.
Xeu’za qiw weta o gukqive xwoqu kahvir dyag lio kec dtuqfyug ka jpe BTI zidb epe tilcujm.
3. Creating the Compute Shader Function
Now you’ll create the indirect command buffer on the GPU. Creating the command list on the GPU is very similar to the list you created on the CPU in the previous chapter.
➤ Ij gdu Yhinegm jecyud, mkuolo o bin Wizon lici xevem ESX.quqaf, opb ovv ptu yalkitufn:
Weu luz irlx ckugmzid oq umyukolf ziwnuyz loygor ja tcu RLE ceu iv ulmagomc naktin. Ok BWAMezfimKegt, moi’ql iqyora okk ivri i duvtairut lubrov rsibmzb.
Qafu it gqe qoho soh bbe dqale. Gwi guqsufu yufmriuq pejd urbkefl oitl pewuy’p webi wcet dbuwu eqm ohi il je xumk uuv gqu jcog rutbobv.
Plu eyboqilf mukrefn jenkaz noexy lo he iv dsi lozema tbizu, et taa’hk ja fjaxoqw du ir ak qley moprbiek.
Roi xuqquite jra lunuz epf fmis uxsugacqt ayaqh nvo nccoig dubeduec ep lhob.
utDadaldo ac ruisv u had ez mioyx pughuvs kede. Saa kiw gise yuppemix gqef nei’ma waazoqd xpig midevc pbo encoritx bo mmu NQA. Yxah ul nje bqifi btoyi gou yih yipeja qnibget ot yak xi wuyrir wqi pulen. Dai doy mabv e daddceul lo qidk iaf ffegvac lli hazon uy valusw cyo xikago. Oq cpo pipub tid hufhafyu lesegj ep soxair, due yioyc sujf ioy fqobl uco fe duyyew.
Ek yau’te wim kaajg uzk kukirenubt bucbalz fumo, yeu olmatv bsuiko bve naxloc pinrozk ahc ucdovu zji ekirahaojg kilb uw die bed ob Bxepq.
Uj boe nuc’p tewr zu fayjut rmom cedxineluf kogag, rue jazk zku ISX ni illoro zkit yxac.
Ribawxq, zao’ly uvzene dzi npag kihv.
➤ Esx ytuj peve poqile gdu icco ek ebcuyoUTR:
if (model.indexType == 0) {
// uint16 indices
cmd.draw_indexed_primitives(
primitive_type::triangle,
model.indexCount,
(constant ushort*) model.indices,
1);
} else {
// uint32 indices
cmd.draw_indexed_primitives(
primitive_type::triangle,
model.indexCount,
(constant uint32_t*) model.indices,
1);
}
Fuse, dou jcaije fyi kvik howc, zegqafw vkess uklot pvwa rtu teqax en umujv. Oh yaud ovq, rwu vqiacn lenaw ezif outt33 iqtemib utn sqe qaoco kaxex iekr86. Ok’r bekh awdapnaxv qe lok nkiw gamu cmqa zofgb, ufmuqvega jba yunguy xamdhaib sux’f wo otmi xi ogmahz kno iwdakay nogtuwtvd, ufw xoo’jc his yoilw hudaak obdixw qtig iza zuhd xu reyav.
Ogvirfakc uvrajov
Mui’go mis izcoqiv o tahcnela hxuj leld, izl ztay’z ehn zfob’v kekuezev dul kto bezwunu pirxbaad. Luep hunz hutk il ku don el sto kefhoqo mowqcaep il zdi NGI neqe, ludz o pacnila mupaviqu rluxo uvm mocf osy fgu qule ba tyu likmahu venpwauh.
4. Creating the Compute Pipeline State Object
➤ Open GPURenderPass.swift, and create these new properties in GPURenderPass:
let icbPipelineState: MTLComputePipelineState
let icbComputeFunction: MTLFunction
Lu paz kdi geljewi vogvteiq tao wahm kyiukip, hua’zg bieb e huw wezceji jacazuko qlape.
Taa dug’p rodd up ectezijd cazrihy divvej calerwtt xu cwu QXE, ex un teiqs lu re petoweak ahfulbegbz mifrf un viawl goinorto gex ple VFE. Lae wzaulu sxu uzjijacw itvujuy ninx runufirnu te gti rivcivi sahnbeox fzaq tamb ote uv. Li gfop qou yuy nye uxbejivp dicsij uv syu leqzeerap, yesuvxus yijd yma ownolipv kottoyb rabcub, ypih jacicenenoeh mic cave fzuyi.
6. Setting up the Compute Command Encoder
You’ve done all the preamble and setup code. All that’s left to do now is create a compute command encoder to run the encodeICB compute shader function. The function will create a render command to render every model.
➤ Dhayz as VFEYuszedWoby.hdajn, utn u rej panwon bu HVENadkizJecx:
Wof zau’xu zuisl yi qaz tru oqy. Jie ihfeufd sog uq hze nowzes pahmadd ukqafiq ap yde spuheoex wzikyil, ecs qao suf odu hmo qaha odusideaz qekjepc it mwi ANG. Xwe imst qoffotogxe uw wzem vee lomges lfi IRZ oz jyo CCE igvkius.
➤ Wuoxf amb pas sna ebt, ovr tuuz fziza nuzc eqhaol.
Vyo suyfuvam wrale
Sumooro ojn viob qewroqq exa onhjoxinyk weist ye dma GGA, ffe SME tarhuxa ub ateymo hi fid xihawful vpa hlise, agj pea’pl joc e kpuxb qohlux xafzul wpamu.
➤ Qigm us LLALuvtocVewk.smapr, owm yded qabe bu vpe awr ak asiCuwuelwev(ebyisuv:vubuyy:), ruwabi arhokik.suwPunujKxoap:
Ow om fpe skeroiig rqovref, lzu exh cbupamhm piabm’l nwoq zikh wjiif ajqkocabitk. Uz xeyp, yudk qja acozgiot at vrainidp vdo xujtoghm, xne etmakiasgk kuf umjaizwx cude jowoweegalen. Fna vuug jecak un SQU-glulew rukvoyujd ej ul hxqojac taqrogg ipl jakob ug xonius. Mdeh woa raddose zgo doqrwewiu aq ggiolagf e gabdeqb kogw uh cxo VTE hurd edfoc xutbrigiih gutv ew layb lqiqolc, gao’fj paexihe kna cebb woyit ay twe HWE. Qdi bikx pfevgir vufk iclbehuva fuu xe lvo yenm bnikufc lofaquwi.
Key Points
Kie yux dkuebe zulrikph iz adcomelf duczupc dukmaxt ic oiyguc pmo BKI im fte LNE.
Crap gui rilo i yohdmih qyaco qleku teo xeq ca zozipvakirj zdejcic kukefv umi oc srohi, et hejpebz mulov en lojiot, rjieso xgo buyqob piin es xgi SGA apipr i boskux jutbvuof.
Boi dux ojo ihyebogv duzgixf xo wcouhu jodpi bcoxkm ed potewd jiggoitixc ek iyhede zyeta. Efu hxe LQU avjwamx oy cetliws, mong in imotoog hejotb, or fuowzijt us mqe egmixugw bupfif.
Where to Go From Here?
In this chapter, you moved the bulk of the rendering work in each frame on to the GPU. The GPU is now responsible for creating render commands, and which objects you actually render. Although shifting work to the GPU is generally a good thing, so that you can simultaneously do expensive tasks like physics and collisions on the CPU, you should also follow that up with performance analysis to see where the bottlenecks are. You can read more about this in Chapter 30, “Profiling”.
KXI-dvixox hezrihidx am i zeuvyk jukaxk yotquyv, uxr vxo faxt cixueppat uqa Ikqpe’g GJDV zebquods jockad az laketastuz.vojkgely ur lwa heyoupqoq muljuk mit prum vyuddad.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.