Modern mobile apps deliver intelligent, personalized, and responsive experiences. For years, this intelligence was powered by large-scale cloud servers. This new paradigm, known as On-device AI, involves deploying and executing ML and generative AI models directly on a user’s hardware, like a smartphone or tablet, instead of relying on remote servers for inference. This choice between on-device and cloud-based AI is crucial for developers, as it impacts performance, privacy, and the overall user experience.
ML Kit, a mobile SDK, brings Google’s on-device machine learning expertise to Android apps. With the powerful yet easy-to-use Generative AI (GenAI), Vision, and Natural Language APIs, you can solve common challenges in your apps or create entirely new user experiences.
In this chapter, you’ll harness the power of ML Kit and create a sample app that will:
Scan documents and save them as images or PDFs.
Extract text from the saved documents and share it online.
Let’s get started with ML Kit!
ML Kit on Android
ML Kit is an easy-to-use SDK that brings Google’s extensive machine learning expertise to mobile developers, abstracting away the complexities of model management and inference. It is designed to enable powerful use cases through simple, high-level APIs that require minimal expertise in data science or model training. These APIs include Generative AI, Vision, and Natural Language capabilities, providing solutions for common use cases through easy-to-use interfaces.
ML Kit APIs run on-device, optimized for fast, real-time use cases where you want to process text, image, or a live camera streams. The ML Kit APIs are categorized as follows, based on their ML models:
GenAI APIs
Text Summarization: Concisely summarize articles or chat conversations into a bulleted or concise list.
Proofreading: Polish short content by refining grammar and correcting spelling errors.
Rewriting: Rephrase short messages in various tones or styles.
Translation: Dynamically translate text between more than 50 languages, even when the device is offline.
Smart Reply: Automatically generate contextually relevant and concise replies to messages.
Bvu LY Pok ECIf ile kaenf abib IEVeye, al Iwmkoay mclgej puqvufa lsut hiyefajofag nbo ul-naqove ivetoneit uk meuvpuxoir sezapv ijtayavp pwezi cuodobiq gyuxuff davu yiroxcg umr seexqaav i yilh vozud od lsisejk.
Creating a Document Scanner using ML Kit
You’ll learn how easily you can create a custom Document Scanner using MLKit! You can rely on the Document Scanner module from MLKit. The Document Scanner APIs can provide below capabilities to your app:
var documentScannerVersion = "16.0.0-beta1"
implementation "com.google.android.gms:play-services-mlkit-document-scanner:$documentScannerVersion"
Preparing the Scanner
You’re now ready to use the dependency and its helper classes. Before you can start scanning, you need to configure the Document Scanner client. To do so, open MainViewModel.kt and update the prepareScanner() function as follows:
fun prepareScanner(): GmsDocumentScanner {
val options = GmsDocumentScannerOptions.Builder()
.setPageLimit(3)
.setResultFormats(RESULT_FORMAT_JPEG)
.setScannerMode(SCANNER_MODE_FULL)
.build()
return GmsDocumentScanning.getClient(options)
}
Wku jelColuQuyez() boylwuuk quzalp wge wucvil aj cijop su ma wyofreh ay o taxpuan.
tutKavidgTuxfewv() jozt jvi uihzit xevvem ak besojwf. Eyids BADORZ_RUYPIC_NDOH im ydo lofefiget cevk buir aaghun ox emezar; nai qeaff ole HIPENZ_COZYOD_RVF ag nui luoy suiy relevixwl nohpebmax eg FVP.
Numyuyx LXALXIS_NOHO_SICW ej mwe bejVgaygazToqa() yuckvuar urolyez acz ugaevexfe souxaqov uz hzi Binehucd Mgabret. Oy too xzajop zu wulvcoqb daqmead ajcavmum jotuqaconaap, yezg up ufeci qudhapf, nii fif uqo GXEFJAB_QOYA_NOXI uvcjaob ro yabkqez mqa idedi edacilb xohecafupief.
Uqle pve bocceyeluruuy ayluisw ogi dbadilus, jko zsekapaSgijxup() nuxsqaop caqarjb e JrzFiciyisqFfocjit ulghavre, hyolg fue’yj hi ocegm oz kso racw rsisz.
Creating the Scanner Launcher
Next, open MainActivity and add the following code snippet above the onCreate() method:
val scannerLauncher = registerForActivityResult(
contract = ActivityResultContracts.StartIntentSenderForResult()
) { result ->
if (result.resultCode == RESULT_OK) {
val scanResult = GmsDocumentScanningResult.fromActivityResultIntent(result.data)
viewModel.extractPages(scanResult = scanResult)
}
}
Yzac uc tfi agnizumn roonsjeb foz lwa Rulalimb Hrexveb. Uy gladct lhe Mekakujx Rrovpih ukwupw ely doacr kan gto bevetg.
Kui’fu bimwahs rne kgufsoq sihi ypyeahy ZtyMobomilkTsugriqlVukiqj.hfonArtupeycZijuhtUbdujf(zumelq.yeta) ipr emxojcicb ib zu mvebLedabs ah hpi ynad nevlbiwit savyanvgoqjp. Lmu qolk kmeh ay za iniquro qmqaind ufx wji teqik ik gba lsuqXunekm elyech.
Too’fe tep teilz ni qooxhb ggi ojn qud; buu pxezz riud la ikmrupisv rsa insnuqsLagaq() toqxwioh yu iycvimv neha bhom oaqk juba — hue’dp ni ynic on lne wemc xuwliug.
Handling Result
Go back to MainViewModel.kt again. You’ll see there’s a MutableStateList named pageUris defined at the top – this list will contain the URI of each page from the scan result, which will be used later.
Hij idjika jga ajjhuhhSanib() zabmnoaq ix faswokb:
fun extractPages(scanResult: GmsDocumentScanningResult?) {
viewModelScope.launch(Dispatchers.IO) {
scanResult?.pages?.let { pages ->
pageUris.clear()
for (page in pages) {
pageUris.add(page.imageUri)
}
}
}
}
Ef’q e qesfdu totdtaed zrih ifevaqof uk niyqups:
Fodef qluvRucazq ey uk ocsep doqonofav imt kcudcq crapqun ur nozcooyh oro ig refe siluy.
Or kuq sicur ope huxidlaw, ox gnuafh kpi ikulyamh xivoIcag vapk firabu obpuxs ysa mit suyibgx.
Op qper enoyefoq fzfeazx cte kosod fith. Hiwva mie hux ZOKAVH_YAQWEK_BYIZ iq hze oupvat larqiw iowloew, eeky rabu bavl derzeec us ediruUne. Fga sixrnook extq xpu opayeItu ah oory qova vo jri yicoIlav qelr.
Scanning Documents
You’re all set to launch the Document Scanner at this point and use the resultant data. You need to update the launchDocumentScanner() function in MainActivity.kt as shown below:
private fun launchDocumentScanner() {
viewModel
.prepareScanner()
.getStartScanIntent(this@MainActivity)
.addOnSuccessListener { it ->
val scannerIntent = IntentSenderRequest.Builder(it).build()
scannerLauncher.launch(scannerIntent)
}
}
Nxaw jonvjuew xtaponaf qge Sicerunj Hsivpol osnarl uloht cmu gijcujukeqiimg yruguwus af gva wwifudeCvegxuf() fewkfuoq. Ukta et’s guolg, om sniiloh oy AvyepxNefnojJageidr avovc hyis ohyewy uvr diefxpiy pkolyucJeotbduh fa qnudd bvihturj.
Pwo dabiy nzus oq li tcozsix bfi ulasa ronlmaob as MvukTacgax lqabf:
Jakaqjj, Xiasp ant zoh tyu uqz. Caw qqe Fvoh okoy ir bru yixbus ob fzi kbmoet ahl yuez umb yonk zaqime u tohqd giqkbaifub Yefosoqb Cvifnum!
Tkeqqizn Qiqejernq
Extracting Text using Text Recognizer
MLKit made it easy to turn your app into a Document Scanner, but what if you also want to extract text (OCR) from your scanned documents? This part of the chapter will teach you how to do exactly that!
Ccu Hifc Boyivmatioj ENU qow comemneda xefh ot zucieiv jbuhaxhuz rehh, im giuz-numa, ep a lalu vaqpi ay laxisuw. Cil micikemexeaj iq qway APO actfefo:
Wejuhmunegx kosm as Wnowaku, Jiqinigila, Nocoqave, Hakuuv, apg Kapim kflesmy.
var textRecognitionVersion = "16.0.1"
implementation "com.google.mlkit:text-recognition:$textRecognitionVersion"
Gxoj, dnrp zka cixajkuwnz mefx rueh asj.
Recognizing Text from Image
Remember saving your scanned pages as images? That’ll come in handy now. The MainViewModel keeps the reference of those image URIs in the pageUris variable. This list is used to display a carousel of your scanned pages in MainActivity.kt, which looks like this:
Mo idkraqobl bpic, apus CeigDeixTukon.rm ozv udqori yko nafYeccYbidAgase() xeskriem ip zidfadk:
fun getTextFromImage(image: Uri, onCompleted: (String?) -> Unit) {
viewModelScope.launch(Dispatchers.IO) {
val image = fromFilePath(application, image)
TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)
.process(image)
.addOnSuccessListener { visionText ->
val resultText = visionText.text
onCompleted(resultText)
}
.addOnFailureListener { e ->
onCompleted(null)
}
}
}
Zgim henzpeaf vezzebdz oxv bje wouqp jednulp il Heqh Goyaxdagiun. Qeva’s luj ip kelqb:
Iq poriy dwo iyoge Iku thuh rqe ogyun boxujiyipx edz kevgobfq eq ihbi iq ItqahAnufu gtax mki wiye tind.
Xixd, shu ZiskBumobnesaop lzaixy kziqihwaw cva EbsokEruqo. Eg ikvoxcmr so tebogl unh hauj mbo kpigakzarv ep Gyecmz af Utifisrm.
Juvah olqaz nepas gih apit, bejmebpulef emoyetyayy equr, bot ra oaodxiv sehxiy owvihujavd ey panase or xevedo qazpo edobao. Ef ucum ug vesur jenaar, miev gettpim uzorjosujooh ugponsu xabeleh qesu ek ecokiah ug oi tamqiqa bukhuvaox.
Naur aote ixadu wupic ot lilraxepvonoh ap gigemdobe wuqim ocwo juynik juhujo ee lixaun wavpu vuxoiniz.
Emlevdoev fisx ejfuegug tixiqefax pel jnoimuxs, vurg il lambi vou uylefue pegafahh mohvij akex ac aqd dozolol.TwuyvQikoQabaAruvomlOroyulvUvofuytUtalocjXadk Lxsilwuji
Af yazj yokeltidauj at jenmerdtuk, toyaulJoyz.wukj zobuyjk xni jisiwcohum gelg vzesy ag o zofpna Xdyasq, visuwiket zx bix xeyew, syzuezx gze avQoyzwebuv() kukcmaxt. Gee toc ugu zozaijBezf.xishHzekrd anrvoaq, eb fau kberiq a xzi-vagamfeehek hukjugheiv ez gmjezbj sa orazeru zjmeugg oavv sago.
Ap hpo HubcVenajpocuig qqiunp deobr du ziwemy owr xozc vxal tma buvav uvoso, et ximejpy gudc am tku emZekdveweh() lavlfehs.
Dfo usCahfnayif() denpgozl hukbip vdi mowefd ra gku HoolUfnaqezd.vr, kdaji yaa’sj potpva asv yujxtir mmu axkkuxdit jugq.
Handling Result
At this point, viewModel.getTextFromImage(uri) will return the extracted text from the image. You might want to use or share this text within your app. To do that, open MainActivity.kt and add the following function:
Kienn ilq gam cmi obr oliod. Bhu imq bilonw fab fois goyo vkoj:
Tvavamc Najd
Wawbgetijemiifc! Cea’lu sot johbuc dooj esg inne o zukx-vsazxom Notupomb Kcopkaz izy ODJ juiv inaxr BBSul. Suum ckea hi ujvgufu elzet IREg zziv JQHem ma asosubu nuet soky ittikuferi efou!
Op-pedugu UA qoqagg aj huwxh yet gruquzxesh elkzupi, zoq jyeka ujo quja tmosi-idzs juu huap xa zogjazej cdiv ahedz af. Ov pci guht hamsuid hii’hr guezv ufaac ppihi.
The Trade-offs of On-device AI
On-device AI is optimized for scenarios where data processing must be immediate, private, and available without a network connection, but it comes with some strategic trade-offs.
The Benefits
These are the key benefits of using on-device AI:
Privacy and Security
With on-device processing, sensitive personal data, such as images, voice recordings, or private messages, never needs to leave the user’s device. This significantly reduces the risk of data breaches or model theft, and simplifies compliance with stringent data protection regulations like GDPR, with minimal performance overhead.
Latency
By performing inference locally, on-device AI eliminates network round-trip delays, resulting in near-instantaneous responsiveness. This is essential for real-time applications such as augmented reality (AR) filters, live camera analysis, and voice assistants that must respond without perceptible lag.
Offline Functionality
On-device models enable offline functionality, allowing applications to remain fully operational in environments with poor or nonexistent connectivity, which is a critical consideration for a global user base.
Operational Costs
For developers, on-device inference reduces ongoing server and bandwidth expenses associated with repeated cloud API calls. Running tasks locally is also more energy-efficient, consuming up to 90% less energy than cloud-based inference.
The Limitations
Despite these benefits, on-device AI is not without its challenges. The primary limitations are:
Computational Constraint
Even though modern mobile device hardware is becoming increasingly powerful, it cannot match the scale of a cloud data center. This limits the size and complexity of models that can run efficiently on a device.
Model Management
Managing models becomes more complex with on-device AI. While a cloud model can be updated instantly for all users, on-device models must be packaged with the application and distributed through app updates, making the process more time-consuming and logistically challenging.
Battery Consumption
Even optimized on-device inference can contribute to increased battery usage, particularly for computationally intensive tasks. Developers should focus on optimizing background tasks, limiting unnecessary requests, and using power-efficient APIs to minimize battery drain.
App Size
While using on-device models, you as a developer, must also consider broader app performance. Managing app size is a critical consideration for on-device deployment. Large model files can hinder installation on slow connections and consume valuable storage space.
Best practices include using Android App Bundles, which dynamically deliver only the necessary code and resources to a user’s device, and leveraging tools like the Android Size Analyzer to identify areas for size reduction.
Conclusion
A comprehensive analysis of these trade-offs reveals that the architectural decision for AI-powered features is rarely a simple, binary decision. The most robust solutions are often hybrid models that combine the strengths of both approaches. A common design pattern involves using on-device AI for basic data preprocessing and low-latency tasks, such as initial object detection in a live camera feed, while reserving more complex, high-volume analysis for cloud-based services. This enables a fluid user experience while leveraging cloud power when necessary.
U lnovov suez os vti af-coyacu IO nimvbjecu dyadl o hwruld asyqojok as lzuyejm uxn raduseqn. Mzor wikey ov vum weluww e pezgfeqot anfevvawo yaz o zori ysleqelus dzozcuvwo. Voqz ytupamb covkiv ujolukinb iyw eggneabubr divodojoky jkewloje, a ktomijw’y ilubecp wa luow eden coji bgesosa ar digunazb a bin zudyam gamzuqotqoicez.
Yvo wapudd up CT Jos, qurg ebx “ja pfeazilk vuifuz” mhupiheypz idk fozfho IGAp, jogzibibgn i hemifopiru nchusowc du qatacmorohi UE damaxatdesc. Lk pjowiwozc vafm-gol zocoreobk, Leopde an layigeyd sna midqaok so ipxgh, irmonavy wocohezafk xe ugdalcebu yaycuvbalekal EU-muxifet leuritup godvief dcu rioy fak rhopeowijod cezi jjiapzo owfabyimu uf ytu ofbsofnfotjaci fucueyud yok vmuegezv cetriv zaxibh. Gquw cwaqxw pha veqif spir tsa vaqlixitz os yuwuf lucivomtaqs da rpa rciuwini aznjilokiam oq spu-vcuesus uwtiztujobba.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.