Regular Expressions in Kotlin

Learn how to improve your strings manipulation with the power of regular expressions in Kotlin. You’ll love them! By arjuna sky kok.

4.7 (3) · 1 Review

Download materials
Save for later
Share
You are currently viewing page 4 of 4 of this article. Click here to view the first page.

Using Reluctant Quantifiers

Replace the content of extractNamesFromHtml with:

val pattern = Regex("""<li>(.*?)</li>""")
val results = pattern.findAll(names)
return results.map {
  it.groupValues[1]
}.toList()

Notice, the difference is you put ? on the right of .*. This is a reluctant quantifier.

Build and run the app. Then submit the form:

Supervillains Diet Plan Reluctant Quantifier Result

This is the correct result. The (.*?) matches as few characters as possible before </li>. The (.*?) reluctantly moves forward.

Now, you successfully extracted the meals data using regex.

To get more familiar these quantifiers, check out this comparison between them and their results:

Regex Quantifiers Comparison

Understanding the Logical Operator and Escaping Regex

Supervillains Club recruits a lot of young supervillains. They also monitor the chatting between young supervillains to ensure they don’t defect. But Gen Z writes differently: they don’t respect English grammar.

This creates a problem when Supervillains Club wants to analyze Gen Z’s dialog when chatting. A Gen Z supervillain might write: “I just beat a hero :] looks like I’m good :)”.

You have to separate the dialog into sentences, but Gen Z supervillains don’t use end punctuation. Kids these days… :]

Fortunately, Supervillains Club’s NLP scientists have done their research. It looks like Gen Z uses :], :) and 🤣 as a . replacement.

Open http://localhost:8080/split and submit the form:

Supervillains Club Splitting Form

Nothing happens. It’s time to analyze Gen Z using regex!

To split the sentences using regex, you use… split!

In RegexValidator.kt, replace the content of splitSentences with:

val escapedString = Regex.escape(""":)""")
val pattern = Regex("""(:]|${escapedString})|🤣""")
return pattern.split(sentences).map {
  it.trim()
}

split uses the regex string to split the input string, looking inside the regex string for string separators. If the regex string is Y and the input string is sunny Y rainy Y cloudy, then the result is sunny, rainy and cloudy.

But you notice there’s another character, |. This is a special character in regex. It means a logical operator.

If you want to use more than one character to split, join them using |. If the regex string is Y|B, then you’ll split the sentences using Y or B.

You’ll also see you escape :) using escape:

Regex.escape(""":)""")

The ) character is special in regex. As you learned previously, it’s the character you use to create a group.

Build and run the app. Submit the form again. You’ll see this:

Supervillains Club Splitting Form Result

Your work impressed Supervillains Club. They offer to make you a supervillain.

Why not?

Your supervillain name is Regex Monster. When people have a problem, you tell them a popular regex joke: “Now you have two problems.” :]

Where to Go From Here?

Download the final project using the Download Materials button at the top or bottom of the tutorial.

You learned the most common Regex methods, but there are some you didn’t tried like replaceFirst, splitToSequence and toPattern. You can consult on the Regex API documentation to learn more.

You also need to be careful with the catastrophic backtracking problem. If you write regex wrong, the regex could consume high CPU and create an outage.

You used some regex patterns but the regex pattern is vast. For example, you haven’t used the multi-lines regex pattern and named groups. Head to the Regex pattern documentation to learn more about the regex pattern.

Regex isn’t invincible. It fails in fuzzy operations like classifying the sentiment of a tweet. For this problem, you need Natural Language Processing or NLP.

Regex is complicated. You can debug the regex pattern in many regex playgrounds. One examples is regex101. Choose the Java 8 flavor in the playground.

I hope you enjoyed this tutorial! Please join the forum discussion below if you have any questions or comments.