Ultimate Guide to Regular Expressions (RegEx) on JavaScript
Matching strings? Lookaheads? Flags? Let's learn more about it!
What's RegEx ?
They are objects with a cryptic syntax that allows you to inspect strings. It can help you to match strings with multiple posibilities.
Think of it as a little friendly Gnome village looking forward to helping you find all the matches you're asking for! For them to understand you, you're actually learning how to ask in their language. Isn't that nice?
You might be asking youself, what could i possibly do with that?
Well, let me tell you it has use on many applications from restricting possible usernames and its characters to helping you create password functions that check for specific elements like numbers, two special consecutive digits or even better! Only make it accept passwords that are 5+ characters long.
Wow, Yuri! I'm sold. How can i start implementing it on my code?
How to start writing RegEx
Here's what you need to know before we dig deep into RegEx syntax.
RegEx are used to match parts or literal strings, white space, numbers and special characters
You create the patterns you want it to return
Quotes ( " " ) are not required within the RegEx
Alright! Now let's dive in:
Matching literal strings
Let's create our first RegEx!
let myRegEx = /Blog/
Wait what? Where did the "/" came from?!!!?
That's RegEx syntax! It will return a literal match. It's basically a little grumpy gnome saying
/ i will find this and this only /
How can you know you're getting what you asked for sure?
The .test() method
let randomString = "Hashnode Blog"
let myRegEx = /Blog/
myRegEx.test(randomString) // true;
.test() takes the RegEx, applies the string to it and returns a boolean (true or false) whether it finds it or not.
Remember we're matching a literal string so any other searches like /blog/, /vlog/, /blogg/ will in fact return false.
This can be quite limitating so how about we search for some variants?
Literal String with different possibilities
This happens with the or operator ( | )
It searches for the string either before or after it. Think of it as a very comfortable gnome sitting down and only wanting to look to the front or the back to find the item you're asking for.
let randomString = "Hashnode Blog"
let myRegEx = /Blog | blog /
myRegEx.test(randomString) // true;
But this too can be quite limitating, you'd have to type all the variants of it for it to be able to return true.
Hmm... Well, what if we try to fix the problem telling the RegEx to ignore case while matching??
Ignore Case
Sure! We can match BOTH uppercase and lowercase with a flag.
A flag? Yes! A flag. It's the /i flag.
This is a more active gnome! And wants to help you find all possible matches to that string no matter how they're written, they'll find it!
let randomString = "Hashnode Blog"
let secondRandom = "HASHNODE BLOG"
let thirdRandom = "hashnode blog"
let lastRandom = "hAsHnOdE bLoG"
let myRegEx = /Blog/i
myRegEx.test(randomString) // true;
myRegEx.test(secondRandom) // true;
myRegEx.test(thirdRandom) // true;
myRegEx.test(lastRandom) // true
This can help you ignorecase, IgnoreCase or IgNoRe cAsE.
Yuri's learning tip: Try to think the /i stands for international, it's easy to remember that it's searching "internationally" and finds the same word with different "spelling"[case], just like you would internationally.
Okay, so now that you have have your matches how do you extract them?
The .match() method
Now that we have it, we could go to the village bank and see what we got.
let randomString = "Hashnode is an user friendly blog"
let myRegEx = /Blog/i
randomString.match(myRegEx)
// ["Blog", index: 29, input: "Hashnode is an user friendly blog" ]
// using .match() on a string returns 2 arrays, the result and one only containing the string.
It returns a nice array of the matches ! Isn't fun?
How to know when to use .test() or .match() method?
Well, let's remember that .test returns a boolean and matches the RegEx to a specified string.
The syntax is: RegEx.test(String)
.match() is used to retrieve the matches. It compares a string against the RegEx and it returns an array. If there aren't any matches it returns null.
The syntax is: string.match(RegEx)
Use .test() if you want a quick check and .map() when you want to retreive them when using a /g flag
But, what's a /g? flag...??
Get more than one initial match
So untill now we've been getting one gnome to help us find what we're looking for, but how about we ask for help to a couple more and return more words? Using a /g flag.
let randomString = "Hashnode is an user friendly blog
for people that like to Blog and blog"
let myRegEx = /Blog/gi
randomString.match(myRegEx) // ["blog", "Blog", "blog"]
Remember: we add the flag /i to ignore case while matching!
Yuri's learning tip: /g stands for global match because it looks for the /regEx/ everywhere in the string!
Checkpoint! 🏖️
So how're you feeling? Good so far?
The gnomes are doing everything they can to help us find matches, they say it's like a game night!
What have we seen so far?
.test() and /literal/ strings
/literal | strings | with | multiple | possibilities/
/i to ignore case while matching
.match() to extract matches
/g to find more than the first match
This is all for now! See you in the next Checkpoint.
Take this 🍦 and you're ready to continue :)
But what if you don't know exactly what to ask the gnomes to match for?
Match anything with a period
If you don't know the exact characters in the string you can use a period ( . )
The gnome will try to find any of the characters you have in the /RegEx./
let randomString = "Hashnode is an user friendly bloggie
for people that like to Blog"
let myRegEx = /Blo./gi
myRegEx.test(randomString) // True
So we can match anything with a period, how about just ONE character with multiple possibilities?
Match single literal character with flexibility
This is quite exciting! We're going to place them in brackets ( [ ] ) .
The [ ] allows you to define a group of characters you want the gnome to find! Think of it as a basket.
To match vowels
let randomString = "Hashnode, Hashtag, HashBrown"
let myRegEx = /[aeiou]/gi
randomString.match(myRegEx)
// (7) ["a", "o", "e", "a", "a, "a", "o"]
But the gnome doesn't want to read all those vowels! Isn't there a better way to make his job easier?
I sure believe so! We can use the hyphen character ( - )
Let's match letters of the alphabet, globally and ignoring case:
let randomString = "Hashnode, Hashtag, Hashbrown"
let myRegEx = /[a-z]/gi
randomString.match(myRegEx)
// (24) ["H", "a", "s", "h", "n", "o", "d", "e",
"H", "a", "s", "h", "t", "a", "g",
"H", "a", "s", "h", "b", "r", "o", "w", "n"]
Don't worry if you want to match numbers, the gnomes can also help you with that!
Match numbers and letters
The hyphen character ( - ) can help us here too! Let's look for numbers from 0 to 9 and letters from A to Z (it includes them too):
let hashnodeStr = "Hashnode 3459689504"
let myregEx = /[a-z0-9]/ig
hashnodeStr.match(myRegEx);
// (18) ["H", "a", "s", "h", "n", "o", "d", "e",
"3", "4", "5", "9", "6", "8", "9", "5", "0", "4"]
This has been lovely! But what if i don't want the gnome to get me all that stuff??
Match characters you don't want
These are called negative character sets.
You place a ( ^ ) before specifying the characters you don't want.
Let's tell the gnome to ignore the numbers 5,6 when looking for a match!
let hashnodeStr = "Hashnode 567856"
let myRegEx = /[^5-6]/ig
hashnodeStr.match(myRegEx) // (11) ["H", "a", "s", "h", "n", "o", "d", "e", " ", "7", "8"]
But the gnome is asking what should they do with the characters that appear one or more times?
Match characters that happen multiple times
To let the gnome know what to do with the characters that are present one or multiple consecutive times we should add a plus sign ( + ) to the character!
let unnecessaryAmount = "Mississippi"
let myRegEx = /[s+]/g
unnecessaryAmount.match(myRegEx)
//(4) ["s", "s", "s", "s"]
Wow! Pretty cool, thanks for letting me know. But what do i do with those that happen zero or more times then? - the gnome asks.
Match characters that happen zero times
We can use the asterisk ( * ) character!
let randomString = " Hashnooooooode"
let myRegEx = /no*/
randomString.match(myRegEx) // ["nooooooo"]
Even lazy gnomes can help us find matching characters!
Lazy matching
This helps us find the smallest part of the string that matches the RegEx! You do this by adding a question mark ( ? ) .
let text = "Hashnode"
let myRegEx = /ha[b-z]?/i
// this is asking to look for ha and [any matching letter of the alphabet from b to z]
//? smallest match the gnome find and
//ignoring case
text.match(myRegEx) // ["Has"]
Checkpoint! 🏖️
Wow! This has been a lot, the gnomes are so happy to be here with us!
While they're having some water let's review what we've seen so far!
You don't need to know exactly what characters you're looking for with ( . ) -> /Ha./
You can search for the same character with multiple possibilities with ( [ ] ) -> [aeiou]
You can avoid typing each character using the hyphen character ( - ) -> [a-u]
You can also match numbers with the hyphen character ( - ) -> /[0-5]/
Can match letters AND numbers /[a-z0-9]/
For negative character sets (those that you don't want to match) use ( ^ ) -> /[^aeiou]/
For consecutive characters we can match them with ( + ) -> /a+/
For those that happen zero or more times we use an asterisk -> /no*/
Lazy matching helps us find the smallest part of a string possible using ( ? ) -> /ha?/
All good? Don't forget to grab your 🍹 and be on your way, my friend!
Let me show you a little shortcut we can use with the gnomes.
Shorthand Character Classes
Writing the range for Regular Expressions can be both useful and laborious (like the ones we've done in alphabetical order), so in an effort to make this problem better Shorthand Character Classes were created to make writing RegEx much easier.
Let's meet them!
Match the beginning of string patterns
Remember how we use ( ^ ) to match characters we don't want? We can also use it to match the beginning of strings!
let hashnodeStr = "Hashnode is a supportive platform"
let myRegEx = /^Hashnode/
myRegEx.test(hashnodeStr) //true
let hashnodeStr = "Try and find Hashnode now"
let myRegEx = /^Hashnode/
myRegEx.test(hashnodeStr) //false
If we can find the beginning, the gnomes reached the conclusion we can also find the end of strings!
Match the end of string patterns
We can do this by using the dollar sign ( $ ) character.
let hashnodeStr = "Hashnode is a supportive platform"
let myRegEx = /platform$/
myRegEx.test(hashnodeStr) // true
Match all letters and numbers
But Yuri, we already do that with ( - ) .
We do! But we can find matches for BOTH uppercase and lowercase characters PLUS numbers just adding \w . The character goes in lowercase.
What it used to be this
let hashnodeStr = "Hashnode 3459689504"
let myregEx = /[a-z0-9]/ig
hashnodeStr.match(myRegEx);
// (18) ["H", "a", "s", "h", "n", "o", "d", "e",
"3", "4", "5", "9", "6", "8", "9", "5", "0", "4"]
Becomes this
let hashnodeStr = "Hashnode 3459689504"
let myRegEx = /[\w]/g
hashnodeStr.match(myRegEx);
// (18) ["H", "a", "s", "h", "n", "o", "d", "e",
"3", "4", "5", "9", "6", "8", "9", "5", "0", "4"]
Fun isn't it? But well, since life's all about balance we are also able to do the opposite!
Match non-alphanumeric characters
We do this by adding \W .
⚠️- It's very important to remember that the character is \W in UPPERCASE becase we just saw that \w in lowercase means everything including letters and numbers.
What would we match if not letters and numbers? Special characters like !, %, *, +, = , and more.
let hashnodeStr = "Hashnode is a supportive platform!! "
let myRegEx = /\W/g
hashnodeStr.match(myRegEx) // (7) [" ", " ", " ", " ", "!", "!", " "]
Match all numbers
If we need to we can also tell the gnomes (that are great mathematicians) to match all the numbers for us!
The shorthand for this is \d in lowercase .
⚠️ This is the same as saying [0-9].
let findNumLength = "2021 is the current year "
let myRegEx = /\d/g
findNumLength.match(myRegEx).length // 4
findNumLength.match(myRegEx) // (4) ["2", "0", "2", "1"]
Match all non numbers
Now let's try and find all the non-number characters using \D in UPPERCASE.
⚠️This is the same as saying [^0-9].
Let's find all the non-number characters this will also include white space.
let findStrLength = "2021 is the current year "
let myRegEx = /\D/g
findStrLength.match(myRegEx).length // 21
findStrLength.match(myRegEx)
// (21) [" ", "i", "s", " ", "t", "h", "e", " ", "c", "u", "r", "r", "e", "n", "t", " ", "y", "e", "a", "r", " "]
Match white space
It matches ALL the space between letters. The character is \s in lowercase.
⚠️ Keep in mind this WILL also match:
- Tab -> \t
- New Line \n
- Form Feed \f
- Carriage Return \r
let hashnodeStr=" Hashnode is a supportive platform! "
let myRegEx=/\s/g
hashnodeStr.match(myRegEx) // (6) [" ", " ", " ", " ", " ", " "]
let hashnodeStr = " Hashnode is a supportive\n platform! "
let myRegEx = /\s/g
hashnodeStr.match(myRegEx)
// (7) [" ", " ", " ", " ", "\n", " ", " "]
Match non-white space characters!
We also have the option to not match white space for our gnome friends to use!
Perfectly balanced, as all things should be. - Thanos
It'll help us search for evertything BUT white space. The character we use for this is \S in UPPERCASE.
⚠️ This will NOT match
- Tab -> \t
- New Line \n
- Form Feed \f
- Carriage Return \r
let hashnodeStr = " Hashnode is a supportive platform! "
let myRegEx = /\S/g
hashnodeStr.match(myRegEx) // (30) ["H", "a", "s", "h", "n", "o", "d", "e", "i", "s", "a", "s", "u", "p", "p", "o", "r", "t", "i", "v", "e", "p", "l", "a", "t", "f", "o", "r", "m", "!"]
let hashnodeStr = " Hashnode is a supportive\n platform! "
let myRegEx = /\S/g
hashnodeStr.match(myRegEx)
// (30) ["H", "a", "s", "h", "n", "o", "d", "e", "i", "s", "a", "s", "u", "p", "p", "o", "r", "t", "i", "v", "e", "p", "l", "a", "t", "f", "o", "r", "m", "!"]
Checkpoint! 🏖️
We're gotten technical quite quick! I know you're doing alright but it's time to take a bit of a break and go over what we just saw with the gnomes !
You can use ( ^ ) to match the beggining of strings /^Hashnode/ AND also to specify characters we don't want /[^aeiou]/.
We match the ending string patterns with a dollar sign( $ ) at the end of our RegEx.
Instead of doing [a-z0-9] we can instead use lowercase \w that covers BOTH all letters and numbers.
If you don't want to match everything BUT letters and numbers use \W in UPPERCASE.
To match all numbers we need \d in lowercase.
To match all non numbers we need \D in UPPERCASE.
When we're in the need to match white space we can use \s in lowercase.
If instead we want to match all NON white space we use \S in UPPERCASE.
Okay! That's it. Take this 🧃 and you're ready to continue!
Quantifiers
They explicitly identify the number of characters, group or string we want to match!
What you need to know:
- To look for one or more characters use the plus sign ( + )
- To look for zero or more characters use an asterisk ( * )
- The character for quantity specifiers curly brackets { }
So if we wanted to match
let str= "2021"
We'd use a quantity specifier to match those 4 numbers. It'll look something like:
let myRegEx = /\d{4}/
str.match(myRegEx) // ["2021"]
⚠️ Remember \d in lowercase is used to match all numbers!!
Specify the number of matches
To let the gnomes know the exact range we want for both upper and lower number of matches we need once again the quantity specifiers -> {n,m} this time with two numbers.
- n representing the lowest
- m representing the highest this is optional, if not specified it'll use {n}
Let's see it !
let str = " 2021 is the current year, it'll be '22 next one"
let myRegEx = /\d{2,4}/g //to find a number from two digits to four globally.
str.match(myRegEx) // (2) ["2021", "22"]
myRegEx.test(str) // true
Get the lowest number of matches
Now let's skip the highest number -> {n,} , it means one or more
let str = "2021 is the current year, it'll be '22 next one"
let myRegEx = /\d{2,}/g //to find a number from two digits and it'll run the sequence n or more times globally
str.match(myRegEx) // (2) ["2021", "22"]
myRegEx.test(str) // true
⚠️ It's very important to remember that this is a shorthand and will have the same result as saying /\d+/:
let str = "2021 is the current year, it'll be '22 next one"
let myRegEx = /\d+/g //to find a number from two digits and it'll run the sequence n or more times globally
str.match(myRegEx) // (2) ["2021", "22"]
myRegEx.test(str) // true
And if we have an specific quantity on the number of matches we want the gnomes to get us we can lose the comma and just have the {n} like we saw in the beginning!
let str= "2021"
let myRegEx = /\d{4}/
str.match(myRegEx) // ["2021"]
Check matches for zero or one
It helps you verify the existence of a character.
We do this by using a question mark ( ? ) .
Yuri's learning tip: think of it as what it is, a question mark saying that the element before it it's optional, right?
A great example for this is color - colour:
let usa = "color"
let uk = "colour"
let myRegEx = /colou?r/ //letting the RegEx gnomes know that the u is optional in the matching
usa.match(myRegEx) // ["color"]
myRegEx.test(usa) // true
uk.match(myRegEx) // ["colour"]
myRegEx.test(uk) // true
Positive and negative lookahead
Lookaheads are characters that tell JavaScript (the gnomes) patterns to check for or 'lookahead' to do the matchings.
They could be:
- Positive lookahead : Will check for the character but won't match it. Is represented by (?=abc) where abc are what this lookahead won't match and won't add it to the result.
let pixelNums = "1pt, 2px, 5px, 10em"
let myRegEx = /\d(?=px)/g //it will look for a numeric character group in this case 'px' without incluiding it in the result
myRegEx.test(pixelNums) // true
pixelNums.match(myRegEx) // (2) ["2", "5"]
- Negative lookahead : Will make sure the element where it's put won't be there in the result. Is represented by (?!abc) where abc are what this lookahead will discard it if it finds a match.
It will look for a numeric character group in this case 'px' and discard it from the result.
let pixelNums = "1pt, 2px, 5px, 10em"
let myRegEx = /\d(?!px)/g
myRegEx.test(pixelNums) // true
pixelNums.match(myRegEx) // (3) ["1", "1", "0"]
Grouping of characters
The gnomes can also help us find groups of characters! For this we use parentheses ( ) and place inside the strings to use.
What you need to know:
- If we put a quantifier [(+),(*),{ }] after the parentheses it applies to it as a whole
- It allows to get the group as a separate item in the resulting array
Capturing group
Why would we use capturing groups?
Instead of looking for characters separately as in: "no+" which means "n" followed by "o" repeated many times to from noooo, noooooo, nooooooooo;
We could look for it as (no)+:
let str= "You are allowed to say no"
let otherStr = " no nope noo nononono no"
//instead of doing it like this
let otherRegEx = /no+/ig
//We can do it like this!
let myRegEx = /(no)+/ig
myRegEx.test(str) / true
myRegEx.test(otherStr) / true
str.match(myRegEx) // ["no"]
otherStr.match(myRegEx) //(5) ["no", "no", "no", "nononono", "no"]
Reuse patterns using capture groups
We can help our gnome friends to be more efficient looking for matches using capture groups.
You need to put the RegEx in (parentheses) followed by a \backslash to specify where that string will appear and then specifying a number. It starts at 1 and it increases with each additional group.
let str = "Hashnode Hashnode"
let myRegEx = /(\w+)\s\1/
str.match(myRegEx) // (2) ["Hashnode, Hashnode", "Hashnode"]
myRegEx.test(str) // true
Conclusion
Wow, what a fun ride!
RegEx is one of my favorite JavaScript topics ever since i learned it. It has a lot of versatility and like most things with programming you need to be very specific in order to get what you want! (The fact that you feel like you can walk on fire when you understand RegEx patterns is a plus) hahaha.
I really hope you learned something new today and i want you to know you can always come back and use this article as your personal notes!
Thank you for reading!! :)
Don't hesitate to contact me and let me know if you'd like to add something else in the comments.
☕If you enjoy my content Buy me a coffee It'll help me continue making quality blogs💕
💙Follow me on Twitter to know more about my self-taught journey!
💜Make sure to check out more articles on my JavaScript For Newbies Series
❤️ Also subscribe to my Youtube Channel !
🖤And for more content and tips on TikTok !
Oh and don't forget to throw all the trash from the checkpoints here 🗑️!
Resources
- regexr.com
- javascript.info/regular-expressions
- freecodecamp.org/learn/javascript-algorithm..
- javascripttutorial.net/regular-expression-q..
- javascript.info/regexp-character-classes
- freecodecamp.org/learn/javascript-algorithm..
- developer.mozilla.org/en-US/docs/Web/JavaSc..
- javascript.info/regexp-quantifiers
- My notes :)