mirror of
https://github.com/changkun/modern-cpp-tutorial.git
synced 2025-12-16 20:27:08 +03:00
book: typo and grammar fixes (#185)
This commit is contained in:
@@ -10,64 +10,62 @@ order: 6
|
||||
|
||||
## 6.1 Introduction
|
||||
|
||||
Regular expressions are not part of the C++ language and therefore we only briefly
|
||||
Regular expressions are not part of the C++ language and therefore we only briefly
|
||||
introduced it here.
|
||||
|
||||
Regular expressions describe a pattern of string matching.
|
||||
The general use of regular expressions is mainly to achieve
|
||||
Regular expressions describe a pattern of string matching.
|
||||
The general use of regular expressions is mainly to achieve
|
||||
the following three requirements:
|
||||
|
||||
1. Check if a string contains some form of substring;
|
||||
2. Replace the matching substrings;
|
||||
3. Take the eligible substring from a string.
|
||||
|
||||
Regular expressions are text patterns consisting of ordinary characters (such as a to z)
|
||||
and special characters. A pattern describes one or more strings to match when searching for text.
|
||||
Regular expressions are text patterns consisting of ordinary characters (such as a to z)
|
||||
and special characters. A pattern describes one or more strings to match when searching for text.
|
||||
Regular expressions act as a template to match a character pattern to the string being searched.
|
||||
|
||||
### Ordinary characters
|
||||
|
||||
Normal characters include all printable and unprintable characters that
|
||||
are not explicitly specified as metacharacters. This includes all uppercase
|
||||
Normal characters include all printable and unprintable characters that are not explicitly specified as metacharacters. This includes all uppercase
|
||||
and lowercase letters, all numbers, all punctuation, and some other symbols.
|
||||
|
||||
### Special characters
|
||||
|
||||
A special character is a character with special meaning in a regular expression,
|
||||
and is also the core matching syntax of a regular expression. See the table below:
|
||||
A special character is a character with special meaning in a regular expression and is also the core matching syntax of a regular expression. See the table below:
|
||||
|
||||
|Special characters|Description|
|
||||
|:---:|:------------------------------------------------------|
|
||||
|`$`| Matches the end position of the input string. |
|
||||
|`(`,`)`| Marks the start and end of a subexpression. Subexpressions can be obtained for later use. |
|
||||
|`*`| Matches the previous subexpression zero or more times. |
|
||||
|`+`| Matches the previous subexpression one or more times. |
|
||||
|`.`| Matches any single character except the newline character `\n`. |
|
||||
|`[`| Marks the beginning of a bracket expression. |
|
||||
|`?`| Matches the previous subexpression zero or one time, or indicates a non-greedy qualifier. |
|
||||
| `\`| Marks the next character as either a special character, or a literal character, or a backward reference, or an octal escape character. For example, `n` Matches the character `n`. `\n` matches newline characters. The sequence `\\` Matches the `'\'` character, while `\(` matches the `'('` character.|
|
||||
|`^`| Matches the beginning of the input string, unless it is used in a square bracket expression, at which point it indicates that the set of characters is not accepted. |
|
||||
|`{`| Marks the beginning of a qualifier expression. |
|
||||
|`\|`| Indicates a choice between the two. |
|
||||
| Special characters | Description |
|
||||
| :----------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `$` | Matches the end position of the input string. |
|
||||
| `(`,`)` | Marks the start and end of a subexpression. Subexpressions can be obtained for later use. |
|
||||
| `*` | Matches the previous subexpression zero or more times. |
|
||||
| `+` | Matches the previous subexpression one or more times. |
|
||||
| `.` | Matches any single character except the newline character `\n`. |
|
||||
| `[` | Marks the beginning of a bracket expression. |
|
||||
| `?` | Matches the previous subexpression zero or one time, or indicates a non-greedy qualifier. |
|
||||
| `\` | Marks the next character as either a special character, or a literal character, or a backward reference, or an octal escape character. For example, `n` Matches the character `n`. `\n` matches newline characters. The sequence `\\` Matches the `'\'` character, while `\(` matches the `'('` character. |
|
||||
| `^` | Matches the beginning of the input string, unless it is used in a square bracket expression, at which point it indicates that the set of characters is not accepted. |
|
||||
| `{` | Marks the beginning of a qualifier expression. |
|
||||
| `\|` | Indicates a choice between the two. |
|
||||
|
||||
### Quantifiers
|
||||
|
||||
The qualifier is used to specify how many times a given component of a regular expression must appear to satisfy the match. See the table below:
|
||||
|
||||
|Character|Description|
|
||||
|:---:|:------------------------------------------------------|
|
||||
|`*`| matches the previous subexpression zero or more times. For example, `foo*` matches `fo` and `foooo`. `*` is equivalent to `{0,}`. |
|
||||
|`+`| matches the previous subexpression one or more times. For example, `foo+` matches `foo` and `foooo` but does not match `fo`. `+` is equivalent to `{1,}`. |
|
||||
|`?`| matches the previous subexpression zero or one time. For example, `Your(s)?` can match `Your` in `Your` or `Yours`. `?` is equivalent to `{0,1}`. |
|
||||
|`{n}`| `n` is a non-negative integer. Matches the determined `n` times. For example, `o{2}` cannot match `o` in `for`, but can match two `o` in `foo`. |
|
||||
|`{n,}`| `n` is a non-negative integer. Match at least `n` times. For example, `o{2,}` cannot match `o` in `for`, but matches all `o` in `foooooo`. `o{1,}` is equivalent to `o+`. `o{0,}` is equivalent to `o*`. |
|
||||
|`{n,m}`| `m` and `n` are non-negative integers, where `n` is less than or equal to `m`. Matches at least `n` times and matches up to `m` times. For example, `o{1,3}` will match the first three `o` in `foooooo`. `o{0,1}` is equivalent to `o?`. Note that there can be no spaces between the comma and the two numbers. |
|
||||
| Character | Description |
|
||||
| :-------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `*` | matches the previous subexpression zero or more times. For example, `foo*` matches `fo` and `foooo`. `*` is equivalent to `{0,}`. |
|
||||
| `+` | matches the previous subexpression one or more times. For example, `foo+` matches `foo` and `foooo` but does not match `fo`. `+` is equivalent to `{1,}`. |
|
||||
| `?` | matches the previous subexpression zero or one time. For example, `Your(s)?` can match `Your` in `Your` or `Yours`. `?` is equivalent to `{0,1}`. |
|
||||
| `{n}` | `n` is a non-negative integer. Matches the determined `n` times. For example, `o{2}` cannot match `o` in `for`, but can match two `o` in `foo`. |
|
||||
| `{n,}` | `n` is a non-negative integer. Match at least `n` times. For example, `o{2,}` cannot match `o` in `for`, but matches all `o` in `foooooo`. `o{1,}` is equivalent to `o+`. `o{0,}` is equivalent to `o*`. |
|
||||
| `{n,m}` | `m` and `n` are non-negative integers, where `n` is less than or equal to `m`. Matches at least `n` times and matches up to `m` times. For example, `o{1,3}` will match the first three `o` in `foooooo`. `o{0,1}` is equivalent to `o?`. Note that there can be no spaces between the comma and the two numbers. |
|
||||
|
||||
With these two tables, we can usually read almost all regular expressions.
|
||||
|
||||
## 6.2 `std::regex` and Its Related
|
||||
|
||||
The most common way to match string content is to use regular expressions. Unfortunately, in traditional C++, regular expressions have not been supported by the language level, and are not included in the standard library. C++ is a high-performance language. In the development of background services, the use of regular expressions is also used when judging URL resource links. The most mature and common practice in industry.
|
||||
The most common way to match string content is to use regular expressions. Unfortunately, in traditional C++, regular expressions have not been supported by the language level, and are not included in the standard library. C++ is a high-performance language. In the development of background services, the use of regular expressions is also used when judging URL resource links. The most mature and common practice in the industry.
|
||||
|
||||
The general solution is to use the regular expression library of `boost`. C++11 officially incorporates the processing of regular expressions into the standard library, providing standard support from the language level and no longer relying on third parties.
|
||||
|
||||
@@ -77,7 +75,7 @@ We use a simple example to briefly introduce the use of this library. Consider t
|
||||
|
||||
- `[az]+\.txt`: In this regular expression, `[az]` means matching a lowercase letter, `+` can match the previous expression multiple times, so `[az]+` can Matches a string of lowercase letters. In the regular expression, a `.` means to match any character, and `\.` means to match the character `.`, and the last `txt` means to match `txt` exactly three letters. So the content of this regular expression to match is a text file consisting of pure lowercase letters.
|
||||
|
||||
`std::regex_match` is used to match strings and regular expressions, and there are many different overloaded forms. The simplest form is to pass `std::string` and a `std::regex` to match. When the match is successful, it will return `true`, otherwise it will return `false`. For example:
|
||||
`std::regex_match` is used to match strings and regular expressions, and there are many different overloaded forms. The simplest form is to pass `std::string` and a `std::regex` to match. When the match is successful, it will return `true`, otherwise, it will return `false`. For example:
|
||||
|
||||
```cpp
|
||||
#include <iostream>
|
||||
@@ -131,14 +129,14 @@ bar.txt sub-match[1]: bar
|
||||
|
||||
## Conclusion
|
||||
|
||||
This section briefly introduces the regular expression itself,
|
||||
and then introduces the use of the regular expression library
|
||||
through a practical example based on the main requirements of
|
||||
This section briefly introduces the regular expression itself,
|
||||
and then introduces the use of the regular expression library
|
||||
through a practical example based on the main requirements of
|
||||
using regular expressions.
|
||||
|
||||
## Exercise
|
||||
|
||||
In web server development, we usually want to serve some routes that satisfy a certain condition.
|
||||
In web server development, we usually want to serve some routes that satisfy a certain condition.
|
||||
Regular expressions are one of the tools to accomplish this.
|
||||
Given the following request structure:
|
||||
|
||||
@@ -230,4 +228,4 @@ An suggested solution can be found [here](../../exercises/6).
|
||||
|
||||
## Licenses
|
||||
|
||||
<a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-nd/4.0/88x31.png" /></a><br />This work was written by [Ou Changkun](https://changkun.de) and licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License</a>. The code of this repository is open sourced under the [MIT license](../../LICENSE).
|
||||
<a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-nd/4.0/88x31.png" /></a><br />This work was written by [Ou Changkun](https://changkun.de) and licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License</a>. The code of this repository is open sourced under the [MIT license](../../LICENSE).
|
||||
|
||||
Reference in New Issue
Block a user