book: typo and grammar fixes (#185)

This commit is contained in:
Nikhil Garuda
2021-04-18 15:42:26 +05:30
committed by GitHub
parent 45bef893a7
commit 731a219d89
10 changed files with 421 additions and 429 deletions

View File

@@ -10,64 +10,62 @@ order: 6
## 6.1 Introduction
Regular expressions are not part of the C++ language and therefore we only briefly
Regular expressions are not part of the C++ language and therefore we only briefly
introduced it here.
Regular expressions describe a pattern of string matching.
The general use of regular expressions is mainly to achieve
Regular expressions describe a pattern of string matching.
The general use of regular expressions is mainly to achieve
the following three requirements:
1. Check if a string contains some form of substring;
2. Replace the matching substrings;
3. Take the eligible substring from a string.
Regular expressions are text patterns consisting of ordinary characters (such as a to z)
and special characters. A pattern describes one or more strings to match when searching for text.
Regular expressions are text patterns consisting of ordinary characters (such as a to z)
and special characters. A pattern describes one or more strings to match when searching for text.
Regular expressions act as a template to match a character pattern to the string being searched.
### Ordinary characters
Normal characters include all printable and unprintable characters that
are not explicitly specified as metacharacters. This includes all uppercase
Normal characters include all printable and unprintable characters that are not explicitly specified as metacharacters. This includes all uppercase
and lowercase letters, all numbers, all punctuation, and some other symbols.
### Special characters
A special character is a character with special meaning in a regular expression,
and is also the core matching syntax of a regular expression. See the table below:
A special character is a character with special meaning in a regular expression and is also the core matching syntax of a regular expression. See the table below:
|Special characters|Description|
|:---:|:------------------------------------------------------|
|`$`| Matches the end position of the input string. |
|`(`,`)`| Marks the start and end of a subexpression. Subexpressions can be obtained for later use. |
|`*`| Matches the previous subexpression zero or more times. |
|`+`| Matches the previous subexpression one or more times. |
|`.`| Matches any single character except the newline character `\n`. |
|`[`| Marks the beginning of a bracket expression. |
|`?`| Matches the previous subexpression zero or one time, or indicates a non-greedy qualifier. |
| `\`| Marks the next character as either a special character, or a literal character, or a backward reference, or an octal escape character. For example, `n` Matches the character `n`. `\n` matches newline characters. The sequence `\\` Matches the `'\'` character, while `\(` matches the `'('` character.|
|`^`| Matches the beginning of the input string, unless it is used in a square bracket expression, at which point it indicates that the set of characters is not accepted. |
|`{`| Marks the beginning of a qualifier expression. |
|`\|`| Indicates a choice between the two. |
| Special characters | Description |
| :----------------: | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `$` | Matches the end position of the input string. |
| `(`,`)` | Marks the start and end of a subexpression. Subexpressions can be obtained for later use. |
| `*` | Matches the previous subexpression zero or more times. |
| `+` | Matches the previous subexpression one or more times. |
| `.` | Matches any single character except the newline character `\n`. |
| `[` | Marks the beginning of a bracket expression. |
| `?` | Matches the previous subexpression zero or one time, or indicates a non-greedy qualifier. |
| `\` | Marks the next character as either a special character, or a literal character, or a backward reference, or an octal escape character. For example, `n` Matches the character `n`. `\n` matches newline characters. The sequence `\\` Matches the `'\'` character, while `\(` matches the `'('` character. |
| `^` | Matches the beginning of the input string, unless it is used in a square bracket expression, at which point it indicates that the set of characters is not accepted. |
| `{` | Marks the beginning of a qualifier expression. |
| `\|` | Indicates a choice between the two. |
### Quantifiers
The qualifier is used to specify how many times a given component of a regular expression must appear to satisfy the match. See the table below:
|Character|Description|
|:---:|:------------------------------------------------------|
|`*`| matches the previous subexpression zero or more times. For example, `foo*` matches `fo` and `foooo`. `*` is equivalent to `{0,}`. |
|`+`| matches the previous subexpression one or more times. For example, `foo+` matches `foo` and `foooo` but does not match `fo`. `+` is equivalent to `{1,}`. |
|`?`| matches the previous subexpression zero or one time. For example, `Your(s)?` can match `Your` in `Your` or `Yours`. `?` is equivalent to `{0,1}`. |
|`{n}`| `n` is a non-negative integer. Matches the determined `n` times. For example, `o{2}` cannot match `o` in `for`, but can match two `o` in `foo`. |
|`{n,}`| `n` is a non-negative integer. Match at least `n` times. For example, `o{2,}` cannot match `o` in `for`, but matches all `o` in `foooooo`. `o{1,}` is equivalent to `o+`. `o{0,}` is equivalent to `o*`. |
|`{n,m}`| `m` and `n` are non-negative integers, where `n` is less than or equal to `m`. Matches at least `n` times and matches up to `m` times. For example, `o{1,3}` will match the first three `o` in `foooooo`. `o{0,1}` is equivalent to `o?`. Note that there can be no spaces between the comma and the two numbers. |
| Character | Description |
| :-------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `*` | matches the previous subexpression zero or more times. For example, `foo*` matches `fo` and `foooo`. `*` is equivalent to `{0,}`. |
| `+` | matches the previous subexpression one or more times. For example, `foo+` matches `foo` and `foooo` but does not match `fo`. `+` is equivalent to `{1,}`. |
| `?` | matches the previous subexpression zero or one time. For example, `Your(s)?` can match `Your` in `Your` or `Yours`. `?` is equivalent to `{0,1}`. |
| `{n}` | `n` is a non-negative integer. Matches the determined `n` times. For example, `o{2}` cannot match `o` in `for`, but can match two `o` in `foo`. |
| `{n,}` | `n` is a non-negative integer. Match at least `n` times. For example, `o{2,}` cannot match `o` in `for`, but matches all `o` in `foooooo`. `o{1,}` is equivalent to `o+`. `o{0,}` is equivalent to `o*`. |
| `{n,m}` | `m` and `n` are non-negative integers, where `n` is less than or equal to `m`. Matches at least `n` times and matches up to `m` times. For example, `o{1,3}` will match the first three `o` in `foooooo`. `o{0,1}` is equivalent to `o?`. Note that there can be no spaces between the comma and the two numbers. |
With these two tables, we can usually read almost all regular expressions.
## 6.2 `std::regex` and Its Related
The most common way to match string content is to use regular expressions. Unfortunately, in traditional C++, regular expressions have not been supported by the language level, and are not included in the standard library. C++ is a high-performance language. In the development of background services, the use of regular expressions is also used when judging URL resource links. The most mature and common practice in industry.
The most common way to match string content is to use regular expressions. Unfortunately, in traditional C++, regular expressions have not been supported by the language level, and are not included in the standard library. C++ is a high-performance language. In the development of background services, the use of regular expressions is also used when judging URL resource links. The most mature and common practice in the industry.
The general solution is to use the regular expression library of `boost`. C++11 officially incorporates the processing of regular expressions into the standard library, providing standard support from the language level and no longer relying on third parties.
@@ -77,7 +75,7 @@ We use a simple example to briefly introduce the use of this library. Consider t
- `[az]+\.txt`: In this regular expression, `[az]` means matching a lowercase letter, `+` can match the previous expression multiple times, so `[az]+` can Matches a string of lowercase letters. In the regular expression, a `.` means to match any character, and `\.` means to match the character `.`, and the last `txt` means to match `txt` exactly three letters. So the content of this regular expression to match is a text file consisting of pure lowercase letters.
`std::regex_match` is used to match strings and regular expressions, and there are many different overloaded forms. The simplest form is to pass `std::string` and a `std::regex` to match. When the match is successful, it will return `true`, otherwise it will return `false`. For example:
`std::regex_match` is used to match strings and regular expressions, and there are many different overloaded forms. The simplest form is to pass `std::string` and a `std::regex` to match. When the match is successful, it will return `true`, otherwise, it will return `false`. For example:
```cpp
#include <iostream>
@@ -131,14 +129,14 @@ bar.txt sub-match[1]: bar
## Conclusion
This section briefly introduces the regular expression itself,
and then introduces the use of the regular expression library
through a practical example based on the main requirements of
This section briefly introduces the regular expression itself,
and then introduces the use of the regular expression library
through a practical example based on the main requirements of
using regular expressions.
## Exercise
In web server development, we usually want to serve some routes that satisfy a certain condition.
In web server development, we usually want to serve some routes that satisfy a certain condition.
Regular expressions are one of the tools to accomplish this.
Given the following request structure:
@@ -230,4 +228,4 @@ An suggested solution can be found [here](../../exercises/6).
## Licenses
<a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-nd/4.0/88x31.png" /></a><br />This work was written by [Ou Changkun](https://changkun.de) and licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License</a>. The code of this repository is open sourced under the [MIT license](../../LICENSE).
<a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-nd/4.0/88x31.png" /></a><br />This work was written by [Ou Changkun](https://changkun.de) and licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/">Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License</a>. The code of this repository is open sourced under the [MIT license](../../LICENSE).