Feature: Support inputURL and inputHostname in scrapers (#6250)

This commit is contained in:
Gykes
2025-11-09 20:00:47 -08:00
committed by GitHub
parent f434c1f529
commit 678b3de7c8
4 changed files with 102 additions and 18 deletions

View File

@@ -325,10 +325,58 @@ Alternatively, an attribute value may be set to a fixed value, rather than scrap
```yaml
performer:
Gender:
Gender:
fixed: Female
```
### Input URL placeholders
The `{inputURL}` and `{inputHostname}` placeholders can be used in both `fixed` values and `selector` expressions to access information about the original URL that was used to scrape the content.
#### {inputURL}
The `{inputURL}` placeholder provides access to the full URL. This is useful when you want to return or reference the source URL as part of the scraped data.
For example:
```yaml
scene:
URL:
fixed: "{inputURL}"
Title:
selector: //h1[@class="title"]
```
When scraping from `https://example.com/scene/12345`, the `{inputURL}` placeholder will be replaced with `https://example.com/scene/12345`.
#### {inputHostname}
The `{inputHostname}` placeholder extracts just the hostname from the URL. This is useful when you need to reference the domain without manually parsing the URL.
For example:
```yaml
scene:
Studio:
fixed: "{inputHostname}"
Details:
selector: //div[@data-domain="{inputHostname}"]//p[@class="description"]
```
When scraping from `https://example.com/scene/12345`, the `{inputHostname}` placeholder will be replaced with `example.com`.
These placeholders can also be used within selectors for more advanced use cases:
```yaml
scene:
Details:
selector: //div[@data-url="{inputURL}"]//p[@class="description"]
Site:
selector: //div[@data-host="{inputHostname}"]//span[@class="site-name"]
```
> **Note:** These placeholders represent the actual URL used to fetch the content, after any URL replacements have been applied.
### Common fragments
The `common` field is used to configure selector fragments that can be referenced in the selector strings. These are key-value pairs where the key is the string to reference the fragment, and the value is the string that the fragment will be replaced with. For example: