Expose url for URLReplace in JSON scrapeByURL and scrapeByFragment (#1150)

* Expose url for URLReplace in JSON scrapeByURL and scrapeByFragment
* Apply queryURLReplace to xpath scrapers

Co-authored-by: WithoutPants <53250216+WithoutPants@users.noreply.github.com>
This commit is contained in:
bnkai
2021-03-02 00:19:56 +02:00
committed by GitHub
parent fe990e00c1
commit 117e6326db
5 changed files with 56 additions and 9 deletions

View File

@@ -223,6 +223,7 @@ For `sceneByFragment`, the `queryURL` field must also be present. This field is
* `{oshash}` - the oshash of the scene
* `{filename}` - the base filename of the scene
* `{title}` - the title of the scene
* `{url}` - the url of the scene
These placeholder field values may be manipulated with regex replacements by adding a `queryURLReplace` section, containing a map of placeholder field to regex configuration which uses the same format as the `replace` post-process action covered below.
@@ -241,6 +242,24 @@ sceneByFragment:
The above configuration would scrape from the value of `queryURL`, replacing `{filename}` with the base filename of the scene, after it has been manipulated by the regex replacements.
### scrapeXPath and scrapeJson use with `<scene|performer|gallery|movie>ByURL`
For `sceneByURL`, `performerByURL`, `galleryByURL` the `queryURL` can also be present if we want to use `queryURLReplace`. The functionality is the same as `sceneByFragment`, the only placeholder field available though is the `url`:
* `{url}` - the url of the scene/performer/gallery
```yaml
sceneByURL:
- action: scrapeJson
url:
- metartnetwork.com
scraper: sceneScraper
queryURL: "{url}"
queryURLReplace:
url:
- regex: '^(?:.+\.)?([^.]+)\.com/.+movie/(\d+)/(\w+)/?$'
with: https://www.$1.com/api/movie?name=$3&date=$2
```
### Stash
A different stash server can be configured as a scraping source. This action applies only to `performerByName`, `performerByFragment`, and `sceneByFragment` types. This action requires that the top-level `stashServer` field is configured.