Release notes dialog (#2726)

* Move manual docs
* Move changelog docs
* Add migration notes
* Move changelog to settings
* Add release notes dialog
* Add new changelog
This commit is contained in:
WithoutPants
2022-07-13 12:57:53 +10:00
parent 964b559309
commit 30877c75fb
59 changed files with 229 additions and 57 deletions

View File

@@ -0,0 +1,17 @@
# Auto Tagging
This task matches your Performers, Studios, and Tags against your media, based on names only. It finds Scenes, Images, and Galleries where the path or filename contains the Performer/Studio/Tag.
For each scene it finds that matches, it sets the applicable field. It will **only** tag based on performers, studios, and tags that already exist in your database. In order to completely identify and gather information about the scenes in your collection, you will need to use the Tagger view and/or Scraping tools.
When the Performer/Studio/Tag name has multiple words, the search will include paths/filenames where the Performer/Studio/Tag name is separated with `.`, `-` or `_` characters, as well as whitespace.
For example, auto tagging for performer `Jane Doe` will match the following filenames:
* `Jane.Doe.1.mp4`
* `Jane_Doe.2.mp4`
* `Jane-Doe.3.mp4`
* `Jane Doe.4.mp4`
Matching is case insensitive, and should only match exact wording within word boundaries. For example, `Jane Doe` will not match `Maryjane-Doe`, but will match `Mary-Jane-Doe`.
Auto tagging for only specific Performers, Studios and Tags can be performed from the individual Performer/Studio/Tag page.

View File

@@ -0,0 +1,45 @@
# Browsing
## Querying and Filtering
### Keyword searching
The text field allows you to search using keywords. Keyword searching matches on different fields depending on the object type:
| Type | Fields searched |
|------|-----------------|
| Scene | Title, Details, Path, OSHash, Checksum, Marker titles |
| Image | Title, Path, Checksum |
| Movie | Title |
| Marker | Title, Scene title |
| Gallery | Title, Path, Checksum |
| Performer | Name, Aliases |
| Studio | Name, Aliases |
| Tag | Name, Aliases |
Keyword matching uses the following rules:
* all words are required in the matching field. For example, `foo bar` matches scenes with both `foo` and `bar` in the title.
* the `or` keyword or symbol (`|`) is used to match either fields. For example, `foo or bar` (or `foo | bar`) matches scenes with `foo` or `bar` in the title. Or sets can be combined. For example, `foo or bar or baz xyz or zyx` matches scenes with one of `foo`, `bar` and `baz`, *and* `xyz` or `zyx`.
* the not symbol (`-`) is used to exclude terms. For example, `foo -bar` matches scenes with `foo` and excludes those with `bar`. The not symbol cannot be combined with an or operand. That is, `-foo or bar` will be interpreted to match `-foo` or `bar`. On the other hand, `foo or bar -baz` will match `foo` or `bar` and exclude `baz`.
* surrounding a phrase in quotes (`"`) matches on that exact phrase. For example, `"foo bar"` matches scenes with `foo bar` in the title. Quotes may also be used to escape the keywords and symbols. For example, `foo "-bar"` will match scenes with `foo` and `-bar`.
* quoted phrases may be used with the or and not operators. For example, `"foo bar" or baz -"xyz zyx"` will match scenes with `foo bar` *or* `baz`, and exclude those with `xyz zyx`.
* `or` keywords or symbols at the start or end of a line will be treated literally. That is, `or foo` will match scenes with `or` and `foo`.
* all matching is case-insensitive
### Filters
Filters can be accessed by clicking the filter button on the right side of the query text field.
Note that only one filter criterion per criterion type may be assigned.
### Sorting and page size
The current sorting field is shown next to the query text field, indicating the current sort field and order. The page size dropdown allows selecting from a standard set of objects per page, and allows setting a custom page size.
### Saved filters
Saved filters can be accessed with the bookmark button on the left of the query text field. The current filter can be saved by entering a filter name and clicking on the save button. Existing saved filters may be overwritten with the current filter by clicking on the save button next to the filter name. Saved filters may also be deleted by pressing the delete button next to the filter name.
### Default filter
The default filter for the top-level pages may be set to the current filter by clicking the `Set as default` button in the saved filter menu.

View File

@@ -0,0 +1,14 @@
# Captions
Stash supports captioning with SRT and VTT files.
These files need to be named as follows:
## Scene
- {scene_name}.{language_code}.ext
- {scene_name}.ext
Where `{language_code}` is defined by the [ISO-6399-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) (2 letters) standard and `ext` is the file extension. Captions files without a language code will be labeled as Unknown in the video player but will work fine.
Scenes with captions can be filtered with the `captions` criterion.

View File

@@ -0,0 +1,137 @@
# Configuration
## Stashes
This section allows you to add and remove directories from your library list. Files in these directories will be included when scanning. Files that are outside of these directories will be removed when running the Clean task.
> **⚠️ Note:** Don't forget to click `Save` after updating these directories!
## Excluded Patterns
Given a valid [regex](https://github.com/google/re2/wiki/Syntax), files that match even partially are excluded during the Scan process and are not entered in the database. Also during the Clean task if these files exist in the DB they are removed from it and their generated files get deleted.
Prior to matching both the filenames and patterns are converted to lower case so the match is case insensitive.
Regex patterns can be added in the config file or from the UI.
If you add manually to the config file a restart is needed while from the UI you just need to click the Save button.
When added through the config file directly special care must be given to double escape the `\` character.
Some examples
For the config file you need the following added
```
exclude:
- "sample\\.mp4$"
- "/\\.[[:word:]]+/"
- "c:\\\\stash\\\\videos\\\\exclude"
- "^/stash/videos/exclude/"
- "\\\\\\\\stash\\network\\\\share\\\\excl\\\\"
```
* the first excludes all files ending in `sample.mp4` ( `.` needs to be escaped also)
* the second hidden directories `/.directoryname/`
* the third is an example for a windows directory `c:\stash\videos\exclude`
* the fourth the directory `/stash/videos/exclude/`
* and the last a windows network path `\\stash\network\share\excl\`
**Note:** if a directory is excluded for images and videos, then the directory will be excluded from scans completely.
_a useful [link](https://regex101.com/) to experiment with regexps_
## Hashing algorithms
Stash identifies video files by calculating a hash of the file. There are two algorithms available for hashing: `oshash` and `MD5`. `MD5` requires reading the entire file, and can therefore be slow, particularly when reading files over a network. `oshash` (which uses OpenSubtitle's hashing algorithm) only reads 64k from each end of the file.
The hash is used to name the generated files such as preview images and videos, and sprite images.
By default, new systems have MD5 calculation disabled for optimal performance. Existing systems that are upgraded will have the oshash populated for each scene on the next scan.
### Changing the hashing algorithm
To change the file naming hash to oshash, all scenes must have their oshash values populated. oshash population is done automatically when scanning.
To change the file naming hash to `MD5`, the MD5 must be populated for all scenes. To do this, `Calculate MD5` for videos must be enabled and the library must be rescanned.
MD5 calculation may only be disabled if the file naming hash is set to `oshash`.
After changing the file naming hash, any existing generated files will now be named incorrectly. This means that stash will not find them and may regenerate them if the `Generate task` is used. To remedy this, run the `Rename generated files` task, which will rename existing generated files to their correct names.
#### Step-by-step instructions to migrate to oshash for existing users
These instructions are for existing users whose systems will be defaulted to use and calculate MD5 checksums. Once completed, MD5 checksums will no longer be calculated when scanning, and oshash will be used for generated file naming. Existing calculated MD5 checksums will remain on scenes, but checksums will not be calculated for new scenes.
1. Scan the library (to populate oshash for all existing scenes).
2. In Settings -> Configuration page, untick `Calculate MD5` and select `oshash` as file naming hash. Save the configuration.
3. In Settings -> Tasks page, click on the `Rename generated files` migration button.
## Parallel Scan/Generation
#### Number of parallel task for scan/generation
This setting controls how many sub-tasks will be run in parallel during scanning and generation tasks. (See Tasks)
Auto-detection can be enabled by setting this to zero. This will calculate the number of parallel tasks to be cpu_cores/4 + 1.
This setting can be used to increase/decrease overall CPU utilisation in two scenarios:
1) High performance 4+ core cpus.
2) Media files stored on remote/cloud filesystem.
Note: If this is set too high it will decrease overall performance and causes failures (out of memory).
## Scraping
### User Agent string
Some websites require a legitimate User-Agent string when receiving requests, or they will be rejected. If entered, this string will be applied as the `User-Agent` header value in http scrape requests.
### Chrome CDP path
Some scrapers require a Chrome instance to function correctly. If left empty, stash will attempt to find the Chrome executable in the path environment, and will fail if it cannot find one.
`Chrome CDP path` can be set to a path to the chrome executable, or an http(s) address to remote chrome instance (for example: `http://localhost:9222/json/version`).
## Authentication
By default, stash is not configured with any sort of password protection. To enable password protection, both `Username` and `Password` must be populated. Note that when entering a new username and password where none was set previously, the system will immediately request these credentials to log you in.
## API key
If password protection is enabled, you may also generate an API key. An API key is used by external systems to access your stash system without needing to login first.
External systems using the API key must set the `ApiKey` header value to the configured API key in order to bypass the login requirement.
### Logging out
The logout button is situated in the upper-right part of the screen when you are logged in.
### Recovering from a forgotten username or password
Stash saves login credentials in the config.yml file. You must reset both login and password if you have forgotten your password by doing the following:
* Close your Stash process
* Open the `config.yml` file found in your Stash directory with a text editor
* Delete the `login` and `password` lines from the file and save
Stash authentication should now be reset with no authentication credentials.
## Advanced configuration options
These options are typically not exposed in the UI and must be changed manually in the `config.yml` file.
| Field | Remarks |
|-------|---------|
| `custom_served_folders` | A map of URLs to file system folders. See below. |
| `custom_ui_location` | The file system folder where the UI files will be served from, instead of using the embedded UI. Empty to disable. Stash must be restarted to take effect. |
| `max_upload_size` | Maximum file upload size for import files. Defaults to 1GB. |
| `theme_color` | Sets the `theme-color` property in the UI. |
### Custom served folders
Custom served folders are served when the server handles a request with the `/custom` URL prefix. The following is an example configuration:
```
custom_served_folders:
/: D:\stash\static
/foo: D:\bar
```
With the above configuration, a request for `/custom/foo/bar.png` would serve `D:\bar\bar.png`.
The `/` entry matches anything that is not otherwise mapped by the other entries. For example, `/custom/baz/xyz.png` would serve `D:\stash\static\baz\xyz.png`.

View File

@@ -0,0 +1,47 @@
# Ways to contribute
## Financial
Financial contributions are welcomed and are accepted using [Open Collective](https://opencollective.com/stashapp).
## Development-related
The Stash backend is written in golang with a sqlite database. The UI is written in react. Bug fixes, improvements and new features are welcomed. Please see the [README.md](https://github.com/stashapp/stash/blob/develop/docs/DEVELOPMENT.md) file for details on how to get started. Assistance can be provided via our [Discord](https://discord.gg/2TsNFKt).
## Documentation
Efforts to improve documentation in stash helps new users and reduces the amount of questions we have to field in Discord. Contributions to documentation are welcomed. While submitting documentation changes via git pull requests is ideal, we will gladly accept submissions via [github issues](https://github.com/stashapp/stash/issues) or on [Discord](https://discord.gg/2TsNFKt).
For those with web page experience, we also welcome contributions to our [website](https://stashapp.cc/) (which as of writing is very undeveloped).
## Testing features, improvements and bug fixes
Testing is currently covered by a very small group, so new testers are welcomed. Being able to build stash locally is ideal, but custom binaries for pull requests are available by navigating to the `continuous-integration/travis-ci/pr` travis check details.
The link to the custom binary for each platform can be found at the end of the build log, and looks like the following:
```
$ if [ "$TRAVIS_PULL_REQUEST" != "false" ]; then sh ./scripts/upload-pull-request.sh; fi
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 43.1M 100 35 100 43.1M 3 3812k 0:00:11 0:00:11 --:--:-- 5576k
stash-osx uploaded to url: https://transfer.sh/.../stash-osx
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 60.7M 100 39 100 60.7M 3 5391k 0:00:13 0:00:11 0:00:02 7350k
stash-win.exe uploaded to url: https://transfer.sh/.../stash-win.exe
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 44.6M 100 37 100 44.6M 2 3648k 0:00:18 0:00:12 0:00:06 7504k
stash-linux uploaded to url: https://transfer.sh/.../stash-linux
```
The `if` line will need to be expanded to see the details.
## Submitting and contributing to bug reports, improvements and new features
We welcome contributions for future improvements and features, and bug reports help everyone. These can all be found in the [github issues](https://github.com/stashapp/stash/issues).
## Providing support
Offering support for new users on [Discord](https://discord.gg/2TsNFKt) is also welcomed.

View File

@@ -0,0 +1,9 @@
# Dupe Checker
[The dupe checker](/sceneDuplicateChecker) searches your collection for scenes that are perceptually similar. This means that the files don't need to be identical, and will be identified even with different bitrates, resolutions, and intros/outros.
To achieve this stash needs to generate what's called a phash, or perceptual hash. Similar to sprite generation stash will generate a set of 25 images from fixed points in the scene. These images will be stitched together, and then hashed using the phash algorithm. The phash can then be used to find scenes that are the same or similar to others in the database. Phash generation can be run during scan, or as a separate task. Note that generation can take a while due to the work involved with extracting screenshots.
The dupe checker can be run with four different levels of accuracy. `Exact` looks for scenes that have exactly the same phash. This is a fast and accurate operation that should not yield any false positives except in very rare cases. The other accuracy levels look for duplicate files within a set distance of each other. This means the scenes don't have exactly the same phash, but are very similar. `High` and `Medium` should still yield very good results with few or no false positives. `Low` is likely to produce some false positives, but might still be useful for finding dupes.
Note that to generate a phash stash requires an uncorrupted file. If any errors are encountered during sprite generation the phash will not be generated. This is to prevent false positives.

View File

@@ -0,0 +1,131 @@
# Embedded Plugins
Embedded plugins are executed within the stash process using a scripting system.
## Supported script languages
Stash currently supports Javascript embedded plugins using [otto](https://github.com/robertkrimen/otto).
# Javascript plugins
## Plugin input
The input is provided to Javascript plugins using the `input` global variable, and is an object based on the structure provided in the `Plugin input` section of the [Plugins](/help/Plugins.md) page. Note that the `server_connection` field should not be necessary in most embedded plugins.
## Plugin output
The output of a Javascript plugin is derived from the evaluated value of the script. The output should conform to the structure provided in the `Plugin output` section of the [Plugins](/help/Plugins.md) page.
There are a number of ways to return the plugin output:
### Example #1
```
(function() {
return {
Output: "ok"
};
})();
```
### Example #2
```
function main() {
return {
Output: "ok"
};
}
main();
```
### Example #3
```
var output = {
Output: "ok"
};
output;
```
## Logging
See the `Javascript API` section below on how to log with Javascript plugins.
# Plugin configuration file format
The basic structure of an embedded plugin configuration file is as follows:
```
name: <plugin name>
description: <optional description of the plugin>
version: <optional version tag>
url: <optional url>
exec:
- <path to script>
interface: [interface type]
tasks:
- ...
```
The `name`, `description`, `version` and `url` fields are displayed on the plugins page.
## exec
For embedded plugins, the `exec` field is a list with the first element being the path to the Javascript file that will be executed. It is expected that the path to the Javascript file is relative to the directory of the plugin configuration file.
## interface
For embedded plugins, the `interface` field must be set to one of the following values:
* `js`
# Javascript API
## Logging
Stash provides the following API for logging in Javascript plugins:
| Method | Description |
|--------|-------------|
| `log.Trace(<string>)` | Log with the `trace` log level. |
| `log.Debug(<string>)` | Log with the `debug` log level. |
| `log.Info(<string>)` | Log with the `info` log level. |
| `log.Warn(<string>)` | Log with the `warn` log level. |
| `log.Error(<string>)` | Log with the `error` log level. |
| `log.Progress(<float between 0 and 1>)` | Sets the progress of the plugin task, as a float, where `0` represents 0% and `1` represents 100%. |
## GQL
Stash provides the following API for communicating with stash using the graphql interface:
| Method | Description |
|--------|-------------|
| `gql.Do(<query/mutation string>, <variables object>)` | Executes a graphql query/mutation on the stash server. Returns an object in the same way as a graphql query does. |
### Example
```
// creates a tag
var mutation = "\
mutation tagCreate($input: TagCreateInput!) {\
tagCreate(input: $input) {\
id\
}\
}";
var variables = {
input: {
'name': tagName
}
};
result = gql.Do(mutation, variables);
log.Info("tag id = " + result.tagCreate.id);
```
## Utility functions
Stash provides the following API for utility functions:
| Method | Description |
|--------|-------------|
| `util.Sleep(<milliseconds>)` | Suspends the current thread for the specified duration. |

View File

@@ -0,0 +1,102 @@
# External Plugins
External plugins are executed by running an external binary.
## Plugin interfaces
Stash communicates with external plugins using an interface. Stash currently supports RPC and raw interface types.
### RPC interface
The RPC interface uses JSON-RPC to communicate with the plugin process. A golang plugin utilising the RPC interface is available in the stash source code under `pkg/plugin/examples/gorpc`. RPC plugins are expected to provide an interface that fulfils the `RPCRunner` interface in `pkg/plugin/common`.
RPC plugins are expected to accept requests asynchronously.
When stopping an RPC plugin task, the stash server sends a stop request to the plugin and relies on the plugin to stop itself.
### Raw interface
Raw interface plugins are not required to conform to any particular interface. The stash server will send the plugin input to the plugin process via its stdin stream, encoded as JSON. Raw interface plugins are not required to read the input.
The stash server reads stdout for the plugin's output. If the output can be decoded as a JSON representation of the plugin output data structure then it will do so. If not, it will treat the entire stdout string as the plugin's output.
When stopping a raw plugin task, the stash server kills the spawned process without warning or signals.
## Logging
External plugins may log to the stash server by writing to stderr. By default, data written to stderr will be logged by stash at the `error` level. This default behaviour can be changed by setting the `errLog` field in the plugin configuration file.
Plugins can log for specific levels or log progress by prefixing the output string with special control characters. See `pkg/plugin/common/log` for how this is done in go.
# Plugin configuration file format
The basic structure of an external plugin configuration file is as follows:
```
name: <plugin name>
description: <optional description of the plugin>
version: <optional version tag>
url: <optional url>
exec:
- <binary name>
- <other args...>
interface: [interface type]
errLog: [one of none trace, debug, info, warning, error]
tasks:
- ...
```
The `name`, `description`, `version` and `url` fields are displayed on the plugins page.
## exec
For external plugins, the `exec` field is a list with the first element being the binary that will be executed, and the subsequent elements are the arguments passed. The execution process will search the path for the binary, then will attempt to find the program in the same directory as the plugin configuration file. The `exe` extension is not necessary on Windows systems.
> **⚠️ Note:** The plugin execution process sets the current working directory to that of the stash process.
Arguments can include the plugin's directory with the special string `{pluginDir}`.
For example, if the plugin executable `my_plugin` is placed in the `plugins` subdirectory and requires arguments `foo` and `bar`, then the `exec` part of the configuration would look like the following:
```
exec:
- my_plugin
- foo
- bar
```
Another example might use a python script to execute the plugin. Assuming the python script `foo.py` is placed in the same directory as the plugin config file, the `exec` fragment would look like the following:
```
exec:
- python
- {pluginDir}/foo.py
```
## interface
For external plugins, the `interface` field must be set to one of the following values:
* `rpc`
* `raw`
See the `Plugin interfaces` section above for details on these interface types.
The `interface` field defaults to `raw` if not provided.
## errLog
The `errLog` field tells stash what the default log level should be when the plugin outputs to stderr without encoding a log level. It defaults to the `error` level if no provided. This field is not necessary if the plugin outputs logging with the appropriate encoding. See the `Logging` section above for details.
# Task configuration
In addition to the standard task configuration, external tags may be configured with an optional `execArgs` field to add extra parameters to the execution arguments for the task.
For example:
```
tasks:
- name: <operation name>
description: <optional description>
execArgs:
- <arg to add to the exec line>
```

View File

@@ -0,0 +1,12 @@
# Galleries
**Note:** images are now included during the scan process and are loaded independently of galleries. It is _no longer necessary_ to have images in zip files to be scanned into your library.
Galleries are automatically created from zip files found during scanning that contain images. It is also possible to automatically create galleries from folders containing images, by selecting the "Create galleries from folders containing images" checkbox in the Configuration page. It is also possible to manually create galleries.
For best results, images in zip file should be stored without compression (copy, store or no compression options depending on the software you use. Eg on linux: `zip -0 -r gallery.zip foldertozip/`). This impacts **heavily** on the zip read performance.
If an filename of an image in the gallery zip file ends with `cover.jpg`, it will be treated like a cover and presented first in the gallery view page and as a gallery cover in the gallery list view. If more than one images match the name the first one found in natural sort order is selected.
Images can be added to a gallery by navigating to the gallery's page, selecting the "Add" tab, querying for and selecting the images to add, then selecting "Add to Gallery" from the `...` menu button. Likewise, images may be removed from a gallery by selecting the "Images" tab, selecting the images to remove and selecting "Remove from Gallery" from the `...` menu button.

View File

@@ -0,0 +1,7 @@
# Where to get further help
Join our [Discord](https://discord.gg/2TsNFKt).
The [Github wiki](https://github.com/stashapp/stash/wiki) covers some areas not covered in the in-app help.
Raise a [github issue](https://github.com/stashapp/stash/issues).

View File

@@ -0,0 +1,31 @@
# Identify
This task iterates through your Scenes and attempts to identify the scene using a selection of scraping sources.
This task accepts one or more scraper sources. Valid scraper sources for the Identify task are stash-box instances, and scene scrapers which support scraping via Scene Fragment. The order of the sources may be rearranged.
For each Scene, the Identify task iterates through the scraper sources, in the order provided, and tries to identify the scene using each source. If a result is found in a source, then the Scene is updated, and no further sources are checked for that scene.
## Options
The following options can be set:
| Option | Description |
|--------|-------------|
| Include male performers | If false, then male performers will not be created or set on scenes. |
| Set cover images | If false, then scene cover images will not be modified. |
| Set organised flag | If true, the organised flag is set to true when a scene is organised. |
Field specific options may be set as well. Each field may have a Strategy. The behaviour for each strategy value is as follows:
| Strategy | Description |
|----------|-------------|
| Ignore | Not set. |
| Overwrite | Overwrite existing value. |
| Merge (*default*) | For multi-value fields, adds to existing values. For single-value fields, only sets if not already set. |
For Studio, Performers and Tags, an option is also available to Create Missing objects. This is false by default. When true, if a Studio/Performer/Tag is included during the identification process and does not exist in the system, then it will be created.
Default Options are applied to all sources unless overridden in specific source options.
The result of the identification process for each scene is output to the log.

View File

@@ -0,0 +1,9 @@
# Interactivity
Stash currently supports syncing with Handy devices, using funscript files.
In order for stash to connect to your Handy device, the Handy Connection Key must be entered in Settings -> Interface.
Funscript files must be in the same directory as the matching video file and must have the same base name. For example, a funscript file for `video.mp4` must be named `video.funscript`. A scan must be run to update scenes with matching funscript files.
Scenes with funscript files can be filtered with the `interactive` criterion.

View File

@@ -0,0 +1,53 @@
# Interface Options
## Language
Setting the language affects the formatting of numbers and dates.
## Scene/Marker Wall Preview Type
The Scene Wall and Marker pages display scene preview videos by default. This can be changed to animated image (webp) or static image.
> **⚠️ Note:** scene/marker preview videos must be generated to see them in the applicable wall page if Video preview type is selected. Likewise, if Animated Image is selected, then Image Previews must be generated.
## Show Studios as text
By default, a scene's studio will be shown as an image overlay. Checking this option changes this to display studios as a text name instead.
## Scene Player options
By default, scene videos do not automatically start when navigating to the scenes page. Checking the "Auto-start video" option changes this to auto play scene videos.
The maximum loop duration option allows looping of shorter videos. Set this value to the maximum scene duration that scene videos should loop. Setting this to 0 disables this functionality.
## Custom CSS
The stash UI can be customised using custom CSS. See [here](https://github.com/stashapp/stash/wiki/Custom-CSS-snippets) for a community-curated set of CSS snippets to customise your UI.
[Stash Plex Theme](https://github.com/stashapp/stash/wiki/Theme-Plex) is a community created theme inspired by the popular Plex interface.
## Custom served folders
It is possible to expose specific folders to the UI. This configuration is performed manually in the `config.yml` file only.
Custom served content is exposed via the `/custom` URL path prefix.
For example, in the `config.yml` file:
```
custom_served_folders:
/: D:\stash\static
/foo: D:\bar
```
With the above configuration, a request for `/custom/foo/bar.png` would return `D:\bar\bar.png`. The `/` entry matches anything that is not otherwise mapped by the other entries. For example, `/custom/baz/xyz.png` would return `D:\stash\static\baz\xyz.png`.
Applications for this include using static images in custom css, like the Plex theme. For example, using the following config:
```yml
custom_served_folders:
/: <stash folder>\custom
```
The `background.png` and `noise.png` files can be placed in the `custom` folder, then in the custom css, the `./background.png` and `./noise.png` strings can be replaced with `/custom/background.png` and `/custom/noise.png` respectively.
Other applications are to add custom UIs to stash, accessible via `/custom`.

View File

@@ -0,0 +1,7 @@
# Introduction
Stash works by cataloging your media using the paths that you provide. Once you have [configured](/settings?tab=library) the locations where your media is stored, you can click the Scan button in [`Settings -> Tasks`](/settings?tab=tasks) and stash will begin scanning and importing your media into its library.
For the best experience, it is recommmended that after a scan is finished, that video previews and sprites are generated. You can do this in [`Settings -> Tasks`](/settings?tab=tasks). Note that currently it is only possible to perform one task at a time and there is no task queue, so the Generate task should be performed after Scan is complete.
Once your media is imported, you are ready to begin creating Performers, Studios and Tags, and curating your content!

View File

@@ -0,0 +1,497 @@
# Import/Export JSON Specification
The metadata given to Stash can be exported into the JSON format. This structure can be modified, or replicated by other means. The resulting data can then be imported again, giving the possibility for automatic scraping of all kinds. The format of this metadata bulk is a folder structure, containing the following folders:
* `downloads`
* `galleries`
* `performers`
* `scenes`
* `studios`
* `movies`
Additionally, it contains a `mappings.json` file.
The mappings file contains a reference to all files within the folders, by including their checksum. All files in the aforementioned folders are named by their checksum (like `967ddf2e028f10fc8d36901833c25732.json`), which (at least in the case of galleries and scenes) is generated from the file that this metadata relates to. The algorithm for the checksum is MD5.
# Content of the json files
In the following, the values of the according jsons will be shown. If the value should be a number, it is written with after comma values (like `29.98` or `50.0`), but still as a string. The meaning from most of them should be obvious due to the previous explanation or from the possible values stash offers when editing, otherwise a short comment will be added.
The json values are given as strings, if not stated otherwise. Every new line will stand for a new value in the json. If the value is a list of objects, the values of that object will be shown indented.
If a value is empty in any but the `mappings.json` file, it can be left out of the file entirely. In the `mappings.json` however, all values must be present, if there are no objects of a type (for example, no performers), the value is simply null.
Many files have an `created_at` and `updated_at`, both are kept in the following format:
```
YYYY-MM-DDThh:mm:ssTZD
```
Example:
```
"created_at": "2019-05-03T21:36:58+01:00"
```
## `mappings.json`
```
performers
name
checksum
studios
name
checksum
galleries
path
checksum
scenes
path
checksum
```
## Performer
```
name
url
twitter
instagram
birthdate
death_date
ethnicity
country
hair_color
eye_color
height
weight
measurements
fake_tits
career_length
tattoos
piercings
image (base64 encoding of the image file)
created_at
updated_at
rating (integer)
details
```
## Studio
```
name
url
image (base64 encoding of the image file)
created_at
updated_at
rating (integer)
details
```
## Scene
```
title
studio
url
date
rating (integer)
details
performers (list of strings, performers name)
tags (list of strings)
markers
title
seconds
primary_tag
tags (list of strings)
created_at
updated_at
file (not a list, but a single object)
size (in bytes, no after comma values)
duration (in seconds)
video_codec (example value: h264)
audio_codec (example value: aac)
width (integer, in pixel)
height (integer, in pixel)
framerate
bitrate (integer, in Bit)
created_at
updated_at
```
## Gallery
No files of this kind are generated yet.
# In JSON format
For those preferring the json-format, defined [here](https://json-schema.org/), the following format may be more interesting:
## mappings.json
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://github.com/stashapp/stash/wiki/JSON-Specification/mappings.json",
"title": "mappings",
"description": "The base file for the metadata. Referring to all other files with names, as well as providing the path to files.",
"type": "object",
"properties": {
"performers": {
"description": "Link to the performers files along with names",
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"checksum": {
"type": "string"
}
},
"required": ["name", "checksum"]
},
"minItems": 0,
"uniqueItems": true
},
"studios": {
"description": "Link to the studio files along with names",
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"checksum": {
"type": "string"
}
},
"required": ["name", "checksum"]
},
"minItems": 0,
"uniqueItems": true
},
"galleries": {
"description": "Link to the gallery files along with the path to the content",
"type": "array",
"items": {
"type": "object",
"properties": {
"path": {
"type": "string"
},
"checksum": {
"type": "string"
}
},
"required": ["path", "checksum"]
},
"minItems": 0,
"uniqueItems": true
},
"scenes": {
"description": "Link to the scene files along with the path to the content",
"type": "array",
"items": {
"type": "object",
"properties": {
"path": {
"type": "string"
},
"checksum": {
"type": "string"
}
},
"required": ["path", "checksum"]
},
"minItems": 0,
"uniqueItems": true
}
},
"required": ["performers", "studios", "galleries", "scenes"]
}
```
## performer.json
``` json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://github.com/stashapp/stash/wiki/JSON-Specification/performer.json",
"title": "performer",
"description": "A json file representing a performer. The file is named by a MD5 Code.",
"type": "object",
"properties": {
"name": {
"description": "Name of the performer",
"type": "string"
},
"url": {
"description": "URL to website of the performer",
"type": "string"
},
"twitter": {
"description": "Twitter name of the performer",
"type": "string"
},
"instagram": {
"description": "Instagram name of the performer",
"type": "string"
},
"birthdate": {
"description": "Birthdate of the performer. Format is YYYY-MM-DD",
"type": "string"
},
"death_date": {
"description": "Death date of the performer. Format is YYYY-MM-DD",
"type": "string"
},
"ethnicity": {
"description": "Ethnicity of the Performer. Possible values are black, white, asian or hispanic",
"type": "string"
},
"country": {
"description": "Country of the performer",
"type": "string"
},
"hair_color": {
"description": "Hair color of the performer",
"type": "string"
},
"eye_color": {
"description": "Eye color of the performer",
"type": "string"
},
"height": {
"description": "Height of the performer in centimeters",
"type": "string"
},
"weight": {
"description": "Weight of the performer in kilograms",
"type": "string"
},
"measurements": {
"description": "Measurements of the performer",
"type": "string"
},
"fake_tits": {
"description": "Whether performer has fake tits. Possible are Yes or No",
"type": "string"
},
"career_length": {
"description": "The time the performer has been in business. In the format YYYY-YYYY",
"type": "string"
},
"tattoos": {
"description": "Giving a description of Tattoos of the performer if any",
"type": "string"
},
"piercings": {
"description": "Giving a description of Piercings of the performer if any",
"type": "string"
},
"image": {
"description": "Image of the performer, parsed into base64",
"type": "string"
},
"created_at": {
"description": "The time this performers data was added to the database. Format is YYYY-MM-DDThh:mm:ssTZD",
"type": "string"
},
"updated_at": {
"description": "The time this performers data was last changed in the database. Format is YYYY-MM-DDThh:mm:ssTZD",
"type": "string"
},
"details": {
"description": "Description of the performer",
"type": "string"
}
},
"required": ["name", "ethnicity", "image", "created_at", "updated_at"]
}
```
## studio.json
``` json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://github.com/stashapp/stash/wiki/JSON-Specification/studio.json",
"title": "studio",
"description": "A json file representing a studio. The file is named by a MD5 Code.",
"type": "object",
"properties": {
"name": {
"description": "Name of the studio",
"type": "string"
},
"url": {
"description": "URL to the studios websites",
"type": "string"
},
"image": {
"description": "Logo of the studio, parsed into base64",
"type": "string"
},
"created_at": {
"description": "The time this studios data was added to the database. Format is YYYY-MM-DDThh:mm:ssTZD",
"type": "string"
},
"updated_at": {
"description": "The time this studios data was last changed in the database. Format is YYYY-MM-DDThh:mm:ssTZD",
"type": "string"
},
"details": {
"description": "Description of the studio",
"type": "string"
}
},
"required": ["name", "image", "created_at", "updated_at"]
}
```
## scene.json
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://github.com/stashapp/stash/wiki/JSON-Specification/scene.json",
"title": "scene",
"description": "A json file representing a scene. The file is named by the MD5 Code of the file its data is referring to.",
"type": "object",
"properties": {
"title": {
"description": "Title of the scene",
"type": "string"
},
"studio": {
"description": "The name of the studio that produced that scene",
"type": "string"
},
"url": {
"description": "The url to the scenes original source",
"type": "string"
},
"date": {
"description": "The release date of the scene. Its given in the format YYYY-MM-DD",
"type": "string"
},
"rating": {
"description": "The scenes Rating. Its given in stars, from 1 to 5",
"type": "integer"
},
"details": {
"description": "A description of the scene, containing things like the story arc",
"type": "string"
},
"performers": {
"description": "A list of names of the performers in this gallery",
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"uniqueItems": true
},
"tags": {
"description": "A list of the tags associated with this scene",
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"uniqueItems": true
},
"markers": {
"description": "Markers mark certain events in the scene, most often the change of the position. They are attributed with their own tags.",
"type": "array",
"items": {
"type": "object",
"properties": {
"title": {
"description": "Searchable name of the marker",
"type": "string"
},
"seconds": {
"description": "At what second the marker is set. It is given with after comma values, such as 10.0 or 17.5",
"type": "string"
},
"primary_tag": {
"description": "A tag identifying this marker. Multiple markers from the same scene with the same primary tag are concatenated, showing them as similar in nature",
"type": "string"
},
"tags": {
"description": "A list of the tags associated with this marker",
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"uniqueItems": true
},
"created_at": {
"description": "The time this marker was added to the database. Format is YYYY-MM-DDThh:mm:ssTZD",
"type": "string"
},
"updated_at": {
"description": "The time this marker was updated the last time. Format is YYYY-MM-DDThh:mm:ssTZD",
"type": "string"
}
},
"required": ["seconds", "primary_tag", "created_at", "updated_at"]
},
"minItems": 1,
"uniqueItems": true
},
"file": {
"description": "Some technical data about the scenes file.",
"type": "object",
"properties": {
"size": {
"description": "The size of the file in bytes",
"type": "string"
},
"duration": {
"description": "Duration of the scene in seconds. It is given with after comma values, such as 10.0 or 17.5",
"type": "string"
},
"video_codec": {
"description": "The coding of the video part of the scene file. An example would be h264",
"type": "string"
},
"audio_codec": {
"description": "The coding of the audio part of the scene file. An example would be aac",
"type": "string"
},
"width": {
"description": "The width of the scene in pixels",
"type": "integer"
},
"height": {
"description": "The height of the scene in pixels",
"type": "integer"
},
"framerate": {
"description": "Framerate of the scene. It is given with after comma values, such as 29.95",
"type": "string"
},
"bitrate": {
"description": "The bitrate of the video, in bits",
"type": "integer"
}
},
"required": ["size", "duration", "video_codec", "audio_codec", "height", "width", "framerate", "bitrate"]
},
"created_at": {
"description": "The time this studios data was added to the database. Format is YYYY-MM-DDThh:mm:ssTZD",
"type": "string"
},
"updated_at": {
"description": "The time this studios data was last changed in the database. Format is YYYY-MM-DDThh:mm:ssTZD",
"type": "string"
}
},
"required": ["files", "created_at", "updated_at"]
}
```
## Gallery
No files of this kind are created here yet

View File

@@ -0,0 +1,173 @@
# Keyboard Shortcuts
## Global shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `?` | Display manual |
### Global Navigation
| Keyboard sequence | Target page |
|-------------------|--------|
| `g s` | Scenes |
| `g i` | Images |
| `g v` | Movies |
| `g k` | Markers |
| `g l` | Galleries |
| `g p` | Performers |
| `g u` | Studios |
| `g t` | Tags |
| `g z` | Settings |
## Query page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `/` | Focus search field |
| `f` | Show Add Filter dialog |
| `r` | Reshuffle if sorted by random |
| `v g` | Set view to grid |
| `v l` | Set view to list |
| `v w` | Set view to wall |
| `+` | Increase zoom slider |
| `-` | Decrease zoom slider |
| `←` | Previous page of results |
| `→` | Next page of results |
| `Shift + ←` | Go to current results page -10 |
| `Shift + →` | Go to current results page +10 |
| `Ctrl + Home` | Go to first page of results |
| `Ctrl + End` | Go to last page of results |
| `s a` | Select all on page |
| `s n` | Unselect all |
| `e` | Edit selected |
| `d d` | Delete selected |
## Scenes page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `p r` | Play random scene |
## Scene page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `a` | Details tab |
| `q` | Queue tab |
| `k` | Markers tab |
| `i` | File info tab |
| `e` | Edit tab |
| `,` | Hide/Show sidebar |
| `.` | Hide/Show scene scrubber |
| `o` | Increment O-Counter |
| `p n` | Play next scene in queue |
| `p p` | Play previous scene in queue |
| `p r` | Play random scene in queue |
| `{1-9}` | Seek to 10-90% duration |
| `[` | Scrub backwards 10% duration |
| `]` | Scrub forwards 10% duration |
### Scene Markers tab shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `n` | Display Create Markers dialog |
### Edit Scene tab shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `r {1-5}` | Set rating |
| `r 0` | Unset rating |
| `s s` | Save Scene |
| `d d` | Delete Scene |
| `Ctrl + v` | Paste Scene cover |
[//]: # "Commented until implementation is dealt with"
[//]: # "(| `l` | Focus Gallery selector |)"
[//]: # "(| `u` | Focus Studio selector |)"
[//]: # "(| `p` | Focus Performers selector |)"
[//]: # "(| `v` | Focus Movies selector |)"
[//]: # "(| `t` | Focus Tags selector |)"
## Movies Page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `n` | New Movie |
## Movie Page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `e` | Edit Movie |
| `s s` | Save Movie |
| `d d` | Delete Movie |
| `r {1-5}` | Set rating (in edit mode) |
| `r 0` | Unset rating (in edit mode) |
| `Ctrl + v` | Paste Movie image |
[//]: # "Commented until implementation is dealt with"
[//]: # "(| `u` | Focus Studio selector (in edit mode) |)"
## Markers Page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `p r` | Play random marker |
## Performers Page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `n` | New Performer |
| `p r` | Open random Performer |
## Performer Page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `a` | Details tab |
| `c` | Scenes tab |
| `e` | Edit tab |
| `o` | Operations tab |
| `f` | Toggle favourite |
### Edit Performer tab shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `s s` | Save Performer |
| `d d` | Delete Performer |
| `Ctrl + v` | Paste Performer image |
## Studios Page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `n` | New Studio |
## Studio Page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `e` | Edit Studio |
| `s s` | Save Studio |
| `d d` | Delete Studio |
| `Ctrl + v` | Paste Studio image |
## Tags Page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `n` | New Tag |
## Tag Page shortcuts
| Keyboard sequence | Action |
|-------------------|--------|
| `e` | Edit Tag |
| `s s` | Save Tag |
| `d d` | Delete Tag |
| `Ctrl + v` | Paste Tag image |

View File

@@ -0,0 +1,177 @@
# Plugins
Stash supports the running tasks via plugins. Plugins can be implemented using embedded Javascript, or by calling an external binary.
Stash also supports triggering of plugin hooks from specific stash operations.
> **⚠️ Note:** Plugin support is still experimental and is likely to change.
# Adding plugins
By default, Stash looks for plugin configurations in the `plugins` sub-directory of the directory where the stash `config.yml` is read. This will either be the `$HOME/.stash` directory or the current working directory.
Plugins are added by adding configuration yaml files (format: `pluginName.yml`) to the `plugins` directory.
Loaded plugins can be viewed in the Plugins page of the Settings. After plugins are added, removed or edited while stash is running, they can be reloaded by clicking `Reload Plugins` button.
# Using plugins
Plugins provide tasks which can be run from the Tasks page.
# Creating plugins
See [External Plugins](/help/ExternalPlugins.md) for details for making external plugins.
See [Embedded Plugins](/help/EmbeddedPlugins.md) for details for making embedded plugins.
## Plugin input
Plugins may accept an input from the stash server. This input is encoded according to the interface, and has the following structure (presented here in JSON format):
```
{
"server_connection": {
"Scheme": "http",
"Port": 9999,
"SessionCookie": {
"Name":"session",
"Value":"cookie-value",
"Path":"",
"Domain":"",
"Expires":"0001-01-01T00:00:00Z",
"RawExpires":"",
"MaxAge":0,
"Secure":false,
"HttpOnly":false,
"SameSite":0,
"Raw":"",
"Unparsed":null
},
"Dir": <path to stash config directory>,
"PluginDir": <path to plugin config directory>,
},
"args": {
"argKey": "argValue"
}
}
```
The `server_connection` field contains all the information needed for a plugin to access the parent stash server, if necessary.
## Plugin output
Plugin output is expected in the following structure (presented here as JSON format):
```
{
"error": <optional error string>
"output": <anything>
}
```
The `error` field is logged in stash at the `error` log level if present. The `output` is written at the `debug` log level.
## Task configuration
Tasks are configured using the following structure:
```
tasks:
- name: <operation name>
description: <optional description>
defaultArgs:
argKey: argValue
```
A plugin configuration may contain multiple tasks.
The `defaultArgs` field is used to add inputs to the plugin input sent to the plugin.
## Hook configuration
Stash supports executing plugin operations via triggering of a hook during a stash operation.
Hooks are configured using a similar structure to tasks:
```
hooks:
- name: <operation name>
description: <optional description>
triggeredBy:
- <trigger types>...
defaultArgs:
argKey: argValue
```
**Note:** it is possible for hooks to trigger eachother or themselves if they perform mutations. For safety, hooks will not be triggered if they have already been triggered in the context of the operation. Stash uses cookies to track this context, so it's important for plugins to send cookies when performing operations.
### Trigger types
Trigger types use the following format:
`<object type>.<operation>.<hook type>`
For example, a post-hook on a scene create operation will be `Scene.Create.Post`.
The following object types are supported:
* `Scene`
* `SceneMarker`
* `Image`
* `Gallery`
* `Movie`
* `Performer`
* `Studio`
* `Tag`
The following operations are supported:
* `Create`
* `Update`
* `Destroy`
* `Merge` (for `Tag` only)
Currently, only `Post` hook types are supported. These are executed after the operation has completed and the transaction is committed.
### Hook input
Plugin tasks triggered by a hook include an argument named `hookContext` in the `args` object structure. The `hookContext` is structured as follows:
```
{
"id": <object id>,
"type": <trigger type>,
"input": <operation input>,
"inputFields": <fields included in input>
}
```
The `input` field contains the JSON graphql input passed to the original operation. This will differ between operations. For hooks triggered by operations in a scan or clean, the input will be nil. `inputFields` is populated in update operations to indicate which fields were passed to the operation, to differentiate between missing and empty fields.
For example, here is the `args` values for a Scene update operation:
```
{
"hookContext": {
"type":"Scene.Update.Post",
"id":45,
"input":{
"clientMutationId":null,
"id":"45",
"title":null,
"details":null,
"url":null,
"date":null,
"rating":null,
"organized":null,
"studio_id":null,
"gallery_ids":null,
"performer_ids":null,
"movies":null,
"tag_ids":["21"],
"cover_image":null,
"stash_ids":null
},
"inputFields":[
"tag_ids",
"id"
]
}
}
```

View File

@@ -0,0 +1,63 @@
# Scene Filename Parser
[This tool](/sceneFilenameParser) parses the scene filenames in your library and allows setting the metadata from those filenames.
## Parser Options
To use this tool, a filename pattern must be entered. The pattern accepts the following fields:
| Field | Remark |
|-------|--------|
| `title` | Text captured within is set as the title of the scene. |
|`ext`|Matches the end of the filename. It is not captured. Does not include the last `.` character.|
|`d`|Matches delimiter characters (`-_.`). Not captured.|
|`i`|Matches any ignored word entered in the `Ignored words` field. Ignored words are entered as space-delimited words. Not captured. Use this to match release artifacts like `DVDRip` or release groups.|
|`date`|Matches `yyyy-mm-dd` and sets the date of the scene.|
|`rating`|Matches a single digit and sets the rating of the scene.|
|`performer`| Sets the scene performer, based on the text captured.|
|`tag`| Sets the scene tag, based on the text captured.|
|`studio`| Sets the studio performer, based on the text captured.|
|`{}`|Matches any characters. Not captured.|
> **⚠️ Note:** `performer`, `tag` and `studio` fields will only match against Performers/Tags/Studios that already exist in the system.
The `performer`/`tag`/`studio` fields will remove any delimiter characters (`.-_`) before querying. Name matching is case-insensitive.
The following partial date fields are also supported. The date will only be set on the scene if a date string can be built using the partial date components:
| Field | Remark |
|-------|--------|
|`yyyy`|Four digit year|
|`yy`|Two digit year. Assumes the first two digits are `20`|
|`mm`|Two digit month|
|`mmm`|Three letter month, such as `Jan` (case-insensitive)|
|`dd`|Two digit date|
The following full date fields are supported, using the same partial date rules as above:
* `yyyymmdd`
* `yymmdd`
* `ddmmyyyy`
* `ddmmyy`
* `mmddyyyy`
* `mmddyy`
All of these fields are available from the `Add Field` button.
Title generation also has the following options:
| Option | Remark |
|--------|--------|
|Whitespace characters| These characters are replaced with whitespace (defaults to `._`, to handle filenames like `three.word.title.avi`|
|Capitalize title| capitalises the first letter of each word|
The fields to display can be customised with the `Display Fields` drop-down section. By default, any field with new/different values will be displayed.
## Applying the results
Once the options are correct, click on the `Find` button. The system will search for scenes that have filenames that match the given pattern.
The results are presented in a table showing the existing and generated values of the discovered fields, along with a checkbox to determine whether or not the field will be set on each scene. These fields can also be edited manually.
The `Apply` button updates the scenes based on the set fields.
> **⚠️ Note:** results are paged and the `Apply` button only applies to scenes on the current page.

View File

@@ -0,0 +1,839 @@
# Contributing Scrapers
Scrapers can be contributed to the community by creating a PR in [this repository](https://github.com/stashapp/CommunityScrapers/pulls).
# Scraper configuration file format
```yaml
name: <site>
performerByName:
<single scraper config>
performerByFragment:
<single scraper config>
performerByURL:
<multiple scraper URL configs>
sceneByName:
<single scraper config>
sceneByQueryFragment:
<single scraper config>
sceneByFragment:
<single scraper config>
sceneByURL:
<multiple scraper URL configs>
movieByURL:
<multiple scraper URL configs>
galleryByFragment:
<single scraper config>
galleryByURL:
<multiple scraper URL configs>
<other configurations>
```
`name` is mandatory, all other top-level fields are optional. The inclusion of each top-level field determines what capabilities the scraper has.
A scraper configuration in any of the top-level fields must at least have an `action` field. The other fields are required based on the value of the `action` field.
The scraping types and their required fields are outlined in the following table:
| Behavior | Required configuration |
|-----------|------------------------|
| Scraper in `Scrape...` dropdown button in Performer Edit page | Valid `performerByName` and `performerByFragment` configurations. |
| Scrape performer from URL | Valid `performerByURL` configuration with matching URL. |
| Scraper in query dropdown button in Scene Edit page | Valid `sceneByName` and `sceneByQueryFragment` configurations. |
| Scraper in `Scrape...` dropdown button in Scene Edit page | Valid `sceneByFragment` configuration. |
| Scrape scene from URL | Valid `sceneByURL` configuration with matching URL. |
| Scrape movie from URL | Valid `movieByURL` configuration with matching URL. |
| Scraper in `Scrape...` dropdown button in Gallery Edit page | Valid `galleryByFragment` configuration. |
| Scrape gallery from URL | Valid `galleryByURL` configuration with matching URL. |
URL-based scraping accepts multiple scrape configurations, and each configuration requires a `url` field. stash iterates through these configurations, attempting to match the entered URL against the `url` fields in the configuration. It executes the first scraping configuration where the entered URL contains the value of the `url` field.
## Actions
### Script
Executes a script to perform the scrape. The `script` field is required for this action and accepts a list of string arguments. For example:
```yaml
action: script
script:
- python
- iafdScrape.py
- query
```
If the script specifies the python executable, Stash will find the correct python executable for your system, either `python` or `python3`. So for example. this configuration could execute `python iafdScrape.py query` or `python3 iafdScrape.py query`.
`python3` will be looked for first and if it's not found, we'll check for `python`. In the case neither are found, you will get an error.
Stash sends data to the script process's `stdin` stream and expects the output to be streamed to the `stdout` stream. Any errors and progress messages should be output to `stderr`.
The script is sent input and expects output based on the scraping type, as detailed in the following table:
| Scrape type | Input | Output |
|-------------|-------|--------|
| `performerByName` | `{"name": "<performer query string>"}` | Array of JSON-encoded performer fragments (including at least `name`) |
| `performerByFragment` | JSON-encoded performer fragment | JSON-encoded performer fragment |
| `performerByURL` | `{"url": "<url>"}` | JSON-encoded performer fragment |
| `sceneByName` | `{"name": "<scene query string>"}` | Array of JSON-encoded scene fragments |
| `sceneByQueryFragment`, `sceneByFragment` | JSON-encoded scene fragment | JSON-encoded scene fragment |
| `sceneByURL` | `{"url": "<url>"}` | JSON-encoded scene fragment |
| `movieByURL` | `{"url": "<url>"}` | JSON-encoded movie fragment |
| `galleryByFragment` | JSON-encoded gallery fragment | JSON-encoded gallery fragment |
| `galleryByURL` | `{"url": "<url>"}` | JSON-encoded gallery fragment |
For `performerByName`, only `name` is required in the returned performer fragments. One entire object is sent back to `performerByFragment` to scrape a specific performer, so the other fields may be included to assist in scraping a performer. For example, the `url` field may be filled in for the specific performer page, then `performerByFragment` can extract by using its value.
Python example of a performer Scraper:
```python
import json
import sys
import string
def readJSONInput():
input = sys.stdin.read()
return json.loads(input)
def searchPerformer(name):
# perform scraping here - using name for the query
# fill in the output
ret = []
# example shown for a single found performer
p = {}
p['name'] = "some name"
p['url'] = "performer url"
ret.append(p)
return ret
def scrapePerformer(input):
ret = []
# get the url from the input
url = input['url']
return scrapePerformerURL(url)
def debugPrint(t):
sys.stderr.write(t + "\n")
def scrapePerformerURL(url):
debugPrint("Reading url...")
debugPrint("Parsing html...")
# parse html
# fill in performer details - single object
ret = {}
ret['name'] = "fred"
ret['aliases'] = "freddy"
ret['ethnicity'] = ""
# and so on
return ret
# read the input
i = readJSONInput()
if sys.argv[1] == "query":
ret = searchPerformer(i['name'])
print(json.dumps(ret))
elif sys.argv[1] == "scrape":
ret = scrapePerformer(i)
print(json.dumps(ret))
elif sys.argv[1] == "scrapeURL":
ret = scrapePerformerURL(i['url'])
print(json.dumps(ret))
```
### scrapeXPath
This action scrapes a web page using an xpath configuration to parse. This action is **not valid** for `performerByFragment`.
This action requires that the top-level `xPathScrapers` configuration is populated. The `scraper` field is required and must match the name of a scraper name configured in `xPathScrapers`. For example:
```yaml
sceneByURL:
- action: scrapeXPath
url:
- pornhub.com/view_video.php
scraper: sceneScraper
```
The above configuration requires that `sceneScraper` exists in the `xPathScrapers` configuration.
XPath scraping configurations specify the mapping between object fields and an xpath selector. The xpath scraper scrapes the applicable URL and uses xpath to populate the object fields.
>
### scrapeJson
This action works in the same way as `scrapeXPath`, but uses a mapped json configuration to parse. It uses the top-level `jsonScrapers` configuration. This action is **not valid** for `performerByFragment`.
JSON scraping configurations specify the mapping between object fields and a GJSON selector. The JSON scraper scrapes the applicable URL and uses [GJSON](https://github.com/tidwall/gjson/blob/master/SYNTAX.md) to parse the returned JSON object and populate the object fields.
### scrapeXPath and scrapeJson use with `performerByName`
For `performerByName`, the `queryURL` field must be present also. This field is used to perform a search query URL for performer names. The placeholder string sequence `{}` is replaced with the performer name search string. For the subsequent performer scrape to work, the `URL` field must be filled in with the URL of the performer page that matches a URL given in a `performerByURL` scraping configuration. For example:
```yaml
name: Boobpedia
performerByName:
action: scrapeXPath
queryURL: http://www.boobpedia.com/wiki/index.php?title=Special%3ASearch&search={}&fulltext=Search
scraper: performerSearch
performerByURL:
- action: scrapeXPath
url:
- boobpedia.com/boobs/
scraper: performerScraper
xPathScrapers:
performerSearch:
performer:
Name: # name element
URL: # URL element that matches the boobpedia.com/boobs/ URL above
performerScraper:
# ... performer scraper details ...
```
### scrapeXPath and scrapeJson use with `sceneByFragment` and `sceneByQueryFragment`
For `sceneByFragment` and `sceneByQueryFragment`, the `queryURL` field must also be present. This field is used to build a query URL for scenes. For `sceneByFragment`, the `queryURL` field supports the following placeholder fields:
* `{checksum}` - the MD5 checksum of the scene
* `{oshash}` - the oshash of the scene
* `{filename}` - the base filename of the scene
* `{title}` - the title of the scene
* `{url}` - the url of the scene
These placeholder field values may be manipulated with regex replacements by adding a `queryURLReplace` section, containing a map of placeholder field to regex configuration which uses the same format as the `replace` post-process action covered below.
For example:
```yaml
sceneByFragment:
action: scrapeJson
scraper: sceneQueryScraper
queryURL: https://metadataapi.net/api/scenes?parse={filename}&limit=1
queryURLReplace:
filename:
- regex: <some regex>
with: <replacement>
```
The above configuration would scrape from the value of `queryURL`, replacing `{filename}` with the base filename of the scene, after it has been manipulated by the regex replacements.
### scrapeXPath and scrapeJson use with `<scene|performer|gallery|movie>ByURL`
For `sceneByURL`, `performerByURL`, `galleryByURL` the `queryURL` can also be present if we want to use `queryURLReplace`. The functionality is the same as `sceneByFragment`, the only placeholder field available though is the `url`:
* `{url}` - the url of the scene/performer/gallery
```yaml
sceneByURL:
- action: scrapeJson
url:
- metartnetwork.com
scraper: sceneScraper
queryURL: "{url}"
queryURLReplace:
url:
- regex: '^(?:.+\.)?([^.]+)\.com/.+movie/(\d+)/(\w+)/?$'
with: https://www.$1.com/api/movie?name=$3&date=$2
```
### Stash
A different stash server can be configured as a scraping source. This action applies only to `performerByName`, `performerByFragment`, and `sceneByFragment` types. This action requires that the top-level `stashServer` field is configured.
`stashServer` contains a single `url` field for the remote stash server. The username and password can be embedded in this string using `username:password@host`.
An example stash scrape configuration is below:
```yaml
name: stash
performerByName:
action: stash
performerByFragment:
action: stash
sceneByFragment:
action: stash
stashServer:
url: http://stashserver.com:9999
```
## Xpath and JSON scrapers configuration
The top-level `xPathScrapers` field contains xpath scraping configurations, freely named. These are referenced in the `scraper` field for `scrapeXPath` scrapers.
Likewise, the top-level `jsonScrapers` field contains json scraping configurations.
Collectively, these configurations are known as mapped scraping configurations.
A mapped scraping configuration may contain a `common` field, and must contain `performer`, `scene`, `movie` or `gallery` depending on the scraping type it is configured for.
Within the `performer`/`scene`/`movie`/`gallery` field are key/value pairs corresponding to the [golang fields](/help/ScraperDevelopment.md#object-fields) on the performer/scene object. These fields are case-sensitive.
The values of these may be either a simple selector value, which tells the system where to get the value of the field from, or a more advanced configuration (see below). For example, for an xpath configuration:
```yaml
performer:
Name: //h1[@itemprop="name"]
```
This will set the `Name` attribute of the returned performer to the text content of the element that matches `<h1 itemprop="name">...`.
For a json configuration:
```yaml
performer:
Name: data.name
```
The value may also be a sub-object. If it is a sub-object, then the selector must be set to the `selector` key of the sub-object. For example, using the same xpath as above:
```yaml
performer:
Name:
selector: //h1[@itemprop="name"]
postProcess:
# post-processing config values
```
### Fixed attribute values
Alternatively, an attribute value may be set to a fixed value, rather than scraping it from the webpage. This can be done by replacing `selector` with `fixed`. For example:
```yaml
performer:
Gender:
fixed: Female
```
### Common fragments
The `common` field is used to configure selector fragments that can be referenced in the selector strings. These are key-value pairs where the key is the string to reference the fragment, and the value is the string that the fragment will be replaced with. For example:
```yaml
common:
$infoPiece: //div[@class="infoPiece"]/span
performer:
Measurements: $infoPiece[text() = 'Measurements:']/../span[@class="smallInfo"]
```
The `Measurements` xpath string will replace `$infoPiece` with `//div[@class="infoPiece"]/span`, resulting in: `//div[@class="infoPiece"]/span[text() = 'Measurements:']/../span[@class="smallInfo"]`.
> **⚠️ Note:** Recursive common fragments are **not** supported.
Referencing a common fragment within another common fragment will cause an error. For example:
```yaml
common:
$info: //div[@class="info"]
# Referencing $info in $models will cause an error
$models: $info/a[@class="model"]
scene:
Title: $info/h1
Performers:
Name: $models
URL: $models/@href
```
### Post-processing options
Post-processing operations are contained in the `postProcess` key. Post-processing operations are performed in the order they are specified. The following post-processing operations are available:
* `feetToCm`: converts a string containing feet and inches numbers into centimeters. Looks for up to two separate integers and interprets the first as the number of feet, and the second as the number of inches. The numbers can be separated by any non-numeric character including the `.` character. It does not handle decimal numbers. For example `6.3` and `6ft3.3` would both be interpreted as 6 feet, 3 inches before converting into centimeters.
* `lbToKg`: converts a string containing lbs to kg.
* `map`: contains a map of input values to output values. Where a value matches one of the input values, it is replaced with the matching output value. If no value is matched, then value is unmodified.
Example:
```yaml
performer:
Gender:
selector: //div[@class="example element"]
postProcess:
- map:
F: Female
M: Male
Height:
selector: //span[@id="height"]
postProcess:
- feetToCm: true
Weight:
selector: //span[@id="weight"]
postProcess:
- lbToKg: true
```
Gets the contents of the selected div element, and sets the returned value to `Female` if the scraped value is `F`; `Male` if the scraped value is `M`.
Height and weight are extracted from the selected spans and converted to `cm` and `kg`.
* `parseDate`: if present, the value is the date format using go's reference date (2006-01-02). For example, if an example date was `14-Mar-2003`, then the date format would be `02-Jan-2006`. See the [time.Parse documentation](https://golang.org/pkg/time/#Parse) for details. When present, the scraper will convert the input string into a date, then convert it to the string format used by stash (`YYYY-MM-DD`). Strings "Today", "Yesterday" are matched (case insensitive) and converted by the scraper so you don't need to edit/replace them.
* `subtractDays`: if set to `true` it subtracts the value in days from the current date and returns the resulting date in stash's date format.
Example:
```yaml
Date:
selector: //strong[contains(text(),"Added:")]/following-sibling::text()
postProcess:
- replace:
- regex: (\d+)\sdays\sago.+
with: $1
- subtractDays: true
```
* `replace`: contains an array of sub-objects. Each sub-object must have a `regex` and `with` field. The `regex` field is the regex pattern to replace, and `with` is the string to replace it with. `$` is used to reference capture groups - `$1` is the first capture group, `$2` the second and so on. Replacements are performed in order of the array.
Example:
```yaml
CareerLength:
selector: $infoPiece[text() = 'Career Start and End:']/../span[@class="smallInfo"]
postProcess:
- replace:
- regex: \s+to\s+
with: "-"
```
Replaces `2001 to 2003` with `2001-2003`.
* `subScraper`: if present, the sub-scraper will be executed after all other post-processes are complete and before parseDate. It then takes the value and performs an http request, using the value as the URL. Within the `subScraper` config is a nested scraping configuration. This allows you to traverse to other webpages to get the attribute value you are after. For more info and examples have a look at [#370](https://github.com/stashapp/stash/pull/370), [#606](https://github.com/stashapp/stash/pull/606)
Additionally, there are a number of fixed post-processing fields that are specified at the attribute level (not in `postProcess`) that are performed after the `postProcess` operations:
* `concat`: if an xpath matches multiple elements, and `concat` is present, then all of the elements will be concatenated together
* `split`: the inverse of `concat`. Splits a string to more elements using the separator given. For more info and examples have a look at PR [#579](https://github.com/stashapp/stash/pull/579)
Example:
```yaml
Tags:
Name:
selector: //span[@class="list_attributes"]
split: ","
```
Splits a comma separated list of tags located in the span and returns the tags.
For backwards compatibility, `replace`, `subscraper` and `parseDate` are also allowed as keys for the attribute.
Post-processing on attribute post-process is done in the following order: `concat`, `replace`, `subscraper`, `parseDate` and then `split`.
### XPath resources:
- Test XPaths in Firefox: https://addons.mozilla.org/en-US/firefox/addon/try-xpath/
- XPath cheatsheet: https://devhints.io/xpath
### GJSON resources:
- GJSON Path Syntax: https://github.com/tidwall/gjson/blob/master/SYNTAX.md
### Debugging support
To print the received html/json from a scraper request to the log file, add the following to your scraper yml file:
```yaml
debug:
printHTML: true
```
### CDP support
Some websites deliver content that cannot be scraped using the raw html file alone. These websites use javascript to dynamically load the content. As such, direct xpath scraping will not work on these websites. There is an option to use Chrome DevTools Protocol to load the webpage using an instance of Chrome, then scrape the result.
Chrome CDP support can be enabled for a specific scraping configuration by adding the following to the root of the yml configuration:
```yaml
driver:
useCDP: true
```
Optionally, you can add a `sleep` value under the `driver` section. This specifies the amount of time (in seconds) that the scraper should wait after loading the website to perform the scrape. This is needed as some sites need more time for loading scripts to finish. If unset, this value defaults to 2 seconds.
When `useCDP` is set to true, stash will execute or connect to an instance of Chrome. The behavior is dictated by the `Chrome CDP path` setting in the user configuration. If left empty, stash will attempt to find the Chrome executable in the path environment, and will fail if it cannot find one.
`Chrome CDP path` can be set to a path to the chrome executable, or an http(s) address to remote chrome instance (for example: `http://localhost:9222/json/version`). As remote instance a docker container can also be used with the `chromedp/headless-shell` image being highly recommended.
### CDP Click support
When using CDP you can use the `clicks` part of the `driver` section to do Mouse Clicks on elements you need to collapse or toggle. Each click element has an `xpath` value that holds the XPath for the button/element you need to click and an optional `sleep` value that is the time in seconds to wait for after clicking.
If the `sleep` value is not set it defaults to `2` seconds.
A demo scraper using `clicks` follows.
```yaml
name: clickDemo # demo only for a single URL
sceneByURL:
- action: scrapeXPath
url:
- https://getbootstrap.com/docs/4.3/components/collapse/
scraper: sceneScraper
xPathScrapers:
sceneScraper:
scene:
Title: //head/title
Details: # shows the id/s of the the visible div/s for the Multiple targets example of the page
selector: //div[@class="bd-example"]//div[@class="multi-collapse collapse show"]/@id
concat: "\n\n"
driver:
useCDP: true
sleep: 1
clicks: # demo usage toggle on off multiple times
- xpath: //a[@href="#multiCollapseExample1"] # toggle on first element
- xpath: //button[@data-target="#multiCollapseExample2"] # toggle on second element
sleep: 4
- xpath: //a[@href="#multiCollapseExample1"] # toggle off fist element
sleep: 1
- xpath: //button[@data-target="#multiCollapseExample2"] # toggle off second element
- xpath: //button[@data-target="#multiCollapseExample2"] # toggle on second element
```
> **⚠️ Note:** each `click` adds an extra delay of `clicks sleep` seconds, so the above adds `2+4+1+2+2=11` seconds to the loading time of the page.
### Cookie support
In some websites the use of cookies is needed to bypass a welcoming message or some other kind of protection. Stash supports the setting of cookies for the direct xpath scraper and the CDP based one. Due to implementation issues the usage varies a bit.
To use the cookie functionality a `cookies` sub section needs to be added to the `driver` section.
Each cookie element can consist of a `CookieURL` and a number of `Cookies`.
* `CookieURL` is only needed if you are using the direct / native scraper method. It is the request url that we expect from the site we scrape. It must be in the same domain as the cookies we try to set otherwise all cookies in the same group will fail to set. If the `CookieURL` is not a valid URL then again the cookies of that group will fail.
* `Cookies` are the actual cookies we set. When using CDP that's the only part required. They have `Name`, `Value`, `Domain`, `Path` values.
In the following example we use cookies for a site using the direct / native xpath scraper. We expect requests to come from `https://www.example.com` and `https://api.somewhere.com` that look for a `_warning` and a `_warn` cookie. A `_test2` cookie is also set just as a demo.
```yaml
driver:
cookies:
- CookieURL: "https://www.example.com"
Cookies:
- Name: "_warning"
Domain: ".example.com"
Value: "true"
Path: "/"
- Name: "_test2"
Value: "123412"
Domain: ".example.com"
Path: "/"
- CookieURL: "https://api.somewhere.com"
Cookies:
- Name: "_warn"
Value: "123"
Domain: ".somewhere.com"
```
The same functionality when using CDP would look like this:
```yaml
driver:
useCDP: true
cookies:
- Cookies:
- Name: "_warning"
Domain: ".example.com"
Value: "true"
Path: "/"
- Name: "_test2"
Value: "123412"
Domain: ".example.com"
Path: "/"
- Cookies:
- Name: "_warn"
Value: "123"
Domain: ".somewhere.com"
```
For some sites, the value of the cookie itself doesn't actually matter. In these cases, we can use the `ValueRandom`
property instead of `Value`. Unlike `Value`, `ValueRandom` requires an integer value greater than `0` where the value
indicates how long the cookie string should be.
In the following example, we will adapt the previous cookies to use `ValueRandom` instead. We set the `_test2` cookie
to randomly generate a value with a length of 6 characters and the `_warn` cookie to a length of 3.
```yaml
driver:
cookies:
- CookieURL: "https://www.example.com"
Cookies:
- Name: "_warning"
Domain: ".example.com"
Value: "true"
Path: "/"
- Name: "_test2"
ValueRandom: 6
Domain: ".example.com"
Path: "/"
- CookieURL: "https://api.somewhere.com"
Cookies:
- Name: "_warn"
ValueRandom: 3
Domain: ".somewhere.com"
```
When developing a scraper you can have a look at the cookies set by a site by adding
* a `CookieURL` if you use the direct xpath scraper
* a `Domain` if you use the CDP scraper
and having a look at the log / console in debug mode.
### Headers
Sending request headers is possible when using a scraper.
Headers can be set in the `driver` section and are supported for plain, CDP enabled and JSON scrapers.
They consist of a Key and a Value. If the the Key is empty or not defined then the header is ignored.
```yaml
driver:
headers:
- Key: User-Agent
Value: My Stash Scraper
- Key: Authorization
Value: Bearer ds3sdfcFdfY17p4qBkTVF03zscUU2glSjWF17bZyoe8
```
* headers are set after stash's `User-Agent` configuration option is applied.
This means setting a `User-Agent` header from the scraper overrides the one in the configuration settings.
### XPath scraper example
A performer and scene xpath scraper is shown as an example below:
```yaml
name: Pornhub
performerByURL:
- action: scrapeXPath
url:
- pornhub.com
scraper: performerScraper
sceneByURL:
- action: scrapeXPath
url:
- pornhub.com/view_video.php
scraper: sceneScraper
xPathScrapers:
performerScraper:
common:
$infoPiece: //div[@class="infoPiece"]/span
performer:
Name: //h1[@itemprop="name"]
Birthdate:
selector: //span[@itemprop="birthDate"]
parseDate: Jan 2, 2006
Twitter: //span[text() = 'Twitter']/../@href
Instagram: //span[text() = 'Instagram']/../@href
Measurements: $infoPiece[text() = 'Measurements:']/../span[@class="smallInfo"]
Height:
selector: $infoPiece[text() = 'Height:']/../span[@class="smallInfo"]
postProcess:
- replace:
- regex: .*\((\d+) cm\)
with: $1
Ethnicity: $infoPiece[text() = 'Ethnicity:']/../span[@class="smallInfo"]
FakeTits: $infoPiece[text() = 'Fake Boobs:']/../span[@class="smallInfo"]
Piercings: $infoPiece[text() = 'Piercings:']/../span[@class="smallInfo"]
Tattoos: $infoPiece[text() = 'Tattoos:']/../span[@class="smallInfo"]
CareerLength:
selector: $infoPiece[text() = 'Career Start and End:']/../span[@class="smallInfo"]
postProcess:
- replace:
- regex: \s+to\s+
with: "-"
sceneScraper:
common:
$performer: //div[@class="pornstarsWrapper"]/a[@data-mxptype="Pornstar"]
$studio: //div[@data-type="channel"]/a
scene:
Title: //div[@id="main-container"]/@data-video-title
Tags:
Name: //div[@class="categoriesWrapper"]//a[not(@class="add-btn-small ")]
Performers:
Name: $performer/@data-mxptext
URL: $performer/@href
Studio:
Name: $studio
URL: $studio/@href
```
See also [#333](https://github.com/stashapp/stash/pull/333) for more examples.
### JSON scraper example
A performer and scene scraper for ThePornDB is shown below:
```yaml
name: ThePornDB
performerByName:
action: scrapeJson
queryURL: https://api.metadataapi.net/performers?q={}
scraper: performerSearch
performerByURL:
- action: scrapeJson
url:
- https://api.metadataapi.net/performers/
scraper: performerScraper
sceneByURL:
- action: scrapeJson
url:
- https://api.metadataapi.net/scenes/
scraper: sceneScraper
sceneByFragment:
action: scrapeJson
queryURL: https://api.metadataapi.net/scenes?parse={filename}&hash={oshash}&limit=1
scraper: sceneQueryScraper
queryURLReplace:
filename:
- regex: "[^a-zA-Z\\d\\-._~]" # clean filename so that it can construct a valid url
with: "." # "%20"
- regex: HEVC
with:
- regex: x265
with:
- regex: \.+
with: "."
jsonScrapers:
performerSearch:
performer:
Name: data.#.name
URL:
selector: data.#.id
postProcess:
- replace:
- regex: ^
with: https://api.metadataapi.net/performers/
performerScraper:
common:
$extras: data.extras
performer:
Name: data.name
Gender: $extras.gender
Birthdate: $extras.birthday
Ethnicity: $extras.ethnicity
Height:
selector: $extras.height
postProcess:
- replace:
- regex: cm
with:
Measurements: $extras.measurements
Tattoos: $extras.tattoos
Piercings: $extras.piercings
Aliases: data.aliases
Image: data.image
sceneScraper:
common:
$performers: data.performers
scene:
Title: data.title
Details: data.description
Date: data.date
URL: data.url
Image: data.background.small
Performers:
Name: data.performers.#.name
Studio:
Name: data.site.name
Tags:
Name: data.tags.#.tag
sceneQueryScraper:
common:
$data: data.0
$performers: data.0.performers
scene:
Title: $data.title
Details: $data.description
Date: $data.date
URL: $data.url
Image: $data.background.small
Performers:
Name: $data.performers.#.name
Studio:
Name: $data.site.name
Tags:
Name: $data.tags.#.tag
driver:
headers:
- Key: User-Agent
Value: Stash JSON Scraper
- Key: Authorization
Value: Bearer lPdwFdfY17p4qBkTVF03zscUU2glSjdf17bZyoe # use an actual API Key here
# Last Updated April 7, 2021
```
## Object fields
### Performer
```
Name
Gender
URL
Twitter
Instagram
Birthdate
DeathDate
Ethnicity
Country
HairColor
EyeColor
Height
Weight
Measurements
FakeTits
CareerLength
Tattoos
Piercings
Aliases
Tags (see Tag fields)
Image
Details
```
*Note:* - `Gender` must be one of `male`, `female`, `transgender_male`, `transgender_female`, `intersex`, `non_binary` (case insensitive).
### Scene
```
Title
Details
URL
Date
Image
Studio (see Studio Fields)
Movies (see Movie Fields)
Tags (see Tag fields)
Performers (list of Performer fields)
```
### Studio
```
Name
URL
```
### Tag
```
Name
```
### Movie
```
Name
Aliases
Duration
Date
Rating
Director
Studio
Synopsis
URL
FrontImage
BackImage
```
### Gallery
```
Title
Details
URL
Date
Rating
Studio (see Studio Fields)
Tags (see Tag fields)
Performers (list of Performer fields)
```

View File

@@ -0,0 +1,73 @@
# Metadata Scraping
Stash supports scraping of metadata from various external sources.
## Scraper Types
| Type | Description |
|---|:---|
| Fragment | Uses existing metadata for an Item and match it to a result from a metadata source. |
| Search/By Name | Uses a provided query string to search a metadata source for a list of matches for the user to pick from. |
| URL | Extracts metadata from a given URL. |
## Supported Scrapers
| | Fragment | Search | URL |
|---|:---:|:---:|:---:|
| gallery | ✔️ | | ✔️ |
| movie | | | ✔️ |
| performer | | ✔️ | ✔️ |
| scene | ✔️ | ✔️ | ✔️ |
# Scraper Operation
## Included Scrapers
Stash provides the following built-in scrapers:
| Scraper | Description |
|---|--|
| Freeones | `search` Performer scraper for freeones.xxx. |
| Auto Tag | Scene `fragment` scraper that matches existing performers, studio and tags using the filename. |
## Adding Scrapers
By default, Stash looks for scraper configurations in the `scrapers` sub-directory of the directory where the stash `config.yml` is read. This will either be the `$HOME/.stash` directory or the current working directory.
Scrapers are added by placing yaml configuration files (format: `scrapername.yml`) in the `scrapers` directory.
> **⚠️ Note:** Some scrapers may require more than just the yaml file, consult the individual scraper documentation
After the yaml files are added, removed or edited while stash is running, they can be reloaded going to `Settings > Metadata Providers > Scrapers` and clicking `Reload Scrapers`.
The stash community maintains a number of custom scraper configuration files that can be found [here](https://github.com/stashapp/CommunityScrapers).
## Using Scrapers
#### Fragment Scraper
Click on the `Scrape With...` button in the `edit` tab of an item, then select the scraper you wish to use.
#### Search Scraper
Click on the 🔍 button in the `edit` tab of an item. You will be presented with a search dialog with a pre-populated query to search for, after searching you will be presented with a list of results to pick from
#### URL Scraper
Enter the URL in the `edit` tab of an Item. If a scraper is installed that supports that url, then a button will appear to scrape the metadata.
## Tagger View
The Tagger view is accessed from the scenes page. It allows the user to run scrapers on all items on the current page. The Tagger presents the user with potential matches for an item from a selected stash-box instance or metadata source if supported. The user needs to select the correct metadata information to save.
When used in combination with stash-box, the user can optionally submit scene fingerprints to contribute to a stash-box instance. A scene fingerprint consists of any generated hashes (`phash`, `oshash`, `md5`) and the scene duration. Fingerprint submissions are associated with your stash-box account. Submitting fingerprints assists others in matching their files, because stash-box returns a count of matching user submitted fingerprints with every potential match.
| | Has Tagger | Source Selection |
|---|:---:|:---:|
| gallery | | |
| movie | | |
| performer | ✔️ | |
| scene | ✔️ | ✔️ |
## Identify Task
This task iterates through your Scenes and attempts to identify the scene using a selection of scraping sources. This task can be found under `Settings -> Tasks -> "Identify..." (Button)`. For more information see the [Tasks > Identify](/help/Identify.md) page.

View File

@@ -0,0 +1,21 @@
# Scene Tagger
Stash can be integrated with stash-box which acts as a centralized metadata database. This is in the early stages of development but can be used for fingerprint/keyword lookups and automated tagging of performers and scenes. The batch tagging interface can be accessed from the [scene view](/scenes?disp=3). For more information join our [Discord](https://discord.gg/2TsNFKt).
#### Searching
The fingerprint search matches your current selection of files against the remote stash-box instance. Any scenes with a matching fingerprint will be returned, although there is currently no validation of fingerprints so it&rsquo;s recommended to double-check the validity before saving.
If no fingerprint match is found it&rsquo;s possible to search by keywords. The search works by matching the query against a scene&rsquo;s _title_, _release date_, _studio name_, and _performer names_. By default the tagger uses metadata set on the file, or parses the filename, this can be changed in the config.
An important thing to note is that it only returns a match *if all query terms are a match*. As an example, if a scene is titled `"A Trip to the Mall"` with the performer `"Jane Doe"`, a search for `"Trip to the Mall 1080p"` will *not* match, however `"trip mall doe"` would. Usually a few pieces of info is enough, for instance performer name + release date or studio name. To avoid common non-related keywords you can add them to the blacklist in the tagger config. Any items in the blacklist are stripped out of the query.
#### Saving
When a scene is matched stash will try to match the studio and performers against your local studios and performers. If you have previously matched them, they will automatically be selected. If not you either have to select the correct performer/studio from the dropdown, choose create to create a new entity, or skip to ignore it.
Once a scene is saved the scene and the matched studio/performers will have the `stash_id` saved which will then be used for future tagging.
By default male performers are not shown, this can be enabled in the tagger config. Likewise scene tags are by default not saved. They can be set to either merge with existing tags on the scene, or overwrite them. It is not recommended to set tags currently since they are hard to deduplicate and can litter your data.
#### Submitting fingerprints
After a scene is saved you will prompted to submit the fingerprint back to the stash-box instance. This is optional, but can be helpful for other users who have an identical copy who will then be able to match via the fingerprint search. No other information than the `stash_id` and file fingerprint is submitted.

View File

@@ -0,0 +1,79 @@
# Tasks
This page allows you to direct the stash server to perform a variety of tasks.
# Scanning
The scan function walks through the stash directories you have configured for new and moved files.
Stash currently identifies files by performing a quick file hash. This means that if the file is renamed for moved elsewhere within your configured stash directories, then the scan will detect this and update its database accordingly.
Stash currently ignores duplicate files. If two files contain identical content, only the first one it comes across is used.
The scan task accepts the following options:
| Option | Description |
|--------|-------------|
| Generate previews | Generates video previews which play when hovering over a scene. |
| Generate animated image previews | Generates animated webp previews. Only required if the Preview Type is set to Animated Image. Requires Generate previews to be enabled. |
| Generate sprites | Generates sprites for the scene scrubber. |
| Generate perceptual hashes | Generates perceptual hashes for scene deduplication and identification. |
| Generate thumbnails for images | Generates thumbnails for image files. |
| Don't include file extension in title | By default, scenes, images and galleries have their title created using the file basename. When the flag is enabled, the file extension is stripped when setting the title. |
| Set name, date, details from embedded file metadata. | Parse the video file metadata (where supported) and set the scene attributes accordingly. It has previously been noted that this information is frequently incorrect, so only use this option where you are certain that the metadata is correct in the files. |
# Auto Tagging
See the [Auto Tagging](/help/AutoTagging.md) page.
# Scene Filename Parser
See the [Scene Filename Parser](/help/SceneFilenameParser.md) page.
# Generated Content
The scanning function automatically generates a screenshot of each scene. The generated content provides the following:
* Video or image previews that are played when mousing over the scene card
* Perceptual hashes - helps match against StashDB, and feeds the duplicate finder
* Sprites (scene stills for parts of each scene) that are shown in the scene scrubber
* Marker video previews that are shown in the markers page
* Transcoded versions of scenes. See below
* Image thumbnails of galleries
The generate task accepts the following options:
| Option | Description |
|--------|-------------|
| Previews | Generates video previews which play when hovering over a scene. |
| Animated image previews | Generates animated webp previews. Only required if the Preview Type is set to Animated Image. Requires Generate previews to be enabled. |
| Scene Scrubber Sprites | Generates sprites for the scene scrubber. |
| Markers Previews | Generates 20 second videos which begin at the marker timecode. |
| Marker Animated Image Previews | Generates animated webp previews for markers. Only required if the Preview Type is set to Animated Image. Requires Markers to be enabled. |
| Marker Screenshots | Generates static JPG images for markers. Only required if Preview Type is set to Static Image. Requires Marker Previews to be enabled. |
| Transcodes | MP4 conversions of unsupported video formats. Allows direct streaming instead of live transcoding. |
| Perceptual hashes | Generates perceptual hashes for scene deduplication and identification. |
| Overwrite existing generated files | By default, where a generated file exists, it is not regenerated. When this flag is enabled, then the generated files are regenerated. |
## Transcodes
Web browsers support a limited number of video and audio codecs and containers. Stash will directly stream video files where the browser supports the codecs and container. Originally, stash did not support viewing scene videos where the browser did not support the codecs/container, and generating transcodes was a way of viewing these files.
Stash has since implemented live transcoding, so transcodes are essentially unnecessary now. Further, transcodes use up a significant amount of disk space and are not guaranteed to be lossless.
## Image gallery thumbnails
These are generated when the gallery is first viewed, so generating them beforehand is not necessary.
# Cleaning
This task will walk through your configured media directories and remove any scene from the database that can no longer be found. It will also remove generated files for scenes that subsequently no longer exist.
Care should be taken with this task, especially where the configured media directories may be inaccessible due to network issues.
# Exporting and Importing
The import and export tasks read and write JSON files to the configured metadata directory. Import from file will merge your database with a file.
> **⚠️ Note:** The full import task wipes the current database completely before importing.
See the [JSON Specification](/help/JSONSpec.md) page for details on the exported JSON format.
---