Compare commits

...

8 Commits

3 changed files with 28 additions and 21 deletions

View File

@@ -79,4 +79,8 @@ curl -X POST "http://es:9200/ta_channel/_search?pretty" \
} }
} }
}' }'
``` ```
You should get back an example like this: [ Example-channel-info-elasticsearch.json](Example-channel-info-elasticsearch.json)
I'm using this as an example to update channels in TubeArchivist that are missing data. I wish kibana would let me do it. Maybe I can Just haven't figured it out yet.

View File

@@ -1,10 +1,10 @@
{ {
"id": "", "id": "Scripted in Video ID",
"channel_id": "Change To Channel ID/username", "channel_id": "Change to Youtube Username",
"uploader": "Youtube Username", "uploader": "Change to Youtube Username",
"uploader_id": "Change To Channel ID", "uploader_id": "Change To Channel ID",
"uploader_url": "https://www.youtube.com/channel/ChangeToChannelID-or-username", "uploader_url": "https://www.youtube.com/channel/ChangeToChannelID-or-username",
"title": "Example", "title": "Scripted in Title",
"description": null, "description": null,
"upload_date": "ToBeScriptedInYYYYMMDD", "upload_date": "ToBeScriptedInYYYYMMDD",
"categories": null, "categories": null,

View File

@@ -8,22 +8,28 @@ Small collection of Bash helpers used to prepare offline / archived YouTube vide
Normalize filenames and create accompanying metadata (.info.json) so TubeArchivist can ingest local archives (especially those from archive.org or other offline sources). Normalize filenames and create accompanying metadata (.info.json) so TubeArchivist can ingest local archives (especially those from archive.org or other offline sources).
Example input filename: Example input filename:
`20170311 (5XtCZ1Fa9ag) Terry A Davis Live Stream.mp4` - Example A: `20170311 (5XtCZ1Fa9ag) Terry A Davis Live Stream.mp4`
- Example B: `20131003 - 001 - 1okW1RTPZ7Q - TempleOS Hymns #1.mp4`
Resulting filename and sidecar JSON: Resulting filename and sidecar JSON:
- `20170311 Terry A Davis Live Stream [5XtCZ1Fa9ag].mp4` - Example A:
- `20170311 Terry A Davis Live Stream [5XtCZ1Fa9ag].info.json` - `20170311 Terry A Davis Live Stream [5XtCZ1Fa9ag].mp4`
- `20170311 Terry A Davis Live Stream [5XtCZ1Fa9ag].info.json`
- Example B:
- `20131003 - 001 - TempleOS Hymns #1 [1okW1RTPZ7Q].mp4`
- `20131003 - 001 - TempleOS Hymns #1 [1okW1RTPZ7Q].info.json`
--- ---
## How it works / Usage ## How it works / Usage
1. Put all the scripts in the directory with your video files (scripts currently do not recurse into subdirectories). 1. Put all the scripts in the directory with your video files (scripts currently do not recurse into subdirectories).
2. Edit 'Example.info.json' 2. Edit 'Example.info.json'
Update these lines - Update these lines (And also any other lines you want copied to each video that won't be scripted in. Null values I didn't have data for yet.)
- "channel_id": "Change To Channel ID/username", ```
- "uploader": "Youtube Username", "channel_id": "Change to Youtube Username",
- "uploader_id": "Change To Channel ID", "uploader": "Change to Youtube Username",
- "uploader_url": "https://www.youtube.com/channel/ChangeToChannelID-or-username", "uploader_id": "Change To Channel ID",
"uploader_url": "https://www.youtube.com/channel/ChangeToChannelID-or-username",
```
3. Run the scripts in order from the directory containing your media below: 3. Run the scripts in order from the directory containing your media below:
Each script performs a single transformation so you can inspect results between steps. Each script performs a single transformation so you can inspect results between steps.
@@ -34,7 +40,7 @@ Each script performs a single transformation so you can inspect results between
- If already have id at end skip to 3. - If already have id at end skip to 3.
1b. `move-find-id-to-end-filename.bash` 1b. `move-find-id-to-end-filename.bash`
- Split filename into parts. Find id between second and third " - " without brackets, adds backets, moves [id] to end of filename before extension. - Split filename into parts. Find video id between second and third " - " without brackets, adds backets, moves [id] to end of filename before extension.
- Skip 1a/2a, straight to 3. - Skip 1a/2a, straight to 3.
2a. `move-[id]-to-end-filename.bash` 2a. `move-[id]-to-end-filename.bash`
@@ -50,15 +56,16 @@ Each script performs a single transformation so you can inspect results between
- Insert the cleaned title into the sidecar JSON. - Insert the cleaned title into the sidecar JSON.
6. `insert-date-into-json.bash` 6. `insert-date-into-json.bash`
- Insert the date (if available) into the sidecar JSON. - Insert the date from filename (if available) into the sidecar JSON.
--- ---
## Notes and tips ## Notes and tips
- Scripts do not process subdirectories. Run at the directory root for each archive. - Scripts do not process subdirectories. Run at the directory root for each archive.
- Always test on a copy or run a subset first to confirm behavior. - Always test on a copy or run a subset first to confirm behavior.
- If filenames contain unusual characters, run a quick grep for non-ASCII prior to processing. - If filenames contain unusual characters, run a quick grep for non-ASCII prior to processing.
- Modify scripts to add dry-run mode if you want safer previews. - Modify scripts to add dry-run mode if you want safer previews.
- ElasticSearch Common Commands for updates: [ElasticSearch Common Commands](ElasticSearch-Common-Commands.md)
--- ---
@@ -66,8 +73,4 @@ Each script performs a single transformation so you can inspect results between
Archive used for testing: Archive used for testing:
`https://archive.org/details/TempleOS-TheMissingVideos` `https://archive.org/details/TempleOS-TheMissingVideos`
Processed example (after running full pipeline):
`20170311 Terry A Davis Live Stream [5XtCZ1Fa9ag].mp4`
`20170311 Terry A Davis Live Stream [5XtCZ1Fa9ag].info.json`
--- ---