I want to download the subtitle of a YouTube playlist with the following specifications:
Only in English
In srt format
Just the subtitles files and not the video itself
I have tried the following code snippet. But it is downloading the subtitles in all available languages and in vtt format.
ydl_opts = {
'allsubtitles': True,
'writesubtitles': True,
'convertsubtitles':True,
'skip_download':True,
'outtmpl': 'C:/Users/shrayani.mondal/Desktop/Personal/Python Projects/Speech to text/Subtitles/%(title)s.%(ext)s',
#'subtitlesformat': 'srt'
'subtitleslangs':'en',
'postprocessors': [{
'key': 'FFmpegSubtitlesConvertor',
'format': 'srt',
}],
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['https://www.youtube.com/watch?v=Lp7E973zozc&list=PLQltO7RlbjPJnbfHLsFJWP-DYnWPugUZ7'])
My second objective is to use auto-generated English subtitles for videos that do not have subtitles available. How do I include the if statement for that?
# if convert vtt into srt:
# use postprocessors code snippet given below
ytdlp_option = {
"progress_hooks": [self.track_progress],
"logger": DownloadVideo.Logger(self),
"noplaylist": True,
"format": f'{self.selected_file.get(VIDEO, {}).get("format_id")}+{self.selected_file.get(AUDIO, {}).get("format_id")}',
"paths": {"home": self.video_download_path},
"outtmpl": {"default": f"{self.video_file_path}.{self.output_extension}"},
"postprocessors": [
{
"key": "FFmpegSubtitlesConvertor",
"format": "srt",
}
]
"writesubtitles": True,
"subtitlesformat": "vtt"
"subtitleslangs": ["en"],
}
you can download subtitle only use this way.
youtube-dl --write-sub --write-auto-sub --skip-download YOURLINK
Default download subtitle language is Eng.
more option have in this link or youtube-dl --help (enter link description here
default subtitle format is vtt. can you convert to srt. try with youtube-dl or simply convertor have in internet.
Related
I create a m3u8 file using ffmpeg, I need to play that file, for that i used a npm package called react-hls-player but it is not playing, what the mistake i make, below is my code
I used below code to create a m3u8 file using ffmpeg in nodejs
const ffmpeg = require('fluent-ffmpeg')
const ffmpegInstaller = require('#ffmpeg-installer/ffmpeg')
ffmpeg.setFfmpegPath(ffmpegInstaller.path)
ffmpeg('uploads/malayalam.mp4').addOptions([
'-c:a',
'aac',
'-ar',
'48000',
'-b:a',
'128k',
'-c:v',
'h264',
'-profile:v',
'main',
'-crf',
'20',
'-g',
'48',
'-keyint_min',
'48',
'-sc_threshold',
'0',
'-b:v',
'2500k',
'-maxrate',
'2675k',
'-bufsize',
'3750k',
'-hls_time',
'4',
'-hls_playlist_type',
'vod',
'-hls_segment_filename',
'uploads/video/720p_%03d.ts'
]).output('uploads/video/output.m3u8').on('end',(err,data)=>{
console.log(data)
console.log('end')
}).run()
below is the screenshot of my m3u8 file directory
I used react-hls-player to play the m3u8 file my nodejs server running on port:5000
below is the reactjs code
import React from 'react'
import ReactHlsPlayer from 'react-hls-player';
function M3u8_player() {
return (
<div className="userlist">
<ReactHlsPlayer
src="http://localhost:5000/uploads/video/output.m3u8"
autoPlay={false}
controls={true}
width="100%"
height="auto"
/>
</div>
)
}
export default M3u8_player
Here i cant play the m3u8 file someone please help me , is there any mistake i made creating m3u8 file
I think the problem that you need to pass an Authentication header within the hls player
In my company, we're trying to transition from writing in Word to using markdown. We need several output formats for markdown, so I'm trying to create a grunt.js task that can watch a folder, and convert markdown files into the desired outputs.
Required outputs are: DOCX, PDF, and HTML.
I'm using the node-pandoc plugin for grunt.js to handle the conversions.
I've currently got the grunt task creating the docx and pdf. The issue is the HTML file - Pandoc seems to require that an HTML filename be specified explicitly, instead of just using the name of the original file. If none are specified, it generates an error (openBinaryFile: does not exist).
My current task options look like this:
grunt.initConfig({
node_pandoc: {
engdocx: {
//English source+options
expand: true,
src: 'Eng/*.md',
dest: 'pdfBuild/',
ext: '.docx',
flatten: true,
options: {
flags: '-f markdown -t docx -o --reference-doc=reference.docx --metadata-file=EngMeta.yml'
}
},
hebdocx:{
//Hebrew source+options
expand: true,
src: 'Heb/*.md',
dest: 'pdfBuild/',
ext: '.docx',
flatten: true,
options: {
flags: '-f markdown -t docx -o --reference-doc=reference.docx --metadata-file=HebMeta.yml'
},
// Target-specific file lists and/or options go here.
},
html_heb: {
// Convert markdown to html
expand: true,
src: ['Heb/*.md'],
dest: 'Final/',
ext: '.html',
flatten: true,
options: {
flags: '-f markdown -t html -o'
},
},
html_eng: {
// Convert markdown to html
expand: true,
src: ['Eng/*.md'],
dest: 'Final/',
ext: '.html',
flatten: true,
options: {
flags: '-f markdown -t html -s -o '
},
}
I'm wondering if I can:
Extract the filename from within the task.
Append it dynamically to the options for each task so pandoc knows what the desired file name is.
Alternatively, if I can persuade pandoc to generate the files without specifying the file name explicitly.
Turns out I was missing a -s for standalone flag.
html_heb: {
// Convert markdown to html
expand: true,
src: ['Heb/*.md'],
dest: 'Final/',
ext: '.html',
flatten: true,
options: {
flags: '-f markdown -t html -o -s' // Was missing a -s.
},
I'm using nightwatchjs to run my test suite, and I would like to remove the warning messages being outputted to my terminal display.
At the moment, I'm getting loads of these (admittedly genuine) warning messages whilst my scripts are running and it's making the reading of the results harder and harder.
As an example;
Yes they are valid messages, but it's not often possible for me to uniquely pick out each individual element and I'm not interested in them for my output.
So, I'd like to know how I can stop them from being reported in my terminal.
Below is what I've tried so far in my nightwatch.conf.js config file;
desiredCapabilities: {
browserName: 'chrome',
javascriptEnabled : true,
acceptSslCerts: true,
acceptInscureCerts: true,
chromeOptions : {
args: [
'--ignore-certificate-errors',
'--allow-running-insecure-content',
'--disable-web-security',
'--disable-infobars',
'--disable-popup-blocking',
'--disable-notifications',
'--log-level=3'],
prefs: {
'profile.managed_default_content_settings.popups' : 1,
'profile.managed_default_content_settings.notifications' : 1
},
},
},
},
but it's still displaying the warnings.
Any help on this would be really appreciated.
Many thanks.
You can try setting detailed_output property to false in the configuration file. This should stop these details from printing in the console.
You can find a sample config file here.
You can find relevant details available under Output Settings section of official docs here.
Update 1: This looks like a combo of properties which controls this and the below combo works for me.
live_output: false,
silent: true,
output: true,
detailed_output: false,
disable_error_log: false,
I'm using youtube-dl for a discord bot in python and it works fine, however it downloads the files to the root directory of the project. Since it will be downloading LOTS of videos, I would prefer for it to download to a directory inside of the root. How do I do this?
These are my current options:
ytdl_format_options = {
'format': 'bestaudio/best',
'outtmpl': '%(extractor)s-%(id)s-%(title)s.%(ext)s',
'reactrictfilenames': True,
'noplaylist': True,
'nocheckcertificate': True,
'ignoreerrors': False,
'logtostderr': False,
'quiet': True,
'no_warnings': True,
'default_search': 'auto',
'source_addreacs': '0.0.0.0', # bind to ipv4 since ipv6 addreacses cause issues sometimes
'output': r'youtube-dl'
}
ffmpeg_options = {
'before_options': '-nostdin',
'options': '-vn'
}
Set an output template containing slashes in the outtmpl option:
ytdl_format_options = {
'outtmpl': 'somewhere/%(extractor_key)s/%(extractor)s-%(id)s-%(title)s.%(ext)s',
...
}
Output templates can have lots of fields (including playlist IDs, license, format name/bitrates, album name & much more, depending on what the video website you're using supports). For more information, refer to the youtube-dl documentation of output templates. All fields can be used as directory or file names.
Receiving the error Could not load file 'worker.js' for content script. It isn't UTF-8 encoded.
> file -I chrome/worker.js
chrome/worker.js: text/plain; charset=utf-8
With to-utf8-unix
> to-utf8-unix chrome/worker.js
chrome/worker.js
----------------
Detected charset:
UTF-8
Confidence of charset detection:
100
Result:
Conversion not needed.
----------------
I also tried converting the file with Sublime Text back and forth without any luck.
manifest:
"content_scripts": [{
"matches": ["http://foo.com/*"],
"js": ["worker.js"]
}],
The file in question: https://www.dropbox.com/s/kcv23ooh06wlxg3/worker.js?dl=1
It is a compiled javascript file spit out from clojurescript with cljsbuild:
{:id "chrome-worker"
:source-paths ["src/chrome/worker"],
:compiler {:output-to "chrome/worker.js",
:optimizations :simple,
:pretty-print false}}
]}
Other files (options page, background) are compiled the same way and don't generate this error. I tried getting rid of weird characters like Emojis but that didn't fix the problem.
In case you are using Webpack you can solve it by replacing the default minifier Uglify with Terser, which won´t produce those encoding issues.
in your webpack.conf.js add
const TerserPlugin = require('terser-webpack-plugin');
// add this into your config object
optimization: {
minimize: true,
minimizer: [
new TerserPlugin({
parallel: true,
terserOptions: {
ecma: 6,
output: {
ascii_only: true
},
},
}),
],
},
It turns out this is a problem within the google closure compiler that clojurescript uses to generate javascript - https://github.com/google/closure-compiler/issues/1704
A workaround is to set compilation to "US-ASCII"
:closure-output-charset "US-ASCII"
Thanks a to to pesterhazy from the clojurians slack for helping with this!
Had this error get thrown after editing working source code in WordPad. When I saved the file in WordPad, the encoding was lost. To fix it, open the same file in NotePad, Save as, and specify "UTF-8" in the Encoding drop down menu next to the save button.
In case anyone has this issue with Parcel, just add a .terserrc file with this content.
{
"ecma": 6,
"output": {
"ascii_only": true
}
}
This is an adaptation of #marian-klühspies response https://stackoverflow.com/a/58528858/2920671