Tutorials on Bash

Learn about Bash from fellow newline community members!

  • React
  • Angular
  • Vue
  • Svelte
  • NextJS
  • Redux
  • Apollo
  • Storybook
  • D3
  • Testing Library
  • JavaScript
  • TypeScript
  • Node.js
  • Deno
  • Rust
  • Python
  • GraphQL
  • React
  • Angular
  • Vue
  • Svelte
  • NextJS
  • Redux
  • Apollo
  • Storybook
  • D3
  • Testing Library
  • JavaScript
  • TypeScript
  • Node.js
  • Deno
  • Rust
  • Python
  • GraphQL

Processing JSON with jq

Commonly, we process JSON data by writing a program to load, deserialize and manipulate this data. Depending on the programming language, this program may require an additional compilation step before being executed within a terminal. For simple operations, such as filtering and mapping, we don't need to write an additional program to perform these operations on our JSON data. Rather, we can directly manipulate our JSON data within a terminal via the jq command-line utility, which allows the editing of streamed JSON data without an interactive text editor interface (" sed for JSON"). If you're looking for a tool to retrieve JSON data from an API endpoint, process this data and save the result to a CSV, TSV or JSON file, then jq easily accomplishes this task in a single-line command. Below, I'm going to show you how to process JSON data with jq . Install the jq command-line utility by visiting the homepage of the jq website, downloading a prebuilt binary (compatible with your operating system) and executing this binary once the download is complete. Alternatively... To verify the installation was successful, restart the terminal, and inside of this terminal, enter the command jq . This should print an overview of the jq command: For extensive documentation, enter the command man jq , which summons man ual pages for the jq command: To get started, let's pretty-print a JSON dataset (with formatting and syntax-highlighting). The jq command must be passed a filter as its first argument. A filter is a program that tells jq what output should be returned given the input JSON data. The most basic filter is the pre-defined identity filter . , which tells jq to do nothing to the input JSON data and return it as is. To run jq on a JSON dataset, pipe the stringified JSON to jq (e.g., the file content of a .json file via the cat command or the JSON response from an API endpoint via the cURL command). If we pipe the JSON response of a cURL command to jq . , then jq pretty-prints this response in the terminal. Suppose we only wanted a single element from the JSON data. To access a single element from a JSON array, pass the array index filter to jq , which follows the syntax .[x] with x representing an index value (positive and negative integer). To access the first element: To access the last element: To access the penultimate (second to last) element: To access the element at index 3 : If the index value is outside of the JSON array's bounds, then no element is returned by the array index filter: Here, the dataset only contains 41 rows. Therefore, any index beyond 40 causes the filter to return no element. If an index value is omitted, then all of the elements are returned by the array index filter: Additionally, the .[] filter can be used on JSON objects to return all top-level values within the object. In case you are unsure whether the input data is not valid JSON, then append a ? to the empty square brackets to suppress errors. For example, if the input data is a stringified integer value... Without the ? , the error jq: error (at <stdin>:1): Cannot iterate over number (1) will be thrown. With the ? , this error is suppressed as if no error occurred. Suppose we only wanted a subset of the JSON data. To extract a sub-array from a JSON array, pass the array/string slice filter to jq , which follows the syntax .[x:y] with x and y representing starting (inclusive) and ending (exclusive) index values respectively (positive and negative integers). It behaves similar to JavaScript's .slice() method. To extract the first element only: To extract the last element only: To extract all elements but the first element (omit the first element): To extract all elements but the last element (omit the last element): To extract the elements at indices 3 - 5 : To retrieve the length of a JSON array, pipe the output of an identity filter to the built-in length function: This returns the total number of elements within the JSON array. For our example dataset, the total number of records returned by the NYC Open Data API is 41 . For a JSON object, the length function returns the total number of top-level keys within this object. To retrieve the length of each item of a JSON array, pipe the output of a .[] slice filter to the length function: This returns a list of each element's length. For our example dataset, each record contains four pieces of information: the year, the population of NYC for that year, the total number of gallons (in millions) of water consumed by NYC residents per day and the average number of gallons of water consumed by a NYC resident per day. If an element is a string, then length returns the string's length. If an element is a null value, then length returns zero. To retrieve the top-level keys from JSON, use the built-in keys function. These keys are returned as an array of strings. Unlike the length function, the keys function requires no filter piping. By default, these keys are sorted alphabetically. Alternatively, the keys_unsorted function does not sort keys alphabetically and returns the keys in their original order. For JSON arrays, this function returns a list of indices. Experiment with these techniques on other JSON data sources/files.

Thumbnail Image of Tutorial Processing JSON with jq

ffmpeg - Thumbnail and Preview Clip Generation (Part 2)

Disclaimer - If you are unfamiliar with FFmpeg, then please read this blog post before proceeding. When you upload a video to a platform such as Youtube , you can select and add a custom thumbnail image to display within its result item. Amongst the many recommended videos, a professionally-made thumbnail captures the attention of undecided users and improves the chances of your video being played. At a low-level, a thumbnail consists of an image, a title and a duration (placed within a faded black box and fixed to the lower-right corner): To generate a thumbnail from a video with ffmpeg : Let's test the drawtext filter by extracting the thumbnail image from the beginning of the video and writing "Test Text" to the center of this image. This thumbnail image will be a JPEG file. Notice that the drawtext filter accepts the parameters text , fontcolor , fontsize , x and y for configuring it: The parameters are delimited by a colon. To see a full list of drawtext parameters, click here . Now that we've covered the basics, let's add a duration to this thumbnail: Unfortunately, there's no convenient variable like w or tw for accessing the input's duration. Therefore, we must extract the duration from the input's information, which is outputted by the -i option. 2>&1 redirects standard error ( 2 for stderr ) to standard output ( 1 for stdout ). We pipe the information outputted by the -i option directly to grep to search for the line containing the text "Duration" and pipe it to cut to extract the duration (i.e., 00:00:10 for ten seconds) from this line. This duration is stored within a variable DURATION so that it can be injected into the text passed to drawtext . Here, we use two drawtext filters to modify the input media: one for writing the title text "Test Text" and one for writing the duration "00:00:10". The filters are comma delimited. To place the duration within a box, provide the box parameter and set it to 1 to enable it. To set the background color of this box, provide the boxcolor parameter. Note : Alternatively, you could get the video's duration via the ffprobe command. Let's tidy up this thumbnail by substituting the placeholder title with the actual title, uppercasing this title, changing the font to "Open Sans" and moving the duration box to the bottom-right corner. Like the duration, the title must also be extracted from the input media's information. To uppercase every letter in the title, place the ^^ symbol of Bash 4 at the end of the title's variable via parameter expansion ( ${TITLE^^} ). Since Bash is required for the uppercasing, let's place these commands inside of a .sh file beginning with a Bash shebang , which determines how the script will be executed. To find the location of the Bash interpreter for the shebang, run the following command: ( thumbnail.sh ) To specify a font weight for a custom font, reference that font weight's file as the fontfile . Don't forget to replace <username> with your own username! Additionally, several changes were made to the thumbnail box. The box color has a subtle opacity of 0.625. This number (any number between 0 and 1) proceeds the @ in the boxcolor . A border width of 8px provides a bit of spacing between the edges of the box and the text itself. Note : If you run into a bash: Bad Substitution error, update Bash to version 4+ and verify the Bash shebang correctly points to the Bash executable. When you hover over a recommended video's thumbnail, a brief clip appears and plays to give you an idea of what the video's content is. With the ffmpeg command, generating a clip from a video is relatively easy. Just provide a starting timestamp via the -ss option (from the original video, -ss seeks until it reaches this timestamp, which will serve as the point the clip begins at) and an ending timestamp via the -to option (from the original video at which the clip should end). Because video previews on Youtube are three seconds long, let's extract a three second segment starting from the four second mark and ending at the seven second mark. Since the clip lasts for a few seconds, we must re-encode the video (exclude -c copy ) to accurately capture instances when no keyframes exist. To clip a video without re-encoding, ffmpeg must capture a sufficient number of keyframes from the video. Since MP4s are encoded with the H.264 video codec ( h264 (High) is stated under the video's metadata printed by ffmpeg -i <input> ), if we assume that there are 250 frames between any two keyframes ("a GOP size of 250"), then for the ten second Big Buck Bunny video with a frame rate of 30 fps, there is one keyframe each eight to nine seconds. Clipping a video less than nine seconds with -c copy results in no keyframes being captured, and thus, the outputted clip contains no video ( 0 kB of video). Eight Second Clip (with -c copy ): Nine Second Clip (with -c copy ): Note : Alternatively, the -t option can be used in place of the -to option. With the -t option, you must specify the duration rather than the ending timestamp. So instead of 00:00:07 with -to , it would be 00:00:03 with -t for a three second clip. Suppose you want to add your brand's logo, custom-made title graphics or watermark to the thumbnail. To overlay such an image on top of a thumbnail, pass this image as an input file via the i option and apply the overlay filter. Position the image on top of the thumbnail accordingly with the x and y parameters. ( thumbnail.sh ) Passing multiple inputs (in this case, a video and watermark image) requires the -filter_complex option in place of the -vf option. The main_h and overlay_h variables represent the main input's height (from the input video) and the overlay's height (from the input watermark image) respectively. Here, we place the watermark image in the lower-left corner of the thumbnail. The watermark image looks a bit large compared to the other elements on the thumbnail. Let's scale down the watermark image to half its original size by first scaling it down before any of the existing chained filters are executed. ( thumbnail.sh ) To scale the watermark image to half its size, we must explicitly tell the scale filter to only scale this image and not the video. This is done by prepending [1:v] to the scale filter to have the scale filter target our second input -i ./watermark-ex.png . The iw and ih variables will represent the watermark image's width and height respectively. Once the scaling is done, the scaled watermark image is outputted to ovrl , which can be referenced by other filters for consumption as a filter input. Because the overlay filter takes two inputs, an input video and an input image overlay, we prepend the overlay filter with these inputs: [0:v] for the first input -i ./Big_Buck_Bunny_360_10s_30MB.mp4 and ovrl for our scaled watermark image. Imagine having a large repository of videos that needs to be processed and uploaded during continuous integration. Write a Bash script to automate this process.

Thumbnail Image of Tutorial ffmpeg - Thumbnail and Preview Clip Generation (Part 2)

I got a job offer, thanks in a big part to your teaching. They sent a test as part of the interview process, and this was a huge help to implement my own Node server.

This has been a really good investment!

Advance your career with newline Pro.

Only $30 per month for unlimited access to over 60+ books, guides and courses!

Learn More

ffmpeg - Editing Audio and Video Content (Part 1)

Online streaming and multimedia content platforms garner a large audience and consume a disproportionate amount of bandwidth compared to other types of platforms. These platforms rely on content creatorsΒ to upload, share and promote their videos and music. To process and polish video and audio files, both professionals and amateurs automatically resort to using interactive software, such as Adobe Premiere. Such software features many tools to unleash the creativity of its users, but each comes with its own set of entry barriers (learning curve and pricing) and unique workflows for editing tasks. For example, in Adobe Premiere , to manually concatenate footage together, you create a nested sequence, which involves several steps of creating sequences and dragging and dropping clips into a workspace's timeline. If you produce lots of content weekly for a platform, such as YouTube, and work on a tight schedule that leaves no extra time for video editing, then you may consider hiring a devoted video editor to handle the video editing for you. Fortunately, you can develop a partially autonomous workflow for video editing by offloading certain tedious tasks to FFmpeg. FFmpeg is a cross-platform, open-source library for processing multimedia content (e.g., videos, images and audio files) and converting between different video formats (i.e., MP4 to WebM ). Commonly, developers use FFmpeg via the ffmpeg CLI tool, but there are language-specific bindings written for FFmpeg to import it as a package/dependency into your project/s. With ffmpeg , Bash scripts can automate your workflow with simple, single-line commands, whether it is making montages, replacing a video's audio with stock background music or streamlining bulk uploads. This either significantly reduces or completely eliminates your dependence on a user interface to manually perform these tasks by moving around items, clicking buttons, etc. Below, I'm going to show you... Some operating systems already have ffmpeg installed. To check, simply type ffmpeg into the terminal. If the command is already installed, then the terminal prints a synopsis of ffmpeg . If ffmpeg is not yet installed on your machine, then visit the FFmpeg website, navigate to the " Download " page, download a compiled executable (compatible with your operating system) and execute it once the download is complete. Note : It is recommended to install the stable build to avoid unexpected bugs. Alternatively... For extensive documentation, enter the command man ffmpeg , which summons man ual pages for the ffmpeg command: For this blog post, I will demonstrate the versatility of ffmpeg using the Big Buck Bunny video, an open-source, animated film built using Blender. Because downloading from the official Big Buck Bunny website might be slow for some end users, download the ten second Big Buck Bunny MP4 video ( 30 MB, 640 x 360 ) from Test Videos . The wget CLI utility downloads files from the web. Essentially, this command downloads the video from Wikimedia Commons to the current directory, and this downloaded video is named Big_Buck_Bunny_360_10s_30MB.mp4 . The -c option tells wget to resume an interrupted download from the most recent download position, and the -O option tells wget to download the file to a location of your choice and customize the name of the downloaded file. The ffmpeg command follows the syntax: For a full list of options supported by ffmpeg , consult the documentation . Square brackets and curly braces indicate optional items. Items grouped within square brackets are not required to be mutually exclusive, whereas items grouped within curly braces are required to be mutually exclusive. For example, you can provide the -i option with a path of the input file ( infile ) to ffmpeg without any infile options. However, to provide any outfile option, ffmpeg must be provided the path of the output file ( outfile ). To specify an input media file, provide its path to the -i option. Unlike specifying an input media file, specifying an output media file does not require an option; it just needs to be the last argument provided to the ffmpeg command. To print information about a media file, run the following command: Just providing an input media file to the ffmpeg command displays its details within the terminal. Here, the Metadata contains information such as the video's title ("Big Buck Bunny, Sunflower version") and encoder ("Lavf54.20.4"). The video runs for approximately ten and a half minutes at 30k FPS. To strip away the FFmpeg banner information (i.e., the FFmpeg version) from the output of this command, provide the -hide_banner option. That's much cleaner! To convert a media file to a different format, provide the outfile path (with the extension of the format). For example, to convert a MP4 file to an WebM file... Note : Depending on your machine's hardware, you may need to be patient for large files! To find out all the formats supported by ffmpeg , run the following command: To reduce the amount of bandwidth consumed by users watching your videos on a mobile browser or save space on your hard/flash drive, compress your videos by: Here, we specify a video filter with the -vf option. We pass a scale filter to this option that scales down the video to a quarter of its original width and height. The original aspect ratio is not preserved. Note : To preserve aspect ratio, you need to set either the target width or height to -1 (i.e., scale=360:-1 sets the width to 360px and the height to a value calculated based on this width and the video's aspect ratio). The output file is less than 100 KBs! Here, we specify the H.265 video codec by setting the -c:v option to libx265 . The -preset defines the speed of the encoding. The faster the encoding, the worst the compression, and vice-versa. The default preset is medium , but we set it to fast , which is just one level above medium in terms of speed. The CRF is set to 28 for the default quality maintained by the codec. The -tag:v option is set to hvc1 to allow QuickTime to play this video. The output file is less than 500 KBs, and it still has the same aspect ratio and dimensions as the original video while also maintaining an acceptable quality! Unfortunately, because browser support for H.265 is sparse , videos compressed with this standard cannot be viewed within most major browsers (e.g., Chrome and Firefox). Instead, use the H.264 video codec, an older standard that offers worst compression ratios (larger compressed files, etc.) compared to H.265, to compress videos. Videos compressed with this standard can be played in all major browsers . Note : We don't need to provide the additional -tag:v option since QuickTime automatically knows how to play videos compressed with H.264. Note : 23 is the default CRF value for H.264 (visually corresponds to 28 for H.265, but the size of a H.264 compressed file will be twice that of a H.265 compressed file). Notice that the resulting video ( Big_Buck_Bunny_360_10s_30MB_codec_2.mp4 ) is now twice that of previous ( Big_Buck_Bunny_360_10s_30MB_codec.mp4 ), but now, you have a video that can be played within all major browsers. Simply drag and drop these videos into separate tabs of Chrome or Firefox to see this. Big_Buck_Bunny_360_10s_30MB_codec_2.mp4 in Firefox: Big_Buck_Bunny_360_10s_30MB_codec.mp4 in Firefox: Check out this codec compatibility table to ensure you choose the appropriate codec based on your videos and the browsers you need to support. Much like formats, to find out all the codecs supported by ffmpeg , run the following command: First, let's download another video, the ten second Jellyfish MP4 video ( 30 MB, 640 x 360 ), from Test Videos. To concatenate this video to the Big Buck Bunny video, run the following command: Since both video files are both MP4s and encoded with the same codec and parameters (e.g., dimensions and time base), they can be concatenated by passing them through a demuxer , which extracts a list of video files from an input text file and demultiplexes the individual streams (e.g., audio, video and subtitles) of each video files, and then multiplexing the constituent streams into a coherent stream. Essentially, this command concatenates audio to audio, video to video, subtitles to subtitles, etc., and then combines these concatenations together into a single video file. By omitting the decoding and encoding steps for the streams (via -c copy ), the command quickly concatenates the files with no loss in quality. Note : Setting the -safe option to 0 allows the demuxer to accept any file, regardless of protocol specification. If you are just concatenating files referenced via relative paths, then you can omit this option. When you play the concatenated.mp4 video file, you will notice that this video's duration is 20 seconds. It starts with the Big Buck Bunny video, and then immediately jumps to the Jellyfish video at the 10 second mark. Note : If the input video files are encoded differently or are not of the same format, then you must re-encode all of the video files with the same codec before concatenating them. Suppose you wanted to merge the audio of a video with stock background music to fill the silence. To do this, you must provide the video file and stock background music file as input files for ffmpeg . Then, we specify the video codec ( -c:v ) to be copy to tell FFmpeg to copy the video's bitstream directly to the output with zero quality changes, and we specify the audio codec ( -c:a ) to be aac (for Advanced Audio Coding ) to tell FFmpeg to encode the audio to an MP4-friendly format. Since our audio file will be MP3, which can be handled by an MP4 container, you can omit the -c:a option. To prevent the video from lasting as long as the two and a half minute audio file, and only lasting as long as the original video, add the -shortest option to tell FFmpeg to stop encoding once the shortest input file (in this case, the ten second Big Buck Bunny video) is finished. Additionally, download the audio file Ukulele from Bensound . If your audio file happens have a shorter duration than your video file, and you want to continuously loop the audio file until the end of the video, then pass the -stream_loop option to FFmpeg. Set its value to -1 to infinitely loop over the input stream. Note : The -stream_loop option is applied to the input file that comes directly after it in the command, which happens to be the short.mp3 file. This audio file has a duration less than the video file. Consult the FFmpeg Documentation to learn more about all of the different video processing techniques it provides.

Thumbnail Image of Tutorial ffmpeg - Editing Audio and Video Content (Part 1)

Searching with find and grep

If you work within a disorganized workspace with deeply nested folders and try locating a specific folder, file or code snippet, then your productivity suffers from the constant distraction of manually searching through the workspace. Navigating the workspace and rummaging through every folder (double-clicking each one) to find a single folder or file becomes repetitive and directs attention away from your work. If you forget to close the folders after exploring them, then these opened folders accumulate over time and obstruct subsequent searches by cluttering the screen. Additionally, a computer's file explorer, such as Mac's Finder or Ubuntu's Nautilus, slows down when loading and displaying folders and files within large external hard-drives, thumb drives or SD cards filled (or nearly filled) to maximum capacity. Operating systems based on the UNIX kernel provide the find and grep command-line utilities to search for files/folders and text within a file respectively via pattern matching. With a single-line command, you avoid interacting with the interface of the computer's file explorer. Instead, the command prints the search results to standard output ( stdout ) displayed within the terminal. Both the find and grep commands are considered as some of the most essential building blocks in bash scripting! Knowing how to use them allows you to integrate them into your continuous integration (CI) pipeline to automate search tasks. Below, I'm going to show you: To demonstrate the find and grep commands, we will search for directories and files within a downloaded copy of one of GitHub's most popular repositories, facebook/react . The find command, as its name implies, recursively finds directories and files located within a specified list of directory paths. When a file or directory matches search criteria (based on the options provided to the find command), the find command outputs the matched directories and files by their path relative to the given starting point/s. To recursively list all of the directories and files (including those hidden) within the current directory: To narrow our list down to a specific directory or file, we must provide an expression to the find command: Note : Angle brackets indicate required arguments, whereas square brackets indicate optional arguments. An expression describes how to identify a match via tests , which use certain properties of directories/files to determine which directory/file satisfies the defined conditions: For more information on other tests, check out the find command's documentation in the Linux Manual Page. To get started, let's search for all files named package.json . Note : This command also searches for all directories named package.json . It is highly unlikely for directory names to contain extensions. To limit the search to files only, add the -type f test. If we try to search for directories or files that do not exist, then find returns an empty list with an exit code of zero. Now let's search for all JSON files. If you execute this command without the quotation marks around the glob pattern, then you may expect the terminal to also print a list of JSON files. However, notice that the terminal only prints a list of package.json files. Suppose we rename the package.json file in the root of the current directory to package-x.json . If we execute the previous find command, then notice that the terminal only prints this package-x.json file. Without the quotation marks, Bash expands the glob pattern and will replace it with the first file in the current directory that matches this pattern, which is package-x.json . To ensure that Bash does not expand the glob pattern and behave non-deterministically, wrap the glob pattern in quotation marks. Now, revert the renaming of package-x.json back to package.json . Currently, this project has no empty directories: Let's create an empty directory: When we search for empty directories within this project, the terminal will print the empty-dir directory. Since relative paths start with ./ to indicate the current directory, the argument passed to the -path test must begin with either * or ./ to allow the glob pattern to match the leading segment of a relative path. If we tried to locate package.json files with -path and forgot to add these prefixes to the glob pattern, then the find command returns an empty list. Prepending * or ./ to the glob pattern allows the find command to correctly match for package.json files via their relative paths. Let's find all package.json files within the packages sub-directory: The -type test filters out directories/files based on what the user is looking for. If the user only wants the find command to limit its search to files, then the -type test must be passed f for "file." The following command prints all of the files within the current directory. If the user only wants the find command to limit its search to directories, then the -type test must be passed d for "directory." The following command prints all of the sub-directories within the current directory. To search for multiple types, join the types together and separate each type with a comma. The -type test supports additional types: Proceed on to the second part of this blog post, which dives into the grep command.

Thumbnail Image of Tutorial Searching with find and grep