User Tools

Site Tools


wiki:cli:scripts

Objectives

  • Write a shell script that runs a command or series of commands for a fixed set of files.
  • Run a shell script from the command line.
  • Write a shell script that operates on a set of files defined by the user on the command line.

We are finally ready to see what makes the shell such a powerful programming environment. We are going to take the commands we repeat frequently and save them in files so that we can re-run all those operations again later by typing a single command. For historical reasons, a bunch of commands saved in a file is usually called a shell script, but make no mistake: these are actually small programs.

Let's start by putting the following lines in the file sorted_lengths.sh:

  #! /bin/bash
  wc -l *.pdb | sort -n > sorted_lengths.txt

The first line is a “shebang” or “hash bang” it tells the shell which program to run the script with. In this case it is a bash script. For a python script one would add the path for python. The second line is our pipe & filter command for generating a sorted list of file line counts.

We can try to run the script:

   $./sorted_lengths.sh
   bash: ./sorted_lengths.sh: Permission denied

Not quite what we expected. We don't have permission to execute / run the file. Unix controls who can read, modify, and run files using *permissions*. Users can belong to any number of groups, each of which has a unique group name and numeric group ID The list of who's in what group is usually stored in the file /etc/group.

Now let's look at files and directories. Every file and directory on a Unix computer belongs to one owner and one group. Along with each file's content, the operating system stores the numeric IDs of the user and group that own it.

The user-and-group model means that for each file every user on the system falls into one of three categories: the owner of the file, someone in the file's group, and everyone else.

For each of these three categories, the computer keeps track of whether people in that category can read the file, write to the file, or execute the file (i.e., run it if it is a program).

user group all
read yes yes yes
write yes no no
execute no non no

it would mean that:

  • the file's owner can read and write it, but not run it;
  • other people in the file's group can read it, but not modify it or run it; and
  • everybody else can do nothing with it at all.

Let's look at this model in action. Let's run ls -l

 $ ls -l
  1. rw–r–r 1 jens jens 1178 Aug 10 14:31 cubane.pdb
  2. rw–r–r 1 jens jens 634 Aug 10 14:31 ethane.pdb
  3. rw–r–r 1 jens jens 431 Aug 10 14:31 methane.pdb
  4. rw–rw-r 1 jens jens 57 Aug 10 14:31 sorted_lengths.sh

The -l flag tells ls to give us a long-form listing. It's a lot of information, so let's go through the columns in turn.

On the right side, we have the files' names. Next to them, moving left, are the times and dates they were last modified. Backup systems and other tools use this information in a variety of ways, but you can use it to tell when you (or anyone else with permission) last changed a file.

Next to the modification time is the file's size in bytes and the names of the user and group that owns it (in this case, jens and jens respectively). We'll skip over the second column for now (the one showing 1 for each file) because it's the first column that we care about most. This shows the file's permissions, i.e., who can read, write, or execute it.

Let's have a closer look at one of those permission strings: -rwxr-xr-x. The first character tells us what type of thing this is: '-' means it's a regular file, while 'd' means it's a directory, and other characters mean more esoteric things.

The next three characters tell us what permissions the file's owner has. Here, the owner can read, write, and execute the file: rwx. The middle triplet shows us the group's permissions. If the permission is turned off, we see a dash, so r-x means “read and execute, but not write”. The final triplet shows us what everyone who isn't the file's owner, or in the file's group, can do. In this case, it's 'r-x' again, so everyone on the system can look at the file's contents and run it.

To change permissions, we use the chmod command (whose name stands for “change mode”).

  $ chmod u+x sorted_lengths.sh

The 'u' signals that we're changing the privileges of the user (i.e., the file's owner), + that we are adding permissions, and rw is the new set of permissions. A quick ls -l shows us that it worked, because the owner's permissions are now set to read and write:

  $ ls -l
  1. rw–r–r 1 jens jens 1178 Aug 10 14:31 cubane.pdb
  2. rw–r–r 1 jens jens 634 Aug 10 14:31 ethane.pdb
  3. rw–r–r 1 jens jens 431 Aug 10 14:31 methane.pdb
  4. rwx-rw-r 1 jens jens 57 Aug 10 14:31 sorted_lengths.sh

Now we can finally run our script.

  $ ./sorted_lengths.sh

We check the result with ls and cat.

  $ ls
  cubane.pdb           ethane.pdb           methane.pdb
  sorted_lengths.sh    sorted_lengths.txt
  $ cat sorted_lengths.txt
    9 methane.pdb
   12 ethane.pdb
   15 propane.pdb
   20 cubane.pdb
   21 pentane.pdb
   30 octane.pdb
  107 total

Maybe we want to make the input more flexible so that we can tell the script which files to run on.

Let's edit the file:

  #! /bin/bash
  wc -l $* | sort -n > sorted_lengths.txt

$* means “All of the command-line parameters to the shell script.”

Before we try this out let's add a comment:

 #! /bin/bash
 # Write sorted file length (lines) list in sorted_lengths   
 # Usage: sorted_lengths.sh filenames
 wc -l $* | sort -n > sorted_lengths.txt

A comment starts with a # character and runs to the end of the line. The computer ignores comments, but they're invaluable for helping people understand and use scripts.

We can run the script specifying files

  $ ./sorted_lengths.sh methane.pdb ethane.pdb propane.pdb

or using the wild-card:

  $ ./sorted_lengths.sh *.pdb

Let's make another script:

  #! /bin/bash
  # Select lines from the middle of a file
  # Usage: middle.sh filename -end_line -num_lines
  head $2 $1 | tail $3

Inside a shell script, $1 means “the first filename (or other parameter) on the command line”. $2 and $3 mean the “second parameter” and “third parameter”, respectively.

What does this script do?

Now we have two scripts we can run in succession. When we make a small change to one of the input file is there a quick way to rerun the whole analysis?

Key Points

  • Save commands in files (usually called shell scripts) for re-use.
  • Setting permissions of a file.
  • $* refers to all of a shell script's command-line parameters.
  • $1, $2, etc., refer to specified command-line parameters.
  • Letting users decide what files to process is more flexible and more consistent with built-in Unix commands.
wiki/cli/scripts.txt · Last modified: 2022/07/21 06:59 by 127.0.0.1