Skip to content

file globbing

(Written by Paul Cobbaut, https://github.com/paulcobbaut/, with contributions by: Alex M. Schapelle, https://github.com/zero-pytagoras/, Bert Van Vreckem https://github.com/bertvv/)

This chapter will explain file globbing. Typing man 7 glob (on Debian) will tell you that long ago there was a program called /etc/glob that would expand wildcard patterns. Soon afterward, this became a shell built-in.

A string is a wildcard pattern if it contains ?, * or [. Globbing (or dynamic filename generation) is the operation that expands a wildcard pattern into a list of pathnames that match the pattern.

* asterisk

The asterisk * is interpreted by the shell as a sign to generate filenames, matching the asterisk to any combination of characters (even none). When no path is given, the shell will use filenames in the current directory. See the man page of glob(7) for more information.

student@linux:~/gen$ ls
file1  file2  file3  File4  File55  FileA  fileå  fileab  Fileab  FileAB  fileabc  fileæ  fileø  filex  filey  filez
student@linux:~/gen$ ls File*
File4  File55  FileA  Fileab  FileAB
student@linux:~/gen$ ls file*
file1  file2  file3  fileå  fileab  fileabc  fileæ  fileø  filex  filey  filez
student@linux:~/gen$ ls *ile55
File55
student@linux:~/gen$ ls F*ile55
File55
student@linux:~/gen$ ls F*55
File55

? question mark

Similar to the asterisk, the question mark ? is interpreted by the shell as a sign to generate filenames, matching the question mark with exactly one character.

student@linux:~/gen$ ls File?
File4  FileA
student@linux:~/gen$ ls Fil?4
File4
student@linux:~/gen$ ls Fil??
File4  FileA
student@linux:~/gen$ ls File??
File55  Fileab  FileAB

[] square brackets

The square bracket [ is interpreted by the shell as a sign to generate filenames, matching any of the characters between [ and the first subsequent ]. The order in this list between the brackets is not important. Each pair of brackets is replaced by exactly one character.

student@linux:~/gen$ ls File[5A]
FileA
student@linux:~/gen$ ls File[A5]3
ls: cannot access 'File[A5]3': No such file or directory
student@linux:~/gen$ ls File[A5]
FileA
student@linux:~/gen$ ls File[A5][5b]
File55
student@linux:~/gen$ ls File[a5][5b]
File55  Fileab
student@linux:~/gen$ ls File[a5][5b][abcdefghijklm]
ls: cannot access 'File[a5][5b][abcdefghijklm]': No such file or directory
student@linux:~/gen$ ls file[a5][5b][abcdefghijklm]
fileabc

You can also exclude characters from a list between square brackets with the exclamation mark !. And you are allowed to make combinations of these wildcards.

student@linux:~/gen$ ls file[a5][!Z]
fileab
student@linux:~/gen$ ls file[!5]*
file1  file2  file3  fileå  fileab  fileabc  fileæ  fileø  filex  filey  filez
student@linux:~/gen$ ls file[!5]?
fileab

a-z and 0-9 ranges

The bash shell will also understand ranges of characters between brackets.

student@linux:~/gen$ ls file[a-z]*
fileab  fileabc  filex  filey  filez
student@linux:~/gen$ ls file[0-9]
file1  file2  file3
student@linux:~/gen$ ls file[a-z][a-z][0-9]*
ls: cannot access 'file[a-z][a-z][0-9]*': No such file or directory
student@linux:~/gen$ ls file[a-z][a-z][a-z]*
fileabc

named character classes

Instead of ranges, you can also specify named character classes: [[:alnum:]],, [[:alpha:]], [[:blank:]], [[:cntrl:]], [[:digit:]], [[:graph:]], [[:lower:]], [[:print:]], [[:punct:]], [[:space:]], [[:upper:]], [[:xdigit:]]. Instead of, e.g. [a-z], you can also use [[:lower:]].

student@linux:~/gen$ ls file[a-z]*
fileab  fileabc  filex  filey  filez
student@linux:~/gen$ ls file[[:lower:]]*
fileå  fileab  fileabc  fileæ  fileø  filex  filey  filez

Remark that the named character classes work better for international characters. In the example above, [a-z] does not match the Danish characters æ, ø, and å, but [[:lower:]] does.

$LANG and square brackets

But, don't forget the influence of the $LANG variable. Depending on the selected language or locale, the shell will interpret the square brackets and named character classes differently. Sort order may also be affected.

For example, when we select the default locale called C:

student@linux:~/gen$ sudo localectl set-locale C
[... log out and log in again ...]
student@linux:~/gen$ echo $LANG
C
student@linux:~/gen$ ls
 File4   File55   FileA   FileAB   Fileab   file1   file2   file3   fileab   fileabc   filex   filey   filez  'file'$'\303\245'  'file'$'\303\246'  'file'$'\303\270'
student@linux:~/gen$ ls file[[:lower:]]*
fileab  fileabc  filex  filey  filez

The Danish characters can't be displayed properly and don't match the [[:lower:]] character class.

Let us change the locale to da_DK.UTF-8 (Danish/Denmark with UTF-8 support) and see what happens:

student@linux:~/gen$ sudo localectl set-locale da_DK.UTF-8
[... log out and log in again ...]
student@linux:~/gen$ echo $LANG
da_DK.UTF-8
student@linux:~/gen$ ls
file1  file2  file3  File4  File55  FileA  FileAB  Fileab  fileab  fileabc  filex  filey  filez  fileæ  fileø  fileå
student@linux:~/gen$ ls file[[:lower:]]*
fileab  fileabc  filex  filey  filez  fileæ  fileø  fileå

Now the Danish characters are displayed properly and match the [[:lower:]] character class.

In the en_US.UTF-8 locale (US English, with UTF-8 support), the Danish characters are displayed properly, and also match the [[:lower:]] character class. However, they are sorted differently:

student@linux:~/gen$ sudo localectl set-locale en_US.UTF-8
[... log out and log in again ...]
student@linux:~/gen$ echo $LANG
en_US.UTF-8
student@linux:~/gen$ ls
file1  file2  file3  File4  File55  FileA  fileå  fileab  Fileab  FileAB  fileabc  fileæ  fileø  filex  filey  filez
student@linux:~/gen$ ls file[[:lower:]]*
fileå  fileab  fileabc  fileæ  fileø  filex  filey  filez

preventing file globbing

If a wildcard pattern does not match any filenames, the shell will not expand the pattern. Consequently, when in an empty directory, echo * will display a *. It will echo the names of all files when the directory is not empty.

student@linux:~$ mkdir test42
student@linux:~$ cd test42/
student@linux:~/test42$ echo *
*
student@linux:~/test42$ touch test{1,2,3}
student@linux:~/test42$ echo *
test1 test2 test3

Globbing can be prevented using quotes or by escaping the special characters, as shown in this screenshot.

student@linux:~/test42$ echo *
test1 test2 test3
student@linux:~/test42$ echo \*
*
student@linux:~/test42$ echo '*'
*
student@linux:~/test42$ echo "*"
*

practice: shell globbing

In the questions below, use the ls command with globbing patterns to list the specified files. Don't pipe the output to grep or another tool to filter on regular expressions!

  1. Create a test directory glob and enter it.

  2. Create the following files :

    vagrant@ubuntu:~/glob$ ls
    'file('   file10  'file 2'   File2   file33   fileA   fileà   fileAAA
     file1    file11   file2     File3   filea    fileá   fileå   fileAB
    

    (remark that file 2 has a space in the name!)

  3. List all files starting with file

  4. List all files starting with File

  5. List all files starting with file and ending in a number.

  6. List all files starting with file and ending with a letter

  7. List all files starting with File and having a digit as fifth character.

  8. List all files starting with File and having a digit as fifth and last character (i.e. the name consists of five characters).

  9. List all files starting with a letter and ending in a number.

  10. List all files that have exactly five characters.

  11. List all files that start with f or F and end with 3 or A.

  12. List all files that start with f have i or R as second character and end in a number.

  13. List all files that do not start with the letter F.

  14. Show the influence of $LANG (the system locale) in listing A-Z or a-z ranges.

  15. You receive information that one of your servers was cracked. The cracker probably replaced the ls command with a rootkit so it can no longer be used safely. You know that the echo command is safe to use. Can echo replace ls? How can you list the files in the current directory with echo?

solution: shell globbing

  1. Create a test directory glob and enter it.

    mkdir glob; cd glob
    
  2. Create the files:

    student@ubuntu:~$ touch file1 file10 file11 file2 File2 File3 file33 fileAB
    student@ubuntu:~$ touch filea fileá fileà fileå fileA fileAAA 'file(' 'file 2'
    
  3. List all files starting with file

    student@ubuntu:~/glob$ ls file*
    'file('   file10  'file 2'   file33   fileA   fileà   fileAAA
     file1    file11   file2     filea    fileá   fileå   fileAB
    
  4. List all files starting with File

    student@ubuntu:~/glob$ ls File*
    File2  File3
    
  5. List all files starting with file and ending in a number.

    student@ubuntu:~/glob$ ls file*[0-9]
    file1   file10   file11  'file 2'   file2   file33
    
  6. List all files starting with file and ending with a letter

    student@ubuntu:~/glob$ ls file*[A-Za-z]
    filea  fileA  fileAAA  fileAB
    student@ubuntu:~/glob$ ls file*[[:alpha:]]
    filea  fileA  fileá  fileà  fileå  fileAAA  fileAB
    

    Remark that the first solution is not complete, as it does not list the files with special characters in the name! In this case, it's better to use the named class [:alpha:].

  7. List all files starting with File and having a digit as fifth character.

    student@ubuntu:~/glob$ ls File[0-9]*
    File2  File3
    
  8. List all files starting with File and having a digit as fifth and last character (i.e. the name consists of five characters).

    student@ubuntu:~/glob$ ls File[0-9]
    File2
    
  9. List all files starting with a letter and ending in a number.

    student@ubuntu:~/glob$ ls [[:alpha:]]*[[:digit:]]
    file1   file10   file11  'file 2'   file2   File2   File3   file33
    
  10. List all files that have exactly five characters.

    student@ubuntu:~/glob$ ls ?????
    'file('   file1   file2   File2   File3   filea   fileA   fileá   fileà   fileå
    
  11. List all files that start with f or F and end with 3 or A.

    student@ubuntu:~/glob$ ls [fF]*[3A]
    File3  file33  fileA  fileAAA
    
  12. List all files that start with f have i or R as second character and end in a number.

    student@ubuntu:~/glob$ ls f[iR]*[0-9]
    file1   file10   file11  'file 2'   file2   file33
    
  13. List all files that do not start with the letter F.

    student@ubuntu:~/glob$ ls [^F]*
    'file('   file10  'file 2'   file33   fileA   fileà   fileAAA
     file1    file11   file2     filea    fileá   fileå   fileAB
    
  14. Show the influence of $LANG (the system locale) in listing A-Z or a-z ranges.

    student@ubuntu:~/glob$ LANG=C ls file[[:alpha:]]*
    fileA   fileAAA   fileAB   filea  'file'$'\303\240'  'file'$'\303\241'  'file'$'\303\245'
    student@ubuntu:~/glob$ LANG=en_US.UTF-8 ls file[[:alpha:]]*
    filea  fileA  fileá  fileà  fileå  fileAAA  fileAB
    student@ubuntu:~/glob$ LANG=da_DK.UTF-8 ls file[[:alpha:]]*
    fileA  filea  fileá  fileà  fileAB  fileå  fileAAA
    
  15. You receive information that one of your servers was cracked. The cracker probably replaced the ls command with a rootkit so it can no longer be used safely. You know that the echo command is safe to use. Can echo replace ls? How can you list the files in the current directory with echo?

    student@ubuntu:~/glob$ echo *
    file( file1 file10 file11 file 2 file2 File2 File3 file33 filea fileA fileá fileà fileå fileAAA fileAB
    

    A disadvantage is that you can't see properties of the files, like permissions, owner, group, size, and date. For this, you can use stat, e.g. stat -c '%A %h %U %G %s %y %n' *.