file globbing
(Written by Paul Cobbaut, https://github.com/paulcobbaut/, with contributions by: Alex M. Schapelle, https://github.com/zero-pytagoras/, Bert Van Vreckem https://github.com/bertvv/)
This chapter will explain file globbing. Typing man 7 glob
(on Debian) will tell you that long ago there was a program called /etc/glob
that would expand wildcard patterns. Soon afterward, this became a shell built-in.
A string is a wildcard pattern if it contains ?
, *
or [
. Globbing (or dynamic filename generation) is the operation that expands a wildcard pattern into a list of pathnames that match the pattern.
*
asterisk
The asterisk *
is interpreted by the shell as a sign to generate filenames, matching the asterisk to any combination of characters (even none). When no path is given, the shell will use filenames in the current directory. See the man page of glob(7)
for more information.
student@linux:~/gen$ ls
file1 file2 file3 File4 File55 FileA fileå fileab Fileab FileAB fileabc fileæ fileø filex filey filez
student@linux:~/gen$ ls File*
File4 File55 FileA Fileab FileAB
student@linux:~/gen$ ls file*
file1 file2 file3 fileå fileab fileabc fileæ fileø filex filey filez
student@linux:~/gen$ ls *ile55
File55
student@linux:~/gen$ ls F*ile55
File55
student@linux:~/gen$ ls F*55
File55
?
question mark
Similar to the asterisk, the question mark ?
is interpreted by the shell as a sign to generate filenames, matching the question mark with exactly one character.
student@linux:~/gen$ ls File?
File4 FileA
student@linux:~/gen$ ls Fil?4
File4
student@linux:~/gen$ ls Fil??
File4 FileA
student@linux:~/gen$ ls File??
File55 Fileab FileAB
[]
square brackets
The square bracket [
is interpreted by the shell as a sign to generate filenames, matching any of the characters between [
and the first subsequent ]
. The order in this list between the brackets is not important. Each pair of brackets is replaced by exactly one character.
student@linux:~/gen$ ls File[5A]
FileA
student@linux:~/gen$ ls File[A5]3
ls: cannot access 'File[A5]3': No such file or directory
student@linux:~/gen$ ls File[A5]
FileA
student@linux:~/gen$ ls File[A5][5b]
File55
student@linux:~/gen$ ls File[a5][5b]
File55 Fileab
student@linux:~/gen$ ls File[a5][5b][abcdefghijklm]
ls: cannot access 'File[a5][5b][abcdefghijklm]': No such file or directory
student@linux:~/gen$ ls file[a5][5b][abcdefghijklm]
fileabc
You can also exclude characters from a list between square brackets with the exclamation mark !
. And you are allowed to make combinations of these wildcards.
student@linux:~/gen$ ls file[a5][!Z]
fileab
student@linux:~/gen$ ls file[!5]*
file1 file2 file3 fileå fileab fileabc fileæ fileø filex filey filez
student@linux:~/gen$ ls file[!5]?
fileab
a-z
and 0-9
ranges
The bash shell will also understand ranges of characters between brackets.
student@linux:~/gen$ ls file[a-z]*
fileab fileabc filex filey filez
student@linux:~/gen$ ls file[0-9]
file1 file2 file3
student@linux:~/gen$ ls file[a-z][a-z][0-9]*
ls: cannot access 'file[a-z][a-z][0-9]*': No such file or directory
student@linux:~/gen$ ls file[a-z][a-z][a-z]*
fileabc
named character classes
Instead of ranges, you can also specify named character classes: [[:alnum:]]
,, [[:alpha:]]
, [[:blank:]]
, [[:cntrl:]]
, [[:digit:]]
, [[:graph:]]
, [[:lower:]]
, [[:print:]]
, [[:punct:]]
, [[:space:]]
, [[:upper:]]
, [[:xdigit:]]
. Instead of, e.g. [a-z]
, you can also use [[:lower:]]
.
student@linux:~/gen$ ls file[a-z]*
fileab fileabc filex filey filez
student@linux:~/gen$ ls file[[:lower:]]*
fileå fileab fileabc fileæ fileø filex filey filez
Remark that the named character classes work better for international characters. In the example above, [a-z]
does not match the Danish characters æ
, ø
, and å
, but [[:lower:]]
does.
$LANG
and square brackets
But, don't forget the influence of the $LANG
variable. Depending on the selected language or locale, the shell will interpret the square brackets and named character classes differently. Sort order may also be affected.
For example, when we select the default locale called C
:
student@linux:~/gen$ sudo localectl set-locale C
[... log out and log in again ...]
student@linux:~/gen$ echo $LANG
C
student@linux:~/gen$ ls
File4 File55 FileA FileAB Fileab file1 file2 file3 fileab fileabc filex filey filez 'file'$'\303\245' 'file'$'\303\246' 'file'$'\303\270'
student@linux:~/gen$ ls file[[:lower:]]*
fileab fileabc filex filey filez
The Danish characters can't be displayed properly and don't match the [[:lower:]]
character class.
Let us change the locale to da_DK.UTF-8
(Danish/Denmark with UTF-8 support) and see what happens:
student@linux:~/gen$ sudo localectl set-locale da_DK.UTF-8
[... log out and log in again ...]
student@linux:~/gen$ echo $LANG
da_DK.UTF-8
student@linux:~/gen$ ls
file1 file2 file3 File4 File55 FileA FileAB Fileab fileab fileabc filex filey filez fileæ fileø fileå
student@linux:~/gen$ ls file[[:lower:]]*
fileab fileabc filex filey filez fileæ fileø fileå
Now the Danish characters are displayed properly and match the [[:lower:]]
character class.
In the en_US.UTF-8
locale (US English, with UTF-8 support), the Danish characters are displayed properly, and also match the [[:lower:]]
character class. However, they are sorted differently:
student@linux:~/gen$ sudo localectl set-locale en_US.UTF-8
[... log out and log in again ...]
student@linux:~/gen$ echo $LANG
en_US.UTF-8
student@linux:~/gen$ ls
file1 file2 file3 File4 File55 FileA fileå fileab Fileab FileAB fileabc fileæ fileø filex filey filez
student@linux:~/gen$ ls file[[:lower:]]*
fileå fileab fileabc fileæ fileø filex filey filez
preventing file globbing
If a wildcard pattern does not match any filenames, the shell will not expand the pattern. Consequently, when in an empty directory, echo *
will display a *
. It will echo the names of all files when the directory is not empty.
student@linux:~$ mkdir test42
student@linux:~$ cd test42/
student@linux:~/test42$ echo *
*
student@linux:~/test42$ touch test{1,2,3}
student@linux:~/test42$ echo *
test1 test2 test3
Globbing can be prevented using quotes or by escaping the special characters, as shown in this screenshot.
student@linux:~/test42$ echo *
test1 test2 test3
student@linux:~/test42$ echo \*
*
student@linux:~/test42$ echo '*'
*
student@linux:~/test42$ echo "*"
*
practice: shell globbing
In the questions below, use the ls
command with globbing patterns to list the specified files. Don't pipe the output to grep
or another tool to filter on regular expressions!
-
Create a test directory
glob
and enter it. -
Create the following files :
vagrant@ubuntu:~/glob$ ls 'file(' file10 'file 2' File2 file33 fileA fileà fileAAA file1 file11 file2 File3 filea fileá fileå fileAB
(remark that
file 2
has a space in the name!) -
List all files starting with
file
-
List all files starting with
File
-
List all files starting with
file
and ending in a number. -
List all files starting with
file
and ending with a letter -
List all files starting with
File
and having a digit as fifth character. -
List all files starting with
File
and having a digit as fifth and last character (i.e. the name consists of five characters). -
List all files starting with a letter and ending in a number.
-
List all files that have exactly five characters.
-
List all files that start with
f
orF
and end with3
orA
. -
List all files that start with
f
havei
orR
as second character and end in a number. -
List all files that do not start with the letter
F
. -
Show the influence of
$LANG
(the system locale) in listingA-Z
ora-z
ranges. -
You receive information that one of your servers was cracked. The cracker probably replaced the
ls
command with a rootkit so it can no longer be used safely. You know that theecho
command is safe to use. Canecho
replacels
? How can you list the files in the current directory withecho
?
solution: shell globbing
-
Create a test directory
glob
and enter it. -
Create the files:
-
List all files starting with
file
-
List all files starting with
File
-
List all files starting with
file
and ending in a number. -
List all files starting with
file
and ending with a letterstudent@ubuntu:~/glob$ ls file*[A-Za-z] filea fileA fileAAA fileAB student@ubuntu:~/glob$ ls file*[[:alpha:]] filea fileA fileá fileà fileå fileAAA fileAB
Remark that the first solution is not complete, as it does not list the files with special characters in the name! In this case, it's better to use the named class
[:alpha:]
. -
List all files starting with
File
and having a digit as fifth character. -
List all files starting with
File
and having a digit as fifth and last character (i.e. the name consists of five characters). -
List all files starting with a letter and ending in a number.
-
List all files that have exactly five characters.
-
List all files that start with
f
orF
and end with3
orA
. -
List all files that start with
f
havei
orR
as second character and end in a number. -
List all files that do not start with the letter
F
. -
Show the influence of
$LANG
(the system locale) in listingA-Z
ora-z
ranges.student@ubuntu:~/glob$ LANG=C ls file[[:alpha:]]* fileA fileAAA fileAB filea 'file'$'\303\240' 'file'$'\303\241' 'file'$'\303\245' student@ubuntu:~/glob$ LANG=en_US.UTF-8 ls file[[:alpha:]]* filea fileA fileá fileà fileå fileAAA fileAB student@ubuntu:~/glob$ LANG=da_DK.UTF-8 ls file[[:alpha:]]* fileA filea fileá fileà fileAB fileå fileAAA
-
You receive information that one of your servers was cracked. The cracker probably replaced the
ls
command with a rootkit so it can no longer be used safely. You know that theecho
command is safe to use. Canecho
replacels
? How can you list the files in the current directory withecho
?student@ubuntu:~/glob$ echo * file( file1 file10 file11 file 2 file2 File2 File3 file33 filea fileA fileá fileà fileå fileAAA fileAB
A disadvantage is that you can't see properties of the files, like permissions, owner, group, size, and date. For this, you can use
stat
, e.g.stat -c '%A %h %U %G %s %y %n' *
.