Skip to content

Shell Tools and Scripting

Shell Scripting

So far we have seen how to execute commands in the shell and pipe them together. However, in many scenarios you will want to perform a series of commands and make use of control flow expressions like conditionals or loops.

到目前为止,我们已经学习来如何在 shell 中执行命令,并使用管道将命令组合使用。但是,很多情况下我们需要执行一系列的操作并使用条件或循环这样的控制流。

Shell scripts are the next step in complexity.

shell 脚本是一种更加复杂度的工具。

Most shells have their own scripting language with variables, control flow and its own syntax. What makes shell scripting different from other scripting programming language is that it is optimized for performing shell-related tasks. Thus, creating command pipelines, saving results into files, and reading from standard input are primitives in shell scripting, which makes it easier to use than general purpose scripting languages. For this section we will focus on bash scripting since it is the most common.

大多数shell都有自己的一套脚本语言,包括变量、控制流和自己的语法。shell脚本与其他脚本语言不同之处在于,shell 脚本针对 shell 所从事的相关工作进行来优化。因此,创建命令流程(pipelines)、将结果保存到文件、从标准输入中读取输入,这些都是 shell 脚本中的原生操作,这让它比通用的脚本语言更易用。本节中,我们会专注于 bash 脚本,因为它最流行,应用更为广泛。

To assign variables in bash, use the syntax foo=bar and access the value of the variable with $foo. Note that foo = bar will not work since it is interpreted as calling the foo program with arguments = and bar. In general, in shell scripts the space character will perform argument splitting. This behavior can be confusing to use at first, so always check for that.

在bash中为变量赋值的语法是foo=bar,访问变量中存储的数值,其语法为 $foo。 需要注意的是,foo = bar (使用空格隔开)是不能正确工作的,因为解释器会调用程序foo 并将 =bar作为参数。 总的来说,在shell脚本中使用空格会起到分割参数的作用,有时候可能会造成混淆,请务必多加检查。

Strings in bash can be defined with ' and " delimiters, but they are not equivalent. Strings delimited with ' are literal strings and will not substitute variable values whereas " delimited strings will.

Bash中的字符串通过'"分隔符来定义,但是它们的含义并不相同。以'定义的字符串为原义字符串,其中的变量不会被转义,而 "定义的字符串会将变量值进行替换。

foo=bar
echo "$foo"
# prints bar
echo '$foo'
# prints $foo

Screenshot 2023-11-09 at 15.31.25

As with most programming languages, bash supports control flow techniques including if, case, while and for. Similarly, bash has functions that take arguments and can operate with them. Here is an example of a function that creates a directory and cds into it.

和其他大多数的编程语言一样,bash也支持if, case, whilefor 这些控制流关键字。同样地, bash 也支持函数,它可以接受参数并基于参数进行操作。下面这个函数是一个例子,它会创建一个文件夹并使用cd进入该文件夹。

mcd () {
    mkdir -p "$1"
    cd "$1"
}

Screenshot 2023-11-09 at 16.04.13

source mcd.sh: This will execute this script in our shell and load it. It looks like nothing happened. But now the "mcd" function has been defined in our shell.

Screenshot 2023-11-09 at 16.13.08

Here $1 is the first argument to the script/function. Unlike other scripting languages, bash uses a variety of special variables to refer to arguments, error codes, and other relevant variables. Below is a list of some of them. A more comprehensive list can be found here.

这里 $1 是脚本的第一个参数。与其他脚本语言不同的是,bash使用了很多特殊的变量来表示参数、错误代码和相关变量。下面是列举来其中一些变量,更完整的列表可以参考 这里

  • $0 - Name of the script

  • $1 to $9 - Arguments to the script. $1 is the first argument and so on.

  • $@ - All the arguments

  • $# - Number of arguments

  • $? - Return code of the previous command

query for the value. if 0 means everything went ok and there weren't any issues.

And a 0 exit code is the same as you will get in a language like C. 1: means something wrong.

  • $$ - Process identification number (PID) for the current script

  • !! - Entire last command, including arguments. A common pattern is to execute a command only for it to fail due to missing permissions; you can quickly re-execute the command with sudo by doing sudo !!

  • $_ - Last argument from the last command. If you are in an interactive shell, you can also quickly get this value by typing Esc followed by . or Alt+.

  • $0 - 脚本名

  • $1$9 - 脚本的参数。 $1 是第一个参数,依此类推。

  • $@ - 所有参数

  • $# - 参数个数

  • $? - 前一个命令的返回值

  • $$ - 当前脚本的进程识别码

  • !! - 完整的上一条命令,包括参数。常见应用:当你因为权限不足执行命令失败时,可以使用 sudo !!再尝试一次。

Screenshot 2023-11-09 at 16.14.19

Screenshot 2023-11-09 at 16.15.02

  • $_ - 上一条命令的最后一个参数。如果你正在使用的是交互式 shell,你可以通过按下 Esc 之后键入 . 来获取这个值。

Commands will often return output using STDOUT, errors through STDERR, and a Return Code to report errors in a more script-friendly manner. The return code or exit status is the way scripts/commands have to communicate how execution went. A value of 0 usually means everything went OK; anything different from 0 means an error occurred.

命令通常使用 STDOUT来返回输出值,使用STDERR 来返回错误及错误码,便于脚本以更加友好的方式报告错误。 返回码或退出状态是脚本/命令之间交流执行状态的方式。返回值0表示正常执行,其他所有非0的返回值都表示有错误发生。

Exit codes can be used to conditionally execute commands using && (and operator) and || (or operator), both of which are short-circuiting operators. Commands can also be separated within the same line using a semicolon ;. The true program will always have a 0 return code and the false command will always have a 1 return code. Let's see some examples

false || echo "Oops, fail"
# Oops, fail

true || echo "Will not be printed"
#

true && echo "Things went well"
# Things went well

false && echo "Will not be printed"
#

true ; echo "This will always run"
# This will always run

false ; echo "This will always run"
# This will always run

Screenshot 2023-11-09 at 16.22.08

Another common pattern is wanting to get the output of a command as a variable. This can be done with command substitution.

另一个常见的模式是以变量的形式获取一个命令的输出,这可以通过 命令替换command substitution)实现。

Whenever you place $( CMD ) it will execute CMD, get the output of the command and substitute it in place. For example, if you do for file in $(ls), the shell will first call ls and then iterate over those values. A lesser known similar feature is process substitution, <( CMD ) will execute CMD and place the output in a temporary file and substitute the <() with that file's name. This is useful when commands expect values to be passed by file instead of by STDIN. For example, diff <(ls foo) <(ls bar) will show differences between files in dirs foo and bar.

当您通过 $( CMD ) 这样的方式来执行CMD 这个命令时,它的输出结果会替换掉 $( CMD ) 。例如,如果执行 for file in $(ls) ,shell首先将调用ls ,然后遍历得到的这些返回值。还有一个冷门的类似特性是 进程替换process substitution), <( CMD ) 会执行 CMD 并将结果输出到一个临时文件中,并将 <( CMD ) 替换成临时文件名。这在我们希望返回值通过文件而不是STDIN传递时很有用。例如, diff <(ls foo) <(ls bar) 会显示文件夹 foobar 中文件的区别。

Screenshot 2023-11-09 at 16.33.50

Since that was a huge information dump, let's see an example that showcases some of these features. It will iterate through the arguments we provide, grep for the string foobar, and append it to the file as a comment if it's not found.

说了很多,现在该看例子了,下面这个例子展示了一部分上面提到的特性。这段脚本会遍历我们提供的参数,使用grep 搜索字符串 foobar,如果没有找到,则将其作为注释追加到文件中。

#!/bin/bash

echo "Starting program at $(date)" # Date will be substituted

echo "Running program $0 with $# arguments with pid $$"   # '$0' is the name of the script that we're running '$#': this is the number of arguments that we are giving to the command
# '$$' means the process ID of thsi command that is running.
# There's a lot of kind of these dollar things, they're not intuitive. Because they don't have like a mnemonic way of remembering, maybe "$#". But you just use more and will be more familiar with them.

for file in "$@"; do      # "$@": will expand all the arguments $1 $2 ...
    grep foobar "$file" > /dev/null 2> /dev/null
    # grep: which is just search for a substring in some file
    # for here is searching `foobar` in the file
    # we can redirect it to somewhere to save it or like to connect it to some other file
    # /dev/null 这是Unix系统中的一种特殊设备, 你可以像写入文件一样写入数据, 但它将被丢弃
    # '2>'用于重定向标准错误流的, 因为这两个流是分开的, 你需要告诉bash如何处理它们
    # 如果文件包含foobar它将返回0. 如果没有则返回1
    # When pattern is not found, grep has exit status 1
    # We redirect STDOUT and STDERR to a null register since we do not care about them
    # -ne 不相等
    if [[ $? -ne 0 ]]; then
        echo "File $file does not have any foobar, adding one"
        echo "# foobar" >> "$file"
    fi
done

In the comparison we tested whether $? was not equal to 0. Bash implements many comparisons of this sort - you can find a detailed list in the manpage for test. When performing comparisons in bash, try to use double brackets [[ ]] in favor of simple brackets [ ]. Chances of making mistakes are lower although it won't be portable to sh. A more detailed explanation can be found here.

在条件语句中,我们比较 $? 是否等于0。 Bash实现了许多类似的比较操作,您可以查看 test 手册。 在bash中进行比较时,尽量使用双方括号 [[ ]] 而不是单方括号 [ ],这样会降低犯错的几率,尽管这样并不能兼容 sh。 更详细的说明参见这里

Screenshot 2023-11-09 at 17.42.13

注意空格.

Screenshot 2023-11-09 at 17.44.01

When launching scripts, you will often want to provide arguments that are similar. Bash has ways of making this easier, expanding expressions by carrying out filename expansion. These techniques are often referred to as shell globbing.

当执行脚本时,我们经常需要提供形式类似的参数。bash使我们可以轻松的实现这一操作,它可以基于文件扩展名展开表达式。这一技术被称为shell的 通配globbing

  • Wildcards - Whenever you want to perform some sort of wildcard matching, you can use ? and * to match one or any amount of characters respectively. For instance, given files foo, foo1, foo2, foo10 and bar, the command rm foo? will delete foo1 and foo2 whereas rm foo* will delete all but bar.

通配符 - 当你想要利用通配符进行匹配时,你可以分别使用 ?* 来匹配一个或任意个字符。例如,对于文件foo, foo1, foo2, foo10bar, rm foo?这条命令会删除foo1foo2 ,而rm foo* 则会删除除了bar之外的所有文件。

  • Curly braces {} - Whenever you have a common substring in a series of commands, you can use curly braces for bash to expand this automatically. This comes in very handy when moving or converting files.

花括号{} - 当你有一系列的指令,其中包含一段公共子串时,可以用花括号来自动展开这些命令。这在批量移动或转换文件时非常方便。

convert image.{png,jpg}
# Will expand to
convert image.png image.jpg

cp /path/to/project/{foo,bar,baz}.sh /newpath
# Will expand to
cp /path/to/project/foo.sh /path/to/project/bar.sh /path/to/project/baz.sh /newpath

# Globbing techniques can also be combined
mv *{.py,.sh} folder
# Will move all *.py and *.sh files


mkdir foo bar
# This creates files foo/a, foo/b, ... foo/h, bar/a, bar/b, ... bar/h
touch {foo,bar}/{a..h}
touch foo/x bar/y
# Show differences between files in foo and bar
diff <(ls foo) <(ls bar)       
# Outputs
# < x
# ---
# > y

Screenshot 2023-11-09 at 18.18.59Screenshot 2023-11-09 at 18.24.01

Writing bash scripts can be tricky and unintuitive. There are tools like shellcheck that will help you find errors in your sh/bash scripts.

编写 bash 脚本有时候会很别扭和反直觉。例如 shellcheck 这样的工具可以帮助你定位sh/bash脚本中的错误。

Note that scripts need not necessarily be written in bash to be called from the terminal. For instance, here's a simple Python script that outputs its arguments in reversed order:

注意,脚本并不一定只有用 bash 写才能在终端里调用。比如说,这是一段 Python 脚本,作用是将输入的参数倒序输出:

#!/usr/local/bin/python         ---> shebang
import sys                          # python 默认情况下不会与shell交互, 所以需要导入一些库
for arg in reversed(sys.argv[1:]):      # 遍历 “sys.argv[1:]”
    print(arg)

Screenshot 2023-11-09 at 19.00.06

不同的电脑配置环境, python 位置会不一样

Screenshot 2023-11-09 at 19.01.18

The kernel knows to execute this script with a python interpreter instead of a shell command because we included a shebang line at the top of the script.

内核知道去用 python 解释器而不是 shell 命令来运行这段脚本,是因为脚本的开头第一行的 shebang

It is good practice to write shebang lines using the env command that will resolve to wherever the command lives in the system, increasing the portability of your scripts. To resolve the location, env will make use of the PATH environment variable we introduced in the first lecture. For this example the shebang line would look like #!/usr/bin/env python.

shebang 行中使用 env 命令是一种好的实践,它会利用环境变量中的程序来解析该脚本,这样就提高来您的脚本的可移植性。env 会利用我们第一节讲座中介绍过的PATH 环境变量来进行定位。 例如,使用了env的shebang看上去时这样的#!/usr/bin/env python

Screenshot 2023-11-09 at 19.07.25

Some differences between shell functions and scripts that you should keep in mind are:

shell函数和脚本有如下一些不同点:

  • Functions have to be in the same language as the shell, while scripts can be written in any language. This is why including a shebang for scripts is important.

函数只能与shell使用相同的语言,脚本可以使用任意语言。因此在脚本中包含 shebang 是很重要的。

  • Functions are loaded once when their definition is read. Scripts are loaded every time they are executed. This makes functions slightly faster to load, but whenever you change them you will have to reload their definition.

函数仅在定义时被加载,脚本会在每次被执行时加载。这让函数的加载比脚本略快一些,但每次修改函数定义,都要重新加载一次。

  • Functions are executed in the current shell environment whereas scripts execute in their own process. Thus, functions can modify environment variables, e.g. change your current directory, whereas scripts can't. Scripts will be passed by value environment variables that have been exported using export

函数会在当前的shell环境中执行,脚本会在单独的进程中执行。因此,函数可以对环境变量进行更改,比如改变当前工作目录,脚本则不行。脚本需要使用 export 将环境变量导出,并将值传递给环境变量。

  • As with any programming language, functions are a powerful construct to achieve modularity, code reuse, and clarity of shell code. Often shell scripts will include their own function definitions.

与其他程序语言一样,函数可以提高代码模块性、代码复用性并创建清晰性的结构。shell脚本中往往也会包含它们自己的函数定义。

Screenshot 2023-11-09 at 19.20.17

If not found, brew install ShellCheck first.

bash ? shell ? zsh ?

Shell Tools

Finding how to use commands

At this point, you might be wondering how to find the flags for the commands in the aliasing section such as ls -l, mv -i and mkdir -p. More generally, given a command, how do you go about finding out what it does and its different options? You could always start googling, but since UNIX predates StackOverflow, there are built-in ways of getting this information.

看到这里,您可能会有疑问,我们应该如何为特定的命令找到合适的标记呢?例如 ls -l, mv -imkdir -p。更普遍的是,给您一个命令行,您应该怎样了解如何使用这个命令行并找出它的不同的选项呢? 一般来说,您可能会先去网上搜索答案,但是,UNIX 可比 StackOverflow 出现的早,因此我们的系统里其实早就包含了可以获取相关信息的方法。

As we saw in the shell lecture, the first-order approach is to call said command with the -h or --help flags. A more detailed approach is to use the man command. Short for manual, man provides a manual page (called manpage) for a command you specify.

在上一节中我们介绍过,最常用的方法是为对应的命令行添加-h--help 标记。另外一个更详细的方法则是使用man 命令。man 命令是手册(manual)的缩写,它提供了命令的用户手册。

For example, man rm will output the behavior of the rm command along with the flags that it takes, including the -i flag we showed earlier. In fact, what I have been linking so far for every command is the online version of the Linux manpages for the commands. Even non-native commands that you install will have manpage entries if the developer wrote them and included them as part of the installation process. For interactive tools such as the ones based on ncurses, help for the commands can often be accessed within the program using the :help command or typing ?.

例如,man rm 会输出命令 rm 的说明,同时还有其标记列表,包括之前我们介绍过的-i。 事实上,目前我们给出的所有命令的说明链接,都是网页版的Linux命令手册。即使是您安装的第三方命令,前提是开发者编写了手册并将其包含在了安装包中。在交互式的、基于字符处理的终端窗口中,一般也可以通过 :help 命令或键入 ? 来获取帮助。

Sometimes manpages can provide overly detailed descriptions of the commands, making it hard to decipher what flags/syntax to use for common use cases. TLDR pages are a nifty complementary solution that focuses on giving example use cases of a command so you can quickly figure out which options to use.

有时候手册内容太过详实,让我们难以在其中查找哪些最常用的标记和语法。 TLDR pages 是一个很不错的替代品,它提供了一些案例,可以帮助您快速找到正确的选项。

For instance, I find myself referring back to the tldr pages for tar and ffmpeg way more often than the manpages.

例如,自己就常常在tldr上搜索tarffmpeg 的用法。

Screenshot 2023-11-09 at 19.35.51

tldr可以快速教你用法

Finding files

One of the most common repetitive tasks that every programmer faces is finding files or directories. All UNIX-like systems come packaged with find, a great shell tool to find files. find will recursively search for files matching some criteria. Some examples:

程序员们面对的最常见的重复任务就是查找文件或目录。所有的类UNIX系统都包含一个名为 find 的工具,它是 shell 上用于查找文件的绝佳工具。find命令会递归地搜索符合条件的文件,例如:

# Find all directories named src
find . -name src -type d              
# .:current folder  
# src: name
# type to be a directory
# it's gonna recursively go through the current directory

# Find all python files that have a folder named test in their path
find . -path '*/test/*.py' -type f
# Find all files modified in the last day
find . -mtime -1
# Find all zip files with size in range 500k to 10M
find . -size +500k -size -10M -name '*.tar.gz'

Screenshot 2023-11-09 at 20.17.13

Beyond listing files, find can also perform actions over files that match your query. This property can be incredibly helpful to simplify what could be fairly monotonous tasks.

除了列出所寻找的文件之外,find 还能对所有查找到的文件进行操作。这能极大地简化一些单调的任务。

# Delete all files with .tmp extension
find . -name '*.tmp' -exec rm {} \;
# Find all PNG files and convert them to JPG
find . -name '*.png' -exec convert {} {}.jpg \;

Screenshot 2023-11-10 at 00.57.39

Despite find's ubiquitousness, its syntax can sometimes be tricky to remember. For instance, to simply find files that match some pattern PATTERN you have to execute find -name '*PATTERN*' (or -iname if you want the pattern matching to be case insensitive).

尽管 find 用途广泛,它的语法却比较难以记忆。例如,为了查找满足模式 PATTERN 的文件,您需要执行 find -name '*PATTERN*' (如果您希望模式匹配时是不区分大小写,可以使用-iname选项)

You could start building aliases for those scenarios, but part of the shell philosophy is that it is good to explore alternatives. Remember, one of the best properties of the shell is that you are just calling programs, so you can find (or even write yourself) replacements for some.

您当然可以使用 alias 设置别名来简化上述操作,但 shell 的哲学之一便是寻找(更好用的)替代方案。 记住,shell 最好的特性就是您只是在调用程序,因此您只要找到合适的替代程序即可(甚至自己编写)。

For instance, fd is a simple, fast, and user-friendly alternative to find. It offers some nice defaults like colorized output, default regex matching, and Unicode support. It also has, in my opinion, a more intuitive syntax. For example, the syntax to find a pattern PATTERN is fd PATTERN.

例如,fd 就是一个更简单、更快速、更友好的程序,它可以用来作为find的替代品。它有很多不错的默认设置,例如输出着色、默认支持正则匹配、支持unicode并且我认为它的语法更符合直觉。以模式PATTERN 搜索的语法是 fd PATTERN.

Screenshot 2023-11-10 at 01.19.03

Most would agree that find and fd are good, but some of you might be wondering about the efficiency of looking for files every time versus compiling some sort of index or database for quickly searching. That is what locate is for.

大多数人都认为 findfd 已经很好用了,但是有的人可能想知道,我们是不是可以有更高效的方法,例如不要每次都搜索文件而是通过编译索引或建立数据库的方式来实现更加快速地搜索。

locate uses a database that is updated using updatedb. In most systems, updatedb is updated daily via cron. Therefore one trade-off between the two is speed vs freshness. Moreover find and similar tools can also find files using attributes such as file size, modification time, or file permissions, while locate just uses the file name. A more in-depth comparison can be found here.

这就要靠 locate 了。 locate 使用一个由 updatedb负责更新的数据库,在大多数系统中 updatedb 都会通过 cron 每日更新。这便需要我们在速度和时效性之间作出权衡。而且,find 和类似的工具可以通过别的属性比如文件大小、修改时间或是权限来查找文件,locate则只能通过文件名。 这里有一个更详细的对比。

Screenshot 2023-11-10 at 01.29.56

Finding code

Finding files by name is useful, but quite often you want to search based on file content. A common scenario is wanting to search for all files that contain some pattern, along with where in those files said pattern occurs.

查找文件是很有用的技能,但是很多时候您的目标其实是查看文件的内容。一个最常见的场景是您希望查找具有某种模式的全部文件,并找它们的位置。

To achieve this, most UNIX-like systems provide grep, a generic tool for matching patterns from the input text. grep is an incredibly valuable shell tool that we will cover in greater detail during the data wrangling lecture.

为了实现这一点,很多类UNIX的系统都提供了grep命令,它是用于对输入文本进行匹配的通用工具。它是一个非常重要的shell工具,我们会在后续的数据清理课程中深入的探讨它。

For now, know that grep has many flags that make it a very versatile tool. Some I frequently use are -C for getting Context around the matching line and -v for inverting the match, i.e. print all lines that do not match the pattern. For example, grep -C 5 will print 5 lines before and after the match. When it comes to quickly searching through many files, you want to use -R since it will Recursively go into directories and look for files for the matching string.

grep 有很多选项,这也使它成为一个非常全能的工具。其中我经常使用的有 -C :获取查找结果的上下文(Context);-v 将对结果进行反选(Invert),也就是输出不匹配的结果。举例来说, grep -C 5 会输出匹配结果前后五行。当需要搜索大量文件的时候,使用 -R 会递归地进入子目录并搜索所有的文本文件。

But grep -R can be improved in many ways, such as ignoring .git folders, using multi CPU support, &c.

但是,我们有很多办法可以对 grep -R 进行改进,例如使其忽略.git 文件夹,使用多CPU等等。

Many grep alternatives have been developed, including ack, ag and rg. All of them are fantastic and pretty much provide the same functionality. For now I am sticking with ripgrep (rg), given how fast and intuitive it is. Some examples:

因此也出现了很多它的替代品,包括 ack, agrg。它们都特别好用,但是功能也都差不多,我比较常用的是 ripgrep (rg) ,因为它速度快,而且用法非常符合直觉。例子如下:

# Find all python files where I used the requests library
rg -t py 'import requests'
# Find all files (including hidden files) without a shebang line
rg -u --files-without-match "^#!"
# 打印出所有不匹配给定模式的文件

# Find all matches of foo and print the following 5 lines
rg foo -A 5
# Print statistics of matches (# of matched lines and files )
rg --stats PATTERN

Note that as with find/fd, it is important that you know that these problems can be quickly solved using one of these tools, while the specific tools you use are not as important.

find/fd 一样,重要的是你要知道有些问题使用合适的工具就会迎刃而解,而具体选择哪个工具则不是那么重要。

Screenshot 2023-11-10 at 01.37.53

rg "import requests" -t py -C 5 ~/Downloads/

Screenshot 2023-11-10 at 01.42.13

Screenshot 2023-11-10 at 01.45.35

Screenshot 2023-11-10 at 01.45.50

Finding shell commands

So far we have seen how to find files and code, but as you start spending more time in the shell, you may want to find specific commands you typed at some point. The first thing to know is that typing the up arrow will give you back your last command, and if you keep pressing it you will slowly go through your shell history.

目前为止,我们已经学习了如何查找文件和代码,但随着你使用shell的时间越来越久,您可能想要找到之前输入过的某条命令。首先,按向上的方向键会显示你使用过的上一条命令,继续按上键则会遍历整个历史记录。

The history command will let you access your shell history programmatically. It will print your shell history to the standard output. If we want to search there we can pipe that output to grep and search for patterns. history | grep find will print commands that contain the substring "find".

history 命令允许您以程序员的方式来访问shell中输入的历史命令。这个命令会在标准输出中打印shell中的里面命令。如果我们要搜索历史记录,则可以利用管道将输出结果传递给 grep 进行模式搜索。 history | grep find 会打印包含find子串的命令。

history:

Screenshot 2023-11-10 at 01.52.56

In most shells, you can make use of Ctrl+R to perform backwards search through your history. After pressing Ctrl+R, you can type a substring you want to match for commands in your history.

对于大多数的shell来说,您可以使用 Ctrl+R 对命令历史记录进行回溯搜索。敲 Ctrl+R 后您可以输入子串来进行匹配,查找历史命令行。

Screenshot 2023-11-10 at 01.54.59

As you keep pressing it, you will cycle through the matches in your history. This can also be enabled with the UP/DOWN arrows in zsh.

反复按下就会在所有搜索结果中循环。在 zsh 中,使用方向键上或下也可以完成这项工作。

A nice addition on top of Ctrl+R comes with using fzf bindings. fzf is a general-purpose fuzzy finder that can be used with many commands. Here it is used to fuzzily match through your history and present results in a convenient and visually pleasing manner.

Ctrl+R 可以配合 fzf 使用。fzf 是一个通用对模糊查找工具,它可以和很多命令一起使用。这里我们可以对历史命令进行模糊查找并将结果以赏心悦目的格式输出。

Screenshot 2023-11-10 at 01.58.11

brew intall fzf.

Screenshot 2023-11-10 at 01.59.36

Another cool history-related trick I really enjoy is history-based autosuggestions. First introduced by the fish shell, this feature dynamically autocompletes your current shell command with the most recent command that you typed that shares a common prefix with it. It can be enabled in zsh and it is a great quality of life trick for your shell.

另外一个和历史命令相关的技巧我喜欢称之为基于历史的自动补全。 这一特性最初是由 fish shell 创建的,它可以根据您最近使用过的开头相同的命令,动态地对当前对shell命令进行补全。这一功能在 zsh 中也可以使用,它可以极大的提高用户体验。

You can modify your shell's history behavior, like preventing commands with a leading space from being included. This comes in handy when you are typing commands with passwords or other bits of sensitive information. To do this, add HISTCONTROL=ignorespace to your .bashrc or setopt HIST_IGNORE_SPACE to your .zshrc. If you make the mistake of not adding the leading space, you can always manually remove the entry by editing your .bash_history or .zhistory.

你可以修改 shell history 的行为,例如,如果在命令的开头加上一个空格,它就不会被加进shell记录中。当你输入包含密码或是其他敏感信息的命令时会用到这一特性。 为此你需要在.bashrc中添加HISTCONTROL=ignorespace或者向.zshrc 添加 setopt HIST_IGNORE_SPACE。 如果你不小心忘了在前面加空格,可以通过编辑。bash_history.zhistory 来手动地从历史记录中移除那一项。

Directory Navigation

So far, we have assumed that you are already where you need to be to perform these actions. But how do you go about quickly navigating directories? There are many simple ways that you could do this, such as writing shell aliases or creating symlinks with ln -s, but the truth is that developers have figured out quite clever and sophisticated solutions by now.

之前对所有操作我们都默认一个前提,即您已经位于想要执行命令的目录下,但是如何才能高效地在目录 间随意切换呢?有很多简便的方法可以做到,比如设置alias,使用 ln -s 创建符号连接等。而开发者们已经想到了很多更为精妙的解决方案。

As with the theme of this course, you often want to optimize for the common case. Finding frequent and/or recent files and directories can be done through tools like fasd and autojump.

由于本课程的目的是尽可能对你的日常习惯进行优化。因此,我们可以使用fasdautojump 这两个工具来查找最常用或最近使用的文件和目录。

Fasd ranks files and directories by frecency, that is, by both frequency and recency. By default, fasd adds a z command that you can use to quickly cd using a substring of a frecent directory. For example, if you often go to /home/user/files/cool_project you can simply use z cool to jump there. Using autojump, this same change of directory could be accomplished using j cool.

Fasd 基于 frecency 对文件和文件排序,也就是说它会同时针对频率(frequency)和时效(recency)进行排序。默认情况下,fasd使用命令 z 帮助我们快速切换到最常访问的目录。例如, 如果您经常访问/home/user/files/cool_project 目录,那么可以直接使用 z cool 跳转到该目录。对于 autojump,则使用j cool代替即可

More complex tools exist to quickly get an overview of a directory structure: tree, broot or even full fledged file managers like nnn or ranger.

还有一些更复杂的工具可以用来概览目录结构,例如 tree, broot 或更加完整的文件管理器,例如 nnnranger

Screenshot 2023-11-10 at 11.41.38

Screenshot 2023-11-10 at 11.42.50

brew install nnn

Screenshot 2023-11-10 at 11.44.37

Exercises

  1. Read man ls and write an ls command that lists files in the following manner

    • Includes all files, including hidden files
    • Sizes are listed in human readable format (e.g. 454M instead of 454279954)
    • Files are ordered by recency
    • Output is colorized

    A sample output would look like this

    -rw-r--r--   1 user group 1.1M Jan 14 09:53 baz
    drwxr-xr-x   5 user group  160 Jan 14 09:53 .
    -rw-r--r--   1 user group  514 Jan 14 06:42 bar
    -rw-r--r--   1 user group 106M Jan 13 12:12 foo
    drwx------+ 47 user group 1.5K Jan 12 18:08 ..
    

    Screenshot 2023-11-10 at 11.47.39

    Screenshot 2023-11-10 at 11.53.26

  2. Write bash functions marco and polo that do the following. Whenever you execute marco the current working directory should be saved in some manner, then when you execute polo, no matter what directory you are in, polo should cd you back to the directory where you executed marco. For ease of debugging you can write the code in a file marco.sh and (re)load the definitions to your shell by executing source marco.sh.

Solution:

Screenshot 2023-11-10 at 12.07.57

#!/bin/bash
 marco(){
     echo "$(pwd)" > $HOME/marco_history.log
     echo "save pwd $(pwd)"
 }
 polo(){
     cd "$(cat "$HOME/marco_history.log")"
 }
  1. Say you have a command that fails rarely. In order to debug it you need to capture its output but it can be time consuming to get a failure run. Write a bash script that runs the following script until it fails and captures its standard output and error streams to files and prints everything at the end. Bonus points if you can also report how many runs it took for the script to fail.
  #!/usr/bin/env bash

  n=$(( RANDOM % 100 ))

  if [[ n -eq 42 ]]; then
     echo "Something went wrong"
     >&2 echo "The error was using magic numbers"
     exit 1
  fi

  echo "Everything went according to plan"

Solution:

Screenshot 2023-11-10 at 12.35.53

  1. As we covered in the lecture find’s -exec can be very powerful for performing operations over the files we are searching for. However, what if we want to do something with all the files, like creating a zip file? As you have seen so far commands will take input from both arguments and STDIN. When piping commands, we are connecting STDOUT to STDIN, but some commands like tar take inputs from arguments. To bridge this disconnect there’s the xargs command which will execute a command using STDIN as arguments. For example ls | xargs rm will delete the files in the current directory.

Your task is to write a command that recursively finds all HTML files in the folder and makes a zip with them. Note that your command should work even if the files have spaces (hint: check -d flag for xargs).

If you’re on macOS, note that the default BSD find is different from the one included in GNU coreutils. You can use -print0 on find and the -0 flag on xargs. As a macOS user, you should be aware that command-line utilities shipped with macOS may differ from the GNU counterparts; you can install the GNU versions if you like by using brew.

Screenshot 2023-11-10 at 12.39.36

  1. (Advanced) Write a command or script to recursively find the most recently modified file in a directory. More generally, can you list all files by recency?

Screenshot 2023-11-10 at 12.41.25

Find all files in the current directory recursively.

List all files by recency.

Find the most recently modified file in a directory i.e. the first line of the list.