6.s081 Lab0 Xv6 and Unix Utilities | 没什么看头的地方

这个人太懒了，都不想加标签

Lab

发布日期: 2023-10-22

热身运动，实现一些 shell 命令。

#Boot xv6

首先去 lab tool page 把实验所需工具都给装上，然后查看一下各个工具的版本，检查是否安装成功。

$ tool --version

获取代码，切换到实验对应分支，准备开撸。

$ git clone git://g.csail.mit.edu/xv6-labs-2021
$ cd xv6-labs-2021
$ git checkout util

输入 make qemu 可以编译所有代码并模拟内核环境，输入 ctrl+a x 以退出环境。

#Task1: sleep

该任务要求实现 sleep 命令，模式为

$ sleep 时间值

这个任务就是帮助我们熟悉整个代码框架，以及一些头文件。

比如框架为我们提供了系统调用，位于头文件 user/user.h 中，以及 kernel/types.h 为我们提供了类型定义。这两个头文件在后面都是需要 include 的。

在 user/ 文件夹下 touch 一个新文件，然后写入代码：

user/sleep.c
#include "kernel/types.h"
#include "user/user.h"

int
main(int argc, char *argv[])
{
  if (argc < 2) {
    fprintf(2, "Usage: sleep seconds...\n");
    exit(1);
  }

  int ticks = atoi(argv[1]);  // 自带的库函数，定义位于 user/ulib.c
  sleep(ticks);               // user.h 中的系统调用
  exit(0);                    // 每个进程都要调用 exit(0) 正常退出
}

与此同时，还要修改 Makefile 里头 UPROGS 变量，在后面加上 $U/_sleep\ 即可正确编译。编译后，我们可以进入 qemu 并调用 sleep 看看效果：

$ make qemu
$ sleep 10
(nothing happens for a little while)

也可以用 lab 自带的 python 脚本 grade-lab-util [test-name]

$ ./grade-lab-util sleep
...
== Test sleep, no arguments == sleep, no arguments: OK (3.6s)
== Test sleep, returns == sleep, returns: OK (0.6s)
== Test sleep, makes syscall == sleep, makes syscall: OK (1.0s)

#Task2: pingpong

该任务要求我们利用 pipe() 和 fork() 两大系统调用进行父子进程间的同步：

父进程通过管道发送一个字节给子进程；
子进程收到后，打印一条消息，然后将该字节返回给父进程，退出；
父进程等待子进程运行完毕，再进行读取，这样做是防止读到自己已经写入的内容。收到后，打印一条消息，然后退出；

首先是管道的创建。int pipe(int *fd) 会将 fd[0] 设置为读端，fd[1] 设置为写端，并返回调用是否成功（返回 -1 表示创建管道失败）

再是进程的创建。int fork() 会在当前进程 parent 的基础上创建一个新进程 child，child 相当于是 parent 的副本——拷贝了所有内容，唯一的区别在于这俩进程不共享地址空间，也就是说在 child 中修改全局变量并不会影响到 parent 里的全局变量，毕竟只是副本。对于父进程，fork 会返回子进程的 pid；而对于子进程则返回 0。

最后等待子进程运行完毕需要用到 wait(int *)，传入的是子进程 pid 的指针，如果传入 0 意为等待所有子进程结束。

有了这些预备知识，代码就很好编写了：

user/pingpong.c
#include "kernel/types.h"
#include "user/user.h"

#define ONEBYTE sizeof(char)

int
main(int argc, char *argv[])
{
  int fd[2]; // fd[0] 为读端, fd[1] 为写端
  if (pipe(fd) < 0) {
    fprintf(2, "pingpong: pipe failed\n");
    exit(1);
  }

  // 1. The parent should send a byte to the child;
  // 2. the child should print "<pid>: received ping", where <pid> is its process ID,
  //    and write the byte on the pipe to the parent, and exit;
  // 3. the parent should read the byte from the child, print "<pid>: received pong", and exit.
  int pid = fork();
  if (pid > 0) { // parent
    char *buf = (char*)malloc(ONEBYTE); // an one-byte memory
    if (write(fd[1], buf, ONEBYTE) != ONEBYTE) {
      fprintf(2, "pingpong: proc %d write failed\n", getpid());
      free(buf);
      exit(1);
    }

    wait(&pid); // 如果不加这步，则下面的 read 可能会抢先拿出自己 write 的数据

    if (read(fd[0], buf, ONEBYTE) < 0) {
      fprintf(2, "pingpong: proc %d read failed\n", getpid());
      free(buf);
      exit(1);
    }

    printf("%d: received pong\n", getpid());
    free(buf);  // malloc 后跟 free 防止内存泄漏
  } else if (pid == 0) { // child
    char* buf = (char*)malloc(ONEBYTE);
    if (read(fd[0], buf, ONEBYTE) < 0) {
      fprintf(2, "pingpong: proc %d read failed\n", getpid());
      free(buf);
      exit(1);
    }

    printf("%d: received ping\n", getpid());

    if (write(fd[1], buf, ONEBYTE) != ONEBYTE) {
      fprintf(2, "pingpong: proc %d write failed\n", getpid());
      free(buf);
      exit(1);
    }
    free(buf);
  } else {
    fprintf(2, "pingpong: fork failed\n");
    exit(1);
  }

  exit(0);
}

修改完 Makefile 后进行测试，结果如下：

$ ./grade-lab-util pingpong
...
== Test pingpong == pingpong: OK (1.3s)

#Task3: primes

该任务是上一个任务的 plus 版本，编写一个并发输出 2~35 之间所有素数的程序，算法思想见这个网页。

在这里，就是对于每一个进程 p，不断接收其父进程 parent 写入管道的数据。对于第一个收到的数据 n，直接打印，对于后续的数据 data，如果满足 data % p != 0，说明该数可能是个素数，通过管道交给子进程 child。

由于所有的管道都是单向的，即从 parent 流向 child，故每个进程需关闭左侧（与 parent 交互）管道的写端与右侧（与 child 交互）的读端，防止文件描述符不够用的情况。

注意到，由于范围内的最后一个素数是 31，所以收到并打印 31 以后就不用再进一步创建子进程，反之，打印完素数后进一步 fork()，这也指明了递归终止的条件（创建子进程相当于一个递归的过程）。同时，最左侧进程（也就是主进程）只能写不能读，那就干脆不让他打印了，只往管道写数据好了。

代码如下：

user/primes.c
#include "kernel/types.h"
#include "user/user.h"

void
createNewProc(int p, int *left)
{
  int right[2];
  if (pipe(right) < 0) {
    fprintf(2, "primes: pipe failed\n");
    exit(1);
  }

  int pid = fork();
  if (pid > 0) {
    close(right[0]);
    int n;
    while (read(left[0], &n, sizeof(int)) == sizeof(int)) {
      if (n % p != 0) {
        if (write(right[1], &n, sizeof(int)) != sizeof(int)) {
          fprintf(2, "primes: proc %d write failed\n", getpid());
          exit(1);
        }
      }
    }
    close(right[1]);
    wait((int*)0);
  } else if (pid == 0) {
    close(right[1]);

    int p;
    read(right[0], &p, sizeof(int));
    printf("prime %d\n", p);

    if (p != 31) {
      createNewProc(p, right);
    }
  } else {
    fprintf(2, "primes: fork failed\n");
    exit(1);
  }
}

int
main(int argc, char *argv[])
{
  int fd[2];
  if (pipe(fd) < 0) {
    fprintf(2, "primes: pipe failed\n");
    exit(1);
  }

  int pid = fork();
  if (pid > 0) {
    close(fd[0]); // 关闭用不到的
    for (int i = 2; i <= 35; i++) {
      if (write(fd[1], &i, sizeof(int)) != sizeof(int)) {
        fprintf(2, "primes: proc %d write failed\n", getpid());
        exit(1);
      }
    }
    close(fd[1]); // 写完了就关掉，防止不够用
    wait((int*)0);
  } else if (pid == 0) {
    close(fd[1]); // 关闭用不到的
    int p;
    read(fd[0], &p, sizeof(int));
    printf("prime %d\n", p);

    createNewProc(p, fd); // 第一个打印数据的进程，递归创建子进程进行打印数据
  } else {
    fprintf(2, "primes: fork failed\n");
    exit(1);
  }

  exit(0);
}

修改完 Makefile 后进行测试，结果如下：

$ ./grade-lab-util primes
...
== Test primes == primes: OK (0.7s)

#Task4: find

该任务要求我们实现一个简单版本的 find 命令。其模式为

$ find 目录 文件名

user/ls.c 里有读取目录的样例实现。熟悉文件系统的应该知道，目录也是「文件」，也具有 inode 号和相应的磁盘空间，只不过其磁盘块上的内容是若干文件条目，在 lab 中用数据结构 dirent 表示，其内容为 {inode, name}，其中 name 既可以是文件名，也可以是目录名。

所以思路就清晰了：

对于每个当前目录 dir，遍历其 dirent 并利用 stat() 获取该条目对应的类型；
如果是文件类型，则判断其 name 与待查找的文件名 target 是否一致，若一致则输出路径；
如果是目录类型，递归查找，跳过当前目录 . 与父目录 ..；

代码如下：

user/find.c
#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"
#include "kernel/fs.h"
#include "kernel/param.h"

#define BUFSIZE 512

void
searchFile(char *dir, char *file)
{
  int fd; // 指向目录 dir
  if ((fd = open(dir, 0)) < 0){
    fprintf(STDERR, "find: cannot open %s\n", dir);
    exit(1);
  }

  if (strlen(dir) + 1 + DIRSIZ + 1 > BUFSIZE){ // path + '/' + name + '\0'
    fprintf(STDERR, "find: path too long\n");
    exit(1);
  }

  // 构建路径名
  char buf[BUFSIZE], *p;
  strcpy(buf, dir);
  p = buf+strlen(buf);
  *p++ = '/'; // buf 当前为 "${dir}/"

  // 遍历目录下所有条目
  struct dirent de;
  struct stat st;
  while (read(fd, &de, sizeof(de)) == sizeof(de)) { // 每次读一整个 dirent 大小
    if (de.inum == 0 || strcmp(de.name, ".") == 0 || strcmp(de.name, "..") == 0) { // 跳过无效项
      continue;
    }

    // 构建文件路径名
    memmove(p, de.name, DIRSIZ);
    p[DIRSIZ] = 0;
    if (stat(buf, &st) < 0) {
      fprintf(STDERR, "find: cannot stat %s\n", buf);
      continue;
    }

    switch (st.type) {
      case T_FILE: // 检查文件名是否一致
        if (strcmp(file, de.name) == 0) {
          printf("%s\n", buf);
        }
        break;
      case T_DIR: // 递归查找子目录
        searchFile(buf, file);
        break;
    }
  }
}

int
main(int argc, char *argv[])
{
  if (argc <= 2) {
    fprintf(2, "Usage: find dir files...");
    exit(1);
  }

  for (int i = 2; i < argc; i++) { // 支持同一目录下多文件查找
    searchFile(argv[1], argv[i]);
  }
  exit(0);
}

修改完 Makefile 后进行测试，结果如下：

$ ./grade-lab-util find
...
== Test find, in current directory == find, in current directory: OK (0.8s)
== Test find, recursive == find, recursive: OK (1.1s)

#Task5: xargs

最后一个任务是实现 xargs 命令。该命令一般配合管道使用，将前一个命令输出，作为后面命令的额外参数，比如

$ echo hello too | xargs echo bye
bye hello too

就是将 echo hello too 的输出 hello too 作为 echo bye 的额外参数并附加到末尾，相当于执行了命令 echo bye hello too。这里我们只需实现 -n 1 的版本，即每 1 行输出作为一组额外参数。

$ echo -e "1\n2" | xargs -n 1 echo line
line 1
line 2
## 前面理论上输出为
## 1
## 2
## 整行命令相当于依次执行了
## echo line 1
## echo line 2

对于 xargs 而言，必然前面是出现管道符 | 的，也就意味着有额外参数被写入到标准输出(fd=1)中，那就要从标准输入(fd=0)中读取了。思路很明确了：不断从标准输入中读字符直到 '\n'，意味着完整的参数已被读取，那就调用 exec() 执行命令。

注意到 exec() 会将当前进程替换为新进程，原来那个进程后面就不继续了，所以每次 exec() 都要由主进程 fork() 一个子进程去执行

user/xargs.c
#include "kernel/types.h"
#include "kernel/param.h"
#include "user/user.h"

int
main(int argc, char *argv[])
{
  // eg: find . b | xargs grep hello
  char *cmd = argv[1];
  char *new_args[MAXARG];

  // 先把原本的参数加进去
  int k;
  for (k = 0; k < argc-1; k++) {
    new_args[k] = argv[k+1];
  }

  char extra_arg[32];
  int p = 0;
  // 逐行读取，再把新参数 append 到 new_args 里去
  while (read(STDIN, extra_arg+p, 1) > 0) {
    if (extra_arg[p] == '\n') {
      extra_arg[p] = '\0';

      // 遇到 '\n' 就意味着准备好 exec 了，立马 fork 一个子进程去做。
      // 记得用 wait 来进行同步
      int pid;
      pid = fork();
      if (pid > 0) {
        wait(&pid);
      } else if (pid == 0) {
        new_args[k] = extra_arg;
        exec(cmd, new_args);

        // 如果 exec 调用失败才会到这一行
        fprintf(2, "exec %s failed\n", cmd);
        exit(1);
      } else {
        fprintf(2, "xargs: fork failed\n");
        exit(1);
      }
      p = 0;
    } else {
      p++;
    }
  }
  exit(0);
}

修改完 Makefile 后进行测试，结果如下：

$ ./grade-lab-util xargs
...
== Test xargs == xargs: OK (1.9s)

#最后的工作

新建一个 time.txt 文件，写上总共花了多少小时，然后 git commit -am "" 将所有修改提交到本地。
去网页用邮箱注册，然后收到一封邮件，点击邮件链接会收到一个 key，妥善保管；
执行 make handin，输入刚才得到的那个 key，就成功提交了，并且显示当前的课程进度；

可选的挑战再说吧，没有什么想做的欲望。

这个人太懒了，都不想加标签

6.s081 Lab1 System Calls

本 lab 要求我们进行系统调用代码的编写。

2023-10-23 Lab

面经

个人的面试经验整理。

2023-10-19 Interview