Minishell
[42 Madrid] As beautiful as a shell
Install / Use
/learn @madebypixel02/MinishellREADME
minishell | 42 Madrid
As beautiful as a shell 🐚
<div align="center"> <img src=https://user-images.githubusercontent.com/40824677/149231106-b3b63b76-633f-4618-a5af-c9526bcdd10f.png /> </div>Table of Contents
Introduction
This project is all about recreating your very own (mini)shell, taking bash (Bourne Again SHell) as reference. This was our first group project, and I was honored to do it with @mbueno-g :)
What the Shell?
As we just said, we are asked to implement our own shell, but what is a shell to begin with? If we think of (for example) Linux as a nut or a seashell, the kernel/seed is the core of the nut and has to be surrounded by a cover or shell. Likewise, the shell we are implementing works as a command interpreter communicating with the OS kernel in a secure way, and allows us to perform a number tasks from a command line, namely execute commands, create or delete files or directories, or read and write content of files, among (many) other things
Our Implementation of Minishell
The general idea for this shell is reading a string of commands in a prompt using readline. Before anything, it is highly recommended to take a deep dive into the bash manual, as it goes over every detail we had to have in mind when doing this project. Minishell involves heavy parsing of the string read by readline, thus it is crucial to divide the code of the project into different parts: the lexer, the expander, the parser, and lastly the executor
Lexer and Expander
This first part covers the part of our code in charge of expanding environment variables with $ followed by characters, as well as the expansion of ~ to the user's home directory. Here we also split the input string into small chunks or tokens to better handle pipes, redirections, and expansions.
After reading from the stdin we use a function we named cmdtrim which separates the string taking spaces and quotes into account. For example:
string: echo "hello there" how are 'you 'doing? $USER |wc -l >outfile
output: {echo, "hello there", how, are, 'you 'doing?, $USER, |wc, -l, >outfile, NULL}
Then, we apply the expander functions on top of every substring of the original string, resulting in something similar to this:
output: {echo, "hello there", how, are, 'you 'doing?, pixel, |wc, -l, >outfile, NULL}
Note: if a variable is not found, the $var part of the string will be replaced by an empty string
Lastly, we have another split function called cmdsubsplit which separates with <, |, or >, but only if those chars are outside of quotes:
output: {echo, "hello there", how, are, 'you 'doing?, pixel, |, wc, -l, >, outfile, NULL}
Parser
The parser is in charge of storing the tokenized string and save it in a useful manner for the executor to use later. Our data structure is managed as follows:
int g_status;
typedef struct s_prompt
{
t_list *cmds;
char **envp;
pid_t pid;
} t_prompt;
typedef struct s_mini
{
char **full_cmd;
char *full_path;
int infile;
int outfile;
} t_mini;
Here is a short summary of what every variable is used for
| Parameter | Description |
| :-------: | :---------: |
| cmds | Linked list containing a t_mini node with all commands separated by pipes |
| full_cmd | Equivalent of the typical argv, containing the command name and its parameters when needed |
| full_path | If not a builtin, first available path for the executable denoted by argv[0] from the PATH variable |
| infile | Which file descriptor to read from when running a command (defaults to stdin) |
| outfile | Which file descriptor to write to when running a command (defaults to stdout) |
| envp | Up-to-date array containing keys and values for the shell environment |
| pid | Process ID of the minishell instance |
| g_status | Exit status of the most-recently-executed command |
After running our lexer and expander, we have a two-dimensional array. Following the previous example, it was the following:
{echo, "hello there", how, are, 'you 'doing?, pixel, |, wc, -l, >, outfile, NULL}
Now, our parser starts building the linked list of commands (t_list *cmds), which is filled in the following way:
- Iterate over the two-dimensional array
- Whenever a redirection is found, check the type of redirection and retrieve a file descriptor containing the info we need as the infile
- Check that the file descriptor that has been opened is valid (!= -1) and continue
- If a pipe is found, add a new node to the list of commands
- In all other cases add whatever words are found to the argument list (
argv) we callfull_cmd
Here's how the variables will look like according to the example we used before:
cmds:
cmd 1:
infile: 0 (default)
outfile: 1 (redirected to pipe)
full_path: NULL (because echo is a builtin)
full_cmd: {echo, hello there, how, are, you doing?, pixel, NULL}
cmd 2:
infile: 0 (contains output of previous command)
outfile: 3 (fd corresponding to the open file 'outfile')
full_path: /bin/wc
full_cmd: {wc, -l, NULL}
envp: (envp from main)
pid: process ID of current instance
g_status: 0 (if last command exits normally)
Executor
With all our data properly on our structs, the executer has all the necessary information to execute commands. For this part we use separate processess to execute either our builtins or other commands inside child processes that redirect stdin and stdout in the same way we did with our previous pipex project. If we are given a full path (e.g. /bin/ls) then we do not need to look for the full path of the command and can execute directly with execve. If we are given a relative path then we use the PATH environment variable to determine the full_path of a command. After all commands have started running, we retrieve the exit status of the most recently executed command with the help of waitpid
Once all commands have finished running the allocated memory is freed and a new prompt appears to read the next command
Mind Map
Here is a handy mindmap of our code structure to help you understand everything we mentioned previously

Global Variable
For this project we could use one global variable. At first it seemed we were never going to need one, but later it became obvious that it is required. Specifically, it has to do with signals. When you use signal to capture SIGINT (from Ctrl-C) and SIGQUIT (from Ctrl-\) signals, we have to change the error status, and the signal function has no obvious way of retrieving the updated exit status that shoud change when either of these signals are captured. To work this around, we added a global variable g_status that updates the error status when signals are detected.
Builtins
We were asked to implement some basic builtins with the help of some functions, here is a brief overview of them:
| Builtin | Description | Options | Parameters | Helpful Functions |
| :-----: | :---------: | :-----: | :--------: | :---------------: |
| echo | Prints arguments separated with a space followed by a new line | -n | :heavy_check_mark: | write |
| cd | Changes current working directory, updating PWD and OLDPWD | :x: | :heavy_check_mark: | chdir |
| pwd| Prints current working directory | :x: | :x: | getcwd |
| env | Prints environment | :x: | :x: | write |
| export | Adds/replaces variable in environment | :x: | :heavy_check_mark: | :x: |
| unset | Removes variable from environment | :x: | :heavy_check_mark: | :x: |
Prompt
As mentioned previously, we use readline to read the string containing the shell commands. To make it more interactive, readline receives a string to be used as a prompt. We have heavily tweaked the looks of it to be nice to use. The prompt is structured as follows:
$USER@minishell $PWD $
Some remarks:
- If there is any problem retrieving the user, it will be replaced with
guest - The
PWDis colored blue and dynamically replaces theHOMEvariable with~when the variable is set. See below for more details - The
$in the end is printed blue or red depending on the exit status in the struct
.
commit-push-pr
83.9kCommit, push, and open a PR