website

Website contents
git clone git://git.reagancfischer.dev/website.git
Log | Files | Refs

lexer_new.lit (21010B)


      1 @code_type c .c
      2 @comment_type /* %s */
      3 @compiler lit -t lexer.lit && gcc -Wall -Wextra -Wpedantic -Wstrict-aliasing=3 -Wwrite-strings -Wvla -Wcast-align=strict -Wstrict-prototypes -Wstringop-overflow=4 -Wshadow -fanalyzer tokenizer.c input.c hash_table.c -D TOK_TEST -g -O0 && rm a.out
      4 
      5 @title Lexer
      6 @add_css ../style.css
      7 @s General Project Structure
      8 Since this is the first article, I'll outline the project structure for the C- compiler.
      9 
     10 The project has a series of pretty typical stages:
     11 
     12 1.  The lexer. This takes a file as input and emits a series of tokens (Its input is already preprocessed, I outsource that to "gcc -E").
     13 2.  The parser. This takes the tokens and builds an abstract syntax tree (AST).
     14 3.  The symbol table. This exists in a sort of in-between space next to the lexer and parser. It's used to store information about variables and functions.
     15 4.  The type checker. This is used to ensure that the types of variables and functions are correct.
     16 5.  The code generator. This takes the AST and generates an intermediate representation (IR).
     17 6.  The optimizer. This takes the IR and optimizes it. This'll be broken up into a few stages.
     18 7.  The lowerer. This takes the IR and lowers it to a simpler IR.
     19 8.  The register allocator. This takes the IR, which has instructions in an infinite number of registers, and assigns them to a finite number of registers.
     20 9.  The code emitter. This takes the IR and emits RISC-V assembly.
     21 
     22 As far as possible, I'd like to keep each of these stages separate. One benefit of this is that it simplifies memory management greatly. I plan to use an arena allocator for each stage, and by making sure the only thing on the actual heap is the output of the stage, and all temporary data is stored in the arena, I can free all the memory used by a stage by simply freeing the arena.
     23 
     24 @s Some Rules
     25 
     26 Here are some rules (more like guidelines) that I plan to follow for this project; they're mostly just to keep things simple and consistent.
     27 
     28 1. PROGRAM LIKE IT'S 1999
     29 
     30 > 640 KB ought to be enough for anybody. - Bill Gates
     31 
     32 Maybe not that little, But I'm going to try to keep the project as simple as possible, 640 KB probably won't be enough, but I'll still aim for less than 10 MB of memory usage.
     33 
     34 This places a lot of constraints on the project, but I think it's a good exercise in minimalism.
     35 
     36 Some consequences of this are that I'll have to use memory-wise algorithms, be very careful about program structure, and avoid some of the bigger libraries (which will help with making this project self-hosting in the future).
     37 
     38 2. PROGRAM IN C++--
     39 
     40 I'm not a big fan of C++, but its class system helps prevent a lot of ugly bugs. To that end, I'm going to try and keep data structures out of header files, and only expose functions that operate on those data structures, to create a sort of approximation of a class. This has a few benefits:
     41 
     42 *   Quicker compilation. A change to a data structure will only require one file to be recompiled, rather than every file that includes the header.
     43 *   Less chance of bugs. If a function is the only way to interact with a data structure, then it's much harder to misuse that data structure.
     44 *   Run time type checking. I can include some sort of tag in the first field of every data structure to ensure that the correct functions are being called.
     45 
     46 3. DON'T GET FANCY
     47 
     48 My goal here isn't to write the fastest interpreter in the world, or the most complete. I just want to make something that works and can be understood by someone else.
     49 
     50 That means I'm going to avoid a lot of the tricks that are used in production interpreters, and focus more on simplicity and readability.
     51 
     52 4. DESIGN FOR DEBUGGING
     53 
     54 This code is going to be peppered with asserts and contain mechanisms to print out the state of the program at any point.
     55 
     56 This might be painful, but it'll make debugging a lot simpler and let users look under the hood.
     57 
     58 5. SMART DATA, STUPID CODE
     59 
     60 A lot of times, the right data structure can replace 50-100 lines of procedural code. I'm going to try and design data structures which make the algorithms as simple as possible.
     61 
     62 For example, instead of writing 50-100 lines of code to hold every keyword in the language, I can just use a simple hash table.
     63 
     64 @s Misc
     65 THIS IS A LITERATE PROGRAM! Go to [this link](https://reagancfischer.dev/lexer.lit) to see the file that generated this HTML.
     66 
     67 @s The Lexer
     68 
     69 A lexical analyzer reads source code and produces tokens, which are the smallest unit of meaning in a language. For example, in the C programming language, the tokens are things like keywords (if, else, while, etc.), identifiers (variable names), numbers, and punctuation (braces, semicolons, etc.).
     70 
     71 Given a string like `int main() { return 0; }`, the lexer would produce a series of tokens like `INT`, `IDENTIFIER(main)`, `LPAREN`, `RPAREN`, `LBRACE`, `RETURN`, `INTCONSTANT(0)`, `SEMICOLON`, `RBRACE`.
     72 
     73 @s Design
     74 
     75 I'll break the lexer up into a couple of modules. `token.c` will contain the token data structure and functions to create and destroy tokens. `input.c` will contain the input data structure and functions to read from the input file. `tokenizer.c` will contain the main lexer logic.
     76 
     77 @s Token Interface
     78 Tokens are the smallest unit of meaning in a language. They're used by the parser to build an abstract syntax tree (AST). We'll need a couple of things to represent a token:
     79 *   The type of token. This will be an enum, with values like `TOK_CTK_IF` or `TOK_CONST_INTEGER_U32`.
     80 *   The value of the token. Some tokens, like keywords, don't have a value. Others, like identifiers or constants, do.
     81 *   The line and column of the token. This is used for error messages.
     82 
     83 As I mentioned earlier, we're trying to implement a sort of class system in C. For that, we'll need to hide the token implementation details behind an opaque pointer. We could just have a `void` pointer, but that stops us from being able to use compile-time type checking.
     84 Instead, we'll use a forward declaration of the token type in the header file, and then define the token type in the implementation file. 
     85 @s
     86 --- Opaque Token Type
     87 typedef struct token token_t;
     88 ---
     89 @s
     90 We'll need a couple of functions to create and destroy tokens.
     91 --- Token Creation and Destruction
     92 token_t *token_data_create(c_token_types kind, int lin, int col,
     93                                   int len);
     94 
     95 token_t *token_create(c_token_types kind, int lin, int col, int len);
     96 
     97 token_t *token_create_int(c_token_types kind, int lin, int col,
     98                                  int64_t i, int len);
     99 
    100 token_t *token_create_float(c_token_types kind, int lin, int col,
    101                                    double f, int len);
    102 
    103 token_t *token_create_char(c_token_types kind, int lin, int col, char c,
    104                                   int len);
    105 
    106 token_t *token_create_string(c_token_types kind, int lin, int col,
    107                                     const char *s, int len);
    108 
    109 void token_destroy(token_t *token);
    110 ---
    111 @s
    112 We'll also need some functions to access the token data.
    113 --- Token Interface
    114 c_token_types token_type(token_t *token);
    115 
    116 int64_t token_int(token_t *token);
    117 
    118 double token_float(token_t *token);
    119 
    120 const char *token_string(token_t *token);
    121 
    122 char token_char(token_t *token);
    123 
    124 int token_line(token_t *token);
    125 
    126 int token_column(token_t *token);
    127 ---
    128 @s
    129 We'll need some types to represent the different kinds of tokens.
    130 --- Token Types
    131 typedef enum {
    132   // Control Keywords
    133   TOK_CTK_IF,
    134   TOK_CTK_ELSE,
    135   TOK_CTK_SWITCH,
    136   TOK_CTK_CASE,
    137   TOK_CTK_DEFAULT,
    138   TOK_CTK_WHILE,
    139   TOK_CTK_DO,
    140   TOK_CTK_FOR,
    141   TOK_CTK_CONTINUE,
    142   TOK_CTK_BREAK,
    143   TOK_CTK_RETURN,
    144   TOK_CTK_GOTO,
    145 
    146   // Type Keywords
    147   TOK_TK_VOID,
    148   TOK_TK_CHAR,
    149   TOK_TK_SHORT,
    150   TOK_TK_INT,
    151   TOK_TK_LONG,
    152   TOK_TK_FLOAT,
    153   TOK_TK_DOUBLE,
    154   TOK_TK_SIGNED,
    155   TOK_TK_UNSIGNED,
    156   TOK_TK_STRUCT,
    157   TOK_TK_UNION,
    158   TOK_TK_ENUM,
    159   TOK_TK_TYPEDEF,
    160 
    161   // Storage Class/Specifier Keywords
    162   TOK_SCSK_AUTO,
    163   TOK_SCSK_REGISTER,
    164   TOK_SCSK_STATIC,
    165   TOK_SCSK_EXTERN,
    166   TOK_SCSK_CONST,
    167   TOK_SCSK_VOLATILE,
    168 
    169   // Misc Keywords
    170   TOK_MK_SIZEOF,
    171 
    172   // Operators
    173   TOK_OP_ADD,            // +
    174   TOK_OP_SUB,            // -
    175   TOK_OP_MUL,            // *
    176   TOK_OP_DIV,            // /
    177   TOK_OP_MOD,            // %
    178   TOK_OP_BIT_AND,        // &
    179   TOK_OP_BIT_OR,         // |
    180   TOK_OP_BIT_XOR,        // ^
    181   TOK_OP_BIT_NOT,        // ~
    182   TOK_OP_LSHIFT,         // <<
    183   TOK_OP_RSHIFT,         // >>
    184   TOK_OP_NOT,            // !
    185   TOK_OP_ASSIGN,         // =
    186   TOK_OP_LT,             // <
    187   TOK_OP_GT,             // >
    188   TOK_OP_INC,            // ++
    189   TOK_OP_DEC,            // --
    190   TOK_OP_EQ,             // ==
    191   TOK_OP_NE,             // !=
    192   TOK_OP_LE,             // <=
    193   TOK_OP_GE,             // >=
    194   TOK_OP_AND,            // &&
    195   TOK_OP_OR,             // ||
    196   TOK_OP_MEMBER_POINTER, // ->
    197   TOK_OP_MEMBER,         // .
    198   TOK_OP_COND_DECISION,  // :
    199   TOK_OP_COND,           // ?
    200   TOK_OP_ASSIGN_ADD,     // +=
    201   TOK_OP_ASSIGN_SUB,     // -=
    202   TOK_OP_ASSIGN_MUL,     // *=
    203   TOK_OP_ASSIGN_DIV,     // /=
    204   TOK_OP_ASSIGN_MOD,     // %=
    205   TOK_OP_ASSIGN_BITAND,  // &=
    206   TOK_OP_ASSIGN_BITOR,   // |=
    207   TOK_OP_ASSIGN_BITXOR,  // ^=
    208   TOK_OP_ASSIGN_LSHIFT,  // <<=
    209   TOK_OP_ASSIGN_RSHIFT,  // >>=
    210 
    211   // Separators
    212   TOK_SEP_LEFT_PAREN,    // (
    213   TOK_SEP_RIGHT_PAREN,   // )
    214   TOK_SEP_LEFT_BRACKET,  // [
    215   TOK_SEP_RIGHT_BRACKET, // ]
    216   TOK_SEP_LEFT_BRACE,    // {
    217   TOK_SEP_RIGHT_BRACE,   // }
    218   TOK_SEP_COMMA,         // ,
    219   TOK_SEP_SEMICOLON,     // ;
    220   TOK_SEP_DOT,           // .
    221   TOK_SEP_ELLIPSIS,      // ...
    222   TOK_SEP_HASH,          // #
    223 
    224   // Identifiers
    225   TOK_ID,
    226 
    227   // Constants
    228   TOK_CONST_INTEGER_U32,  // u
    229   TOK_CONST_INTEGER_U64,  // ul
    230   TOK_CONST_INTEGER_S32,  // (no suffix)
    231   TOK_CONST_INTEGER_S64,  // l
    232   TOK_CONST_FLOAT_32,     // f
    233   TOK_CONST_FLOAT_64,     // (no suffix)
    234   TOK_CONST_CHAR,         // 'c'
    235   TOK_CONST_STRING_ASCII, // "string" (width of 8 bits)
    236 
    237   // Special
    238   TOK_SPECIAL_EOF,
    239   TOK_SPECIAL_ERROR,
    240 } c_token_types;
    241 ---
    242 @s
    243 We bring this all together in `token.h`.
    244 --- token.h
    245 #ifndef TOKEN_H
    246 #define TOKEN_H
    247 #include <stdint.h> // We use this for int64_t
    248 @{Token Types}
    249 @{Opaque Token Type}
    250 @{Token Creation and Destruction}
    251 @{Token Interface}
    252 #endif
    253 ---
    254 
    255 @s Token Implementation
    256 Now that we have the interface, we can implement the token data structure. We'll need a couple of things:
    257 *   The token type.
    258 *   A way to store extra data. 
    259 *   Implementations of the functions we defined in the interface.
    260 
    261 @s
    262 One problem is we haven't defined a way to verify that the token we're getting isn't corrupt. We'll use a tag for that.
    263 --- Token Data Structure
    264 #define TOK_MAGIC_1 0x544F4B454E544F4Bul // "TOKENTOK"
    265 #define TOK_MAGIC_2 0x544F4B544F4B454Eul // "TOKTOKEN"
    266 struct token {
    267   long magic;
    268   int line;
    269   int column;
    270   short kind;
    271   long opt_data[0];
    272 };
    273 
    274 typedef struct token token_t;
    275 
    276 struct token_data {
    277   union {
    278     long long i;
    279     double f;
    280     const char *s;
    281     char c;
    282   } data;
    283 };
    284 
    285 typedef struct token_data token_data_t;
    286 ---
    287 You might notice that a zero-length array is used in the token data structure. This is a GCC extension that allows us to allocate memory for the token data structure and the token data in one allocation. This is a bit of a hack, but it's a common pattern in C code.
    288 @s
    289 To access this extra data, we define a macro
    290 --- Token Data Access
    291 #define token_data(token) ((struct token_data *)((token)->opt_data))
    292 ---
    293 @s
    294 Now we can implement the functions we defined in the interface.
    295 --- Token Creation and Destruction (Except `token_create_string`)
    296 static token_t *token_data_create(c_token_types kind, int lin, int col, int len) {
    297   token_t *token = malloc(sizeof(token_t) + sizeof(struct token_data));
    298   if (token == NULL) {
    299     fputs("Out of memory\n", stderr);
    300     exit(1);
    301   }
    302   token->magic = TOK_MAGIC_1;
    303   token->line = lin;
    304   token->column = col;
    305   column += len;
    306   token->kind = kind;
    307   return token;
    308 }
    309 
    310 static token_t *token_create(c_token_types kind, int lin, int col, int len) {
    311   token_t *token = malloc(sizeof(token_t));
    312   if (token == NULL) {
    313     fputs("Out of memory\n", stderr);
    314     exit(1);
    315   }
    316   token->magic = TOK_MAGIC_2;
    317   token->line = lin;
    318   token->column = col;
    319   column += len;
    320   token->kind = kind;
    321   return token;
    322 }
    323 
    324 static token_t *token_create_int(c_token_types kind, int lin, int col, int64_t i, int len) {
    325   token_t *token = token_data_create(kind, lin, col, len);
    326   token_data(token)->data.i = i;
    327   return token;
    328 }
    329 
    330 static token_t *token_create_float(c_token_types kind, int lin, int col, double f, int len) {
    331   token_t *token = token_data_create(kind, lin, col, len);
    332   token_data(token)->data.f = f;
    333   return token;
    334 }
    335 
    336 static token_t *token_create_char(c_token_types kind, int lin, int col, char c, int len) {
    337   token_t *token = token_data_create(kind, lin, col, len);
    338   token_data(token)->data.c = c;
    339   return token;
    340 }
    341 
    342 void token_destroy(token_t *token) {
    343   if (token->magic == TOK_MAGIC_1) {
    344     free(token);
    345   } else if (token->magic == TOK_MAGIC_2) {
    346     free(token);
    347     if (kind == TOK_CONST_STRING_ASCII) {
    348       free(token_data(token)->data.s);
    349     }
    350   } else {
    351         fputs("Corrupt token\n", stderr);
    352         exit(1);
    353     }
    354 }
    355 ---
    356 @s
    357 `token_create_string` can be implemented either the easy way or the right way. Let's try the easy way.
    358 --- token_create_string
    359 token_t *token_create_string(c_token_types kind, int lin, int col,
    360                                     const char *s, int len) {
    361   token_t *token = token_create(kind, lin, col, len);
    362   token_data(token)->data.s = strdup(s);
    363   return token;
    364 }
    365 ---
    366 @s
    367 There's an issue with this approach. `token_create_string` will be called for every identifier and every string in a program. Imagine a large program, say a shell, with a bunch of user input and output. That program will likely have 20-40 calls to `fprintf`, `fscanf`, `strchr`, `strtok`, each. We create a new string for each of those calls. That's a lot of duplicates, and can quickly add up to a lot of memory usage.
    368 To fix this, we use a hash table to store the strings. We'll define a hash table in `hash_table.h` and `hash_table.c`.
    369 @s Hash Table
    370 A hash table is a data structure that maps keys to values. It's commonly used to store information, such as variables and functions in a symbol table. To implement a generic hash table, we'll need several things:
    371 * A function to hash the keys.
    372 * A function to compare keys.
    373 * An opaque type for the hash table.
    374 * A function to destroy deleted keys and values.
    375 
    376 Let's start with the interface.
    377 
    378 @s
    379 --- Hash Table Opaque Types
    380 typedef struct hash_table hash_table_t;
    381 typedef int (*hash_table_cmp_fn)(void *key1, void *key2);
    382 typedef unsigned int (*hash_table_hash_fn)(void *key);
    383 typedef void (*hash_table_dtor)(void *value, int is_key);
    384 ---
    385 
    386 @s
    387 --- Hash Table Creation and Destruction
    388 hash_table_t *hash_table_create(int size, hash_table_cmp_fn cmp, hash_table_hash_fn hash, hash_table_dtor dtor);
    389 void hash_table_destroy(hash_table_t *table);
    390 ---
    391 
    392 @s
    393 --- Hash Table Access
    394 void *hash_table_get(hash_table_t *table, void *key);
    395 void hash_table_put(hash_table_t *table, void *key, void *value);
    396 void hash_table_remove(hash_table_t *table, void *key);
    397 ---
    398 
    399 @s
    400 --- hash_table.h
    401 #ifndef HASH_TABLE_H
    402 #define HASH_TABLE_H
    403 @{Hash Table Opaque Types}
    404 @{Hash Table Creation and Destruction}
    405 @{Hash Table Access}
    406 #endif
    407 ---
    408 
    409 @s
    410 Let's implement the hash table now.
    411 
    412 --- hash_table.c
    413 #include <stdlib.h>
    414 #include <string.h>
    415 #include <stdio.h>
    416 #include "hash_table.h"
    417 
    418 @{Hash Table Data Structure}
    419 @{Hash Table Entry Data Structure}
    420 
    421 hash_table_t *hash_table_create(int size, hash_table_cmp_fn cmp, hash_table_hash_fn hash, hash_table_dtor dtor) {
    422   @{Allocate and Initialize Hash Table}
    423   return table;
    424 }
    425 
    426 void hash_table_destroy(hash_table_t *table) {
    427   @{Destroy Entries}
    428   free(table->entries);
    429   free(table);
    430 }
    431 
    432 void *hash_table_get(hash_table_t *table, void *key) {
    433   @{Get Entry By Hash}
    434   @{Loop Through Entries and Return Value if Match}
    435   return NULL;
    436 }
    437 
    438 void hash_table_put(hash_table_t *table, void *key, void *value) {
    439   @{Get Entry By Hash}
    440   @{Loop Through Entries and Replace Value if Key Matches}
    441   @{Allocate New Entry if No Match}
    442 }
    443 
    444 void hash_table_remove(hash_table_t *table, void *key) {
    445   @{Get Entry By Hash}
    446   @{Loop Through Entries and Remove Entry if Key Matches}
    447 }
    448 
    449 #ifdef TEST_HASH_TABLE
    450 #include <assert.h>
    451 #include <stdio.h>
    452 #include <string.h>
    453 
    454 int string_cmp(void *key1, void *key2) { 
    455   return strcmp((char *)key1, (char *)key2); 
    456 }
    457 
    458 unsigned long string_hash(void *key) {
    459   unsigned long hash = 5381;
    460   char *str = (char *)key;
    461   while (*str != '\0') {
    462     hash = ((hash << 5) + hash) + *str;
    463     str++;
    464   }
    465   return hash;
    466 }
    467 
    468 int main() {
    469   hash_table_t *table = hash_table_create(16, string_cmp, string_hash, NULL);
    470   hash_table_put(table, "foo", "bar");
    471   hash_table_put(table, "foo", "baz");
    472   assert(strcmp((char *)hash_table_get(table, "foo"), "baz") == 0);
    473   hash_table_remove(table, "foo");
    474   assert(hash_table_get(table, "foo") == NULL);
    475   hash_table_destroy(table);
    476   return 0;
    477 }
    478 #endif
    479 ---
    480 
    481 @s
    482 For the hash table data structure, we'll define a pointer to an array of entries, the size of the array, and the hash/comparison functions.
    483 
    484 --- Hash Table Data Structure
    485 struct hash_table {
    486   struct hash_table_entry **entries;
    487   int size;
    488   hash_table_cmp_fn cmp;
    489   hash_table_hash_fn hash;
    490   hash_table_dtor dtor;
    491 };
    492 ---
    493 
    494 @s
    495 Entries in the hash table will have a key, a value, and a link to the next entry in the chain.
    496 
    497 --- Hash Table Entry Data Structure
    498 struct hash_table_entry {
    499   void *key;
    500   void *value;
    501   struct hash_table_entry *next;
    502 };
    503 ---
    504 
    505 @s
    506 Allocating a hash table involves allocating memory for the hash table itself and the entries, zeroing out the entries, and setting the hash and comparison functions.
    507 
    508 --- Allocate and Initialize Hash Table
    509 hash_table_t *table = malloc(sizeof(struct hash_table));
    510 if (table == NULL) {
    511   fputs("Out of memory, could not allocate hash table\n", stderr);
    512   exit(1);
    513 }
    514 table->entries = calloc(size, sizeof(struct hash_table_entry *));
    515 if (table->entries == NULL) {
    516   fputs("Out of memory, could not allocate hash table entries\n", stderr);
    517   exit(1);
    518 }
    519 table->size = size;
    520 table->cmp = cmp;
    521 table->hash = hash;
    522 table->dtor = dtor;
    523 ---
    524 
    525 @s
    526 To destroy a hash table, we loop through the entries, freeing the keys and values, and then free the entries and the table itself.
    527 
    528 --- Destroy Entries
    529 for (int i = 0; i < table->size; i++) {
    530   struct hash_table_entry *entry = table->entries[i];
    531   while (entry != NULL) {
    532     struct hash_table_entry *next = entry->next;
    533     if (table->dtor != NULL) {
    534       table->dtor(entry->key, 1);
    535       table->dtor(entry->value, 0);
    536     }
    537     free(entry);
    538     entry = next;
    539   }
    540 }
    541 ---
    542 
    543 @s
    544 To get an entry from the hash table, we hash the key, loop through the entries, and return the value if we find a match.
    545 
    546 --- Get Entry By Hash
    547 unsigned int hash = table->hash(key) % table->size;
    548 struct hash_table_entry *entry = table->entries[hash];
    549 ---
    550 
    551 @s
    552 To put an entry in the hash table, we hash the key, loop through the entries, and replace the value if we find a match.
    553 
    554 --- Loop Through Entries and Replace Value if Key Matches
    555 while (entry != NULL) {
    556   if (table->cmp(entry->key, key) == 0) {
    557     entry->value = value;
    558     return;
    559   }
    560   entry = entry->next;
    561 }
    562 ---
    563 
    564 @s
    565 If we don't find a match, we allocate a new entry, set the key and value, and insert it at the head of the linked list.
    566 This exploits a property in computer science called locality of reference. The gist of that is that when you write to a piece of memory, you're likely to read from it again soon. By putting the new entry at the head of the linked list, we increase the chances that we'll find it quickly next time.
    567 
    568 --- Allocate New Entry if No Match
    569 struct hash_table_entry *new_entry = malloc(sizeof(struct hash_table_entry));
    570 if (new_entry == NULL) {
    571   fputs("Out of memory, could not allocate hash table entry\n", stderr);
    572   exit(1);
    573 }
    574 new_entry->key = key;
    575 new_entry->value = value;
    576 new_entry->next = table->entries[hash];
    577 table->entries[hash] = new_entry;
    578 ---
    579 
    580 @s
    581 To remove an entry from the hash table, we hash the key, loop through the entries, and remove the entry if we find a match.
    582 
    583 --- Loop Through Entries and Remove Entry if Key Matches
    584 struct hash_table_entry *prev = NULL;
    585 while (entry != NULL) {
    586     if (table->cmp(entry->key, key) == 0) {
    587         if (prev == NULL) {
    588             table->entries[hash] = entry->next;
    589         } else {
    590             prev->next = entry->next;
    591         }
    592         if (table->dtor != NULL) {
    593             table->dtor(entry->key, 1);
    594             table->dtor(entry->value, 0);
    595         }
    596         free(entry);
    597         return;
    598     }
    599     prev = entry;
    600     entry = entry->next;
    601 }
    602 ---
    603 
    604 @s
    605 To find a value associated with a given key in the hash table, we hash the string, loop through the entries, and return the value if a match is found.
    606 
    607 --- Loop Through Entries and Return Value if Match
    608 while (entry != NULL) {
    609   if (table->cmp(entry->key, key) == 0) {
    610     return entry->value;
    611   }
    612   entry = entry->next;
    613 }
    614 ---