[논문 정리] Steelix: Program-State Based Binary Fuzzing

Abstract

coverage-based fuzzing
- 취약점 찾는데 가장 효과적인 기술 중 하나
- BUT difficulty in exercising the paths that are protected by magic bytes(file signature) comparisons
- magic bytes comparisons을 break하기 위해 heavy-weight 프로그램 사용이 제안되었다.
In this paper
- program-state based binary fuzzing (named Steelix)
- 실행 속도를 낮춤으로써 퍼저의 침투력을 개선한다.
- 특히, 퍼저에게 coverage information과 comparison progress information을 제공하기 위해서 light-weight static analysis와 binary instrumentation을 사용했다.
  - 그런 프로그램 상태 정보는 퍼저에게 test input에서의 magic bytes의 위치와 효과적으로 magic bytes를 맞추기(match) 위해 mutation을 어떻게 실행해야 하는가를 알려준다.
- Steelix를 개발하고 3개의 dataset으로 평가했다.
  - AVA-M dataset, DARPA CGC sample binaries, five real-life programs
그 결과
- Steelix는 better code coverage, bug detection capability than state-of-the-art fuzzers

Introduction

퍼징 분류
- based on how the structural knowledge of the PUT is utilized
  - white-box
  - black-box
  - grey-box
- based on how the test inputs to the PUT are generated
  - mutation-based
  - generation-based
In this paper, we focus on grey-box mutation-based fuzzing
- 가장 성공적인 기술 중 하나 coverage-based fuzzing
  - 어떤 test input이 퍼징을 위해 남겨져야 하는지 판단하기 위해 각 실행된 test input에 대해 coverage information을 추출하려고 light-wieght instrumentation을 사용한다.
  - incremental 방식으로 PUT의 execution path를 explore한다.
  - AFL이 state-of-the-art coverage-based 퍼저이다.
- 하지만, coverage-based fuzzing은 magic bytes comparisons에 의해 보호되는 path를 exercise(학습?)하기 위한 제한된 침투력을 갖고 있다. (coverage-based fuzzer에 단점이 존재한다.)
- 그래서 program-state based fuzzer (Steelix) 를 제안한다.
  - binary level에서 동작
  - execution speed를 적당히 낮춤으로써 magic bytes comparison에 의해 보호되는 paths를 exercise할 수 있다.
  - key idea of Steelix : coverage 정보를 모을 뿐만아니라 comparison progress 정보도 모은다.
  - 이런 접근 방식을 AFL을 확장하여 개발했고, 여러 dataset으로 평가했다.

Contributions

execution speed를 적당히 낮춤으로써 magic bytes comparison에 의해 보호되는 path를 학습하기 위해 program-state based binary fuzzing을 제안한다.
adaptive mutation을 guide하기 위한 dynamic feedbacks로써 coverage와 comparison progress information을 모두 수집하기 위해 light-weight static analysis and binary instrumentation을 제안한다.
Steelix를 개발하고 various benchmark and real-life program으로 평가했는데, 유의미한 결과를 보여줬다.
CVE 1개와 9개의 이전에는 알려지지 않았던 버그를 (in some widely-used real-life program)발견했다.

Motivation

coverage-based fuzzer AFL은 어떤 mutated test input이 유지되어야 하는가 결정하기 위해 coverage information을 사용한다.
하지만 test input에서 magic bytes의 위치를 알 수 없고, 그러므로 test input을 효과적으로 mutate할 수 없다는 이유로 AFL은 crash trigger하는데 여전히 어려움을 겪고 있다.

binary level에서 동작함으로써 Steelix는 open & close source 프로그램에 적용될 수 있다.
Steelix의 3 main components : static analysis, binary instrumentation, fuzzing loop
- static analysis
  - 프로그램 바이너리를 input으로 disassemble한다.
  - 몇 규칙으로 uninteresting comparisons를 filter out한다.
  - 그리고 interesting comparisons의 정보와 basic blocks의 정보를 추출한다. 이 정보들은 binary instrumentation이 "where to instrument", "what to instrument" 에 대해 알려준다.
  - [**] basic blocks는 퍼저를 위해 coverage information을 수집하는데 사용된다.
  - 이 statistically extracted information과 program binary는 함께 binary instrumentation으로 pass된다.
- binary instrumentation
  - 2 main concerns(우려)
    - comparison progress를 compact한 방법으로 mark해야한다. comparison progress가 efficiency를 위해 제한된 size로 디자인된 shared memory에 recored되는 것처럼
    - 프로그램이 comparison operands의 실제값을 갖고 퍼징동안 comparison progress 정보를 생성하기위해 프로그램을 instrument했다.
- fuzzing loop
  - instrumented binary를 갖고 퍼징 시작
  - instrumented program을 실행한 후 퍼저는 coverage와 comparison progress feedback을 받고, 그 feedback에 기반하여 magic bytes의 location information을 얻어낼 것이다.
  - 그 coverage, comparison progress와 location information은 adaptive(적응할 수 있는) mutation을 guide하는데 사용될 것이다.

Conclusion

In this paper, we proposed program-state based binary fuzzing approach, named Steelix.
Steelix가 갖는 프로그램 상태 정보
- coverage information
- comparison progress infomation
- 위 두 정보들은 light-weight static analysis와 binary instrumentation에 의해 모였다.
comparison progress의 안내로(With the guidance of) Steelix는 magic bytes comparisons를 더 효과적으로 penetrate할 수 있다. (execution overhead는 낮게 유지한 채로)
Steelix를 개발하고 평가한 결과, bug detection capability와 code coverage를 개선할 수 있었다.

간단하게 정리하면,,,

(기존) coverage-based fuzzer(ex. AFL)

test input에서 magic bytes의 위치를 알 수 없다.
그러므로 test input을 효과적으로 mutate할 수 없다.

결론적으로, AFL은 crash를 trigger하는데 어려움이 있다.

▼

(제안) program-state based binary fuzzing(Steelix)

실행 속도를 낮춰 퍼저 침투력 개선
light-weight static analysis와 binary instrumentation을 사용해 퍼저에게 coverage information과 comparison progress information 제공
- 이런 프로그램 상태 정보는 퍼저에게 test input에서의 magic bytes의 위치와 효과적으로 magic bytes를 맞추기(match) 위해 mutation을 어떻게 실행해야 하는가를 알려준다.

즉, 제안하는 두 방법으로 기존의 어려움을 해결할 수 있다.

'정보보안 > fuzzing' 카테고리의 다른 글

[실험] AFL/AFL++ 실험설계 및 수행 (0)	2022.01.21
[최종 정리] AFL의 특징과 AFL++의 개선방안 (0)	2022.01.13
[논문 정리] AFL++: Combining Incremental Steps of Fuzzing Research (0)	2022.01.10
AFL 설치 및 실습 (0)	2022.01.08
AFL 이해를 위한 기본 개념 (0)	2022.01.07

Steadily

[논문 정리] Steelix: Program-State Based Binary Fuzzing

Abstract

Introduction

Contributions

Motivation

Conclusion

간단하게 정리하면,,,

(기존) coverage-based fuzzer(ex. AFL)

(제안) program-state based binary fuzzing(Steelix)

'정보보안 > fuzzing' 카테고리의 다른 글

댓글

티스토리툴바

[논문 정리] Steelix: Program-State Based Binary Fuzzing

Abstract

Introduction

Contributions

Motivation

Conclusion

간단하게 정리하면,,,

(기존) coverage-based fuzzer(ex. AFL)

(제안) program-state based binary fuzzing(Steelix)

'정보보안 > fuzzing' 카테고리의 다른 글

관련글

댓글

티스토리툴바