You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This document describes the core design concept of a secure memory copy scheme in DragonOS based on the Exception Table mechanism. This solution addresses the issue of safely accessing user-space memory in system call contexts, preventing kernel panics caused by accessing invalid user addresses.
26
+
This document describes the core design of a secure memory copy scheme in DragonOS based on the Exception Table mechanism. This solution addresses the issue of safely accessing user-space memory in system call contexts, preventing kernel panics caused by accessing invalid user addresses.
27
27
28
28
## Design Background and Motivation
29
29
30
30
### Problem Definition
31
31
32
-
During system call processing, the kernel needs to access pointers passed from user space (such as path strings, parameter structures, etc.). These accesses may fail due to:
32
+
During system call processing, the kernel needs to access pointers passed from user space (such as path strings, parameter structures, etc.). These accesses may fail:
33
33
34
34
1.**Unmapped address**: The user-provided address has no corresponding VMA (Virtual Memory Area)
35
35
2.**Insufficient permissions**: The page exists but lacks required permissions
@@ -38,22 +38,22 @@ During system call processing, the kernel needs to access pointers passed from u
38
38
### Limitations of Traditional Solutions
39
39
40
40
**TOCTTOU issues with pre-checking solutions:**
41
-
-Addresses may be valid during check but modified by other threads when used
42
-
- Race condition windows exist
41
+
-An address may be valid during check but modified by other threads when used
42
+
- Race condition window exists
43
43
44
-
**Dilemma of direct access:**
44
+
**Challenges with direct access:**
45
45
- Cannot distinguish between "normal page fault" and "illegal access"
46
-
- Page fault handlers cannot determine whether it's a kernel bug or user error
46
+
- Page fault handler cannot determine whether it's a kernel bug or user error
47
47
48
-
## Principles of Exception Table Mechanism
48
+
## Exception Table Mechanism Principles
49
49
50
50
### Core Idea
51
51
52
-
The Exception Table mechanism achieves secure user-space access through **compile-time marking + runtime lookup**:
52
+
The exception table mechanism achieves secure user-space access through **compile-time marking + runtime lookup**:
53
53
54
-
1.**Compile-time**: Generate exception table entries at instructions that may trigger page faults
54
+
1.**Compile-time**: Generate exception table entries for instructions that may trigger page faults
55
55
2.**Runtime**: When a page fault occurs, search the exception table and jump to fix-up code
56
-
3.**Zero overhead**: No performance loss on normal paths
56
+
3.**Zero overhead**: No performance loss on the normal path
57
57
58
58
### Architectural Diagram
59
59
@@ -76,7 +76,7 @@ The Exception Table mechanism achieves secure user-space access through **compil
76
76
│ │ 4. 修改指令指针(RIP) │ │
77
77
│ │ ↓ │ │
78
78
│ │ 5. 执行修复代码 │ │
79
-
│ │ └─ 设置错误码(-1) │ │
79
+
│ │ └─ 返回剩余未拷贝字节数 │ │
80
80
│ │ ↓ │ │
81
81
│ │ 6. 返回EFAULT给用户 │ │
82
82
│ └──────────────────────────────┘ │
@@ -85,15 +85,15 @@ The Exception Table mechanism achieves secure user-space access through **compil
85
85
86
86
### Core Data Structures
87
87
88
-
**Exception Table Entry (8-byte aligned):**
88
+
**Exception table entry (8-byte aligned):**
89
89
```
90
90
┌─────────────────┬──────────────────┐
91
91
│ 指令相对偏移 │ 修复代码相对偏移 │
92
92
│ (4 bytes) │ (4 bytes) │
93
93
└─────────────────┴──────────────────┘
94
94
```
95
95
96
-
**Design Highlights:**
96
+
**Design highlights:**
97
97
- Uses relative offsets to support ASLR (Address Space Layout Randomization)
98
98
- 8-byte alignment improves cache performance
99
99
- Stored in read-only segments to prevent tampering
@@ -114,14 +114,14 @@ The Exception Table mechanism achieves secure user-space access through **compil
114
114
│
115
115
是
116
116
↓
117
-
修改RIP到修复代码 ──→ 返回错误码
117
+
修改RIP到修复代码 ──→ 返回剩余未拷贝字节数
118
118
```
119
119
120
120
## Typical Execution Scenarios
121
121
122
-
### Scenario: System Call with Invalid Address
122
+
### Scenario: System call with invalid address
123
123
124
-
Taking the `open()` system call as an example, demonstrating the operation of the exception table:
124
+
Taking the `open()` system call as an example, demonstrating the exception table's operation:
125
125
126
126
```
127
127
用户程序: open(0x1000, O_RDONLY) // 0x1000未映射
@@ -150,7 +150,7 @@ Taking the `open()` system call as an example, demonstrating the operation of th
150
150
↓
151
151
┌────────────────────────────────┐
152
152
│ 4. 修改指令指针到修复代码 │
153
-
│ └─ 设置返回值为错误码 │
153
+
│ └─ 设置返回值为剩余未拷贝字节数 │
154
154
└────────────────────────────────┘
155
155
│
156
156
↓
@@ -162,42 +162,42 @@ Taking the `open()` system call as an example, demonstrating the operation of th
162
162
用户程序: fd = -1, errno = EFAULT
163
163
```
164
164
165
-
**Key Points:**
165
+
**Key points:**
166
166
- No need for pre-checking address validity
167
-
- Page faults are automatically converted to error codes
168
-
-Kernel won't panic, user programs receive clear error information
167
+
- Page faults are automatically converted to returning the number of remaining uncopied bytes
168
+
-The kernel won't panic, and user programs receive clear error information
169
169
170
170
## Usage Scenario Analysis
171
171
172
-
### ✅ Suitable Scenarios for Exception Table Protection
172
+
### ✅ Scenarios Suitable for Exception Table Protection
173
173
174
-
#### 1. Small Data System Call Parameters
174
+
#### 1. Small system call parameter data
175
175
176
176
**Characteristics:**
177
177
- Small data volume (typically < 4KB)
178
-
-Single copy operation
178
+
-One-time copy
179
179
- Unknown data length (e.g., strings)
180
180
181
-
**Typical Applications:**
181
+
**Typical applications:**
182
182
- Path strings: `open()`, `stat()`, `execve()`, etc.
183
183
- Fixed-size structures: `sigaction`, `timespec`, `stat`, etc.
184
184
- Small arrays: `iovec[]`, `pollfd[]`, etc.
185
185
186
186
**Advantages:**
187
187
-**Avoids TOCTTOU races**: No pre-checking needed
188
188
-**High robustness**: User errors won't cause kernel panics
189
-
-**Acceptable performance**: Small data volume, minimal impact even with extra copies
189
+
-**Acceptable performance**: Small data volume; even multiple copies have minimal impact
190
190
191
-
#### 2. Scenarios with Uncertain Address Validity
191
+
#### 2. Scenarios with uncertain address validity
192
192
193
193
When address validity cannot be verified by other means, the exception table is the safest choice:
194
194
- Raw pointers directly provided by users
195
195
- Addresses that may be concurrently modified in multi-threaded environments
196
196
- Operations requiring atomicity guarantees
197
197
198
-
### ❌ Unsuitable Scenarios for Exception Table Protection
198
+
### ❌ Scenarios Unsuitable for Exception Table Protection
199
199
200
-
#### 1. Large Data Transfers
200
+
#### 1. Large data transfers
201
201
202
202
**Anti-pattern: Double buffering in read/write system calls**
203
203
```
@@ -207,43 +207,43 @@ When address validity cannot be verified by other means, the exception table is
0 commit comments