Jollen's Blog: January 2007 歸檔: Embedded Linux Consulting and Training: 專注於嵌入式 Linux

2007 開工了！

新的 2007 年，「Jollen's Blog」也會把重點放在「Linux kernel」與「Linux device driver」；期待您的意見與指教。

jollen 發表於 January 1, 2007 12:52 AM | 全文 | 評論 (0) | 引用通告 (0)

Process Creation, #2：Running a "User Process"

Process Creation

在討論「Process Creation」議題時，首先要了解的就是「Linux 的三種 Process」。同樣是「執行中的程式（process）」，但是依其「特性（design perspective）」區分的話，可以歸納為以下三種：

idle process
kernel threads
user process

其中「user process」就是我們此系列日記所要介紹的對象。User process 是很單純的一種 process，簡單來說，以下二種「執行程式的方式」就是 user process：

在 shell 模式中輸入 Linux command 所執行的外部程式
由 init process 所執行的外部程式

那麼 user process 是怎麼執行的呢？嗯，先前我們提過的日記「Process Creation, #1：由 shell 執行外部程式《基本觀念與範例》」便介紹了這樣的觀念。

Process Creation：Running a Program

接著，我們就以一張圖來展示「在 shell 模式執行外部程式（running a program）」的流程。

這樣就很清楚了：

由 user 鍵入 UNIX command
shell 在 PATH 路徑中尋找 command，即外部程式（stored program）
shell fork/clone 自己，然後 child process 會將自己取代為 ELF image^*1

至於「idle process」與「kernel threads」則不適用此圖！

*1 即外部程式：Linux 的執行檔為 ELF 的格式，所以稱其為「ELF image」

Also See

2006.12.31: Process Creation, #1：由 shell 執行外部程式《基本觀念與範例》

jollen 發表於 January 2, 2007 11:08 PM | 全文 | 評論 (0) | 引用通告 (0)

一個防止程式被玩耍的小技倆

今天在分享自己實作上的經驗時，聊到「如何防止程式被人家玩了！」。我們所要實現的想法很簡單，就是設法防止「執行檔」被 Linux 下的「標準程式工具」給把玩了！例如：

防止被 "objdump" 工具讀取
防止被 "objdump –d" 做反組譯
防止被 gdb 除錯
防止 symbol table 被 nm 讀出
防止被其它的標準程式工具（GNU bintuils，如 file 等）所操作

我分享了一個簡單的方法，這個方法在以往「Linux Systems Programming」的課或多或少也曾向同學介紹過；不過，大家要知道的是，這只是一個有趣的小東西，或者說是一個「小手段」，任何高手級的 Linux 玩家，大概只要不到一小時就能輕易反擊這個做法。

實測

就拿 "tar" 指令來看，以下是正常的操作：

# objdump -d /bin/tar

/bin/tar:     file format elf32-i386

Disassembly of section .init:

08049734 <.init>:
 8049734:       55                      push   %ebp
 8049735:       89 e5                   mov    %esp,%ebp
 8049737:       83 ec 08                sub    $0x8,%esp
 804973a:       e8 b1 07 00 00          call   0x8049ef0
 804973f:       e8 0c 08 00 00          call   0x8049f50
 8049744:       e8 93 b7 01 00          call   0x8064edc
 8049749:       c9                      leave
 804974a:       c3                      ret
Disassembly of section .plt:
...

把 "tar" 做「處理」後，這些標準的工具全都失效了。先用 Jollen 提供的小工具處理 /bin/tar：

# ./truncate_it /bin/tar > ./tar.trunc
# chmod a+x tar.trunc

再執行 objdump 試試：

# objdump -d tar.trunc
objdump: tar.trunc: File truncated

連 nm、gdb 等，也都失效了：

# nm tar.trunc
nm: tar.trunc: File truncated
[root@mail tmp]# gdb tar.trunc
GNU gdb Red Hat Linux (5.3post-0.20021129.18rh)
Copyright 2003 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"..."/tmp/tar.trunc": not in executable format: File truncated

(gdb) r
Starting program:
No executable file specified.
Use the "file" or "exec-file" command.

但是程式是可以正常使用的，不會有什麼問題：

# ./tar.trunc --help|more
GNU `tar' saves many files together into a single tape or disk archive, and
can restore individual files from the archive.

Usage: ./tar.trunc [OPTION]... [FILE]...

Examples:
  ./tar.trunc -cf archive.tar foo bar  # Create archive.tar from files foo a
nd bar.
  ./tar.trunc -tvf archive.tar         # List all files in archive.tar verbo
sely.
  ./tar.trunc -xf archive.tar          # Extract all files from archive.tar.

If a long option shows an argument as mandatory, then it is mandatory
for the equivalent short option also.  Similarly for optional arguments.
...

基本上，「執行時期都正常」，但是「程式的操作工具都失效」是這個小技巧能達成的效果。反過來想，既然 run-time 是正常的，那麼也就難逃 run-time 除錯工具的荼毒了。

下載程式

做法與原理是相當簡單的，只要知道「如何活用 ELF 規格」，箇中奧妙就不難懂了。大家可以下載 Jollen 寫的小程式「Truncate It」，這是一個執行檔，下篇日記我會介紹「Truncate It」的原理並提供 source code。使用方法如下：

# ./truncate_it /bin/tar > tar.trunc

記得 chmod 成可執行：

# chmod a+x tar.trunc

然後可以在「目前的目錄下」得到 tar.trunc 執行檔。

truncate_it 可以針對 ELF 的執行檔做一些小動作，並產生一個新的執行檔（結果輸出到 stdout）。被 truncate_it 處理過的檔案「大部份」都能正常執行，並且無法使用 GNU binutils 或是 gdb 對他做作何動作。

jollen 發表於 January 4, 2007 1:51 AM | 全文 | 評論 (0) | 引用通告 (0)

應用在 Embedded Linux 場合的 Busybox 有了 "CONFIG_DESKTOP"

Busybox 於 2006-12-14 釋出 1.3.0（stable）版，並於 2006-12-27 釋出 1.3.1（stable）更新。不過，吸引我的並不是這個平凡的版本更新資訊，而是看到以下這段文字：

This release has CONFIG_DESKTOP option which enables features needed for busybox usage on desktop machine.

'CONFIG_DESKTOP' 的出現，讓原本是應用在 Embedded Linux 場合的 Busybox，也開始支援 Desktop Linux 的應用了。於是，我心裡浮現出以下的感想...

Busybox 原先的設計理念是基於「簡化的系統工具」與「簡單化的指令集」，並朝 footprint 的目標而發展；當初由於 target device 硬體環境的貧乏（如 RAM 只有 4Mbytes），因此這種做法不但合情合理，並且也很「直覺」（make sense）。

現在，target device 硬體端技術的進步，讓 Embedded Linux 與 Desktop Linux 之間的界線越來越模糊。「根本上的定義」，Embedded Linux 是最小化的 Linux 系統，於是我們要學會 root filesystem 的建置，並儘最大力氣把它做到最小（down size）；如今，最小化，或是對尺寸的敏感性，在一些 Embedded Linux 的應用場合上，都不再是重點，甚致可以用「精簡大小不 make sense」來回應。

當 Embedded Linux 隱約等於 Desktop Linux 時，根本上的 Embedded Linux 技能（也就是建置最小化的 root filesystem）仍然是必備的基本能力，但是許多「Auto Build」的環境才是實用的做法；在這樣的前提下，所謂的 Embedded Linux 可能需要再重新定義。如果由「技術上的定義」來解釋，BSP 與週邊界面可能才能叫做「基本能力」，例如：如何 porting 某 SOC NAND flash controller 的 Linux 驅動程式，至於 root filesystem 的話，「就交給滑鼠左鍵吧」。

jollen 發表於 January 5, 2007 12:05 AM | 全文 | 評論 (0) | 引用通告 (0)

Linux 的 Virtual Memory Areas（VMA）：基本概念介紹

由 user process 角度來說明的話，VMA 是 user process 裡一段 virtual address space 區塊；virtual address space 是連續的記憶體空間，當然 VMA 也會是連續的空間。VMA 對 Linux 的主要好處是，可以記憶體的使用更有效率，並且更容易管理 user process address space。

從另一個觀念來看，VMA 可以讓 Linux kernel 以 process 的角度來管理 virtual address space。Process 的 VMA 對映，可以由 /proc/<pid>/maps 檔案查詢；例如 pid 1（init）的 VMA mapping 為：

$ cat /proc/1/maps
08048000-0804e000 r-xp 00000000 08:01 12118      /sbin/init
0804e000-08050000 rw-p 00005000 08:01 12118      /sbin/init
08050000-08054000 rwxp 00000000 00:00 0
40000000-40016000 r-xp 00000000 08:01 52297      /lib/ld-2.2.4.so
40016000-40017000 rw-p 00015000 08:01 52297      /lib/ld-2.2.4.so
40024000-40025000 rw-p 00000000 00:00 0
40025000-40157000 r-xp 00000000 08:01 58241      /lib/i686/libc-2.2.4.so
40157000-4015c000 rw-p 00131000 08:01 58241      /lib/i686/libc-2.2.4.so
4015c000-40160000 rw-p 00000000 00:00 0
bfffe000-c0000000 rwxp fffff000 00:00 0

列表中的欄位格式如下：

start-end perm offset major:minor inode image

Linux 以 struct vm_area_struct 資料結構來紀錄每一「區塊」的 VMA 資訊（include/linux/mm.h）：

struct vm_area_struct {
        struct mm_struct * vm_mm;
        unsigned long vm_start;
        unsigned long vm_end;

        struct vm_area_struct *vm_next;

        pgprot_t vm_page_prot;
        unsigned long vm_flags;

        rb_node_t vm_rb;

        struct vm_area_struct *vm_next_share;
        struct vm_area_struct **vm_pprev_share;

        struct vm_operations_struct * vm_ops;

        unsigned long vm_pgoff;

        struct file * vm_file;
        unsigned long vm_raend;
        void * vm_private_data;
};

struct vm_area_struct 裡有 3 個欄位，用來來維護 VMA 資料結構：

˙ unsigned long vm_start：記錄此 VMA 區塊的開始位址（start address）。
˙ unsigned long vm_end：記錄此 VMA 區塊的結束位址（end address）。
˙ struct vm_area_struct *vm_next：指向下一個 VMA 區塊結構的指標（Linux 以 linked list 資料結構維護每一個 VMA 區塊）。

VMA 的實作主要是為了能更有效率地管理記憶體，並且是基於 paging 系統之上所發展出的；VMA 是比原始 paging 理論更高階的記憶體管理方法。

jollen 發表於 January 5, 2007 2:08 PM | 全文 | 評論 (1) | 引用通告 (0)

Process Creation, #3：sys_fork《基本觀念》

fork() 是「Process Creation」議題的重要里程碑：提到 system call 代表著我們的研究要正式進入 kernel space 的層面了。

值得附帶一提的是，sys_fork 是一個 machine-dependent 的 system call，以 i386 為例，其實作位於 linux/arch/i386/kernel/process.c。^*1

Operating System 端的觀念

sys_fork 是非常重要的 system call，從作業系統的角度來解釋的話，這就是「建立 process」的主要 system call。當外部程式（即 ELF image）被使用者鍵入指令後，shell 便會呼叫 fork() 系統呼叫，並透過作業系統的 fork system call 來產生新的 process，以執行此外部程式。這種類型的 fork 也稱做 spawn。

可參考先前 Jollen 所分享的「Process Creation」前二則日記，以便了解 spawn、fork、exec 等等觀念：

2007.01.02: Process Creation, #2：Running a "User Process"
2006.12.31: Process Creation, #1：由 shell 執行外部程式《基本觀念與範例》

強烈建議您先閱讀以上二則日記，並了解 user-space 端的 process 觀念後，再繼續往下學習，才能更容易「感受」fork system call 的觀念。

Linux 端的觀念

Linux 並沒有 spawn system call，根據「Process Creation」日記的解說，Linux 下的 fork + exec 等於 spawn。而其中最核心的觀念則是在於 fork 的實作，也就是「如何產生新的 process」。

Linux 以 sys_fork 或 sys_clone 來產生新的 process，而這二個 system call 最後都會呼叫到 do_fork() 函數，do_fork() 是 Linux 主要的 fork-routine。繼續探討 do_fork() 時，便會接觸到以下二個重要議題：

memory descriptor（struct mm_struct）
memory area（VMA）

關於更深入的 fork()，我們就先在此打住，下篇日記《核心實作》再做介紹。以下是 sys_fork() 與 sys_clone() 的程式碼（Linux 2.6.17.7::arch/i386/kernel/process.c）：

asmlinkage int sys_fork(struct pt_regs regs)
{
	return do_fork(SIGCHLD, regs.esp, ®s, 0, NULL, NULL);
}

asmlinkage int sys_clone(struct pt_regs regs)
{
	unsigned long clone_flags;
	unsigned long newsp;
	int __user *parent_tidptr, *child_tidptr;

	clone_flags = regs.ebx;
	newsp = regs.ecx;
	parent_tidptr = (int __user *)regs.edx;
	child_tidptr = (int __user *)regs.edi;
	if (!newsp)
		newsp = regs.esp;
	return do_fork(clone_flags, newsp, ®s, 0, parent_tidptr, child_tidptr);
}

do_fork() 的函數原型：

/*
 *  Ok, this is the main fork-routine.
 *
 * It copies the process, and if successful kick-starts
 * it and waits for it to finish using the VM if required.
 */
long do_fork(unsigned long clone_flags,
	      unsigned long stack_start,
	      struct pt_regs *regs,
	      unsigned long stack_size,
	      int __user *parent_tidptr,
	      int __user *child_tidptr);

clone_flags 是 do_fork() 觀念的核心，也是很有趣的議題。

Also See

^*12006.10.11: Linux 2.6 的 System Call：12 大類

jollen 發表於 January 8, 2007 1:44 PM | 全文 | 評論 (0) | 引用通告 (0)

Qt Centre Programming Contest 2007：與一些自己的小想法

Qt Centre（The Ultimate Qt Community）釋出一則消息：「Qt Centre Programming Contest 2007」。嘿，Qt Centre 也和幾個 partners（這裡面當然有 Trolltech）辦起 Qt4 的程式設計比賽了。

社群手法

先前曾提到的「Embedded Linux 2006 十大回顧！」都是很「具體」的事件，不過若大環境來看，絕對可以加上一條「開放源碼的社群經營手法在 2006 年展現氣勢。」大家可以思考 Linux mobile 的爆炸性成長過程，與其策略手法，甚致是 IBM 的幾個 case study，便能了解「社群手法」的重要性。

因為形成了這樣的 ecosystem，因此更能促進 open source 運動的發展，這絕對是好事一樁，。

回歸正題

2007 年的開放源碼世界，除了「Linux mobile」會持續躍進外，「community」的經營手法當然也會是本年度的重點戲。

每次看到一些消息，腦筋都不免會跳脫常軌，出現一些奇怪的聯想。還是回到主題來吧。Qt4 的比賽當然不會是一個「解題（problem solving）」的比賽，看一下他的「Guidelines on how to win the contest」，特別強調的是：

1. idea。關鍵你的 idea 要「大」。
2. completeness。成果要完整能動。
3. portability。在幾個主要的 Qt4 平臺都能執行。
4. design。程式碼本身的設計要良好。
5. documentation。良好的註解與文件。
6. code quality。良好的程式碼。
7. tests。必須透過一些解決方案來測試程式碼。
8. dependencies。基於 Qt 技術平臺。
9. team size。這個是說，團隊的人要多，一人隊伍是無法贏得比賽的。

我覺得這幾個 keywords 是很不錯的，可以比較一般化的把這幾個想法應用在 Embedded Linux 專案計畫中。例如：code quality 方面，「self-explainable variable names」是基本功；又如，在「dependencies」方面，因為是 Embedded Linux 開發，限制程式人員所能使用的 library 與 application 則是合理的。

jollen 發表於 January 8, 2007 11:32 PM | 全文 | 評論 (2) | 引用通告 (0)

「Truncate It」小技倆的原始碼與原理

先前提到的「一個防止程式被玩耍的小技倆」中，Jollen 提供了一個稱為「Truncate It」的小工具，我把他的原始碼放在此處「http://tw.jollen.org/elf-programming/truncate_it.tar.bz2」，有興趣的朋友可下載回家把玩。

用法請參考前一則日記的介紹，另外，「Truncate It」只是一個「呈現概念的原型」，並未良好的 coding，除了結果是輸出到 stdout 外，現階段也只能處理 IA32 的 ELF image。

Truncate It 的原理

Truncate It 的原理相當簡單，我只是把「Section Header Table」等資訊由 ELF image 中移除，因此「標準工具」便無法處理 ELF image；這是由於 GNU binutils 的工具都是以 linking view 來解讀 ELF image 之故，若要正常反組譯「truncated ELF image」，就要以 execution view 的角度來解讀執行檔。

此外，Section Header Table 在 execution view 時是 optional 的，可參考 Jollen 先前的日記「ELF（Executable and Linking Format）格式教學文件, #1: ELF 簡介」。

以下以一個操作流程來說明 Truncate It 的原理：

1. 如下圖（objdump）。

2. 把前半段的內容切出來，存成 hello.trunc。

# dd if=hello of=hello.trunc bs=1308 count=1
讀入了 1+0 個區段
輸出了 1+0 個區段
# chmod a+x hello.trunc

0x51c 等於十進位 1308。其中，.bss section 的長度為 4，但是 0x51c 並不需要再加上 4，原因請參考 Jollen 的「BSS Section Concepts: .bss section 的基本觀念介紹專欄」。

此外，.bss section 在一般情況下都會是「strip 後的最後一個 section」。.bss section 不佔實體檔案空間，會有 .bss section 的存在，主要的原因是「因為當初 C 語言標準所種下的因。」

jollen 發表於 January 9, 2007 11:41 PM | 全文 | 評論 (1) | 引用通告 (0)

.bss section：C 語言所種下的因

由於當初 C 語言標準提到「未初始化的全域變數（un-initialized global variables）其初始值為零（zero）」，所以得到的結果便是「程式執行時，必須將未初始化的全域變數都初始化成零」。

Linux 針對這種狀況的解決方式是「配置 zeroed pages 給 .bss section」，因此 un-initialized global variables 的值（value）便會為零。

註：Un-initialized global variables 會被編譯器放到 .bss section，可參考 Jollen 的「BSS Section 觀念教學」專欄。

此外，global variable 被初始化為零時，也會被 GCC 放到 .bss section 裡。所以，以下的寫法：

int foo;
main()
{
...
}

會等於：

int foo = 0;
main()
{
...
}

以上二種寫法都會讓 foo 被放到 .bss section。由此可知，以下二種狀況，變數都會被放在 .bss section：

1. 當 global variable 未被初始化時；
2. 或是 global variable 被初始化成零時。

-fno-zero-initialized-in-bss

如果（但是確實有這種應用場合）我們不想讓 variable 被放到 .bss section 呢？做法有二。第一種方式是「傳統做法」，程式設計師只要將 global variable 初始化為「非零」值即可，舉以下程式為例：

#include <stdio.h>
int foo = 1;

int main(void)

{

printf("foo is %d.\n", foo);

return 0;

}

將此程式編譯：

$ gcc -O2 -o bss bss.c

利用 objdump 來觀察後，會發現 foo 變數被 GCC 放到 .data section 裡了。這種做法是基於 coding 時的做法。

第二種做法是 GCC 3.4.x 後所支援的「-fno-zero-initialized-in-bss」最佳化選項。將程式修改如下：

#include <stdio.h>
int foo = 0;

int main(void)

{

printf("foo is %d.\n", foo);

return 0;

}

請特別留意，global variable 仍要做初始化，所以「foo」一定要做「assignment」為零值的動作。將程式編譯：

$ gcc -fno-zero-initialized-in-bss -O2 -o bss bss.c

GCC 會把 foo 放到 .data section。同樣可利用 objdump 來觀察。

良好的 Embedded Linux 程式寫作習慣

了解以上的觀念後，我要來說明一個重要的觀念。由於某些特定應用，或是 target 需要將變數放在 .data section 裡，因此若是 Embedded Linux 的應用，建議應對全域變數做初始化，例如：

int x;
int y;
int z;

int main(void)
{
...
}

應將這種傳統 C 的寫作習慣調整為：

int x = 0;
int y = 0;
int z = 0;

int main(void)
{
...
}

良好的習慣養成，未來將會得到許多好處。

這種寫法並非傳統 C 語言所講的「多此一舉」，應該以 Linux systems software 的角度來思考：這種寫法便能搭配 GCC 的「-fno-zero-initialized-in-bss」或是「-fzero-initialized-in-bss（預設）」選項，來決定 global variable 要放到 .bss section 或是 .data section。

jollen 發表於 January 10, 2007 10:44 PM | 全文 | 評論 (1) | 引用通告 (0)

Process Creation, #4：sys_fork《核心實作》

接續前一記日記的觀念：「Linux 以 sys_fork 或 sys_clone 來產生新的 process，而這二個 system call 最後都會呼叫到 do_fork() 函數，do_fork() 是 Linux 主要的 fork-routine。」

將此觀念以實作角度來說明的話，do_fork() 必須要做的工作便是「copy 原來的 process 成為另一個新的 process」。Linux 的 sys_fork() 內部實作就是以「copy process」的方式來實作。也就是說，當 user program 呼叫 fork() wrapper function 後，sys_fork() 便會「copy」原來的 process，以得到一個新的 process。

由此觀念的推導，我們便能了解到，sys_fork() 的內部實作關鍵便是：

1. 如何 copy process。

2. 要 copy process 的「哪個部份」？

這二個關鍵，都是相當值得玩味的題目，同時，透過探討「copy process」的核心實作，我們也可以強化「process address space」的觀念。以下先將 sys_fork() 的內部流程先大略 trace 一遍後，再討論「copy process」的主題；而「copy what」則會在「clone()」的專欄裡再做介紹。

首先，sys_fork() 與 sys_clone() 都呼叫到 do_fork routine。以下是 do_fork() 的原始碼：

/*
 *  Ok, this is the main fork-routine.
 *
 * It copies the process, and if successful kick-starts
 * it and waits for it to finish using the VM if required.
 */
long do_fork(unsigned long clone_flags,
	      unsigned long stack_start,
	      struct pt_regs *regs,
	      unsigned long stack_size,
	      int __user *parent_tidptr,
	      int __user *child_tidptr)
{
	struct task_struct *p; // 請見 1.
	int trace = 0;
	struct pid *pid = alloc_pid(); // 請見 2.
	long nr;

	if (!pid)
		return -EAGAIN;
	nr = pid->nr;
	if (unlikely(current->ptrace)) {
		trace = fork_traceflag (clone_flags);
		if (trace)
			clone_flags |= CLONE_PTRACE;
	}
	// 請見 3.
	p = copy_process(clone_flags, stack_start, regs, stack_size, parent_tidptr, child_tidptr, nr);
	/*
	 * Do this prior waking up the new thread - the thread pointer
	 * might get invalid after that point, if the thread exits quickly.
	 */
	if (!IS_ERR(p)) {
		struct completion vfork;

		if (clone_flags & CLONE_VFORK) {
			p->vfork_done = &vfork;
			init_completion(&vfork);
		}

		if ((p->ptrace & PT_PTRACED) || (clone_flags & CLONE_STOPPED)) {
			/*
			 * We'll start up with an immediate SIGSTOP.
			 */
			sigaddset(&p->pending.signal, SIGSTOP);
			set_tsk_thread_flag(p, TIF_SIGPENDING);
		}

		if (!(clone_flags & CLONE_STOPPED))
			wake_up_new_task(p, clone_flags);
		else
			p->state = TASK_STOPPED;

		if (unlikely (trace)) {
			current->ptrace_message = nr;
			ptrace_notify ((trace << 8) | SIGTRAP);
		}

		if (clone_flags & CLONE_VFORK) {
			wait_for_completion(&vfork);
			if (unlikely (current->ptrace & PT_TRACE_VFORK_DONE))
				ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP);
		}
	} else {
		free_pid(pid);
		nr = PTR_ERR(p);
	}
	return nr;
}

我把重要的地方用紅色字體標示出來，一開始，我們必須先了解此部份的實作：

1. 宣告一個 process descriptor。

2. 要求一個 PID 給新的 process 使用。

3. 呼叫 copy_process()，以複制出新的 process。

由此可知，Linux kernel 的 copy_process() API 是重要的「process creation」API。

接著，把 copy_process() 的原始碼 trace 出來：

/*
 * This creates a new process as a copy of the old one,
 * but does not actually start it yet.
 *
 * It copies the registers, and all the appropriate
 * parts of the process environment (as per the clone
 * flags). The actual kick-off is left to the caller.
 */
static task_t *copy_process(unsigned long clone_flags,
				 unsigned long stack_start,
				 struct pt_regs *regs,
				 unsigned long stack_size,
				 int __user *parent_tidptr,
				 int __user *child_tidptr,
				 int pid)
{
...
}

copy_process() 程式碼有點多，這裡只先列出其函數原型。

看到 copy_process() 的第一個參數 clone_flags，這個參數一開始是由 sys_fork() 或是 sys_clone() 所傳遞進來的，並且 copy_process() 會根據 clone_flags 來決定「copy what」。

那麼我怎麼知道 clone_flags 有哪些值？這個部份定義在 <linux/sched.h> 標頭檔裡，以下是 clone_flags 的 bitwise 值定義：

/*
 * cloning flags:
 */
#define CSIGNAL		0x000000ff	/* signal mask to be sent at exit */
#define CLONE_VM		0x00000100	/* set if VM shared between processes */
#define CLONE_FS		0x00000200	/* set if fs info shared between processes */
#define CLONE_FILES	0x00000400	/* set if open files shared between processes */
#define CLONE_SIGHAND	0x00000800	/* set if signal handlers and blocked signals shared */
#define CLONE_PTRACE	0x00002000	/* set if we want to let tracing continue on the child too */
#define CLONE_VFORK	0x00004000	/* set if the parent wants the child to wake it up on mm_release */
#define CLONE_PARENT	0x00008000	/* set if we want to have the same parent as the cloner */
#define CLONE_THREAD	0x00010000	/* Same thread group? */
#define CLONE_NEWNS	0x00020000	/* New namespace group? */
#define CLONE_SYSVSEM	0x00040000	/* share system V SEM_UNDO semantics */
#define CLONE_SETTLS	0x00080000	/* create a new TLS for the child */
#define CLONE_PARENT_SETTID	0x00100000	/* set the TID in the parent */
#define CLONE_CHILD_CLEARTID	0x00200000	/* clear the TID in the child */
#define CLONE_DETACHED		0x00400000	/* Unused, ignored */
#define CLONE_UNTRACED		0x00800000	/* set if the tracing process can't force CLONE_PTRACE on this clone */
#define CLONE_CHILD_SETTID		0x01000000	/* set the TID in the child */
#define CLONE_STOPPED		0x02000000	/* Start in stopped state */

/*
 * List of flags we want to share for kernel threads,
 * if only because they are not used by them anyway.
 */
#define CLONE_KERNEL	(CLONE_FS | CLONE_FILES | CLONE_SIGHAND)

以 sys_fork() 的實作來看：

asmlinkage int sys_fork(struct pt_regs regs)
{
	return do_fork(SIGCHLD, regs.esp, ®s, 0);
}

呼叫 fork() wrapper function 時，並無法讓 user 自行定義 clone flags；因此，「在學會 clone() 函數的用法前」，其實可以先暫時跳過 clone flags 這個部份。

到這裡是 sys_fork() 內部實作的 trace，雖然我們了解到 clone flags 的作用，但是由於 sys_fork() 並不指定此參數，所以先不討論 clone flags。不過，我們的 sys_fork() trace 功課還沒完成，下一篇日記將會是「Process Creation, #5：copy process」。

以上 kernel trace，皆使用 Linux 2.6.17.7 原始程式碼。

Also See

2007.01.08: Process Creation, #3：sys_fork《基本觀念》
2007.01.02: Process Creation, #2：Running a "User Process"
2006.12.31: Process Creation, #1：由 shell 執行外部程式《基本觀念與範例》

jollen 發表於 January 11, 2007 11:17 PM | 全文 | 評論 (1) | 引用通告 (0)

Nano-X 程式設計, #3：顯示圖片（image.c）

程式範例 image.c 是以 hello.c 為基礎，加上顯示圖片的功能。透過 image.c 我們可以學到以下的 Nano-X 程式設計方法：

˙ 如何使用嵌入式圖片
˙ 如何將圖片顯示於視窗上

由檔案讀取圖片檔並顯示顯示圖片是一般常見的做法，這裡我們所要實作的範例是希望可以將圖片直接嵌入程式裡，而不是由外部檔案讀取。

如何使用嵌入式圖片

要將圖片嵌入於程式裡，首先必須將圖片轉換成數值資料形式的 C 程式。Nano-X 提供一個檔名為 convbmp 的工具來將 BMP 格式的圖片轉換成 C 程式。

convbmp工具的原始程式位於 src/mwin/bmp/convbmp.c，這是提供給 Microwindows API 使用者的工具，因此我們在設定 Nano-X 編譯選項時，除了勾選 Nano-X API 外，還要勾選 Microwindows API 選項才能產生 convbmp 執行檔。編譯後可以在 src/bin/ 目錄下找到 convbmp，我們手動將此工具安裝到 /usr/bin/目錄下，以方便我們使用：

# cd microwin-0.89/
# cp src/bin/convbmp /usr/bin

先將取得的圖檔轉換成 BMP 的格式，再利用 convbmp 轉換成 C 程式。例如，我想轉換圖檔 jollen.bmp，那麼將圖檔轉換成 C 程式的指令就是：

$ convbmp jollen.bmp

圖（jollen.bmp）

轉換後便會得到 jollen.c。接著我們再修改 hello.c 將圖片顯示於視窗上。

如何將圖片顯示於視窗上

因為圖片資料屬於外部變數，因此先在程式裡宣告外部圖片變數：

extern GR_IMAGE_HDR image_jollen;

image_jollen 是一個陣列，存放圖檔的 pixel 資料，此陣列由 convbmp 轉換後產生，詳見 jollen.c 程式。接著，在處理 GR_EVENT_TYPE_EXPOSURE 事件的地方呼叫 GrDrawImageBits() 函數畫出圖片即可：

GrDrawImageBits(wid, gc, 0, 0, &image_jollen);

編譯時別忘了與 jollen.c 程式做連結，這個部份可以寫一個簡單的 Makefile rule來完成：

mage: image.o jollen.o
	$(CC) $(CFLAGS) $(LDFLAGS) $^ -o $@

以下是 image.c 的完整程式，粗體字是新加入的程式碼。

/*
 * Copyright(c) 2003,2004 www.jollen.org
 *
 * - Nano-X API example.
 * - image.c
 */

#include <stdio.h>
#define MWINCLUDECOLORS
#include <microwin/nano-X.h>

GR_WINDOW_ID wid;
GR_GC_ID gc;

/* 外部圖片 */
extern GR_IMAGE_HDR image_jollen;

void event_handler (GR_EVENT *event);

int main (void)
{
   if (GrOpen() < 0) {
        fprintf (stderr, "GrOpen failed");
        return -1;
   }

   gc = GrNewGC();
   GrSetGCForeground (gc, 0xFF0000);

   wid = GrNewWindowEx(GR_WM_PROPS_APPFRAME |
                       GR_WM_PROPS_CAPTION  |
                       GR_WM_PROPS_CLOSEBOX,
                       "jollen.org",
                       GR_ROOT_WINDOW_ID, 
                       0, 0, 
                       image_jollen.width, /* 圖片寬度 */ image_jollen.height /* 圖片高度 */, 
                       0xFFFFFF); GrSelectEvents(wid, GR_EVENT_MASK_CLOSE_REQ | GR_EVENT_MASK_EXPOSURE);

   GrMapWindow(wid);
   GrMainLoop(event_handler);

   return 0;
}

void event_handler (GR_EVENT *event)
{
   switch (event->type)
   {
      case GR_EVENT_TYPE_EXPOSURE:
           GrDrawImageBits(wid, gc, 0, 0, &image_jollen);
	   break;
      case GR_EVENT_TYPE_CLOSE_REQ: 
	   GrClose();
      default: break;
   }
}

注釋

2007.01.13 編修（Revision）

Also See

2004.08.11: Nano-X 程式設計, #1：介紹與安裝
2004.04.26: Nano-X 程式設計, #2：「Hello World」

jollen 發表於 January 13, 2007 8:41 PM | 全文 | 評論 (0) | 引用通告 (0)

Nano-X 程式設計, #4：設定 Window Manager（wm.c）

wm.c 以 image.c 為範本，wm.c 執行後在畫面上只會看到圖片圖（沒有視窗標題列與邊框），並且可使用滑鼠來拖曳圖片。本範例主要在展示以下的 Nano-X API 程式設計方法：

˙ 如何設定視窗屬性（window manager）。
˙ 如何與使用者做互動（human interactive）。

要寫出「托曳圖片」的功能，首先必須把視窗的標題列與邊框去除，否則只能在標題列上托曳整個「視窗」。做法非常簡單，只要在建立視窗時設定 window manager 即可：

   wid = GrNewWindowEx(GR_WM_PROPS_APPWINDOW | 
                       GR_WM_PROPS_NODECORATE | 
                       GR_WM_PROPS_NOAUTOMOVE,
                       NULL,
                       GR_ROOT_WINDOW_ID, 
                       0, 0, 
                       image_jollen.width, /* 圖片寬度 */
                       image_jollen.height /* 圖片高度 */, 
                       0xFFFFFF);

粗體字的地方是新加入的 window manager 屬性，為了達到我們的要求，我們指定了3個屬性值：

GR_WM_PROPS_APPWINDOW：使用 window manager 的外觀。
GR_WM_PROPS_NODECORATE：不要有視窗邊框。
GR_WM_PROPS_NOAUTOMOVE：第一次顯示視窗時不移動視窗。

由於我們並未指定 GR_WM_PROPS_CAPTION，因此不會有視窗標題列。在處理使用者互動上，我們需要自行處理3個滑鼠事件：

GR_EVENT_MASK_MOUSE_POSITION：滑鼠移動事件。
GR_EVENT_MASK_BUTTON_UP：放開滑鼠按鍵。
GR_EVENT_MASK_BUTTON_DOWN：按下滑鼠按鍵。

因此選擇事件的程式碼應修改成：

   GrSelectEvents(wid, GR_EVENT_MASK_MOUSE_POSITION | 
                           GR_EVENT_MASK_BUTTON_UP |
                           GR_EVENT_MASK_BUTTON_DOWN);

接下來，要特別注意的地方是「顯示圖片的時機」。我們在這裡設計成在視窗顯示後、進入 event loop 前處理。首先是顯示圖片的程式寫法：

   GrMapWindow(wid);
   GrDrawImageBits(wid, gc, 0, 0, &image_jollen);

接著，修改 event loop，並加上處理以上 3 個事件（粗體字部份）的程式碼：

void event_handler (GR_EVENT *event)
{
   switch (event->type)
   {
      case GR_EVENT_TYPE_CLOSE_REQ: 
            GrClose();
      case GR_EVENT_TYPE_MOUSE_POSITION:
            position_event(&event->mouse);
            break;
      case GR_EVENT_TYPE_BUTTON_UP:
      case GR_EVENT_TYPE_BUTTON_DOWN:
            button_event(&event->button);
            break;
      default: break;
   }
}

對於滑鼠按鍵的處理方式為：當滑鼠被按下時，便記錄滑鼠的新座標，然後將視窗移到最上層；若此時使用者移動滑鼠，則將整個「視窗」移到新座標位置。如此一來，使用者就會看到「整個圖片被托曳」的效果。

處理滑鼠按鍵的程式寫法如下：

void button_event(GR_EVENT_BUTTON *e)
{
   if (e->type == GR_EVENT_TYPE_BUTTON_DOWN) {
      newx = e->x;
      newy = e->y;

      button_down = 1;
      GrRaiseWindow(e->wid);
   } else {
      button_down = 0;
   }
}

先判斷目前所產生的事件是否為 GR_EVENT_TYPE_BUTTON_DOWN；如果是，才記錄新座標，並將 button_down 設為 1。若不是 GR_EVENT_TYPE_BUTTON_DOWN 事件，表示滑鼠按鍵已放開，此時將 button_down 設為 0。
處理滑鼠移動事件的程式寫法如下：

void position_event(GR_EVENT_MOUSE *e)
{
   if (!button_down) return;

   GrMoveWindow(e->wid, e->rootx-newx, e->rooty-newy);
}

先判斷 button_down 是否為 false，若 button_down等於0，表示滑鼠按鍵是放開的，因此不做任何動作。反正，若 button_down 為true，代表滑鼠按鍵仍「持續」按住，此時才能呼叫 GrMoveWindow() 移動視窗。

以下是 wm.c 的完整程式，粗體字是新加入或修改過的程式碼。

/*
 * Copyright(c) 2003,2004 www.jollen.org
 *
 * - Nano-X API example.
 * - wm.c
 *
 */

#include <stdio.h>
#define MWINCLUDECOLORS
#include <microwin/nano-X.h>
#include <microwin/nxcolors.h>

GR_WINDOW_ID wid;
GR_GC_ID gc;

/* 外部影像 */
extern GR_IMAGE_HDR image_jollen;

void event_handler (GR_EVENT *event);

/* 滑鼠座標 */
int newx, newy;
int button_down;

int main (void)
{
   if (GrOpen() < 0) {
        fprintf (stderr, "GrOpen failed");
        return -1;
   }

   gc = GrNewGC();
   GrSetGCForeground (gc, 0xFF0000);

   wid = GrNewWindowEx(GR_WM_PROPS_APPWINDOW |
                       GR_WM_PROPS_NODECORATE | 
                       GR_WM_PROPS_NOAUTOMOVE,
                       NULL,
                       GR_ROOT_WINDOW_ID, 
                       0, 0, 
                       image_jollen.width, /* 影像寬度 */
                       image_jollen.height /* 影像高度 */, 
                       0xFFFFFF);

   GrSelectEvents(wid, GR_EVENT_MASK_MOUSE_POSITION |
                    GR_EVENT_MASK_BUTTON_UP |
                    GR_EVENT_MASK_BUTTON_DOWN);

   GrMapWindow(wid);
   GrDrawImageBits(wid, gc, 0, 0, &image_jollen);

   GrMainLoop(event_handler);

   return 0;
}

void button_event(GR_EVENT_BUTTON *e)
{
   if (e->type == GR_EVENT_TYPE_BUTTON_DOWN) {
      newx = e->x;
      newy = e->y;

      button_down = 1;
      GrRaiseWindow(e->wid);
   } else {
      button_down = 0;
   }
}

void position_event(GR_EVENT_MOUSE *e)
{
   if (!button_down) return;

   GrMoveWindow(e->wid, e->rootx-newx, e->rooty-newy);
}

void event_handler (GR_EVENT *event)
{
   switch (event->type)
   {
      case GR_EVENT_TYPE_CLOSE_REQ: 
	   GrClose();
      case GR_EVENT_TYPE_MOUSE_POSITION:
	   position_event(&event->mouse);
	   break;
      case GR_EVENT_TYPE_BUTTON_UP:
      case GR_EVENT_TYPE_BUTTON_DOWN:
           button_event(&event->button);
	   break;
      default: break;
   }
}

注釋

2007.01.13 編修（Revision）

Also See

2004.08.11: Nano-X 程式設計, #1：介紹與安裝
2004.04.26: Nano-X 程式設計, #2：「Hello World」
2007.01.13: Nano-X 程式設計, #3：顯示圖片（image.c）

jollen 發表於 January 13, 2007 11:11 PM | 全文 | 評論 (0) | 引用通告 (0)

Process Creation, #5：copy_process()

Before We Start

Trace Linux kernel 時，有幾個相當重要的原則要掌握住：

不要對 Linux kernel 做逐行的研讀（trace line-by-line、最忌諱的做法），而是掌握我們想要了解的核心觀念，並針對此核心觀念來了解相關的實作細節。
必須整理觀念與程式碼的對應，但並不是要大家對原始碼做逐行註解，而是針對細部觀念，將相關的實作片斷整理出來即可。
每個 kernel API 的實作原本就會包含許多的「OS 基礎概念與理論」，例如：mutex、semaphore、locking 等等。與這些基礎知識相關的程式碼，本來就是屬於整體的部份，所以應該「跳脫」個別的 kernel API 實作。
如 3.，也就是說，如果我原本就懂這些東西，就應該很清楚了解這些「片段程式碼」的作用，應避免「完美主意」作祟：不要急於看懂這些片斷的程式細節，以免因小失大。
如 3.，如果我原本就不懂這些東西，可以試著把這些程式碼當做「黑盒子」來看待。這樣的，這也是「學習方法」的問題，並非要刻意逃避這些自己不懂的東西。

以本日記為例，我們想要了解 Linux kernel 如何產生新的 process，而其中的關鍵便是 copy_process() 函數。但是，目前我們在做的是 sys_fork() 的 trace，而 sys_fork() 並不指定任何的 clone_flags 參數值，因此，在 trace copy_process() 的過程中，我們可以先行略過與 clone_flags 有關的特定處理。

copy_process()

以下是 copy_process() 的完整實作，我把現階段可略過的部份標示為灰體字。

/*
 * This creates a new process as a copy of the old one,
 * but does not actually start it yet.
 *
 * It copies the registers, and all the appropriate
 * parts of the process environment (as per the clone
 * flags). The actual kick-off is left to the caller.
 */
static task_t *copy_process(unsigned long clone_flags,
				 unsigned long stack_start,
				 struct pt_regs *regs,
				 unsigned long stack_size,
				 int __user *parent_tidptr,
				 int __user *child_tidptr,
				 int pid)
{
	int retval;
	struct task_struct *p = NULL;

	if ((clone_flags & (CLONE_NEWNS|CLONE_FS)) == (CLONE_NEWNS|CLONE_FS))
		return ERR_PTR(-EINVAL);

	/*
	 * Thread groups must share signals as well, and detached threads
	 * can only be started up within the thread group.
	 */
	if ((clone_flags & CLONE_THREAD) && !(clone_flags & CLONE_SIGHAND))
		return ERR_PTR(-EINVAL);

	/*
	 * Shared signal handlers imply shared VM. By way of the above,
	 * thread groups also imply shared VM. Blocking this case allows
	 * for various simplifications in other code.
	 */
	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
		return ERR_PTR(-EINVAL);

	retval = security_task_create(clone_flags);
	if (retval)
		goto fork_out;

	retval = -ENOMEM;
	p = dup_task_struct(current);
	if (!p)
		goto fork_out;

	retval = -EAGAIN;
	if (atomic_read(&p->user->processes) >=
			p->signal->rlim[RLIMIT_NPROC].rlim_cur) {
		if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_RESOURCE) &&
				p->user != &root_user)
			goto bad_fork_free;
	}

	atomic_inc(&p->user->__count);
	atomic_inc(&p->user->processes);
	get_group_info(p->group_info);

	/*
	 * If multiple threads are within copy_process(), then this check
	 * triggers too late. This doesn't hurt, the check is only there
	 * to stop root fork bombs.
	 */
	if (nr_threads >= max_threads)
		goto bad_fork_cleanup_count;

	if (!try_module_get(task_thread_info(p)->exec_domain->module))
		goto bad_fork_cleanup_count;

	if (p->binfmt && !try_module_get(p->binfmt->module))
		goto bad_fork_cleanup_put_domain;

	p->did_exec = 0;
	copy_flags(clone_flags, p);
	p->pid = pid;
	retval = -EFAULT;
	if (clone_flags & CLONE_PARENT_SETTID)
		if (put_user(p->pid, parent_tidptr))
			goto bad_fork_cleanup;

	p->proc_dentry = NULL;

	INIT_LIST_HEAD(&p->children);
	INIT_LIST_HEAD(&p->sibling);
	p->vfork_done = NULL;
	spin_lock_init(&p->alloc_lock);
	spin_lock_init(&p->proc_lock);

	clear_tsk_thread_flag(p, TIF_SIGPENDING);
	init_sigpending(&p->pending);

	p->utime = cputime_zero;
	p->stime = cputime_zero;
 	p->sched_time = 0;
	p->rchar = 0;		/* I/O counter: bytes read */
	p->wchar = 0;		/* I/O counter: bytes written */
	p->syscr = 0;		/* I/O counter: read syscalls */
	p->syscw = 0;		/* I/O counter: write syscalls */
	acct_clear_integrals(p);

 	p->it_virt_expires = cputime_zero;
	p->it_prof_expires = cputime_zero;
 	p->it_sched_expires = 0;
 	INIT_LIST_HEAD(&p->cpu_timers[0]);
 	INIT_LIST_HEAD(&p->cpu_timers[1]);
 	INIT_LIST_HEAD(&p->cpu_timers[2]);

	p->lock_depth = -1;		/* -1 = no lock */
	do_posix_clock_monotonic_gettime(&p->start_time);
	p->security = NULL;
	p->io_context = NULL;
	p->io_wait = NULL;
	p->audit_context = NULL;
	cpuset_fork(p);
#ifdef CONFIG_NUMA
 	p->mempolicy = mpol_copy(p->mempolicy);
 	if (IS_ERR(p->mempolicy)) {
 		retval = PTR_ERR(p->mempolicy);
 		p->mempolicy = NULL;
 		goto bad_fork_cleanup_cpuset;
 	}
	mpol_fix_fork_child_flag(p);
#endif

#ifdef CONFIG_DEBUG_MUTEXES
	p->blocked_on = NULL; /* not blocked yet */
#endif

	p->tgid = p->pid;
	if (clone_flags & CLONE_THREAD)
		p->tgid = current->tgid;

	if ((retval = security_task_alloc(p)))
		goto bad_fork_cleanup_policy;
	if ((retval = audit_alloc(p)))
		goto bad_fork_cleanup_security;
	/* copy all the process information */
	if ((retval = copy_semundo(clone_flags, p)))
		goto bad_fork_cleanup_audit;
	if ((retval = copy_files(clone_flags, p)))
		goto bad_fork_cleanup_semundo;
	if ((retval = copy_fs(clone_flags, p)))
		goto bad_fork_cleanup_files;
	if ((retval = copy_sighand(clone_flags, p)))
		goto bad_fork_cleanup_fs;
	if ((retval = copy_signal(clone_flags, p)))
		goto bad_fork_cleanup_sighand;
	if ((retval = copy_mm(clone_flags, p)))
		goto bad_fork_cleanup_signal;
	if ((retval = copy_keys(clone_flags, p)))
		goto bad_fork_cleanup_mm;
	if ((retval = copy_namespace(clone_flags, p)))
		goto bad_fork_cleanup_keys;
	retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
	if (retval)
		goto bad_fork_cleanup_namespace;

	p->set_child_tid = (clone_flags & CLONE_CHILD_SETTID) ? child_tidptr : NULL;
	/*
	 * Clear TID on mm_release()?
	 */
	p->clear_child_tid = (clone_flags & CLONE_CHILD_CLEARTID) ? child_tidptr: NULL;
	p->robust_list = NULL;
#ifdef CONFIG_COMPAT
	p->compat_robust_list = NULL;
#endif
	/*
	 * sigaltstack should be cleared when sharing the same VM
	 */
	if ((clone_flags & (CLONE_VM|CLONE_VFORK)) == CLONE_VM)
		p->sas_ss_sp = p->sas_ss_size = 0;

	/*
	 * Syscall tracing should be turned off in the child regardless
	 * of CLONE_PTRACE.
	 */
	clear_tsk_thread_flag(p, TIF_SYSCALL_TRACE);
#ifdef TIF_SYSCALL_EMU
	clear_tsk_thread_flag(p, TIF_SYSCALL_EMU);
#endif

	/* Our parent execution domain becomes current domain
	   These must match for thread signalling to apply */
	   
	p->parent_exec_id = p->self_exec_id;

	/* ok, now we should be set up.. */
	p->exit_signal = (clone_flags & CLONE_THREAD) ? -1 : (clone_flags & CSIGNAL);
	p->pdeath_signal = 0;
	p->exit_state = 0;

	/*
	 * Ok, make it visible to the rest of the system.
	 * We dont wake it up yet.
	 */
	p->group_leader = p;
	INIT_LIST_HEAD(&p->thread_group);
	INIT_LIST_HEAD(&p->ptrace_children);
	INIT_LIST_HEAD(&p->ptrace_list);

	/* Perform scheduler related setup. Assign this task to a CPU. */
	sched_fork(p, clone_flags);

	/* Need tasklist lock for parent etc handling! */
	write_lock_irq(&tasklist_lock);

	/*
	 * The task hasn't been attached yet, so its cpus_allowed mask will
	 * not be changed, nor will its assigned CPU.
	 *
	 * The cpus_allowed mask of the parent may have changed after it was
	 * copied first time - so re-copy it here, then check the child's CPU
	 * to ensure it is on a valid CPU (and if not, just force it back to
	 * parent's CPU). This avoids alot of nasty races.
	 */
	p->cpus_allowed = current->cpus_allowed;
	if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed) ||
			!cpu_online(task_cpu(p))))
		set_task_cpu(p, smp_processor_id());

	/* CLONE_PARENT re-uses the old parent */
	if (clone_flags & (CLONE_PARENT|CLONE_THREAD))
		p->real_parent = current->real_parent;
	else
		p->real_parent = current;
	p->parent = p->real_parent;

	spin_lock(¤t->sighand->siglock);

	/*
	 * Process group and session signals need to be delivered to just the
	 * parent before the fork or both the parent and the child after the
	 * fork. Restart if a signal comes in before we add the new process to
	 * it's process group.
	 * A fatal signal pending means that current will exit, so the new
	 * thread can't slip out of an OOM kill (or normal SIGKILL).
 	 */
 	recalc_sigpending();
	if (signal_pending(current)) {
		spin_unlock(¤t->sighand->siglock);
		write_unlock_irq(&tasklist_lock);
		retval = -ERESTARTNOINTR;
		goto bad_fork_cleanup_namespace;
	}

	if (clone_flags & CLONE_THREAD) {
		/*
		 * Important: if an exit-all has been started then
		 * do not create this new thread - the whole thread
		 * group is supposed to exit anyway.
		 */
		if (current->signal->flags & SIGNAL_GROUP_EXIT) {
			spin_unlock(¤t->sighand->siglock);
			write_unlock_irq(&tasklist_lock);
			retval = -EAGAIN;
			goto bad_fork_cleanup_namespace;
		}

		p->group_leader = current->group_leader;
		list_add_tail_rcu(&p->thread_group, &p->group_leader->thread_group);

		if (!cputime_eq(current->signal->it_virt_expires,
				cputime_zero) ||
		    !cputime_eq(current->signal->it_prof_expires,
				cputime_zero) ||
		    current->signal->rlim[RLIMIT_CPU].rlim_cur != RLIM_INFINITY ||
		    !list_empty(¤t->signal->cpu_timers[0]) ||
		    !list_empty(¤t->signal->cpu_timers[1]) ||
		    !list_empty(¤t->signal->cpu_timers[2])) {
			/*
			 * Have child wake up on its first tick to check
			 * for process CPU timers.
			 */
			p->it_prof_expires = jiffies_to_cputime(1);
		}
	}

	/*
	 * inherit ioprio
	 */
	p->ioprio = current->ioprio;

	if (likely(p->pid)) {
		add_parent(p);
		if (unlikely(p->ptrace & PT_PTRACED))
			__ptrace_link(p, current->parent);

		if (thread_group_leader(p)) {
			p->signal->tty = current->signal->tty;
			p->signal->pgrp = process_group(current);
			p->signal->session = current->signal->session;
			attach_pid(p, PIDTYPE_PGID, process_group(p));
			attach_pid(p, PIDTYPE_SID, p->signal->session);

			list_add_tail_rcu(&p->tasks, &init_task.tasks);
			__get_cpu_var(process_counts)++;
		}
		attach_pid(p, PIDTYPE_PID, p->pid);
		nr_threads++;
	}

	total_forks++;
	spin_unlock(¤t->sighand->siglock);
	write_unlock_irq(&tasklist_lock);
	proc_fork_connector(p);
	return p;

bad_fork_cleanup_namespace:
	exit_namespace(p);
bad_fork_cleanup_keys:
	exit_keys(p);
bad_fork_cleanup_mm:
	if (p->mm)
		mmput(p->mm);
bad_fork_cleanup_signal:
	cleanup_signal(p);
bad_fork_cleanup_sighand:
	__cleanup_sighand(p->sighand);
bad_fork_cleanup_fs:
	exit_fs(p); /* blocking */
bad_fork_cleanup_files:
	exit_files(p); /* blocking */
bad_fork_cleanup_semundo:
	exit_sem(p);
bad_fork_cleanup_audit:
	audit_free(p);
bad_fork_cleanup_security:
	security_task_free(p);
bad_fork_cleanup_policy:
#ifdef CONFIG_NUMA
	mpol_free(p->mempolicy);
bad_fork_cleanup_cpuset:
#endif
	cpuset_exit(p);
bad_fork_cleanup:
	if (p->binfmt)
		module_put(p->binfmt->module);
bad_fork_cleanup_put_domain:
	module_put(task_thread_info(p)->exec_domain->module);
bad_fork_cleanup_count:
	put_group_info(p->group_info);
	atomic_dec(&p->user->processes);
	free_uid(p->user);
bad_fork_free:
	free_task(p);
fork_out:
	return ERR_PTR(retval);
}

有許多沒有標成灰色字體的程式片斷，其實也應該先行省略不看，像是：資料結構的操作、spinlock、錯誤處理、變數初始化等等。

copy_process() 的關鍵在哪裡？

Trace 到這裡後，我會先就程式碼本身的實作，節錄「深入 sys_fork() 底層」相關的實作片斷。以下供您參考：

1. 新的 process description：

	struct task_struct *p = NULL;

2. "dup" current 成為 p：

	p = dup_task_struct(current);
	if (!p)
		goto fork_out;

3. copy_process() 會判斷目前的 process 數是否過多：

	if (nr_threads >= max_threads)
		goto bad_fork_cleanup_count;

max_threads 是在 fork_init() 階段算出來的，可參考「Jollen 的 Linux 核心分享包,#3: fork_init()《講義6》」。

4. 開始 "copy" current 給新的 process：

	if ((retval = copy_semundo(clone_flags, p)))
		goto bad_fork_cleanup_audit;
	if ((retval = copy_files(clone_flags, p)))
		goto bad_fork_cleanup_semundo;
	if ((retval = copy_fs(clone_flags, p)))
		goto bad_fork_cleanup_files;
	if ((retval = copy_sighand(clone_flags, p)))
		goto bad_fork_cleanup_fs;
	if ((retval = copy_signal(clone_flags, p)))
		goto bad_fork_cleanup_sighand;
	if ((retval = copy_mm(clone_flags, p)))
		goto bad_fork_cleanup_signal;
	if ((retval = copy_keys(clone_flags, p)))
		goto bad_fork_cleanup_mm;
	if ((retval = copy_namespace(clone_flags, p)))
		goto bad_fork_cleanup_keys;
	retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
	if (retval)
		goto bad_fork_cleanup_namespace;

作業還沒完

解析出精華的目的，當然是為了做「更深入」且「有效率」的研究。我用我的學習方法，呈現「Process Creation」系列專欄的推導過程；希望我的做法對您是真正有幫助的。

在繼續進行前，必須了解幾個基礎知識。Keyword 如下：

process relationship
process descriptor
memory descriptor

「Process Creation」系列專欄到此告一段落，不過「作業」仍會完成，並不是就此結束。我會起另外一個專欄，來把剩下的 hacking 功課完成。

Also See

2007.01.11: Process Creation, #4：sys_fork《核心實作》
2007.01.08: Process Creation, #3：sys_fork《基本觀念》
2007.01.02: Process Creation, #2：Running a "User Process"
2006.12.31: Process Creation, #1：由 shell 執行外部程式《基本觀念與範例》

jollen 發表於 January 14, 2007 1:59 AM | 全文 | 評論 (1) | 引用通告 (0)

Linux 的 Virtual Memory Areas（VMA）：Process 與 VMA 整體觀念

要了解 VMA（Virtual Memory Area）的「整體觀念」，最好的方式就是圖解說明。下圖說明了 process 與 VMA 的整體觀念。

圖：Process 與 VMA 整體觀念

Memory Descriptor

Linux 的「Process Descriptor」資料結構為 struct task_struct（include/linux/sched.h）。Process descriptor 裡的 mm field 紀錄了 process 的 VMA 資訊：

struct task_struct {
	...
	struct mm_struct *mm;
	...
}

struct mm_struct 即是 Linux 提供的「Memory Descriptor」資料結構，以下是 struct mm_struct 的原型宣告：

struct mm_struct {
	struct vm_area_struct * mmap;	/* list of VMAs */
	struct rb_root mm_rb;
	struct vm_area_struct * mmap_cache;	/* last find_vma result */
	unsigned long (*get_unmapped_area) (struct file *filp,
				unsigned long addr, unsigned long len,
				unsigned long pgoff, unsigned long flags);
	void (*unmap_area) (struct mm_struct *mm, unsigned long addr);
	unsigned long mmap_base;		/* base of mmap area */
	unsigned long task_size;		/* size of task vm space */
	unsigned long cached_hole_size;      /* if non-zero, the largest hole below free_area_cache */
	unsigned long free_area_cache;	/* first hole of size cached_hole_size or larger */
	pgd_t * pgd;
	atomic_t mm_users;			/* How many users with user space? */
	atomic_t mm_count;			/* How many references to "struct mm_struct" (users count as 1) */
	int map_count;				/* number of VMAs */
	struct rw_semaphore mmap_sem;
	spinlock_t page_table_lock;		/* Protects page tables and some counters */

	struct list_head mmlist;		/* List of maybe swapped mm's.  These are globally strung
					 * together off init_mm.mmlist, and are protected
					 * by mmlist_lock
					 */

	/* Special counters, in some configurations protected by the
	 * page_table_lock, in other configurations by being atomic.
	 */
	mm_counter_t _file_rss;
	mm_counter_t _anon_rss;

	unsigned long hiwater_rss;	/* High-watermark of RSS usage */
	unsigned long hiwater_vm;	/* High-water virtual memory usage */

	unsigned long total_vm, locked_vm, shared_vm, exec_vm;
	unsigned long stack_vm, reserved_vm, def_flags, nr_ptes;
	unsigned long start_code, end_code, start_data, end_data;
	unsigned long start_brk, brk, start_stack;
	unsigned long arg_start, arg_end, env_start, env_end;

	unsigned long saved_auxv[AT_VECTOR_SIZE]; /* for /proc/PID/auxv */

	unsigned dumpable:2;
	cpumask_t cpu_vm_mask;

	/* Architecture-specific MM context */
	mm_context_t context;

	/* Token based thrashing protection. */
	unsigned long swap_token_time;
	char recent_pagein;

	/* coredumping support */
	int core_waiters;
	struct completion *core_startup_done, core_done;

	/* aio bits */
	rwlock_t		ioctx_list_lock;
	struct kioctx		*ioctx_list;
};

Memory descriptor 故名思義，是用來描述 process 記憶體資訊的資料結構。由 struct mm_struct 裡可以看到一個稱為 mmap 的 field，mmap 的 data type 為 struct vm_area_struct，這個資料結構即是我們在「Linux 的 Virtual Memory Areas（VMA）：基本概念介紹」所介紹的 VMA 資料結構。

VMA 與 ELF Image 的對映關係

在「Linux 的 Virtual Memory Areas（VMA）：基本概念介紹」曾經介紹過，Process 的 VMA 對映，可以由 /proc/<pid>/maps 檔案查詢；例如 pid 1（init）的 VMA mapping 為：

$ cat /proc/1/maps
08048000-0804e000 r-xp 00000000 08:01 12118      /sbin/init
0804e000-08050000 rw-p 00005000 08:01 12118      /sbin/init
08050000-08054000 rwxp 00000000 00:00 0
40000000-40016000 r-xp 00000000 08:01 52297      /lib/ld-2.2.4.so
40016000-40017000 rw-p 00015000 08:01 52297      /lib/ld-2.2.4.so
40024000-40025000 rw-p 00000000 00:00 0
40025000-40157000 r-xp 00000000 08:01 58241      /lib/i686/libc-2.2.4.so
40157000-4015c000 rw-p 00131000 08:01 58241      /lib/i686/libc-2.2.4.so
4015c000-40160000 rw-p 00000000 00:00 0
bfffe000-c0000000 rwxp fffff000 00:00 0

列表結果便能用來說明 VMA 與 ELF image 之間的關係。搭配上圖來說明列表結果的 VMA 對映關係，如下：

第 1 列（row）是 ELF 執行檔（/sbin/init）的 code section VMA mapping；
第 2 列是 ELF 執行檔的 data section VMA mapping；
第 3 列是 ELF 執行檔的 .bss section VMA mapping。
第 4 列是 dynamic loader（/lib/ld-2.2.4.so）的 code section VMA mapping；
第 5 列是 dynamic loader 的 data section VMA mapping；
第 6 列是 dynamic loader 的 .bss section VMA mapping。
第 7 列是 libc 的 code section VMA mapping；
第 8 列是 libc 的 data section VMA mapping；
第 9 列是 libc 的 .bss section VMA mapping。

另外，要留意的是，在文中所指的 code section 與 data section 不見得就是 ELF 的 .text section 與 .data section；我們以 code section 來表示所有可執行的節區，以 data section 來表示包含資料的節區。

在整個 VMA 的討論過程中，我們只針對 code section 與 data section 做討論（如圖），至於 .bss section 的話，原則上另案來討論其核心實作會比較實際一些。

Also See

2007.01.05: Linux 的 Virtual Memory Areas（VMA）：基本概念介紹

jollen 發表於 January 15, 2007 3:27 PM | 全文 | 評論 (3) | 引用通告 (0)

Shared Memory 的 Race Condition

今天在討論基本的 shared memory 機制時聊到，shared memory 有同步性的問題（synchronization），主要的原因是 Linux 並未對 shared memory 做同步的控制。以下單純由 user-space programming 的角度來探討此觀念。

首先，試作以下 3 個小程式：

shm_allocate.c，用來請求 shared memory
shm_read.c，用來讀取 shared memory
shm_write.c，用來寫入 shared memory

以上程式可由 [http://tw.jollen.org/ipc-programming/shm_race.tar.bz2] 下載。以下是操作方法：

# ./shm_allocate jollen
Shared Memory Segment ID: 98306

先執行 shm_allocate 配置 shared memory，並任意填入一個初始字串，程式會印出此 shared memory 的 Segment ID。接著：

# ./shm_read 98306
Message of Shared Memory: jollen
Message of Shared Memory: jollen
Message of Shared Memory: jollen
Message of Shared Memory: jollen
...

再執行 shm_read，命令列參數加上 shared memory 的 Segment ID。shm_read 會持續不斷的讀取 shared memory 的內容。然後，在另一個 terminal 執行 shared memory 的寫入程式。

我們透過 shm_write 寫入 [00000000,11111111,22222222,...,99999999] 字串到 shared memory，並觀察輸出。由於 shm_write 寫入 shared memory 的資料是「相同數字的字串」，因此，若觀察到以下的結果，表示 shm_read 與 shm_write 間存在 race condition 問題：

...
Message of Shared Memory: 98888888
Message of Shared Memory: 33333322
Message of Shared Memory: 11111111
Message of Shared Memory: 66666666
Message of Shared Memory: 44443333
Message of Shared Memory: 77777666
...

因此，使用 shared memory 做為 IPC 機制時，必須實作同步的演算法。由演算法層面來討論，我們區分以下 3 個層面的探討：

All Read：不需考慮 race condition。
1-Write、n-Read（n >= 1）
n-Write、n-Read（n >= 1）

可此可知，當「只有有人寫入 shared memory，便要考慮 race conditon 問題」。若由演算法層面來思考此問題，以下是相對較為單純簡單的做法：

1-Write、n-Read（n >= 1）：可使用 lock file 的機制。
n-Write、n-Read（n >= 1）：lock file 若是以 polling 方式做 waiting，必須配合 queue（或 circular queue）來實作，才能真正有效解決問題。

有機會的話，再跟大家分享朋友寫的 code。

jollen 發表於 January 16, 2007 3:54 PM | 全文 | 評論 (0) | 引用通告 (0)

Embedded Linux 測試：Bootstrap root filesystem（x86）階段《程式執行測試》

本文是單純實作時的測試方法說明，「Bootstrap root filesystem」的做法、步驟、概念與隱含的觀念請自行參考相關文件。

前情提要：由 Busybox 開始

這個階段的實作「提要」如下。

我們到 Busybox 官方網站下載原始碼：http://busybox.net/downloads/busybox-1.3.1.tar.bz2。如果您是初次「把玩」open source package，必須了解一個重要的概念：對任何的 open source 套件而言，以下二份文件是必讀的：

README
INSTALL

大多數的套件都包含以上二個檔案，並且許多重要的資訊都寫在這二份文件檔裡。將 Busybox 套件解開後，先檢查以下的 utility 是否有勾選：

init process
shell（請優先選用 ash）

並且將安裝路徑（install prefix）設定到事先建立的 root filesystem 空目錄。接著進行編譯，在編譯過程中可能會產生一些錯誤，請先把產生錯誤的選項取消，試著將 Busybox 編譯出來後並安裝至 root filesystem 目錄下。

我要怎麼測試此階段 root filesystem 的正確性？

假設 root filesystem 放在 /tmp/busybox_project/rootfs 目錄下，那麼直接用 chroot 來做第一次測試是不錯的方式。指令如下：

$ chroot /tmp/busybox_project/rootfs/ /bin/sh

若出現以下訊息，表示此 root filesystem 是無法運作的：

chroot: /bin/sh: No such file or directory

以下是正確無誤的畫面：

# chroot /tmp/busybox_project/rootfs/ /bin/sh


BusyBox v1.3.1 (2007-01-17 14:42:55 CST) Built-in shell (ash)
Enter 'help' for a list of built-in commands.

# ls
bin      lib      linuxrc  sbin     usr

進到 shell 模式後，就可以跑 root filesystem 裡的應用程式了。這種測試方式，主要是對 root filesystem 裡的應用程式進行執行測試，主要的目的可能有：

了解 shell 與應用程式是否能正常執行。
了解 library dependencies 的正確性。
了解應用程式的執行結果是否正確。
了解應用程式的設定檔是否正確設定。
Root filesystem 是否還缺少什麼元件（檔案、工具、設定檔等等）。

常見原因

此階段產生錯誤的原因可能有：

缺少 shell。
library dependencies 問題。

這種測試方式，可能無法真正有效測試到 init process 階段的問題，因此變更 root 根目錄後的第一個執行程式應該指定為 shell。

如果您想練習 chroot 測試，必須下載 Busybox 並自行建構基本的 root filesystem。我也提供了此階段的成果檔案 [http://tw.jollen.org/root-filesystem/busybox_project_001.tar.bz2]，以方便您練習 chroot 用法。

jollen 發表於 January 17, 2007 3:43 PM | 全文 | 評論 (0) | 引用通告 (0)

SD/SDIO 的開發板

Jinvani Systech 推出 SD/SDIO 的發展平臺，特別的是，這是一張 PCI-based 的 host controller 卡！相當的有用處，詳見 LinuxDevices.com 上的新聞全文 [http://www.linuxdevices.com/news/NS7542044278.html]。

我以「相當的有用處」來形容這個發展平臺的理由是，在 SD/SDIO 相關的專案裡頭，host controller 的發展與移植並非首要的工作，重點工作反而是：

1. 基於 SD/SDIO 週邊介面卡的 device driver，例如 WiFi card、Bluetooth、GPS modem、SD memory 等。

2. SD/SDIO card 的相關應用程式，例如數位相框軟體（基於 SD memory card）等。

因此，如果可以在 host 端（x86）插上一張這種卡，事情就美妙許多了，因為可以減化部份應用程式的測試工作，同時也能加速 SDIO device driver 的開發。以往，雖然寫的是單純的 SD/SDIO 應用程式，可是總是要做 target 端的實機測試，總覺得有點不方便；或是，無法在 host 端使用模擬平臺（例如：Qemu）來做驅動程式的測試，非得要丟上 target board 才有辦法實測，相當費力不說，也不方便除錯（debug）。

對於 Embedded Linux 的應用開發，這張 SD/SDIO 開發板確實很有幫助。以下是這張 SD/SDIO 開發板的規格（節錄自 LinuxDevices.com 新聞）：

Supports two SD card slots

SD memory and SDIO capable

SD Host 1.0 compatible

Supports high capacity / high speed SD memory cards

Supports security commands

Supports SD 1-bit and 4-bit modes (no SPI)

Supports SD Sleuth II

此外，這張 SD/SDIO 開發板也支援 SD Memory 開機！

jollen 發表於 January 19, 2007 3:49 PM | 全文 | 評論 (0) | 引用通告 (0)

Linux（open source）的 SD/MMC/SDIO 支援現況概要

SD（Secure Digital）與 MMC（Multimedia Card）

SD 是一種 flash memory card 的標準，也就是一般常見的 SD 記憶卡，而 MMC 則是較早的一種記憶卡標準，目前已經被 SD 標準所取代。在維基百科上有相當詳細的 SD/MMC 規格說明：[http://zh.wikipedia.org/wiki/Secure_Digital]。

SDIO（Secure Digital I/O）

SDIO 是目前我們比較關心的技術，SDIO 故名思義，就是 SD 的 I/O 介面（interface）的意思，不過這樣解釋可能還有點抽像。更具體的說明，SD 本來是記憶卡的標準，但是現在也可以把 SD 拿來插上一些週邊介面使用，這樣的技術便是 SDIO。

所以 SDIO 本身是一種相當單純的技術，透過 SD 的 I/O 接腳來連接外部週邊，並且透過 SD 上的 I/O 資料接位與這些週邊傳輸資料，而且 SD 協會會員也推出很完整的 SDIO stack 驅動程式，使得 SDIO 週邊（我們稱為 SDIO 卡）的開發與應用變得相當熱門。

現在已經有非常多的手機或是手持裝置都支援 SDIO 的功能（SD 標準原本就是針對 mobile device 而制定），而且許多 SDIO 週邊也都被開發出來，讓手機外接週邊更加容易，並且開發上更有彈性（不需要內建週邊）。目前常見的 SDIO 週邊（SDIO 卡）有：

Wi-Fi card（無線網路卡）
CMOS sensor card（照相模組）
GPS card
GSM/GPRS modem card
Bluetooth card
Radio/TV card（很好玩）

SDIO 的應用將是未來嵌入式系統最重要的介面技術之一，並且也會取代目前 GPIO 式的 SPI 介面。

SD/SDIO 的傳輸模式

SD 傳輸模式有以下 3 種：

SPI mode（required）
1-bit mode
4-bit mode

SDIO 同樣也支援以上 3 種傳輸模式。依據 SD 標準，所有的 SD（記憶卡）與 SDIO（週邊）都必須支援 SPI mode，因此 SPI mode 是「required」。此外，早期的 MMC 卡（使用 SPI 傳輸）也能接到 SD 插糟（SD slot），並且使用 SPI mode 或 1-bit mode 來讀取。

SD 的 MMC Mode

SD 也能讀取 MMC 記憶體，雖然 MMC 標準上提到，MMC 記憶體不見得要支援 SPI mode（但是一定要支援 1-bit mode），但是市面上能看到的 MMC 卡其實都有支援 SPI mode。因此，我們可以把 SD 設定成 SPI mode 的傳輸方式來讀取 MMC 記憶卡。

SD 的 MMC Mode 就是用來讀取 MMC 卡的一種傳輸模式。不過，SD 的 MMC Mode 雖然也是使用 SPI mode，但其物理特性仍是有差異的：

MMC 的 SPI mode 最大傳輸速率為 20 Mbit/s；
SD 的 SPI mode 最大傳輸速率為 25 Mbit/s。

為避免混淆，有時也用 SPI/MMC mode 與 SPI/SD mode 的寫法來做清楚區別。

Linux 的 SD/MMC 驅動程式

Linux 2.6.17 正式加入 SD/MMC 驅動程式，「官方版」的 release 當然只能在 ARM 平臺上使用，若不討論 host controller 的支援，目前的 Linux SD/MMC/SDIO 支援狀況（Core API）如下（Open source compatible for SD/MMC/SDIO）：

1. 支援 MMC。Open source 的實作即是 SD-MMC 的驅動程式。

2. Core API 「可以做到」支援 SD/SPI mode only，因為有辦法做到支援 SPI mode，當然 1-bit mode 也「可以想辦法」做出來。將 1-bit mode 擴充至 4-bit mode，網路上雖然有人做，不過會有版權問題。

3. SDIO 目前只看到 Atheros Wi-Fi card 的支援（可在 sourceforge 上找到，已正式釋出 source code）。

4. 完整的 SD stack 是要付錢的，請特別注意，SD Card Association product license agreement 並「不」允許 open source 的驅動程式實作。所以也不用再問，為什麼沒有 open source 的 SD 驅動程式了；Linux 是無罪的！

請特別留意的是，以上所述是「官方」的正式支援，並不包含玩家所分享的各種 patch。

jollen 發表於 January 19, 2007 5:32 PM | 全文 | 評論 (2) | 引用通告 (0)

[筆記] Linux 2.6 的 MMC Core

以下整理自 Jollen 筆記（非教學文件），許多地方未能清楚交待，這部份有請大家自行補齊了。本文分享給有志研究 Linux MMC 驅動程式實作（MMC Core）的朋友參考。以下分析基於 Linux 2.6.17.7，更新版本的 kernel 加入了許多 patch（例如 Linux 2.6.19 的 SDHC patch），這些更新內容不在討論之列。

續前一篇日記「Linux（open source）的 SD/MMC/SDIO 支援現況概要」所提到的，目前的 Linux SD/MMC/SDIO 「嚴格來說」，只支援 MMC 記憶卡，如果是要插上 SD 記憶卡，使用上則會有諸多限制。

由 Linux 驅動程式的角度來看，單就 MMC 的部份來分析的話，Linux 的 SD/MMC 驅動程式層包含以下實作（Kconfig）：

CONFIG_MMC
CONFIG_MMC_BLOCK

相關檔案位於 drivers/mmc/ 目錄，我們由 Makefile 來找到實作檔案：

#                                                       
# Core                                                  
#                                                       
obj-$(CONFIG_MMC)               += mmc_core.o           
                                                        
#                                                       
# Media drivers                                         
#                                                       
obj-$(CONFIG_MMC_BLOCK)         += mmc_block.o          
                                                        
#                                                       
# Host drivers                                          
#                                                       
obj-$(CONFIG_MMC_ARMMMCI)       += mmci.o               
obj-$(CONFIG_MMC_PXA)           += pxamci.o             
obj-$(CONFIG_MMC_IMX)           += imxmmc.o             
obj-$(CONFIG_MMC_SDHCI)         += sdhci.o              
obj-$(CONFIG_MMC_WBSD)          += wbsd.o               
obj-$(CONFIG_MMC_AU1X)          += au1xmmc.o            
obj-$(CONFIG_MMC_OMAP)          += omap.o               
obj-$(CONFIG_MMC_AT91RM9200)    += at91_mci.o           
                                                        
mmc_core-y := mmc.o mmc_queue.o mmc_sysfs.o

Host controller 驅動程式的部份先不討論，MMC Core API 層的實作檔案整理如下：

drivers/mmc/mmc.c：主要的 MMC command 與 protocol 實作。
drivers/mmc/mmc_queue.c：I/O Request Queue 的實作。
drivers/mmc/mmc_sysfs.c：Linux 2.6 的 kobject 與 sysfs 實作。
drivers/mmc/mmc_block.c：區塊層架構實作，即 interface to user-space 的 file operation 部份。

由此可知，MMC Core 層包含以下原始程式碼：

drivers/mmc/mmc.c
drivers/mmc/mmc_queue.c
drivers/mmc/mmc_sysfs.c

區塊層部份，mmc_block.c 以 devfs 的方式向 kernel 註冊：

static struct mmc_driver mmc_driver = {
	.drv		= {
		.name	= "mmcblk",
	},
	.probe		= mmc_blk_probe,
	.remove		= mmc_blk_remove,
	.suspend		= mmc_blk_suspend,
	.resume		= mmc_blk_resume,
};

static int __init mmc_blk_init(void)
{
	int res = -ENOMEM;

	res = register_blkdev(major, "mmc");
	if (res < 0) {
		printk(KERN_WARNING "Unable to get major %d for MMC media: %d\n",
		       major, res);
		goto out;
	}
	if (major == 0)
		major = res;

	devfs_mk_dir("mmc");
	return mmc_register_driver(&mmc_driver);

 out:
	return res;
}
...
module_init(mmc_blk_init);

mmc_register_driver() 向 MMC Core 層註冊，接著 MMC Core 再對 kobject 做註冊。學過 Linux 2.6 驅動程式的朋友都曉得，Core API 層必須呼叫 driver_register() 向 kobject 註冊為 Driver；對於底層（machine-dependent）的 host controller 驅動程式而言，則必須向 kobject 註冊為 Platform Driver。

由於 kobject 會 callback fops 的 probe method，所以 mmc_blk_probe() 函數就是 MMC 區塊層的進入點（entry point）。所以，MMC 區塊層的一切動作就要由 mmc_blk_probe() 函數看起。Linux 2.6.17.7 的 MMC 區塊層使用到大家所熟悉的 genhd.c 層。

至於 Linux 區塊層驅動程式最重要的「初始化 I/O request queue」動作，則是同樣在 mmc_blk_probe() 階段呼叫到 MMC Core 層的 mmc_init_queue() 來完成。

了解 Linux 的 MMC 整體架構後，便能開始深入研究「規格的實作」部份。

jollen 發表於 January 20, 2007 12:06 AM | 全文 | 評論 (0) | 引用通告 (0)

OpenMoko 準備舉旗進攻了

OpenMoko 二天前釋出他的最新 roadmap 規劃，OpenMoko 的 project leader "Sean Moss-Pultz" 在 1/20 日發表了「釋放你的手機（Free Your Phone）」，此文提到，Neo1973/OpenMoko 將採三階段來「釋放手機」，首先即將到來的是「2007-02-11 Phase 0: Developer Preview」。

第一階段的計畫中，開發者將能取得 OpenMoko 的原始碼；完整的 OpenMoko Linux distribution 與原始碼將會開放。以下是 FIC Neo1973 手機的硬體規格：

* 120.7 x 62 x 18.5 (mm)
* 2.8" VGA (480x640) TFT Screen
* Samsung s3c2410 SoC @ 266 MHz
* Global Locate AGPS chip
* Ti GPRS (2.5G not EDGE)
* Unpowered USB 1.1
* Touchscreen
* micro-sd slot
* 2.5mm audio jack
* 2 additional buttons
* 1200 mAh battery (charged over USB)
* 128 MB SDRAM
* 64 MB NAND Flash
* Bluetooth (2.0)

軟體方面，OpenMoko Linux distribution 內建於 OpenEmbedded 中，並使用到以下主要的軟體套件：

* Linux 2.6.17.14
* gcc 4.1.1
* binutils 2.17.50.0.5
* glibc 2.4
* Xorg 7.1
* glib 2.6.4
* gtk 2.6.10
* dbus 0.9
* eds
* (more)

OpenMoko 本身是一套 Linux mobile phone 的完整 application framework，OpenMoko 基於 GTK+2 / Dbus / GConf / EDS，並提供 C APIs 給手機開發者使用。此外，OpenMoko application framework 的 UI 基於 GObject，並提供以下的 UI class（引述原文）：

* MokoApplication -- base application class
* MokoPanedWindow -- base class for stylus main windows
* MokoMenuBox -- menu widget holding application and filter menu
* MokoFingerWindow -- base class for finger main windows
* MokoFingerWheel -- rotary finger wheel
* MokoFingerToolBox -- finger tool box
* (more)

OpenMoko 手機釋出計畫的第二階段為「2007-03-11 Phase 1: Official Developer Launch」，此階段將能在 OpenMoko.com 上「網購」每支 US$ 350 的 Neo1973 手機。第三階段則是「2007-09-11 Phase 2: Mass Market Sale」，看來 OpenMoko 舉大旗的時間將會是在今年的 9 月份。值得期待！

由這幾個月來的討論，與二天前所 announce 的消息來看，更能確立 OpenMoko 當初 presentation 時，所要掌握的「Mobile 2.0」核心精神：Open、open、open；Community、community、community。

相信這是值得期待的一個 open source 專案。

延伸閱讀

2006.11.08: 「Mobile 2.0 的思考」與第一隻採用 OpenMoko 的 Linux Smartphone

jollen 發表於 January 22, 2007 8:16 PM | 全文 | 評論 (0) | 引用通告 (0)

製作 ARM9 的 Bootstrap Root Filesystem

更新「Building ARM9 Bootstrap Root Filesystem」教學文件，在此提供與各位朋友分享，並請多多指正，以讓文件內容更加完善。以下文件以 MS-Word 轉檔，請多見諒。

《Jollen的Root Filesystem建置技術系列》

製作ARM9的Bootstrap Root Filesystem

作者／陳俊宏

http://www.jollen.org

更新日期：2007/1/23

在「完整註明出處」的前提下（註明方式說明），您能立即擁有轉貼與引用的授權，且毋需知會作者。

目的

製作 bootstrap root filesystem（base root filesystem）以提供一個最簡單、陽春且可開機的環境；製作完成的系統可開機到shell模式，並可使用 busybox 提供的指令。

準備工作

首先，您必須準備一台 host 開發環境，並安裝好 cross toolchain；接著，由於本文是做實機測試，因此，如果您沒有 ARM9 開發板，可以考慮使用 Qemu 來做模擬測試。

以下的操作示範，只節錄重點指令片段，您可能必須根據自己的整體實作流程，來微調指令的順序，或是參數等。

Step 1：建立工作目錄

建立一個專用的工作目錄，命名為 arm9.so-busybox/：

# mkdir arm9.so-busybox/

# cd arm9.so-busybox/

接著在 arm9.so-busybox/ 目錄下建立 4 個子目錄：

# mkdir src/ install/ mnt/ pub/ build/

實際進行 root filesystem 實作時，我們應該養成將檔案分類擺放的好習慣。以本專案為例，build/ 目錄用來編譯程式，src/ 目錄用來存放原始程式碼，install/ 目錄則用來擺放我們最後的 root filesystem。

Step 2：建立目錄架構

根據 FHS 的目錄架構標準，在 root filesystem 目錄下（install/）建立目錄階層架構：

# cd install/

# mkdir bin/ dev/ etc/ mnt/ proc/ sbin/ usr/

另外還有二個必要的目錄：/var 與 /tmp，由於這二個目錄都需要具備寫入權限，所以在這裡我們是以 ramdisk 的做法來 mount 這二個目錄。

Step 3：建立裝置檔

在 root filesystem 的 dev/ 目錄下建立必要的裝置檔：

crw------- 1 root root 5， 1 1月 1 1970 console

crw------- 1 root root 29， 0 1月 1 1970 fb0

crw------- 1 root root 1， 3 1月 1 1970 null

brw------- 1 root root 1， 0 1月 1 1970 ram0

crw------- 1 root root 5， 0 1月 1 1970 tty

crw------- 1 root root 4， 0 1月 1 1970 tty0

此階段使用 mknod 指令來完成。請先切換到 root filesystem 的 dev/ 目錄下，接著執行以下指令：

# mknod console c 5 1

# mknod fb0 c 29 0

# mknod null c 1 3

# mknod ram0 1 0

# mknod tty c 5 0

# mknod tty0 c 4 0

對於需要產生大量 device file 的場合來說，可以改用 genext2fs 的 ‘-D’ 參數來製作。詳見 Jollen’s Blog：[使用 genext2fs 的 '-D'（device file table）來建立 root filesystem]。

Step 4：加入Busybox

編譯並安裝 Busybox（動態程式庫方式）。將取得的Busybox原始碼解壓縮至 project 目錄裡的 src/ 子目錄下，以下是幾個注意事項：

本教學文件使用 Busybox 1.3.1
Busybox 1.3.0 開始，使用 Linux Kernel 的 Makefile（因為開始支援 CONFIG_DESKTOP）。Cross compile 時，需要修改 Makefile 如下：

ARCH ?= arm

CROSS_COMPILE ?= /opt/crosstool/gcc-3.4.1-glibc-2.3.3/arm-9tdmi-linux-gnu/bin/arm-9tdmi-linux-gnu-

CROSS_COMPILE 的設定是 cross toolchain 的「PREFIX」，視您的 toolchain 而定。您可由 http://www.jollen.org/kit/ 下載本文所使用的 GCC 3.4.1 ARM9 toolchain，以使用與本文完全相同的修改。

Busybox 整合了常用的指令與工具，我們可以設定 Busybox，以勾選我們需要的功能選項。進入 Busybox 的設定選單：

# make menuconfig

請注意，init 與 shell 是必選的項目，請檢查是否有勾選這二個功能。同時，也別忘了設定 Busybox 的安裝路徑，將安裝路徑指到我們 root filesystem 目錄下。

接著直接進行編譯（cross compile）：

# make

編譯完成後，將 Busybox 安裝至我們的 root filesystem 目錄（即 Step 2 的 install/ 目錄）：

# make install

此時，您應該可以在 root filesystem 目錄下看到 Busybox 所安裝的檔案。

Step 5：加入動態程式庫

編譯完成的 Busybox 已經是給 ARM9 執行的格式了，但我們的編譯設定是將Busybox 編譯成 shared library 架構，因此 Busybox 執行時需要以下的檔案：

˙ libc.so.6：C library標準程式庫。

˙ ld-linux.so.2：Native dynamic loader。

請由 toolchain 將以上二個檔案複製至 root filesystem 的 lib/ 目錄下：

# cd ../../install （切換至root filesystem根目錄）

# cp /opt/crosstool/gcc-3.4.1-glibc-2.3.3/arm-9tdmi-linux-gnu/arm-9tdmi-linux-gnu/lib/ld-linux.so.2 lib/ （複制native dynamic loader。以上命令請勿斷行）

# cp /opt/crosstool/gcc-3.4.1-glibc-2.3.3/arm-9tdmi-linux-gnu/arm-9tdmi-linux-gnu/lib/libc.so.6 lib/ （複製C library。以上命令請勿斷行）

Busybox 會因版本與功能選項設定的差異，而需要更多的程式庫。請使用 cross toolchain 的 objdump 指令來檢查 Busybox 的程式庫相依問題（無法使用 ldd 指令），並將所需的程式庫由 toolchain 複製到 root filesystem 的 lib/ 目錄下。

Step 6：加入系統檔案

加入2個重要的系統檔案於 etc/ 目錄下：

˙ fstab：mount table。

˙ inittab：系統初始表（init table）。

etc/fstab內容如下：

/dev/ram0 / ext2 defaults 1 1

none /proc proc defaults 0 0

/dev/ram1 /tmp ramfs defaults 0 0

/dev/ram2 /var ramfs defaults 0 0

fstab 第一行設定，目的在將 /dev/ram0 重新附掛成 ‘/’（root），此動作用意在於重新指定 ‘/’ 的檔案系統為 ext2。最後二行的目的是為了以 ramfs 來 mount 重要的二個目錄：/var 與 /tmp；如此一來，就算開機沒做 remount root（詳見後文說明），也能對 /tmp 與 /var 目錄做寫入的動作

etc/inittab內容如下：

:0:sysinit:/etc/rc.d/rc.init

:0:respawn:/bin/sh

根據這個 inittab 設定，當系統開機後便會進入 run level 0，在 run level 0 模式下，init process會執行2個動作：(1) 執行 /etc/rc.d/rc.init，此即「init script」；(2) 執行 /bin/sh，即進入 shell 模式。

在此我們並沒有參照 LSB 的標準來設定 run level，而且也沒有使用 getty 來讓使用者登入（多使用者模式）。

Step 7：編寫 Initial Script

根據 inittab 的設定，我們 root filesystem 的 init script 位於 /etc/rc.d/rc.init。以下提供一個供 Embedded Linux 使用的 init script 範本：

#!/bin/sh

# automount (/etc/fstab)

mount -a

# remount root

mount -o remount rw /

#

mkdir /var/lock

mkdir /var/lock/subsys

mkdir /var/run

# start other applications (Running application automatically during

# booting up.

# eg. /bin/thttpd –p 80 –d /var/www

當我們執行「mount –a」後，mount 便會去讀取前一步驟所設定的 fstab，並根據此表格的內容來做 mount 的動作。另外，這裡有一個 remount 的動作：

# mount -o remount rw /

此動作的目的是將 root（’/’）重新 mount 成可讀寫，此動作是選擇性的，若省略不做，請務必保持 /var 與 /tmp 目錄是能寫入的（建議以 ramdisk 方式實作為佳）。

若 root filesystem 未包含 inittab 設定檔，則 Busybox 會使用以下的內建設定：

::sysinit:/etc/init.d/rcS

::askfirst:/bin/sh

::ctrlaltdel:/sbin/reboot

::shutdown:/sbin/swapoff -a

::shutdown:/bin/umount -a -r

::restart:/sbin/init

不過，還是建議編寫自己的 inittab 設定檔。

Step 8：製作 Root Filesystem 映像檔（Image File）

截至目前為止，我們的檔案系統已經擁有基本的系統指令與工具。接下來，我們即可將建置完成的 root filesystem 製作成 ext2 格式的映像檔。

以下提供二種 ext2fs image file 的製作方式：(1) 土方法；(2) 使用 genext2fs 工具。

先說明傳統的土方法。首先，先利用dd指令做出一個空白的檔案，大小為 4M（bytes）：

# dd if=/dev/zero of=ext2fs bs=1k count=4096

我們將檔案命名為 ext2fs，接著再將 ext2fs 製作成 ext2 格式的檔案系統：

# mkfs.ext2 ext2fs

mke2fs 1.32（09-Nov-2002）

ext2fs is not a block special device.

Proceed anyway?（y，n）y

選擇y後出現以下畫面：

Filesystem label=

OS type： Linux

Block size=1024（log=0）

Fragment size=1024（log=0）

128 inodes， 1024 blocks

51 blocks（4.98%）reserved for the super user

First data block=1

1 block group

8192 blocks per group， 8192 fragments per group

128 inodes per group

Writing inode tables： done

Writing superblocks and filesystem accounting information： done

This filesystem will be automatically checked every 26 mounts or

180 days， whichever comes first. Use tune2fs -c or -i to override.

到這裡我們已經做好一個檔案格式為 ext2 的空白映像檔，再來只要將先前做好的 root filesystem 全部複製到 ext2fs 映像檔「裡面」即可。

先將 ext2fs 附掛至任一空目錄，例如 mnt/：

# mkdir mnt/

# mount -t ext2 -o loop ext2fs mnt/ （指定檔案系統為 ext2）

複製檔案系統時，我們不使用 cp 指令，而是利用 tar 來完成：

# cd install/

# tar cz * > ../install.tar.gz （將檔案系統做成tarball，同時也備份 root filesystem。）

# cd ..

# cd mnt/

# tar zxvf ../install.tar.gz （再將tarball解至映像檔）

接著將映像檔 umount 並壓縮即可：

# cd ..

# umount mnt/

# gzip -9c ext2fs > pub/ext2fs.gz

最後得到的 ext2fs.gz 即是完成品。請注意，若不使用 tar 來說，也應該使用 cpio 來複製檔案，避免使用 cp 指令。

使用 genext2fs

genext2fs 是一個 ext2 filesystem image file 的製作工具，可以讓我們很方便地將 root filesystem 製作成 image 檔。請由 genext2fs 的官方網站下載原始碼套件：

http://genext2fs.sourceforge.net/

編譯後可以取得 genext2fs 檔案，以下是將 install/ 目錄製作成 ext2fs image 檔的指令：

# genext2fs -b 8192 -i 1024 -d install/ ext2fs

執行後，會得到檔名為 ext2fs 的 image 檔，大小為 8 MB（透過 ‘-b’ 參數指定 image file 大小）；接著同樣再用 gzip 將 ext2fs 檔壓縮即可。

Step 9：在 Target 端做測試

本步驟以 Jollen-Kit! 為例，Jollen-Kit! 是由 www.jollen.org 所推出的 ARM9 training board，詳細介紹請參考 [http://www.jollen.org/kit/]。請注意，本階段的操作，視 target device 的不同而不同，因此以下示範只適用於 Jollen-Kit! 或是其他的 SMDK2410 平臺。

步驟 8 所得到的 ext2.gz 必須再包裝成 U-Boot 的格式，才能透過 U-Boot 載入到 RAM，以成為 kernel 的 initial ramdisk（initrd）：

# mkimage -A arm -O linux -T ramdisk -C none -a 0x30800000 -e 0x30800000 -n ramdisk -d ext2fs.gz urootfs.img

執行後可得到 urootfs.img 檔案，在測試階段為了方便起見，我們可以直接將 urootfs.img 載到 RAM 做測試；U-Boot 指令如下：

jollen.org # tftpboot 32000000 urootfs.img; tftpboot 30F00000 uimage.img; bootm 30F00000 32000000

urootfs.img 是我們製作的 root filesystem，uimage.img 則是給 Jollen-Kit! 使用的 Linux kernel（pre-built）。

延伸閱讀

2006.10.03: Library Dependency 的議題要點

jollen 發表於 January 23, 2007 6:25 PM | 全文 | 評論 (0) | 引用通告 (0)

LiMo Foundation（Linux 手機發展基金會）開張了

國際大廠「合作」成立「非營利組織」的基金會，似乎成為一種運動了，這意謂著更開放、更自由與更社群化的企業手法，已成為建構 ecosystem 的重要方針之一。

LiMo 基金會開門營業了。今天在 LinuxDevices.com 看看到這則消息：Cellphone giants unveil mobile Linux foundation，六家手機大廠（Motorola, NEC, NTT DoCoMo, Panasonic Mobile Communications, Samsung Electronics, and Vodafone.）共同贊助並成立一個名為「LiMo」的基金會，引述一小段新聞原文如下：

The collaboration of giants now known as the LiMo Foundation was first announced in mid-June of last year. The group's goals and official name remained a mystery until today,

相當有趣的一段話，但是也顯示出 Linux mobile phone 未來可能隨時爆發的一股神秘力量。以下是 LiMo 基金會的主要目標（引述報導原文）：

- An API specification
- An architecture
- References to open source code
- New source code-based reference implementation components (to be developed and contributed by Foundation members)
- Specifications for referenced third party software

下圖是 LiMo 初步發表的 Linux 手機平臺架構。此架構中，Application UI Framework 部份，採用的是 GTK+。先前我們所注意的幾個 Linux 手機 framework 中，同樣也採用 GTK+ 做為 UI 的解決方案，可見 GTK+ 已經成為 Embedded Linux 應用於 handheld device 的重要 UI 解決方案了。

圖（圖片來源：http://www.linuxdevices.com/news/NS2923387573.html）：LiMo Foundation architecture

另外，LiMO 的 APIs 授權方式採用「Foundation Public License（FPL）」授權，也就是 LiMo 自己的授權方式，不過關於 kernel 與 middleware 方面，當然還是保留原本的 GPL 授權方式。以下引述報導中，關於 LiMo 授權方式的說明：

From a legal perspective, it appears that the LiMo Foundation will license its APIs to members on a royalty-free basis, using a "Foundation Public License" (FPL). Of course, the kernel and middleware defined by the Foundation's specifications will remain under open source licenses such as the GPL.

LiMo 的授權方式是比較需要注意的地方，因為報導中提及 FPL 的授權方式中，對非會員的處理方式是「However, the license forbids source code distribution to non-members of the Foundation under any terms.」；不過，LiMo 也有針對 3rd party 的授權方式，LiMo 經由此授權方式，讓非會員也能利用 LiMo 的 API 來開發手機應用軟體。

LiMo Foundation 的官方網站為：http://www.limofoundation.org/sf/sfmain/do/home。

延伸閱讀

2007.01.22: OpenMoko 準備舉旗進攻了
2006.11.08: 「Mobile 2.0 的思考」與第一隻採用 OpenMoko 的 Linux Smartphone
2006.12.25: Hiker：另一個 GTK+ based 的 Linux mobile phone application framework

jollen 發表於 January 26, 2007 6:38 PM | 全文 | 評論 (0) | 引用通告 (0)

U-Boot：Porting a new Board（如何在 U-Boot 裡新增自己的 board）

U-Boot 在 board/ 目錄下存放個別板子的支援程式碼（board support codes、board support package），並且是一張板子一個目錄。有時，基於一些理由，我們會想到新增一個新的 board 目錄來存放我們的 BSP。例如，以 Jollen-Kit!（JK2410）為例，JK2410 是基於 SMDK2410 所發展，若將 U-Boot 設定為 'smdk2410_config'，也能支援我們的 JK2410。但是，我們會想要另外增加一個 <U-Boot>/board/jk2410/ 目錄，來擺放 JK2410 的程式碼，即便 <U-Boot>/board/smdk2410/ 與 <U-Boot>/board/jk2410/ 裡的程式碼是 100% 相同的。

有這樣需要的幾個具體理由是：

練習 U-Boot 移植。
避免修改到原始的 BSP 程式碼。
門面問題，讓 U-Boot 也能有專屬於我們板子的支援，而不是基於其他的板子。
將來會正式釋出 patch。

以下是「Porting a New Board」的具體做法：

1. 產生 JK2410 的 BSP 專屬目錄。

# cd <U-Boot>

# cd board

# cp -pa smdk2410/ jk2410/

jk2410/ 目錄下的程式碼與 smdk2410/ 完全相同。

2. 建立專屬的 config 檔。

# cd <U-Boot>

# cd include/configs/

# cp smdk2410.h jk2410.h

基於 smdk2410.h 來產生我們專屬的 jk2410.h 設定檔。

3. 修改 Makefile，基於 SMDK2410 來加入 JK2410 的 rule，紅色部份是新加入的設定：

smdk2410_config :       unconfig
        @./mkconfig $(@:_config=) arm arm920t smdk2410 NULL s3c24x0

jk2410_config   :       unconfig
        @./mkconfig $(@:_config=) arm arm920t jk2410 NULL s3c24x0

紅色粗體字部份是板子的目錄名稱，即 <U-Boot>/board/<Board Name>。

專屬於 JK2410 的設定選項

完成後，我們就能以專屬的設定，將 U-Boot 設定為 JK2410 的支援：

# cd <U-Boot>

# make jk2410_config
Configuring for jk2410 board...

Let's start and happy porting !!

jollen 發表於 January 29, 2007 4:14 PM | 全文 | 評論 (1) | 引用通告 (0)

關於 U-Boot 的程式整體進入點

關於 U-Boot 的程式進入點，我以 SMDK2410 平臺為例來說明。首先，在每張 board 的目錄下，都會有一個 linker script，程式的整體進入點可以閱讀此檔案得知。以 SMDK2410 來說，我們應該由 <U-Boot>/board/smdk2410/u-boot.lds 檔案看起，以下是其內容節錄：

ENTRY(_start)
SECTIONS
{
        . = 0x00000000;

        . = ALIGN(4);
        .text      :
        {
          cpu/arm920t/start.o   (.text)
          *(.text)
        }

在 linker script 的 SECTIONS 命令區塊中，.text section 一開始被放進 <U-Boot>/cpu/arm920t/start.o 檔，所以能了解到此檔案是整個 U-Boot 程式的進入點。因此，「整體的 U-Boot 啟動流程，是由 <U-Boot>/cpu/arm920t/start.o 檔案開始的」，start.o 是 U-Boot for ARM9 整個程式碼中，唯一的 assembly code，其原始程式的檔名為 start.S。

所以，由 <U-Boot>/cpu/arm920t/start.S 開始研讀，是了解 U-Boot 整個執行過程的起始點。另外，附帶一提，就概念上來說，start.S 最主要的工作是：

設定中斷向量表
設定 processor
Initialization Sequence
Relocation

因此，我們把 start.S 也稱為「hardware bring-up code」，而且是「前期」的硬體帶動碼（bring-up code）。在 U-Boot 的 cpu/ 目錄下，可以看到 U-Boot 為各種不同的處理器所撰寫的 bring-up code，可見 U-Boot 真的是一個「萬用 bootloader」。

下篇日記，我將會分享 start.S 的研讀筆記，歡迎對 U-Boot 有興趣的朋友來信指教。

jollen 發表於 January 30, 2007 11:41 PM | 全文 | 評論 (3) | 引用通告 (0)

Linux/PowerPC 新世代《序幕》

Linux/PPC 現在已經改為 Linux/PowerPC。由於 IBM 成立 power.org 組織後，便以 Power Architecture 來稱呼 PowerPC 處理器晶片，因此，以往所慣用的 PPC 現在必須正名為 PowerPC。也就是說，用 PowerPC 來統稱新世代 Power Architecture 處理器晶片會比較好；以往的 PPC 簡稱將會被捨棄。

Linux kernel 由 2.6.15 版開始，已經把 arch/ppc/ 重新組識到 arch/powerpc/，不過在整個 migration 的工作完成前，arch/ppc/ 仍會持續存在，只不過 arch/ppc/ 將會停止發展；接下來的 Linux kernel for PowerPC 將會轉移到 arch/powerpc/ 的新架構下繼續發展。

另外，以往 PPC platform 是被寫在 arch/ppc/platforms/*.c 裡，組識結構不甚良好，現在終於被重新架構了；PowerPC platform（即 ARM Linux 的 "machine"）被重新組識在 arch/powerpc/platforms// 目錄下，一個 platform 一個目錄乾淨多了：

# ls arch/powerpc/platforms/
4xx/ 85xx/ apus/ embedded6xx/ maple/ prep/
82xx/ 86xx/ cell/ iseries/ pasemi/ pseries/
83xx/ 8xx/ chrp/ Makefile powermac/

此外，最近在 kernel 的 GIT 裡出現了 game box 的 platform 支援，第一個即將被正式加入Linux kernel 的是 PS3 遊戲機，2.6.20 將會加入 PS3 的 PowerPC platform。讓我們一同期待 PowerPC 新世代的來臨！

jollen 發表於 January 31, 2007 2:56 PM | 全文 | 評論 (1) | 引用通告 (0)

January 2007 歸檔

January 1, 2007

2007 開工了！

January 2, 2007

Process Creation, #2：Running a "User Process"

January 4, 2007

一個防止程式被玩耍的小技倆

January 5, 2007

應用在 Embedded Linux 場合的 Busybox 有了 "CONFIG_DESKTOP"

Linux 的 Virtual Memory Areas（VMA）：基本概念介紹

January 8, 2007

Process Creation, #3：sys_fork《基本觀念》

Qt Centre Programming Contest 2007：與一些自己的小想法

January 9, 2007

「Truncate It」小技倆的原始碼與原理

January 10, 2007

.bss section：C 語言所種下的因

January 11, 2007

Process Creation, #4：sys_fork《核心實作》

January 13, 2007

Nano-X 程式設計, #3：顯示圖片（image.c）

Nano-X 程式設計, #4：設定 Window Manager（wm.c）

January 14, 2007

Process Creation, #5：copy_process()

January 15, 2007

Linux 的 Virtual Memory Areas（VMA）：Process 與 VMA 整體觀念

January 16, 2007

Shared Memory 的 Race Condition

January 17, 2007

Embedded Linux 測試：Bootstrap root filesystem（x86）階段《程式執行測試》

January 19, 2007

SD/SDIO 的開發板

Linux（open source）的 SD/MMC/SDIO 支援現況概要

January 20, 2007

[筆記] Linux 2.6 的 MMC Core

January 22, 2007

OpenMoko 準備舉旗進攻了

January 23, 2007

製作 ARM9 的 Bootstrap Root Filesystem

January 26, 2007

LiMo Foundation（Linux 手機發展基金會）開張了

January 29, 2007

U-Boot：Porting a new Board（如何在 U-Boot 裡新增自己的 board）

January 30, 2007

關於 U-Boot 的程式整體進入點

January 31, 2007

Linux/PowerPC 新世代《序幕》

搜索

關於 January 2007