Gcc4.4.4 Multilib Toolchain Release Note
Gcc4.4.4 Multilib Toolchain Release Note
Gcc4.4.4 Multilib Toolchain Release Note
1 What new
New features of this toolchain include: gcc 4.4.4. This compiler version supports VFPv3 and NEON. And it is also the first gcc version supports Cortex-A9, Cortex-R4. Multilib cross toolchain. This toolchain is a multilib toolchain, library compiled separately for different CPU model. Optimized library. We optimized some common library routines to improve application performance. Application debug tools. We provide some debug tools to trace and detect application bug.
Freescale Semiconductor
2 What inside
The whole toolchain contains: binutils 2.20.1 gcc 4.4.4 with multilib support glibc 2.11.1. glibc-ports 2.11 (some routines are optimized with neon and arm instructions) gdb and gdbserver 7.1 other debug tools and some companion libraries
Toolchain directory structure. |-- bin //toolchain with prefix, such as arm-none-linux-gnueabi-gcc etc. |-- lib //library files used for toolchain itself, not for application `-- arm-fsl-linux-gnueabi |-- bin //toolchain without prefix, such as gcc. |-- debug-root //all debug tools `-- multi-libs //all libraries and headers. |-- armv5te //library for armv5te (i.mx 2xx). only support soft float point |-- armv6 // library for armv6 (i.mx 3xx), soft fpu version | `-- vfp //library for armv6, vfp fpu version |-- armv7-a //library for armv7-a (i.mx5xx), hardware fpu version | |-- neon //library for armv7-a, use neon as fpu | |-- thumb //library for armv7-a, use thumb-2 instruction instead of arm. | `-- vfpv3 //library for armv7-a, use vfpv3 as fpu. |-- lib //default library. It can be used for armv4t and above. `-- usr |-- include //header files for the application development `-- lib //three-part library and static built library
Freescale Semiconductor
Freescale Semiconductor
Gcc options for different library directory. Directory path lib, usr/lib armv5te/lib armv5te/usr/lib Targe CPU model armv4t armv5te, i.mx2xx -march=armv5te -march=armv5te alias to: -mcpu=cpu name of armv5te or -mtune=cpu name of armv5te such as: arm968e-s, arm946e-s. etc. -march=armv6 alias to: -mcpu=cpu name of armv6 arch or -mtune=cpu name of armv6 arch such as: arm1136j-s etc. -mfpu=vfp save to above. -march=armv7-a alias to: -mcpu=cpu name of armv7-a or -mtune=cpu name of armv7-a such as: cortex-a8 cortex-a9 etc. same as above same as above same as above Gcc options Alias options
armv6/lib armv6/usr/lib
armv6, i.mx3xx
-march=armv6
Notes: directory path is relative path to: gcc-4.4.4-glibc-2.11.1-multilib-1.0/arm-fsl-linux-gnueabi/ arm-fsl-linux-gnueabi/multi-libs target CPU model means cpu model and above, not just only the specific model. gcc options means the options must provided, other options can add base on this. alias options means: some gcc options alias to other, so if you use these alias options, has the same effect to gcc options
http://dmalloc.com/docs/latest/online/dmalloc_toc.html
Freescale Semiconductor
Freescale Semiconductor
case $1 in libc_start_files) startfile_multilib; break;; gcc_core_pass_2) gcc_shared_multilib; break;; libc) libc_multilib; break;; *) break;; esac libc_start_file.sh libc_start_file.sh is used to build and install glibc start files by different gcc options. So script can be something like: #!/bin/sh startfile_multilib() { local TC_PREFIX=arm-fsl-linux-gnueabi local build_dir=${CT_WORK_DIR}/${TC_PREFIX}/build/build-libc-startfiles/ local src_dir=${CT_WORK_DIR}/src/glibc-2.11.1 local dest_dir_prefix=${CT_PREFIX_DIR}/${TC_PREFIX}/multi-libs/ local CC_PATH=${CT_WORK_DIR}/${TC_PREFIX}/build/gcc-core-static/bin export PATH=${CT_PREFIX_DIR}/bin:${CC_PATH}:$PATH local -a gcc_options=("-march=armv7-a -mfpu=neon -mfloat-abi=softfp" "") local -a dest_dirname=("armv7-a/arm/neon" ".") local -a fp_config=("--with-fp" "without-fp") for ((index=0; index < ${#gcc_options[@]}; index++)) { cd ${build_dir} make clean echo "libc_cv_forced_unwind=yes" > config.cache echo "libc_cv_c_cleanup=yes" >> config.cache BUILD_CC=${CC_PATH}/${TC_PREFIX}-gcc CFLAGS="${gcc_options[index]}" CC="${TC_PREFIX}-gcc" AR=${TC_PREFIX}-ar RANLIB=${TC_PREFIX}-ranlib ${src_dir}/configure --prefix=/usr --build=i686-build_pc-linux-gnu --host=${TC_PREFIX} --without-cvs --disable-profile --disable-debug --without-gd --with-headers=${dest_dir_prefix}/usr/include --cache-file=config.cache --with-__thread --with-tls --enable-shared ${fp_config[index]} --enable-add-ons=nptl,ports --enable-kernel=${CT_LIBC_GLIBC_MIN_KERNEL} make OBJDUMP_FOR_HOST=${TC_PREFIX}-objdump PARALLELMFLAGS= -j1 csu/subdir_lib ASFLAGS="${gcc_options[index]}"
} } gcc_shared_multilib.sh And gcc_shared_multilib.sh, which be invoked after build gcc-shared, is important to patch gcc to enable you multilib scheme. Gcc not support multilib flexibly. For example, if compile ARM multilib toolchain, first apply the multilib patch provided by FreeScale. then modify gcc-4.4.4/gcc/config/arm/t-arm-elf it as following according to your scheme: MULTILIB_OPTIONS = march=armv5te/march=armv6/march=armv7-a mfpu=vfp/mfpu=vfpv3/mfpu=neon mthumb MULTILIB_DIRNAMES = armv5te armv6 armv7-a vfp vfpv3 neon thumb MULTILIB_EXCEPTIONS = mfpu* MULTILIB_EXCEPTIONS += *march=armv5te/*mfpu* MULTILIB_EXCEPTIONS += *march=armv6/*mfpu=vfpv3* MULTILIB_EXCEPTIONS += *march=armv6/*mfpu=neon* MULTILIB_EXCEPTIONS += *march=armv7-a/*mfpu=vfp MULTILIB_EXCEPTIONS += *mfpu*/*mthumb* MULTILIB_EXCEPTIONS += mthumb* MULTILIB_EXCEPTIONS += march=armv5te/*mthumb* MULTILIB_EXCEPTIONS += march=armv6/*mthumb* This scheme will build armv5te, armv6, armv7-a architecture, fpu can be vfp, vfpv3, neon, and instruction type can be arm and thumb OPTIONS describe the gcc options for multilib. DIRNAMES describe the directory name corresponding to each option EXCEPTIONS describe the except of the option combination. For example, -march=armv5te -mfpu=neon is invalid combination. And next, modify gcc/config/arm/sysroot_suffix.h to define your option alias. For example: #undef SYSROOT_SUFFIX_SPEC #define SYSROOT_SUFFIX_SPEC "" \ "%{march=armv5te|mcpu=arm946e-s|mtune=arm946e-s|mcpu=arm968e-s|mtune=arm968e-s|mcpu=arm926ej-s|mtune= arm926ej-s|mcpu=arm10tdmi|mtune=arm10tdmi|mcpu=arm1020t|mtune=arm1020t|mcpu=arm1026ej-s|mtune=arm1026 ej-s|mcpu=arm10e|mtune=arm10e|mcpu=arm1020e|mtune=arm1020e|mcpu=arm1022e|mtune=arm1022e:/armv5te;" \ "march=armv6|mcpu=arm1136j-s|mtune=arm1136j-s|mcpu=arm1136jf-s|mtune=arm1136jf-s|mcpu=mpcore|mtune=mpc ore|mcpu=mpcorenovfp|mtune=mpcorenovfp|mcpu=arm1156t2-s|mtune=ar1156t2-s|mcpu=arm1156t2f-s|mtune=arm115 6t2f-s|mcpu=arm1176jz-s|mtune=arm1176jz-s|mcpu=arm1176jzf-s|mtune=arm1176jzf-s:" \ "%{mfpu=vfp:/armv6/vfp;" \ ":/armv6};" \ "march=armv7-a|mcpu=cortex-a5|mtune=cortex-a5|mcpu=cortex-a8|mtune=cortex-a8|mcpu=cortex-a9|mtune=cortex-a9: "\ "%{mfpu=vfpv3:/armv7-a/vfpv3;" \ "mfpu=neon:/armv7-a/neon;" \ "mthumb:/armv7-a/thumb;" \ ":/armv7-a};" \ ":}" lib_multilib.sh And the last one, libc_multilib.sh, similar to libc_start_file.sh, compile glibc with different gcc options, the difference is
Freescale Semiconductor gcc 4.4.4 toolchain, Rev 10.11.01 9
start file compiled by static gcc, this step compiled by shared gcc. The script looks like this: #!/bin/sh libc_multilib() { local TC_PREFIX=arm-fsl-linux-gnueabi local build_dir=${CT_WORK_DIR}/${TC_PREFIX}/build/build-libc/ local src_dir=${CT_WORK_DIR}/src/glibc-2.11.1 local dest_dir_prefix=${CT_PREFIX_DIR}/${TC_PREFIX}/multi-libs/ local CC_PATH=${CT_WORK_DIR}/${TC_PREFIX}/build/gcc-core-shared/bin export PATH=${binutils dir}:${your shared gcc dir}:$PATH local -a gcc_options=( ) local -a dest_dirname=( ) for ((index=0; index < ${#gcc_options[@]}; index++)) { cd ${build_dir} BUILD_CC=${your shared arm gcc} CFLAGS=${gcc_options[index]} CC=${your shared arm gcc} $src_dir/configure --prefix=/usr/ ${other options} make OBJDUMP_FOR_HOST=${your arm binutils objdump} ASFLAGS=${gcc_options[index]} all mkdir ${dest_dir_prefix}/${dest_dirname[index]}/usr/lib -p make install_root=${dest_dir_prefix}/${dest_dirname[index]} OBJDUMP_FOR_HOST=${your arm binutils objdump} install } } We provide example scripts release package, packed with patch and crosstool-ng. But beware we not guarantee you can get correct toolchain with the scripts, because toolchain building depends on host machine, environment variable you have set and the components configurations.
Freescale Semiconductor
10
memcpy() performance test result. memcpy: memcpy original from glibc -2.11.1 memcpy_aroid: memcpy from Android 2.2 CTS release 2 memcpy with neon: memcpy optimized with NEON instruction. Release in multi-libs/armv7-a/neon/
memcpy() performance benchmark result: data unit(MB/s), meaning of A/B data format: A is the average value of memcpy with different source and destination address at every test loop. B is the average value of memcpy with same source and destination address at every test loop. Block size 3 bytes 4 bytes 5 bytes 7 bytes 8 bytes 11 bytes 12 bytes 15 bytes 16 bytes 24 bytes 31 bytes 4096 6144 65536 98304 memcpy 82.4 / 86.8 56.1 / 64.3 63.3 / 80.4 78.2 / 112.4 87.1 / 128.4 112.0 / 176.6 119.9 / 192.9 142.3 / 241.2 148.9 / 257.2 193.8 / 385.9 227.1 / 498.4 1360.4 / 2246.0 1385.3 / 2270.4 990.1 / 1298.7 736.9 / 884.1 memcpy_aroid 94.6 / 101.0 110.8 / 115.6 73.0 / 71.7 104.1 / 100.4 222.4 / 230.8 177.6 / 173.1 301.4 / 313.1 211.4 / 204.2 165.3 / 368.2 243.2 / 508.5 322.1 / 360.2 3394.9 / 3560.4 3508.3 / 3630.9 1557.7 / 1369.3 1544.2 / 1350.3 memcpy with neon 127.6 / 143.5 139.0 / 192.2 171.8 / 239.2 220.9 / 336.5 218.4 / 384.6 281.9 / 526.2 276.7 / 574.4 334.6 / 718.0 271.7 / 539.1 373.5 / 808.8 486.3 / 1044.7 3551.7 / 3695.4 3586.7 / 3671.9 1577.5 / 1545.9 1552.9 / 1463.8
Freescale Semiconductor
11
Freescale Semiconductor
12