c++ - Are the optimizations done in LTO the same as in normal compilation? -
while compiling translation unit compiler doing lot of optimizations - inlining, constant folding/propagation, alias analysis, loop unrolling, dead code elimination , many others haven't heard of. of them done when using lto/ltcg/wpo between multiple translation units or subset (or variant) of them done (i've heard inlining)? if not optimizations done consider unity builds superior lto (or maybe using them both when there more 1 unity source files).
my guess it's not same (unity builds having full set of optimizations) , varies lot across compilers.
the documentation on lto of each compiler doesn't precisely answer (or failing @ understanding it).
since lto involves saving intermediate representation in object files in theory lto optimizations... right?
note not asking build speed - separate issue.
edit: interested in gcc/llvm.
if have @ gcc documentation find:
-flto[=n]
this option runs standard link-time optimizer. when invoked source code, generates gimple (one of gcc's internal representations) , writes special elf sections in object file. when object files linked together, function bodies read these elf sections , instantiated if had been part of same translation unit.
to use link-time optimizer, -flto , optimization options should specified @ compile time , during final link. example:
gcc -c -o2 -flto foo.c gcc -c -o2 -flto bar.c gcc -o myprog -flto -o2 foo.o bar.o
the first 2 invocations gcc save bytecode representation of gimple special elf sections inside foo.o , bar.o. final invocation reads gimple bytecode foo.o , bar.o, merges 2 files single internal image, , compiles result usual. since both foo.o , bar.o merged single image, this causes interprocedural analyses , optimizations in gcc work across 2 files if single one. means, example, inliner able inline functions in bar.o functions in foo.o , vice-versa.
as documentation tells, yes, all! optimizations program compiled in single file. can done -fwhole-program
"same" optimization result.
if compile simple example:
f1.cpp:
int f1() { return 10; }
f2.cpp:
int f2(int i) { return 2*i; }
main.cpp:
int main() { int res=f1(); res=f2(res); res++; return res; }
i got assembler output:
00000000004005e0 <main>: 4005e0: b8 15 00 00 00 mov $0x15,%eax 4005e5: c3 retq 4005e6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 4005ed: 00 00 00
all code inlined expected.
my experience is, actual gcc optimizes lto compiled in single file. on rare conditions got ice while using lto. actual 5.2.0 version have not seen ice again.
[ice]-> internal compiler error
Comments
Post a Comment