Saturday, November 12, 2016

A weird problem in my build process...

I have been working recently on an embedded project and experiencing a very weird problem.  My main development machine runs Linux Mint and I use the GNU tool chain (gcc, make, etc).  This tool set usually gives me no issues, however I am seeing a warning about not finding a symbol:

arm-none-eabi-ld: warning: cannot find entry symbol Reset_Handler; defaulting to 0000000008000000

Now, if you search this, there are tons of answers on how this can happen.  There are even suggestions that changing the extension of the file from lower-case s to an upper-case S can fix the problem.
(This is because .S files are supposed to be prepossessed prior to assembling them and .s files are not.)

I had researched quite a bit and tried many different things to fix this.  I examined the linker script, the assembly files, the flags that were passed to the assembler, compiler, and the linker.  I consulted with quite a few friends of mine that do embedded development and they had some more suggestions for what the problem was.  I looked at almost everything that my friends suggested, and then something two of them agreed upon popped into my head.  They agreed that make "sucks", that it sometimes doesn't behave as you would want.  I don't entirely agree, but there it was.  The one thing that I hadn't really thought about in the equation of the problem - GNU make.

I decided to make a bare-bones embedded project with some of the existing files and manually call each step of my build process.  I assembled, compiled, and linked it manually.  Presto!  No issue, however when using my make file, I get the problem almost every time.

I like make, and it provides a lot of useful features, including the one that bit me for the last few months.  GNU Make allows you to call the shell, you can embed scripts in the file.  As it turns out I must not understand this feature as much as I thought I did.  I had the following lines in my make file that caused my problem for months:

OBJECTS += $(ASM_SRCS:%.s=%.o)
OBJECTS += $(STD_LIB_SRCS:%.c=%.o)
OBJECTS += $(RTOS_SRCS:%.c=%.o)
OBJECTS += $(RTOS_ASM_SRCS:%.s=%.o)
OBJECTS += $(HAL_SRCS:%.c=%.o)
OBJECTS += $(APP_SRCS:%.c=%.o)
OBJECTS += $(OS_SRCS:%.c=%.o)
OBJECTS += $(C_SRCS:%.c=%.o)

OBJECT_FILES := $(shell find $(OBJ_PATH) -name '*.o')

...

link: $(OBJECTS)

@$(LD) $(LDFLAGS) -o $(OBJ_PATH)/$(ELF_IMAGE_NAME) $(OBJECT_FILES)

The intent of these lines is to create a list of the objects and link them after they have been moved to a build folder.  I am sure at the point in time that I wrote this, it made some sense to me.  I look at it now, and wonder what the hell I was thinking.  The objects were getting built and moved to the build directory, but they hadn't always been moved to the build directory prior to the OBJECT_FILES list being generated by the shell command.

I have since removed the shell script execution line and changed the link target.  The problem is gone, I can execute any target, in any order and it builds perfectly.  The moral of this story is, always understand all your tools, and never think that a tool can't be the problem.  
(Technically, the tool did what I asked it to, so it was still my fault!)



No comments: