Friday, January 31, 2014

Managing non-code resources in cross platform projects and cross language projects

For the book, I am writing a fair amount of example code.  The example code uses a lot of non-compiled resources.  Meshes, images, fonts and even shaders sources are all needed for rendering.  I need to be able to access these resources on all my target platforms (Windows, OSX, Linux, Android), and in all my target languages (C++, Java, Android Java).

Accessing these resources at run-time poses several challenges:
  • How do I store the resource so that it's available to my application regardless of platform?
  • How do I specify the resource I want in the application?
On top of these basic problems, which I must solve in order to have the examples work at all, there are two other constraints I would like to support
  • How do I avoid duplicating the resource storage for each of my example applications?
  • How can I enable a debugging mode so that if a resource changes while an application is running, the changes are reflected in the application immediately (without rebuilding or even restarting the application)? 

Motivation

The requirement for storage is simple.  When I check out my code and build it, it's not hard to simply set up the build process so that the resources are loaded from wherever they've been check out on the disk.  However, this makes it hard to create an executable I can easily had to other people to try out.  The application might end up looking for a file at /home/bdavis/git/OculusRiftExample/resource/shaders/simple.vs but it's not likely to find it when running on someone else's machine.  

In terms of specifying resources, in my source code I don't want to be using strings to load resources, because they're easy to mistype and much time can be wasted because I asked for shader/simple.vs instead of shaders/simple.vs.  So I'd rather my build process found all the possible resources and created an enumeration for them instead.  

Enumeration

Enumeration of the resources is pretty easy. I can identify all the resources I want to include with a few CMake commands.  I have a project with a CMakeLists.txt file specifically for dealing with all of the resources

file(GLOB MESHES meshes/*.ctm)
file(GLOB FONTS fonts/*.sdff)
file(GLOB_RECURSE IMAGES images/*.png)
file(GLOB_RECURSE VERTEX_SHADERS shaders/*.vs)
file(GLOB_RECURSE FRAGMENT_SHADERS shaders/*.fs)
file(GLOB_RECURSE COMMON_SHADERS shaders/*.glsl)

Having done that, I can iterate over all the resources to determine their relative path from the root of the resource folder

    foreach (file ${RESOURCE_FILES})
        file(RELATIVE_PATH relative ${CMAKE_CURRENT_SOURCE_DIR} ${file})
        set(res ${relative})
        string(TOUPPER ${res} res)
        string(REGEX REPLACE "\\.\\./" "" res "${res}" )
        string(REGEX REPLACE "[^A-Z1-9]" "_" res "${res}" )
        set(HEADER_BUFFER "${HEADER_BUFFER}\t${res},\n")
        set(MAP_BUFFER "${MAP_BUFFER}  std::make_pair(${res}, \"${relative}\"), \n")
        string(REGEX REPLACE " " "_" relative_str "${relative}" )
        set(RC_BUFFER "${RC_BUFFER}\n${relative_str} TEXTFILE \"${file}\"")
    endforeach()

Creating the relative path turns /home/bdavis/git/OculusRiftExamples/resource/shaders/simple.vs into shaders/simple.vs, which is stored in the relative variable.  The subsequent setting and modifications of the res variable populate it with a simple string suitable for inclusion in an enumeration.  So shaders/simple.vs becomes SHADERS_SIMPLE_VS.  I then append to three different variables I'll be using in file generation.  HEADER_BUFFER contains a comma delimited list of all of the res values I've created.  MAP_BUFFER contains a comma delimited list of std::pair types, creating a relationship between a given res and a string containing  the corresponding relative value.  Finally, RC_BUFFER contains all of list of all the relative paths and the corresponding full path to the file on my drive.  In this last instance we tweak the relative path so that it contains no whitespace, as the RC compiler treats whitespace as a field delimiter, regardless of the use of quotes.

In my resources project I have a ResourceEnums.h.in and a ResourceEnums.cpp.in.  In the CMakeLists.txt file I can tell CMake to process them and do variable substitution, like so

configure_file( ResourceEnums.h.in ResourceEnums.h )
configure_file( ResourceEnums.cpp.in ResourceEnums.cpp )

In the header input file I have a section that looks like this

enum Resources_EXPORT Resource {
${RESOURCE_ENUM}  NO_RESOURCE
};

By the time of processing the RESOURCE_ENUM has been populated with the output HEADER_BUFFER data from the for loop above.  The NO_RESOURCE value acts as a sentinel value allowing me to easily iterate over all the resources and know when I've reached the end.  

Similarly in the cpp input file I have the following

const Resources::Pair Resources::RESOURCE_MAP_VALUES[] = { 
${RESOURCE_MAP}  std::make_pair(NO_RESOURCE, "")
};

RESOURCE_MAP contains the output of MAP_BUFFER, so it ends up creating an array that associates my enum values with the relative paths of all my resources.  

The end result is that I have a ResourceEnums.h and ResourceEnums.cpp that contain information all the information my applications will need in order to load the actual resource data.

Storage

Windows and OSX actually have pretty good solutions for resource storage.  Windows has the ability to store non-code resources in an executable or DLL.  OSX applications are actually deep folder structures, with a specific location reserved for such resources.

To my knowledge Linux doesn't really have this sort of easy 'resource fork' functionality.  There's mention on Stack Overflow of using objcopy to push resources into an executable, but it looks pretty fiddly and in order to get it to work properly I'd really need to build a framework around it, which is something I want to avoid.  For the time being I've sidestepped this problem, because while I do want to create applications for OSX and Windows that I can distribute, I'm less motivated to do so for Linux. Instead I think it's satisfactory (for now) for Linux to go ahead and load files from their original paths on the drive.

However, I'd still like to avoid having every executable store a copy of every resource, particularly when many of them won't actually be using more than a few of them.  For Windows I can create a resource DLL.  Similarly, for OSX I can create a framework, which serves a similar purpose (in theory... in actuality CMakes support for creating shared frameworks is sub-optimal).

Windows

For the Windows resources I have a Resources.rc.in file that contains in its entirety

${RESOURCE_RC}

This variable is populated with the output of RC_BUFFER from the for loop above.  During a windows build the compiler will parse this file and create resources in the output DLL for all the referenced files.  So for instance the presence of the file images/images/Tuscany Undistorted.png in my resources project produces the following line in the output file:

images/images/Tuscany_Undistorted.png TEXTFILE "C:/Users/bdavis/Git/OculusRiftExamples/resources/images/Tuscany_Undistorted.png"

Note the underscore in the relative path portion at the beginning of the line.  

OSX

For OSX environments I'd like to have the resources project produce a shared framework that included all the resources.  So far I haven't found out how to make CMake do that.  So I'm having to fall back on having the resources copied in each individual application.  This is accomplished in the CMakeLists.txt file for creating the examples.

        list(APPEND SOURCE_FILES ${ALL_RESOURCES})
        foreach(resource_file ${ALL_RESOURCES}) 
            file(RELATIVE_PATH relative_path ${RESOURCE_ROOT} ${resource_file})
            get_filename_component(relative_path ${relative_path} PATH)
            set_source_files_properties(${resource_file} 
                PROPERTIES MACOSX_PACKAGE_LOCATION Resources/${relative_path})
            source_group("Resources\\\\${relative_path}" FILES ${resource_file})
        endforeach()
        add_executable(${EXECUTABLE} MACOSX_BUNDLE ${SOURCE_FILES} )

This process makes the files available to the native OSX APIs for fetching application resources.

Loading

The code for loading the resources at runtime is located in the library produced by the Resources project, regardless of whether the resources are actually stored there or not.  There's a bit of complexity introduced by the various platforms and the pre-processor logic used to break them up, but the interface is simple

class Resources {
public:
  static time_t getResourceModified(Resource resource);
  static size_t getResourceSize(Resource resource);
  static void getResourceData(Resource resource, void * out);
};

You'll notice that despite the class usage, the interface for fetching data is pretty C-ish.  Because on Windows the resources are going to be located in another DLL with it's own memory management, we can't risk anything that would allow one module to allocate memory that would end up being freed by another module.  This means no passing or returning of container types like std::string or std::vector.  Instead, if we want a given resource we have to query for its size, and then make another call to fetch the data itself into storage we've allocated for it.  

You may wonder what the purpose of the getResourceModified method is, since it doesn't really make sense for an application loading resources out of the executable to care about the modified time of either the executable or the original resource.  Neither can change under those conditions.  However, as I said before, I want to have a mechanism for debugging.  In particular debugging shaders is often easier if I can to make a change to the shader and have the running application pick it up immediately.  So in addition to loading resources out of the resource fork, I want to be able to build my applications to load them directly from their source locations while debugging.  This mechanism is triggered off of a pre-processor define called FILE_LOADING in the source.  If that FILE_LOADING is defined then resources won't be loaded out of a DLL or an application bundle, but will be loaded from their original location.  

FILE_LOADING gets defined if the RIFT_DEBUG flag is enabled on all platforms.  On Linux, FILE_LOADING is always defined.  

The implementation of the three methods each starts by turning the passed Resource enum into a string.  The contents of the string differ based on platform and whether FILE_LOADING is defined.  If FILE_LOADING is defined, then the string contains the full absolute path to the resource on the local system.  

On OSX, if FILE_LOADING is not defined, the string contains the relative path to the file, allowing it to be used with CFBundleCopyResourceURL and CFURLGetFileSystemRepresentation to turn it into an absolute file path.

On Windows if FILE_LOADING is not defined, the string contains the relative path, but with whitespace characters turned to underscores, to match the names in the resource DLL resource fork.  This can then be used with FindResource and LoadResource to get the resource data.  

Other Languages

I want to be able to use my resources in Java as well, and on Android.  For Java, it's not too hard.  I already use Maven for project and dependency management in Java.  By including a pom.xml in my resource directory I'm able to tell Maven to construct a jar consisting entirely of the resources, preserving their directory structure.

This is the relevant section of the pom's build section

    <resources>
      <resource>
        <directory>.</directory>
        <includes>
          <include>images/*</include>
          <include>shaders/*</include>
          <include>meshes/*</include>
          <include>fonts/*</include>
        </includes>
      </resource>
    </resources>

This only solves the problem of storage though, not enumeration.  An application of the same technique I used in the CMakeLists.txt to generate a C++ header and source file containing the enumeration could be applied here to create a similar Java file, but by default such files are placed in the build output directory.  This means that the pom file would have to be generated as well in order to refer to it.  This in turn makes it more annoying to set up an eclipse workspace, though perhaps not insurmountably so.

For Android, the files need to be 'assets' rather than resources.  Android has it's own special brand of managing non-code artifacts, and in the Android world 'resource' means something that is very specifically not what I'm dealing with.  Assets on the other hand refer to arbitrary non-code artifacts that should be included in a given application or library.  Again, it should be possible to cause CMake to generate the appropriate java and project files for doing so, but I haven't gone far in that direction yet.  

Future Work

I'd like to improve the overall ease of use of doing this, perhaps make it more encapsulated and easy to integrate into other CMake projects.  Obviously I'd like to solve the problem of using a shared framework on a Mac rather than duplicating the resources.  I'd also eventually like to solve the problem of executable resources on Linux.  Perhaps it will require essentially re-implementing the Windows resource fork functionality using the objcopy mechanism mentioned in the Stack Overflow question, or perhaps it could involve something more bruteforce, like generating a C++ source file for each resource that contains the binary representation of it in a byte array.  

1 comment:

  1. This project is going to be challenging for you and i seen that it is quite difficult.In Finland i been a project manager and i never been handle a project like this however this one is awesome.

    ReplyDelete