First impressions

juhnu · April 25, 2007, 11:53pm

Hi all,

I just started researching Collada a while ago and planning to use it in our next generation content pipeline. I’m pleased to see that COLLADA is gaining support in the industry and is doing well.

It’s been bit hard to grasp the full concept though, so I second the request for a 101-tutorial documentation. That would make the learning curve a bit easier as the Collada-specification is mostly just a reference, and leaves many things unanswered.

Although it has a section dedicated for design issues, I find it little bit too political and not very practical as it doesn’t really address real usage scenarios and common issues, such as file merging, file naming/ordering, loading times/performance etc.

I have few questions if you don’t mind:

What is the current status with external binary files?

The latest specification doesn’t seem to mention nor standardize binary files anyway, but it seems there is an experimental support for raw binary data in the COLLADA DOM. Is this expected to become part of the official specification at some point?

It’s encouraging to see, that there is at least is some sort of effort to “binarize” COLLADA and make file sizes and loading times more acceptable for larger models. However, I’m bit worried that the existing raw binary method would generate many binary files - one for each array and therefore increasing external file count per .dae a lot. I’m not 100% sure about this, so correct me if I’m wrong, but it just doesn’t sound very scalable solution.

Maybe it would be better to solve the issue by creating a real binary-based COLLADA specification, which could be translated to XML when needed, than introducing external files and raw-data outside COLLADA. What do you think?

How is the COLLADA content pipeline supposed to work?

Most of the time COLLADA-documents are created by exporting data from a modeling tool and when exporting a new COLLADA file is created. As one of the main points of COLLADA is that external data is retained in the file and it can be processed, modified and saved by various tool, I don’t see how this fits with the fact that all the data is lost when the model is re-exported.

I figure that there should be some intelligent merging of files going on, but as far as I see there’s no support for merging in the COLLADA DOM or FCOLLADA. I think it would be useful have in the base libraries as that’s what everyone probably needs to have anyway.

This of course wouldn’t be a problem if COLLADA was used as the source format in modeling tools, but that’s unlikely to be the case in the near future or?

Best Regards,

Juhani

remi · April 26, 2007, 7:03pm

It’s been bit hard to grasp the full concept though, so I second the request for a 101-tutorial documentation. That would make the learning curve a bit easier as the Collada-specification is mostly just a reference, and leaves many things unanswered.

Yes, we do our best, but more is needed indeed.

We wrote a book to give more information about the concept and more implementation details that we are not able to have in the specification, since it has to be a reference document by design.
We are currently working on tutorials that we are planning to add in a wiki section of collada.org. But we need help with this effort.

What is the current status with external binary files?

COLLADA users all have their own preferred binary format, and most of the time one binary format per target platform. Defining a unique format is outside of the scope of the COLLADA specification.

It’s encouraging to see, that there is at least is some sort of effort to “binarize” COLLADA and make file sizes and loading times more acceptable for larger models. However, I’m bit worried that the existing raw binary method would generate many binary files - one for each array and therefore increasing external file count per .dae a lot. I’m not 100% sure about this, so correct me if I’m wrong, but it just doesn’t sound very scalable solution.

Encouraging is the right word ! We want to encourage game developers to use COLLADA design to address their needs. To answer your question, the COLLADA DOM binary sample will create a single raw file, containing all the values stored in arrays in the document, using the ‘offset’ and ‘stride’ values of accessors.
Note that this does not address issues such as alignment, little/big endian, NaN and so forth, so the binary raw file is only usable on the same kind of platform/OS that has created it.

Maybe it would be better to solve the issue by creating a real binary-based COLLADA specification, which could be translated to XML when needed, than introducing external files and raw-data outside COLLADA. What do you think?

I am not sure what would be the advantage of this ?
This is a huge amount of work to do this. A binary format would have to be bidirectional (XML->binary->XML), lossless (64 or 128 bits floats?), cross platform (endianess, alignment, IEEE vs other encoding), and require a lot of tool writing to create a validation for which we currently can use any XML validation tools.

Maybe you are wondering about performance and size issues ?
We already made some measurements and we do not see any significant different between XML and binary implementation.
In fact, if you compare COLLADA (XML) and the interchange format used by Autodesk between Max and Maya (FBX - binary format), we have observed that the same content is twice as large saved by FBX than with COLLADA.

If you compare with Maya .ma files for large files, the COLLADA file is 1.2x to 1.8x larger than the source file. So no big difference there. Then if you use zip to make the files even smaller, you will observe that the COLLADA files compress better than the binary file, and in all our tests the zipped COLLADA file is smaller than the zipped binary file.

If you look at performance, most of the time spent in the COLLADA loader or exporter in 3dsMax or Maya is spent in the internal SDK, so even if you completely eliminate the serialization time (either in binary or XML), the export/import time will be the same.

How is the COLLADA content pipeline supposed to work?
Most of the time COLLADA-documents are created by exporting data from a modeling tool and when exporting a new COLLADA file is created. As one of the main points of COLLADA is that external data is retained in the file and it can be processed, modified and saved by various tool, I don’t see how this fits with the fact that all the data is lost when the model is re-exported.

Indeed, if you erase the previous document with the new one, all information that were in the previous document will be lost. That’s the nature of the file system, and has nothing to do with COLLADA per say.

If you are at the origin of your content pipeline, this is not a problem. But if you are in the middle of your content pipeline, then you’ll have to import, and then export back in the COLLADA document.

The rule of thumb is that your source (format) is the place where you store all the information you need, so you do not lose the information the next time you ‘export’ from the source.

If you decide that your source is one specific tool, then you have to make sure to all your date to this tool (plug-ins), make sure that the given source format can handle all your extra data, and make sure that you always edit the data within this tool.

Another choice is to use COLLADA as your source, so you make sure that you always import/export (and not load/save) from all the tools. This way you do not have to concentrate all your data creation in one single tool, you are not limited by how much a single tool can handle, you gain a lot of flexibility and productivity. You enable simpler and more specialized tools to be used in your pipeline, you can take advantage of new releases of tools in the middle of game development (since you do not rely on their source format, and do not depend on a lot of plug-ins)

It’s your choice.

I figure that there should be some intelligent merging of files going on, but as far as I see there’s no support for merging in the COLLADA DOM or FCOLLADA. I think it would be useful have in the base libraries as that’s what everyone probably needs to have anyway.

Another way to take advantage of COLLADA is to use multiple sources and external references. You would be creating a ‘root’ document, which contains instances of geometry, materials (itself an instance of FX), animations, skin… Each of those elements are external COLLADA files that can be created by the same, or different application. This workflow has several advantages, since it allows for much faster exports, concurrent/collaborative development, utilization of several tools on the same content, ‘real-time’ partial updates and so forth.

Merging and Diff would indeed be really useful tools for COLLADA. Those would need to be graphical tools. Imagine that you can highlight the meshes that have been changed between two version, switch between the two. Imagine using this capability to merge files, just like this is done for source code development.

COLLADA and its asset tag that is available on an element basis can be used to drastically improve asset management. Unfortunately there has been little effort from the tool companies. The only reference to this I have see so far is the Pinecoast CLManager.

Automatic merging is yet another issue that has not been completely solved even for source code, so we are probably not there yet for 3D data.

This of course wouldn’t be a problem if COLLADA was used as the source format in modeling tools, but that’s unlikely to be the case in the near future or?

Yes, that is most probably unlikely for the main DCC tools that they will drop their own format and switch to COLLADA! But this is not unlikely for smaller specialized tools, and already been done for Equinox-3D for example. The author used the <extra> extensibility mechanism to add all the information needed for the tool (menu and window positions, extra geometry/shader definition…), and ditched out the original source format.

Note that it actually does not make a difference, since you are the one that make the choice of what will be the source format in your pipeline, as explained above.

In the future, we are looking at a very different model: getting rid of files.
Instead, we want to use a real-time 3D database system, where we will send updates, and get notification from. This provides the ultimate speed with:

real-time update of the data from the modeler to the game engine
ultimate security of data, since even if an application crashes the data is safe in the server
collaborative development, since several artists can act on the same data at the same time.

We are actively collaborating with the Verse project and already did a prototype demonstration during GDC’07.

juhnu · April 26, 2007, 11:06pm

Yeah, I hope the book clarifies issues further, and I actually ordered a copy of the book few days ago.

COLLADA users all have their own preferred binary format, and most of the time one binary format per target platform. Defining a unique format is outside of the scope of the COLLADA specification.

You mean the final delivery format, right? I agree that when the final format is platform specific and outside the scope of COLLADA, any possible binary files used with the COLLADA as in the intermediate-format (such as the raw binary arrays you were talking about) should be part of the standard.

Note that this does not address issues such as alignment, little/big endian, NaN and so forth, so the binary raw file is only usable on the same kind of platform/OS that has created it.

I think this raw format should definitely be standardized and not be platform specific. I don’t see how using some platform dependent raw-arrays could co-exists with .dae files, which is designed to and should be platform neutral.

[quote:2wdyyzd4]Maybe it would be better to solve the issue by creating a real binary-based COLLADA specification…

I am not sure what would be the advantage of this ?
…
Maybe you are wondering about performance and size issues ?
We already made some measurements and we do not see any significant different between XML and binary implementation.
[/quote:2wdyyzd4]

Yeah, one my major concerns were file size and loading time issues(and also the time it takes to merge files together), not so concerned about the exporting times as I know the bottleneck is elsewhere. .I haven’t done any real testing yet to see this myself, but I would have assumed binary files to have significant benefits over XML in this regard. I’m somewhat puzzled here, though - on the other hand you are saying that the binary format didn’t give much benefits, but yet you have developed a raw binary extension to COLLADA, which would seem to show otherwise.

Indeed, if you erase the previous document with the new one, all information that were in the previous document will be lost. That’s the nature of the file system, and has nothing to do with COLLADA per say.

As this is probably relevant to most COLLADA-based content pipelines I think the situation should be dealt somehow by the specification/or the support libraries. As we both agree, it’s unlikely that the major modeling tools will be using COLLADA as their source format anytime soon so this issue is relevant for everyone using those tools and shouldn’t be just ignored. This issue would be there even you used another a non-file based database system, such as the one you were talking about.

Merging and Diff would indeed be really useful tools for COLLADA. Those would need to be graphical tools. Imagine that you can highlight the meshes that have been changed between two version, switch between the two. Imagine using this capability to merge files, just like this is done for source code development.

Yeah that would be good. However I would rather see some intelligent merging taking place, which wouldn’t require any user-interaction, and also so that it’s implemented as a library and not as a stand-alone tool to provide easy integration with the content pipeline.

COLLADA and its asset tag that is available on an element basis can be used to drastically improve asset management. Unfortunately there has been little effort from the tool companies. The only reference to this I have see so far is the Pinecoast CLManager.

Interesting, this could indeed help in the situation, but too bad it’s not widely supported as you say.

As for now, the best way to use COLLADA is to take advantage of external references. You would be creating a ‘root’ document, which contains instances of geometry, materials (itself an instance of FX), animations, skin… Each of those elements are external COLLADA files that can be created by the same, or different application. This workflow has several advantages, since it allows for much faster exports, concurrent/collaborative development, utilization of several tools on the same content, ‘real-time’ partial updates and so forth.

Thanks a lot about this insight.

In the future, we are looking at a very different model: getting rid of files.
Instead, we want to use a real-time 3D database system, where we will send updates, and get notification from. This provide the ultimate speed with:

real-time update of the data from the modeler to the game engine

ultimate security of data, since even if an application crashes the data is safe in the server

collaborative development, since several artists can act on the same data at the same time.

I would be very interested in learning more about this approach. Are there some documentation or discussions about this available somewhere? Either public or internal documents requiring Khronos-membership.

remi · April 26, 2007, 11:57pm

I haven’t done any real testing yet to see this myself, but I would have assumed binary files to have significant benefits over XML in this regard. I’m somewhat puzzled here, though - on the other hand you are saying that the binary format didn’t give much benefits, but yet you have developed a raw binary extension to COLLADA, which would seem to show otherwise.

That’s part of our testing ! Sorry if it is confusing.
Note that there is no hard desicion made. Since the code is there you can play with it and let us know what your findings / recommandations are.

Apologies I had some problems posting, my previous post may have changed since you read it.

remi · April 27, 2007, 12:03am

I would be very interested in learning more about this approach. Are there some documentation or discussions about this available somewhere? Either public or internal documents requiring Khronos-membership.

Right now all the public information is here. We are looking into ramping up this project to add this technology to COLLADA. Of course this is open for discussion and suggestions on this forum.

juhnu · May 14, 2007, 2:32am

<rant>
Uhm…I wrote a long post but when pressed the preview button it disappeared and instead a page asking my login-id appeared…should always remember to use a notepad or something to edit posts…well
</rant>

Thank you for the explanation. I’ve been reading the COLLADA book as it finally arrived and also been evaluating Verse and related implementations whether they would be useful for us. I like the idea of the server based data model as it would definitely make it easier to manage content and enable better collaboration.

However, that said I think that Verse might take the concept of data-sharing a bit too far - especially absence of data locking for editing and real-time partial data updates, makes me wonder how difficult it would be to implement robust editing features on top of it.

Maybe it would be possible to implement a somewhat simpler system based directly on COLLADA object model? So that all data is updated and locked per element-basis only and no partial element updates would be possible. This would make it easier to implement the backing database and make it even possible to use a normal file-system + Artists + CMS for it’s implementation (if really needed).

Maybe the COLLADA_DOM, you have been developing could be used, or modified to be used for this purpose? I’m not very familiar with the implementation details yet, but I have some questions if you don’t mind:

Does COLLADA_DOM support events for data update notifications? I think there should be a way for the database to signal clients for all data updates. I have been reading the programming guide today, but so far couldn’t find anything related to this.

Does COLLADA_DOM support locking of data for editing?

As far as the COLLADA specification in concerned I think it might make sense to develop a stricter subset/profile of COLLADA specification for different use-cases.
Currently it’s somewhat burden to develop a custom tool as it’s hard to support all available content representations. It’s up to developers to decide the supported features and create a conditioning pipeline to transform data into needed format.

I think it would be better if no such step was necessary, especially when using a client-server based database and collaborative data editing. E.g a profile which would make it strict that only a certain coordinate system and geometry representation is used. This would make implementing a client easier and more robust. What do you think?

Best Regards,

Juhani

remi · May 14, 2007, 11:54am

This is my opinion too.
I am looking forward to the collaboration between Verse and COLLADA moving forward to address those issues.

For the GDC demonstration of COLLADA/Verse we added data update notification to COLLADA DOM, and did some modifications to the COLLADA RT sample code, to do exactly what you are proposing. Although we have not released this code since it is really just a prototype, with a lot of complex code to bridge the differences between the 3D representations. So this is not a code that we want to maintain, or that we want anyone to use in production.

Please let me know by e-mail (remi (at) collada.org) if you are interested in contributing to this project.

Our (Verse/COLLADA) goal is to add direct database connection to the COLLADA specification. Not all features will be supported at first as you are suggesting, but we hope that most would be. This would be available through the COLLADA DOM for your own tools/applications, but also should be available directly for the most common DCC tools as part of the regular COLLADA support. Of course you would need a Versse-COLLADA server as well.

I hope this helps.

– Remi