Designing the Framework of a
Parallel Game Engine
How to Get the Most Out of a Multi‐Core CPU with Your Game Engine
Table of Contents
How to Get the Most Out of a Multi-Core CPU with Your Game Engine............................................................. 1
1. Introduction............................................................................................................................................................... 4
1.1. Overview .......................................................................................................................................................... 4
1.2. Assumptions .................................................................................................................................................... 4
2. Parallel Execution State............................................................................................................................................ 5
2.1. Execution Modes ............................................................................................................................................ 5
2.1.1. Free Step Mode ..................................................................................................................................... 6
2.1.2. Lock Step Mode .................................................................................................................................... 6
2.2. Data Synchronization..................................................................................................................................... 6
3. The Engine................................................................................................................................................................. 8
3.1. Framework ....................................................................................................................................................... 9
3.1.1. Scheduler .............................................................................................................................................. 10
3.1.2. Universal Scene & Objects................................................................................................................ 10
3.2. The Managers ................................................................................................................................................ 11
3.2.1. Task Manager ...................................................................................................................................... 11
3.2.2. State Manager ...................................................................................................................................... 12
3.2.3. Service Manager .................................................................................................................................. 13
3.2.4. Environment Manager ....................................................................................................................... 13
3.2.5. Platform Manager ...............................................................................................................................13
4. Interfaces .................................................................................................................................................................. 15
4.1. Subject and Observer Interfaces ................................................................................................................ 15
4.2. Manager Interfaces ....................................................................................................................................... 15
4.3. System Interfaces .......................................................................................................................................... 15
4.4. Change Interfaces ......................................................................................................................................... 16
5. Systems ..................................................................................................................................................................... 17
5.1. Types............................................................................................................................................................... 17
5.2. System Components..................................................................................................................................... 17
5.2.1. System................................................................................................................................................... 17
5.2.2. Scene ..................................................................................................................................................... 18
5.2.3. Object ................................................................................................................................................... 18
5.2.4. Task ....................................................................................................................................................... 18
6. Tying It All Together.............................................................................................................................................. 19
6.1. Initialization Stage......................................................................................................................................... 19
6.2. Scene Loading Stage ..................................................................................................................................... 19
6.3. Game Loop Stage ......................................................................................................................................... 20
6.3.1. Task Execution.................................................................................................................................... 20
6.3.2. Distribution.......................................................................................................................................... 21
6.3.3. Runtime Check and Exit.................................................................................................................... 21
7. Final Thoughts ........................................................................................................................................................ 22
8. About the Author ................................................................................................................................................... 23
Appendix A. Example Engine Diagram ............................................................................................................... 24
Appendix B. Engine and System Relationship Diagram.................................................................................... 25
Appendix C. The Observer Design Pattern......................................................................................................... 26
Appendix D. Tips on Implementing Tasks .......................................................................................................... 27
List of Figures.................................................................................................................................................................... 28
Bibliography....................................................................................................................................................................... 29
1. Introduction
With the advent of multiple cores within a processor the need to create a parallel game engine has
become more and more important. It is still possible to focus primarily on just the GPU and have a
single threaded game engine, but the advantage of utilizing all the processors on a system, whether
CPU or GPU, can give a much greater experience for the user. For example, by utilizing more CPU
cores a game could increase the number of rigid body physics object for greater effects on screen, or
developing smarter AI that gives it a more human like behavior.
1.1. Overview
The “Parallel Game Engine Framework” or engine is a multi-threaded game engine that is designed to
scale to as many processors as are available within a platform. It does this by executing different
functional blocks in parallel so that it can utilize all available processors. This is easier said than
done as there are many pieces to a game engine that often interact with one another and can cause
many threading errors because of that. The engine takes these scenarios into account and has
mechanisms for getting proper synchronization of data without having to be bound by
synchronization locks. The engine also has a method for executing data synchronization in parallel
in order to keep serial execution time at a minimum.
1.2. Assumptions
This paper assumes a good working knowledge of modern computer game development as well as
some experience with game engine threading or threading for performance in general.
2. Parallel Execution State
The concept of a parallel exe
runtime. In order for a game
possible, it will need to have
interaction as possible to any
however, but now instead of
orientation data, each system
between different parts of th
sent to a state manager which
systems are done executing, t
structures, which is also part
overhead, allowing systems to
2.1. Execution Mo
Execution state management
different systems execute syn
frame time and it is not neces
specific frequency but could
long it takes to complete one
implement your execution sta
operating in free step mode of e
the same clock. There is also
and complete in one clock.
Clock n
Phys
Oth
F
cution state in an engine is crucial to an efficient multi-threaded
engine to truly run parallel, with as little synchronization overhead as
each system operate within its own execution state with as little
thing else that is going on in the engine. Data still needs to be shared
each system accessing a common data location to say, get position or
has its own copy. This removes the data dependency that exists
e engine. Notices of any changes made by a system to shared data are
then queues up all the changes, called messaging. Once the different
hey are notified of the state changes and update their internal data
of messaging. Using this mechanism greatly reduces synchronization
act more independently.
des
works best when operations are synchronized to a clock, meaning the
chronously. The clock frequency may or may not be equivalent to a
sary for it to be so. The clock time does not even have to be fixed to a
be tied to frame count, such that one clock step would be equal to how
frame regardless of length. Depending on how you would like to
te will determine clock time. Figure 1 illustrates the different systems
xecution, meaning they all don’t have to complete their execution on
a lock step mode of execution (see Figure 2) where all systems execution
Data Synchronization
Clock n+1
Graphics
ics Physics
AI
er Other
State Manager
igure 1: Execution State using Free Step Mode
2.1.1. Free Step Mode
This mode of execution allows systems to operate in the time they need to complete their
calculations. Free can be misleading as a system is not free to complete whenever it wants to, but is
free to select the number of clocks it will need to execute.
With this method a simple notification of a state change to the state manager is not enough, data will
also need to be passed along with the state change notification. This is because a system that has
modified shared data may still be executing when a system that wants the data is ready to do an
update. This requires more memory and more copies to be used so may not be the most ideal mode
for all situations.
2.1.2. Lock Step Mode
This mode requires that all systems complete their execution in a single clock. This is simpler to
implement and does not require passing data with the notification because systems that are
interested in a change made by another system can simply query the other system for the value (at
the end of execution of course).
Data Synchronization
Lock step can also implement a pseudo free step mode of operation by staggering calculations
across multiple steps. One use of this is with an AI that will calculate its initial “large view” goal in
the first clock but instead of just repeating the goal calculation for the next clock it can now come
up with a more focused goal based on the initial goal.
2.2. Data Synchronization
It is possible for multiple systems to make changes to the same shared data. Because of this,
something needs to be put in place in the messaging to determine which value would be the correct
value to use. There are two such mechanisms that can be used:
Clock n Clock n+1
Graphics Graphics
Physics Physics
AI
Other Other
AI
State Manager
Figure 2: Execution State using Lock Step Mode
• Time, where the last system to make the change time-wise has the correct value.
• Priority, where a system with a higher priority will be the one that has the correct value.
This can also be combined with the time mechanism to resolve changes from systems of
equal priority.
Data values that are determined to be stale, via the two mechanisms, will simply be overwritten or
thrown out of the change notification queue.
Because the data is shared, using relative values for data can prove to be difficult as some data may
be order dependent when combining it. To alleviate this problem use absolute data values for those
that require it so that when systems update their local values they just replace the old with the new.
A combination of both absolute and relative data would be the most ideal and would depend on
each specific situation. For example, common data, like position and orientation, should be kept
absolute as creating a transformation matrix for it would depend on the order they are received, but
a custom system that generated particles, via the graphics system, that fully owned the particle
information could merely send relative value updates.
3. The Engine
The engine’s design is focused on flexibility, allowing for the simple expansion of its functionality.
With that said, it can be easily modified to accommodate platforms that are constrained by certain
factors, like memory, etc.
The engine is broken up into two distinct pieces called the framework and the managers. The
framework (section 3.1) contains the parts of the game that are duplicated, meaning there will be
multiple instances of them. It also contains items that have to do with execution of the main game
loop. The managers (section 3.2) are singletons that the game logic is dependent upon.
The following diagram illustrates the different sections that make up the engine:
EEnnggiinnee
Framework Managers
Task Scheduler Game Loop
State
UScene Service
Figure 3: Engine High-Level Architecture
Notice that the game processing functionality, referred to as a system, is treated as a separate entity
from the engine. This is for the purpose of modularity, essentially making the engine the “glue” for
tying in all the functionality together. Modularity also allows for the systems to loaded or unloaded
as needed.
The interfaces are the means of communication between the engine and the systems. Systems
implement the interface so that the engine can get access to a system’s functionality, and the engine
implements the interface so that the systems can access the managers.
To get a clearer picture of this concept refer to Appendix A, “Example Engine Diagram”.
As described in section 2, “Parallel Execution State”, the systems are inherently discrete. By doing
this, systems can run in parallel without interfering with the execution of other systems. This does
cause some problems when systems need to communicate with each other as data is not guaranteed
to be in a stable state. Two reasons for inter system communication are:
Environment
… UObject UObject
Platform
SSyysstteemm SSyysstteemm SSyysstteemm
Interfaces
• To inform another system of a change it has made to shared data (e.g. position, or
orientation),
• To request for some functionality that is not available within the system (e.g. the AI system
asking the geometry/physics system to perform a ray intersection test).
The first communication problem is solved by implementing the state manager described in the
previous section. The state manager is discussed in more detail in section 3.2.2, “State Manager”.
To rectify the second problem, a mechanism is included for a system to provide a service that a
different system can use. For a more detailed description, you can reference section 3.2.3, “Service
Manager”.
3.1. Framework
The framework is responsible for tying in all the different pieces of the engine together. Engine
initialization occurs within the framework, with the exception of the managers which are globally
instantiated. The information about the scene is also stored in the framework. For the purpose of
flexibility the scene is implemented as what is called a universal scene which contains universal objects
which are merely containers for tying together the different functional parts of a scene. More
information on this is available in section 3.1.2.
The game loop is also located within the framework and has the following flow:
Process Window Messages
Figure 4: Main Game Loop
The first step in the game loop is to process all pending OS window messages as the engine operates
in a windowed environment. The engine would be unresponsive to the OS if this was not done.
The next step is for the scheduler to issue the systems’ tasks with the task manager. This is
discussed in more detail in section 3.1.1 below. Next, the changes that the state manager (section
3.2.2) has been keeping track of are distributed to all interested parties. Finally, the framework
checks the execution status to see if the engine should quit, or perform some other engine execution
action like go to the next scene. The engine execution status is located in the environment manager
which is discussed in section 3.2.4.
Scheduler Execution
Distribute Changes
1. Determine Systems to
Execute
2. Send to Task Manager
Check Execution Status
3. Wait for Completion
3.1.1. Scheduler
The scheduler holds the master clock for execution which is set at a pre-determined frequency. The
clock can also run at an unlimited rate, for things like benchmarking mode, so that there is no
waiting for the clock time to expire before proceeding.
The scheduler submits systems for execution, via the task manager, on a clock tick. For free step
mode (section 2.1.1), the scheduler communicates with the systems to determine how many clock
ticks they will need to complete their execution and from there determines which systems are ready
for execution and which systems will be done by a certain clock tick. This amount can be adjusted
by the scheduler if it determines that a system needs more execution time. Lock step mode (section
2.1.2) has all systems start and end on the same clock, so the scheduler will wait for all systems to
complete execution.
3.1.2. Universal Scene & Objects
The universal scene and objects are containers for the functionality that is implemented within the
systems. By themselves, the universal scene and objects do not possess any functionality other than
the ability to interact with the engine. They can, however, be extended to include the f
本文档为【Designing_a_Parallel_Game_Engine】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。