To pad or not to pad… (Vertex data alignment on PowerVR)

This topic contains 10 replies, has 5 voices, and was last updated by  Dark_Photon 2 years ago.

Viewing 11 posts - 1 through 11 (of 11 total)
  • Author
    Posts
  • #52078

    Does this performance recommendation apply to all Series 6 GPUs?:

    When vertex data is interleaved, each vertex should be aligned to a four byte boundary.

    (This is from the PowerVR Perf guide, but I thought I’d double-check as sometimes the recommendations haven’t been updated for Series 6.)

    #52080

    Joe Davis
    Member

    Hi Dark_Photon,

    I did a quick review of the performance recommendations doc for the 4.0 SDK to make sure the most important Rogue recommendations were in there (hopefully, we’ll be able to make bigger changes for the 4.1 SDK). Here’s the new section on vertex attributes:

    For optimal performance on PowerVR Graphics Cores, a mesh with static attribute data should:

    • Use indexed triangle lists
    • Interleave VBO attribute data
    • Ensure that every VBO attribute is used by the shader
    • Align to 16 bytes
    • Avoid changing the layout of VBO attribute data

    These rules are applicable to SGX and Rogue.

    #52083

    Simon Fenney
    Moderator

    “Ensure that every VBO attribute is used by the shader”
    Shouldn’t that be worded as “don’t include unused attributes”?

    #52116

    Joe Davis
    Member

    Good suggestion – that’s definitely clearer. I’ll see if I can squeeze the change in before the 4.0 release.

    #52121

    Thanks, Joe. Two questions:

    * Align to 16 bytes

    Does this mean:
    1) Align the starting address/offset of each batch to a 16-byte aligned boundary, or
    2) Align every vertex within an interleaved VBO to a 16-byte aligned boundary?

    I suspect #1, as #2 could waste a lot of space (and memory bandwidth)!

    If #1, then these recommendations don’t say anything about alignment/padding of individual vertices within an interleaved VBO. That was my question. Any recommendation there?

    Thanks.

    #52124

    warmi
    Member

    Hmm …. I was always aligning at the vertex value boundary (not the vertex struct) with resulting wastage since my assumption was that this is related to code reading attributes and not code not copying vertex structs but who knows …. maybe i was wrong.

    #52149

    Joe Davis
    Member

    Hi Dark_Photon, warmi,

    Align to 16 bytes

    This recommendation refers to padding the attributes of every vertex to 16 byte boundaries.

    #2 could waste a lot of space [and memory bandwidth]!

    Definitely. I’ve softened the wording of this recommendation in the doc. It should improve the performance of GPU cache accesses but, as you’ve pointed out, is unlikely to help if it causes an application to be bottlenecked by storage space or memory bandwidth.

    While making changes, I’ve also added clarification to the “Avoid changing the layout of VBO attribute data” recommendation. Here’s the revised section:

    For optimal performance on PowerVR Graphics Cores, a mesh with static attribute data should:

    • Use indexed triangle lists;
    • Interleave VBO attribute data;
    • Not include unused attributes

    For optimal vertex shader execution performance, meshes transformed by the same vertex shader (even if compiled into different shader programs) must have the same VBO attribute data layout.

    On some devices, padding each vertex to 16 byte boundaries may also improve performance.

    #52155

    Thank you, Joe. That clarifies it. One short follow-up:

    On some devices, padding each vertex to 16 byte boundaries may also improve performance.

    Could you specify which GPU series’ prefer this (Series 6?, SGX?)? Or is this a function of the system that the PowerVR GPU is embedded in.

    If the former, this would be great info to have in the Performance recommendations!

    #52160

    Ganesh
    Member

    I have one more question regarding this topic.
    Suppose I have 2 shaders ,one shader uses all the vertex attribute whereas another one does not use all the vertex attributes.
    A good example is assume I am rendering reflection texture (later used in main rendering).
    So what scenario is better for performance
    Do I create 2 seperate vertex buffers or the same vertex buffer can be shared across both the passes.

    #52241

    Joe Davis
    Member

    Hi Dark_Photon,

    Could you specify which GPU series’ prefer this [Series 6?, SGX?]? Or is this a function of the system that the PowerVR GPU is embedded in.

    The recommendation is based on the way Rogue GPUs behave. I’ll look into adding SGX recommendations to a future version of the doc. Based on preliminary discussions with our competitive analysis team though, I believe a 16 byte alignment recommendation would also apply to SGX.

    Hi Ganesh,

    Do I create 2 seperate vertex buffers or the same vertex buffer can be shared across both the passes.

    Theoretically, creating a VBO for each render would result in the best performance as it will increase the speed at which the GPU can copy attribute data to USC registers. However, the benefit of implementing this depends very heavily on where your render is bottlenecked. You would have to implement both solutions and benchmark the performance of each to see if the performance gain is worth the added complexity of having to duplicate data into multiple VBOs. Unless you are heavily vertex processing limited, I suspect using a single interleaved VBO for both passes would be fast enough.

    #52242

    That’s what I needed. Thanks again!

Viewing 11 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic.