Vulkan 设计指南
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
Vulkan 与旧版图形 API 的不同之处在于,驱动程序不会为应用执行特定的优化(例如流水线再利用)。相反,使用 Vulkan 的应用必须自行实现这类优化。否则,这些应用可能会展现出比运行 OpenGL ES 的应用更差的性能。
在应用自行实现这些优化时,它们可能比驱动程序做得更成功,因为它们可以访问指定用例的更多特定信息。因此,使用 Vulkan 的应用经过巧妙的优化后,可能会比使用 OpenGL ES 的应用具有更高的性能。
本页介绍了您的 Android 应用为了通过 Vulkan 实现性能提升可以实现的多项优化。
硬件加速
大多数设备通过硬件加速支持 Vulkan 1.1,一小部分设备通过软件模拟支持 Vulkan 1.1。应用可以使用 vkGetPhysicalDeviceProperties
并检查所返回结构的 deviceType
字段来检测基于软件的 Vulkan 设备。SwiftShader 和其他基于 CPU 的实现具有值 VK_PHYSICAL_DEVICE_TYPE_CPU
。应用可以通过检查相同结构的 vendorID
和 deviceID
字段,来检查 SwiftShader 的特定值。
性能关键型应用应避免使用软件模拟的 Vulkan 实现,而应改为使用 OpenGL ES。
渲染时应用屏幕旋转
如果应用的向上方向与设备屏幕的方向不符,合成器将旋转应用的交换链图像,以使这两种方向一致。合成器会在显示图像时执行此旋转操作,而这会导致能耗增加,有时能耗会显著高于不旋转图像的情况。
相比之下,如果一边生成交换链图像一边进行旋转,即便会导致能耗增加,增幅也十分有限。VkSurfaceCapabilitiesKHR::currentTransform
字段会指示合成器向窗口应用的旋转效果。当应用在渲染过程中应用该旋转效果后,应用会使用 VkSwapchainCreateInfoKHR::preTransform
字段报告旋转完成。
最大程度减少每个帧的渲染通道
在大多数移动 GPU 架构中,开始和结束渲染通道是既耗时又耗电的操作。通过将渲染操作控制在尽可能少的渲染通道中,可以提升您的应用性能。
不同的附件加载和附件存储操作会实现不同级别的性能。例如,如果您不需要保留附件的内容,可以使用更快的 VK_ATTACHMENT_LOAD_OP_CLEAR
或 VK_ATTACHMENT_LOAD_OP_DONT_CARE
,而不是 VK_ATTACHMENT_LOAD_OP_LOAD
。同样,如果您无需将附件的最终值写入内存供日后使用,可以使用 VK_ATTACHMENT_STORE_OP_DONT_CARE
获得比 VK_ATTACHMENT_STORE_OP_STORE
更好的性能。
另外,在大多数渲染通道中,您的应用不需要加载或存储深度/模板附件。在这种情况下,如果您在创建附件图像时使用 VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT
标记,就不必为附件分配物理内存。此位提供与 OpenGL ES 中的 glFramebufferDiscard
相同的优势。
选择合适的内存类型
分配设备内存时,应用必须选择内存类型。内存类型决定应用使用内存的方式,并且还可以说明内存的缓存和相干属性。不同的设备具有不同的可用内存类型;不同的内存类型会展现不同的性能特性。
应用可以运用简单的算法来选择最合适的内存类型,以用于特定的用途。此算法会在 VkPhysicalDeviceMemoryProperties::memoryTypes
数组中选择首先满足以下两个标准的第一个内存类型:内存类型必须适用于缓冲区或图像,并且必须至少具备应用所需的属性。
移动系统通常不会为 CPU 和 GPU 提供单独的物理内存堆。在此类系统中,VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
的重要性不如在具有独立 GPU 的系统上大,独立 GPU 拥有自己专属的内存。应用不应假设此属性为必要属性。
按频率将描述符组分组
如果您拥有以不同频率变化的资源绑定,请为每个流水线使用多个描述符组,而不是为每个绘图重新绑定所有资源。例如,您可以为“按场景”绑定使用一组描述符,为“按材料”绑定使用另一组描述符,为“按网格实例”绑定使用第三组描述符。
为最高频率的变更(例如使用每个绘制调用执行的变更)使用立即常数。
本页面上的内容和代码示例受内容许可部分所述许可的限制。Java 和 OpenJDK 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2024-01-10。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["没有我需要的信息","missingTheInformationINeed","thumb-down"],["太复杂/步骤太多","tooComplicatedTooManySteps","thumb-down"],["内容需要更新","outOfDate","thumb-down"],["翻译问题","translationIssue","thumb-down"],["示例/代码问题","samplesCodeIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2024-01-10。"],[],[],null,["# Vulkan design guidelines\n\nVulkan is unlike earlier graphics APIs in that drivers do not perform certain\noptimizations, such as pipeline reuse, for apps. Instead, apps using Vulkan must\nimplement such optimizations themselves. If they do not, they may exhibit worse\nperformance than apps running OpenGL ES.\n\n\nWhen apps implement these optimizations themselves, they have the potential\nto do so more successfully than the driver can, because they have access to\nmore specific information for a given use case. As a result, skillfully\noptimizing an app that uses Vulkan can yield better performance than if the\napp were using OpenGL ES.\n\n\nThis page introduces several optimizations that your Android app can implement\nto gain performance boosts from Vulkan.\n\nHardware acceleration\n---------------------\n\n\nMost devices\n[support Vulkan 1.1 via hardware acceleration](/reference/android/content/pm/PackageManager#FEATURE_VULKAN_HARDWARE_VERSION) while a small subset support\nit via software emulation. Apps can detect a\nsoftware-based Vulkan device using `vkGetPhysicalDeviceProperties`\nand checking the `deviceType` field of the returned structure.\n[SwiftShader](https://github.com/google/swiftshader) and other\nCPU-based implementations have the value `VK_PHYSICAL_DEVICE_TYPE_CPU`.\nApps can check specifically for SwiftShader by checking the `vendorID` and `deviceID`\nfields of this same structure for SwiftShader-specific values.\n\n\nPerformance-critical apps should avoid using software-emulated Vulkan implementations\nand fall back to OpenGL ES instead.\n\nApply display rotation during rendering\n---------------------------------------\n\n\nWhen the upward-facing direction of an app doesn't match the orientation of the device's\ndisplay, the compositor rotates the app's swapchain images so that it\ndoes match. It performs this rotation as it displays the images, which results\nin more power consumption---sometimes significantly more---than if it were not\nrotating them.\n\n\nBy contrast, rotating swapchain images while generating them results in\nlittle, if any, additional power consumption. The\n`VkSurfaceCapabilitiesKHR::currentTransform` field indicates the rotation\nthat the compositor applies to the window. After an app applies that rotation\nduring rendering, the app uses the `VkSwapchainCreateInfoKHR::preTransform`\nfield to report that the rotation is complete.\n\nMinimize render passes per frame\n--------------------------------\n\n\nOn most mobile GPU architectures, beginning and ending a render pass is an\nexpensive operation. Your app can improve performance by organizing rendering operations into\nas few render passes as possible.\n\n\nDifferent attachment-load and attachment-store ops offer different levels of\nperformance. For example, if you do not need to preserve the contents of an attachment, you\ncan use the much faster `VK_ATTACHMENT_LOAD_OP_CLEAR` or\n`VK_ATTACHMENT_LOAD_OP_DONT_CARE` instead of `VK_ATTACHMENT_LOAD_OP_LOAD`. Similarly, if\nyou don't need to write the attachment's final values to memory for later use, you can use\n`VK_ATTACHMENT_STORE_OP_DONT_CARE` to attain much better performance than\n`VK_ATTACHMENT_STORE_OP_STORE`.\n\n\nAlso, in most render passes, your app doesn't need to load or store the\ndepth/stencil attachment. In such cases, you can avoid having to allocate physical memory for\nthe attachment by using the `VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT`\nflag when creating the attachment image. This bit provides the same benefits as does\n`glFramebufferDiscard` in OpenGL ES.\n\nChoose appropriate memory types\n-------------------------------\n\n\nWhen allocating device memory, apps must choose a memory type. Memory type\ndetermines how an app can use the memory, and also describes caching and\ncoherence properties of the memory. Different devices have different memory\ntypes available; different memory types exhibit different performance\ncharacteristics.\n\n\nAn app can use a simple algorithm to pick the best memory type for a given\nuse. This algorithm picks the first memory type in the\n`VkPhysicalDeviceMemoryProperties::memoryTypes` array that meets two criteria:\nThe memory type must be allowed for the buffer\nor image, and must have the minimum properties that the app requires.\n\n\nMobile systems generally don't have separate physical memory heaps for the\nCPU and GPU. On such systems, `VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT` is not as\nsignificant as it is on systems that have discrete GPUs with their own, dedicated\nmemory. An app should not assume this property is required.\n\nGroup descriptor sets by frequency\n----------------------------------\n\n\nIf you have resource bindings that change at different frequencies, use\nmultiple descriptor sets per pipeline rather than rebinding all resources for\neach draw. For example, you can have one set of descriptors for per-scene\nbindings, another set for per-material bindings, and a third set for\nper-mesh-instance bindings.\n\n\nUse immediate constants for the highest-frequency changes, such as changes\nexecuted with each draw call."]]