« on: September 15, 2016, 04:29:20 PM »
This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
NB: forum doesn't allow longer headlines
Last week I spent working on the glmark2 performance issues. I now have a NIR patch out for the pathological conditionals test (it's now faster than on the old driver), and a branch for job shuffling (+17% and +27% on the two desktop tests).
Here's the basic idea of job shuffling:
We're a tiled renderer, and tiled renderers get their wins from having a Clear at the start of the frame (indicating we don't need to load any previous contents into the tile buffer). When your frame is done, we flush each tile out to memory. If you do your clear, start rendering some primitives, and then switch to some other FBO (because you're rendering to a texture that you're planning on texturing from in your next draw to the main FBO), we have to flush out all of those tiles, start rendering to the new FBO, and flush its rendering, and then when you come back to the main FBO and we have to reload your old cleared-and-a-few-draws tiles.
Job shuffling deals with this by separating the single GL command stream into separate jobs per FBO. When you switch to your temporary FBO, we don't flush the old job, we just set it aside. To make this work we have to add tracking for which buffers have jobs writing into them (so that if you try to read those from another job, we can go flush the job that wrote it), and which buffers have jobs reading from them (so that if you try to write to them, they can get flushed so that they don't get incorrectly updated contents).
MSI is celebrating its 30th anniversary as a leading manufacturer of innovative PC hardware. During the past 30 years, MSI has earned a reputation for providing products featuring cutting edge technology and striving to create and use only the best quality components.
To celebrate this milestone, MSI has created an exclusive limited edition graphics card, combining the excellence of MSI GAMING graphics cards with a unique custom designed EK waterblock for this anniversary edition. The exceptionally classy waterblock features infused RGB LED lights that can be set to any of 16.8 million colors by using the MSI Gaming App.
At the heart of this exclusive card is NVIDIA’s GeForce® GTX 1080 GPU to provide all the power you need at up to 4K resolution gaming. The card comes fully assembled in a closed loop liquid cooling configuration that is covered by warranty and maintenance-free. Enclosed in the exquisite and sturdy wooden box is a small gift which is perfect for enjoying the latest epic games in full comfort.
Now that DX11 has given us UAVs in all the other shading stages as well, I decided to try the equivalent for the vertex cache. By “Vertex Cache”, I mean the Post-transform vertex re-use cache. That is, the thing which enables us to re-use vertex shading results across duplicated vertices in a mesh.
Using UAVs in a VS, we can use SV_VertexID to do an atomic increment into a buffer containing one counter for each vertex. An atomic inc is necessary here because we don’t actually know what the vertex distribution algorithm is, and we could theoretically process a given vert in more than one VS thread simultaneously. For that matter, HW could simply be duplicating all the verts. We won’t know until we’ve looked at the results. Using this approach, we end up with a buffer telling us the exact number of times that each vert was processed during the draw. From this, we can directly calculate the ACMR (average cache miss ratio) of the mesh.
This code accompanies the research paper "Masked Software Occlusion Culling", and implements an efficient alternative to the hierarchical depth buffer algorithm. Our algorithm decouples depth values and coverage, and operates directly on the hierarchical depth buffer. It lets us efficiently parallelize both coverage computations and hierarchical depth buffer updates.
This code is mainly optimized for the AVX2 instruction set, and some AVX specific instructions are required for best performance. However, we also provide SSE 4.1 and SSE 2 implementations for backwards compatibility. The appropriate implementation will be chosen during run-time based on the CPU's capabilities.
Gathering petabytes of data about your customers is cool, but how can you take advantage of this data? BlazingDB lets you run high-performance SQL on a database using a ton of GPUs.
Relying on GPUs for a database is quite interesting. GPUs can run a ton of tasks in parallel and present a clear advantage for very specific tasks. In particular, companies have been using GPUs a lot lately for image processing and machine learning applications — but it’s the first time I’m hearing about taking advantage of GPUs for databases.
That’s where BlazingDB shines. You can do sums, use predicates and run through many, many database entries in little time. The company just started accepting customers in June 2016, and there are already big Fortune 100 companies that want to use BlazingDB.
The design of iBow docking allows you to replace graphics cards easily according to your requirements to enhance the graphics experience. iBow was developed to accommodate the largest video cards currently available in the market.
Git is hard: screwing up is easy, and figuring out how to fix your mistakes is fucking impossible. Git documentation has this chicken and egg problem where you can't search for how to get yourself out of a mess, unless you already know the name of the thing you need to know about in order to fix your problem.
So here are some bad situations I've gotten myself into, and how I eventually got myself out of them in plain english.
GpuCapsViewer.exe /exp_txt_report /exp_full_filename="C:/tmp/gpucapsviewer_report.txt"
GpuCapsViewer.exe /exp_xml_report /exp_full_filename="C:/tmp/gpucapsviewer_report.xml"
The new DOOM is a perfect addition to the franchise, using the new id Tech 6 engine where ex-Crytek Tiago Sousa now assumes the role of lead renderer programmer after John Carmack’s departure.
Historically id Software is known for open-sourcing their engines after a few years, which often leads to nice remakes and breakdowns. Whether this will stand true with id Tech 6 remains to be seen but we don’t necessarily need the source code to appreciate the nice graphics techniques implemented in the engine.
Unlike most Windows games released these days, DOOM doesn’t use Direct3D but offers an OpenGL and Vulkan backend.
Vulkan being the new hot thing and Baldur Karlsson having recently added support for it in RenderDoc, it was hard resisting picking into DOOM internals. The following observations are based on the game running with Vulkan on a GTX 980 with all the settings on Ultra, some are guesses others are taken from the Siggraph presentation by Tiago Sousa and Jean Geffroy.
Zepto ransomware is a relatively new player in the ransomware scene, and it’s closely related to the infamous Locky ransomware. Taking a closer look at Zepto’s code, we found that the code is pretty much the same as Locky’s code, but it has been slightly modified. The malware authors behind Zepto use the same methods used to spread Locky, and even the infection vector and the TOR payment page are the same, which makes us think that the people behind Locky are now spreading Zepto. The only difference between Locky and Zepto is the ransom demand. Zepto’s demand is much higher than Locky’s, 3 Bicoins (approximately $1,850).