Understanding the General Concepts of Halide Programming Language | by Minhaz | Apr, 2022

Write quick and maintainable code

Screenshot of the 3×3 field blur algorithm written in Halide. Supply — halide-lang.org

Final time I offered the way it’s attainable to attain excessive efficiency with easy and maintainable code with Halide. This time, I’ll share in regards to the common ideas in Halide.

Halide is an open-source programming language designed to make it simpler to write down high-performance picture processing or array processing code on fashionable machines.

Within the last article of this series, I defined:

  • What’s Halide?
  • What’s the want for Halide? (Recap: quick code is commonly complicated and exhausting to write down and keep)
  • How Halide addresses the necessity.

On this article, I’ll dig deeper and share some common ideas in Halide — key knowledge varieties and totally different schedule primitives. For various schedule primitives — an instance of generated code and a demo of code execution is included.

This text is a summarized model of content material obtainable within the documentation and tutorials shared by the Halide improvement staff.

I’ve tried to summarize among the major ideas for the readers and my very own future reference. This appeared crucial to share earlier than speaking in-depth about among the subjects.

Necessary Disclaimer: Any opinion known as out on this article are my very own and don’t replicate opinion or stance of the organizations I work with.

I’ll be protecting this subject simply on the floor, for studying extra please checkout the tutorials at halide-lang.org/tutorials.

A Halide program sometimes have

  • Some enter(s) and a number of output(s).
  • Some expressions or algorithms that that expresses the conversion of enter(s) to output(s).
  • Schedules that outline how these expressions can be evaluated on a sure machine. We will additionally outline a number of schedules every concentrating on a unique subset of {hardware}.

By way of these expressions or algorithms we’ve got

  • Func: A Func object represents a pipeline stage. It is a pure perform that defines what worth every pixel ought to have. You may consider it as a computed picture.
  • Var: Var objects are names to make use of as variables within the definition of a Func. They don’t have any which means by themselves. We typically use x and y as variables that outline the x axes and y axes of photographs.
  • Expr: Expr permits us to outline complicated expressions utilizing Vars.

I’ll use the gradient instance shared within the tutorials as instance right here

Halide code for producing gradient with out schedule

Supply: halide-lang.org > tutorials > lesson 1

This could generate a picture, one thing like this

Gradient picture generated with f(x, y) = x + y with matplotlib (cmap=vidris). Picture by Creator.

Schedule is Halide’s manner of defining “ run the algorithm effectively on a selected machine”.

Following content material is abstract of the bigger tutorial at halide-lang.org > tutorials > lesson 5

For the above gradient() instance, by default the code can be executed in sequence in row-major iteration. The generated code shall look one thing like

Row main iteration code instance
Execution order of the gradient instance — default. Supply: halide-lang.org, Apache license.

Let’s check out totally different schedule choices.


Halide schedule for reorder

This reorders the sequence of iteration, on this the loop shall be executed in column main vogue. The generated code will as a substitute seem like

Column main iteration code instance
Execution order of the gradient instance — column main iteration. Supply: halide-lang.org, Apache license.

This type of change often have a big influence on the efficiency based mostly on how the CPU leverages caches.

Cut up

Halide schedule for cut up

This can be a very highly effective primitive and permits us to separate a loop in sure dimension in a number of components. The final argument right here is known as the “cut up issue”. On this case the generated code adjustments to one thing like

Row main for loop cut up at x-axis


Halide schedule for fuse

That is reverse of splitting, on this case we fuse two variables and merges the 2 loop right into a single for loop over the product of the extents. The generated code seems to be like

2D for loop fused into 1D for loop


Schedule for tiling in Halide

With each cut up and reorder in hand, we will now do tiled evaluations. The above schedule permits us to separate each x and y loop by an element of 4 and reorder the order of execution. The generated code is now complicated however ought to look one thing like

Code generated for tiling in Halide
Execution order of the gradient instance — tiling. Supply: halide-lang.org, Apache license.


Schedule for vectorization in Halide

We will instruct the compiler to now generate directions to vectorize the interior loops in x for loops. This generates code like

Code generated for vectorization in Halide
Determine: Execution order of the gradient instance — vectorization. Supply: halide-lang.org, Apache license.

Unrolling a loop

Schedule for unroll in Halide

We will simply information the compiler to unroll the loop through the use of unroll primitive.

Fusing, tiling, and parallelizing

Halide schedule for fusing, tiling and vectorization

We will principally mix all of those constructs relying on the info independence to attain the very best efficiency. That is the place tiling and fusing shine. We will parallelize throughout the tiles.

The above-mentioned schedule generates code like:

Code generated for fusing, tiling and vectorization
Execution order of the gradient instance — Fusing, tiling and vectorizing. Supply: halide-lang.org, Apache license.

Additional combining all of the constructs for a big picture:

Schedule for placing all of it collectively in Halide

This video exhibits a visualization of how the complicated schedule will get computed.


An enormous shout out to Jonathan Ragan-Kelly and the staff for developing with Halide and making it open supply. A lot of the content material on this article is derived from their work printed as tutorials at halide-lang.org/tutorials.

twentieth April 2022 This text is 2nd article of a collection of articles on this style. In case you missed the primary article, yow will discover it right here:

Within the subsequent few articles, I plan to write down about

  • Article 3: Quick picture processing on Android with Halide.
  • Article 4: Looking for most optimized schedules — Auto Scheduler.

If there’s something particular about this subject you’d wish to be taught, please let me know

  1. Halide — halide-lang.org
  2. Halide tutorials
  3. Halide: decoupling algorithms from schedules for high-performance image processing — Open entry analysis article by Jonathan Ragan-Kelley et. al.
  4. Processing images fast with native code in Android
  5. How to use RenderScript to convert YUV_420_888 YUV Image to Bitmap
  6. Faster image processing in Android Java using multi threading

More Posts