Skip to main content

24 posts tagged with "tech-blog"

View All Tags

array_sort lambda function

· 6 min read
Masha Basmanova
Software Engineer @ Meta

Presto provides an array_sort function to sort arrays in ascending order with nulls placed at the end.

presto> select array_sort(array[2, 5, null, 1, -1]);
_col0
---------------------
[-1, 1, 2, 5, null]

There is also an array_sort_desc function that sorts arrays in descending order with nulls placed at the end.

presto> select array_sort_desc(array[2, 5, null, 1, -1]);
_col0
---------------------
[5, 2, 1, -1, null]

Both array_sort and array_sort_desc place nulls at the end of the array.

Simple Functions: Efficient Complex Types

· 6 min read
Laith Sakka
Software Engineer @ Meta

This blogpost is part of a series of blog posts that discuss different features and optimizations of the simple function interface.

Efficient Complex Types

In this blogpost, we will discuss two major recent changes to the simple function interface to make its performance comparable to the vector function implementations for functions that produce or consume complex types (Arrays, Maps and Rows).

To show how much simpler simple functions are. The figure below shows a function NestedMapSum written in both the simple and vector interfaces. The function consumes a nested map and computes the summations of all values and keys. Note that the vector function implementation is minimal without any special optimization (ex: vector reuse, fast path for flat inputs ..etc). Adding optimizations will make it even longer.

NestedMapSum function implemented using vector(left) and simple(right) interfaces.

Improving the Velox Build Experience

· 5 min read
Jacob Wujciak-Jens
Software Engineer @ Voltron Data
Raúl Cumplido
Software Engineer @ Voltron Data
Krishna Pai
Software Engineer @ Meta

When Velox was open sourced in August 2021, it was not nearly as easily usable and portable as it is today. In order for Velox to become the unified execution engine blurring the boundaries for data analytics and ML, we needed Velox to be easy to build and package on multiple platforms, and support a wide range of hardware architectures. If we are supporting all these platforms, we also need to ensure that Velox remains fast and regressions are caught early.

To improve the Velox experience for users and community developers, Velox has partnered with Voltron Data to help make Velox more accessible and user-friendly. In this blog post, we will examine the challenges we faced, the improvements that have already been made, and the ones yet to come.

Simple Functions: Introduction and Basic Optimizations

· 8 min read
Laith Sakka
Software Engineer @ Meta

This blogpost is part of a series of blog posts that discuss different features and optimizations of the simple function interface in Velox.

Introduction to Simple Functions

Scalar functions are one of the most used extension points in Velox. Since Velox is a vectorized engine, by nature functions are "vector functions" that consume Velox vectors (batches of data) and produce vectors. Velox allows users to write functions as vector functions or as single-row operations "simple functions" that are converted to vector functions using template expansion through SimpleFunctionAdapter.