In the 5th Frosty Friday challenge, we are to create a Python User defined Function (UDF) within the Snowflake ecosystem. Hence, in this post we will create a Python function to produce a new result column in a snowflake table.
What is a Python UDF?
User-defined functions (UDFs) let us extend the snowflake system to perform operations that are not provided through the built-in, system-defined functions. With a Python UDF, Snowflake lets us use the Python programming language to manipulate data and return either scalar or tabular results. Essentially, snowflake allows us to write Python code and call it as though it were a SQL function. Python UDFs can contain both new code and calls to existing packages, allowing both flexibility and code reuse.
An in-depth description of Python UDFs can be found in the Snowflake Docs.
In challenge 5 we are asked to create a function that multiplies a value times 3 and call it through the SQL statement to produce a column as a calculation for each value.
As the challenge states, we want to keep things simple in order to understand the snowflake syntax involved in the task. The python code for such a function is quite straightforward:
def multi_3_py(input: int): return input*3
Moving forward, we can carry on and configure our snowflake environment, as mentioned in former blogs. For the rest of the challenge let’s first create a table with 5 values (numbers 1-5):
// Creating a table with one column 'numbers' // with a data type of NUMBER CREATE OR REPLACE TABLE ch5_sample_tbl (numbers NUMBER); // Inserting values to the 'numbers' column INSERT INTO ch5_sample_tbl VALUES (1), (2), (3), (4), (5)
We are now ready to use the python code we had earlier and make a UDF out of it. Snowflake documentation points out two ways of doing that:
- Uploading code from a stage
- Specifying the code in-line
We will specify the python code in-line using an
AS clause, as part of the
CREATE FUNCTION statement:
// Creating a Python UDF // Create a Python function of multiplying an integer*3 CREATE OR REPLACE FUNCTION multi_3(input INT) RETURNS INT NOT NULL language python runtime_version = '3.8' // <python_version> handler = 'multi_3_py' // '<function_name>' AS $$ def multi_3_py(input: int): return input*3 $$;
We have created our first Python UDF!
Now let’s use it in our newly created value table:
// Calling and Validating our Python UDF with ch5_sample_tbl SELECT numbers, numbers*3 as validation, multi_3(numbers) as python_udf FROM ch5_sample_tbl;
We have created a python UDF and used it to calculate a column in a table.
I hope this blog helped you clarify and better understand the challenge.
Work together with one of our consultants and maximise the effects of your data.
Contact us, and we’ll help you right away.