Data Lake Query#

morningstar_data.datalake.query(
query_str: str,
temp_tables: List[TempTable] | None = None,
) DataFrame#

Upcoming Feature

Retrieve the results of a SQL query from the Morningstar Data Lake.

Parameters:
  • query_str – SQL query to be executed in Morningstar Data Lake

  • temp_tables – A list of temporary tables that will exist for the duration of the query.

Returns:

A DataFrame object with results of the SQL query.

Examples:

Submit a query using a temp table

import morningstar_data as md
import pandas as pd


df_my_table = pd.DataFrame({'sec_id': ['F0GBR0606A', 'F00000SYAH', 'F00000WP51'], 'closing_price': [128.372, 23.02, 528.33]})
df_query = md.datalake.query(query_str = 'select * from my_table;', temp_tables = [md.datalake.TempTable('my_table', df_my_table)])
Output:

sec_id

closing_price

F0GBR0606A

128.372

F00000SYAH

23.02

F00000WP51

528.33

Errors:

InvalidQueryException: When query_str contains invalid SQL syntax.

UnauthorizedDataLakeAccessError: When the calling user is not authorized to query the Morningstar Data Lake.

TempTableNameNotFoundException: When one or more temp tables being used are not found in the query string.