# Data Query Language

# What

# What is Data Query Language?

Data Query Language (DQL) is a language that Clinc has developed that enables you to locate data through queries. It supports queries based on strings, slots, slot values, phrases and the number of slots labeled, which provides a clear picture of slot value distributions and slot data quality.

# What's the syntax of DQL?

We will illustrate the syntax of the Data Query Language (DQL) and what it is capable of doing through an example below:

Index Annotated Sample
0 Book a flight from { SRC ​rome​ } to { DST ​paris​ }
1 I need to book a flight
2 Please find airfare out of { SRC ​london​ } to { DST ​dublin​ }
3 I need to fly to { DST ​turin ​}

# String Queries

Query for strings by using double-quotes. We can select all samples containing the string "flight" by querying:

Query Retrieved
"flight" [0, 1]

To find samples that have the phrase "book a flight", query:

Query Retrieved
"Book a flight" [0, 1]

You can also format the search string as a regex like: Query: "r'[cheese|ham]burger'"

Which will select all utterances containing either "cheeseburger" or "hamburger".

Note: The whole regex must be within double quotes and the expression body must be within single quotes. All regular expressions in DQL use POSIX standards (opens new window).

# Boolean Operators

The DQL currently supports the ​and​, o​r​, and ​not​ operators. All samples with "from" and "to" can be selected by querying:

Query Retrieved
"from" and "to" [0]

Boolean expressions can be grouped within parentheses (). The query below shows an example:

Query Retrieved
("flight" or "book") and not "from" [1]

# Slot Queries

The ​{ }​ characters can be used to specify slots. The syntax is: { slot_name slot_value_expression }

The slot_name is the name of the slot for which we wish to search and may either be a string or a regex string (format r'expression'). The slot_value_expression​ can be a string representing a slot label, a regex expression (format r"expression"), or a boolean expression composed of the ​and​, ​or​, or ​not​ operators with strings.

Note: Spacing matters. There needs to be whitespace between the ​{​ and ​}​.

All samples with the slot DST and value "dublin" can be selected by querying:

Query Retrieved
{ DST "dublin" } [2]

Find all samples with SRC slot with value "london" or "rome" by querying:

Query Retrieved
{ SRC "london" or "rome" } [0, 2]

Note: The expression ​"london" or "rome"​ is an example of a slot_value_expression​. Also note that using a regex as the slot name requires single quotes '' while using a regex as the slot value requires double quotes "".

The wildcard symbol ​*​ can be used within slot queries. All samples with the slot SRC and any value can be selected by querying:

Query Retrieved
{ SRC * } [0, 2]

All samples with any slot can be found by querying:

Query Retrieved
{ * * } [0, 2, 3]

Boolean operators can be used to join together slot queries into an expression. To find samples that have DST but not SRC slots, query:

Query Retrieved
{ DST * } and not { SRC * } [3]

Regular expressions can be used in slot_value_expression​s. The query below shows how to use regex to find all slots that begin with "par" or "tur":

Query Retrieved
{ * r"par tur" }

# String and Slot Queries

We can use slot queries within string queries. One use case is where we want to find all samples with the phrase "from rome to paris", and where "rome" is a SRC slot and "paris" is a DST slot. This can be done by querying:

Query Retrieved
"from { SRC "rome" } to { DST "paris" }" [0]

Note: We still used the same double quote symbol "" introduced above to specify the phrase, and we also used the same slot syntax { } introduced above to specify the slots.

# Number Queries

The ​<​, ​>​, and ​=​ operators can be used to query for samples containing specific numbers of strings or slots.

Find all samples with more than 1 "to" by querying:

Query Retrieved
"to" > 1 [3]

Find all samples with fewer than 2 instances of "to" by querying:

Query Retrieved
"to" < 2 [0, 1, 2]

Number queries can be mixed with booleans. (However, the comparison value cannot be grouped.) An example with a number query combined with a slot query is:

Query Retrieved
"to" < 2 and { SRC "london" } [2]

Number queries can work with slots as well:

Query Retrieved
0 < { SRC * } [0, 2]

# How

How to use the advanced slot data search tool?

# How to use the advanced slot data search tool?

To use the advanced slot data search feature in the platform, you need to enable it first:

  1. Go to the Settings page, scroll to the Institution-Level Beta Features, and find DQL.

beta features

  1. Click Enable. The changes are saved automatically.

The search bar is located on the upper right corner of all the slot data pages. Click the checkbox to enable DQL.

enable DQL

The search bar also provides detailed error messages when the syntax of the query is incorrect. It also shows the number of matching utterances.

sample query

Note: Data will be filtered implicitly by the slot of the page you are on. If you want to filter all the slot data, make sure to be on the All Data page.



Last updated: 08/21/2020