SPARTA
Semantic Parsing And Relational Table Aware Model that generates SQL from question written in Korean language
Install / Use
/learn @tootouch/SPARTAREADME
SPARTA (Semantic Parsing And Relational Table Aware)
This is a term project in Unstructured Text Analysis class. We implement the deep learning model for converting Korean language to SQL query.
Team Members
- Hoonsang Yoon
- Jaehyuk Heo
- Jungwoo Choi
- Jeongseob Kim
Information
- Korea University DSBA Lab
- Advisor: Pilsung Kang
Demo
Check about Demo in here.
Video
Text2SQL Result Video
Dataset
tar xvjf data/data.tar.bz2
Korean WikiSQL dataset
unzip data/ko_token.zip
unzip data/ko_token_not_h.zip
unzip data/ko_from_table.zip
unzip data/ko_from_table_not_h.zip
Translation
We translated English question into Korean question in four ways as follows.
No | Method | Data Name | Description
---|---|---|---
1 | Where+Select | ko_token | Keep where values in label and column used in select clause among the words in English question
2 | Where | ko_token_not_h | Keep header of table among the words in English question
3 | Table+Header | ko_from_table | Keep values and header in table among the words in English question
4 | Table | ko_from_table_not_h | Keep values in table among the words in English question
Run translation
- Create a question dataframe to translate English to Korean.
bash run_translate.sh value
-
Translate English to Korean by using Google Tanslator (click here!) and copy a text file in ko_data directory such as 'ko_train_question.txt'
-
Insert Korean question
bash run_translate.sh token
SPARTA Model
We use pretrained multilingual BERT as encoder.
Sub Task
Seq2Seq
Evaluation
- Logical Form Accuracy
- Execution Accuracy
Experiments
Model | Task | Test<br>Logical Form<br>Accuracy(%) | Test<br>Execution<br>Accuracy(%) ---|---|:---:|:---: SQLova | Subtask | 65.8 | 74.3 HydraNet | Subtask | 40.4 | 40.7 Bridge | Generation | 54.6 | 62.1
Download Trained Models
Method | SQlova | Bridge ---|---|--- Where+Select | Download | - Where | Download | - Table+Header | Download | - Table | Download | -
Presentation
Proposal
Interim Findings
Final
Reference
- [1] Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning.
- [2] Hwang, W., Yim, J., Park, S., & Seo, M. (2019). A comprehensive exploration on wikisql with table-aware word contextualization. KR2ML Workship at NeurIPS 2019
- [3] Lyu, Q., Chakrabarti, K., Hathi, S., Kundu, S., Zhang, J., & Chen, Z. (2020). Hybrid ranking network for text-to-sql. arXiv preprint arXiv:2008.04759.
- [4] Xi Victoria Lin, Richard Socher and Caiming Xiong. Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing. Findings of EMNLP 2020.
Related Skills
feishu-drive
353.3k|
things-mac
353.3kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
353.3kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
postkit
PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver




