Du analysierst bestehende NL2SQL-Methoden und automatisierst die Datenvorbereitung, während du die Leistung und Genauigkeit des Systems verbesserst.
Anforderungen
- •Master's Degree in Computer Science or Engineering
- •Good knowledge of Python
- •Experience with SQL and Data Science
- •Growth mindset
- •Good in English
Deine Aufgaben
- •Analysiere bestehendes Fachwissen und NL2SQL-Methoden
- •Untersuche Evaluierungstechniken und relevante Methoden
- •Automatisiere die Sammlung und Vorbereitung von Datensätzen
- •Entwickle Skripte zur Automatisierung der Datenvorbereitung
- •Sichere die Datenqualität für verschiedene Szenarien
- •Implementiere Methoden zur Verbesserung der NL2SQL-Leistung
- •Gestalte und implementiere Methoden zur Leistungssteigerung
- •Lege robuste Evaluationsmethoden zur Messung von Verbesserungen fest
- •Definiere Metriken zur Bewertung der NL2SQL-Leistung
- •Implementiere Evaluierungsprotokolle zur systematischen Überprüfung
- •Teste und validiere die Effektivität des NL2SQL-Systems
- •Vergleiche die Ergebnisse mit einem Basiswert zur Verbesserung
- •Komplettiere Dokumentationen über Methoden, Codes und Ergebnisse
- •Bereite einen Abschlussbericht über die Ergebnisse vor
Original Beschreibung
## Job Description
The increasing use of Natural Language to SQL (NL2SQL) techniques is transforming the way large language models (LLMs) help bridge the gap between complex industrial data and users, enabling domain experts to interact with data using natural language. However, challenges remain in optimizing and evaluating NL2SQL outputs, particularly for interactive AI applications and specialized domains like semiconductor data visualization. This project aims to investigate and improve NL2SQL methods to support our "Data Viz" tool.
* During your thesis you will conduct a thorough analysis of the existing domain knowledge, advanced NL2SQL methods, and existing evaluation techniques. This includes reviewing methods such as fine-tuning, retrieval-augmented generation (RAG), and researching how AI agents are applied in NL2SQL tasks and exploring potential approaches. You will explore existing advanced techniques and identify key areas for improvement.
* You will automate the process of collecting, preparing, and managing training and testing datasets by developing scripts and tools to automate the extraction and preprocessing of training and testing data. You will ensure data quality to represent domain knowledge across various scenarios.
* Furthermore, you will implement methods to enhance NL2SQL performance and retrieval accuracy. You will design and implement multiple suitable methods to enhance NL2SQL performance and benchmark different methods.
* Moreover, you will establish robust evaluation methods to measure the improvements in NL2SQL results. You will define metrics for evaluating NL2SQL performance (e.g. accuracy, efficiency) and implement evaluation protocols to systematically assess improvements.
* Finally, you will rigorously test and validate the effectiveness of the enhanced NL2SQL system comparing results against a baseline to measure improvement and compile comprehensive documentation of the entire process, including methodologies, codes, processes, and results. You will prepare a final report summarizing findings and potential future work.
## Qualifications
* **Education:** Master studies in the field of Computer Science, Engineering, Microelectronics or comparable
* **Experience and Knowledge:** good knowledge of Python; experience with SQL, Data Science, AI, GenAI, LLM, microelectronics and web development is a plus
* **Personality and Working Practice:** you have a growth mindset
* **Languages:** good in English
## Additional Information
**Start:** according to prior agreement
**Duration:** 6 months
Requirement for this thesis is the enrollment at university. Please attach your CV, transcript of records, examination regulations and if indicated a valid work and residence permit.
Diversity and inclusion are not just trends for us but are firmly anchored in our corporate culture. Therefore, we welcome all applications, regardless of gender, age, disability, religion, ethnic origin or sexual identity.