Natural-language querying of external data sources
Public engagement and open parliament
Italy - Senate
Use case ID: 030
Author: Senate of Italy
Date: 12 June 2024
Objective:
Enable users to query external data sources, such as the Italian National Institute of Statistics (ISTAT) website or the Normattiva website (portal of current laws), using natural language, in order to enhance accessibility and the user experience by allowing intuitive, human-like interactions with external datasets.
Actors:
- Senate website users (citizens, researchers and journalists)
- Senate IT and web development team
Prerequisites:
- Access to external data sources via application programming interfaces (APIs) or web scraping
- Trained large language model (LLM)-based artificial intelligence (AI) model for understanding and processing natural-language queries
- Internet accessibility for users
Scenario:
- The user accesses the Senate website.
- The user enters a query in natural language (e.g. “What is the latest unemployment rate according to ISTAT?”) in the search bar.
- The LLM-based AI model processes the natural-language query to understand the intent and key terms.
- The AI system sends a request to the relevant external data source (e.g. the ISTAT website) to retrieve the requested information.
- The external data source returns the relevant data to the AI system.
- The AI system formats the data and presents it to the user in a user-friendly format.
- The user can refine their search or ask follow-up questions in natural language to obtain more specific information.
- The system logs the query and results for continuous improvement of the LLM-based AI model.
Alternate flows:
- If the LLM-based AI model cannot understand the query, it prompts the user to rephrase or provides suggestions.
- If the external data source is unavailable or returns incomplete data, the system informs the user and suggests alternative sources.
Expected results:
- User satisfaction is improved owing to quick and accurate responses to queries from external data sources.
- Usage of the Senate website's search functionality for accessing external data is increased.
- It takes less time and effort for users to find specific information from external datasets.
- Accessibility for users unfamiliar with querying technical databases or websites is improved.
Potential challenges:
- Ensuring the LLM-based AI model can accurately understand diverse phrasing and terminologies related to external datasets
- Handling queries that require complex data processing or multiple external sources
Data requirements:
- Historical queries and user interactions for training and improving the LLM-based AI model
- Access to APIs or web scraping tools for retrieving data from external sources
- Real-time user query data for ongoing learning and adaptation
Integrations with other systems:
- LLM-based AI processing systems and models
- APIs or web scraping tools for accessing external data sources
- User interface components for displaying query results
Success metrics:
- Query response time
- User satisfaction ratings and feedback
- Accuracy and relevance of query results from external data sources
- Reduction in user queries requiring manual intervention or support