AI Huawei Ireland Research Centre

Huawei Ireland Research Centre

(Huawei Technologies (Ireland) Co., Limited)

Nominated Award:
Best Application of AI in a Large Enterprise

Website of Company (or Linkedin profile of Person):
https://www.huawei.com/ie/
https://ie.linkedin.com/company/huawei-ireland-research-center

 

Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices. With integrated solutions across four key domains – telecom networks, IT, smart devices, and cloud services – Huawei is committed to bringing digital to every person, home and organization for a fully connected, intelligent world. Huawei employs over 194,000 people in 170 countries across the globe. Huawei has been in Ireland since 2004, with its business now serving over 3 million people and supporting over 860 direct and indirect jobs. Huawei’s business activities in Ireland continue to thrive. Intelligent connectivity with fibre and 5G technologies has begun and will empower the market of mobile networks and broadband networks with AI and IOT technologies. Huawei Ireland is working very closely with local operators and partners, and is focused on nurturing future talent and highly-skilled professionals in these areas across the country.

Huawei has consistently maintained heavy investment in R&D and innovation over the past three decades. Huawei puts customer value above shareholder returns and does not constrain R&D investment by profits. This is a principle we have always followed. On average, we invest 10% of our annual revenue in R&D. In 2021, Huawei’s total R&D investment amounted to US$22.4 billion, accounting for 22.4% of our total revenue. This was the highest level we’ve achieved in the past decade. Huawei’s total R&D investment over the past 10 years has exceeded US$132.5 billion.

Huawei ranks second in the 2021 EU Industrial R&D Investment Scoreboard. The 2021 EU Industrial R&D Investment Scoreboard is a European Commission publication. It ranks the research investment levels of 2500 companies around the world that comprise 90% of the world’s business- funded R&D. The report was prepared by the EU Joint Research Centre (JRC). Huawei is now ranked as the 2nd highest private sector investor in research and the development in the world.

Much of the global research that Huawei carries out takes place in Europe. Huawei set up its first research centre in Sweden in the year 2000. Today, Huawei employs over 2 400 researchers in 23 research centres across Europe. Through a series of partnerships with over 150 European universities, Huawei is deeply embedded within the ICT research ecosystem in Europe. Through this collaborative research activity, Huawei makes Europe fit for the digital age.

Reason for Nomination:

Chatbots provide a natural dialogue interface to users, simplifying information search and assisting in domain-specific applications. Chatbots are increasingly used in healthcare, ecommerce, public administration, and education; as a “one-stop” fast information hub for end-users. However, the challenges involve (i) domain understanding (e.g., concepts within domains); (ii) anticipating question styles (e.g., linguistic variations, abbreviations and colloquialism); and (iii) multi-linguality.

Traditionally chatbots relied on curated FAQ utterance-responses, but creation of such structured data involve expertise and expensive effort. Thus, NLP techniques like machine reading comprehension (MRC) have been used in chatbots to extract answers for queries from free-flowing unstructured texts (like documents/manuals). Unfortunately, limited efforts exist towards combining the techniques, and separate products are proposed, like Google DialogFlow and knowledge connector, Amazon Lex and Kendra, etc.

We present a hybrid and unified AI based chatbot prototype for integrating both structured and unstructured domain-specific data to seamlessly answer diverse queries, within an online, cloud deployable framework. This would enable the adoption of AI chatbot technology across diverse domains/organizations due to efficient deployment (in cloud), scalability (with multilinguality), and limited effort (from FAQ and documents/manuals). Also, it would immensely benefit the end-users to quickly access pertinent information, across applications and geographical barriers.

We developed an AI based hybrid conversational framework named CAGE, comprising 3 machine learning architectures:

(1) Intent Classification (IC) – This module trains a supervised machine learning model for user question classification based on a pre-defined structured query-response dataset (e.g., FAQ). The model learns to map an end-user’s query to one of the pre-defined questions in the FAQ. Internally, the IC module utilizes several multilingual sentence transformer blocks to map questions into high-dimensional dense vector representations that capture the context and semantics. The vectors obtained from the transformer blocks are concatenated to generate “query meta-embedding” – which is fed to a shallow neural network for classification, thereby understanding question-question similarity to retrieve the appropriate answer.

(2) MRC – This module allows the parsing and understanding of unstructured text for answering the user question. Here a machine learning model is tuned to the applied domain in a self-supervised manner. Specifically, a T5 language model is used to automatically generate possible questions and corresponding answers from domain-specific documents – to form a training dataset for “domain adaptation” of a pre-trained question-answering model.

(3) Checker – The final module drives the seamless transition between IC and MRC modules to extract the best answer– enabling the “hybrid” nature of our system. The appropriate triggering threshold is set based on different confidence scores of both the modules.

Our multilingual CAGE chatbot has been integrated with BotFront dialogue system interface, for deployment as an online cloud based service. The deployed framework is trained on structured FAQ on Mobile Service applications and unstructured text description obtained from the web.

A user query is first passed to IC module to obtain a matching question from FAQ. If the confidence score is greater than a threshold, the matched answer is returned. Otherwise, the query is routed to MRC, to obtain the answer text. If the MRC module is also not confident, the chatbot requests the user to rephrase the query (or flags out-of-scope). This threshold provides a degree of control on the chatbot response, thereby reducing the chances of inappropriate (e.g., irrelevant or offensive) answers being presented to the user.

A typical user interaction involves user greeting the system, to which the chatbot asks how it can help. The user follows with a domain pertinent question, which if present in the FAQ, is matched by our system correctly. For example, the user question “I cannot login to my account?” is matching to the FAQ, and the well-documented answer is returned. Further, open-ended questions are also efficiently answered by the MRC module, wherein information present only in the text are retrieved along with a longer context for readability. For example, for the question “How many languages are supported?”, CAGE reports: … covers 190+ countries …, supporting “70+ languages”. Here, the text in quotes provides the direct answer, while the entire response presents a well-contexted readable answer. Even seemingly objective questions like “Why should I use Maps?” or “What is the benefit of Cloud?” are well answered by CAGE (returning “to find locations, driving directions, …” and “… to back up your data which prevents data loss …” respectively). Finally, for multilingual questions the answer is returned in the same language as that of the query, to enable better readability. A short demo of CAGE can be found in the additional documents and at https://youtu.be/PIzwbrmM4UU.

Such chatbots provide high efficiency, performance and customer reach to organizations and improves user satisfaction and quality-of-experience, facilitating improved human-machine interaction.

Additional Information:

The proposed AI based chatbot solution helps solve an important aspect of the growing need for better communication between man and machines. Our CAGE framework combines disruptive machine learning, AI, NLP, and Information Retrieval techniques to better serve the information needs of end users. The overall framework provides a novel AI framework geared towards enhanced digital assistants. Further, the advantage brought via multi-linguality, cloud deployment capabilities, and use of easy-access data, would transform how organizations reach out and offer support to customers and clients, across products and services and across geography, at scale.

Related scientific publications (at top-tier AI conferences) and detailed technical information of the chatbot framework with respect to this work can be found from:

1. Semantic Aware Answer Sentence Selection using Self-Learning based Domain Adaptation. KDD 2022 (to appear).

2. CAGE: A Hybrid Framework for Closed-Domain Conversational Agents. ECML-PKDD 2022 (to appear).

3. Enhanced Sentence Meta-Embeddings for Textual Understanding. ECIR 2022.

4. Qasar: Self-Supervised Learning Framework for Extractive Question Answering. IEEE Big Data 2021.

5. Efficient Multi-Lingual Sentence Classification Framework with Sentence Meta Encoders. IEEE Big Data 2021.

6. DTAFA: Decoupled Training Architecture for Efficient FAQ Retrieval. SIGDIAL 2021.