Chatbot: Data Science and Artificial Intelligence at Davidson Consulting
The rapidly expanding Data Science / Artificial Intelligence section of Davidson Consulting was created with the aim of meeting both our internal needs (our DUTLER intelligent intranet), as well as those of our clients across a variety of business sectors (Finance, Telecoms, Energy, etc.). It was therefore a perfectly natural step for us to take a serious interest in chatbots.
First observation: Today’s standard market bots both ask questions and give answers derived from pre-recorded data in a fixed decision tree, which will have been prepared in advance by the back office. This is extremely useful and valuable in a closed target environment but in an open-ended context involving regular changes linked with the product or service dealt with by the bot, this approach results in a significant human input cost (constant need for updating).
The innovation developed by our data scientists therefore involves getting around simple fixed decision rules with the use of a generative algorithm able to construct its own, word by word, responses to the questions. The Chatbot will thus generate a consistent response to the same question as asked in a wide variety of possible forms by users. It also has the advantage of being 80% generic and transposable to a variety of contexts.
To achieve this, we are making use of Deep Learning techniques and, in particular, recurrent neurone networks. A specific recurrent network in this context is the “Sequence to sequence” network developed in 2017 and this is the network we have chosen for our application. As shown in the schematic below, this is based on a pair of networks: an encoder to “understand” the question and a decoder to generate the word by word response.
We also make use upstream of so-called Natural Language Processing (NLP) and of REST API techniques downstream to find specific information (for integrating into the answer) from third-party services. This gives rise to the following architecture:
Prior to the training process, we implement a data augmentation phase for the algorithm to enhance the adaptability of the model when dealing with a variety of user writing styles, as well as enabling it to detect the meaning of a question, in whatever way it is asked. Alongside this augmentation phase, there is also a cleaning or Data Pre-processing phase. We remove all special characters, making use of a Word2vec representation to reduce the size, and to detect the presence of specific regular expressions corresponding to information relating to the connected user, etc. The network training phase then makes use of the Back Propagation technique, requiring substantial computing resources, which is why we chose GPU power for running it. This is carried out in a number of stages:
We naturally evaluate our model before it is deployed by testing its capacity to memorise the information obtained from the training data (inversely proportional to the Training Loss below), as well as its capacity to answer questions it did not encounter during the training phase (inversely proportional to the Validation Loss below). The longer the training lasts (number of epochs), the greater the decrease in these errors. Their convergence towards a minimum is an indicator of the correct completion of the training as well as of the model’s capacity for more generalised use.
Once our Chatbot has been trained, it is able to extend its operational field and thus answer questions that were not necessarily included in the training database. The Chatbot takes a question asked, carries out its Pre-processing and generates a word by word response (without any decision tree: and this is the critical innovation!). The answer generated is an answer Template that is then often completed with data recovered in post parsing, from third-party services (Rest API). This Template answer is semantically complete but does not contain those elements that may relate, for example, to geographic locations, financial amounts, etc.
The Chatbot decides which tools to request in completing the answers it supplies (Not only does it know how to speak “like an adult”, it also knows how to interact with other applications 🙂 ).
We can offer you a preview of the initial results achieved by “DAVE”, our Chatbot, for a variety of scenarios, highlighting its ability to adapt to a range of contexts. The following video shows the outcome of the generation of the “answer template” by a trained network without the use of REST API. The specific context here is a request for information on bank account transactions, and the xxxchargexxx and xxxdescriptionxxx patterns indicating that the model has to look for this data through requests to third-party services.
And here now is the complete generation with the API, which has connected to a simulated internal service.
This means that our engine can process a geolocation context, specifically bank branches near an address, while understanding any question concerning the geolocation of a branch, by generating answers and accessing the Google Maps API.
What if an address is not detailed enough? Our model is intelligent enough to request further details and include these in its analysis.
Now with the addition of Colorz UX / UI and its Artistic Directors, you are working with top-level conversational agents.
Next Step: Introduction of voice recognition. Coming soon in episode 2 of the adventures of “DAVE”.