I did some experimentation with openai embeddings and developed a app that you can ask chatGPT contextualized questions. It has all the textblaze documentation in the backend and all you need to do is to ask something. It will try to answer and even generate snippets using textblaze documentation.
I'd love to hear from you guys some feedback. It certainly has limitations, but it answered me working snippets and uptodate with textblaze documentation. It could generate snippets that calculate a person age, for example. Pretty nice I think
For sure. I think the big limitation here is the context size limitation that openai stablishes. I can't do it in a way to aswer the question considering all of the documentation. Behind the scenes, the app selects some aspect of the documentation that seems to contain the answer and afterwards generate an answer with it.
I believe openAI will make it possible to include larger contexts, so we won't have this limitation.
As a result, more complex questions remain with the 'I don't know' answer from the model.
Because of the same limitation, it can generate pretty long answers. In the same API call it must fit all the context and the answer with respect to the max tokens allowed by openai.
Hi @Vinicius_Almeida thanks for sharing this! I'm excited to dive into this.
I'm curious to know about the tech stack. I had experimented (about a month ago) with LangChain+FAISS max relevance search on top of GPT3 embeddings, and this didn't work because of hallucination. I had also tried manually copy pasting the relevant documentation into GPT3.5 prompt, before asking the question, and it still hallucinates (not a scalable technique but worth a shot I guess ). Didn't try with GPT4 yet but will try again once I get a chance.
As you indicated, the hallucination in my case was only on complex questions. For simple questions, where the answer usually exists in our documentation, the answer was usually correct.
Really nice, @Gaurang_Tandon! You can check the way I generated the prompt in the github repo. There you can also find the documentation scrapped data in a .hdf file.
Here's the link. It's public
Feel free to add any comments or suggestions. It will be really nice for me to collaborate in projects like this one. We could just build something together if worth it. That can be really nice.
I do not work at TextBlaze but I believe they're working on integrations with OpenAI.
If TextBlaze wants me in their team, please get in touch! @scott@Gaurang_Tandon